Splitting Audio Clip

Splitting the audio segment into individual sentences is overall the most time-consuming task. I have searched high and low for an automated tool that would at least attempt to guess where each sentence begins and ends and split them for me. Unfortunately, the few I have found failed badly and wasted more time than saved. This is due to the fact that when native English narrator reads a paragraph, to convey the meaning and emotions she cannot possibly pace herself according to the sentence boundaries. There are long pauses between some sentences, and yet short pauses or no pauses at all between others. What makes it even worse is the noise of the recording and the fact that some sentences would have internal pauses that would match those of the sentence boundaries.

I split my sentences manually with the sound editor. For that, I use spwave written by Hideki BANNO. spwave allows me to visually identify each sentence, and the automatically splits them into individual WAV files! All I have to do is to load clip file, select a sentence with mouse and click Shift-V (Extrace And Autosave) key. The window with selection pops up. I just close it.


% cd /data/English/Segment12/English
% spwave en12_.wav
	

spwave numbers selections you make and produces a sequence of files with names derived from the file edited. At the end, I have a sequence of 30 individual sentences in English,


% ls

en012_01.wav  en012_10.wav  en012_19.wav  en012_28.wav
en012_02.wav  en012_11.wav  en012_20.wav  en012_29.wav
en012_03.wav  en012_12.wav  en012_21.wav  en012_30.wav
en012_04.wav  en012_13.wav  en012_22.wav
en012_05.wav  en012_14.wav  en012_23.wav
en012_06.wav  en012_15.wav  en012_24.wav
en012_07.wav  en012_16.wav  en012_25.wav
en012_08.wav  en012_17.wav  en012_26.wav
en012_09.wav  en012_18.wav  en012_27.wav
	

Neat! spwave also lets you interrupt processing and pick up next time where you stopped. It saves all your past selections to the text file you can load later on. File->Label->Open Label does the trick. That comes in handy when you are processing big chunks of data.