Speech tools Jean Philippe Goldman 03 03 2004 Two questions What kind of data Which task 2 What kind of data Speech content noise multivoice Data File Sound Transcription PitchCurve Sampling Quantization 16k 12k 8k 4k 8bit Size 16k16bit 256kbps 1 9Mo mn 115Mo h Format Sound wav wma mp3 ogg aiff aifc au vox raw sd CSL Ogg Vorbis NIST Sphere Transcription HTK TIMIT TextGrid Phondat Number of files 3 Which task Visualization and Edition Analysis segmentation labeling Scripting Filtering mixing adding effects prosodic manipulation Annotation spectral pitch Speech manipulation Record Play edit mix add effects Batch communication with outside Plotting 4 Examples of tasks build stimuli for an experiment i e crosssplicing manage a speech database for a TTS engine create a prosodic database analyze speech corpus from experiment recordings verify correct an automatic segmentation 5 Two questions What kind of data Which task Two rules there is no unique tool to do everything there are plenty of ways to do one thing 6 Tool features Visualization Edition Analysis Speech manipulation Annotation Scripting Plotting Supported format Platform installation Evolution community Accessibility Price 7 Softwares Goldwave audio editor Esps Xwaves routines visual Praat speech analysis Wavesurfer speech editor Transcriber annotation tool Matlab general purpose soft OGI speech tools routines app dev winpitch pitchworks phonedit cooledit 8 Goldwave self defined as top rated professional digital audio editor 9 Goldwave pros edition good gestion of memory for big files many FX noise reduction real time spectrum and VU meters various formats batch conversion chain effects easy interface cons nothing for speech pitch formant windows only no scripting Good for file edition not for speech 10 11 Esps Waves Developed by Entropic AT T Now public Comp speech FAQ says Esps comprehensive set of speech analysis processing tools Waves is a graphical front end for speech processing waveforms spectrograms pitch includes a signal labeling utility 12 13 Esps waves pros powerful designed for big files cons UNIX only free BSD not standard formats requires programming skills development has stopped 14 Praat Developed by P Boersma and D Weenink at the Institute of Phonetic Sciences University of Amsterdam general purpose speech tool edition segmentation and labeling prosodic manipulation 15 16 Praat pros designed for speech analysis not only sound edition or spectrogram visualization nice GUI scripting active development and community prosodic manipulation cons limited scripting language native format of transcription and pitch files 17 WaveSurfer Open Source tool for sound visualization and manipulation speech sound analysis and sound annotation transcription platform for more advanced specialized applications extending WaveSurfer with new custom plug ins or embedding WaveSurfer visualization components in other applications Requires SnackToolKit 18 19 Transcriber Authors C Barras E Geoffrois Relies on Snack Tcl tk Good for annotation Nice simple GUI No speech analysis 20 21 Matlab Mathworks Math environment Signal processing toolbox filter design spectral analysis waveform generation linear prediction voicebox 2002 mike brookes ic ac uk pitch determination algorithm 2002 Xuejing Sun sunxj northwestern edu colea speech editor 1998 Philip Loizou loizou utdallas edu Univ of Texas Dallas 22 Matlab Mathworks pros open powerful scripting excellent plotting cons poor speech community standards not designed for big files 23 OGI speech tools CSLU Toolkit development started in 1992 in C on Unix at Center for Spoken Language Understanding CSLU at OGI Includes An X windows display tool LYRE display edit speech signal spectrograms phoneme labels and other information a set of C library routines LIBNSPEECH utilities for converting file formats filtering Neural Network training vector quantizer database utility to automate speech database related enquiries a set of PERL Scripts which have been used mainly to automate the use of the OGI Speech Tools MAN Pages RAD rapid application development points of entry Package C script tcl GUI tk levels free for research use 24 25 Summary Price Comm Evolut OS Format Plot Script Annot Manip Anal Edit Goldwave Esps Waves Praat wavesurfer snack transcriber win 40 C sh Unix free yes native console sendpraat src free C tcl tk python src free xml free OGI Toolkit matlab Sigproc packages free native no BSD stud 100 40 tbx 26 yes but requires some dev Expect to do conversions Sound files goldwave win sox unix Transcription files scripts to convert text formatted label files 27 Links www goldwave com www speech kth se software esps www praat org www speech kth se software wavesurfer www cse ogi edu toolkit www mathworks com Matlab www lpl univ aix fr sqlab phonedit www sciconrd com pworks htm PitchWorks www winpitch com WinPitch www adobe com CoolEdit Audition 28
View Full Document
Unlocking...