DOC PREVIEW
Columbia COMS W4706 - Limited Domain TTS  using Festival

This preview shows page 1-2-3-4-5-6 out of 19 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 19 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 19 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 19 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 19 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 19 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 19 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 19 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Project 1 – Limited Domain TTS using FestivalProject 1Limited Domain TTS projectOutline – Creating your TTSPart A – environment variables Part A – build directoriesPart A - build prompts and utterances Part A – build and run speech synthesizerPart A – final outputPart B: Designing a limited domain TTS systemExample: TimePart B – preparing your limited domain TTSTips for designing promptsPart C – completing your limited domain TTSTips for Recording promptsSubmissionSubmission continuedScript fileSummaryProject 1 – Limited Domain TTS using FestivalCS4706 Spring 20101Project 1• Build a limited domain (LDOM) TTS system• Festival TTS• http://www.festvox.org/• Uses scheme as script ing language• Installed (with scripts) on speech lab linux machines• Design TTS based on design of your dialog systems. Do in same teams.• Sign up for times in speech lab ASAP• http://www.cs.columbia.edu/~speech/sign‐up/index.php• Choose main machines first. Use overflow machines if none available• CS linux account required• http://www.cs.columbia.edu/crf/2Limited Domain TTS project• Part A (easy)– Set up and test a first small TTS system (talking clock) using Festival‐provided scripts– Each team member should do this part with their own voices• Part B&C– Do this in your teams– Design and implement your own LDOM TTS– Base this on your dialog system domain– Use some of the same procedures as in Part A– Note: these slides are an overview of the assignment. See full instructions at: http://www.cs.columbia.edu/~julia/courses/CS4706/hw/PROJ1.htm3Outline –Creating your TTS• Create utterance list (text) to specify the words and phrases in your domain as said in context (to capture coarticulation effects).• Record these utterances.• For all recorded utterances, Festival aligns utterance audio to utterance text.• Festival generates voice synthesizer for your domain (to cover words in your utterance list).• Write script which translates an input code to the new utterance text to be synthesized. Synthesized utterance must use words found in utterance list.• Run script to synthesize the given new utterance.4Part A –environment variables– Add these lines to your ~/.bashrc file:• export PATH=/proj/speech/tools/festival/festival/bin:$PATH• export PATH=/proj/speech/tools/festival/speech_tools/bin:$PATH• export FESTVOXDIR=/proj/speech/tools/festival/festvox• export ESTDIR=/proj/speech/tools/festival/speech_tools– Note: after first setting these variables, you have to log out and back in for the changes to the ~/.bashrc file to tak e eff ect. • Or do source ~/.bashrc in a running shell• Check by running echo $FESTVOXDIR. The value specified above should be displayed.– Festival documentation on the “telling the time” example:• http://festvox.org/bsv/x1003.html5Part A – build directories• Each student should do Part A• Create directory and cd to it• mkdir /proj/speech/users/cs4706/USERNAME• cd /proj/speech/users/cs4706/USERNAME• mkdir time• cd time• Set‐up directory• $FESTVOXDIR/src/ldom/setup_ldom SLP time xyz• These two files (among others) are created:• etc/time.data,contains a set of utterances covering the possible word and phrase choices in the domain• festvox/SLP_time_xyz.scm,defines several functions (in Scheme) to convert a time like 07:57 into an utterance like The time is now, a little after five to eight, in the morning.• For part A of the homework you do not need to edit these files, but you will in Parts B and C.6Part A ‐ build prompts and utterances • Generate prompts from utterance list (outputs to prompt‐wav/ subdir)• festival ‐b festvox/build_ldom.scm '(build_prompts "etc/time.data")’• Record from prompts. The system will prompt you to say one utterance at a time. (see recording tips)• bin/prompt_them etc/time.data• Note: Recorded files go to wav/ subdirectory• Create label files aligning utterance text to recorded utterances (outputs to lab/ subdir)• bin/make_labs prompt‐wav/*.wav• Build utterance structure• festival ‐b festvox/build_ldom.scm '(build_utts "etc/time.data")’• Copy uttera nce list text file (used in next processing steps)• cp etc/time.data etc/txt.done.data7Part A – build and run speech synthesizer• Extract pitchmarks and fix them (outputs to pm/ subdir)• bin/mak e_pm_wave wav/*.wav• bin/make_pm_fix pm/*.pm• Power normalization (outputs to wavn/ subdir)• bin/simple_powernormalize wav/*.wav• MCEP vectors (outputs to mcep/ subdir)• bin/make_mcep wav/*.wav• Build synthesizer • festival ‐b festvox/build_ldom.scm '(build_clunits "etc/time.data”)’• Note: it can only produce words that you have specifically given it in your utterance list.• Run it• festival festvox/SLP_time_xyz_ldom.scm '(voice_SLP_time_xyz_ldom)’• (saytime)• (saythistime "07:57")• (saythistime "14:22")8Part A –final output• Generate three wav files using your TTS as follows:– cd /proj/speech/users/cs4706/USERNAME/time– Festival– (load "festvox/SLP_time_xyz_ldom.scm")(voice_SLP_time_xyz_ldom)– (Parameter.set 'Audio_Method 'Audio_Command)(Parameter.set'Audio_Required_Rate 16000)(Parameter.set 'Audio_Required_Format 'wav)– (Parameter.set 'Audio_Command "cp $FILE time1.wav")(saytime)– (Parameter.set 'Audio_Command "cp $FILE time2.wav")(saythistime "07:57")– (Parameter.set 'Audio_Command "cp $FILE time3.wav")(saythistime "14:22")9Part B: Designing a limited domain TTS system• Part B and C should be done in your groups• Base your TTS on a possible set of output utterances from your dialog system design. • In your actual dialog system, you will probably need to expand on this set.• Include at least five degrees of freedom10Example: Time• In the talking clock the input is a string of the form– HH:MM,• And the output is a sentence of the form:– The time is now, EXACTNESS MINUTE INFO(, in the DAYPART),where:• EXACTNESS = {exactly, just after, a little after, almost} • MINUTE = {‐, five past, ten past, quarter past, twenty past, twenty‐five past, half past, twenty‐five to, twenty to, quarter to, ten to, five to} • INFO = {one, two, three, four, five, six, seven, eight, nine, ten, eleven, twelve, midnight} • DAYPART = {morning, afternoon, evening}• For example– 07:57 => The time is now, a little after five to eight, in the morning• Four degrees of


View Full Document

Columbia COMS W4706 - Limited Domain TTS  using Festival

Download Limited Domain TTS  using Festival
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Limited Domain TTS  using Festival and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Limited Domain TTS  using Festival 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?