Unformatted text preview:

Homework 1 – due Wednesday 6/28 in classSPHSC 503 – Speech Signal Processing UW – Summer 2006 Homework 1 – due Wednesday 6/28 in class Since this is the first homework in this course, you may wonder what to hand in for your homework. Your homework should include all requested plots, and it should provide answers to all questions and assignments in the homework. For example, for problem 1.1, you don’t need to provide anything for part a, but you need to provide a plot for part b and c, provide the duration of the signal in seconds in part d, and provide estimates for t=0.2 and t=4 for the distance in seconds between peaks (part e, 2 estimates), and for the fundamental frequency (part f, 2 estimates). You’re free to include any additional information, such as Matlab commands, supporting plots and comments, but those are not required. You may either bring a hard-copy of your homework to class on Wednesday, or you can email me your homework. For the latter option, I recommend collecting all your plots and answers in a single word processing document such as Microsoft Word. Problem 1.1 – Loading and analyzing a speech signal Download the file ex1_3.wav from the class website, and save it in a folder on your computer (if you’re in SAV 137, the suggested location for the file is C:\Temp\SPHSC503\ex1_3.wav). The file contains the spoken word “zero” sampled at 10 kHz. Change the current directory in Matlab to the directory that contains the saved file. a. Load the ex1_3.wav file into Matlab. You can either use Matlab’s Import Wizard, by double-clicking on the filename in Matlab’s current directory window, or use the wavread command (see help wavread for details). b. Plot the speech signal against its index (n) by using the interactive tools or by using the plot command. Label the axes and the plot appropriately. c. Plot the speech signal as a function of time, and label the axes and the plot appropriately. Hint: you need to create the a ‘time’ vector t, and then plot the signal with “plot(t,y)”. You can create the correct time vector by dividing the speech signal’s index vector by the sampling frequency of the signal, for example: t = n / fs; d. What is the duration of the signal in seconds? It may be helpful to use the figure’s zoom, pan and data cursor tools. Voiced speech, such as vowels, is characterized by a series of high-energy peaks in the speech signal. Those peaks are created by the repeated opening and closing of the vocal chords. e. Make a rough estimate of the distance in seconds between the peaks in the “zero” speech signal around t=0.2 and t=0.4 seconds, corresponding to the two vowels. Again, it may be helpful to use the figure’s zoom, pan and data cursor tools. f. Convert the measured distances in seconds from part e into an estimate of the fundamental frequency (in Hz) of the speech signal around t=0.2 and t=0.4 seconds. – 1 –SPHSC 503 – Speech Signal Processing UW – Summer 2006 Problem 1.2 – Measuring fundamental frequency with correlation In problem 1.1f, you’ve manually found an estimate for the fundamental frequency of the speech signal. In this problem, we will use a technique called correlation to estimate the fundamental frequency automatically. Correlation is a measure of the degree to which two sequences are similar. It is related to convolution, and its mathematical expression looks like the convolution sum. There are two kinds of correlation: auto-correlation (correlation between a signal and itself) and cross-correlation (correlation between two different signals). They are defined as follows: Auto-correlation: [] [ ][ ]xxnr l xnxn l∞=−∞=−∑Cross-correlation: [] [ ][ ]xynrl xnyn l∞=−∞=−∑In Matlab, both types of correlation can be computed using the xcorr function from the Signal Processing Toolbox. For example, >> x = [1 1 1]; lmax = 3; >> [rxx,l] = xcorr(x,lmax); >> stem(l,rxx); % a triangle computes and plots the auto-correlation of x, , for l=-lmax,…,lmax. Similarly, []xxrl >> x = [1 1 1]; y = [-1 0 1]; lmax = 3; >> [rxy,l] = xcorr(x,y,lmax); >> stem(l,rxy); % 2 up, 2 down computes and plots the cross-correlation of x and y, , for l=-lmax,…,lmax. []xyrl a. Clear the workspace (clear), load the speech signal from problem 1.1 again, and extract a voiced section of the speech signal using yvoiced1 = y(1900:2300); . Plot the voiced section against its index, nvoiced1 = 1900:2300; . b. Compute and plot the autocorrelation of the voiced section for lmax=250. You can modify the example code above to do this. Label your x-axis as ‘Lag (in samples)’. Notice the following in your plot of the autocorrelation: the autocorrelation has the highest peak for zero lag (l=0). This is a necessary property of all autocorrelations. Then the autocorrelation has strong positive peaks at equal distances to the left and right of the zero lag point. c. Determine the value of the lag for the next strongest peak to the left and right. The zoom tools of the plot window may be helpful for this. d. Divide the value of the lag you found in part e by the sampling frequency to get the value of the lag in seconds. e. The value of the lag in seconds should correspond more or less to the distance between the peaks you found in problem 1.1e for t=0.2. Is that the case for you? Convert the value of the lag in seconds to a frequency in Hz. This value should correspond to the fundamental frequency from problem 1.1f for t=0.2. f. Repeat a-e for the second voiced section around t=0.4, nvoiced2 = 3800:4200;. – 2 –SPHSC 503 – Speech Signal Processing UW – Summer 2006 Problem 1.3 – Measuring fundamental frequency with a pitch estimator It is possible to fully automate the estimation of the fundamental frequency of a speech signal with the methods used in problem 1.1 and 1.2. In this problem, we will use a pitch estimator to estimate the pitch of the entire speech signal. a. Download the file pitchestimate.m from the class website. This m-file contains the Matlab function pitchestimate, which can be used as follows % y,fs is the input signal and sampling frequency >> [t,p]=pitchestimate(y,fs); % t is the times at which the fundamental frequency was estimated % p is the estimates of the fundamental frequency >> plot(t,p) See help pitchestimate for details. b. Clear the workspace, load the speech signal again, and estimate and plot its fundamental


View Full Document

UW SPHSC 503 - Study Guide

Download Study Guide
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Study Guide and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Study Guide 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?