Prepared ByFigure 2. MIT Media Lab Beat Detection AlgorithmDetected Tempo (bpm)Detected Tempo (bpm)Table 3. Beat Detector Cutoff Frequency ResultsMusic Database with Song IdentificationUsing Perceptual Audio HashingSubmitted ToHamood-ur RehmanDr. Brian L. EvansDr. Bruce PennycookPrepared ByEric HeinenEE464H Senior Design ProjectElectrical and Computer Engineering DepartmentUniversity of Texas at AustinSpring 2006iiCONTENTSLIST OF FIGURES.............................................................................................................iiiLIST OF TABLES...............................................................................................................ivEXECUTIVE SUMMARY...................................................................................................v1.0 INTRODUCTION........................................................................................................12.0 DESIGN PROBLEM STATEMENT.........................................................................12.1 MOTIVATION....................................................................................................12.2 BACKGROUND..................................................................................................22.3 REQUIREMENTS AND GOALS.....................................................................32.4 DESIGN PARAMETERS..................................................................................42.5 CONSTRAINTS..................................................................................................43.0 DESIGN PROBLEM SOLUTION.............................................................................43.1 FRAMING...........................................................................................................43.2 FEATURE EXTRACTION................................................................................83.3 CLUSTERING..................................................................................................114.0 DESIGN IMPLEMENTATION...............................................................................125.0 TEST AND EVALUATION.......................................................................................135.1 BEAT DETECTOR CUTOFF FREQUENCY...............................................135.2 NUMBER OF FRAMES PER BEAT..............................................................145.3 SONG DUPLICATION THRESHOLD..........................................................146.0 TIME AND COST CONSIDERATIONS................................................................15iii7.0 SAFETY AND ETHICAL ASPECTS OF DESIGN...............................................158.0 CONCLUSIONS AND RECOMMENDATIONS...................................................16REFERENCES....................................................................................................................16ivLIST OF FIGURES1 Three-Stage Perceptual Audio Hashing Framework.........................................................32 MIT Media Lab Beat Detection Algorithm......................................................................53 Example Percussion Spectrogram.....................................................................................74 YIN Pitch Detection Results.............................................................................................95 Auditory Nerve Image (ANI) Pitch Detection Results...................................................106 Cross-Correlation Pitch Detection Results.....................................................................117 Graphical User Interface (GUI) Implementation............................................................12vLIST OF TABLES1 Initial Beat Detection Results...........................................................................................62 Improved Beat Detection Results......................................................................................83 Beat Detector Cutoff Frequency Results.........................................................................14viEXECUTIVE SUMMARYDigital audio files require relatively large amounts of memory. Therefore, it would be a good idea for a music database’s interface to catch potential song duplications. Without relying on artist and song information in filenames or metadata, this problem becomes a matter signal processing.The concept of perceptual audio hashing was therefore employed. The idea is to generate asong identifier, upon file addition, that describes its perceptual content in a reduced yet thorough way. Then, catching song duplication is simply a matter of comparing these identifiers. Furthermore, with the right type of search algorithm, database users might be able to find perceptually similar songs.Perceptual audio hashing generally follows a three-stage framework, but I focused exclusively on the first two stages: framing and feature extraction. For the framing stage, I decided to use beat detection, such that there are a constant number of frames per beat. I had hoped that this would to minimize the complexity of comparing renditions of the same song performed at different tempos. I tested different algorithms, picked the best one, and made improvements by adjusting the limits of the frequency bands it analyzed within the source audio.For the feature extraction stage, I only implemented pitch detection. I tested three algorithms, and the one I decided to use was of my own design. I used the output of this stage, which I called a note vector, as the database identfier. I then developed an algorithm for comparing these identifiers, and defined song duplication as the event that the match percentage exceeds a threshold.After implementing all of the aforementioned code in MATLAB, I created a graphical user interface in LabVIEW to implement a music database. With this interface, I then tested theoverall solution and made some final design choices. In the end, the database recognized song duplications with high probability, and also recognized my rendition of a song as being very similar to the original. The design had some shortcomings, but the results in general were encouraging and I am interested in extending this project in the future.vii1.0 INTRODUCTIONThe purpose of this report is basically to discuss an engineering problem that I considered, and how I worked to design, implement, test, and refine a solution. In short, I designed music database
View Full Document