Unformatted text preview:

Rap Lyric GeneratorHieu Nguyen, Brian SaJune 4, 20091 Research QuestionWriter’s block can be a real pain for lyricists when composing their song lyrics. Some say it’sbecause it is pretty hard to come up with lyrics that are clever but a lso flow with the rest of thesong. We wanted to tackle this problem by using our own song lyric generator that utilizes s omeNatural Language Generation techniques. In the general case, our lyr ic generator takes a corpusof song lyrics and outputs a song based on the words from the corpus. It also has the ability toproduce lines that emulate song structure (rhyming and syllables) and lines that are tied to a specifictheme. Using the ideas produced by our song lyric generator, we hope to provide lyricists with someinspiration for producing an awesome song.We chose to use o nly rap lyrics for our lyric corpus because we thought the language used in raplyrics were very s pecific to its domain, and thus interesting to read. Also, the lyrics often have asimilar structure (similar word length per line and similar rhyming schemes). Our lyr ic genera torcan be applied to any other type of ly ric, such as rock or pop, or even to poems that have somestructure and rhyming.2 Related WorkNatural Language Generation is a rapidly evo lv ing field of natural language processing. It canbe used in fun hobby projects such as chat-bots and lyric generators, or it can have applicationsthat would aid a larger range of people. There has been work in automatically generating easy-to-read summaries of financial, medical, or any other sort of data. An interesting application wasthe STO P Project, created by Reiter, et a l. Based on some input data about smoking history, thesystem produces a brochur e that tries to get the user to quit smo king, fine-tuned to the us e r’s inputdata. The process is divided into three steps: planning (producing content), microplanning (addingpunctuation and w hitespace), and realization (producing the brochure). The system did producereadable and quite persuasive output. But results showed that the tailored brochures were no moreeffective than the default non-tailore d brochures.Work in Natura l Language Generation revolves around cre ating systems that produce text thatmakes sense in content, grammar, lexical choice, and overall flow. The systems also need to produceoutput that is non-repetitive, so they need to do things like combine short sentences with the samesubject. In general, Natural Language Generation s ystems need to trick r e aders into thinking thatthe generated text was actually written by a human.1CS224N Spring 2009, Final Project Hieu Nguyen, Brian Sa 2-=talking=-Lets get it on every timeHoller out "Your mine"[Chorus] [10sion not singing]And I say "a yia yia yia" -=singing=-Let’s get it on every timeHoller out "Your mine"And I say "Oh oo oh oo oh oh oh oh oh" -=singing=-So if you willin’ you wit it then we can spend timeAnd I say "a yia yia yia" -=singing=-Figure 1: Excerpt fr om 10sion’s “Let’s Get It On”Chorus:Everybody light your Vega,everybody light your Vega,everybody smoke, woo hoooo (2x)ChorusNow first let’s call for the motherfuckin indoPull out your crutch and put away your pistol<Rest of verse>Figure 2: Excer pt from 11/5’s “Garcia Vegas”3 Implementation3.1 Data3.1.1 Rap LyricsWe crawled a hip-hop lyrics site (www.ohhla.com) and pulled in about 40,000 lyr ic s from artistsranging from 2pac to Zion I, putting them into a MySQL database. We then preprocessed a subsetof those lyrics by removing the header, removing unnecessary punctuation and whitespace, andlowercasing all the alphabet characters. Finally, we split the content of the lyrics into chorus andverse flatfiles. This was actually not a trivial ta sk. The lyrics from the site were in various formatsand used different hea ders, so it was difficult to tell where chorus sections began and ended.As seen in Figures 1 and 2, the two lyrics use different formatting for Chorus headers. Also, asin “Garcia Vegas”, it was hard to tell whether a se c tion actually corresponded to the chorus, or if theword Chorus was just used to indicate a repeat of the chorus. This oc c urred in se veral other so ngs.We solved this by using a state machine as we were parsing the lyrics line-by-line to keep track ofwhich section we were in. We had to manually create the transition rules for the state machine. Forexample, if we saw Chorus then a blank line, we would assume that the next section is actually theverse.Each flatfile contains a single ly rical line (which we will define as a “sentence”) per line in thefile. Our languag e model uses this data to train.CS224N Spring 2009, Final Project Hieu Nguyen, Brian Sa 33.1.2 Rhyming WordsWe used a rhyming database (rhyme.sourceforge.net) to produce words that rhymed with a giveninput word. The rhymer’s default usage is through command-line, a nd although this producedresults, we eventually decided to create flatfiles of a ll word → rhyme poss ibilities for all the words inour chorus and our verse corpora . These files also included the syllable count of the words . Whenour lyric generator is loaded, it loads all of the r hyme flatfiles into memory.3.2 Language ModelOur rap generator uses two langua ge models: one that produces the chorus, and one that producesthe verse. They are essentially the same model, except trained on different corpora.We o riginally started o ut with a linear-interpolated Trigram Model that weights the scores ofabsolute-discounted unigram, bigram, and trigram models according to hand- set weights. Althoughthis produced decent r e sults, there was a general lack of flow in the sentences becaus e our modelonly looked at a 2-word history to produce the next word. Here is an example line from our Trigrammodel:comfort pigeons feeble need me i don’t park there’s a knotWe then created a linear- interpolated Quadgram Model that weig hts the scores of absolute-discounted unigram, bigram, trigram, and quadg ram models according to hand-set weights. Thisproduced much better results, like this example:what you know you gotta love it new york city3.3 Sentence GenerationFor each section in the song (chorus or verse) we generate a set number of lines using the correspond-ing language model. For each line we generate, we actually generate a certain number o f candidatelines (K, default = 30) fro m the model, and rank them according to a score. Then we pick thesentence with the best score,


View Full Document
Download Rap Lyric Generator
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Rap Lyric Generator and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Rap Lyric Generator 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?