Slide 1Paraphrase DetectionAutoencoderRecursive AutoencoderUnsupervised TrainingNearest NeighborsAggregate FeaturesSimilarity MatrixSimilarity MatrixResultsParaphrase Detection Using Recursive AutoencodersCS224nEric HuangRichard Socher, Jeffrey Pennington, Professor Andrew NgParaphrase Detection•Microsoft Research Paraphrase Corpus•Sentence 1: Amrozi accused his brother, whom he called "the witness", of deliberately distorting his evidence.•Sentence 2: Referring to him as only "the witness", Amrozi accused his brother of deliberately distorting his evidence.•Class: 1 (true paraphrase)AutoencoderRecursive AutoencoderUnsupervised Training•152,487 sentences from English Gigaword dataset•Minimize the sum of reconstruction errors at all nodesNearest Neighbors•the U.S.•a U.S., the second biggest U.S., the most experienced U.S.•executive director•council director, general director, assistant directorAggregate Features•10 Settings•Top node•Avg/Min/Max of :•Leaf nodes•Non-Leaf nodes•All nodesSimilarity MatrixThe dog sitsThe 1 0.001 0.001puppy0.001 0.9 0.001stays 0.001 0.001 0.5Similarity MatrixThe dog sits The dog The dog sitsThe 1 0.001 0.001 0.05 0.05puppy 0.001 0.9 0.001 0.8 0.4stays 0.001 0.001 0.5 0.001 0.4The puppy0.05 0.8 0.001 0.9 0.5The puppy stays0.05 0.4 0.4 0.5
View Full Document