10
Paraphrase Detection Using Recursive Autoencoders CS224n Eric Huang Richard Socher, Jeffrey Pennington, Professor Andrew Ng

Paraphrase Detection Using Recursive Autoencoders CS224n Eric Huang Richard Socher, Jeffrey Pennington, Professor Andrew Ng

Embed Size (px)

Citation preview

Page 1: Paraphrase Detection Using Recursive Autoencoders CS224n Eric Huang Richard Socher, Jeffrey Pennington, Professor Andrew Ng

Paraphrase Detection Using Recursive Autoencoders

CS224nEric Huang

Richard Socher, Jeffrey Pennington, Professor Andrew Ng

Page 2: Paraphrase Detection Using Recursive Autoencoders CS224n Eric Huang Richard Socher, Jeffrey Pennington, Professor Andrew Ng

Paraphrase Detection

• Microsoft Research Paraphrase Corpus

• Sentence 1: Amrozi accused his brother, whom he called "the witness", of deliberately distorting his evidence.

• Sentence 2: Referring to him as only "the witness", Amrozi accused his brother of deliberately distorting his evidence.

• Class: 1 (true paraphrase)

Page 3: Paraphrase Detection Using Recursive Autoencoders CS224n Eric Huang Richard Socher, Jeffrey Pennington, Professor Andrew Ng

Autoencoder

Page 4: Paraphrase Detection Using Recursive Autoencoders CS224n Eric Huang Richard Socher, Jeffrey Pennington, Professor Andrew Ng

Recursive Autoencoder

Page 5: Paraphrase Detection Using Recursive Autoencoders CS224n Eric Huang Richard Socher, Jeffrey Pennington, Professor Andrew Ng

Unsupervised Training

• 152,487 sentences from English Gigaword dataset

• Minimize the sum of reconstruction errors at all nodes

Page 6: Paraphrase Detection Using Recursive Autoencoders CS224n Eric Huang Richard Socher, Jeffrey Pennington, Professor Andrew Ng

Nearest Neighbors

• the U.S.• a U.S., the second biggest U.S., the most experienced

U.S.

• executive director• council director, general director, assistant director

Page 7: Paraphrase Detection Using Recursive Autoencoders CS224n Eric Huang Richard Socher, Jeffrey Pennington, Professor Andrew Ng

Aggregate Features

• 10 Settings• Top node• Avg/Min/Max of :• Leaf nodes• Non-Leaf nodes• All nodes

Page 8: Paraphrase Detection Using Recursive Autoencoders CS224n Eric Huang Richard Socher, Jeffrey Pennington, Professor Andrew Ng

Similarity MatrixThe dog sits

The 1 0.001 0.001

puppy

0.001 0.9 0.001

stays 0.001 0.001 0.5

Page 9: Paraphrase Detection Using Recursive Autoencoders CS224n Eric Huang Richard Socher, Jeffrey Pennington, Professor Andrew Ng

Similarity MatrixThe dog sits The dog The dog

sits

The 1 0.001 0.001 0.05 0.05

puppy 0.001 0.9 0.001 0.8 0.4

stays 0.001 0.001 0.5 0.001 0.4

The puppy

0.05 0.8 0.001 0.9 0.5

The puppy stays

0.05 0.4 0.4 0.5 0.8

Page 10: Paraphrase Detection Using Recursive Autoencoders CS224n Eric Huang Richard Socher, Jeffrey Pennington, Professor Andrew Ng

Results