University of Montreal, Microsoft Research, University of ... · University of Montreal, Microsoft...

Straight to the Tree: Constituency Parsing with Neural Syntactic Distance

Yikang Shen*, Zhouhan Lin*,Athul Paul Jacob, Alessandro Sordoni, Aaron Courville, Yoshua Bengio

University of Montreal, Microsoft Research, University of Waterloo

Overview- Motivation- Syntactic Distance based Parsing Framework- Model- Experimental Results

ICLR 2018: Neural Language Modeling by Jointly Learning Syntax and Lexicon

Syntactic Distance

Structured Self-Attention

Language Model (61 ppl)

Unsupervised Constituency parser(68 UF1)

Supervised Constituency Parsing with Syntactic Distance?

[Shen et al. 2018]

Chart Neural Parsers

1. High computational cost:Complexity of CYK is O(n^3).

2. Complicated loss function:

Transition based Neural Parsers

1. Greedy decoding:Incompleted tree (the shift and reduce steps may not match).

2. Exposure biasThe model is never exposed to its own mistakes during training

[Stern et al., 2017; Cross and Huang, 2016]

Intuitions

Only the order of split (or combination) matters for reconstructing the tree.

Can we model the order directly?

Syntactic distance

For each split point, their syntactic distance should share the same order as the height of related node

Convert to binary tree

[Stern et al., 2017]

Tree to Distance

The height for each non-terminal node is the maximum height of its children plus 1

Tree to Distance

S VP S-VP ∅

NP ∅ ∅ NP ∅

Distance to Tree

Split point for each bracket is the one with maximum distance.

Distance to Tree

Framework for inferring the distances and labels

Distances

Labels for leaf nodes

Labels for non-leaf nodes

Inferring the distances

Distances

Inferring the distances

Pairwise learning-to-rank loss for distances

a variant of hinge loss

Pairwise learning-to-rank loss for distances

While di > dj : While di < dj :

Distances

Inferring the Labels

Putting it together

Experiments: Penn Treebank

Experiments: Chinese Treebank

Experiments: Detailed statistics in PTB and CTB

Experiments: Ablation Test

Experiments: Parsing Speed

Conclusions and Highlights

- A novel constituency parsing scheme: predicting tree structure from a set of real-valued scalars (syntactic distances).

- Completely free from compounding errors. - Strong performance compare to previous models, and- Significantly more efficient than previous models- Easy deployment: The architecture of model is no more than a stack

of standard recurrent and convolutional layers.

One more thing... Why it works now?

- The research in rank loss is well-studied in the topic of learning-to-rank, since 2005 (Burges et al. 2005).

- Models that are good at learning these syntactic distances are not widely known until the rediscovery of LSTM in 2013 (Graves 2013).

- Efficient regularization methods for LSTM didn’t become mature until 2017 (Merity 2017).

Thank you!

Questions?

Yikang Shen, Zhouhan LinMILA, Université de Montréal{yikang.shn, lin.zhouhan}@gmail.com

Paper:Code:

University of Montreal, Microsoft Research, University of ... · University of Montreal, Microsoft...

Documents

Differentially Private Data Release for Data Mining Benjamin C.M. Fung Concordia University Montreal, QC, Canada Noman Mohammed Concordia University Montreal,

Jay Gopalakrishnan University of Florida Montreal ...web.pdx.edu/~gjay/talks/montrealI.pdf · Jay Gopalakrishnan University of Florida Montreal Scientiﬁc Computing Days ... Iterative

Neural basis of fMRI signals - McGill University basis … · Neural basis of fMRI signals Amir Shmuel MNI, McConnel Brain Imaging Centre McGill University, Montreal, QC, Canada Montreal

McGill University Archives McGill University, Montreal Canadaarchives.mcgill.ca › findingaid › mg4240.pdf · McGill University Archives McGill University, Montreal Canada MG4240

McGill University Archives McGill University, Montreal Canadaarchives.mcgill.ca/findingaid/mg2001.pdf · 2015-09-28 · McGill University Archives McGill University, Montreal Canada

Drt 6455 eCommerce Law lesson 4 – eContract associate professor faculty of law university of montreal university of montreal chair in e-Security and e-Business

Drt 6455 eCommerce Law lesson 4.2 – IT and eConsumer associate professor faculty of law university of montreal university of montreal chair in e-Security

J. William Atwood Bing Li Concordia University, Montreal Salekul Islam

Press Release Magnificat ang - arionbaroque.com · Montreal, November 24, 2015 – For the closing concert of this year’s Montreal Bach ... Microsoft Word - Press Release Magnificat

Drt 6455 eCommerce Law lesson 1 – Introduction associate professor faculty of law university of montreal university of montreal chair in e-Security and

Immigration and Charity in the Montreal ... - York University

Montreal, Quebec - McGill University

The Impact of Uncertainty in Agriculture · Raphael Godefroy (University of Montreal) Joshua Lewis (University of Montreal) February 2018 Abstract Income uncertainty in the rural

Concordia University, Montreal, Canada · Concordia University, Montreal, Canada Abstract In response to the volume and sophistication of malicious software or malware, security investigators

Kevin Tuite, University of Montreal · Kevin Tuite, University of Montreal 1 ¤1. Grammars and dictionaries of Georgian customarily classify verbs by transitivity and/or voice, in

The ATLAS Trigger { Commissioning with cosmic rays · 49 Department of Physics, McGill University, Montreal 50 University of Montreal, Montreal 51 Moscow State University, Moscow

EContract vincent gautrais associate professor faculty of law university of montreal May 7th, 2007 university of montreal chair in e-Security and e-Business

July 2013 at Concordia University in Montreal, Qc, Canada

Neptronic at ASHRAE Montreal Montreal...Microsoft Word - ASHRAE Montreal Article English.docx Created Date 4/26/2016 9:40:35 PM

1 drt 6455 eCommerce Law lesson 8 – IT and web 2.0 associate professor faculty of law university of montreal university of montreal chair in e-Security