Artificial Intelligence Rishabh Nigam Shubhdeep Kochhar Computational modelling of Grammar...

Artificial Intelligence

Rishabh Nigam

Shubhdeep Kochhar

Computational modelling of Grammar Acquisition

The Problem

● Computational framework for Grammar Acuisition

● Unsupervised Learning from a real corpus

● Why the problem

● Algorithm is capable of learning complex syntax, generating grammatical novel sentences, and proving useful in other fields that call for structure discovery from raw data, such as bioinformatics.

The Algorithm

● ADIOS – Automatic Distillation Of Structure● What it does :

The Mex Criterion : It uses the M[i,j]=PR[i,j] or PL[i,j] , this 2d matrix is then searched for steep decrease in PR[i,j] and PL[i,j] indicating a possibility of Equivalence classes in between them

Codes Used

● MEX criterion● Training scripts● Generating Scripts

--> Edelman and Zach Solan made these codes available .

Work done so far

● Converted the CHILDES database, HINDI database(WORDNET) into format readable by the ADIOS algorithm.

● Ran the algorithm on the CHILDES database and HINDI database.

● Had a brief correspondence with Shushobhan Nayak and we ran the algorithm on his database of small commentary.

Running on the CHILDES database

● E6478 {we,you,youse}

● P6479 (I,think) 0.0068258047 1 2 201

● P6480 (E6481,you,are) 0 1 3 18

● E6481 {there,here}

● P6482 (who,E6466) 0.0039371848 1 2 36

● P6483 (P6439,P6434,Emily) 0 0.33333334 5.4000001 4

● P6484 (he,is,E6485) 0.0058915019 1 3 28

● E6485 {.,here}

● P6486 (are,we,P6402) 0.0043362379 1 4 10

● P6487 (wait,to,E6488,E6489) 6.1452389e-05 0.5 4 15

● E6488 {we,you}

● E6489 {hear,see}

For eg E6481 you are --> There you are and here you are --< sentences in the corpus used

Running it on Hindi Database

● ID seq p-value gen len occ

● P3487 (भी�,प्रचलि�त) 0.0042799711 1 2 5

● P3488 (के ,E3489,भीगों�) 0 1 3 11

● E3489 {वि�भिभीन्न,मु��यमु}

● P3490 (E3491,के�,भीषा,मु�) 1.9848347e-05 1 4 6

● E3491 {वि�ज्ञान,बो��-च�}

● P3492 (मु�,E3493,�प,केरन,से) 0.0037000179 1 5 4

● E3493 {मिमु�,घो��केर,प�सेकेर}

● P3494 (वि�षा,नष्ट,हो�त) 0.001850009 1 3 4

● P3495 (सेमुन,भीगों) 0.0059099197 1 2 26

Running on the Commentary● P447 (the,E448,square) 0 1 3 65

● E448 {large,big}

● P449 (big,square) 7.212162e-05 1 2 38

● P450 (the,little,E451) 0 1 3 49

● E451 {circle,square}

● P452 (the,big,box) 0 1 3 34

● E455 {opens,closes,enters}

● P456 (the,E457) 0.0055941939 1 2 91

● E457 {bottom,corner,door,entrance}

● P458 (P449,E459,the) 0 1 4 4

● E459 {leaves,closes,enters}

● E461 {--,inside,and,leaves,left,opens,closes,enters}

Precision And Recall

Precision - the proportion of Clearner sentences accepted by the Teacher

Recall - the proportion of Ctarget sentences accepted by the Learner

Values found around 0.6 precision and 0.5 recall

References

[1] Heider. Waterfall ,Ben Sandbank,Luca Onnis and Shimon Edelman , An empirical generativeframework for computational modeling of language acquisition* : Cambridge University Press 2010

[2] Zach Solan PHD thesis under Professor David Horn ,Professor Shimon Edelman, and Professor Eytan Ruppin , AVIV university

Thank You

Artificial Intelligence Rishabh Nigam Shubhdeep Kochhar Computational modelling of Grammar...

Documents

Advertisement by rishabh khullar

WEBSITE BUILDING SOFTWARE Google By Ishaan Kochhar

Rishabh Software - Corporate Presentation

WellNation - Rishabh Mehrotra

CT Details Rishabh Make

Chanda Kochhar sees opportunities n challenges - Resonance

Hydroelectricity by yatharth kochhar

By : Anand Kochhar Video Conferencing. End User Training on Tandberg-990MXP By Anand Kochhar Mobile:- 9810619400 Video-0120-254010

Rishabh Itl

OIL and Gas - Rishabh

Copy of Rishabh Report

Chanda Kochhar Slides

Rishabh JURIS Project

Rishabh varma

Thomas Linden, Rishabh Khandelwal, Hamza …...Thomas Linden, Rishabh Khandelwal, Hamza Harkous, and ... ... the.. 1

Java Capabilities - Rishabh Software

Hydrogen Residential Model Rishabh

RISHABH JAIN1111

Yatharth kochhar ppt on monument

Final Indore Rishabh