Synopsis of NLP project

Embed Size (px)

Citation preview

  • 7/23/2019 Synopsis of NLP project

    1/4

    Synopsis of

    NLP Based Grammar Checker

    For the award of degree of

    Bachelor in Engineering in Computer Science.

    RAGIV GA!"I #R$!%$GI&I VIS"'AVI!%A(A%A

    )*ni+ersity of ,echnology of -adhya #radesh

    B"$#A( )-.#

    Su/mitted /y

    Prakash JhaKumar Prasanna

    Ishita VermaPrakhar Gupta

    Department of Computer Science Engineering

    Sagar Institute of Research & Technoog!

    Session "#$%"#$'

  • 7/23/2019 Synopsis of NLP project

    2/4

    (# Based Grammar Chec0er

    A grammar chec0er is one of the /asic atural (anguage #rocessing )(# tools for any

    language. ,he (# field is relati+ely new in India and a lot of tools ha+e yet to /e de+eloped.

    $ne of these is a grammar chec0er.

    Goas

    ,o implement a ,e1t #rocessing system which chec0s grammar of Input te1t and identifies

    types of error2

    Description in detai(

    $) P*S tagging

    Before grammar chec0ing can /e performed on a te1t it needs to /e run

    through a partofspeech )#$S tagger and parser. ,his ena/les the grammarchec0er to recognise types of words within each sentence. ,he te1t is first run

    through a #$S tagger which generates a tag for each word in a sentence. ,he

    tag indicates the word3s class. e1t4 the te1t )with tags is run through a parser

    which performs syntactic analysis on it4 adding tags to parts of the sentence4

    mar0ing phrases within it and syntactic roles.

    for e1ample5

    6. +aking Chunk,ased Sentence Patterns

    chun0s is a process to parse the sentence into a form that is a chun0 /ased sentence

    structure. A chun0 is a te1tual unit of ad7acent #$S tags which display the relations

    /etween their internal words. Input English sentence is made in chun0 structure /y

    using hand written rules. It represents how these chun0s fit together to form the

    constituents of the sentence. Conte1t Free Grammar )CFG5 CFGs constitute an

    important class of grammars4 with a /road range of applications including

    programming languages4 natural language processing4 /io informatics and so on.

    CFG3s rules present a single sym/ol on the left8hand8side4 are a sufficiently powerfulformalism to descri/e most of the structure in natural language.

  • 7/23/2019 Synopsis of NLP project

    3/4

    A conte1t8free grammar G 9 )V4 ,4 S4 # is gi+en /y

    A finite set V of +aria/les or non terminal sym/ols.

    A finite set , of sym/ols or terminal sym/ols. 'e assume that the sets V and , are

    dis7oint.

    A start sym/ol S V.

    A finite set #V )V,: of productions. A production )A4 ;4 where AV and

    ;)V,: is a se

    reduce parsing /egins with the input sentence and com/ines words into higher8le+el

    chun0s until the unit finally /ecomes a sentence.

    Parsing chunks ,! using C-G(

    ,he syntactic chun0 structure of a sentence is necessary to determine its grammar

    correctness. In the proposed system4 ten general chun0 types are used to ma0e the

    chun0 structure as shown in ,a/le.

    ,he proposed grammar chec0er identifies the chun0s using CFG /ased /ottom8up

    parsing for assem/ling #$S tags into higher le+el chun0s4 until a complete sentencehas /een found. For e1ample4 a simple sentence ?,he students are playing foot/all in

    the playground.@ is chun0ed as follows5

    CVCC##CCE! )Chun08/ased Sentence #attern

    CVCC##CC

    CVCC##C

    CVCC

  • 7/23/2019 Synopsis of NLP project

    4/4

    S!stem Components

    $) PoS Tagger

    ") Chunk Based Grammar Checker)

    .ppications

    ,e1t #rocessing

    -achine ,ranslation Systems

    Search Engine

    Spell8chec0er

    Grammar Chec0er

    amed Entity Identification

    Information E1traction

    Information Retrie+al

    ,e1t Classification and Clustering

    uestion Answering Systems

    Custom Search Systems

    Technoogies /sed

    #"#

    Angulars