Upload
prakash-jha
View
240
Download
0
Embed Size (px)
Citation preview
7/23/2019 Synopsis of NLP project
1/4
Synopsis of
NLP Based Grammar Checker
For the award of degree of
Bachelor in Engineering in Computer Science.
RAGIV GA!"I #R$!%$GI&I VIS"'AVI!%A(A%A
)*ni+ersity of ,echnology of -adhya #radesh
B"$#A( )-.#
Su/mitted /y
Prakash JhaKumar Prasanna
Ishita VermaPrakhar Gupta
Department of Computer Science Engineering
Sagar Institute of Research & Technoog!
Session "#$%"#$'
7/23/2019 Synopsis of NLP project
2/4
(# Based Grammar Chec0er
A grammar chec0er is one of the /asic atural (anguage #rocessing )(# tools for any
language. ,he (# field is relati+ely new in India and a lot of tools ha+e yet to /e de+eloped.
$ne of these is a grammar chec0er.
Goas
,o implement a ,e1t #rocessing system which chec0s grammar of Input te1t and identifies
types of error2
Description in detai(
$) P*S tagging
Before grammar chec0ing can /e performed on a te1t it needs to /e run
through a partofspeech )#$S tagger and parser. ,his ena/les the grammarchec0er to recognise types of words within each sentence. ,he te1t is first run
through a #$S tagger which generates a tag for each word in a sentence. ,he
tag indicates the word3s class. e1t4 the te1t )with tags is run through a parser
which performs syntactic analysis on it4 adding tags to parts of the sentence4
mar0ing phrases within it and syntactic roles.
for e1ample5
6. +aking Chunk,ased Sentence Patterns
chun0s is a process to parse the sentence into a form that is a chun0 /ased sentence
structure. A chun0 is a te1tual unit of ad7acent #$S tags which display the relations
/etween their internal words. Input English sentence is made in chun0 structure /y
using hand written rules. It represents how these chun0s fit together to form the
constituents of the sentence. Conte1t Free Grammar )CFG5 CFGs constitute an
important class of grammars4 with a /road range of applications including
programming languages4 natural language processing4 /io informatics and so on.
CFG3s rules present a single sym/ol on the left8hand8side4 are a sufficiently powerfulformalism to descri/e most of the structure in natural language.
7/23/2019 Synopsis of NLP project
3/4
A conte1t8free grammar G 9 )V4 ,4 S4 # is gi+en /y
A finite set V of +aria/les or non terminal sym/ols.
A finite set , of sym/ols or terminal sym/ols. 'e assume that the sets V and , are
dis7oint.
A start sym/ol S V.
A finite set #V )V,: of productions. A production )A4 ;4 where AV and
;)V,: is a se
reduce parsing /egins with the input sentence and com/ines words into higher8le+el
chun0s until the unit finally /ecomes a sentence.
Parsing chunks ,! using C-G(
,he syntactic chun0 structure of a sentence is necessary to determine its grammar
correctness. In the proposed system4 ten general chun0 types are used to ma0e the
chun0 structure as shown in ,a/le.
,he proposed grammar chec0er identifies the chun0s using CFG /ased /ottom8up
parsing for assem/ling #$S tags into higher le+el chun0s4 until a complete sentencehas /een found. For e1ample4 a simple sentence ?,he students are playing foot/all in
the playground.@ is chun0ed as follows5
CVCC##CCE! )Chun08/ased Sentence #attern
CVCC##CC
CVCC##C
CVCC
7/23/2019 Synopsis of NLP project
4/4
S!stem Components
$) PoS Tagger
") Chunk Based Grammar Checker)
.ppications
,e1t #rocessing
-achine ,ranslation Systems
Search Engine
Spell8chec0er
Grammar Chec0er
amed Entity Identification
Information E1traction
Information Retrie+al
,e1t Classification and Clustering
uestion Answering Systems
Custom Search Systems
Technoogies /sed
#"#
Angulars