27
Natural Language Processing and its Application Dr., Samir Rustamov, Assistant Professor, School of IT & Engineering, ADA University

Natural Language Processingandits Application · 2018-05-28 · BİL711 Natural Language Processing 3 Computer Human Human Judge • Human Judgeasks tele-typed questions to Computer

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Natural Language Processingandits Application · 2018-05-28 · BİL711 Natural Language Processing 3 Computer Human Human Judge • Human Judgeasks tele-typed questions to Computer

Natural Language Processing and its ApplicationDr., Samir Rustamov,Assistant Professor, School of IT & Engineering, ADA University

Page 2: Natural Language Processingandits Application · 2018-05-28 · BİL711 Natural Language Processing 3 Computer Human Human Judge • Human Judgeasks tele-typed questions to Computer

2

WhatisNaturalLanguageProcessing(NLP)?

• NaturalLanguageProcessing(NLP)isafieldofartificialintelligencethatenablescomputersinteractwithhumaninnaturallanguage.

Ultimategoal:Naturalhuman-to-computercommunication

Page 3: Natural Language Processingandits Application · 2018-05-28 · BİL711 Natural Language Processing 3 Computer Human Human Judge • Human Judgeasks tele-typed questions to Computer

BİL711 Natural Language Processing 3

Computer Human

HumanJudge

• HumanJudge askstele-typedquestionstoComputer andHuman.• Computer’sjobistoactlikeahuman.• Human’s jobistoconvinceJudgethatheisnotmachine.• Computer isjudged“intelligent”ifitcanfoolthejudge• Judgmentofintelligenceislinkedtoappropriateanswerstoquestionsfromthesystem.

The Turing Test(Can Machine think? A. M. Turing, 1950)

Page 4: Natural Language Processingandits Application · 2018-05-28 · BİL711 Natural Language Processing 3 Computer Human Human Judge • Human Judgeasks tele-typed questions to Computer
Page 5: Natural Language Processingandits Application · 2018-05-28 · BİL711 Natural Language Processing 3 Computer Human Human Judge • Human Judgeasks tele-typed questions to Computer

NaturalLanguageProcessingMarkettoReach$22.3Billionby2025

Page 6: Natural Language Processingandits Application · 2018-05-28 · BİL711 Natural Language Processing 3 Computer Human Human Judge • Human Judgeasks tele-typed questions to Computer

BİL711 Natural Language Processing 6

FormsofNaturalLanguage

• Theinput/outputofaNLPsystemcanbe:• writtentext• speech

• Toprocesswrittentext,weneed:• lexical,syntactic,semanticknowledgeaboutthelanguage• discourseinformation,realworldknowledge

• Toprocessspokenlanguage,weneedeverythingrequiredtoprocesswrittentext,plusthechallengesofspeechrecognitionandspeechsynthesis.

Page 7: Natural Language Processingandits Application · 2018-05-28 · BİL711 Natural Language Processing 3 Computer Human Human Judge • Human Judgeasks tele-typed questions to Computer

ComponentsofNLP

NaturalLanguageUnderstandingMappingthegiveninputinthenaturallanguageintoausefulrepresentation.

NaturalLanguageGenerationProducingoutputinthenaturallanguage

fromsomeinternalrepresentation.

NLUnderstanding ismuchharderthanNLGeneration.But,stillbothofthemarehard.

Page 8: Natural Language Processingandits Application · 2018-05-28 · BİL711 Natural Language Processing 3 Computer Human Human Judge • Human Judgeasks tele-typed questions to Computer

WhyNLUnderstandingishard?

• Naturallanguageisextremelyrichinformandstructure,andveryambiguous.• Oneinputcanmeanmanydifferentthings.Ambiguitycanbeatdifferentlevels.• Lexical(wordlevel)ambiguity-- differentmeaningsofwords• Syntacticambiguity-- differentwaystoparsethesentence• Interpretingpartialinformation-- howtointerpretpronouns

• Manyinputcanmeanthesamething.• Interactionamongcomponentsoftheinputisnotclear.

Page 9: Natural Language Processingandits Application · 2018-05-28 · BİL711 Natural Language Processing 3 Computer Human Human Judge • Human Judgeasks tele-typed questions to Computer

BİL711 Natural Language Processing 9

Exampleofambiguity

• Someinterpretationsof: Adamı gördüm.1. Isawtheman.2. IsawAdam3. Isawmyisland.4. Ivisitedmyisland.5. IsawmyADA6. IvisitedmyADA7. Ibribedtheman.

• SemanticAmbiguity:• gör tosee• gör tovisit• gör tobribe

Page 10: Natural Language Processingandits Application · 2018-05-28 · BİL711 Natural Language Processing 3 Computer Human Human Judge • Human Judgeasks tele-typed questions to Computer

BİL711 Natural Language Processing 10

ResolveAmbiguities

• lexicaldisambiguation -- Resolutionofpart-of-speechandword-senseambiguitiesaretwoimportantkindsoflexicaldisambiguation.• syntacticambiguity -- canbeaddressedbyprobabilisticparsing.

Page 11: Natural Language Processingandits Application · 2018-05-28 · BİL711 Natural Language Processing 3 Computer Human Human Judge • Human Judgeasks tele-typed questions to Computer

WhatisPoS tagging?Whyisitimportant?

“gül” - ?

Eachwordhasapart-of-speechtagtodescribeitscategory.POSTaggerstrytofindPOStagsforthewords.

Page 12: Natural Language Processingandits Application · 2018-05-28 · BİL711 Natural Language Processing 3 Computer Human Human Judge • Human Judgeasks tele-typed questions to Computer

Whyitmatters?Applicationsu Machinetranslation– “Daşıdaşı”

Page 13: Natural Language Processingandits Application · 2018-05-28 · BİL711 Natural Language Processing 3 Computer Human Human Judge • Human Judgeasks tele-typed questions to Computer

Phonetics&Phonology (Speechsound)

Morphology, LexiconWords&theirforms(Words&theirforms)

Syntax,Parsing (Structureofsentences)

Semantics(Meaningofsentences)

Pragmatics (Meaning incontext&forapurpose )

Discourse(Connectedsentenceprocessinginalargerbodyoftext)

LanguageProcessing

Page 14: Natural Language Processingandits Application · 2018-05-28 · BİL711 Natural Language Processing 3 Computer Human Human Judge • Human Judgeasks tele-typed questions to Computer

Applicationsforspellingcorrection

Websearch

PhonesWordprocessing

Page 15: Natural Language Processingandits Application · 2018-05-28 · BİL711 Natural Language Processing 3 Computer Human Human Judge • Human Judgeasks tele-typed questions to Computer

Spellingcorrection

- Cinseddi dunyanınyedimocusəsindənbirdir.- Niye?- Cunki duzəldikleti enuzunomurlu sheydi.f( )=

-Çinsəddidünyanınyeddimöcüzəsindənbiridir.- Niyə?- Çünkidüzəltdiklətiənuzunömürlüşeydir.

Page 16: Natural Language Processingandits Application · 2018-05-28 · BİL711 Natural Language Processing 3 Computer Human Human Judge • Human Judgeasks tele-typed questions to Computer

WhenthespaceorganizationNASAfirststartedsendingupastronaunts theydiscoveredballpointpenswouldnotworkinzerogravity.Tosolvetheproblem,NASAscientistsspenttenyearsand12billion todevelopapenthatwouldwriteinzerogravity,upsidedown,underwater,onalltypesofsurface,andattemperaturesrangingfrombelowfreezing to300C.Russiansusedapencil.Originaltext

NASAkosmik təşkilatı ilkdəfə astronavtların göndərilməsinə başladıqları zaman,ballpointqələmləri sıfır çəkisi ilə işləməyəcəkdi.Problemi həll etmək üçün,NASAalimləri sıfır ağırlıq,baş aşağı,su altında,bütün səthlərdə və aşağıdadondurmadan 300dərəcə qədər dəyişən temperaturda yazacaqbir qələmhazırlamaq üçün onil və 12milyard dollar sərf etmişdir.Ruslar bir qələmistifadə edirdi.Google

Kosmos təşkilatı NASAilkastronaunts qaldıraraq başlayanda onlar kəşf diyircəkliqələmlər çəkisizlik şəraitində işləməyəcək.Problemin həlli,NASAalimləriçəkisizlik yazacaqqələm inkişaf etdirmək,onil və 12milyard xərcləyib,tərsinə,su altında,bütün səthinin növləri,300c-aşaxta tutmuş və temperaturda .karandaş istifadə olunan Ruslar.Dilmanc

Machinetranslation

Page 17: Natural Language Processingandits Application · 2018-05-28 · BİL711 Natural Language Processing 3 Computer Human Human Judge • Human Judgeasks tele-typed questions to Computer
Page 18: Natural Language Processingandits Application · 2018-05-28 · BİL711 Natural Language Processing 3 Computer Human Human Judge • Human Judgeasks tele-typed questions to Computer

Whyismachinetranslationhard?

• Requiresbothunderstandingthe“from”languageandgeneratingthe“to”language.

• Howcanweteachacomputera“secondlanguage”whenitdoesn’tevenreallyhaveafirstlanguage?

• Canwedomachinetranslationwithoutsolvingnaturallanguageunderstanding andnaturallanguagegeneration first?

Page 19: Natural Language Processingandits Application · 2018-05-28 · BİL711 Natural Language Processing 3 Computer Human Human Judge • Human Judgeasks tele-typed questions to Computer

TextClassification

•Assigning subjectcategories,topics,orgenres•Spamdetection•Authorship identification•Age/gender identification• LanguageIdentification•Sentimentanalysis•…

Page 20: Natural Language Processingandits Application · 2018-05-28 · BİL711 Natural Language Processing 3 Computer Human Human Judge • Human Judgeasks tele-typed questions to Computer
Page 21: Natural Language Processingandits Application · 2018-05-28 · BİL711 Natural Language Processing 3 Computer Human Human Judge • Human Judgeasks tele-typed questions to Computer

Informationretrieval

• Informationretrieval istheactivityofobtaininginformationresourcesrelevanttoaninformationneedfromacollectionofinformationresources.

Page 22: Natural Language Processingandits Application · 2018-05-28 · BİL711 Natural Language Processing 3 Computer Human Human Judge • Human Judgeasks tele-typed questions to Computer

TextSummarization• Goal:produceanabridgedversionofatextthatcontainsinformationthatisimportantorrelevanttoauser.

Page 23: Natural Language Processingandits Application · 2018-05-28 · BİL711 Natural Language Processing 3 Computer Human Human Judge • Human Judgeasks tele-typed questions to Computer

Dialoguesystems

• A dialog system or conversational agent (CA)is acomputer systemintended to conversewith ahuman,with acoherent structure.

Page 24: Natural Language Processingandits Application · 2018-05-28 · BİL711 Natural Language Processing 3 Computer Human Human Judge • Human Judgeasks tele-typed questions to Computer

QuestionAnswering:

24

Page 25: Natural Language Processingandits Application · 2018-05-28 · BİL711 Natural Language Processing 3 Computer Human Human Judge • Human Judgeasks tele-typed questions to Computer

Deep Learning Algorithms NLP Usage

NeuralNetwork– NN(feed)

•Part-of-speechTagging•Tokenization•NamedEntityRecognition•IntentExtraction

RecurrentNeuralNetworks-(RNN)•MachineTranslation•QuestionAnsweringSystem•ImageCaptioning

RecursiveNeuralNetworks

•Parsingsentences•SentimentAnalysis•Paraphrasedetection•RelationClassification•Objectdetection

ConvolutionalNeuralNetwork-(CNN)

•Sentence/Textclassification•Relationextractionandclassification•Spamdetection•Categorizationofsearchqueries•Semanticrelationextraction

Page 26: Natural Language Processingandits Application · 2018-05-28 · BİL711 Natural Language Processing 3 Computer Human Human Judge • Human Judgeasks tele-typed questions to Computer
Page 27: Natural Language Processingandits Application · 2018-05-28 · BİL711 Natural Language Processing 3 Computer Human Human Judge • Human Judgeasks tele-typed questions to Computer

References:

• 1.OverviewofArtificialIntelligenceandNaturalLanguageProcessing.NAVDEEPSINGHGILL.https://www.upwork.com/hiring/for-clients/artificial-intelligence-and-natural-language-processing-in-big-data/

2.NaturalLanguageProcessingMarkettoReach$22.3Billionby2025https://www.tractica.com/newsroom/press-releases/natural-language-processing-market-to-reach-22-3-billion-by-2025/3.DanJurafsky.NaturalLanguageProcessingLectures.4.BİL711NaturalLanguageProcessing.Prof.Dr.İlyas Çiçekli.