86
Introduction to Formal Linguistics Simon Dobnik Department of Philosophy, Linguistics and Theory of Science September 3, 2015 Based on slides by Robin Cooper

Introduction to Formal Linguistics

  • Upload
    others

  • View
    5

  • Download
    0

Embed Size (px)

Citation preview

Introduction to Formal Linguistics

Simon DobnikDepartment of Philosophy, Linguistics and Theory of Science

September 3, 2015

Based on slides by Robin Cooper

Outline

Practicalities

Overview of linguistics

Phonetics and Phonology

Morphology

Syntax

Semantics

Lexicon

A broader view

Practicalities

The course website

LT2112 H15 Introduction to formal linguistics on https://gul.gu.se

https://gul.gu.se/courseId/65958/content.do?id=26978419

http://gul.gu.se/public/courseId/70822/lang-en/publicPage.do

4 / 50

Course lecturers

I Ellen Breitholtz(morphology)

I Simon Dobnik(syntax and semantics with pragmatics, course organiser)

I Johan Gross(phonetics and phonology)

5 / 50

Overview of linguistics

Linguistics – a scientific view of language

I formal: explicit, exact (to an extent)

I Noam Chomsky, starting mid-fifties

I but goes back to ancient grammarians (Pan. ini, 4th cent.B.C.)

I nineteenth century (historical perspective, diachronic,Hermann Paul: sentences are the sum of their parts)

I pre-Chomskyan 20th century – synchronic (Saussure),structuralists (Leonard Bloomfield, Charles Hockett, ZelligHarris)

7 / 50

Linguistics – a scientific view of language

I formal: explicit, exact (to an extent)

I Noam Chomsky, starting mid-fifties

I but goes back to ancient grammarians (Pan. ini, 4th cent.B.C.)

I nineteenth century (historical perspective, diachronic,Hermann Paul: sentences are the sum of their parts)

I pre-Chomskyan 20th century – synchronic (Saussure),structuralists (Leonard Bloomfield, Charles Hockett, ZelligHarris)

7 / 50

Linguistics – a scientific view of language

I formal: explicit, exact (to an extent)

I Noam Chomsky, starting mid-fifties

I but goes back to ancient grammarians (Pan. ini, 4th cent.B.C.)

I nineteenth century (historical perspective, diachronic,Hermann Paul: sentences are the sum of their parts)

I pre-Chomskyan 20th century – synchronic (Saussure),structuralists (Leonard Bloomfield, Charles Hockett, ZelligHarris)

7 / 50

Linguistics – a scientific view of language

I formal: explicit, exact (to an extent)

I Noam Chomsky, starting mid-fifties

I but goes back to ancient grammarians (Pan. ini, 4th cent.B.C.)

I nineteenth century (historical perspective, diachronic,Hermann Paul: sentences are the sum of their parts)

I pre-Chomskyan 20th century – synchronic (Saussure),structuralists (Leonard Bloomfield, Charles Hockett, ZelligHarris)

7 / 50

Linguistics – a scientific view of language

I formal: explicit, exact (to an extent)

I Noam Chomsky, starting mid-fifties

I but goes back to ancient grammarians (Pan. ini, 4th cent.B.C.)

I nineteenth century (historical perspective, diachronic,Hermann Paul: sentences are the sum of their parts)

I pre-Chomskyan 20th century – synchronic (Saussure),structuralists (Leonard Bloomfield, Charles Hockett, ZelligHarris)

7 / 50

Linguistic methods

I corpus linguistics

I formal analysis

I experimental methods

8 / 50

Computational linguistics

. . . the scientific study of human language – specifically of thesystem of rules and the ways in which they are used incommunication – using mathematical models and formalprocedures that can be realised and validated using computers; across-over of many disciplines. (Stanford Linguistics Professor,1980s)

Borrowed from Stephan Oepen’s slide

9 / 50

Computational Linguistics

Wikipedia

University of Saarland

10 / 50

A language module

A language module

Text input

Speechinput

Lexicon

Grammar

Knowledge baseDialogue planner

Speech output

Text output

Speech recognizer/synthesizer

Morphologicalanalyzer/generator

Syntactic parser/generator

Semantic analyzer/reasoner

11 / 50

Phonetics and Phonology

Text input

Speechinput

Lexicon

Grammar

Knowledge base

Dialogue planner

Speech output

Text output

Speech recognizer/synthesizer

Morphologicalanalyzer/generator

Syntactic parser/generator

Semantic analyzer/reasoner

13 / 50

Articulatory phonetics

I how we use our mouth, vocal tract to produce speech sounds

I classification of speech sounds according to articulation

14 / 50

Articulatory phonetics

I how we use our mouth, vocal tract to produce speech sounds

I classification of speech sounds according to articulation

14 / 50

The vocal tract

From Wikipedia.

15 / 50

The IPA charthttp://www.internationalphoneticalphabet.org/ipa/

THE INTERNATIONAL PHONETIC ALPHABET (revised to 2005)CONSONANTS (PULMONIC)

´

A Å

i y È Ë ¨ u

Pe e∏ Ø o

E ‰ ø O

a ӌ

I Y U

Front Central Back

Close

Close-mid

Open-mid

Open

Where symbols appear in pairs, the one to the right represents a rounded vowel.

œ

ò

Bilabial Labiodental Dental Alveolar Post alveolar Retroflex Palatal Velar Uvular Pharyngeal Glottal

Plosive p b t d Ê ∂ c Ô k g q G /Nasal m µ n = ≠ N –Trill ı r RTap or Flap v | «Fricative F B f v T D s z S Z ß Ω ç J x V X Â © ? h HLateralfricative Ò LApproximant √ ® ’ j ˜Lateralapproximant l ¥ K

Where symbols appear in pairs, the one to the right represents a voiced consonant. Shaded areas denote articulations judged impossible.

CONSONANTS (NON-PULMONIC)

SUPRASEGMENTALS

VOWELS

OTHER SYMBOLS

Clicks Voiced implosives Ejectives

> Bilabial ∫ Bilabial ’ Examples:

˘ Dental Î Dental/alveolar p’ Bilabial

! (Post)alveolar ˙ Palatal t’ Dental/alveolar

¯ Palatoalveolar ƒ Velar k’ Velar

≤ Alveolar lateral Ï Uvular s’ Alveolar fricative

" Primary stress

Æ Secondary stress

ÆfoUn´"tIS´n … Long e… Ú Half-long eÚ

* Extra-short e*˘ Minor (foot) group

≤ Major (intonation) group

. Syllable break ®i.œkt ≈ Linking (absence of a break)

TONES AND WORD ACCENTS LEVEL CONTOUR

e _or â Extrahigh e

ˆ

or ä Rising

e! ê High e$ ë Falling

e@ î Mid e% ü Highrising

e~ ô Low efi ï Lowrising

e— û Extralow e& ñ$ Rising-

falling

Õ Downstep ã Global rise

õ Upstep à Global fall

© 2005 IPA

DIACRITICS Diacritics may be placed above a symbol with a descender, e.g. N( 9 Voiceless n9 d9 ª Breathy voiced bª aª 1 Dental t 1 d1 3 Voiced s3 t 3 0 Creaky voiced b0 a0 ¡ Apical t ¡ d¡ Ó Aspirated tÓ dÓ £ Linguolabial t £ d£ 4 Laminal t 4 d4 7 More rounded O7 W Labialized tW dW ) Nasalized e) ¶ Less rounded O¶ ∆ Palatalized t∆ d∆ ˆ Nasal release dˆ ™ Advanced u™ ◊ Velarized t◊ d◊ ¬ Lateral release d¬ 2 Retracted e2 ≥ Pharyngealized t≥ d≥ No audible release d · Centralized e· ù Velarized or pharyngealized : + Mid-centralized e+ 6 Raised e6 ( ®6 = voiced alveolar fricative)

` Syllabic n` § Lowered e§ ( B§ = voiced bilabial approximant)

8 Non-syllabic e8 5 Advanced Tongue Root e5 ± Rhoticity ´± a± ∞ Retracted Tongue Root e∞

∑ Voiceless labial-velar fricative Ç Û Alveolo-palatal fricatives

w Voiced labial-velar approximant » Voiced alveolar lateral flap

Á Voiced labial-palatal approximant Í Simultaneous S and xÌ Voiceless epiglottal fricative

¿ Voiced epiglottal fricativeAffricates and double articulationscan be represented by two symbols

÷ Epiglottal plosive joined by a tie bar if necessary.

kp ts(

(

16 / 50

The IPA chart for pulmonic consonants

17 / 50

The IPA chart for vowels

18 / 50

Acoustic phonetics

I the data from sound waves

I can we recognise speech sounds from the acoustic data?

I not just acoustic data: McGurk effect, video

I continuous speech to discrete speech sounds, co-articulation

19 / 50

Acoustic phonetics

I the data from sound waves

I can we recognise speech sounds from the acoustic data?

I not just acoustic data: McGurk effect, video

I continuous speech to discrete speech sounds, co-articulation

19 / 50

Acoustic phonetics

I the data from sound waves

I can we recognise speech sounds from the acoustic data?

I not just acoustic data: McGurk effect, video

I continuous speech to discrete speech sounds, co-articulation

19 / 50

Acoustic phonetics

I the data from sound waves

I can we recognise speech sounds from the acoustic data?

I not just acoustic data: McGurk effect, video

I continuous speech to discrete speech sounds, co-articulation

19 / 50

Spectrogram

From Wikipedia.

20 / 50

Phonology

I phonemes (kit, cat)

I phonological rules ([s]ip,[z]ip – sip[s], zip[s] ≈ bib[z], pub[z])

21 / 50

Phonology

I phonemes (kit, cat)

I phonological rules ([s]ip,[z]ip

– sip[s], zip[s] ≈ bib[z], pub[z])

21 / 50

Phonology

I phonemes (kit, cat)

I phonological rules ([s]ip,[z]ip – sip[s], zip[s] ≈ bib[z], pub[z])

21 / 50

Morphology

Text input

Speechinput

Lexicon

Grammar

Knowledge base

Dialogue planner

Speech output

Text output

Speech recognizer/synthesizer

Morphologicalanalyzer/generator

Syntactic parser/generator

Semantic analyzer/reasoner

23 / 50

Inflectional morphology

I different forms in a paradigm

I singular vs plural (cat vs cats, run, runs, ran)

24 / 50

Derivational morphology

I creating new words, perhaps of a different category, perhapswith a different meaning

I clever ≈ cleverness, able ≈ ability

25 / 50

Other morphological processes

I not clear if there is a clear boundary between morphology andsyntax

I cliticization – John’s coming, je l’ai vuI compounding – language technology

course assessment

I sometimes not just a sum of meanings of sub-parts:white house, White House

26 / 50

Other morphological processes

I not clear if there is a clear boundary between morphology andsyntax

I cliticization – John’s coming, je l’ai vuI compounding – language technology course

assessment

I sometimes not just a sum of meanings of sub-parts:white house, White House

26 / 50

Other morphological processes

I not clear if there is a clear boundary between morphology andsyntax

I cliticization – John’s coming, je l’ai vuI compounding – language technology course assessment

I sometimes not just a sum of meanings of sub-parts:white house, White House

26 / 50

Other morphological processes

I not clear if there is a clear boundary between morphology andsyntax

I cliticization – John’s coming, je l’ai vuI compounding – language technology course assessment

I sometimes not just a sum of meanings of sub-parts:white house, White House

26 / 50

Syntax

Text input

Speechinput

Lexicon

Grammar

Knowledge base

Dialogue planner

Speech output

Text output

Speech recognizer/synthesizer

Morphologicalanalyzer/generator

Syntactic parser/generator

Semantic analyzer/reasoner

28 / 50

Parts of speech

I dog – noun

I run – verb

I the – determiner, definite article

29 / 50

Construction types

I the dog – noun phrase

I the dog ran – sentence

I the thief [who saw the policeman] ran into the shop – relativeclause

I I wonder [who saw the policeman] – embedded question

30 / 50

Grammars and grammar rules

I sentences may consist of a noun phrase followed by a verbphrase – S → NP VP

I phrase structure grammars, context free grammars (Chomskyhierarchy)

I are natural languages context free?

I features *the dog run, *the dogs runs

31 / 50

Syntactic structures

From here.

32 / 50

Semantics

Text input

Speechinput

Lexicon

Grammar

Knowledge base

Dialogue planner

Speech output

Text output

Speech recognizer/synthesizer

Morphologicalanalyzer/generator

Syntactic parser/generator

Semantic analyzer/reasoner

34 / 50

Semantic properties and model theory

I “to know the meaning of a (declarative) sentence is to knowthe conditions under which it would be true”

I truth in a model

35 / 50

Logic

I propositional logic

I first order logic

I predicates, constants, variables, quantifiers

I Every television presenter has a secret.∀ x .(television presenter(x)⇒ ∃ y .(secret(y) ∧ have(x , y)))∃ y .(secret(y) ∧ ∀ x .(television presenter(x)⇒ have(x , y)))

I model theory for logic

I inference

36 / 50

Pragmatics

I language in use

I speech acts (assert, query, . . . )

I language in context (deictic pronouns I, you, but alsodemonstratives (this, that) and tense)

I presuppositions (my wife is coming → I have a wife, my wifeisn’t coming → I have a wife)

37 / 50

Pragmatics

I language in use

I speech acts (assert, query, . . . )

I language in context (deictic pronouns I, you, but alsodemonstratives (this, that) and tense)

I presuppositions (my wife is coming → I have a wife, my wifeisn’t coming → I have a wife)

37 / 50

Pragmatics

I language in use

I speech acts (assert, query, . . . )

I language in context (deictic pronouns I, you, but alsodemonstratives (this, that) and tense)

I presuppositions (my wife is coming → I have a wife, my wifeisn’t coming → I have a wife)

37 / 50

Pragmatics

I language in use

I speech acts (assert, query, . . . )

I language in context (deictic pronouns I, you, but alsodemonstratives (this, that) and tense)

I presuppositions (my wife is coming → I have a wife, my wifeisn’t coming → I have a wife)

37 / 50

Lexicon

Text input

Speechinput

Lexicon

Grammar

Knowledge base

Dialogue planner

Speech output

Text output

Speech recognizer/synthesizer

Morphologicalanalyzer/generator

Syntactic parser/generator

Semantic analyzer/reasoner

40 / 50

Words and phrases

I “the lexicon is a list of words”

I seems also to include phrases – look up (the number), keeptrack of (the score), kick the bucket

I more information than just the words: phonology, morphology,syntax semantics

41 / 50

Words and phrases

I “the lexicon is a list of words”

I seems also to include phrases – look up (the number), keeptrack of (the score), kick the bucket

I more information than just the words: phonology, morphology,syntax semantics

41 / 50

Words and phrases

I “the lexicon is a list of words”

I seems also to include phrases – look up (the number), keeptrack of (the score), kick the bucket

I more information than just the words: phonology, morphology,syntax semantics

41 / 50

A broader view

Some other areas of linguistics

. . . which may be relevant to language technology:

I historical linguistics

I comparative linguistics and language typology

I dialect studies

I sociolinguistics

I psycholinguistics (language acquisition, human languageprocessing)

43 / 50

Language variation and universals

I languages are different but there’s a limit on how differentthey are

I language universals

I Sam read the books in the living-roomI Did Sam read the books in the living-room?I *Living-room the in books the read Sam?I Sam read the books which are in the living-roomI Which room did Sam read the books in ?I *Which room did Sam read the books which are in ?

44 / 50

Language variation and universals

I languages are different but there’s a limit on how differentthey are

I language universals

I Sam read the books in the living-room

I Did Sam read the books in the living-room?I *Living-room the in books the read Sam?I Sam read the books which are in the living-roomI Which room did Sam read the books in ?I *Which room did Sam read the books which are in ?

44 / 50

Language variation and universals

I languages are different but there’s a limit on how differentthey are

I language universals

I Sam read the books in the living-roomI Did Sam read the books in the living-room?

I *Living-room the in books the read Sam?I Sam read the books which are in the living-roomI Which room did Sam read the books in ?I *Which room did Sam read the books which are in ?

44 / 50

Language variation and universals

I languages are different but there’s a limit on how differentthey are

I language universals

I Sam read the books in the living-roomI Did Sam read the books in the living-room?I *Living-room the in books the read Sam?

I Sam read the books which are in the living-roomI Which room did Sam read the books in ?I *Which room did Sam read the books which are in ?

44 / 50

Language variation and universals

I languages are different but there’s a limit on how differentthey are

I language universals

I Sam read the books in the living-roomI Did Sam read the books in the living-room?I *Living-room the in books the read Sam?I Sam read the books which are in the living-room

I Which room did Sam read the books in ?I *Which room did Sam read the books which are in ?

44 / 50

Language variation and universals

I languages are different but there’s a limit on how differentthey are

I language universals

I Sam read the books in the living-roomI Did Sam read the books in the living-room?I *Living-room the in books the read Sam?I Sam read the books which are in the living-roomI Which room did Sam read the books in ?

I *Which room did Sam read the books which are in ?

44 / 50

Language variation and universals

I languages are different but there’s a limit on how differentthey are

I language universals

I Sam read the books in the living-roomI Did Sam read the books in the living-room?I *Living-room the in books the read Sam?I Sam read the books which are in the living-roomI Which room did Sam read the books in ?I *Which room did Sam read the books which are in ?

44 / 50

Everybody can talk

I . . . except perhaps because of sickness, developmentalcharacteristics or unusual social conditions

I native speakers

I linguistic (un)consciousness (lexicon vs grammar rules)

45 / 50

Everybody can talk

I . . . except perhaps because of sickness, developmentalcharacteristics or unusual social conditions

I native speakers

I linguistic (un)consciousness (lexicon vs grammar rules)

45 / 50

Everybody can talk

I . . . except perhaps because of sickness, developmentalcharacteristics or unusual social conditions

I native speakers

I linguistic (un)consciousness (lexicon vs grammar rules)

45 / 50

Everybody can talk

I . . . except perhaps because of sickness, developmentalcharacteristics or unusual social conditions

I native speakers

I linguistic (un)consciousness (lexicon vs grammar rules)

45 / 50

Language acquisition

From here.

46 / 50

Linguistics and psychology

I developmental psychology

I human processing

I should language technologists be concerned with this?

I should language technology systems imitate humans?

47 / 50

Linguistics and psychology

I developmental psychology

I human processing

I should language technologists be concerned with this?

I should language technology systems imitate humans?

47 / 50

Linguistics and psychology

I developmental psychology

I human processing

I should language technologists be concerned with this?

I should language technology systems imitate humans?

47 / 50

Why is linguistics (and language technology) difficult?

I natural languages are complex

I interaction with context

I multimodality, body language

I difficult to give a precise scientific theory of our linguisticbehaviour

48 / 50

Why is linguistics (and language technology) difficult?

I natural languages are complex

I interaction with context

I multimodality, body language

I difficult to give a precise scientific theory of our linguisticbehaviour

48 / 50

Why is linguistics (and language technology) difficult?

I natural languages are complex

I interaction with context

I multimodality, body language

I difficult to give a precise scientific theory of our linguisticbehaviour

48 / 50

Why is linguistics (and language technology) difficult?

I natural languages are complex

I interaction with context

I multimodality, body language

I difficult to give a precise scientific theory of our linguisticbehaviour

48 / 50

Human languages and other languages

I animal languages

I artificial languages (logic, programming languages)

I human languages

49 / 50

Some properties of human languages

I displacement (talking about things not present, time/tense,negation, (im)possibilities)

I arbitrary (compare different words for common objects inunrelated languages)

I productive (take any sentence, can you create a longersentence which contains it?)

I discrete (digitisation)

50 / 50

Some properties of human languages

I displacement (talking about things not present, time/tense,negation, (im)possibilities)

I arbitrary (compare different words for common objects inunrelated languages)

I productive (take any sentence, can you create a longersentence which contains it?)

I discrete (digitisation)

50 / 50

Some properties of human languages

I displacement (talking about things not present, time/tense,negation, (im)possibilities)

I arbitrary (compare different words for common objects inunrelated languages)

I productive (take any sentence, can you create a longersentence which contains it?)

I discrete (digitisation)

50 / 50

Some properties of human languages

I displacement (talking about things not present, time/tense,negation, (im)possibilities)

I arbitrary (compare different words for common objects inunrelated languages)

I productive (take any sentence, can you create a longersentence which contains it?)

I discrete (digitisation)

50 / 50

Some properties of human languages

I displacement (talking about things not present, time/tense,negation, (im)possibilities)

I arbitrary (compare different words for common objects inunrelated languages)

I productive (take any sentence, can you create a longersentence which contains it?)

I discrete (digitisation)

50 / 50