Upload
mabel-roberts
View
229
Download
0
Tags:
Embed Size (px)
Citation preview
Languages and Grammars
• A grammar specifies the rules for constructing well-formed sentences in a language
• Every language, including a programming language, has a grammar
Applications
• Grammar checkers in word processors
• Programming language compilers
• Natural language queries (Google, etc.)
Generate Sentences in English
• Given a vocabulary and grammar rules, one can generate some random and perhaps rather silly sentences
• Vocabulary - the set of words belonging to the parts of speech (nouns, verbs, articles, prepositions)
• Grammar - the set of rules for building phrases in a sentence (noun phrase, verb phrase, prepositional phrase)
The Structure of a Sentence sentence
noun phrase verb phrase
A sentence is a noun phrase followed by a verb phrase
The Structure of a Sentence sentence
noun phrase verb phrase
article noun
A noun phrase is an article followed by a noun
The Structure of a Sentence
Similar to the behavior of strings so far
sentence
noun phrase verb phrase
article noun
the girl
Pick actual words for those parts of speech at random
The Structure of a Sentence
Similar to the behavior of strings so far
sentence
noun phrase verb phrase
article noun verb noun phrase prepositional phrase
the girl
A verb phrase is a verb followed by a noun phrase and a prepositional phrase
The Structure of a Sentence
Similar to the behavior of strings so far
sentence
noun phrase verb phrase
article noun verb noun phrase prepositional phrase
the girl hit
Pick a verb at random
The Structure of a Sentence
Similar to the behavior of strings so far
sentence
noun phrase verb phrase
article noun verb noun phrase prepositional phrase
article noun
the girl hit
Expand a noun phrase again
The Structure of a Sentence
Similar to the behavior of strings so far
sentence
noun phrase verb phrase
article noun verb noun phrase prepositional phrase
article noun
the girl hit the boy
Pick an article and a noun at random
The Structure of a Sentence
Similar to the behavior of strings so far
sentence
noun phrase verb phrase
article noun verb noun phrase prepositional phrase
article noun preposition noun phrase
the girl hit the boy
A prepositional phrase is a preposition followed by a noun phrase
The Structure of a Sentence
Similar to the behavior of strings so far
sentence
noun phrase verb phrase
article noun verb noun phrase prepositional phrase
article noun preposition noun phrase
the girl hit the boy with
Pick a preposition at random
The Structure of a Sentence
Similar to the behavior of strings so far
sentence
noun phrase verb phrase
article noun verb noun phrase prepositional phrase
article noun preposition noun phrase
article noun
the girl hit the boy with
Expand another noun phrase
The Structure of a Sentence
Similar to the behavior of strings so far
sentence
noun phrase verb phrase
article noun verb noun phrase prepositional phrase
article noun preposition noun phrase
article noun
the girl hit the boy with a bat
More random words from the parts of speech
Representing the Vocabularynouns = ['bat', 'boy', 'girl', 'dog', 'cat', 'chair', 'fence', 'table', 'computer', 'cake', 'field']
verbs = ['hit', 'threw', 'pushed', 'ate', 'dragged', 'jumped']
prepositions = ['with', 'to', 'from', 'on', 'below', 'above', 'beside']
articles = ['a', 'the']
Use a list of words for each part of speech (lexical category)
Picking a Word at Randomnouns = ['bat', 'boy', 'girl', 'dog', 'cat', 'chair', 'fence', 'table', 'computer', 'cake', 'field']
verbs = ['hit', 'threw', 'pushed', 'ate', 'dragged', 'jumped']
prepositions = ['with', 'to', 'from', 'on', 'below', 'above', 'beside']
articles = ['a', 'the']
import random
print(random.choice(verbs)) # Prints a randomly chosen verb
The random module includes functions to select numbers, sequence elements, etc., at random
Grammar Rulessentence = nounphrase verbphrase
nounphrase = article noun
verbphrase = verb nounphrase prepositionalphrase
prepositonalphrase = preposition nounphrase
A sentence is a noun phrase followed by a verb phrase
Etc., etc.
Define a Function for Each Rule# sentence = nounphrase verbphrasedef sentence(): return nounphrase() + ' ' + verbphrase()
Each function builds and returns a string that is an instance of the phrase
Separate phrases and words with a space
Define a Function for Each Rule# sentence = nounphrase verbphrasedef sentence(): return nounphrase() + ' ' + verbphrase()
# nounphrase = article noundef nounphrase(): return random.choice(articles) + ' ' + random.choice(nouns)
When a part of speech is reached, select an instance at random from the relevant list of words
Call sentence() to Try It Out # sentence = nounphrase verbphrasedef sentence(): return nounphrase() + ' ' + verbphrase()
# nounphrase = article noundef nounphrase(): return random.choice(articles) + ' ' + random.choice(nouns)
…
for x in range(10): print(sentence()) # Display 10 sentences
You can also generate examples of the other phrases by calling their functions
Kinds of Symbols in a Grammar
• Terminal symbols: words in the vocabulary of the language
• Non-terminal symbols: words that describe phrases or portions of sentences
• Metasymbols: used to construct rules
Metasymbols for a Grammar
Metasymbols Use"" Enclose literal items= Means "is defined as"[ ] Enclose optional items{ } Enclose zero or more items( ) Group together required choices| Indicates a choice
A Grammar of Arithmetic Expressions
expression = term { addingOperator term }
term = factor { multiplyOperator factor }
factor = primary ["^" primary ]
primary = number | "(" expression ")"
number = digit { digit }
digit = "0" | "1" | "2" | "3" | "4" | "5" | "6" | "7" | "8" | "9"
addingOperator = "+" | "-"
multiplyingOperator = "*" | "/"
Example sentences: 3, 4 + 5, 5 + 2 * 3, (5 + 2) * 3 ^ 4
Alternative Notation: Train Track
term = factor { multiplyingOperator factor }
factor
*
/
primary = number | "(" expression ")"
number
( )expression
Parsing
• A parser analyzes a source program to determine whether or not it is syntactically correct
Parser
Source language program
Syntax error messages
OK or not OK
Scanning
• A scanner picks out words in a source program and sends these to the parser
Parser
Source language program
Syntax error messages
Ok or not OKScanner
Lexical error messages
Tokens
Scanner(aString) Creates a scanner on a source string
get() Returns the current token (at the cursor)
next() Advances the cursor to the next token
The Scanner Interface
Tokens
• A Token object has two attributes:– type (indicating an operand or operator)– value (an int if it’s an operand, or the source string otherwise)
• Token types are– Token.EOE – Token.PLUS, Token.MINUS– Token.MUL, Token.DIV– Token.INT– Token.UNKNOWN
The Token Interface
Token(source) Creates a token from a source string
str(aToken) String representation
isOperator() True if an operator, false otherwise
getType() Returns the type
getValue() Returns the value
Recursive Descent Parsing• Each rule in the grammar translates to a
Python parsing method
def expression(self): self.term() token = self.scanner.get() while token.getType() in (Token.PLUS, Token.MINUS): self.scanner.next() self.term() token = self.scanner.get()
expression = term { addingOperator term }
Recursive Descent Parsing• Each method is responsible for a phrase in
an expression
def term(self): self.factor() token = self.scanner.get() while token.getType() in (Token.MUL, Token.DIV): self.scanner.next() self.factor() token = self.scanner.get()
term = factor { multiplyingOperator factor }
Recursive Descent Parsingprimary = number | "(" expression ")"
def primary(self): token = self.scanner.get() if token.getType() == Token.INT: self.scanner.next() elif token.getType() == Token.L_PAR: self.scanner.next() self.expression() self.accept(self._scanner.get(), Token.R_PAR, "')' expected") self.scanner.next() else: self.fatalError(token, "bad primary")