Upload
marcy
View
36
Download
3
Tags:
Embed Size (px)
DESCRIPTION
J AMOOS An Object-Oriented Language for Grammars. MODULE A A X Y Z; X “JAM”; Y {“O” … }++ Z “S”; END. Yuri Tsoglin Supervised by: Dr. Yossi Gil. A simple example. How do we write a grammar for Pascal program in YACC ? Note: even this is not sufficient… - PowerPoint PPT Presentation
Citation preview
JJAMOOSAMOOS
An Object-Oriented An Object-Oriented Language for GrammarsLanguage for Grammars
Yuri Tsoglin
Supervised by:
Dr. Yossi Gil
MODULE AA X Y Z;X “JAM”;Y {“O” … }++Z “S”;END
MODULE AA X Y Z;X “JAM”;Y {“O” … }++Z “S”;END
A simple exampleA simple exampleHow do we write a grammar for Pascal program in YACC ?
Note: even this is not sufficient…
Must define in a lexical file that semicolon means “;”.
Program:Program: program Name semicolon Decls Body;program Name semicolon Decls Body;
Decls:Decls: Decls Decl Decls Decl | /* empty */ ; | /* empty */ ;
…………………………………………………………………………………………………………
Conditional: if Exp then Statement OptElse;Conditional: if Exp then Statement OptElse;
OptElse:OptElse: else Statement | /* empty */ ;else Statement | /* empty */ ;
…………………………………………………………………………………………………………
Problems with YaccProblems with Yacc
• List and optional elements are defined in an unnatural way.
• Tokens must be defined in a separate file.• All productions must have semantic features
of the same type, which is defined separately.• Error handling using special error token.• No support for language library.• No internal symbol table handling.
What do we have in What do we have in
JJAMOOSAMOOS• Equivalence between programs and grammars.
• Lingual features:– Class definitions in EBNF form– Default fields– Automatic field naming– Type OK– Error handling and error types– Tree computation metaphor– Modular definitions and generic modules– Dictionaries
• Grammatical features:– Extended BNF grammars
– Three predefined kinds of tokens
– Generic (parametrized) grammars
– Language embedding
– Improved parse error handling
– Internal symbol tables handling
Class definitionsClass definitions• Each class definition defines also a grammar production•Extended BNF: structures like lists, optional components, choices are represented as such.
• No need for a separate “lexical” file. All the tokens are written as they are within the grammar definition.
The above Yacc example can be redefined this way:
Program Program programprogram Name “;” {Decls …} Body Name “;” {Decls …} Body;;
……………………………………………………………………………………………………………………......
Conditional Conditional ifif Exp Exp thenthen Statement Statement [[elseelse Statement Statement];];
Class Class Rule Rule Procedure Procedure• Every definition in Jamoos can be read and
understood as all of the following:– Rule (in a BNF)– Class (as in OO)– Procedure (as in imperative programming) with local
variables, input, output, and input-output arguments.
• Example: A X Y;– Rule: Symbol A can derive an X and a Y.– Class: Class A has two components, X and Y.– Procedure: Procedure A calls procedures X and Y.
• A definition has fields …
Classification of FieldsClassification of Fields
• Every field represents a value to be computed, a syntactical or semantical element, or a component of a class.
• Properties of a field:
_: Name: Type := Initializer• Type: Almost always exists
– Could be a primitive type, a class, or a compound type.
• Name: Optional (automatic naming can be used)• Initializer: Optional• Perishability prefix: Optional
Kinds of FieldsKinds of FieldsFieldConstructo
r Life TimeHasProcedural
kindArgument?BeginEndinitializer?Equivalent
Compone
nt
YESCTOR
invocationWith
objectNO
IN-OUT argument
Perishabl
e
YESCTOR
invocationCTOR return
NOIN
argument
AttributeNOInitialization by CTOR
Withobject
YESOUT
argument
Temporar
y
NOInitialization by CTOR
CTOR return
YESLocal
variable
Detailed ExampleDetailed ExampleAddition Expression “+” Expression
FEATURESvalue:INTEGER := [[
return $Expression#1.value + $Expression#2.value ;]]
END This can be understood in the following three ways:
• Addition is a production of two Expressions with a “+” between them, having an integer semantic feature value whose value is computed using the given C++ code.
• Addition is a class consisting of three fields: two unnamed of type Expression and one of type INTEGER named value. The constructor of this type gets two parameters of type Expression, assigns their values to the first two fields and assigns the result of the C++ computation to the third field.
• Addition is a procedure which gets two IN-OUT parameters of type Expression and assigns a value to a third OUT parameter computed using the C++ code.
The Special The Special ReturnReturn Field Field• Let us slightly change the above example:
Addition Expression “+” ExpressionFEATURES
value:INTEGER := [[return $1.value + $2.value ;
]]• Why not write just $1+$2 like in Yacc?• A field named return is a default field. To refer it, its name can
be omitted.Addition Expression “+” ExpressionFEATURES
return:INTEGER := [[return $1 + $2 ;
]] -- assuming Expression also has a return field
Program Program programprogram _:Name “;” decls:{ Decl … } Body _:Name “;” decls:{ Decl … } Body
FEATURESFEATURESnum_vars: INTEGER := decls.num_vars;num_vars: INTEGER := decls.num_vars;
ENDEND
VarsDeclaration VarsDeclaration varvar _:vars:{ (var_list:{ Name “,” … }+ “:” Type “;”) … }+ _:vars:{ (var_list:{ Name “,” … }+ “:” Type “;”) … }+FEATURESFEATURES
variables: { (Variable Type) … }+ variables: { (Variable Type) … }+ :=:=[[[[
for (int i=0; i<$@vars; i++) for (int j=0; j<$@vars[i].var_list; j++)
ADD (vars [i].var_list [j] vars [i].Type); ]]]]
ENDEND
Notice how JAMOOS and C++ are mutually embedded.
Internal ClassesInternal ClassesWe have no methods!!!
• There are no methods as such.• Internal classes can be used as
methods.• A constructor call for an internal class
is like a method call.• If method needs local variables, these
are fields of the internal class.• The return field may be used to “return”
only the necessary value.
Tree ComputationTree Computation• An execution of JAMOOS program is nothing but
– A nested chain of constructor calls, or,– An execution of a bottom-up or top-down parser,– A nested execution of procedures and functions.
• Each constructor call builds an object which becomes a node in the abstract syntax tree.
• The constructor computes all the attributes by executing their initializer, in the order of their appearance in the definition.
• When parsing, constructor calls are made implicitly by parser.
• At the start, constructor of class Main is called.
Summary of the 3 Aspects of Summary of the 3 Aspects of DefinitionsDefinitions
Grammatical aspectOO aspectProcedural aspectGrammar productionClassProcedure
Right-hand side components
FieldsProcedure arguments
ParsingConstructor callsProcedure calls
Semantic actions / featuresAttributesOUT arguments
Syntax / semantic errorsError typesException throwing
Embedded languagesModular definitionsModular definitions
Generic grammarsGeneric classesGeneric procedures
Default user actionType OKImperative code
Tokens (non-terminals)Primitive typesPrimitive types
Variables (terminals)Class namesProcedure names
Selection in right-hand sideAbstract class---
Semantic value of a symbol
Default fieldReturn value (of a function)
Four kinds of compound types:• List (similar to arrays)• Optional (similar to pointers)• Choice (as in C’s union or Pascal’s variant
records)• Sequence (as in C’s struct)
More examples:
CompoundStatement CompoundStatement beginbegin {{ Statement “;” … Statement “;” … }} endend;;
ForLoop ForLoop forfor Var “:=“ Var “:=“ lowerlower:Exp:Exp
up OF up OF toto
||
down OF down OF downtodownto
upper:Exp upper:Exp dodo Statement; Statement;
There are three types of tokens:• Keyword - any sequence of letters and
digits (beginning with a letter).
ifif beginbegin abc345abc345
• String - any quoted sequence of characters.
““(”(” “…”“…” “A^” “A^”
• Regular expression.<[A-Za-z]*><[A-Za-z]*> <abc(de)*><abc(de)*>
TokensTokens
Primitive TypesPrimitive Types
• Tokens define objects of primitive types, by default - STRING
• JAMOOS primitive types are:INTEGERREALBOOLEANCHARACTERSTRINGOK
Unit TypeUnit Type•Unit type is called OK.
•Used primarily to designate imperative code fragments (usually in C++).
•An expression of this type may appear at any place within a constructor argument list.
Program Program programprogram _:Name “;” decls:{ Decl … } Body _:Name “;” decls:{ Decl … } Body
FEATURESFEATURESprint_num_vars: OK print_num_vars: OK := [[:= [[ cout << cout << $decls.num_vars$decls.num_vars;; ]]]]
ENDEND
Error TypesError Types• Both syntax and semantic errors are handled
using error types.• An object of an error type can “legally” be in an
illegal state.Procedure Header? Body; Header? Body; -- Header can be illegal-- Header can be illegalVariableName VariableName Id Id
FEATURESFEATUREStype:Type? := … type:Type? := … -- Type can be illegal-- Type can be illegal
ENDEND
• A special case is type A special case is type OK? OK? which can be used to which can be used to define assertions.define assertions.
• Errors are generated by special Errors are generated by special ERROR ERROR command.command.
• Any object can be tested for being in an illegal Any object can be tested for being in an illegal state.state.
What about inheritance?
Abstract class:Abstract class: the right hand side defines all the subclasses.
Statement Statement
Assignment | Loop | Conditional | Assignment | Loop | Conditional | Compound | ProcCall;Compound | ProcCall;
Loop Loop ForLoop | WhileLoop | RepeatLoop;ForLoop | WhileLoop | RepeatLoop;
Grammatically, this is just a selection element of EBNF.
•Fields of an abstract class can be inherited or overridden in a subclass.•When overridden, field can be made either component or attribute.
Field InheritanceField Inheritance
Loop StepLoop | CondLoop;
CondLoop WhileLoop | RepeatLoopFEATURES
cond: Expression;END
WhileLoop while @cond do Statement;
RepeatLoop repeat { Statement “;” … } until @cond;
DictionariesDictionaries•Each field can depend on the fields of the descendants (so called “generated features”).
There are also
“inherited features”!
•Symbol tables can help in most practical cases.•In JAMOOS they are called dictionaries.
Dictionaries (cont.)Dictionaries (cont.)•A dictionary is a mapping from strings to some type.•So, to define a dictionary, we define the type of its elements.•There is a stack of dictionaries for each dictionary type.•Three operations on a dictionary:
– INSERT (a_string, an_element)– SEARCH (a_string) - only current dictionary– FERRET (a_string) - search through stack
•A class can be assigned a dictionary; the dictionary will be pushed on stack each time an object of that class is accessed.
Example:DICTIONARY Identifiers;
…………………………………..
ProcDecl ProcDecl procedureprocedure Name Name Identifiers “(“ {Param “,”} “)”; “(“ {Param “,”} “)”;
The place of Identifiers within the definition defines when the dictionary must be constructed.
In this case, the dictionary is constructed after procedureprocedure and Name Name are matched.
Dictionaries (cont.)Dictionaries (cont.)
•Definitions can be modularized.
•Each module can have type parameters. MODULE Expression (Op)MODULE Expression (Op)Expression Expression Unary | Binary | Parenthesized; Unary | Binary | Parenthesized;Unary Unary Op Expression; Op Expression;Binary Binary Expression Op Expression; Expression Op Expression;Parenthesized Parenthesized “(“ Expression “)”; “(“ Expression “)”;ENDEND
Similar to templates! Each class in the module is a template class parametrized by OpOp.
Modularity and GenericityModularity and Genericity
Now, any other module can use this module.
PascalExp PascalExp Expression (Op=PascalOp); Expression (Op=PascalOp);
PascalOp PascalOp Arith | Bool; Arith | Bool;
Arith Arith plus OF “+”plus OF “+”||minus OF “-”minus OF “-”||mult OF “*”mult OF “*”||div OF “/”;div OF “/”;
Calls between grammarsCalls between grammars
•Sometimes, we need one parser to call another parser.
For example:• A version of Pascal allowing embedded code in Assembler.•PARSE command is used to call another parser.
EmbeddedAssemler EmbeddedAssemler PARSE PARSE((“\””,Assembly,“\””“\””,Assembly,“\””););
EmbeddedCPP EmbeddedCPP PARSE PARSE(“[[“(“[[“,CPP,,CPP,”]]”);”]]”);
THE END