50
CS 330 Programming Languages 09 / 28 / 2006 Instructor: Michael Eckmann

CS 330 Programming Languages 09 / 28 / 2006 Instructor: Michael Eckmann

Embed Size (px)

Citation preview

Page 1: CS 330 Programming Languages 09 / 28 / 2006 Instructor: Michael Eckmann

CS 330Programming Languages

09 / 28 / 2006

Instructor: Michael Eckmann

Page 2: CS 330 Programming Languages 09 / 28 / 2006 Instructor: Michael Eckmann

Michael Eckmann - Skidmore College - CS 330 - Fall 2006

Today’s Topics• Questions / comments?• Chapter 4

– Parsers• Bottom Up

• Perl

• Let's meet in the Linux Lab - Har 207 for Tuesday's class October 3rd.

Page 3: CS 330 Programming Languages 09 / 28 / 2006 Instructor: Michael Eckmann

Bottom Up parsers

Michael Eckmann - Skidmore College - CS 330 - Fall 2006

• Bottom-up parsers– Given a right sentential form, , determine what

substring of is the right-hand side of the rule in the grammar that must be reduced to produce the previous sentential form in the right derivation

– This substring is called the handle– This is backwards from the way top down parsing

works– The most common bottom-up parsing algorithms are

in the LR family (L=left-to-right scan of input, R= rightmost derivations)

Page 4: CS 330 Programming Languages 09 / 28 / 2006 Instructor: Michael Eckmann

Bottom Up parsers

Michael Eckmann - Skidmore College - CS 330 - Fall 2006

• Bottom-up parsers– The whole process is: Starting with the input sentence

(all terminal symbols) (also this is the last sentential form in a derivation), produce the sequence of sentential forms (up) until all that remains is the start symbol.

– One step in the process of bottom-up parsing is: given a right sentential form, find the correct RHS to reduce to get the previous right-sentential form in the derivation

– What are we reducing a RHS to?

Page 5: CS 330 Programming Languages 09 / 28 / 2006 Instructor: Michael Eckmann

Bottom Up parsers

Michael Eckmann - Skidmore College - CS 330 - Fall 2006

Example:E -> E + T | TT -> T * F | FF -> ( E ) | id

E => E + T => E + T * F => E + T * id => E + F * id => E + id * id => T + id * id => F + id * id => id + id * id Read from the bottom going up.

Remember --- we start with the

sentence (in this case id + id + id.)

The underlined symbol is the one

that gets reduced.

Page 6: CS 330 Programming Languages 09 / 28 / 2006 Instructor: Michael Eckmann

Bottom Up parsers

Michael Eckmann - Skidmore College - CS 330 - Fall 2006

Example:

E -> E + T | T

T -> T * F | F

F -> ( E ) | id

E => E + T

=> E + T * F

=> E + T * id

=> E + F * id

=> E + id * id

=> T + id * id

=> F + id * id

=> id + id * id

• Note: Sometimes it is not obvious which RHS to reduce in each step. For instance, the right-sentential form E + T * id includes more than one RHS. Which RHSs are included in this sentential form?

• And why can we not choose the others?

Page 7: CS 330 Programming Languages 09 / 28 / 2006 Instructor: Michael Eckmann

Bottom Up parsers

Michael Eckmann - Skidmore College - CS 330 - Fall 2006

• The correct choice of the RHS to reduce in any given step is called the handle (of that right-sentential form.)

• The text contains intuition on how to find the handle --- we won't cover that here. Just know that the handle of any right-sentential form is unique.

• To decide what part of the rightmost sentential form is the handle, one should understand what a phrase is and what a simple phrase is. Then the handle is defined to be the leftmost simple phrase within a right sentential form.

Page 8: CS 330 Programming Languages 09 / 28 / 2006 Instructor: Michael Eckmann

Bottom Up parsers

Michael Eckmann - Skidmore College - CS 330 - Fall 2006

•Most bottom-up parsers are LR --- what is LR again?

• They use small programs and a parsing table.

Page 9: CS 330 Programming Languages 09 / 28 / 2006 Instructor: Michael Eckmann

Bottom Up parsers

Michael Eckmann - Skidmore College - CS 330 - Fall 2006

• LR parsers consist of – A parse stack– An input string (the sentence to be determined if it

is syntactically correct)– A parse table created from the grammar

beforehand. Its rows are states, and its columns are terminal and nonterminal symbols.

– So, given an input symbol and a state, the program will lookup in the table what to do.

Page 10: CS 330 Programming Languages 09 / 28 / 2006 Instructor: Michael Eckmann

arithmetic expression grammar

Michael Eckmann - Skidmore College - CS 330 - Fall 2006

• 1. E -> E + T• 2. E -> T• 3. T -> T * F• 4. T -> F• 5. F -> ( E )• 6. F -> id

• For next slide, R means reduce, S means shift. e.g. R4 means reduce using production 4 above. S6 means shift the next symbol of input onto the stack and push state 6 onto the stack.

Page 11: CS 330 Programming Languages 09 / 28 / 2006 Instructor: Michael Eckmann

LR parsing table for arithmetic expression grammar

Michael Eckmann - Skidmore College - CS 330 - Fall 2006

Page 12: CS 330 Programming Languages 09 / 28 / 2006 Instructor: Michael Eckmann

ACTION & GOTO

Michael Eckmann - Skidmore College - CS 330 - Fall 2006

• The GOTO portion of the table is used to determine which state to push onto the stack after a reduction.

• The ACTION portion of the table is used to determine whether to shift or reduce based on the current state and next input symbol.

• Blank cells in the table imply syntax errors.• accept means the sentence is syntactically correct.• The stack starts with only state 0 on it. • The input string is the complete sentence followed

by a termination symbol, usually a $.

Page 13: CS 330 Programming Languages 09 / 28 / 2006 Instructor: Michael Eckmann

LR Parser structure

Michael Eckmann - Skidmore College - CS 330 - Fall 2006

• Top of stack is to the right, the next input symbol for the input is the leftmost symbol. S's are State #'s and X's are grammar symbols.

Page 14: CS 330 Programming Languages 09 / 28 / 2006 Instructor: Michael Eckmann

Let's go through an example parse

Michael Eckmann - Skidmore College - CS 330 - Fall 2006

• Before we go through an example parse a few things should be stated.

• Recall that a shift pushes the next input symbol on the stack and then pushes the specified state on the stack as well. That's pretty straightforward.

Page 15: CS 330 Programming Languages 09 / 28 / 2006 Instructor: Michael Eckmann

Let's go through an example parse

Michael Eckmann - Skidmore College - CS 330 - Fall 2006

• A reduce is more complex. When a reduce occurs,

• 1. the handle (the whole RHS) is popped off the stack (along with each state per symbol in the handle)

• 2. then the LHS is pushed onto the stack • 3. followed by another state pushed onto the stack. The state

to be pushed is determined by the GOTO portion of the parse table.

– column is the LHS just pushed

– row is the state that was on the top of the stack after the handle and it's associated states were popped.

Page 16: CS 330 Programming Languages 09 / 28 / 2006 Instructor: Michael Eckmann

Let's go through an example parse

Michael Eckmann - Skidmore College - CS 330 - Fall 2006

• Now we're ready to go through an example. We'll determine if

id + id * id

is syntactically correct for the example grammar a few slides ago.

We'll do this by viewing the table on screen and I'll show the stack and input and action on the board.

Also, I'll write on the board what happens during a shift and a reduce.

Page 17: CS 330 Programming Languages 09 / 28 / 2006 Instructor: Michael Eckmann

Let's go through an example parse

Michael Eckmann - Skidmore College - CS 330 - Fall 2006

• How about we determine if

( id ) id + id

is syntactically correct for the example grammar a few slides ago.

We'll do this by viewing the table on screen and I'll show the stack and input and action on the board.

Also, I'll write on the board what happens during a shift and a reduce.

Page 18: CS 330 Programming Languages 09 / 28 / 2006 Instructor: Michael Eckmann

Perl

Michael Eckmann - Skidmore College - CS 330 - Fall 2006

•You have two options•Get Perl on your own machine. Download Perl

(version 5.003 or higher.)– recommend ActivePerl from ActiveState for

Mac OS and Windows–Perl is installed by default under most linuxes

AND/OR•Use the Linux machines in the new Linux lab.

Page 19: CS 330 Programming Languages 09 / 28 / 2006 Instructor: Michael Eckmann

Perl

Michael Eckmann - Skidmore College - CS 330 - Fall 2006

• Perl (developed by Larry Wall and many others) is a language that has as it's ancestors C, sed, awk and others. Sed is short for stream editor, awk is named for it's authors (of which K is for Kernighan famous for his work on C.)

• Sed and awk are good for pattern matching, editing and reporting on text files. Perl has these capabilities too. Some of Perl's syntax is C-like.

• PHP and JavaScript are “descendents” of Perl.

Page 20: CS 330 Programming Languages 09 / 28 / 2006 Instructor: Michael Eckmann

Perl

Michael Eckmann - Skidmore College - CS 330 - Fall 2006

• How it's going to be “taught” in this course.• We're going to dive in and see a lot of stuff so you

get a general overview of the language quickly.• We will go in more depth later, on several topics.• I suggest you go through a tutorial with and learn

from your peers.• We will not learn all there is to know about Perl.

Page 21: CS 330 Programming Languages 09 / 28 / 2006 Instructor: Michael Eckmann

Perl

Michael Eckmann - Skidmore College - CS 330 - Fall 2006

•Resources on the web• www.perl.com•A Perl tutorial• http://www.comp.leeds.ac.uk/Perl/start.html• Programming Perl book online• http://www.unix.org.ua/orelly/perl/prog3/index.htm

• If that's not enough, you know how to use google. • I'll try to post some useful links on our course

webpage.

Page 22: CS 330 Programming Languages 09 / 28 / 2006 Instructor: Michael Eckmann

Perl

Michael Eckmann - Skidmore College - CS 330 - Fall 2006

• Variables need not be declared, they exist upon first use.

• If you happen to use a variable before it is assigned a value, then it is 0,“”, or false depending on its use.

• You can though, if you want to, declare variables with my or our (to be explained later.)

• You can also douse strict;

which provides error checking including disallowing the use of undeclared variables.

Page 23: CS 330 Programming Languages 09 / 28 / 2006 Instructor: Michael Eckmann

Perl

Michael Eckmann - Skidmore College - CS 330 - Fall 2006

–Scalars (hold a single value) • Names start with a $

–Arrays• Names start with an @

–Hashes (keyed lists aka associative arrays)• Names start with a %

–Subroutines• Names start with an &

Page 24: CS 330 Programming Languages 09 / 28 / 2006 Instructor: Michael Eckmann

Perl

Michael Eckmann - Skidmore College - CS 330 - Fall 2006

• More about hashes–They consist of pairs of data. The first in each

pair is the key and the second is the value.–Values can be returned based on their keys.–We'll see an example in a couple of slides

Page 25: CS 330 Programming Languages 09 / 28 / 2006 Instructor: Michael Eckmann

Perl

Michael Eckmann - Skidmore College - CS 330 - Fall 2006

• Example scalar assignment statements (and declarations) --- can hold strings or numerics (integers and floating point.)$age = 50;

$name = “Mike”;

• Example array assignment statements (and declarations)@grades_list = (100, 98, 43, 87, 92);

@people = (“Jerry Garcia”, “Bobby Weir”, “Phil Lesh”);

• Example hash assignment statement (and declaration)%course_names = (“CS106” => “Intro to CS 1”,

“CS206” => “Intro to CS 2”,

“CS330” => “Programming Languages”);

Page 26: CS 330 Programming Languages 09 / 28 / 2006 Instructor: Michael Eckmann

Perl

Michael Eckmann - Skidmore College - CS 330 - Fall 2006

• @people = (“Jerry Garcia”, “Bobby Weir”, “Phil Lesh”);

• When referring to one element of an array, use $ because each element is a scalar and use the typical square brackets with index. Indices start at 0.

e.g. $people[0] would refer to the first element of the array.

• Oddly, scalars can be assigned from the whole array:

($lead, $rhythm, $bass) = @people;

Page 27: CS 330 Programming Languages 09 / 28 / 2006 Instructor: Michael Eckmann

Perl

Michael Eckmann - Skidmore College - CS 330 - Fall 2006

• Note: in hashes, the comma can be used instead of the => which is less readable, but valid syntax.

• %course_names = (“CS106” => “Intro to CS 1”,

“CS206” => “Intro to CS 2”,

“CS330” => “Programming Languages”);

• Also, hashes can be used as lists (ignoring the key / value pair meaning.) The first key is the first element of the list, the first value is the second element in the list, the 2nd key is the third element, and so on.

• e.g. @course_info = %course_names;

• Here, $course_info[0] contains “CS106”

• $course_info[1] contains “Intro to CS 1”

• etc.

Page 28: CS 330 Programming Languages 09 / 28 / 2006 Instructor: Michael Eckmann

Perl

Michael Eckmann - Skidmore College - CS 330 - Fall 2006

• %course_names = (“CS106” => “Intro to CS 1”,

“CS206” => “Intro to CS 2”,

“CS330” => “Programming Languages”);

• To get a value out of a hash, you can use it's key inside { }

• e.g. $name = $course_names{“CS330”};

• $name would contain the string “Programming Languages”.

Page 29: CS 330 Programming Languages 09 / 28 / 2006 Instructor: Michael Eckmann

Perl

Michael Eckmann - Skidmore College - CS 330 - Fall 2006

• Output

• print “Hello, World!\n”;

• $output_text = “Hello, World”;

• print $output_text . “\n”;

• print “$output_text\n”; #same as above because “” causes interpolation

• print '$output_text \n';

#would print literally --- not value of var and no new line either

• print “\$output_text\n”;

#would print literally --- b/c \ forces $ to be printed so no var

Page 30: CS 330 Programming Languages 09 / 28 / 2006 Instructor: Michael Eckmann

Perl

Michael Eckmann - Skidmore College - CS 330 - Fall 2006

• Operators

• For numerics:

+, -, *, /, %, ** (exponentiation)

• For strings:

. (concatentation),

x (multiplies the string by an integer)

e.g.

$text = “Hey”;

$text = $text x 3;

# result is HeyHeyHey

Page 31: CS 330 Programming Languages 09 / 28 / 2006 Instructor: Michael Eckmann

Perl

Michael Eckmann - Skidmore College - CS 330 - Fall 2006

• Assignment operators (more than just the =)

• Here are some examples *=, +=, ||=, x=, etc...

e.g.

$text = “Hey”;

$text x= 3; # same as if we did $text = $text x 3;

Page 32: CS 330 Programming Languages 09 / 28 / 2006 Instructor: Michael Eckmann

Perl

Michael Eckmann - Skidmore College - CS 330 - Fall 2006

• We covered:

– Printing

• Interpolation “ “ vs. literal text ' '

• \ reference operator within “ “ to force literal

– Operators (numeric and string) ., x, +, *, etc.

– Assignment operators *=, .=, ||=, etc.

– conditional operators (numeric and string)

• Next we'll cover:

– Logical &&, ||, !, and, or, not, xor (words vs. symbols different precedence)

– for, while, until, do-while, if, else, elsif

– File handling

– Regular expressions

– Assignments (chaining them --- b/c they return lvalues)

– chop, chomp,

– subroutines

Page 33: CS 330 Programming Languages 09 / 28 / 2006 Instructor: Michael Eckmann

Perl

Michael Eckmann - Skidmore College - CS 330 - Fall 2006

• Logical operators•Both symbols and words•&&, ||• !•and, or, not (also allowed but have lower precedence

than their related symbols above.)• Both forms of AND (&& and and) and OR (|| and or) are

short circuit operators. That means that if the left operand determines the outcome of the whole condition, then the right operand is not evaluated.

Page 34: CS 330 Programming Languages 09 / 28 / 2006 Instructor: Michael Eckmann

Perl

Michael Eckmann - Skidmore College - CS 330 - Fall 2006

• Short circuiting

• When using or or ||, if the left operand evaluates to true then the whole thing is true (because true or anything is true) so, the right side is not evaluated.

• When using and or &&, if the left operand evaluates to false then the whole thing is false (because false and anything is false) so, the right side is not evaluated.

• E.g. if ($x == 1 && $y < 0)

• # suppose $x had the value 0, it is unnecessary to evaluate the $y < 0 at all, because the condition will be false regardless.

Page 35: CS 330 Programming Languages 09 / 28 / 2006 Instructor: Michael Eckmann

Perl

Michael Eckmann - Skidmore College - CS 330 - Fall 2006

• Short circuiting• E.g. if ($z == 0 || $y < 0) • # suppose $z had the value 0, it is unnecessary to

evaluate the $y < 0 at all, because the condition will be true regardless.

Page 36: CS 330 Programming Languages 09 / 28 / 2006 Instructor: Michael Eckmann

Perl

Michael Eckmann - Skidmore College - CS 330 - Fall 2006

• Example of how Perl programmers use this short circuit feature to their advantage to make concise readable code.

• open(FILEHAND, “<“, $fname) or die “can’t open file.\n”;

• Because of the short circuit, the above works in the following way, the open function returns false if the file can’t be opened. If that happens the die function is called (see the or operator) which prints the error to STDERR.

• If the file can be opened, then true is returned and the or part is not evaluated (executed.)

Page 37: CS 330 Programming Languages 09 / 28 / 2006 Instructor: Michael Eckmann

Perl

Michael Eckmann - Skidmore College - CS 330 - Fall 2006

• Open and die are two of Perl’s built-in functions.• STDIN, STDOUT, and STDERR are file handles that

are automatically available and open in Perl programs. STDIN is the keyboard and the other 2 are the console.

• We’ll come back to more about opening files later, let’s instead continue with our discussion of more operators.

Page 38: CS 330 Programming Languages 09 / 28 / 2006 Instructor: Michael Eckmann

Perl

Michael Eckmann - Skidmore College - CS 330 - Fall 2006

• Conditional operators include:• <, >, <=, >=, ==, != (for numeric comparison)• lt, gt, le, ge, eq, ne (for string comparisons)

• < = > (numeric compare) --- it is a less than followed by equals followed by a greater than symbol without spaces.

• cmp (string compare)

Page 39: CS 330 Programming Languages 09 / 28 / 2006 Instructor: Michael Eckmann

Perl

Michael Eckmann - Skidmore College - CS 330 - Fall 2006

• These probably don’t need much discussion:• <, >, <=, >=, ==, != (for numeric comparison)• lt, gt, le, ge, eq, ne (for string comparisons)

• But these two compare operators are interesting:• < = > (numeric compare)• cmp (string compare)• The two compare operators above

• Return –1 if left operand is less than right operand• Return 0 if left operand is equal to right operand• Return +1 if left operand is greater than right operand

Page 40: CS 330 Programming Languages 09 / 28 / 2006 Instructor: Michael Eckmann

Perl

Michael Eckmann - Skidmore College - CS 330 - Fall 2006

• if, elsif, else structures• (notice the odd spelling of elsif --- there is no “e” in it.)• Why do you think?

• As expected, the elsif and else portions of the if structure are optional.

• You can have an if, followed by zero or more elsif’s, followed by zero or one else’s.

• Also, there’s an unless that can be used instead of the if (but still can use the elsif’s and else portions.)

• unless reverses the test if you had used if

Page 41: CS 330 Programming Languages 09 / 28 / 2006 Instructor: Michael Eckmann

Perl

Michael Eckmann - Skidmore College - CS 330 - Fall 2006

if ($count < 10)

{

# do something here

}

• Is the same as:

unless ($count >= 10)

{

# do something here

}

Page 42: CS 330 Programming Languages 09 / 28 / 2006 Instructor: Michael Eckmann

Perl (example of if/elsif/else)

Michael Eckmann - Skidmore College - CS 330 - Fall 2006

if ($count < 10) {

# do something here}elsif ($count >100){

# do something here}else{

# do something here}

Page 43: CS 330 Programming Languages 09 / 28 / 2006 Instructor: Michael Eckmann

Perl

Michael Eckmann - Skidmore College - CS 330 - Fall 2006

• Previous slides showed how to use if and unless on blocks of code.

• Interestingly, if, unless, while, until and foreach can all be used as modifiers to a simple statement.

• Examples:

print “Hello” unless $printing_is_off;

$total++ if $increase_total;

Page 44: CS 330 Programming Languages 09 / 28 / 2006 Instructor: Michael Eckmann

Perl

Michael Eckmann - Skidmore College - CS 330 - Fall 2006

• while, until, foreach and for are looping structures in Perl.

• while and for act as you’d expect from knowing C++ or Java.

• until executes its loop until the condition becomes true, whereas while executes its loop until the condition becomes false. Yeah it’s redundant. So much of Perl is.

Page 45: CS 330 Programming Languages 09 / 28 / 2006 Instructor: Michael Eckmann

Perl

Michael Eckmann - Skidmore College - CS 330 - Fall 2006

• foreach works on list data (e.g. arrays.)

• Example:

foreach $element (@people)

{ print “$element is a person in the array\n”; }

# foreach iterates through all values of the array in the parens and uses the variable just after the word foreach to temporarily store the value. Then the code in the { }’s executes once for every element of the array.

Note: $element and @people are user-defined names (not special to Perl)

Page 46: CS 330 Programming Languages 09 / 28 / 2006 Instructor: Michael Eckmann

Perl

Michael Eckmann - Skidmore College - CS 330 - Fall 2006

open( INDATA, “< “, “datafile.txt”) or die

“can’t open datafile.txt”;

Alternatively, one can combine the mode with the file name in one string: e.g. “<datafile.txt”

Modes:

< is input,

> is output (writing),

>> is append,

+< is read-write (assumes file exists) and

>+ is read-write (file might not exist).

Page 47: CS 330 Programming Languages 09 / 28 / 2006 Instructor: Michael Eckmann

Perl

Michael Eckmann - Skidmore College - CS 330 - Fall 2006

Another thing to notice about Perl is that for calling the built-in functions we can use parentheses around the arguments or not use the parentheses.

e.g. open(FH, “<data.txt”);

open FH, “<data.txt”; # both are valid and do the same thing.

But be careful with something like:

print ( 7 + 3 ) * 2; # this will cause the parens to enclose the args

# and therefore it would print 10 not 20.

We would want this instead: print (( 7 + 3 ) * 2);

Page 48: CS 330 Programming Languages 09 / 28 / 2006 Instructor: Michael Eckmann

Perl

Michael Eckmann - Skidmore College - CS 330 - Fall 2006

Use the angle brackets to read a line from a file handle.

Use print to write to a filehandle.

e.g. print FH “A line to be written to file\n”;

Page 49: CS 330 Programming Languages 09 / 28 / 2006 Instructor: Michael Eckmann

Perl

Michael Eckmann - Skidmore College - CS 330 - Fall 2006

• When reading lines from STDIN or a file, the line will contain a \n at the end.

• It is often the case that you wish to get rid of it.

• Use chomp function to do this.

• $inline = <STDIN>;

• chomp($inline);

• Or

• chomp($inline = <STDIN>);

• Chomp removes the end of record marker and returns the # of chars removed.

• chop is also a function. It removes the last character regardless if it is \n or not and returns the character.

Page 50: CS 330 Programming Languages 09 / 28 / 2006 Instructor: Michael Eckmann

Perl

Michael Eckmann - Skidmore College - CS 330 - Fall 2006

• An interesting thing about return values:

• The assignment operators are interesting in that they return the variable on the LHS of the assignment as an lvalue. An lvalue is something that can have a value assigned to it.

• This allows chaining of assignments like:

• $num1 = $num2 = $num3 = 0; # 0 is assigned to num3, then num3 to num2 …

• And

• ($temp -= 32) *= 5/9;

• # the -= in parens returns the $temp as an lvalue which is assigned

• # a new value with the *= assignment.