39
Scripting Languages Course 3 Diana Trandabăț Master in Computational Linguistics - 1 st year 2013-2014

Scripting Languages Course 3 Diana Trandab ă ț Master in Computational Linguistics - 1 st year 2013-2014

Embed Size (px)

Citation preview

Page 1: Scripting Languages Course 3 Diana Trandab ă ț Master in Computational Linguistics - 1 st year 2013-2014

Scripting LanguagesCourse 3

Diana Trandabăț

Master in Computational Linguistics - 1st year2013-2014

Page 2: Scripting Languages Course 3 Diana Trandab ă ț Master in Computational Linguistics - 1 st year 2013-2014

Today’s lecture

• What is Perl?• How to install Perl?• How to write Perl progams?• How to run a Perl program?– perl program.pl

• Scalars

Page 3: Scripting Languages Course 3 Diana Trandab ă ț Master in Computational Linguistics - 1 st year 2013-2014

About programming

3

• Working with algorithms• Program needs to contain exact commands– (Mostly) not: Go buy some bread– But: Put on your coat and shoes, open the door, go

through it, close the door, go down the stairs…

• A program has a certain input• Processes it• Produces a certain output

Page 4: Scripting Languages Course 3 Diana Trandab ă ț Master in Computational Linguistics - 1 st year 2013-2014

Why Perl?

4

• PERL = Practical Extraction and Report Language• Easy to learn• Simple syntax• Open source, available for different platforms: Unix,

Mac, Windows• Good at manipulating text– Good at dealing with regular expressions

• TMTOWTDI - “There’s more than one way to do it”• Extremely popular for CGI and GUI programming.

Page 5: Scripting Languages Course 3 Diana Trandab ă ț Master in Computational Linguistics - 1 st year 2013-2014

Getting started…

• For Windows: install ActivePerl http://www.activestate.com/Products/ActivePerl

• You may use your university account (putty), and then you don’t have to install anything.

• Most Linux distribution come with Perl. To find out if you have it installed already, open an terminal, and write

perl –v• which should give you the version of Perl that you

have installed on your computer.

Page 6: Scripting Languages Course 3 Diana Trandab ă ț Master in Computational Linguistics - 1 st year 2013-2014

04/21/23 Perl in a Day - Introduction 6

· Make sure Perl exists, find out what version it is· perl -v

· How do I get help?· perldoc perl (general info, TOC)· perldoc perlop (operators like +, *)· perldoc perlfunc (functions like chomp: > 200!)· perldoc perlretut (regular expressions: /ABC/)· perldoc perlreref (regular expression reference)· perldoc -f chomp (what does chomp function do?)· perldoc File::IO (find out about a Perl module)

· Type q to quit when viewing help pages or space bar for next page.

Before you start using Perl…

Page 7: Scripting Languages Course 3 Diana Trandab ă ț Master in Computational Linguistics - 1 st year 2013-2014

How to write a Perl program

7

• Perl programs can be written in any text editor– Notepad, vim, even Word…– Recommended: A simple text editor with syntax

highlighting

• Write the program code• Save the file as xxx.pl– .pl extension not necessary, but useful

Page 8: Scripting Languages Course 3 Diana Trandab ă ț Master in Computational Linguistics - 1 st year 2013-2014

What is a Perl program like?

8

#! usr/bin/perl -w# This *very* simple program prints "Hello World!“

print "Hello World!";

Page 9: Scripting Languages Course 3 Diana Trandab ă ț Master in Computational Linguistics - 1 st year 2013-2014

What is a Perl program like?

9

• This line is needed in Linux, not mandatory in Windows, but it does not harm, so you may leave it in your code.

• The -w option tells Perl to produce extra warning messages about potential dangers. This is similar to

#! usr/bin/perl use warnings;

White space doesn't matter in Perl.All Perl statements end in a semicolon ;

#! usr/bin/perl –w# This *very* simple program prints "Hello World!“

print "Hello World!";

Page 10: Scripting Languages Course 3 Diana Trandab ă ț Master in Computational Linguistics - 1 st year 2013-2014

What is a Perl program like?

10

• The content of a line after the # is commentary. It is ignored by the program - with the exception of the line #! usr/bin/perl

• What are commentaries for, then?– They are for you, and others who will have to read the

code– Imaging looking at a complex program in a few months

and trying to figure out what it does• Write as much commentaries as you can

#! usr/bin/perl –w# This *very* simple program prints "Hello World!“

print "Hello World!";

Page 11: Scripting Languages Course 3 Diana Trandab ă ț Master in Computational Linguistics - 1 st year 2013-2014

What is a Perl program like?

11

• This is a Perl command– In this case, for printing text on the screen

• Every command should start at a new line– Not a Perl requirement, but crucial for readability

• Every command should end with a semicolon;• Many commands take arguments– Here: “Hello World!”

#! usr/bin/perl –w# This *very* simple program prints "Hello World!“

print "Hello World!";

Page 12: Scripting Languages Course 3 Diana Trandab ă ț Master in Computational Linguistics - 1 st year 2013-2014

What to do with the program?

12

• Perl works from the command line• Windows: Start Run…cmd• Go to the directory where you saved the

program– E.g.: cd C:\Perl\MyPrograms

• Run the program:– perl program.pl

• See the results of your labours!

Page 13: Scripting Languages Course 3 Diana Trandab ă ț Master in Computational Linguistics - 1 st year 2013-2014

Exercise

13

• Create a folder for your Perl programs• Open the editor of your choice and write the

„Hello World“ program– The command is print „Hello World!“;– Don‘t forget the commentary!

• Save the program• Run it!• What happens if you misprint the print

command?

Page 14: Scripting Languages Course 3 Diana Trandab ă ț Master in Computational Linguistics - 1 st year 2013-2014

More on the first program·Perl is case sensitive!

·print is not the same as Print·$bio is not the same as $Bio

·print is a function which prints to the screen·print("Hi") is (usually) the same as print "Hi"· Inside "double quotes", \n starts new line, \t prints tab·A function is called with zero or more arguments

· Arguments are separated by commas· print takes as many arguments as you give it

print ""; # legal, prints nothing, not even \nprint("Hi", "There"); # prints HiThereprint(Hi); # illegal (calls the function Hi)print(1+1, 2+2, "\n"); # prints 24 and a newline

Page 15: Scripting Languages Course 3 Diana Trandab ă ț Master in Computational Linguistics - 1 st year 2013-2014

Variables

15

• The „Hello World“ program always has the same output– Not a very useful program, as such

• We need to be able to change the output• Variables are objects that can hold different

values

Page 16: Scripting Languages Course 3 Diana Trandab ă ț Master in Computational Linguistics - 1 st year 2013-2014

Variables

• Names in Perl: – Start with a letter – Contain letters, numbers, and underscores “_” – Case sensitive

• Two major types: – $ Scalars (single value) – @ Lists – % hash tables

Page 17: Scripting Languages Course 3 Diana Trandab ă ț Master in Computational Linguistics - 1 st year 2013-2014

Scalars

• Start with a dollar sign “$” • Can be of type: – Integer – Floating point – String/text– Binary data – Reference (like a pointer)

• Perl is not a strongly typed language (There is no necessity to declare the variable before hand)

Page 18: Scripting Languages Course 3 Diana Trandab ă ț Master in Computational Linguistics - 1 st year 2013-2014

Defining variables

18

• To define a variable, write a dollar sign followed by the variable’s name– Names should consist of letters, numbers and the

underscore– They should start with a letter– Variable names are case-sensitive!

• $a and $A are different variables!

– Generally, a variable’s name should tell you what the variable does

# We define a variable „a“ and assign it a value of „42“

$a = 42;

Page 19: Scripting Languages Course 3 Diana Trandab ă ț Master in Computational Linguistics - 1 st year 2013-2014

Defining variables

19

• Variables can be assigned values– String: text (character sequence) in quotes/double

quotes– Numbers

• $a = 42;• $a = “some text”;

# We define a variable „a“ and assign it a value of „42“

$a = 42;

Page 20: Scripting Languages Course 3 Diana Trandab ă ț Master in Computational Linguistics - 1 st year 2013-2014

04/21/23 Perl in a Day - Variables 20

Declaring Variables

• Variables can also be declared with my – Tell the program there's a variable with that name– my $value = 1;– Use my the first time you use a variable– Don't have to give a value (default is "", but –w may warn)

• Avoid typos– use strict; will force you to declare all variables you

use with my– Put this at the top of (almost) any program– Now Perl will complain if you use an undeclared variable

Page 21: Scripting Languages Course 3 Diana Trandab ă ț Master in Computational Linguistics - 1 st year 2013-2014

Changing variables

21

• Arithmetic operations– $a = 42 / 2; # division– $a = 42 + 5; # addition– $a = $b * 2; # multiplication– $a = $a - $b; # subtraction

• Also useful:– $a += 42; # the same as $a = $a + 42;– The same for +, -, /

• String operations– $a = “some“ . “ text“; # concatenation– $a = $a . “ more text“;

Page 22: Scripting Languages Course 3 Diana Trandab ă ț Master in Computational Linguistics - 1 st year 2013-2014

22

Data flow

• Unless you say otherwise:– Data comes in through STDIN (Standard IN)– Data goes out through STDOUT (Standard Out)– Errors go to STDERR (Standard Error)• Error code contained in a ‘magic’ variable $!

Page 23: Scripting Languages Course 3 Diana Trandab ă ț Master in Computational Linguistics - 1 st year 2013-2014

Basic output

23

• We have already seen an output command– print “text“;– print $a;– print “text $a“;– print “text “ . $a+$b . “ more text.“;– Special characters:• \n – new line• \t – tabulator

Page 24: Scripting Languages Course 3 Diana Trandab ă ț Master in Computational Linguistics - 1 st year 2013-2014

Exercise

24

• Define a variable• Assign it a value of 15• Print it• Double the value• Print it again• Define another variable with the string „apples“• Print both variables• Change the first variable to its square and the second

to „pears“• Print both variables

Page 25: Scripting Languages Course 3 Diana Trandab ă ț Master in Computational Linguistics - 1 st year 2013-2014

Basic input

25

• The <> operator returns input from the standard source (usually, the keyboard)

• Syntax:– $a = <>;

• Don’t forget to tell the user what he’s supposed to enter!

• Try the following program:

# This program asks the user for his name and greets him

print "What is your name? ";$name = <>;print "Hello $name!";

Page 26: Scripting Languages Course 3 Diana Trandab ă ț Master in Computational Linguistics - 1 st year 2013-2014

Input, output and new lines

26

• As the user input is followed by the [Enter] key, the string in $name ends in a new line

• The chomp function deletes the new line at the end of a string

• Try the following, modified program:# This program asks the user for his name and greets him

print "What is your name? ";$name = <>;chomp($name);print "Hello $name!";

Page 27: Scripting Languages Course 3 Diana Trandab ă ț Master in Computational Linguistics - 1 st year 2013-2014

If, else

27

• Until now, the course the program runs is fixed• The if clause allows us to take different actions

in different circumstances

# Let‘s try out a conditional clause

print "Please enter password: ";$password = <>;if ($password == 42) {

print "Correct password! Welcome.";} else {

print "Wrong password! Access denied.";}

Page 28: Scripting Languages Course 3 Diana Trandab ă ț Master in Computational Linguistics - 1 st year 2013-2014

If, else

28

• Note: = is the assignment operator, == is the comparison operator

• Else is an optional operator triggering if the if condition fails

# Let‘s try out a conditional clause

print "Please enter password: ";$password = <>;if ($password == 42) {

print "Correct password! Welcome.";} else {

print "Wrong password! Access denied.";}

Page 29: Scripting Languages Course 3 Diana Trandab ă ț Master in Computational Linguistics - 1 st year 2013-2014

Exercise

29

• Try out the password program.– Why doesn‘t it work correctly? Fix it.– Tell the user if the number he entered is too large

or too small• Hint: The comparison operators you’ll need are < and >

Page 30: Scripting Languages Course 3 Diana Trandab ă ț Master in Computational Linguistics - 1 st year 2013-2014

While

30

What if we want to do checks until something happens?The while loop repeats commands until its criteria are

met Note: in the example below, $password has no value, so it

specifically doesn’t have the value 42

# Now on to a "while" loopwhile ($password != 42) {

print "Access denied.\n";print "Please enter password: ";$password = <>;chomp($password);

}print "Correct password! Welcome.";

Page 31: Scripting Languages Course 3 Diana Trandab ă ț Master in Computational Linguistics - 1 st year 2013-2014

Exercise

31

• Write a small game: take a number, and make the user guess it. Tell him if it‘s too high or too low. If the user gets it right, the program terminates.– If you like, you can take a random number:

$random = int (rand(10) );

Page 32: Scripting Languages Course 3 Diana Trandab ă ț Master in Computational Linguistics - 1 st year 2013-2014

32

Filehandles

• A filehandle is a way to interact with input or output– ‘<>’ interacts with files on the command line

• filehandle names are simple strings with no symbols– I usually use all caps (SEQFILE), but that isn’t

necessary

• You must open your filehandle before using it

Page 33: Scripting Languages Course 3 Diana Trandab ă ț Master in Computational Linguistics - 1 st year 2013-2014

Reading files

33

• What if we want to have input from a file, not from the user?

• Open file for reading:– open(INPUT, "<file.ext");• This is default behavior, so you don’t actually need the

‘<‘

• Read a line:– $line = <SOURCE>;– $line = <>; # is just a special case

Page 34: Scripting Languages Course 3 Diana Trandab ă ț Master in Computational Linguistics - 1 st year 2013-2014

Writing files

34

• What if we want to print to a file, not to the screen?• Open file for writing:– open(OUTPUT, “>file.ext"); #open new file• Warning: If filename already exists, it is

overwritten!!• Write:– print OUTPUT “Some text...”;

• Appending:– open NAME, “>>filename”; # append to old file

Page 35: Scripting Languages Course 3 Diana Trandab ă ț Master in Computational Linguistics - 1 st year 2013-2014

Reading files

35

• Perl Magic! <> – Opens the file (or files) given as arguments on the

command line– Brings in one line of data at a time

open(INPUT, "<test.txt");while ($line = <INPUT>) {

chomp $ine;$line_id++;print “$line_id:\t$line\n”;

}

Page 36: Scripting Languages Course 3 Diana Trandab ă ț Master in Computational Linguistics - 1 st year 2013-2014

36

Filehandle

• Flexible coding– I want to specify the file to open on the

command line, rather than hard coding it$in_name = shift;$out_name = shift;open FILE, “<$in_name” or die “Couldn’t open $in_name for reading: $!\n”;open OUT, “>$out_name” || die“Couldn’t open $out_name for reading: $!\n”;while ($line = <FILE>){

chomp $line;print OUT “Something about $line\n

}close OUT;close FILE; • Usage: perl myscript.pl inputfile outputfile

Page 37: Scripting Languages Course 3 Diana Trandab ă ț Master in Computational Linguistics - 1 st year 2013-2014

37

Pipelining

• The STDOUT of one script can serve as the STDIN of another script. – use the pipe (‘|’) symbol to chain scripts

together

• Nothing goes to the screen in between scripts– instead, what would normally go to the screen is

redirected and made the STDIN of the next script

Page 38: Scripting Languages Course 3 Diana Trandab ă ț Master in Computational Linguistics - 1 st year 2013-2014

Exercise

38

• Make a text file and fill it with a Wikipedia article– Count the number of definite and indefinite

articles (the and a)– Count the number of numbers and digits– Insert a <number!> tag before every number

Page 39: Scripting Languages Course 3 Diana Trandab ă ț Master in Computational Linguistics - 1 st year 2013-2014

Great!

See you next time!