Llama to Ram How to think like a perl weenie Rob Napier 3/18/99

Preview:

Citation preview

Llama to RamHow to think like a perl weenie

Rob Napier

3/18/99

3/18/99 Llama to Ram -- Rob Napier 2

Introduction

In this talk we will move from the basics of perl syntax and grammar (Llama) to the philosophy behind perl and the tools of the trade (Ram).

This talk will not cover many advanced perl topics, and in particular won't cover performance issues or advanced data structures

3/18/99 Llama to Ram -- Rob Napier 3

Topics of Discussion

Background and philosophy Perl basics Tools of the trade Pitfalls Perl rules Gotchas

3/18/99 Llama to Ram -- Rob Napier 4

Background and Philosophy

Quotes Influences O’Reilly Bestiary

3/18/99 Llama to Ram -- Rob Napier 5

Quotes

Practical Extraction and Report Language Perl is a language for getting things done. There’s more than one way to do it

3/18/99 Llama to Ram -- Rob Napier 6

Major Influences

C sh regex unix LISP and COBOL

3/18/99 Llama to Ram -- Rob Napier 7

O’Reilly Bestiary

Learning - Llama Programming - Camel Cookbook - Ram Advanced - Leopard/Panther Also Perl/Tk, Nutshell, Win32, and others

3/18/99 Llama to Ram -- Rob Napier 8

Perl Basics

Auto-conversion (coercion) The search for Truth Safety nets Data types

3/18/99 Llama to Ram -- Rob Napier 9

Auto-conversion (coercion)

Strings <=> Numbers References => Strings undef => Strings

Scalars => Lists Lists => Scalar

The dreaded 1’s

3/18/99 Llama to Ram -- Rob Napier 10

The Search for Truth

False: “”, “0”

False = 0, undef, () True = 1, ref, “0 but true”, and most anything else

3/18/99 Llama to Ram -- Rob Napier 11

Safety Nets

-w use strict

3/18/99 Llama to Ram -- Rob Napier 12

Data types

Scalars Lists Hashes Filehandles References

3/18/99 Llama to Ram -- Rob Napier 13

Scalars

Numbers Strings References undef Typeglob Filehandle

$foo = “bar”;

3/18/99 Llama to Ram -- Rob Napier 14

Lists

Heterogeneous Both list-like and array-like, but usually list-like

@foo = qw(bar baz bang);

$bing = @foo[4];

@bang = @foo[4, 6];

3/18/99 Llama to Ram -- Rob Napier 15

Hashes

Associate arrays keys, values, each, delete, exists

%foo = (apple => “red”, orange => “orange”, foo => 2);

$foo{apple} = “orange”;

@foo{apple, orange};

3/18/99 Llama to Ram -- Rob Napier 16

Filehandles

open (FOO, “foo”); while (<FOO>) { while (<>) {

Includes predefined STDOUT, STDIN, STDERR

3/18/99 Llama to Ram -- Rob Napier 17

Typeglobs

Entries in the symbols table Not used very often except for references to

filehandles

3/18/99 Llama to Ram -- Rob Napier 18

References

Hard Symbolic

3/18/99 Llama to Ram -- Rob Napier 19

Hard References

Similar to C-style pointers.

$scalarref = \$foo;

$arrayref = \@ARGV;

$hashref = \%ENV;

$coderef = \&handler;

$globref = \*foo;

$scalarref = \1;

$arrayref = [1, 2, [‘a’, ‘b’, ‘c’]];

$hashref = {‘Adam’ => ‘Eve’, ‘Clyde’ => Bonnie’ };

$coderef = sub {print “Boink!\n” };

3/18/99 Llama to Ram -- Rob Napier 20

Symbolic References

Indirect references to variable These can be very dangerous (and aren’t allowed

under ‘use strict’)

$$scalarref; @$arrayref; %$hashref

3/18/99 Llama to Ram -- Rob Napier 21

Tools of the trade

Lists Hashes Regex Subs Modules

3/18/99 Llama to Ram -- Rob Napier 22

Lists

Usually used as lists, instead of arrays qw() Sets Slurping

3/18/99 Llama to Ram -- Rob Napier 23

Lists, seldom arrays

Usually use foreach, rather than subscripting into arrays. Instead of:

for ($i =0; $i <= $#list; $i++) {

do_something($list[$i]);

}

Do this:

foreach $elem (@list) {

do_something($_);

}

3/18/99 Llama to Ram -- Rob Napier 24

qw()

Very good way to set lists of quoted words:

@foo = qw(this is a test);

3/18/99 Llama to Ram -- Rob Napier 25

Sets

@isect = @diff = @union = ();

foreach $e (@a, @b) { $count{$e}++ }

foreach $e (keys %count) {

push(@union, $e);

push @{ $count{$e} == 2 ? \@isect : \@diff }, $e;

}

3/18/99 Llama to Ram -- Rob Napier 26

Slurping

Often it’s handy to just slurp a whole file and work on it in memory:

open(FOO, “foo”);

@foo = <FOO>;

3/18/99 Llama to Ram -- Rob Napier 27

Hashes

Swiss-army knife of perl Associative arrays Records “In list” applications

3/18/99 Llama to Ram -- Rob Napier 28

Associate Arrays

%foo = (apple => “red”,

orange => “orange”);

print $foo{apple};

3/18/99 Llama to Ram -- Rob Napier 29

Records

The hard way

@entry = getpwuid($<);

%user = (name => $entry[0],

passwd => $entry[1],

uid => $entry[2],

gid => $entry[3],

quota => $entry[4],

comment => $entry[5],

[…],

expire => $entry[9]);

print “name is $user{name}\n”;

3/18/99 Llama to Ram -- Rob Napier 30

Records cont

The easy way (i.e. the perl way)

@fields = qw(name passwd uid gid quota comment gcos dir shell expire);

@user{@fields} = getpwuid $<;

print “name is $user{name}\n”;

3/18/99 Llama to Ram -- Rob Napier 31

“In list” applications

Maintaining list ordersub unique {

my (@list) = (@_);

my %seen = (); # Hash to keep track of what we've seen

my $item; # Current item

my @uniq; # Unique list

foreach $item (@list) {

push (@uniq, $item) unless $seen{$item}++;

}

return @uniq;

}

3/18/99 Llama to Ram -- Rob Napier 32

“In list” applications cont

Trashing list order

sub unique {

my (@list) = (@_)

my %uniq;

@uniq{@list} = ();

return keys @uniq;

}

3/18/99 Llama to Ram -- Rob Napier 33

Regex

Very useful for getting a lot of things done fast.

Rob Napier: 9408 Erinsbrook Drive, Raleigh, NC 27613 (919)848-9523

/(.*):\s*([^,]*),\s*([^,]*),\s*(\w+)\s+(\d+)\s+(?=\()(.*)/

$name = $1;

$address = $2;

$city = $3;

$state = $4;

$zip = $5;

$phone = $6;

3/18/99 Llama to Ram -- Rob Napier 34

Subs

Passing non-scalars Returning non-scalars Named parameters

3/18/99 Llama to Ram -- Rob Napier 35

Passing non-scalars

Try to move non-scalar to the end If you can’t, pass a reference

sub foo {

my @a = @{shift()};

my @b = @{shift()};

print “@a\n”;

print “@b\n”;

}

foo(\@bar, \@baz);

3/18/99 Llama to Ram -- Rob Napier 36

Returning non-scalars

If you can return it as a flat list (or hash), then just return it.

If you have multiple, distinct return values, return a list of references

sub foo {

my @a = qw(this is a test);

my @b = qw(this is a test);

return (\@a, \@b);

}

3/18/99 Llama to Ram -- Rob Napier 37

Named parameters

sub thefunc {

my %args = (

INCREMENT => '10s',

FINISH => 0,

START => 0,

@_, # argument pair list goes here

);

if ($args{INCREMENT} =~ /m$/ ) { ..... }

}

thefunc(INCREMENT => "20s", START => "+5m", FINISH => "+30m");

3/18/99 Llama to Ram -- Rob Napier 38

Modules

File::Path File::Find File::Copy Exporter getop sendmail CGI Cwd

3/18/99 Llama to Ram -- Rob Napier 39

perl4 pitfalls

use strict! (and debatably also use -w) local => my chop => chomp require => use Avoid globals Avoid typeglobs Investigate complex data structures

3/18/99 Llama to Ram -- Rob Napier 40

sh pitfalls

use strict! << instead of multiple prints 0=false, 1=true except on system calls Don’t over-fork. Most of what you want is in perl:

rm/rm -rf, find, ls, echo, grep, awk, sed, pwd, mkdir, mkdir -p, chown, chgrp, cp, ln

Avoid globals

3/18/99 Llama to Ram -- Rob Napier 41

sh pitfalls (cont)

Don’t store lists as strings Avoid temp files Avoid excessive chdir()

3/18/99 Llama to Ram -- Rob Napier 42

C pitfalls

Avoid subscripting lists that you’re iterating over printf -> print Don’t fear labels, especially for using ‘last’ and

‘next’ Don’t try to split a string character by character.

Use regex. Don’t overlook POSIX

3/18/99 Llama to Ram -- Rob Napier 43

General perl pitfalls

Generally you don’t need to add .pl onto script names.

Often readdir() is a better tool than glob() tr/a-z/A-Z/ -> uc()

3/18/99 Llama to Ram -- Rob Napier 44

Perl rules

Always return() Always check your system return codes and close()

returns use strict $# => scalar (or scalar context)

Some people may debate this one, but I find it helps a lot.

Before writing anything complex, always check CPAN (www.cpan.com)

3/18/99 Llama to Ram -- Rob Napier 45

Gotchas

BEGIN { use strict; } There is no real function prototype mechanism

(perl prototypes aren’t what you think) << EOF

Watch out for spaces before the EOF != vs ne, == vs eq, + vs .

number vs. string

3/18/99 Llama to Ram -- Rob Napier 46

Gotchas cont

&&, ||, and, or && and || bind tightly. “and” and “or” bind loosely Generally, you use && and || in boolean logic, while

“and” and “or” are used for “or die” type error checking.

print “@foo” vs print @foo “@foo” adds spaces. This includes inside a HERE-

docs

3/18/99 Llama to Ram -- Rob Napier 47

Gotchas cont

`foo` or warn This only warns if `foo` returns no output (even if

that output is an error) split (‘ ‘, ...)

Splits on whitespace /[ \n\t]+/, not just ‘ ‘.

3/18/99 Llama to Ram -- Rob Napier 48

Wrapup

Thinking like a perl weenie means working with the language instead of against it. Even though “there’s more than one way to do it,” many of those ways fail to make good use of the power that perl offers.

The best way to learn to think in perl is to keep trying to make working scripts more perl-like. The best solution is usually the shortest solution that is still readable.

Recommended