9

Click here to load reader

FTP_ File Transfer Using Perl - The Perl Journal, Autumn 1996

Embed Size (px)

Citation preview

Page 1: FTP_ File Transfer Using Perl - The Perl Journal, Autumn 1996

02/05/14 FTP: File Transfer Using Perl - The Perl Journal, Autumn 1996

www.foo.be/docs/tpj/issues/vol1_3/tpj0103-0007.html 1/9

PREVIOUS TABLE OF CONTENTS NEXT

FTP: File Transfer Using Perl

Graham Barr

In my last article I showed you how to use Perl to create, process and send e-mail. I also gave a brief

introduction to the Simple Mail Transfer Protocol (SMTP), one of the many protocols used across theInternet. In this article and those that follow I'll introduce some of the others.

In particular, this article will show you how to create an FTP (File Transfer Protocol) client. You've probably

used the program ftp, a program which is a user interface to the FTP protocol. The difference between the

two is subtle but important, because the program I'll develop in this column is also an interface to FTP.

It might not surprise you to hear that most of the work has already been done: a Perl 5 module, Net::FTP,

interprets the FTP protocol for you. So you won't have to mess with the nuts and bolts of FTP (defined in

RFC 959) to have your program send and receive files all by itself.

FTP is a client-server protocol. That is, there's a server which listens for connections on an agreed-upon port

address (FTP uses 21 by default). Once a connection is made, the server allocates a new port for

communication with the client. This leaves port 21 free to accept the connection from the next client. The

client and server communicate conversationally, with the client sending commands defined in the FTP

protocol to the server, and the server sending responses back to the client. This is the architecture for manywell known protocols on the Internet such as SMTP, NNTP, and HTTP.

Here's an example of a conversation between an FTP server and a client. It shows what communication is

necessary to connect, login, change directory and retrieve a file. The commands sent from the client to the

server are shown in bold.

220 ftphost FTP server (SunOS 4.1) ready. USER anonymous 331 Guest login ok, send ident as password. PASS [email protected] Guest login ok, access restrictions apply. CWD pub 250 CWD command successful. PWD 257 "/pub" is current directory. PORT 127,0,0,1,16,110 200 PORT command successful. RETR testfile 150 ASCII data connection for testfile (127.0.0.1,4206) (0 bytes).226 ASCII Transfer complete. QUIT 221 Goodbye.

The FTP protocol actually uses two connections: one for the commands just shown, and one for the actual

data transfer. The PORT command tells the server which socket address the client is using. The server uses

this information (4 IP octets and a 2-byte port address) to make the data connection.

Page 2: FTP_ File Transfer Using Perl - The Perl Journal, Autumn 1996

02/05/14 FTP: File Transfer Using Perl - The Perl Journal, Autumn 1996

www.foo.be/docs/tpj/issues/vol1_3/tpj0103-0007.html 2/9

You will see from the examples that Net::FTP simplifies this interface by keeping track of the status andproviding methods for each of the commands.

My first program contacts a Comprehensive Perl Archive Network (CPAN) site and retrieves all modulesthat have been uploaded within a given number of days. First, initialization:

#!/usr/bin/perl # Load the Net::FTP package use Net::FTP; use File::Listing qw(parse_dir);

# Look for files under 7 days (in seconds), $age = 7*24*60*60;

# Change this to the name of your nearest CPAN host,$CPANhost = 'CPAN';

# A likely path to the CPAN/modules directory $CPANpath = '/mirrors/CPAN/modules';

Now we need to construct a Net::FTP object which will talk to the remote server. The Net::FTP constructortakes, as arguments, the FTP hostname followed by some options:

Port: the port number (or name) for the remote host

Time: the initial timeout value, in seconds, for responses (defaults to 120)

Debug: the debug level

# Create a new Net::FTP object, changing the# timeout to 60 seconds$ftp = Net::FTP->new($CPANhost, Timeout => 60) or die "Cannot contact $CPANhost: $!";

Once a connection has been made, the login() method must be called before any other methods. login() takesthree optional arguments: login, password and account.

If no arguments are supplied, Net::FTP searches the .netrc file in your home directory (on UNIX

machines).( Net::Netrc, which is what interprets the .netrc file, has only been tested on UNIX platforms.) Ifno login information is found, the login defaults to "anonymous."

When doing .netrc lookups, Net::FTP performs certain security checks, just like the ftp program. You must

own the file, and nobody else should be able to read or write to it. If these checks fail, Net::FTP ignores your.netrc.

If no password is given and the login is 'anonymous' then Net::FTP guesses your e-mail address and sends it

as the password.

The third argument is account information which might be required by the FTP server. For anonymous FTPit's unnecessary.

# We'll login to the ftp server as anonymous; # specifying a login id prevents a .netrc lookup.

$ftp->login('anonymous') or die "Can't login ($CPANhost):" . $ftp->message;

Page 3: FTP_ File Transfer Using Perl - The Perl Journal, Autumn 1996

02/05/14 FTP: File Transfer Using Perl - The Perl Journal, Autumn 1996

www.foo.be/docs/tpj/issues/vol1_3/tpj0103-0007.html 3/9

O.K. - so the server has accepted us. Little does it know that we aren't mere surfers! First we need tochange directory to the root of the CPAN modules and retrieve a recursive directory listing.(Recursivedirectory listings take a long time on large filesystems. That can annoy FTP site maintainers, so only do this

when necessary.) By changing directories first, we reduce the size of the listing and therefore the time requiredto transmit it.

# Change the working directory $ftp->cwd($CPANpath) or die "Can't change directory ($CPANhost):" . $ftp->message;

# Retrieve a recursive directory listing @ls = $ftp->ls('-lR');

Before we start to transfer the files we need to tell the FTP server what type of file we're expecting. Differentmachines store files in different ways - what a wonderful world we live in. That's why FTP supports multiple

transfer modes:

ASCII: Data is transferred as 8-bit bytes with <CRLF> denoting end-of-line. This is the default mode.

EBCDIC: This type is intended for transfer between the few hosts which still use EBCDIC instead ofASCII.

IMAGE: The data are sent as contiguous bits which, for transfer, are packed into the 8-bit transfer

bytes. The receiving site must store the data as contiguous bits. Also called BINARY.

LOCAL: The data is transferred in logical bytes of a size chosen by the client.

However, only two of these are supported by Net::FTP: ASCII and IMAGE. In binary (IMAGE) mode thefiles are transferred as is, but in ASCII mode some translations, such as <CRLF> to <NL>, can be

performed.

# We probably want binary, # although some files may be ASCII$ftp->binary();

Now we have a recursive directory listing in @ls and an FTP connection in $ftp. We use the parse_dir()

subroutine in the File::Listing module to split our directory listing into its components. (File::Listing is available

in the libwww distribution in the CPAN.)

From these components we can access the filename, the last time the file was written, and its type, which can

be one of l, d, or f, representing links, directories, and files.

foreach $file (parse_dir(\@ls)) { my($name, $type, $size, $mtime, $mode) = @$file;

# We only want to process plain files, # we shall ignore symbolic links next unless $type eq 'f';

# Check age of file against $age # $mtime is a UNIX time: seconds since 1 Jan 1970 # $̂T is the time this script started.

Page 4: FTP_ File Transfer Using Perl - The Perl Journal, Autumn 1996

02/05/14 FTP: File Transfer Using Perl - The Perl Journal, Autumn 1996

www.foo.be/docs/tpj/issues/vol1_3/tpj0103-0007.html 4/9

if ($̂T - $mtime < $age) { print "Retrieving ", $name, "\n";

# Get the file from the ftp server $ftp->get($name) or warn "Couldn't get '$name', skipped: $!"; }}# Close the connection to the FTP server.

$ftp->quit or die "Couldn't close the connection cleanly: $!";

# We're done! exit;

Before I go into more detail you'll need to know the four commands made available by FTP for retrieving andstoring files:

RETR: Retrieve (get) a file from the server.

STOR: Store (put) a file on the server, overwriting it if it's already there.

STOU: Store (put) a file on the server by generating a unique name.

APPE: Append to a file on the server.

Now if you want to get adventurous and speed up transfer, you can use multiple FTP connections managed

either by multiple processes or by a select() call. The latter is demonstrated below, with several Net::FTPobjects, one per connection.

#!/usr/bin/perl

use Net::FTP; use File::Listing qw(parse_dir);

# We'll need to open and write some files use FileHandle;

# Look for files under 7 days (in seconds), $age = 7*24*60*60;

# Change this to the name of your nearest CPAN host$CPANhost = 'CPAN';

# The path to the CPAN/modules directory on most CPAN hosts$CPANpath = '/mirrors/CPAN/modules';

# Create the initial connection $ftp = connection();

# Retrieve a recursive directory listing @ls = $ftp->ls('-lR');

# Set the transfer mode to binary $ftp->binary or die "Cannot set binary mode: $!";

# Create a list of files we want to get @files = ();

Page 5: FTP_ File Transfer Using Perl - The Perl Journal, Autumn 1996

02/05/14 FTP: File Transfer Using Perl - The Perl Journal, Autumn 1996

www.foo.be/docs/tpj/issues/vol1_3/tpj0103-0007.html 5/9

foreach $file (parse_dir(\@ls)) { my($name, $type, $size, $mtime, $mode) = @$file;

# We only want to process plain files next unless $type eq 'f';

# Compare the age of file to $age if ($̂T - $mtime < $age) { push(@files, $name) }

}# The maximum number of connections to make $max_connection = 4; $max_connection = @files if @files < $max_connection;

# Create a list of connections. We already have one: $ftp.@ftp = ($ftp);

for($i = 1 ; $i < $max_connection ; $i++) { my $ftp = connection(); $ftp->binary or die "Cannot set binary mode: $!"; push(@ftp, $ftp); }

print "Using ", scalar(@ftp), " connections,\n"; print " to download ",scalar(@files)," files.\n";

# Keep a list of data connections @data = ();

# We'll start off with an empty file set.$fdset = "";

# Prime the ftp servers with RETR commands while(@ftp && @files) { my $ftp = shift @ftp; my $file = shift @files; my($data,$fh) = init_xfer($ftp, $file); push(@data, [$data, $fh]); }

# Close any unused connections while (@ftp) { my $ftp = shift @ftp; $ftp->close or warn "Can't close connection cleanly: $!"; }

We now have several FTP data connections to the same server, each in charge of one file. To service all ofthese connections simultaneously, we need select() to tell us when there's data to be read. We loop for as

long as there is data to read; on each iteration, up to 1024 bytes are read from any descriptor with data

available. If an EOF is found, the descriptor is closed. If there are still more files to be retrieved, a new file is

requested on the corresponding command socket. This creates another descriptor. If there are no more filesto transfer, the command socket is closed - when the list of data descriptors is empty, we'll know the transfer

is complete.

# Loop while we have connections. They'll be closed and# removed from @data when transfers finish and @files# is empty.

while (@data) {

Page 6: FTP_ File Transfer Using Perl - The Perl Journal, Autumn 1996

02/05/14 FTP: File Transfer Using Perl - The Perl Journal, Autumn 1996

www.foo.be/docs/tpj/issues/vol1_3/tpj0103-0007.html 6/9

$nfound = select($rout=$fdset, undef, undef, undef); next unless $nfound; die "select: $!" if ($nfound == -1); my @d = @data;

# Empty @data, connections will be added back into @data # if they're still in use later.

@data = (); foreach $con (@d) { my($data, $fh) = @$con; # Do we have data waiting on this connection? if (vec($rout, fileno($data),1)) { my $buf = "";

# Read some data. This may block if there's # less than 1024 bytes ready for reading. To # reduce the blocking time, use a smaller number.

my $l = $data->read($buf, 1024); die "Error reading data: $!" if $l < 0; if ($l) {

# Write the data to the local file syswrite($fh, $buf, $l)

} else {

# The data transfer is complete, so we can # close the data connection

my $ftp = finish_xfer($data, $fh); # Reuse the FTP connectiopn if there are # files left to retrieve.

if (@files) { my $file = shift @files; @$con = init_xfer($ftp, $file); } else {

# close the FTP connection and remove it # from @data

$ftp->close or warn "Can't close connection: $!";

# the connection is no longer in use undef $con; } } }

# If the connection is still in use, return it to # @data push(@data, $con) if defined $con; } }

And finally, the three subroutines we've been using: connect(), init_xfer(), and finish_xfer().

Page 7: FTP_ File Transfer Using Perl - The Perl Journal, Autumn 1996

02/05/14 FTP: File Transfer Using Perl - The Perl Journal, Autumn 1996

www.foo.be/docs/tpj/issues/vol1_3/tpj0103-0007.html 7/9

# Create a new connection to the ftp server sub connection { # Create a new NET::FTP object $ftp = Net::FTP->new($CPANhost, Timeout => 60) or die "Can't contact $CPANhost: $!"; # We shall login to the ftp server as anonymous; $ftp->login('anonymous') or die "Can't login ($CPANhost):" . $ftp->message; # Change the working directory $ftp->cwd($CPANpath) or die "Can't change directory ($CPANhost):". $ftp->message; return $ftp; }

# Initialize a file transfer sub init_xfer { my($ftp,$file) = @_;

# Send the retr command, and get a file descriptor # for the socket my $data = $ftp->retr($file) or die "Can't retrieve file '$file': $!";

# Store all files locally, in the current directory my ($path) = ($file =~ m!([̂/]+)$!);

# Open a filehandle to the local file my $fh = FileHandle->new($path, "w") or die "Cannot open file '$path': $!"; print "Retrieving $file as $path ...\n";

# Add data connection into fdset for select() vec($fdset, fileno($data), 1) = 1; return ($data, $fh); }

# Cleanup after a file transfer has completed sub finish_xfer { my($data, $fh) = @_;

# Get the ftp command object my $ftp = $data->cmd;

# Remove data connection from fdset for select() vec($fdset, fileno($data), 1) = 0;

# Close the data connection $data->close or warn "Cannot close data connection: $!";

# Close the local file close($fh) or warn "Can't close filehandle: $!";

return $ftp; }

As you can see, the whole problem becomes a lot more complex, fun, or obscure, depending on how twisted

you are.

Page 8: FTP_ File Transfer Using Perl - The Perl Journal, Autumn 1996

02/05/14 FTP: File Transfer Using Perl - The Perl Journal, Autumn 1996

www.foo.be/docs/tpj/issues/vol1_3/tpj0103-0007.html 8/9

So far we've looked at transferring files to and from one server. But what if we have two remote servers and

want to transfer a file from one to the other? FTP contains a powerful facility for doing this, but first let'sconsider the obvious solution.

You could transfer the remote file to the local filesystem and then transfer it to the other remote server. Better

would be to connect to each of the servers simultaneously, and perform sequential reads and writes between

them using the local machine as a waystation. The code for this is shown below.

#!/usr/bin/perl

use Net::FTP;

# Create connections to both remote servers...$ftpf = Net::FTP->new('from') or die "Cannot connect to 'from': $!"; $ftpd = Net::FTP->new('dest') or die "Cannot connect to 'dest': $!";

# ...and login to them.$ftpf->login('anonymous') or die "Can't login to 'from'"; $ftpd->login('anonymous') or die "Can't login to 'dest'";

# Place both servers into the correct transfer mode. # In this case I'm using ASCII. $ftpf->ascii() && $ftpd->ascii() or die "Can't set ASCII mode: $!";

# Send the RETR command to the source server # and obtain a file descriptor $ffile = '/pub/testfile'; $fdf = $ftpf->retr($ffile) or die "Can't retrieve '$ffile': $!";

# Send the STOR command to the destination server # and obtain a file descriptor $sfile = '/pub/outfile'; $fdd = $ftpd->stor($sfile) or die "Cannot store '$sfile': $!";

# Read and write the data between the two file descriptors while ($fdf->read($buf,1024)) { $fdd->write($buf, length $buf); }

$fdf->quit() && $fdd->quit() or die "Can't close connections: $!"; $ftpf->quit() && $ftpd->quit() or die "Can't quit ftp connections: $!";

While this is better than reading the whole file to the local filesystem and re-sending it, this process is still not

as good as it could be. Consider the situation when the file in question is rather large, say over 10MB. It

takes a long time to transfer just once, and here we're actually transferring it twice which could, potentially,

double the transfer time. For those who pay by the minute, this could get expensive.

This is where the PASV ("passive") command comes in handy. Assuming that both of the remote servers canconnect to one another, you can transfer the file directly:

#!/usr/bin/perl

Page 9: FTP_ File Transfer Using Perl - The Perl Journal, Autumn 1996

02/05/14 FTP: File Transfer Using Perl - The Perl Journal, Autumn 1996

www.foo.be/docs/tpj/issues/vol1_3/tpj0103-0007.html 9/9

use Net::FTP;

# Create connections to both remote servers...$ftpf = Net::FTP->new('from') or die "Can't connect to 'from': $!"; $ftpd = Net::FTP->new('dest') or die "Can't connect to 'dest': $!";

# ...and login to them.$ftpf->login('anonymous') or die "Can't login to 'from'"; $ftpd->login('anonymous') or die "Can't login to 'dest'";

# Place both servers into the correct transfer mode. # In this case I'm using ASCII. $ftpf->ascii() && $ftpd->ascii() or die "Can't set ASCII mode: $!";

# Send the PASV command to the destination server. # This returns a port address.$port = $ftpd->pasv or die "Can't put FTP host in passive mode: $!";

# Send the port address to the source server so it # knows where to send the data.$ftpf->port($port) or die "Error sending port: $!";

# Send the RETR and STOU commands to the servers $rfile = '/pub/testfile'; $ftpf->retr($rfile) or $ftpf->ok or die "Can't retrieve '$rfile': $!"; $sfile = '/pub/outfile'; $ftpd->stou($sfile) or die "Can't store '$sfile': $!";

# Wait for the transfer to complete $ftpd->pasv_wait($ftpf) or die "Transfer failed: $!";

$fdf->close() && $fdd->close() or die "Can't close connections: $!"; $ftpf->quit() && $ftpd->quit() or die "Can't quit ftp connections: $!";

After creating the connections, and placing them in the correct transfer mode, we send the destination server

a PASV command. This tells the server, for the next command, that it should listen on a port for a connection

instead of making the connection itself. The PASV command returns the port at which it is listening. We then

send this information to the source server with a PORT command, which tells the server where to make the

data connection for the next command. Once this is done we send the two commands, which start the

transfer between the two servers, and wait for the transfer to complete.

The programs in this article are available on CPAN at modules/by-author/id/GBARR/ftp_eg.tar.gz and on theTPJ web site.

__END__

PREVIOUS TABLE OF CONTENTS NEXT