IEEEJournal_The Internet - A Tutorial

Embed Size (px)

Citation preview

  • 8/8/2019 IEEEJournal_The Internet - A Tutorial

    1/10

    The Internet:a tutorialby J. CrowcroftThe Internet is a world-wide packet-switched network that connects togetherwell over 10 million computers in over 100countries for t h e purposle of informationsharing. This paper is a tutor ial on Internet technology- ow i t works nowand how i t is changing as the network becomes more commercial. It covers theunderlying design of the Internet, the connectionless service model, and theapplications tha t run over the network: the World Wide Web and otherpopular systems t h a t have made the network growth so explosive in recent years.

    1 The internet: Wh at kind of service?The Internet is the Information Superhighway; it hasbecome a cliche to say so. However, before embarking ona drive around the World wide Web0,t isimportant to understand how the roads themselves work(and to understand who pays road tax).The Internet is undergoing a stormy adolescence as itmoves from being a playground for academics to acommercial service. With more than 50% of the networkcommercially provided and more then 50% of thesubscribers being businesses, the Internet is now a verydifferent place from what it was in the 1980s. Growth hasoccurred most sharply in Europe and in the commercialsector in the last two years.it isnormal to describe the Internet under two headings.The first is host software (applications and support); thisforms what one might call the information services. Thesecond is network support; this is the access and transfertechnology used to move the information around.Hosts, networks and routersFig. 1,are threefold:The components that make up the internet, illustrated in

    Hosts: These are the workstations, PCs, servers, andmainframes on which applications are run.Networks: These may be local-area nets (EthernetLANs or example), point-to-point leased lines or dial-up (telephone, ISDN, X25) links that carry trafficbetween one computer and another, or wide-areanetworks (WANs) using switching technology such asSMDS (Switched Multimegabit Data Service) or ATM(asynchronous transfer mode).Routers: These glue together all the different networktechnologies to provide a ubiquitous service to deliverpackets (a packet is a small unit of data convenient forcomputers to bundle up data for sending andreceiving). Routers are usually just special purposecomputers which are good at talking to network links.Some people use general purpose machines as low-

    performance (low-cost) routers, e.g. PCs or Unixboxes with multiple JAN cards or serial line cards ormodems.

    Name s, addresses and routesEvery computer (host or router) in a well run part of theInternet has a name. The name is usually given to a deviceby its owner. Internet names are actually hierarchical andlook rather like postal addresses. For example, he name ofmy computer is waffle.cs.ucl.ac,uki allocated it the namewafle; the department in which I work called itself CS(Computer Science); he university t is in called itselfUCL(University College London); the academic communitycalled themselves ac; anid the Americans called us the UK.The name tells me what something isorganisationally. Theinternet calls this the Domain Name System.

    Everything in any part of ithe Internet that is to bereached must have an address. The address tells thecomputer in the Internet (hosts and routers) wheresomething is topologically. Thus the address ishierarchical. My computers atddress is 128.16.8.88. Myuniversity asked the IANA (Internet Assigned NumbersPrincipal abbreviation sDNSFTPGiFHTMLHTTPU N AIPLANMIMENICRDPRFCSMTPTCPUDPwww

    Domain Name ServiceFile Transfer ProtocolGraphics Interchmge FormatHyperText Mark up LanguageHyperText Transfer ProtocolInternet Assigned Numbers Authorityinternet Protocollocal-areanetworkMultipurpose Internet Mail ExtensionsNetwork information CenterReliable Data ProtocolRequest for CommentsSimple Message Transfer ProtocolTransmission Control ProtocolUser Datagram ProtocolWorld Wide Web

    ELECTRONICS & COMMUNICATION ENGINEERING JOURNAL JUNE 1996 113

    Authorized licensed use limited to: CADENCE DESIGN SYSTEMS. Downloaded on August 18,2010 at 05:01:17 UTC from IEEE Xplore. Restrictions apply.

  • 8/8/2019 IEEEJournal_The Internet - A Tutorial

    2/10

    Fig. 1InternetA piece of the

    Authority) for a network number* and was given thenumber 128.16.x.y n which we could fill in the x andy howwe liked, to number the computers on our network. Wedivided our computers into groups on different LANsegments and numbered the segments 1-256 (x),and thenthe hosts 1-256 Cy) on each segment. When anorganisation asks for a number for its net, it is asked howmany computers it has and is assigned a network numberbig enough to accommodate that number of computers.Nowadays, if you have a large network, you will be given anumber of numbers!Everything in the Internet must be reachable. The routeto a host will traverse one or more networks. The easiestway to picture a route is by thinking how a letter to a friendin a foreign country gets there.You post the letter in a postbox. It is picked up by apostman (LAN) nd taken to a sorting office (router).There, the sorter looks at the address and sees that theletter is for another country, and sends it to the sortingoffice for international mail. This then carries out a similarprocedure. And so on, until the letter gets to its destination.If the letter was for the same network then it would getimmediate local delivery. Notice that all the routers(sorting offices) dont have to know all the details abouteverywhere, just about the next hop to go to. Notice alsothat the routers (sorting offices) have to consult tables ofwhere to go next (e.g. international sorting office). Routers* The task of allocatingnumbers to sites in the Internet has no wbecome so vast that it has been delegated to a number oforganisationsaround the world.Ask your Internet provider wherethey get the numbers from if you are interested.

    chatter to each other all the time figuring out the best (oreven just usable) routes to places.The way to picture this is to imagine a road system witha person standing at every intersection who isworking forthe Road Observance Brigade. This person (Rob) readsthe road names of the roads meeting at the intersection andwrites them down on a card, with the number 0 after eachname. Every few minutes, Rob holds up the card to anyneighbour standing down the road at the next intersection.If they are doing the same, Rob writes down their list ofnames, but adds 1 o the numbers read off the other card.After a while, Rob is now telling people about theneighbours roads several roads away! Of course, Robmight get tw o ways to get somewhere! Then h e crosses outthe one with the larger number.Pe$obrmanceThe Internet today moves packets around without dueregard to any special priorities. The speed at which apacket goes once it starts to be transmitted is the speed ofthe wire (LAN, point-to-point link, dial-up link etc.) on thenext hop. Th e range of communication technologyspeeds is illustrated in Fig. 2.However, if there are a lot ofusers, packets ge t held up inside routers (like letters insorting offices at Christmas). Because the Internet isdesigned to be interactive, unlike the mail (evenelectronic mail) system with its slow turnaround, routersgenerally do not hang on to packets for very long. Instead,they just drop them on the floor when things ge t busy!This then means that hosts have to deal with a networkthat loses packets.

    114 ELECTRONICS & COMMUNICATION ENGINEERING JOURNAL JUNE 1996

    Authorized licensed use limited to: CADENCE DESIGN SYSTEMS. Downloaded on August 18,2010 at 05:01:17 UTC from IEEE Xplore. Restrictions apply.

  • 8/8/2019 IEEEJournal_The Internet - A Tutorial

    3/10

    Hosts generally have conversations that last a littlelonger than a single packet- t least one packet in eachdirection, but usually several in each direction.In fact, it is worse than that. The network canautomatically decide o change the routes it is using becauseof a problem somewhere. Then it is possible for a new routeto appear that is better. Suddenly, all packets will follow thenew route. But if there were already some packets half wayalong the old route, they may get to their destination aftersome of the later packets (a bit like people driving to a partyand some smart late driver aking a short cut and overtakingthe earlier leavers). So a host has to be prepared to put upwith out of order packets,as well as lost packets.Pro orolsAll this communication isdone using standard languagesto exchange blocks of data in packets, simply by puttingenvelopesor wrappers called headers round the packet.The work of routing and addressing is done by theInternet Protocol, or IP. The work of host communicationis done by the Transmission Control Protocol, or TCP.TCP/IP is often used as the name for the Internetprotocols, including some of the higher level informationservices. TCP does all the work to solve the problems ofpacket loss, corruption and reordering that the IP layermay have introduced, through a number of end to endreliability and error recovery mechanisms. If you like, youcan think of IP as a bucket brigade and TCPas a drainpipe.So to send a block of data, a TCP header is added to it toprotect it from wayward network events and an IP headerto get it routed to the right place.Host and applicationsThe emergence of some simple APIs (applicationprogramming interfaces) and GUIs (graphical user

    105

    104

    e 10 3g-8 IO

    In.._mm

    Y._10

    1

    interfaces) has led to the rapid growth of new user-friendlyapplications n the Internet. Information services providedby archive and Web servers are accessible through WWWand Mosaic, Archie and Prospero, gopher, an d WAIS and239.50Fig. 3 shows a screendump of a workstation runningMosaic, a popular client program for accessing the Webservers around the world. In this picture are a windowshowinga map of the UKand two other pictures, both GIFs(graphics interchange formats) or photographs. The topright picture is of University College London. The lowerright picture is one of the satellite images stored onEdinburgh University Cornpuling Services Web serverevery hour from a live satellite weather feed, which comesfrom one of the many weather satellites that send outunencrypted images periodically over the ether. Manysites simply point a satellite dish in the right direction andsoak up the data for later dissemination on the terrestrialInternet.The value of the service is now clearly becoming thevalue of the informatioin rather than the communicationchannel. This means that some mechanism for chargingand auditing access to information is a new requirement.This must be a secure mechanism to assure people thatcharges (or audit trails) are correct.Fig. 4 shows a screendump with three applicationsrunning. The top left is a Gopher Client, the top right isArchie, and the bottom right is WAIS.From all of these interfaces,, having found a piece ofinformation, we can retrieve it, print it or mail it to otherpeople.2 Information servers -Are you being served?We are familiar with ways to get information in the non--

    computer1 busses 1local-areanetworksL packetsatellite

    terrestrial

    10-3 10-2 io- i i o i o 2 i o 3 104geographic coverage, km

    Fig. 2 Range ofnetwork performance

    ELECTRONICS & COMMUNICATION ENGINEERING JOURNAL JUNE 1996 115.

    Authorized licensed use limited to: CADENCE DESIGN SYSTEMS. Downloaded on August 18,2010 at 05:01:17 UTC from IEEE Xplore. Restrictions apply.

  • 8/8/2019 IEEEJournal_The Internet - A Tutorial

    4/10

    Fig. 3Screendumpof workstationrunning Mosaic

    network world: we can go to a library, or buy a book in abookshop; we can telephone companies or individuals bylooking up their names in a phonebook; we can sit aroundand watchTV or listen to the radio. In the networked world,there are a number of ways of carrying out the same kind ofactivities.Different kinds of information services have differentmodels of use and different ways of holding information.Almost all fit into the client/servermodel that hasbecome widespread in distributed computing.Client/server communication is quite easy tounderstand in terms of roles and is very closelyanalogous to what happens in a shopping situation. Anassistant in a shop awaits a customer. The assistantdoesnt know in advance which customer might arrive(or even how many- he store manager is supposed tomake su re that enough assistants are employed to justabout cope with the maximum number of shoppersarriving at any one time). A server on the network istypically a dedicated computer that runs a programcalled the server. This awaits requests from thenetwork, according to some specified protocol, andserves them, one or more at a time, with regard for whothey come from.There are a variety of refinements of this model, suchas requiring authentication or registration with theserver before other kinds of transactions can beundertaken, but almost all the basic systems on theInternet work like this for now.Information servers can be categorised along a

    number of different axes:Synchronous versus asynchronous: Synchronousservers respond as you type/click at your computer.Asynchronous ones save up their answers andreturn them some time later. Sometimes, yoursystem will actually not even send the request forinformation to the server until you have finishedcomposing the whole request, or even later, to savetime (and possibly money, since night-time networkaccess may be cheaper and/or faster).Browsable versus searchable: Browsable serversallow you to move from one piece of information toanother. Typically, the managers/keepers havestructured the information with links, or else theinformation is hierarchical (like mos t organisationsjob structure s or payrolls). Searchable servers allowyou to search for particular items by givingkeywords. Usually, this means tha t the manage rs ofthe information have created indexes, althoughsometimes it ust means that the server is running ona very fast computer that can search all through thedata. This latter approach is becoming increasinglyimpossible as the quantity of information kept onlinegrows to massive proportions.Distributed versus replicated: Distributed informationservers hold only the information entered at theirsite and may have links for the us er to followto otherserve rs at other sites. Replicated systems copy theinformation around at quiet times, so that all serve rs

    116 ELECTRONICS & COMMUNICATION ENGINEERING JOURNAL JUNE 1996

    Authorized licensed use limited to: CADENCE DESIGN SYSTEMS. Downloaded on August 18,2010 at 05:01:17 UTC from IEEE Xplore. Restrictions apply.

  • 8/8/2019 IEEEJournal_The Internet - A Tutorial

    5/10

    are replicas of each other . This means that it doesntmatter which server you access in the latter case, soyou might as well go for the nearest or cheapest(likely to be both) .Trans port protocolsThe Internet provides a way toget packets (convenientunits of data for computers and routers) from any hostcomputer to one or more other host computers.However, the network protocols make no guaranteesabout delivering a packet. In fact, a packet may get lost,may arrive after others sent later or may be distorted. Apacket might even arrive that simply wasnt sent!To counter this, host computers incorporate transportprotocols, which use the Internet to carry theapplication information around but which also send avariety of other information to provide checking andcorrection or recovery from such errors.There is a spectrum of complexity in transportprotocols, depending on the application requirements.Three representative ones are:(a ) The User Datagram Protocol (UDP): UDP is a sendand forget protocol. It provides just enoughinformation at the start of each packet to tell whatapplication is running and to check if the packet wasdistorted on route. UDP is used by applications thathave no requirement for an answer, ypically, and dontreally care if the other end received the message. Atypical example of this might be a server thatannounces the time on the network, unsolicited.

    The Reliable Data Protocol (RDP): RDP is a genericname for a collection of protocols- he most relevantone here is that used by Prospero. RDP type protocolsare similar to TCP (see below), but with reducedcomplexity at the start ancl end of a conversation, andwith good support for sequences of exchanges ofchunks of data, often known as remote procedurecalls and sometimes incorrectly called transactions.The term transaction refers to an atomic, oruninterruptible exchange of data and is usually muchmore expensive to implement than a simple remoteprocedure call.The T ransmission Control Protocol (TC P): TCP is theprotocol module that provides reliability and safety. Itis designed to cope with the whole gamut of networkfailures and adapts elegantly o the available resourcesin the network. It even tries to be fair to all users.

    File Trans fer Protocol (F TP)The original information service was called ETP. This isprobably the worlds least friendly information service. Itoperates really at the level of machine-machinecommunication and is used by more modern clientprograms just as a way of getting something from a server.However, it is still used as a sor t of lowest commondenominatormeans of access to afile on a remote computer.Internet FTP is interactive or synchronous, whichmeans that you formulate your commands as you type atthe terminal. FTP maintains a control connection betweenthe client and server and sends commands over this inASCII or ordinary text!

    Fig. 4Screendumpwith threeapplicationsrunning

    ELECTRONICS8.1COMMUNICATION ENGINEERING JOURNAL JUNE 1996 11 7

    Authorized licensed use limited to: CADENCE DESIGN SYSTEMS. Downloaded on August 18,2010 at 05:01:17 UTC from IEEE Xplore. Restrictions apply.

  • 8/8/2019 IEEEJournal_The Internet - A Tutorial

    6/10

    When data isgoing to move, the client and server opena data connection. The data connection can keep goingwhilst the user issues further commands.Electronic mailE-mailis either the saviour of modem society, and trees,or the devil on the icing on the cake of technocracy.Electronic mail, at its simplest, is a replacement for paperletters (snail mail) or the facsimile. Sending e-mail(Fig.5)is easy if you know the address of the person you want toget it to . You type in the message using whatever facilityyou are familiar with and then submit it to the mail system(put it in the postmasters bag!). Then a series of automaticsystems (message handlers) will sort and carry it to thedestination, just like post offices and sorting ofices do withpaper mail.The protocol used for electronic mail in the Internet iscalled the Simple Message Transfer Protocol, or SMTP.The model is one of message handling systems and useragents all talking to each other, both using the sameprotocol. The user program invokes SMTP to send to areceiver, which may be a mail relay or actual recipientsystem.An SMTP mail address looks like this:J. Crowcrof t@ cs.ucl .ac.uk.The general form of such an address is:User @ DomainThe Domain is as defined in the Domain Name Service(DNS) for the host implementing SMTP. The D N S name istranslated to an IP address. The sending system merelyopens a TCP connection to the site and then talks theSMTP protocol.Mail listsSome mail system managers use info-servers tomaintain mail lists. Mail lists are ways of sending amessage at one go to groups of people having a commoninterest or purpose. They resemble, but are completelydifferent in implementation from, bulletin boards, whichare discussed below. Mail lists are very useful when useddiscriminatingly. On the other hand, because it isas easy to

    send to a list as to an individual, sometimes userspropagate junk mail to large groups of people. The mostcommon piece of junk mail is to do with list management(e.g. please add me to this list or please remove me fromthis list,which should be directed to list managers, usuallyas listname-request) but other human errors includesending irrelevant or offensive information.Bulletin boardsOnline bulletin boards are analogous to the pinboardswe are all used to from offices and schools. There is afundamental difference between a bboard and a mail list:mail arrives in an individuals mailbox, and the individualsattention is drawn to it; bulletins arrive on a bboard, andusers decide whether or when they want to read thatbboard, if at all.A less fundamental difference is the protocol. Bulletinboards are effectivelya single mailbox. Thus the overheadof delivery in terms of computer storage is much lower fora bboard than for a mail list.ArchieArchive servers appeared in the mid 1980s. Initially,they were a logical extension of FTP ervers. They provideindexed repositories of files for retrieval through a simpleprotocol called Prospero.Archive servers had been in place manually for sometime, simply as well maintained FTP servers. The firstattempt at automating co-ordination was to use a simpleprotocol. This involved periodically exchanging arecursive directory listing of all the files present on a givenserver with all other known servers. Thereafter, access to agiven server for a file present on another could have tworesults: either the client could be redirected to the rightserver, or the current server could fetch it and then returnit to the client.These tw o approaches are called referraland chainingin some communities, or iteration and recursion inothers. These ideas are discussed further below.Wkoisand Fingerwhois is one of the oldest and simplest informationservers in the Internet. It allowsyou to look up someonese-mail address, and other information that a user may be

    Fig. 5 Example ofelectronic mail

    Return-Path: Received: from arl-img-2.compusen/e.cornby bellsxs.with Interne6sam)

    erve.com>

    118 ELECTRONICS& COMMUNICATION ENGINEERING JOURNAL JUNE 1996

    Authorized licensed use limited to: CADENCE DESIGN SYSTEMS. Downloaded on August 18,2010 at 05:01:17 UTC from IEEE Xplore. Restrictions apply.

  • 8/8/2019 IEEEJournal_The Internet - A Tutorial

    7/10

    happy to give away, simply by knowing theirname (Fig. 6). Originally, it was a purelycentral server run on the ARPANET for allmanagers/contacts of networks attached tothe ARPANET for the DCA (DefenceCommunications Agency) (RFC 954 - eeBibliography).Basically, a Whois server runs on TCP port43 and awaits simple command lines (inASCII text, ending with CRLFt). The serversimply looks up the command line or namespecification in a file (perhaps using fuzzy orsoundex matching) and responds, possiblywith multiple matches. Whois is for keepingorganisation contact information.Note that each returned entry has an IChandle to distinguish it (i.e. to act a s a uniquekey). NIC is the Network Information Center,

    Fig. 6 Example of Whois output

    of which there are now many around the Internet .Finger: The Finger protocol derives from the documentRFC742. A Finger server runs on UDP or TCP port 79. Itexpects either a null string, in which case a list of all peopleusing the system is returned, or, if a string is given,provides information available concerning that person(whether logged in or not).Some people find Whois and Finger alarminglyinsecure. One particular scare in the Internet concerningsecurity was due to a simple, but extremely effective,gaping bug in the most widely used implementation of theFinger server; this may be why it scares some people.?DNSThe Domain Name System is really designed as aNetwork Information Service for internal use by toolsrather than directly by users. However, the names it holdsappear in location information currently used by manyservices and are also the basis for electronic mail routing.

    The DNS model is that all objects on the net have a namethat should be that given by the people responsible for theobject. However, this name is only part of the full way tospecify the object. The fully distinguished name is part of ahierarchy of names, which are written as per a postaladdress, or example:swan.computer-lab.cambridge.ac.uk

    *TheTC P port numbers direct data to the appropriateapplicationsoftwareprogram.7 CRLF is carriagereturn (ASCII character 13) followedby linefeed (ASCII character10).$ This bug is long since futed. Basically,the Fingerdaemon hadstorage for receiving a limited request/command, but couldactually be handed a larger amount of information from thetransport protocol. The extra information would overwrite thestack of the executing Finger server program. An ingenioushackercould exploit this by sending a Fingercommand carefullyconstructed with executable code that carried ou t his desiredmisdemeanour. The problem was exacerbatedon many systemswhere the Finger server ran as a special privileged process (root!),forno particular reason other than lazinessof the designersof thedefault configuration. Thus the wily hacker gained access toarbitraryrights on th e system.

    The top level is a country codle (e.g. .uk) or US Specific(e.g. .corn). Any organisation owns its own level and thenames of the levels below. Any string is usable. Aliases areallowed- ames more friendly than addresses.Any owner of a name space must run a server. Theowning site then must inform sites at a level above wheretheir server is. At the same time, they tell their serverwhere the level above is.Applications (FTP, Mail, TElLNET, Mosaic, etc.) use alibrary function to call the resolver. They give theresolver function a name ancl it sends a request to thelocal site DNS server. The DNS server responds witheither:(a ) the answerfrom its own tables

    0 from another server it asked on the users behalf.This is called chaining.(b ) a site which can answer. Tlhis is called referral.The Domain Name System holds general purposeresource records.Wide Area Inform ation Server -- WAISThe Wide Area Information Server idea is based on asearch model of information, rather than a browse one.Sites that run WAIS servers have created a collection ofindexed data that can then be retrieved by searches onthese indexes. The access protocol to WAIS servers isbased on the standard developed for library searching byANSI (American National Standards Institute) with theunlikely title 239.50 (also known as Information RetrievalService and Protocol Standard).WAIS has four parts (like most information servicesexcept the richer wwrw): the client, the server, thedatabase and the protocol.Client programs (e.g. the X Windows client xwaisq)construct queries and send them using the protocol to theappropriate sewer. The server responds and includes arelevance measure for the results of the search match tothe query.The actual operation of the protocol is quite complex, asit permits exchanges to be broken into separate parts.

    ELECTRONICS & COMMUNICATION ENGINEERINGJOURNAL JUNE 1996 119

    Authorized licensed use limited to: CADENCE DESIGN SYSTEMS. Downloaded on August 18,2010 at 05:01:17 UTC from IEEE Xplore. Restrictions apply.

  • 8/8/2019 IEEEJournal_The Internet - A Tutorial

    8/10

    Table 1 Gopher response types

    hone-book server

    connection closes. Beware

    onnection closes. Bewaren3270 sessionl e . C l i en t dec ides how to d isp lay

    of the important design principles inthe Internet has always been tominimise the number of places thatneed to keep track of who is doingwhat. In the case of statelessinformation servers this means thatthey do not keep track of whichclients are accessing them. In otherwords, between one access and thenext, the server and protocol areconstructed in such a way that theydo not care who, why, when orwhere the next access comes from.This is essential to the reliabilityof the server and to making suchsystems work in very-large-scalenetworks such as the Internet withWAIS permits retrieval of bibliographic as well as contents(including images) data.A search request consists of seed words (or keys) ypedby the user into the client, together witha list of documents(identified by a unique global ID). The response is quitecomplex and includes a list of records, including thefollowing fields:

    headline- asically a title/descriptionrank -relative relevance of the documentformats- ist of formats available (text/postscript etc.)document IDlengthGopherGopher is a service that listens for TCP connections onport 70. It responds to trivial string requests from clientswith answers preceded by a single character identifyingthe type, a name and a selector. The client then chooseswhat to do an d how to display any actual data returned(Table 1).World Wide WebThe World Wide Web makes all these previous serviceslook like stone tablets and smoke signals. In fact the Webis better than that! It can read stone tablets and send smokesignals too!The World Wide Web service is made up of severalcomponents. Client programs (e.g. Mosaic, Lynx) accessservers (e.g. HTTP Daemons) using the protocol HTTP(HyperText Transfer Protocol). Servers hold data, writtenin a language called HTML, the HyperText MarkupLanguage. As indicated by its name, it i s a language (inother words it consists of keywords and grammar for usingthem) for marking-up text that is hyper!*The pages in the World Wide Web are held in.HTMLformat and delivered from WW W servers to clients in thisform, albeit wrapped in MIME (Multipurpose IqternetMail Extensions) and conveyed by HTTP.A note on stateless serversAlmost all information servers above are described asstateless. State iswhat networking people call memory. One

    potentially huge numbers of clients: if the server diddepend on a client, then any client failure would leave theserver in the lurch, possibly not able to continue, or elseserving other clients with reduced resources.3 Security, performance guarantees and b illi ngImprovements in CPU, memory and storage performance/price have made many new applications possible which arefinding increasing use as a result of reduced connectivitycosts. A corresponding increase in network functionalityhas yet to happen. The Internet is experiencing a numberof problems due to the growth in itssize and the breadth ofthe community using it. These include:

    scale: The number of systems isexceeding the range ofnumbers available for addressing systems in theInternet- problem similar o the one that everyone inthe UK is accustomed to from time to time withtelephone numbers. The way these numbers areallocated is also leading to problems with the amount ofmemory in the router boxes that hold together theInternet. Currently, these need to hold the full list ofevery site in the Internet. Amore hierarchical approach(like the phone system or the postal system) willfx this.security: Security concerns are broad. The rangeincludes privacy of information, authentication of usersand systems to each other, access control to systemsand resources, prevention or denial of service, andother kinds of misuse, even including covert signalling.Security is really not a question that is relevant whentalking about the Internet itself. What needs to besecured are hosts and information. However, thenetwork must provide relevant hooks for security to beimplemented.billing: Currently, the charging model in most of theInternet is a leasing one. Bills are for the speed of

    * Hyper comes f rom the Greek prefix meaning above or over , andgenerally means that some additional functionality is presentcompared with simple text. In this case, this additionalfunctionality is in two forms: graphics orother media, and links orreferences to othe r pieces of (hyper-) ext. The se links are anothercomponent of th e WW W called Uniform Resource Locators.120 ELECTRONICS& COMMUNICATION ENGINEERING JOURNAL JUNE 1996

    Authorized licensed use limited to: CADENCE DESIGN SYSTEMS. Downloaded on August 18,2010 at 05:01:17 UTC from IEEE Xplore. Restrictions apply.

  • 8/8/2019 IEEEJournal_The Internet - A Tutorial

    9/10

    0

    access, not the amount of usage. However, manybelieve that at least during busy periods, or else forpriority service, billing will be necessary as a negativefeedback mechanism. This will also require security sothat the right people can be billed legally.*guarantees: The Internet has not historically providedguarantees of service. Many providers have done so,but typically by overprovisioning he internal resourcesoftheir networks. In the long run, this may prove viable,but at least for the next few years mechanisms will beneeded to control guarantees, especially of timelinessof delivery of information. For example, manyinformation providers, such as the share trading andnews businesses, value their commodity by time.

    Performance parametersThere are three key parameters to worry about in thenetwork and these are important if you intend using a partof the Internet to deliver commercial or dependable WW Wservices:(a ) Errors: Transmission technology is never perfect.Even glass fibre transmission has occasional errors.(b ) Delay or latency:A network is not infinitely fast. In fact,now that we are buildingaglobal society, the speed-of-

    light that Einstein was so keen on is becoming asignificant factor.Also, busy networks run slower.(c ) Throughput: Different networks are built for differentamounts of traffic. So although some networks mayhave lower latency they may also have lowerthroughput. Normally, though, latency andthroughput are largely unconnected. Typically,throughput is a feature of how much you pay, whereaslatency is a feature of the distance you arecommunicating over plus the busyness of the net.Internet service modelsThe Internet provides a best-effort service.Data can besent whenever you want: you do not have to know thatthere is a receiver ready, or that the path exists betweenyou and a potential receiver, or that there are adequateresources along the path for your data. You do not have togive a credit card number or order number so you can bebilled. You do not have to check the wire to make surethere are no eavesdroppers.Some people are uncomfortable with this model. Theypoint out that this makes it hard to carry traffic that needscertain kinds of performance guarantees, or to makecommunication secure, or to bill people. These threeaspects of the Internet are intimately connected and therest of this paper briefly discusses how research at* The re are some who believe that every type of Internet acc essshould be billed for on a usage bas is. This is problematic, and infact it has been shown that i t does not maxim ise profit. Only theuser kno ws how urgent a file transfer is . With very m any types ofdata around, the net can only charge for the ones i t really knowsabout, like long-distance voice or high-quality video. Currently,the more u sers there are, the less share each gets , so that a billbased on the actual connect t ime is not reasonable and a bill basedon amount of data transferred does not reflect the actual valueseen to the user .

    University College London and elsewhere is leading to anew Internet model which accommodates them.Best effort and charging:The current model for chargingfor traffic in the Internet is that sites connect via somepoint of presence of a provider, and pay a flat fee permonth according to the speed d he line they attach with,whether they use it to capacityo r not.

    For existing applicalions thds model is perfect. Mostsites wish to exchange data which has a value that is notincreased radically by being delivered immediately. Forinstance, when I send electronic mail, or transfer a file, theusefulness to me is in the exchange.The network provider maximises profit by admitting alltraffic and simply providing a fair share, As the speeddecreases, the usefulness to me decreases, so I amprepared to pay less. But the increase in possible incomefrom the additional users outweighs this. The underlyingconstant cost of adding an additional user to the Internet isso low that this isalways true.However, there are other kinds of traffic that this besteffort model does not suit at all and these will now beconsidered.Real-time raflc: The Internet has been used for more than4 years now to cany audio and video traffic around the world.The problem with this traffic is that it requires guarantees. Ithas a minimum bandwidth below which audio becomesincomprehensible, and even compressed video is just notusable. For human interaction, there is also a maximumdelay above which conversation becomes intolerable.In the experimental parts of the Internet, we havereprogrammed the routers which provide theinterconnection to recognise these kinds of traffic and togive it regular service.There are two aspects to this. Firstlythe minimum bandwidth guarantee must be met and this isdone by looking more frequently at the queues of traffic inthe net for traffic that needs more capacity.The delay seenby this traffic is affected by the variations n the other trafficon the network and by 1he basic transmission time (speedof light, or thereabouts- his is still a significant factor fortransmission around tlhe world, however it is one thatcannot be altered).As the other traffic is increased, the video or audioexperiences increasing delays and variation of delay. Solong as these stay within tolerable limits, the receiver canadapt continuously (e.g. in silences in audio or betweenvideo frames) and all is well. The receiver mustcompensate both for a variation of inter-packet arrivaltimes and for a possibly varying mean delay betweensender and receiver. If tlhe mean delay isconstant, then thismakes things simpler. Meanwhile, any spare capacitycarries the old best-effort trafficas before.However, when the total amount of traffic is higher thancapacity, the system starts to fail. At this point there arethree views on how to proceed:( a ) Engineer the network so thLat there is enough capacity.This is feasible only while most peoples access speedis limited by the subscribier loop, or tail circuits thatgo to their homes/office. When everyone has fibre tothe home/desktop, the potential for drowning the

    ELECTRONICS & COMMUNICATION ENGINEERING JOURNAL JUNE 1996 121

    Authorized licensed use limited to: CADENCE DESIGN SYSTEMS. Downloaded on August 18,2010 at 05:01:17 UTC from IEEE Xplore. Restrictions apply.

  • 8/8/2019 IEEEJournal_The Internet - A Tutorial

    10/10

    Jon Crowcrolt is a professor ofnetworked systems in theDepartment of Computer Science,University College London, wherehe is responsible for a number ofEuropean- and US funded researchprojects in multimedia communi-cations. He has been working inthese areas for over 15 years. Hegraduated in Physics from TrinityCollege, Cambridge University in1979 and gained an MSc degree inComputing in 1981 and a PhD in1993. He is a member of the ACM,he British Computer Societyand the IEE and is a senior member of the IEEE. He is generalchair for he ACM Special Interest Group on Communications. Heisalso on the editorial teams for the Transactions on Networks andthe Journal o Internetworking. With Mark Handley he is the CO-author of WWW:Reneath the surf and of Open distributedsystems, both published by UCLPress.Address: Department of Computer Science, University CollegeLondon, Gower Street, London WC lE 6BT, UK

    Internet will be alarming. Note though, that the phonenetwork is currently over-engineered so , for audiocapacity,everyone could certainly be switched over tothe Internet, and all phones switched over to Internet-based terminals with a flat-fee model. The Internetoperates with a flat fee rather than usage-basedcharging, because it has capacity for all existingapplications.If those existing applications ncorporateinteractive voice traffic, then the charging model stillapplies, and the advantages of a single network forvoice and data is clear.

    (b ) Police the traffic by asking people who have real-timerequirements to make a call set-up as they do with thetelephone networks. When the net is Eull, calls arerefused unless someone is prepared to pay a premiumand incur the wrath of other users by causing them tobe cut off!

    ( c) Simply bill people more as the net gets busier. Thismodel is proposed by economists at Harvard and issimilar to models of charging for road traffic proposedby the Transport Studies group at UCL. We believe itisoptimal. Since we have already reprogrammed the

    routers to recognise real-time traffic, we have theability to charge on the basis of logging of this traffic.Note that we can now charge differentially as well.Until the network service offers guarantees whichdifferentiate users traffic, it is difficult o charge on thebasis of usage (for time, or packets, or service). Nowthat guaranteed services are beginning to be feasible,this sort of charging will become possible. However, nintroducing charging in this way, we have maintainedall the original advantages of the Internet (no call set-up, easy to rendezvous etc.).

    4 ConclusionsIn this tutorial we have looked at the service model of theworld-wide Internet. We have discussed the kinds ofapplications hat have evolved to make information readilyavailable to non-expert users, and we have looked at theunderlying technology used to route and transfer the dataaround the network. Finally we have looked at some of theareas that are being explored as the Internet isextended toadd new services on a more commercial basis.BibliographyDetails of Internet protocols are given in the RFC (Requestfor Comments) documents maintained by the NetworkInformation Center at Network Solutions Inc. They areavailable on the Internet at many sites, e.g. the UKAcademic FTP site,ftp.doc.ic.ac.uk.1 TRANSMISSION CONTROL PROTOCOL, W C 793, editedby Jon Pos tel , September 19812 INTERN ET ,PROTOCOL, .RFC 791, edited by Jo n Postel,

    Sep tember 19813 STEVENS, W. R.: TCP/IP Illustrated, Vol. 1 an d Vol. 2(Addison-W esley, 1992 and 1994), ISBN 0-201-63354-X4 P m R I D G E , C: Gigabit networking (Addison-Wesley,

    5 HANDLEY, M., an d CROWCROFT, J.: Wor ld W ide Web:beneath the s u r f (UCL Press, London, 1995) ISBN 1-85728-1993), ISBN 0.201-56333-9

    435-60 EE: 1996

    Br ian J. Thomas

    12 2 ELECTRONICS & COMMUNICATION ENGINEERING JOURNAL JUNE 1996

    http://ftp.doc.ic.ac.uk/http://ftp.doc.ic.ac.uk/