237
Technical University of Denmark Department of Informatics and Mathematical Modelling Design of a Hardware Network Address Translation Unit for a Single Chip High-Speed Ethernet Router A Thesis in Informatics by Martin Rolsted Jensen Submitted in Partial Fulfillment of the Requirements for the Degree of Master of Science June 2009 IMM-M.Sc.-2009-27

Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

Embed Size (px)

Citation preview

Page 1: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

Technical University of DenmarkDepartment of Informatics and Mathematical Modelling

Design of a Hardware

Network Address Translation Unit

for a Single Chip

High-Speed Ethernet Router

A Thesis inInformatics

by

Martin Rolsted Jensen

Submitted in Partial Fulfillmentof the Requirements

for the Degree of

Master of Science

June 2009

IMM-M.Sc.-2009-27

Page 2: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

Technical University of DenmarkInformatics and Mathematical ModellingBuilding 321, DK-2800 Kongens Lyngby, DenmarkPhone +45 45253351, Fax +45 [email protected]

Page 3: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

Summary

The general objective of this thesis work is to design and evaluate a Network andAddress & Port Translation (NAPT) core, which is typically found in and asa part of a Small- Office or Small-Home (SOHO) Ethernet router. The NAPTcore makes it possible for e.g. a wireless router to share only one public IPaddress, provided by an Internet Service Provider (ISP), with several internalhosts on a private Local Area Network (LAN), even though the internal hostsare connecting simultaneously.The main functionality of the NAPT core is to modify the header informationof both in- and outbound IP packets transversing the NAPT. This process ofmanipulating, also called translation, involves time-critical lookups in varioustables. To comply with the expected operation these tables are typically real-ized by ordinary memory blocks and incorporates some algorithm to make thelookup fast. In this work a table structure is realized with the use of ContentAddressable Memory (CAM) modules for the lookup process and will be evalu-ated, first of all to find out if a CAM actually can be used for the purpose andsecondly to find out, if it can support a network bandwidth of 2 Gbps.At first the reader will be introduced to the fundamentals of the NAPT in Chap-ter 2, whereafter the actual analysis and design of a NAPT core is carried out.The analysis and design Chapter 3 is structured in an itemized way, where thedifferent demands are treated. Typically the structure is equal for each item,starting with an introduction to the problem followed up by one or more de-sign solutions and finally the subject of the item is discussed with pro and consand relevant security considerations. The implemented parts of the project arepresented in Chapter 4 and a functional test of the implemented part is carriedout in Chapter 5. Finally the achived results are presented in Chapter 6 and adiscussion of potential future work is made in Chapter 7. In the last chapter 8

Page 4: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

ii

a conclusion is given.

Page 5: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

Resume

Formalet med denne afhandling er at designe og evaluere en Network and Adress& Port Translation (NAPT) kerne, som typisk forefindes i og som en del af enSmall Office eller Small Home (SoHo) Ethernet router. NAPT kernen gør detmuligt for eksempelvis en tradløs router kun at dele en public IP adresse, tildeltaf en Internet Service Provider (ISP), med adskillige interne hosts pa et privatLocal Area Network (LAN), selvom de interne hosts har simultane forbindelser.

NAPT kernens hovedfunktion er, at modificere header informationen for badeind- og udadgaende IP pakker, som gennemsendes NAPT gatewayen. Dennemanipulationsproces, ogsa kaldet oversættelse, involverer tidskritiske lookupsi forskellige tabeller. For at efterkomme den forventede funktion bliver dissetabeller typisk realiseret med ordinære hukommelses blokke og inkorporeredealgoritmer for at hurtiggøre lookup.

I denne afhandling realiseres tabelstrukturen med brug af Content AddressableMemory (CAM) moduler til lookup processen, og evalueres først og fremmestfor at pavise om en CAM i det hele taget kan benyttes til formalet og dernæst,om den kan supportere en netværks bandbredde pa 2 Gbps.

Til en start vil læseren introduceres til det fundamentale ved en NAPT i kapitel2, hvorefter den faktiske analyse og designet af NAPT kernen foretages i kapitel3. Dette kapitel er detaljeret struktureret i emner, hvor de forskellige krav be-handles. Typisk vil strukturen for hvert emne starte med en problembeskrivelseefterfulgt af en eller flere designløsninger. Endeligt vil løsningen diskuteres medfordele og ulemper samt eventuelle sikkerhedsproblemstillinger.

De implementerede dele af projektet præsenteres i kapitel 4 og en funktionstest

Page 6: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

iv

af det implementerede foretages i kapitel Chapter 5. Endvidere redegøres for deopnaede resultater i kapitel 6 for herefter at foresla fremtidige undersøgelser ogudviklingspotentiale i kapitel 7. Der vil til sidst konkluderes samlet pa dette ikapitel 8.

Page 7: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

Preface

This thesis was prepared at Informatics Mathematical Modelling (IMM), theTechnical University of Denmark (DTU) in partial fulfillment of the require-ments for acquiring the M.Sc. degree in engineering.

The thesis deals with the development of a 2 Gbps hardware Network Address& Port Translation (NAPT) core intended to be used as a part of a Small-Officeor Small-Home (SoHo) Ethernet router.

It is desirable for the reader to has a basic knowledge of digital electronic designand a intermediate knowledge of the TCP/IP architecture.

The thesis work is carried out during the period September 2008 – March 2009at and in corporation with Vitesse Semiconductor Corporation A/S

The project is accompanied by a CD-ROM with all source code. A descriptionof the CD-ROM contents can be found in appendix H.

Nørrebro, June 2009

Martin Rolsted Jensen

Page 8: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

vi

Page 9: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

Acknowledgment

First of all, I would like to thank Docent Jens Sparsø from the institute ofInformatics and Mathematic Modeling (IMM) at the Technical University ofDenmark (DTU), who agreed in being my official supervisor.

My gratitude to Kristian Ehlers, Martin Bak and Martin Olsen from VitesseSemiconductor Corporation A/S for their advices and help through out theproject period and for the time they spent reading my thesis. This thesis wouldcertainly not have been what it is without the advices and comments theybrought to me.

It is a real pleasure for me to thank all other employees at Vitesse SemiconductorCorporation A/S for their kindness and for the great time i had at the informalget-together Vitesse offered during my stay.

I would like to heartily thank my wife Kia Norsker for her support and encour-agement throughout the period, especially for the time spent reading this thesisand for her fruitful comments on this work.

Finally I would like to thank my family and friends for their encouragement,with a special thanks to my mother for her numerous dinner offers.

Page 10: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

viii

Page 11: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

Contents

Summary i

Resume iii

Preface v

Acknowledgment vii

1 Introduction 1

1.1 Motivation & Objective . . . . . . . . . . . . . . . . . . . . . . . 9

1.2 Delimitation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

1.3 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

1.4 Nomenclature . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

2 Fundamentals of Network Address Translation 15

2.1 Traditional NAT . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

Page 12: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

x CONTENTS

3 Analysis and Design of a NAPT Core 29

3.1 NAPT Tables Mapping Behavior . . . . . . . . . . . . . . . . . . 29

3.2 NAPT Tables Filtering Behavior . . . . . . . . . . . . . . . . . . 36

3.3 Port Assignment Behavior . . . . . . . . . . . . . . . . . . . . . . 44

3.4 Connection Establishment . . . . . . . . . . . . . . . . . . . . . . 59

3.5 Connection Refresh and Termination . . . . . . . . . . . . . . . . 89

3.6 The Transport Layer Checksum and its Algorithm . . . . . . . . 101

4 Implementation of a NAPT Design 105

4.1 NAPT Core . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108

4.2 Checksum Processor . . . . . . . . . . . . . . . . . . . . . . . . . 116

5 Test and Verification 121

5.1 Test of the mapping behavior . . . . . . . . . . . . . . . . . . . . 122

5.2 Test of the filtering behavior . . . . . . . . . . . . . . . . . . . . 123

5.3 Test of the session timer . . . . . . . . . . . . . . . . . . . . . . . 124

5.4 Test of the delete procedure . . . . . . . . . . . . . . . . . . . . . 127

6 Results 131

6.1 Bandwidth Estimation . . . . . . . . . . . . . . . . . . . . . . . . 131

6.2 Bandwidth of the NAPT Core . . . . . . . . . . . . . . . . . . . . 134

6.3 Checksum Processor Requirements . . . . . . . . . . . . . . . . . 136

6.4 Latency Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . 138

6.5 Area Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . 140

Page 13: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

CONTENTS xi

6.6 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140

7 Discussion 143

7.1 Summary of Main Contributions . . . . . . . . . . . . . . . . . . 143

7.2 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144

8 Conclusion 149

A Working description of the M. Sc. Thesis 151

B UDP example PPT 153

B.1 UDP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153

C Flowchart: Packet Processing 169

D Flowchart: Clean-Up Processing 171

E VHDL Source Code 173

E.1 Behavioral-HNAPT . . . . . . . . . . . . . . . . . . . . . . . . . . 173

E.2 RTL-Checksum . . . . . . . . . . . . . . . . . . . . . . . . . . . . 195

F ICMP Message Types 203

G CPU Interrupt Codes 205

H CD-ROM Content 207

H.1 General . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 207

H.2 Source Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 207

Page 14: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

xii CONTENTS

H.3 Test vectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 208

I List of Acronyms 209

Page 15: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

List of Figures

1.1 Time series of IANA allocations. . . . . . . . . . . . . . . . . . . 3

1.2 Typical deployment scenarios of routers with an embedded NATgateway. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

1.3 Components of a standard broadband router containing an em-bedded NAT gateway. . . . . . . . . . . . . . . . . . . . . . . . . 7

2.1 Illustration of a static Basic Network Address Translation oper-ation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

2.2 Illustration of a dynamic Basic Network Address Translation op-eration. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

2.3 Illustration of a Network Address and Port Translation (NAPT)operation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

2.4 Examples of a NAT gateway with an Endpoint-Independent map-ping behavior. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

2.5 Examples of a NAT gateway with an Address and Port-Dependentmapping behavior. . . . . . . . . . . . . . . . . . . . . . . . . . . 21

2.6 Examples of a NAT gateway with a Port-Dependent mappingbehavior. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

Page 16: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

xiv LIST OF FIGURES

2.7 Examples of a NAT gateway with an Address-Dependent map-ping behavior. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

2.8 Examples of a NAT gateway with a Connection-Dependent map-ping behavior. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

2.9 The NAT gateways firewall/endpoint-filtering rule is set to Independent-Endpoint. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

2.10 The NAT gateways firewall/endpoint-filtering rule is set to Address-Dependent. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

2.11 The NAT gateways firewall/endpoint-filtering rule is set to Port-Dependent. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

2.12 The NAT gateways firewall/endpoint-filtering rule is set to Ad-dress and Port-Dependent. . . . . . . . . . . . . . . . . . . . . . . 26

2.13 External source IP address and port NAT hairpinning behavior. 26

2.14 Internal source IP address and port NAT hairpinning behavior. . 27

3.1 Without relay. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

3.2 with relay. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

3.3 Illustration of the procedure to handle an outbound packet be-longing to a session unknown to the mapping table. . . . . . . . . 34

3.4 Illustration of the procedure to handle an outbound packet be-longing to a session already known to the mapping table. . . . . 34

3.5 Illustration of the procedure to handle an inbound packet whichhas a valid mapping in the mapping table. . . . . . . . . . . . . . 35

3.6 A tree structure which illustrates the relationship between themapping and filtering table. . . . . . . . . . . . . . . . . . . . . . 37

3.7 Illustrates the creation of a new full-session on behalf of an out-bound packet, which has no valid entry on beforehand in bothNAT tables CAM-0 and CAM-1. . . . . . . . . . . . . . . . . . . 39

Page 17: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

LIST OF FIGURES xv

3.8 Illustrates the creation of a new sub-session on behalf of an out-bound packet, which has a valid mapping on beforehand in NATtable CAM-0 but not CAM-1. . . . . . . . . . . . . . . . . . . . . 40

3.9 Illustrates the situation, where an outbound packet transversethe NAT gateway, and an full-session has been established onbeforehand. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41

3.10 Illustrates the situation, where an inbound packet transverse theNAT gateway and an full-session has been established on before-hand. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42

3.11 The modified mapping table (CAM-0) to support the dynamicassignment of filtering rules. . . . . . . . . . . . . . . . . . . . . . 43

3.12 Full port range solution. . . . . . . . . . . . . . . . . . . . . . . . 48

3.13 A section of the modified mapping table. . . . . . . . . . . . . . . 49

3.14 Port number and range preserved. . . . . . . . . . . . . . . . . . 50

3.15 Port range preserved. . . . . . . . . . . . . . . . . . . . . . . . . . 51

3.16 Creating a virtual range to expand the upper port range. . . . . 52

3.17 CPU session table . . . . . . . . . . . . . . . . . . . . . . . . . . 54

3.18 Outbound packet procedure where the session is unknown on be-forehand . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56

3.19 Outbound packet procedure where the session is known on be-forehand in the mapping table . . . . . . . . . . . . . . . . . . . . 57

3.20 Inbound packet procedure where the session is known on before-hand in the mapping table . . . . . . . . . . . . . . . . . . . . . . 58

3.21 The UDP header consists of only 4 fields. The use of two of thoseis optional (pink background in table). The numbers are the bitsize of the fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60

3.22 Sequence of UDP packets in a data transmission between host Aand E trough a NAPT gateway. . . . . . . . . . . . . . . . . . . . 62

Page 18: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

xvi LIST OF FIGURES

3.23 Content of the CPU session table after processing the first out-bound initializing UDP packet. . . . . . . . . . . . . . . . . . . . 62

3.24 Content of the NAPT tables after processing the first outboundinitializing UDP packet. . . . . . . . . . . . . . . . . . . . . . . . 63

3.25 An illustration of the different fields a TCP header consist of.The numbers are the bit size of the fields. . . . . . . . . . . . . . 64

3.26 Establishment, data transmission and termination phase of a 3-way handshake. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66

3.27 Left: Ordinary TCP 3-way handshake. Right: Interfered be-tween SYN and SYN-ACK by an ICMP error message caused byendhost or network. . . . . . . . . . . . . . . . . . . . . . . . . . 67

3.28 Left: Interfered between SYN and SYN-ACK by an ICMP timeexceeded message caused by network. Right: STUNT #1 TCP-transversal approach. . . . . . . . . . . . . . . . . . . . . . . . . . 68

3.29 Left: Ordinary TCP 3-way handshake. Right: Interfered betweenSYN and SYN-ACK by an RST packet caused by endhost. . . . 69

3.30 Left: Interfered between SYN-ACK and ACK by an RST packetcaused by endhost. Right: Interfered between SYN-ACK andACK by an ICMP error message caused by endhost or network. . 69

3.31 TCP - Simultaneous Open. . . . . . . . . . . . . . . . . . . . . . 70

3.32 Establishment, data transmission and termination phase of asimultaneous-open. . . . . . . . . . . . . . . . . . . . . . . . . . . 70

3.33 Left: STUNT #1 Right: STUNT #2. . . . . . . . . . . . . . . . 73

3.34 Left: NATblaster Right: P2PNAT. . . . . . . . . . . . . . . . . . 74

3.35 Time line sequence of TCP packets, which makes up a TCP con-nection establishment, data transmission and termination phasebetween host A:121 and E:90 through the NAPT gateway. . . . . 75

3.36 Content of the CPU session table after processing the first initialpacket (SYN) in the TCP connection establishment phase. . . . . 76

Page 19: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

LIST OF FIGURES xvii

3.37 Content of the NAPT tables after processing the first initialpacket (SYN) in the TCP connection establishment phase. . . . . 77

3.38 Content of the CPU session table after processing the secondpacket (SYN-ACK) in the TCP connection establishment phase. 78

3.39 Content of the NAPT tables after processing the second packet(SYN-ACK) in the TCP connection establishment phase. . . . . 78

3.40 Content of the CPU session table after processing the last packet(ACK) in the TCP connection establishment phase. . . . . . . . 79

3.41 Content of the NAPT tables after processing the last packet(ACK) in the TCP connection establishment phase. . . . . . . . 79

3.42 ICMP Error message layout. The numbers are the bit size of thefields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80

3.43 ICMP Query message layout. The numbers are the bit size of thefields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80

3.44 ICMP error message example; the application on host E is downand therefore host E responds with an ICMP error message. . . . 83

3.45 Content of the NAPT tables and the CPU session table for theconcerned session. . . . . . . . . . . . . . . . . . . . . . . . . . . 84

3.46 ICMP query message example; The host A uses the utility pro-gram Ping to discover the existence of the endhost E. . . . . . . 85

3.47 Content of the CPU session table after processing the first out-bound initializing ICMP query packet. . . . . . . . . . . . . . . . 86

3.48 Content of the NAPT tables after processing the first outboundinitializing ICMP packet. . . . . . . . . . . . . . . . . . . . . . . 86

3.49 Sequence of UDP packets in a data transmission between host Aand E trough a NAPT gateway. . . . . . . . . . . . . . . . . . . . 91

3.50 Content of the NAPT tables and the CPU session table afterprocessing the second, inbound UDP packet. . . . . . . . . . . . 92

3.51 Content of the NAPT tables after processing the third outboundUDP packet. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92

Page 20: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

xviii LIST OF FIGURES

3.52 Content of the NAPT tables after a session clean-up, where theACT-bit is cleared after a session timer restart. . . . . . . . . . . 93

3.53 Simultaneous close packet sequence. . . . . . . . . . . . . . . . . 94

3.54 Time line sequence of TCP packets, which makes up the a datatransmission and termination phase between host A:121 and E:90through the NAT gateway. . . . . . . . . . . . . . . . . . . . . . . 95

3.55 Content of the NAPT table after processing the data carryingTCP packet. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96

3.56 Content of the NAPT table after processing the data acknowl-edging TCP packet. . . . . . . . . . . . . . . . . . . . . . . . . . 97

3.57 Content of the CPU and NAPT tables after processing the firstTCP packet where the FIN flag is set. . . . . . . . . . . . . . . . 97

3.58 Content of the CPU and NAPT tables after processing the secondTCP packet where the FIN flag is set. . . . . . . . . . . . . . . . 99

3.59 Content of the NAPT tables. . . . . . . . . . . . . . . . . . . . . 100

3.60 UDP/TCP packet with pseudo header used for checksum calcu-lation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104

4.1 State diagram of the main design. . . . . . . . . . . . . . . . . . 106

4.2 Router Core. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108

4.3 NAPT Core. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109

4.4 NAPT main processing. . . . . . . . . . . . . . . . . . . . . . . . 111

4.5 Propagation delay model of applied components in the checksumprocessor e.g. full-adder, inverter etc. All values are given in picoseconds [ps] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117

4.6 Hardware implementation of the checksum algorithm. . . . . . . 117

4.7 Checksum Processor Stage I. . . . . . . . . . . . . . . . . . . . . 118

4.8 4-2 adder unit. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119

Page 21: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

LIST OF FIGURES xix

4.9 Checksum Processor Stage II. . . . . . . . . . . . . . . . . . . . . 119

5.1 17 outbound packets are send from 17 different hosts on the pri-vate side of the network to the same host on the public network. 122

5.2 Results of the 17 outbound packets that were send from 17 dif-ferent hosts on the private side of the network to the same hoston the public network. . . . . . . . . . . . . . . . . . . . . . . . . 122

5.3 17 inbound packet are send as response to the 17 outbound. . . . 123

5.4 Results of the 17 inbound packet that were send in response tothe 17 outbound. . . . . . . . . . . . . . . . . . . . . . . . . . . . 123

5.5 17 outbound packets are send from 1 host on the private side ofthe network to 17 different hosts on the public network. . . . . . 123

5.6 Results of the 17 outbound packets that were send from one hoston the private side of the network to 17 different hosts on thepublic network. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124

5.7 17 inbound packet are send as response to the 17 outbound. . . . 124

5.8 Results of the 17 inbound packet that were send in response tothe 17 outbound. . . . . . . . . . . . . . . . . . . . . . . . . . . . 124

6.1 Twin-buffer solution system solution. . . . . . . . . . . . . . . . . 132

6.2 NAPT Buffer 1 is loading input data. . . . . . . . . . . . . . . . 133

6.3 The NAPT is processing the first packet in NAPT Buffer 1, whileNAPT Buffer 2 is loading input data. . . . . . . . . . . . . . . . 133

6.4 The NAPT is processing the second packet in NAPT Buffer 2,while NAPT Buffer 1 is loading input data and the WAN OutputBuffer is getting drained with a bandwidth of 2Gbps. . . . . . . 134

6.5 Outbound minimum sized IPv4/UDP packet with a NAPT run-ning @150MHz. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136

6.6 Outbound minimum sized UDP packet with a NAPT running@175MHz. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137

Page 22: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

xx LIST OF FIGURES

6.7 Outbound minimum sized UDP packet with a NAPT running@200MHz. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138

6.8 Checksum calculation timing diagram. . . . . . . . . . . . . . . . 139

B.1 Illustration of a static Network Address Translation operation. . 154

B.2 Illustration of a static Network Address Translation operation. . 155

B.3 Illustration of a static Network Address Translation operation. . 156

B.4 Illustration of a static Network Address Translation operation. . 157

B.5 Illustration of a static Network Address Translation operation. . 158

B.6 Illustration of a static Network Address Translation operation. . 159

B.7 Illustration of a static Network Address Translation operation. . 160

B.8 Illustration of a static Network Address Translation operation. . 161

B.9 Illustration of a static Network Address Translation operation. . 162

B.10 Illustration of a static Network Address Translation operation. . 163

B.11 Illustration of a static Network Address Translation operation. . 164

B.12 Illustration of a static Network Address Translation operation. . 165

B.13 Illustration of a static Network Address Translation operation. . 166

B.14 Illustration of a static Network Address Translation operation. . 167

Page 23: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

List of Tables

5.1 Create a session: Host A → host B . . . . . . . . . . . . . . . . . 125

5.2 Check the session created: Host B → host A . . . . . . . . . . . 125

5.3 After one large delay - Check the created session again: Host B→ host A . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125

5.4 Create a session: Host A → host B . . . . . . . . . . . . . . . . . 126

5.5 Check the session created: Host B → host A . . . . . . . . . . . 126

5.6 After a half large delay - Restart the session-timer: Host A →host B . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126

5.7 After one large delay - Check the created session again: Host B→ host A . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127

5.8 Create a session: Host A → host B . . . . . . . . . . . . . . . . . 127

5.9 Check the session created: Host B → host A . . . . . . . . . . . 127

5.10 After one large delay - Check the created session again: Host B→ host A . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128

5.11 Create a session to show that is get the released port number:Host P → host B . . . . . . . . . . . . . . . . . . . . . . . . . . . 128

Page 24: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

xxii LIST OF TABLES

5.12 Create another session to show that it gets another NAPT portnumber: Host A → host B . . . . . . . . . . . . . . . . . . . . . . 129

5.13 Create a new full-session: Host A → host B . . . . . . . . . . . . 129

5.14 Create a new sub-session after half the delay of the session-timer:Host A → host C . . . . . . . . . . . . . . . . . . . . . . . . . . . 129

5.15 Inbound packet after slightly more than half the delay of thesession-timer: Host B → host A . . . . . . . . . . . . . . . . . . . 129

5.16 Inbound packet: Host C → host A . . . . . . . . . . . . . . . . . 129

7.1 The NAT table size of a section of commercial routers with NATcapability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146

F.1 Q = Query, E = Error, S = Support, • = Must, ◦ = May, =Should not . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 204

Page 25: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

Chapter 1

Introduction

The infrastructure of the Internet today is build upon the Internet ProtocolSuite (generally known as TCP/IP), which is a collection of communicationprotocols used for packet exchange on the Internet and other similar networks.The TCP/IP model can be viewed as a set of layers, where each layer solves aset of problems involving the transmission of data, and provides a well-definedservice to the upper layer protocols based on the use of services from some lowerlayers. The TCP/IP model also called a TCP/IP stack consists of four layers[28]. From lowest to highest, these are the Link Layer, the Internet Layer, theTransport Layer, and the Application Layer. Figure 1.1a illustrates the layeredarchitecture with some of the common protocols used by the different layersof the TCP/IP stack. The practical implementation of the layered TCP/IPmodel is done by encapsulating data, which provides the necessary abstractionof protocols and services. Data are further encapsulated as it works its waydown the TCP/IP stack as illustrated in Figure 1.1b. At the receiving endhostthe opposite operation of decapsulation takes place upwards at each layer of theTCP/IP stack.

The Internet Protocol

The Internet Protocol (IP) at the Internet Layer has been the dominating pro-tocol in use ever since the great migration from the Network Control Protocol(NCP), which was the first standard networking protocol on the ARPANET

Page 26: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

2 Introduction

(a) The Internet Protocol Suite (b) Data encapsulation at the dif-ferent layers

(the first wide area network that has evolved into the Internet, that we knowtoday), to the TCP/IP-based Internet. The migration was officially completedon January 1, 1983 when the new protocols were permanently activated [1].Years before the full switchover to a TCP/IP-based network, several versionsof the TCP/IP protocols where purposed and tried out. This ongoing researchand development leads to stability in the fourth version of the Internet Protocol,named IPv4; the standard protocol still in use on the Internet today.

The primary purpose of using the IP on the Internet is to provide routing ca-pabilities, and it does so by introducing host addressing. Assigning uniqueaddresses to individual hosts connected to the Internet, makes it possible toidentify the network of the requested host and the host itself and thereby facil-itate routing of data. The addresses used for routing on the Internet are calledIP addresses and are carried as a part of the IP header, which is used to encap-sulate the payload at the Internet layer of the TCP/IP stack. The IP address ofthe sender and the requested host are both carried in the IP header, just as itis known from the classic envelope where both the address and return addressare included.

The address of the IPv4 is build upon a binary data word with a size of 32-bit,which naturally fix the total number of simultaneous connected computers on theInternet to 232, which equals almost 4.3 billion individual network addressablecomputers. The actual usable range is reduced as some of the addresses arereserved for special purposes such as private networks (∼ 18 million addresses)or multicast addresses (∼ 270 million addresses). This reduces the number ofaddresses, that can be allocated as public Internet addresses.

The Internet Protocol address shortage scenario

While the number of available addresses seems to be enough in the early days of

Page 27: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

3

the Internet, the situation has changed over the years as the popularity of theInternet has grown. Ever since the early 1980’s where the IP was standardized,there has been a tremendous demand for IP addresses. This demand has leadto several obstacles to overcome for the Internet community over time. Mostlythese obstacles has to do with the way, that the IP addresses are managed andthe allocation strategy, but as time passes, the IPv4 address depletion scenariohas become the all-important concern of today. The time series in Figure 1.1display the cumulative Internet Assigned Numbers Authority (IANA) (the entitythat oversees global IP address allocation) Address allocations [2].

Figure 1.1: Time series of IANA allocations.

The allocation unit is in units of /8s, which means, that every bit combinationof the 8 MSB serves as a unit. This gives a total of 256 units of which some arereserved, as already mentioned.

IPv6 as a longterm solution to the IPv4 address shortage scenario

As time of writing the most reliable prediction of the IPv4 address depletionpredicts, that IANA will run out of free units in 08-Jun-2011. Fortunately a longterm solution to the address depletion problem already exists in terms of theInternet Protocol version 6 (IPv6), which is the next-generation Internet Layerprotocol for the Internet, purposed for the first time in [9] in 1996. The IPv6is meant to be the successor of the IPv4 and it solves the IP address depletionproblem by providing a much larger address space. The address space in thenext generation Internet Protocol has been increased by using a 128-bit dataword instead of the 32-bit used by the fourth version of the Internet Protocol.The new address space thus supports 2128 (about 3.4×1038) addresses, a number

Page 28: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

4 Introduction

which gives approximately 4.9×1028 addresses for each of the roughly 6.9 billion(6.9× 109) people alive in 2009 [3].

Since the format of IPv4 packets and IPv6 packets are significantly different,the two protocols are not interoperable, and as a consequence thereof, the in-frastructure of the Internet must provide support for the IPv6. While some ofthe changes to the new protocol can be done relatively painless, vital parts ofthe Internet, like for instance routers, requires explicit support for IPv6. In thelight of the expense of a full transformation from a IPv4 based Internet to aIPv6 based Internet, it would be inappropriate to use a terminal date, where thechanges will become effective once and for all. Instead a transition period willbe a possibility, a period where the two standards will co-exist as seen today.

Network Address Translators as an intermediate solution to the IPv4 addressshortage

The introduction of the Network Address Translation (NAT) approach, whichwas described for the first time in [20] by the Dane K. Egevang in the early1990’s, as a way to extend the lifetime of the current IPv4 protocol, have had atremendous impact on the length of the transition period for the migration froma IPv4 to a IPv6 based Internet. More than a decade after the announcementof the next generation Internet Protocol, recent study [15] by Google indicates,that penetration of IPv6 traffic on the Internet is still less than one percentin any country. The extensive use of NATs are the main reason for the IPv6deployment delay.

Many variations of NATs exist. The one of concern in this work is often re-ferred to as ”Traditional NAT”. Traditional NAT facilitate a way to connect anetwork, which uses private IP address, better known as a private network oran intranet, to a public network, a network where public addresses are used.Generally speaking an IP address can either be seen as public or private, publicIP addresses compared to private addresses are addresses, which can be routedon the Internet. When the public IP addresses are assigned, routes are pro-grammed into the routers of the Internet, so that traffic to the assigned publicIP addresses can reach their locations. A small portion of the complete rangeof IP addresses has been reserved to be private addresses, as mentioned earlier.Private addresses can only live in private intranets and will be discarded, if theyreach the Internet, i.e. the public realm, without the use of a border gatewayto perform a network address translation from a private one to a public one.The class of private IP addresses were originally created due to the shortageof publicly registered IP addresses. The introduction of private IP addressesmakes it possible for a number of private networks throughout the Internet touse the same IP address internally. This approach saves IP addresses and slowsdown the demand for new public addresses.

Page 29: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

5

Traditional NAT comes in two versions; one called Basic NAT and the otherNetwork Address and Port Translation (NAPT). Basic NAT translates only theIP address, while NAPT includes the Transport identifier (such as TCP/UDPport or ICMP query ID) at the transport layer in the Internet Protocol Suite.Basic NAT uses a pool of public addresses to provide access for one or many pri-vate hosts on the internal network. Having fewer public addresses than privateaddresses limits the number of simultaneous connections, also called sessions,that the NAT can handle. The NAPT approach only requires one public IPaddress to provide access for one or many private hosts on the internal network,as it uses the transport identifier to keep track of which host on the privatenetwork, that are involved in the packet exchange.

Unless mentioned otherwise, NAT throughout this report will stick to Tradi-tional NAT, namely Basic NAT as well as NAPT. A comprehensive walk throughthe fundamentals of the operation of the different NAT versions, Basic NAT andNAPT, will be given in section 2. By now it is enough to conclude, that the NATapproach is used mainly to extent the lifetime of IPv4, until the next generationInternet Protocol IPv6 will take over.

Location of NATs in the Internet

The predominant use of NAT gateways are centered around the use of bordergateways between the Internet and a Local Area Network (LAN), typically em-bedded in Small office/Home office (SoHo) routers, but also as a part of highperformance network equipment. Figure 1.2 shows three different scenarios,where a NAT gateway is used to gain access to the Internet.

Scenario A shows a common and widespread use of the Traditional NAT ap-proach, normally the NAPT variant is applied. Making use of the NAPT variantof the Traditional NAT approach in such a setup is very popular, as it makesdo with only one public IP address from an Internet Service Provider (ISP) tosupport a number of hosts on the private network.

Areas in the public domain, which offers wireless connection through the Inter-net via a Wi-Fi hotspot, are typically based on the Traditional NAT approach.Scenario B illustrates an example, where a Wi-Fi hotspot with an embeddedNAT gateway enables access to the Internet.

The last scenario C, illustrates a setup where the ISP offers Internet accessthrough a NAT gateway. The principle idea of having ISP offering access tothe Internet through a NAT gateway, is to economize on the IP address usages.Ideally the ISP must have an IP address for each subscriber, but since only afraction of the subscribers are simultaneous connected to the Internet, the actualneed is less than the number of subscribers. Every time a subscriber request

Page 30: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

6 Introduction

Figure 1.2: Typical deployment scenarios of routers with an embedded NATgateway.

access to the Internet, the ISP reserves dynamically an IP address from a poolof addresses and assigns it to the subscriber for as long as needed. In case thelast IP address of the pool is taken, the NAT either switches to NAPT mode orsimply rejects new requests. An ISP can by means of this practice offer Internetconnectivity for a larger group of subscribers, than it has IP addresses at itsdisposal.

Router exploration

The most commonly use of NATs are found in home or small businesses broad-band routers, analogue to the scenario A in Figure 1.2.

A broadband router combines the features of a traditional network switch, afirewall, and a Dynamic Host Configuration Protocol(DHCP) server. Broadbandrouters are designed for convenience in setting up home networks, particularlyfor homes with high-speed cable modem or DSL Internet service.

Typical Router features

• Static Routing

Static routing makes it possible to customize specific routes of data throughthe network described by fixed paths. The fixed paths are manually addedto the routing table. Incoming packets will on the basis of the destination

Page 31: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

7

Figure 1.3: Components of a standard broadband router containing an embed-ded NAT gateway.

IP address be matched to the routing table and if an entry exists with amatching IP address, the packet will be directed to a next hop gateway.The information of the next hop gateway is stored at the matching entryand consists of an IP address, netmask and the interface.

• Dynamic Host Configuration Protocol (DHCP) server

A DHCP-server manages the network information dynamically, like the IPaddresses, for internal hosts running a DHCP-client. When a host entersthe internal network, it will request an IP address and other relevantinformation from the DHCP-server. Based on the host’s Media AccessControl (MAC) address the DHCP-server will assign a possible free IPaddress. The assignment is time limited and the host must on a regularbasis renew its information.

• Domain Name System (DNS) server

The DNS-server translates domain names meaningful to humans into thenumerical (binary) identifiers associated with networking equipment. TheDNS-server includes a cache with information about recent lookups. If thelocal DNS-server is unable to accommodate the request, the local DNS-server will send the request to one or more designated DNS-servers. Inthis scenario, with a SoHo router, the ISP, to which the router connects,will usually supply this DNS-server.

• Multicast Enabler

A module of the router which enables the support of multicast specifictraffic to be passed to and from a host on the private network behind the

Page 32: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

8 Introduction

NAPT.

• Universal Plug and Play (UPnP) Support

UPnP comes with a solution for NAT traversal with the use of the InternetGateway Device (IGD) Protocol. A router can expose itself as IGDs, al-lowing any local UPnP controller to perform a variety of actions, includingretrieving the external IP address of the device, enumerate existing portmappings, and adding and removing port mappings. By adding a portmapping, a UPnP controller behind the IGD can enable traversal of theIGD from an external address to an internal client.

• Quality of Service (QoS) Engine

Introduces prioritized traffic handling, which means that some traffic areassigned a higher priority than other e.g. Voice over Internet Protocol(VoIP) and traffic originating from online gaming.

• Application-Level Gateway (ALG)

Certain applications need explicit handling to work successfully with thepresence of a NAT/NAPT unit. An ALG unit, which is interconnectedwith the NAT unit, takes care of these applications on an individual basisby allowing customized NAT traversal filters to be added, such that therouter supports these applications.

• NAPT unit Does not need any further introduction.

• Port Forwarding

Enables external access to one or multiple hosts on the private networkby adding a static mapping for each in the NAPT table.

• Layer 2 Switch

Uses the MAC address at Layer 2 of the IP Suite to forward traffic. Thelayer 2 switch registers the MAC address of a sending host together withthe port it uses in a table. This ensures in most cases, that traffic aredirected to the right port first time, as the information about the portresides in the table together with the MAC address of the respondingframe. In case the entry does not exist, the frame are forwarded to everyexisting port and registers the following responding port.

• MAC Filter

Restricts the access to the network by means of the requesting host’s MACaddress. Only hosts with a MAC address, which has been added to aninternal MAC-filter list by an administrator, are able to gain access to thenetwork.

Page 33: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

1.1 Motivation & Objective 9

• Stateful Packet Inspection (SPI)

SPI is a unit which monitors the state of network connections, such as TCPand UDP communication, traveling across it. The unit is programmedto distinguish legitimate packets for different types of connections. Onlypackets matching a known connection state will be allowed by the firewall;others will be rejected.

• Spoofing filter

Spoofing is the creation of TCP/IP packets using somebody else’s IP ad-dress. Routers use the ”destination IP address” in order to forward packetsthrough the Internet, but ignore the ”source IP address”. A Spoofing filtercan in most cases prevent spoofing attacks by examining the ”source IPaddress”, if the source IP address is known on beforehand i.e. a host on theprivate network has established a connection and registered informationabout the external destination.

1.1 Motivation & Objective

The NAT approach is not a new invention and has been around for quite awhile. The basic principles are well understood, but in a rapidly changingworld, where new communication protocols and/or applications sees the light ofday frequently, vendors of network equipment, like in this case NAT units, needto keep abreast of development to compete with rival companies. Furthermorein addition to the possible specific demands for new communication protocols orapplications, the eternal demand for a larger bandwidth are valid as well at thesame time. As a consequence of those demands the NAT approach is subjectedto a constantly moving development.

The first implementations of NATs was done purely in software and typicallyrunning on an ordinary computer as a part of the operating system like UNIX,Linux and Windows. Examples on popular software solutions are Netfilter/ipta-bles, which is the defacto standard NAT/packet-filtering/firewall tool for Linux-2.4 and later kernels, IP Filter, which is a FreeBSD/NetBSD/OpenBSD imple-mentation and last but not least Microsoft’s Internet Connection Sharing (ICS),which implements NAT for Windows systems. The software based NATs arestill in use, but with the introduction of fast Ethernet (100 Mbit) it seems, thathardware accelerated solutions have gained popularity [23] [35]. As most of theNAT implementations are commercial and therefore proprietary designs, it isdifficult to comment on the current implementation’s trend as little or noneliterature are published.

Page 34: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

10 Introduction

To support the next generation of Gigabit NATs a hardware accelerated solutionwith the use of Content Addressable Memory (CAM) is examined in this thesiswork, and a design solution will be tried out to achieve a wire-speed bandwidthof at least 1 Gigabit bi-directional. As table lookups are a fundamental task ofevery NAT, it is expected, that the use of CAM as a key element will give riseto success; that is to achieve the before mentioned goal of designing a 1 Gigabitbi-directional wire-speed bandwidth NAPT. The presumption comes from thefact, that CAM modules are the fastest choice for table lookups available, astable lookups are carried out in parallel.

To sum up; the topic of this thesis is to help solving the NAPT task in hardwarewith the use of CAM. Such a solution would be of great importance for thetelecommunication industry and would be useful in the next generation of SoHorouters.

The unified objectives of this thesis are divided in three parts and describedbelow.

• Explore and analyse the essential requirements in terms of the architectureand protocol support etc. which needs to be satisfied for a NAPT core towork properly in a modern Internet infrastructure.

• Design an architecture and determine the behavior of a NAPT core, whichfulfill the requirements stated above with the use of CAM modules aslookup tables, such that the model is capable of handling a bi-directionalpacket stream with a bandwidth of 2 Gbps.

• Implement the resulting architecture as a clock true- and structural/be-havioral model with the use of a Hardware Description Language (HDL)to proof the concept of the purposed architecture and to give a bandwidthestimate of the implemented model.

1.2 Delimitation

1.2.1 Internet Layer

At the Internet layer the following considerations and delimitation will be madefor the following protocols.

IP

Page 35: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

1.2 Delimitation 11

Since the NAT resides as a border gateway, where it translates passing packetsbetween the local private network to another network, typically the Internet, itought to support every possible/present protocols to provide fully transparencyto the communicating end-hosts. Since the Traditional NAT both the BasisNAT and the NAPT approach mainly uses the IP address and port number forthe translation process, the focus regarding the traditional NAT approach iscentered at the Internet and Transport layer of the Internet Protocol Suits 1.1a.At the Internet layer the Internet Protocol in the fourth version is the dominantprotocol and constitutes the foundation of the packet exchange on the Internet,as mentioned earlier. The Internet Protocol should definitely be supported byany NAT. The IPv6, which is the counterpart to IPv4, should be supported aswell, but since the IPv6 probably does not need to undergo any NAT operation,the assumption is based on the fact, that there will be enough IP addressesto be used, it is delimited in this work. Routers which accept the presence ofboth IP versions must, beside including a NAT to support the fourth version,also provide a mechanism for simply passing version six packets without anymodifications.

The Internet Protocol provides IP fragmentation, so that packets can be frag-mented into pieces small enough to pass over a network with a smaller MaximumTransfer Unit (MTU) than the original packet size. The processing of IP frag-ments are a challenge for NAPT gateways, as only the first fragment carriesthe original header. Subsequent fragments only carries an identifier. Since theactual handling in terms of an IP address and port translation is similar to or-dinary packets, IP fragmentation has been stated as a part of the delimitationin this work.

Internet Protocol security (IPsec)

Both versions of the IP comes in a secure version called IPsec [21], where the themajority of the IP packet is encrypted and as a consequence, the NAT approachdoes not have the ability to cope with those packets, as the information it needs,is encrypted. The approach is a popular choice when introducing security atthe Internet Layer of the IP Suite. The solution in use today is based onencapsulation i.e. the whole IPsec packet is embedded in a ”normal” IP/UDPpacket at the sending end and decapsulated at the receiving end. From theNAT’s point of view the IPsec packet is masked and appears as an ordinary IPpacket. The approach used for masking IPsec packets is called NAT-T [34] [4]and is handled entirely at the communicating end-hosts without interfering ofthe NAT. Therefore the support of IPsec packets is not considered as a part ofthe NAT in this work either, and is therefore delimited.

IP Multicast & IGMP IP Multicast and the Internet Group Management Pro-tocol (IGMP) is often employed for streaming media and Internet television

Page 36: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

12 Introduction

applications. The support for IP multicast data flows and the protocol usedby receivers to join a multicast group, the Internet Group Management Proto-col, are not treated, as it is considered as being an extension to the ordinaryUDP unicast data flows. The typical transport layer protocol used to carry theactual data in a multicast connection is in fact the UDP protocol. An exten-sive evaluation of the UDP protocol, regarding NAPT gateways, will indeed beprovided.

Internet Control Message Protocol(ICMP)

The ICMP protocol is considered as a fundamental and necessary part of theInternet, or at least some of the features it provides. The support for ICMP willbe treated in this work with an emphasis on the crucial features. The ICMPprotocol is in its nature an Internet Layer protocol, but resembles as a Transportlayer protocol in the way, that the NAT handles the protocol. This is why it ismentioned as a transport layer protocol throughout this work.

1.2.2 Transport Layer

At the Internet layer the following considerations and delimitation will be madefor the following protocols.

UDP, TCP, DCCP and SCTP

At the transport layer the commonly applied protocols TCP and UDP are con-sidered, while the recently developed protocols like Datagram Congestion Con-trol Protocol (DCCP) and Stream Control Transmission Protocol (SCTP) arenot. There are two main reasons for avoiding those protocols in this work.The first argument is the absence of native support in operating systems, andsecondly the basic principles of the new protocols are based on the existing fun-damental Transport layer protocols UDP and TCP and therefore the handlingin regards to the NAT is much alike. This means, that it should be possible toextend the protocol support in the NAT with just a little effort. While the firstargument seems to be a weak argument, as the support for the next generationtransport is of vital interest to be compatible in the market, the justification ofthe avoidance in this work is natural, as it is not a complete treatment of everyaspect of a NAT, and as the second argument explains, the UDP and TCP formthe basis of the new protocols and they share a lot in their architecture.

Page 37: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

1.3 Overview 13

1.2.3 Application Layer

As the point of departure the application layer is not considered by the NATapproach, but unfortunately some applications places IP addresses in the Ap-plication layer. The NAT unit does not by default pay regard to the applicationlayer, therefore the concept ALG has been introduced. An ALG monitors theapplication layer for selected applications, which is not standardized, whichmeans that the ALG must be configurable to handle each concerned applica-tion individually. One of the most popular applications, which requirers explicitattention and management, is the File Transfer Protocol (FTP). As the nameimplies, the Protocol is used to manage file transfer between communicatingendhosts. As a part of the procedure when initiating a file transfer, the IPaddress and the port number of the transport layer is embedded as payload(in American Standard Code for Information Interchange (ASCII)). The ALGin the NAT gateway must detect this situation and translate the IP addressand port number in the payload as well as the IP address and port number ofthe packet’s header in order to comply with the needs. The support for FTPin shape of an ALG in NAT gateways has over time become more or less astandard feature of all commercial NATs, but is considered as a plugin to thefundamental NAT operation and is therefore not treated further in this work.

1.3 Overview

In the fulfillment with the objectives this thesis is naturally divided into sevenchapters, where each chapter requires knowledge from the preceding chapters.

Chapter 2 Fundamentals of NAT: Presents a summary of the fundamen-tals of the NAT approach, with a main focus on Traditional NAT, butother types of NAT are treated briefly for completeness.

Chapter 3 Analysis and design of a NAPT core: Covers the explorationand design of a NAPT core based on CAM modules.

Chapter 4 Implementation of a NAPT core: Documents the implementedproof-of-concept model.

Chapter 5 Test and Verification: A functional test of the implemented modelis carried out to verify selected cases.

Chapter 6 Results: Determination of how to estimate the bandwidth andlatency of the implemented model and a presentation of the results.

Page 38: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

14 Introduction

Chapter 7 Discussion: Presents a discussion of possible ideas to future work.

Chapter 8 Conclusion: The conclusion of the work done in this thesis.

1.4 Nomenclature

Public/Global/External realm/interface = The interface on the public side ofthe NAPT gateway. i.e. the Internet.

Private/Local/Internal realm/interface = The interface on the private side ofthe NAPT gateway, i.e. the private host/internal application

Uni-directional = Only initiated outbound sessions can create a mapping be-tween address realms.

Computer = host = endhost = endpoint

NAT/NAPT gateway = Part of the router which handles the NAT/NAPT re-lated work.

NAPT/NAT core = Hardware part of the NAT/NAPT gateway

Inbound = Packets or connections originating from the public interface to theprivate interface.

Outbound = Packets or connections originating from the private interface tothe public interface.

Page 39: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

Chapter 2

Fundamentals of NetworkAddress Translation

This section introduces basic NAT terminology and convensions used throughoutthe paper, and then outlines general NAT traversal techniques that apply equallyto TCP, UDP and ICMP.

2.1 Traditional NAT

This section describes the most common Network Address Translation oper-ation, which is normally referred to as Traditional NAT, which has alreadybeen briefly mentioned in the Introduction section 1. Traditional NAT makesit possible for hosts within a private network to get transparent access to ahost residing a public network. The transparency is valid for most cases, butthere are situations, where it is difficult or impossible to accomplish. Thosetroublesome situations are not covered, with reference to the Delimitation sec-tion 1.2. Sessions in a traditional NAT are uni-directional, which means thatonly initiated outbound sessions can create a binding between address realms.Initiated inbound sessions are only allowed on an exceptional basis, where astatic binding is created on beforehand. Traditional NAT covers two variationsof operation, which is called Basic NAT and Network Address and Port Transla-

Page 40: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

16 Fundamentals of Network Address Translation

tion, or NAPT. Basic NAT manipulates IP addresses only, while NAPT includestransport identifiers such as TCP/UDP port or ICMP query ID.

Basic NAT and NAPT will be described in details in the following subsectionsand examples of their use will be given.

2.1.1 Basic Network Address Translation (Basic NAT)

Static Translation

In a Network Address Translator using static address binding there is a one-to-one address mapping for hosts between a private network address and a publicnetwork address. The assignments of address bindings are based on a pre-configuration carried out by an administrator and not on session flows. Thebehavior of a NAT, which makes use of static bindings, differs especially fromthe dynamic translation approach, since it is allowed for a host on the publicnetwork to instantiate a session.

Example:

The following example illustrated in Figure 2.1 shows a network configuration,where a private network is connected to a public network through a router,which incorporates a NAT gateway. The NAT in this example is configured asa pure static NAT with only one address binding, which maps the private IPaddress 10.0.0.3 to the public IP address 197.76.27.4 and visa versa. Inboundand outbound sessions are treated equally. It should be noted that a uniquepublic address is needed for every private to public binding.

Figure 2.1: Illustration of a static Basic Network Address Translation operation.

Page 41: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

2.1 Traditional NAT 17

Dynamic Translation

In the dynamic translation approach the private network share a pool of publicnetwork addresses. Address bindings in the NAT gateway are created for out-bound sessions and removed either by termination of the session or by a timer.If the pool of public network addresses is larger or equals the number of privatenetwork addresses, simultaneous access is feasible, otherwise a policy is neededto control the address assignments. The exact nature of address assignment isspecific to individual NAT implementations.

Example:

The following example illustrated in Figure 2.2 shows a network configuration,where a private network is connected to a public network through a router,which incorporates a NAT gateway. The NAT in this example is configuredas a pure dynamic NAT, where the private network share a pool of publicnetwork addresses in the range 197.76.27.4-197.76.27.5. The private host withthe IP address 10.0.0.3 sends a packet to the public host with the IP address197.76.33.1. The blue path shows an initial outbound packet from the privatehost to the public host, which picks a public address from the pool and creates abinding in the NAT gateway for that session. As soon at the binding is assigned,packets can flow in either direction as long as the binding exist.

Figure 2.2: Illustration of a dynamic Basic Network Address Translation oper-ation.

Page 42: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

18 Fundamentals of Network Address Translation

2.1.2 Network Address and Port Translation (NAPT)

Network Address and Port Translation also called dynamic translation withoverloading or IP masquerading, dear child many names, extends the afore-mentioned Basic NAT by also including the transport identifier e.g. TCP portnumbers, UDP port numbers or ICMP identifiers. Introducing the port numberto the binding in conjugation with the IP address in the NAT gateway enablesmultiple hosts on a private network to have simultaneous sessions sharing a sin-gle public network address. This approach is commonly used in Small OfficeHome Office (SOHO) routers, where it supports a group with access to the In-ternet through a single public network address assigned from an Internet ServiceProvider (ISP). To keep the record straight, it should be mentioned, that it ofcourse is possible to have more than one public network address assigned to theNAPT; this scenario directs the attention to the section ”Multihomed NATs”.

Just like in the case of Basic NAT (dynamic) a new address binding is createdon behalf of a new outbound session, although it depends on the present bindingbehavior of the concerned NAT, which will be described in details later in thissection. Removals of sessions take place either by termination or expiration ofa timer, again analogous to the aforementioned Basic NAT.

Example:

This example in Figure 2.3 illustrates the basic Network Address and Port Trans-lation operation, where a host with the IP address 10.0.0.3 and port number7321 on the private network sends a packet to a telnet server with the IP address197.76.33.1 and port number 23. When the outbound packet leaves the NATgateway, a session has been created, consisting of a binding between a pair oftuples. Each tuple consists respectively of the private IP address and port anda public IP address and port. After that the source IP address and port of thepacket are changed to the external IP address of the NAT gateway and a uniqueport number. Any subsequent response to the sending host on the private net-work from the telnet server has to be addressed to the NAT gateway, wherethe correct private IP address and port number are looked up and the packets’destination address is changed with the right IP address and port number.

Endpoint Mapping Behavior

For a NAT gateway to open a session to the public network, regardless of theTransport Layer protocol, an outbound packet from a host on the private net-work is needed, as with the dynamic approach. A session seen from a NATgateway’s point of view is defined as a mapping between a pair of tuples, eachtuple consists respectively of a private IP address and port (private IP ad-

Page 43: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

2.1 Traditional NAT 19

Figure 2.3: Illustration of a Network Address and Port Translation (NAPT)operation.

dress:private Port) and a public IP address and port (public IP address:publicPort). The outbound packets private IP address and port are translated to themapped public IP address and port specific to that session, so that any followinginbound packets can be directed to the right host on the private network. Thepublic IP address and port are both assigned by the NAT gateway. The wayto do so and the administration policy defines the NAT gateway’s behavior andis of great importance for many applications, since this knowledge is useful toend-hosts attempting to traverse NAT gateways, as it allows them to predictthe mapped address and port of a connection based on previous connections.

The NAT gateways mapping behavior for TCP packet flows has been capturedand further classified [32], the classifications of behavior are more or less com-parable to the UDP behavior classification in [31]. An attempt to account forthe different classes based on [32] are listed and described below.

Endpoint-Independent mapping behavior

Mapping is determined only by the source address and port and is independent ofthe destination address. An existing mapping is reused for subsequent sessions,if the packet is initiated from the same IP address and port on the privatenetwork independent of the destinations IP address and port on the publicnetwork, otherwise a new binding is created and used for this new session.This behavior equals Cone behavior in [31] and is referred to as ”Endpoint-Independent Mapping” in [10]. The example beneath demonstrates the behaviorof an Endpoint-Independent NAT gateway and is based on a simple setup, wherethe same host A:1 on a private network sends four packets through a NATgateway, initially empty, to three different hosts B:10, B:11 and C:10 on thepublic network. The first packet transmission to the public host B:10 creates a

Page 44: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

20 Fundamentals of Network Address Translation

binding in the NAT table. As can be seen, that binding applies to the followingtransmissions since the destinations IP address and port is irrelevant. Figure2.4 illustrates the described example where the NAT gateway has a Endpoint-Independent mapping behavior.

Figure 2.4: Examples of a NAT gateway with an Endpoint-Independent map-ping behavior.

Address and Port-Dependent Mapping Behavior

Mapping is determined by both the source IP address and port and the des-tination IP address and port. An existing mapping is reused for subsequentsessions, if the packet is initiated from the same IP address and port on theprivate network and if it refers to the same IP address and port on the pub-lic network, otherwise a new binding is created and used for this new session.This behavior equals Symmetric behavior in [31] and is referred to as ”Addressand Port-Dependent Mapping” in [10]. The example beneath demonstrates thebehavior of an Address and Port-Dependent NAT gateway and is based on asimple setup as in the previous example, where the same host A:1 on a privatenetwork sends four packets through a NAT gateway, initially empty, to threedifferent hosts B:10, B:11 and C:10 on the public network. The first packettransmission to the public host B:10 creates a binding in the NAT table, thesecond transmission reuses the binding since the tuple pairs (IP address:port)are identical. In the case of the third and fourth transmission a binding is cre-ated for each new transmission in view of the fact that the port number and IPaddress are different respectively from the first and second transmission. Figure2.5 illustrates the described example where the NAT gateway has a Address andPort-Dependent mapping behavior.

Port-Dependent Mapping Behavior

Page 45: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

2.1 Traditional NAT 21

Figure 2.5: Examples of a NAT gateway with an Address and Port-Dependentmapping behavior.

Mapping is determined by both the source IP address and port and the destina-tion port. An existing mapping is reused for subsequent sessions, if the packetis initiated from the same IP address and port on the private network and ifit refers to the same port on the public network, otherwise a new binding iscreated and used for this new session. The example beneath demonstrates thebehavior of a Port-Dependent NAT gateway and is based on a simple setup asin the previous examples, where the same host A:1 on a private network sendsfour packets through a NAT gateway, initially empty, to three different hostsB:10, B:11 and C:10 on the public network. The first packet transmission to thepublic host B:10 creates a binding in the NAT table, the second transmissionreuses the binding since the relevant parts of the tuple pairs (IP address:port)are identical. A new binding is created for the third transmission as the desti-nation port differs. The fourth and last transmission reuses the first binding asthe destination is irrelevant. Figure 2.6 illustrates the described example wherethe NAT gateway has a Port-Dependent mapping behavior.

Address-Dependent Mapping Behavior

Mapping is determined by both the source IP address and port and the desti-nation IP address. An existing mapping is reused for subsequent sessions, if thepacket is initiated from the same IP address and port on the private networkand if it refers to the same address on the public network, otherwise a newbinding is created and used for this new session. This behavior is referred toas ”Address-Dependent Mapping” in [10]. The example beneath demonstratesthe behavior of an Address-Dependent NAT gateway and is based on a simplesetup analogues to the previous examples, where the same host A:1 on a privatenetwork sends four packets through a NAT gateway, initially empty, to threedifferent hosts B:10, B:11 and C:10 on the public network. The first packet

Page 46: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

22 Fundamentals of Network Address Translation

Figure 2.6: Examples of a NAT gateway with a Port-Dependent mapping be-havior.

transmission to the public host B:10 creates a binding in the NAT table, thesecond and third transmission reuse the binding since the relevant parts of thetuple pairs (IP address:port) are identical. A new binding is created for thefourth as the IP address differs. Figure 2.7 illustrates the described examplewhere the NAT gateway has a Address-Dependent mapping behavior.

Figure 2.7: Examples of a NAT gateway with an Address-Dependent mappingbehavior.

Connection-Dependent Mapping Behavior

Mapping is determined by the individual outbound packet transmissions alone.An existing mapping is never reused; the NAT gateway creates a new bindingfor each outbound packet. Connection-dependent mapping behavior has a rela-tively rare occurrence in more resent NAT gateways’ implementations [32]. The

Page 47: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

2.1 Traditional NAT 23

example beneath demonstrates the behavior of a Connection-Dependent NATgateway and is based on a simple setup analogues to the previous examples,where the same host A:1 on a private network sends four packets through aNAT gateway, initially empty, to three different hosts B:10, B:11 and C:10 onthe public network. The behavior is quite intuitively and it is easily seen, thata new binding is created for each transmission. In this example both the fullsource and destination information are included in the binding; this is not anabsolute requirement. Figure 2.8 illustrates the described example where theNAT gateway has a Connection-Dependent mapping behavior.

Figure 2.8: Examples of a NAT gateway with a Connection-Dependent mappingbehavior.

Firewall/Endpoint-filtering Behavior

Firewall/Endpoint-filtering behavior should be distinguished from the bindingbehavior and viewed as a set of rules for established sessions in a NAT gateway.When an outbound session creates a binding between two hosts on a privateand public network, respectively a firewall/Endpoint-filtering rule is attachedfor that specific session defined by an administrator. Imagine a situation wherean application running on a host behind an Endpoint-Independent NAT gate-way needs a symmetric connection to a server on the public network. If nofirewall/Endpoint-filtering is applied, every host on the public network knowingthe mapped address and port of the private host could respond right after thebinding was established. The firewall/Endpoint-filtering can restrict the accessto a host on the private network in four different ways. Those four configurationmodes are presented and illustrated in the succeeding sections.

Endpoint-Independent Filtering Behavior

Page 48: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

24 Fundamentals of Network Address Translation

Once an initial outbound packet from a host on a private network has createda binding in the NAT gateway, any inbound packet destined for the host on theprivate network is allowed to pass through regardless of the source IP addressand port. Figure 2.9 illustrates this scenario.

Figure 2.9: The NAT gateways firewall/endpoint-filtering rule is set toIndependent-Endpoint.

Address-Dependent Filtering Behavior

Once an initial outbound packet from a host on a private network has created abinding in the NAT gateway, only inbound packets destined for the host on theprivate network with a source IP address matching the destination IP addressof the initial packet are allowed to pass through regardless of the source port.Figure 2.10 illustrates this scenario.

Figure 2.10: The NAT gateways firewall/endpoint-filtering rule is set to Address-Dependent.

Page 49: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

2.1 Traditional NAT 25

Port-Dependent Filtering Behavior

Once an initial outbound packet from a host on a private network has created abinding in the NAT gateway, only inbound packets destined for the host on theprivate network with a source port matching the destination port of the initialpacket are allowed to pass true regardless of the source IP address. Figure 2.11illustrates this scenario.

Figure 2.11: The NAT gateways firewall/endpoint-filtering rule is set to Port-Dependent.

Address and Port-Dependent Filtering Behavior

With this filtering behavior only inbound packets from the host on the publicnetwork with the IP address and port that equals the destination IP addressand port on the initial and later outbound packet send from a host on a privatenetwork can communicate. Address and port-Dependent filtering mode is thestrictest filtering mode and is used where restricted application transparency isimportant. Figure 2.12 illustrates this scenario.

2.1.3 Hairpinning Behavior

A situation where hosts on a private network behind a NAT gateway exchangepackets among themselves is referred to as hairpinning. NAT hairpinning be-haviors can further be divided into two and are based on the address policy usedby the NAT gateway. The first scenario is called an ”External source IP addressand port NAT hairpinning behavior” and to describe what it is all about anexample has been provided in figure 2.13. As a starting point let’s imagine thata address:port binding for the host B on the private network with the IP address

Page 50: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

26 Fundamentals of Network Address Translation

Figure 2.12: The NAT gateways firewall/endpoint-filtering rule is set to Addressand Port-Dependent.

10.0.0.3 and port number 3587 already exists. The other host A on the privatenetwork with IP address 10.0.0.2 and port number 1053 knows the public IPaddress and port number of the host B and wants to send a packet. At themoment the NAT gateway receives a packet from host A, a new address:portbinding is assigned for the source address:port and changed as normal, but sincethe destination address of the packet is the same as the NAT gateways address,the NAT gateway knows that it is a hairpin state of affairs and also the publicdestination address:port is looked up and changed. The last step characterizesthe ”External source IP address and port hairpinning behavior”. Figure 2.14shows an analogy to the aforementioned example but with the exception of thelast step, where the source address was changed and the NAT gateway is saidto have an ”Internal source IP address and port hairpinning behavior”.

Figure 2.13: External source IP address and port NAT hairpinning behavior.

Page 51: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

2.1 Traditional NAT 27

Figure 2.14: Internal source IP address and port NAT hairpinning behavior.

Page 52: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

28 Fundamentals of Network Address Translation

Page 53: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

Chapter 3

Analysis and Design of aNAPT Core

3.1 NAPT Tables Mapping Behavior

For the design of the NAPT table’s mapping behavior a specification for thesolution is needed. In this section a discussion about how the mapping behav-ior of the NAPT gateway may affect the surrounding applications residing onboth sides of the NAPT gateway, will be given. Explanations with real worldexamples are given as a part of the argumentation to justify possible designchoices.

When a private host opens a new outbound session through a NAPT, the NAPTneeds to create an address & port mapping consisting of information aboutthe session. Section 2.1.2 illustrates five different ways to implement thesemapping behaviors, each of them having great impact on the way the NAPTeither reuses existing mappings for successive packet sending or creates newmappings. Many applications are sensitive to the NAPT behavior with regardto multiple simultaneous sessions established to different public hosts. Imagine asituation where an application on a host in the private realm uses an UNilateralSelf-Address Fixing (UNSAF) method in a starting-up procedure to fix/knowits public address & port. The UNSAF method implies an UNSAF client atthe private host and an UNSAF server residing on the public network. The

Page 54: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

30 Analysis and Design of a NAPT Core

UNSAF client sends a request to the UNSAF server and the UNSAF serverresponses by sending a packet back to the UNSAF client with the public sourceaddress & port of the requested packet in its payload. The UNSAF client isnow able to pass on the public address & port to the originating application.For this starting-up procedure to work, the NAPT mapping must be the sameregardless of the destination host, when the application is in action e.g. theNAPT mapping behavior needs to be ”Endpoint-Independent”.

Generally seen, one should aim a solution for the NAPT’s mapping behavior,that causes least harm to applications living in a NAPT environment. In thefollowing there will be an example of a situation, where an application on a pri-vate host demands an ”Endpoint-Independent mapping” behavior of the NAPT,which it is behind, to work properly. Unfortunately the present NAPT can’tfulfill this requirement, since its mapping behavior equals an ”Address and Port-Dependent mapping” behavior. For this scenario to work anyway, a relay can beused, but it is often impractical. Figure 3.1 illustrates the case, where no relay isinvolved and it can be seen, that every host on the public network gets a diverseport number as expected using a NAPT with an ”Address and Port-Dependentmapping” behavior. To overcome this problem a relay is located in the pub-lic realm and is used as something, that resembles a reverse NAPT, virtuallyplaced between the actual NAPT and the involved host on the public network.The outcome is, that participants refer to the same IP address and port numberas with the ”Endpoint-independent mapping” approach shown in Figure 3.2.

Figure 3.1: Without relay.

Since the NAPT procedure is not standardized, the behavior of all aspects candiffer from implementation to implementation resulting in great confusion forthe application designers, which have increased the demand for some implemen-tation guidelines for vendors of NAPTs and NATs. This confusion leads to the

Page 55: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

3.1 NAPT Tables Mapping Behavior 31

Figure 3.2: with relay.

work done in [10] and [13], where an attempt to clarify and justify some of theseissues by specifying best current practices for handling TCP and UDP traffic.The outcome of these best current practices for handling TU protocols is a listof requirements for each TCP/UDP protocol with a key word associated suchas ”MUST”, ”MUST NOT” and ”RECOMMENDED” etc. A complete list ofkey words can be seen in section 3 in [10] and are to be interpreted as describedin [6].

With the examples and the discussion in mind, one should be motivated to takethe final decision about the mapping behavior of the NAPT. According to thealready mentioned [10] and [13], which give the best current practices that wouldallow many applications, such as multimedia communications or online gaming,to work consistently, endorse the decision of using an ”Endpoint-independentmapping” approach, as it states, that a NAT/NAPT MUST have an ”Endpoint-independent mapping” behavior.

As a side benefit seen with the eyes of a hardware designer, the total cost of areais minimized by selecting the ”Endpoint-independent mapping”, since only theprivate host’s source address and port needs to be stored in the NAPT table asshown in Figure 2.4.

3.1.1 CAM Based Mapping Table

This section presents a CAM-table based hardware solution to handle the verybasic address & port mapping behavior between a private and a public realm.

Page 56: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

32 Analysis and Design of a NAPT Core

As the CAM has a vital part to play, examples to support the understandingof the proposed solution is given. In reference to the section regarding thespecification of the mapping behavior above, any possible statements crucial forthe design are of course taken into account.

As repeated before, only initiated outbound sessions are capable of creatingmappings between the private and the public realm. When the very first out-bound packet meets an empty table structure inside the NAPT, the session isof course created followed by the actual address and port translation process.What is needed for the translation process to succeed and as an outcome of themapping creation process, is a unique identifier, which is used as a foundationfor the NAPT port number for that session. A simple approach where an offsetis added to the unique identifier is used to generate the NAPT port number inthe examples shown in this section. This unique identifier needs to point at thatspecific application on the private network to comply with the requirements ofhaving an ”Endpoint-Independent mapping” behavior, which means that theunique identifier needs to be reused for successive packets, originating from thevery same application independent of the destination’s address and port.

A CAM module, where the address of an entry applies as the unique identifierand the content represents the original source address & port of the sessiontogether with a Valid-bit, is used as a core mapping between the private andpublic realms. The Valid-bit is used to differentiate between availability andunavailability of the entry in the CAM module.Since the CAM module has itscharacteristic parallel search option for bit patterns, it is possible to search forentries, where the Valid-bit indicates, that the entry is free; a Valid-bit with avalue that equals zero means a free entry in this example. One should rememberthat in case of multiple matches, the entry with the highest CAM address isreturned.

To get a grip on how this proposal actual works and how it handles the address& port translation, some examples are given with an explanation of typical out-and inbound packet flows.

Outbound packet handling

Outbound packets traversing the NAPT belong either to an already establishedsession in the mapping table or are completely unknown and need to be created.The two different procedures for handling these situations are listed beneath.

Outbound session unknown to the mapping table:

• Lookup in CAM to get the NAPT port no. of the session (no match)

Page 57: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

3.1 NAPT Tables Mapping Behavior 33

• Lookup for a free entry in CAM via the Valid-bit

• Set-up new binding

• Perform address & port translation etc.

Outbound session known to the mapping table

• Lookup in CAM to get the NAPT port no. of the session

• Perform address & port translation etc.

It is clear, that there are differences between the way known and unknownpackets are treated, so to get a greater understanding about the steps involvedand the use of the CAM module, a walk-through for both cases are provided inthe succeeding two examples.

Example #1

A host on the private network with the address B and the port number 121sends a packet to a destination, which is irrelevant to this example. When thepacket is received by the NAPT gateway, the first step is to clarify, whetherthe packet belongs to an already existing session or not. A lookup in the CAMmodule with the input data equal to an concatenation of the packet’s sourceaddress & port and a single bit representing the valid-bit, which is set to one,since only valid entries are of interest, and a mask consisting of a bit patternof ones, is performed. In this example the CAM module was initially emptyand therefore no match was found in the lookup process. The first mappingtable in Figure 3.3 shows step 1: the lookup process with the relevant inputdata above the table and the output below the mapping table. Unfortunatelyno match was found, which triggers the session setup procedure, where the firststep is the search for a free entry, that can be used for the new session. Forthis second lookup a search for an entry, which have a cleared Valid-bit, is doneby setting the least significant bit of the input data to zero and use a mask,that determines a search among the least significant bits in the CAM module.The mapping table in the middle of Figure 3.3 shows the second lookup andthe output. Since the mapping table isn’t full, a match is found on address 4in the CAM module. The last step in the setup procedure is a write operationto save the session relevant data at the free entry on address 4 for further use.The written data should be equal to the bit pattern used in the first lookup;a concatenation of the packet’s source address & port together with a singlebit representing the Valid-bit. As expected the session is now created and theNAPT gateway can go on with the address & port translation etc.

Page 58: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

34 Analysis and Design of a NAPT Core

Figure 3.3: Illustration of the procedure to handle an outbound packet belongingto a session unknown to the mapping table.

Example #2

If and when the very same host on the private network with address B andport number 121, as in the example above, wants to communicate further, asubsequent packet flow reuses the mapping created at the arrival of the initialpacket. The packet processing is therefore reduced to a simple lookup to getthe unique identifier as illustrated in Figure 3.4. As expected the outcome isa match and the address equals 4 as with the initial packet from the examplebefore.

Figure 3.4: Illustration of the procedure to handle an outbound packet belongingto a session already known to the mapping table.

Inbound packet handling

Page 59: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

3.1 NAPT Tables Mapping Behavior 35

To complete the explanation of the use of the mapping table, a look at howinbound packets are handled would be appropriate. In case the unique identifierextracted from the destination port number of the packet is within the addressrange of the table, a read operation can be performed. Figure 3.5 shows asituation, where the NAPT gateway receives a respond from the destination itis communicating with. The read operation demands an address just like in thecase of an ordinary RAM and the output is the content of that address. In thisconstellation a check of a Valid-bit is necessary for the NAPT core to insure,that the session on that address is valid. If the entry turns out to be an invalidentry an exception has occurred and the packet is discarded.

Figure 3.5: Illustration of the procedure to handle an inbound packet which hasa valid mapping in the mapping table.

3.1.2 Discussion/Summery

The presented solution takes great advantage of the CAM technology, as thesolution is grounded on the subject. CAM tables are powerful in solving tasks,where incredibly fast lookup routines are needed. Since nothing comes for free,a CAM is expensive in terms of a larger transistor count per stored bit comparedto an ordinary RAM based solution. Furthermore a parallel lookup in hardwareis a power-hungry task and therefore heating could be a matter for both thechoice of packet technology and the need for forced cooling.

Fortunately a lookup-mask is used to control, whether a specific content bit isessential for the lookup process or not. Because of this there is no need for athird bit-value of ”don’t care”. In the light of that, one can benefit by selectinga binary CAM in flavor of a ternary CAM because of the area ratio betweenthem.

Page 60: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

36 Analysis and Design of a NAPT Core

3.2 NAPT Tables Filtering Behavior

From the discussion above it has been defined, that a NAPT must have an”Endpoint-Independent Mapping” behavior and since the mapping and filteringbehavior are the core functionalities of the NAPT, the work concerned in thissection is build upon and in strong relationship with the solution for the mappingbehavior presented above. One could see the filtering part as an extension of themapping part. The actual filtering behavior observed in the field is presented insection 2.1.2. In this section a specification of the filtering behavior is carriedout.

The determination of the filtering behavior of the NAPT gateway should beconfigurable by an administrator at startup time to meet any user demands.For the NAPT gateway to be deterministic in its behavior, it is recommended,that the filtering behavior is static during runtime. As an alternative to thestatic filtering mode a dynamic filtering approach could be a possibility, thoughit spoils the demand of a NAPT having a deterministic behavior. Runninga NAPT gateway in a dynamic filtering mode, each session can have its ownfiltering rule determined by an administrator of the network, where the NAPTgateway resides within. The static and dynamic approaches are described in thefollowing two sections during design proposals given as examples.

According to [10] and [13] it is ”RECOMMENDED”, that the filtering behavioris set to ”Address and Port-Dependent” filtering, if a strict filtering is required,which is quite obvious and the filtering rule ”MAY” be configurable. By makingthe filtering behavior configurable the requirements stated in the [10] and [13]will be applied.

3.2.1 NAPT Core with Configurable Static Filtering

At startup time the NAPT, running in a static filtering mode, assigns itselfa filtering rule from a control register capable of manipulation and set by theCPU. The filtering rule is by default set to no filtering and in case it hasn’tbeen changed, no packet filtering occurs. It is essential to understand, thatstatic assigned filtering rules means same filtering rule for all sessions.

In view of the fact that the proposed designs generally seen only differ fromeach other in details regarding the filtering modes, the point of departure willbe on the ”Address & Port-Dependent” filtering, since it includes every aspectof the design challenges. The presentation of the ”Address & Port-Dependent”filtering design will be completed with a look on the differences regarding the

Page 61: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

3.2 NAPT Tables Filtering Behavior 37

other filtering modes.

Using the strictest filtering rule implies, that information of the target desti-nation, both address & port, needs to be stored as a pendant to the session insuch a way, that only packets from already known destinations can be accepted.This constellation can be visually viewed as a tree structure, where the mappinginformation, which identifies the original source, is the root and the differentdestinations act as branches; this tree structure is presented figuratively in ageneral form in Figure 3.6 below.

Figure 3.6: A tree structure which illustrates the relationship between the map-ping and filtering table.

The first branching seen from the right equals the mapping table discussed insection 3.1.1 Design Proposal - CAM Based Mapping Table. The number afterthe # character stands for the address of the session in the CAM module andthe source address & port to the right represent the content of that entry. Foreach entry in the basic mapping table a new branching appears, which consistsof the address of the session together with the address & port of destinationhosts on the public network. It is obvious, that a session can communicate witha batch of hosts, each demanding an entry in the second branching.

To support the filtering behavior, the CAM technology is still used, as an exten-sion of the first mapping table, CAM-0. This new table will be called filtering

Page 62: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

38 Analysis and Design of a NAPT Core

or CAM-1 as we move on. This table contains its own address, the destinationIP address and port, a column for the CAM-0 address, and the well knownValid-bit. This can be seen in the following examples.

Outbound packet handling

An outbound packet can be classified into three classes depending on the statusof the internal mapping and filtering tables. The first class is called a new full-session, where the word full indicates, that neither the pair of source address& port nor the pair of destination address & port of the outbound packet areknown on beforehand in either table. This situation occurs every time the pairof source address & port is new to the mapping table; it means every time anew application on a host residing on the private network initiates a session, orthe session-timer of that session has expired in between the total time period ofthe communication between two applications residing on each side of the NAPTgateway.

Example: New full-session

Figure 3.7 illustrates a situation, where a private application with the TUport number 121 running on the host with address B wants to communicatewith the public application with the TU port number 90 and address C. TheNAPT gateway is configured to have an ”Address and Port-Dependent” filteringbehavior. The mapping table (CAM-0) has in this example an address offseton 1024. The mapping table (CAM-0) is initially empty and a search for a freeentry results in the highest numbered address in the table; here it will be address1028. The packet’s source address & port are saved at the assigned entry andthe Valid-bit is set to indicate, that the entry is now occupied. The filteringtable (CAM-1) is also initially empty and a search for a free entry results asbefore in the highest numbered address in the table; here the address will be 4.The packet’s destination address & port are saved at the assigned entry and theValid-bit is set to indicate, that the entry is now occupied. The source address& port translation process changes the packet’s source address with the publicaddress of the NAPT gateway and the source port with the unique identifier,the CAM-0 address. It should be evident, that successive packets from the sameprivate application gets the same unique identifier, which means the same NAPTport number, as wanted to achieve the goal of having an ”Endpoint-IndependentMapping” behavior.

A second class is called a new sub-session. In this case only the source address& port of the packet are known from earlier packet transfers from the specificapplication on that host, but the destination address & port are new to thefiltering table. In brief this situation occurs, when the concerned application onthe private host with the source address & port has transmitted recently (the

Page 63: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

3.2 NAPT Tables Filtering Behavior 39

(a) Setup (b) NAT Tables

Figure 3.7: Illustrates the creation of a new full-session on behalf of an outboundpacket, which has no valid entry on beforehand in both NAT tables CAM-0 andCAM-1.

source part of the mapping is still in the mapping table; no timer expiration forthat sub-session), though it was to another destination.

Example: New sub-session

Figure 3.8 illustrates a situation, where a private application with the TU portnumber 121 running on the host with address B wants to communicate withthe public application with the TU port number 91 and address C. The NAPTgateway is configured to have an ”Address and Port-Dependent” filtering behav-ior. The mapping table (CAM-0) has in this example an address offset on 1024.Since the mapping table (CAM-0) contains the concerned session, a reuse of theunique identifier is present. The filtering table (CAM-1) does not include theconcerned destination for the specific session, therefore a new entry is needed.A search for a free entry results, as before, in the highest numbered available ad-dress in the table; here the address will be 3. The packet’s destination address &port are saved at the assigned entry and the Valid-bit is set to indicate, that theentry is now occupied. The source address & port translation process changesthe packet’s source address with the public address of the NAPT gateway andthe source port with the unique identifier.

Lastly the third class covers the situation, where the source address & port anddestination address & port are known in advance from earlier sessions.

Page 64: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

40 Analysis and Design of a NAPT Core

(a) Setup (b) NAT Tables

Figure 3.8: Illustrates the creation of a new sub-session on behalf of an outboundpacket, which has a valid mapping on beforehand in NAT table CAM-0 but notCAM-1.

Example: Reuse of mapping and filtering entry

Figure 3.9 illustrates a situation, where a private application with the TUport number 121 running on the host with address B wants to communicatewith the public application with the TU port number 90 and address C. TheNAPT gateway is configured to have an ”Address and Port-Dependent” filteringbehavior. The mapping table (CAM-0) has, in this example, an address offseton 1024. Since the session and destination are already known from earlier packettransmitting, the final header manipulation can take place directly. The sourceaddress & port translation process changes the packet’s source address withthe public address of the NAPT gateway and the source port with the uniqueidentifier.

Inbound packet handling

As illustrated in Figure 3.10 inbound packets, which arrive on the right sideof the tree structure, will search for an entry with a match between its sourceaddress and source port and the destination port of its own, which is the addressof the session, alias the session’s unique NAPT port number (if the simple portassignment policy is used). If the packet succeeds in finding an entrance i.e.pass the filter, it can follow the path and get the original source address andport in the same way as with the basic NAPT table without filtering. In thereverse situation, where the packet doesn’t succeed in finding an entrance, the

Page 65: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

3.2 NAPT Tables Filtering Behavior 41

(a) Setup (b) NAT Tables

Figure 3.9: Illustrates the situation, where an outbound packet transverse theNAT gateway, and an full-session has been established on beforehand.

packet will be discarded.

Example:

Figure 3.10 illustrates a situation, where a public application with the TU portnumber 90 running on the host with address C wants to communicate with theprivate application with the TU port number 121 and address B. The precondi-tion for this to work, is having an outbound packet to create a mapping in theNAPT tables, before any inbound packet can transverse the NAPT gateway.The NAPT gateway is configured to have an ”Address and Port-Dependent”filtering behavior. The mapping table (CAM-0) has in this example an addressoffset on 1024. The first lookup in the filtering table (CAM-1) results in a matchat address 3 and the packet is therefore valid, since the destination was knownfrom an earlier outbound packet flow. The process of manipulating the headerdata can occur with data obtained with a lookup in the mapping table (CAM-0).As mentioned before a packet with no match in the filtering table will be dis-carded, unless the filtering is disabled i.e. running in an Endpoint-IndependentFiltering mode.

Every other filtering mode

The support of the other filtering behaviors is relatively easy to achieve withthe purposed model. Recall that the NAPT gateway as a part of the startupprocedure reads the filtering control register to obtain its filtering mode. What

Page 66: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

42 Analysis and Design of a NAPT Core

(a) Setup (b) NAT Tables

Figure 3.10: Illustrates the situation, where an inbound packet transverse theNAT gateway and an full-session has been established on beforehand.

actually happens is, that the mask used for lookup in the second NAPT table(CAM-1) is determined. Running in an ”Address and Port-Dependent” filteringmode, the lookup mask consists entirely of ones. The reason is, that every bitin a match between the input data and the content is required to be consistenti.e. both the destination address & port together with the address of the sessionand the Valid-bit. It is quite obvious, that by changing the lookup mask, allfiltering modes can be supported. To visualize this every lookup mask is shownbeneath. There is no mask for the no filtering mode, since it makes no sense tolookup for nothing. The ’-’character in the bit vector has only one purpose; tomake it user-friendly to look at and to separate the individual components thevector covers.

• Address & Port Filtering Mask = 111...111-111...111-111...111-1

• Address Filtering Mask = 111...111-000...000-111...111-1

• Port Filtering Mask = 000...000-111...111-111...111-1

3.2.2 NAT Core with Configurable Dynamic Filtering

The dynamic approach is very similar to the static one in operation, exceptthat the filtering rules can be set individual to every session at creation. At

Page 67: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

3.2 NAPT Tables Filtering Behavior 43

the administrator level a list of rules act as a reference in such a way, that theNAPT core can examine the list at every new creation of a session and assignthe desired rule to that specific session. This interaction between the NAPTcore and the CPU, which stands for the user interaction, will of course reducespeed of the creation process, but as mentioned earlier it is a feature of theNAPT gateway and it can be disabled if needed. For the NAPT core to supportthe feature, independent of the CPU, for subsequent packet flows belonging tothe session, a colon in the mapping table with a size of two bits is added. Thesetwo bits can represent the four different filtering modes available. In the listbelow a translation of the bit pattern is shown for each of the filtering modes.

• 00 - No filtering

• 01 - Port-dependent filtering mode

• 10 - Address-dependent filtering mode

• 11 - Address & Port dependent filtering mode

The bit pattern is used to determine which filtering lookup-mask to use, whena lookup in CAM-1 is performed. As a consequence of that the procedure foran inbound packet is changed upside down to a lookup in CAM-0 to get theinformation of the filtering mode together with the original source address &port followed up by a lookup in CAM-1 to determine, if the packet comes froma known destination and therefore is acceptable.

In Figure 3.11 beneath, a modified version of the mapping table (CAM-0) withan extra column added for the new filtering mode can be seen.

Figure 3.11: The modified mapping table (CAM-0) to support the dynamicassignment of filtering rules.

Page 68: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

44 Analysis and Design of a NAPT Core

3.2.3 Discussion/Summery

The purposed solution for managing the filtering behavior by means of CAMtechnology is suitable, where high performance is needed, since a lookup resolvedin only one clock cycle. In applications where the support for a lot of simulta-neous sessions is high prioritized, the solution could come under pressure andtherefore a strict tear-down policy may be necessary. If further empiric mea-surements or others’ work show, that the capacity of the filtering table shouldbe larger, than what one can defend with the use of CAM technology, anotherlookup approach has to be considered. Because of possible data redundancy inthe filtering table, further work to minimize that would be beneficial; a solutionto that would in all likelihood violate the high performance obtained with thepurposed solution, but it could be worth to look at, if it turns out that the sizeof the filtering table matters.

To clarify the mapping and filtering behavior, a step by step example has beenprovided in Appendix B. The applied NAPT gateway’s, in the example, map-ping and filtering behavior are respectively endpoint-independent and endpointaddress and port.

3.3 Port Assignment Behavior

While designing the NAPT core table’s behavior in Section 3.1 a simple approachfor selecting the unique NAPT source port number was chosen in an attemptto bring in focus on the main subject of the concerned section. Although thisapproach may function properly for most applications, a further study is neededto confirm the solution or at least to make the port assignment of the NAPTapplication friendly. This section will start out by looking in short at the conceptport number, how it is used and managed by applications, operating systemsand the interaction with the Transport Layer protocols such as TCP and UDPetc. however with the perspective of a NAPT gateway. Next in the sectiontwo design proposals will be given, which in the end will lead to a discussion ofwhich applied port number assignment solution to use further in this work tomake the NAPT gateway as application friendly as possible.

What is application port numbers and what are they used for

In computer networking, port numbers are used by the kernel to distinguishbetween and to identify applications holding one or more connections via theTransport Layer protocol to endpoints on a network. An application typically

Page 69: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

3.3 Port Assignment Behavior 45

gets a unique randomly chosen port number (unless the application belongsto either the ”well-known” or ”registered” range of port numbers; more aboutranges later in this section) from the operating system. When establishing aconnection to the network, a binding is thereby created between the applicationand the port number. Note that applications may bind to multiple ports. Whene.g. an Ethernet packet arrives at the NIC of a host and if it carries a Trans-port Layer packet, which specifies a source and destination port in their packetheaders, the operating system is able to forward it to the right application, ifit exists. The actual port number is a 16-bit unsigned integer ranging from 0to 65535 and is administrated by IANA, which has further sub divided the en-tire range into three. These ranges are ”well-known” from 0-1023, ”registered”from 1024 to 49151 and ”dynamic/private” from 49152 through 65535. Themajority of port numbers from the ”well-known” and the ”registered” rangeare reserved for specific uses. An example for the use of ports is the Internetmail system (e-mail). A server used for sending and receiving e-mails providesboth a Simple Mail Transfer Protocol (SMTP) service (for sending) and a PostOffice Protocol 3 (POP3) service (for receiving). These are handled by differ-ent server processes, and the port number is used to determine, which data isassociated with which process. By convention, the SMTP server listens on port25, while POP3 listens on port 110. The IANA is responsible for maintainingthe official assignments of port numbers. IANA does not enforce adherence inuse of port numbers to these assignments; it is simply a set of recommendeduses. Sometimes applications use port numbers for different purposes than theofficial assignments suggest.

Why bother about port number assignments

An intuitively chosen strategy for the NAPT gateway’s port assignment behav-ior would be to translate the port number with a one-to-one match on eachside of the NAPT gateway, meaning that the NAPT gateway preserves the portnumber, such that a packet with port number X on the private network wouldget the same port number X on the public network after the NAPT translation.Unfortunately this approach could lead to port overloading in situations, wheremore than one application simultaneous request the same port number. Theprobability of a port assignment race could be reduced if the NAPT gatewayhas a pool of external IP addresses, though this solution gives just a brief respiteto the problem, since new problems would occur the moment the pool is drained.To avoid port overloading, the NAPT must either abandon the request from therequesting application or switch to a non-deterministic behavior e.g. choose toviolate the one-to-one rule. Since a fully transparent network address and porttranslation approach isn’t possible without breaking the deterministic behavior,even if we introduce a one or two extra external IP addresses (one or two portsare probably a realistic scenario in SOHO routers), another approach is needed.So far a simple approach has been used for demonstration purpose. This ap-

Page 70: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

46 Analysis and Design of a NAPT Core

proach picks a random port number based on the address of the allocated entryin the mapping table. While this approach works well for most applications,it may violate conditions determined by other applications. The objective istherefore to create a port assignment behavior, which takes special needs intoconsideration to support as many applications as possible.

For most protocols the port number in the ”well-known” and ”registered” rangeare destination ports and not source ports. This statement can be exemplifiedwith an everyday example, where a client (end-user) initiates a request to aserver (web site) using the Hypertext Transfer Protocol (HTTP) request/re-sponse standard. In between the client and the server, a NAPT gateway isplaced having the client on the private network and the server on the publicnetwork.

The client establishes a TCP connection to a particular port on a host (port 80by default). A HTTP server listening on that port waits for the client to senda request message. Upon receiving the request the server sends back a statusline, such as ”HTTP/1.1 200 OK”, and a message of its own, the body of whichis perhaps the requested resource, an error message or some other information.This example verifies, that HTTP uses the ”well-known” port number 80 as adestination port and has no restrictions on the choice of the source port number.

There is however applications/protocols, which have restrictions on the sourceport number, as for instance the Network File System (NFS), which expects theTCP/UDP port number to be in the ”well-known” range, if it performs portmonitoring. The idea doing port monitoring is, that binding to ports in the”well-known” range is a privileged operation on the client, and so the client isenforcing file access permissions on its end. Unfortunately in most cases, the useof privileged ports and port monitoring for security is at best an inconvenience tothe attacker and should not be depended on. To accommodate NFS connections,where the NFS server performs port monitoring, the NAPT gateway oughtto preserve the client’s source port number into the ”well-known” port range.Not necessary a strict one-to-one preservation, since the port monitoring onlydemands the source port number to be within the ”well-known” port numberrange.

Another example showing an application/protocol having expectations on thesource port is the Real-time Transport Protocol (RTP). RTP is usually usedin conjunction with the Real-time Transport Control Protocol (RTCP). WhileRTP carries the media streams e.g. audio and video, RTCP is used to monitortransmission statistics and QoS information. When used in conjunction, RTP isusually originated and received on even port numbers, whereas RTCP uses thenext higher odd port number. To support the described scenario, the NAPTgateway needs to accomplish port number parity and contiguity preservation.

Page 71: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

3.3 Port Assignment Behavior 47

Port number parity preservation implies from a NAPT gateway point of view,that it respects the parity of the source port number i.e. an even private portnumber is mapped to an even public port number. This behavior respects therule, that RTP uses even ports, and RTCP uses odd ports. It is important tonote, that the RTP protocol allows any port number to be used for RTP andRTCP, if the two port numbers are specified separately; for example, using [16]implementing port parity preservation would make it possible for applications,that do not support explicit specification of both RTP and RTCP port numbers,to communicate properly.

Port number contiguity preservation is required to meet the contiguity rule,where RTCP port number = RTP port number + 1. Sequential port numberassignment could be a possible solution to the contiguity preservation problem,though it assumes, that the application will open a mapping for the RTP firstand then for RTCP. It is not always practical to enforce this requirement inall cases. Imagine a situation where the NAPT port numbers are assigned ona linear manner, but combined with port preservation for certain applications,lets say that the pointer for the next available port number in the linear portassignment approach has reached port number 25, but the next port number,port number 26, has already been assigned to an application, which enforcesport preservation. In this case the linear approach must be clever enough tohandle this situation. One could argue, that a clever monitoring of RTP packetflows could resolve the problem, but even if this holds out, it would certainlybreak the deterministic behavior of the NAPT gateway.

Best Current Practices

According to [10] and [13] it is ”RECOMMENDED”, that the NAT/NAPTgateway preserves ”well-known” port range i.e. if a session initiating packet hasa source port number in the range 0-1023 on the private network, the NAPT portnumber on the public network should be in the same range. The same appliesthe other way around for the remaining port range 1024-65535, which shouldalso be preserved. Lastly, but most important, it is stated, that the NAPTgateway ”MUST NOT” have a port assignment behavior of port overloading.

In the following, two different solutions will be proposed. The first is a hardware(NAPT) based approach and the second is a software (CPU) based approach.The hardware approach will describe how the NAPT core handles differentrange preservations and port number preservation including some examples.The software approach will describe how the CPU takes responsibility for theport number assignment instead of the NAPT core and the changes needed forthis. It will also include examples.Lastly a discussion will take place and one of the solutions will be chosen, whichwill be used in the subsequent analysis and design.

Page 72: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

48 Analysis and Design of a NAPT Core

3.3.1 Hardware port number assignment

In this section a port number and range preservation with a full port range hard-ware solution is presented. Further down an alternative solution is given. Thissupports port range preservation with a reduced port range (static boundary).

Port number and range preservation with a full port range

The basic idea in the simple port assignment behavior for a session, presentedbriefly in section 3.1 for demonstration purpose, was to directly attach theunique NAPT port number to the address of the session’s entry in the map-ping table (CAM-0). Since CAM units usually are proprietary units, for whichno de facto standard exists, they can vary greatly, but regarding the layoutof the address space, it is typically comparable with general purpose storagedevices; nevertheless it is the perception here. With the knowledge that foreach public IP address, the NAPT gateway may have up to a total number of65536 concurrent sessions at its disposal (recall that the port number is a 16-bitunsigned, which gives 65536 possible combinations), a strange forward solutionis to cover the full port range with a large CAM table as illustrated in Figure3.12.

Figure 3.12: Full port range solution.

Every host on the private network are capable of using any port number from0 - 65535 and since this range, as mentioned above, only can be found once,true port preservation isn’t possible without introducing port overloading. Butsince many connections are short timed and have random nature, semi portpreservation can be implemented by the NAPT. As a part of the initializationprocedure for a new session, the NAPT tries to get a CAM entry, whos addressmatches its own source address. If this is not possible, the NAPT will findanother entry, where only the range is alike. Of course there could be situations,where both requirements can’t be met and in such a situation, the NAPT willsimply discard the packet or send an error message.

Page 73: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

3.3 Port Assignment Behavior 49

So far a Valid-bit has been used to distinguish between used and unused entriesin the mapping table, but to support the port range preservation, an extensionto the Valid-bit is needed. It is no longer enough just to indicate a taken/un-taken entry. A way to differentiate between the ”reserved” and the combined”registered” and ”dynamic/private” port range is prescribed. To make this pos-sible, the Range-bit column is introduced to the mapping table (CAM-0). TheRange-bit approach works like a virtual cut in the mapping table between ad-dress 1023 and 1024. The full range 0-65535 is then cut into two sub-ranges.Every entry from address 0-1023 has the same value in the Range-bit columnand the rest of the entries 1024-65535 have also the same but reverse value ofthe other. Applying this single bit column to the table, the NAPT is capableof searching for a free entry in either sub-range. A section centered about thedividing line is shown in figure 3.13 to visualize the extension added to themapping table regarding this approach. To handle the semi port preservationno table extension is needed, only an extra lookup in case the present port istaken.

Figure 3.13: A section of the modified mapping table.

As a recapitulation two examples are provided. One illustrating the proceduresinvolved when an initialization of a session succeed to preserve the port num-ber and another, where the port number is taken and only the port range ispreserved. In the following examples it is assumed, that action has been takenon beforehand to ensure, that the session does not exist and a creation of a fullsession is needed. Furthermore the creation of an entry in the filtering table etc.is omitted for simplicity.

Example: Port number and range preserved

An application with port number 1 on the private host A initiates a new sessionin the mapping table (CAM-0), and since the entry with the address matchingthe packet’s source port number is free (done with a read operation of the entryto confirm/deny whether it is taken by evaluating the valid bit), port numberand port range preservation can be met. Beneath is the steps relevant for thisextension presented on a listed form together with an illustration 3.14 of the

Page 74: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

50 Analysis and Design of a NAPT Core

case described. Remark that there is a one-to-one match between the sourceTU port and the address of the entry for the created session (marked with ablue scripture).

Creation of a full session where the entry with the address matching the sourceaddress of the packet is free:

• Read the content of the entry with the address matching the source address

• Set-up new binding

• Perform address & port translation etc.

Figure 3.14: Port number and range preserved.

Example: Port range preserved

This example is a direct extension to the previous example. Another applicationwith the same port number as before, but running on a different host (B), isinitiating a new session in the mapping table. As before the entry with theaddress matching the source address is read, but unfortunately it has beentaken and therefore no port preservation can be met this time. A lookup for afree entry, where the Valid-bit is zero indicating ”not taken” and a Range-bitalso with its value set to zero, since the application’s port number belongs tothe lower port range, is performed. Since there are multiple matches, the returnaddress points to the entry with the highest value, in this case 1023 as shown inFigure 3.15. The procedure for this type of scenario is also included beneath.

Creation of a full session where the entry with the address matching the sourceaddress of the packet is taken:

• Read the content of the entry with the address matching the source address(occupied)

Page 75: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

3.3 Port Assignment Behavior 51

• Determine whether to search in the lower or upper port range by using acompare unit.

• Lookup for a free entry in CAM-0 via the Valid-bit and Range-bit

• Set-up new binding

• Perform address & port translation etc.

Figure 3.15: Port range preserved.

This approach is self destroying regarding the port preservation, as the likeli-hood of getting a preserved port number gets smaller as the table gets loaded,especially in the lower range. But as stated in the beginning of the section itis actually not a problem, as the source port number typically only serves forreturn purposes. To increase the likelihood of preserving the port number, onecould skip the range preservation of this approach and dedicate the ”dynam-ic/private” port range as a second choice range. This range should be identifiedwith the Range-bit just like described above. The downside of not preservingthe port range can be solved by introducing ALG for every problematic appli-cation. The (ALG) will monitor the packet flow and take special care of thoseapplications. It is obvious, that this solution is most durable, if the number ofproblematic applications is small, as ALG are typically implemented in softwareand are therefore slow compared to a hardware solution. As mentioned in thedelimitation, ALG will not be treated further in this work.

Port range preservation with a reduced port range (staticboundary)

While it is pleasant to have the full range of port numbers available in themapping table, it is not always wise, since maybe only a small percentage ofentries are in current use. Imagine an analogue situation like above, but with areduced port range in that the CAM table only consists of 1500 entries. Havingonly entry addresses in the range 0-1499, port preservation is definitely not apossibility using a direct one-to-one match between the mapping table’s address

Page 76: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

52 Analysis and Design of a NAPT Core

and the NAPT port number. Port range preservation is indeed possible, thoughthe address range of the mapping table is a lot smaller than the possible portnumber range. In the scenario with a range of 1500 entries only 1500-1024are suitable for the upper port range, while the entire lower range is covered.With the present likelihood that the compressed upper range will be exhaustedfaster than the lower range, it could be beneficial to move the boundary betweenthe upper and lower range, permitting the upper range to use entries from thelower range. This borrow approach can only be done with success, if the entriesappears as coming from the upper range. The reason for that is of course topreserve the port range. The technique is quite simple. A suitable range ofentries are virtually cut out and an offset is added in such a way, that theyappear as being placed on top of the actual mapping table. This technique isillustrated in Figure 3.16. Note that the Range-bit for every virtual entry is set,such that they can be recognized as being from the upper range. The illustrationshows only one of many possible scenarios, as the mapping table size can varyand the same goes for the virtual range.

Figure 3.16: Creating a virtual range to expand the upper port range.

Outbound packet handling

An outbound packet, which does not relate to any established session in themapping table on beforehand, but satisfies the conditions for a session creation,(these conditions varies according to the actual transport protocol), is a candi-date for a session creation. The first step in creating a new session using thisapproach is to determine, whether it belongs to the upper range or lower range.If the source port number of the packet is less than 1024, a search for an entry

Page 77: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

3.3 Port Assignment Behavior 53

with both the Range-bit and the Valid-bit cleared is carried out. Recall that theinfluence of different bit columns can be controlled by the applied mask usedfor lookups in the CAM tables. Ignoring all other bits than the Range-bit andValid-bit, a lookup can be met by a mask equal to this bit-vector 000...000-000...000-1-1. The outcome is either a match, which indicates that there is anempty entry, or a miss. In case of a match the session is created and the session’sNAPT port number gets the same value as the address of the entry. In case of amiss the packet is simply discarded (or an error message is send). If the sourceaddress of the outbound packet belongs to the upper range, a free entry fromthe upper range in the mapping table is needed. A lookup as before is carriedout, however with the Range-bit set instead of cleared; the same mask appliesalso. If the result of the lookup is a match, the new session is created, but ifthe entry address is in the virtually range, an offset needs to be added beforethe actual NAPT translation is performed. Based on the illustration in Figure3.16 an offset with a value of 1000 is necessary, if the address from the lookupis between 500 and 1023. Again if the result of the lookup is a miss, the onlything to do is to discard the packet or send an error message.

Inbound packet handling

The procedure handling inbound packets is very much like the described pre-sented in section 3.1.1, however with the exception of treatment of an inboundpacket with a NAPT port number in the virtual range 1500-2023. These packetsneed some special attention before the procedure we already know applies. Thepreprocessing of an inbound packet in the virtual range is simple: subtract theoffset before further processing. Thereafter a simple read operation of the entrywith the address equal to the result of the subtraction is performed and if theValid-bit in the entry is set, the reverse address translation can be accomplished.

3.3.2 CPU port number assignment

So far the unique NAPT port number has more or less been tied to the addressof the session’s entry in the table. One major disadvantage of the table addressbased port selection approach is the lack of flexibility once the final approachhas been chosen and implemented. Imagine a scenario where the above men-tioned solution (Port Range Preservation with a Reduced Port Range (StaticBoundary)) has been chosen and implemented in the released product and aclient later on calls the sales office of the vendor with a request regarding a dif-ferent port selection strategy. In this case with a hardware based port selectionapproach, the commonly known and widespread procedure of releasing a sim-ple firmware update does not help. It has been shown, that some applicationshave specific demands in regards to the NAPT gateway’s port selection strategy

Page 78: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

54 Analysis and Design of a NAPT Core

to work properly. To minimize the risk of shortening some markets, a flexiblesolution for the port selection approach is presented in the following.

The flexible solution is basically based on the idea, that a co-existing CPUassigns the unique NAPT port number independent of the address for the entry.For this to work the assigned NAPT port number has to be included for eachentry in the mapping table (CAM-0), and the lookup procedure presented in3.1.1 is slightly changed to cope with the new approach. The solution implies,that the CPU is involved in every connection establishment and terminationphase of a session, in order to manage the reservation and release of NAPT portnumbers. Besides providing a flexible port selection procedure, where a firmwareupdate is possible, a powerfull firewall capability come with the solution. Asevery new session establishment are insured to be handled partly by the CPU,a software based firewall could be added, such that an administrator can setup any security policy for possible combinations of the tuple consisting of asession’s source IP address, destination IP address, source port, destinationport and protocol.

The internal communication between the CPU and the NAPT core relies on anexchange of messages in the shape of codes; interrupt codes from the NAPT coreto the CPU and return codes from the CPU to the NAPT core. An incompletelist of codes can be seen in appendix G. Those codes are the result of the protocolwalk-through in section 3.4 and 3.5, where the codes are used in connection withthe presented examples.

To be able to manage the responsibility of assigning port numbers, make firewallcheck etc. the CPU is equipped with a data structure, a so called CPU sessiontable, which content can be seen in Figure 3.17. The CPU session table includes

Figure 3.17: CPU session table

a fingerprint of the session together with information about the current modeand state, which is explicit used by, but not restricted to, the TCP connections.While most of the fields explain itself, the mode and state field need furtherintroduction. The mode field can be used by administrators to clarify the currentmode of a TCP session i.e. established, data transmission or termination modeand the state field, which is actually a ring-buffer, containing the history of asession four packets backwards in time. This feature is necessary, as experienced

Page 79: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

3.3 Port Assignment Behavior 55

later, to control the establishment and termination phase of TCP connections.

Beneath is a guided description of the methodical adjustments carried out tosupport the flexible port assignment model, which includes the CPU and thephysical changes made to the mapping table. The handling of both outboundand inbound packets are treated in the following, starting with the outboundhandling. The description and examples in this section is a direct extension andare strongly related to the description given in section 3.1.1, where the basicmapping behavior where given.

Outbound packet handling

Outbound packets traversing the NAPT belong either to an already establishedsession in the mapping table or are completely unknown and need to be created.The two different procedures for handling these situations are listed beneath.

Outbound session unknown to the mapping table:

• Lookup in CAM to get the NAPT port number of the session (no match)

• Lookup for a free entry in CAM via the Valid-bit

• The CPU assigns a unique NAPT port number and performs the address& port translation etc.

• Set-up new mapping

Outbound session known to the mapping table

• Lookup in CAM to get the NAPT port no. of the session

• Read the content of the entry which contains the unique NAPT port num-ber, besides the original source IP address and port number.

• Perform address & port translation etc.

The first case treated is the one, where the outbound packet does not belongto an already created session. It is assumed, that the mapping table is initiallyempty and for simplicity, the filtering procedure is omitted.

Example: Unknown session

A UDP packet from an application with the port number 121 running on hostB on the private network is received by the NAPT gateway. A lookup including

Page 80: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

56 Analysis and Design of a NAPT Core

all columns, except for the NAPT port column, is performed. As the packetdoes not belong to any created session, the lookup results in a miss as shown inthe first table in Figure 3.18.

Figure 3.18: Outbound packet procedure where the session is unknown on be-forehand

On the basis of the miss, a lookup for a free entry is performed (second table)and since the table is empty, a free entry is found at address 4. In the followingthe NAPT core interrupts the CPU and all necessary steps are carried oute.g. firewall check, port number assignment, header manipulation, checksumrecalculation and creation of a fingerprint for the session in the CPU sessiontable. When the CPU is done, it hands over the responsibility of the packet tothe NAPT core, which then creates the session in the NAPT tables (third table),if the packet passes through the firewall, otherwise the packet is discarded.Lastly the packet is forwarded to the right interface.

Example: Known session

Receiving an UDP packet, which belongs to an already known session in themapping table, is handled entirely in hardware. For determination purposea lookup in the mapping table is performed to see, if the combination of thepacket’s source IP address and port number exists in the mapping table. In

Page 81: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

3.3 Port Assignment Behavior 57

this case, where the concerned session exists on beforehand, the lookup in themapping table results in a hit, as illustrated in Figure 3.19 with the table tothe left. To obtain the unique NAPT port number a read of the content of theentry with the address from the lookup is performed. The read operation isillustrated in the table to the right in Figure 3.19. Finally a filtering check, arecalculation of the checksum and the header manipulation is carried out beforethe packet is forwarded to the right interface. 3.19.

Figure 3.19: Outbound packet procedure where the session is known on before-hand in the mapping table

Inbound packet handling

Inbound session known to the mapping table

• Lookup in CAM with the destination port number of the packet, to getthe original source IP address and port number of the session.

• Read the content of the entry to obtain the unique NAPT port numberused by the session.

• Perform address & port translation etc.

Inbound session unknown to the mapping table:

• Lookup in CAM with the destination port number of the packet, to getthe original source IP address and port number of the session.

• Silently discard the packet.

Example: Known session

Page 82: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

58 Analysis and Design of a NAPT Core

A lookup in the third column with the received inbound packet’s destinationport number would clarify, whether the concerned packet belongs to an alreadycreated session or not. If the outcome of the lookup is a hit, the session existsand a following read is necessary to obtain the original source IP address andport number of the host on the private network. The details of the lookup andread operation can be seen in Figure 3.20. The filtering check, recalculationof the checksum and header manipulation is carried out before the packet isforwarded.

Figure 3.20: Inbound packet procedure where the session is known on before-hand in the mapping table

In the case where the session is unknown to the mapping table, i.e. the lookupresults in a miss, the packet is simply silently discarded.

3.3.3 Discussion/Summery

In the light of the information given in this section, it is obvious ,that no opti-mal solution exists. The hardware based approach might be a straight forwardand fast solution, if the port assignment algorithm used is simple i.e. no parityand contiguity preservation, or any other sophisticated features are applied. Ahardware solution implementing those ”sophisticated” preservation approacheswould in best cases lead to a rather complex device. The most significant prob-lem with the hardware approach in general, is the static behavior, which entailsan overall inflexibility, when support for new applications are needed, in regardsto the NAPT port number assignment algorithm.

The purposed solution where the responsibility of assigning unique NAPT portnumbers is handled by a CPU, solves the inflexibility of the hardware solution,though it introduces an interrupt every time a new session needs to be created.

Page 83: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

3.4 Connection Establishment 59

Besides slowing down the creation process with the introduction of a softwareroutine, an extension made to the mapping table, in such that the CPU assignedNAPT port number can be included, is required.

The concept firewall was introduced in the presentation of the CPU based ap-proach and since it more or less has become a de facto standard in commercialNAT/NAPT gateways, it attaches great importance to the finally weighing be-tween the two solutions.

Both solutions are durable, each having their advantages and disadvantages,and a vendor has to compromise when making the choice. Further in this workthe CPU based approach is applied, despite the fact that it increases the sizeof the mapping table and that it introduces relative large interrupts every timea session is created. Realizing the importance of having control of the sessioncreation process, the ability to prevent certain sessions’ access to the NAPTgateway and the flexibility introduced by the software routines, the CPU basedapproach is the preferred choice.

[10] and [13] recommend, as a minimum requirement, that the port assignmentalgorithm of the NAPT gateway preserves the port range in order to be appli-cation friendly. As the port assignment algorithm is easily changeable, whenit is executed as a software routine, and since a full coverage of any special re-quirement a possible application may have, a rather simple approach is chosenand used through out the rest of this work. The approach is simple: Preservethe port number if possible, otherwise randomly pick a free one from the entirerange of port numbers.

3.4 Connection Establishment

The objective of this section is to shed light on the connection establishmentprocess for the different protocols supported by the NAPT, in this case UDP,TCP and ICMP. For each protocol an evaluation of the connection processwill be given to clarify the requirements needed. The connection establishmentprocess varies greatly from each other, but they all end up with a mappingcreation in the NAPT, which allow the actual data transfer to occur.

The first protocol treated is the UDP protocol, which is the least complex com-monly used general purpose data transmission protocol, and as a natural contin-uation of the treatment of the connection establishment, the UDP protocol willbe followed up by an evaluation of the TCP connection establishment. The TCPprotocol is much more complex than its counterpart UDP, especially regarding

Page 84: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

60 Analysis and Design of a NAPT Core

the connection establishment phase, where not only one way of establishing aconnection exists. Managing the connection establishment phase too strict togain control could harm the many purposed techniques for tranversing NAPTgateways, which enables TCP peer-to-peer connections to work properly. Thissubject will be discussed while evaluating the TCP connection establishment.The ICMP protocol is also treated even though its typical use is signaling errorsoccurring in an ongoing already established data transmission process, whichalready has created a mapping on beforehand, but there are scenarios wherethe ICMP protocol needs to create mapping on its own, and therefore to enablecertain applications to work it is necessary for the NAPT to handle ICMP con-nection establishment as well. All-in-all this section should end up with a clearview of what precautions to make while designing the connection establishmentpart of the NAPT gateway supporting UDP, TCP and ICMP, and how to actualdesign an architecture handling the supported protocols.

All examples throughout the Connection Establishment section are based on theconstellation of CPU assigned NAPT port numbers as described in section 3.3,which implies that the NAPT tables and the lookup procedures are changed inaccordance to the port assignment approach. The general consideration aboutthe mapping and filtering behavior presented and explained in 3.1 and 3.2 stillholds true.

3.4.1 User Datagram Protocol - UDP Connection Estab-lishment

The UDP [26] is associated with simplicity, as it is a minimalistic data exchangeprotocol. The header of a UDP packet consists only of four fields (8 bytesin length) besides the actual data payload. An illustration of an UDP packetwith the fields named and sized can be seen in Figure 3.21. Based on the

Figure 3.21: The UDP header consists of only 4 fields. The use of two of thoseis optional (pink background in table). The numbers are the bit size of the fields

basic concise exploration of the UDP just given, it can be concluded, that theUDP protocol contains very little functionality. It does provide the essentialaddressing capability the port number offers and an optional opportunity in the

Page 85: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

3.4 Connection Establishment 61

presence of the checksum field in the header to make a checksum calculation ofthe UDP packet. As a consequence of the simplicity the UDP does not establisha lasting connection between devices; it does not acknowledge received data orretransmit lost messages, and it certainly does not concern itself with esotericssuch as flow control and congestion management.

Despite the functional limitations of the UDP it excels in situations, where per-formance is more important than completeness like e.g. multimedia applicationsand in TCP/IP applications, where the underlying protocol consists of only avery simple request/reply exchange e.g. a short request message is sent from aclient to a server, and a short reply message goes back from the server to theclient. In this situation, there is no real need to set up a connection like TCPdoes. Also, if only one short message is sent, it can be carried in a single IPdatagram. This means there is no need to worry about data arriving out oforder, flow control between the devices and so forth.

Because of the lack of states in the data exchange procedure of the UDP, theNAPT core encounters difficulties in handling UDP sessions in a strict manner.The establishment/creation of a UDP session in the NAPT core is triggered bythe first emerging outbound packet wanting to transverse the NAPT gateway,recall that inbound UDP packets not belonging to an already established sessionare eliminated as it is considered a security risk. To gain insight into the estab-lishment procedure of the NAPT gateway handling UDP packets, an exampleis provided beneath.

Example: UDP session establishment

Imagine a situation where a host named A running an application identifiedwith port number 121 on the private network initiates a data transfer trough aNAPT gateway to an application with port number 90 on a host named E onthe Internet, as illustrated in Figure 3.22. The data sequence consists of threepackets, where only the first one is of interest in this section. It is assumedfurther, that the internal tables of the NAPT gateway are initially empty inthis example.

When the first initial packet is received at the internal interface of the NAPTgateway and stored in a buffer, the identity is checked in the NAPT tables i.e.the mapping and filtering tables, CAM-0 and CAM-1. As in this case where thesession is unknown to the NAPT tables, the CPU is interrupted with code 20,which basically informs the CPU to execute the software routines necessary fora UDP session to be established in the software layer of the NAPT gateway. Theoutline of the routines are first of all to run a search for possible firewall settingsmade explicitly for the concerned session, typically a combination of the sourceand destination addresses and ports. The security policy chosen is not restricted

Page 86: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

62 Analysis and Design of a NAPT Core

Figure 3.22: Sequence of UDP packets in a data transmission between host Aand E trough a NAPT gateway.

to the addresses and ports of the packet, but could in principle be of any kind,for instance a time period filtering. After the verification step of the creationprocedure, the CPU continues with the creation procedure, provided that thesession is approved i.e. no firewall restriction valid on that particular session,otherwise the packet is discarded without notifying the initiating endhost. Thehardware layer is notified to discard the packet without further action with areturn code of 20.1.

If the packet is accepted, a vendor chosen specific port-selection software routine,which was described in section 3.3, selects a NAPT port number, that is furtherused, as already mentioned earlier, to masquerade the source port number ofthe outbound packet and to be a unique identifier for the session.

Next a fingerprint of the session is saved in the CPU’s session table. The purposeof saving a fingerprint in a session table in the case of a UDP session is to providethe administrator of the NAPT gateway with a logging feature. The content ofthe session table after creation for the concerned session can be seen in Figure3.23. Note that the source port number is preserved and as a result, the NAPTport number is the same as the source port number. The valid bit abbreviatedwith a V in the session table is intended for administrative usage to clarify theavailability of the entry, like in the case of the mapping and filtering tables insection 3.1and 3.2.

Figure 3.23: Content of the CPU session table after processing the first outboundinitializing UDP packet.

Page 87: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

3.4 Connection Establishment 63

Since the UDP protocol is stateless, the mode and state field are left unused.

Finally the actual header manipulation of the packet is carried out, the originalsource address is replaced with the NAPT gateway’s external IP address andthe source port number is replaced with the one selected by the CPU, and thechecksum, which is described in details in section 3.6, is updated if it is in use i.e.the checksum field does not consist entirely of ones, otherwise leaved unchanged.

The remaining steps in the creation procedure is done by the hardware on cuewith a return code of 20.0.

The remaining steps done by the hardware includes saving relevant packetheader data in the NAPT tables, source IP address and port number in CAM-0and destination IP address and port number in CAM-1, and in addition bothtables includes the CPU selected NAPT port number. In the session memorythe ALI-bit is set to indicate, that the session is currently active. The ALI-bitis used by the NAPT core to distinguish between states a session can exist in,active or inactive, and plays a central role in the refresh/termination part of thesession handling, which will be treated in the next section 3.5. The content ofthe internal NAPT tables after the creation of this specific session can be seenin Figure 3.24.

Figure 3.24: Content of the NAPT tables after processing the first outboundinitializing UDP packet.

A last step is needed to complete the creation procedure, a session timer needsto be started. The purpose of this timer is to manage unused sessions, recall thatUDP sessions are stateless and therefore only a timer can be used to estimatethe liveness of the session. In [10] it is recommended, that the value of thesession timer are set to a minimum of 2 minutes. The behavior of the NAPTcore when the session timer expires, and the session management in general, arejust like the use of the ALI-bit, further treated in the next section 3.5.

The packet is then ready to be handed over to the external interface and trans-mitted to the endhost.

Page 88: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

64 Analysis and Design of a NAPT Core

3.4.2 Transmission Control Protocol - TCP ConnectionEstablishment

The Transmission Control Protocol (TCP) [17] is one of the main protocols ofthe Internet Protocol Suite TCP/IP networks. The TCP is a rather complexdata exchange protocol compared to the above-mentioned UDP protocol, as itguarantees delivery of data and also guarantees, that packets will be deliveredin the same order in which they were sent. Among its other management tasks,TCP controls the message size, the rate at which messages are exchanged, andnetwork traffic congestion. As a result of the increased functionality the size ofthe TCP packet’s header is likewise increased, it consists of a minimum of 20bytes. Figure 3.25 shows the general layout of a TCP packet. Besides the sourceand destination port, the flags U-Urgent, A-Acknowledgment, P-Push, R-Reset,S-Synchronise and F-Finalise is of great importance to the NAPT gateway. ATCP connection is a protocol in states. For the NAPT gateway to determinethe current state a connection is in, it applies the flags in the evaluation of thereceived packets.

Figure 3.25: An illustration of the different fields a TCP header consist of. Thenumbers are the bit size of the fields.

As touched on in the beginning of the main section, there are more than one wayto establish a TCP connection. The standard [17] prescribes two main methodsto establish a connection: the 3-way handshake and simultaneous-open. The3-way handshake is a popular and frequently used connection establishmentmethod and must therefore be supported by the NAPT, but as peer-to-peerconnections gain currency now a days, the simultaneous-open connection estab-lishment method also gets more widespread, as it is the connection type used toestablish direct connection between peer-to-peer applications. For successfullyNAPT TCP traversal both the 3-way handshake and the simultaneous-openmust be supported by the NAPT. Not all current NAPTs in the field are ableto manage the simultaneous-open A test [32] of sixteen NAT products in a lab,

Page 89: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

3.4 Connection Establishment 65

and 93 home NATs in the wild shows, that all NATs allow the 3-way handshake,but 13.6% of the NATs does not support the simultanious-open. Those 13.6%of the NATs, which do not support the simultanous-open, will need a differentaproach to establish a peer-to-peer connection.

The section will start out with an introduction to the 3-way handshake and thesimultaneous-open approach to get an understanding on how the NAPT actsrespectively to manage those connection methods in this work. The introductionis followed up by a review of TCP NAT-transversal approaches, that have beenproposed in recent literature to get an overview of what is necessary for a NAPTto support, if it should be fully application friendly. Lastly a rather detailedexample representing the solution for the 3-way handshake in this work will begiven. The coverage of the 3-way handshake should be sufficient to clarify theNAPT behavior, as the handling of the simultaneous-open are procedural veryalike.

3-way handshake

The TCP stands out from the other transport layer protocol UDP, as it pro-vides reliable, ordered delivery of a stream of bytes from one application onone host to another application on another host. As a part of the solution, toprovide those advantages, every connection undergo a connection establishmentphase, a data transmission phase and a termination phase. The connection es-tablishment phase is under treatment in this section as a whole, but only the3-way handshake connection establishment phase will be examined in this partto outline the issues in supporting the 3-way handshake. Figure 3.26 shows asnapshot of a time period illustrating the different phases a TCP connection un-dergoes and the type of packets involved. The blue packets belong to either theestablishment phase or the termination phase. The pink packets are undefinedfor the moment. Of special interest in this section is of course the establish-ment phase, which will be described and should end up with a road map forthe implementation. The terminology is important for the understanding andis covered right away before going into details.

A connection is initiated by a SYN packet and is therefore named connection ini-tiation, in other words, connection initiation implies an unsolicited SYN packeti.e. receiving a SYN packet, where no mapping exists on beforehand. The con-nection initiation is pointed out in 3.26. The actual connection establishmentis defined as being fulfilled, when the packet pattern SYN, SYN-ACK and ACKhave been observed by the NAPT.

The actual mapping in the NAPT tables is created every time it monitors anunsolicited SYN packet, but since the connection in practice can stay in theestablishment phase indefinitely e.g. if the requested endhost is dead caused

Page 90: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

66 Analysis and Design of a NAPT Core

Figure 3.26: Establishment, data transmission and termination phase of a 3-wayhandshake.

by power failure and therefore does not respond at all, then the initiating hostwill at best retry, but it could also end the connection for good. To solvethis problem a timer for the establishment phase is introduced. According to[28], which defines and discusses the requirements for Internet host software, aTCP connection in the establishment phase can stay idle for most 4 minuteswhile waiting for in-flight packets to be delivered. This time period is called”transitory connection idle-timeout”. From the NAPT’s viewpoint a connectiongets at least 4 minutes to present the packet pattern SYN, SYN-ACK, ACK,otherwise the mapping will be terminated. This behavior should allow almostall current applications to work, recall that there is no 100% solution.

For a NAPT gateway to handle this, it needs to lock on every received SYNpacket and then create a session, if it does not break the security policy deter-mined by the administrator i.e. the session is blocked. Furthermore a sessiontimer appurtenant to the created mapping, of which expiration period is slightlylarger than 4 minutes, must be started. Successive packets, which belong to thesame session, must be monitored and when the packet pattern SYN, SYN-ACKand ACK has been registered, the session changes status from being in theestablishment phase to the data transmission phase as visualized in 3.26.

One question emerge in the light of the presented connection establishmentsequence. What to do with possible packets interfering the connection estab-lishment sequence? The short answer is to silently drop them all. This solutionwill work, but it will cause trouble when e.g. the endhost is unavailable and inresponse to the SYN packet creates and sends an ICMP Port Unreachable (type3, code 3) error message back (The illustration to the right in Figure 3.27).

Page 91: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

3.4 Connection Establishment 67

Figure 3.27: Left: Ordinary TCP 3-way handshake. Right: Interfered betweenSYN and SYN-ACK by an ICMP error message caused by endhost or network.

If the NAPT does not allow the connection establishment sequence of packets tobe interfered by error messages and therefore erroneously filters the messages,then the initiating host will be unaware of the problem. Typically the host willhave to wait until its own application whatever-timer expires before it can re-transmit the initiating packet, even though the endhost is unavailable or closed.The ICMP error message must not terminate the session as it will prevent re-liably performance of the simultaneous-open connection establishment method,discussed later. The NAPT should simply wait for the rest of the 3-way hand-shake connection establishment sequence to appear, even though it would nothappen and it has the knowledge about it (it has forwarded an error message).It will become clear why the NAPT must ignore the knowledge about the errorwhen the simultaneous-open connection establishment method is covered.

Other types of messages can be received in response to the initiating packete.g. the non-fatal ICMP time exceeded message indicating that the Time ToLive (TTL) was set to low. This message is used by the majority of NAT TCP-transversal approaches currently available. Filtering this type of packets willreduce the application designers choices to transverse a NAPT.

As an example the STUNT #1 TCP-transversal approach [14] demands theNAPT to permit the non-fatal ICMP time exceeded message to interfere betweenthe interfering SYN and SYN-ACK packet. Figure 3.28 illustrates to the leftthe connection sequence seen by the NAPT for the concerned session and theSTUNT #1 TCP-transversal approach to the right.

The STUNT #1 approach works in this way: Both endpoints send an initial SYNwith a TTL high enough to cross their own NATs, but small enough that thepackets are dropped in the network as soon as the TTL expires. The endpointslearn the initial TCP sequence number used by their Operation systems’ (OS)stack by listening for the outbound SYN over PCAP or a RAW socket. Both

Page 92: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

68 Analysis and Design of a NAPT Core

Figure 3.28: Left: Interfered between SYN and SYN-ACK by an ICMP timeexceeded message caused by network. Right: STUNT #1 TCP-transversalapproach.

endpoints inform a globally reachable STUNT server of their respective sequencenumbers, following, the STUNT server spoofs a SYNACK to each host with thesequence numbers appropriately set. The ACK completing the TCP handshakegoes through the network as usual.

Besides the ICMP messages, also RST packets can occur between the SYNand SYN-ACK packets. In that case the NAPT must forward the packet ifit belongs to the original SYN packet, the TCP sequence number of the RSTpacket must be zero and the TCP acknowledge number must match the storedTCP sequence number + 1 of the initiating SYN packet. Storing and comparingthe TCP sequence and acknowledge numbers hinders spurious RSTs to enter theprivate network. The mapping must terminate if the RST is valid, otherwiseit must silently be dropped and the connection establishment sequence shallremain in force, such that the establishment phase can continue in that wayspurious RSTs with wrong TCP sequence or acknowledge numbers is filtered.To keep the record straight, an illustration of a scenario where a RST packet issend in response to a SYN is provided to the right in Figure 3.29. Note thatthe TCP acknowledge number of the RST is consistent with the TCP sequencenumber + 1 of the initiating SYN packet.

Bottom line; for the NAPT to be application and TCP-transversal approachfriendly it must allow ICMP error messages (type 3) and ICMP time exceeded(type 11) to pass in between the initiating SYN packet and the SYN-ACKpacket. Furthermore the NAPT should terminate or block for successive packetsbelonging to the session, when a valid RST packet is forwarded in response to aSYN packet for security reasons, and the NAPT must not change its connectionestablishment state when receiving spurious RSTs, but should just silently drop

Page 93: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

3.4 Connection Establishment 69

Figure 3.29: Left: Ordinary TCP 3-way handshake. Right: Interfered betweenSYN and SYN-ACK by an RST packet caused by endhost.

them. The TCP sequence number must also be used to gain maximum controlof the whole connection establishment sequence.

The SYN-ACK, ACK sequence of the connection establishment can be interferedas well and are handled the same way as in the case with SYN SYN-ACKdescribed above. The NAPT must allow ICMP messages both type 3 and 11 totransverse the NAPT in response to the SYN-ACK packet. The ICMP messagemust leave the mapping unaffected. In case a RST packet belonging to thereferenced connection is received the sequence number must be the same as thestored TCP acknowledge number copied from the SYN-ACK packet. In caseit does not match, the RST packet must be classified as being a spurious RSTpacket. The mapping must terminate if the RST is valid. Figure 3.30 illustratesthe two scenarios.

Figure 3.30: Left: Interfered between SYN-ACK and ACK by an RST packetcaused by endhost. Right: Interfered between SYN-ACK and ACK by an ICMPerror message caused by endhost or network.

Simultaneous Open

Besides the well known 3-way handshake establishment used by client-server

Page 94: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

70 Analysis and Design of a NAPT Core

applications to initiate a connection, an alternative method of connection estab-lishment named simultaneous-open exists. In a simultaneous-open connectioninitiation, two peers simultaneous send a SYN packet to each other, the twopackets cross each other in the network and both applications respond by send-ing a SYN/ACK packet to the other to indicate the connection is established.For a NAPT to permit Peer-to-Peer applications it has to support the packetsequence introduced by the simultaneous-open. Beneath in Figure 3.31 is anillustration of the packet sequence introduced by the simultaneous-open way ofestablishing a connection.

Figure 3.31: TCP - Simultaneous Open.

When the connection establishment phase of the simultaneous-open approachhave procured, the scenario is like in the 3-way handshake approach with adata transmission phase followed by a connection termination phase as shownin Figure 3.32.

Figure 3.32: Establishment, data transmission and termination phase of asimultaneous-open.

Page 95: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

3.4 Connection Establishment 71

The simultaneous-open connection establishment sequence (SYN, SYN, SYN-ACK and SYN-ACK) can, as well as in the case of the 3-way connection estab-lishment, be interfered by other packets e.g. ICMP messages and RST packets.Furthermore the order in which the SYN-ACK packet arrives can differ causedby network delay etc. Many of the precautions made for the 3-way handshakeare also valid for the simultaneous-open. For instance the handling of ICMPmessages both type 3 and 11 are allowed to interfere the connection establish-ment phase and must not terminate the mapping. RST packets are treated asfatal errors causing the mapping to terminate or block any further packets forthe referenced connection.

Handling unsolicited inbound SYN packets where the support is disabled

A fully transparent NAPT allows both in- and outbound SYN packets to ini-tiate a connection. For security reasons the decision whether a NAPT shouldallow connection initiation by unsolicited inbound SYN packets or not shouldbe handed over and offered as an option for the administrator. The handlingof in- and outbound connection establishment are alike, just reverse and there-fore only connections initiated by outbound SYN packets are treated further.Before digging into how the NAPT handles connection initiations in details,both 3-way handshake and simultaneous-open, a short look at how the NAPTshould handle a received unsolicited inbound SYN packet, where the supportfor inbound establishment is disabled, at least for this mapping.

One intuitive and simple approach is to silently drop those inbound packetswithout any respond. This solution leaves the sending initiator unaware of theproblem until its internal session timer expires or in any other way figures out,that an error has occurred. This is of course undesirable from the endhost’s pointof view, since these self fixing methods typically are quite slow; the sooner theapplication has the knowledge of an error, the sooner it can react. To avoid thebefore mentioned problem of having a situation where an application is idlingcaused by the unawareness of a packet drop, the NAPT could respond withan error message, either a ICMP or RST packet informing that a problem hasoccurred. Unfortunately sending a TCP RST or ICMP Port Unreachable (type3, code 3) error message in response to an unsolicited SYN packet could cause,that the endhost aborts the connection attempt. This scenario is a problem,since the endhost is trying to establish a connection simultaneously with a hoston the private side of the NAPT, where the solicited outbound SYN packethas not yet created the necessary mapping for whatever reason in the NAPTe.g. due to network congestion. If the NAPT just silently drop the packet, theendhost will after some time retransmit the SYN packet and there will be apossibility, that a mapping exists created by the delayed outbound SYN packet.

On one hand sending an error message based on a inbound SYN packet may

Page 96: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

72 Analysis and Design of a NAPT Core

harm applications attempting a simultaneous-open, but on the other hand ithelps the endhost to discover an error earlier. Since the NAPT are not ableto determine the underlying intention with the SYN packet received from thepublic realm, there is no absolute solution to the problem. A compromise hasbeen suggested in [13]. The compromise implies, that the NAPT delays theerror message for a period of 6 seconds after the inbound SYN is received. Inother words, if an outbound SYN packet is received within the 6 seconds, noerror message is send. In the opposite case where no outbound SYN packetarrives in the meantime, the NAPT responds with an ICMP Port Unreachable(type 3, code 3) error message. The solution allows Peer-to-Peer applications toperform a simultaneous-open and it reduces the time period the endhost shouldwait in case of an erroneous packet. The decision whether the NAPT shouldsilently drop the SYN packet or respond with an error-message should be madeas an option to the administrator, as responding to unsolicited inbound SYNpackets could be misused by an attacker.

3.4.3 TCP NAT-Traversal approaches

The preceding subsection describes the two fundamental ways (3-way handshakeand simultaneous-open according to the standard [17]) of establishing a TCPconnection between two applications on their respective hosts and handling ofthese in the connection establishment phase. The NAPT must monitor/controlthe connection establishment first and foremost to determine when to declarethe connection for fully established and switch the status of the session frombeing in the connection establishment state to the data transmission state asvisualized in both Figure 3.26 and 3.32. While it is obviously a benefit witha strict control of the connection establishment phase to determine when tochange the session state, the downside is the narrowed flexibility caused by thefixed connection establishment sequences. It has been shown, that in order tosupport the STUNT #1 TCP-Traversal approach, it is necessary to allow theconnection establishment sequence to be interfered by an ICMP time exceededmessage. This subsection will present some of the opportunities an applicationdesigner has to transversing NAPT gateways today making TCP peer-to-peerapplications possible. It is necessary to have this knowledge, as it will affectthe way the NAPT should handle the connection establishment phase. In thefollowing four different approaches will be presented, of which the STUNT #1 isalready known, but included for the sake of completeness, as it has a counterpartnamed STUNT #2 [14] [22], the other two approaches are NATBlaster [5] andP2PNAT [11].

STUNT

Page 97: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

3.4 Connection Establishment 73

The STUNT #1 approach presented earlier implies a public STUNT server,which is not always feasible. The second STUNT #2 approach solves exactlythat problem. One endhost sends out a SYN packet with a low Time-To-Live(TTL) enough to transfer the NAT, and immediately after terminating theconnection attempt after which it creates a passive TCP socket on the same IPaddress and port. The other endhost then initiates a regular TCP connection,as illustrated on the right in Figure 3.33.

For the NAPT gateway to support this approach it must not consider the ICMPerror message as a fatal error and it also requires, that the NAPT accepts aninbound SYN packet following an outbound SYN packet.

Figure 3.33: Left: STUNT #1 Right: STUNT #2.

NATBlaster

In the NATBlaster approach each endhosts sends out a SYN-packet with a lowTTL value. Before the packets are dropped somewhere in the middle of thenetwork, the concerned hosts exchanges the sequence number of their respectiveSYN-packet and each produces a SYN-ACK packet, that the other end expects.The SYN-ACK packet is injected into the network through a RAW socket. Oncethe SYN-ACK packet is received, the TCP connection between the two endhostsare completed with the exchange of the prescribed ACK packet. An illustrationof a NATBlaster approach is shown to the left in Figure 3.34.

For the NAPT gateway to support this approach it must not consider the ICMPerror message as a fatal error. It also requires, that the NAPT accepts anoutbound SYN-ACK packet immediately after an outbound SYN packet andlastly, the approach fails if the NAT gateway changes the sequence number ofthe SYN packet.

Page 98: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

74 Analysis and Design of a NAPT Core

P2PNAT

The P2PNAT approach takes advantage of the simultaneous open scenario pre-sented in 3.4.2. Both endhosts initiates a connection by sending a SYN packet.If both the endpoints cross in the network, both endhosts respond with a SYN-ACK packet. For the NAPT gateway to support this approach, it must acceptan inbound SYN packet after an outbound SYN packet, in other words, it mustsupport the packet sequence of the simultaneous open scenario.

Figure 3.34: Left: NATblaster Right: P2PNAT.

Example: TCP Connection Establishment

This example illustrates step-by-step the handling of a 3-way handshake connec-tion establishment, with initially empty NAPT session tables. To simplify theexample no interfering of the establishment sequence occurs during the hand-shake i.e. no error packets or RST packets are send in response to any of theinvolved packets in the 3-way handshake sequence.

Imagine a situation similar to the UDP example, regarding the overall setup,where a host named A running an application identified with port number 121on the private network initiates a data transfer trough a NAPT gateway toan application with port number 90 on a host named E on the Internet, asillustrated in Figure 3.35. The data sequence consists of a total of nine packets,where only the first three are of interest in this section.

When the NAPT receives an outbound SYN packet and no mapping existsfor the session i.e. the source IP address and source port is unknown to theCAM-0 table, the packet processing is forwarded to the CPU with an interruptconsisting of information, that the session does not already exist (interrupt code40). Before the actual forwarding the NAPT has to make sure, that there is

Page 99: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

3.4 Connection Establishment 75

Figure 3.35: Time line sequence of TCP packets, which makes up a TCP con-nection establishment, data transmission and termination phase between hostA:121 and E:90 through the NAPT gateway.

space left in both NAPT tables, CAM-0 and CAM-1, for a new session. If thisis not the case, the CPU is instead informed to create an ICMP error message(interrupt code 67), here in terms of an ICMP destination unreachable errorpacket (type 3, code 3) notifying the initiating host, that the NAPT are notable to accommodate the request.

The CPU begins in this case (interrupt code 40), where the session is unknownon beforehand and there is an empty entry, by checking the firewall settingsfor the packet, to determine whether the session is allowed to be created ornot. The firewall check is made configurable by an administrator to meet thechosen security policy, which means, that it is possible e.g. to design specificfilters based on a combination of the packet’s source IP address, source port,destination IP address, destination port and protocol.

Based on the outcome of the firewall check, the CPU either discards the packet

Page 100: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

76 Analysis and Design of a NAPT Core

and forms an ICMP destination unreachable error packet (type 3, code 9), whichis then by the NAPT core (return code 67.0) forwarded back to the endhost,if the firewall check fails, though this could violate the security policy for theNAPT gateway; otherwise it continues with the next step, which is assigning aNAPT port number to the session.

The chosen NAPT port selection strategy used in this example preserves themasqueraded source port number (NAPT port equals source port) if it is avail-able i.e. not in use by another session. If the requested NAPT port numberis already in use, another random port number from the same port range ischosen. Besides assigning a NAPT port number, the CPU saves a fingerprint ofthe packet in such, that an administrator can monitor the internal activity ofthe NAPT. An example of what a TCP entry in the CPU session table, which isused to manage sessions, consists of, is shown in figure 3.36. In short the CPU

Figure 3.36: Content of the CPU session table after processing the first initialpacket (SYN) in the TCP connection establishment phase.

session table should include all possible needed information for each protocol itsupports. Some information are mandatory for the NAPT to perform the NAPTcore output, while other information are nice to have for an administrator. Forinstance as the NAPT monitors the establishment sequence of packets, it needsto save the current status of the establishment packet sequence, such that it hassome knowledge about what type of packet it should expect to see next in the3-way handshake. Opposite what is nice to have, is a fingerprint of the runningsessions in the NAPT represented by the source and destination IP addressesand ports together with the protocol and the NAPT port number. When theCPU has checked the firewall settings for the packet, chosen a suitable NAPTport number, updated the CPU session table and modified the packet’s header,the further processing is forwarded to the NAPT core by providing some in-structional information (a return code 40.0 in this case) about, what the NAPTneeds to do to fulfill the mapping creation.

When the NAPT core takes over, it reads the return code and thereby get arecipe of what to do. In this case where the return code equals 40.0, the NAPTcore’s task is to write the session relevant data into the NAPT tables, bothCAM-0 and CAM-1, and set the ALI-bit, TM-bit and TS-bit in the sessionmemory, to their proper values. For every initial SYN packet the ALI-bit is

Page 101: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

3.4 Connection Establishment 77

set to indicate, that the session is alive (used by the NAPT clean-up procedureto differentiate between dead and alive sessions). The TM-bit (Timer Mode) iscleared to specify the mode of the session’s timer, and as stated above, the timermode during a connection establishment is called the ”transitory connectionidle-timeout” and the timer value must not be less than 4 minutes. The TM-bitis generally seen used by the clean-up procedure, which will be introduced laterin section 3.5.2, to differentiate between the connection phases and thereby thetimer value, when resetting the timer. Lastly the TS-bit (Transitory State) isset, used by the NAPT to differentiate between the transitory connection phaseand the connection established phase.

The figure 3.37 beneath shows the NAPT tables after the NAPT core has writtenthe relevant data into the respective tables. To sum up an outbound SYN packet

Figure 3.37: Content of the NAPT tables after processing the first initial packet(SYN) in the TCP connection establishment phase.

is received and since no mapping exists, the NAPT core checks, that there isa free entry in both CAM-0 and CAM-1 before it interrupts the CPU with aninterrupt code to inform it. The CPU chooses a procedure based on the providedinterrupt code, here it must check the firewall settings for the intended mapping,select a NAPT port number, register and modify the packet’s header before itcan return the responsibility to the NAPT core. The last thing needed beforethe packet is forwarded to the public realm is an update of the NAPT tables.

The next packet received by the NAPT is, as expected, an inbound SYN-ACKpacket appurtenant to the previous SYN packet. Even though the packet be-longs to an active mapping, it is categorized as being a part of an establishmentphase. This is based partly on the fact, that a SYN-ACK packet belongs toa TCP establishment phase only, and remember that the TS-bit is also set toindicate, that the concerned mapping is in an establishment phase. Thereforethe further processing of the packet is, as the previous, forwarded to the CPUfor further treatment. Again an interrupt code is attended to guide the CPU,and it is done with interrupt code 41, which signifies, that the packet belongsto an establishment sequence. The task for the CPU consists of a modificationof the header i.e. replacing the destination IP address and port number of thepacket with the stored original source IP address and port number and recalcu-

Page 102: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

78 Analysis and Design of a NAPT Core

lating the checksum of the packet together with an adjustment of the currentstatus of the state order, which results in a slightly changed CPU session tableas illustrated in figure 3.38

Figure 3.38: Content of the CPU session table after processing the second packet(SYN-ACK) in the TCP connection establishment phase.

Returning, the CPU has now completed the modification of both the CPUsession table and the SYN-ACK packet’s header. The further responsibilityof the packet is subsequently forwarded to the NAPT core, which includes areturn code 41.0, that informs the NAPT core to simply forward the packetto the private network without updating any NAPT tables. If for any reasonthe packet does not belong to the connection establishment phase, the returncode would be 41.1, a message notifying the NAPT core to silently discard thepacket.

The internal NAPT tables after the SYN-ACK packet processing is shown infigure 3.39.

Figure 3.39: Content of the NAPT tables after processing the second packet(SYN-ACK) in the TCP connection establishment phase.

The last packet in the establishment sequence is an outbound ACK packetwhich triggers a CPU interrupt with interrupt code 41, as the TS-bit is set. Inthis case only the TS-bit can tell whether the packet belongs to a connectionestablishment phase or not.

The task performed by the CPU includes replacing the source IP address withthe NAPT gateway’s external IP address and the source port number with theunique NAPT port number, recalculate the checksum of the packet and perform

Page 103: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

3.4 Connection Establishment 79

an update of the session’s fingerprint in the CPU session table. As the 3-wayconnection establishment is completed with the reception of the ACK packet,the Mode field in the CPU session table is changed to transmission (trans.).The return code when the packet has been processed successfully is 41.2 and41.3 when not.

The internal CPU session table after the ACK packet processing is shown infigure 3.40.

Figure 3.40: Content of the CPU session table after processing the last packet(ACK) in the TCP connection establishment phase.

Since the connection establishment is completed, the NAPT core must on behalfof the return code 41.2 change the connection idle-timeout from transitory toestablished, meaning resetting the timer value to 2 hours and 4 minutes. Tomake sure that the session-timer got another 2 hours and 4 minutes instead ofthe 4 minutes at subsequent restarts carried out by the clean-up procedure, theTM-bit is set in the session memory, indicating that the alternative time-periodis chosen.

The TS-bit is likewise changed, cleared, as subsequent ”normal” packets belong-ing to the session, are not needed to be treated by the CPU.

If the return code 41.3 is returned, the packet is silently discarded as in the casewith the previous SYN-ACK packet.

The content of the NAPT tables after completion of the establishment processare shown in 3.41.

Figure 3.41: Content of the NAPT tables after processing the last packet (ACK)in the TCP connection establishment phase.

Page 104: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

80 Analysis and Design of a NAPT Core

3.4.4 Internet Control Message Protocol - ICMP Connec-tion Establishment

Internet Control Message Protocol (ICMP) is an error reporting and diagnosticutility and is considered a required part of any IP implementation. Understand-ing ICMP and knowing what can possibly generate a specific type of ICMPmessage is necessary in the design of NAPT gateways.

ICMPs are used by routers, intermediary devices or hosts to communicate up-dates or error information to other routers, intermediary devices, or hosts. EachICMP message contains three fields, that define its purpose and provide a check-sum. They are TYPE, CODE and CHECKSUM fields. The TYPE field identi-fies the ICMP message, the CODE field provides further information about theassociated TYPE field, and the CHECKSUM provides a method for determiningthe integrity of the message.

A complete list of all possible types of ICMP messages can be explored inappendix F. In this work only a subset of the full list are considered, but sincethey share great similarities an extension of the NAPT gateway’s capabilityshould be relatively easy done. The reason is, that even though a large list ofpossible ICMP messages are available, only two packet layouts exist for all typesof messages, namely ICMP Query and Error message layouts. The two types oflayouts are shown in Figure 3.42 and 3.43.

Figure 3.42: ICMP Error message layout. The numbers are the bit size of thefields

Figure 3.43: ICMP Query message layout. The numbers are the bit size of thefields

The alert reader will note, that no port number exists for ICMP messages like

Page 105: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

3.4 Connection Establishment 81

in the case with UDP and TCP. To support the ICMP some modifications haveto be made in the way the packet is handled by the NAPT gateway.

According to [25], which lists some behavioral recommendations a NAPT gate-way ought to follow, not all types of ICMP messages needs to be supported.Some types are even marked as should not be supported. The list F in ap-pendix includes a column, which in class manner states whether to support ornot. The states are MUST, MAY and SHOULD NOT. Further only the mustsupported is considered, but again the model should be easy to expand, suchthat it could handle all types, as the MUST support type falls into both maintypes, Query and Error messages. To keep the record straight, the MUST sup-ported types are Destination Unreachable (type 3, code 0-15), Time Exceeded(type 11, code 0-1) and Echo Request/Reply (type 0 & 8, code 0) Messages.

Before looking at the real world treatment of ICMP packets by examples,some generalities regarding handling the two main types of ICMP messagesare brought to light in the following.

ICMP Error Messages

ICMP error messages originates only as a response to an already ongoing datatransmission like UDP or TCP. As a result to this it never creates a session byitself, it simply uses the existing session for which the error message belongsto. ICMP error messages can also turn up in an ICMP query request/replyflow. Despite the fact that the ICMP header does not in itself include any portnumber, the payload includes a partial copy (see Figure 3.42) of the originalpacket causing the error. This copy includes the needed information the NAPTgateway requires to forward the packet.

Since the error messages are used among different protocols and since it doesnot create nor refresh or terminate any sessions, it does not really belong toeither of the sections ICMP Connection Establishment or ICMP Refresh andTermination. Despite this fact the evaluation, consideration and treatment ofthe ICMP error message are mainly placed here in this section with a detailedexample of a situation, where an error message occurs, and the handling doneby the NAPT gateway will be given later in this section.

ICMP Query Messages

ICMP Query messages have the ability to create sessions on their own. A popu-lar application, known by almost every one in the field, is the Ping application.Ping sends a request to a host and if the request is successfully received, a replyis send back. It is obvious, that an outbound Ping request must be able to cre-ate a session in the NAPT gateway by itself. Unfortunately the Query message

Page 106: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

82 Analysis and Design of a NAPT Core

layout does not include port numbers at all, as in the case with UDP and TCP,but instead it has an Identifier field in the header, which can be used to makea session in the NAPT gateway. Regarding the session-timers time period forICMP Query messages, [25] argues that the timer value should be larger than60 seconds. A Ping example will be provided as one of the beneath examples.

Example: ICMP Error Message forwarding (inbound)

This example presents the solution in this work with a step-by-step handlingof an ICMP error message coming from the public realm on the basis of anerror occurred somewhere on the path to the destined endhost e.g. a routersignaling ”destination network administratively prohibited”, which means thatthe packet, which causes the error, has been discarded administratively by arouter.

As mentioned above and shown in Figure 3.42 a copy of the packet, which causesthe error, is embedded in the payload of the ICMP error message. With thisinformation it is possible for the NAPT gateway to forward the ICMP errormessage to the right host on the private network using the already establishedsession, i.e. the one the original outbound packet belongs to. The solutionapplies for both type 3 and 11 error messages and all its different respectivecodes.

To simplify the example it treats a general case, where the error message occursduring a normal data transfer between two endhosts, but where an error occurson the basis of an outbound packet independent of the protocol type, eitherUDP or TCP. However the UDP protocol is used for illustration purpose in theexample.

Figure 3.44 illustrates a scenario, where an application with the port number121 on host A on a private network sends an outbound packet to an application90 on host E, which resides on the public network (step 1 in 3.44).

The packet undergoes a normal suitable header translation by the NAPT gate-way and is afterwards forwarded (step 2 in 3.44). Unfortunately the applicationon the endhost is down for some reason. The host therefore replies with anICMP error packet, with the message ”port unreachable” (type 3, code 3). Theerror packet is destined to the NAPT gateway with a full or partial copy of theoriginal packet (step 3 in 3.44).

When the NAPT gateway receives an ICMP error message, in this case aninbound error packet, the first thing the NAPT gateway carries out, is to deter-mine whether the packet belongs to an already known session or not. A lookupin the mapping table (CAM-0) is performed with the use of the source port num-

Page 107: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

3.4 Connection Establishment 83

Figure 3.44: ICMP error message example; the application on host E is downand therefore host E responds with an ICMP error message.

ber (the unique NAPT port number of the error causing session) together withthe protocol type of the embedded packet, as input for the lookup procedure.

The content of the NAPT tables applicable for this example is shown in Figure3.45, and it can be seen, that a lookup in the mapping table results in a hit anda subsequent read of the content of the entry provides the NAPT core with thenecessary information to complete the packet translation.

The destination IP address of the IP header ought to be replaced with the IPaddress from the lookup procedure and the source IP address and port numberin the embedded packet shall likewise undergo a replacement with the IP addressand port number from the lookup procedure (step 4 in 3.44).

Before a forwarding can take place, the packet’s validity has to be checked by alookup in the filtering table as usual, but with the destination IP address andport number from the embedded packet. It is not possible to make any checkincluding the outer IP address as the origin of the error message is unpredictable,because failures can happen under transport on the internet. The ICMP packetcan therefore come from unknown IP addresses and still be acceptable.

Page 108: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

84 Analysis and Design of a NAPT Core

Figure 3.45: Content of the NAPT tables and the CPU session table for theconcerned session.

Because it is an ICMP error message, the NAPT core should forward the packetto the CPU with interrupt code 65. The CPU will either accept or reject theICMP packet and sent code 65.0 (accepted) or 65.1 (fail, discard). In this casethe packet is accepted.

The CPU involvement is mainly for two reasons. At first the administratormonitors exceptions on a log and secondly to reject a possible Denial-of-Service(DoS) attack caused by flooding an existing NAPT gateway with ICMP errorpackets. The administrator can specify a certain ICMP error security policy tominimize the risk of being exploited by an attacker.

Example: ICMP Query session establishment

Imagine a situation where a host named A running the well known applicationPing, which tries to get a reply by sending a request to an endhost E on thepublic network, as illustrated in Figure 3.46. The Ping application uses anICMP query message type 8 and code 0 to carry out the request and an ICMPquery message type 0 and code 0 for the reply. The ICMP query format does notinclude any source port number, but instead it includes an Identifier number,which is then used as a substitute for the missing source port number. The datasequence consists of two packets, where only the first one is of interest in thissection. It is assumed further, that the internal tables of the NAPT gatewayare initially empty in this example.

The NAPT core will start out determining the originating interface, in this casethe internal interface. It then checks for a possible active session in the NAPTtables. To clarify the matter, whether the session exists or not, a lookup inthe mapping table (CAM-0) is performed with the source IP address, Identifier

Page 109: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

3.4 Connection Establishment 85

Figure 3.46: ICMP query message example; The host A uses the utility programPing to discover the existence of the endhost E.

number (used as source port number) and the type of protocol used as input. Inthis example an active session does not exist on beforehand, therefore a sessionneeds to be created. A free entry in both the mapping and the filtering table(CAM-0 and CAM-1) are reserved before a CPU interrupt is carried out withcode 60 to indicate, that the session is unknown and needs to be created. If nofree entries are found, the CPU is instead informed to create an ICMP errormessage (type 3, code 9), which should be directed toward the sender of theICMP query request.

The CPU starts out checking possible firewall-settings managed by an adminis-trator, which may exist for this packet. If this check fails, the packet is discardedwith return code 60.1 forwarded to the NAPT core. If the creation is allowed,the CPU assigns a unique NAPT port number for the session and saves the rel-evant information by updating its own session table. The CPU’s session table,as it would look in this case, is shown in Figure 3.47.

The NAPT port number is used to mask the original Identifier number insteadof the usual source port number, which does not exist in the ICMP query packetlayout, see Figure 3.43.

Now the CPU should modify the ID field of the ICMP header with the uniqueNAPT port number and change the Source IP address of the IP header to theNAPT’s public IP address. The situation is shown in Figure 3.46 in step 1

Page 110: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

86 Analysis and Design of a NAPT Core

Figure 3.47: Content of the CPU session table after processing the first outboundinitializing ICMP query packet.

and 2. When the CPU successfully completes its necessary tasks, it forwards amessage with the return code 60.0 to the NAPT core.

The NAPT core executes the procedure of the return code. In this case thetask is to write the session data into the NAPT tables, set the ALI-bit in thesession-memory and lastly start a session-timer with a time period of 60 secondsfor the concerned session. After the creation of the session the content of theNAPT tables is shown in Figure 3.48.

Figure 3.48: Content of the NAPT tables after processing the first outboundinitializing ICMP packet.

Now the ICMP query message type 8 code 0, i.e. ICMP request message, canbe send to the external interface.

3.4.5 Discussion/Summery

Generally seen any packet regardless of the protocol ought to undergo a check-sum calculation before initiating any further processing to confirm the validnessof the packet. The checksum of UDP packets is optional and is therefore ex-empted from the above statement when disabled. If the precalculation of thechecksum is ignored for any reason, an error prone packet, which undergoes theprescribed header manipulation including a recalculation of the checksum, whichis crucial for the translation process, could then as a consequence of the igno-ration, appear error free. Disabling the precalculation of the checksum could

Page 111: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

3.4 Connection Establishment 87

harm applications, as they have little chance of detecting errors.

Besides the described dynamic connection establishment for each protocol, astatic connection establishment should also be an option. Static connectionestablishment is recognised as being permanent i.e. the session timer is disabled.Static sessions are created by the administrator of the NAPT gateway. Thisfeature could be used as a last resort to make an application behind the NAPTwork or when a server is placed behind the NAPT gateway.

UDP Connection Establishment

Allowing inbound UDP packets to establish connections by creating sessions inthe NAPT gateway are deemed to be a security risk, as it opens to potential in-truders with possible malicious intentions. If the demand for the NAPT gatewayhaving the option of creating sessions on behalf of inbound packets, the NAPTgateway could be equipped with an administrative controlled opportunity toallowing predetermined sessions to be created. But there will still be a securityrisk associated with this feature, as the predetermined sessions are availablefor potential intruders to gain access to the private network behind the NAPTgateway. The decision if the NAPT should allow inbound packets to establishconnections must be based on a weighing between security and flexibility, inother words a compromise. One could argue, that the security discussion ofletting inbound packet have the ability to create sessions and having the abilityof establish static sessions is the same.

With regard to the session timer value used in the establishment procedure,no optimal value exists, as it is a matter of balance between flexibility andresource constrains. Endhost applications will benefit from a large timer sessionvalue, but this could lead to table exhaustion. As mentioned in the beginningof the section [10] advise, that the UDP session timer value are set to at least2 minutes. As there is no standard on the topic, it is up to the vendor toselect an appropriate session timer value, which meet a chosen compromisebetween flexibility and the NAPT gateway’s resource constrains. The sessiontimer value could vary between the different port numbers, such that knownapplications could be handled on an exceptional basis. It should be mentioned,that a deterministic behavior is important, therefore a certain consistency mustbe sustained. Applications behind NAPT gateways should determine, that itactually is behind a NAPT gateway and then if necessary send keep alive packetsto maintain the session in the NAPT gateway.

DoS attacks are an attempt to make a computer resource unavailable to itsintended users. If the NAPT gateway responds with an error message to anendhost every time an endhost tries to establish a connection by sending a,inbound packet, the NAPT gateway exposes itself to potential DoS attacks. An

Page 112: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

88 Analysis and Design of a NAPT Core

attacker could abuse this type of NAPT behavior by constantly flooding theexternal interface of the NAPT gateway with malicious packets and therebyforce down the NAPT gateway. The NAPT gateway must not respond with anICMP error message on the basis of receiving an inbound UDP packet, whichhas no active session; it should simply discard it.

TCP Connection Establishment

A commonly known DoS attack performed against NAPT gateways is namedSYN flooding. A SYN flood attack consists of a client sending TCP connectionrequests faster than the NAPT gateway can process them. Since every SYNpacket is processed by the CPU, it is possible for an administrator to setup asecurity policy, which for instance could reject multiple requests from the sameendhost or other suspicious SYN patterns etc. To keep the record straight it isdifficult to detect and prevent sophisticated SYN attacks, simply because theylook like normal TCP connection requests.

For TCP connections a sequence number exists in the packet’s header. Thiscan be monitored to maintain a strict security and flow control policy. In thissolution it is not monitored, though it should be straight forward to implementin the CPU for the connection establishment phase. Monitoring the sequencenumber will make it harder to transverse the NAPT gateway for a blind attacker,because the sequence number now also has to be known, before a succesfulattack can take place. For an attacker, that himself monitors the packet flow,the monitoring of the sequence number will have more or less no effect.

As mentioned in [13], it is ”RECOMENDED”, that the idle-timeout in thetransitory state should be at least 4 minutes and that the established connectionidle-timeout should be at least 2 hours and 4 minutes. In the solution thiscould be made configurable to the administrator, despite the risk of not beingapplication friendly, if the idle-timeout is less than 4 minutes or 2 hours and 4minutes, depending on the state.

ICMP Connection Establishment

ICMP error messages who have no active sessions in the NAPT gateway must besilently discarded regardless of the originating realm, because it would constitutea major security risk, if the NAPT gateway tried to forward the packet anyway,despite that no mapping existed. If the NAPT gateway correctly drops themessage, but as a service sends a response to the endhost, informing that nosession exists, could be exploited by malicious people to make DoS attacksagainst the NAPT gateway by flooding ICMP error packets. Silently discardingthe packets prevent those attacks.

Page 113: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

3.5 Connection Refresh and Termination 89

The checksum of the embedded packet in a ICMP error message is not consideredby the NAPT gateway, the reason for that is the fact that no endhost receivingthe error message must rely on the validity of the checksum, as only a small partof the original packet, causing the error message is guarantied to be included.Recall that the checksum is requires the entire packet in order to calculate thechecksum.

The general value of the ICMP query session-timer should be made configurableto maximize the flexibility of the NAPT gateway, setting the timer value to highwould tie up entries in the NAT tables in a wasteful manner. Setting the session-timer to low could result in a premature release of the session which spoil inprocess connection. The recommendation given in [25] is a compromise betweenthe to extremes.

The two main types or ICMP messages are supported query and error, onlyare few of all possible subtypes where treated, apart from that a full coverageshould be plausible in the light of the great similarities among the different typesof query and error messages. The supported subtypes should be chosen by anadministrator on a individual basis, based on the selected security policy for theNAPT gateway.

3.5 Connection Refresh and Termination

In the previous section the connection establishment was described and quitedetailed examples where given for each supported protocol. This section focuseson the maintain/refresh and termination phases of a session’s lifetime in theNAPT gateway. The structure of the section is kept similar to the sectionof connection establishment with a general introduction to the subject and adifferentiated treatment of the supported protocols. Furthermore the examplesprovided are build upon the examples in the connection establishment sectionas a direct continuation.

When a session has been established, the session is defined to be in a datatransmission mode, where packets belonging to the session can transverse theNAPT gateway as long as the session exists. As mentioned in the establishmentsection, a session got a time period to live in, when the session is created in theNAPT gateway. The principle mode of operation regarding the refreshment ofthe time period given for a session is like this: Every time a packet transversesthe NAPT gateway in an outbound direction, the ACT-bit for the concernedsession is set, if not already set. The clean-up procedure, which comes intoaction when the NAPT is idling, examines the session memory bit session by

Page 114: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

90 Analysis and Design of a NAPT Core

session in a linear way. For those sessions where the ALI- and ACT-bit is set,the concerned session’s session timer is restarted and the ACT-bit is cleared.Recall that the ALI-bit is used to indicate if the session is actually in use. Itis also important to mention, that not all sessions are treated at once. Everytime a session has been examined and maybe treated, the clean-up proceduresaves the current session number and returns to packet processing mode. If nopackets are present, the NAPT gateway switches back to the clean-up procedureto continue the evaluation of the sessions. It should be obvious, that the clean-up process ignores the sessions, where the ALI-bit is not set regardless of anyother parameters. In cases where the ALI-bit is set, but where the ACT-bitis cleared, the mode of operation depends on the session timer status. If thesession timer for the concerned session has not yet expired, the session is activeand no action is needed from the clean-up procedure’s point of view. In thesame case but where the session timer has expired, the situation is completelydifferent and the circumstances call for a deletion of the session. In other words,a session where the ALI-bit indicates, that the session takes up an entry in theNAPT tables and where the ACT-bit indicates no packet activity since the lastvisit and the session timer has expired, the deletion procedure is triggered. Theactual steps involved in the deletion procedure are treated and illustrated byexamples for each supported protocol below.

The above treated subject applies to all protocols both stateless and those withstates. Moreover it is the only way for a stateless protocol to terminate itssession in the NAPT gateway. Recall from the treatment of the TCP protocolthat an explicit termination sequence exists, but depending on the terminationsequence alone could end up with a permanent occupation of entries in theNAPT tables if something went wrong and the NAPT does not register thetermination sequence correct or at all.

3.5.1 User Datagram Protocol - UDP Connection Refreshand Termination

The basic principles of the UDP protocol were treated in the previous section3.4.1, in this subsection a direct continuation of the UDP session establishmentexample are presented to clarify the procedures involved.

Example: UDP Session Refresh and Termination

This example is a continuation of the ”UDP session establishment” example3.4.1 in the previous section. A connection is established between the applica-tion with port number 121 on host A on the private network behind the NAPTgateway and an application with port number 90 running on host E on the Inter-

Page 115: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

3.5 Connection Refresh and Termination 91

net. The adopted sequence diagram from the previous establishment example isreshowed in Figure 3.49 and as stated, a response to the initial packet has beensend to the application on host A from the application on host E. The packet is

Figure 3.49: Sequence of UDP packets in a data transmission between host Aand E trough a NAPT gateway.

received by the external interface of the NAPT gateway, saved and processed be-fore the packet is actually retransmitted. The packet’s source address and porttogether with the destination port number, which is the masked destination portnumber i.e. the NAPT port number, is looked up in both NAPT tables, themapping table (CAM-0) and the filtering table (CAM-1). It is assumed in thisexample, that all the response packets to the initiating packet is received by theNAPT gateway within the time period, where the connection is still alive. Asthe session is still active, the lookup procedures results in hits in both lookupsand the relevant fields of the packet’s header can be modified by the NAPT. Themodification includes replacing the destination address and port of the packetwith the information looked-up in the mapping table (CAM-0) and update thechecksum if necessary.

The packet is now ready to be forwarded to the private network. In this caseit is important to note, that the status of the ACT-bit has been left unchangedfor security reasons, because if an inbound packet had the capability to refreshthe session timer with the use of the ACT-bit, a potential intruder could abusethis to keep the session alive forever. Since the packet is found to be a part ofan already existing session, the packet processing is done entirely in hardwarewithout the involvement of the CPU. The content of the NAPT tables and thesession table in the CPU is left unchanged after the completeness of the packetprocessing, and can be seen in Figure 3.50.

The third and last packet in this example is another outbound packet similar tothe first packet, which created the session. The packet is as before received atthe internal interface of the NAPT gateway and the procedure is analogue to thepacket just described, except from the direction. With the source IP address,port number and the protocol type of the packet a lookup in the mapping table

Page 116: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

92 Analysis and Design of a NAPT Core

Figure 3.50: Content of the NAPT tables and the CPU session table afterprocessing the second, inbound UDP packet.

is performed and as expected, it results in a hit. The NAPT port number fromthe lookedup entry is used as a replacement of the original source port numberof the packet and the IP address of the external interface of the NAPT gatewayis used as a replacement of the source IP address. A lookup in the filter table isperformed to verify, that the endhost is known and therefore no further action isneeded. Again the checksum of the packet is recalculated if it is applied. Finallythe ACT-bit in the session memory is set, such that the clean-up procedure isaware of the session’s activity, when it processes the concerned session. Recallthat a raised ACT-bit entails, that the session is allocated another time slotof 2 minutes by restarting the session timer. The content of the NAPT tablesafter the processing of the packet can be seen in Figure 3.51. In the light of themissing CPU involvement its session table remains the same as in Figure 3.50.

Figure 3.51: Content of the NAPT tables after processing the third outboundUDP packet.

Pretend that soon after the last packet is processed, the clean-up procedurepays its attention to the concerned session and in that both the ALI-bit andACT-bit in the session memory is set. The outcome of the clean-up procedureis a restarted session timer and a cleared ACT-bit. The changes to the session

Page 117: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

3.5 Connection Refresh and Termination 93

memory can be seen in Figure 3.52.

Figure 3.52: Content of the NAPT tables after a session clean-up, where theACT-bit is cleared after a session timer restart.

While the session timer is doing a countdown, the session is preserved, but assoon as the countdown is accomplished and the clean-up procedure reaches thesession, a termination of the session is completed by a deletion of the entriesthe session occupies in both the NAPT tables and the CPU’s session table.Besides clearing the ALI-bit in the session memory and the valid-bit in both themapping and filtering table, the CPU is interrupted with code 70 and the entryin the CPU’s session table is also released.

3.5.2 Transmission Control Protocol - TCP ConnectionRefresh and Termination

The basic principles of the TCP protocol were treated in the previous section3.4.2, in this subsection a direct continuation of the TCP session establishmentexample are presented to clarify the procedures involved.

Before continuing with the example and digging into the packet sequence inFigure 3.35, some protocol specific issues are needed to be covered.

When the TCP session is established, the connection finds itself in the datatransmission phase. The NAPT gateway behavior in this phase is similar to theUDP connection when established, except from the fact that the time period,which is available for the TCP connection, is 2 hours and 4 minutes comparedto the 2 minutes in the UDP case. An application having established a TCPconnection has the ability to close the connection, when the data exchange iscompleted. In order to close the connection both applications must signal itsdesire by sending a TCP packet with the FIN flag in the header set, and bothendhosts must subsequently reply with a TCP packet with the ACK flag in theheader set.

Page 118: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

94 Analysis and Design of a NAPT Core

The TCP standard supports half close. A situation where one application signalsa close by sending a TCP packet with the FIN flag set and gets a reply packetwith the ACK flag set. By doing this the application signals, that it has no moredata to send but is willing to still receive data, as long as the other endhostapplication is not closed. For the NAPT to handle this situation correctly, itmust keep the session timer value for 2 hours and 4 minutes until at least twoTCP packets with the FIN flag set is registered.

Besides the described closing procedure a simultaneous close exists. This packetsequence is quite similar to the simultaneous open approach except from theflags set. Figure 3.53 shows the packet sequence for the simultaneous close ap-proach. For a NAPT gateway to handle the different approaches for connection

Figure 3.53: Simultaneous close packet sequence.

termination, the termination phase must not be brought into effect until packetsbelonging to the same session, with a FIN flag set, has transferred the NAPTgateway in both directions. The transitory connection phase where the idle-timeout is 4 minutes does not become effective until a second packet with theFIN flag set has been seen in the opposite direction.

Example: TCP Session Refresh and Termination

Again this connection refresh and termination example is like the previous UDPexample, a continuation of the connection establishment example in section3.4.3. A connection is established between the application with port number 121on host A on the private network behind the NAPT gateway and an applicationwith port number 90 running on host E on the Internet. The last part of thesequence diagram from the previous establishment example is shown in Figure3.54. The first two packets in the sequence account for the data transmissionphase and consist of an outbound packet with 10 bytes of data and a packetacknowledging the received data in the opposite direction. Immediately afterreceiving the acknowledge from the endhost, the termination phase is triggeredand a normal connection termination is started.

Page 119: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

3.5 Connection Refresh and Termination 95

Figure 3.54: Time line sequence of TCP packets, which makes up the a datatransmission and termination phase between host A:121 and E:90 through theNAT gateway.

The packet processing carried out in the TCP data transmission phase is more orless equal to the UDP packet processing procedure i.e. packets from the privatenetwork must refresh the session timer for the concerned session, while packetsfrom the public network must not refresh the session timer. As mentioned earlierthe established connection idle-timeout must be at least 2 hours and 4 minutesin the case of an established TCP packet flow.

When the data carrying TCP packet arrives at the NAPT gateway’s interface,the source IP address, source port number and protocol type are used as inputin a lookup in the mapping table (CAM-0). The lookup results in this case witha hit, since it was created in the establishment phase. The assigned NAPT portnumber is read and further used together with the destination IP address andport number as input for a lookup in the filtering table (CAM-1) to see whetherthe destination already exists or a new destination must be created. Since theconcerned packet in all means exists in both the mapping and filtering tables,no further action is needed besides the header manipulation, where the sourceIP address and port are masqueraded and the checksum is recalculated. Beforea forwarding of the packet is carried out, the ACT-bit in the session memory isset to indicate, that session activity has been observed. The CPU is not involvedin the data transmission phase as it would reduce the bandwidth.

Figure 3.55 shows the content of the NAPT tables after the first data carryingpacket has been forwarded by the NAPT gateway. The previous outbound data

Page 120: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

96 Analysis and Design of a NAPT Core

Figure 3.55: Content of the NAPT table after processing the data carrying TCPpacket.

carrying packet informs the NAPT core to refresh the session timer and it isassumed, that the actual refresh is done before the data acknowledging inboundpacket is received by the NAPT gateway. The changes made to the sessionmemory can be seen in Figure 3.56.

At the time the acknowledging inbound packet is received, the destination portnumber i.e. the unique NAPT port number and protocol type is looked uprespectively in column three and four in the mapping table, CAM-0. In thiscase the unique NAPT port number exists and the lookup procedure results ina hit and a subsequent read of the entry provides the NAPT core with the IPaddress and port number of the host on the private network.

A check is made against the filtering table to see whether the packet is allowedto enter the private network. The source IP address and port number togetherwith the destination port number of the packet is used as input to the filteringtable. As the right information exists in the filtering table, the packet passesthe check. If the packet does not pass the check, the packet should simply besilently discarded by the NAPT gateway.

The packet undergoes a header manipulation, where the destination IP addressand port number is replaced with the IP address and port number read fromthe entry in the mapping table and as usual the checksum is recalculated. Thesession memory is left unchanged as inbound packets does not have the privilegeto refresh the session timer. The treatment of the acknowledging inbound packetis also done entirely in the NAPT core without involving the CPU.

The content of the NAPT tables in Figure 3.56 is left unchanged after theprocessing of the acknowledging inbound packet. When the NAPT gatewayreceives the next packet, it is aware of the FIN flag, which is set. Every TCPpacket with a set FIN flag triggers a CPU interrupt with code 42. The CPUroutine matching code 42 takes over. The CPU simply registers, that one ofthe participants in the connection has chosen to close down for further datatransmission. This situation equals the half close situation presented in the

Page 121: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

3.5 Connection Refresh and Termination 97

Figure 3.56: Content of the NAPT table after processing the data acknowledgingTCP packet.

introduction of the section. A timer refresh should be made anyways and thetime period should be maintained, as the session is considered being in a datatransmission even though only one end is capable of sending data. The ACT-bitin the session memory is set and the TM-bit is maintained as it is.

With the use of the CPU session table the packet check, the header translationand the recalculation of the checksum is done entirely by the CPU. If the packetis accepted, a return code 42.0 is send to the NAPT core, informing the NAPTcore to update its session memory as explained above, and to forward the packetto the right interface. Otherwise if the outcome of the overall check turns outto be negative, the CPU informs the NAPT core to silently discard the packetwith return code 42.1.

The content of both the CPU session table and the NAPT tables after pro-cessing the packet with the FIN flag set is shown in Figure 3.57. The following

Figure 3.57: Content of the CPU and NAPT tables after processing the firstTCP packet where the FIN flag is set.

acknowledge response to the TCP packet with the FIN flag set, is treated exactlythe same way as the second packet in the data transmission sequence, which ac-

Page 122: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

98 Analysis and Design of a NAPT Core

knowledges the data carrying packet. If the packet belongs to the stored sessionand passes the filtering check, the packet’s header is modified in accordance withthe normal procedure, by the NAPT, and is forwarded to the private network.No modifications in the session memory takes place as it is an ordinary inboundpacket. The packet is silently discarded, if it is not accepted or did not pass thefiltering rule or simply do not belong to an active session.

The second last packet received by the NAPT gateway is of special interest, asit is the second FIN packet in the termination sequence. All FIN packets areconsequently directed to the CPU with interrupt code 42 and since it is the lastFIN, the status of the connection is changed from being in a data transmissionphase to a termination phase. This switch entails, that the session timer isrestarted with the value of the ”transitory connection idle-timeout” (4 minutes),regardless of its current value. The connection must not be closed entirely, asthe endhost expects to get an acknowledge in reply of the FIN packet. To gainmaximum control of the following packet flow, the TS-bit is set. This ensuresthat all packets belonging to the session are redirected and treated by the CPU.This behavior prevents intruders to abuse the transitory time period as only thelast acknowledge packet is allowed to transfer the NAPT gateway. The ACT-bitis not in use anymore, as it makes no sense to refresh the session timer on behalfof the last packet transferring that specific session in the NAPT gateway.

The CPU makes the header translation, which means changing the destinationIP address and port number of the packet to the source IP address and portnumber from the CPU session table, make the filtering check, update the fin-gerprint and lastly recalculate the checksum. This is obviously only carried out,if the packet belongs to an already created session. If the packet is accepted,the CPU hands over the responsibility to the NAPT core in the shape of returncode 42.2. The NAPT core makes the changes to the session memory beforeit forwards the packet to the private network. If the packet does not fulfill thedemands, a return code 42.3 is passed to the NAPT, informing it to silentlydiscard the packet.

The content of both the CPU session table and the NAPT tables are shown inFigure 3.58.

The last packet received is the acknowledge packet appurtenant to the secondpacket, but since the TS-bit is set, the packet’s further treatment is handed overto the CPU. Besides the normal packet header manipulation, the session timeris cleared, which means it is set to appear expired, and even though the packetis outbound, the ACT-bit is not set. The combination of a cleared ACT-bit andsession timer imply that the session is deleted as soon as the clean-up processreaches the specific session. To ensure that subsequent packets do not passthrough the NAPT gateway in the period, until the session is actual deleted,

Page 123: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

3.5 Connection Refresh and Termination 99

Figure 3.58: Content of the CPU and NAPT tables after processing the secondTCP packet where the FIN flag is set.

the TS-bit is set. Thereby any further received packets belonging to the sessionare insured to be treated by the CPU i.e. discarded.

3.5.3 Internet Control Message Protocol - ICMP QueryConnection Refresh and Termination

The basic principles of the ICMP protocol were treated in the previous section3.4.4. In this subsection a direct continuation of the ICMP query example ispresented to clarify the procedures involved.

Example: ICMP Query Session Refresh and Termination

In the ICMP query session establishment example host A initiated an ICMPquery session in the NAPT gateway by sending an ICMP query request toendhost E. In this example the handling of the reply from host E will be treated.As in the case with both the UDP and TCP protocol subsequent outboundpackets must refresh the session the usual way by setting the ACT-bit in thesession memory. In this example only a single reply is treated and therefore norefresh occurs, but the handling is very much like the UDP and TCP examplesabove.

When the reply message is received, the NAPT core determines the originatinginterface and performs a lookup for an active session in the mapping table CAM-0 with the Identifier number used as the NAPT port number and the type ofprotocol. The lookup results in a hit and a read operation of the entry gives theIP address of the private host and the original Identifier number. A filter check

Page 124: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

100 Analysis and Design of a NAPT Core

can be made, but only the packet’s source IP address can be verified, since nosource port number exists. Figure 3.59 shows the content of the NAPT tables.

Figure 3.59: Content of the NAPT tables.

A CPU interrupt is made with code 63, and the CPU takes over and checksits state order for the session to verify, that it is a reply to a former request,as it would expect. Thereafter it updates its fingerprint and translates theheader if it verifies the packet, by changing the destination IP address and theIdentifier number of the packet with the IP address and Identifier number fromthe previous entry read. Now it sends a return message to the NAPT core withcode 63.0, which tells, that it accepts the reply. If the CPU does not accept thereply message, it gives the NAPT core a return message with code 63.1 thatsays the packet should be silently discarded.

If the ICMP reply is accepted, the ICMP query packet is forwarded to theinternal interface of the NAPT gateway. These steps are illustrated in figure3.46 in step 3 and 4.

Because of the involvement of the CPU, the NAPT gateway can figure out,that the reply matches the request and it can therefore conclude, that the Pingoperation is fulfilled. The session timer is therefore cleared indicating that thesession is ready to be deleted. Any inbound packets received in the followingtime period, until the session is deleted, are silently discarded by the NAPTgateway. Outbound packets are treated as new sessions.

3.5.4 Discussion/Summery

UDP Connection Refresh and Termination

On the basis of the prioritized selection between the ”packet processing” andthe ”clean-up” path of execution, there is a theoretical security risk. Sincethe ”packet processing” is higher prioritized than the clean-up procedure, anintruder could detect an active UDP session from the outside of the NAPTgateway and then by keeping the ”packet processing” busy, sending spoofing

Page 125: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

3.6 The Transport Layer Checksum and its Algorithm 101

packets, it could prevent the clean-up path of execution to take place. As a resultall current active sessions will stay forever open, though it requires a continuousstream of packets with a large enough bandwidth to keep the packet processingbusy. The intruder thereby gain access to the private network, however notwithout putting the NAPT gateway down on the basis of the heavy packet load.It is considered as a theoretical security risk as the attack requires a continuoushigh bandwidth equal or larger than the NAPT gateway’s bandwidth. If thesecurity risk is not acceptable, one could introduce constrained periodic clean-up interrupts. Such a behavior reduces the occupation period, but it would as adown side reduce the bandwidth, when the NAPT gateway is busy ”processingpackets”, caused by the interrupts.

TCP Connection Refresh and Termination

As mentioned in 3.4.5 the sequence number can be monitored in the connectionestablishment phase. In the data transmission phase this is undesirable, becauseit is very complicated to apply in hardware. Also in the termination phase, itwont make any sense to monitor the sequence number for the expected flow,because other packets (data) still can arrive, since the connection could be onlyhalf closed.

ICMP Connection Refresh and Termination

Forwarded error messages must not refresh nor terminate the session, that per-tains to the embedded payload within the ICMP Error packet. This behavioreliminates the possibility of an attacker keeping the session alive forever bysending spoofing ICMP error packets.

3.6 The Transport Layer Checksum and its Al-gorithm

In this subsection only the checksum of the transport layer-4 is considered. Therecalculation of the network layer-3 checksum is delimited as it is considered asa part of the router core, which is only briefly covered in this work to get theoverall picture. Strictly speaking, the ICMP protocol is an exception to this,as it belongs to layer-3, but is treated as a layer-4 protocol in this work. Thissubsection contains an introduction to the subject checksum, that is how thechecksum is used and how it is calculated, all this in conjunction with NAPT.This is followed up by an analysis of two different design suggestions to handlethe task of recalculating the transport layer checksum after the NAPT coremanipulation of a transport layer packet.

Page 126: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

102 Analysis and Design of a NAPT Core

The checksum is used to detect errors in segments e.g. UDP, TCP and ICMP.The checksum of a segment is calculated at creation time or when changes havebeen made to the segment, each with an appropriate algorithm based on thesegment type. The different algorithms are presented later in this subsection.The checksum calculation is done just before transmission of the segment, andat any point in the network a checksum check can be made to verify the validityof the segment. If the check results in an error, the packet can be silentlydiscarded, since retransmission is cheaper than repair. In that connection it isimportant to understand that there still can be undetected errors in a segment,despite the applying of this rather simple error detection technique. For thisreason a number of protocols use more sophisticated error detection techniques,which are out the scope of this work.

As mentioned above the actual algorithm to calculate the checksum for thedifferent segments supported by the NAPT differs slightly in how the checksumemerges, but shares the core calculation concept resulting in great reuse of thecore calculation engine among the different segments.

Generally seen, the concept is the same for all segments:

• Adjacent octets are paired

• Clear the checksum field (0000 0000 0000 0000)

• Calculate the one’s complement sum

• One’s complement negation of this sum is placed into the checksum field

• Adjacent octets are paired

• One’s complement sum over these fields is calculated

• If result == 1111 1111 1111 1111 (-0 in one’s complement) the checksucceeds.

• Otherwise the packet is dropped.

Example:

Fictive TCP/UDP segment represented in octets

02 16 65 85 32 11 94 47 00 00 (00 00 = checksum field)

Form 16-bit words

Page 127: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

3.6 The Transport Layer Checksum and its Algorithm 103

0216 6585 3211 9447 0000 Calculate 2’s complement sum 0216 + 6585 + 3211+ 9447 + 0000 = 0001 2DF3 (store the sum in a 32-bit word)

Add the carries (0001) to get the 16-bit 1’s complement sum 2DF3 + 0001 =2DF4

Calculate 1’s complement of the 1’s complement sum 2DF4 = D20B

We send the packet including the checksum D2 0B 02 16 65 85 32 11 94 47 D20B

At the receiving 0216 + 6585 + 3211 + 9447 + D20B = 0001.FFFE FFFE +0001 = FFFF

which checks OK.

ICMP Echo (type 8) or Echo Reply (type 0) MessageThe checksum is the 16-bit ones’ complement of the one’s complement sum ofthe ICMP message, starting with the ICMP Type. For computing the checksum,the checksum field should be zero. If the total length is odd, the received datais padded with one octet of zeros for computing the checksum. This checksummay be replaced in the future.

ICMP Destination Unreachable (type 3) or Time Exceeded (type 11)MessageThe checksum is the 16-bit ones’ complement of the one’s complement sum ofthe ICMP message starting with the ICMP Type. For computing the checksum,the checksum field should be zero. This checksum may be replaced in the future.

TCP and UDP segmentTo calculate TCP/UDP checksum a ”pseudo header” is added to the TCP/UDPheader. This pseudo header consists of information taken from the segment’sappurtenant IP header as listed underneath and illustrated in Figure 3.60.

The checksum field is the 16 bit ones’ complement of the one’s complementsum of all 16-bit words in the pseudo header, TCP/UDP header and data. If asegment contains an odd number of TCP/UDP header and data octets, the lastoctet is padded on the right with zeros to form a 16-bit word for checksum pur-poses. The pseudo header and pad are not transmitted as part of the segment.While computing the checksum, the checksum field itself is replaced with zeros.

The main task of the checksum processor is to calculate the new checksum of themodified packet, as described above , every time a new packet is present. Theidea is to have a sequential adder, which reads in a bit-vector from the packet

Page 128: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

104 Analysis and Design of a NAPT Core

Figure 3.60: UDP/TCP packet with pseudo header used for checksum calcula-tion.

buffer and then adds the value to an accumulator on the fly for every 32-bit inthe packet. The sequential read and add behavior have been chosen on the basisof and with regard to the size of the arithmetic core. A parallel adder structurewould simply be to large, as the maximum packet size is 65709 bytes for an IPv4packet. The requested performance of adding a 2 Gbps bit stream should offhandbe met with a sequential adder approach using a carry-save representation of thetemporary result in the accumulator. This type of addition omits/reduces thecarry-propagation chain presented in other traditional adders. The downsidewith a carry-save representation is, that the result needs to be converted into aconventional representation before the final checksum processing can be carriedout. The conversion is done by a well-known carry-propagate adder, which isthe cheapest possible adder in the field, and because of the relatively smallvector bit-width it is chosen as a first choice. There exist a number of differentadders. Each of them having a different area and performance, which could bechosen. The preferred adder is the one, which just barely meets the performancerequirements, as it typically results in the smallest possible implementation. Afinal negation of the resulting bit-vector produced by the carry-propagate adderis needed, before the replacement of the old checksum value in the packet cantake place. This is done by placing an array of inverters right after the carry-propagate adder. Now the task can take place.

Page 129: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

Chapter 4

Implementation of a NAPTDesign

In this chapter a NAPT core and part of a virtual router core will be imple-mented as a behavioral model with the use of a hardware description languageto prove the concept of some of the design solutions from the chapter 3. As astarting point a model supporting the UDP protocol is selected, as it includesmost of the core functionalities, that a NAPT must have to support all threetransport layer protocols UDP, TCP and ICMP. Before digging into the detailsof the NAPT core, which is of greatest interest, as it is the main focus, the socalled virtual router core’s existential justification and design are described.

The router core is build up to serve as the underlying platform, which firstjob is to handle the underlying network layer. The underlying network layeris represented by an in- and output file, where IPv4/UDP packets are placed.Packets decided to enter the router are placed in the input file, either manuallyor with a packet generating program following a simple protocol made for thiswork. The output file is initially empty, but after a simulation run it will containpackets, which has been processed and forwarded. Exeption packets are in thisversion of the model silently discarded, but the intention is to involve the CPUevery time an exception occurs, and a packet drop (discarding) is defined in thiswork as an exception.

Page 130: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

106 Implementation of a NAPT Design

For every IPv4/UDP packet written in the input file, additional information, asthe direction of the packet either inbound (from WAN) or outbound (from LAN)and a variable (a natural number), which is used by the router core to delaythe packet for some time in the router core before it is processed, are needed.This last information, the delay variable, enables time delays between successivepackets coming from the network, which is especially useful when testing thesession timers etc. in the model. To keep the record straight it should bementioned, that this feature is not a part of a typical router implementation inthe field. It only serves as a test feature.

The router core consists of a controller and a number of components, whichrespective tasks represent a state in the state diagram in Figure 4.1. The pinkcircles represent tasks done by the virtual router core and the blue ones representthe NAPT core’s jobs. All the states, both pink and blue, are controlled by thecontroller of the router core. The router core continuously executes the statessequentially like in the state diagram as long as there are IPv4/UDP packets inthe network, in this case the input file. As long as there are packets available in

Figure 4.1: State diagram of the main design.

the input file (not reached end of file), alias the network, the following procedure

Page 131: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

107

will be carried out: An IPv4/UDP + additional data is read from the input fileand placed in a packet-buffer, an ordinary 32-bit width static RAM block, inthe first state after idle. The next state reads/picks the relevant data from thepacket-buffer and presents it for the NAPT core, such that the NAPT core isexempted from the packet classification part of the router, which is an importanttask when dealing with a multi protocol NAPT. The third and fourth state arethe actual NAPT tasks, where the header translation is done, which will bedescribed later in this chapter. When the third and fourth state are carried out,the fifth state writes output data from the NAPT back to the packet-buffer.The last state completes the routers task by writing back the whole modifiedIPv4/UDP packet to the underlying network i.e. the output file. The routercontroller acts as a chairman, which gives the individual states right to do whatever it is intended to do.

The implementation goal of the router core was to make a solution, which isas flexible and expandable as possible and therefore as a consequence, it is notspeed optimized e.g. the state where a packet is read from the network and thestate where relevant data are read from the buffer and presented to the NAPTcore could be combined and parallelized. Further more the NAPT core andthe checksum processing should in the final implementation be implemented asdescribed in Chapter 3 and Chapter 6, where they are working in concurrencywith each other. With the block diagram provided in Figure 4.2 it is easily seen,that the model of the router core is configurable and capable of supporting anyfurther needed procedures.

All the involved files, which the implementation of the router core in Figure4.2 consists of, are listed beneath with a short explanation of its purpose. Asmentioned in the preface, all the files are located in the enclosed CD-ROM, seeAppendix H.

setDataUnit.vhd : Prepare NAPT data from packet buffer unit

getDataUnit.vhd : Write− back NAPT data to packet buffer unit

netwRx.vhd : Receive Packet from network unit

netwTx.vhd : Transmit Packet to network unit

pktMem.vhd : 32− bit Static Random Access Memory (packet data)sysCntlUnit.vhd : Test bench controller

sysclk.vhd : System clock

testBench.vhd : Test bench

(4.1)

Page 132: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

108 Implementation of a NAPT Design

Figure 4.2: Router Core.

4.1 NAPT Core

4.1.1 Hardware Architecture

The hardware architecture of the NAPT core consists of a controller unit, asession-timer core, two 16x65-bit TCAM units and an ordinary 32-bit staticrandom access memory unit. A block diagram of the hardware architecture ofthe NAPT core can be seen in Figure 4.3. As already mentioned in the beginningof this chapter, the model is limited to the handling of the IPv4 packet with anembedded UDP datagram. The limitation will not destroy the main purpose ofthe model, which is to prove some of the fundamental features, that a NAPTcore needs to cope with, and to estimate the packet processing speed, as theyare more or less common for all three transport layer protocols UDP, TCP andICMP. To explain the big picture without getting bogged down in the details,a brief survey of the mentioned components will be given.

• NAPT Controller Unit

The NAPT controller unit is the main control unit. It is responsible forthe correct handling of the presented packet header data originating from

Page 133: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

4.1 NAPT Core 109

Figure 4.3: NAPT Core.

the router core and maintains the cleanup procedure of old and unusedsessions. The NAPT controller unit is basically a great state machine.

• Session-timer Core

The session-timer core acts as a co processor to the NAPT controller unitand manages everything regarding session-timers. It does so by request ofthe NAPT controller unit e.g. start or restart a timer, or get the statusof a specific session’s timer. Besides acting as an interface it also includesthe actual timers and the down count mechanism.

• CAM-0

The CAM-0 table is the mapping table, where the relationship betweenthe unique assigned NAPT port number and the identity of an applicationon the private network is saved.

• CAM-1

The CAM-1 table is the filtering table, where the relationship between theunique assigned NAPT port number and the identity of an application onthe public network is saved.

Page 134: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

110 Implementation of a NAPT Design

• SRAM

The SRAM consists of session relevant data, which is not lookup criticalin this work. Only the ALI-bit and ACT-bit appears. Each session hasits own entry, 32-bit in the memory.

• CPU

The CPU is mainly used to handle the creation and closure of sessions inthis model, such that port numbers can be assigned and released. Ide-ally the CPU should also handle the user interface i.e. configuration andsystem management together with all sorts of exceptions.

4.1.2 NAPT Configuration

The NAPT core implemented has an endpoint-independent mapping behavior,which was described in 3.1 and an address and port dependent filtering behavior3.2. This behavior could easily be changed to one of the other filtering modespossible just by changing the lookup-mask; but it has by default an addressand port dependent filtering behavior. The port assignment is handled by theCPU. No port number or range preservation, only a simple assignment strategy,where the largest available port number will be assigned to the session. Thereason for this choice is to focus on other parts of the NAPT core and as theport assignment strategy is a part of the software layer, it is not considered as ahighly speed critical part of the NAPT core, since it belongs to the creation pathsof the NAPT. The final product may implement several assignment strategiesand it only demands a configurable piece of software.

Compared to the full-blown model described in section 7.2.1 no CPU buffer isimplemented i.e. one session processing at a time. This simplification reducesthe main processing state diagram to the one shown in figure 4.4, where the onlychoices are packet processing or session cleanup. The prioritization strategy ismaintained i.e. packets are always higher prioritized than a session cleanup.

4.1.3 Packet Processing Paths

Highest prioritized are the packet processing path, therefore the first choicewhen returning from whatever. There are two main paths existing according tothe packet direction, inbound (from WAN) or outbound (from LAN).

Outbound packet flow

Page 135: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

4.1 NAPT Core 111

Figure 4.4: NAPT main processing.

The packet processing flows for both in- and outbound packets are shown inAppendix C.

The first thing the NAPT core does when processing an outbound packet, is tolookup the session’s unique NAPT port number in CAM-0 i.e. the mapping tablewith the packet’s source IP address & port. If the outcome of the lookup processis a match, the returned address of the entry, which has a match between thesource IP address & port of the packet, is used by the NAPT controller to readthe content of the entry directly after the lookup is completed. Unfortunatelyif the outcome of the lookup process is not a match, the packet does not belongto an already created session yet. Before further treatment of the case wherethe packet does not belongs to a session, the case where the packet does belongto an already created session are treated. The content of the entry, that hasjust been looked up, were obtained by a read operation, to get the NAPT portnumber used for the session. All data needed for a complete modification isready. This counts the NAPT port number used for that session and the NAPTIP address, which is soft-coded into a register as a part of the initializing phaseof the NAPT, but now the question is if the destination is known to the filteringtable represented by the CAM-1 table. To find out whether the destination isknown or unknown to the session, a lookup in CAM-1 is carried out with theuse of the destination IP address & port number of the packet, together withthe NAPT port number obtained in the previous lookup. If the outcome in thiscase results in a match between the content of an entry and the destination IPaddress & port and the NAPT port number, the destination endhost is knownand the outbound packet processing can be ended by setting the ACT-bit. Thisinsures that when the cleanup process gets to that session, the session-timer is

Page 136: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

112 Implementation of a NAPT Design

restarted, because of the ACT-bit.

The two cases where the lookup process does not end up with a match, wereskipped while clarifying the packet processing path, where the outbound packetwere known to the NAPT tables, both mapping CAM-0 and filtering CAM-1.The packet processing paths are quite similar to each other, but will be treatedseparately beginning with the case, where the packet is completely unknownto both session tables, which is called a full-session creation. If the lookup inCAM-0 results in a miss, the session is unknown, not only to the mapping tablebut also to the filtering table, as a filtering rule cannot exist without a mapping,as seen in Figure 3.6. Since it makes no sense to interrupt the CPU and askfor a NAPT port number before making sure, that there is an empty space inboth the mapping and filtering table (CAM-0 & CAM-1), a lookup for a freeentry is made for both tables. The table addresses are saved to be used after theCPU processing. As mentioned the CPU is interrupted and this is right after thelookup for free entries. The task done by the CPU is a port number assignment,which is taken in a linear order from a list. This behavior could be changed toany possible port assignment behavior, but to obtain a performance estimate, itis not important how the NAPT port number is assigned as its performance ismeasured running the best-case packet flow. The packet processing path endswith a table update of both CAM tables, where the source IP address & portand NAPT port number is written into CAM-0 and where the destination IPaddress & port and NAPT port number is written into CAM-1 at the entrieswith the addresses found prior to the CPU interrupt. The ALI-bit is set toindicate, that the session is in use and then the session-timer is started.

The last possible path is missing. It is when a packet is unknown to the filter-ing table, also called a creation of a sub-session. The session is known to themapping table and has already an assigned port number, but the endhost isunknown to the filtering table. The NAPT core discovers the missing entry inthe filtering table as a consequence of a miss when looking up the destinationIP address & port and NAPT port number. The first step to cope with this,is to look up a free entry, as in the aforementioned case with the creation of afull-session, but in this case only in the filtering table. The CPU is interruptedbefore the final creation to make sure, that the session is allowed to be created.The NAPT’s firewall policy can be set by the user as described in Section 3. Inthis implementation the CPU does not support a user defined firewall policy.The session creation is ended with a write of all relevant data i.e. the destinationIP address & port and NAPT port number, into the filtering table at the ad-dress of the free entry found before the CPU interrupt. As with the full-session,the very last thing to do is to set the ALI-bit and start a session-timer.

Inbound packet flow

Page 137: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

4.1 NAPT Core 113

For the inbound packet flow to be succesful it needs the original source IPaddress & port. To obtain this information a lookup in the mapping table iscarried out by a search for an entry with the NAPT port number. In case thelookup results in a miss, the packet is discarded as the NAPT does not havean active mapping for the session. Otherwise the original source IP address &port is obtained by reading the entry with the address from the lookup process,which results in a match. The last step for the NAPT controller is to checkthe validness of the packet by looking up its destination IP address & port andNAPT port number in the filtering table. In this implementation the inboundpacket does not have the capability to refresh the session-timer or to createsessions due to the chosen security policy.

In all cases where an exception occurs because of no free entries in one of thetables etc., the present packet processing is terminated. In later versions theexception should be directed to the CPU by an interrupt and a code.

4.1.4 Cleanup Processing Paths

When no packets are to be treated by the NAPT core, the NAPT core is cleaningup possible unused sessions. The cleanup procedure starts out by obtaining theglobal session pointer, which is initiated to zero at boot time. This pointerwith its value equal to the address of the filtering session table (i.e. unique andpermanent for each session) points to the current session under investigation.After finishing what is necessary, if anything is needed to be done at all, thepointer is advanced such that it points to the next session. In this way thecleanup procedure only handles one session at a time, which shortens the cleanupprocessing occupation time. Imagine a solution, which cleanup the whole sessionlist at once at idle mode. When this process is first started, the NAPT core isoccupied in a period equal to the time period for the specific task needed foreach individual session, multiplied by the total number of available sessions inthe NAPT. At best every session needs no attention, but at worst they all needthe most demanding task to be carried out. In other words the latency causedby the cleanup procedure is minimized by implementing a one at a time cleanupprocedure, and the packets lost if no buffer is used, is reduced and opposite, ifa packet buffer is used, the packet buffer is optimized, as the buffering periodis likewise shortened. To turn back to the cleanup process from where thecurrent session pointer value is obtained, the next step is to gather informationabout that session from the session data memory located in the SRAM shown inFigure 4.3 together with the status of the session-timer of that particular sessionfrom the session-timer core. Based on the outcome of the gathered information,one of the following paths in Figure D is taken. The choices are listed and anexplanation is given for each of them.

Page 138: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

114 Implementation of a NAPT Design

• DeleteThe delete path is taken if the session is alive, indicated by the ALI-bit,but timed out and no other previous packets has set the ACT-bit. Thecurrent session is removed from the filtering table and if the session has noother entries in the filtering table, the entry in the mapping table removedas well. In practice this is done by reading the entry of the filtering tablewith the global session pointer to obtain the NAPT port number used forthe concerned session. Then the entry just read is deleted and a search forsessions using the same NAPT port number is carried out. Remember thatthe NAPT port number is reused for subsequent so called sub-sessions, andtherefore if the concerned entry was the only one, also the appurtenantentry in the mapping table needs to be deleted as well. Before leavingthe cleanup process it sets the global session pointer one in advance andclears the session data entry in the session data memory. Finally the CPUis interrupted with a code, so the session can be deleted in the CPU’ssession table as well.

• Do NothingDo nothing means do nothing. The session is in this case alive indicated bythe ALI-bit and the session-timer status is ”not expired” and no packetshas transferred the NAPT since the last visit i.e. the ACT-bit is cleared,or the session has just started (only the ALI-bit is set at session creation)and before a successive packet for that session has been processed, thecleanup process gets to the session. Before leaving the cleanup process itsets the global session pointer one in advance.

• Start/RestartThe start/restart path of the cleanup procedure is taken, if the currentsession is alive indicated by the ALI-bit, which is set in the session datamemory and if the ACT-bit is also set, indicating session activity i.e. atleast one packet has transferred the NAPT core since last time it wasvisited. Activity on the line, which is alive, means a restart of the session-timer regardless of its status. Before leaving the cleanup process, it setsthe global session pointer one in advance and clears the ACT-bit, as it hasserved its purpose for now.

• Not AliveNot alive means, that the session is not in use, indicated by the ALI-bitwhich equals zero, and therefore no work needs to be carried out exceptfrom moving the global session pointer one in advance, such that it pointsto the next session.

Page 139: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

4.1 NAPT Core 115

4.1.5 Session-Timer Core

The session-timer core consists internally of a session-timer interface controllerand an actual timer core, which handles the commandos originating from theNAPT core controller during runtime. There are four possible commandos (rep-resented by two-bit) of which three of them are used. The commandos availableare idle, restart/start and get session timer status. The session-timer interfacecontroller controls the actual timer core, which contains a session-timer table,where the timer value for each individual session is stored and a bit indicatingthe aliveness of the session. At initialization time the entire session-timer ta-ble is cleared including the aliveness bit and an appurtenant counter is started.Every time the counter expires, the session-timer table is examined and everyentry where the aliveness bit is set, the value is counted one down, if of courseit is not zero, indicating that the session-timer for the concerned session is ex-pired. In the case where only the UDP protocol is supported, the individualsession-timers should be set to at least 2 minutes as mentioned in Section 3.4.1.Having the session-timer core’s global counter set to expire for every second,the session-timer value for each session in the session-timer table must be set to120 when started.

To sum it up, the session-timer interface controller reacts on requests from theNAPT controller and depending on the command, it will conduct the followingfor each possible command.

• Idle ”00”The session-timer core is running and the session-timer values for alivesessions are reduced by one every time the second counter expires.

• Start/restart ”01”The session-timer value for the concerned session identified by the filteringtable address is set to 120 in this model, as the UDP protocol demandssessions to be valid for at least 120 seconds. In both cases, start andrestart, the ALI-bit is set, although it is already set in the case of a restart.

• Get status ”11”As the commando name ”get status” indicates, the value of the session-timer for the concerned session is obtained and returned to the NAPTcontroller.

Page 140: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

116 Implementation of a NAPT Design

4.2 Checksum Processor

In the following section the applied hardware architecture of the checksum pro-cessor is presented and evaluated. The critical path is determined for the dif-ferent stages in the architecture and on that basis, the propagation delay iscalculated. The result is used as a reference in chapter 6, where the perfor-mance is discussed. The Checksum processor is implemented and tested as apart of the router core introduced earlier in this chapter, but with the aimedrouter core presented in 3, 6 and 7.2.1 in mind. The resulting implementationis a flexible implementation and it could easily be adapted at any time to anyanother possible environment with a minimum of effort. At first an overviewof the architecture of the Checksum Processor as a whole is presented, mainlyfocusing on the main controller of the Checksum Processor and its purpose.Secondly the architecture of the arithmetic core is described in details and aperformance estimate will be given.

The implemented checksum processor is controlled by a main controller, whichis basically a state machine mastering the control lines of the arithmetic coreand the overall behavior of the Checksum Processor, like reading and writingdata from/to the packet buffer. The controller is invoked on cue from theunderlying router core every time a packet has arrived. The interface betweenthe checksum processors, alias the controller core, and the router core consistsof a request and an acknowledge signal. The controller core acts on the requestsignal and signals the completion with the acknowledge signal. The internalcontrol signals between the controller and the arithmetic part consists of anaccumulator clear and load signal. The load bit-vector places the immediateresult in the accumulator, while the clear signal clears the accumulator. Everytime a bit-vector has been read successfully from the packet buffer and addedwith the the current accumulator value, the load signal is triggered to placethe result in the accumulator. The clear signal is applied before every newcalculation, to reset the accumulator.

Before presenting the hardware architecture of the arithmetic core, with thepurpose of being able to calculate the propagation delay of the different stagesin the checksum processor, a delay model for the basic components is given inFigure 4.5. The delay model is a simplified model, where the effect of the loadon the gates output is ignored. The actual delay values for the full-adder usedto estimate the propagation delay are taken from [7], which describes and im-plements full-adders in Metal Oxide Semiconductor (MOS) current-mode logicusing 0.18 µ Complementary Metal Oxide Semiconductor (CMOS) technology.The delay of the register represents the sum of the setup and clock-to-outputtime delay. The arithmetic core consists of two stages; stage I, which is a se-quential multioprand adder unit with an accumulator represented by registers,

Page 141: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

4.2 Checksum Processor 117

FAfrom/to co sx 89ps 50psy 110ps 77psci 58ps 106ps

INV Registerfrom/to s Dinx 15psDout 150ps

Figure 4.5: Propagation delay model of applied components in the checksumprocessor e.g. full-adder, inverter etc. All values are given in pico seconds [ps]

and stage II, the final preparation of the result from the sequential multioprandaddition in stage I. The architecture of the arithmetic core can be seen in Figure4.6.

Figure 4.6: Hardware implementation of the checksum algorithm.

When the checksum processor is invoked, it reads a bit-vector with a magnitudeof 32-bit from the packet buffer one at a time, as data is getting loaded into thebuffer. The bit-vector is then added with the present value in the accumulator,which is initially zero. A redundant adder with a carry-save representation ofthe accumulator is selected to carry out the actual addition, as it distinguishesoneself by omitting the carry-propagation chain in sequential part of the addi-tion. As a consequence two registers are needed. One for the sum and one forthe carry. The size of the registers are determined on the basis of the maximummagnitude possible, which is the sum of a maximum sized IPv4 packet, which

Page 142: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

118 Implementation of a NAPT Design

consists of pure ones. According to the algorithm for the checksum calculation,the gathered bit-vector is separated in two (two 16-bit words), such that theyare weighted equally, each with a maximum magnitude of 65536 and with a65536 bytes theoretically size limit of the IPv4 packet. The magnitude of thecarry-save accumulator needs to be:

The maximum value of the accumulator is S = numberof16 − words · 16 −bitunsignedmagnitude which gives S = 3276816−words·65536 and since 216 =S, 16 bits are necessary.

As the gathered data word from the packet buffer is represented by two 16-bitvectors and the carry and sum in the accumulator are ditto 16-bit vectors, theadder needed is a [4:2] adder, which reduces the input data words and the currentvalue of the accumulator (carry-save representation) into a new accumulatorvalue, which is the new result, also in a carry-save representation. Figure 4.7STAGE I shows stage I. The sequential multioprand adder with accumulatorregisters for both carry and sum.

Figure 4.7: Checksum Processor Stage I.

The propagation delay of a round trip of stage I. The sequential multioprandadder is the sum of the propagation delay of the accumulator register and the4:2 adder module expressed in Equation (4.2).

TStageI = T4:2 + Treg (4.2)

Applying the [4:2] adder in Figure 4.8, the propagation delay can then be ex-pressed in terms of standard full-adders. Doing this the former equation resultsin Equation (4.3). The critical path for the 4:2 adder equals two times the largespropagation delay from one input to the sum of a full-adder.

TStageI = 2 · TFA(y−s) + Treg (4.3)

Page 143: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

4.2 Checksum Processor 119

Figure 4.8: 4-2 adder unit.

Using the propagation delay values for the full-adder and register in Figure4.5 the propagation delay for a round trip in the sequential multioprand adderresults in:

TStageI = 2 · 50ps+ 150ps = 250ps (4.4)

When the final result of the sequential multioprand adder appears, the carry-save representation of the accumulator needs to be converted into a conventionalrepresentation. A well-known carry propagate adder is selected for that purposeas described in section 3.6. The final bit-vector needs to be negated and thisis done by inverters placed on each bit-line from the carry propagate adder.Figure 4.9 shows the implemented carry propagate adder with inverters. Thered line indicates the critical-path in stage II of the checksum processor. As with

Figure 4.9: Checksum Processor Stage II.

stage I, the critical path is determined and an estimate of the actual delay valueis provided with use of the propagation delays of the basic components foundin 4.5. The critical path for stage II is the sum of the carry-propagate adder

Page 144: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

120 Implementation of a NAPT Design

TCPA and an inverter TINV propagation delay. Focusing on the carry-propagateadder a further division is made to express the propagation delay in terms offull-adder propagation delays. Equation (4.6) shows the divided critical path ofstage II, and the carry-propagate adder delay is now represented by the delayof an input to carry output delay, fourteen carry input to carry output and acarry input to sum. An estimated value of the total propagation delay of stageII is summarized and presented in (4.7).

TStageII = TCPA + TINV (4.5)TStageII = TFA(y−co) + 14 · TFA(ci−co) + TFA(ci−s) + TINV (4.6)TStageII = 110ps+ 14 · 58 + 106 + 15ps = 1043ps (4.7)

(4.8)

Page 145: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

Chapter 5

Test and Verification

The test done and described here in this chapter is not a complete test of theimplemented model in 4.1. It is a so called alpha test, where some of the corebehaviors are functionally proved. The test model has an endpoint-independentmapping behavior, an endpoint address & port filtering behavior and a portassignment by software, but since the port assignment behavior for the twomodels (port assignment by either hardware or software) are equal, it makesno difference, which of the models that are used. The size of the mapping andfiltering tables are both 16.

The first test illustrates, that only 16 mappings can be created at once and allsuccessive packets requesting a full-session creation are discarded by the NAPT,until at least one of the already created sessions are deleted by a session timeout.

The second test is like the first, except that it illustrates, that only 16 filteringrules can be created at once and all successive packets requesting a sessioncreation both full-sessions and sub-sessions are discarded by the NAPT, untilat least one of the already created sessions are deleted by a session timeout.

In the third test the session timer is tested and the test consists of three subtests. The first one illustrates that it actually works i.e. it is started and thatthe session deletes itself after an administratively selected time period. Thesecond sub test tests the purpose of the ACT-bit, which restarts the session

Page 146: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

122 Test and Verification

timer, when it is set by an appurtenant successive packet. The third sub testillustrates that in-use NAPT port numbers are released by the session timer,when this has expired.

Finally the deletion process is tested. Again a sub-session is deleted. Twosessions are created from the same private host to two public hosts. One of thesessions times out, while the other do not, because of different creation times.The test shows, that the remaining session is still alive i.e. only the entry in thefilter table is deleted.

5.1 Test of the mapping behavior

Test of the boundaries of CAM-0

17 outbound packets are send from 17 different hosts on the private side of thenetwork to the same host on the public network, as illustrated in figure 5.1.Since the size of the mapping table in this example is set to 16, only 16 outof the 17 outbound packets are allowed to create a session and therefore theNAPT discards the 17th packet. The outcome is illustrated in Figure 5.2. The

Figure 5.1: 17 outbound packets are send from 17 different hosts on the privateside of the network to the same host on the public network.

Figure 5.2: Results of the 17 outbound packets that were send from 17 differenthosts on the private side of the network to the same host on the public network.

last packet is filtered as expected. A check is made by sending the 17 outboundpackets in return as inbound packets, which can be seen in 5.3. As expectedonly 16 of the packets are allowed, namely the first 16, which created a session.

Page 147: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

5.2 Test of the filtering behavior 123

The outcome is shown in 5.4. The test files (both in- and output files) used

Figure 5.3: 17 inbound packet are send as response to the 17 outbound.

Figure 5.4: Results of the 17 inbound packet that were send in response to the17 outbound.

by the implemented model to test the boundary of CAM-0 are located in thedirectory /test/CAM0boundary on the included CD-ROM.

5.2 Test of the filtering behavior

Test of the boundaries of CAM-1 and for endpoint-independent behavior

17 outbound packets are send from one host on the private side of the networkto 17 different hosts on the public network, as illustrated in Figure 5.5. Sincethe size of the filtering table in this example is set to 16, only 16 out of the17 outbound packets are allowed to create a session and therefore the NAPTdiscards the 17th packet. The outcome is shown in Figure as illustrated in

Figure 5.5: 17 outbound packets are send from 1 host on the private side of thenetwork to 17 different hosts on the public network.

Page 148: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

124 Test and Verification

Figure 5.6. The last packet is filtered as expected. A check is made by sending

Figure 5.6: Results of the 17 outbound packets that were send from one hoston the private side of the network to 17 different hosts on the public network.

the 17 outbound packets in return as inbound packets, which can be seen in5.7. As expected only 16 of the packets are allowed, namely the first 16 whichcreated a session. The outcome is shown in 5.8. Besides testing the boundary of

Figure 5.7: 17 inbound packet are send as response to the 17 outbound.

the filtering table, the test also shows the endpoint-independent behavior, as allthe packets from host A got the same NAPT port number, which can be verifiedin Figure 5.6. The test files (both in- and output files) used by the implemented

Figure 5.8: Results of the 17 inbound packet that were send in response to the17 outbound.

model to test the boundary of CAM-1 and the endpoint-independent behaviorare located in the directory /test/CAM1boundary on the included CD-ROM.

5.3 Test of the session timer

Show the basic behavior of the timer on a session

Page 149: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

5.3 Test of the session timer 125

An outbound packet from host A to host B creates a session, which of courseincludes a session timer in the NAPT gateway. An inbound packet is send fromhost B to host A immediately after to show the existence of the session. Theinbound packet is expected to be forwarded, as a mapping for the packet exists.After a long delay (longer than the session timer), an inbound packet is againsend from host B to host A. This packet is discarded as expected, since thesession timer has expired. Beneath are the packet flow listed with the actualsource and destination IP addresses and port numbers. The test files (bothin- and output files) used by the implemented model to test the session-timersbehavior are located in the directory /test/sessionTimer on the included CD-ROM. The in- and output of the current test are illustrated in Table 5.1, 5.2and 5.3.

Table 5.1: Create a session: Host A → host B

file direction src. IP addr. src. port dest. IP addr. dest. port

Input outbound 0xC0A80001 0x051F 0xFC045193 0x0050Output 0x42F947B7 0x000F 0xFC045193 0x0050

Table 5.2: Check the session created: Host B → host A

file direction src. IP addr. src. port dest. IP addr. dest. port

Input inbound 0xFC045193 0x0050 0x42F947B7 0x000FOutput 0xFC045193 0x0050 0xC0A80001 0x051F

Table 5.3: After one large delay - Check the created session again: Host B →host A

file direction src. IP addr. src. port dest. IP addr. dest. port

Input inbound 0xFC045193 0x0050 0x42F947B7 0x000FOutput

Show the sessions ACT-bits effect on the session-timer

An outbound packet from host A to host B creates a session, which of courseincludes a session timer in the NAPT gateway. An inbound packet is send fromhost B to host A immediately after to show the existence of the session. Theinbound packet is expected to be forwarded, as a mapping for the packet exists.After a delay (half as long as the session timer), an outbound packet is again sendfrom host A to host B. This packet restarts the timer with the ACT-bit. Afterslightly more than a half session timer period, an inbound packet is again send

Page 150: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

126 Test and Verification

from host B to host A. This packet is forwarded by the NAPT, since the sessionhas been restarted. Recall that the time period between the two outboundpackets is slightly larger than a session time period. Beneath is the packet flowlisted with the actual source and destination IP addresses and port numbers.The test files (both in- and output files) used by the implemented model to testthe session timers behavior are located in the directory /test/sessionTimer onthe included CD-ROM. The in- and output of the current test are illustrated inTable 5.4, 5.5, 5.6 and 5.7.

Table 5.4: Create a session: Host A → host B

file direction src. IP addr. src. port dest. IP addr. dest. port

Input outbound 0xC0A80001 0x051F 0xFC045193 0x0050Output 0x42F947B7 0x000F 0xFC045193 0x0050

Table 5.5: Check the session created: Host B → host A

file direction src. IP addr. src. port dest. IP addr. dest. port

Input inbound 0xFC045193 0x0050 0x42F947B7 0x000FOutput 0xFC045193 0x0050 0xC0A80001 0x051F

Table 5.6: After a half large delay - Restart the session-timer: Host A → hostB

file direction src. IP addr. src. port dest. IP addr. dest. port

Input outbound 0xC0A80001 0x051F 0xFC045193 0x0050Output 0x42F947B7 0x000F 0xFC045193 0x0050

Show the availability of a port number after a session is deleted i.e. session-timerexpired

An outbound packet from host A to host B creates a session, which of courseincludes a session timer in the NAPT gateway. An inbound packet is sendfrom host B to host A immediately after to show the existence of the sessionin the NAPT. The inbound packet is expected to be forwarded, as a mappingfor the packet exists. After a delay slightly longer than a session timer period,an inbound packet is again send from host B to host A. This inbound packetis discarded, since the session timer has expired and the NAPT port numberhas been released. An outbound packet from another private host P to hostB creates a session in the NAPT gateway and gets the port number that thefirst packet got, since it has been released and is the first to choose by the

Page 151: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

5.4 Test of the delete procedure 127

Table 5.7: After one large delay - Check the created session again: Host B →host A

file direction src. IP addr. src. port dest. IP addr. dest. port

Input inbound 0xFC045193 0x0050 0x42F947B7 0x000FOutput 0xFC045193 0x0050 0xC0A80001 0x051F

NAPT, when it is initially empty. An outbound packet similar to the first one(host A to host B) is send again to show, that it gets the next NAPT portnumber available. Beneath the packet flow is listed with the actual source anddestination IP addresses and port numbers. The test files (both in- and outputfiles) used by the implemented model to test the session timers behavior arelocated in the directory /test/sessionTimer on the included CD-ROM. The in-and output of the current test are illustrated in Table 5.8, 5.9, 5.10 and 5.11.

Table 5.8: Create a session: Host A → host B

file direction src. IP addr. src. port dest. IP addr. dest. port

Input outbound 0xC0A80001 0x051F 0xFC045193 0x0050Output 0x42F947B7 0x000F 0xFC045193 0x0050

Table 5.9: Check the session created: Host B → host A

file direction src. IP addr. src. port dest. IP addr. dest. port

Input inbound 0xFC045193 0x0050 0x42F947B7 0x000FOutput 0xFC045193 0x0050 0xC0A80001 0x051F

5.4 Test of the delete procedure

Show the deletion of a sub-session

An outbound packet from host A to host B creates a session in the NAPTgateway. Another outbound packet is send again from host A but now to hostC after a delay of half the period of the session timer. After a delay of slightlymore than half the period of the session timer, an inbound packet is send fromhost B to host A. As expected this packet is discarded, since the session timer ofthe concerned session has expired and the session is deleted. Immediately after, asecond inbound packet is send from host C to host A and the packet is forwarded,

Page 152: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

128 Test and Verification

Table 5.10: After one large delay - Check the created session again: Host B →host A

file direction src. IP addr. src. port dest. IP addr. dest. port

Input inbound 0xFC045193 0x0050 0x42F947B7 0x000FOutput

Table 5.11: Create a session to show that is get the released port number: HostP → host B

file direction src. IP addr. src. port dest. IP addr. dest. port

Input outbound 0xC0A80002 0x051F 0xFC045193 0x0050Output 0x42F947B7 0x000F 0xFC045193 0x0050

as the session still exists. Recall that the two sessions share the same mappingas a result of the endpoint-mapping behavior, but has their own filtering rule.When the first session timer expires, only the entry in the filtering table isdeleted, since the two sessions share the mapping entry. Beneath the packet flowis listed with the actual source and destination IP addresses and port numbers.The test files (both in- and output files) used by the implemented model totest the delete procedure are located in the directory /test/sessionDelete on theincluded CD-ROM. The in- and output of the current test are illustrated inTable 5.13, 5.14, 5.15 and 5.16.

Page 153: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

5.4 Test of the delete procedure 129

Table 5.12: Create another session to show that it gets another NAPT portnumber: Host A → host B

file direction src. IP addr. src. port dest. IP addr. dest. port

Input outbound 0xC0A80001 0x051F 0xFC045193 0x0050Output 0x42F947B7 0x000E 0xFC045193 0x0050

Table 5.13: Create a new full-session: Host A → host B

file direction src. IP addr. src. port dest. IP addr. dest. port

Input outbound 0xC0A80001 0x051F 0xFC045193 0x0050Output 0x42F947B7 0x000F 0xFC045193 0x0050

Table 5.14: Create a new sub-session after half the delay of the session-timer:Host A → host C

file direction src. IP addr. src. port dest. IP addr. dest. port

Input outbound 0xC0A80001 0x051F 0xFC045193 0x0051Output 0x42F947B7 0x000F 0xFC045193 0x0051

Table 5.15: Inbound packet after slightly more than half the delay of the session-timer: Host B → host A

file direction src. IP addr. src. port dest. IP addr. dest. port

Input inbound 0xFC045193 0x0050 0x42F947B7 0x000FOutput

Table 5.16: Inbound packet: Host C → host A

file direction src. IP addr. src. port dest. IP addr. dest. port

Input inbound 0xFC045193 0x0051 0x42F947B7 0x000FOutput 0xFC045193 0x0051 0xC0A80001 0x051F

Page 154: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

130 Test and Verification

Page 155: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

Chapter 6

Results

Based on the experience and knowledge gathered from the implemented model,this chapter will argue and present an estimate of the performance in terms ofa maximum bandwidth and the latency of the NAPT core.

As stated in the specification, the NAPT core should as a minimum support abandwidth of 1 Gbps bidirectional and since the NAPT core is unidirectional,the actual demand of the bandwidth is 2 Gbps. There is no demand for thelatency, but it is included to form a whole.

6.1 Bandwidth Estimation

The bandwidth is estimated with the use of the twin-buffer solution shown inFigure 6.1.

The maximum bandwidth of the NAPT is defined, in this work at least, asbeing the best-case bandwidth, which is determined as being the case, wherethe input data stream consists entirely of packets belonging to already knownsessions i.e. no new connection establishments in-between and furthermore noexceptions like NAPT session depletion etc. occurs.

Page 156: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

132 Results

Figure 6.1: Twin-buffer solution system solution.

The network input data used for the estimation of the bandwidth is representedby a constant stream of IPv4 packets with an empty option field (i.e. headersize = 20 bytes) and a variable sized UDP packet embedded as payload. Thesize of the UDP packet ranges from a minimum length of 8 bytes (only UDPheader, no application data in payload) to 65,507 bytes, which is the practicallimit for the underlying IPv4 protocol.

Regardless of the packet size, the NAPT core must keep up with a constant inputdata rate of 2 Gbps after session initialization to accomplish the performancegoal. However provided that the requirements of the best-case scenario are met.In practice an input data stream with a mix of minimum and maximum sizedpackets are used. These corner cases covers the entire span of packet sizes andare sufficient to determine the requirements of the NAPT core and checksumcalculation for thereby to be able to conclude, whether the bandwidth goal isachievable or not.

An example is provided to illustrate the arrival and processing of first a maxi-mum sized packet followed by two minimum sized packets. All packets belongsto the same already created session in the NAPT tables.

The NAPT core in the example is configured as the implemented model inchapter 4 i.e. with an endpoint independent mapping behavior, an address andport dependent behavior and a software port assignment behavior. It has beenshown in the referred chapter above, that the NAPT procedure is independentof the domain from which the packet comes from. In other words, the bandwidthestimation is independent of the direction of the packet (inbound/outbound).

Example

Figure 6.2 shows the NAPT buffer 1 being loaded with data (maximum sizedpacket) from either the LAN or WAN in an empty buffer system. The checksum

Page 157: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

6.1 Bandwidth Estimation 133

processor starts calculating the new checksum on the fly, at least as much asit can. The data fields, which are going to be translated by the NAPT core,are obviously not valid until the NAPT core has completed the step, wherethe modified data are determined. The rest of the checksum processor’s job iscarried out concurrently with the NAPT core. There will be a little more aboutthe checksum processor later, but until then it is assumed, that the secondpart of the checksum process is done concurrently with the NAPT and as thetranslation is completed.

Figure 6.2: NAPT Buffer 1 is loading input data.

As soon as the packet is fully loaded, the NAPT core does what is necessaryand the checksum starts processing the modified header data as soon as it isready. At the same time the input data stream is redirected to the NAPT buffer2. Figure 6.3 illustrates the NAPT core processing the packet in NAPT buffer1, while NAPT buffer 2 is loaded with data. The dotted line in NAPT buffer 2is an indicator of the size of the next packet, which is a minimum sized packet.The indicator is only included to enhance the understanding of the illustration.

Figure 6.3: The NAPT is processing the first packet in NAPT Buffer 1, whileNAPT Buffer 2 is loading input data.

Page 158: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

134 Results

As soon as the NAPT core and ditto the checksum processor is done, the packetis placed in the appropriate output buffer (here WAN) and the the NAPT isswitching over to NAPT buffer 2, where the packet has just been fully loaded.The input data is immediately redirected to the NAPT buffer 1. This step isillustrated in Figure 6.4.

Figure 6.4: The NAPT is processing the second packet in NAPT Buffer 2, whileNAPT Buffer 1 is loading input data and the WAN Output Buffer is gettingdrained with a bandwidth of 2Gbps.

6.2 Bandwidth of the NAPT Core

It is obvious, that for this solution to work properly, the NAPT processing timemust be less than the load time of a new packet from the network. Otherwisethe data stream must be stalled, until the NAPT core is done. The statementraises the question of how long the load or maximum NAPT processing timeperiod must be?

The load processing time period depends heavily on the packet size and sincesmall packets causes fast switches between the buffers, the worst case scenario,seen from the NAPT perspective, is when the smallest possible packet is pre-sented. Beneath is the minimum packet sizes listed for an IPv4 packet repre-sented respectively with an embedded minimum sized UDP or TCP packet.

SminIPv4UDP = SminIPv4 + SminUDP = 20bytes+ 8bytes = 224bits (6.1)SminIPv4UDP = SminIPv4 + SminTCP = 20bytes+ 20bytes = 320bits (6.2)

With a maximum network bandwidth of 2Gbps it takes 0.5ns/bit, which resultsin the following load processing time period for UDP and TCP:

Page 159: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

6.2 Bandwidth of the NAPT Core 135

TmaxUDPproc = 224bits · 0.5nsbit

= 112ns (6.3)

TmaxTCPproc = 320bits · 0.5nsbit

= 160ns (6.4)

Appendix C shows the NAPT processing steps for in- and outbound.

The NAPT processing time period is the sum of the individual contributions ofpropagation delay in the sequences. Since the outbound path is the largest, itwill determine the longest NAPT processing time period.

Taking a closer look on the hardware paths for the NAPT core, the followingequation (6.5) can be derived representing the total number of clock cycles tocomplete a packet:

NUDPproc = 2n+ p+ 2 clock cycles (6.5)

The variables n and p represent the number of system clock cycles for a lookupand read process in the CAM, regardless of the type of CAM. It is assumedthat all input data is available for the NAPT core at start prepared under thepacket load process and that the actual packet modification is done on the fly,which is possible as the data for the modification is ready early in the process.Multiplying the equation with the reciprocal of the frequency gives the NAPTprocessing time period, which must be shorter than the load processing time.Equation (6.6) expresses the NAPT processing time period, while equation (6.7)expresses the overall demand.

TNAPTpktProc = (2n+ p+ 2) · 1sysfreq

(6.6)

TNAPTpktProc ≤ TminPktLoadProc (6.7)≤ 112ns

As the lookup and read time are variables depending on the applied CAM,three plots 6.5 6.6 6.7 are provided to show the possible bandwidths for severaldifferent CAM tables running at three different system clock frequencies. Theuniversal equation behind the plots are seen in (6.8).

Bandwidth [Gbps] =SminIPv4UDP

TNAPTpktProc=

224(2n+ p+ 2) · 1

sys freq [MHz]

(6.8)

If for instance a system is running at 200MHz and, as in this case, the lookupand read delay are respectively 5 and 2 clock cycles, the system will then becapable of handling a network bandwidth of approx. 3.25 Gbps

Page 160: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

136 Results

Figure 6.5: Outbound minimum sized IPv4/UDP packet with a NAPT running@150MHz.

6.3 Checksum Processor Requirements

The precondition for the bandwidth estimation was that the checksum calcula-tion period is shorter than the NAPT processing period. An example is givento clarify the requirements for this and the influence of the checksum processorusing the solution described in 4.2.

For a minimum sized outbound packet the first 32-bit of the packet is read by thechecksum processor and added to the accumulator. This must be done within0.5ns/bit · 32 − bit = 16ns for a 2 Gbps input data stream. This procedureis repeated through the whole load process with exception of the source IPaddress & port, which are going to be modified by the NAPT core. The sourceIP address i.e. the field where the NAPT gateway’s public IP address shouldbe placed, is instead passed by the checksum processor. The NAPT gateway’spublic IP address is taken from a register and added to the checksum as normalin the time period, where the checksum processor usually just reads and adds

Page 161: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

6.3 Checksum Processor Requirements 137

Figure 6.6: Outbound minimum sized UDP packet with a NAPT running@175MHz.

the value. Right after the load time period, the NAPT processing starts inconcurrency with the checksum processor. The checksum processor needs toread and add the last 32-bit word in the NAPT buffer and then wait for theNAPT core to deliver the last data needed, the NAPT port number. The NAPTcore needs to accomplish the first two steps: the NAPT port lookup in CAM-0and a following read of the looked up entry in CAM-0, before the checksumprocessor can continue with a final addition of the NAPT port number with theaccumulator, and thereafter write the new checksum of the packet back to thechecksum field in the packet.

It is a fact, that the checksum processor must be able to calculate the last partof the final checksum and write it back before the lookup process in the CAM-1table is completed.

From the timing diagram in Figure 6.8, which shows the interface between thechecksum processor and the NAPT buffer, where a checksum calculation of thelast two 32-bit words and a write back of the result occurs, it is shown, that the

Page 162: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

138 Results

Figure 6.7: Outbound minimum sized UDP packet with a NAPT running@200MHz.

final addition and write back can be completed in 2 clock cycles.

For the bandwidth estimation to be valid, the time period of the 2 clock cyclesmust be less than the time period of the CAM-1 table lookup step. Furthermorethe time period of a checksum memory read and addition must be larger thanthe CAM-0 lookup and read steps together and smaller than the load time of32-bit in the NAPT buffer.

6.4 Latency Estimation

The latency for the NAPT is independent of the packet size, but depends heavilyon the type of packet i.e. which hardware path it will follow in the NAPT. In thecase of a packet, which belongs to a already known session, the latency equals(6.6). This is, as with the bandwidth, the beat-case latency of the NAPT.The latency for the two other NAPT paths, respectively a sub- and full-session

Page 163: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

6.4 Latency Estimation 139

Figure 6.8: Checksum calculation timing diagram.

Page 164: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

140 Results

creation, are harder to estimate, as the CPU interrupt latency and the best-caseand worst-case execution time is unknown in this work. Looking at a systemperspective, here the twin-buffer solution and the best-case latency are the sumof the packet load time period and the NAPT processing period, and thereforheavily dependent of the packet size. Equation (6.9) expresses the variations ofthe best-case latency for the two packet size extrema.

TNAPT + TminLoadTime � SystemLatency � TNAPT + TmaxLoadTime (6.9)

6.5 Area Estimation

The area needed for the CAM tables depends on the number of entries necessaryin the tables, which again depends on the type of session, meaning if it is newfull-session or a new sub-session. The natural choice will be to analyse thesituation with new full-sessions, since this is the most demanding in addition tothe usage of entries in the CAM tables. When the number of sessions have beenclarified through a field analysis, which is not performed in this work, the areaestimation should be straight forward, since the number of bits used in eachentry in each CAM table is known.

6.6 Discussion

This is an example of a system configuration, where the system clock frequencyequals the maximum clock frequency of the single cycle memory block, whichis a fundamental design block in the system design and is used for the purposeof storing session specific data and buffering network data. In this examplethe maximum clock frequency of the memory block is 200MHz, and the systemfrequency is chosen to be 200MHz as well. If the CAM table uses 4 systemclock cycles to read an entry in the table, then by looking at the plot in Figure6.7 it is obvious, that the lookup latency of the CAM should be less than 8system clock cycles to accomplish the required 2 Gbps bandwidth. The timingrequirements for the checksum processor can be extracted from the informationabout the timing behavior of the CAM table.

Common for all checksum calculations, where a 2 Gbps bandwidth is requiredregardless and the buffer width is 32-bit, is that the time period for a readoperation from the NAPT buffer and a subsequent addition must be handledwithin the time period of the load of 32-bits from the network, which equals0.1ns/bit ∗ 32 − bit = 16ns of which the 5ns is reserved for the read operation

Page 165: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

6.6 Discussion 141

alone, leaving 11ns for the addition. The first part of the NAPT procedure,where the information is gathered from CAM-0, the last data word of the packetneeds to be read from the packet buffer by the checksum processor and addedto the accumulator, and that must be completed within the time period of thefirst lookup in CAM-0 and the subsequent read operation from CAM-0 that is(8 + 4 =) 12 system clock cycles in total, which corresponds to 60ns runningat 200MHz. This checksum task is unproblematic, since the task equals thetrivial read and add operation, which should be completed in less than 16nsaccording to the aforementioned demand, and this is less than the available60ns. The second part of the NAPT procedure in the inbound case is the worst,because it needs to add both the newly looked up source IP address and portto the accumulator and the final preparation of the new checksum togetherwith a write back. This has to be completed within a time period of a lookupin the CAM-1 table, which is 8 system clock cycles, and this corresponds to40ns. The addition part demands less than 22ns of the available time periodof 40ns, which leaves 13ns for the final preparation of the checksum after thewrite back operation of the checksum, which takes 5ns, is subtracted. Theresulting maximum time periods respectively for the addition (11ns) and thefinal preparation of the checksum (13ns), are both larger than the half clockcycle, that is available for each of them, according to the timing diagram 6.8of the implementation. In other words the goal of achieving a bandwidth of 2Gbps with a system clock frequency of 200MHz embrace, can be achieved, if around trip of the sequential addition part of the checksum processor and thefinal preparation of the checksum can be completed respectively within half thetime period of a system clock cycle, which in this case equals 2.5ns.

It should be obvious for the reader, that a clean-up/deletion and creations ofsessions will affect the performance negatively, since the results are calculated onthe basis of the best-case scenario, where the input data stream consists entirelyof packets belonging to already known sessions, and without interference fromthe clean-up procedure. How negative the effect will be, depends on the situationand all the factors, that is included in the needed procedure, either the clean-upor the packet process.

Because of similarities in the NAPT core and checksum processor operation forBoth UDP and TCP and the fact, that UDP has the smallest packet size, abandwidth estimate of the NAPT core processing IPv4 packet carrying an UDPpacket is sufficient, as it covers the size range of the TCP protocol.

Page 166: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

142 Results

Page 167: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

Chapter 7

Discussion

7.1 Summary of Main Contributions

The following four sections describes the main contributions of this thesis to theNetwork Address Network approach.

7.1.1 The CAM-table based NAPT Approach

A central part in this thesis is the development of the CAM-table based NAPTApproach, as described in Chapter 3. The CAM-table based NAPT procedurecan be used by border network equipment such as routers etc. where a separationbetween a private and a public network is wanted. The actual separation is doneby means of a masquerading of the source IP address and port number at theNetwork and Transport Layer of the TCP/IP suite. The purposed approachhas been proved to support very high bandwidth rates, even with a low systemfrequency (see Chapter 6).

Page 168: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

144 Discussion

7.1.2 A Behavioral/Structural HDL Implementation

A HDL Behavioral/Structural implementation of a CAM-table based NAPT wasconstructed as described in Chapter 4. The implementation uses network dumpfiles to simulate a network traffic flow. The format of the network traffic dumpfiles is quite simple, therefore it should be straight forward to use dump filesproduced by the popular application Wireshark, which is a network protocolanalyzer for Unix and Windows, for further test purpose.

7.1.3 Very high speed integrated circuit Hardware De-scription Language (VHDL) Entities

The following VHDL entities were constructed and used in this thesis:

• chksumProcCore.vhd: The Checksum Processor

• setDataUnit.vhd: Prepare NAPT data from packet buffer unit

• getDataUnit.vhd: Write-back NAPT data to packet buffer unit

• hnatCore.vhd: The Hardware Network Address Translation core

• netwRx.vhd: Receive Packet from network unit

• netwTx.vhd: Transmit Packet to network unit

• sram.vhd: 32-bit Static Random Access Memory (session memory)

• pktMem.vhd: 32-bit Static Random Access Memory (packet buffer)

• sysCntlUnit.vhd: Test bench controller

• sysclk.vhd: System clock

• testBanch.vhd: Test bench

7.2 Future Work

The following sections describe interesting areas, which could improve the NAPTapproach.

Page 169: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

7.2 Future Work 145

7.2.1 Parallel execution

The purposed solution is based on a sequential execution of the packet flow. Thissolution suffers under possible large interrupt periods, where the CPU stalls theexecution of sessions, other than the concerned. An interesting but not evaluatedsolution to the problem of having large stall periods of the hardware, while theCPU performs the necessary tasks, is to include a buffer structure betweenthe hardware and the CPU. The buffer should be a double buffer containingrequests from the hardware, and the CPU replies respectively. Every time apacket needs attention from the CPU, it is placed in the request buffer andthe hardware continues with other packets, while the CPU does what is neededwith the packet. When the CPU has completed the requested task, it places thepacket in the reply buffer. The prioritized execution of the main tasks, recallthe packet processing and the clean-up procedure, must be extended with areply processing, which has the highest priority of the now three main tasks.The solution must take appropriate care of those situations where e.g. an UDPpacket is received and recognized as being unknown to the NAPT tables and asa result of that is send to the CPU for further processing. It is obvious, thatsuccessive packets in the UDP packet flow must be discarded until the creationof the session is completed.

7.2.2 The NAPT table size

A natural question arises when talking about NAPT gateways and that is howlarge the NAPT tables should be. No precise answer exists, as it depends onthe application. Table 7.1 shows a list of current NAT table sizes for a sectionof commercial SoHo routers. An in-depth analysis of the needs for sessionsin different environments would be desirable, before a final conclusion is madeabout the optimal table size.

7.2.3 Combined mapping and filtering table

The current solution is build upon a twice CAM module solution, one for themapping and one for the filtering information. An intuitive idea is to mergethose two tables together into one CAM module. On in the others end andvirtually separate them by introducing a single bit for each entry, which clarifiesthe group membership. It is at the moment not clear, whether it is beneficial tomake the combination, as the area saved by eliminating one control unit mustbe larger than the total waste of data bits in the filtering part of the combined

Page 170: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

146 Discussion

Table 7.1: The NAT table size of a section of commercial routers with NATcapability

Vendor Year NAT table size

Zyxel Prestige 310 2000 256Actiontec MI424WR 2007 1024Zyxel 660H-T1 2007 1024Billion BiPAC 7404VGP 2007 1500Netgear FWG114P 2004 2000Netgear DG834G 2007 2048Linksys-Cisco WRT54GL 2005 4096D-Link DGL-4300 2006 8192

table. Recall that the entry size of the filtering table is smaller than the entryof the mapping table. Combining the two tables prevent furthermore parallellookup in the tables, which is possible for inbound packets in the twice CAMmodule solution.

7.2.4 Prioritized packet flow

In the current design packets are received in a First In First Out (FIFO) bufferand processed accordingly, some traffic requires a high QoS, such as multimediastreams or VoIP. A NAPT gateway could include a selection procedure, whichprioritize the execution of packets in the input buffer. This behavior wouldmake the NAPT gateway multimedia application friendly.

7.2.5 Evaluation of TCP corner cases

The TCP protocol has a number of corner cases as described in [17]. It isadvisable to test the behavior of the final implemented NAPT gateway, whenpresented for those corner cases.

7.2.6 Support for fragmented IP packets

The Internet Protocol allows IP fragmentation, so that IP packets can be frag-mented into pieces small enough to pass over a link with a smaller MaximumTransfer Unit (MTU) than the original IP packet size. Unfortunately for NAPTgateways, only the first fragment carries the full header, the following fragments

Page 171: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

7.2 Future Work 147

a regonized by an identifier. NAPT gateways could simply discard fragmentedpackets and send back an ICMP error message, which says ”Packet too Big”,or better, it could feature a way to handle fragmented IP packets.

7.2.7 Hairpinning support

Hairpinning support has become an important feature today, as peer-to-peerconnections, where the communicating hosts are behind their own separateNAPT gateways and those two NAPT gateways are both attached to the pri-vate side, the same top level NAPT gateway are unable to work [11] withoutthe support of hairpinning. The support for hairpinning is relatively easy toimplement, as an add-on to the existing approach.

7.2.8 Multicast and IGMP support

IP Multicast is a technique for one-to-many communications over an IP in-frastructure. IP multicast is often employed for streaming media and Internettelevision applications. A host behind a NAPT gateway should as a minimumbe able to join a so called group and afterwards receive the actual multicasttraffic. The protocol used by receivers to join a group is called IGMP, and musttherefore be supported to the extent that is necessary.

7.2.9 IPv6 support

Only the IPv4 has been covered in this work, as it is the protocol of concernwhen talking about NAPT gateways, but as mentioned in the introduction,IPv6 is the next generation Internet Protocol and the support for both InternetProtocols will soon be a demand.

Page 172: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

148 Discussion

Page 173: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

Chapter 8

Conclusion

The decreasing supply of unallocated IPv4 addresses have made NAPT gatewayspopular, and the ever-increasing need for higher and higher network bandwidthsenforces vendors of broadband equipment to keep up with the development.Additionally, the increase in hacker attacks etc. has spurred a public demandfor security and safety in private networks; demands that could be partially metby deployment of these technologies.

The three objectives of this thesis were: To explore and analyse the essentialrequirements in terms of the architecture and protocol support etc., design anddevelop an architecture, which fulfills the essential requirements stated above,with the use of CAM tables, such that the model is capable of handling abi-directional packet stream with a bandwidth of 2 Gbps, and finally an imple-mentation of this new approach.

In Chapter 3 this thesis presents a comprehensive step-by-step evaluation anddesign with a progressive complexity of the a NAPT gateway, satisfying the twofirst objectives.

Section 3.1 and 3.2 presents a solution to the general mapping and filteringbehavior of the NAPT approach, with the use of binary CAM modules. If abandwidth reduction for inbound packet flows can be tolerated, the originallypurposed solution, which uses two individual CAM modules for the mapping

Page 174: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

150 Conclusion

and filtering tables, can easily be merged together and represented by only oneCAM module as described in 7.2.

Several assignment strategies for the unique port number used to mask sessionsoriginating from the private network are covered in section 3.3. A CPU basedsolution was chosen rather than a pure hardware based solution mainly for tworeasons; The flexibility it provides (the port assignment strategy is decoupledform the hardware), which could be valuable if changes has to be made, and sec-ondly, as it is the natural implementation choice, when it comes to the extendedfirewall capabilities of the NAPT gateway.

The NAPT behavior while handling the three most applied transport layer pro-tocols UDP, TCP and ICMP are treated in section 3.4 and 3.5. Solutions forthe handling of the supported protocols are given, regarding the establishment,maintaining and termination phase, in which a session can be in. In additionsome protocol specific security issues, related to the design, has been considered.

The last objective is satisfied by the work presented in Chapter 4. Based on theclock-true HDL implementation (consisting of two CAM modules) an equationfor the bandwidth of the NAPT core has been derived, and on the basis of thesystem clock frequency and the type of CAM module, a bandwidth estimatehave been provided. With CAM tables, where lookups are performed in therange of few tens of nanoseconds, the required goal of supporting a bandwidthof at least 2 Gbps must in this case be considered feasible with the presentedsolution driven by a system frequency of 150MHz.

Besides providing information used for the bandwidth estimation, the imple-mentation also represents a proof-of-concept of the behavior of the designedNAPT approach. Some essential behavioral tests of the implemented model hasbeen carried out in chapter 5, with positive results.

Page 175: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

Appendix A

Working description of the M.Sc. Thesis

Project titleDesign of a hardware network address translation unit for a single chip high-speed Ethernet router

Project title (Danish)Design af et hardware modul til netværksadresse oversættelse i en enkelt chiphøjhastigheds-Ethernet router.

Participant(s)Martin Rolsted Jensen Jens Sparsø (DTU supervisor) Vitesse SemiconductorCorporation A/S (collaborator)

Project descriptionThe main goal in this M.Sc. thesis project is to come up with a hardware solutionto the Network Address Translation (NAT) procedure, which has become astandard feature in routers for home and small-office Internet. NAT is usuallyimplemented in software, but with the steady increasing need for bandwidth,the software solution has become a bottleneck in modern high-speed networks.

The project is carried out as a collaboration between the candidate and VitesseSemiconductor Corporation A/S and is intended to be a part of a larger redesign

Page 176: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

152 Working description of the M. Sc. Thesis

of an existing product. The main activity in relation to the thesis is held atVitesse Semiconductor Corporation A/S.

The project includes an exhaustive survey of the router environment, in whichit is intended to be a subpart of, before the actual Hardware Network AddressTranslation (HNAT) architecture is specified. The design part of the archi-tecture includes performance analysis to achieve a wire-speed of at least 1000Mbps.

Project steps in point form

• Identify the role of the HNAT in a network System-on-Chip (SOC) ap-proach

• Design HNAT architecture

• Specify subcomponents

• Define a communication interface

• Calculate critical path (estimate latency)

• Calculate area

• Implement a structural description of the HNAT architecture in a Hard-ware Description Language (HDL)

• Test the design by simulation

Project issuesFor the actual translation of a public network address to one of possible multipleprivate addresses and vice versa, an efficient data structure is needed as look-uptable for this process. A suitable data structure has to be selected to achievethe high demand for a minimum of latency. The unavoidable trade-off betweenarea and speed when choosing a data structure has to be considered.

Since a potential large number of private addresses can coexist simultaneously,its count has a huge impact on the table size, which also has to be considered.

Success criteriaThe expected final outcome of the project is a complete design of a HNAT-architecture with a wire-speed on at least 1000 Mbps. A structural descriptionof the HNAT should be implemented and simulated with the use of HDL. Theaim of the simulation is to prove functionality.

Page 177: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

Appendix B

UDP example PPT

B.1 UDP

Page 178: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

154 UDP example PPT

B.1.1 UDP I

New full outbound session Since a lookup for the packets source address & portin CAM-0 results in a miss then a free entry in both CAM-0 & CAM-1 areneeded for this new session. Save all relevant data at the free entries in bothCAM-0 & CAM-1

Figure B.1: Illustration of a static Network Address Translation operation.

Page 179: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

B.1 UDP 155

B.1.2 UDP II

Outbound NAT operation Perform the actual NAT operation on the packetwhich means new source address and port. Use the address of the free CAM-0entry as the new source port of the packet

Figure B.2: Illustration of a static Network Address Translation operation.

Page 180: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

156 UDP example PPT

B.1.3 UDP III

New full outbound session Since a lookup for the packets source address & portin CAM-0 results in a miss then a free entry in both CAM-0 & CAM-1 areneeded for this new session. Save all relevant data at the free entries in bothCAM-0 & CAM-1

Figure B.3: Illustration of a static Network Address Translation operation.

Page 181: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

B.1 UDP 157

B.1.4 UDP IV

Outbound NAT operation Perform the actual NAT operation on the packetwhich means new source address and port. Use the address of the free CAM-0entry as the new source port of the packet

Figure B.4: Illustration of a static Network Address Translation operation.

Page 182: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

158 UDP example PPT

B.1.5 UDP V

New outbound session Since a lookup for the packets source address & port inCAM-0 results in a match, a second lookup for the packets destination address &port together with the address of the match from the previous lookup in CAM-0also called the address of the session is needed, this time in the CAM-1. If theoutcome is a match the destination for that session is known and everything isok otherwise a free entry in the CAM-1 is needed to create the new destinationfor that session.

Save the packets destination address & port together with the sessions addressat the free entry in CAM-1 exactly as we did to create a new full outboundsession.

Figure B.5: Illustration of a static Network Address Translation operation.

Page 183: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

B.1 UDP 159

B.1.6 UDP VI

Outbound NAT operation Perform the actual NAT operation on the packetwhich means new source address and port. Use the address of the free CAM-0entry as the new source port of the packet

Figure B.6: Illustration of a static Network Address Translation operation.

Page 184: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

160 UDP example PPT

B.1.7 UDP VII

Inbound request for a NAT operation Get the original source address & portfor that session from the CAM-0 with the packets destination port, if the entryis valid otherwise discard the packet.

If a lookup with the packets source address & port together with the destinationport in CAM-1 results in a match the session is known and the packet can gofurther in the process otherwise discard the packet.

Figure B.7: Illustration of a static Network Address Translation operation.

Page 185: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

B.1 UDP 161

B.1.8 UDP VIII

Inbound NAT operation Perform the actual NAT operation on the packet whichmeans new destination address & port. Use the original source address & portfound in the earlier lookup in the CAM-0.

Figure B.8: Illustration of a static Network Address Translation operation.

Page 186: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

162 UDP example PPT

B.1.9 UDP IX

Inbound request for a NAT operation Get the original source address & portfrom the CAM-0 with the packets destination port, if the entry is valid otherwisediscard the packet.

If a lookup with the packets source address & port together with the addressof the session in CAM-1 results in a match the session is known and the packetcan go further in the process otherwise discard the packet.

Figure B.9: Illustration of a static Network Address Translation operation.

Page 187: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

B.1 UDP 163

B.1.10 UDP X

Inbound NAT operation Perform the actual NAT operation on the packet whichmeans new destination address & port. Use the original source address & portfound in the earlier lookup in the CAM-0.

Figure B.10: Illustration of a static Network Address Translation operation.

Page 188: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

164 UDP example PPT

B.1.11 UDP XI

Inbound request for a NAT operation Get the original source address & portfrom the CAM-0 with the packets destination port, if the entry is valid otherwisediscard the packet.

If a lookup with the packets source address & port together with the addressof the session in CAM-1 results in a match the session is known and the packetcan go further in the process otherwise discard the packet.

Figure B.11: Illustration of a static Network Address Translation operation.

Page 189: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

B.1 UDP 165

B.1.12 UDP XII

Figure B.12: Illustration of a static Network Address Translation operation.

Page 190: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

166 UDP example PPT

B.1.13 UDP XIII

Inbound request for a NAT operation Get the original source address & portfrom the CAM-0 with the packets destination port, if the entry is valid otherwisediscard the packet.

If a lookup with the packets source address & port together with the addressof the session in CAM-1 results in a match the session is known and the packetcan go further in the process otherwise discard the packet.

Figure B.13: Illustration of a static Network Address Translation operation.

Page 191: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

B.1 UDP 167

B.1.14 UDP XIV

Figure B.14: Illustration of a static Network Address Translation operation.

Page 192: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

168 UDP example PPT

Page 193: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

Appendix C

Flowchart: Packet Processing

Page 194: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

170 Flowchart: Packet Processing

Page 195: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

Appendix D

Flowchart: Clean-UpProcessing

Page 196: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

172 Flowchart: Clean-Up Processing

Page 197: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

Appendix E

VHDL Source Code

E.1 Behavioral-HNAPT

1 −− −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−2 −−3 −− T i t l e : Ha rdwar e Ne two r k A d d r e s s T r a n s l a t i o n4 −− :5 −− D e v e l o p e r s : Ma r t i n R o l s t e d J e n s e n6 −− :7 −− Pu r p o s e :8 −− :9 −− :

10 −− R e v i s i o n : 1 . 0 18−08−08 I n i t i a l v e r s i o n11 −−12 −− −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−13 l ibrary i e e e ;14 use i e e e . s t d l o g i c 1 1 6 4 . a l l ;15 use i e e e . numeric std . a l l ;16 l ibrary t e s t l i b r a r y ;17 use t e s t l i b r a r y . tcam2 types . a l l ;1819 entity hnatCore i s20 port (21 c lk : in s t d l o g i c ;22 nReset : in s t d l o g i c ;23 −− C o n t r o l s i g n a l s24 reqNAT : in s t d l o g i c ;25 ackNAT : out s t d l o g i c ;26 d i s ca rd : out s t d l o g i c ;27 reqINT : out s t d l o g i c ;28 ackINT : in s t d l o g i c ;29 −− Memory i n t e r f a c e30 data : inout s t d l o g i c v e c t o r (31 downto 0) ;31 addr : out s t d l o g i c v e c t o r (17 downto 0) ;32 nWE : out s t d l o g i c ;33 nCS : out s t d l o g i c ;34 nOE : out s t d l o g i c ;35 −− Memory i n t e r f a c e36 gen data : inout s t d l o g i c v e c t o r (31 downto 0) ;37 gen addr : out s t d l o g i c v e c t o r (17 downto 0) ;38 gen nWE : out s t d l o g i c ;39 gen nCS : out s t d l o g i c ;

Page 198: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

174 VHDL Source Code

40 gen nOE : out s t d l o g i c ;41 −− Gen e r i c p a c k e t d a t a42 d i r e c t i o n : in s t d l o g i c ;43 cnt : in s t d l o g i c v e c t o r (31 downto 0) ;44 −− IP h e a d e r i n f o o f t h e p a c k e t45 ve r s i on : in s t d l o g i c v e c t o r (3 downto 0) ;46 IHL : in s t d l o g i c v e c t o r (3 downto 0) ;47 pro toco l : in s t d l o g i c v e c t o r (7 downto 0) ;48 srcIPaddr : in s t d l o g i c v e c t o r (31 downto 0) ;49 destIPaddr : in s t d l o g i c v e c t o r (31 downto 0) ;50 −− TCP & UDP h e a d e r i n f o o f t h e p a c k e t51 srcPort : in s t d l o g i c v e c t o r (15 downto 0) ;52 destPort : in s t d l o g i c v e c t o r (15 downto 0) ;53 −− TCP o n l y54 FIN : in s t d l o g i c ;55 SYN : in s t d l o g i c ;56 RST : in s t d l o g i c ;57 FACK : in s t d l o g i c ;58 −−59 natIPaddr : out s t d l o g i c v e c t o r (31 downto 0) ;60 natPort : inout s t d l o g i c v e c t o r (15 downto 0) ;61 prvIPaddr : out s t d l o g i c v e c t o r (31 downto 0) ;62 prvPort : inout s t d l o g i c v e c t o r (15 downto 0) ) ;6364 end hnatCore ;6566 architecture behav ioura l of hnatCore i s6768 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−69 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−70 −− TCAM COMPONENT71 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−72 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−73 component tcam2 top74 generic (75 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−76 −−Width o f a FULL TCAM word :77 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−78 −−( g DATA WID mus t b e a m u l t i p l u m o f 4 i f h a l f / q uad s u bw o r d s u p p o r t i s79 −−r e q u i r e d )80 g DATA WID : i n t e g e r := 49 ;8182 −−T o t a l number o f a d d r e s s e s ( wh e r e83 −−one a d d r e s s i s u s e d f o r e a c h s u bw o r d ) :84 g NUM ADDR : i n t e g e r := 16 ;8586 −−Number o f s u b w o r d s p e r f u l l word :87 −−( Must b e s e t t o 1 , i f s u b w o r d s mus t n o t b e s u p p o r t e d , and o t h e r w i s e i t88 −−mus t n o r m a l l y b e 4 ( t o s u p p o r t QUAD wo r d s ) )89 g NUM SUBWORDS PER FULLWORD : i n t e g e r := 1) ;9091 port (92 s y s c l k : in s t d l o g i c ;93 s y s r s t : in s t d l o g i c ;94 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−95 −−Norma l a c c e s s i n p u t s :96 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−97 −− S p e c i f i e s TCAM o p e r a t i o n ( IDLE , READ , WRITE , LOOKUP) :98 acce s s cmd i : in t ACCESS CMD STRM;99

100 −− Row a d d r e s s f o r r e a d and w r i t e o p e r a t i o n s :101 a c c e s s a dd r i : in s t d l o g i c v e c t o r ( c e i l i n g l o g 2 (g NUM ADDR)−1 downto

0) ;102103 −−S e l e c t i o n b e t w e e n co l umn 0 and co l umn 1 :104 a c c e s s c o l s e l i : in s t d l o g i c ;105106 −−Data i n p u t when w r i t i n g t o e i t h e r c o l umn 0 o r co l umn 1 i n a row :107 a c c e s s d a t a i : in s t d l o g i c v e c t o r (g DATA WID−1 downto 0) ;108109 −−Mask i n p u t u s e d o n l y d u r i n g LOOKUP o p e r a t i o n :110 −−No t e : i t i s n o t u s e d d u r i n g e . g . WRITE o p e r a t i o n s :111 a c c e s s g l oba l ma sk i : in s t d l o g i c v e c t o r (g DATA WID−1 downto 0) ;112113 −−Word i n d i c a t i o n (QUAD, HALF , FULL ) u s e d d u r i n g READ , WRITE and LOOKUP

o p e r a t i o n s :114 −−( I t mus t b e s e t t o TCAM2 FULL WORD , i f s u b w o r d s a r e n o t s u p p o r t e d )115 a c c e s s w rd i : in t TCAM2 WORD SEL;116117 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−118 −−c o n f i g u r a t i o n and s t a t u s i n t e r f a c e :119 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−120 i n i t t c am c f g i : in s t d l o g i c ;121 in i t t cam done o : out s t d l o g i c ;122123 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−124 −− TCAM l o o k u p r e s u l t i n t e r f a c e :

Page 199: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

E.1 Behavioral-HNAPT 175

125 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−126 r s l t ma t ch v l d o : out s t d l o g i c ;127 r s l t match o : out s t d l o g i c ;128 r s l t a d d r o : out s t d l o g i c v e c t o r ( c e i l i n g l o g 2 (g NUM ADDR)−1 downto

0) ;129 −−Ou t p u t i n d i c a t i n g i f t h e l o o k u p was a s s o c i a t e d w i t h a f u l l , h a l f o r q u a l

word :130 −−( I f h a l f / q uad wo r d s a r e n o t s u p p o r t e d ( g NUM SUBWORDS PER FULLWORD=1) t h e n131 −− t h i s o u t p u t mus t b e i g n o r e d ) .132 r s l t w rd o : out t TCAM2 WORD SEL;133 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−134 −− TCAM r e a d d a t a i n t e r f a c e :135 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−136 rd da ta v ld o : out s t d l o g i c ;137 rd data o : out s t d l o g i c v e c t o r (g DATA WID−1 downto 0) ;138 rd wrd o : out t TCAM2 WORD SEL) ;139 end component ;140141 constant NAT IP ADDRESS : s t d l o g i c v e c t o r (31 downto 0) := X”42F947B7” ;142 constant CC PER INT : natura l := 8024; −−127143 constant LOOKUP DLY : natura l := 3 ;144 constant READ DLY : natura l := 2 ;145 constant CPU DLY : natura l := 6 ;146147 constant TIM : s t d l o g i c v e c t o r (6 downto 0) := ”0001000” ;148149 −− c l o c k c y c l e s p e r m i n u t e s150 constant CC PER SEC : natura l := 5 ;151152 −− Main NAT FSM s i g n a l s153 type s t a t e t ype i s ( i n i t s t a t e , i d l e s t a t e , in i t co l 0 TCAM 0 1 state ,

in i t co l 1 TCAM 0 1 state ,154 CAM 0 map lookup state , CAM 1 f i l t l ookup state ,155 CAM 0 free lookup state , CAM 1 free lookup state ,156 wr map col 0 TCAM 0 state , wr map col 1 TCAM 0 state ,157 wr f i l t co l 0 TCAM 1 state , wr f i l t co l 1 TCAM 1 state ,

d i s c a rd s t a t e , t x s ta t e ,158 rd map TCAM 0 state , CAM 1 inb lookup state ,159 i n t cpu s t a t e ,160 c l ean up s ta t e , r e s e t t ime r s t a t e ,161 del col 0 TCAM 0 state , de l col 1 TCAM 0 state ,162 del CAM 1 lookup state , del col 0 TCAM 1 state ,

del col 1 TCAM 1 state , rd map TCAM 1 state ,163 CAM 0 map read state , CAM 0 map inb read state ,

CAM 0 map inb lookup state ) ;164165 signal s tate , n ex t s t a t e : s t a t e t ype ;166 signal stateName : s t r i n g (1 to 10) ;167168 −− Timer FSM s i g n a l s169 type t ime r s t a t e t yp e i s ( i n i t t im e r s t a t e , i d l e t im e r s t a t e ,

e xp i r ed t ime r s t a t e , done t imer s t a t e ) ;170171 signal t imer s ta t e , n ex t t ime r s t a t e : t ime r s t a t e t yp e ;172 signal t imer state name : s t r i n g (1 to 10) ;173174 −− Main Timer FSM s i g n a l s175 type main t imer s ta t e type i s ( i n i t ma in t ime r s t a t e , i d l e ma i n t ime r s t a t e ) ;176177 signal main t imer state , nex t ma in t imer s ta t e : ma in t imer s ta t e type ;178 signal main t imer state name : s t r i n g (1 to 10) ;179180181182 −− TCAM−0 s i g n a l s183 signal s acce s s cmd 0 : t ACCESS CMD STRM;184185 signal s a c c e s s add r 0 : s t d l o g i c v e c t o r ( c e i l i n g l o g 2 (16)−1 downto 0) ;186187 signal s a c c e s s c o l s e l 0 : s t d l o g i c ;188189 signal s a c c e s s d a t a 0 : s t d l o g i c v e c t o r (65−1 downto 0) ;190191 signal s a c e s s g l oba l ma sk 0 : s t d l o g i c v e c t o r (65−1 downto 0) ;192193 signal s a c c e s s wrd 0 : t TCAM2 WORD SEL;194195 signal s i n i t t c am c f g 0 : s t d l o g i c ;196 signal s i n i t t c am done 0 : s t d l o g i c ;197198 signal s r s l t ma t c h v l d 0 : s t d l o g i c ;199 signal s r s l t ma t ch 0 : s t d l o g i c ;200 signal s r s l t a d d r 0 : s t d l o g i c v e c t o r ( c e i l i n g l o g 2 (16)−1 downto 0) ;201 signal s r s l t w r d 0 : t TCAM2 WORD SEL;202203 signal s r d da t a v l d 0 : s t d l o g i c ;204 signal s r d da t a 0 : s t d l o g i c v e c t o r (65−1 downto 0) ;

Page 200: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

176 VHDL Source Code

205 signal s rd wrd 0 : t TCAM2 WORD SEL;206207 −− TCAM−1 s i g n a l s208 signal s acce s s cmd 1 : t ACCESS CMD STRM;209210 signal s a c c e s s add r 1 : s t d l o g i c v e c t o r ( c e i l i n g l o g 2 (16)−1 downto 0) ;211212 signal s a c c e s s c o l s e l 1 : s t d l o g i c ;213214 signal s a c c e s s d a t a 1 : s t d l o g i c v e c t o r (65−1 downto 0) ;215216 signal s a c e s s g l oba l ma sk 1 : s t d l o g i c v e c t o r (65−1 downto 0) ;217218 signal s a c c e s s wrd 1 : t TCAM2 WORD SEL;219220 signal s i n i t t c am c f g 1 : s t d l o g i c ;221 signal s i n i t t c am done 1 : s t d l o g i c ;222223 signal s r s l t ma t c h v l d 1 : s t d l o g i c ;224 signal s r s l t ma t ch 1 : s t d l o g i c ;225 signal s r s l t a d d r 1 : s t d l o g i c v e c t o r ( c e i l i n g l o g 2 (16)−1 downto 0) ;226 signal s r s l t w r d 1 : t TCAM2 WORD SEL;227228 signal s r d da t a v l d 1 : s t d l o g i c ;229 signal s r d da t a 1 : s t d l o g i c v e c t o r (65−1 downto 0) ;230 signal s rd wrd 1 : t TCAM2 WORD SEL;231232 −− Ot h e r s i g n a l s233 signal s temp : s t d l o g i c v e c t o r (49−1 downto 0) ;234 signal tmp natPort : s t d l o g i c v e c t o r (15 downto 0) ;235 signal s tmp rd data : s t d l o g i c v e c t o r (49−1 downto 0) ;236237 signal s f i l t e r a d d r : s t d l o g i c v e c t o r (31 downto 0) ;238 signal s f i l t e r p o r t : s t d l o g i c v e c t o r (15 downto 0) ;239240 signal s e s s i o n t yp e : s t d l o g i c v e c t o r (1 downto 0) ;241242 signal CAM 0 addr : s t d l o g i c v e c t o r (15 downto 0) ;243 signal CAM 1 addr : s t d l o g i c v e c t o r (15 downto 0) ;244245 signal inc : s t d l o g i c ;246247 signal c n t s t a r t : s t d l o g i c ;248 signal cnt done : s t d l o g i c ; −− ?????249 signal tmp in i t cn t : s t d l o g i c v e c t o r (3 downto 0) ;250251 signal i n t c n t : s t d l o g i c v e c t o r (15 downto 0) ;−−5252 signal i n t c n t c l r : s t d l o g i c ;253254 signal de lay cnt2 : s t d l o g i c v e c t o r (3 downto 0) ;255 signal d e l a y c l r 2 : s t d l o g i c ;256257 signal min cnt : s t d l o g i c v e c t o r (15 downto 0) ;258 signal min cn t c l r : s t d l o g i c ;259260 type AA i s array (15 downto 0) of s t d l o g i c v e c t o r (6 downto 0) ;261262 signal cn t a r r : AA;263 signal n : natura l ;264265 signal main t imer code : s t d l o g i c v e c t o r (1 downto 0) ;266267 signal t ime r s t a tu s : s t d l o g i c ;268269 signal s e c cn t ho l d : s t d l o g i c ;270271 signal i : natura l ;272 signal s e s s i o n t im e r r e s t a r t : s t d l o g i c ;273 signal sess ion number : natura l ;274275 signal tmp CAM 0 port : s t d l o g i c v e c t o r (15 downto 0) ;276277 −− s i g n a l CAM 0 add r 4 : s t d l o g i c v e c t o r ( 3 d own t o 0 ) ;278279 type NAPT port table CPU i s array (15 downto 0) of s t d l o g i c ;280281 signal cpu por t a r r : NAPT port table CPU ;282283 signal cpu code : s t d l o g i c v e c t o r (1 downto 0) ;284285 begin286287 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−288 −− NAT Main c o n t r o l l e r289 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−290 p1 : process ( s tate , tmp in i t cnt , ackINT)291 begin

Page 201: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

E.1 Behavioral-HNAPT 177

292 case s t a t e i s293294 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−295 −− I n i t P r o c e d u r e o f t h e NAT296 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−297 when i n i t s t a t e =>298 stateName <= ” i n i t ” ;299300 −− Tri−s t a t e TCAM−0301 s acce s s cmd 0 <= TCAM2 ACCESS CMD IDLE;302 s a c c e s s add r 0 <= ( others => ’Z ’ ) ;303 s a c c e s s c o l s e l 0 <= ’Z ’ ;304 s a c c e s s d a t a 0 <= ( others => ’Z ’ ) ;305 s a c e s s g l oba l ma sk 0 <= ( others => ’Z ’ ) ;306 s a c c e s s wrd 0 <= TCAM2 FULL WORD;307 s i n i t t c am c f g 0 <= ’1 ’ ;308309 −− Tri−s t a t e TCAM−1310 s acce s s cmd 1 <= TCAM2 ACCESS CMD IDLE;311 s a c c e s s add r 1 <= ( others => ’Z ’ ) ;312 s a c c e s s c o l s e l 1 <= ’Z ’ ;313 s a c c e s s d a t a 1 <= ( others => ’Z ’ ) ;314 s a c e s s g l oba l ma sk 1 <= ( others => ’Z ’ ) ;315 s a c c e s s wrd 1 <= TCAM2 FULL WORD;316 s i n i t t c am c f g 1 <= ’1 ’ ;317318 i n t c n t c l r <= ’0 ’ ;−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−319320 c n t s t a r t <= ’0 ’ ;−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−321322 −− C l e a r t h e cpu n a t P o r t l i s t323 for h in 0 to 15 loop324 cpu por t a r r (h) <= ’0 ’ ;325 end loop ;326327 −− Used t o d i f f e n t i a t e b e t w e e n a f u l l −, sub− o r o r d i n a r y−s e s s i o n328 s e s s i o n t yp e <= ”00” ;329330 −− S t a r t d e l a y c o u n t e r331 d e l a y c l r 2 <= ’0 ’ ;332333 −− R e s e t t h e c l e a n−up p o i n t e r334 i <= 0;335336 −− Timer−c o d e s e t t o i d l e337 main t imer code <= ( others => ’ 0 ’ ) ;338339 −− Tri−s t a t e340 CAM 0 addr <= ( others => ’Z ’ ) ;341342 −− Tri−s t a t e343 CAM 1 addr <= ( others => ’Z ’ ) ;344345 tmp CAM 0 port <= ( others => ’Z ’ ) ;346347 cpu code <= ”00” ;348349 −− Tri−s t a t e t h e memory b u s350 data <= ( others => ’Z ’ ) ;351 addr <= ( others => ’Z ’ ) ;352 nWE <= ’Z ’ ;353 nCS <= ’Z ’ ;354 nOE <= ’Z ’ ;355356 −− Tri−s t a t e t h e s e s s i o n memory b u s357 gen data <= ( others => ’Z ’ ) ;358 gen addr <= ( others => ’Z ’ ) ;359 gen nWE <= ’Z ’ ;360 gen nCS <= ’Z ’ ;361 gen nOE <= ’Z ’ ;362363 −− Ze r o t h e c o n t r o l s i g n a l s364 ackNAT <= ’0 ’ ;365 d i s ca rd <= ’0 ’ ;366 reqINT <= ’0 ’ ;367368 −− Tri−s t a t e t h e n a t I P a d d r & n a t P o r t369 natIPaddr <= ( others => ’Z ’ ) ;370 natPort <= ( others => ’Z ’ ) ;371372 −− Gen e r a l i n i t o f NAT done373 i f ( s i n i t t c am done 0 = ’1 ’ ) and ( s i n i t t c am done 1 = ’1 ’ ) then374 −−375 c n t s t a r t <= ’1 ’ ;−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−376 nex t s t a t e <= in i t co l 0 TCAM 0 1 sta te ;377 −− Wait f o r a i n i t r e s p o n s e f r om t h e CAM t a b l e s378 else

Page 202: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

178 VHDL Source Code

379 nex t s t a t e <= i n i t s t a t e ;380 end i f ;381382 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−383 −− I n i t c o l −0 o f a l l e n t r i e s i n TCAM−0 & TCAM−1384 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−385 when i n i t co l 0 TCAM 0 1 sta te =>386 stateName <= ” i n i t c o l 0 ” ;387388 −− C l e a r c o l −0 o f a e n t r y i n TCAM−0 ( t m p i n i t c n t )389 s acce s s cmd 0 <= TCAM2 ACCESS CMD WR;390 s a c c e s s add r 0 <= s t d l o g i c v e c t o r ( unsigned ( tmp in i t cn t ) ) ;391 s a c c e s s c o l s e l 0 <= ’0 ’ ;392 s a c c e s s d a t a 0 <= X”00000000” & X”0000” & X”0000” & ’ 0 ’ ;393394 −− C l e a r c o l −0 o f a e n t r y i n TCAM−1 ( t m p i n i t c n t )395 s acce s s cmd 1 <= TCAM2 ACCESS CMD WR;396 s a c c e s s add r 1 <= s t d l o g i c v e c t o r ( unsigned ( tmp in i t cn t ) ) ;397 s a c c e s s c o l s e l 1 <= ’0 ’ ;398 s a c c e s s d a t a 1 <= X”00000000” & X”0000” & X”0000” & ’ 0 ’ ;399400 −− The i n i t i a t i o n o f e v e r y c o l −0 i n CAM−0 & CAM−1 i s done401 i f unsigned ( de lay cnt2 ) = 15 then402 nex t s t a t e <= in i t co l 1 TCAM 0 1 sta te ;403404 −− Con t i n u e i n i t a t i n g c o l −0 o f t h e n e x t e n t r y i n CAM−0 & CAM−1405 else406 −− S t a r t d e l a y c o u n t e r407 d e l a y c l r 2 <= ’1 ’ ;408409 nex t s t a t e <= in i t co l 0 TCAM 0 1 sta te ;410 end i f ;411412 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−413 −− I n i t c o l −1 o f a l l e n t r i e s i n TCAM−0 & TCAM−1414 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−415 when i n i t co l 1 TCAM 0 1 sta te =>416 stateName <= ” i n i t c o l 1 ” ;417418 −− C l e a r c o l −1 o f a e n t r y i n TCAM−0 ( t m p i n i t c n t )419 s acce s s cmd 0 <= TCAM2 ACCESS CMD WR;420 s a c c e s s add r 0 <= s t d l o g i c v e c t o r ( unsigned ( tmp in i t cn t ) ) ;−−−−−−−−−−−421 s a c c e s s c o l s e l 0 <= ’1 ’ ;422 s a c c e s s d a t a 0 <= X”00000000” & X”0000” & X”0000” & ’ 0 ’ ;423424 −− C l e a r c o l −1 o f a e n t r y i n TCAM−1 ( t m p i n i t c n t )425 s acce s s cmd 1 <= TCAM2 ACCESS CMD WR;426 s a c c e s s add r 1 <= s t d l o g i c v e c t o r ( unsigned ( tmp in i t cn t ) ) ;−−−−−−−−−−−427 s a c c e s s c o l s e l 1 <= ’1 ’ ;428 s a c c e s s d a t a 1 <= X”00000000” & X”0000” & X”0000” & ’ 0 ’ ;429430 −− The i n i t i a t i o n o f e v e r y c o l −0 i n CAM−0 & CAM−1 i s done431 i f unsigned ( de lay cnt2 ) = 15 then432433 −− C l e a r / S t o p d e l a y c o u n t e r434 d e l a y c l r 2 <= ’0 ’ ;435436 −− S t a r t i n t e r r u p t c o u n t e r437 i n t c n t c l r <= ’1 ’ ;−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−438439 nex t s t a t e <= i d l e s t a t e ;440441 −− Con t i n u e i n i t a t i n g c o l −1 o f t h e n e x t e n t r y i n CAM−0 & CAM−1442 else443 nex t s t a t e <= in i t co l 1 TCAM 0 1 sta te ;444 end i f ;445446 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−447 −− I d l e s t a t e448 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−449 when i d l e s t a t e =>450 stateName <= ” Id l e ” ;451452 −− Tri−s t a t e TCAM−0453 s acce s s cmd 0 <= TCAM2 ACCESS CMD IDLE;454 s a c c e s s add r 0 <= ( others => ’Z ’ ) ;455 s a c c e s s c o l s e l 0 <= ’Z ’ ;456 s a c c e s s d a t a 0 <= ( others => ’Z ’ ) ;457 s a c e s s g l oba l ma sk 0 <= ( others => ’Z ’ ) ;458459 −− Tri−s t a t e TCAM−1460 s acce s s cmd 1 <= TCAM2 ACCESS CMD IDLE;461 s a c c e s s add r 1 <= ( others => ’Z ’ ) ;462 s a c c e s s c o l s e l 1 <= ’Z ’ ;463 s a c c e s s d a t a 1 <= ( others => ’Z ’ ) ;464 s a c e s s g l oba l ma sk 1 <= ( others => ’Z ’ ) ;465

Page 203: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

E.1 Behavioral-HNAPT 179

466 −− Tri−s t a t e t h e s e s s i o n memory b u s467 gen addr <= ( others => ’Z ’ ) ;468 gen data <= ( others => ’Z ’ ) ;469 gen nCS <= ’Z ’ ;470 gen nWE <= ’Z ’ ;471 gen nOE <= ’Z ’ ;472473 −− S e t t h e t i m e r c o n t r o l l e r t o i d l e474 main t imer code <= ”00” ;475476 −− Tri−s t a t e t h e CAM−1 a d d r e s s b u f f e r477 −− ( Used b y t h e HNAT t o i d e n t i f y i t s t i m e r )478 CAM 1 addr <= ( others => ’Z ’ ) ;−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−479480 −− S e t t h e s e s s i o n t i m e r t o ” unknown ”481 s e s s i o n t yp e <= ”00” ;482483 −− Take t h e p a c k e t p r o c e s s i n g p a t h484 i f reqNAT = ’1 ’ then485486 −− Out bound487 i f d i r e c t i o n = ’1 ’ then488 nex t s t a t e <= CAM 0 map lookup state ;489490 −− I n b o u n d491 e l s i f d i r e c t i o n = ’0 ’ then492 nex t s t a t e <= CAM 0 map inb lookup state ;493494 −− D i s c a r d495 else496 nex t s t a t e <= d i s c a r d s t a t e ;497 end i f ;498499 −− Take t h e c l e a n−up p r o c e s s i n g p a t h500 else501 −− C l e a r t h e a c k f l a g o f t h e NAT502 ackNAT <= ’0 ’ ;503504 −− C l e a r t h e d i s c a r d f l a g505 d i s ca rd <= ’0 ’ ;506507 −− S e t u p RAM p i n s ( Read )508 gen nCS <= ’0 ’ ;509 gen nWE <= ’1 ’ ;510 gen nOE <= ’0 ’ ;511 gen data <= ( others => ’Z ’ ) ;512 gen addr <= s t d l o g i c v e c t o r ( to uns igned ( i , 18) ) ;513514 −− P r e p a r e a g e t t i m e r s t a t u s f o r s e s s i o n515 main t imer code <= ”11” ;516 sess ion number <= i ;517518 nex t s t a t e <= c l e an up s t a t e ;519520 end i f ;521522523 −−∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗524 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−525 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−526 −− NAT OUTBOUND PROCEDURES527 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−528 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−529 −−∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗530531532 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−533 −− Lookup t h e p a c k e t s s r c I P a d d r and s r c P o r t i n CAM−0 t o f i n d o u t w h e t h e r534 −− i t i s a known / sub−s e s s i o n o r a new f u l l −s e s s i o n t o t h e mapp in g t a b l e .535 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−536 when CAM 0 map lookup state =>537 stateName <= ”lookup map” ;538539 −− Lookup t o d e t e r m i n e w h e t h e r t h e p a c k e t s s o u r c e IP a d d r & p o r t a r e540 −− l o a c t e d i n a v a l i d e n t r y i n CAM−0541 s acce s s cmd 0 <= TCAM2 ACCESS CMD CMP;542 s a c c e s s add r 0 <= ( others => ’Z ’ ) ;543 s a c c e s s d a t a 0 <= srcIPAddr & srcPort & X”0000” & ’ 1 ’ ;544 s a c e s s g l oba l ma sk 0 <= X”00000000” & X”0000” & X”FFFF” & ’0 ’ ;545546 −− E x e c u t e when CAM−0 i s done w i t h t h e l o o k u p p r o c e s s547 i f unsigned ( de lay cnt2 ) = LOOKUP DLY + 1 then548549 −− C l e a r / S t o p d e l a y c o u n t e r550 d e l a y c l r 2 <= ’0 ’ ;551552 −− The NAT IP a d d r e s s i s s e t t o t h e c o n s t a n t NAT IP ADDRESS

Page 204: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

180 VHDL Source Code

553 natIPaddr <= NAT IP ADDRESS ;554555 −− The l o o k u p p r o c e s s r e s u l t s i n a HIT ( Known−s e s s i o n o r new sub−s e s s i o n )556 i f s r s l t ma t ch 0 = ’1 ’ then557558 −− The r e t u r n e d a d d r e s s f r om t h e l o o k u p559 −− CAM 0 add r 4 <=

s t d l o g i c v e c t o r ( t o u n s i g n e d ( t o i n t e g e r ( u n s i g n e d ( s r s l t a d d r 0 ) ) ,4 ) ) ;

560 CAM 0 addr <=s t d l o g i c v e c t o r ( to uns igned ( t o i n t e g e r ( unsigned ( s r s l t a d d r 0 ) ) ,16) ) ;

561562 −− Read t h e e n t r y i n CAM−0 w h i c h h a s t h e a d d r e s s : CAM 0 add r 4 a d d r e s s

t o g e t563 −− t h e n a t P o r t f o r t h e s e s s i o n564 nex t s t a t e <= CAM 0 map read state ;565566 −− The l o o k u p p r o c e s s r e s u l t s i n a MISS ( New f u l l −s e s s i o n )567 else568569 −− New f u l l −s e s s i o n570 s e s s i o n t yp e <= ”11” ;571572 −− F ind a f r e e e n t r y i n CAM−0573 nex t s t a t e <= CAM 0 free lookup state ;574 end i f ;575576 −− Whi l e CAM−0 i s b u s y577 else578 −− S t a r t d e l a y c o u n t e r579 d e l a y c l r 2 <= ’1 ’ ;580581 −− Rep e a t t h i s s t a t e u n t i l t h e l o o k u p i s done582 nex t s t a t e <= CAM 0 map lookup state ;583 end i f ;584585 −− r e p o r t ”Map l o o k u p ” s e v e r i t y n o t e ;586587 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−588 −− Read t h e c o n t e n t o f t h e e n t r y i n CAM−0 w i t h t h e CAM−0 a d d r e s s f r om t h e589 −− p r e v i o u s CAM−0 l o o k u p .590 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−591 when CAM 0 map read state =>592 stateName <= ”read CAM 0” ;593594 −− S e t u p CAM−0 t o r e a d mode595 s acce s s cmd 0 <= TCAM2 ACCESS CMD RD;596597 −− Read t h e masked IP a d d r e s s and TU p o r t w i t h t h e u n i q u e NAT p o r t number598 −− s a c c e s s a d d r 0 <= CAM 0 add r 4 ;599 s a c c e s s add r 0 <= CAM 0 addr (3 downto 0) ;600 s a c c e s s c o l s e l 0 <= ’0 ’ ;601602 −− E x e c u t e when CAM−1 i s done w i t h t h e r e a d i n g p r o c e s s603 i f unsigned ( de lay cnt2 ) = READ DLY + 1 then604605 −− C l e a r / S t o p d e l a y c o u n t e r606 d e l a y c l r 2 <= ’0 ’ ;607608 −− Get t h e masked p o r t number609 natPort <= s rd da ta 0 (16 downto 1) ;610611 −− Lookup t o d e t e r m i n e w h e t h e r t h e s e s s i o n i s known t o t h e f i l t e r i n g612 −− t a b l e CAM−1613 nex t s t a t e <= CAM 1 f i l t l ookup s ta t e ;614615 −− Whi l e CAM−1 i s b u s y616 else617 −− S t a r t d e l a y c o u n t e r618 d e l a y c l r 2 <= ’1 ’ ;619620 −− Rep e a t t h i s s t a t e u n t i l t h e r e a d i s done621 nex t s t a t e <= CAM 0 map read state ;622 end i f ;623624 −− r e p o r t ”Rd map CAM−1” s e v e r i t y n o t e ;625626 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−627 −− Lookup t h e p a c k e t s d e s t I P a d d r and d e s t P o r t and n a t p o r t i n CAM−1 t o628 −− f i n d o u t w h e t h e r i t i s a known / sub−s e s s i o n t o t h e f i l t e r i n g t a b l e .629 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−630 when CAM 1 f i l t l ookup s ta t e =>631 stateName <= ” l o o k u p f i l ” ;632633 −− Lookup t o d e t e r m i n e w h e t h e r t h e p a c k e t s d e s t . IP a d d r . & p o r t and634 −− n a t p o r t a r e l o a c t e d i n a v a l i d e n t r y i n CAM−1

Page 205: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

E.1 Behavioral-HNAPT 181

635 s acce s s cmd 1 <= TCAM2 ACCESS CMD CMP;636 s a c c e s s add r 1 <= ( others => ’Z ’ ) ;637 s a c c e s s d a t a 1 <= destIPAddr & destPort & natPort & ’ 1 ’ ;638 s a c e s s g l oba l ma sk 1 <= X”00000000” & X”0000” & X”0000” & ’ 0 ’ ;639640 −− E x e c u t e when CAM−1 i s done w i t h t h e l o o k u p p r o c e s s641 i f unsigned ( de lay cnt2 ) = LOOKUP DLY + 1 then642643 −− C l e a r / S t o p d e l a y c o u n t e r644 d e l a y c l r 2 <= ’0 ’ ;645646 −− Lookup r e s u l t s i n a HIT ( Known−s e s s i o n )647 i f s r s l t ma t ch 1 = ’1 ’ then648649 −− New o r d i n a r y−s e s s i o n650 s e s s i o n t yp e <= ”01” ;651652 −− The r e t u r n e d a d d r e s s f r om t h e l o o k u p653 CAM 1 addr <=

s t d l o g i c v e c t o r ( to uns igned ( t o i n t e g e r ( unsigned ( s r s l t a d d r 1 ) ) ,16) ) ;

654655 −− Nex t s t a t e : Fo rward p a c k e t t o r o u t e r c o r e656 nex t s t a t e <= tx s t a t e ;657658 −− Lookup r e s u l t s i n a MISS ( New sub−s e s s i o n )659 else660 −− New sub−s e s s i o n661 s e s s i o n t yp e <= ”10” ;662663 −− F ind a f r e e e n t r y i n CAM−1664 nex t s t a t e <= CAM 1 free lookup state ;665 end i f ;666667 −− Whi l e CAM−1 i s b u s y668 else669 −− S t a r t d e l a y c o u n t e r670 d e l a y c l r 2 <= ’1 ’ ;671672 −− Rep e a t t h i s s t a t e u n t i l t h e r e a d i s done673 nex t s t a t e <= CAM 1 f i l t l ookup s ta t e ;674 end i f ;675676 −− r e p o r t ” F i l t e r l o o k u p ” s e v e r i t y n o t e ;677678 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−679 −− Lookup f o r a f r e e e n t r y i n CAM−0680 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−681 when CAM 0 free lookup state =>682 stateName <= ”lookup em0” ;683684 s acce s s cmd 0 <= TCAM2 ACCESS CMD CMP;685 s a c c e s s add r 0 <= ( others => ’Z ’ ) ;686 s a c c e s s d a t a 0 <= X”00000000” & X”0000” & X”0000” & ’ 0 ’ ;687 s a c e s s g l oba l ma sk 0 <= X”FFFFFFFF” & X”FFFF” & X”FFFF” & ’0 ’ ;688689 −− E x e c u t e when CAM−0 i s done w i t h t h e l o o k u p p r o c e s s690 i f unsigned ( de lay cnt2 ) = LOOKUP DLY + 1 then691692 −− C l e a r / S t o p d e l a y c o u n t e r693 d e l a y c l r 2 <= ’0 ’ ;694695 −− Lookup r e s u l t s i n a HIT (A f r e e e n t r y h a s b e e n f o u n d )696 i f s r s l t ma t ch 0 = ’1 ’ then697698 −− The r e t u r n e d a d d r e s s f r om t h e l o o k u p699 CAM 0 addr <=

s t d l o g i c v e c t o r ( to uns igned ( t o i n t e g e r ( unsigned ( s r s l t a d d r 0 ) ) ,16) ) ;

700701 −− F ind a f r e e e n t r y i n CAM−1702 nex t s t a t e <= CAM 1 free lookup state ;703704 −− Lookup r e s u l t s i n a MISS ( No f r e e e n t r y h a s b e e n f o u n d )705 else706 −− D i s c a r d t h e p a c k e t707 nex t s t a t e <= d i s c a r d s t a t e ;708709 report ”Mapping tab l e f u l l ! ! ” severity note ;710 end i f ;711712 −− Whi l e CAM−0 i s b u s y713 else714 −− S t a r t d e l a y c o u n t e r715 d e l a y c l r 2 <= ’1 ’ ;716717 −− Rep e a t t h i s s t a t e u n t i l t h e l o o k u p i s done

Page 206: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

182 VHDL Source Code

718 nex t s t a t e <= CAM 0 free lookup state ;719 end i f ;720721 −−r e p o r t ” F r e e e n t r y l o o k u p CAM−0” s e v e r i t y n o t e ;722723 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−724 −− Lookup f o r a f r e e e n t r y i n CAM−1725 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−726 when CAM 1 free lookup state =>727 stateName <= ”lookup em1” ;728729 s acce s s cmd 1 <= TCAM2 ACCESS CMD CMP;730 s a c c e s s add r 1 <= ( others => ’Z ’ ) ;731 s a c c e s s d a t a 1 <= X”00000000” & X”0000” & X”0000” & ’ 0 ’ ;732 s a c e s s g l oba l ma sk 1 <= X”FFFFFFFF” & X”FFFF” & X”FFFF” & ’0 ’ ;733734 −− The s e s s i o n i s a new f u l l −s e s s i o n735 i f unsigned ( de lay cnt2 ) = LOOKUP DLY + 1 and s e s s i o n t yp e = ”11” then736737 −− C l e a r / S t o p d e l a y c o u n t e r738 d e l a y c l r 2 <= ’0 ’ ;739740 −− Lookup r e s u l t s i n a HIT (A f r e e e n t r y h a s b e e n f o u n d )741 i f s r s l t ma t ch 1 = ’1 ’ then742743 −− The r e t u r n e d a d d r e s s f r om t h e l o o k u p744 CAM 1 addr <=

s t d l o g i c v e c t o r ( to uns igned ( t o i n t e g e r ( unsigned ( s r s l t a d d r 1 ) ) ,16) ) ;

745746 −− I n t e r r u p t t h e CPU t o g e t a NAPT p o r t number and t o c r e a t e t h e747 −− s e s s i o n i n t h e NAPT t a b l e i n t h e CPU748 nex t s t a t e <= in t c pu s t a t e ;749750 −− Lookup r e s u l t s i n a MISS ( No f r e e e n t r y h a s b e e n f o u n d )751 else752 report ” F i l t e r tab l e f u l l ! ! ” severity note ;753754 −− D i s c a r d t h e p a c k e t755 nex t s t a t e <= d i s c a r d s t a t e ;756 end i f ;757758 −− The s e s s i o n i s a new sub−s e s s i o n759 e l s i f unsigned ( de lay cnt2 ) = LOOKUP DLY + 1 and s e s s i o n t yp e = ”10” then760761 −− C l e a r / S t o p d e l a y c o u n t e r762 d e l a y c l r 2 <= ’0 ’ ;763764 −− Lookup r e s u l t s i n a HIT (A f r e e e n t r y h a s b e e n f o u n d )765 i f s r s l t ma t ch 1 = ’1 ’ then766767 −− The r e t u r n e d a d d r e s s f r om t h e l o o k u p768 CAM 1 addr <=

s t d l o g i c v e c t o r ( to uns igned ( t o i n t e g e r ( unsigned ( s r s l t a d d r 1 ) ) ,16) ) ;

769770 −− I n t e r r u p t t h e CPU t o c r e a t e t h e s e s s i o n i n t h e NAPT t a b l e i n t h e CPU771 nex t s t a t e <= in t c pu s t a t e ;772773 −− Lookup r e s u l t s i n a MISS ( No f r e e e n t r y h a s b e e n f o u n d )774 else775 report ” F i l t e r tab l e f u l l ! ! ” severity note ;776777 −− D i s c a r d t h e p a c k e t778 nex t s t a t e <= d i s c a r d s t a t e ;779 end i f ;780781 −− Whi l e CAM−1 i s b u s y782 else783 −− S t a r t d e l a y c o u n t e r784 d e l a y c l r 2 <= ’1 ’ ;785786 −− Rep e a t t h i s s t a t e u n t i l t h e l o o k u p i s done787 nex t s t a t e <= CAM 1 free lookup state ;788 end i f ;789790 −− r e p o r t ” F r e e e n t r y l o o k u p CAM−1” s e v e r i t y n o t e ;791792 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−793 −− Wr i t e t h e s o u r c e a d d r e s s and p o r t i n TCAM−0 c o l no . 0794 −− Used f o r mapp in g c r e a t i o n795 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−796 when wr map col 0 TCAM 0 state =>797 stateName <= ”wr CAM0 c0” ;798799 s acce s s cmd 0 <= TCAM2 ACCESS CMD WR;800 s a c c e s s add r 0 <= CAM 0 addr (3 downto 0) ;

Page 207: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

E.1 Behavioral-HNAPT 183

801 s a c c e s s c o l s e l 0 <= ’0 ’ ;802 s a c c e s s d a t a 0 <= srcIPaddr & srcPort & natPort & ’ 1 ’ ;803804 −− Nex t s t a t e : Wr i t e mapp in g d a t a i n t o c o l : 1 o f CAM−0805 nex t s t a t e <= wr map col 1 TCAM 0 state ;806807 −− r e p o r t ” Wr i t e map t o c o l −0 i n CAM−0” s e v e r i t y n o t e ;808809 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−810 −− Wr i t e t h e s o u r c e a d d r e s s and p o r t i n TCAM−0 c o l no . 1811 −− Used f o r mapp in g c r e a t i o n812 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−813 when wr map col 1 TCAM 0 state =>814 stateName <= ”wr CAM0 c1” ;815816 s acce s s cmd 0 <= TCAM2 ACCESS CMD WR;817 s a c c e s s add r 0 <= CAM 0 addr (3 downto 0) ;818 s a c c e s s c o l s e l 0 <= ’1 ’ ;819 s a c c e s s d a t a 0 <= srcIPaddr & srcPort & natPort & ’ 1 ’ ;820821 −− Nex t s t a t e : Wr i t e f i l t e r d a t a i n t o c o l : 0 o f CAM−1822 nex t s t a t e <= wr f i l t co l 0 TCAM 1 sta t e ;823824 −− r e p o r t ” Wr i t e map t o c o l −1 i n CAM−0” s e v e r i t y n o t e ;825826 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−827 −− Wr i t e t h e d e s t i n a t i o n a d d r e s s , p o r t and CAM 0 addr i n TCAM−1 c o l no . 0828 −− Used f o r f i l t e r c r e a t i o n829 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−830 when wr f i l t co l 0 TCAM 1 sta t e =>831 stateName <= ”wr CAM1 c0” ;832833 s acce s s cmd 1 <= TCAM2 ACCESS CMD WR;834 s a c c e s s add r 1 <= CAM 1 addr (3 downto 0) ;835 s a c c e s s c o l s e l 1 <= ’0 ’ ;836 s a c c e s s d a t a 1 <= destIPaddr & destPort & natPort & ’ 1 ’ ;837838 −− Nex t s t a t e : Wr i t e f i l t e r d a t a i n t o c o l −1 o f CAM−1839 nex t s t a t e <= wr f i l t co l 1 TCAM 1 sta t e ;840841 −− r e p o r t ” Wr i t e f i l t t o c o l −0 i n CAM−1” s e v e r i t y n o t e ;842843 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−844 −− Wr i t e t h e d e s t i n a t i o n a d d r e s s , p o r t and CAM 0 addr i n TCAM−1 c o l no . 1845 −− Used f o r f i l t e r c r e a t i o n846 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−847 when wr f i l t co l 1 TCAM 1 sta t e =>848 stateName <= ”wr CAM1 c1” ;849850 s acce s s cmd 1 <= TCAM2 ACCESS CMD WR;851 s a c c e s s add r 1 <= CAM 1 addr (3 downto 0) ;852 s a c c e s s c o l s e l 1 <= ’1 ’ ;853 s a c c e s s d a t a 1 <= destIPaddr & destPort & natPort & ’ 1 ’ ;854855 −− Nex t s t a t e : Fo rward p a c k e t t o r o u t e r c o r e856 nex t s t a t e <= tx s t a t e ;857858 −− r e p o r t ” Wr i t e f i l t t o c o l −1 i n CAM−1” s e v e r i t y n o t e ;859860861 −−∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗862 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−863 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−864 −− NAT INBOUND PROCEDURES865 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−866 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−867 −−∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗868869870 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−871 −− Read t h e c o n t e n t o f t h e e n t r y i n CAM−0 w i t h t h e p a c k e t s d e s t i n a t i o n872 −− p o r t number u s e d a s a d d r e s s .873 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−874 when CAM 0 map inb lookup state =>875 stateName <= ”rd map ” ;876877 −− Lookup t o d e t e r m i n e w h e t h e r t h e p a c k e t s s o u r c e IP a d d r & p o r t a r e878 −− l o a c t e d i n a v a l i d e n t r y i n CAM−0879 s acce s s cmd 0 <= TCAM2 ACCESS CMD CMP;880 s a c c e s s add r 0 <= ( others => ’Z ’ ) ;881 s a c c e s s d a t a 0 <= X”00000000” & X”0000” & destPort & ’ 1 ’ ;882 s a c e s s g l oba l ma sk 0 <= X”FFFFFFFF” & X”FFFF” & X”0000” & ’0 ’ ;883884 −− E x e c u t e when CAM−0 i s done w i t h t h e l o o k u p p r o c e s s885 i f unsigned ( de lay cnt2 ) = LOOKUP DLY + 1 then −− 4886887 −− C l e a r / S t o p d e l a y c o u n t e r

Page 208: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

184 VHDL Source Code

888 d e l a y c l r 2 <= ’0 ’ ;889890 −− The l o o k u p p r o c e s s r e s u l t s i n a HIT ( i . e . l o o k u p s u c c e s )891 i f s r s l t ma t ch 0 = ’1 ’ then892893 −− The r e t u r n e d a d d r e s s f r om t h e l o o k u p894 −− CAM 0 add r 4 <=

s t d l o g i c v e c t o r ( t o u n s i g n e d ( t o i n t e g e r ( u n s i g n e d ( s r s l t a d d r 0 ) ) ,4 ) ) ;

895 CAM 0 addr <=s t d l o g i c v e c t o r ( to uns igned ( t o i n t e g e r ( unsigned ( s r s l t a d d r 0 ) ) ,16) ) ;

896897 −− Read t h e e n t r y i n CAM−0 w h i c h h a s t h e a d d r e s s : CAM 0 addr a d d r e s s t o

g e t898 −− t h e o r i g i n a l s r c I P a d d r and s r c P o r t f o r t h e s e s s i o n899 nex t s t a t e <= CAM 0 map inb read state ;900901 −− The l o o k u p p r o c e s s r e s u l t s i n a MISS ( i . e . no l o o k u p s u c c e s )902 else903 −− D i s c a r d t h e p a c k e t904 nex t s t a t e <= d i s c a r d s t a t e ;905906 report ”Unknown inbound packet ” severity note ;907 end i f ;908909 −− Whi l e CAM−0 i s b u s y910 else911 −− S t a r t d e l a y c o u n t e r912 d e l a y c l r 2 <= ’1 ’ ;913914 −− Rep e a t t h i s s t a t e u n t i l t h e l o o k u p i s done915 nex t s t a t e <= CAM 0 map inb lookup state ;916 end i f ;917918 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−919 −− Read t h e c o n t e n t o f t h e e n t r y i n CAM−0 w i t h t h e CAM−0 a d d r e s s f r om t h e920 −− p r e v i o u s CAM−0 l o o k u p .921 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−922 when CAM 0 map inb read state =>923 stateName <= ” rd inb C0 ” ;924925 −− S e t u p CAM−0 t o r e a d mode926 s acce s s cmd 0 <= TCAM2 ACCESS CMD RD;927 −− s a c c e s s a d d r 0 <= CAM 0 add r 4 ;928 s a c c e s s add r 0 <= CAM 0 addr (3 downto 0) ;929 s a c c e s s c o l s e l 0 <= ’0 ’ ;930931 −− E x e c u t e when CAM−1 i s done w i t h t h e r e a d i n g p r o c e s s932 i f unsigned ( de lay cnt2 ) = READ DLY + 1 then933934 −− C l e a r / S t o p d e l a y c o u n t e r935 d e l a y c l r 2 <= ’0 ’ ;936937 −− Get t h e o r i g i n a l s r c I P a d d r , s r c P o r t938 prvPort <= s rd da ta 0 (32 downto 17) ;939 prvIPaddr <= s rd da ta 0 (64 downto 33) ;940941 −− Check t h e v a l i d i t y o f t h e p a c k e t942 nex t s t a t e <= CAM 1 inb lookup state ;943944 −− Whi l e CAM−1 i s b u s y945 else946 −− S t a r t d e l a y c o u n t e r947 d e l a y c l r 2 <= ’1 ’ ;948949 −− Nex t s t a t e : P r e s e n t s t a t e950 nex t s t a t e <= CAM 0 map inb read state ;951 end i f ;952953 −− r e p o r t ”Rd map CAM−1” s e v e r i t y n o t e ;954955 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−956 −− I s t h e i n b o u n d p a c k e t known t o t h e f i l t e r i n g t a b l e ?957 −− S e a r c h f o r a v a l i d e n t r y i n CAM−1 w i t h t h e c o n t e n t e q u a l t o t h e958 −− c o n c a t e n a t i o n o f t h e p a c k e t s s r c I P a d d r , s r c P o r t and d e s t P o r t959 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−960 when CAM 1 inb lookup state =>961 stateName <= ” inb lookup ” ;962963 −− S e t u p CAM−1 t o l o o k u p mode964 s acce s s cmd 1 <= TCAM2 ACCESS CMD CMP;965 s a c c e s s add r 1 <= ( others => ’Z ’ ) ;966 s a c c e s s d a t a 1 <= srcIPAddr & srcPort & destPort & ’ 1 ’ ;967 s a c e s s g l oba l ma sk 1 <= X”00000000” & X”0000” & X”0000” & ’ 0 ’ ;968969 −− E x e c u t e when CAM−1 i s done w i t h t h e l o o k u p p r o c e s s

Page 209: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

E.1 Behavioral-HNAPT 185

970 i f unsigned ( de lay cnt2 ) = LOOKUP DLY + 1 then971972 −− C l e a r / S t o p d e l a y c o u n t e r973 d e l a y c l r 2 <= ’0 ’ ;974975 −− Lookup r e s u l t s i n a HIT ( V a l i d i t y c h e c k : OK)976 i f s r s l t ma t ch 1 = ’1 ’ then977978 −− Nex t s t a t e : Fo rward p a c k e t t o r o u t e r979 nex t s t a t e <= Tx state ;980981 −− Lookup r e s u l t s i n a MISS ( V a l i d i t y c h e c k : NOT OK)982 else983 −− Nex t s t a t e : D i s c a r d p a c k e t984 nex t s t a t e <= d i s c a r d s t a t e ;985986 report ” F i l t e r e d inbound packet ” severity note ;987 end i f ;988989 −− Whi l e CAM−1 i s b u s y990 else991 −− S t a r t d e l a y c o u n t e r992 d e l a y c l r 2 <= ’1 ’ ;993994 −− Nex t s t a t e : P r e s e n t s t a t e995 nex t s t a t e <= CAM 1 inb lookup state ;996 end i f ;997998 −− r e p o r t ” I n b l o o k u p CAM−1” s e v e r i t y n o t e ;999

10001001 −−∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗1002 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−1003 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−1004 −− FORWARD THE PACKET TO THE ROUTER CORE OR DISCARD IT1005 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−1006 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−1007 −−∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗100810091010 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−1011 −− D i s c a r d t h e p a c k e t1012 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−1013 when d i s c a r d s t a t e =>1014 stateName <= ” di s ca rd ” ;10151016 −− S e t t h e d i s c a r d s i g n a l1017 d i s ca rd <= ’1 ’ ;10181019 nex t s t a t e <= i d l e s t a t e ;10201021 −− r e p o r t ” D i s c a r d s t a t e ” s e v e r i t y n o t e ;10221023 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−1024 −− Ac k n ow l e d g e t h e NAT p r o c e d u r e1025 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−1026 when t x s t a t e =>1027 stateName <= ”Tx ” ;10281029 −− S e t ACT−b i t1030 i f d i r e c t i o n = ’1 ’ then10311032 −− S e t u p t h e RAM p i n s ( Wr i t e )1033 gen nCS <= ’0 ’ ;1034 gen nWE <= ’0 ’ ;1035 gen nOE <= ’1 ’ ;10361037 −− S e t u p t h e a d d r e s s and d a t a1038 gen addr <= ”00” & CAM 1 addr ;10391040 −− Fu l l−s e s s i o n s e t o n l y a l i v e −b i t1041 i f s e s s i o n t yp e = ”11” then1042 gen data <= X”00000002” ;10431044 −− S t a r t t h e s e s s i o n t i m e r1045 main t imer code <= ”01” ;1046 sess ion number <= to i n t e g e r ( unsigned (CAM 1 addr ) ) ;10471048 −− Sub−s e s s i o n s e t o n l y a l i v e −b i t1049 e l s i f s e s s i o n t yp e = ”10” then1050 gen data <= X”00000002” ;10511052 −− S t a r t t h e s e s s i o n t i m e r1053 main t imer code <= ”01” ;1054 sess ion number <= to i n t e g e r ( unsigned (CAM 1 addr ) ) ;10551056 −− Ord i n a r y−s e s s i o n s e t b o t h t h e a c t−b i t and a l i v e −b i t

Page 210: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

186 VHDL Source Code

1057 e l s i f s e s s i o n t yp e = ”01” then1058 gen data <= X”00000003” ;10591060 else1061 report ”Error ” severity note ;1062 end i f ;10631064 end i f ;10651066 −− S e t t h e a c k f l a g o f t h e NAT1067 ackNAT <= ’1 ’ ;10681069 nex t s t a t e <= i d l e s t a t e ;10701071 −− r e p o r t ”Tx s t a t e ” s e v e r i t y n o t e ;107210731074 −−∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗1075 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−1076 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−1077 −− CPU ROUTINE1078 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−1079 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−1080 −−∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗108110821083 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−1084 −− V i r t u e l CPU1085 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−1086 when i n t c pu s t a t e =>1087 stateName <= ” int cpu ” ;10881089 −− E x e c u t e when CPU i s done1090 i f unsigned ( de lay cnt2 ) = CPU DLY + 1 then10911092 −− C l e a r / S t o p d e l a y c o u n t e r1093 d e l a y c l r 2 <= ’0 ’ ;10941095 i f cpu code = ”11” then1096 −− C l e a r t h e p o r t number1097 cpu por t a r r ( i ) <= ’0 ’ ;10981099 nex t s t a t e <= rd map TCAM 1 state ;11001101 −− New sub−s e s s i o n1102 e l s i f s e s s i o n t yp e = ”10” then11031104 −− C r e a t e t h e a c t u a l s e s s i o n i n t h e mapp in g t a b l e s c o l −0 CAM−11105 nex t s t a t e <= wr f i l t co l 0 TCAM 1 sta t e ;11061107 −− New f u l l −s e s s i o n1108 e l s i f s e s s i o n t yp e = ”11” then11091110 for p in 15 downto 0 loop1111 i f cpu por t a r r (p) = ’0 ’ then1112 cpu por t a r r (p) <= ’1 ’ ;1113 natPort <= s t d l o g i c v e c t o r ( to uns igned (p , 16) ) ;1114 exit ;1115 end i f ;1116 end loop ;11171118 −− C r e a t e t h e a c t u a l s e s s i o n i n t h e mapp in g t a b l e s c o l −0 CAM−01119 nex t s t a t e <= wr map col 0 TCAM 0 state ;11201121 end i f ;11221123 −− Whi l e CPU i s b u s y1124 else1125 −− S t a r t d e l a y c o u n t e r1126 d e l a y c l r 2 <= ’1 ’ ;11271128 −− Rep e a t t h i s s t a t e u n t i l t h e CPU i s done1129 nex t s t a t e <= in t c pu s t a t e ;1130 end i f ;11311132 −− r e p o r t ” I n b l o o k u p CAM−1” s e v e r i t y n o t e ;113311341135 −−∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗1136 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−1137 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−1138 −− CLEAN−UP PROCEDURES1139 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−1140 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−1141 −−∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗11421143

Page 211: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

E.1 Behavioral-HNAPT 187

1144 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−1145 −− CLEAN−UP PROCEDURES1146 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−1147 when c l e an up s t a t e =>1148 stateName <= ”cleanUp ” ;11491150 −− The c u r r e n t s e s s i o n i s a l i v e1151 i f gen data (1) = ’1 ’ then11521153 −− S e s s i o n t i m e r e x p i r e d and ACT−b i t = ’ 1 ’ => R e s t a r t t i m e r1154 i f t ime r s t a tu s = ’1 ’ and gen data (0) = ’1 ’ then1155 main t imer code <= ”01” ;11561157 −− C l e a r ACT−b i t1158 gen nCS <= ’0 ’ ;1159 gen nWE <= ’0 ’ ;1160 gen nOE <= ’1 ’ ;11611162 gen addr <= s t d l o g i c v e c t o r ( to uns igned ( i , 18) ) ;1163 gen data <= X”FFFFFFFE” and gen data ;11641165 i f i = 15 then1166 i <= 0;1167 else1168 i <= i + 1 ;1169 end i f ;11701171 nex t s t a t e <= i d l e s t a t e ;1172 report ”Restart Timer : Timer Expired ( Timer status = 1) ” severity note ;11731174 −− S e s s i o n t i m e r i s NOT e x p i r e d and ACT−b i t = ’ 1 ’ => R e s t a r t t i m e r1175 e l s i f t ime r s t a tu s = ’0 ’ and gen data (0) = ’1 ’ then1176 main t imer code <= ”01” ;11771178 −− C l e a r s e s s i o n ACT−b i t i n RAM1179 gen nCS <= ’0 ’ ;1180 gen nWE <= ’0 ’ ;1181 gen nOE <= ’1 ’ ;11821183 gen addr <= s t d l o g i c v e c t o r ( to uns igned ( i , 18) ) ;1184 gen data <= X”FFFFFFFE” and gen data ;11851186 i f i = 15 then1187 i <= 0;1188 else1189 i <= i + 1 ;1190 end i f ;11911192 nex t s t a t e <= i d l e s t a t e ;1193 report ”Restart Timer : Timer NOT Expired ( Timer status = 0) ” severity

note ;11941195 −− S e s s i o n t i m e r i s e x p i r e d and ACT−b i t = ’ 0 ’ => D e l e t e t i m e r1196 e l s i f t ime r s t a tu s = ’1 ’ and gen data (0) = ’0 ’ then1197 main t imer code <= ”00” ;11981199 −− C l e a r a l l s e s s i o n i n f o i n RAM ( Wr i t e )1200 gen nCS <= ’0 ’ ;1201 gen nWE <= ’0 ’ ;1202 gen nOE <= ’1 ’ ;12031204 gen addr <= s t d l o g i c v e c t o r ( to uns igned ( i , 18) ) ;1205 gen data <= X”00000000” ;12061207 −− n e x t s t a t e <= rd map TCAM 1 s t a t e ;1208 cpu code <= ”11” ;1209 nex t s t a t e <= in t c pu s t a t e ;−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−1210 report ”Delete Ses s i on : Timer Expired ( Timer status = 1) ” severity note ;12111212 −− S e s s i o n t i m e r i s NOT e x p i r e d and ACT−b i t = ’ 0 ’ => Do No t h i n g1213 e l s i f t ime r s t a tu s = ’0 ’ and gen data (0) = ’0 ’ then1214 main t imer code <= ”00” ;12151216 −− C l e a r ACT−b i t1217 gen nCS <= ’0 ’ ;1218 gen nWE <= ’0 ’ ;1219 gen nOE <= ’1 ’ ;12201221 gen addr <= s t d l o g i c v e c t o r ( to uns igned ( i , 18) ) ;1222 gen data <= X”FFFFFFFE” and gen data ;12231224 i f i = 15 then1225 i <= 0;1226 else1227 i <= i + 1 ;1228 end i f ;1229

Page 212: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

188 VHDL Source Code

1230 nex t s t a t e <= i d l e s t a t e ;1231 −− r e p o r t ”Do N o t h i n g : Timer NOT E x p i r e d ( T i m e r s t a t u s = 0 ) ” s e v e r i t y

n o t e ;12321233 else12341235 i f i = 15 then1236 i <= 0;1237 else1238 i <= i + 1 ;1239 end i f ;12401241 main t imer code <= ”00” ;1242 nex t s t a t e <= i d l e s t a t e ;1243 report ”Error ” severity note ;12441245 end i f ;12461247 −− The c u r r e n t s e s s i o n i s NOT a l i v e1248 else12491250 main t imer code <= ”00” ;12511252 −− A j u s t t h e c l e a n−up p o i n t e r1253 i f i = 15 then1254 i <= 0;1255 else1256 i <= i + 1 ;1257 end i f ;12581259 nex t s t a t e <= i d l e s t a t e ;12601261 −− r e p o r t ”NOT A l i v e ” s e v e r i t y n o t e ;1262 end i f ;12631264 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−1265 −− Read t h e c o n t e n t o f t h e e n t r y i n CAM−1 w i t h t h e c l e a n−up p o i n t e r .1266 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−1267 when rd map TCAM 1 state =>1268 stateName <= ”rd map ” ;12691270 −− R e l e a s e t h e d a t a b u s1271 gen nCS <= ’Z ’ ;1272 gen nWE <= ’Z ’ ;1273 gen nOE <= ’Z ’ ;1274 gen addr <= ( others => ’Z ’ ) ;1275 gen data <= ( others => ’Z ’ ) ;12761277 −− S e t u p CAM−1 t o r e a d mode1278 s acce s s cmd 1 <= TCAM2 ACCESS CMD RD;12791280 −− Read t h e masked IP a d d r e s s and TU p o r t w i t h t h e u n i q u e NAT p o r t number1281 s a c c e s s add r 1 <= s t d l o g i c v e c t o r ( to uns igned ( i , 4) ) ;1282 s a c c e s s c o l s e l 1 <= ’0 ’ ;12831284 cpu code <= ”00” ;12851286 −− E x e c u t e when CAM−1 i s done w i t h t h e r e a d i n g p r o c e s s1287 i f unsigned ( de lay cnt2 ) = READ DLY + 1 then12881289 −− C l e a r / S t o p d e l a y c o u n t e r1290 d e l a y c l r 2 <= ’0 ’ ;12911292 −− I t i s a v a l i d e n t r y i . e . t h e VALID−b i t i s s e t1293 i f s r d da t a 1 (0) = ’1 ’ then12941295 −− Get t h e masked IP a d d r e s s and p o r t number1296 tmp CAM 0 port <= s rd da ta 1 (16 downto 1) ;12971298 −− Nex t s t a t e : P r e s e n t s t a t e1299 nex t s t a t e <= del co l 0 TCAM 1 state ;13001301 −− I t i s NOT a v a l i d e n t r y i . e . t h e VALID−b i t i s c l e a r e d1302 else13031304 −− Nex t s t a t e : D i s c a r d t h e p a c k e t1305 nex t s t a t e <= i d l e s t a t e ;1306 end i f ;13071308 −− Whi l e CAM−1 i s b u s y1309 else1310 −− S t a r t d e l a y c o u n t e r1311 d e l a y c l r 2 <= ’1 ’ ;13121313 −− Nex t s t a t e : P r e s e n t s t a t e1314 nex t s t a t e <= rd map TCAM 1 state ;1315 end i f ;

Page 213: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

E.1 Behavioral-HNAPT 189

13161317 −− r e p o r t ”Rd map CAM−1” s e v e r i t y n o t e ;13181319 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−1320 −− D e l e t e c o l no . 0 i n TCAM−11321 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−1322 when del co l 0 TCAM 1 state =>1323 stateName <= ”dl CAM1 c0” ;13241325 s acce s s cmd 1 <= TCAM2 ACCESS CMD WR;1326 s a c c e s s add r 1 <= s t d l o g i c v e c t o r ( to uns igned ( i , 4) ) ;1327 s a c c e s s c o l s e l 1 <= ’0 ’ ;1328 s a c c e s s d a t a 1 <= X”00000000” & X”0000” & X”0000” & ’ 0 ’ ;13291330 nex t s t a t e <= del co l 1 TCAM 1 state ;13311332 −− r e p o r t ” Wr i t e map t o c o l −0 i n CAM−0” s e v e r i t y n o t e ;13331334 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−1335 −− D e l e t e c o l no . 1 i n TCAM−11336 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−1337 when del co l 1 TCAM 1 state =>1338 stateName <= ”dl CAM1 c1” ;13391340 s acce s s cmd 1 <= TCAM2 ACCESS CMD WR;1341 s a c c e s s add r 1 <= s t d l o g i c v e c t o r ( to uns igned ( i , 4) ) ;1342 s a c c e s s c o l s e l 1 <= ’1 ’ ;1343 s a c c e s s d a t a 1 <= X”00000000” & X”0000” & X”0000” & ’ 0 ’ ;13441345 nex t s t a t e <= del CAM 1 lookup state ;13461347 −− r e p o r t ” Wr i t e map t o c o l −0 i n CAM−0” s e v e r i t y n o t e ;13481349 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−1350 −−1351 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−1352 when del CAM 1 lookup state =>1353 stateName <= ”del CAM1 ” ;13541355 s acce s s cmd 1 <= TCAM2 ACCESS CMD CMP;1356 s a c c e s s add r 1 <= ( others => ’Z ’ ) ;1357 s a c c e s s d a t a 1 <= X”00000000” & X”0000” & X”000” & tmp CAM 0 port (3 downto

0) & ’ 1 ’ ;1358 s a c e s s g l oba l ma sk 1 <= X”FFFFFFFF” & X”FFFF” & X”0000” & ’0 ’ ;13591360 i f unsigned ( de lay cnt2 ) = LOOKUP DLY + 1 then13611362 −− C l e a r / S t o p d e l a y c o u n t e r1363 d e l a y c l r 2 <= ’0 ’ ;13641365 −− H i t1366 i f s r s l t ma t ch 1 = ’1 ’ then13671368 i f i = 15 then1369 i <= 0;1370 else1371 i <= i + 1 ;1372 end i f ;13731374 nex t s t a t e <= i d l e s t a t e ;13751376 −− Mis s1377 else1378 nex t s t a t e <= del co l 0 TCAM 0 state ;1379 end i f ;13801381 else1382 −− S t a r t d e l a y c o u n t e r1383 d e l a y c l r 2 <= ’1 ’ ;13841385 nex t s t a t e <= del CAM 1 lookup state ;1386 end i f ;13871388 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−1389 −− D e l e t e c o l no . 0 i n TCAM−01390 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−1391 when del co l 0 TCAM 0 state =>1392 stateName <= ”dl CAM0 c0” ;13931394 s acce s s cmd 0 <= TCAM2 ACCESS CMD WR;1395 s a c c e s s add r 0 <= tmp CAM 0 port (3 downto 0) ;1396 s a c c e s s c o l s e l 0 <= ’0 ’ ;1397 s a c c e s s d a t a 0 <= X”00000000” & X”0000” & X”0000” & ’ 0 ’ ;13981399 nex t s t a t e <= del co l 1 TCAM 0 state ;14001401 −− r e p o r t ” Wr i t e map t o c o l −0 i n CAM−0” s e v e r i t y n o t e ;

Page 214: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

190 VHDL Source Code

14021403 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−1404 −− D e l e t e c o l no . 1 i n TCAM−01405 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−1406 when del co l 1 TCAM 0 state =>1407 stateName <= ”dl CAM0 c1” ;14081409 s acce s s cmd 0 <= TCAM2 ACCESS CMD WR;1410 s a c c e s s add r 0 <= tmp CAM 0 port (3 downto 0) ;1411 s a c c e s s c o l s e l 0 <= ’1 ’ ;1412 s a c c e s s d a t a 0 <= X”00000000” & X”0000” & X”0000” & ’ 0 ’ ;14131414 i f i = 15 then1415 i <= 0;1416 else1417 i <= i + 1 ;1418 end i f ;14191420 nex t s t a t e <= i d l e s t a t e ;14211422 −− r e p o r t ” Wr i t e map t o c o l −0 i n CAM−0” s e v e r i t y n o t e ;14231424 when others =>1425 nex t s t a t e <= i d l e s t a t e ;1426 end case ;14271428 end process ;14291430 −−+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++1431 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−1432 −− R e g i s t e r p a r t o f t h e FSM1433 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−1434 −−+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++1435 regs : process ( clk , nReset )1436 begin1437 i f nReset = ’0 ’ then1438 s t a t e <= i n i t s t a t e ;1439 e l s i f clk ’ event and c lk = ’1 ’ then1440 s t a t e <= nex t s t a t e ;1441 end i f ;1442 end process r eg s ;14431444 −−∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗1445 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−1446 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−1447 −− MAIN TIMER CONTROLLER1448 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−1449 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−1450 −−∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗14511452 −−+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++1453 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−1454 −− Main Timer c o n t r o l l e r1455 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−1456 −−+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++1457 p17 : process ( main t imer state , i n t c n t )1458 begin1459 case main t imer s ta t e i s14601461 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−1462 −− INITIATION PROCEDURE1463 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−1464 when i n i t ma i n t ime r s t a t e =>1465 main t imer state name <= ” i n i t ” ;1466 s e c cn t ho l d <= ’0 ’ ;14671468 t ime r s t a tu s <= ’0 ’ ;1469 s e s s i o n t im e r r e s t a r t <= ’0 ’ ;1470 next ma in t imer s ta t e <= id l e ma i n t ime r s t a t e ;14711472 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−1473 −− INITIATION PROCEDURE1474 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−1475 when i d l e ma i n t ime r s t a t e =>1476 main t imer state name <= ” i d l e ” ;14771478 −− I d l e1479 i f main t imer code = ”00” then1480 s e s s i o n t im e r r e s t a r t <= ’0 ’ ;1481 next ma in t imer s ta t e <= id l e ma i n t ime r s t a t e ;14821483 −− Get s e s s i o n t i m e r s t a t u s1484 e l s i f main t imer code = ”11” then1485 s e s s i o n t im e r r e s t a r t <= ’0 ’ ;1486 i f unsigned ( cn t a r r ( i ) ) = 0 then1487 t ime r s t a tu s <= ’1 ’ ;1488 else

Page 215: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

E.1 Behavioral-HNAPT 191

1489 t ime r s t a tu s <= ’0 ’ ;1490 end i f ;14911492 next ma in t imer s ta t e <= id l e ma i n t ime r s t a t e ;14931494 −− R e s t a r t s e s s i o n t i m e r1495 e l s i f main t imer code = ”01” then1496 s e s s i o n t im e r r e s t a r t <= ’1 ’ ;1497 next ma in t imer s ta t e <= id l e ma i n t ime r s t a t e ;14981499 −− S t a r t s e s s i o n t i m e r1500 e l s i f main t imer code = ”10” then1501 s e s s i o n t im e r r e s t a r t <= ’0 ’ ;1502 next ma in t imer s ta t e <= id l e ma i n t ime r s t a t e ;15031504 −− o t h e r w i s e1505 else1506 s e s s i o n t im e r r e s t a r t <= ’0 ’ ;1507 next ma in t imer s ta t e <= id l e ma i n t ime r s t a t e ;1508 end i f ;15091510 −− r e p o r t ” i d l e s t a t e ” s e v e r i t y n o t e ;15111512 end case ;1513 end process ;15141515 −−+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++1516 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−1517 −− R e g i s t e r p a r t o f t h e main t i m e r c o n t r o l l e r (FSM)1518 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−1519 −−+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++1520 FSM main timer regs : process ( clk , nReset )1521 begin1522 i f nReset = ’0 ’ then1523 ma in t imer s ta t e <= in i t ma i n t ime r s t a t e ;1524 e l s i f clk ’ event and c lk = ’0 ’ then1525 ma in t imer s ta t e <= next ma in t imer s ta t e ;1526 end i f ;1527 end process FSM main timer regs ;152815291530 −−∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗1531 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−1532 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−1533 −− TIMER CONTROLLER1534 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−1535 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−1536 −−∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗153715381539 −−+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++1540 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−1541 −− S e s s i o n Down Coun t e r C o n t r o l l e r1542 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−1543 −−+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++1544 p13 : process ( t imer s ta t e , min cnt )1545 begin1546 case t ime r s t a t e i s15471548 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−1549 −− I n i t i a t i o n p r o c e d u r e1550 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−1551 when i n i t t im e r s t a t e =>1552 t imer state name <= ” i n i t ” ;1553 min cn t c l r <= ’0 ’ ;1554 n <= 0;15551556 cn t a r r (0) <= ( others => ’ 0 ’ ) ;1557 cn t a r r (1) <= ( others => ’ 0 ’ ) ;1558 cn t a r r (2) <= ( others => ’ 0 ’ ) ;1559 cn t a r r (3) <= ( others => ’ 0 ’ ) ;1560 cn t a r r (4) <= ( others => ’ 0 ’ ) ;1561 cn t a r r (5) <= ( others => ’ 0 ’ ) ;1562 cn t a r r (6) <= ( others => ’ 0 ’ ) ;1563 cn t a r r (7) <= ( others => ’ 0 ’ ) ;1564 cn t a r r (8) <= ( others => ’ 0 ’ ) ;1565 cn t a r r (9) <= ( others => ’ 0 ’ ) ;1566 cn t a r r (10) <= ( others => ’ 0 ’ ) ;1567 cn t a r r (11) <= ( others => ’ 0 ’ ) ;1568 cn t a r r (12) <= ( others => ’ 0 ’ ) ;1569 cn t a r r (13) <= ( others => ’ 0 ’ ) ;1570 cn t a r r (14) <= ( others => ’ 0 ’ ) ;1571 cn t a r r (15) <= ( others => ’ 0 ’ ) ;15721573 nex t t ime r s t a t e <= i d l e t im e r s t a t e ;15741575 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−

Page 216: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

192 VHDL Source Code

1576 −− Wait f o r t h e s e c o n d t i m e r t o e x p i r e1577 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−1578 when i d l e t im e r s t a t e =>1579 t imer state name <= ” i d l e ” ;15801581 i f s e s s i o n t im e r r e s t a r t = ’1 ’ then1582 cn t a r r ( sess ion number ) <= TIM; −− ” 1 0 0 0 0 0 0 ” ; −−00010001583 end i f ;15841585 i f unsigned ( min cnt ) = 0 then1586 nex t t ime r s t a t e <= exp i r e d t ime r s t a t e ;1587 else1588 min cn t c l r <= ’1 ’ ;1589 nex t t ime r s t a t e <= i d l e t im e r s t a t e ;1590 end i f ;15911592 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−1593 −− I f t h e c u r r e n t t a b l e e n t r y (NAT p o r t ) i s a c t i v e t h e n c o u n t i t s c o u n t e r1594 −− one down o t h e r w i s e s k i p t h e t a b l e e n t r y i . e . do n o t h i n g1595 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−1596 when e xp i r e d t ime r s t a t e =>1597 t imer state name <= ” expired ” ;1598 min cn t c l r <= ’0 ’ ;15991600 i f s e s s i o n t im e r r e s t a r t = ’1 ’ then1601 cn t a r r ( sess ion number ) <= TIM; −− ” 1 0 0 0 0 0 0 ” ; −−00010001602 end i f ;16031604 i f cn t a r r (n) = ”0000000” then1605 null ;1606 else1607 cn t a r r (n) <= s t d l o g i c v e c t o r ( unsigned ( cn t a r r (n) ) − 1) ;1608 end i f ;16091610 nex t t ime r s t a t e <= done t imer s t a t e ;16111612 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−1613 −− Move t h e e n t r y p o i n t e r t o t h e n e x t e n t r y ( i f t h e p r e s e n t e n t r y i s t h e1614 −− l a s t e n t r y t h e n p o i n t t o t h e f i r s t e n t r y o f t h e t a b l e )1615 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−1616 when done t imer s t a t e =>1617 t imer state name <= ”done ” ;1618 n <= n + 1;16191620 i f s e s s i o n t im e r r e s t a r t = ’1 ’ then1621 cn t a r r ( sess ion number ) <= TIM; −− ” 1 0 0 0 0 0 0 ” ; −−00010001622 end i f ;16231624 i f n = 15 then1625 n <= 0;1626 nex t t ime r s t a t e <= i d l e t im e r s t a t e ;1627 else1628 n <= n + 1;1629 nex t t ime r s t a t e <= exp i r e d t ime r s t a t e ;1630 end i f ;1631 end case ;1632 end process ;16331634 −−+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++1635 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−1636 −− R e g i s t e r p a r t o f t h e FSM1637 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−1638 −−+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++16391640 FSM timer regs : process ( clk , nReset , s e c cn t ho l d )1641 begin1642 i f nReset = ’0 ’ then1643 −− m i n c n t c l r <= ’ 1 ’ ;1644 t ime r s t a t e <= i n i t t im e r s t a t e ;1645 e l s i f s e c cn t ho l d = ’1 ’ then1646 null ;1647 e l s i f clk ’ event and c lk = ’1 ’ then1648 t ime r s t a t e <= nex t t ime r s t a t e ;1649 end i f ;1650 end process FSM timer regs ;16511652 −−+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++1653 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−1654 −− S e c ond c o u n t e r1655 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−1656 −−+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++16571658 p14 : process ( clk , nReset , m in cn t c l r )1659 variable i n t e rna l c oun t : natura l ;1660 begin1661 −− R e s e t ( a s y n c h r o n o u s )1662 i f nReset = ’0 ’ then

Page 217: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

E.1 Behavioral-HNAPT 193

1663 i n t e rna l c oun t := 0 ;1664 −− C l e a r ( a s y n c h r o n o u s )1665 e l s i f min cn t c l r = ’0 ’ then1666 i n t e rna l c oun t := CC PER SEC ;1667 −− C l o c k w i t h f a l i n g e d g e1668 e l s i f clk ’ event and c lk = ’1 ’ then1669 i f i n t e rna l c oun t = 0 then1670 else1671 i n t e rna l c oun t := in t e rna l c oun t − 1 ;1672 end i f ;1673 end i f ;1674 min cnt <= s t d l o g i c v e c t o r ( to uns igned ( in t e rna l count , 16) ) ; −−61675 end process ;167616771678 −−∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗1679 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−1680 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−1681 −− DIV . COUNTERS1682 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−1683 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−1684 −−∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗∗168516861687 −−+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++1688 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−1689 −− Ad d r e s s c o u n t e r f o r t h e i n i t i a t i o n p r o c e d u r e o f t h e NAT1690 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−1691 −−+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++16921693 p7 : process ( clk , nReset )1694 variable i n t e rna l c oun t : natura l ;1695 begin1696 i f nReset = ’0 ’ then −− r e s e t ( a s y n c h r o n o u s )1697 i n t e rna l c oun t := 15 ;1698 e l s i f clk ’ event and c lk = ’1 ’ then −− c l o c k1699 i f c n t s t a r t = ’1 ’ then1700 i f i n t e rna l c oun t = 15 then17011702 i n t e rna l c oun t := 0 ;1703 else1704 i n t e rna l c oun t := in t e rna l c oun t + 1 ;1705 end i f ;1706 else1707 i n t e rna l c oun t := 0 ;1708 end i f ;1709 end i f ;1710 tmp in i t cn t <= s t d l o g i c v e c t o r ( to uns igned ( in t e rna l count , 4) ) ;1711 end process ;17121713 −−+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++1714 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−1715 −− G l o b a l i n t e r r u p t c o u n t e r f o r t h e c l e a n−up and d e l e t i o n p r o c e d u r e1716 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−1717 −−+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++17181719 p8 : process ( clk , nReset , i n t c n t c l r )1720 variable i n t e rna l c oun t : natura l ;1721 begin1722 −− R e s e t ( a s y n c h r o n o u s )1723 i f nReset = ’0 ’ then1724 i n t e rna l c oun t := 0 ;1725 −− C l e a r ( a s y n c h r o n o u s )1726 e l s i f i n t c n t c l r = ’0 ’ then1727 i n t e rna l c oun t := CC PER INT ;−− 1 2 7 ; −−631728 −− C l o c k w i t h f a l i n g e d g e1729 e l s i f clk ’ event and c lk = ’0 ’ then1730 i f i n t e rna l c oun t = 0 then1731 else1732 i n t e rna l c oun t := in t e rna l c oun t − 1 ;1733 end i f ;1734 end i f ;1735 i n t c n t <= s t d l o g i c v e c t o r ( to uns igned ( in t e rna l count , 16) ) ; −−61736 end process ;17371738 −−+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++1739 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−1740 −− Gen e r a l d e l a y c o u n t e r1741 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−1742 −−+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++17431744 p9 : process ( clk , nReset , d e l a y c l r 2 )1745 variable i n t e rna l c oun t : natura l ;1746 begin1747 −− R e s e t ( a s y n c h r o n o u s )1748 i f nReset = ’0 ’ then1749 i n t e rna l c oun t := 0 ;

Page 218: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

194 VHDL Source Code

1750 −− C l e a r ( a s y n c h r o n o u s )1751 e l s i f d e l a y c l r 2 = ’0 ’ then1752 i n t e rna l c oun t := 0 ;1753 −− C l o c k w i t h f a l i n g e d g e1754 e l s i f clk ’ event and c lk = ’0 ’ then1755 i f i n t e rna l c oun t = 15 then1756 i n t e rna l c oun t := 0 ;1757 else1758 i n t e rna l c oun t := in t e rna l c oun t + 1 ;1759 end i f ;1760 end i f ;1761 de lay cnt2 <= s t d l o g i c v e c t o r ( to uns igned ( in t e rna l count , 4) ) ;1762 end process ;17631764 −−+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++1765 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−1766 −− PORT MAP OF TCAM−01767 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−1768 −−+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++17691770 p2 : tcam2 top1771 generic map(1772 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−1773 −−Width o f a FULL TCAM word :1774 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−1775 −−( g DATA WID mus t b e a m u l t i p l u m o f 4 i f h a l f / q uad s u bw o r d s u p p o r t i s1776 −−r e q u i r e d )1777 g DATA WID => 65 ,17781779 −−T o t a l number o f a d d r e s s e s ( wh e r e1780 −−one a d d r e s s i s u s e d f o r e a c h s u bw o r d ) :1781 g NUM ADDR => 16 ,17821783 −−Number o f s u b w o r d s p e r f u l l word :1784 −−( Must b e s e t t o 1 , i f s u b w o r d s mus t n o t b e s u p p o r t e d , and o t h e r w i s e i t1785 −−mus t n o r m a l l y b e 4 ( t o s u p p o r t QUAD wo r d s ) )1786 g NUM SUBWORDS PER FULLWORD => 1)1787 port map (1788 s y s c l k => clk ,1789 s y s r s t => nReset ,17901791 −−Norma l a c c e s s i n p u t s :1792 −− S p e c i f i e s TCAM o p e r a t i o n ( IDLE , READ , WRITE , LOOKUP) :1793 acce s s cmd i => s access cmd 0 ,17941795 −− Row a d d r e s s f o r r e a d and w r i t e o p e r a t i o n s :1796 a c c e s s a dd r i => s a c c e s s add r 0 ,17971798 −−S e l e c t i o n b e t w e e n co l umn 0 and co l umn 1 :1799 a c c e s s c o l s e l i => s a c c e s s c o l s e l 0 ,18001801 −−Data i n p u t when w r i t i n g t o e i t h e r c o l umn 0 o r co l umn 1 i n a row :1802 a c c e s s d a t a i => s a c c e s s da t a 0 ,18031804 −−Mask i n p u t u s e d o n l y d u r i n g LOOKUP o p e r a t i o n :1805 −−No t e : i t i s n o t u s e d d u r i n g e . g . WRITE o p e r a t i o n s :1806 a c c e s s g l oba l ma sk i => s a c e s s g l oba l mask 0 ,18071808 −−Word i n d i c a t i o n (QUAD, HALF , FULL ) u s e d d u r i n g READ , WRITE and LOOKUP

o p e r a t i o n s :1809 −−( I t mus t b e s e t t o TCAM2 FULL WORD , i f s u b w o r d s a r e n o t s u p p o r t e d )1810 a c c e s s w rd i => s acce s s wrd 0 ,18111812 −−c o n f i g u r a t i o n and s t a t u s i n t e r f a c e :1813 i n i t t c am c f g i => s i n i t t c am c f g 0 ,1814 in i t t cam done o => s i n i t t cam done 0 ,18151816 −− TCAM l o o k u p r e s u l t i n t e r f a c e :1817 r s l t ma t ch v l d o => s r s l t ma t ch v l d 0 ,1818 r s l t match o => s r s l t mat ch 0 ,1819 r s l t a d d r o => s r s l t a dd r 0 ,18201821 −−Ou t p u t i n d i c a t i n g i f t h e l o o k u p was a s s o c i a t e d w i t h a f u l l , h a l f o r q u a l

word :1822 −−( I f h a l f / q uad wo r d s a r e n o t s u p p o r t e d ( g NUM SUBWORDS PER FULLWORD=1) t h e n1823 −− t h i s o u t p u t mus t b e i g n o r e d ) .1824 r s l t w rd o => s r s l t w rd 0 ,18251826 −− TCAM r e a d d a t a i n t e r f a c e :1827 rd da ta v ld o => s rd da ta v ld 0 ,1828 rd data o => s rd data 0 ,1829 rd wrd o => s rd wrd 0 ) ;18301831 −−+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++1832 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−1833 −− PORT MAP OF TCAM−11834 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−

Page 219: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

E.2 RTL-Checksum 195

1835 −−+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++18361837 p3 : tcam2 top1838 generic map(1839 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−1840 −−Width o f a FULL TCAM word :1841 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−1842 −−( g DATA WID mus t b e a m u l t i p l u m o f 4 i f h a l f / q uad s u bw o r d s u p p o r t i s1843 −−r e q u i r e d )1844 g DATA WID => 65 ,18451846 −−T o t a l number o f a d d r e s s e s ( wh e r e1847 −−one a d d r e s s i s u s e d f o r e a c h s u bw o r d ) :1848 g NUM ADDR => 16 ,18491850 −−Number o f s u b w o r d s p e r f u l l word :1851 −−( Must b e s e t t o 1 , i f s u b w o r d s mus t n o t b e s u p p o r t e d , and o t h e r w i s e i t1852 −−mus t n o r m a l l y b e 4 ( t o s u p p o r t QUAD wo r d s ) )1853 g NUM SUBWORDS PER FULLWORD => 1)18541855 port map (1856 s y s c l k => clk ,1857 s y s r s t => nReset ,18581859 −−Norma l a c c e s s i n p u t s :1860 −− S p e c i f i e s TCAM o p e r a t i o n ( IDLE , READ , WRITE , LOOKUP) :1861 acce s s cmd i => s access cmd 1 ,18621863 −− Row a d d r e s s f o r r e a d and w r i t e o p e r a t i o n s :1864 a c c e s s a dd r i => s a c c e s s add r 1 ,18651866 −−S e l e c t i o n b e t w e e n co l umn 0 and co l umn 1 :1867 a c c e s s c o l s e l i => s a c c e s s c o l s e l 1 ,18681869 −−Data i n p u t when w r i t i n g t o e i t h e r c o l umn 0 o r co l umn 1 i n a row :1870 a c c e s s d a t a i => s a c c e s s da t a 1 ,18711872 −−Mask i n p u t u s e d o n l y d u r i n g LOOKUP o p e r a t i o n :1873 −−No t e : i t i s n o t u s e d d u r i n g e . g . WRITE o p e r a t i o n s :1874 a c c e s s g l oba l ma sk i => s a c e s s g l oba l mask 1 ,18751876 −−Word i n d i c a t i o n (QUAD, HALF , FULL ) u s e d d u r i n g READ , WRITE and LOOKUP

o p e r a t i o n s :1877 −−( I t mus t b e s e t t o TCAM2 FULL WORD , i f s u b w o r d s a r e n o t s u p p o r t e d )1878 a c c e s s w rd i => s acce s s wrd 1 ,18791880 −−c o n f i g u r a t i o n and s t a t u s i n t e r f a c e :1881 i n i t t c am c f g i => s i n i t t c am c f g 1 ,1882 in i t t cam done o => s i n i t t cam done 1 ,18831884 −− TCAM l o o k u p r e s u l t i n t e r f a c e :1885 r s l t ma t ch v l d o => s r s l t ma t ch v l d 1 ,1886 r s l t match o => s r s l t mat ch 1 ,1887 r s l t a d d r o => s r s l t a dd r 1 ,18881889 −−Ou t p u t i n d i c a t i n g i f t h e l o o k u p was a s s o c i a t e d w i t h a f u l l , h a l f o r q u a l

word :1890 −−( I f h a l f / q uad wo r d s a r e n o t s u p p o r t e d ( g NUM SUBWORDS PER FULLWORD=1) t h e n1891 −− t h i s o u t p u t mus t b e i g n o r e d ) .1892 r s l t w rd o => s r s l t w rd 1 ,18931894 −− TCAM r e a d d a t a i n t e r f a c e :1895 rd da ta v ld o => s rd da ta v ld 1 ,1896 rd data o => s rd data 1 ,1897 rd wrd o => s rd wrd 1 ) ;18981899 end behav ioura l ;

E.2 RTL-Checksum

1 −− −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−2 −−3 −− T i t l e : Checksum P r o c e s s o r4 −− :5 −− D e v e l o p e r s : Ma r t i n R o l s t e d J e n s e n6 −− :7 −− Pu r p o s e : C a l c u l a t e s t h e c h e c k s um o f t h e NAPT m o d e f i e d p a c k e t8 −− :9 −− :

10 −− R e v i s i o n : 1 . 0 18−08−08 I n i t i a l v e r s i o n11 −−

Page 220: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

196 VHDL Source Code

12 −− −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−13 l ibrary i e e e ;14 use i e e e . s t d l o g i c 1 1 6 4 . a l l ;15 use i e e e . numeric std . a l l ;1617 entity chksumProcCore i s18 port (19 c lk : in s t d l o g i c ;20 nReset : in s t d l o g i c ;21 −− C o n t r o l s i g n a l s22 req : in s t d l o g i c ;23 ack : out s t d l o g i c ;24 −− Memory i n t e r f a c e25 data : inout s t d l o g i c v e c t o r (31 downto 0) ;26 addr : out s t d l o g i c v e c t o r (17 downto 0) ;27 nWE : out s t d l o g i c ;28 nCS : out s t d l o g i c ;29 nOE : out s t d l o g i c ;30 d i r e c t i o n : in s t d l o g i c ;31 cnt : in s t d l o g i c v e c t o r (31 downto 0) ;32 −− IP h e a d e r i n f o o f t h e p a c k e t33 IHL : in s t d l o g i c v e c t o r (3 downto 0) ;34 −− Chksum35 chksum : out s t d l o g i c v e c t o r (15 downto 0)36 ) ;37 end chksumProcCore ;3839 architecture behav ioura l of chksumProcCore i s4041 component f o u r t o two 16b i t un i t42 port (43 x 0 , x 1 , x 2 , x 3 : in s t d l o g i c v e c t o r (15 downto 0) ;44 sc , ps : out s t d l o g i c v e c t o r (15 downto 0) ) ;45 end component ;4647 component PS C Reg48 port (49 c lk : in s t d l o g i c ;50 c l e a r : in s t d l o g i c ;51 load : in s t d l o g i c ;52 r e g i n0 : in s t d l o g i c v e c t o r (15 downto 0) ;53 r e g i n1 : in s t d l o g i c v e c t o r (15 downto 0) ;54 reg out0 : out s t d l o g i c v e c t o r (15 downto 0) ;55 reg out1 : out s t d l o g i c v e c t o r (15 downto 0) ) ;56 end component ;5758 component cpa un i t59 generic (60 l e f t : natura l := 15) ;61 port (62 a : in s t d l o g i c v e c t o r ( l e f t downto 0) ;63 b : in s t d l o g i c v e c t o r ( l e f t downto 0) ;64 c in : in s t d l o g i c ;65 sum : out s t d l o g i c v e c t o r ( l e f t downto 0) ;66 cout : out s t d l o g i c ) ;67 end component ;6869 component ones complement unit70 generic (71 l e f t : natura l := 15) ;72 port (x : in s t d l o g i c v e c t o r ( l e f t downto 0) ;73 y : out s t d l o g i c v e c t o r ( l e f t downto 0)74 ) ;75 end component ;7677 component chksum reg78 port (79 c lk : in s t d l o g i c ;80 c l e a r : in s t d l o g i c ;81 load : in s t d l o g i c ;82 r e g i n : in s t d l o g i c v e c t o r (15 downto 0) ;83 reg out : out s t d l o g i c v e c t o r (15 downto 0) ) ;84 end component ;8586 type s t a t e t ype i s ( i n i t , i d l e , step1 , step2 , step3 ) ;87 signal s tate , n ex t s t a t e : s t a t e t ype ;88 signal stateName : s t r i n g (1 to 10) ;8990 signal s cpa : s t d l o g i c ;91 signal s outpPSnext : s t d l o g i c v e c t o r (15 downto 0) ;92 signal s outpCSnext : s t d l o g i c v e c t o r (15 downto 0) ;93 signal s oper0 : s t d l o g i c v e c t o r (15 downto 0) ;94 signal s oper1 : s t d l o g i c v e c t o r (15 downto 0) ;95 signal s outpPS : s t d l o g i c v e c t o r (15 downto 0) ;96 signal s outpCS : s t d l o g i c v e c t o r (15 downto 0) ;97 signal s chksum : s t d l o g i c v e c t o r (15 downto 0) ;98 signal s f in chksum : s t d l o g i c v e c t o r (15 downto 0) ;

Page 221: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

E.2 RTL-Checksum 197

99100 signal s c l ear chksum : s t d l o g i c ;101 signal s load chksum : s t d l o g i c ;102103 signal s c l e a r : s t d l o g i c ;104 signal s l o ad : s t d l o g i c ;105 signal s r e g0 : s t d l o g i c v e c t o r (15 downto 0) ;106 signal s r e g1 : s t d l o g i c v e c t o r (15 downto 0) ;107108 begin109110 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−111 −− F i n i t e S t a t e Mach in e (FSM) f o r t h e c h e c k s um c a l c u l a t i o n112 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−113 p1 : process ( s tate , req )114 variable tmp : s t d l o g i c v e c t o r (31 downto 0) ;115 variable v cnt : natura l ;116 variable d i f f : i n t e g e r := 0 ;117 begin118 case s t a t e i s119 when i n i t =>120 stateName <= ” i n i t ” ;121 chksum <= ( others => ’Z ’ ) ;122 s oper0 <= ( others => ’ 0 ’ ) ;123 s oper1 <= ( others => ’ 0 ’ ) ;124 data <= ( others => ’Z ’ ) ;125 addr <= ( others => ’Z ’ ) ;126 nWE <= ’Z ’ ;127 nCS <= ’Z ’ ;128 nOE <= ’Z ’ ;129 tmp := ( others => ’Z ’ ) ;130 ack <= ’0 ’ ;131 v cnt := 6 ;−− 0132 s c l e a r <= ’1 ’ ;133 s l o ad <= ’0 ’ ;134 s c l ear chksum <= ’1 ’ ;135 s load chksum <= ’0 ’ ;136 s cpa <= ’0 ’ ;137138 nex t s t a t e <= i d l e ;139140 −− i d l e s t a t e141 when i d l e =>142 stateName <= ” Id l e ” ;143 i f req = ’1 ’ then144 −− Compen s a t e f o r d i f f e r e n t IP h e a d e r l e n g t h145 d i f f := t o i n t e g e r ( s igned ( IHL) ) − 5 ;146 nex t s t a t e <= step1 ;147 else148 s oper0 <= ( others => ’ 0 ’ ) ;149 s oper1 <= ( others => ’ 0 ’ ) ;150 data <= ( others => ’Z ’ ) ;151 addr <= ( others => ’Z ’ ) ;152 nWE <= ’Z ’ ;153 nCS <= ’Z ’ ;154 nOE <= ’Z ’ ;155 tmp := ( others => ’Z ’ ) ;156 ack <= ’0 ’ ;157 v cnt := 6 ;−− 0158 s c l e a r <= ’1 ’ ;159 s l o ad <= ’0 ’ ;160 s c l ear chksum <= ’0 ’ ;161 s load chksum <= ’0 ’ ;162163 nex t s t a t e <= i d l e ;164 end i f ;165166 −− S e t u p a d d r167 when step1 =>168 stateName <= ” step1 ” ;169 nCS <= ’0 ’ ;170 nWE <= ’1 ’ ;171 nOE <= ’0 ’ ;172 addr <= s t d l o g i c v e c t o r ( to uns igned ( v cnt+d i f f , 18) ) ;173 s c l e a r <= ’0 ’ ;174 s l o ad <= ’1 ’ ;175 nex t s t a t e <= step2 ;176177 −− Read d a t a178 when step2 =>179 stateName <= ” step2 ” ;180 v cnt := v cnt + 1 ;181 tmp := data ;182 s c l e a r <= ’0 ’ ;183 s l o ad <= ’0 ’ ;184 s oper0 <= data (15 downto 0) ;185 s oper1 <= data (31 downto 16) ;

Page 222: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

198 VHDL Source Code

186 i f v cnt = t o i n t e g e r ( s igned ( cnt ) ) then187 s c l ear chksum <= ’0 ’ ;188 s load chksum <= ’1 ’ ;189190 nex t s t a t e <= step3 ;191 else192 −− n e x t s t a t e <= i d l e ;193 nex t s t a t e <= step1 ;194 end i f ;195196 −− Ack t h e c h e c k s um c a l c u l a t i o n197 when step3 =>198 stateName <= ” step3 ” ;199 s c l e a r <= ’1 ’ ;200 s l o ad <= ’0 ’ ;201 s c l ear chksum <= ’0 ’ ;202 s load chksum <= ’0 ’ ;203 ack <= ’1 ’ ;204 nex t s t a t e <= i d l e ;205206 end case ;207 end process ;208209 regs : process ( clk , nReset )210 begin211 i f nReset = ’0 ’ then212 s t a t e <= i n i t ;213 e l s i f clk ’ event and c lk = ’1 ’ then214 s t a t e <= nex t s t a t e ;215 end i f ;216 end process r eg s ;217218 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−219 −− 16− b i t m u l t i o p e r a n d a d d e r made o f 4 : 2 a d d e r mo d u l e s220 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−221 p5 : f o u r t o two 16b i t un i t222 port map (223 −− I n p u t224 x 0 => s outpPSnext ,225 x 1 => s outpCSnext ,226 x 2 => s oper0 ,227 x 3 => s oper1 ,228 −− Ou t p u t229 sc => s outpPS ,230 ps => s outpCS ) ;231232 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−233 −− R e g i s t e r f o r t h e Pseudo−Sum and Ca r r y v e c t o r s234 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−235 p6 : PS C Reg port map (236 c lk => clk ,237 c l e a r => s c l e a r ,238 load => s load ,239 r e g i n0 => s outpPS ,240 r e g i n1 => s outpCS ,241 reg out0 => s outpPSnext ,242 reg out1 => s outpCSnext ) ;243244 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−245 −− 16− b i t Ca r r y P r o p a g a t e Adder246 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−247 p8 : cpa un i t248 port map (249 a => s outpPSnext ,250 b => s outpCSnext ,251 c in => s cpa ,252 sum => s chksum ,253 cout => s cpa ) ;254255 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−256 −− 16− b i t One ’ s Comp l emen t257 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−258 p9 : ones complement unit259 port map (260 x => s chksum ,261 y => s f in chksum ) ;262263 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−264 −− R e g i s t e r f o r t h e c h e c k s um265 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−266 p10 : chksum reg267 port map (268 c lk => clk ,269 c l e a r => s c lear chksum ,270 load => s load chksum ,271 r e g i n => s f in chksum ,272 reg out => chksum) ;

Page 223: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

E.2 RTL-Checksum 199

273274 end architecture behav ioura l ;275276 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−277 −− F u l l Adder modu l e278 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−279 l ibrary i e e e ;280 use i e e e . s t d l o g i c 1 1 6 4 . a l l ;281282 entity f u l l a d d e r i s283 port (x , y , c i n : in s t d l o g i c ;284 c out , z : out s t d l o g i c285 ) ;286 end f u l l a d d e r ;287288 architecture r t l of f u l l a d d e r i s289 begin290 z <= x xor y xor c i n ;291 c out <= (( x and y) or (x and c i n ) or (y and c i n ) ) ;292 end architecture r t l ;293294 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−295 −− 4 : 2 modu l e296 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−297 l ibrary i e e e ;298 use i e e e . s t d l o g i c 1 1 6 4 . a l l ;299300 entity f ou r to two i s301 port (302 x 0 , x 1 , x 2 , x 3 , c in0 , c i n1 : in s t d l o g i c ;303 c out0 , c out1 , s0 , s1 : out s t d l o g i c ) ;304 end f ou r to two ;305306 architecture r t l of f ou r to two i s307308 component f u l l a d d e r309 port (310 x , y , c i n : in s t d l o g i c ;311 c out , z : out s t d l o g i c ) ;312 end component ;313314 signal a : s t d l o g i c ;315316 begin317318 f a 1 : f u l l a d d e r319 port map(320 x => x 0 ,321 y => x 1 ,322 c i n => x 2 ,323 c out => c out0 ,324 z => a ) ;325326 f a 2 : f u l l a d d e r327 port map(328 x => x 3 ,329 y => a ,330 c i n => c in0 ,331 c out => c out1 ,332 z => s1 ) ;333334 s0 <= c in1 ;335336 end architecture r t l ;337338 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−339 −− R e g i s t e r f o r t h e Pseudo−Sum and Ca r r y v e c t o r s340 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−341 l ibrary i e e e ;342 use i e e e . s t d l o g i c 1 1 6 4 . a l l ;343344 entity PS C Reg i s345 port (346 c lk : in s t d l o g i c ;347 c l e a r : in s t d l o g i c ;348 load : in s t d l o g i c ;349 r e g i n0 : in s t d l o g i c v e c t o r (15 downto 0) ;350 r e g i n1 : in s t d l o g i c v e c t o r (15 downto 0) ;351 reg out0 : out s t d l o g i c v e c t o r (15 downto 0) ;352 reg out1 : out s t d l o g i c v e c t o r (15 downto 0) ) ;353 end PS C Reg ;354355 architecture r t l of PS C Reg i s356 begin357358 process ( c lk )359 begin

Page 224: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

200 VHDL Source Code

360 i f c l e a r = ’1 ’ then361 reg out0 <= ( others => ’ 0 ’ ) ;362 reg out1 <= ( others => ’ 0 ’ ) ;363 e l s i f clk ’ event and c lk = ’1 ’ then364 i f load = ’1 ’ then365 reg out0 <= reg i n0 ;366 reg out1 <= reg i n1 ;367 end i f ;368 end i f ;369 end process ;370371 end architecture r t l ;372373 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−374 −− 16− b i t m u l t i o p e r a n d a d d e r made o f 4 : 2 a d d e r mo d u l e s375 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−376 l ibrary i e e e ;377 use i e e e . s t d l o g i c 1 1 6 4 . a l l ;378379 entity f o u r t o two 16b i t un i t i s380 port ( x 0 , x 1 , x 2 , x 3 : in s t d l o g i c v e c t o r (15 downto 0) ;381 sc , ps : out s t d l o g i c v e c t o r (15 downto 0)382 ) ;383 end f o u r t o two 16b i t un i t ;384385 architecture r t l of f o u r t o two 16b i t un i t i s386387 component f ou r to two388 port (389 x 0 , x 1 , x 2 , x 3 , c in0 , c i n1 : in s t d l o g i c ;390 c out0 , c out1 , s0 , s1 : out s t d l o g i c ) ;391 end component ;392393 signal s c0 : s t d l o g i c v e c t o r (15 downto 0) ;394 signal s c1 : s t d l o g i c v e c t o r (15 downto 0) ;395396 begin397398 u0 : f ou r to two399 port map (400 x 0 => x 0 (0) , x 1=>x 1 (0) , x 2=>x 2 (0) , x 3=>x 3 (0) , c i n0=>s c0 (15) ,401 c i n1=>s c1 (15) , c out0=>s c0 (0) , c out1=>s c1 (0) , s0=>ps (0) , s1=>sc (0) ) ;402403 G2 : for N in 1 to 14 generate404 f ou r t o two a r r ay : f ou r to two405 port map (406 x 0 => x 0 (N) , x 1=>x 1 (N) , x 2=>x 2 (N) , x 3=>x 3 (N) , c i n0=>s c0 (N−1) ,407 c i n1=>s c1 (N−1) , c out0=>s c0 (N) , c out1=>s c1 (N) , s0=>ps (N) , s1=>sc (N) ) ;408 end generate G2;409410 u15 : f ou r to two411 port map (412 x 0 => x 0 (15) , x 1=>x 1 (15) , x 2=>x 2 (15) , x 3=>x 3 (15) , c i n0=>s c0 (14) ,413 c i n1=>s c1 (14) , c out0=>s c0 (15) , c out1=>s c1 (15) , s0=>ps (15) , s1=>sc (15) ) ;414415 end architecture r t l ;416417 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−418 −− 16− b i t Ca r r y P r o p a g a t e Adder419 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−420 l ibrary i e e e ;421 use i e e e . s t d l o g i c 1 1 6 4 . a l l ;422423 entity cpa un i t i s424 generic ( l e f t : natura l := 15) ;425 port ( a : in s t d l o g i c v e c t o r ( l e f t downto 0) ;426 b : in s t d l o g i c v e c t o r ( l e f t downto 0) ;427 c in : in s t d l o g i c ;428 sum : out s t d l o g i c v e c t o r ( l e f t downto 0) ;429 cout : out s t d l o g i c ) ;430 end entity cpa un i t ;431432 architecture r t l of cpa un i t i s433 begin434 adder : process (a , b , c in )435 variable carry : s t d l o g i c ;436 variable isum : s t d l o g i c v e c t o r ( l e f t downto 0) ;437 begin438439 carry := c in ;440441 for N in 0 to 15 loop442 isum (N) := a (N) xor b(N) xor carry ;443 carry := ( a (N) and b(N) ) or ( a (N) and carry ) or (b(N) and carry ) ;444 end loop ;445446 sum <= isum ;

Page 225: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

E.2 RTL-Checksum 201

447 cout <= carry ;448 end process adder ;449450 end architecture r t l ;451452 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−453 −− 16− b i t One ’ s Comp l emen t454 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−455 l ibrary i e e e ;456 use i e e e . s t d l o g i c 1 1 6 4 . a l l ;457458 entity ones complement unit i s459 generic ( l e f t : natura l := 15) ;460 port (x : in s t d l o g i c v e c t o r ( l e f t downto 0) ;461 y : out s t d l o g i c v e c t o r ( l e f t downto 0) ) ;462 end ones complement unit ;463464 architecture r t l of ones complement unit i s465 begin466 y <= not x ;467 end architecture r t l ;468469 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−470 −− R e g i s t e r f o r t h e c h e c k s um471 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−472 l ibrary i e e e ;473 use i e e e . s t d l o g i c 1 1 6 4 . a l l ;474475 entity chksum reg i s476 port (477 c lk : in s t d l o g i c ;478 c l e a r : in s t d l o g i c ;479 load : in s t d l o g i c ;480 r e g i n : in s t d l o g i c v e c t o r (15 downto 0) ;481 reg out : out s t d l o g i c v e c t o r (15 downto 0) ) ;482 end chksum reg ;483484 architecture r t l of chksum reg i s485 begin486487 process ( c lk )488 begin489 i f c l e a r = ’1 ’ then490 reg out <= ( others => ’ 0 ’ ) ;491 e l s i f clk ’ event and c lk = ’1 ’ then492 i f load = ’1 ’ then493 reg out <= reg i n ;494 end i f ;495 end i f ;496 end process ;497498 end architecture r t l ;

Page 226: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

202 VHDL Source Code

Page 227: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

Appendix F

ICMP Message Types

Page 228: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

204 ICMP Message Types

Type Code Description Q E S

0 0 echo reply (Ping reply) X •3 destination unreachable:

0 network unreachable X •1 host unreachable X •2 protocol unreachable X •3 port unreachable X •4 fragmentation needed don’t-fragment bit set X •5 source route failed X •6 destination network unknown X •7 destination host unknown X •8 source host isolated X •9 destination network administratively prohibited X •10 destination host administratively prohibited X •11 network unreachable for TOS X •12 host unreachable for TOS X •13 communication administratively prohibited by filtering X •14 destination host precedence violation X •15 precedence cutoff in effect X •

4 0 source quench (elementary flow control) X5 redirect:

0 redirect for network X ◦1 redirect for host X ◦2 redirect for TOS and network X ◦3 redirect for TOS and host X ◦

8 0 echo request (Ping request) X •9 0 router advertisement X •10 0 router solicitation X •11 time exceeded:

0 time-to-live 0 durng transit (Traceroute) X •1 time-to-live 0 during reassembly X •

12 parameter problem:0 IP header bad (catchall error) X ◦1 required option missing X ◦

13 0 timestamp request X ◦14 0 timestamp reply X ◦15 0 information request (obsolete) X16 0 information reply (obsolete) X17 0 address mask request X ◦18 0 address mask reply X ◦

Table F.1: Q = Query, E = Error, S = Support, • = Must, ◦ = May, = Shouldnot

Page 229: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

Appendix G

CPU Interrupt Codes

Page 230: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

206 CPU Interrupt Codes

Protocol hardware to CPU CPU to hardware Task

UDP 20-Create Create session20.0 Create session in CAM-tabels and

set status in session memory andstart session timer

20.1 Fail, discardTCP 40-SYN create Create session

40.0 Create session in CAM-tabels andset status in session memory andstart session timer

40.1 Fail, discard40.2 Restart session timer40.3 Fail, discard

41-Establishment Packet part of conn. est. sequence41.0 Success, update session memory41.1 Fail, discard41.2 Success, update session memory

and change session timer(2H4M)41.3 Fail, discard

42-Termination Packet part of termination sequence42.0 Success, update session memory42.1 Fail, discard42.2 Success, update session memory

and change session timer(4M)42.3 Fail, discard42.4 Success, change session timer (0M)42.5 Fail, discard

ICMP 60-Query request Create session, query60.0 Create session in CAM-tabels and

set status in session memory andstart session timer

60.1 Fail, discard63-Query reply Packet part of query sequence

63.0 Success, change session timer (0M)63.1 Fail, discard

65-Error ICMP error65.0 Accepted65.1 Fail, discard

67-Error create Create ICMP error67.0 Forward the ICMP error type X

code X to int./ext. interfaceDelete 70-Delete Delete session

Page 231: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

Appendix H

CD-ROM Content

Overview of the CD-ROM:

H.1 General

doc: The report.

H.2 Source Code

VHDL source codesrc/vhd/chksumProcCore.vhd: The Checksum Processorsrc/vhd/setDataUnit.vhd: Prepare NAPT data from packet buffer unitsrc/vhd/getDataUnit.vhd: Write-back NAPT data to packet buffer unitsrc/vhd/hnatCore.vhd: The Hardware Network Address Translation coresrc/vhd/netwRx.vhd: Receive Packet from network unitsrc/vhd/netwTx.vhd: Transmit Packet to network unitsrc/vhd/sram.vhd: 32-bit Static Random Access Memory (session data)src/vhd/pktMem.vhd: 32-bit Static Random Access Memory (packet data)

Page 232: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

208 CD-ROM Content

src/vhd/sysCntlUnit.vhd: Test bench controllersrc/vhd/tcam2 top.vhd: Ternary Content Addressable Memorysrc/vhd/tcam2 top-behav.vhd: TCAM modelsrc/vhd/sysclk.vhd: System clocksrc/vhd/testBanch.vhd: Test bench

TCL compile scriptsrc/tcl/compile script.tcl: TCL compile script

H.3 Test vectors

Test of the mapping behaviortest/CAM0boundary/CAM0boundary.vec: Test of the boundaries of CAM-0test/CAM0boundary/CAM0boundaryOut.vec: Output

Test of the filtering behaviortest/CAM1boundary/CAM1boundary.vec: Test of the boundaries of CAM-1 andfor endpoint-independent behaviortest/CAM1boundary/CAM1boundaryOut.vec: Output

Test of the session timertest/sessionTimer/timerBehav.vec: Show the basic behavior of the timer on asessiontest/sessionTimer/timerBehavOut.vec: Output

test/sessionTimer/ACTbit.vec: Show the sessions ACT-bits effect on the session-timertest/sessionTimer/ACTbitOut.vec: Output

test/sessionTimer/timerExp.vec: Show the availability of a port number aftera session is deleted i.e. session-timer expiredtest/sessionTimer/timerExpOut.vec: Output

Test of the delete proceduretest/sessionDelete/testDelete.vec: Show the deletion of a sub-sessiontest/sessionDelete/testDeleteOut.vec: Output

Page 233: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

Appendix I

List of Acronyms

ALG Application-Level Gateway

ASCII American Standard Code for Information Interchange

CAM Content Addressable Memory

CMOS Complementary Metal Oxide Semiconductor

DCCP Datagram Congestion Control Protocol

DHCP Dynamic Host Configuration Protocol

DNS Domain Name System

FIFO First In First Out

FTP File Transfer Protocol

HDL Hardware Description Language

HTTP Hypertext Transfer Protocol

IANA Internet Assigned Numbers Authority

ICMP Internet Control Message Protocol

ICS Internet Connection Sharing

Page 234: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

210 List of Acronyms

IGD Internet Gateway Device

IGMP Internet Group Management Protocol

IP Internet Protocol

IPsec Internet Protocol security

IPv4 Internet Protocol version 4

IPv6 Internet Protocol version 6

ISP Internet Service Provider

LAN Local Area Network

MAC Media Access Control

MOS Metal Oxide Semiconductor

MTU Maximum Transfer Unit

NAT Network Address Translation

NAPT Network Address and Port Translation

NCP Network Control Protocol

NFS Network File System

NIC Network Interface Controller

POP3 Post Office Protocol 3

QoS Quality of Service

RTCP Real-time Transport Control Protocol

RTP Real-time Transport Protocol

SCTP Stream Control Transmission Protocol

SMTP Simple Mail Transfer Protocol

SoHo Small office/Home office

SPI Stateful Packet Inspection

UNSAF UNilateral Self-Address Fixing

UPnP Universal Plug and Play

VHDL Very high speed integrated circuit Hardware Description Language

VoIP Voice over Internet Protocol

Page 235: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

Bibliography

[1] http://www.livinginternet.com/i/ii summary.htm.

[2] http://www.potaroo.net/tools/ipv4/index.html.

[3] http://www.ibiblio.org/lunarbin/worldpop.

[4] V. Volpe L. DiBurro A. Huttunen, B. Swander and M. Stenberg. Udpencapsulation of ipsec esp packets. RFC 3948, January 2005.

[5] Wilson G. Biggadike A., Ferullo D. and Perrig A. Natblaster: Establish-ing tcp connections between hosts behind nats. In Proceedings of ACMSIGCOMM ASIA Workshop (Beijing, China), April 2005.

[6] S. Bradner. Key words for use in rfcs to indicate requirement levels. RFC2119, Harvard University, March 1997.

[7] Elizabeth J. Brauer and Yusuf Leblebici. Sub-70 ps full adder in moscurrent-mode logic using 0.18 ım cmos technology. Northern ArizonaUniversity and Swiss Federal Institute of Technology, December 2003.

[8] Pyda Srisuresh Bryan Ford and Dan Kegel. State of peer-to-peer (p2p) com-munication across network address translators (nats). RFC 5128, March2008.

[9] S. Deering and R. Hinden. Internet protocol, version 6 (ipv6) specification.RFC 2460, December 1998.

[10] Audet F. Network address translation (nat) behavioral requirements forunicast udp. RFC 4787, January 2007.

Page 236: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

212 BIBLIOGRAPHY

[11] Srisuresh P. Ford B. and Kegel D. Peer-to-peer communication acrossnetwork address translators. In Proceedings of the 2005 USENIX AnnualTechnical Conference (Anaheim, CA), April 2005.

[12] Huston G. Anatomy - a look inside network address translators. August2004.

[13] Biswas K. Guha S. and Kegel D. Nat behavioral requirements for tcp. RFC5382, October 2008.

[14] Takeda Y. Guha S. and Francis P. Nutss: A sip-based approach to udpand tcp network connectivity. In Proceedings of SIGCOMM’04 Workshops(Portland, OR), pp. 43-48, August 2004.

[15] Steinar H. Gunderson. Global ipv6 statistics - measuring the current stateof ipv6 for ordinary users. RIPE57 DUBAI, October 2008.

[16] C. Huitema. Real time control protocol (rtcp) attribute in session descrip-tion protocol (sdp). RFC 3605, October 2003.

[17] Postel J. Transmission control protocol, std 7. RFC 793, September 1981.

[18] C. Huitema J. Rosenberg and R. Mahy. Traversal using relays around nat(turn): Relay extensions to session traversal utilities for nat (stun). draft-ietf-behave-turn-14, April 2009.

[19] P. Matthews J. Rosenberg, R. Mahy and D. Wing. Session traversal utilitiesfor nat (stun). RFC 5389, October 2008.

[20] Egevang K. and P. Francis. The ip network address translator (nat). RFC1631, May 1994.

[21] S. Kent and R. Atkinson. Security architecture for the internet protocol.RFC 2401, November 1998.

[22] Eppinger J. L. Tcp connections for p2p apps: A software approach tosolving the nat problem. Tech. Rep. CMU-ISRI-05-104, Carnegie MellonUniversity, Pittsburgh, PA, January 2005.

[23] Stefan Lundstr’o m. Hardware design of a network address translator.Master’s thesis, Master Thesis, LuleaUniversity of Technology, 2002.

[24] Srisuresh P. and M. Holdrege. Ip network address translator (nat) termi-nology and considerations. RFC 2663, August 1999.

[25] B. Ford S. Sivakumar P. Srisuresh and S. Guha. Nat behavioral require-ments for icmp. RFC 5508, April 2009.

[26] J.B. Postel. User datagram protocol. RFC 768, August 1980.

Page 237: Design of a Hardware - DTU Electronic Theses and …etd.dtu.dk/thesis/244423/ep09_27.pdf · Technical University of Denmark Department of Informatics and Mathematical Modelling Design

BIBLIOGRAPHY 213

[27] J.B. Postel. Internet protocol. RFC 791, September 1981.

[28] Braden R. Requirements for internet hosts - communication layers, std 3.RFC 1122, October 1989.

[29] Stevens W. Richard. TCP/IP Illustrated Volume 1 - The Protocols.Addison-Wesley, 1994.

[30] J. Rosenberg. Interactive connectivity establishment (ice): A protocol fornetwork address translator (nat) traversal for offer/answer protocols. draft-ietf-mmusic-ice-19, May 2008.

[31] Huitema C. Rosenberg J., Weinberger J. and R. Mahy. Stun - simple traver-sal of user datagram protocol (udp) through network address translators(nats). RFC 3489, March 2003.

[32] Guha S. and Francis P. Characterization and measurement of tcp traver-sal through nats and firewalls. Proceedings of the Internet MeasurmentConference (Berkeley CA), October 2005.

[33] Hain T. Architectural implications of nat. RFC 2993, November 2000.

[34] A. Huttunen T. Kivinen, B. Swander and V. Volpe. Negotiation of nat-traversal in the ike. RFC 3947, January 2005.

[35] Teruhiko Nagatomo Tomokazu Aoki and Kazuya Asano. High-speedip/ipsec processor lsis. FUJITSU Sci Tech. J. 42,2,p.214-226(April 2006),2005.