27
The United States air transportation network analysis Dorothy Cheung

The United States air transportation network analysis Dorothy Cheung

Embed Size (px)

Citation preview

Page 1: The United States air transportation network analysis Dorothy Cheung

The United States air transportation network analysis

Dorothy Cheung

Page 2: The United States air transportation network analysis Dorothy Cheung

Introduction• The problem and its importance

• Missing Pieces

• Related works in summary

• Methodology– Data set– Network Generation– Network Analysis

• Conclusion

Page 3: The United States air transportation network analysis Dorothy Cheung

Outline• The problem and its importance

• Missing Pieces

• Related works

• Methodology– Data set– Network Generation– Network Analysis

• Conclusion

Page 4: The United States air transportation network analysis Dorothy Cheung

The problem and its importance

• Problem– Analysis the air transportation network in the U.S.

• Network driven by profits and politics• Better understand the network structure not maximize

utility

• Importance– Economy: transport of good and services– Air traffic flow: convenience– Health studies: propagation of diseases

Page 5: The United States air transportation network analysis Dorothy Cheung

Outline• The problem and its importance

• Missing Pieces

• Related works

• Methodology– Data set– Network Generation– Network Analysis

• Conclusion

Page 6: The United States air transportation network analysis Dorothy Cheung

Missing pieces

• Sufficient amount of researches on the network with focuses on utility optimization.

• Commercial enterprises: OAG and Innovata

• But … lack of research on analyzing the network features studied in class.

Page 7: The United States air transportation network analysis Dorothy Cheung

Outline• The problem and its importance

• Missing Pieces

• Related works

• Methodology– Data set– Network Generation– Network Analysis

• Conclusion

Page 8: The United States air transportation network analysis Dorothy Cheung

Related worksAir transportation networks analysis

• WAN – World-wide Airport Network

• ANI – Airport Network of India

• ANC – Airport Network of China

Page 9: The United States air transportation network analysis Dorothy Cheung

Related worksSummary:

Features of air transportation networks

• Small world network (compared with random graphs)

– Small average shortest path– High average clustering coefficient– Degree mixing differs

• Scale free power law degree distribution

WAN ANI ANCAvg. shortest path 4.4 4 2.067

Avg. Clustering Coef. 0.62 0.6574 0.733

Degree mixing Associative Dissociative Dissociative

WAN ANI ANC

Power law exponent

1.0 2.2 +/- 0.1 1.65

Page 10: The United States air transportation network analysis Dorothy Cheung

Outline• The problem and its importance

• Missing Pieces

• Related works

• Methodology– Data set– Network Generation– Network Analysis

• Conclusion

Page 11: The United States air transportation network analysis Dorothy Cheung

Methodology

• Data Set

• Network Generation

• Network Analysis

Page 12: The United States air transportation network analysis Dorothy Cheung

Methodology – Data Set

Legends

OAI : Office of Airline Information RITA : Research and Innovative Technology AdministrationBTS : Bureau of Transportation Statistics

T100

OAI RITA

BTSDATABASE

My data

Page 13: The United States air transportation network analysis Dorothy Cheung

Methodology – Data Set

Domestic Air Traffic Hubs [1]

Page 14: The United States air transportation network analysis Dorothy Cheung

Methodology – Data Set

• Domestic scheduled flights– Passengers, cargos, and mails– Military excluded

• Market Data vs. Segment Data– Market : Used

• Accounts for passenger once on the same flight number

– Segment : Not used• Accounts for passenger more than once per leg

• Month specific : July 2011

Page 15: The United States air transportation network analysis Dorothy Cheung

Methodology – Data Set• Relevant information• Number of Passengers

• Number of Cargos : Freight and Mail

• Origin City

• Destination City

PASSENGERS FREIGHT MAIL ORIGIN_CITY_NAME DEST_CITY_NAME

DEST_CITY_NUM

DEST_STATE_ABR

DEST_STATE_FIPS

DEST_STATE_NM DEST_WAC YEAR QUARTER MONTH

DISTANCE_GROUP CLASS

59 700 17 Akhiok, AK Kodiak, AK 1017 AK 2 Alaska 1 2011 3 7 1 F19 200 2 Akhiok, AK Kodiak, AK 1017 AK 2 Alaska 1 2011 3 7 1 L24 0 0 Akhiok, AK Kodiak, AK 1017 AK 2 Alaska 1 2011 3 7 1 F

2 0 0 Akiachak, AK Akiak, AK 1024 AK 2 Alaska 1 2011 3 7 1 F176 47748 2250 Adak Island, AK Anchorage, AK 1029 AK 2 Alaska 1 2011 3 7 3 F

20 0 0 Adak Island, AK Anchorage, AK 1029 AK 2 Alaska 1 2011 3 7 3 L105 28 320 Akiachak, AK Bethel, AK 1055 AK 2 Alaska 1 2011 3 7 1 F

Sample .csv from BTS

Page 16: The United States air transportation network analysis Dorothy Cheung

Methodology – Network Generation

• Network– 850 Nodes: airports

– 21405 entries• Weighted edges: sum of passengers and cargos

– Directed and Undirected network input files for Pajak [2] and GUESS [5].

Page 17: The United States air transportation network analysis Dorothy Cheung

Methodology – Network Generation

Microsoft.Jet.OLEDB4.0Provider

ParseCSV

GenerateNwk

Data Table

.CSV

PajekDirected.net

PajekUndirected.net

GUESSDirected.gdf

GUESSUndirected.gdf

LINQ

Network Generation Tool written in C# using LINQ (Language Integrated Query)

Page 18: The United States air transportation network analysis Dorothy Cheung

Methodology – Network Generation

The U.S. Air Transportation Network drawn in Pajek

Page 19: The United States air transportation network analysis Dorothy Cheung

Methodology – Network Analysis• Metrics

– Degree distributions and correlations• Top 10 most connected cities• Top 10 most central cites

– Small world network?• Shortest path length• Clustering coefficient• Compare against WAN, ANI, and ANC

– Cumulative degree distribution and the power law

– Resilience

– Associativity : Rich-club?

– Random graph

– Z-Score TBD?

Page 20: The United States air transportation network analysis Dorothy Cheung

Methodology – Network Analysis– Degree distributions and correlations

• Directed network• Pajek:

In degree : Net -> Partitions -> Degree -> Input Out degree : Net -> Partitions -> Degree -> Output Both : Net -> Partitions -> Degree -> All

– Shortest path length• Directed network• Pajek:

Net -> Paths between 2 vertices -> Diameter

– Clustering coefficient• Directed network• Pajek:

Net -> Paths between 2 vertices -> Diameter

Page 21: The United States air transportation network analysis Dorothy Cheung

Methodology – Network Analysis

– Cumulative degree distribution and the power law• Directed networkStep 1 in Pajek:

– Create a partition of all degree– Export the partition in a tab delimited file Tools -> Export to Tab Delimited File -> Current Partition

Step 2 in MatLab [6]: – Generating a power law integer distribution

X = GetInput.m : reads the partition from the tab delimited file (X => X.name, X.label, X.degree)– Calculating the cumulative distribution

cumulativecounts.m [4][xlincumulative,ylincumulative] = cumulativecounts(X.degree)

Page 22: The United States air transportation network analysis Dorothy Cheung

Methodology – Network Analysis– ResilienceWhat % of nodes are removed to reduce the size of the Giant component by half?

• Consider:– Random attack– Targeted attack : remove nodes with the highest degree and betweenness

centrality measures

• Undirected network with 850 nodes

• GUESS toolbars: resiliencedegree.py and resiliencebetweenness.py that are downloaded from cTools [4]

• Compare against a random network (Random and targeted attacks)GUESS : makeSimpleRandom(numberOfNodes, numberOfEdges)=> numberOfNodes = 850 numberOfEdges = 21405

Page 23: The United States air transportation network analysis Dorothy Cheung

Methodology – Network Analysis

– Associativity : Rich-club?• Draw conclusion from graphical analysis in GUESS

– Random graph• Difficulty in constructing a realistic random network

that models the real network [3].

– Z-Score?• To Be Determined.

Page 24: The United States air transportation network analysis Dorothy Cheung

Methodology – Network Analysis

• Expectations/Predictions– Larger degree nodes are more central (betweenness).

Consider LAX, SFO, HOU, JFK, etc.

– Small world as compared to WAN, ANI, and ANC

– Scale free power law distribution

– Dissociate

Page 25: The United States air transportation network analysis Dorothy Cheung

Outline• The problem and its importance

• Missing Pieces

• Related works

• Methodology– Data set– Network Generation– Network Analysis

• Conclusion

Page 26: The United States air transportation network analysis Dorothy Cheung

Conclusion

The United States air transportation network analysis

• The problem and its importance

• Missing Pieces

• Related works – WAN, ANI, ANC

• Methodology Data set : BTS : Bureau of Transportation Statistics Network Generation : Directed and Undirected network input files Network Analysis :

Degree distribution Small world network as compared to WAN, ANI, and ANC Cumulative degree distribution and power law Resilience Associativity z-score – TBD?

Page 27: The United States air transportation network analysis Dorothy Cheung

References for this presentation1. T-100 reporting guide, RITA, http://www.rita.dot.gov/, www.transtats.bts.gov,

http://www.bts.gov/programs/airline_information/.2. Pajak, program for large network analysis,

http://vlado.fmf.uni-lj.si/pub/networks/pajek/.3. Albert-Laszlo Barabasi and Reka Albert, “Emergence of Scaling in Random

Networks”, Department of Physics, University of Notre-Dame, October, 1999.4. CTools, https://ctools.umich.edu/portal.5. GUESS, graph exploration system, http://graphexploration.cond.org/.6. Matlab, The language of technical computing, http

://www.mathworks.com/products/matlab/index.html