Upload
caitlin-barnicle
View
214
Download
0
Tags:
Embed Size (px)
Citation preview
The United States air transportation network analysis
Dorothy Cheung
Introduction• The problem and its importance
• Missing Pieces
• Related works in summary
• Methodology– Data set– Network Generation– Network Analysis
• Conclusion
Outline• The problem and its importance
• Missing Pieces
• Related works
• Methodology– Data set– Network Generation– Network Analysis
• Conclusion
The problem and its importance
• Problem– Analysis the air transportation network in the U.S.
• Network driven by profits and politics• Better understand the network structure not maximize
utility
• Importance– Economy: transport of good and services– Air traffic flow: convenience– Health studies: propagation of diseases
Outline• The problem and its importance
• Missing Pieces
• Related works
• Methodology– Data set– Network Generation– Network Analysis
• Conclusion
Missing pieces
• Sufficient amount of researches on the network with focuses on utility optimization.
• Commercial enterprises: OAG and Innovata
• But … lack of research on analyzing the network features studied in class.
Outline• The problem and its importance
• Missing Pieces
• Related works
• Methodology– Data set– Network Generation– Network Analysis
• Conclusion
Related worksAir transportation networks analysis
• WAN – World-wide Airport Network
• ANI – Airport Network of India
• ANC – Airport Network of China
Related worksSummary:
Features of air transportation networks
• Small world network (compared with random graphs)
– Small average shortest path– High average clustering coefficient– Degree mixing differs
• Scale free power law degree distribution
WAN ANI ANCAvg. shortest path 4.4 4 2.067
Avg. Clustering Coef. 0.62 0.6574 0.733
Degree mixing Associative Dissociative Dissociative
WAN ANI ANC
Power law exponent
1.0 2.2 +/- 0.1 1.65
Outline• The problem and its importance
• Missing Pieces
• Related works
• Methodology– Data set– Network Generation– Network Analysis
• Conclusion
Methodology
• Data Set
• Network Generation
• Network Analysis
Methodology – Data Set
Legends
OAI : Office of Airline Information RITA : Research and Innovative Technology AdministrationBTS : Bureau of Transportation Statistics
T100
OAI RITA
BTSDATABASE
My data
Methodology – Data Set
Domestic Air Traffic Hubs [1]
Methodology – Data Set
• Domestic scheduled flights– Passengers, cargos, and mails– Military excluded
• Market Data vs. Segment Data– Market : Used
• Accounts for passenger once on the same flight number
– Segment : Not used• Accounts for passenger more than once per leg
• Month specific : July 2011
Methodology – Data Set• Relevant information• Number of Passengers
• Number of Cargos : Freight and Mail
• Origin City
• Destination City
PASSENGERS FREIGHT MAIL ORIGIN_CITY_NAME DEST_CITY_NAME
DEST_CITY_NUM
DEST_STATE_ABR
DEST_STATE_FIPS
DEST_STATE_NM DEST_WAC YEAR QUARTER MONTH
DISTANCE_GROUP CLASS
59 700 17 Akhiok, AK Kodiak, AK 1017 AK 2 Alaska 1 2011 3 7 1 F19 200 2 Akhiok, AK Kodiak, AK 1017 AK 2 Alaska 1 2011 3 7 1 L24 0 0 Akhiok, AK Kodiak, AK 1017 AK 2 Alaska 1 2011 3 7 1 F
2 0 0 Akiachak, AK Akiak, AK 1024 AK 2 Alaska 1 2011 3 7 1 F176 47748 2250 Adak Island, AK Anchorage, AK 1029 AK 2 Alaska 1 2011 3 7 3 F
20 0 0 Adak Island, AK Anchorage, AK 1029 AK 2 Alaska 1 2011 3 7 3 L105 28 320 Akiachak, AK Bethel, AK 1055 AK 2 Alaska 1 2011 3 7 1 F
Sample .csv from BTS
Methodology – Network Generation
• Network– 850 Nodes: airports
– 21405 entries• Weighted edges: sum of passengers and cargos
– Directed and Undirected network input files for Pajak [2] and GUESS [5].
Methodology – Network Generation
Microsoft.Jet.OLEDB4.0Provider
ParseCSV
GenerateNwk
Data Table
.CSV
PajekDirected.net
PajekUndirected.net
GUESSDirected.gdf
GUESSUndirected.gdf
LINQ
Network Generation Tool written in C# using LINQ (Language Integrated Query)
Methodology – Network Generation
The U.S. Air Transportation Network drawn in Pajek
Methodology – Network Analysis• Metrics
– Degree distributions and correlations• Top 10 most connected cities• Top 10 most central cites
– Small world network?• Shortest path length• Clustering coefficient• Compare against WAN, ANI, and ANC
– Cumulative degree distribution and the power law
– Resilience
– Associativity : Rich-club?
– Random graph
– Z-Score TBD?
Methodology – Network Analysis– Degree distributions and correlations
• Directed network• Pajek:
In degree : Net -> Partitions -> Degree -> Input Out degree : Net -> Partitions -> Degree -> Output Both : Net -> Partitions -> Degree -> All
– Shortest path length• Directed network• Pajek:
Net -> Paths between 2 vertices -> Diameter
– Clustering coefficient• Directed network• Pajek:
Net -> Paths between 2 vertices -> Diameter
Methodology – Network Analysis
– Cumulative degree distribution and the power law• Directed networkStep 1 in Pajek:
– Create a partition of all degree– Export the partition in a tab delimited file Tools -> Export to Tab Delimited File -> Current Partition
Step 2 in MatLab [6]: – Generating a power law integer distribution
X = GetInput.m : reads the partition from the tab delimited file (X => X.name, X.label, X.degree)– Calculating the cumulative distribution
cumulativecounts.m [4][xlincumulative,ylincumulative] = cumulativecounts(X.degree)
Methodology – Network Analysis– ResilienceWhat % of nodes are removed to reduce the size of the Giant component by half?
• Consider:– Random attack– Targeted attack : remove nodes with the highest degree and betweenness
centrality measures
• Undirected network with 850 nodes
• GUESS toolbars: resiliencedegree.py and resiliencebetweenness.py that are downloaded from cTools [4]
• Compare against a random network (Random and targeted attacks)GUESS : makeSimpleRandom(numberOfNodes, numberOfEdges)=> numberOfNodes = 850 numberOfEdges = 21405
Methodology – Network Analysis
– Associativity : Rich-club?• Draw conclusion from graphical analysis in GUESS
– Random graph• Difficulty in constructing a realistic random network
that models the real network [3].
– Z-Score?• To Be Determined.
Methodology – Network Analysis
• Expectations/Predictions– Larger degree nodes are more central (betweenness).
Consider LAX, SFO, HOU, JFK, etc.
– Small world as compared to WAN, ANI, and ANC
– Scale free power law distribution
– Dissociate
Outline• The problem and its importance
• Missing Pieces
• Related works
• Methodology– Data set– Network Generation– Network Analysis
• Conclusion
Conclusion
The United States air transportation network analysis
• The problem and its importance
• Missing Pieces
• Related works – WAN, ANI, ANC
• Methodology Data set : BTS : Bureau of Transportation Statistics Network Generation : Directed and Undirected network input files Network Analysis :
Degree distribution Small world network as compared to WAN, ANI, and ANC Cumulative degree distribution and power law Resilience Associativity z-score – TBD?
References for this presentation1. T-100 reporting guide, RITA, http://www.rita.dot.gov/, www.transtats.bts.gov,
http://www.bts.gov/programs/airline_information/.2. Pajak, program for large network analysis,
http://vlado.fmf.uni-lj.si/pub/networks/pajek/.3. Albert-Laszlo Barabasi and Reka Albert, “Emergence of Scaling in Random
Networks”, Department of Physics, University of Notre-Dame, October, 1999.4. CTools, https://ctools.umich.edu/portal.5. GUESS, graph exploration system, http://graphexploration.cond.org/.6. Matlab, The language of technical computing, http
://www.mathworks.com/products/matlab/index.html