36
1 Enhancing the Quality of Enhancing the Quality of Transferred Household Travel Transferred Household Travel Survey Data: Survey Data: A Bayesian Updating Approach A Bayesian Updating Approach Using MCMC with Gibbs Sampling Using MCMC with Gibbs Sampling Yongping Zhang Yongping Zhang Kouros Mohammadian, PhD Kouros Mohammadian, PhD Department of Civil and Materials Department of Civil and Materials Engineering Engineering University of Illinois at Chicago University of Illinois at Chicago The 11th TRB National Transportation Planning Applications Conference May 7, 2007

Yongping Zhang Kouros Mohammadian, PhD Department of Civil and Materials Engineering

  • Upload
    aggie

  • View
    30

  • Download
    0

Embed Size (px)

DESCRIPTION

Enhancing the Quality of Transferred Household Travel Survey Data: A Bayesian Updating Approach Using MCMC with Gibbs Sampling. Yongping Zhang Kouros Mohammadian, PhD Department of Civil and Materials Engineering University of Illinois at Chicago. - PowerPoint PPT Presentation

Citation preview

Page 1: Yongping Zhang  Kouros Mohammadian, PhD Department of Civil and Materials Engineering

11

Enhancing the Quality of Transferred Enhancing the Quality of Transferred Household Travel Survey Data: Household Travel Survey Data:

A Bayesian Updating Approach Using A Bayesian Updating Approach Using

MCMC with Gibbs SamplingMCMC with Gibbs Sampling Yongping Zhang Yongping Zhang

Kouros Mohammadian, PhDKouros Mohammadian, PhDDepartment of Civil and Materials EngineeringDepartment of Civil and Materials Engineering

University of Illinois at ChicagoUniversity of Illinois at Chicago

The 11th TRB National Transportation Planning Applications Conference

May 7, 2007

Page 2: Yongping Zhang  Kouros Mohammadian, PhD Department of Civil and Materials Engineering

22

Data TransferabilityData Transferability

The idea is to use data collected in one context in a new The idea is to use data collected in one context in a new context. This can reduce or eliminate the need for a context. This can reduce or eliminate the need for a large data collection in the application context. large data collection in the application context.

Previous StudiesPrevious Studies ITE trip generation tablesITE trip generation tables NCHRP 365 NCHRP 365 (Nancy McGuckin, et al)(Nancy McGuckin, et al)

Highly aggregateHighly aggregate ORNL’s NPTS/NHTS transferability study ORNL’s NPTS/NHTS transferability study (Pat Hu, et al)(Pat Hu, et al)

Aggregate (CT level)Aggregate (CT level) Data simulation Data simulation (Stopher and Greaves)(Stopher and Greaves)

Disaggregate (HH level), C&RT classification method, limited Disaggregate (HH level), C&RT classification method, limited number of independent variables number of independent variables

Page 3: Yongping Zhang  Kouros Mohammadian, PhD Department of Civil and Materials Engineering

33

Project ApproachProject Approach

Consider larger set of variables Consider larger set of variables NHTS and CTPP datasetsNHTS and CTPP datasetsUse quantifiable variables that can be easily Use quantifiable variables that can be easily predicted or are available from other sources predicted or are available from other sources (e.g., PUMS)(e.g., PUMS) Consider variables representing Land-use, Consider variables representing Land-use, Urban form, and transportation system Urban form, and transportation system characteristicscharacteristicsAdvanced clustering, updating, and simulation Advanced clustering, updating, and simulation approachesapproaches

Page 4: Yongping Zhang  Kouros Mohammadian, PhD Department of Civil and Materials Engineering

44

DataData

Data SourcesData Sources 2001 NHTS, 2000 CTPP, 2001 NHTS, 2000 CTPP, PUMS, PUMS, 2003 TTI, Tiger/Line 2003 TTI, Tiger/Line

GIS data filesGIS data files

Data CleaningData Cleaning 33 variables of demographics, socio-economics 33 variables of demographics, socio-economics and land useand land use Individual level: Individual level: Age group, Race/Ethnicity, Age group, Race/Ethnicity,

Education, OccupationEducation, Occupation Household level: Household level: HH size, Income, Adults, Vehicles, HH size, Income, Adults, Vehicles,

Drivers, WorkersDrivers, Workers Census tract level: Census tract level: Housing, Employment, and Housing, Employment, and

Population densitiesPopulation densities New VariablesNew Variables

Page 5: Yongping Zhang  Kouros Mohammadian, PhD Department of Civil and Materials Engineering

55

Intersection density (Tiger/Line)Intersection density (Tiger/Line)No. of intersections / Area No. of intersections / Area

Road density (Tiger/Line)Road density (Tiger/Line)Road length / AreaRoad length / Area

Pedestrian environment (Tiger/Line)Pedestrian environment (Tiger/Line)Block size: Road length / No. of intersections Block size: Road length / No. of intersections

Transit friendly environment (CTPP)Transit friendly environment (CTPP)Transit users / Total no. of workersTransit users / Total no. of workersTransit trips / Total no. of tripsTransit trips / Total no. of trips

Congestion factorCongestion factorTravel time index (TTI report for 85 MSAs)Travel time index (TTI report for 85 MSAs)

Avg. travel time / Free flow TT in that regionAvg. travel time / Free flow TT in that region

New VariablesNew Variables

Page 6: Yongping Zhang  Kouros Mohammadian, PhD Department of Civil and Materials Engineering

66

Dependent VariablesDependent VariablesTravel Characteristics Travel Characteristics (from NHTS trip file aggregated to HH level)(from NHTS trip file aggregated to HH level)

VMT for each householdVMT for each household No. of tripsNo. of trips No. of mandatory tripsNo. of mandatory trips No. of maintenance tripsNo. of maintenance trips No. of discretionary tripsNo. of discretionary trips No. of transit trips in the HHNo. of transit trips in the HH No. of private vehicle tripsNo. of private vehicle trips No. of non-motorized (bicycles and walk) tripsNo. of non-motorized (bicycles and walk) trips No. of toursNo. of tours Average trips per tourAverage trips per tour Average trip distance in miles for all HH membersAverage trip distance in miles for all HH members No. of transit users in the HHNo. of transit users in the HH No. of carpool users in the HHNo. of carpool users in the HH Percentage of public transit usage in the HHPercentage of public transit usage in the HH Percentage of carpool usage among workers in the HHPercentage of carpool usage among workers in the HH Total commute distance in the HHTotal commute distance in the HH Average commute distance in the HHAverage commute distance in the HH

Page 7: Yongping Zhang  Kouros Mohammadian, PhD Department of Civil and Materials Engineering

77

SyntheticHouseholds

with 33Variables

Look-up Tables

BayesianUpdating

Monte CarloSimulation

PopulationSynthesizer

FormattedPUMS

SyntheticHHs withCluster

Membership

TransferredTravel

Attributes

UpdatedTransferred

TravelAttributes

SimulatedTravel

Attributes

Add-on Datain

NHTS2001

ValidationData with

ClusterMembership

Validation withSimulation

ANN Module TransferabilityModel

NHTS 2001

Page 8: Yongping Zhang  Kouros Mohammadian, PhD Department of Civil and Materials Engineering

88

ClusteringClustering

Classification schema is a critical issue Classification schema is a critical issue

Clustering methods tested include: K-MeaClustering methods tested include: K-Means, hierarchical, C&RT, TwoStep, ANN ns, hierarchical, C&RT, TwoStep, ANN

11 clusters were generated using TwoStep 11 clusters were generated using TwoStep clustering method clustering method

ONLY national data is usedONLY national data is used

Page 9: Yongping Zhang  Kouros Mohammadian, PhD Department of Civil and Materials Engineering

99

ClustersClusters1.1. Rich and Smart Rich and Smart : :

middle age families middle age families professional or managerial white collar jobs professional or managerial white collar jobs graduate degrees graduate degrees high incomes high incomes majority live in suburbs. majority live in suburbs. greater part are White but also some Asiangreater part are White but also some Asian

2.2. Young Achievers: Young Achievers: Young couples without children or mainly with pre-school children Young couples without children or mainly with pre-school children college degrees college degrees white collar jobs in sales, service, technical, and professional white collar jobs in sales, service, technical, and professional mid-range income. mid-range income. higher percentages live in suburb or rural areas.higher percentages live in suburb or rural areas.

3.3. Kids-centered FamiliesKids-centered Families : : middle aged and working class families middle aged and working class families pre-school and school age childrenpre-school and school age children usually have college education usually have college education mid-rage to high level incomemid-rage to high level income primarily White and live in suburb or townprimarily White and live in suburb or town

Page 10: Yongping Zhang  Kouros Mohammadian, PhD Department of Civil and Materials Engineering

1010

Clusters, cont.Clusters, cont.4.4. Rural Blues Rural Blues : :

working class, middle aged families working class, middle aged families pre-school and school age children pre-school and school age children mainly high school graduates mainly high school graduates blue collar jobs (farming, manufacturing, etc) blue collar jobs (farming, manufacturing, etc) low to mid-range incomelow to mid-range income greater part are White and mainly live in rural area or small towns.greater part are White and mainly live in rural area or small towns.

5.5. Working Mixing Pot Working Mixing Pot :: working class White, Black, Asian, or Hispanic working class White, Black, Asian, or Hispanic single adults or couples single adults or couples college or high school education college or high school education low to mid-range income low to mid-range income

6.6. Mainstream Families: Mainstream Families: mid-scale, upper mid age, Whitemid-scale, upper mid age, White large working class couples or families with older childrenlarge working class couples or families with older children college or high school educationcollege or high school education mid-range to high level income mid-range to high level income suburb or rural areas suburb or rural areas

Page 11: Yongping Zhang  Kouros Mohammadian, PhD Department of Civil and Materials Engineering

1111

7.7. Senior Couples Senior Couples : : senior couples,senior couples, majority working and some are retired majority working and some are retired greater part is White but include some Black, Asian, or American-greater part is White but include some Black, Asian, or American-

IndiansIndians suburb or rural areas.suburb or rural areas.

8.8. Sustaining Minority Families: Sustaining Minority Families: low income, low income, middle aged, working class familiesmiddle aged, working class families mainly Hispanic or Black but also some Asian and Whitemainly Hispanic or Black but also some Asian and White majority have not finished high schoolmajority have not finished high school service, sales, manufacturing, farming, or construction jobsservice, sales, manufacturing, farming, or construction jobs

9.9. Forever Youngs Forever Youngs : : White senior couplesWhite senior couples, empty nesters, empty nesters mostly retired but some have sales, service, or managerial jobs mostly retired but some have sales, service, or managerial jobs low to mid-range income low to mid-range income

Clusters, cont.Clusters, cont.

Page 12: Yongping Zhang  Kouros Mohammadian, PhD Department of Civil and Materials Engineering

1212

10.10. Traditional Seniors: Traditional Seniors: mainly retired single individuals and some retired couples mainly retired single individuals and some retired couples low income. low income. majority are White but some Black, Asian, or American-Indiansmajority are White but some Black, Asian, or American-Indians

11.11. Neo UrbansNeo Urbans : : Small families/couples or single individuals Small families/couples or single individuals dense urban areas dense urban areas college education college education low to mid-range income low to mid-range income sales, service, or professional jobssales, service, or professional jobs dominant race is White but a significant number are Black, Asian, and dominant race is White but a significant number are Black, Asian, and

HispanicHispanic

Clusters, cont.Clusters, cont.

Page 13: Yongping Zhang  Kouros Mohammadian, PhD Department of Civil and Materials Engineering

1313

Cluster-Based Travel CharacteristicsCluster-Based Travel Characteristics

Cluster No.

of H

H T

rips

No.

of A

uto

Tri

ps

No.

of N

on-M

otor

ized

T

rips

% o

f Tra

nsit

% o

f Car

pool

Avg

Com

mut

e D

ista

nce

No.

of T

ours

Avg

Tri

p Pe

r Tou

r

Avg

Ann

ual V

MT

Man

dato

ry T

rips

per

Pe

rson

Mai

nten

ance

Tri

ps p

er

Pers

on

Dis

cret

iona

ry T

rips

per

Pe

rson

Aut

o T

rips

per

Per

son

Tra

nsit

Tri

ps p

er P

erso

n

Non

-Mot

oriz

ed T

rips

per

Pe

rson

Tou

rs p

er P

erso

n

Mean 14.84 12.90 1.47 0.03 0.05 19.43 10.77 1.41 13,008 0.68 1.27 1.10 3.96 0.05 0.44 3.28S.D. 8.47 11.74 3.18 0.11 0.19 132.95 9.43 0.40 11,984 0.70 1.10 0.95 2.06 0.21 0.74 1.45Mean 12.95 11.50 1.10 0.02 0.07 15.90 9.47 1.39 13,450 0.63 1.09 0.91 3.47 0.04 0.32 2.87S.D. 8.17 11.08 2.95 0.11 0.23 36.80 8.81 0.44 14,713 0.63 1.02 0.91 1.92 0.22 0.68 1.39Mean 17.02 14.88 1.43 0.02 0.04 13.53 12.26 1.41 13,501 0.62 1.20 1.04 3.79 0.03 0.36 3.11S.D. 9.69 13.67 3.13 0.09 0.16 40.93 10.30 0.37 12,627 0.64 1.05 0.92 2.07 0.14 0.63 1.44Mean 15.68 13.69 1.24 0.02 0.07 12.54 11.41 1.40 12,705 0.60 1.08 0.93 3.47 0.04 0.31 2.89S.D. 8.86 12.78 2.72 0.10 0.23 22.21 9.73 0.44 14,721 0.57 0.96 0.90 1.94 0.24 0.57 1.34Mean 5.62 4.97 0.51 0.04 0.04 15.26 4.01 1.45 11,309 0.77 1.21 1.05 3.87 0.08 0.40 3.12S.D. 3.49 4.67 1.46 0.18 0.18 59.75 3.39 0.60 13,337 1.02 1.32 1.24 2.65 0.43 0.99 1.72Mean 13.06 11.94 0.83 0.02 0.07 14.30 9.68 1.36 12,655 0.79 1.09 0.95 3.91 0.03 0.26 3.14S.D. 7.68 11.01 2.31 0.09 0.21 47.65 8.67 0.39 12,233 0.74 1.07 0.92 2.09 0.18 0.56 1.41Mean 9.16 8.42 0.63 0.02 0.04 15.77 6.56 1.44 10,995 0.59 1.19 1.01 3.80 0.03 0.29 2.96S.D. 5.31 7.65 1.85 0.10 0.18 47.41 5.75 0.53 12,107 0.79 1.16 1.02 2.21 0.19 0.70 1.56Mean 12.38 10.40 1.29 0.05 0.10 12.09 9.14 1.39 10,998 0.49 0.97 0.77 2.92 0.09 0.34 2.51S.D. 8.24 11.76 3.43 0.15 0.25 21.55 9.74 0.45 13,200 0.54 0.97 0.83 2.01 0.34 0.68 1.37Mean 8.91 8.11 0.70 0.01 0.04 10.49 6.06 1.57 8,606 0.14 1.41 1.21 3.84 0.01 0.32 2.86S.D. 4.95 6.86 1.79 0.09 0.18 20.18 4.95 0.68 8,484 0.40 1.25 1.07 2.27 0.14 0.69 1.57Mean 4.61 3.95 0.51 0.04 0.02 8.60 3.21 1.54 5,972 0.12 1.45 1.25 3.69 0.07 0.47 2.99S.D. 2.72 3.53 1.35 0.19 0.15 14.03 2.64 0.77 8,278 0.46 1.50 1.25 2.58 0.42 1.00 1.64Mean 6.58 3.39 2.03 0.27 0.05 21.71 4.90 1.41 7,504 0.56 1.05 0.90 1.88 0.64 1.19 2.72S.D. 4.27 4.80 3.35 0.39 0.17 174.94 4.69 0.51 12,759 0.73 1.19 1.03 2.18 1.07 1.63 1.60

8

9

10

11

1

2

3

4

5

6

7

Page 14: Yongping Zhang  Kouros Mohammadian, PhD Department of Civil and Materials Engineering

1414

SyntheticHouseholds

with 33Variables

Look-up Tables

BayesianUpdating

Monte CarloSimulation

PopulationSynthesizer

FormattedPUMS

SyntheticHHs withCluster

Membership

TransferredTravel

Attributes

UpdatedTransferred

TravelAttributes

SimulatedTravel

Attributes

Add-on Datain

NHTS2001

ValidationData with

ClusterMembership

Validation withSimulation

ANN Module TransferabilityModel

NHTS 2001

Page 15: Yongping Zhang  Kouros Mohammadian, PhD Department of Civil and Materials Engineering

1515

TransferabilityTransferabilityAn ANN model (with genetic algorithm) is An ANN model (with genetic algorithm) is used to simulate cluster membership as used to simulate cluster membership as a function of 11 factors for each HH in a function of 11 factors for each HH in add-on datasetsadd-on datasetsThe model has 92.4% prediction The model has 92.4% prediction potentialpotentialTravel characteristics are transferred Travel characteristics are transferred from national clusters to add-on data from national clusters to add-on data according to their cluster membershipaccording to their cluster membershipWeighted observed and Predicted travel Weighted observed and Predicted travel characteristics are comparedcharacteristics are compared

Page 16: Yongping Zhang  Kouros Mohammadian, PhD Department of Civil and Materials Engineering

1616

Comparison of Weighted Trip Comparison of Weighted Trip Count per PersonCount per Person

Trip Count per Person

0

1

2

3

4

5

1 2 3 4 5 6 7 8 9 10 11

National

New York

Wisconsin

Baltimore

Page 17: Yongping Zhang  Kouros Mohammadian, PhD Department of Civil and Materials Engineering

1717

Comparison of Weighted Comparison of Weighted Mandatory Trips per PersonMandatory Trips per Person

Mandatory Trips per Person

0.0

0.2

0.4

0.6

0.8

1.0

1 2 3 4 5 6 7 8 9 10 11

National

New York

Wisconsin

Baltimore

Page 18: Yongping Zhang  Kouros Mohammadian, PhD Department of Civil and Materials Engineering

1818

Original Comparison of Transit Usage Original Comparison of Transit Usage Not so good! some clusters need improvementNot so good! some clusters need improvement

•Compared to No. of Trips, the prediction of transit usage is not so good.

•Cluster 5,8,10,11 show significant difference and need improvement.

Percentage of Transit Usage in the Household (No. of Transit Users/Household Size)

0.00

0.10

0.20

0.30

0.40

0.50

1 2 3 4 5 6 7 8 9 10 11

National

Baltimore

New York

Wisconsin

Page 19: Yongping Zhang  Kouros Mohammadian, PhD Department of Civil and Materials Engineering

1919

Improvement to Clusters Using Improvement to Clusters Using C&RTC&RT1. The first level of tree is

grown upon the difference of the No. of vehicles in the household (own vehicle or not).

2. Improvement of the model due to this level is defined by improvement/(Variance of Node 0).

3. For example, here 0.0017 equals to 13.3%, and 0.009 equals to 7.05% and 0.0002 equals to 1.57%.

4. Total model improvement is about 22%.

Page 20: Yongping Zhang  Kouros Mohammadian, PhD Department of Civil and Materials Engineering

2020

Contrast of Distributions of Trip Rate per Person

in Cluster 5

0.00

0.05

0.10

0.15

0.20

0 5 10 15 20 25

Trips per Person per Day

Pro

babi

lity

National

NY

Nice match shown!however, not always the case. How to improve the transferability?

Considering Distributions:Considering Distributions:Trip RateTrip Rate

Page 21: Yongping Zhang  Kouros Mohammadian, PhD Department of Civil and Materials Engineering

2121

Considering Distributions:Considering Distributions:Trip DistanceTrip Distance

Not So Good!Needs to be improved

Comparison of National and NY Add-on Distributions for Cluster 2

0

0.005

0.01

0.015

0.02

0.025

0.03

0 20 40 60 80 100 120 140

Trip Distance per Person(Miles)

Pro

babi

lity

National

NY

Page 22: Yongping Zhang  Kouros Mohammadian, PhD Department of Civil and Materials Engineering

2222

Considering Distributions:Considering Distributions:Various distributions were fitted to the dataset inclVarious distributions were fitted to the dataset including: uding: Normal, Gamma, Weibull, Exponential, Max Extreme, Normal, Gamma, Weibull, Exponential, Max Extreme,

Lognormal, Logistic, Student’s t, Min Extreme, TriangulLognormal, Logistic, Student’s t, Min Extreme, Triangular, General Beta, Pareto, Uniform, Binomial, Geometric,ar, General Beta, Pareto, Uniform, Binomial, Geometric, Hyper Geometric, and Poisson. Hyper Geometric, and Poisson.

The fitting results are interpreted by The fitting results are interpreted by examining the rankings of the three fit statistics examining the rankings of the three fit statistics

A-D, K-S, and Chi-squared statistics A-D, K-S, and Chi-squared statistics visually judging of plots, density and cumulative curves visually judging of plots, density and cumulative curves p-value and critical values at different sig. levels.p-value and critical values at different sig. levels.

Non-normal distributions are dominant (e.g., GamNon-normal distributions are dominant (e.g., Gamma)ma)

Page 23: Yongping Zhang  Kouros Mohammadian, PhD Department of Civil and Materials Engineering

2323

Gamma DistributionGamma Distribution

PDF CDF

Gamma function:

k > 0 is the shape parameterθ > 0 is the scale parameterthe location parameter determines where the origin is located

Page 24: Yongping Zhang  Kouros Mohammadian, PhD Department of Civil and Materials Engineering

2424

Fitted Distribution with Parameters for each Fitted Distribution with Parameters for each Variable by ClusterVariable by Cluster

Page 25: Yongping Zhang  Kouros Mohammadian, PhD Department of Civil and Materials Engineering

2525

SyntheticHouseholds

with 33Variables

Look-up Tables

BayesianUpdating

Monte CarloSimulation

PopulationSynthesizer

FormattedPUMS

SyntheticHHs withCluster

Membership

TransferredTravel

Attributes

UpdatedTransferred

TravelAttributes

SimulatedTravel

Attributes

Add-on Datain

NHTS2001

ValidationData with

ClusterMembership

Validation withSimulation

ANN Module TransferabilityModel

NHTS 2001

Page 26: Yongping Zhang  Kouros Mohammadian, PhD Department of Civil and Materials Engineering

2626

Bayesian UpdatingBayesian Updating

Local updating can significantly improve Local updating can significantly improve the quality of the transferred datathe quality of the transferred data

Used Bayesian updatingUsed Bayesian updating Traditionally in transferability literature only Traditionally in transferability literature only

variables with normal distributions have been variables with normal distributions have been studied due to the simplicity in calculation of studied due to the simplicity in calculation of posterior from normal prior and likelihood.posterior from normal prior and likelihood.

In practice, the variables of interest (i.e., the In practice, the variables of interest (i.e., the likelihood) can take various distributional likelihood) can take various distributional forms. forms.

Page 27: Yongping Zhang  Kouros Mohammadian, PhD Department of Civil and Materials Engineering

2727

f(x|θ) is the probability function for the observed data x f(x|θ) is the probability function for the observed data x (i.e., local sample), given the unknown parameter θ,(i.e., local sample), given the unknown parameter θ,

g(θ) is the prior distribution for θ, g(θ) is the prior distribution for θ, k(θ|x) is the posterior distribution for θ given observed k(θ|x) is the posterior distribution for θ given observed

data xdata x

The technique can be expanded to situations when The technique can be expanded to situations when no prior data is available.no prior data is available.

The analyst can do successive updating,The analyst can do successive updating, using the new information without losing the gains fusing the new information without losing the gains f

rom the old one.rom the old one.

dgxfgxfxk

0

)()(/)()()(

Bayesian UpdatingBayesian Updating

Page 28: Yongping Zhang  Kouros Mohammadian, PhD Department of Civil and Materials Engineering

2828

Bayesian Updating (2)Bayesian Updating (2)

The National sample of NHTS 2001 is used as The National sample of NHTS 2001 is used as the source for the prior informationthe source for the prior information

A small local sample is randomly selected from A small local sample is randomly selected from the NY add-on, leaving the rest for validationthe NY add-on, leaving the rest for validation

Bootstrap method is used to resample the data Bootstrap method is used to resample the data and justify the prior distribution assumptions of and justify the prior distribution assumptions of parameters of interest (i.e., scale and shape for parameters of interest (i.e., scale and shape for Normal distribution), Normal distribution),

Normal distribution is fitted to each of the Normal distribution is fitted to each of the resample datasets. resample datasets.

Page 29: Yongping Zhang  Kouros Mohammadian, PhD Department of Civil and Materials Engineering

2929

Bayesian Updating (3)Bayesian Updating (3)

Then, Markov Chain Monte Carlo (MCMC) simulation Then, Markov Chain Monte Carlo (MCMC) simulation with Gibbs Sampling is utilized to update the prior with with Gibbs Sampling is utilized to update the prior with the small local sample.the small local sample.Assuming the updated variables of interest are still Assuming the updated variables of interest are still Gamma distributed, the posterior of parameters are used Gamma distributed, the posterior of parameters are used to derive the updated means and SD of the variables. to derive the updated means and SD of the variables. Updated parameters are then compared with the Updated parameters are then compared with the validation data and national data to test the effectiveness validation data and national data to test the effectiveness of the updating procedure. of the updating procedure. The comparisons prove that significant improvement is The comparisons prove that significant improvement is achieved. achieved. The improvement increases with the local sample size The improvement increases with the local sample size

a relatively cost-effective sample size is suggesteda relatively cost-effective sample size is suggested

Page 30: Yongping Zhang  Kouros Mohammadian, PhD Department of Civil and Materials Engineering

3030

RMSE Change wi th Sampl e Si ze

0

0. 2

0. 4

0. 6

0. 8

1

30 50 70 90 110 130 150 170 190Sampl e Si ze

RMSE

Root Mean Square Error (RMSE) decreases Root Mean Square Error (RMSE) decreases with the increase of sample size. with the increase of sample size.

There is instability when the sample size within There is instability when the sample size within each cluster is smaller than 45 observations. each cluster is smaller than 45 observations.

A sample size of 75 per cluster seems to be the A sample size of 75 per cluster seems to be the most cost-effective plan.most cost-effective plan.

Page 31: Yongping Zhang  Kouros Mohammadian, PhD Department of Civil and Materials Engineering

3131

Comparisons of Updated,National and NY Add-on Distributionsfor Trip Distance per Preson in Cluster 2

0

0.005

0.01

0.015

0.02

0.025

0.03

0 20 40 60 80 100 120Trip Distance per Person per Day (miles)

Pro

babib

ility National

NY

Updated

Updating ResultsUpdating Results

Updated mean values are significantly Updated mean values are significantly improved towards validation data.improved towards validation data.

Page 32: Yongping Zhang  Kouros Mohammadian, PhD Department of Civil and Materials Engineering

3232

Trip Rates per Person

ClusterNational National-updated State of New York

Location Shape Scale Mean Location Shape Scale Mean Location Shape Scale Mean

2 -0.83 5.42 0.88 3.94 -0.83 5.15 0.92 3.91 -0.30 3.47 1.14 3.66

3 -3.13 12.31 0.61 4.38 -3.13 12.05 0.61 4.22 -1.66 8.44 0.67 3.99

4 -0.99 6.42 0.77 3.95 -0.99 6.05 0.80 3.85 -0.42 4.43 0.89 3.53

8 -0.13 3.14 1.15 3.48 -0.13 2.90 1.12 3.13 0.18 2.40 1.24 3.16

11 0.04 2.52 1.47 3.75 0.04 2.44 1.45 3.58 0.32 2.20 1.40 3.39

Trip Distance per Person

ClusterNational National-updated State of New York

Location Shape Scale Mean Location Shape Scale Mean Location Shape Scale Mean

2 -0.09 1.45 21.28 30.67 -0.09 1.34 21.04 28.10 -0.07 1.32 20.84 27.33

3 -0.49 1.68 18.91 31.18 -0.49 1.62 18.93 30.18 0.11 1.53 19.31 29.62

4 -0.22 1.61 18.55 29.59 -0.22 1.45 19.98 28.75 -0.02 1.30 20.59 26.67

5 -0.09 1.20 24.93 29.93 -0.09 1.20 24.03 28.84 -0.09 1.19 23.97 28.36

6 -0.43 1.91 18.12 34.18 -0.43 1.89 18.22 34.01 -0.08 1.58 21.40 33.69

7 0.11 1.48 22.69 33.58 0.11 1.54 21.69 33.51 -0.08 1.52 20.75 31.55

8 -0.12 1.06 24.08 25.38 -0.12 1.03 24.03 24.63 -0.09 0.90 22.91 20.53

9 -0.09 1.16 21.43 24.72 -0.09 1.16 22.23 25.65 -0.03 1.17 22.17 25.91

Summary of Updating Results

Page 33: Yongping Zhang  Kouros Mohammadian, PhD Department of Civil and Materials Engineering

3333

SyntheticHouseholds

with 33Variables

Look-up Tables

BayesianUpdating

Monte CarloSimulation

PopulationSynthesizer

FormattedPUMS

SyntheticHHs withCluster

Membership

TransferredTravel

Attributes

UpdatedTransferred

TravelAttributes

SimulatedTravel

Attributes

Add-on Datain

NHTS2001

ValidationData with

ClusterMembership

Validation withSimulation

ANN Module TransferabilityModel

NHTS 2001

Page 34: Yongping Zhang  Kouros Mohammadian, PhD Department of Civil and Materials Engineering

3434

Population Synthesizing and Travel Population Synthesizing and Travel Data SimulationData Simulation

Using PUMS Data, NYC population is Using PUMS Data, NYC population is synthesized. synthesized. All of the contextual factors were calculated for All of the contextual factors were calculated for each HH. each HH. Synthetic population with all required 33 variables Synthetic population with all required 33 variables was generated.was generated.Using the ANN model, cluster memberships are Using the ANN model, cluster memberships are obtained. obtained.

Travel data are simulated for each HH using Travel data are simulated for each HH using Monte Carlo simulation of each travel attribute Monte Carlo simulation of each travel attribute with updated parameters of the fitted with updated parameters of the fitted distributionsdistributions..

Page 35: Yongping Zhang  Kouros Mohammadian, PhD Department of Civil and Materials Engineering

3535

Comparison of Simulated and Add-on NYC SamplComparison of Simulated and Add-on NYC Samples (Trips per Person)es (Trips per Person)

0

0.5

1

1.5

2

2.5

3

3.5

4

4.5

5

0 2 4 6 8 10 12

Simulated

NYC-Addon

Page 36: Yongping Zhang  Kouros Mohammadian, PhD Department of Civil and Materials Engineering

3636

Comparison of Simulated and Add-on NYC Comparison of Simulated and Add-on NYC Samples (Trip Distance per Person)Samples (Trip Distance per Person)

0

10

20

30

40

50

60

1 2 3 4 5 6 7 8 9 10 11Cl uster Membershi p

Trip

Dis

tanc

e pe

r Pe

rson

(Mi

les)

Si mul ated NYCNYC Add-on