9
Data Analytics: Challenges and the Internet of Moving Things Constantine Caramanis The University of Texas at Austin Electrical and Computer Engineering Department 1

Data Analytics: Challenges and the Internet of Moving Things

Embed Size (px)

Citation preview

Takeaway #1: Communication is useful

Data Analytics: Challenges and the Internet of Moving ThingsConstantine CaramanisThe University of Texas at AustinElectrical and Computer Engineering Department1

Infrastructure-based sensingSensing includes radar, LIDAR, cameras, and weatherCoordinate traffic through intersections, support automated drivingCollect data about collisions and near-misses for planningEffective with non-connected cars, bicycles, and pedestrians

Sensing includes radar, LIDAR, cameras, and weatherCoordinate traffic through intersections, support automated drivingCollect data about collisions and near-misses for planningEffective with non-connected cars, bicycles, and pedestrians

2

Radar-aided millimeter wave communication

mmWave BS supporting V2X+radar

antennas

Radar beam

Millimeter wave is used for both radar sensing and high bandwidth communication

communication beams

Radar can be used to configure communication link more efficiently

3

Key Challenges: Noise + Noise + NoiseDifferent Noise Characteristics in Data.Raw data measurements from other cars, from infrastructure, and from local sensors.Partial occlusions: objects may disappear/reappearMaps can be wrong: sparse but arbitrary corruption in the data.Inconsistent measurements from different sensing modalities

Hidden Structure and Missing Data: a visual illustration

Key Challenges: Mixed PopulationsHigher level data abstractionsInferring behavior: Ex: deviations from trajectory

Mixtures and Non-Linearities in Large Scale Data AnalysisLinear, Logistic and Non-linear regression are fundamental for prediction and planningExamples: transit time vs. daily flows, flow vs. speed, responses to network stressors or diversions or to future demand and flow patternsMixtures: Populations are mixed, and may require simultaneous clustering and regression/classification, when clustering-as-data-preprocessing is impossibleNonlinearities: Discover structure without expensive/intractable non-parametric modelsNew algorithms for:

Solving the simultaneous clustering-regression problem (tensor methods)Structure recovery through unknown non-linear transforms (second-moment methods)

45th St38 th StI-35Red River StAirport BlvdGuadalupe StLamar Blvd32nd StManor Rd51st St

NorthfieldWindsor ParkRidgetopHyde Park/ NorthfieldDelwood IIHancockNorth UniversityCherrywood/Wilshire Wood / Delwood IMueller

Barbara Jordan Blvd38 th St Manor Rd

OtherRamps used by neighborhood traffic, Source: Dr. J. Duthie

Notes from Constantine Caramanis: Mixed regression is the problem where you see (y,x) and they are related via: y = z*x*beta_1 + (1-z)*x*beta_2 + noise

Instead of having all data (y,x) being related by one (noisy) linear relationship, y = x*beta + noise, each data point has one of 2 (or many) possible linear relationships.The interesting setting is when if you look at all the x's by themselves, they are not clusterable, and hence you cannot pre-process. Note that if they were clusterable, you would just cluster the data, and then solve individual regressions on each cluster, as usual.------------

For the nonlinear part, think of logistic regression. It says: P(y = 1 | x) = f(x*beta), where f is the logit function.

Suppose that you do not know f. Must you learn it, if what you care about is learning beta? Our results say that you do not. if all you want is to learn beta (say, its support if you have a sparse problem) then no need to learn f.7

Dynamics, Transportation and Data ScienceThe two themes above tightly interrelatedInference-of-dynamics becomes a sensing modalityAnd different sensing capabilities require/impose different inference needs

UpshotBasic Statistical and Algorithmic Research, Models and Computation still a fundamental bottleneckComputing Infrastructure: Where does computing take place? Where are the sensors?Cost Speed Communication challenges and tradeoffs