12
Scaling Sensors with Data Synthesis Catharine van Ingen eScience Group Microsoft Research It was six * men of Indostan, to learning much inclined, Who went to see the elephant though all of them were blind, That each by observation, might satisfy his mind. * data reporting error

Scaling Sensors with Data Synthesis

  • Upload
    melia

  • View
    22

  • Download
    0

Embed Size (px)

DESCRIPTION

Scaling Sensors with Data Synthesis. Catharine van Ingen eScience Group Microsoft Research. It was six * men of Indostan , to learning much inclined, Who went to see the elephant though all of them were blind, That each by observation, might satisfy his mind. * data reporting error. - PowerPoint PPT Presentation

Citation preview

Page 1: Scaling Sensors with  Data Synthesis

Scaling Sensors with

Data Synthesis

Catharine van IngeneScience Group

Microsoft Research

It was six* men of Indostan, to learning much inclined,Who went to see the elephant though all of them were blind, That each by observation, might satisfy his mind.

*data reporting error

Page 2: Scaling Sensors with  Data Synthesis

Unprecedented Data Availability• Created by the confluence of

fast internet connectivity, commodity computing and advanced sensor technologies

• Ever more pressing challenge is how to make sense of it all

Page 3: Scaling Sensors with  Data Synthesis

Navigatingin Real-Time

and Real-Space

Crop cycles 100 y

Competition, Gap Creation 101 y

Succession, Mortality102 y

Species migration, Soil formation103 y

Photosynthesis 10-6 -10-3 y

Speciation, Extinction 106 y

Evolution 109 y

Stomata 10-5 m

Leaf 10-2 – 10-1 m

Plant 10-1 - 100 m

Canopy 100 - 103 m

Landscape 103 - 105 m

Chloroplast 10-6 m

Continent 106 m

Globe 107 m

Sensors are the ante; Synthesis is the game

• Challenge: How do we use data to think about the future when the past is no longer a good predictor?

• Strategy: Scale up and down to bridge understanding and observational capabilities

• Approach: {mashup, derive, validate, analyze} repeat

• Hope: There are some technologies and methodologies that generalize to other disciplines with time and space drivers

Page 4: Scaling Sensors with  Data Synthesis

Data-Driven Science Meets Public Policy and Economics

• GPP, or gross photosynthetic production is component of carbon fixation and tied to water balance

• Implications for biofuels – GPP is higher in southern temperate forests than in the mid-west Corn Belt

Thanks to Dennis Baldocchi and Youngryel Ryu (UC Berkeley) 2010

Page 5: Scaling Sensors with  Data Synthesis

About That Map• Existing upscaling methods leverage sensor categorical

aggregates • Black(ish) box statistics applied to land cover informed by

modeled or remote sensed meteorology• Parameterization for biophysical model synthesis computation• Simulation is not an option• Radiative transfer meets turbulence meet ssystem biology• Existing climate models “do not evince much skill” at capturing

the biological processes • Science disclaimer: Biofuel is more complex

• Efficient and renewable biofuel production includes factors such as harvest efficiency and transportation costs

Page 6: Scaling Sensors with  Data Synthesis

Theory Meets Reality

• Big reduction : many inputs• Not a matrix : some inputs

have geospatial categorical dependencies

ൌ��ο � ሺ ��ሻȀሺο ቀࢽ ��

�ቁሻ�

ET = Water volume evapotranspired (m3 s-1 m-2)

Δ = Change rate of sat. specific humidity with air temp.(Pa K-1) λv = Latent heat of vaporization (J/g)

Rn = Net radiation (W m-2)

cp = Specific heat capacity of air (J kg-1 K-1)

ρa = dry air density (kg m-3)

δq = vapor pressure deficit (Pa)ra = Resistance of air (m s-1)

rs = Resistance of plant stoma, air (m s-1)

γ = Psychrometric constant (γ ≈ 66 Pa K-1)

Estimating resistance across a catchment can be tricky

Penman-Monteith (1964)

Page 7: Scaling Sensors with  Data Synthesis

Heterogeneous Data Sources

Remote sensingof CO2

Tem

pora

l sc

ale

Spatial scale [km]

hour

day

week

month

year

decade

century

local 0.1 1 10 100 1000 10 000 global

forestinventory

plot

Countries EUplot/site

Talltower sensor

obser-vatories

Forest/soil inventories

Eddycovariance

sensor towers

Landsurface remote sensing

Thanks to Markus Reichstein (Max Planck) 2010

Page 8: Scaling Sensors with  Data Synthesis

Sourcing from Imagery, Sensors, Models, Field Data and

Wisdom

NCEP/NCAR ~100MB (4K files)

Vegetative clumping~5MB (1file)

Climate classification~1MB (1file)

FLUXNET curated field

dataset2 KB (1 file)

FLUXNET curated sensor

dataset 30GB

(960 files)

NASA MODIS imagery archives5 TB (600K files)

10 US years1 global year ~ 13 US years

http://www.fluxdata.org

Page 9: Scaling Sensors with  Data Synthesis

Validation Classic

Local: direct pixel comparison with ground deployment

• Known good or known bad

Global: qualitative map views and large aggregates comparison

• Includes inter-annual variations

Global GPP 118± 26 PgG/y literature range 107-167

Radiation model expected to underestimate in the tropics

Page 10: Scaling Sensors with  Data Synthesis

Shows high summer water use in the rice growing region of the Sacramento Valley and (blue) rock outcrop

The great frontier of unknown unknowns•Qualitative map observations require local knowledge – crowd source via citizen science?•Geospatial feature determination errors can be significant

Validation Vanguard

Page 11: Scaling Sensors with  Data Synthesis

Scaling: The Synthesis Trifecta• Science

• Incorporate discovered or known omissions such as elevation, fires, storms, fertilizer

• Regional analysis flame tests• Sensors

• Refining existing sensors and variable derivations

• Incorporating new emerging sensors such as web cams

• Substrate • Move compute to data• Supercomputer size, but not

supercomputing friendly• Data discovery, reuse, harmonization Sensors are ~20 KM apart – one

shows impact of calibration drift

Phenocam detecting leaf green up and green down

Sacramento Delta 10 year average evapotranspiration

Page 12: Scaling Sensors with  Data Synthesis

Anecdote, Analysis, Action

I was walking Dry Creek and saw stranded fish…

..had local farmers turned on sprinklers?

Flow vs Temperature 2008 Detail