20
Jie Liu Senior Researcher Microsoft Research Redmond, WA 98052 [email protected] Microsoft-FAPESP Environmental Science Workshop, Nov. 11 th , 2010

Jie Liu - Fapesp · Jie Liu Suman Nath ... Börje Felipe Fernandes Karlsson Pontifícia Universidade Católica do Rio de Janeiro Leonardo Barbosa Oliveira Universidade Estadual de

  • Upload
    vanngoc

  • View
    213

  • Download
    0

Embed Size (px)

Citation preview

Jie Liu

Senior Researcher

Microsoft Research

Redmond, WA 98052

[email protected]

Microsoft-FAPESP Environmental Science Workshop, Nov. 11th, 2010

Networked Embedded Computing Group @

MSR

Jie Liu

Suman Nath

Bodhi Priyantha

Michel Goraczko

Aman Kansal

Dimitrios Lymberopoulos

Special Thanks to our LATAM

Interns Andre Santanche

Universidade Estadual de Campinas

Börje Felipe Fernandes KarlssonPontifícia Universidade Católica do Rio de Janeiro

Leonardo Barbosa OliveiraUniversidade Estadual de Campinas

Heitor Soares Ramos Filho Universidade Federal de Minas Gerais

Miguel PalomeraUniversidad Nacional Autónoma de México

Sensing Made Easy -- As It Appears

Sensing, computing, and networking become increasing accessible Low cost sensors, radios, and devices

Low barrier of entry for accessing cloud services

Ubiquitous cell and WiFi coverage

Ride the consumer device wave Sales of sensor-rich smartphones: 270 million in 2010. (IDC)

Potential of smart grid: 70 million single family houses in US.

A round of academic investments around the world US: Cyber-physical systems

EU: CONET

China: Internet of things

Sensing as a Foundation for

Science

CollaborationCollection Extraction

At Microsoft Research, we tackle platform and systems issues on

sensing the physical world at scale.

Data Centers

Data Center Genome

A data center can have hundreds of thousandservers and other equipment and consumes tens of mega watts.

The complex interactions among utilization, computing, and cooling make improve their efficiency challenging.

MeshID: Low cost scalable asset location and

tracking

Low cost RFIDs on each server

In-rack RFID reader arrays

Transit units to convert RFIDs to wireless sensors

Mesh ad hoc network among transit units.

WiFlock: Ultra-Low Duty Cycle

Discovery

Automatic discovery and group formation when nodes come to communication range.

Lower duty cycle means more energy efficiency, but longer discovery latency.

The faster a node can wake up the lower duty cycle it can achieve. We can do 80 s with CC2500, or equivalently 0.2% duty cycle.

Expected life time: 5~7 years.

Beacon x

Listen x

Beacon y

Listen y

x

y

Genomotes Each server have a dozen fans.

Together with AC fans, the temperature distribution is

complex.

Wireless sensors are ideal for non-intrusive continuous

monitoring.

RACNet: Reliable ACquisition

Network(MSR & JHU)

Technology highlights Local data caches

Multi-channel, multi-hop collection tree

Token passing data retrieval over collection trees

End-to-end reliability.

Database

Virtual and Soft Sensors

Not all variables of interest are measurable (e.g. per virtual machine power consumption.)

Virtual and soft sensors compute the values indirectly.

0 50 100 150 200 250 3000

10

20

30

40

50

60

Time(s)

Watt

s

Energy Model Error

Measured

Estimate

Error

0 50 100 150 200 250 3000

2

4

6

8

10

12

14

16

18

20Component Dynamic Energy

Time(s)

Watt

s

CPU

Memory

Disk

0 50 100 150 200 250 3000

5

10

15

20

25Application Dynamic Energy

Time(s)

Watt

s

Total

App 1

App 2

Total App 1 App 20

500

1000

1500

2000

2500Component Dynamic Energy (Cumulative)

Joule

s

CPU

Memory

Disk

JouleMeterperf

counterspower

Example:

Software and Virtual SensorsSensor net as service providers

– Services encapsulate data and computation

– Input/output are label with clear semantics

– Services can be discovered, composited, and executed on demand

PlannerQueryService

graph

Service

Embedding

Tasking MLService

Scheduler

Service

scheduleExecution

Run-time

resource infoRun time

Schedule time

Service

info

User

Report

Service Discovery/

Self-monitoring

Tasking

Progress

Scalable Data Management

Unprecedented data collection capability

creates massive amount of data

Microsoft data centers collects ~1TB/day

Data management challenges

Information extraction

Archiving

Analysis

Cypress Approach

Compress data to reduce

storage and I/O cost

Query processing on

compressed data

— Semantic-based compression to reduce storage and I/O cost

— Achieve 100X compression with bounded maximum error (L∞ )

— Answer useful queries directly from compressed data

Column-oriented compression:

Precisionreduction

[selection]

Trickles:

Single-streamspectrum analysis

[correlation, histogram]

GAMPS:

Cross-stream compression

[selection, similarity]

0 50 100 150

-0.2

-0.1

0

0.1

0.2

Compress components into ―trickles‖

0 500 1000 1500 2000 2500 30000

0.2

0.4

0.6

0.8

1

0 500 1000 1500 2000 2500 3000-0.5

0

0.5

0 500 1000 1500 2000 2500 3000-0.5

0

0.5

LoF

Residual

Spikes

0 10 20 30 40 50 600

0.2

0.4

0.6

0.8

1

Down

sampleLoF trickle 144

2880

2880

2880

LosslessCompression

Spike trickle 56

Sketch with

Random

Projection

Sketch trickle 200

0 500 1000 1500 2000 2500 30000

0.2

0.4

0.6

0.8

1

Trickles + GAMPS

LoF & Spike

TricklesGAMPS

Me

sse

ng

er

Se

rve

r

Counte

r D

ata

Trickles GAMPS

Discard

noise

13%

8x

5%

20x

8%

12x

1%

100x

100%

HiF Trickle

SenseWeb: Wikipedia of

Sensors

SenseWeb

USGS

• Sensor registration

• Metadata mgmt

• Store and query data

• Share data with others • One stop portal

• Discover data source

• Spatio-temporal viz.

http://atom.research.microsoft.com/sensewebv3/sensormap/

...

SwissExSeamonster

Thoughts Every system is different

Power options & power consumption

Infrastructure vs infrastructure-less

Stationary vs mobile

Engineered vs opportunistic vs participatory

Extracting information from data is the challenge Modeling, validation, learning, hypothesis testing

Archiving vs streaming

Network effects

Crowd sourcing/citizen scientists

Trust

Can we design sensing systems at scale? Computer scientists and real scientists need to collaborate

Fast prototype and feedback

ACM SenSys 2011 in

Seattle

Conference dates: Nov. 1-4, 2010

Submission deadline: April 1, 2010

Workshop proposals are welcome!