Upload
cloudera-inc
View
1.113
Download
0
Tags:
Embed Size (px)
DESCRIPTION
Opower is a fast moving energy management SaaS company that collects sensor data from nearly all of the major utilities in the United States–meaning from more than 45 million American households–along with major utilities in 5 countries throughout Europe and AsiaPac. Opower manages more than 100 billion meter reads, ranging from high frequency power data (AMI), smart thermostats data, and weather data. Currently all data at Opower is stored in HBase or Hadoop (and is notably not security sensitive). This discussion will discuss Opower’s HBase architecture, highlight potential and current uses of data in HBase, share the vision of Opower’s future projects and directions, and reveal how Opower’s big data management has allowed the company to help its utility clients save enough energy to power a city of nearly 200,000 people and save utility customers more than $70 million since only 2008!
Citation preview
My life with HBase
Drawn to Drawn to ScaleScale
Drawn to Drawn to ScaleScale OpowerOpowerOpowerOpowerClouderaClouderaClouderaClouderaFactsetFactsetFactsetFactset
About Opower
Opower is a customer engagement platform for the utility industry
About Opower
Home energy reportsCustomized utility bills
Energy efficiency programs for utilities
About Opower
Opower runs on analyticsAnalytics run on Hadoop + HBase
Opower analysis relies on datafrom a variety of sources
» Electric Utility Usage Data
» Gas Utility Usage Data
2
4
3 1
Data Storage & Processing
Disaggregation Algorithms
Shared Energy Signature
Repository
OPOWER Platform
» Thermostat data
» Weather data
Opower’s first architecture could not support their analytic vision
MySQLScalability?
Performance? Data integration?
Opower’s first architecture could not support their analytic vision
Analytic workflow instead of analytic apps:
SQL -> CSV -> R -> too little, too slow
Problem #1 Data Lake Cost
Usage AMI Regional AMI Sensor Data Data Lake
Problem #2 Slower and slower queries
Smart-grid-scale dataLots of supporting data: weather, demographics, etc.
Problem #3 It was taking lots of “magic”
Intense analyticsStrange schemas
Segmented queries
Hadoop + HBase at Opower
Opower determined that they needed an entirely new data architecture
NexGen Architecture @ Opower
Hadoop + HBase at Opower
Early success: HBase AMI
What rocked
Endless, cheap scalability
What rocked
The analytics team loved it!
What sucked
Hard on the ops team – still trying to grok it
What suckedNoSchema p1.
Creating SchemaManaging MetaData
Schema <=> Performance
What sucked
HAFailover
Snapshots
What sucked
No secondary indexAggregation is slow (Rollup/OLAP)
Poor Client Performance
It would be better if only …
Developers were not forced to know how the data is stored, indexed, etc.
It would be better if only …
There were nicer APIs and better query languages (SQL?)
It would be better if only …
Version migrations were easyHierarchical Tables
It would be better if only …
Real-time tuning
It would be better if only …
Did I mention HA?
In summary
HBase has helped Opower achieve their analytic vision
But they’ve still got a long way to goHBase still has a long way to go