Upload
others
View
5
Download
0
Embed Size (px)
Citation preview
© copyright 2003 hewlett-packard company 1
31 March 2003FAST 2003 keynote - john wilkes
Data services –from data to containers
FAST 2003 keynotejohn wilkes
slide 231 March 2003 FAST 2003 keynote - john wilkes
Key messages
Rising system complexity +rising abilities +rising expectations
Solution:• define data QoS needs• use storage QoS abilities• automate storage + data
management
Our target should be:data services
© copyright 2003 hewlett-packard company 2
31 March 2003FAST 2003 keynote - john wilkes
slide 331 March 2003 FAST 2003 keynote - john wilkes
persistent data
applications, data, storage
consumers, customers
applicationsapplication(s)application(s)
application(s)application(s)business logic, web services, OS, …
business logic, web services, OS, …
servicesLocaldata
storage systemcontainers for data
applications
© copyright 2003 hewlett-packard company 3
31 March 2003FAST 2003 keynote - john wilkes
slide 531 March 2003 FAST 2003 keynote - john wilkes
Personal applications
The digital life• Information on the move• Data at home• Interactions everywhere
• An information-hungry society
“When I get a little money, I buy books. And if there is any left over, I buy food.”
– Erasmus
slide 631 March 2003 FAST 2003 keynote - john wilkes
Personal applications
Wherever I go … there’s my data
– from islands of isolated data(work, home, on the move,PC, laptop, PDA, server, …)
– to anywhere, anytime access to data
© copyright 2003 hewlett-packard company 4
31 March 2003FAST 2003 keynote - john wilkes
slide 731 March 2003 FAST 2003 keynote - john wilkes
Personal applications
1TB in my pocket!
Now what?– security?– privacy?– resiliency?– freshness of data?– relevance?– validity?
slide 831 March 2003 FAST 2003 keynote - john wilkes
Personal applications + the back-end
there is no middle!
© copyright 2003 hewlett-packard company 5
31 March 2003FAST 2003 keynote - john wilkes
slide 931 March 2003 FAST 2003 keynote - john wilkes
Enterprise (commercial) applications
• on-line– “business critical” communications
• email, workflow systems, …– OLTP (e.g., order entry)– customer interactions (e.g., Verizon)
• back office– SAP, ERM, …– day-to-day finance systems– logistics planning and operations– payroll, …
• …
slide 1031 March 2003 FAST 2003 keynote - john wilkes
sample enterprise IT plan
Call Event Management Training Services
eDelivery
Sales & Marketing
eDelivery
Project Management HW & SW Supply ChainIM & Reporting
FinanceReference Systems
Bluestone App Server
.NET App Environment
CustomerConnectivity
CISL, CPRS, HAO,Predictive, ISEE
Shared Appse.g. Siebel
App Servers(NT)
HP-UXWeblogicJ2EE AppServers
SiteminderSiebel
eChannelPortal (OOTB)
JavaScript/JSP/NTIIS/ ASP.NET, Apache/Javascript, Weblogic/JSP
Uniform Portal/Web Application Framework (One or More Instances)
B2B &Fulfillment
Partner Major Account SMB Consumer
SoftwareManagementPatches
Other Clients
W2K.NET AppServers
SRS(VendorMaster)
PPCProduct Pricing Central
(Material MasterPurchasing Info Records
PartPricingCost
Consulting ProductsService Master)
CRS(Customer
MasterAP/Japan/
Europe/LA/NA)
Skillpack
PartnersBPS
(CSN & CPN)
Upfront Service Order(pre-paid). For service delivery
& future renewal
IndirectRegistration
ServiceEntitlement
B2BiBulk Rosettanet
eClaims
Entitlement Service
eSupport
eSellingStoresSMB
Major AccountConsumer
Sales CallCenter
InternetMarketplaces
Claim WebService
EDISupportpackRegistration
EDITrans-
formationOrderPacks
OrderePacks
ServiceEntitlement
ServiceEntitlement
ePackRegistration
Obligation Feedfrom HPS SAP
SCA
PostalServiceTeams
Consumer
J2EE App Environment
RegisterConsumer
Packs
OrderPacks
OrderPacks
WWEO
WWSNRS
IntegratedWarranty
Terms Mgmt.& Entitlement
Solution
Obligation/Entitlement ODS
Service Bites,PrintAdvantage,
University, Profiler,PIN
Parts ServicesReportingExcaliburUnit Config
.NET App Environment
XML Cannonicals
PRSPricing
Reference
PMGProduct Master
Armor Aware WMS
WWPACK
SBW
University
Web BasedExams
Training on theWeb
(Enrolement)
E-Testing(CAT)
ExternalCompanies
CA(Compaq Accreditation)
Accredited EngineerWeb Access
Exam UpdatesGeneral UserAccess
LearningUtility
ESB Learning
SAP ODS
WESInterface
(SIS)
GSEMInterface
(SIS)
RosettanetCase
Exchange
DocumentumMinimal KM
Management
ConcentraExtended KMManagement
OperationalData Store
HPCSManagment
ProvisionerXML Extract Heavy UI
SAW Portal
Light UI(xMeTaL)
SearchOffline Server
AnalyticsUI
Spider
Third PartyAcquisition
LegacySources
Legacy UIs
From OpenviewService Desk
CATSW - DeliveryLabor Tracking
EQUATEIC Billing
Business Intelligence -TBD
Project Management
Manual
Manual
From MPC
Serv
ice
Laye
r - M
essa
ging
Bac
kbon
e (E
IA) o
r ETL
(Inf
orm
atic
a)
Serv
ice
Laye
r - E
TL (I
nfor
mat
ica)
Serv
ice
Laye
r - E
TL (I
nfor
mat
ica)
iGSO BWFrom HPS SAP
WFM IMODSFrom WFM
OperationalMartsWFM-IM
SmartiGSO (BW)Part Page?
MPC
OdessaODS
Odessa DataSources:
3PL, Tabula, PIPE,etc
WW HPS WarehouseConstellation
StrategicMartsICEMANCalistoWMS
MagnetoAuroraSVR
ExcaliburERGOPSDM
iGSO Mgmt
Other DataSources:
Qspeak, ISEE,Oscar, Kahuna,
Passkey 2000 etc
Master Data: Customer,Product, Supplier, Organisation,Geography, BOM, Employee,Chart of Accounts, Material
Event Repositories:Service Delivery, Orders &Contracts, Supply Chain, Finance,Engagement Mgmt, Sales &Marketing
ExcaliburExternal
WebReporting
Ad-HocReporting
Operation.Excellence
Tables
BalancedScorecard
Parts Service: CSN Global Business Services
PortableRepair EDI
SONICEDI
Orders / Shipments
VOYAGER/iHUB
XelusPlan / Extend
ProductionRequisitions
Inventory / Costs
UniversityOrdersGAP?
Financial Postings
Tech/Training Orders
Taxware
SAPAPO
PEPs E-Stores
Vendor EDIS-Plus
UPSLG (NA)
APO active but not usedfor Order Routing in Step 1
CWS /Keychain
To iGSO BW
PowerInterface
OrderSupportpackWeb Service
QuoteManagement
QuotePricing
SupportpackRegister Web
Service
Watson
ContractAdmin
QL
GSEM SDK
EIA
WFMClarify
WES InterfaceReference DataLoad (CDO)
IM Extract
University BulkLoad
Technician Info
Skillpack BulkLoad
ScoreHPS SAP "A"Future State
ACI/WTI
Prod DivEscalations
Paging(Case Update)
WFM Copy
Click Schedule
Case - WES+
PGUReferenceAuthoring
Call CenterAuthoring
Agent
eTouch
KM LightAuthoring
Parts Update
OpenView
- Service Level Mgmt- Help Desk Mgmt- Incident Mgmt- Configuration Mgmt- Change Mgmt- Problem Mgmt- Application Monitoring- EvSPlent Detection- Capacity Mgmt- Work Order Mgmt- Reporting- Network Fault Mgmt
To CATSW(Labor Tracking)
eVictor(Request Mgmt)
EASI, ATM, eVictor(Provisioning)
Ruleware(Process Guides,Escalation Mgmt)
JETRouting, Notification,
Work Force Mgmt
RadiaInformation Mgmt
Two-way HPCECase Exchange(WebMethods)
WFM ClientCall Center
Workflow Agent
CustomerSystemConnect
eMaileResponse Mgr
eMailDelano
WFM MobileClient
(Untethered)
WFM ICAClient (Citrix)
Citrix
CWAeSupport
KCSIntegration
CSNBusinessServices
Hotlinks
Entitlement
OM InterfacePer incident
quote
Credit Card(Corporate)
SageTo ServiceNotes ODS
ITRC
ConsumptionUpdates (5.0)GEMS
ProductDivision CHS's
To SAP &WWPack
Bulk Load
Bulk Load
SearchEngine
Analytics
Search UI
Onsite Agents
IntranetPortal
MobileFormats
ServiceNotes
Linkage
HPS SAPEnterprise
Instance "A"
FinanceFI/CO/SPLFuture:SCORE
HPFO SAPEnterprise
Instance "B"
ARAPGL
AP, AR, GL
AP, AR,AP/AR Clearing
Lighthouse GL
HPFO BusinessWarehouse
To SAPODS
EQUATEICO Audit Detail
From SalesSAP Systems
Hyperion
HPFOMaster Data
WWCLASS(Foreign Trade)
HarmonizedCodes
Sprint/007
iGSO CentralHPS SAP
EnterpriseInstance "A"
VISTAPSG "O"
IPG "N"
ESG "M"Sales SAPEnterprise
ARAPGL
HPS SAPEnterprise
Instance "A"
ContractsPartsPacksQuotes
Prophecy [email protected]
Siebel (+MS/C&I)Prophecy
(Sales Funnel)
DirectRegistration
SWATFinW
OMEGASales Comp
IPC
OM
SUMWISDOMIMAGING
Translation
Managed Services eDelivery
Ruleware(Customer
Data)
EMS(Escalation
Mgmt)
Openview(Portal)
JET(Notification)
Support Software, ePro,eCase, Instant Software
(eSupport Portal)
eProKnowledge
Telephone
UDDI
To SageToEntitlement
CDOQuoteObject
Subscribe
ServiceEntitlement
Contract &Quote Lookup
PackObject(CCP?)
PackObject(CCP?)
WarrantyRegistrations
?
CCDB overFAI Warranty Registrations
MS Resource Management
PeoplesoftHR
PORGY(HR Data andOrg Hierarchy
Practices)
Skillpack(Skills)
RMMP(Resource Market
Place)
To MPC IM DW
ED(Enpterprise
Directory)
Consulting //PursuitKnowledge Managementt
Livelink(K-net and KMS)
SPS(Sharepoint Portal
Services)
STS(Sharepoint Team
Services)
Groove(External
Colloboration)
Procurement SAP "A"
VendorTransmission of POs
(Purchase Order)
DTT(Goods Receipt)
DTT(Goods
Movement)
MPC SAP EnterpriseInstance "A"
SD, MM,FI/CO/SPLPS, HR, CATSWCapture OrdersManage OrdersBill / InvoiceProcurementProcess Fin. Trans.Plan / Manage BudgetsManage Cash & LiquidityAnalyze & Report Results
Local DB(Oracle/Unix)
ContractManage
QuoteConfig
Novient(Resource
Management)
HR
ERGO(Consultant Effort
& HR)
Omega(MS Expenses)
MS Project(Small - MIddle
Projects)
Large Projects(TBD)
Sales Omega(Sales Comp)
CRO(Order
Recognition)
VCP(Credit
Management)
DDT(Italy only)
WISDOM(Billing Output
Imagiing)COPE/BETBids/Quotes
New Tax?(Worldwide Tax)
ESSEC(Foreign Trade)
Invoice Printing(Offline Printingand Distribution)
RPL(Restricted Parties
List)
FusionHP Products
WatsonHP Products
Config and Quote
Order
From MM
SWSCHPS SAP
EnterpriseInstance "C"
Labs, Mkt,Prod Groups
ISSW(EntitlementGenration)
FTP toShippers
GLiS
eRendezvous
SW Mastering
To HPFO
To CalistoIM
From PPC
SUM(Address Update)
OM
ContractCDO
Internal Portal
SLM(Ext Portal)
IOM/HPOM
GSOIAInterface
iGSOInterface
Central UNASTNT (EMEA)
iGSO CentralHPS SAP 4.6Instance "D"
ESSEC
WWClassFS-SPOC DSPS/
FS50I-COST/CCS/
GPSy
ORION
PPC RIFLESFT98
OMNI
Trade
EMEA Reference
RPL
Eiffel/Heart OREquate
Sub Financials
SAPHPFO
"Tsquared"Taxware
SRS
FinanceKahuna/
3D Comet/Tabular
Sokrates
Returns
HPS Application LandscapeFuture State 2005 - DRAFT
November 12, 2002
MPC ODS
SAPCOPA
WebReports
@IM PMMSPortal
Returns
Multi-domainThin Clients
OrchestrationSales [email protected]
PackCPP
ContractCDO
EventObject Price Product
ConsumptionUpdate
Pub/Sub
WarrantyRegist
SubscribePublish
PackObject(CCP?)
Elf-pack QL
PackObject(CCP?)
CREST
To WES+
PWA
simplifie
d!
© copyright 2003 hewlett-packard company 6
31 March 2003FAST 2003 keynote - john wilkes
slide 1131 March 2003 FAST 2003 keynote - john wilkes
these are not desktop systems!
slide 1231 March 2003 FAST 2003 keynote - john wilkes
Enterpise – what’s next?
Scientific applications as predictors of future trends?
• Huge quantities of data– instruments, sensors, time sequences, …
• Data often:– stochastic (“merely representative”)– large-scale– specialized access patterns
• Some examples: bioinformatics, physics, enterprise data mining inputs
© copyright 2003 hewlett-packard company 7
31 March 2003FAST 2003 keynote - john wilkes
slide 1331 March 2003 FAST 2003 keynote - john wilkes
Scientific applications – Sanger
Graphics from: Phil Butcher, Meeting user demands: a solution architecture, http://www.sanger.ac.uk/Info/IT/, Sanger Institute, 2002
slide 1431 March 2003 FAST 2003 keynote - john wilkes
Scientific applications – Sanger
Graphics from: Phil Butcher, Meeting user demands: a solution architecture, http://www.sanger.ac.uk/Info/IT/, Sanger Institute, 2002
© copyright 2003 hewlett-packard company 8
31 March 2003FAST 2003 keynote - john wilkes
slide 1531 March 2003 FAST 2003 keynote - john wilkes
Scientific applications – Sanger
Graphics from: Phil Butcher, Meeting user demands: a solution architecture, http://www.sanger.ac.uk/Info/IT/, Sanger Institute, 2002
slide 1631 March 2003 FAST 2003 keynote - john wilkes
Scientific applications – CERN
© copyright 2003 hewlett-packard company 9
31 March 2003FAST 2003 keynote - john wilkes
slide 1731 March 2003 FAST 2003 keynote - john wilkes
Scientific applications – CERN
slide 1831 March 2003 FAST 2003 keynote - john wilkes
Scientific applications – CERN
© copyright 2003 hewlett-packard company 10
31 March 2003FAST 2003 keynote - john wilkes
slide 1931 March 2003 FAST 2003 keynote - john wilkes
Scientific applications – CERN
slide 2031 March 2003 FAST 2003 keynote - john wilkes
Scientific applications – CERN
detector proper40MHz collisions 1MB/event
50,000 data channels200 GB buffering
~1TB/s
~500Gb/s
event filtering(1 CPU/event)
~0.5GB/s
data storage ~5PB/year
© copyright 2003 hewlett-packard company 11
31 March 2003FAST 2003 keynote - john wilkes
slide 2131 March 2003 FAST 2003 keynote - john wilkes
Scientific applications – data grids
Today, CERN has 4000–6000 active clients, 2000 of which are offsite.Tomorrow, they will have ~ten tier1 data/computation centers spread across the globe.
data
© copyright 2003 hewlett-packard company 12
31 March 2003FAST 2003 keynote - john wilkes
slide 2331 March 2003 FAST 2003 keynote - john wilkes
data my preferencescontainer user keystroke history log
data user keystroke historycontainer file (byte vector)
data file (named)container volume (virtual block vector)
data volume fragmentcontainer LU
data LU fragmentcontainer disk drive
Data versus storage (containers)
application
file system
disk array
volume virtualization system
slide 2431 March 2003 FAST 2003 keynote - john wilkes
Data attributes
• what’s true about the data?– how much?– rate of growth?– access patterns?
• what do we want to be true about the data?– as above, plus …– resiliency?– security?– semantics?
– plus … predictability in the face of change
© copyright 2003 hewlett-packard company 13
31 March 2003FAST 2003 keynote - john wilkes
slide 2531 March 2003 FAST 2003 keynote - john wilkes
Data attributes
All data not created equal
Some data:• has little value• never changes• can be regenerated
Not all data needs to be:• accessible• kept forever• secret• up to date
slide 2631 March 2003 FAST 2003 keynote - john wilkes
Data attributes – QoS
• size• access
– from where?– how fast?– when? (expectations; remote vs wired)
• resilience– data loss/corruption
(operator error, software bugs, viruses, …)• security
– who can access/change/control?• semantics
– consistency? updates? correctness?
SLO: a set of QoS objectives
SLA: a contract to provide them (adds penalties, monitoring rules, etc)
SLO: a set of QoS objectives
SLA: a contract to provide them (adds penalties, monitoring rules, etc)
© copyright 2003 hewlett-packard company 14
31 March 2003FAST 2003 keynote - john wilkes
slide 2731 March 2003 FAST 2003 keynote - john wilkes
Data attributes - size
the data itself
• current size
• over time– growth rates?– variance?
• unwanted data?
slide 2831 March 2003 FAST 2003 keynote - john wilkes
Data attributes - size
after it’s packaged
• wastage?
• slack capacity?
• duplicates• duplicates
© copyright 2003 hewlett-packard company 15
31 March 2003FAST 2003 keynote - john wilkes
slide 2931 March 2003 FAST 2003 keynote - john wilkes
Data attributes – access
• from where?• local SAN/LAN• wired WAN• disconnected/mobile
• how fast?– QoS parameters
• when? – “availability”
slide 3031 March 2003 FAST 2003 keynote - john wilkes
Data attributes – access
• stream: an access pattern• behaviors: I/O rate, I/O request size,
spatial locality, temporal locality, cache affinity, phasing, …
• requirements: bandwidth, response time
• data: information being accessed
• store: a container– e.g., Logical Unit, Logical Volume – provider of resources to realize demands
(capacity, bandwidth, …)
store
DataData
StreamStream
Stream
© copyright 2003 hewlett-packard company 16
31 March 2003FAST 2003 keynote - john wilkes
slide 3131 March 2003 FAST 2003 keynote - john wilkes
Data attributes – access variability
i3125om5, RC = 8GB, WC = 8GB; cache outcome percentages by I/O address range
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
0 400 800 1200 1600 2000 2400 2800 3200 3600 steadystatetrace time [seconds]
I/O's
ave
average overfull
trace
averagefor
secondhalf
readhits
reads
writes
Coalesced write
Warm write
Cold write
Delayed write
Compulsory read miss
Replacement read miss
Read hit - reread
Read hit from write cache
Sample results for a single run
cache hits for a traced openmailworkload
cache hits for a traced openmailworkload
Graph by the HPL Sonora teamGraph by the HPL Sonora team
slide 3331 March 2003 FAST 2003 keynote - john wilkes
Downtime matters($/hour of downtime – frequency)
<$50k46%
<$100k15%
<$250k13%
<$500k9%
<$1m9%
<$5m4%
>=$5m4%
Data from: 2001 cost of downtime online survey Eagle Rock Alliance, NJ
© copyright 2003 hewlett-packard company 17
31 March 2003FAST 2003 keynote - john wilkes
slide 3431 March 2003 FAST 2003 keynote - john wilkes
Causes of data loss or unavailability
softwareprogram
malfunction
computer virus
hardware or system
malfunction
human error
site disaster
44%
3%
7%
14%32%
Source: Ontrack
slide 3531 March 2003 FAST 2003 keynote - john wilkes
Data attributes – access
• “Number of nines” availability doesn’t reflect important info– performance? (normal and degraded modes)– outage frequency?– outage duration?
99 999Time
QoS
Met
ric
0
normal behavior
outage duration
QoS degradationfailure
© copyright 2003 hewlett-packard company 18
31 March 2003FAST 2003 keynote - john wilkes
slide 3631 March 2003 FAST 2003 keynote - john wilkes
Data attributes – access
• Reliability: likelihood system up continuously from 0 to t
• Availability: likelihood system will be up at time t
• Performability: likelihood system has performance p at time t
Time
QoS
Met
ric
0
normal behavior
outage duration
QoS degradationfailure
slide 3831 March 2003 FAST 2003 keynote - john wilkes
Data attributes – resilience/reliability
Resilience/reliability– lack of data loss or corruption, from:
• operator error, software bugs, viruses, …• hardware failures (a container property)
• simple model: annual failure rate– AFR = 1/MTTDL (“mean time to data loss”)– any size loss is equally bad
• a richer model:< size [bytes], type [recent/arbitrary], rate [per year]>
© copyright 2003 hewlett-packard company 19
31 March 2003FAST 2003 keynote - john wilkes
slide 3931 March 2003 FAST 2003 keynote - john wilkes
Data attributes – resilience/reliability
Recent data loss
• usage is driven by:– efficiency needs (e.g., buffering)– user’s conceptual model (“yesterday’s state”)
• recovery– return to/retrieve data from a time-based recovery point– re-apply appropriate intermediate changes
slide 4031 March 2003 FAST 2003 keynote - john wilkes
backup data also has accessibility, performance, and resilience needs
backupsbackups
data-levelbackup/snapshot
Data attributes – resilience/reliability
Implementation techniques
DataData
copy1
storage-levellocal mirrors
data mappedto storagecontainer
copy 2
storage-level remote mirrors
© copyright 2003 hewlett-packard company 20
31 March 2003FAST 2003 keynote - john wilkes
slide 4131 March 2003 FAST 2003 keynote - john wilkes
Data attributes – security
Who can access/change/control data?
• implementation:– physical security (common today in FibreChannel SANs)– host, network, storage device
• secure transmission• secure storage
slide 4231 March 2003 FAST 2003 keynote - john wilkes
Data attributes – semantics
• correctnessan information-level construct: not knowable at the data level
• consistency/coherency between versions“two-phase locking reduces inconsistencies to a dull roar”
– data versions (e.g., backups, snapshots, archive)– data copies (e.g., software release)– storage copies (e.g., local/remote mirrors)
© copyright 2003 hewlett-packard company 21
31 March 2003FAST 2003 keynote - john wilkes
slide 4331 March 2003 FAST 2003 keynote - john wilkes
Data life cycle and data placement
• Overlaid on all the previous discussion
• Sample phases:– gathered, generated– production use – access/updates– demotion/archiving– discarding/expunging
• All phases can exploitstorage (container) hierarchy
offline tape
near-line tape
near-online disk
low-enddisk
high-enddisk
slide 4431 March 2003 FAST 2003 keynote - john wilkes
storage
© copyright 2003 hewlett-packard company 22
31 March 2003FAST 2003 keynote - john wilkes
slide 4531 March 2003 FAST 2003 keynote - john wilkes
Storage
• disk drives• MEMS, MRAM• disk arrays
• tape drives• tape libraries
• storage systems (SANs)• storage management systems
Image courtesy of Seagate Technology, Inc.© 2000 Seagate Technology, Inc.
slide 4631 March 2003 FAST 2003 keynote - john wilkes
0
10,000
20,000
30,000
40,000
50,000
60,000
1997 1998 1999 2000 2001 2002 2003 2004 2005 2006
Disk storage capacity shipped expected to grow at 62% CAGR
Dis
c st
orag
e sh
ippe
d (P
etab
ytes
)
Source: IDC 2002, Seagate
Units shipped (millions)
Units shipped
0
50
100
150
200
250
300
350
estimates
Disc capacity
© copyright 2003 hewlett-packard company 23
31 March 2003FAST 2003 keynote - john wilkes
Average disk platter size
0
5
10
15
20
25
30
35
40
1994 1995 1996 1997 1998 1999 2000 2001 2002
Plattersize(GB)
Mobile
Desktop
Enterprise
Average
Source: Peter A. Brew, Seagate Technology, R&D Division, Dec. 2002
Date
© copyright 2003 hewlett-packard company 24
31 March 2003FAST 2003 keynote - john wilkes
slide 5031 March 2003 FAST 2003 keynote - john wilkes
Storage devices – MEMS
Electron beam
From: “Scientific American” MagazineMay 2000 Issue, Page 72
~ 30 µm (about the diameterof a human hair)
~10x faster than a disk driveCost < flash RAMGbytes per moduleNonvolatilePortable: small, rugged, low power
~10x faster than a disk driveCost < flash RAMGbytes per moduleNonvolatilePortable: small, rugged, low power
© copyright 2003 hewlett-packard company 25
31 March 2003FAST 2003 keynote - john wilkes
slide 5131 March 2003 FAST 2003 keynote - john wilkes
Storage devices – MEMS
Media-mover prototype• replaces spinning disks with a
micro-machined motor• faster access at lower power
and lower cost
From: Jim Brug and Rich Elder, HP Labs
slide 5231 March 2003 FAST 2003 keynote - john wilkes
Magnetic RAM (MRAM) technology
Images from: Jim Brug and Rich Elder, HP Labs
Cross-section of memory cell
Word line
magnetic field: Hy
magnetic field: Hx
M ii
Bit line
i
~ DRAM speeds< DRAM cost100’s MB/chipnon-volatile
~ DRAM speeds< DRAM cost100’s MB/chipnon-volatile
© copyright 2003 hewlett-packard company 26
31 March 2003FAST 2003 keynote - john wilkes
slide 5331 March 2003 FAST 2003 keynote - john wilkes
Power, etc
Storage devices – disk arrays
to host
XOR
Arraycontroller Cache
stripe data parity
slide 5531 March 2003 FAST 2003 keynote - john wilkes
High-end disk array – physical structure
DKC
R1 DKU
L2 DKUAdditional DKU
L1 DKUDisk Subsystem Basic UnitDKC and R1 DKU
Additional DKU
R2 DKU
AC Power Boxes (typical)
Cache memory batteries
Disks (typical)
Front DKC Logic Box:CHIPS, ½ Cache, ½ Shared
Memory, ½ CHIP PowerRear DKC Logic Box:
ACPs, ½ Cache, ½ Shared Memory, ½ CHIP Power
HP xp1024 disk array
© copyright 2003 hewlett-packard company 27
31 March 2003FAST 2003 keynote - john wilkes
slide 5631 March 2003 FAST 2003 keynote - john wilkes
High-end disk array – logical structure
max. 4 CHIP pairs, up to 64 ports
up to 1024 drives36 GB 15 K rpm and 73 GB 10 K rpm disks
write-mirroreddata cache
data cacheup to 64 GB;128 GB 2H02
sharedmemory
up to 3 GB;6 GB 2H02
maximum sequential data transfer:
3.2 GB/s peak from cache2 GB/s sustained from disk
CHIP CHIP CHIP CHIP
basebaseopt
opt
crossbar switches and shared memory interconnect10 GB/s data crossbar and 5 GB/s control; total 15 GB/sec interconnect bandwidth
0123101231
0123101231
0123101231
0123101231
0123101231
0123101231
0123101231
0123101231
ACP
ACP
ACP
ACP
0 1 2 31 0 1 2 31
0 1 2 31 0 1 2 31
0 1 2 31 0 1 2 31
0 1 2 31 0 1 2 31
0 1 2 31 0 1 2 31
0 1 2 31 0 1 2 31
0 1 2 31 0 1 2 31
0 1 2 31 0 1 2 31
ACP
ACP
ACP
ACP
L1 disksL2 disks R2 disksR1 disks
data:2 bytes
* 166 MHz*2/board
*16 boards=10 GB/s
Control:0.5 bytes*166 MHz*4/board
*16 boards=5 GB/s
HP xp1024 disk array
slide 5731 March 2003 FAST 2003 keynote - john wilkes
tape
Storage connectivity (“SN”)
tape
host computer systems
diskarrays
© copyright 2003 hewlett-packard company 28
31 March 2003FAST 2003 keynote - john wilkes
slide 5831 March 2003 FAST 2003 keynote - john wilkes
Disk systems getting smarter – and collective
1988 john wilkes – DataMesh
1996 Garth Gibson – NASD
1999+ Jim Gray – “disks are becoming computers” (Storage bricks have arrived, FAST 2002)
Disk controller+ 1Ghz CPU+ 1GB RAM
Communications:Infiniband, Ethernet, radio…
Applications:Web, DBMS, Files, OS
2001 IBM Almaden –Collective Intelligent bricks
(aka IceCube)
2003 HPL – federated array of bricks (FAB)
slide 5931 March 2003 FAST 2003 keynote - john wilkes
File/record layerFile/record layer
Database(dbms)
File system(FS)
The SNIA shared storage model
Stor
age
dom
ain
Block layerBlock layer
Storage devices (disks, …)Storage devices (disks, …)
Serv
ices
Serv
ices
Dis
cove
ry, m
onito
ring
Dis
cove
ry, m
onito
ring
Res
ourc
e m
gmt,
conf
igur
atio
nR
esou
rce
mgm
t, co
nfig
urat
ion
Sec
urity
, bill
ing
Sec
urity
, bill
ing
Red
unda
ncy
mgm
t (ba
ckup
, …)
Red
unda
ncy
mgm
t (ba
ckup
, …)
Hig
h av
aila
bilit
y (fa
il-ov
er, …
)H
igh
avai
labi
lity
(fail-
over
, …)
Cap
acity
pla
nnin
gC
apac
ity p
lann
ing
Application
Network
Host
DeviceBlock aggregation
Cop
yrig
ht ©
200
0-20
03, S
tora
ge N
etw
orki
ng In
dustr
y A
ssoc
iatio
n
© copyright 2003 hewlett-packard company 29
31 March 2003FAST 2003 keynote - john wilkes
slide 6031 March 2003 FAST 2003 keynote - john wilkes
Device block-aggregation
Network block-aggregation
Host block-aggregation
The SNIA shared storage model
Host
. with
LVM
and
softw
are R
AID
1. Direct-attachFi
le/re
cord
laye
rB
lock
laye
r
NAS head
4. NAS head
2. SN-attach
Host
. with
LVM
Disk array
SN
5. SN aggregation
Aggregationappliance
Blockmetadata
server
File metadata
server
6. Metadata aggr’n
3. NAS server
LAN
HostHost
NASserver
Application
Cop
yrig
ht ©
200
0-20
03, S
tora
ge N
etw
orki
ng In
dustr
y A
ssoc
iatio
n
slide 6131 March 2003 FAST 2003 keynote - john wilkes
Storage systems – parts of larger systems
accesstier
webtier
applicationtier
databasetier
edge routers
routingswitchesauthentication, DNS,
intrusion detect, VPNweb cache 1st level firewall
2nd level firewall
load balancingswitches
web servers
web page storage(NAS)
databaseSQL servers
storage areanetwork(SAN)
applicationservers
files(NAS)
switches
switches
© copyright 2003 hewlett-packard company 30
31 March 2003FAST 2003 keynote - john wilkes
slide 6431 March 2003 FAST 2003 keynote - john wilkes
management
slide 6531 March 2003 FAST 2003 keynote - john wilkes
Management challenges:Administration cost
• Rules of thumb:– 1980: 1 data administrator / 10GB– 2000: 1 data administrator / 5TB
• Problem:– 5TB ~= $5k in a few years– admin cost >> storage cost!
• Conclusion: need to automate allstorage administration tasks
Adapted from Storage bricks have arrivedJim Gray, Microsoft Research, FAST 2002
© copyright 2003 hewlett-packard company 31
31 March 2003FAST 2003 keynote - john wilkes
slide 6631 March 2003 FAST 2003 keynote - john wilkes
Management challenges:Causes of system crashes
Brendan Murphy and Ted Gent, Measuring System and Software Reliability using an Automated Data Collection Process,Quality and Reliability Engineering International, 11:341-353, 1995. © John Wiley & Sons.
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
1985 1993
Fractionof crashes other
system management
software failure
hardware failure
1985–1993 DEC OpenVMS systems
slide 6731 March 2003 FAST 2003 keynote - john wilkes
Management challenges:Multiple constituencies
business view
(storage)administrator
view
applicationview
© copyright 2003 hewlett-packard company 32
31 March 2003FAST 2003 keynote - john wilkes
slide 6831 March 2003 FAST 2003 keynote - john wilkes
Management challenges:some of the things to address
• Physical infrastructure– capacity planning – discovery, installation– allocation, qualification– configuration
• Logical configuration– data placement, security– volume/device virtualization
• QoS enforcement (runtime)– security, performance– data protection, recovery– consistency/coherency
• Monitoring– reporting– billing
slide 6931 March 2003 FAST 2003 keynote - john wilkes
Determine solution• select devices+configurations• assign load to devices
Construct solution• configure devices,
LVM, …• migrate data
Use the solution• do work• enforce QoS
Monitor QoS• offered load• system response
Understand needs• offered load• system components
businessrequirements
Runningsystem
Runningsystem
Monitor /analyze
Monitor /analyze
Configure /reconfigure
Configure /reconfigure
Design /redesign
Design /redesign
(Changing)
Management challenges: full automation
© copyright 2003 hewlett-packard company 33
31 March 2003FAST 2003 keynote - john wilkes
slide 7031 March 2003 FAST 2003 keynote - john wilkes
Management challenges: the design step
To achieve complete automation:
– Specify what’s wanted (goals), not how to achieve it (implementation)
– Management system must choose what to do, not people
– Human oversight + feedback to correct/refine choices
slide 7131 March 2003 FAST 2003 keynote - john wilkes
design
Explore design space,evaluate alternatives,make choices
automatic design system
automatic design system
Management challenges: the design step
specification
Models,estimators
device abilities,protection-technique effects
workloads, failure types
specification
Estimated, specified, or measured
user/businessgoals
spec
ifica
tion
Tradeoffs,objective functions
© copyright 2003 hewlett-packard company 34
31 March 2003FAST 2003 keynote - john wilkes
slide 7231 March 2003 FAST 2003 keynote - john wilkes
Management challenges: the design step
• There are many techniques available to us– which one to use?– in what circumstances?– with what settings?– are they working?
• Potential provocative idea: no new techniques until we can define these properties?
slide 7331 March 2003 FAST 2003 keynote - john wilkes
Management challenges: future work
Monitoring Storage QoS
Disasterrecovery
Data protection
Applicationfailover
Consolidation
Applicationlevelmanagement
hpl Hippodrome
Migration
Provisioning(design)
Workload analysis
Storageconfiguration
Virtualization
There’s plenty of scope for it!
© copyright 2003 hewlett-packard company 35
31 March 2003FAST 2003 keynote - john wilkes
slide 7431 March 2003 FAST 2003 keynote - john wilkes
data services
slide 7531 March 2003 FAST 2003 keynote - john wilkes
data service-drivenstorage systems
From some of the parts …… to the sum of the parts
“some assembly required”
© copyright 2003 hewlett-packard company 36
31 March 2003FAST 2003 keynote - john wilkes
slide 7631 March 2003 FAST 2003 keynote - john wilkes
Data services
Anywhere, anytime access to data
QoS-driven:• affordable• flexible• predictable• reliable
In a rapidly changing world,at scales that dwarf the desktop,while leaving people in control
slide 7731 March 2003 FAST 2003 keynote - john wilkes
Data services-driven storage systems
consumers, customers
application(s)application(s)application(s)application(s)business logic, web
services, OS, …business logic, web
services, OS, …
Localdata
© copyright 2003 hewlett-packard company 37
31 March 2003FAST 2003 keynote - john wilkes
slide 7831 March 2003 FAST 2003 keynote - john wilkes
Data services
Rising system complexity +rising abilities +rising expectations
Solution:• define data QoS needs• use storage QoS abilities• automate storage + data
management
Our target should be:data services
slide 7931 March 2003 FAST 2003 keynote - john wilkes
Data services
It’s going to be an exciting ride, at all levels:– storage devices– storage networking– storage management– data management
Welcome aboard!
© copyright 2003 hewlett-packard company 38
31 March 2003FAST 2003 keynote - john wilkes
slide 8031 March 2003 FAST 2003 keynote - john wilkes
Data services – from data to containers
thank you!
[email protected]://www.hpl.hp.com/research/ssp