With the arrival of the cloud, and business focus on service based reporting, capturing data for Capacity Management has never been more important. These slides discuss the challenges of capturing the sorts of data required, to answer the demands of the business.
- 1. www.metron-athene.com Data, data, everywhere, and not a bit to use. With the arrival of the cloud, and business focus on service based reporting, capturing data has never been more important. www.metron-athene.com
2. www.metron-athene.com This is not a presentation about Big Data Big Data Any data ? www.metron-athene.com 3. www.metron-athene.com Session Agenda Why Talk about data? Business demands My basic principles Data Capture Techniques Data Sources The Obligatory Cloud Part APM CMIS Reality www.metron-athene.com 4. www.metron-athene.com Why talk about data? Dashboard, Dashboard, Dashboard Alerts Automation CMIS What does that all sit on? Raw data Ever increasing number of requests to take data from an ever increasing number of sources www.metron-athene.com 5. www.metron-athene.com Frustration www.metron-athene.com 6. www.metron-athene.com A Problem Shared Data you want Cool data you got hold of Solutions you found Write them down, scrunch it up, and throw them up the front here. Put your name on it No prizes for hitting the presenter Or just ask later www.metron-athene.com 7. www.metron-athene.com Session Agenda Why Talk about data? Business demands My basic principles Data Capture Techniques Data Sources The Obligatory Cloud Part APM CMIS Reality www.metron-athene.com 8. www.metron-athene.com What businesses are asking for Have data for everything Internal to a system Across all infrastructure (build a service picture) Business volumes & transaction response times Dont deploy more agents Ensure reliable data Minimal Storage No staff www.metron-athene.com 9. www.metron-athene.com Where a lot of people are A handful of tools for specific platforms Designed for sys admin roles No single person can access them all No business data Projections based on resource utilisations Huge volumes of out of reach data Some Agents, some SNMP capture, some stuff nobody understands anymore Limited staff www.metron-athene.com 10. www.metron-athene.com Session Agenda Why Talk about data? Business demands My basic principles Data Capture Techniques Data Sources The Obligatory Cloud Part APM CMIS Reality www.metron-athene.com 11. www.metron-athene.com Data Capture (My Basic Principals) More is (just this once), more At data capture time get everything you will need Time travel is still fiction Quality is important Put it under YOUR control Full service picture Resource Application Network SAN Business data www.metron-athene.com 12. www.metron-athene.com Session Agenda Why Talk about data? Business demands My basic principles Data Capture Techniques Data Sources The Obligatory Cloud Part APM CMIS Reality www.metron-athene.com 13. www.metron-athene.com Capture Techniques Agentless (SNMP, WMI, etc) Is subject to more security issues, and network quality. Broken communication = lost data Easier/Faster implementation (often), Less data of lower quality Agent Based Autonomous Data collected by a local process. If the server is up, data capture is running. Broken communication = catch up later Possibility to use existing Agents Overhead (system and human) Vote now! Blended Delivery Model Where most people are. www.metron-athene.com 14. www.metron-athene.com Session Agenda Why Talk about data? Business demands My basic principles Data Capture Techniques Data Sources The Obligatory Cloud Part APM CMIS Reality www.metron-athene.com 15. www.metron-athene.com Application/Service Data Databases Well thought out APIs or Windows Counters Well thought out Agents do this SAP Various transactions return Perf data (e.g. ST03) What if there is no designed interface? Logs, databases, write your own instrumentation APM Tools www.metron-athene.com 16. www.metron-athene.com Business/Application Transaction data (APM) A user action = A transaction Log on, Search, Add to Basket, Checkout, Payment = 5 transactions Benefits Common language Service based Defined SLAs Real workload volumes (Planning benefits) Usual Difficulties No tool capturing this data (Ask me for a recommendation) No access to the data held (Typically controlled by Operations) No import facility to capacity tool Avoid Exporting data from both tools into Excel and manually cutting and pasting to get combined reports www.metron-athene.com 17. www.metron-athene.com SANs Challenge IOPS remains the biggest bottleneck Surprising number of capacity managers are unaware of storage capacity available Where to get data SMI-S (Storage Management Initiative Standard) PowerShell Plugins Learn PowerShell or learn to serve fries (some dude 2008) Storage Vendor central control server Operations Manager, StorageWorks, ControlCenter Using the data? Bring it into your capacity tool www.metron-athene.com 18. www.metron-athene.com In the last 6 months Business / Customer transaction reports (multiple types) Open VMS T4 data Historical CPU & Memory data from home grown scripts NetApp, HP EVA IP Pool allocation Datacenter temperature & power www.metron-athene.com 19. www.metron-athene.com More detailed example NetApp Operations Manager DFM CLI Export Occupancy and performance data for all LUNS, Volumes, Aggregates & Systems connected to Operations Manager. dfm data export run d comma t 5 mins f avg h 1 day Database tables in .csv Script to produce something nicer to import www.metron-athene.com 20. www.metron-athene.com Session Agenda Why Talk about data? Business demands My basic principles Data Capture Techniques Data Sources The Obligatory Cloud Part APM CMIS Reality www.metron-athene.com 21. www.metron-athene.com The Cloud http://xkcd.com/908/ www.metron-athene.com 22. www.metron-athene.com Basic Cloud Types & Challenges (IaaS) Public Cloud (Worst Case) No control You put your faith in the provider Monitor response times only? Private Cloud (Best Case) Full control You are responsible, but have all the data Community Cloud (Never seen) Potential control You are involved and may have access to the data Hybrid Cloud (Where youre likely to be) Some control Full control of the Private Cloud portion only www.metron-athene.com 23. www.metron-athene.com Want to Benchmark the Public cloud? How hard can it be Jeremy Clarkson Get a VM up and running and see what workload it can handle AWS results all over the place Somebody else must have looked into this: http://www.spec.org/osgcloud/ Still working on it. Join in ? (Im short of the $10,000 required) http://datasys.cs.iit.edu/events/MTAGS12/i02.pdf IaaS Cloud Benchmarking: Approaches, Challenges, and Experience Alexandru Iosup, Radu Prodan, and Dick Epema www.metron-athene.com 24. www.metron-athene.com Benchmarking the Cloud problems Cloud evolution Changes made under your feet We are no longer in the loop commercial clouds such as Amazon EC2 add frequently new functionality to their systems. Thus, the benchmarking results obtained at any given time may be unrepresentative for the future behaviour of the system. Alexandru Iosup, Radu Prodan, and Dick Epema So why dont we continually benchmark the cloud? Because its complex and expensive (Challenge 1 = how to do it cheap) A straightforward approach to benchmark both short-term dynamics and long-term evolution is to measure the system under test periodically, with judiciously chosen frequencies . However, this approach increases the pressure of the so-far unresolved Challenge 1. Alexandru Iosup, Radu Prodan, and Dick Epema www.metron-athene.com 25. www.metron-athene.com Benchmarking (more problems) Even with lots of data, youll have a hard time making it fit reality because you cannot replicate all the software involved. We have surveyed in our previous work ,  over ten performance studies that use common benchmarks to assess the virtualization overhead on computation (515%), I/O (1030%), and HPC kernels (results vary). We have shown in a recent study of four commercial IaaS clouds  that virtualized resources obtained from public clouds can have a much lower performance than the theoretical peak, possibly because of the performance of the middleware layer. Alexandru Iosup, Radu Prodan, and Dick Epema www.metron-athene.com 26. www.metron-athene.com Long term observation We have observed the long-term evolution in performance of clouds since 2007. Then, the acquisition of one EC2 cloud resource took an average time of 50 seconds, and constantly increased to 64 seconds in 2008 and 78 seconds in 2009. The EU S3 service shows pronounced daily patterns with lower transfer rates during night hours (7PM to 2AM), while the US S3 service exhibits a yearly pattern with lowest mean performance during the months January, September, and October. Other services have occasional decreases in performance, such as SDB in March 2009, which later steadily recovered until December . Alexandru Iosup, Radu Prodan, and Dick Epema www.metron-athene.com 27. www.metron-athene.com Final nail in the coffin Depending on the provider and its middleware abstraction, several cloud overheads and performance metrics can have different interpretation and meaning. Alexandru Iosup, Radu Prodan, and Dick Epema So you cant trust the data from clouds to be what you expect. And you cant trust your existing benchmarks to represent the future. Sowhat can you do? www.metron-athene.com 28. www.metron-athene.com Private Cloud You are in charge and you monitor the hardware utilisations The Cloud still has physical limits, and soft limits Resource Pools, Reservations etc Opportunity Resource Utilisation and Service Information combined Users, Processes, Transactions, Business Volumes Challenge Business decision based on easy capacity monitoring? www.metron-athene.com 29. www.metron-athene.com Sess