Upload
kiththi-perera
View
118
Download
9
Tags:
Embed Size (px)
Citation preview
Big Data Solutions on Cloud – the way forward
By: K. A. Kiththi PereraChief Enterprise and Wholesale OfficerSri Lanka Telecom
ITU-TRCSL Symposium on Cloud Computing 2015 Colombo
Session 04: Big Data Strategy in the Cloud and Applications
Big Data Analytics and Cloud Computing
• Two ICT initiatives are currently top of mind for organizations;– Big Data Analytics and– Cloud Computing
• Big Data Analytics offer;– Valuable insights to create competitive advantage– Spark new innovations and– Drive Revenue
• Cloud Computing offer;– Enhance Business Agility and Productivity– Enable greater efficiencies and– Reduce Costs
Both Technologies continue to evolve
Big Data
Harnessing Big Data
• OLTP: Online Transaction Processing (DBMSs)• OLAP: Online Analytical Processing (Data Warehousing)• RTAP: Real-Time Analytics and Processing (Big Data Architecture & technology)
Big Data – Variety and Complexity
What’s driving Big Data
- Ad-hoc querying and reporting- Data mining techniques- Structured data, typical sources- Small to mid-size datasets
- Optimizations and predictive analytics- Complex statistical analysis- All types of data, and many sources- Very large datasets- More of a real-time
Value of Big Data Analytics
• Big Data is more real-time in nature than traditional DW applications
• Traditional DW Architectures (e.g. Exadata, Teradata) are not well-suited for big data apps
• Shared, massively parallel processing, scale out architectures are well-suited for big data apps
“Without big data, you are blind and deaf in the middle of a
freeway”
Geoffrey Moore, management consultant and theorist
Need to have a high-performance and easy-to-use data transformation and analytic solution for Big Data
Scale and Architectures
Hadoop Functional Blocks
Hive - A high-level language built on top of MapReduce for analyzing large data sets . Pig - Enables the analysis of large data sets using Pig Latin.Sqoop - ("SQL to Hadoop") is a Java-based application designed for transferring bulk data between Apache Hadoop and non-Hadoop data stores
Hadoop Core Components
• HDFS – Hadoop Distributed File System (Distributed Storage);– Distributed across multiple “nodes”– Natively redundant– “NameNode” tracks locations
• Map Reduce (Distributed Processing);– Split a task across processors– Self-Healing, High Bandwidth– Clustered Storage– JobTracker manages TaskTrackers
Big Data and EDW to coexist?
Alternatives to Hadoop
• Many believe that Big Data and Hadoop is the only option
• Hadoop's historic focus on Batch Processing of data was well supported by ‘MapReduce’
• But there is a need for more flexible developer tool to support;– The larger market of 'mid-size data sets’ and – Use cases that call for ‘real-time processing’
• Apache Spark: Preparing for the Next Wave of Reactive Big Data
Survey on Apache Spark
Hadoop and Spark – work together
Cloud for Big Data ?
Economics of Cloud Users
Unused resources
• Pay by use instead of provisioning for peak
Static data center Data center in the cloud
Demand
Capacity
Time
Re
sou
rce
s
Demand
Capacity
TimeR
eso
urc
es
EDBT 2011 Tutorial
Cloud Computing Modalities
• Hosted Applications and services• Pay-as-you-go model• Scalability, fault-tolerance,
elasticity, and self-manageability
• Very large data repositories• Complex analysis• Distributed and parallel data
processing
“Can we outsource our IT software and hardware infrastructure?”
“We have terabytes of click-stream data – what can we do with it?”
Big Data - Cloud Option and Challenges
• Key to big data success;– Elastic Infrastructure and– Data gravity
• Cloud is emerging as increasingly popular option for new analytics applications and processing big data
• Challenge - movement of hundreds of terabytes or petabytes of data across the network– Traditional data is largely located in Enterprise Data Warehouse– Limited speed in the WAN
• New data sets – weather data, census data, machine and sensor data originate from outside the enterprise– Cloud becomes the ideal place to capture and data processing
Cloud Service Providers to offer “Hadoop/Spark as a service” bundled with “High Speed Connectivity”
SLT “akaza” cloud services
IAASInfrastructure as a Service
SAASSoftware as
a Service
DAASDesktop as a
Service
CAASCommunicati
on as a Service
PAASPlatform as a
Service
Big Data Use Cases
Optimize Funnel Conversion01
Behavioral Analytics02
Customer Segmentation03
Predictive Support04
Market Analysis and pricing optimization05
Predict Security Threats06
Big data analytics allows companies to track leads through the entire sales conversion process, from a click on an adword ad to the final transaction, in order to uncover insights on how the conversion process can be improved.
Optimize Funnel Conversion
COMPANYT- Mobile
INDUSTRYCommunication
EMPLOYEES38,000
TYPEOptimize Funnel Conversion
PURPOSE:T- mobile uses multiple indicators, such as billing and sentiment
analysis, in order to identify customers that can be upgraded to higher quality products, as well as to identify those with a high lifetime customer – value, so its team can focus on retaining those customers.
Optimize Funnel Conversion
With access to data on consumer behavior, companies can learn what prompts a customer to stick around longer as well as learn more about their customer’s characteristics and purchasing habits in order to improve marketing efforts and boost profits.
Behavioral Analytics
PURPOSE:McDonalds tracks vast amounts of data in order to improve operations and
boost the customer experience. The company looks at factors such as the
design of the drive-thru, information provided on the menu, wait times, size of orders and ordering patterns in order to optimize each restaurant to its particular market.
Company McDonald’s
IndustryFood and Beverage
Employees750,000
TypeBehavioral Analytics
Behavioral Analytics
By accessing data about the consumer from multiple sources, such as social media data and transaction history, companies can better segment and target their customers and start to make personalized offers to those customers.
Customer Segmentation
COMPANYIntercontinental Hotel Group
INDUSTRYHotel/Travel
EMPLOYEES7,981
TYPECustomer Segmentation
PURPOSE:IHG collects extensive data about their customers in order to provide a
personalized web experience for each customer, so as to boost conversion rates. It also uses data analytics to evaluate and adjusts marketing mix.
Customer Segmentation
Through sensors and other machine-generated data, companies can identify when a malfunction is likely to occur. The company can then proactively order parts and make repairs in order to avoid downtime and lost profits.
Predictive Support
COMPANYSouthwest Airlines
INDUSTRYTravel
EMPLOYEES45,000
TYPEPredictive Support
PURPOSE:Southwest analyses sensor data on their planes in order to identify patterns that indicate a potential malfunction or safety issue. This
allows the airline to address potential problems and make necessary
repairs without interrupting flights or putting passengers in danger.
Predictive Support
“Information is the oil of the 21st century, and analytics is the combustion
engine.” By Peter Sondergaard, Gartner Research
References
• http://spark.apache.org/• https://hadoop.apache.org/• https://www.oracle.com/big-data/index.html• http://www.computerworld.com/article/2929384/cloud-computing/• http://www.thoughtworks.com/insights/blog/6-reasons-why-hadoop-cloud-makes-sense• http://www.finance.gov.au/files/2013/03/Big-Data-Strategy-Issues-Paper1.pdf• http://
www.intel.com/content/dam/www/public/us/en/documents/product-briefs/big-data-cloud-technologies-brief.pdf
• https://datafloq.com/read/Big-Data-Hadoop-Alternatives/1135• http://www.slideshare.net/Dell/big-data-use-cases-36019892• http://www.rackspace.com/big-data• http://www.microsoft.com/en-us/server-cloud/solutions/big-data.aspx• http://www.slideshare.net/BernardMarr/big-data-news-feb-2015• http://aptuz.com/blog/is-apache-spark-going-to-replace-hadoop/• https://adtmag.com/blogs/dev-watch/2015/03/hadoop-and-spark-friends-or-foes.aspx• http://www.datastax.com/resources/webinars/choosing-a-big-data-solution• http://www.infosys.com/cloud/resource-center/Documents/big-data-spectrum.pdf• http://www.slideshare.net/nasrinhussain1/big-data-ppt-31616290• http://www.adamadiouf.com/2013/03/22/bigdata-vs-enterprise-data-warehouse/