12
Big Data Sonovate QuickView Series #3

Quick view Big Data, brought by Oomph!, courtesy of our partner Sonovate

Embed Size (px)

Citation preview

Page 1: Quick view Big Data, brought by Oomph!, courtesy of our partner Sonovate

Big DataSonovate QuickView Series #3

Page 2: Quick view Big Data, brought by Oomph!, courtesy of our partner Sonovate

For more information get in touch on 020 7112 4949 or visit sonovate.com

Big data is changing the world dramatically right before our eyes – from the amount of data being produced to the way in which it’s structured and used.

This QuickView provides an overview to what big data is, how big it is, how it’s mined, refined and used, what the key roles are and the popular big data techniques used by suppliers.

ContentsExplosive GrowthWhat is Big Data? Management of Big DataHow Big is Big Data? Big Data in Numbers5 Predictions For The $125 Billion Big DataAnalytics Market in 20155 Key Big Data Positions Used in a Big Data FlowTop 10 Related IT SkillsTop 10 Industries Hiring Big Data ProfessionalsTop 10 Qualifications Sought by HirersTop 10 Database and BI skills Sought by HirersKey Big Data Terms DemystifiedPopular Big Data Techniques and Vendors

“The most valuable commodity I know of is information”.- Gordon Gekko

Page 3: Quick view Big Data, brought by Oomph!, courtesy of our partner Sonovate

For more information get in touch on 020 7112 4949 or visit sonovate.com

Explosive Growth

The explosion of the data industry, which has been likened to oil in the 18th Century: an immensely, untapped valuable asset, is fueling extraordinary demand for “big data” skilled professionals. Estimates suggest that between 2012-17, use of big data could contribute £216 billion to the UK economy via business creation, efficiency and innovation, and generate 58,000 new jobs. Big data is big business.

According to a report conducted by leading business analytics software and services company SAS, over the past five years big data job growth has risen at an annual rate of 212%.This presents both challenges and opportunities for businesses. A study by the Royal Academy of Engineering shows that British industry will need 1.25 million new graduates in science, technology, engineering and maths subjects between now and 2020 to maintain current employment numbers in an ever-evolving market.

With no guarantee that universities will produce the number of graduates needed to meet the demand, or that companies will invest in training and development for existing staff, companies who are seeking to implement a big data strategy will need to pursue a defined hiring strategy.

By working with hiring specialists to tap into the existing talent pool and extract the hard to find candidates to meet their objectives, businesses will benefit significantly.

According to Accenture, one of the world’s biggest parcel companies and also among the world’s largest big data users, spending $1bn annually to store and study 16 petabytes of data from every conceivable point of its business. The enormity of this statistic underlines how valuable big data is (in the right hands), how important it is for businesses to acquire, interpret and use the right data, and just how exciting the market potential is.

Source: SAS

1.2New Grads Needed

Million

212Job Growth

%

58,000New Jobs

Page 4: Quick view Big Data, brought by Oomph!, courtesy of our partner Sonovate

For more information get in touch on 020 7112 4949 or visit sonovate.com

What is Big Data?

Buzzword? catchphrase? technology? In the last decade there has been a lot said about big data with hundreds of definitions, such as:

“Big data is the derivation of value from traditional relational database-driven business decision making, augmented with new sources of unstructured data” - Oracle

“Datasets whose size is beyond the ability of typical database software tools to capture, store, manage, and analyze” - McKinsey

“Big data is the data characterized by 3 attributes: volume, variety and velocity” - IBM

David Wellman’s succinct offering captures the essence of what big data is really about - “Big Data is not about the size of the data, it’s about the value within the data.” Taking this point back to Accenture’s research on the $1bn packaging company, the value of big data is about the specific purpose and intent it’s used for and ultimately it’s impact on the bottom line. To paraphrase William Bruce Cameron “not everything that can be counted counts”.

Technological Factors Driving the Growth of Big Data

New sources of data are being created through: • Digitisation of existing processes and services, for example online banking, email and medical records

• Automatic generation of data, such as web server logs that record web page requests

• Reduction in the cost and size of sensors found in aeroplanes, buildings and the environment

• Production of new gadgets that collect and transmit data, for example GPS location information from mobile phones and capacity updates from ‘smart’ waste bins

Enhanced Computing Capabilities Driving Big Data Include:

• Improved data storage at higher densities, for lower cost • Greater computing power for faster and more complex calculations • Cloud computing (remote access to shared computing resources via a device connected to a network), facilitating cheaper access to data storage, computation, software and other services

• Recent advances in statistical and computational techniques, which can be used to analyse and extract meaning from big data

• Development of new tools such as Apache Hadoop (which enables large data sets to be processed across clusters of computers) and extension of existing software, such as Microsoft Excel.

58,000

Page 5: Quick view Big Data, brought by Oomph!, courtesy of our partner Sonovate

For more information get in touch on 020 7112 4949 or visit sonovate.com

Mining

Big data can be acquired from a vast, and increasing, number of sources. These include images, sound recordings, user click streams that measure internet activity, and data generated by computer simulations (such as those used in weather forecasting). Key to managing data collection are metadata, which is data about data. An email, for example, automatically generates metadata containing the addresses of the sender and recipient, and the date and time it was sent, to aid the manipulation and storage of email archives. Producing metadata for big data sets can be challenging, and may not capture all the nuances of the data.

Refining

Data may undergo numerous processes to improve quality and usability before analysis, including:

Extraction – pulling out required information from the initial data and expressing it in a structured form Cleansing – detecting and then correcting or removing corrupt or inaccurate records standardization – formatting data to aid interoperability Linkage – connecting records from different sources.

.

Management of Big Data

Use

Analytics are used to gain insight from data. They typically involve applying an algorithm (a sequence of calculations) to data to find patterns, which can then be used to make predictions or forecasts. Big data analytics encompass various inter-related techniques, including the following examples.

Data mining - identifies patterns by sifting through data. It can be applied to user click streams to understand how customers use web pages to inform web page design. Machine learning - describes systems that learn from data. For example, a system that compares documents in two different languages can infer translation rules; human correction of any errors in the rules can result in the system learning how to improve the software.

Simulation - can be used to model the behaviour of complex systems. For example, building a trading simulation can help to assess the effectiveness of measures to reduce insider trading.

Page 6: Quick view Big Data, brought by Oomph!, courtesy of our partner Sonovate

For more information get in touch on 020 7112 4949 or visit sonovate.com

How Big is Big Data?Research group IDC predicts the digital universe will reach 40 zettabytes in size – that’s 45 trillion gigabytes – by 2020. That’s a 50-fold growth in just one decade. There is now almost as many bits of data as there are known stars in the universe. 2013: 4.4 zettabytes, 2020: 44 zettabytes.

Source: Oracle 2012

What is a Zettabyte?

1,000,000,000,000 Gigabytes1,000,000,000,000 Terabytes1,000,000,000,000 Petabytes1,000,000,000,000 Exabytes1,000,000,000,000 Zettabytes

05101520253035404550

2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020

1 terabyteholds theequivalentof roughly210 single-sided DVDs In 20007,

the estimated infomationcontent of all human knowledgewas 295 exabytes

Data is growing at a 40 % compound annual rate, reaching nearly 45 ZB by 2020

Data in zettabytes ( ZB )

Page 7: Quick view Big Data, brought by Oomph!, courtesy of our partner Sonovate

For more information get in touch on 020 7112 4949 or visit sonovate.com

64,000UK organisations with 100 or morestaff will have implementedbig data analytics by 2020

2009 2020

By 2020 over 1/3 of alldata will live in or passthrough the cloud346,000

Big data job opportunitiescreated in the economyin the UK by 2020

The digital universe will grow from 3.2 zettabytes today to 40 zettabytes in only 6 years

0 6

Big Data in Numbers

Individuals create 70 % of all dataEnterprises store 80 % of all data

Data production will be 44 times greater in 2020 than it was in 2009

222%2017

The UK is forecasting a 222%increase in big data jobs by 2017

x+100

41%2011 2013Rise in “big data” jobsthroughout the UK

Source: IDC/SAS

Page 8: Quick view Big Data, brought by Oomph!, courtesy of our partner Sonovate

For more information get in touch on 020 7112 4949 or visit sonovate.com

Predictions For The $125 Billion Big Data Analytics Market in 2015The big data and analytics market will reach $125 billion worldwide in 2015, according to IDC and The International Institute of Analytics (IIA).

Here are their top five predictions for 2015:

1. Over the next five years spending on cloud-based big data and analytics (BDA) solutions will grow three times faster than spending for on-premise solutions.

2. Shortag of skilled staff will persist. In the US alone there will be 181,000 deep analytics roles in 2018 and five times that many positions requiring related skills in data management and interpretation. In the UK, there will be 47,000 big data roles by 2017, a 222% increase on 2013.

3. Growth in applications incorporating advanced and predictive analytics, including machine learning, will accelerate in 2015. These apps will grow 65% faster than apps without predictive functionality.

4. 70% of large organisations already purchase external data and 100% will do so by 2019. In parallel more organisations will begin to monetise their data by selling them or providing value-added content.

5. Rich media (video, audio, image) analytics will at least triple in 2015 and emerge as the key driver for BDA technology investment.

Hiring Big Data Specialists: The Key Roles

Data AnalystBig Data DeveloperData ModelerBig Data ArchitectBusiness Data AnalystData ScientistSAS Data AnalystSAP Data AnalystSQL Data AnalystData Warehousing (DWH) DeveloperData Centre ArchitectMaster Data AnalystData Governance ManagerBig Data ConsultantData Warehousing (DWH) AnalystData Integration DeveloperData Migration AnalystMaster Data ConsultantData Migration ManagerData Business AnalystOracle Data Warehousing (DWH) DeveloperMarket Data EngineerData Migration Project ManagerData Centre Project ManagerData Centre ConsultantData Protection ManagerSAS Data Integration (DI) Studio Developer

125$BillionIDC and (IIA)

Page 9: Quick view Big Data, brought by Oomph!, courtesy of our partner Sonovate

For more information get in touch on 020 7112 4949 or visit sonovate.com

Big Data top 10sFor the six months to 3 March 2015, IT jobs within the UK citing big data also mentioned the following IT skills in order of popularity. The figures indicate the number of jobs and their proportion against the total number of IT job ads sampled that cited big data.

Top 10 Industries

1 Finance2 Marketing3 Banking4 Retail5 Telecoms6 Advertising7 Games8 Pharmaceutical9 Investment Banking10 Legal

Top 10 Qualifications

1 Degree2 phD3 Security Cleared4 VCP45 SQL6 MBA7 Microsoft Certification8 DV Cleared9 ISEB10 PMI Cirtification

Database & Business Intelligence

1 Hadoop2 noSQL3 SQL Server4 Data Warehouse5 MongoDB6 Apache Hive7 mySQL8 Data Mining9 Apache Cassandra10 SQL Server Integration Serv.

125

Top 10 Related IT Skills

1 Java 2 Hadoop3 Agile Software Dev.4 Analytics5 SQL6 Business Inteligence7 Finance8 NoSQL9 Python10 SQL Server

Page 10: Quick view Big Data, brought by Oomph!, courtesy of our partner Sonovate

For more information get in touch on 020 7112 4949 or visit sonovate.com

Key Big Data Terms DemystifiedHadoop

Hadoop is a complex software ecosystem central to a broad range of state-of-art big data technologies (learn more about what is Hadoop). Companies that work with data at super-massive scale inevitably need expert engineers who can work nimbly within the Hadoop framework.

NoSQL

NoSQL (commonly referred to as "Not Only SQL") represents a completely different framework of databases that allows for high-performance, agile processing of information at massive scale. In other words, it is a database infrastructure that as been very well-adapted to the heavy demands of big data.

MongoDB

MongoDB is a leading NoSQL database that is very popular among companies with big data initiatives. Demand for talented engineers with MongoDB familiarity is very high.

Cassandra

Cassandra is a popular NoSQL technology stack that was originally developed at Facebook, and is now deployed at large number of companies with big data initiatives.

Business Intelligence (BI)

BI is a critical capability in any data-driven organisation, responsible for making data visible and actionable for smarter decision-making. BI teams accomplish this by developing tools that make data easy to digest – i.e. data reporting, visualisation, and query platforms such as dashboards and OLAP tools. BI developers/analysts require sharp technical skills and comfort working with large database systems

Database Administrators (DBA)

DBAs are vital engineers at any company with data infrastructure. The role of DBA has actually become more complex over the years. In the past, data may have been adequately managed on a single server. But the big data infrastructure of today is often comprised of a medley of intricate, interconnected data platforms, potentially involving large clusters of massively parallel processing servers. Demand for this type of DBA talent is very high.

Key Big Data Positions Used in a Big Data Flow

Data Hygienists make sure that data coming into the system is clean and accurate, and stays that way over the entire data lifecycle.

Data Explorers sift through mountains of data to discover the data you actually need.

Business Solution Architects put the discovered data together and organise it so that it's ready to analyse.

Data Scientists take this organised data and create sophisticated analytics models that, for example, help predict customer behavior and allow advanced customer segmentation and pricing optimization.

Campaign Experts turn the models into results. They have a thorough knowledge of the technical systems that deliver specific marketing campaigns, such as which customer should get what message when.

Page 11: Quick view Big Data, brought by Oomph!, courtesy of our partner Sonovate

Popular Big Data Techniques and Vendors

Business Intelligence (BI)/Online Analytical Processing (OLAP):

Users interactively analyse multidimensional datausers can roll-up, drill-down, and slice dataBI tools provide dashboard and report capabilities

Cluster Analysis:

Segment objects (e.g., users) into groups based on similar properties or attributes

Data Mining:

Process to discover and extract new patterns in large data sets

Predictive Modeling:

A model is created to best predict the probability of an outcome

SQL:

A computer language that manages (e.g., query, insert, delete, extract) data from a relational database

Crowdsourcing:

A process for collecting data from a large community or distributed group of peopleIdea submission is a common crowdsourcing activity

Textual Analysis:

Computer algorithms that analyse natural languageTopics can be extracted from text along with their linkages

Sentiment Analysis:

A form of textual analysis that determines a positive, negative, or neutral reactionOften used in marketing brand campaigns

Network analysis:

A methodology to analyse the relationship among nodes (e.g., people)On social media platforms, it can be used to create the social graph of follower and friends’ connections among users

For more information get in touch on 020 7112 4949 or visit sonovate.com

Technique Vendor

Tran

sact

iona

l Dat

aN

on-t

rans

acti

onal

Soc

ial D

ata

Page 12: Quick view Big Data, brought by Oomph!, courtesy of our partner Sonovate

Reference Shelf:

Cebr "Data Equity: unlocking the value of big data, 2012"Hadoop Summit 2014IDC "Big Data and Analytics and Enterprise Applications Will Continue to Drive Software Market Growth Until 2018"Computer Weekly: "IT Department for Big Data Projects"McKinsey "Big Data is the data the next frontier for innovation"McKinsey "Big data: The next frontier for innovation, competition, and productivity"David Wellman "What is Big Data"IDC: "Worldwide Big Data and Analytics Predictions for 2015"Accenture Big Success with Big Data SurveyOnrec: "How is the online recruitment IT sector faring"ITJobs Watch "Big Data skills in IT jobs"POSTnote "Big Data: An Overview"The International Institute of Analytics "Analytics predictions for 2015"Data Jobs "Key Big Data Terms Demystified"HBR "Five Roles You Need on Your Big Data Team"CION Insight "Digital Universe Expands at an Alarming Rate"The Telegraph "Big data skills will lead to big IT jobs"MIT: "The Big Data Conundrum: How to Define It?"

For more information get in touch on 020 7112 4949 or visit sonovate.com