38
ANALYTICS IT’S ALL ABOUT DATA & ANALYTICS

Analytics

Embed Size (px)

Citation preview

ANALYTICSIT’S ALL ABOUT DATA & ANALYTICS

Different Profiles of Analytics

DATA SCIENCE•Data scientist, chief scientist, senior analyst, director of analytics, Etc.

• Industries like Digital analytics, search technology, marketing, fraud detection, astronomy, energy, Healthcare, social networks, finance, forensics, security (NSA), mobile, telecommunications, weather forecasts, and fraud detection.

•Projects Taxonomy creation (text mining, big data), clustering applied to big data sets, recommendation engines, simulations, rule systems for statistical scoring engines, root cause analysis, automated bidding, forensics, exo-planets detection, and early detection of terrorist activity or pandemics.

•Main components are Machine to Machine communications, Automation.

•Overlaps with Computer Science, Statistics, Machine Learning, Data Mining, Operational Research, Business Intelligence.Resource: http://www.datasciencecentral.com/profiles/blogs/17-analytic-disciplines-compared

Different Profiles of Analytics

MACHINE LEARNING - Very popular computer science discipline

•Part of data science and closely related to Data Mining. Machine learning is about designing algorithms (like data mining), but emphasis is on prototyping algorithms for production mode, and designing automated systems

•Python is now a popular language for ML development Projects.

•Core algorithms include clustering and supervised classification, rule systems, and scoring techniques

•A sub-domain, close to Artificial Intelligence is deep learning.

Resource: http://www.datasciencecentral.com/profiles/blogs/17-analytic-disciplines-compared

Different Profiles of Analytics

DATA MINING

•Designing algorithms to extract insights from rather large and potentially unstructured data (text mining), sometimes called Nugget Discovery.

•Techniques include pattern recognition, feature selection, clustering, supervised classification and encompasses a few statistical techniques.

•Data mining thus have some intersection with statistics, and it is a subset of Data science.

•Data miners use open source and software such as Rapid Miner.

Resource: http://www.datasciencecentral.com/profiles/blogs/17-analytic-disciplines-compared

Different Profiles of Analytics

PREDICTIVE MODELING

•Predictive modeling projects occur in all industries across all disciplines.

•Aim at predicting future based on past data, usually but not always based on statistical modeling.

•Predictions often come with confidence intervals.

•Roots of predictive modeling are in statistical science.

Resource: http://www.datasciencecentral.com/profiles/blogs/17-analytic-disciplines-compared

Different Profiles of Analytics

STATISTICS

•Loosing ground to data science, industrial statistics, operations research, data mining, machine learning -- where the same clustering, cross-validation and statistical training techniques are used, albeit in a more automated way and on bigger data.

•Many professionals who were called statisticians 10 years ago, have seen their job title changed to data scientist or analyst in the last few years.

•Modern sub-domains include statistical computing, statistical learning(closer to machine learning), computational statistics(closer to data science), data-driven (model-free) inference, sport statistics, and Bayesian statistics

Resource: http://www.datasciencecentral.com/profiles/blogs/17-analytic-disciplines-compared

Different Profiles of Analytics

INDUSTRIAL STATISTICS

•Statistics frequently performed by non-statisticians (engineers with good statistical training), working on engineering projects such as yield optimization or load balancing (system analysts). They use very applied statistics, and their framework is closer to six sigma, quality control and operations research, than to traditional statistics. Also found in oil and manufacturing industries.

•Techniques used include time series, ANOVA, experimental design, survival analysis, signal processing(filtering, noise removal, deconvolution), spatial models, simulation, Markov chains, risk and reliability models.

Resource: http://www.datasciencecentral.com/profiles/blogs/17-analytic-disciplines-compared

Different Profiles of Analytics

ACTUARIAL SCIENCES

• Just a subset of statistics focusing on insurance (car, health, etc.)

•using survival models: predicting when you will die, what your health expenditures will be based on your health status (smoker, gender, previous diseases) to determine your insurance premiums.

•Also predicts extreme floods and weather events to determine premiums.

Resource: http://www.datasciencecentral.com/profiles/blogs/17-analytic-disciplines-compared

Different Profiles of Analytics

HPC

•High performance computing, not a discipline per se, but should be of concern to data scientists, big data practitioners, computer scientists and mathematicians, as it can redefine the computing paradigms in these fields.

•HPC should not be confused with Hadoop and Map-Reduce: HPC is hardware-related, Hadoop is software-related (though heavily relying on Internet bandwidth and servers configuration and proximity).

Resource: http://www.datasciencecentral.com/profiles/blogs/17-analytic-disciplines-compared

Different Profiles of Analytics

OPERATIONS RESEARCH(OR)

• It is about decision science and optimizing traditional business projects: inventory management, supply chain, pricing. They heavily use Markov Chain models, Monter-Carlo simulations, queuing and graph theory, and software such as AIMS, Matlab or Informatica.

• Big, traditional old companies use OR, new and small ones (start-ups) use data science to handle pricing, inventory management or supply chain problems.

• Car traffic optimization is a modern example of OR problem, solved with simulations, commuter surveys, sensor data and statistical modeling.

• OR has a significant overlap with six-sigma, also solves econometric problems, and has many practitioners/applications in the army and defense sectorsResource: http://www.datasciencecentral.com/profiles/blogs/17-analytic-disciplines-compared

Different Profiles of Analytics

ECONOMETRICS

•Econometrics is heavily statistical in nature, using time series models such as auto-regressive processes.

•Also overlapping with operations research (itself overlapping with statistics!) and mathematical optimization (simplex algorithm).

•Econometricians like ROC and efficiency curves.

Resource: http://www.datasciencecentral.com/profiles/blogs/17-analytic-disciplines-compared

Different Profiles of Analytics

DATA ENGINEERING

• Performed by software engineers (developers) or architects (designers) in large organizations (sometimes by data scientists in tiny companies)

•A sub-domain currently under attack is data warehousing, as this term is associated with static, siloed conventational data bases, data architectures, and data flows, threatened by the rise of NoSQL, NewSQL and graph databases.

•Transforming these old architectures into new ones (only when needed) or make them compatible with new ones, is a lucrative business

Resource: http://www.datasciencecentral.com/profiles/blogs/17-analytic-disciplines-compared

Different Profiles of Analytics

BUSINESS INTELLIGENCE

Resource: http://www.datasciencecentral.com/profiles/blogs/17-analytic-disciplines-compared

Different Profiles of Analytics

BUSINESS INTELLIGENCE

• Focuses on dashboard creation, metric selection, producing and scheduling data reports (statistical summaries) sent by email or delivered/presented to executives, competitive intelligence (analyzing third party data), as well as involvement in database schema design (working with data architects) to collect useful, actionable business data efficiently.

• Typical job title is Business Analyst• some are more involved with marketing, product or finance (forecasting sales

and revenue). • Some have learned advanced statistics such as time series, but most

only use (and need) basic stats, and light analytics, relying on IT to maintain databases and harvest data.

• BI and market research(but not competitive intelligence) are currently experiencing a decline. • Part of the decline is due to not adapting to new types of data (e.g.

unstructured text) that require engineering or data science techniques to process and extract value

Resource: http://www.datasciencecentral.com/profiles/blogs/17-analytic-disciplines-compared

Different Profiles of Analytics

DATA ANALYTICS

•This is the new term for Business Statistics since at least 1995, and it covers a large spectrum of applications including fraud detection, advertising mix modeling, attribution modeling, sales forecasts, cross-selling optimization (retails), user segmentation, churn analysis, computing long-time value of a customer and cost of acquisition, Etc.

• Except in big companies, data analyst is a Junior role; these practitioners have a much more narrow knowledge and experience than data scientists

Resource: http://www.datasciencecentral.com/profiles/blogs/17-analytic-disciplines-compared

Different Profiles of Analytics

BUSINESS ANALYTICS

•Same as data analysis, but restricted to business problems only.

•Tends to have a bit more of a financial, marketing or ROI flavor.

Resource: http://www.datasciencecentral.com/profiles/blogs/17-analytic-disciplines-compared

DATA SCIENTIST vs. DATA ANALYST

Resource: http://www.edureka.co/blog/difference-between-data-scientist-and-data-analyst/

DATA SCIENTIST vs. DATA ANALYST

• “Data Analyst” focuses on the movement and interpretation of data, typically with a focus on the past and present.

• Alternatively, a “Data Scientist” may be primarily responsible for summarizing data in such a way as to provide forecasting, or an insight into future based on the patterns identified from past and current data.

Resource: http://captechconsulting.com/blog/richard-rivera/data-scientist-vs-data-analyst

DATA SCIENTIST vs. DATA ANALYST

•Business understanding – Determine Business Objectives, Assess Situation, Determine Data Mining Goals, Produce Project Plan

•Data understanding – Collect Initial Data, Describe Data, Explore Data, Verify Data Quality

•Data preparation – Select Data, Clean Data, Construct Data, Integrate Data

•Modeling – Select modeling technique, Generate Test Design, Build Model, Assess Model

•Evaluation  - Evaluate Results, Review Process, Determine Next Steps

•Deployment – Plan Deployment, Plan Monitoring and Maintenance, Produce Final Report, Review Project

Resource: http://captechconsulting.com/blog/richard-rivera/data-scientist-vs-data-analyst

DATA SCIENTIST vs. DATA ANALYST

•A Data Scientist is often heavily involved in the cleaning and manipulation of data to support their modeling needs as well as the building and evaluating of model designs which are intended to help guide changes in business decisions.

•On the other hand, a Data Analyst may spend their time exploring data to support troubleshooting efforts or to generate ideas for useful reports to pitch to the customer.

• In general, while Data Analysts tend to be more Business focused, Data Scientists are often Mathematically focused.

Resource: http://captechconsulting.com/blog/richard-rivera/data-scientist-vs-data-analyst

DATA SCIENTIST vs. DATA ANALYST

• Data Analysts typically perform data migration and visualization roles that focus on describing the past; while Data Scientists typically perform roles manipulating data and creating models to improve the future.

Resource: http://captechconsulting.com/blog/richard-rivera/data-scientist-vs-data-analyst

DATA SCIENTIST vs. DATA ANALYST

• Situation:• A large provider of streaming entertainment and data services

wants to improve call center performance and extract tactical business value from call center data

•Project Domain:

• Logged performance data from the firm’s proprietary hardware platform and call center data tied to specific customers and device IDs

• In the above case, the Data Analyst and Scientist would both use data but, in different ways.  The Data Analyst would be concerned with reporting metrics, such as average call time; while the Data Scientist would be concerned with using the historical data to predict the future, such as predicting future months call volumes.  Both roles are equally important to the operation of the call center and help find solutions for the center to run smoothly.  The figure below details some solutions each role creates.

Resource: http://captechconsulting.com/blog/richard-rivera/data-scientist-vs-data-analyst

Current Analytics Scenario

•2013

•Significant rise in Data Analytics and Big Data initiatives across all sectors in India. Finance, telecom, ecommerce and retail sectors showed higher investments in data related tools and technologies.

• Industries like healthcare, auto and manufacturing also increased their data related spend.

•Hiring picked up considerably but the demand supply lag was still considerable considering the sheer lack of availability of trained analytics professionals

Stages in Analytics

So…..

TechTerm

Stat/ OR

Source of Data

Size of data

Typical Software

Data Analysis May be Manual Usually Small SPSS, SYSTAT etc.

Business Intelligence

No Business process

Usually Large

Business Objects, Micro strategy, Pentaho etc.

Data Mining May be Business process

Very Large

Clementine, e-miner etc.

Analytics Yes Business process Large SAS, R etc.

• Hence, a hallmark of Analytics is application of statistics/OR in industry setting using business process data.

• SAS is the widely used tool. R (open source) is FAST growing

Current Analytics Scenario

•2013

Current Analytics Scenario

•2013 – Key Facts

•Though SAS remained a popular tool, trends show that R has been increasing its share of the market. Several analytics companies use both SAS and R, depending on project and client preferences.

• Big Data tools Hadoop and Map Reduce were industry favourites. Companies are investing in building this capability , but are not using it effectively yet.

•Hadoop skills were in demand coupled with SAS and R.

•Web analytics, text analytics and social media analytics began to gain popularity

Current Analytics Scenario

•2014

•Demand for analytics is currently driven by businesses and organizations that want to up-skill their employees. We predict that in 2014, MBA colleges will place more emphasis on analytics as part of their regular curriculum to cater to the demand-supply gap for analytics professionals.

•A number of specialized analytics courses (such as the ones offered by the Great Lakes Institute of Management, Chennai and the Indian School of Business , Hyderabad) have already gained popularity in 2013. We predict that many more such courses will be launched in 2014.

Current Analytics Salary

Current Analytics Salary

Current Analytics Salary

Current Analytics Salary

Current Analytics Salary

•Average entry level salaries have increased 27% since 2013, from Rs. 520 thousand to Rs 660 thousand per annum.

•Typically, there is a 250% increase in salary from entry level analyst to manager.

•Managers in analytics command an annual salary upward of Rs 1500 thousand.

•At senior levels, annual salaries are upward of Rs. 2500 thousand which is more than a 60% increase from a managers salary

Let’s Conclusion for today

•Salaries will continue to increase and we will see professionals from other sectors honing their analytics skills and switching careers.

•The scope of analytics will so permeate India Inc., that 15 years from now, we predict that those professionals with no analytics and big data skills will have no scope for growth.

•This seems harsh, but that is how dynamic data analytics is and how pivotal data backed decision making will become.

THANK YOU

In God, We Trust... All Others must bring the "Data"

 

Srikanth Ayithyabout.me/srikanthayithy