Upload
others
View
0
Download
0
Embed Size (px)
Citation preview
Data Science: Big Data monetization
Robert Kopal, Ph.D. CSO, IN2data Vice Dean for R&D, Algebra
2
…is like trying to describe a sunset...
...it should be easy...
...but somehow capturing the words is impossible.
Describing Data Science…
3
Izvor: Bozz Allen Hamilton
Data Science is the art of turning data into actions.
This is accomplished through the creation of data products, which provide actionable information - without exposing decision makers to the underlying data or analytics.
Performing Data Science requires the extraction of timely, actionable information from diverse data sources to drive data products.
Data Lake. A Data Lake is a large object-based storage repository that holds data in its native format until it is needed.
Describing Data Science…
4
Izvor: Bozz Allen Hamilton
Data Science is like cooking...Data Cooking...
Dishes are business problems...
Recipes are guidelines...
Ingredients are Data Sources...Data Applications...Data Components...
...However...
There is a big difference between chefs...and their dishes...
You can use same ingredients and make a lousy dish...or an astonishing one...
You can add some innovative ingredient...and make your dish even better...
It is a challenge for Data Scientists...
Data Scientist makes all the difference...
Data Chef...
Describing Data Science…Data Cooking...
5
Izvor: dr.sc. Robert Kopal
Otkrivanje vs. Reproduciranje pitanja
Data Science otkriva nova pitanja umjesto da samo odgovara na postojeća
Podatkovni znanstvenik vs. Tim podatkovnih znanstvenika
timski rad rezultira sinergijom računalnih i matematičkih znanstvenika te poslovnih eksperata
Proaktivnost vs. Reaktivnost
Data Science traži odgovor na pitanje „što je potrebno napraviti?”, a ne „što je bilo?”
Zašto je Data Science drugačiji?
6
Izvor: Bozz Allen Hamilton
Pogled u prošlost i budućnost
7
Izvor: Bozz Allen Hamilton
Počelo je kao Business Intelligence…
• Deduktivno zaključivanje
• Pogled unatrag
• Slice and Dice podataka
• Skladište podataka (DWH)
• Pogađanje budućnosti na temelju
prošlosti
• Kreiranje izvještaja
• Analitika
… a onda se pojavila Data Science
• Induktivno i deduktivno zaključivanje
• Pogled unaprijed
• Interakcija s podacima
• Distribuirani podaci u realnom vremenu
• Predviđanje i savjetovanje
• Data Products
• Odgovori na postojeća pitanja i kreiranje
novih
• Inicijativa i poticaj
… i njihova uloga u Data Science primjeni
Deduktivno zaključivanje
Formuliranje hipoteza o relacijama i modelima
Eksperimentiranje s podacima u svrhu testiranja hipoteza i modela
Induktivno zaključivanje
Otkrivanje ili poboljšavanje hipoteza (exploratory data analysis)
Otkrivanje novih poveznica u svrhu kreiranja poslovnih poticaja (new relationships, insights and analytic paths from the data)
Vrste zaključivanja…
Deduktivno zaključivanje
Poznato kao formalna logika
Zaključivanje na temelju poznatih premisa ili pretpostavki koje smatramo točnima
Zaključci su sigurni, neminovni, neizbježni…
Induktivno zaključivanje
Poznato kao neformalna logika
Zaključivanje na temelju nesigurnih premisa ili pretpostavki u čiju točnost nismo potpuno uvjereni
Zaključci su vjerojatni, mogući, uvjerljivi, razumni…
Kompetitivna prednost za organizacije koje žele biti vodeće na tržištu
Ako imate savršenu informaciju...
...ili...
...nemate nikakvu informaciju...
...onda nemate problem.
Problemi počinju između te dvije krajnosti
Utjecaj Data Science na donošenje poslovnih odluka
9
Izvor: Bozz Allen Hamilton
Whether or not information is available, decisions must be made!
Što se zapravo promijenilo?
10
Izvor: Bozz Allen Hamilton
People who run the
business
People who manage
infrastructure
Donositelji poslovnih
odluka direktno povezani s podacima
preskripcija(što moramo
poduzeti?)
predikcija(što će se vjerojatno dogoditi?)
predviđanje(što bi se moglo
dogoditi?)
monitoring(što se sada
događa?)
vrijednost za korisnika
kom
plek
snos
t
niska visoka
analiza(zašto se
dogodilo?)
izvješćivanje(što se
dogodilo?)
Od reaktivnog ka proaktivnom
Preskriptivna analitika
Data Science komponente
13
Znanja, sposobnosti i vještine Data Scientista ???
Znanja, sposobnosti i vještine Data Scientista
15
Izvor: Bozz Allen Hamilton
Industrijsko iskustvo
Razumijevanje poslovnog okruženja
Analitičke vještine
Razumijevanje analitičkog okruženja
Računalna znanost
Razumijevanje tehnološkog okruženja
Znanja, sposobnosti i vještine Data Scientista
16
Izvor: Bozz Allen Hamilton
Individuals who are great at all three of the Data Science foundational skills are like unicorns - very rare and if you’re ever lucky enough to find one they should
be treated carefully.
Manjak ili višak Data Scientista ???
Jeste li znali...
situaciju u RH???
Znanja, sposobnosti i vještine Data Scientista ???
Edukacija Data Scientista...
Znanja, vještine i kompetencije za analitičku i stratešku razinu digitalnog marketinga.
Znanja, kompetencije i vještine koje studenti, polaznici diplomskog studija "Digitalni marketing" stječu imaju dualni karakter :
poslovne primjene u digitalnom marketingu,
integriranje, analizu i interpretiranje marketinških podataka u digitalnom obliku.
Budite pripravni na neuspjeh
Data Science se temelji na eksperimentiranju. Inovativna rješenja proizlaze iz testiranja novih ideja i načina razmišljanja. Neuspjeh je prihvatljiv nusprodukt eksperimentiranja.
Griješite često i učite brzo
Postoje situacije u kojima istražujemo višestruke načine rješavanja problema s ciljem otkrivanja najboljeg.
Failing is good; failing quickly is even better
Opće smjernice 1/2
Razmišljajte o cilju
Lako se izgubiti u detaljima i izazovima implementacije. Kad se to dogodi, iz vida se gubi cilj i počinje odmak od optimalnog analitičkog procesa.
Predanost i fokusiranost vode k uspjehu
Često je potrebno isprobati različite pristupe prije pronalaženja pravog. Lako se dolazi do točke obeshrabrenja. Treba ostati posvećen analitičkom cilju. Ponekad naizgled mala i nevažna zapažanja dovode do velikih uspjeha.
Kompliciranije ne znači bolje
Osobe koje poznaju tehnologiju su sklone istraživanju vrlo kompleksnih i naprednih pristupa rješavanju problema. Iako je to ponekad nužno, često se jednakovrijedan rezultat može postići i jednostavnijim načinima. Jednostavnije znači lakše i brže modeliranje, implementaciju i verifikaciju.
Opće smjernice 2/2
Data Science primjene
23
Data Science primjene
24
Data Science je nužan u tržišnoj utakmici budućnosti...
25
17-49% Porast produktivnosti kad organizacija poveća
opseg korištenja podataka za 10%
11-42% Povećanje ROA kad organizacija unaprijedi
dostupnost podataka za 10%
241% Povećanje ROI kad organizacija koristi big data za
povećanje kompetitivnosti
1000% Povećanje ROI nakon uvođenja analitike u veći dio
organizacije i usklađivanja dnevne operativne realizacije s ciljevima managementa
5-6% Poboljšanje performansi organizacija koje
donose odluke na temelju podataka
Izvor: Bozz Allen Hamilton
Evolution of Data-Driven Decision-making
Data Science vs. Big Data
27
As a guide to the likely skills demand, the European Commission expects the market for big data to grow by 40% each year, reaching USD 16.9 billion worldwide in 2015.
According to the study "Worldwide Big Data Technology and Services, 2012–2015 Forecast" conducted by IDC, big data technology and services are expected to grow worldwide at a compound annual growth rate of 40% – about 7x that of the ICT market overall.
Another recent study on "Big Data Analytics: An assessment of demand for labour and skills, 2012-2017", conducted by e-skills uk and SAS, predicts that in the UK alone, the number of big data staff specialist working in large firms will increase by more than 240% over the next 5 years.
Another recent study in Ireland pointed out that, under a high growth scenario, demand from businesses expanding, as well as replacing people, could result in 21,000 job vacancies for big data analysts in the run-up to 2020.
Gartner Hype Cycle 2013.
Gartner Hype Cycle 2014.
tržište se stabiliziralo s razumnim brojem pristupa Big Data paradigmi te su nove tehnologije i prakse dodatak postojećim rješenjima
Data Science je nužan u tržišnoj utakmici budućnosti...
30
Od digitalnog marketinga do marketinga digitalne ere
Monetizacija big data koncepta (i drugih podatkovnih rješenja)
Analitika u digitalnoj ekonomiji
SNA & Big Data
Social data IS NOT like most Big Data.
This picture below IS NOT that USEFUL!
What we want is INTERESTING and USEFUL, not BIG.
31
SNA & Big Data
When investigating social/relational data, it is usually not the forest that is useful, but the clusters of various trees, and their relationships, inside the ecosystem. We not only want to "see the forest for the trees", but also see the patterns/clusters of trees in the forest!
SNA & Big data
Big Data often contains small clusters, especially with social data.
Human networks usually contain dozens or hundreds of nodes we usually do not have time/energy for thousands or millions of friends/colleagues!
Facebook's own research showed that people who claim hundreds or thousands of friends, regularly interact with only dozens of them.
The goal is not to analyze the universe of data, but to to find the significant clusters within all of that data.
At 10,000 meters, big data is not that interesting.
At 1000 meters, we start to see patterns/clumps.
At 10 meters we can play with emergent clusters that have real meaning and we start to learn what is happening inside our social ecosystem.
In Big Data, the important numbers are not the millions, but the many subgroups of dozens, hundreds, and maybe thousands that reveal meaning, and give us insight.
What is happening in the "social clusters" inside your ecosystem?
33
34
Social physics
Social physics is a quantitative social science that describes reliable, mathematical connections between information and idea flow (1) on the one hand and people’s behavior (2) on the other.
Social physics helps us understand how ideas flow from person to person through the mechanism of social learning and how this flow of ideas ends up shaping the norms, productivity, and creative output of our companies, cities, and societies.
It enables us to predict the productivity of small groups, of departments within companies, and even of entire cities.
It also helps us tune communication networks so that we can reliably make better decisions and become more productive.
35
Social physics
The key insights obtained with social physics all have to do with the flow of ideas between people.
This flow of ideas can be seen in the pattern of telephone calls or social media messaging, of course, but also by assessing how much time people spend together and whether they go to the same places and have similar experiences.
Flows of ideas are central to understanding society not only because timely information is critical to efficient systems but, more important, because the spread and combination of new ideas is what drives behavior change and innovation.
This focus on the flow of ideas is why Alex Pentland chose the name “social physics.”
Just as the goal of traditional physics is to understand how the flow of energy translates into changes in motion, social physics seeks to understand how the flow of ideas and information translates into changes in behavior.
36
Social physics
The engine that drives Social Physics is Big Data: the newly ubiquitous digital data now available about all aspects of human life.
Social physics functions by analyzing patterns of human experience and idea exchange within the digital bread crumbs we all leave behind us as we move through the world — call records, credit card transactions, and GPS location fixes, among others.
These data tell the story of everyday life by recording what each of us has chosen to do.
And this is very different from what is put on Facebook; postings on Facebook are what people choose to tell each other, edited according to the standards of the day.
Who we actually are is more accurately determined by where we spend our time and which things we buy, not just by what we say we do.
37
Social physics
The process of analyzing the patterns within these digital bread crumbs is called reality mining, and through it we can tell an enormous amount about who individuals are.
Alex Pentland….My students and I have found that we can use it to tell if people are likely to get diabetes or whether someone is the sort of person who will pay back loans. And by analyzing these patterns across many people, we are discovering that we can begin to explain many things—crashes, revolutions, bubbles—that previously appeared to be random “acts of God.”…
Što ne znamo da ne znamo?
Portfelj znanja
ZNAMO NE ZNAMO
SVJESNO
Znamo da znamo
Pristup, dijeljenje i skladištenje znanja.
Znamo da ne znamo
Traženje i kreiranje znanja-osnovna analitika
(postavljanje pitanja).
NESVJESNO
Ne znamo da znamo
Otkrivanje skrivenog ili implicitnog znanja.
Ne znamo da ne znamo
Otkrivanje ključnih rizika i mogućnosti.
Data Science: Big Data monetization
Robert Kopal, Ph.D. CSO, IN2data Vice Dean for R&D, Algebra