39
Data Science: Big Data monetization Robert Kopal, Ph.D. CSO, IN2data Vice Dean for R&D, Algebra

Data Science: Big Data monetization · Data Science vs. Big Data 27 As a guide to the likely skills demand, the European Commission expects the market for big data to grow by 40%

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

  • Data Science: Big Data monetization

    Robert Kopal, Ph.D. CSO, IN2data Vice Dean for R&D, Algebra

  • 2

  • …is like trying to describe a sunset...

    ...it should be easy...

    ...but somehow capturing the words is impossible.

    Describing Data Science…

    3

    Izvor: Bozz Allen Hamilton

  • Data Science is the art of turning data into actions.

    This is accomplished through the creation of data products, which provide actionable information - without exposing decision makers to the underlying data or analytics.

    Performing Data Science requires the extraction of timely, actionable information from diverse data sources to drive data products.

    Data Lake. A Data Lake is a large object-based storage repository that holds data in its native format until it is needed.

    Describing Data Science…

    4

    Izvor: Bozz Allen Hamilton

  • Data Science is like cooking...Data Cooking...

    Dishes are business problems...

    Recipes are guidelines...

    Ingredients are Data Sources...Data Applications...Data Components...

    ...However...

    There is a big difference between chefs...and their dishes...

    You can use same ingredients and make a lousy dish...or an astonishing one...

    You can add some innovative ingredient...and make your dish even better...

    It is a challenge for Data Scientists...

    Data Scientist makes all the difference...

    Data Chef...

    Describing Data Science…Data Cooking...

    5

    Izvor: dr.sc. Robert Kopal

  • Otkrivanje vs. Reproduciranje pitanja

    Data Science otkriva nova pitanja umjesto da samo odgovara na postojeća

    Podatkovni znanstvenik vs. Tim podatkovnih znanstvenika

    timski rad rezultira sinergijom računalnih i matematičkih znanstvenika te poslovnih eksperata

    Proaktivnost vs. Reaktivnost

    Data Science traži odgovor na pitanje „što je potrebno napraviti?”, a ne „što je bilo?”

    Zašto je Data Science drugačiji?

    6

    Izvor: Bozz Allen Hamilton

  • Pogled u prošlost i budućnost

    7

    Izvor: Bozz Allen Hamilton

    Počelo je kao Business Intelligence…

    • Deduktivno zaključivanje

    • Pogled unatrag

    • Slice and Dice podataka

    • Skladište podataka (DWH)

    • Pogađanje budućnosti na temelju

    prošlosti

    • Kreiranje izvještaja

    • Analitika

    … a onda se pojavila Data Science

    • Induktivno i deduktivno zaključivanje

    • Pogled unaprijed

    • Interakcija s podacima

    • Distribuirani podaci u realnom vremenu

    • Predviđanje i savjetovanje

    • Data Products

    • Odgovori na postojeća pitanja i kreiranje

    novih

    • Inicijativa i poticaj

  • … i njihova uloga u Data Science primjeni

    Deduktivno zaključivanje

    Formuliranje hipoteza o relacijama i modelima

    Eksperimentiranje s podacima u svrhu testiranja hipoteza i modela

    Induktivno zaključivanje

    Otkrivanje ili poboljšavanje hipoteza (exploratory data analysis)

    Otkrivanje novih poveznica u svrhu kreiranja poslovnih poticaja (new relationships, insights and analytic paths from the data)

    Vrste zaključivanja…

    Deduktivno zaključivanje

    Poznato kao formalna logika

    Zaključivanje na temelju poznatih premisa ili pretpostavki koje smatramo točnima

    Zaključci su sigurni, neminovni, neizbježni…

    Induktivno zaključivanje

    Poznato kao neformalna logika

    Zaključivanje na temelju nesigurnih premisa ili pretpostavki u čiju točnost nismo potpuno uvjereni

    Zaključci su vjerojatni, mogući, uvjerljivi, razumni…

  • Kompetitivna prednost za organizacije koje žele biti vodeće na tržištu

    Ako imate savršenu informaciju...

    ...ili...

    ...nemate nikakvu informaciju...

    ...onda nemate problem.

    Problemi počinju između te dvije krajnosti

    Utjecaj Data Science na donošenje poslovnih odluka

    9

    Izvor: Bozz Allen Hamilton

    Whether or not information is available, decisions must be made!

  • Što se zapravo promijenilo?

    10

    Izvor: Bozz Allen Hamilton

    People who run the

    business

    People who manage

    infrastructure

    Donositelji poslovnih

    odluka direktno povezani s podacima

  • preskripcija(što moramo

    poduzeti?)

    predikcija(što će se vjerojatno dogoditi?)

    predviđanje(što bi se moglo

    dogoditi?)

    monitoring(što se sada

    događa?)

    vrijednost za korisnika

    kom

    plek

    snos

    t

    niska visoka

    analiza(zašto se

    dogodilo?)

    izvješćivanje(što se

    dogodilo?)

    Od reaktivnog ka proaktivnom

  • Preskriptivna analitika

  • Data Science komponente

    13

  • Znanja, sposobnosti i vještine Data Scientista ???

  • Znanja, sposobnosti i vještine Data Scientista

    15

    Izvor: Bozz Allen Hamilton

    Industrijsko iskustvo

    Razumijevanje poslovnog okruženja

    Analitičke vještine

    Razumijevanje analitičkog okruženja

    Računalna znanost

    Razumijevanje tehnološkog okruženja

  • Znanja, sposobnosti i vještine Data Scientista

    16

    Izvor: Bozz Allen Hamilton

    Individuals who are great at all three of the Data Science foundational skills are like unicorns - very rare and if you’re ever lucky enough to find one they should

    be treated carefully.

  • Manjak ili višak Data Scientista ???

  • Jeste li znali...

    situaciju u RH???

    Znanja, sposobnosti i vještine Data Scientista ???

  • Edukacija Data Scientista...

    Znanja, vještine i kompetencije za analitičku i stratešku razinu digitalnog marketinga.

    Znanja, kompetencije i vještine koje studenti, polaznici diplomskog studija "Digitalni marketing" stječu imaju dualni karakter :

    poslovne primjene u digitalnom marketingu,

    integriranje, analizu i interpretiranje marketinških podataka u digitalnom obliku.

  • Budite pripravni na neuspjeh

    Data Science se temelji na eksperimentiranju. Inovativna rješenja proizlaze iz testiranja novih ideja i načina razmišljanja. Neuspjeh je prihvatljiv nusprodukt eksperimentiranja.

    Griješite često i učite brzo

    Postoje situacije u kojima istražujemo višestruke načine rješavanja problema s ciljem otkrivanja najboljeg.

    Failing is good; failing quickly is even better

    Opće smjernice 1/2

  • Razmišljajte o cilju

    Lako se izgubiti u detaljima i izazovima implementacije. Kad se to dogodi, iz vida se gubi cilj i počinje odmak od optimalnog analitičkog procesa.

    Predanost i fokusiranost vode k uspjehu

    Često je potrebno isprobati različite pristupe prije pronalaženja pravog. Lako se dolazi do točke obeshrabrenja. Treba ostati posvećen analitičkom cilju. Ponekad naizgled mala i nevažna zapažanja dovode do velikih uspjeha.

    Kompliciranije ne znači bolje

    Osobe koje poznaju tehnologiju su sklone istraživanju vrlo kompleksnih i naprednih pristupa rješavanju problema. Iako je to ponekad nužno, često se jednakovrijedan rezultat može postići i jednostavnijim načinima. Jednostavnije znači lakše i brže modeliranje, implementaciju i verifikaciju.

    Opće smjernice 2/2

  • Data Science primjene

    23

  • Data Science primjene

    24

  • Data Science je nužan u tržišnoj utakmici budućnosti...

    25

    17-49% Porast produktivnosti kad organizacija poveća

    opseg korištenja podataka za 10%

    11-42% Povećanje ROA kad organizacija unaprijedi

    dostupnost podataka za 10%

    241% Povećanje ROI kad organizacija koristi big data za

    povećanje kompetitivnosti

    1000% Povećanje ROI nakon uvođenja analitike u veći dio

    organizacije i usklađivanja dnevne operativne realizacije s ciljevima managementa

    5-6% Poboljšanje performansi organizacija koje

    donose odluke na temelju podataka

    Izvor: Bozz Allen Hamilton

  • Evolution of Data-Driven Decision-making

  • Data Science vs. Big Data

    27

    As a guide to the likely skills demand, the European Commission expects the market for big data to grow by 40% each year, reaching USD 16.9 billion worldwide in 2015.

    According to the study "Worldwide Big Data Technology and Services, 2012–2015 Forecast" conducted by IDC, big data technology and services are expected to grow worldwide at a compound annual growth rate of 40% – about 7x that of the ICT market overall.

    Another recent study on "Big Data Analytics: An assessment of demand for labour and skills, 2012-2017", conducted by e-skills uk and SAS, predicts that in the UK alone, the number of big data staff specialist working in large firms will increase by more than 240% over the next 5 years.

    Another recent study in Ireland pointed out that, under a high growth scenario, demand from businesses expanding, as well as replacing people, could result in 21,000 job vacancies for big data analysts in the run-up to 2020.

  • Gartner Hype Cycle 2013.

  • Gartner Hype Cycle 2014.

    tržište se stabiliziralo s razumnim brojem pristupa Big Data paradigmi te su nove tehnologije i prakse dodatak postojećim rješenjima

  • Data Science je nužan u tržišnoj utakmici budućnosti...

    30

    Od digitalnog marketinga do marketinga digitalne ere

    Monetizacija big data koncepta (i drugih podatkovnih rješenja)

    Analitika u digitalnoj ekonomiji

  • SNA & Big Data

    Social data IS NOT like most Big Data.

    This picture below IS NOT that USEFUL!

    What we want is INTERESTING and USEFUL, not BIG.

    31

  • SNA & Big Data

    When investigating social/relational data, it is usually not the forest that is useful, but the clusters of various trees, and their relationships, inside the ecosystem. We not only want to "see the forest for the trees", but also see the patterns/clusters of trees in the forest!

  • SNA & Big data

    Big Data often contains small clusters, especially with social data.

    Human networks usually contain dozens or hundreds of nodes we usually do not have time/energy for thousands or millions of friends/colleagues!

    Facebook's own research showed that people who claim hundreds or thousands of friends, regularly interact with only dozens of them.

    The goal is not to analyze the universe of data, but to to find the significant clusters within all of that data.

    At 10,000 meters, big data is not that interesting.

    At 1000 meters, we start to see patterns/clumps.

    At 10 meters we can play with emergent clusters that have real meaning and we start to learn what is happening inside our social ecosystem.

    In Big Data, the important numbers are not the millions, but the many subgroups of dozens, hundreds, and maybe thousands that reveal meaning, and give us insight.

    What is happening in the "social clusters" inside your ecosystem?

    33

  • 34

    Social physics

    Social physics is a quantitative social science that describes reliable, mathematical connections between information and idea flow (1) on the one hand and people’s behavior (2) on the other.

    Social physics helps us understand how ideas flow from person to person through the mechanism of social learning and how this flow of ideas ends up shaping the norms, productivity, and creative output of our companies, cities, and societies.

    It enables us to predict the productivity of small groups, of departments within companies, and even of entire cities.

    It also helps us tune communication networks so that we can reliably make better decisions and become more productive.

  • 35

    Social physics

    The key insights obtained with social physics all have to do with the flow of ideas between people.

    This flow of ideas can be seen in the pattern of telephone calls or social media messaging, of course, but also by assessing how much time people spend together and whether they go to the same places and have similar experiences.

    Flows of ideas are central to understanding society not only because timely information is critical to efficient systems but, more important, because the spread and combination of new ideas is what drives behavior change and innovation.

    This focus on the flow of ideas is why Alex Pentland chose the name “social physics.”

    Just as the goal of traditional physics is to understand how the flow of energy translates into changes in motion, social physics seeks to understand how the flow of ideas and information translates into changes in behavior.

  • 36

    Social physics

    The engine that drives Social Physics is Big Data: the newly ubiquitous digital data now available about all aspects of human life.

    Social physics functions by analyzing patterns of human experience and idea exchange within the digital bread crumbs we all leave behind us as we move through the world — call records, credit card transactions, and GPS location fixes, among others.

    These data tell the story of everyday life by recording what each of us has chosen to do.

    And this is very different from what is put on Facebook; postings on Facebook are what people choose to tell each other, edited according to the standards of the day.

    Who we actually are is more accurately determined by where we spend our time and which things we buy, not just by what we say we do.

  • 37

    Social physics

    The process of analyzing the patterns within these digital bread crumbs is called reality mining, and through it we can tell an enormous amount about who individuals are.

    Alex Pentland….My students and I have found that we can use it to tell if people are likely to get diabetes or whether someone is the sort of person who will pay back loans. And by analyzing these patterns across many people, we are discovering that we can begin to explain many things—crashes, revolutions, bubbles—that previously appeared to be random “acts of God.”…

  • Što ne znamo da ne znamo?

    Portfelj znanja

    ZNAMO NE ZNAMO

    SVJESNO

    Znamo da znamo

    Pristup, dijeljenje i skladištenje znanja.

    Znamo da ne znamo

    Traženje i kreiranje znanja-osnovna analitika

    (postavljanje pitanja).

    NESVJESNO

    Ne znamo da znamo

    Otkrivanje skrivenog ili implicitnog znanja.

    Ne znamo da ne znamo

    Otkrivanje ključnih rizika i mogućnosti.

  • Data Science: Big Data monetization

    Robert Kopal, Ph.D. CSO, IN2data Vice Dean for R&D, Algebra