48
Retos estadísticos en Big Data y Data Mining Manuel Febrero–Bande Dpt. de Estadística, Análisis Matemático y Optimización Univ. de Santiago de Compostela Jornada Data Science – INAMAT Pamplona, 25 de septiembre de 2017

Retos estadísticos en Big Data y Data Mining...IntroductionBigDataAnalysis Newchallenges,almostsamesolutionsConclusion DimensionsofBigData TheVcharacteristicsofBigDataare: IVolume.Huge(andperhapsincreasing

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Retos estadísticos en Big Data y Data Mining...IntroductionBigDataAnalysis Newchallenges,almostsamesolutionsConclusion DimensionsofBigData TheVcharacteristicsofBigDataare: IVolume.Huge(andperhapsincreasing

Retos estadísticos en Big Data y Data Mining

Manuel Febrero–Bande

Dpt. de Estadística, Análisis Matemático y OptimizaciónUniv. de Santiago de Compostela

Jornada Data Science – INAMATPamplona, 25 de septiembre de 2017

Page 2: Retos estadísticos en Big Data y Data Mining...IntroductionBigDataAnalysis Newchallenges,almostsamesolutionsConclusion DimensionsofBigData TheVcharacteristicsofBigDataare: IVolume.Huge(andperhapsincreasing

Introduction Big Data Analysis New challenges, almost same solutions Conclusion

Facts about Big Data

I Big Data is about how to solve statistical problems with theavailable (an perhaps limited) computational resources. So, itchanges as time evolves. A Big Data problem in 90’s, now it isa toy example.

I The final goal is to obtain automatic inferences from the data(without human statistical participation) like in the last 30years (under different denominations).

I Big Data is the new magic word. Like –Abracadabra–, it opensthe door to a new fantasy world wherever all is possible fromthe hand of the new Master of Ceremonies: the data scientist.

I Big Data promises better decisions (like Statistics in the lastcentury) and big profits (this is new) after a huge investmentin technology.

M. Febrero–Bande Retos Big Data

Page 3: Retos estadísticos en Big Data y Data Mining...IntroductionBigDataAnalysis Newchallenges,almostsamesolutionsConclusion DimensionsofBigData TheVcharacteristicsofBigDataare: IVolume.Huge(andperhapsincreasing

Introduction Big Data Analysis New challenges, almost same solutions Conclusion

Table of Contents

1 Introduction

2 Big Data Analysis

3 New challenges, almost same solutionsExploratory Data AnalysisClient profileA test exampleA classification example

4 Conclusion

M. Febrero–Bande Retos Big Data

Page 4: Retos estadísticos en Big Data y Data Mining...IntroductionBigDataAnalysis Newchallenges,almostsamesolutionsConclusion DimensionsofBigData TheVcharacteristicsofBigDataare: IVolume.Huge(andperhapsincreasing

Introduction Big Data Analysis New challenges, almost same solutions Conclusion

Table of Contents

1 Introduction

2 Big Data Analysis

3 New challenges, almost same solutionsExploratory Data AnalysisClient profileA test exampleA classification example

4 Conclusion

M. Febrero–Bande Retos Big Data

Page 5: Retos estadísticos en Big Data y Data Mining...IntroductionBigDataAnalysis Newchallenges,almostsamesolutionsConclusion DimensionsofBigData TheVcharacteristicsofBigDataare: IVolume.Huge(andperhapsincreasing

Introduction Big Data Analysis New challenges, almost same solutions Conclusion

Introduction

Several denominations have been employed to relate Statistics inthe era of computers:I Expert systems (AI - 80’s-90’s)I Machine Learning (AI - 90’s– )I Data (Stream) Mining (CS - 95– )I Knowledge Discovery in Databases (KDD 95– )I Pattern Recognition (ML - 2000 – )I Structure Data Mining, Graph Mining (2003– )I Business Analytics (BIDW - 2005– )I Big Data (2011 – )

The final goal is to obtain better and faster inferences from thedata.

M. Febrero–Bande Retos Big Data

Page 6: Retos estadísticos en Big Data y Data Mining...IntroductionBigDataAnalysis Newchallenges,almostsamesolutionsConclusion DimensionsofBigData TheVcharacteristicsofBigDataare: IVolume.Huge(andperhapsincreasing

Introduction Big Data Analysis New challenges, almost same solutions Conclusion

Data Mining and Machine Learning

M. Febrero–Bande Retos Big Data

Page 7: Retos estadísticos en Big Data y Data Mining...IntroductionBigDataAnalysis Newchallenges,almostsamesolutionsConclusion DimensionsofBigData TheVcharacteristicsofBigDataare: IVolume.Huge(andperhapsincreasing

Introduction Big Data Analysis New challenges, almost same solutions Conclusion

Dimensions of Big Data

The V characteristics of Big Data are:I Volume. Huge (and perhaps increasing) size of the collected

data.I Velocity. Process data and produce results in limited time

with limited computer resources.I Variety. Heterogeneous and complex data representations.I Veracity. Quality of the data and its pre-processing.I Value. New economic value that support better decisions.I Variability. Changes in the structure of the data as time goes

on.

M. Febrero–Bande Retos Big Data

Page 8: Retos estadísticos en Big Data y Data Mining...IntroductionBigDataAnalysis Newchallenges,almostsamesolutionsConclusion DimensionsofBigData TheVcharacteristicsofBigDataare: IVolume.Huge(andperhapsincreasing

Introduction Big Data Analysis New challenges, almost same solutions Conclusion

Risky asserts about Big Data

I Traditional statistics will not remain as relevant as it used tobe

I sample and selection biasesI limits of data setsI assumptions about data

I Correlations should replace models. The data speak bythemselves.

I spurious correlationsI causality

I Precision of the results is not as essential as it was previouslybelieved to be

I robustnessI precision/complexity

M. Febrero–Bande Retos Big Data

Page 9: Retos estadísticos en Big Data y Data Mining...IntroductionBigDataAnalysis Newchallenges,almostsamesolutionsConclusion DimensionsofBigData TheVcharacteristicsofBigDataare: IVolume.Huge(andperhapsincreasing

Introduction Big Data Analysis New challenges, almost same solutions Conclusion

Del rigor de la ciencia

En aquel Imperio, el arte de la Cartografía logró tal perfección que elmapa de una sola provincia ocupaba toda una ciudad, y el mapa delImperio, toda una provincia. Con el tiempo, estos mapas desmesuradosno satisficieron y los colegios de cartógrafos levantaron un Mapa delImperio, que tenía el tamaño del Imperio y coincidía puntualmente con él.Menos adictas al estudio de la Cartografía, las generaciones siguientesentendieron que ese dilatado Mapa era inútil y no sin impiedad loentregaron a las inclemencias del sol y los inviernos. En los Desiertos delOeste perduran despedazadas ruinas del Mapa, habitadas por animales ypor mendigos; en todo el país no hay otra reliquia de las DisciplinasGeográficas.

Jorge Luis Borges

M. Febrero–Bande Retos Big Data

Page 10: Retos estadísticos en Big Data y Data Mining...IntroductionBigDataAnalysis Newchallenges,almostsamesolutionsConclusion DimensionsofBigData TheVcharacteristicsofBigDataare: IVolume.Huge(andperhapsincreasing

Introduction Big Data Analysis New challenges, almost same solutions Conclusion

Aspects of Big Data

I Big Data Management.I Hardware. How to store such huge amounts of data?I Computation. How to collect and filter the sources of data?

I Big Data ProcessingI Preprocess. How to structure the data to be readily available?I Scheduling.I Parallelization

I Big Data Techniques and AlgorithmsI Three components: Statistics, Statistics and Statistics

I Big Data Ethics

M. Febrero–Bande Retos Big Data

Page 11: Retos estadísticos en Big Data y Data Mining...IntroductionBigDataAnalysis Newchallenges,almostsamesolutionsConclusion DimensionsofBigData TheVcharacteristicsofBigDataare: IVolume.Huge(andperhapsincreasing

Introduction Big Data Analysis New challenges, almost same solutions Conclusion

Table of Contents

1 Introduction

2 Big Data Analysis

3 New challenges, almost same solutionsExploratory Data AnalysisClient profileA test exampleA classification example

4 Conclusion

M. Febrero–Bande Retos Big Data

Page 12: Retos estadísticos en Big Data y Data Mining...IntroductionBigDataAnalysis Newchallenges,almostsamesolutionsConclusion DimensionsofBigData TheVcharacteristicsofBigDataare: IVolume.Huge(andperhapsincreasing

Introduction Big Data Analysis New challenges, almost same solutions Conclusion

Steps in a Big Data Analysis

DATA(from hundreds to billions)

Segmentation(Regions, Subject,Category, ...)

Dim. Reduction(Feature extr., PC,PLS)

Id. of Groups(kNN, K-means,Clustering, ...)

Descriptive analysisSumm. & Visualize,Outliers, Missing

Predictive analysisRegression,

Classification, ...

Inferential analysisEstimation, Testing,

...

Training and Validation Sets, Comparison criteria

Specific issues in Real–Time DataOutliers, Regression, Classification, Pattern Recognition, Association

Measures, Simulation scenarios,

M. Febrero–Bande Retos Big Data

Page 13: Retos estadísticos en Big Data y Data Mining...IntroductionBigDataAnalysis Newchallenges,almostsamesolutionsConclusion DimensionsofBigData TheVcharacteristicsofBigDataare: IVolume.Huge(andperhapsincreasing

Introduction Big Data Analysis New challenges, almost same solutions Conclusion

Do we need Big Data?

I Big Data fever is founded in the business gains that the BigData tools can provide but the gains can be achieved usingboth Big Data and small data.

I Big Data is sold as the mean of solving all questions. But first,the right question must be asked and then, the solution mustbe provided using the appropriate information/data.

I More data does not imply better information. Information is inthe data but sometimes, we are unable to extract it.

I More data does not imply more accurate data/analysis.I Big Data is unlikely to solve a problem of interest unless it has

been specifically designed to solve it (as usual).

M. Febrero–Bande Retos Big Data

Page 14: Retos estadísticos en Big Data y Data Mining...IntroductionBigDataAnalysis Newchallenges,almostsamesolutionsConclusion DimensionsofBigData TheVcharacteristicsofBigDataare: IVolume.Huge(andperhapsincreasing

Introduction Big Data Analysis New challenges, almost same solutions Conclusion

Issues about Big Data

Usual statistical questions also valid in Big DataI Why Big? Big N, small P – Small N, Big P – Big N, Big PI Are the good variables available? and are the available

variables accurately enough?I Does the data represent the population we wish to make

inferences about? (or predict)I Time frame?I Redundancies?I Homogeneity along time?I Missing data?

M. Febrero–Bande Retos Big Data

Page 15: Retos estadísticos en Big Data y Data Mining...IntroductionBigDataAnalysis Newchallenges,almostsamesolutionsConclusion DimensionsofBigData TheVcharacteristicsofBigDataare: IVolume.Huge(andperhapsincreasing

Introduction Big Data Analysis New challenges, almost same solutions Conclusion

Table of Contents

1 Introduction

2 Big Data Analysis

3 New challenges, almost same solutionsExploratory Data AnalysisClient profileA test exampleA classification example

4 Conclusion

M. Febrero–Bande Retos Big Data

Page 16: Retos estadísticos en Big Data y Data Mining...IntroductionBigDataAnalysis Newchallenges,almostsamesolutionsConclusion DimensionsofBigData TheVcharacteristicsofBigDataare: IVolume.Huge(andperhapsincreasing

Introduction Big Data Analysis New challenges, almost same solutions Conclusion

Basic Toolbox for Big Data

I Aggregation, Grouping and Blocking Summarizetransactions in terms of days or hours can be enough. Thegroups can be homogeneous at certain levels.

I Compression and Sparsity exploitation Sometimes theinformation can be compressed, for example, a functionaltrajectory can be summarized in few coefficients of a basis.

I Sufficient statistics What information is enough to be stored?I Fragmentation and Divisibility (divide et impera). Solve

faster many small problems than the big one.I Recursive vs global estimation Recursive estimation can

dramatically reduce the amount of memory needed for myproblem.

M. Febrero–Bande Retos Big Data

Page 17: Retos estadísticos en Big Data y Data Mining...IntroductionBigDataAnalysis Newchallenges,almostsamesolutionsConclusion DimensionsofBigData TheVcharacteristicsofBigDataare: IVolume.Huge(andperhapsincreasing

Introduction Big Data Analysis New challenges, almost same solutions Conclusion

Why everything is so tough with Big Data? I

I The development stage is tedious because every single trytakes a lot.

I How to detect bias in your data?I Data must be purged in an automatic way. Robustness vs

EfficiencyI Possible effects of Simpson’s Paradox in numerical summaries.I The diagnostic plots are usually not useful due to the amount

of graphic elements. Rethink the graphical displays.

M. Febrero–Bande Retos Big Data

Page 18: Retos estadísticos en Big Data y Data Mining...IntroductionBigDataAnalysis Newchallenges,almostsamesolutionsConclusion DimensionsofBigData TheVcharacteristicsofBigDataare: IVolume.Huge(andperhapsincreasing

Introduction Big Data Analysis New challenges, almost same solutions Conclusion

Why everything is so tough with Big Data? II

load("total.RData")

> Registros: 14710663> sexo importe localidad> M :10589903 Min. :-9010.00 Coruña :6623162> V : 4120327 1st Qu.: 12.00 Lugo :1013069> NA's: 433 Median : 23.00 Ourense : 893682> Mean : 36.70 Pontevedra:4619785> 3rd Qu.: 42.85 Resto :1560965> Max. :25000.00> user system elapsed> 1.59 0.20 1.79

M. Febrero–Bande Retos Big Data

Page 19: Retos estadísticos en Big Data y Data Mining...IntroductionBigDataAnalysis Newchallenges,almostsamesolutionsConclusion DimensionsofBigData TheVcharacteristicsofBigDataare: IVolume.Huge(andperhapsincreasing

Introduction Big Data Analysis New challenges, almost same solutions Conclusion

A couple of examples I

system.time(mimpor <- mean(importe))

> user system elapsed> 0.77 0.09 1.19

system.time(medimpor <- median(importe))

> user system elapsed> 0.31 0.01 0.32

system.time(mtrimpor <- mean(importe, trim = 0.05))

> user system elapsed> 0.39 0.05 0.43> Media: 39.28 Mediana: 24.32 Media truncada(5%): 30.94

M. Febrero–Bande Retos Big Data

Page 20: Retos estadísticos en Big Data y Data Mining...IntroductionBigDataAnalysis Newchallenges,almostsamesolutionsConclusion DimensionsofBigData TheVcharacteristicsofBigDataare: IVolume.Huge(andperhapsincreasing

Introduction Big Data Analysis New challenges, almost same solutions Conclusion

A couple of examples II

> Media de importe> Coruña Lugo Ourense Pontevedra Resto> 37.68 40.47 40.03 38.73 46.71> Coruña Lugo Ourense Pontevedra Resto> AViaj 651.99 471.15 515.08 563.40 245.77> Otros 34.43 35.95 35.14 35.91 42.98> Tienda 46.75 50.14 50.01 45.72 53.51> Ocio 34.79 68.93 50.45 48.20 45.11> Super 29.18 32.02 28.90 28.76 27.23> Casos> Coruña Lugo Ourense Pontevedra Resto> AViaj 5026 1238 983 3501 17305> Otros 4625995 654861 555500 3060637 1150450> Tienda 1491502 278025 262802 1167781 232553> Ocio 45926 1453 1539 33442 28332> Super 173902 36256 35011 169909 35250

M. Febrero–Bande Retos Big Data

Page 21: Retos estadísticos en Big Data y Data Mining...IntroductionBigDataAnalysis Newchallenges,almostsamesolutionsConclusion DimensionsofBigData TheVcharacteristicsofBigDataare: IVolume.Huge(andperhapsincreasing

Introduction Big Data Analysis New challenges, almost same solutions Conclusion

timePlotno

rmal

ised

leve

l

0.8

1.0

1.2

1.4

Sum

a

ene abr jul oct

0.8

1.0

1.2

1.4

Med

ia

Suma Media

M. Febrero–Bande Retos Big Data

Page 22: Retos estadísticos en Big Data y Data Mining...IntroductionBigDataAnalysis Newchallenges,almostsamesolutionsConclusion DimensionsofBigData TheVcharacteristicsofBigDataare: IVolume.Huge(andperhapsincreasing

Introduction Big Data Analysis New challenges, almost same solutions Conclusion

Calendar Plot

Sum(importe) in 2015

l m m j v s d

2 3 4 5 6 7 8

26 27 28 29 30 31 1

19 20 21 22 23 24 25

12 13 14 15 16 17 18

5 6 7 8 9 10 11

29 30 31 1 2 3 4

2 3 4 5 6 7 8

26 27 28 29 30 31 1

19 20 21 22 23 24 25

12 13 14 15 16 17 18

5 6 7 8 9 10 11

29 30 31 1 2 3 4

2 3 4 5 6 7 8

26 27 28 29 30 31 1

19 20 21 22 23 24 25

12 13 14 15 16 17 18

5 6 7 8 9 10 11

29 30 31 1 2 3 4

2 3 4 5 6 7 8

26 27 28 29 30 31 1

19 20 21 22 23 24 25

12 13 14 15 16 17 18

5 6 7 8 9 10 11

29 30 31 1 2 3 4

2 3 4 5 6 7 8

26 27 28 29 30 31 1

19 20 21 22 23 24 25

12 13 14 15 16 17 18

5 6 7 8 9 10 11

29 30 31 1 2 3 4

2 3 4 5 6 7 8

26 27 28 29 30 31 1

19 20 21 22 23 24 25

12 13 14 15 16 17 18

5 6 7 8 9 10 11

29 30 31 1 2 3 4

2 3 4 5 6 7 8

26 27 28 29 30 31 1

19 20 21 22 23 24 25

12 13 14 15 16 17 18

5 6 7 8 9 10 11

29 30 31 1 2 3 4

2 3 4 5 6 7 8

26 27 28 29 30 31 1

19 20 21 22 23 24 25

12 13 14 15 16 17 18

5 6 7 8 9 10 11

29 30 31 1 2 3 4

2 3 4 5 6 7 8

26 27 28 29 30 31 1

19 20 21 22 23 24 25

12 13 14 15 16 17 18

5 6 7 8 9 10 11

29 30 31 1 2 3 4

2 3 4 5 6 7 8

26 27 28 29 30 31 1

19 20 21 22 23 24 25

12 13 14 15 16 17 18

5 6 7 8 9 10 11

29 30 31 1 2 3 4

2 3 4 5 6 7 8

26 27 28 29 30 31 1

19 20 21 22 23 24 25

12 13 14 15 16 17 18

5 6 7 8 9 10 11

29 30 31 1 2 3 4

2 3 4 5 6 7 8

26 27 28 29 30 31 1

19 20 21 22 23 24 25

12 13 14 15 16 17 18

5 6 7 8 9 10 11

29 30 31 1 2 3 4

enero

l m m j v s d

2 3 4 5 6 7 8

23 24 25 26 27 28 1

16 17 18 19 20 21 22

9 10 11 12 13 14 15

2 3 4 5 6 7 8

26 27 28 29 30 31 1

2 3 4 5 6 7 8

23 24 25 26 27 28 1

16 17 18 19 20 21 22

9 10 11 12 13 14 15

2 3 4 5 6 7 8

26 27 28 29 30 31 1

2 3 4 5 6 7 8

23 24 25 26 27 28 1

16 17 18 19 20 21 22

9 10 11 12 13 14 15

2 3 4 5 6 7 8

26 27 28 29 30 31 1

2 3 4 5 6 7 8

23 24 25 26 27 28 1

16 17 18 19 20 21 22

9 10 11 12 13 14 15

2 3 4 5 6 7 8

26 27 28 29 30 31 1

2 3 4 5 6 7 8

23 24 25 26 27 28 1

16 17 18 19 20 21 22

9 10 11 12 13 14 15

2 3 4 5 6 7 8

26 27 28 29 30 31 1

2 3 4 5 6 7 8

23 24 25 26 27 28 1

16 17 18 19 20 21 22

9 10 11 12 13 14 15

2 3 4 5 6 7 8

26 27 28 29 30 31 1

2 3 4 5 6 7 8

23 24 25 26 27 28 1

16 17 18 19 20 21 22

9 10 11 12 13 14 15

2 3 4 5 6 7 8

26 27 28 29 30 31 1

2 3 4 5 6 7 8

23 24 25 26 27 28 1

16 17 18 19 20 21 22

9 10 11 12 13 14 15

2 3 4 5 6 7 8

26 27 28 29 30 31 1

2 3 4 5 6 7 8

23 24 25 26 27 28 1

16 17 18 19 20 21 22

9 10 11 12 13 14 15

2 3 4 5 6 7 8

26 27 28 29 30 31 1

2 3 4 5 6 7 8

23 24 25 26 27 28 1

16 17 18 19 20 21 22

9 10 11 12 13 14 15

2 3 4 5 6 7 8

26 27 28 29 30 31 1

2 3 4 5 6 7 8

23 24 25 26 27 28 1

16 17 18 19 20 21 22

9 10 11 12 13 14 15

2 3 4 5 6 7 8

26 27 28 29 30 31 1

2 3 4 5 6 7 8

23 24 25 26 27 28 1

16 17 18 19 20 21 22

9 10 11 12 13 14 15

2 3 4 5 6 7 8

26 27 28 29 30 31 1

febrero

l m m j v s d

30 31 1 2 3 4 5

23 24 25 26 27 28 29

16 17 18 19 20 21 22

9 10 11 12 13 14 15

2 3 4 5 6 7 8

23 24 25 26 27 28 1

30 31 1 2 3 4 5

23 24 25 26 27 28 29

16 17 18 19 20 21 22

9 10 11 12 13 14 15

2 3 4 5 6 7 8

23 24 25 26 27 28 1

30 31 1 2 3 4 5

23 24 25 26 27 28 29

16 17 18 19 20 21 22

9 10 11 12 13 14 15

2 3 4 5 6 7 8

23 24 25 26 27 28 1

30 31 1 2 3 4 5

23 24 25 26 27 28 29

16 17 18 19 20 21 22

9 10 11 12 13 14 15

2 3 4 5 6 7 8

23 24 25 26 27 28 1

30 31 1 2 3 4 5

23 24 25 26 27 28 29

16 17 18 19 20 21 22

9 10 11 12 13 14 15

2 3 4 5 6 7 8

23 24 25 26 27 28 1

30 31 1 2 3 4 5

23 24 25 26 27 28 29

16 17 18 19 20 21 22

9 10 11 12 13 14 15

2 3 4 5 6 7 8

23 24 25 26 27 28 1

30 31 1 2 3 4 5

23 24 25 26 27 28 29

16 17 18 19 20 21 22

9 10 11 12 13 14 15

2 3 4 5 6 7 8

23 24 25 26 27 28 1

30 31 1 2 3 4 5

23 24 25 26 27 28 29

16 17 18 19 20 21 22

9 10 11 12 13 14 15

2 3 4 5 6 7 8

23 24 25 26 27 28 1

30 31 1 2 3 4 5

23 24 25 26 27 28 29

16 17 18 19 20 21 22

9 10 11 12 13 14 15

2 3 4 5 6 7 8

23 24 25 26 27 28 1

30 31 1 2 3 4 5

23 24 25 26 27 28 29

16 17 18 19 20 21 22

9 10 11 12 13 14 15

2 3 4 5 6 7 8

23 24 25 26 27 28 1

30 31 1 2 3 4 5

23 24 25 26 27 28 29

16 17 18 19 20 21 22

9 10 11 12 13 14 15

2 3 4 5 6 7 8

23 24 25 26 27 28 1

30 31 1 2 3 4 5

23 24 25 26 27 28 29

16 17 18 19 20 21 22

9 10 11 12 13 14 15

2 3 4 5 6 7 8

23 24 25 26 27 28 1

marzo

l m m j v s d

4 5 6 7 8 9 10

27 28 29 30 1 2 3

20 21 22 23 24 25 26

13 14 15 16 17 18 19

6 7 8 9 10 11 12

30 31 1 2 3 4 5

4 5 6 7 8 9 10

27 28 29 30 1 2 3

20 21 22 23 24 25 26

13 14 15 16 17 18 19

6 7 8 9 10 11 12

30 31 1 2 3 4 5

4 5 6 7 8 9 10

27 28 29 30 1 2 3

20 21 22 23 24 25 26

13 14 15 16 17 18 19

6 7 8 9 10 11 12

30 31 1 2 3 4 5

4 5 6 7 8 9 10

27 28 29 30 1 2 3

20 21 22 23 24 25 26

13 14 15 16 17 18 19

6 7 8 9 10 11 12

30 31 1 2 3 4 5

4 5 6 7 8 9 10

27 28 29 30 1 2 3

20 21 22 23 24 25 26

13 14 15 16 17 18 19

6 7 8 9 10 11 12

30 31 1 2 3 4 5

4 5 6 7 8 9 10

27 28 29 30 1 2 3

20 21 22 23 24 25 26

13 14 15 16 17 18 19

6 7 8 9 10 11 12

30 31 1 2 3 4 5

4 5 6 7 8 9 10

27 28 29 30 1 2 3

20 21 22 23 24 25 26

13 14 15 16 17 18 19

6 7 8 9 10 11 12

30 31 1 2 3 4 5

4 5 6 7 8 9 10

27 28 29 30 1 2 3

20 21 22 23 24 25 26

13 14 15 16 17 18 19

6 7 8 9 10 11 12

30 31 1 2 3 4 5

4 5 6 7 8 9 10

27 28 29 30 1 2 3

20 21 22 23 24 25 26

13 14 15 16 17 18 19

6 7 8 9 10 11 12

30 31 1 2 3 4 5

4 5 6 7 8 9 10

27 28 29 30 1 2 3

20 21 22 23 24 25 26

13 14 15 16 17 18 19

6 7 8 9 10 11 12

30 31 1 2 3 4 5

4 5 6 7 8 9 10

27 28 29 30 1 2 3

20 21 22 23 24 25 26

13 14 15 16 17 18 19

6 7 8 9 10 11 12

30 31 1 2 3 4 5

4 5 6 7 8 9 10

27 28 29 30 1 2 3

20 21 22 23 24 25 26

13 14 15 16 17 18 19

6 7 8 9 10 11 12

30 31 1 2 3 4 5

abril

l m m j v s d

1 2 3 4 5 6 7

25 26 27 28 29 30 31

18 19 20 21 22 23 24

11 12 13 14 15 16 17

4 5 6 7 8 9 10

27 28 29 30 1 2 3

1 2 3 4 5 6 7

25 26 27 28 29 30 31

18 19 20 21 22 23 24

11 12 13 14 15 16 17

4 5 6 7 8 9 10

27 28 29 30 1 2 3

1 2 3 4 5 6 7

25 26 27 28 29 30 31

18 19 20 21 22 23 24

11 12 13 14 15 16 17

4 5 6 7 8 9 10

27 28 29 30 1 2 3

1 2 3 4 5 6 7

25 26 27 28 29 30 31

18 19 20 21 22 23 24

11 12 13 14 15 16 17

4 5 6 7 8 9 10

27 28 29 30 1 2 3

1 2 3 4 5 6 7

25 26 27 28 29 30 31

18 19 20 21 22 23 24

11 12 13 14 15 16 17

4 5 6 7 8 9 10

27 28 29 30 1 2 3

1 2 3 4 5 6 7

25 26 27 28 29 30 31

18 19 20 21 22 23 24

11 12 13 14 15 16 17

4 5 6 7 8 9 10

27 28 29 30 1 2 3

1 2 3 4 5 6 7

25 26 27 28 29 30 31

18 19 20 21 22 23 24

11 12 13 14 15 16 17

4 5 6 7 8 9 10

27 28 29 30 1 2 3

1 2 3 4 5 6 7

25 26 27 28 29 30 31

18 19 20 21 22 23 24

11 12 13 14 15 16 17

4 5 6 7 8 9 10

27 28 29 30 1 2 3

1 2 3 4 5 6 7

25 26 27 28 29 30 31

18 19 20 21 22 23 24

11 12 13 14 15 16 17

4 5 6 7 8 9 10

27 28 29 30 1 2 3

1 2 3 4 5 6 7

25 26 27 28 29 30 31

18 19 20 21 22 23 24

11 12 13 14 15 16 17

4 5 6 7 8 9 10

27 28 29 30 1 2 3

1 2 3 4 5 6 7

25 26 27 28 29 30 31

18 19 20 21 22 23 24

11 12 13 14 15 16 17

4 5 6 7 8 9 10

27 28 29 30 1 2 3

1 2 3 4 5 6 7

25 26 27 28 29 30 31

18 19 20 21 22 23 24

11 12 13 14 15 16 17

4 5 6 7 8 9 10

27 28 29 30 1 2 3

mayo

l m m j v s d

29 30 1 2 3 4 5

22 23 24 25 26 27 28

15 16 17 18 19 20 21

8 9 10 11 12 13 14

1 2 3 4 5 6 7

25 26 27 28 29 30 31

29 30 1 2 3 4 5

22 23 24 25 26 27 28

15 16 17 18 19 20 21

8 9 10 11 12 13 14

1 2 3 4 5 6 7

25 26 27 28 29 30 31

29 30 1 2 3 4 5

22 23 24 25 26 27 28

15 16 17 18 19 20 21

8 9 10 11 12 13 14

1 2 3 4 5 6 7

25 26 27 28 29 30 31

29 30 1 2 3 4 5

22 23 24 25 26 27 28

15 16 17 18 19 20 21

8 9 10 11 12 13 14

1 2 3 4 5 6 7

25 26 27 28 29 30 31

29 30 1 2 3 4 5

22 23 24 25 26 27 28

15 16 17 18 19 20 21

8 9 10 11 12 13 14

1 2 3 4 5 6 7

25 26 27 28 29 30 31

29 30 1 2 3 4 5

22 23 24 25 26 27 28

15 16 17 18 19 20 21

8 9 10 11 12 13 14

1 2 3 4 5 6 7

25 26 27 28 29 30 31

29 30 1 2 3 4 5

22 23 24 25 26 27 28

15 16 17 18 19 20 21

8 9 10 11 12 13 14

1 2 3 4 5 6 7

25 26 27 28 29 30 31

29 30 1 2 3 4 5

22 23 24 25 26 27 28

15 16 17 18 19 20 21

8 9 10 11 12 13 14

1 2 3 4 5 6 7

25 26 27 28 29 30 31

29 30 1 2 3 4 5

22 23 24 25 26 27 28

15 16 17 18 19 20 21

8 9 10 11 12 13 14

1 2 3 4 5 6 7

25 26 27 28 29 30 31

29 30 1 2 3 4 5

22 23 24 25 26 27 28

15 16 17 18 19 20 21

8 9 10 11 12 13 14

1 2 3 4 5 6 7

25 26 27 28 29 30 31

29 30 1 2 3 4 5

22 23 24 25 26 27 28

15 16 17 18 19 20 21

8 9 10 11 12 13 14

1 2 3 4 5 6 7

25 26 27 28 29 30 31

29 30 1 2 3 4 5

22 23 24 25 26 27 28

15 16 17 18 19 20 21

8 9 10 11 12 13 14

1 2 3 4 5 6 7

25 26 27 28 29 30 31

junio

l m m j v s d

3 4 5 6 7 8 9

27 28 29 30 31 1 2

20 21 22 23 24 25 26

13 14 15 16 17 18 19

6 7 8 9 10 11 12

29 30 1 2 3 4 5

3 4 5 6 7 8 9

27 28 29 30 31 1 2

20 21 22 23 24 25 26

13 14 15 16 17 18 19

6 7 8 9 10 11 12

29 30 1 2 3 4 5

3 4 5 6 7 8 9

27 28 29 30 31 1 2

20 21 22 23 24 25 26

13 14 15 16 17 18 19

6 7 8 9 10 11 12

29 30 1 2 3 4 5

3 4 5 6 7 8 9

27 28 29 30 31 1 2

20 21 22 23 24 25 26

13 14 15 16 17 18 19

6 7 8 9 10 11 12

29 30 1 2 3 4 5

3 4 5 6 7 8 9

27 28 29 30 31 1 2

20 21 22 23 24 25 26

13 14 15 16 17 18 19

6 7 8 9 10 11 12

29 30 1 2 3 4 5

3 4 5 6 7 8 9

27 28 29 30 31 1 2

20 21 22 23 24 25 26

13 14 15 16 17 18 19

6 7 8 9 10 11 12

29 30 1 2 3 4 5

3 4 5 6 7 8 9

27 28 29 30 31 1 2

20 21 22 23 24 25 26

13 14 15 16 17 18 19

6 7 8 9 10 11 12

29 30 1 2 3 4 5

3 4 5 6 7 8 9

27 28 29 30 31 1 2

20 21 22 23 24 25 26

13 14 15 16 17 18 19

6 7 8 9 10 11 12

29 30 1 2 3 4 5

3 4 5 6 7 8 9

27 28 29 30 31 1 2

20 21 22 23 24 25 26

13 14 15 16 17 18 19

6 7 8 9 10 11 12

29 30 1 2 3 4 5

3 4 5 6 7 8 9

27 28 29 30 31 1 2

20 21 22 23 24 25 26

13 14 15 16 17 18 19

6 7 8 9 10 11 12

29 30 1 2 3 4 5

3 4 5 6 7 8 9

27 28 29 30 31 1 2

20 21 22 23 24 25 26

13 14 15 16 17 18 19

6 7 8 9 10 11 12

29 30 1 2 3 4 5

3 4 5 6 7 8 9

27 28 29 30 31 1 2

20 21 22 23 24 25 26

13 14 15 16 17 18 19

6 7 8 9 10 11 12

29 30 1 2 3 4 5

julio

l m m j v s d

31 1 2 3 4 5 6

24 25 26 27 28 29 30

17 18 19 20 21 22 23

10 11 12 13 14 15 16

3 4 5 6 7 8 9

27 28 29 30 31 1 2

31 1 2 3 4 5 6

24 25 26 27 28 29 30

17 18 19 20 21 22 23

10 11 12 13 14 15 16

3 4 5 6 7 8 9

27 28 29 30 31 1 2

31 1 2 3 4 5 6

24 25 26 27 28 29 30

17 18 19 20 21 22 23

10 11 12 13 14 15 16

3 4 5 6 7 8 9

27 28 29 30 31 1 2

31 1 2 3 4 5 6

24 25 26 27 28 29 30

17 18 19 20 21 22 23

10 11 12 13 14 15 16

3 4 5 6 7 8 9

27 28 29 30 31 1 2

31 1 2 3 4 5 6

24 25 26 27 28 29 30

17 18 19 20 21 22 23

10 11 12 13 14 15 16

3 4 5 6 7 8 9

27 28 29 30 31 1 2

31 1 2 3 4 5 6

24 25 26 27 28 29 30

17 18 19 20 21 22 23

10 11 12 13 14 15 16

3 4 5 6 7 8 9

27 28 29 30 31 1 2

31 1 2 3 4 5 6

24 25 26 27 28 29 30

17 18 19 20 21 22 23

10 11 12 13 14 15 16

3 4 5 6 7 8 9

27 28 29 30 31 1 2

31 1 2 3 4 5 6

24 25 26 27 28 29 30

17 18 19 20 21 22 23

10 11 12 13 14 15 16

3 4 5 6 7 8 9

27 28 29 30 31 1 2

31 1 2 3 4 5 6

24 25 26 27 28 29 30

17 18 19 20 21 22 23

10 11 12 13 14 15 16

3 4 5 6 7 8 9

27 28 29 30 31 1 2

31 1 2 3 4 5 6

24 25 26 27 28 29 30

17 18 19 20 21 22 23

10 11 12 13 14 15 16

3 4 5 6 7 8 9

27 28 29 30 31 1 2

31 1 2 3 4 5 6

24 25 26 27 28 29 30

17 18 19 20 21 22 23

10 11 12 13 14 15 16

3 4 5 6 7 8 9

27 28 29 30 31 1 2

31 1 2 3 4 5 6

24 25 26 27 28 29 30

17 18 19 20 21 22 23

10 11 12 13 14 15 16

3 4 5 6 7 8 9

27 28 29 30 31 1 2

agosto

l m m j v s d

5 6 7 8 9 10 11

28 29 30 1 2 3 4

21 22 23 24 25 26 27

14 15 16 17 18 19 20

7 8 9 10 11 12 13

31 1 2 3 4 5 6

5 6 7 8 9 10 11

28 29 30 1 2 3 4

21 22 23 24 25 26 27

14 15 16 17 18 19 20

7 8 9 10 11 12 13

31 1 2 3 4 5 6

5 6 7 8 9 10 11

28 29 30 1 2 3 4

21 22 23 24 25 26 27

14 15 16 17 18 19 20

7 8 9 10 11 12 13

31 1 2 3 4 5 6

5 6 7 8 9 10 11

28 29 30 1 2 3 4

21 22 23 24 25 26 27

14 15 16 17 18 19 20

7 8 9 10 11 12 13

31 1 2 3 4 5 6

5 6 7 8 9 10 11

28 29 30 1 2 3 4

21 22 23 24 25 26 27

14 15 16 17 18 19 20

7 8 9 10 11 12 13

31 1 2 3 4 5 6

5 6 7 8 9 10 11

28 29 30 1 2 3 4

21 22 23 24 25 26 27

14 15 16 17 18 19 20

7 8 9 10 11 12 13

31 1 2 3 4 5 6

5 6 7 8 9 10 11

28 29 30 1 2 3 4

21 22 23 24 25 26 27

14 15 16 17 18 19 20

7 8 9 10 11 12 13

31 1 2 3 4 5 6

5 6 7 8 9 10 11

28 29 30 1 2 3 4

21 22 23 24 25 26 27

14 15 16 17 18 19 20

7 8 9 10 11 12 13

31 1 2 3 4 5 6

5 6 7 8 9 10 11

28 29 30 1 2 3 4

21 22 23 24 25 26 27

14 15 16 17 18 19 20

7 8 9 10 11 12 13

31 1 2 3 4 5 6

5 6 7 8 9 10 11

28 29 30 1 2 3 4

21 22 23 24 25 26 27

14 15 16 17 18 19 20

7 8 9 10 11 12 13

31 1 2 3 4 5 6

5 6 7 8 9 10 11

28 29 30 1 2 3 4

21 22 23 24 25 26 27

14 15 16 17 18 19 20

7 8 9 10 11 12 13

31 1 2 3 4 5 6

5 6 7 8 9 10 11

28 29 30 1 2 3 4

21 22 23 24 25 26 27

14 15 16 17 18 19 20

7 8 9 10 11 12 13

31 1 2 3 4 5 6

septiembre

l m m j v s d

2 3 4 5 6 7 8

26 27 28 29 30 31 1

19 20 21 22 23 24 25

12 13 14 15 16 17 18

5 6 7 8 9 10 11

28 29 30 1 2 3 4

2 3 4 5 6 7 8

26 27 28 29 30 31 1

19 20 21 22 23 24 25

12 13 14 15 16 17 18

5 6 7 8 9 10 11

28 29 30 1 2 3 4

2 3 4 5 6 7 8

26 27 28 29 30 31 1

19 20 21 22 23 24 25

12 13 14 15 16 17 18

5 6 7 8 9 10 11

28 29 30 1 2 3 4

2 3 4 5 6 7 8

26 27 28 29 30 31 1

19 20 21 22 23 24 25

12 13 14 15 16 17 18

5 6 7 8 9 10 11

28 29 30 1 2 3 4

2 3 4 5 6 7 8

26 27 28 29 30 31 1

19 20 21 22 23 24 25

12 13 14 15 16 17 18

5 6 7 8 9 10 11

28 29 30 1 2 3 4

2 3 4 5 6 7 8

26 27 28 29 30 31 1

19 20 21 22 23 24 25

12 13 14 15 16 17 18

5 6 7 8 9 10 11

28 29 30 1 2 3 4

2 3 4 5 6 7 8

26 27 28 29 30 31 1

19 20 21 22 23 24 25

12 13 14 15 16 17 18

5 6 7 8 9 10 11

28 29 30 1 2 3 4

2 3 4 5 6 7 8

26 27 28 29 30 31 1

19 20 21 22 23 24 25

12 13 14 15 16 17 18

5 6 7 8 9 10 11

28 29 30 1 2 3 4

2 3 4 5 6 7 8

26 27 28 29 30 31 1

19 20 21 22 23 24 25

12 13 14 15 16 17 18

5 6 7 8 9 10 11

28 29 30 1 2 3 4

2 3 4 5 6 7 8

26 27 28 29 30 31 1

19 20 21 22 23 24 25

12 13 14 15 16 17 18

5 6 7 8 9 10 11

28 29 30 1 2 3 4

2 3 4 5 6 7 8

26 27 28 29 30 31 1

19 20 21 22 23 24 25

12 13 14 15 16 17 18

5 6 7 8 9 10 11

28 29 30 1 2 3 4

2 3 4 5 6 7 8

26 27 28 29 30 31 1

19 20 21 22 23 24 25

12 13 14 15 16 17 18

5 6 7 8 9 10 11

28 29 30 1 2 3 4

octubre

l m m j v s d

30 1 2 3 4 5 6

23 24 25 26 27 28 29

16 17 18 19 20 21 22

9 10 11 12 13 14 15

2 3 4 5 6 7 8

26 27 28 29 30 31 1

30 1 2 3 4 5 6

23 24 25 26 27 28 29

16 17 18 19 20 21 22

9 10 11 12 13 14 15

2 3 4 5 6 7 8

26 27 28 29 30 31 1

30 1 2 3 4 5 6

23 24 25 26 27 28 29

16 17 18 19 20 21 22

9 10 11 12 13 14 15

2 3 4 5 6 7 8

26 27 28 29 30 31 1

30 1 2 3 4 5 6

23 24 25 26 27 28 29

16 17 18 19 20 21 22

9 10 11 12 13 14 15

2 3 4 5 6 7 8

26 27 28 29 30 31 1

30 1 2 3 4 5 6

23 24 25 26 27 28 29

16 17 18 19 20 21 22

9 10 11 12 13 14 15

2 3 4 5 6 7 8

26 27 28 29 30 31 1

30 1 2 3 4 5 6

23 24 25 26 27 28 29

16 17 18 19 20 21 22

9 10 11 12 13 14 15

2 3 4 5 6 7 8

26 27 28 29 30 31 1

30 1 2 3 4 5 6

23 24 25 26 27 28 29

16 17 18 19 20 21 22

9 10 11 12 13 14 15

2 3 4 5 6 7 8

26 27 28 29 30 31 1

30 1 2 3 4 5 6

23 24 25 26 27 28 29

16 17 18 19 20 21 22

9 10 11 12 13 14 15

2 3 4 5 6 7 8

26 27 28 29 30 31 1

30 1 2 3 4 5 6

23 24 25 26 27 28 29

16 17 18 19 20 21 22

9 10 11 12 13 14 15

2 3 4 5 6 7 8

26 27 28 29 30 31 1

30 1 2 3 4 5 6

23 24 25 26 27 28 29

16 17 18 19 20 21 22

9 10 11 12 13 14 15

2 3 4 5 6 7 8

26 27 28 29 30 31 1

30 1 2 3 4 5 6

23 24 25 26 27 28 29

16 17 18 19 20 21 22

9 10 11 12 13 14 15

2 3 4 5 6 7 8

26 27 28 29 30 31 1

30 1 2 3 4 5 6

23 24 25 26 27 28 29

16 17 18 19 20 21 22

9 10 11 12 13 14 15

2 3 4 5 6 7 8

26 27 28 29 30 31 1

noviembre

l m m j v s d

4 5 6 7 8 9 10

28 29 30 31 1 2 3

21 22 23 24 25 26 27

14 15 16 17 18 19 20

7 8 9 10 11 12 13

30 1 2 3 4 5 6

4 5 6 7 8 9 10

28 29 30 31 1 2 3

21 22 23 24 25 26 27

14 15 16 17 18 19 20

7 8 9 10 11 12 13

30 1 2 3 4 5 6

4 5 6 7 8 9 10

28 29 30 31 1 2 3

21 22 23 24 25 26 27

14 15 16 17 18 19 20

7 8 9 10 11 12 13

30 1 2 3 4 5 6

4 5 6 7 8 9 10

28 29 30 31 1 2 3

21 22 23 24 25 26 27

14 15 16 17 18 19 20

7 8 9 10 11 12 13

30 1 2 3 4 5 6

4 5 6 7 8 9 10

28 29 30 31 1 2 3

21 22 23 24 25 26 27

14 15 16 17 18 19 20

7 8 9 10 11 12 13

30 1 2 3 4 5 6

4 5 6 7 8 9 10

28 29 30 31 1 2 3

21 22 23 24 25 26 27

14 15 16 17 18 19 20

7 8 9 10 11 12 13

30 1 2 3 4 5 6

4 5 6 7 8 9 10

28 29 30 31 1 2 3

21 22 23 24 25 26 27

14 15 16 17 18 19 20

7 8 9 10 11 12 13

30 1 2 3 4 5 6

4 5 6 7 8 9 10

28 29 30 31 1 2 3

21 22 23 24 25 26 27

14 15 16 17 18 19 20

7 8 9 10 11 12 13

30 1 2 3 4 5 6

4 5 6 7 8 9 10

28 29 30 31 1 2 3

21 22 23 24 25 26 27

14 15 16 17 18 19 20

7 8 9 10 11 12 13

30 1 2 3 4 5 6

4 5 6 7 8 9 10

28 29 30 31 1 2 3

21 22 23 24 25 26 27

14 15 16 17 18 19 20

7 8 9 10 11 12 13

30 1 2 3 4 5 6

4 5 6 7 8 9 10

28 29 30 31 1 2 3

21 22 23 24 25 26 27

14 15 16 17 18 19 20

7 8 9 10 11 12 13

30 1 2 3 4 5 6

4 5 6 7 8 9 10

28 29 30 31 1 2 3

21 22 23 24 25 26 27

14 15 16 17 18 19 20

7 8 9 10 11 12 13

30 1 2 3 4 5 6

diciembre

5e+05

1e+06

1500000

2e+06

2500000

3e+06

M. Febrero–Bande Retos Big Data

Page 23: Retos estadísticos en Big Data y Data Mining...IntroductionBigDataAnalysis Newchallenges,almostsamesolutionsConclusion DimensionsofBigData TheVcharacteristicsofBigDataare: IVolume.Huge(andperhapsincreasing

Introduction Big Data Analysis New challenges, almost same solutions Conclusion

Calendar Plot II

Mean(importe) in 2015

l m m j v s d

2 3 4 5 6 7 8

26 27 28 29 30 31 1

19 20 21 22 23 24 25

12 13 14 15 16 17 18

5 6 7 8 9 10 11

29 30 31 1 2 3 4

2 3 4 5 6 7 8

26 27 28 29 30 31 1

19 20 21 22 23 24 25

12 13 14 15 16 17 18

5 6 7 8 9 10 11

29 30 31 1 2 3 4

2 3 4 5 6 7 8

26 27 28 29 30 31 1

19 20 21 22 23 24 25

12 13 14 15 16 17 18

5 6 7 8 9 10 11

29 30 31 1 2 3 4

2 3 4 5 6 7 8

26 27 28 29 30 31 1

19 20 21 22 23 24 25

12 13 14 15 16 17 18

5 6 7 8 9 10 11

29 30 31 1 2 3 4

2 3 4 5 6 7 8

26 27 28 29 30 31 1

19 20 21 22 23 24 25

12 13 14 15 16 17 18

5 6 7 8 9 10 11

29 30 31 1 2 3 4

2 3 4 5 6 7 8

26 27 28 29 30 31 1

19 20 21 22 23 24 25

12 13 14 15 16 17 18

5 6 7 8 9 10 11

29 30 31 1 2 3 4

2 3 4 5 6 7 8

26 27 28 29 30 31 1

19 20 21 22 23 24 25

12 13 14 15 16 17 18

5 6 7 8 9 10 11

29 30 31 1 2 3 4

2 3 4 5 6 7 8

26 27 28 29 30 31 1

19 20 21 22 23 24 25

12 13 14 15 16 17 18

5 6 7 8 9 10 11

29 30 31 1 2 3 4

2 3 4 5 6 7 8

26 27 28 29 30 31 1

19 20 21 22 23 24 25

12 13 14 15 16 17 18

5 6 7 8 9 10 11

29 30 31 1 2 3 4

2 3 4 5 6 7 8

26 27 28 29 30 31 1

19 20 21 22 23 24 25

12 13 14 15 16 17 18

5 6 7 8 9 10 11

29 30 31 1 2 3 4

2 3 4 5 6 7 8

26 27 28 29 30 31 1

19 20 21 22 23 24 25

12 13 14 15 16 17 18

5 6 7 8 9 10 11

29 30 31 1 2 3 4

2 3 4 5 6 7 8

26 27 28 29 30 31 1

19 20 21 22 23 24 25

12 13 14 15 16 17 18

5 6 7 8 9 10 11

29 30 31 1 2 3 4

enero

l m m j v s d

2 3 4 5 6 7 8

23 24 25 26 27 28 1

16 17 18 19 20 21 22

9 10 11 12 13 14 15

2 3 4 5 6 7 8

26 27 28 29 30 31 1

2 3 4 5 6 7 8

23 24 25 26 27 28 1

16 17 18 19 20 21 22

9 10 11 12 13 14 15

2 3 4 5 6 7 8

26 27 28 29 30 31 1

2 3 4 5 6 7 8

23 24 25 26 27 28 1

16 17 18 19 20 21 22

9 10 11 12 13 14 15

2 3 4 5 6 7 8

26 27 28 29 30 31 1

2 3 4 5 6 7 8

23 24 25 26 27 28 1

16 17 18 19 20 21 22

9 10 11 12 13 14 15

2 3 4 5 6 7 8

26 27 28 29 30 31 1

2 3 4 5 6 7 8

23 24 25 26 27 28 1

16 17 18 19 20 21 22

9 10 11 12 13 14 15

2 3 4 5 6 7 8

26 27 28 29 30 31 1

2 3 4 5 6 7 8

23 24 25 26 27 28 1

16 17 18 19 20 21 22

9 10 11 12 13 14 15

2 3 4 5 6 7 8

26 27 28 29 30 31 1

2 3 4 5 6 7 8

23 24 25 26 27 28 1

16 17 18 19 20 21 22

9 10 11 12 13 14 15

2 3 4 5 6 7 8

26 27 28 29 30 31 1

2 3 4 5 6 7 8

23 24 25 26 27 28 1

16 17 18 19 20 21 22

9 10 11 12 13 14 15

2 3 4 5 6 7 8

26 27 28 29 30 31 1

2 3 4 5 6 7 8

23 24 25 26 27 28 1

16 17 18 19 20 21 22

9 10 11 12 13 14 15

2 3 4 5 6 7 8

26 27 28 29 30 31 1

2 3 4 5 6 7 8

23 24 25 26 27 28 1

16 17 18 19 20 21 22

9 10 11 12 13 14 15

2 3 4 5 6 7 8

26 27 28 29 30 31 1

2 3 4 5 6 7 8

23 24 25 26 27 28 1

16 17 18 19 20 21 22

9 10 11 12 13 14 15

2 3 4 5 6 7 8

26 27 28 29 30 31 1

2 3 4 5 6 7 8

23 24 25 26 27 28 1

16 17 18 19 20 21 22

9 10 11 12 13 14 15

2 3 4 5 6 7 8

26 27 28 29 30 31 1

febrero

l m m j v s d

30 31 1 2 3 4 5

23 24 25 26 27 28 29

16 17 18 19 20 21 22

9 10 11 12 13 14 15

2 3 4 5 6 7 8

23 24 25 26 27 28 1

30 31 1 2 3 4 5

23 24 25 26 27 28 29

16 17 18 19 20 21 22

9 10 11 12 13 14 15

2 3 4 5 6 7 8

23 24 25 26 27 28 1

30 31 1 2 3 4 5

23 24 25 26 27 28 29

16 17 18 19 20 21 22

9 10 11 12 13 14 15

2 3 4 5 6 7 8

23 24 25 26 27 28 1

30 31 1 2 3 4 5

23 24 25 26 27 28 29

16 17 18 19 20 21 22

9 10 11 12 13 14 15

2 3 4 5 6 7 8

23 24 25 26 27 28 1

30 31 1 2 3 4 5

23 24 25 26 27 28 29

16 17 18 19 20 21 22

9 10 11 12 13 14 15

2 3 4 5 6 7 8

23 24 25 26 27 28 1

30 31 1 2 3 4 5

23 24 25 26 27 28 29

16 17 18 19 20 21 22

9 10 11 12 13 14 15

2 3 4 5 6 7 8

23 24 25 26 27 28 1

30 31 1 2 3 4 5

23 24 25 26 27 28 29

16 17 18 19 20 21 22

9 10 11 12 13 14 15

2 3 4 5 6 7 8

23 24 25 26 27 28 1

30 31 1 2 3 4 5

23 24 25 26 27 28 29

16 17 18 19 20 21 22

9 10 11 12 13 14 15

2 3 4 5 6 7 8

23 24 25 26 27 28 1

30 31 1 2 3 4 5

23 24 25 26 27 28 29

16 17 18 19 20 21 22

9 10 11 12 13 14 15

2 3 4 5 6 7 8

23 24 25 26 27 28 1

30 31 1 2 3 4 5

23 24 25 26 27 28 29

16 17 18 19 20 21 22

9 10 11 12 13 14 15

2 3 4 5 6 7 8

23 24 25 26 27 28 1

30 31 1 2 3 4 5

23 24 25 26 27 28 29

16 17 18 19 20 21 22

9 10 11 12 13 14 15

2 3 4 5 6 7 8

23 24 25 26 27 28 1

30 31 1 2 3 4 5

23 24 25 26 27 28 29

16 17 18 19 20 21 22

9 10 11 12 13 14 15

2 3 4 5 6 7 8

23 24 25 26 27 28 1

marzo

l m m j v s d

4 5 6 7 8 9 10

27 28 29 30 1 2 3

20 21 22 23 24 25 26

13 14 15 16 17 18 19

6 7 8 9 10 11 12

30 31 1 2 3 4 5

4 5 6 7 8 9 10

27 28 29 30 1 2 3

20 21 22 23 24 25 26

13 14 15 16 17 18 19

6 7 8 9 10 11 12

30 31 1 2 3 4 5

4 5 6 7 8 9 10

27 28 29 30 1 2 3

20 21 22 23 24 25 26

13 14 15 16 17 18 19

6 7 8 9 10 11 12

30 31 1 2 3 4 5

4 5 6 7 8 9 10

27 28 29 30 1 2 3

20 21 22 23 24 25 26

13 14 15 16 17 18 19

6 7 8 9 10 11 12

30 31 1 2 3 4 5

4 5 6 7 8 9 10

27 28 29 30 1 2 3

20 21 22 23 24 25 26

13 14 15 16 17 18 19

6 7 8 9 10 11 12

30 31 1 2 3 4 5

4 5 6 7 8 9 10

27 28 29 30 1 2 3

20 21 22 23 24 25 26

13 14 15 16 17 18 19

6 7 8 9 10 11 12

30 31 1 2 3 4 5

4 5 6 7 8 9 10

27 28 29 30 1 2 3

20 21 22 23 24 25 26

13 14 15 16 17 18 19

6 7 8 9 10 11 12

30 31 1 2 3 4 5

4 5 6 7 8 9 10

27 28 29 30 1 2 3

20 21 22 23 24 25 26

13 14 15 16 17 18 19

6 7 8 9 10 11 12

30 31 1 2 3 4 5

4 5 6 7 8 9 10

27 28 29 30 1 2 3

20 21 22 23 24 25 26

13 14 15 16 17 18 19

6 7 8 9 10 11 12

30 31 1 2 3 4 5

4 5 6 7 8 9 10

27 28 29 30 1 2 3

20 21 22 23 24 25 26

13 14 15 16 17 18 19

6 7 8 9 10 11 12

30 31 1 2 3 4 5

4 5 6 7 8 9 10

27 28 29 30 1 2 3

20 21 22 23 24 25 26

13 14 15 16 17 18 19

6 7 8 9 10 11 12

30 31 1 2 3 4 5

4 5 6 7 8 9 10

27 28 29 30 1 2 3

20 21 22 23 24 25 26

13 14 15 16 17 18 19

6 7 8 9 10 11 12

30 31 1 2 3 4 5

abril

l m m j v s d

1 2 3 4 5 6 7

25 26 27 28 29 30 31

18 19 20 21 22 23 24

11 12 13 14 15 16 17

4 5 6 7 8 9 10

27 28 29 30 1 2 3

1 2 3 4 5 6 7

25 26 27 28 29 30 31

18 19 20 21 22 23 24

11 12 13 14 15 16 17

4 5 6 7 8 9 10

27 28 29 30 1 2 3

1 2 3 4 5 6 7

25 26 27 28 29 30 31

18 19 20 21 22 23 24

11 12 13 14 15 16 17

4 5 6 7 8 9 10

27 28 29 30 1 2 3

1 2 3 4 5 6 7

25 26 27 28 29 30 31

18 19 20 21 22 23 24

11 12 13 14 15 16 17

4 5 6 7 8 9 10

27 28 29 30 1 2 3

1 2 3 4 5 6 7

25 26 27 28 29 30 31

18 19 20 21 22 23 24

11 12 13 14 15 16 17

4 5 6 7 8 9 10

27 28 29 30 1 2 3

1 2 3 4 5 6 7

25 26 27 28 29 30 31

18 19 20 21 22 23 24

11 12 13 14 15 16 17

4 5 6 7 8 9 10

27 28 29 30 1 2 3

1 2 3 4 5 6 7

25 26 27 28 29 30 31

18 19 20 21 22 23 24

11 12 13 14 15 16 17

4 5 6 7 8 9 10

27 28 29 30 1 2 3

1 2 3 4 5 6 7

25 26 27 28 29 30 31

18 19 20 21 22 23 24

11 12 13 14 15 16 17

4 5 6 7 8 9 10

27 28 29 30 1 2 3

1 2 3 4 5 6 7

25 26 27 28 29 30 31

18 19 20 21 22 23 24

11 12 13 14 15 16 17

4 5 6 7 8 9 10

27 28 29 30 1 2 3

1 2 3 4 5 6 7

25 26 27 28 29 30 31

18 19 20 21 22 23 24

11 12 13 14 15 16 17

4 5 6 7 8 9 10

27 28 29 30 1 2 3

1 2 3 4 5 6 7

25 26 27 28 29 30 31

18 19 20 21 22 23 24

11 12 13 14 15 16 17

4 5 6 7 8 9 10

27 28 29 30 1 2 3

1 2 3 4 5 6 7

25 26 27 28 29 30 31

18 19 20 21 22 23 24

11 12 13 14 15 16 17

4 5 6 7 8 9 10

27 28 29 30 1 2 3

mayo

l m m j v s d

29 30 1 2 3 4 5

22 23 24 25 26 27 28

15 16 17 18 19 20 21

8 9 10 11 12 13 14

1 2 3 4 5 6 7

25 26 27 28 29 30 31

29 30 1 2 3 4 5

22 23 24 25 26 27 28

15 16 17 18 19 20 21

8 9 10 11 12 13 14

1 2 3 4 5 6 7

25 26 27 28 29 30 31

29 30 1 2 3 4 5

22 23 24 25 26 27 28

15 16 17 18 19 20 21

8 9 10 11 12 13 14

1 2 3 4 5 6 7

25 26 27 28 29 30 31

29 30 1 2 3 4 5

22 23 24 25 26 27 28

15 16 17 18 19 20 21

8 9 10 11 12 13 14

1 2 3 4 5 6 7

25 26 27 28 29 30 31

29 30 1 2 3 4 5

22 23 24 25 26 27 28

15 16 17 18 19 20 21

8 9 10 11 12 13 14

1 2 3 4 5 6 7

25 26 27 28 29 30 31

29 30 1 2 3 4 5

22 23 24 25 26 27 28

15 16 17 18 19 20 21

8 9 10 11 12 13 14

1 2 3 4 5 6 7

25 26 27 28 29 30 31

29 30 1 2 3 4 5

22 23 24 25 26 27 28

15 16 17 18 19 20 21

8 9 10 11 12 13 14

1 2 3 4 5 6 7

25 26 27 28 29 30 31

29 30 1 2 3 4 5

22 23 24 25 26 27 28

15 16 17 18 19 20 21

8 9 10 11 12 13 14

1 2 3 4 5 6 7

25 26 27 28 29 30 31

29 30 1 2 3 4 5

22 23 24 25 26 27 28

15 16 17 18 19 20 21

8 9 10 11 12 13 14

1 2 3 4 5 6 7

25 26 27 28 29 30 31

29 30 1 2 3 4 5

22 23 24 25 26 27 28

15 16 17 18 19 20 21

8 9 10 11 12 13 14

1 2 3 4 5 6 7

25 26 27 28 29 30 31

29 30 1 2 3 4 5

22 23 24 25 26 27 28

15 16 17 18 19 20 21

8 9 10 11 12 13 14

1 2 3 4 5 6 7

25 26 27 28 29 30 31

29 30 1 2 3 4 5

22 23 24 25 26 27 28

15 16 17 18 19 20 21

8 9 10 11 12 13 14

1 2 3 4 5 6 7

25 26 27 28 29 30 31

junio

l m m j v s d

3 4 5 6 7 8 9

27 28 29 30 31 1 2

20 21 22 23 24 25 26

13 14 15 16 17 18 19

6 7 8 9 10 11 12

29 30 1 2 3 4 5

3 4 5 6 7 8 9

27 28 29 30 31 1 2

20 21 22 23 24 25 26

13 14 15 16 17 18 19

6 7 8 9 10 11 12

29 30 1 2 3 4 5

3 4 5 6 7 8 9

27 28 29 30 31 1 2

20 21 22 23 24 25 26

13 14 15 16 17 18 19

6 7 8 9 10 11 12

29 30 1 2 3 4 5

3 4 5 6 7 8 9

27 28 29 30 31 1 2

20 21 22 23 24 25 26

13 14 15 16 17 18 19

6 7 8 9 10 11 12

29 30 1 2 3 4 5

3 4 5 6 7 8 9

27 28 29 30 31 1 2

20 21 22 23 24 25 26

13 14 15 16 17 18 19

6 7 8 9 10 11 12

29 30 1 2 3 4 5

3 4 5 6 7 8 9

27 28 29 30 31 1 2

20 21 22 23 24 25 26

13 14 15 16 17 18 19

6 7 8 9 10 11 12

29 30 1 2 3 4 5

3 4 5 6 7 8 9

27 28 29 30 31 1 2

20 21 22 23 24 25 26

13 14 15 16 17 18 19

6 7 8 9 10 11 12

29 30 1 2 3 4 5

3 4 5 6 7 8 9

27 28 29 30 31 1 2

20 21 22 23 24 25 26

13 14 15 16 17 18 19

6 7 8 9 10 11 12

29 30 1 2 3 4 5

3 4 5 6 7 8 9

27 28 29 30 31 1 2

20 21 22 23 24 25 26

13 14 15 16 17 18 19

6 7 8 9 10 11 12

29 30 1 2 3 4 5

3 4 5 6 7 8 9

27 28 29 30 31 1 2

20 21 22 23 24 25 26

13 14 15 16 17 18 19

6 7 8 9 10 11 12

29 30 1 2 3 4 5

3 4 5 6 7 8 9

27 28 29 30 31 1 2

20 21 22 23 24 25 26

13 14 15 16 17 18 19

6 7 8 9 10 11 12

29 30 1 2 3 4 5

3 4 5 6 7 8 9

27 28 29 30 31 1 2

20 21 22 23 24 25 26

13 14 15 16 17 18 19

6 7 8 9 10 11 12

29 30 1 2 3 4 5

julio

l m m j v s d

31 1 2 3 4 5 6

24 25 26 27 28 29 30

17 18 19 20 21 22 23

10 11 12 13 14 15 16

3 4 5 6 7 8 9

27 28 29 30 31 1 2

31 1 2 3 4 5 6

24 25 26 27 28 29 30

17 18 19 20 21 22 23

10 11 12 13 14 15 16

3 4 5 6 7 8 9

27 28 29 30 31 1 2

31 1 2 3 4 5 6

24 25 26 27 28 29 30

17 18 19 20 21 22 23

10 11 12 13 14 15 16

3 4 5 6 7 8 9

27 28 29 30 31 1 2

31 1 2 3 4 5 6

24 25 26 27 28 29 30

17 18 19 20 21 22 23

10 11 12 13 14 15 16

3 4 5 6 7 8 9

27 28 29 30 31 1 2

31 1 2 3 4 5 6

24 25 26 27 28 29 30

17 18 19 20 21 22 23

10 11 12 13 14 15 16

3 4 5 6 7 8 9

27 28 29 30 31 1 2

31 1 2 3 4 5 6

24 25 26 27 28 29 30

17 18 19 20 21 22 23

10 11 12 13 14 15 16

3 4 5 6 7 8 9

27 28 29 30 31 1 2

31 1 2 3 4 5 6

24 25 26 27 28 29 30

17 18 19 20 21 22 23

10 11 12 13 14 15 16

3 4 5 6 7 8 9

27 28 29 30 31 1 2

31 1 2 3 4 5 6

24 25 26 27 28 29 30

17 18 19 20 21 22 23

10 11 12 13 14 15 16

3 4 5 6 7 8 9

27 28 29 30 31 1 2

31 1 2 3 4 5 6

24 25 26 27 28 29 30

17 18 19 20 21 22 23

10 11 12 13 14 15 16

3 4 5 6 7 8 9

27 28 29 30 31 1 2

31 1 2 3 4 5 6

24 25 26 27 28 29 30

17 18 19 20 21 22 23

10 11 12 13 14 15 16

3 4 5 6 7 8 9

27 28 29 30 31 1 2

31 1 2 3 4 5 6

24 25 26 27 28 29 30

17 18 19 20 21 22 23

10 11 12 13 14 15 16

3 4 5 6 7 8 9

27 28 29 30 31 1 2

31 1 2 3 4 5 6

24 25 26 27 28 29 30

17 18 19 20 21 22 23

10 11 12 13 14 15 16

3 4 5 6 7 8 9

27 28 29 30 31 1 2

agosto

l m m j v s d

5 6 7 8 9 10 11

28 29 30 1 2 3 4

21 22 23 24 25 26 27

14 15 16 17 18 19 20

7 8 9 10 11 12 13

31 1 2 3 4 5 6

5 6 7 8 9 10 11

28 29 30 1 2 3 4

21 22 23 24 25 26 27

14 15 16 17 18 19 20

7 8 9 10 11 12 13

31 1 2 3 4 5 6

5 6 7 8 9 10 11

28 29 30 1 2 3 4

21 22 23 24 25 26 27

14 15 16 17 18 19 20

7 8 9 10 11 12 13

31 1 2 3 4 5 6

5 6 7 8 9 10 11

28 29 30 1 2 3 4

21 22 23 24 25 26 27

14 15 16 17 18 19 20

7 8 9 10 11 12 13

31 1 2 3 4 5 6

5 6 7 8 9 10 11

28 29 30 1 2 3 4

21 22 23 24 25 26 27

14 15 16 17 18 19 20

7 8 9 10 11 12 13

31 1 2 3 4 5 6

5 6 7 8 9 10 11

28 29 30 1 2 3 4

21 22 23 24 25 26 27

14 15 16 17 18 19 20

7 8 9 10 11 12 13

31 1 2 3 4 5 6

5 6 7 8 9 10 11

28 29 30 1 2 3 4

21 22 23 24 25 26 27

14 15 16 17 18 19 20

7 8 9 10 11 12 13

31 1 2 3 4 5 6

5 6 7 8 9 10 11

28 29 30 1 2 3 4

21 22 23 24 25 26 27

14 15 16 17 18 19 20

7 8 9 10 11 12 13

31 1 2 3 4 5 6

5 6 7 8 9 10 11

28 29 30 1 2 3 4

21 22 23 24 25 26 27

14 15 16 17 18 19 20

7 8 9 10 11 12 13

31 1 2 3 4 5 6

5 6 7 8 9 10 11

28 29 30 1 2 3 4

21 22 23 24 25 26 27

14 15 16 17 18 19 20

7 8 9 10 11 12 13

31 1 2 3 4 5 6

5 6 7 8 9 10 11

28 29 30 1 2 3 4

21 22 23 24 25 26 27

14 15 16 17 18 19 20

7 8 9 10 11 12 13

31 1 2 3 4 5 6

5 6 7 8 9 10 11

28 29 30 1 2 3 4

21 22 23 24 25 26 27

14 15 16 17 18 19 20

7 8 9 10 11 12 13

31 1 2 3 4 5 6

septiembre

l m m j v s d

2 3 4 5 6 7 8

26 27 28 29 30 31 1

19 20 21 22 23 24 25

12 13 14 15 16 17 18

5 6 7 8 9 10 11

28 29 30 1 2 3 4

2 3 4 5 6 7 8

26 27 28 29 30 31 1

19 20 21 22 23 24 25

12 13 14 15 16 17 18

5 6 7 8 9 10 11

28 29 30 1 2 3 4

2 3 4 5 6 7 8

26 27 28 29 30 31 1

19 20 21 22 23 24 25

12 13 14 15 16 17 18

5 6 7 8 9 10 11

28 29 30 1 2 3 4

2 3 4 5 6 7 8

26 27 28 29 30 31 1

19 20 21 22 23 24 25

12 13 14 15 16 17 18

5 6 7 8 9 10 11

28 29 30 1 2 3 4

2 3 4 5 6 7 8

26 27 28 29 30 31 1

19 20 21 22 23 24 25

12 13 14 15 16 17 18

5 6 7 8 9 10 11

28 29 30 1 2 3 4

2 3 4 5 6 7 8

26 27 28 29 30 31 1

19 20 21 22 23 24 25

12 13 14 15 16 17 18

5 6 7 8 9 10 11

28 29 30 1 2 3 4

2 3 4 5 6 7 8

26 27 28 29 30 31 1

19 20 21 22 23 24 25

12 13 14 15 16 17 18

5 6 7 8 9 10 11

28 29 30 1 2 3 4

2 3 4 5 6 7 8

26 27 28 29 30 31 1

19 20 21 22 23 24 25

12 13 14 15 16 17 18

5 6 7 8 9 10 11

28 29 30 1 2 3 4

2 3 4 5 6 7 8

26 27 28 29 30 31 1

19 20 21 22 23 24 25

12 13 14 15 16 17 18

5 6 7 8 9 10 11

28 29 30 1 2 3 4

2 3 4 5 6 7 8

26 27 28 29 30 31 1

19 20 21 22 23 24 25

12 13 14 15 16 17 18

5 6 7 8 9 10 11

28 29 30 1 2 3 4

2 3 4 5 6 7 8

26 27 28 29 30 31 1

19 20 21 22 23 24 25

12 13 14 15 16 17 18

5 6 7 8 9 10 11

28 29 30 1 2 3 4

2 3 4 5 6 7 8

26 27 28 29 30 31 1

19 20 21 22 23 24 25

12 13 14 15 16 17 18

5 6 7 8 9 10 11

28 29 30 1 2 3 4

octubre

l m m j v s d

30 1 2 3 4 5 6

23 24 25 26 27 28 29

16 17 18 19 20 21 22

9 10 11 12 13 14 15

2 3 4 5 6 7 8

26 27 28 29 30 31 1

30 1 2 3 4 5 6

23 24 25 26 27 28 29

16 17 18 19 20 21 22

9 10 11 12 13 14 15

2 3 4 5 6 7 8

26 27 28 29 30 31 1

30 1 2 3 4 5 6

23 24 25 26 27 28 29

16 17 18 19 20 21 22

9 10 11 12 13 14 15

2 3 4 5 6 7 8

26 27 28 29 30 31 1

30 1 2 3 4 5 6

23 24 25 26 27 28 29

16 17 18 19 20 21 22

9 10 11 12 13 14 15

2 3 4 5 6 7 8

26 27 28 29 30 31 1

30 1 2 3 4 5 6

23 24 25 26 27 28 29

16 17 18 19 20 21 22

9 10 11 12 13 14 15

2 3 4 5 6 7 8

26 27 28 29 30 31 1

30 1 2 3 4 5 6

23 24 25 26 27 28 29

16 17 18 19 20 21 22

9 10 11 12 13 14 15

2 3 4 5 6 7 8

26 27 28 29 30 31 1

30 1 2 3 4 5 6

23 24 25 26 27 28 29

16 17 18 19 20 21 22

9 10 11 12 13 14 15

2 3 4 5 6 7 8

26 27 28 29 30 31 1

30 1 2 3 4 5 6

23 24 25 26 27 28 29

16 17 18 19 20 21 22

9 10 11 12 13 14 15

2 3 4 5 6 7 8

26 27 28 29 30 31 1

30 1 2 3 4 5 6

23 24 25 26 27 28 29

16 17 18 19 20 21 22

9 10 11 12 13 14 15

2 3 4 5 6 7 8

26 27 28 29 30 31 1

30 1 2 3 4 5 6

23 24 25 26 27 28 29

16 17 18 19 20 21 22

9 10 11 12 13 14 15

2 3 4 5 6 7 8

26 27 28 29 30 31 1

30 1 2 3 4 5 6

23 24 25 26 27 28 29

16 17 18 19 20 21 22

9 10 11 12 13 14 15

2 3 4 5 6 7 8

26 27 28 29 30 31 1

30 1 2 3 4 5 6

23 24 25 26 27 28 29

16 17 18 19 20 21 22

9 10 11 12 13 14 15

2 3 4 5 6 7 8

26 27 28 29 30 31 1

noviembre

l m m j v s d

4 5 6 7 8 9 10

28 29 30 31 1 2 3

21 22 23 24 25 26 27

14 15 16 17 18 19 20

7 8 9 10 11 12 13

30 1 2 3 4 5 6

4 5 6 7 8 9 10

28 29 30 31 1 2 3

21 22 23 24 25 26 27

14 15 16 17 18 19 20

7 8 9 10 11 12 13

30 1 2 3 4 5 6

4 5 6 7 8 9 10

28 29 30 31 1 2 3

21 22 23 24 25 26 27

14 15 16 17 18 19 20

7 8 9 10 11 12 13

30 1 2 3 4 5 6

4 5 6 7 8 9 10

28 29 30 31 1 2 3

21 22 23 24 25 26 27

14 15 16 17 18 19 20

7 8 9 10 11 12 13

30 1 2 3 4 5 6

4 5 6 7 8 9 10

28 29 30 31 1 2 3

21 22 23 24 25 26 27

14 15 16 17 18 19 20

7 8 9 10 11 12 13

30 1 2 3 4 5 6

4 5 6 7 8 9 10

28 29 30 31 1 2 3

21 22 23 24 25 26 27

14 15 16 17 18 19 20

7 8 9 10 11 12 13

30 1 2 3 4 5 6

4 5 6 7 8 9 10

28 29 30 31 1 2 3

21 22 23 24 25 26 27

14 15 16 17 18 19 20

7 8 9 10 11 12 13

30 1 2 3 4 5 6

4 5 6 7 8 9 10

28 29 30 31 1 2 3

21 22 23 24 25 26 27

14 15 16 17 18 19 20

7 8 9 10 11 12 13

30 1 2 3 4 5 6

4 5 6 7 8 9 10

28 29 30 31 1 2 3

21 22 23 24 25 26 27

14 15 16 17 18 19 20

7 8 9 10 11 12 13

30 1 2 3 4 5 6

4 5 6 7 8 9 10

28 29 30 31 1 2 3

21 22 23 24 25 26 27

14 15 16 17 18 19 20

7 8 9 10 11 12 13

30 1 2 3 4 5 6

4 5 6 7 8 9 10

28 29 30 31 1 2 3

21 22 23 24 25 26 27

14 15 16 17 18 19 20

7 8 9 10 11 12 13

30 1 2 3 4 5 6

4 5 6 7 8 9 10

28 29 30 31 1 2 3

21 22 23 24 25 26 27

14 15 16 17 18 19 20

7 8 9 10 11 12 13

30 1 2 3 4 5 6

diciembre

34

36

38

40

42

44

46

48

50

M. Febrero–Bande Retos Big Data

Page 24: Retos estadísticos en Big Data y Data Mining...IntroductionBigDataAnalysis Newchallenges,almostsamesolutionsConclusion DimensionsofBigData TheVcharacteristicsofBigDataare: IVolume.Huge(andperhapsincreasing

Introduction Big Data Analysis New challenges, almost same solutions Conclusion

Regression I

I Even the most simple techniques are hard to implement (interms of time).

I We must adapt our inferences to the new paradigm(POPULATION BIG DATA).

I The idea of a simple unique model for your data fulfilling theusual hypothesis is (probably) wrong.

> [1] "Significative variables in more than 30% of repetitions"> [1] "(Intercept)" "sexoV"> [3] "segmento_edadAdultos 2" "segmento_edadJovenes"> [5] "segmento_edadSenior" "localidadLugo"> [7] "localidadOurense" "localidadResto"> [9] "semana01"> [1] "..."> [1] "semana50"> [2] "semana51"> [3] "semana52"> [4] "sexoV:localidadPontevedra"> [5] "segmento_edadAdultos 2:localidadResto"> [6] "segmento_edadJovenes:localidadResto"> [7] "segmento_edadSenior:localidadResto"

M. Febrero–Bande Retos Big Data

Page 25: Retos estadísticos en Big Data y Data Mining...IntroductionBigDataAnalysis Newchallenges,almostsamesolutionsConclusion DimensionsofBigData TheVcharacteristicsofBigDataare: IVolume.Huge(andperhapsincreasing

Introduction Big Data Analysis New challenges, almost same solutions Conclusion

Regression II

0 10 20 30 40 50

2830

3234

3638

40

Semana

Impo

rte

SV SS BF

M. Febrero–Bande Retos Big Data

Page 26: Retos estadísticos en Big Data y Data Mining...IntroductionBigDataAnalysis Newchallenges,almostsamesolutionsConclusion DimensionsofBigData TheVcharacteristicsofBigDataare: IVolume.Huge(andperhapsincreasing

Introduction Big Data Analysis New challenges, almost same solutions Conclusion

Effects

30 35 40

0.0

0.1

0.2

0.3

0.4

semana11

N = 500 Bandwidth = 0.2753

Den

sity

30 35 40

0.0

0.1

0.2

0.3

0.4

semana12

N = 500 Bandwidth = 0.3022

Den

sity

30 35 40

0.0

0.1

0.2

0.3

0.4

semana13

N = 500 Bandwidth = 0.3112

Den

sity

30 35 40

0.0

0.1

0.2

0.3

0.4

semana14

N = 500 Bandwidth = 0.2884

Den

sity

30 35 40

0.0

0.1

0.2

0.3

0.4

semana49

N = 500 Bandwidth = 0.2866

Den

sity

30 35 40

0.0

0.1

0.2

0.3

0.4

semana50

N = 500 Bandwidth = 0.2903

Den

sity

30 35 40

0.0

0.1

0.2

0.3

0.4

semana51

N = 500 Bandwidth = 0.2899

Den

sity

30 35 40

0.0

0.1

0.2

0.3

0.4

semana52

N = 500 Bandwidth = 0.3216

Den

sity

M. Febrero–Bande Retos Big Data

Page 27: Retos estadísticos en Big Data y Data Mining...IntroductionBigDataAnalysis Newchallenges,almostsamesolutionsConclusion DimensionsofBigData TheVcharacteristicsofBigDataare: IVolume.Huge(andperhapsincreasing

Introduction Big Data Analysis New challenges, almost same solutions Conclusion

Client profiles

I The filtering process must be carefully automatized.I The construction of every profile is a easily parallelized task.I In this case, we downscale the complexity of the analysis

increasing the complexity of the statistical objects (functionaldata analysis) as a concept test.

I By the new nature of the objects, the classical inferences orthe numerical methods must be changed accordingly.

0 10 20 30 40 50

-10

12

34

5

log(importe). 20 samples

t

X(t)

M. Febrero–Bande Retos Big Data

Page 28: Retos estadísticos en Big Data y Data Mining...IntroductionBigDataAnalysis Newchallenges,almostsamesolutionsConclusion DimensionsofBigData TheVcharacteristicsofBigDataare: IVolume.Huge(andperhapsincreasing

Introduction Big Data Analysis New challenges, almost same solutions Conclusion

Functional Means

0 10 20 30 40 50

3.4

3.6

3.8

4.0

4.2

log(Importe) by Sexo

t

X(t

)

MV

0 10 20 30 40 50

3.0

3.4

3.8

4.2

log(Importe) by Localidad

t

X(t

) CoruñaLugoOurensePontevedraResto

0 10 20 30 40 50

3.2

3.6

4.0

log(Importe) by Actividad

t

X(t

)

InactivoOcupadoOtrasParado

0 10 20 30 40 50

2.5

3.0

3.5

4.0

log(Importe) by Segmento Edad

t

X(t

)

Adultos 1Adultos 2JovenesSenior

M. Febrero–Bande Retos Big Data

Page 29: Retos estadísticos en Big Data y Data Mining...IntroductionBigDataAnalysis Newchallenges,almostsamesolutionsConclusion DimensionsofBigData TheVcharacteristicsofBigDataare: IVolume.Huge(andperhapsincreasing

Introduction Big Data Analysis New challenges, almost same solutions Conclusion

Inferences about mean

0 10 20 30 40 50

3.5

3.6

3.7

3.8

3.9

4.0

4.1

4.2

N=62547-Boot(500)

t

X(t)

0 10 20 30 40 50

3.5

3.6

3.7

3.8

3.9

4.0

4.1

4.2

N=5000-Sample(500)

t

X(t)

M. Febrero–Bande Retos Big Data

Page 30: Retos estadísticos en Big Data y Data Mining...IntroductionBigDataAnalysis Newchallenges,almostsamesolutionsConclusion DimensionsofBigData TheVcharacteristicsofBigDataare: IVolume.Huge(andperhapsincreasing

Introduction Big Data Analysis New challenges, almost same solutions Conclusion

Comparing means

0 10 20 30 40 50

3.4

3.6

3.8

4.0

4.2

Means by Localidad

semana

impo

rte

CLUOUPO

M. Febrero–Bande Retos Big Data

Page 31: Retos estadísticos en Big Data y Data Mining...IntroductionBigDataAnalysis Newchallenges,almostsamesolutionsConclusion DimensionsofBigData TheVcharacteristicsofBigDataare: IVolume.Huge(andperhapsincreasing

Introduction Big Data Analysis New challenges, almost same solutions Conclusion

Example of a testing procedure

rtest = anova.RPm(fdaobj, ~sexo + sedad + act + sexo:sedad +sexo:act + sedad:act, data.fac = dfdatos[, -c(1, 5)])

> [1] "Pvalues with all data"> sexo sedad act sexo:sedad sexo:act sedad:act> RP30 0 0 0.024 0.01 0.564 0.001> [1] "P(pvalue<=0.05), 500 replicas of N=10000"> sexo sedad act sexo:sedad sexo:act> 0.970 1.000 0.156 0.148 0.078> sedad:act> 0.204> [1] "Mean(pvalue), 500 replicas of N=10000"> sexo sedad act sexo:sedad sexo:act> 0.008 0.000 0.328 0.385 0.493> sedad:act> 0.341

M. Febrero–Bande Retos Big Data

Page 32: Retos estadísticos en Big Data y Data Mining...IntroductionBigDataAnalysis Newchallenges,almostsamesolutionsConclusion DimensionsofBigData TheVcharacteristicsofBigDataare: IVolume.Huge(andperhapsincreasing

Introduction Big Data Analysis New challenges, almost same solutions Conclusion

A difficult scenarioP (Y = i |X) = m(X1,X2) with X = (X1|X2|Z1| . . . |Z48)Training:2000, Validation:1000

-1.0 -0.5 0.0 0.5 1.0

-1.0

-0.5

0.0

0.5

1.0

X1

X2

M. Febrero–Bande Retos Big Data

Page 33: Retos estadísticos en Big Data y Data Mining...IntroductionBigDataAnalysis Newchallenges,almostsamesolutionsConclusion DimensionsofBigData TheVcharacteristicsofBigDataare: IVolume.Huge(andperhapsincreasing

Introduction Big Data Analysis New challenges, almost same solutions Conclusion

Classical tools

BIG problem: How to select the information?

> Best with Linear Corr: Z7 Z35> Best with Dist. Corr: X1 X2 Z7 Z28

res.lda1 = lda(X, grupo)

> grupo> 1 2> 1 572 454> 2 435 539

res.lda2 = lda(X[, l2], grupo)

> grupo> 1 2> 1 578 506> 2 429 487

M. Febrero–Bande Retos Big Data

Page 34: Retos estadísticos en Big Data y Data Mining...IntroductionBigDataAnalysis Newchallenges,almostsamesolutionsConclusion DimensionsofBigData TheVcharacteristicsofBigDataare: IVolume.Huge(andperhapsincreasing

Introduction Big Data Analysis New challenges, almost same solutions Conclusion

GLM, GAM and random Forest I

> [1] "GLM:X1 + X2 + Z7 + Z28"> 1 2 1 2> 1 573 453 578 506> 2 434 540 429 487> [1] "GAM:s(X1) + s(X2) + s(Z7) + s(Z28)"> 1 2 1 2> 1 1007 0 1007 0> 2 0 993 0 993

M. Febrero–Bande Retos Big Data

Page 35: Retos estadísticos en Big Data y Data Mining...IntroductionBigDataAnalysis Newchallenges,almostsamesolutionsConclusion DimensionsofBigData TheVcharacteristicsofBigDataare: IVolume.Huge(andperhapsincreasing

Introduction Big Data Analysis New challenges, almost same solutions Conclusion

GLM, GAM and random Forest II

>> Call:> randomForest(x = X[, l2], y = grupo, ntree = 2000, importance = TRUE)> Type of random forest: classification> Number of trees: 2000> No. of variables tried at each split: 2>> OOB estimate of error rate: 3.55%> Confusion matrix:> 1 2 class.error> 1 982 25 0.02482622> 2 46 947 0.04632427> Error rate0: 0.216

M. Febrero–Bande Retos Big Data

Page 36: Retos estadísticos en Big Data y Data Mining...IntroductionBigDataAnalysis Newchallenges,almostsamesolutionsConclusion DimensionsofBigData TheVcharacteristicsofBigDataare: IVolume.Huge(andperhapsincreasing

Introduction Big Data Analysis New challenges, almost same solutions Conclusion

GLM, GAM and random Forest III

Z33Z19Z21Z27Z37Z39Z5Z45Z8Z14Z12Z29Z31Z43Z20Z32Z3Z41Z9Z25Z42Z16Z7Z10Z18Z35Z47Z46X1X2

0 20 40 60 80

Importance

MeanDecreaseAccuracy

Z7

Z28

X1

X2

0 100 200 300 400

Importance

MeanDecreaseAccuracy

M. Febrero–Bande Retos Big Data

Page 37: Retos estadísticos en Big Data y Data Mining...IntroductionBigDataAnalysis Newchallenges,almostsamesolutionsConclusion DimensionsofBigData TheVcharacteristicsofBigDataare: IVolume.Huge(andperhapsincreasing

Introduction Big Data Analysis New challenges, almost same solutions Conclusion

Neural Networks and SVM I

res.nn0 = nnet(ff0, data = Xe, size = 15, rang = 0.2, maxit = 1000,trace = FALSE)

res.nn = nnet(ff1, data = Xe, size = 15, rang = 0.2, maxit = 500,trace = FALSE)

> [1] "Number of weights:91"

cbind(table(grnn0, grupo), table(grnn, grupo))

> 1 2 1 2> 1 956 52 1007 0> 2 51 941 0 993

M. Febrero–Bande Retos Big Data

Page 38: Retos estadísticos en Big Data y Data Mining...IntroductionBigDataAnalysis Newchallenges,almostsamesolutionsConclusion DimensionsofBigData TheVcharacteristicsofBigDataare: IVolume.Huge(andperhapsincreasing

Introduction Big Data Analysis New challenges, almost same solutions Conclusion

Neural Networks and SVM II

res.svm = svm(ff1, data = Xe, cross = 5)

> [1] "Number of SV:821"

cbind(table(res.svm0$fitted, grupo), table(res.svm$fitted, grupo))

> 1 2 1 2> 1 948 96 983 29> 2 59 897 24 964

-1.0 -0.5 0.0 0.5 1.0

-1.0

-0.5

0.0

0.5

1.0

GAM

X1

X2

-1.0 -0.5 0.0 0.5 1.0

-1.0

-0.5

0.0

0.5

1.0

Random Forest

X1

X2

-1.0 -0.5 0.0 0.5 1.0

-1.0

-0.5

0.0

0.5

1.0

Neural Network

X1

X2

-1.5 -1.0 -0.5 0.0 0.5 1.0 1.5

-1.5

-1.0

-0.5

0.0

0.5

1.0

1.5

SVM

X1

X2

M. Febrero–Bande Retos Big Data

Page 39: Retos estadísticos en Big Data y Data Mining...IntroductionBigDataAnalysis Newchallenges,almostsamesolutionsConclusion DimensionsofBigData TheVcharacteristicsofBigDataare: IVolume.Huge(andperhapsincreasing

Introduction Big Data Analysis New challenges, almost same solutions Conclusion

Predictive performance I

GLM.1 GLM.2 GAM.1 GAM.2 RF.1 RF.2 NN.1 NN.2 SVM.1 SVM.2G1(0) 262 266 513 15 466 62 245 283 287 241G2(0) 240 232 9 463 161 311 222 250 220 252G1(S) 249 279 527 1 509 19 517 11 508 20G2(S) 251 221 5 467 20 452 13 459 16 456

Table: Predictive performance

What’s better (in this case)?I Although the performance of GAM seems to be optimal, none

of the variables are marked as significant.I The Random Forest procedure seems to be the best for

detecting the significant variables.I No useful inferences can be obtained from NN and SVM

procedures. Its good predictive ability has the cost of the hugenumber of parameters.

M. Febrero–Bande Retos Big Data

Page 40: Retos estadísticos en Big Data y Data Mining...IntroductionBigDataAnalysis Newchallenges,almostsamesolutionsConclusion DimensionsofBigData TheVcharacteristicsofBigDataare: IVolume.Huge(andperhapsincreasing

Introduction Big Data Analysis New challenges, almost same solutions Conclusion

Energy example I

0.0 0.2 0.4 0.6 0.8 1.0

2000

035

000

Y=Demanda-Real (dia t)

t=1/144,...,1 (10-minutal)

(MW

)

2008200920102011201220132014

0.0 0.2 0.4 0.6 0.8 1.0

2000

035

000

X=Demanda-Programada (dia t)

day t=1/144,...,1 (10-minutal)

(MW

)

2008200920102011201220132014

5 10 15 20

1000

025

000

4000

0

Energy

hours

MW

h

2008200920102011201220132014

5 10 15 20

050

100

150

Price

hours

EU

R/M

Wh

2008200920102011201220132014

M. Febrero–Bande Retos Big Data

Page 41: Retos estadísticos en Big Data y Data Mining...IntroductionBigDataAnalysis Newchallenges,almostsamesolutionsConclusion DimensionsofBigData TheVcharacteristicsofBigDataare: IVolume.Huge(andperhapsincreasing

Introduction Big Data Analysis New challenges, almost same solutions Conclusion

Where is the information? I

> En.h6 En.h12 En.h18 En.h24 Pr.h6 Pr.h12 Pr.h18 Pr.h24> dsem 0.010 0.137 0.129 0.026 0.004 0.029 0.025 0.006> year 0.095 0.059 0.044 0.120 0.114 0.169 0.148 0.185> mes 0.052 0.026 0.029 0.059 0.052 0.046 0.070 0.042> tmax 0.099 0.023 0.016 0.088 0.139 0.056 0.020 0.029> tmin 0.058 0.019 0.009 0.078 0.083 0.043 0.029 0.026> tmed 0.086 0.024 0.016 0.092 0.120 0.054 0.026 0.030> velm NA NA NA NA NA NA NA NA> rtemp 0.071 0.008 0.006 0.025 0.128 0.052 0.014 0.032

M. Febrero–Bande Retos Big Data

Page 42: Retos estadísticos en Big Data y Data Mining...IntroductionBigDataAnalysis Newchallenges,almostsamesolutionsConclusion DimensionsofBigData TheVcharacteristicsofBigDataare: IVolume.Huge(andperhapsincreasing

Introduction Big Data Analysis New challenges, almost same solutions Conclusion

Where is the information? II

> En.h6 En.h12 En.h18 En.h24 Pr.h6 Pr.h12 Pr.h18 Pr.h24> En1.h1 0.465 0.204 0.210 0.481 0.030 0.029 0.025 0.030> En1.h2 0.508 0.188 0.187 0.423 0.042 0.039 0.030 0.037> En1.h3 0.552 0.197 0.184 0.391 0.052 0.046 0.035 0.040> En1.h4 0.590 0.218 0.197 0.400 0.060 0.051 0.041 0.046> En1.h5 0.611 0.229 0.201 0.405 0.066 0.053 0.042 0.047> En1.h6 0.636 0.252 0.219 0.420 0.065 0.051 0.042 0.047> En1.h7 0.633 0.310 0.265 0.421 0.047 0.043 0.040 0.041> En1.h8 0.456 0.349 0.283 0.370 0.019 0.027 0.034 0.023> En1.h9 0.331 0.346 0.281 0.306 0.008 0.016 0.021 0.013> En1.h10 0.336 0.351 0.283 0.330 0.007 0.014 0.020 0.013> En1.h11 0.327 0.361 0.303 0.334 0.006 0.015 0.018 0.012> En1.h12 0.356 0.380 0.322 0.359 0.007 0.017 0.021 0.014> En1.h13 0.349 0.385 0.346 0.372 0.008 0.015 0.019 0.014> En1.h14 0.345 0.372 0.358 0.361 0.009 0.016 0.018 0.012> En1.h15 0.387 0.377 0.380 0.408 0.013 0.018 0.021 0.015> En1.h16 0.351 0.371 0.391 0.363 0.012 0.015 0.015 0.014> En1.h17 0.346 0.374 0.394 0.349 0.011 0.016 0.017 0.012> En1.h18 0.356 0.380 0.391 0.379 0.010 0.017 0.022 0.011> En1.h19 0.400 0.383 0.368 0.478 0.010 0.020 0.041 0.014> En1.h20 0.425 0.372 0.328 0.536 0.010 0.018 0.046 0.014> En1.h21 0.448 0.389 0.324 0.585 0.011 0.022 0.050 0.019> En1.h22 0.488 0.417 0.353 0.679 0.012 0.022 0.041 0.024> En1.h23 0.521 0.434 0.395 0.759 0.013 0.021 0.039 0.024> En1.h24 0.561 0.369 0.351 0.698 0.019 0.023 0.034 0.027

M. Febrero–Bande Retos Big Data

Page 43: Retos estadísticos en Big Data y Data Mining...IntroductionBigDataAnalysis Newchallenges,almostsamesolutionsConclusion DimensionsofBigData TheVcharacteristicsofBigDataare: IVolume.Huge(andperhapsincreasing

Introduction Big Data Analysis New challenges, almost same solutions Conclusion

Where is the information? III

> En.h6 En.h12 En.h18 En.h24 Pr.h6 Pr.h12 Pr.h18 Pr.h24> Pr1.h1 0.063 0.014 0.010 0.014 0.412 0.420 0.334 0.494> Pr1.h2 0.068 0.012 0.008 0.010 0.447 0.383 0.277 0.457> Pr1.h3 0.074 0.013 0.010 0.010 0.460 0.339 0.210 0.386> Pr1.h4 0.073 0.014 0.011 0.012 0.468 0.332 0.196 0.371> Pr1.h5 0.070 0.013 0.011 0.012 0.482 0.334 0.192 0.363> Pr1.h6 0.079 0.011 0.011 0.013 0.532 0.386 0.230 0.392> Pr1.h7 0.083 0.011 0.013 0.017 0.564 0.453 0.281 0.409> Pr1.h8 0.050 0.013 0.013 0.013 0.464 0.476 0.354 0.400> Pr1.h9 0.041 0.016 0.014 0.014 0.426 0.474 0.350 0.369> Pr1.h10 0.042 0.016 0.013 0.018 0.425 0.496 0.382 0.409> Pr1.h11 0.053 0.017 0.014 0.022 0.467 0.549 0.411 0.459> Pr1.h12 0.059 0.017 0.014 0.024 0.499 0.587 0.440 0.488> Pr1.h13 0.058 0.018 0.017 0.026 0.492 0.595 0.460 0.487> Pr1.h14 0.064 0.016 0.015 0.025 0.530 0.613 0.458 0.508> Pr1.h15 0.065 0.015 0.014 0.021 0.545 0.606 0.474 0.522> Pr1.h16 0.067 0.014 0.014 0.019 0.562 0.593 0.447 0.489> Pr1.h17 0.063 0.015 0.015 0.019 0.552 0.593 0.464 0.473> Pr1.h18 0.056 0.019 0.017 0.025 0.503 0.591 0.520 0.482> Pr1.h19 0.048 0.023 0.017 0.040 0.376 0.529 0.595 0.460> Pr1.h20 0.053 0.030 0.020 0.067 0.251 0.398 0.522 0.357> Pr1.h21 0.045 0.021 0.013 0.045 0.274 0.436 0.512 0.389> Pr1.h22 0.058 0.029 0.024 0.065 0.297 0.463 0.489 0.409> Pr1.h23 0.069 0.022 0.018 0.041 0.473 0.638 0.571 0.583> Pr1.h24 0.093 0.019 0.015 0.025 0.588 0.622 0.510 0.633

M. Febrero–Bande Retos Big Data

Page 44: Retos estadísticos en Big Data y Data Mining...IntroductionBigDataAnalysis Newchallenges,almostsamesolutionsConclusion DimensionsofBigData TheVcharacteristicsofBigDataare: IVolume.Huge(andperhapsincreasing

Introduction Big Data Analysis New challenges, almost same solutions Conclusion

Where is the information? IV

> En.h6 En.h12 En.h18 En.h24 Pr.h6 Pr.h12 Pr.h18 Pr.h24> y1 0.136 0.182 0.162 0.188 0.016 0.023 0.051 0.010> y7 0.104 0.399 0.374 0.247 0.013 0.056 0.088 0.010> x0 0.127 0.556 0.526 0.334 0.011 0.090 0.135 0.009> nu0 0.013 0.012 0.008 0.026 0.010 0.013 0.019 0.015> fu0 0.137 0.113 0.097 0.158 0.025 0.056 0.050 0.031> fu7 0.114 0.089 0.073 0.138 0.026 0.053 0.048 0.034> cc0 0.106 0.161 0.132 0.231 0.055 0.047 0.043 0.039> hi0 0.059 0.018 0.013 0.041 0.134 0.142 0.118 0.140> eo0 0.036 0.006 0.005 0.006 0.136 0.060 0.016 0.031> price1 0.083 0.025 0.021 0.036 0.586 0.622 0.511 0.558> price7 0.048 0.055 0.050 0.040 0.325 0.523 0.486 0.438> energy1 0.531 0.421 0.382 0.536 0.029 0.032 0.039 0.028> energy7 0.359 0.529 0.491 0.516 0.022 0.047 0.057 0.030

M. Febrero–Bande Retos Big Data

Page 45: Retos estadísticos en Big Data y Data Mining...IntroductionBigDataAnalysis Newchallenges,almostsamesolutionsConclusion DimensionsofBigData TheVcharacteristicsofBigDataare: IVolume.Huge(andperhapsincreasing

Introduction Big Data Analysis New challenges, almost same solutions Conclusion

Table of Contents

1 Introduction

2 Big Data Analysis

3 New challenges, almost same solutionsExploratory Data AnalysisClient profileA test exampleA classification example

4 Conclusion

M. Febrero–Bande Retos Big Data

Page 46: Retos estadísticos en Big Data y Data Mining...IntroductionBigDataAnalysis Newchallenges,almostsamesolutionsConclusion DimensionsofBigData TheVcharacteristicsofBigDataare: IVolume.Huge(andperhapsincreasing

Introduction Big Data Analysis New challenges, almost same solutions Conclusion

Summary I

I A Big Data analyst is a white unicorn that combines anstatistician, a computer scientist and a business analyst withgood communication abilities.

I The practical implementation in a company usually requires aninterdisciplinary team.

I The era of Big Data is a Big opportunity for data scientiststhat must develop smarter procedures for analysis.

I Big Data Analysis is not an universal and automatic answer toall questions. Must be tailored and guided by humans. Nomachine can select or transform the variables without anexpert guide.

I Always balance the complexities (from data, from methodsand models, from restrictions).

M. Febrero–Bande Retos Big Data

Page 47: Retos estadísticos en Big Data y Data Mining...IntroductionBigDataAnalysis Newchallenges,almostsamesolutionsConclusion DimensionsofBigData TheVcharacteristicsofBigDataare: IVolume.Huge(andperhapsincreasing

Introduction Big Data Analysis New challenges, almost same solutions Conclusion

Summary II

I Be sure that the scale of the solution is adapted to the size ofyour problem.

I What’s the question? Are you sure that you need a Big DataPlatform to answer THAT question?

I Big Data is the new universe. Use it as before you use thepopulation.

M. Febrero–Bande Retos Big Data

Page 48: Retos estadísticos en Big Data y Data Mining...IntroductionBigDataAnalysis Newchallenges,almostsamesolutionsConclusion DimensionsofBigData TheVcharacteristicsofBigDataare: IVolume.Huge(andperhapsincreasing

References

References I

Anderson, A. and Semmelroth (2015) Statistics for Big Data forDummies. Wiley.

Bühlmann, P., Drineas, P., Kane, M. and van der Laan, M. eds (2016)Handbook of Big Data. CRC Press.

Dean, J.(2014) Big Data, Data Mining, and Machine Learning. ValueCreation for Business Leaders and Practitioners. Wiley.

Prajapati, V. (2013) Big Data Analytics with R and Hadoop. PACKTPublishing Ltd.

Ratner, B. (2011) Statistical and Machine-Learning Data Mining.Techniques for Better Predictive Modeling and Analysis of Big Data, 2ed.CRC Press.

M. Febrero–Bande Retos Big Data