Cubes - Lightweight OLAP Framework

Preview:

DESCRIPTION

Cubes is light-weight online analytical processing (OLAP) framework and HTTP OLAP service server. Documentation: http://packages.python.org/cubes/

Citation preview

CubesLight-weigth Online Analytical Processing

Stefan Urbanekstefan.urbanek@gmail.com@Stiivi

April 2011

Features

■ logical model (metadata)

■ aggregated data browsing

■ OLAP HTTP server with json interface

■ data and metadata localization

■ multiple backends

Introductioncubes, facts and dimensions

data cube

data cell

data cell

fact

most detailed

information

Fact examples:

• contract

• donation

• spending

• invoice

• project

• ...

measurable

measure examples:

• contract amount

• revenue

• duration

• price with VAT

dimensions

location

type

time

dimensionslocation

type

time

■ provide context for facts

■ used to filter queries or reports

■ control scope of aggregation of facts

■ used for ordering or sorting

■ define master-detail relationships

☛ [star]

hierarchies

May

region

date

1st2010

data mart

subject area

Summary

■ fact – most detailed information for analysis

■ measure – an attribute for computation

■ dimension – context of facts

■ hierarchy – master-detail relationship

Slicing and Dicing✂

2010

2009

2008

2007

2006

spending in 2010

location

type

time

Estonia

Poland

HungarySlovakia

Czech Republic

contracts in

Estonialocation

type

time

location

type

time

contracts in

Estonia in 2010

Estonia 2010

location

type

time

IT contracts

location

type

time

IT contracts in

Estonia in 2010

Estonia 2010

IT✂

measures can be aggregated

spending in 2010

revenue from IT

projects

top 10

contractors

Drilling down

0

35

70

2006 2007 2008 2009 2010

6070

4050

30

0

125

250

All years

250

∑amount

drill down by date

∑amount

looking at more detailed level

0

35

70

2006 2007 2008 2009 2010

6070

4050

30

0

125

250

All years

250

0

3,5

7

Jan Feb Mar Apr March April May ...1

3

7

1

54

23

top level

year level

month level

Logical Modeldescription based on how you analyze data

Logical Modeluser’s or analyst’s perspective:

how data are being

measured, aggregated and reported

24%

28% 16%

20%

12%∑amountamount

✶ star❄ snowflake

abstraction over physical data

!

Legend:"#localizable!#required during cube creation

and denormalisation$#required by browser

description "

default hierarchy

label "name

Dimension

detailsfact table !

measures

description "label "name

Cube

label "name

Hierarchy

label attributekey attribute

label "name

Level

description "locales !$

label "name

Attribute

locale

description "label "name

Model

masterdetailalias

Join !

masterdetail

Mapping !

SlicerCubes OLAP server

Slicer

Aggregation Browser

Cell(point of view)

facts(details)

∑ aggregates

Application

HTTP request

JSON reply

model

model

GET /model{ "cubes": { "contracts": { "measures": [ { "name": "amount", "label": "Contract amount" } ], "dimensions": ["date", "supplier", "process_type", "cpv"] } }, "dimensions": { "supplier": { ... } } ...}

GET /aggregateaggregate measures

GET /aggregate

∑amount

GET /aggregate

{ "drilldown": {}, "summary": { "record_count": 19278, "zmluva_hodnota_sum": 11222821530.12966 }}

cut=...slice and dice with cut parameter

GET /aggregate?cut=date:2010

∑amount

2010

GET /aggregate?cut=date:2010✂

2010

{ "drilldown": {}, "summary": { "date.year": 2011, "record_count": 64, "zmluva_hodnota_sum": 78717997.108

}}

/aggregate?cut=date:2010|region:ee

∑amountEstonia

2010

cut

cut=date:2010|region:es

dimension points

dimension points

date:2010

region:ee

dimensionpath

hierarchies

date:2010,12

dimensionpath

month level

year level

more hierarchies

date:2010,12|category:it,sw|

region:ee

contracts in December 2010

in Estonia for IT – Software

cut

date:2010,12|category:it,sw|region:ee

pipe separates cuts

comma separates levels

PUT /reportif you want multiple tables and charts with single

request

drilldown=get more details

0

35

70

2006 2007 2008 2009 2010

6070

4050

30

0

125

250

All years

250

∑amount

∑amount

drilldown=date

0

35

70

2006 2007 2008 2009 2010

6070

4050

30

0

125

250

All years

250

0

3,5

7

Jan Feb Mar Apr March April May ...1

3

7

1

54

23

drilldown=date

cut=date:2010 &drilldown=date

implicit

hierarchy

drilldown=date

drilldown=date&cut=date:2010

drilldown=date&cut=date:2010,11

report by year:

report by month:

report by day:

drilldown=date,supplier

more dimensions:

for cross-tables

Atributes and Measuresnaming

dimensionattribute

category.description

date.yeardate.month

region.region_name

region.region_coderegion.city_name

hierarchical attribute dependencies are implicit

defined in model

there is no “date” data typedate components are normal attributes of

date dimension

month dayyear

received_amount_sum

record_count

measure aggregation

page=...&order=ordering and pagination

page=3&pagesize=20

3rd page with 20 results per page

order=region.nameorder by region name

order=amount:descorder by amount descending

Natural Ordering

attributes can have default order specified in model

order=year:asc

... might be omitted if model contains:

{“name” = “year”, “order” = “asc”}

lang=transparent localization

Localization

1

2

3

+ +

master model translations

drilldown=type&lang=endrilldown=type&lang=sk

report query is language independent

Cubesonline analytical processing

github/bitbucket: Stiivi

☛ References

[star] Christopher Adamson: Star Schema, 2010

and further reading

Recommended