15
1 SDMX Global Conference 28-30 September 2015 SDMX into the future VTL (Validation and Transformation Language) A new technical standard for enhancing data validation and processing Vincenzo Del Vecchio - Bank of Italy Marco Pellegrino – Eurostat SDMX TWG & VTL Task Force

1 SDMX Global Conference 28-30 September 2015 SDMX into the future VTL (Validation and Transformation Language) A new technical standard for enhancing

Embed Size (px)

Citation preview

Page 1: 1 SDMX Global Conference 28-30 September 2015 SDMX into the future VTL (Validation and Transformation Language) A new technical standard for enhancing

1SDMX Global Conference28-30 September 2015

SDMX into the future

VTL(Validation and Transformation Language)

A new technical standard for enhancingdata validation and processing

Vincenzo Del Vecchio - Bank of Italy

Marco Pellegrino – Eurostat

SDMX TWG & VTL Task Force

Page 2: 1 SDMX Global Conference 28-30 September 2015 SDMX into the future VTL (Validation and Transformation Language) A new technical standard for enhancing

228-30 September 2015 SDMX Global Conference

Approach

SDMX originally focused on data collection and dissemination

Current line of tendency: Support more stages of the statistical production process

Generic Statistical Business Process Model

Page 3: 1 SDMX Global Conference 28-30 September 2015 SDMX into the future VTL (Validation and Transformation Language) A new technical standard for enhancing

328-30 September 2015 SDMX Global Conference

What is VTL

A standard language For defining validation and transformation rules

Validation (now)

Transformation (partially now, to be enriched at a later stage)

Main goals:Define and preserve validation and transformation rules

Exchange and share rules

Apply rules in industrialized processes

Apply to several standards (e.g. SDMX, DDI, GSIM) thanks to a generic information model

Page 4: 1 SDMX Global Conference 28-30 September 2015 SDMX into the future VTL (Validation and Transformation Language) A new technical standard for enhancing

The VTL Information Model

VTL is a “stand-alone” specification– It can be used with SDMX, DDI, GSIM or potentially anything else– It can be used on its own

VTL has its own information model– All kind of data are modelled as mathematical functions having

independent variables (Identifiers) and dependent variables (Measures and Attributes)

– GSIM IM is used as a basis– It can be mapped against SDMX– It can be mapped against DDI

28-30 September 2015 SDMX Global Conference 4

Page 5: 1 SDMX Global Conference 28-30 September 2015 SDMX into the future VTL (Validation and Transformation Language) A new technical standard for enhancing

Main VTL drivers (1)

Business orientation – Designed for subject matter experts use

Integrated Approach– Any kind of data

– Independent of the phase of the process

– Unique language for validation and calculation

528-30 September 2015 SDMX Global Conference

Page 6: 1 SDMX Global Conference 28-30 September 2015 SDMX into the future VTL (Validation and Transformation Language) A new technical standard for enhancing

Main VTL drivers (2)

IT implementation independence– Independent of IT tools

– Allowing multiple tools

– Resilient to tools changes

Active Role for processing– Formal (described by means of BNF)

– Able to drive the validation & calculation software

Extensible and customizable

628-30 September 2015 SDMX Global Conference

Page 7: 1 SDMX Global Conference 28-30 September 2015 SDMX into the future VTL (Validation and Transformation Language) A new technical standard for enhancing

728-30 September 2015 SDMX Global Conference

VTL 1.0 Operators

Page 8: 1 SDMX Global Conference 28-30 September 2015 SDMX into the future VTL (Validation and Transformation Language) A new technical standard for enhancing

828-30 September 2015 SDMX Global Conference

VTL features (1)

Declarative language based on Expressions D4 := Check( (D1 – D2) = D3)

D1, D2, D3: Operands

D4: Result

+, > Operators

Operates on Data Sets (SDMX Dataflows)D1, D2, D3, D4 are typically Data Sets, e.g.:

D1 – population at time T by age and civil status

D2 - population at time T-1 by age and civil status

D3 – population flows between T-1 and T by age and civil status

D4 – consistency of population figures (true / false), by age and civil status

… and on parts of Data Setse.g. Time Series, Cross Sections, single Data Points

Page 9: 1 SDMX Global Conference 28-30 September 2015 SDMX into the future VTL (Validation and Transformation Language) A new technical standard for enhancing

928-30 September 2015 SDMX Global Conference

VTL features (2)

Supports operations on many types of statistical data, e.g.:

Dimensional and Unit data, Survey and register data,

Quantitative and qualitative data, …

... And can combine them, e.g.:D1 – Securities Register (by security id)

D2 – Securities Holdings (by security holder, security id, date)

D3 := merge (D1, D2, on (D1#sec_id = D2#sec_id), return D2#sec_holder, D2#sec_id, D1#sec_type)produces D3 by adding to D2 the security type taken from D1)

Page 10: 1 SDMX Global Conference 28-30 September 2015 SDMX into the future VTL (Validation and Transformation Language) A new technical standard for enhancing

1028-30 September 2015 SDMX Global Conference

VTL features (3)

Can concatenate expressions D4 := Check( (D1-D2) = D3)

D5 := if D4 = False then D2 else D1

(the result of the former is an operand of the latter)

Considers the validation as a kind of Transformation (calculation), in order to• Use a common language• Use validations and calculations together, e.g.:

Validation: D4 := Check( (D1-D2) = D3)

Calculation: D5 := if D4 = False then D2 else D1

Page 11: 1 SDMX Global Conference 28-30 September 2015 SDMX into the future VTL (Validation and Transformation Language) A new technical standard for enhancing

1128-30 September 2015 SDMX Global Conference

The Tranformations graph

Collection activity n.1

D1

D2

D3D4

D5

T1

T3

T2

D11

D12D13

D15

D17

D16T13

T12

T14

Collection activity n.2

Collection activity n.3

D21

D22

D23

D24T22

T21

Legend: Di = Data Seti Tj = Transformationj

D51

D52

T53

T52

T51

Analysis & research models

D54

D53

T54

D60D61

Publications

T60T61

Statistical products

D70T71

T70T72D71

D72

D41

T42

T41

D42

Page 12: 1 SDMX Global Conference 28-30 September 2015 SDMX into the future VTL (Validation and Transformation Language) A new technical standard for enhancing

1228-30 September 2015 SDMX Global Conference

VTL features (4)

VTL 1.0 allows:• Persistent and temporary results• Operations on mono and multi measure data• Dealing with missing data• Dealing with Attributes and their propagation rules

VTL 1.1 will introduce:• Other operators, mainly for validation purposes• Reusable rules• Bug fixing, fine tuning

Page 13: 1 SDMX Global Conference 28-30 September 2015 SDMX into the future VTL (Validation and Transformation Language) A new technical standard for enhancing

1328-30 September 2015 SDMX Global Conference

VTL status

VTL 1.0: published in March 2015 – (http://sdmx.org/wp-content/uploads/2015/03/VTL-1-package-2015.zip)

VTL: part 1 - part 2

BNF (Extended Backus-Naur Form) Technical notation

VTL 1.1 (language extensions): work in progress

SDMX implementation: work in progress– Messages for exchanging VTL rules– Registry for storing VTL rules– Web services for retrieving VTL rules

Page 14: 1 SDMX Global Conference 28-30 September 2015 SDMX into the future VTL (Validation and Transformation Language) A new technical standard for enhancing

• VTL is maintained by the SDMX TWG through the VTL Task Force– Extensions will be considered for inclusion in future versions

• VTL has already produced some feedback to GSIM for next version– VTL can be mapped against SDMX– VTL can be directly utilized by DDI in those places where

computations are included– As GSIM processing Rules

Governance and Standards Alignment

28-30 September 2015 SDMX Global Conference 14

Page 15: 1 SDMX Global Conference 28-30 September 2015 SDMX into the future VTL (Validation and Transformation Language) A new technical standard for enhancing

1528-30 September 2015 SDMX Global Conference

SDMX into the future

Contribute to VTL 1.1 !!!Comments on VTL 1.0 and suggestions for improvement

can be sent to the SDMX Technical Working Group

[email protected]

Thanks for your attention !

[email protected]