17
NEAR REAL-TIME DATA WAREHOUSING – THE FINAL FRONTIER?

NEAR REAL-TIME DATA WAREHOUSING THE FINAL FRONTIER? · 2019. 10. 21. · “Real Time ” Low Latency Low ... Based on a time-based batch interval Micro Batching New data elements

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Page 1: NEAR REAL-TIME DATA WAREHOUSING THE FINAL FRONTIER? · 2019. 10. 21. · “Real Time ” Low Latency Low ... Based on a time-based batch interval Micro Batching New data elements

NEAR REAL-TIME DATA

WAREHOUSING –THE FINAL FRONTIER?

Page 2: NEAR REAL-TIME DATA WAREHOUSING THE FINAL FRONTIER? · 2019. 10. 21. · “Real Time ” Low Latency Low ... Based on a time-based batch interval Micro Batching New data elements

Elffar AnalyticsJoel AchaOracle Technologies since 1997

@[email protected]#obihackers IRC channel

2

Page 3: NEAR REAL-TIME DATA WAREHOUSING THE FINAL FRONTIER? · 2019. 10. 21. · “Real Time ” Low Latency Low ... Based on a time-based batch interval Micro Batching New data elements

1.Data

WarehouseLet’s start with a simple

definition

Page 4: NEAR REAL-TIME DATA WAREHOUSING THE FINAL FRONTIER? · 2019. 10. 21. · “Real Time ” Low Latency Low ... Based on a time-based batch interval Micro Batching New data elements

“"A data warehouse is a copy of transaction

data specifically structured for query

and analysis."

Ralph Kimball

4

Page 5: NEAR REAL-TIME DATA WAREHOUSING THE FINAL FRONTIER? · 2019. 10. 21. · “Real Time ” Low Latency Low ... Based on a time-based batch interval Micro Batching New data elements

5

Data Integration Methods

Traditional ETL

CDC Replication

Real Time Streaming

Batch BasedHigh Latency

“Real Time”Low Latency

Low LatencyIn-line in-memory transformation

Page 6: NEAR REAL-TIME DATA WAREHOUSING THE FINAL FRONTIER? · 2019. 10. 21. · “Real Time ” Low Latency Low ... Based on a time-based batch interval Micro Batching New data elements

Data Movement

▹ Batching▹ Micro Batching▹ Streaming

6

Page 7: NEAR REAL-TIME DATA WAREHOUSING THE FINAL FRONTIER? · 2019. 10. 21. · “Real Time ” Low Latency Low ... Based on a time-based batch interval Micro Batching New data elements

Main Characteristics

Batching

▹ New data elements grouped into a batch

▹ Based on a time-based batch interval

Micro Batching

▹ New data elements more frequently grouped into a batch

▹ Real-time analytics not essential

Streaming

▹ Event driven architecture

▹ Low latency is critical

7

Page 8: NEAR REAL-TIME DATA WAREHOUSING THE FINAL FRONTIER? · 2019. 10. 21. · “Real Time ” Low Latency Low ... Based on a time-based batch interval Micro Batching New data elements

2.Big Data

Yet another simple definition...

Page 9: NEAR REAL-TIME DATA WAREHOUSING THE FINAL FRONTIER? · 2019. 10. 21. · “Real Time ” Low Latency Low ... Based on a time-based batch interval Micro Batching New data elements

“"larger, more

complex data sets, especially from new

data sources."

source: oracle.com

Page 10: NEAR REAL-TIME DATA WAREHOUSING THE FINAL FRONTIER? · 2019. 10. 21. · “Real Time ” Low Latency Low ... Based on a time-based batch interval Micro Batching New data elements

BIG DATA

The final nail in the coffin?

10

3 V's:1. Velocity2. Volume3. Variety

Page 11: NEAR REAL-TIME DATA WAREHOUSING THE FINAL FRONTIER? · 2019. 10. 21. · “Real Time ” Low Latency Low ... Based on a time-based batch interval Micro Batching New data elements

3.Data Lake

Last definition – I promise...

Page 12: NEAR REAL-TIME DATA WAREHOUSING THE FINAL FRONTIER? · 2019. 10. 21. · “Real Time ” Low Latency Low ... Based on a time-based batch interval Micro Batching New data elements

“"A centralized

repository that allows you to store all your structured and unstructured data at any scale."

12

source: amazon.com

Page 13: NEAR REAL-TIME DATA WAREHOUSING THE FINAL FRONTIER? · 2019. 10. 21. · “Real Time ” Low Latency Low ... Based on a time-based batch interval Micro Batching New data elements

Big Data – Data Lake

▹ Schema on Read

▹ Data stored in raw form

▹ "Freeform"

Are Apples & Pears the same?

Data Warehouse

▹ Schema on Write

▹ Structured Data

▹ Query limitations

▹ Value of data clear from the outset

13

Page 14: NEAR REAL-TIME DATA WAREHOUSING THE FINAL FRONTIER? · 2019. 10. 21. · “Real Time ” Low Latency Low ... Based on a time-based batch interval Micro Batching New data elements

Complimentary Technologies

14

source: DellEMC.com

📊

💯

Page 15: NEAR REAL-TIME DATA WAREHOUSING THE FINAL FRONTIER? · 2019. 10. 21. · “Real Time ” Low Latency Low ... Based on a time-based batch interval Micro Batching New data elements

Use Cases

Data Warehouse

▹ Highly curated data

▹ Structured standard reporting

Data Lake

▹ Data Scientist access to raw data

▹ Flexible, reactive business model

15

Page 16: NEAR REAL-TIME DATA WAREHOUSING THE FINAL FRONTIER? · 2019. 10. 21. · “Real Time ” Low Latency Low ... Based on a time-based batch interval Micro Batching New data elements

THE BEST OF BOTH WORLDS?

Streaming

▹ Next generation Data Integration

▹ In-flight

▹ Real-time

▹ Can load into data lakes and data warehouses

16