15
Real-time Query Processing Roger Blake CSC 536 May 2, 2005

Real-time Query Processing Roger Blake CSC 536 May 2, 2005

Embed Size (px)

Citation preview

Real-time Query Processing

Roger Blake

CSC 536

May 2, 2005

Real-time Query Processing - Agenda

• Description of RTDBs

• Aspects of real-time query processing

• Real-time query approaches

• Query processing in conventional and real-time DBMS’s

• In order to be a real-time database, it needs to have:

• But you already knew that…what may not have been so clear are some of the misconceptions about real-time databases…

Description of real-time databases

– Timing constraints (deadlines), and– Temporal data (data validity related to time)

Description of real-time databases

• Here are a few misconceptions about real-time databases that have persisted– Must be special purpose and highly specialized

– Must be fast

– Must be in-memory

• There is not complete consensus, especially in practice, about what a real-time database even is

Aspects of real-time query processingTiming

constraints(deadlines)

Soft: A query missing a

deadline results in less value than if the deadline were met

Firm: A query missing a

deadline means the results are useless

Hard: A query missing a

deadline means system failure

Topology Distributed: A query references data on more than one machine

Single node:A query uses data on a

single machine, which is possibly at least partially on disk

Embedded: The real-time database

are queries are entirely in-memory and make use of a dedicated processor

Periodicity and timings

Dynamic: Queries and their

timings are not known in advance, and can not be estimated or bounded

Extended Static: Queries and their timings

are not known in advance but can either be bounded or estimated with a degree of certainty

Static: Queries and their

timings are known in advance and are deterministic

Real-time query approaches

• Priority Memory Management (PMM) algorithm– Prioritizes queries by EDF– Queues and admits queries – Allocates memory to each query based on min/max– Adjusts memory allocations based on resource

utilization

Real-time query approaches

• B-tree indexing for real-time queries– Starting point is a conventional B-tree

with queries prioritized by EDF

Real-time query approaches

• B-tree indexing for real-time queries (cont’d)

– As each query searches a node, it locks that node and all its child nodes for either read or write.

– If there is a conflict, the query with the lowest priority restarts at the top of the tree

– Rebalancing operations have the lowest priority

Real-time query approaches

• R-tree – similar to algorithms used for non-real-time distributed queries

4 physical locations mapped by the R-tree

Real-time query approaches

• R-tree adapted for real-time queries:

Real-time query approaches

• Freshness for base and derived real-time data– Prioritizes queries by

EDF– Calculates Access

Update Ratio (AUR) for each data item (access frequency / update frequency)

– Uses (and adjusts) an AUR threshold for when data is updated

Query processing in conventional and real-time DBMS’s

• A conventional DBMS has several cross purposes with a real-time DBMS. – Multiple platforms or versions

abstraction from hardware/OS– Measured by TPC-D performance

rules– ACID properties inviolate may be

incompatible with real-time constraints

Query processing in conventional and real-time DBMS’s

• But conventional DBMS’s are sometimes used for data with temporal aspects, e.g. decision support systems (DSS)

select city.name, sum(package.weight)from package /*real-time*/, city/*stored*/where package.weight > 100 and package.service = 'priority' and package.zip = city.zip group by city.name

Query processing in conventional and real-time DBMS’s

Conventional DBMS's Real-time DBMS's

Timing constraints (deadlines)

Soft Firm Hard

Topology Distributed Single node Embedded

Periodicity and timings

Dynamic Extended Static Static

Real-time query processing• Algorithms have been developed for many

aspects of real-time query processing• Conventional DBMS’s don’t handle real-

time queries except by ‘brute force’ and possibly some of the algorithms we’ve seen

• Challenges remain for real-time query processing– Security– Fault tolerance– Scalability– Generalization