Upload
others
View
3
Download
0
Embed Size (px)
Citation preview
1 © Copyright 2011 EMC Corporation. All rights reserved.
Automatic Capture of Minimal, Portable, and Executable Bug Repros Using AMPERe
DBTest May 21, 2012
Lyublena Antova (Greenplum) Konstantinos Krikellas (Google)
Florian Waas (Greenplum)
2 © Copyright 2011 EMC Corporation. All rights reserved.
Query Optimization • Key to performance
– Possibly multiple orders of magnitude improvement
• Key to extensibility – Need optimizer support for new features/elements
• Key to customer fidelity – Nothing makes customers happier than a working
system with generates optimal results
3 © Copyright 2011 EMC Corporation. All rights reserved.
Error/Crash onProduction Cluster
Investigate Customer Bugs: Example
DBAdmin TechSupport
Report bug
Provide data
Request data
2012-05-15 23:35:01.399462 PDT,"antova","warehouse” 2012-05-15 23:35:01.873560 PDT,"antova","warehouse"
select sum(a) from R join S on (A=C) wrong file
Provide data
Request data
Log file Stack traces Configuration parameters
Schema definition & statistics
data changed in the meantime
Software Engineer
Provide data
Provide data
Request data Query
Log file
Incomplete data Redundant data Inconsistent data Manual collection
Long time to resolution
4 © Copyright 2011 EMC Corporation. All rights reserved.
Investigating Optimizer Bugs: Challenges • Query optimizers are complex
– and therefore bug-prone
• Repros for optimizer bugs hard to obtain – schema definition, statistics, offending query, logs, configuration
parameters, …
• Collection is done after the fact – may be too late in concurrent environments where data changes
constantly
• Collection and investigation require significant manual effort
• Long turnaround times for fixing optimizer bugs
5 © Copyright 2011 EMC Corporation. All rights reserved.
In an ideal world… • Bug repros are automatically generated when a bug is hit
– no need for manual collection
• Bug repros are executable – live repro vs. eyeballing static stack traces
• Bug repros are portable – problem can be debugged in-house without access to customer cluster
• Bug repros are minimal – no need to dump the whole database schema if query only touches a
single relation
• Bug repros can be added to test framework after the bug is fixed – make sure it doesn’t happen again
6 © Copyright 2011 EMC Corporation. All rights reserved.
AMPERe
Error/Crash onProduction Cluster
Executable Repro onDeveloper Laptop
Automatic Capture of Minimal Portable, and Executable bug Repros
Dump
Automatic Minimal Consistent Executable Portable
Short time to resolution
7 © Copyright 2011 EMC Corporation. All rights reserved.
ORCA: Greenplum’s Query Optimizer • Extensibility • Modularity • Multi-core/many-core enabled • Verifiability
• AMPERe is built into ORCA
8 © Copyright 2011 EMC Corporation. All rights reserved.
ORCA: Architecture
OrcaGPOS
Query Optimizer
Database System
SQL Parser ResultsExecutor
MetadataCache
Optimizer Engine
DXLQuery
DXLPlan
Catalog
DXLMD
Transmitted Data Request
DXL: Data eXchange Language: • Query • Plan • Metadata
9 © Copyright 2011 EMC Corporation. All rights reserved.
Optimizer
Engine
MDCache
GPDBGPDB
MinidumpPG
NoSQLDB
Oracle, MS SQL, MySQL, DB2, etc.
MD Authority
Dumps,Unittests
v4.2 (Rio) MultipleInstances
VanillaPostgres
Hadoop,MongoDB,
etc.ExternalCatalog
MDProvider plug-ins
MDAcc
MDAccMDAcc
10 © Copyright 2011 EMC Corporation. All rights reserved.
AMPERe: Capture Mechanism
Error/Crash onProduction Cluster
Executable Repro onDeveloper Laptop
Dump
• Based on DXL abstraction – abstraction of algebrized query, metadata,
statistics
• Capture mechanism – pre-serialize incoming query, parameters,
and accessed metadata objects and statistics – register objects with capture mechanism – at crash/exception time, harvest registered
objects and serialize them to a memory-mapped output buffer
11 © Copyright 2011 EMC Corporation. All rights reserved.
AMPERe Dump: Example <?xml version="1.0" encoding="UTF-8"?>!<dxl:DXLMsg xmlns:dxl=http://greenplum.com/dxl/v1>! <dxl:Thread Id="0">! <dxl:Stacktrace>! 1 0x000e8106df gpos::CException::Raise! 2 0x000137d853 COptTasks::PvOptimizeTask! 3 0x000e81cb1c gpos::CTask::Execute! 4 0x000e8180f4 gpos::CWorker::Execute! 5 0x000e81e811 gpos::CAutoTaskProxy::Execute! </dxl:Stacktrace>! <dxl:TraceFlags Value="gp_optimizer_hashjoin"/>! <dxl:Metadata>! <dxl:Type Mdid="0.9.1.0" Name="int4” Length="4" />! <dxl:RelStats Mdid="2.688.1.1" Name="r" Rows="10"/>! <dxl:Relation Mdid="0.688.1.1" Name="r” DistrPolicy="Hash" DistrColumns="0">! <dxl:Columns><dxl:Column Name="a” Mdid="0.9.1.0"/></dxl:Columns>! </dxl:Relation>! </dxl:Metadata>! <dxl:Query>! <dxl:OutputColumns>! <dxl:Ident ColId="1" Name="a" Mdid="0.9.1.0"/>! </dxl:OutputColumns>! <dxl:LogicalGet>! <dxl:TableDescriptor Mdid="0.688.1.1" Name="r">! <dxl:Columns><dxl:Column Mdid="0.9.1.0"/></dxl:Columns>! </dxl:TableDescriptor>! </dxl:LogicalGet>! </dxl:Query>! </dxl:Thread>!</dxl:DXLMsg>!
Stack Trace
Metadata
Query
Configuration
12 © Copyright 2011 EMC Corporation. All rights reserved.
AMPERe: Loading • Load AMPERe
dumps – load query from dump file – file-based metadata
provider on top of the dump file
Optimizer Engine
Metadata Cache
DXL Query
DXL MD
DXL Plan
AMPERe Dump
GPOS
Query Optimizer
File-based MD Provider
13 © Copyright 2011 EMC Corporation. All rights reserved.
Use cases • Customer support cases
– crashes and run-time exceptions – bad plan choices – mining support cases – integration with EMC DialHome
• QA – AMPERe-based test framework – cost model sensitivity and calibration
14 © Copyright 2011 EMC Corporation. All rights reserved.
Operational Characteristics
!"
#!"
$!"
%!"
&!"
'!!"
'#!"
(!'" (!#" (!)" (!$" (!*" (!%" (!+" (!&" (!," ('!" (''" ('#" (')" ('$" ('*" ('%" ('+" ('&" ('," (#!" (#'" (##"
AMPERe file size in KB for TPC-H queries
15 © Copyright 2011 EMC Corporation. All rights reserved.
Summary • Investigating optimizer bugs is hard
• AMPERe – automatic – minimal – portable – executable – extensible
• Use cases – customer support cases – QA
16 © Copyright 2011 EMC Corporation. All rights reserved.
Thanks!
17 © Copyright 2011 EMC Corporation. All rights reserved.
AMPERe Capture Mechanism: Workflow
Exception
Main thread
Create handlerRegister stack trace with handlerRegister input query with handlerRegister metadata accessor with handler
Register stack traces with handler
Spawn optimization
threads
Entry for T1: - stack trace
T1 T2 T3 T4
Entry for T2: - stack traceEntry for T3: - stack traceEntry for T4: - stack trace
Entry for main thread:- stack trace- input query - metadata
Register configuration parameters with handler
- configuration parameters
Threads abort optimization
Query errors out
DXL document
1
2
3
4
5
6
7