17
1 © Copyright 2011 EMC Corporation. All rights reserved. Automatic Capture of Minimal, Portable, and Executable Bug Repros Using AMPERe DBTest May 21, 2012 Lyublena Antova (Greenplum) Konstantinos Krikellas (Google) Florian Waas (Greenplum)

Automatic Capture of Minimal, Portable, and Executable Bug

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Automatic Capture of Minimal, Portable, and Executable Bug

1 © Copyright 2011 EMC Corporation. All rights reserved.

Automatic Capture of Minimal, Portable, and Executable Bug Repros Using AMPERe

DBTest May 21, 2012

Lyublena Antova (Greenplum) Konstantinos Krikellas (Google)

Florian Waas (Greenplum)

Page 2: Automatic Capture of Minimal, Portable, and Executable Bug

2 © Copyright 2011 EMC Corporation. All rights reserved.

Query Optimization • Key to performance

–  Possibly multiple orders of magnitude improvement

• Key to extensibility –  Need optimizer support for new features/elements

• Key to customer fidelity –  Nothing makes customers happier than a working

system with generates optimal results

Page 3: Automatic Capture of Minimal, Portable, and Executable Bug

3 © Copyright 2011 EMC Corporation. All rights reserved.

Error/Crash onProduction Cluster

Investigate Customer Bugs: Example

DBAdmin TechSupport

Report bug

Provide data

Request data

2012-05-15 23:35:01.399462 PDT,"antova","warehouse” 2012-05-15 23:35:01.873560 PDT,"antova","warehouse"

select sum(a) from R join S on (A=C) wrong file

Provide data

Request data

Log file Stack traces Configuration parameters

Schema definition & statistics

data changed in the meantime

Software Engineer

Provide data

Provide data

Request data Query

Log file

Incomplete data Redundant data Inconsistent data Manual collection

Long time to resolution

Page 4: Automatic Capture of Minimal, Portable, and Executable Bug

4 © Copyright 2011 EMC Corporation. All rights reserved.

Investigating Optimizer Bugs: Challenges •  Query optimizers are complex

–  and therefore bug-prone

•  Repros for optimizer bugs hard to obtain –  schema definition, statistics, offending query, logs, configuration

parameters, …

•  Collection is done after the fact –  may be too late in concurrent environments where data changes

constantly

•  Collection and investigation require significant manual effort

•  Long turnaround times for fixing optimizer bugs

Page 5: Automatic Capture of Minimal, Portable, and Executable Bug

5 © Copyright 2011 EMC Corporation. All rights reserved.

In an ideal world… •  Bug repros are automatically generated when a bug is hit

–  no need for manual collection

•  Bug repros are executable –  live repro vs. eyeballing static stack traces

•  Bug repros are portable –  problem can be debugged in-house without access to customer cluster

•  Bug repros are minimal –  no need to dump the whole database schema if query only touches a

single relation

•  Bug repros can be added to test framework after the bug is fixed –  make sure it doesn’t happen again

Page 6: Automatic Capture of Minimal, Portable, and Executable Bug

6 © Copyright 2011 EMC Corporation. All rights reserved.

AMPERe

@EMAIL

Error/Crash onProduction Cluster

Executable Repro onDeveloper Laptop

Automatic Capture of Minimal Portable, and Executable bug Repros

Dump

Automatic Minimal Consistent Executable Portable

Short time to resolution

Page 7: Automatic Capture of Minimal, Portable, and Executable Bug

7 © Copyright 2011 EMC Corporation. All rights reserved.

ORCA: Greenplum’s Query Optimizer • Extensibility • Modularity • Multi-core/many-core enabled • Verifiability

•  AMPERe is built into ORCA

Page 8: Automatic Capture of Minimal, Portable, and Executable Bug

8 © Copyright 2011 EMC Corporation. All rights reserved.

ORCA: Architecture

OrcaGPOS

Query Optimizer

Database System

SQL Parser ResultsExecutor

MetadataCache

Optimizer Engine

DXLQuery

DXLPlan

Catalog

DXLMD

Transmitted Data Request

DXL: Data eXchange Language: •  Query •  Plan •  Metadata

Page 9: Automatic Capture of Minimal, Portable, and Executable Bug

9 © Copyright 2011 EMC Corporation. All rights reserved.

Optimizer

Engine

MDCache

GPDBGPDB

MinidumpPG

NoSQLDB

Oracle, MS SQL, MySQL, DB2, etc.

MD Authority

Dumps,Unittests

v4.2 (Rio) MultipleInstances

VanillaPostgres

Hadoop,MongoDB,

etc.ExternalCatalog

MDProvider plug-ins

MDAcc

MDAccMDAcc

Page 10: Automatic Capture of Minimal, Portable, and Executable Bug

10 © Copyright 2011 EMC Corporation. All rights reserved.

AMPERe: Capture Mechanism

@EMAIL

Error/Crash onProduction Cluster

Executable Repro onDeveloper Laptop

Dump

• Based on DXL abstraction –  abstraction of algebrized query, metadata,

statistics

• Capture mechanism –  pre-serialize incoming query, parameters,

and accessed metadata objects and statistics –  register objects with capture mechanism –  at crash/exception time, harvest registered

objects and serialize them to a memory-mapped output buffer

Page 11: Automatic Capture of Minimal, Portable, and Executable Bug

11 © Copyright 2011 EMC Corporation. All rights reserved.

AMPERe Dump: Example <?xml version="1.0" encoding="UTF-8"?>!<dxl:DXLMsg xmlns:dxl=http://greenplum.com/dxl/v1>! <dxl:Thread Id="0">! <dxl:Stacktrace>! 1 0x000e8106df gpos::CException::Raise! 2 0x000137d853 COptTasks::PvOptimizeTask! 3 0x000e81cb1c gpos::CTask::Execute! 4 0x000e8180f4 gpos::CWorker::Execute! 5 0x000e81e811 gpos::CAutoTaskProxy::Execute! </dxl:Stacktrace>! <dxl:TraceFlags Value="gp_optimizer_hashjoin"/>! <dxl:Metadata>! <dxl:Type Mdid="0.9.1.0" Name="int4” Length="4" />! <dxl:RelStats Mdid="2.688.1.1" Name="r" Rows="10"/>! <dxl:Relation Mdid="0.688.1.1" Name="r” DistrPolicy="Hash" DistrColumns="0">! <dxl:Columns><dxl:Column Name="a” Mdid="0.9.1.0"/></dxl:Columns>! </dxl:Relation>! </dxl:Metadata>! <dxl:Query>! <dxl:OutputColumns>! <dxl:Ident ColId="1" Name="a" Mdid="0.9.1.0"/>! </dxl:OutputColumns>! <dxl:LogicalGet>! <dxl:TableDescriptor Mdid="0.688.1.1" Name="r">! <dxl:Columns><dxl:Column Mdid="0.9.1.0"/></dxl:Columns>! </dxl:TableDescriptor>! </dxl:LogicalGet>! </dxl:Query>! </dxl:Thread>!</dxl:DXLMsg>!

Stack Trace

Metadata

Query

Configuration

Page 12: Automatic Capture of Minimal, Portable, and Executable Bug

12 © Copyright 2011 EMC Corporation. All rights reserved.

AMPERe: Loading • Load AMPERe

dumps –  load query from dump file –  file-based metadata

provider on top of the dump file

Optimizer Engine

Metadata Cache

DXL Query

DXL MD

DXL Plan

AMPERe Dump

GPOS

Query Optimizer

File-based MD Provider

Page 13: Automatic Capture of Minimal, Portable, and Executable Bug

13 © Copyright 2011 EMC Corporation. All rights reserved.

Use cases •  Customer support cases

–  crashes and run-time exceptions –  bad plan choices –  mining support cases –  integration with EMC DialHome

•  QA –  AMPERe-based test framework –  cost model sensitivity and calibration

Page 14: Automatic Capture of Minimal, Portable, and Executable Bug

14 © Copyright 2011 EMC Corporation. All rights reserved.

Operational Characteristics

!"

#!"

$!"

%!"

&!"

'!!"

'#!"

(!'" (!#" (!)" (!$" (!*" (!%" (!+" (!&" (!," ('!" (''" ('#" (')" ('$" ('*" ('%" ('+" ('&" ('," (#!" (#'" (##"

AMPERe file size in KB for TPC-H queries

Page 15: Automatic Capture of Minimal, Portable, and Executable Bug

15 © Copyright 2011 EMC Corporation. All rights reserved.

Summary •  Investigating optimizer bugs is hard

•  AMPERe –  automatic –  minimal –  portable –  executable –  extensible

•  Use cases –  customer support cases –  QA

Page 16: Automatic Capture of Minimal, Portable, and Executable Bug

16 © Copyright 2011 EMC Corporation. All rights reserved.

Thanks!

Page 17: Automatic Capture of Minimal, Portable, and Executable Bug

17 © Copyright 2011 EMC Corporation. All rights reserved.

AMPERe Capture Mechanism: Workflow

Exception

Main thread

Create handlerRegister stack trace with handlerRegister input query with handlerRegister metadata accessor with handler

Register stack traces with handler

Spawn optimization

threads

Entry for T1: - stack trace

T1 T2 T3 T4

Entry for T2: - stack traceEntry for T3: - stack traceEntry for T4: - stack trace

Entry for main thread:- stack trace- input query - metadata

Register configuration parameters with handler

- configuration parameters

Threads abort optimization

Query errors out

DXL document

1

2

3

4

5

6

7