38
Query Processing and Optimizing on SSDs Flash Group Qingling Cao [email protected]

Query Processing and Optimizing on SSDs Flash Group Qingling Cao [email protected]

Embed Size (px)

Citation preview

Page 1: Query Processing and Optimizing on SSDs Flash Group Qingling Cao qingling1220@sina.com

Query Processing and Optimizing on SSDs

Flash GroupQingling Cao [email protected]

Page 2: Query Processing and Optimizing on SSDs Flash Group Qingling Cao qingling1220@sina.com

Introduction

Page Layout on SSD

Scan Approaches

Conclusion

Join Algorithms

Outline

Page 3: Query Processing and Optimizing on SSDs Flash Group Qingling Cao qingling1220@sina.com

Introduction

Page Layout on SSD

Scan Approaches

Conclusion

Join Algorithms

Outline

Page 4: Query Processing and Optimizing on SSDs Flash Group Qingling Cao qingling1220@sina.com

• Page layout and data structure• Leverage fast random read to speed up

selection 、 projection and join operation• Database query processing engines traditionally

emphasize on sequential I/O

Introduction

Page 5: Query Processing and Optimizing on SSDs Flash Group Qingling Cao qingling1220@sina.com

Introduction

Page Layout on SSD

Scan Approaches

Conclusion

Join Algorithms

Outline

Page 6: Query Processing and Optimizing on SSDs Flash Group Qingling Cao qingling1220@sina.com

Page Layout on SSD

Row Layout

Column Layout -Attributes of one column stored in continuous pages

slot

Page 7: Query Processing and Optimizing on SSDs Flash Group Qingling Cao qingling1220@sina.com

PAX Layout is efficient for SSD but not for disk. Why?

Page Layout on SSD

PAX Layout

Page 8: Query Processing and Optimizing on SSDs Flash Group Qingling Cao qingling1220@sina.com

• Disk, the sequential read speed is 100MB/s. A skip takes 3-4ms. So a mini-page should be 300-400KB. Then full page size will be MB.

• IDE flash drive, the sequential read bandwidth is 28MB/s. Seek time is 0.25ms, so mini-page should be 7KB. Then full page size can be 32-128KB.

Page Layout on SSD

Page 9: Query Processing and Optimizing on SSDs Flash Group Qingling Cao qingling1220@sina.com

• Disk, the sequential read speed is 100MB/s. A skip takes 3-4ms. So a mini-page should be 300-400KB. Then full page size will be MB.

• IDE flash drive, the sequential read bandwidth is 28MB/s. Seek time is 0.25ms, so mini-page should be 7KB. Then full page size can be 32-128KB.

Page Layout on SSD

Page 10: Query Processing and Optimizing on SSDs Flash Group Qingling Cao qingling1220@sina.com

• Disk, the sequential read speed is 100MB/s. A skip takes 3-4ms. So a mini-page should be 300-400KB. Then full page size will be MB.

• IDE flash drive, the sequential read bandwidth is 28MB/s. Seek time is 0.25ms, so mini-page should be 7KB. Then full page size can be 32-128KB.

Page Layout on SSD

Page 11: Query Processing and Optimizing on SSDs Flash Group Qingling Cao qingling1220@sina.com

• Disk, the sequential read speed is 100MB/s. A skip takes 3-4ms. So a mini-page should be 300-400KB. Then full page size will be MB.

• IDE flash drive, the sequential read bandwidth is 28MB/s. Seek time is 0.25ms, so mini-page should be 7KB. Then full page size can be 32-128KB.

Page Layout on SSD

Page 12: Query Processing and Optimizing on SSDs Flash Group Qingling Cao qingling1220@sina.com

• Disk, the sequential read speed is 100MB/s. A skip takes 3-4ms. So a mini-page should be 300-400KB. Then full page size will be MB.

• IDE flash drive, the sequential read bandwidth is 28MB/s. Seek time is 0.25ms, so mini-page should be 7KB. Then full page size can be 32-128KB.

Page Layout on SSD

Page 13: Query Processing and Optimizing on SSDs Flash Group Qingling Cao qingling1220@sina.com

• Disk, the sequential read speed is 100MB/s. A skip takes 3-4ms. So a mini-page should be 300-400KB. Then full page size will be MB.

• IDE flash drive, the sequential read bandwidth is 28MB/s. Seek time is 0.25ms, so mini-page should be 7KB. Then full page size can be 32-128KB.

Page Layout on SSD

Page 14: Query Processing and Optimizing on SSDs Flash Group Qingling Cao qingling1220@sina.com

• Disk, the sequential read speed is 100MB/s. A skip takes 3-4ms. So a mini-page should be 300-400KB. Then full page size will be MB.

• IDE flash drive, the sequential read bandwidth is 28MB/s. Seek time is 0.25ms, so mini-page should be 7KB. Then full page size can be 32-128KB.

Page Layout on SSD

Page 15: Query Processing and Optimizing on SSDs Flash Group Qingling Cao qingling1220@sina.com

• Disk, the sequential read speed is 100MB/s. A skip takes 3-4ms. So a mini-page should be 300-400KB. Then full page size will be MB.

• IDE flash drive, the sequential read bandwidth is 28MB/s. Seek time is 0.25ms, so mini-page should be 7KB. Then full page size can be 32-128KB.

Page Layout on SSD

Page 16: Query Processing and Optimizing on SSDs Flash Group Qingling Cao qingling1220@sina.com

• Disk, the sequential read speed is 100MB/s. A skip takes 3-4ms. So a mini-page should be 300-400KB. Then full page size will be MB.

• IDE flash drive, the sequential read bandwidth is 28MB/s. Seek time is 0.25ms, so mini-page should be 7KB. Then full page size can be 32-128KB.

Page Layout on SSD

Page 17: Query Processing and Optimizing on SSDs Flash Group Qingling Cao qingling1220@sina.com

Introduction

Page Layout on SSD

Scan Approaches

Conclusion

Join Algorithms

Outline

Page 18: Query Processing and Optimizing on SSDs Flash Group Qingling Cao qingling1220@sina.com

• NSMScan – Always read the whole relation.• FlashScan – Read only the related columns. e.g. select S from R where J

Scan Approaches

Page 19: Query Processing and Optimizing on SSDs Flash Group Qingling Cao qingling1220@sina.com

• FlashScanOPT(U) – read only the mini-pages consist the tuples needed.

e.g. select S from R where J• FlashScanOPT(S) – Attributes are sorted, so

the mini-pages are read at most once.

Scan Approaches

Page 20: Query Processing and Optimizing on SSDs Flash Group Qingling Cao qingling1220@sina.com

Scan Approaches

Table: 70m tuples, 11columns, 10GBSystem: Intel Core 2 Duo at 2.33GHz, 4GB of RAMMtron 32GB SSD

Page 21: Query Processing and Optimizing on SSDs Flash Group Qingling Cao qingling1220@sina.com

Introduction

Page Layout on SSD

Scan Approaches

Conclusion

Join Algorithms

Outline

Page 22: Query Processing and Optimizing on SSDs Flash Group Qingling Cao qingling1220@sina.com

• Block Nested Loops Join• Sort-Merge Join• Grace Hash Join• Hybrid Hash Join

Join Algorithms – past lessons

Page 23: Query Processing and Optimizing on SSDs Flash Group Qingling Cao qingling1220@sina.com

☆Algorithms that stress random reads , and avoid random writes as much as possible see bigger improvements on flash

Join Algorithms – past lessons

Customer: 450w tuples, 730MB Order: 4500w tuples, 5GBHDD: 5400RPM, 320GB SSD: OCZ Core series 60GB SATA II

Page 24: Query Processing and Optimizing on SSDs Flash Group Qingling Cao qingling1220@sina.com

Join Algorithms – RARE-join

J1

J2

Select Name, Team from Player, Game where Player.Team=Game.Geam

Player Game

Blue, P:4Green, P:3Red, P:2 → Red, P:5Orange, P:1 → Orange, P:6

Blue, G:4Red, G:1Orange, G:2 → Orange, G:3

Page 25: Query Processing and Optimizing on SSDs Flash Group Qingling Cao qingling1220@sina.com

<G:4 , P:4> <G:1 , P:2> <G:1 , P:5> <G:2 , P:1>

<G:2 , P:6> <G:3 , P:1> <G:3 , P:6>

Join Algorithms – RARE-join

Join Index :

Total I/O cost: |J1|+ σ1|V1|+|J2|+ σ2|V2|

<Sarah , Blue> <Julie , Red> <Alex , Red> <Ben , Orange><Lena , Orange> <Ben , Orange><Lena , Orange>

Join Result :

Page 26: Query Processing and Optimizing on SSDs Flash Group Qingling Cao qingling1220@sina.com

Join Algorithms – FlashJoin

Read(A) Read(D)

hashA, id1 hashD, id2

hashG, id1,id2hashK, id3

id1,id2,id3

id1,id2

Page 27: Query Processing and Optimizing on SSDs Flash Group Qingling Cao qingling1220@sina.com

<G:4 , P:4> <G:1 , P:2> <G:1 , P:5> <G:2 , P:1>

<G:2 , P:6> <G:3 , P:1> <G:3 , P:6>

Join Algorithms – Fetch Kernel

Join Index :

<G:1 , P:2> <G:1 , P:5> <G:2 , P:1> <G:2 , P:6>

<G:3 , P:1> <G:3 , P:6> <G:4 , P:4>

Join Index :

Each page is read no more than once.

Page 28: Query Processing and Optimizing on SSDs Flash Group Qingling Cao qingling1220@sina.com

Join Algorithms – Fetch Kernel

Join Index :

<Red, G:1, P:2> <Red, G:1, P:5><Orange, G:2, P:1><Orange, G:2, P:6>

<Orange, G:3, P:1> <Orange, G:3, P:6> <Blue, G:4, P:4>

Join Index :

<Orange, G:2, P:1><Orange, G:3, P:1> <Red, G:1, P:2> <Blue, G:4, P:4>

<Red, G:1, P:5><Orange, G:2, P:6><Orange, G:3, P:6>

Page 29: Query Processing and Optimizing on SSDs Flash Group Qingling Cao qingling1220@sina.com

Join Algorithms – FlashJoin

R: 70m tuples, 10GB S: 7m tuples, 1GBSystem: Intel Core 2 Duo at 2.33GHz, 4GB of RAMMtron 32GB SSD

Page 30: Query Processing and Optimizing on SSDs Flash Group Qingling Cao qingling1220@sina.com

• Row-based• {JI, idx, idy}• Minimize the IO to fetch the join result

Join Algorithms – DigestJoin

Page 31: Query Processing and Optimizing on SSDs Flash Group Qingling Cao qingling1220@sina.com

• Sort-merge join• Join results are clustered • Memory is enough• Fetch the pages of the tuples as soon as they

are produced

Join Algorithms – Page Fetching(1)

Page 32: Query Processing and Optimizing on SSDs Flash Group Qingling Cao qingling1220@sina.com

• Fetching instruction table• Join candidate table Join Index: (x1,A:1,C:1) (x2,B:1,D:1)

(x3,A:2,C:2) (x4,B:2,D:2)

ft1={A:1, B:1, A:2, B:2} ft2={C:1, D:1, C:2, D:2}

Join Algorithms – Page Fetching(2)

jct1={x1,x2,x3,x4} jct2={y1,y2,y3,y4}

ft1={A:1, A:2, B:1, B:2} ft2={C:1, C:2, D:1, D:2}

Page 33: Query Processing and Optimizing on SSDs Flash Group Qingling Cao qingling1220@sina.com

• Join Graph G=(V1 V∪ 2, E) E V1 V2

• Segment e.g. {1, a, b, c}, {a, 1, 2}

Join Algorithms – Page Fetching(3)

Page 34: Query Processing and Optimizing on SSDs Flash Group Qingling Cao qingling1220@sina.com

Join Algorithms – Page Fetching(3)

• Required storage size(RSS)• Required cache size(RCS)• <join_atrr,tid1,tid2>

Page 35: Query Processing and Optimizing on SSDs Flash Group Qingling Cao qingling1220@sina.com

Introduction

Page Layout on SSD

Scan Approaches

Conclusion

Join Algorithms

Outline

Page 36: Query Processing and Optimizing on SSDs Flash Group Qingling Cao qingling1220@sina.com

• Scan algorithm has little room for improvement.• RARE-Join 、 FlashJoin.• No write.• Join index will be sorted many times. • The size of minipage is not fixed.

Conclusion

PAX:

Page 37: Query Processing and Optimizing on SSDs Flash Group Qingling Cao qingling1220@sina.com

Row:• DigestJoin.• IO is much more than other join algorithms.

Column: • None• Storage is more flexible.• Utilize the technology of tuple reconstruction.

Conclusion

Page 38: Query Processing and Optimizing on SSDs Flash Group Qingling Cao qingling1220@sina.com