80
www.huawei.com Security Level: HUAWEI TECHNOLOGIES CO., LTD. Faster HBase queries Introducing hindex Secondary indexes for HBase Rajeshbabu Chintaguntla [email protected] ApacheCon North America 2014

Hindex: Secondary indexes for faster HBase queries

Embed Size (px)

DESCRIPTION

HBase is a very popular data store due to its tight integration with Hadoop. However, query latencies can sometimes be high, specially when scanning tables on column values. This can also have other undesirable side effects - like timeouts at client, or lease expires. Hindex adds secondary indexes for HBase tables. The indexes are used for equals and range condition scans, and can turn full table scans to point/range scans. Hindex is 100% server-side solution based on co-processors and supports one or more indexes on a table, multi-column index, and also index based on part of a column value. Hindex stores region level index in a separate table, and colocates the user and index table regions with a custom load balancer.

Citation preview

Page 1: Hindex: Secondary indexes for faster HBase queries

www.huawei.com

Security Level:

HUAWEI TECHNOLOGIES CO., LTD.

Faster HBase queries Introducing hindex – Secondary indexes for HBase

Rajeshbabu Chintaguntla

[email protected]

ApacheCon North America 2014

Page 2: Hindex: Secondary indexes for faster HBase queries

HUAWEI TECHNOLOGIES CO., LTD.

Huawei Confidential 2

$ whoami

Apache HBase Committer

Senior Software Engineer, Huawei India R&D

Developed hindex – Secondary indexes for HBase

Page 3: Hindex: Secondary indexes for faster HBase queries

HUAWEI TECHNOLOGIES CO., LTD.

Huawei Confidential 3

Agenda

HBase – A brief introduction

Introduction to hindex

Usage

Test Results

Page 4: Hindex: Secondary indexes for faster HBase queries

HUAWEI TECHNOLOGIES CO., LTD.

Huawei Confidential 4

Agenda

HBase – A brief introduction

Introduction to hindex

Usage

Test Results

Page 5: Hindex: Secondary indexes for faster HBase queries

HUAWEI TECHNOLOGIES CO., LTD.

Huawei Confidential 5

Apache HBase

name gender dept title mobile

Raj M OIH SE 534

Ram M OIH SSE 254

Anu M OIH TL 123

Pia F BDI TL 326

salary dob

1230 …

1750 …

2100 …

2270 …

key

123

135

141

142

Jay M SOA SA 521

Suma F SOA SSE 126

Som M OIH SE 665

… … … … …

4300 …

1550 …

1270 …

.. …

145

148

152

an open-source, distributed,

versioned, non-relational

database

modeled after Google’s BigTable

leverages distributed data storage

provided by HDFS

allows random, read/write access

to data in HDFS

GOAL: hosting very large tables -

billions of rows X millions of

columns

Page 6: Hindex: Secondary indexes for faster HBase queries

HUAWEI TECHNOLOGIES CO., LTD.

Huawei Confidential 6

HBase – Introduction

name gender dept title mobile

Raj M OIH SE 534

Ram M OIH SSE 254

Anu M OIH TL 123

Pia F BDI TL 326

salary dob

1230 …

1750 …

2100 …

2270 …

key

123

135

141

142

Jay M SOA SA 521

Suma F SOA SSE 126

Som M OIH SE 665

… … … … …

4300 …

1550 …

1270 …

.. …

145

148

152

Sorted

lexicographically

Page 7: Hindex: Secondary indexes for faster HBase queries

HUAWEI TECHNOLOGIES CO., LTD.

Huawei Confidential 7

HBase – Introduction

name gender dept title mobile

Raj M SE 534

Ram M SSE

Anu M 123

Pia F 326

salary dob

1230 …

key

123

135

141

142

Jay SOA SA 521

Suma SOA SSE

Som M OIH SE

… … … … …

4300 …

1550 …

1270 …

.. …

145

148

152

Sorted Sparse any number of columns

Page 8: Hindex: Secondary indexes for faster HBase queries

HUAWEI TECHNOLOGIES CO., LTD.

Huawei Confidential 8

HBase – Introduction

Sorted Sparse

name gender dept title mobile

Raj M OIH SE 534

Ram M SSE

Anu M 123

Pia F 326

salary dob

1230 …

key

123

135

141

142

Jay SOA SA 521

Suma SOA SSE

Som M OIH SE

… … … … …

4300 …

1550 …

1270 …

.. …

145

148

152

Multi-dimensional

SSE

SSE

TL

data is versioned

Key=123

cf1

name T1 Raj

gender T1 M

dept T1 OIH

title T1 SE

title T2 SSE

mobile T1 534

cf2 salary T3 1230

dob T1 19880830

SSE

(key, column family, column, timestamp) -> value

Page 9: Hindex: Secondary indexes for faster HBase queries

HUAWEI TECHNOLOGIES CO., LTD.

Huawei Confidential 9

HBase – Introduction

Sorted Distributed Sparse leverages HDFS, data replicated across nodes

name gender dept title mobile

Raj M SE 534

Ram M SSE

Anu M 123

Pia F 326

salary dob

1230 …

key

123

135

141

142

Jay SOA SA 521

Suma SOA SSE

Som M OIH SE

… … … … …

4300 …

1550 …

1270 …

.. …

145

148

152

Protects against failing node

Multi-dimensional

Page 10: Hindex: Secondary indexes for faster HBase queries

HUAWEI TECHNOLOGIES CO., LTD.

Huawei Confidential 10

HBase – Introduction

name gender dept title mobile

Raj M OIH SE 534

Ram M OIH SSE 254

Anu M OIH TL 123

Pia F BDI TL 326

salary dob

1230 …

1750 …

2100 …

2270 …

key

123

135

141

142

Jay M SOA SA 521

Suma F SOA SSE 126

Som M OIH SE 665

… … … … …

4300 …

1550 …

1270 …

.. …

145

148

152

Regions

Sorted Sparse Auto-sharded split and re-distributed as data grows Distributed Multi-dimensional

Page 11: Hindex: Secondary indexes for faster HBase queries

HUAWEI TECHNOLOGIES CO., LTD.

Huawei Confidential 11

HBase – Introduction

name gender dept title mobile

Raj M OIH SE 534

Ram M OIH SSE 254

Anu M OIH TL 123

Pia F BDI TL 326

salary dob

1230 …

1750 …

2100 …

2270 …

key

123

135

141

142

Jay M SOA SA 521

Suma F SOA SSE 126

Som M OIH SE 665

… … … … …

4300 …

1550 …

1270 …

.. …

145

148

152

Regions

Region Range

R1 120-145

R2 145-170

R3 …

… …

metadata Sorted Sparse Auto-sharded Distributed Multi-dimensional

Page 12: Hindex: Secondary indexes for faster HBase queries

HUAWEI TECHNOLOGIES CO., LTD.

Huawei Confidential 12

HBase – Introduction

HMaster

Mem store

HFile1 HFile2

Store cf1

Mem store

HFile3

HBlock1

HBlock2

:

HBlockN

HFile4

Store cf2

Region 1 Region 2

Region Server 1 Region Server 2

Master, region servers and zookeeper

Table horizontally divided into regions

Columns grouped into Column families – Vertical partition of tables

Memstore, HFiles in DFS. HFiles logically split into smaller blocks. Data read

write happen as blocks

MapReduce integration

HBlock1

HBlock2

:

HBlockN

HBlock1

HBlock2

:

HBlockN

HBlock1

HBlock2

:

HBlockN

Page 13: Hindex: Secondary indexes for faster HBase queries

HUAWEI TECHNOLOGIES CO., LTD.

Huawei Confidential 13

Coprocessors Allow to run client-supplied code on server-side,

Can extend the functionality of HBase without changing the kernel

Pre-Action

Action

Post-Action

Client

Observers – Like triggers

Runs extended functionality before

or after an action through hooks

provided by coprocessor

framework.

EndPoints – Like Stored procedures

Can run any time from client

The endpoint implementation will

then be executed remotely at the

target region(s)

results from those executions will

be returned to the client.

Page 14: Hindex: Secondary indexes for faster HBase queries

HUAWEI TECHNOLOGIES CO., LTD.

Huawei Confidential 14

Filters

Source: Lars, George, HBase The Definitive Guide, O’Reilly Media. 2011

Page 15: Hindex: Secondary indexes for faster HBase queries

HUAWEI TECHNOLOGIES CO., LTD.

Huawei Confidential 15

HBase – Query

name gender dept title mobile

Raj M OIH SE 534

Ram M OIH SSE 254

Anu M OIH TL 123

Pia F BDI TL 326

salary dob

1230 …

1750 …

2100 …

2270 …

key

123

135

141

142

Jay M SOA SA 521

Suma F SOA SSE 126

Som M OIH SE 665

… … … … …

4300 …

1550 …

1270 …

.. …

145

148

152

SELECT NAME FROM PERSON WHERE KEY=141

NOTE: HBase does not have native support for SQL like query interface. It is used here for the sake of easy understanding. Similar query can be done using scanners and filters.

Page 16: Hindex: Secondary indexes for faster HBase queries

HUAWEI TECHNOLOGIES CO., LTD.

Huawei Confidential 16

HBase – Query by Rowkey

name gender dept title mobile

Raj M OIH SE 534

Ram M OIH SSE 254

Anu M OIH TL 123

Pia F BDI TL 326

salary dob

1230 …

1750 …

2100 …

2270 …

key

123

135

141

142

Jay M SOA SA 521

Suma F SOA SSE 126

Som M OIH SE 665

… … … … …

4300 …

1550 …

1270 …

.. …

145

148

152

SELECT NAME FROM PERSON WHERE KEY=141

Page 17: Hindex: Secondary indexes for faster HBase queries

HUAWEI TECHNOLOGIES CO., LTD.

Huawei Confidential 17

HBase – Query by Column (w/o Index)

name gender dept title mobile

Raj M OIH SE 534

Ram M OIH SSE 254

Anu M OIH TL 123

Pia F BDI TL 326

salary dob

1230 …

1750 …

2100 …

2270 …

key

123

135

141

142

Jay M SOA SA 521

Suma F SOA SSE 126

Som M OIH SE 665

… … … … …

4300 …

1550 …

1270 …

.. …

145

148

152

SELECT NAME FROM PERSON WHERE MOBILE=123

Page 18: Hindex: Secondary indexes for faster HBase queries

HUAWEI TECHNOLOGIES CO., LTD.

Huawei Confidential 18

HBase – Query by Column (w/o Index)

name gender dept title mobile

Raj M OIH SE 534

Ram M OIH SSE 254

Anu M OIH TL 123

Pia F BDI TL 326

salary dob

1230 …

1750 …

2100 …

2270 …

key

123

135

141

142

Jay M SOA SA 521

Suma F SOA SSE 126

Som M OIH SE 665

… … … … …

4300 …

1550 …

1270 …

.. …

145

148

152

SELECT NAME FROM PERSON WHERE MOBILE=123

Page 19: Hindex: Secondary indexes for faster HBase queries

HUAWEI TECHNOLOGIES CO., LTD.

Huawei Confidential 19

HBase – Query by Column (w/o Index)

name gender dept title mobile

Raj M OIH SE 534

Ram M OIH SSE 254

Anu M OIH TL 123

Pia F BDI TL 326

salary dob

1230 …

1750 …

2100 …

2270 …

key

123

135

141

142

Jay M SOA SA 521

Suma F SOA SSE 126

Som M OIH SE 665

… … … … …

4300 …

1550 …

1270 …

.. …

145

148

152

SELECT NAME FROM PERSON WHERE MOBILE=123

Full table

scan

Page 20: Hindex: Secondary indexes for faster HBase queries

HUAWEI TECHNOLOGIES CO., LTD.

Huawei Confidential 20

HBase – Query by Column (w/o Index)

name gender dept title mobile

Raj M OIH SE 534

Ram M OIH SSE 254

Anu M OIH TL 123

Pia F BDI TL 326

salary dob

1230 …

1750 …

2100 …

2270 …

key

123

135

141

142

Jay M SOA SA 521

Suma F SOA SSE 126

Som M OIH SE 665

… … … … …

4300 …

1550 …

1270 …

.. …

145

148

152

SELECT NAME FROM PERSON WHERE MOBILE=123

Full table

scan

Billions of records full table

scan evil to performance

Page 21: Hindex: Secondary indexes for faster HBase queries

HUAWEI TECHNOLOGIES CO., LTD.

Huawei Confidential 21

HBase – Query by Column (w/o Index)

name gender dept title mobile

Raj M OIH SE 534

Ram M OIH SSE 254

Anu M OIH TL 123

Pia F BDI TL 326

salary dob

1230 …

1750 …

2100 …

2270 …

key

123

135

141

142

Jay M SOA SA 521

Suma F SOA SSE 126

Som M OIH SE 665

… … … … …

4300 …

1550 …

1270 …

.. …

145

148

152

SELECT NAME FROM PERSON WHERE MOBILE=123

Full table

scan

Billions of records full table

scan evil to performance

Side effects – Client timeouts,

lease expiring

Page 22: Hindex: Secondary indexes for faster HBase queries

HUAWEI TECHNOLOGIES CO., LTD.

Huawei Confidential 22

Agenda

HBase – A brief introduction

Introduction to hindex

Usage

Test Results

Page 23: Hindex: Secondary indexes for faster HBase queries

HUAWEI TECHNOLOGIES CO., LTD.

Huawei Confidential 23

Introducing hindex

Coprocessor based server side implementation of secondary indexing solution

Separate index table, used by all indexes of a table.

Region wise indexing (aka local indexing)

Custom load balancer co-locates the index table regions with actual table regions.

Index table rowkey construction is:

region start key + index name+ indexed column(s) value + user table rowkey

Page 24: Hindex: Secondary indexes for faster HBase queries

HUAWEI TECHNOLOGIES CO., LTD.

Huawei Confidential 24

hindex: Architecture

HBase Client

HMaster

Balancer

Indexing Coprocessor

Cop

rocesor

Host

Cli

en

t Ext

RowKey cf1:col1

001 A

002 B

003 Z

004 C

005 A

006 A

… …

RegionServer

Cop

rocesor

Host

RowKey Index table cf

001_A_001

001_A_005

001_A_006

001_B_002

001_C_004

001_Z_003

Coprocessor handles the index data

A custom LoadBalancer does

collocation

Client Extn allows specifying index details while creating table, not needed for read/write

Primary User table Secondary index table

Client App

Page 25: Hindex: Secondary indexes for faster HBase queries

HUAWEI TECHNOLOGIES CO., LTD.

Huawei Confidential 25

HBase – Query by Column (w/ Index)

name gender dept title mobile

Raj M OIH SE 534

Ram M OIH SSE 254

Anu M OIH TL 123

Pia F BDI TL 326

salary dob

1230 …

1750 …

2100 …

2270 …

key

123

135

141

142

Jay M SOA SA 521

Suma F SOA SSE 126

Som M OIH SE 665

… … … … …

4300 …

1550 …

1270 …

.. …

145

148

152

SELECT NAME FROM PERSON WHERE MOBILE=123

Index column: MOBILE

Page 26: Hindex: Secondary indexes for faster HBase queries

HUAWEI TECHNOLOGIES CO., LTD.

Huawei Confidential 26

HBase – Query by Column (w/ Index)

name gender dept title mobile

Raj M OIH SE 534

Ram M OIH SSE 254

Anu M OIH TL 123

Pia F BDI TL 326

salary dob

1230 …

1750 …

2100 …

2270 …

key

123

135

141

142

idx1

123 141

254 135

326 142

534 123

SELECT NAME FROM PERSON WHERE MOBILE=123

Jay M SOA SA 521

Suma F SOA SSE 126

Som M OIH SE 665

… … … … …

4300 …

1550 …

1270 …

.. …

145

148

152

126 148

521 145

665 152

… …

Page 27: Hindex: Secondary indexes for faster HBase queries

HUAWEI TECHNOLOGIES CO., LTD.

Huawei Confidential 27

HBase – Query by Column (w/ Index)

name gender dept title mobile

Raj M OIH SE 534

Ram M OIH SSE 254

Anu M OIH TL 123

Pia F BDI TL 326

salary dob

1230 …

1750 …

2100 …

2270 …

key

123

135

141

142

idx1

123_141

254_135

326_142

534_123

SELECT NAME FROM PERSON WHERE MOBILE=123

Jay M SOA SA 521

Suma F SOA SSE 126

Som M OIH SE 665

… … … … …

4300 …

1550 …

1270 …

.. …

145

148

152

126_148

521_145

665_152

To ensure user table row key order when a column value repeated in multiple rows

Page 28: Hindex: Secondary indexes for faster HBase queries

HUAWEI TECHNOLOGIES CO., LTD.

Huawei Confidential 28

HBase – Query by Column (w/ Index)

name gender dept title mobile

Raj M OIH SE 534

Ram M OIH SSE 254

Anu M OIH TL 123

Pia F BDI TL 326

salary dob

1230 …

1750 …

2100 …

2270 …

key

123

135

141

142

idx1

123_141

254_135

326_142

534_123

Jay M SOA SA 521

Suma F SOA SSE 126

Som M OIH SE 665

… … … … …

4300 …

1550 …

1270 …

.. …

145

148

152

126_148

521_145

665_152

SELECT NAME FROM PERSON WHERE MOBILE=123

Page 29: Hindex: Secondary indexes for faster HBase queries

HUAWEI TECHNOLOGIES CO., LTD.

Huawei Confidential 29

HBase – Query by Column (w/ Index)

name gender dept title mobile

Raj M OIH SE 534

Ram M OIH SSE 254

Anu M OIH TL 123

Pia F BDI TL 326

salary dob

1230 …

1750 …

2100 …

2270 …

key

123

135

141

142

Jay M SOA SA 521

Suma F SOA SSE 126

Som M OIH SE 665

… … … … …

4300 …

1550 …

1270 …

.. …

145

148

152

idx1

123_141

254_135

326_142

534_123

126_148

521_145

665_152

Regions

Index maintained per

region, not globally

Page 30: Hindex: Secondary indexes for faster HBase queries

HUAWEI TECHNOLOGIES CO., LTD.

Huawei Confidential 30

HBase – Query by Column (w/ Index)

name gender dept title mobile

Raj M OIH SE 534

Ram M OIH SSE 254

Anu M OIH TL 123

Pia F BDI TL 326

salary dob

1230 …

1750 …

2100 …

2270 …

key

123

135

141

142

Jay M SOA SA 521

Suma F SOA SSE 126

Som M OIH SE 665

… … … … …

4300 …

1550 …

1270 …

.. …

145

148

152

idx1

123_141

254_135

326_142

534_123

126_148

521_145

665_152

Regions

Index maintained per

region, not globally

Network calls avoided

Page 31: Hindex: Secondary indexes for faster HBase queries

HUAWEI TECHNOLOGIES CO., LTD.

Huawei Confidential 31

HBase – Query by Column (w/ Index)

name gender dept title mobile

Raj M OIH SE 534

Ram M OIH SSE 254

Anu M OIH TL 123

Pia F BDI TL 326

salary dob

1230 …

1750 …

2100 …

2270 …

key

123

135

141

142

Jay M SOA SA 521

Suma F SOA SSE 126

Som M OIH SE 665

… … … … …

4300 …

1550 …

1270 …

.. …

145

148

152

idx1

123_141

254_135

326_142

534_123

126_148

521_145

665_152

Regions

Index maintained per

region, not globally

Handle

Region Movement

Region Splits

Page 32: Hindex: Secondary indexes for faster HBase queries

HUAWEI TECHNOLOGIES CO., LTD.

Huawei Confidential 32

Regions Co-location

HMaster

Balancer

RS1

RS2

R1

R2

Client

R1

R2

Actual Table

Index Table

A

B

B

C

A

B

B

C

Create table with regions

R1

R2

R1

R2

Page 33: Hindex: Secondary indexes for faster HBase queries

HUAWEI TECHNOLOGIES CO., LTD.

Huawei Confidential 33

Put operation

Table-> t1 & Column family -> cf

Index-> idx1(cf:q1) & idx2(cf:q2)

Index table -> t1_idx

HRegionServer

A

Coprocessor

Client User Region

R1

Index Region

R1

A

B

A

B

Put ‘t1’,’AAB’, ’cf:q1’,’5’, ’cf:q2’,’z1’

Put ‘t1_idx’,’Aidx15AAB’ Put ‘t1_idx’,’Aidx2z1AAB’

Page 34: Hindex: Secondary indexes for faster HBase queries

HUAWEI TECHNOLOGIES CO., LTD.

Huawei Confidential 34

Scan Operation

1) Create scanner for index region at server side

HRegionServer

A

Coprocessor

Client User Region

R1

Index Region

R1

A

B

A

B

Create scanner (condition cf:q1=5)

Create scanner on index region Start row : Aidx15 Stop row : Aidx16

Page 35: Hindex: Secondary indexes for faster HBase queries

HUAWEI TECHNOLOGIES CO., LTD.

Huawei Confidential 35

Scan Operation

2) Scan index table at server side and seek to exact rows in the user table

HRegionServer

A

Coprocessor

Client User Region

R1

Index Region

R1

A

B

A

B

next() Seek to exact row

1 2

3

4 5

Page 36: Hindex: Secondary indexes for faster HBase queries

HUAWEI TECHNOLOGIES CO., LTD.

Huawei Confidential 36

Scan Operation

Coprocessors read index and seek to exact row in the user table

Doing seeks on HFiles based on

rowkey obtained from index data

HFiles reads as block by block

Default block size is 64kb

Skipping block reads from HDFS

where data not at all present

Some times skipping full HFile

No need to read index details back to

client avoiding network extra usage.

Page 37: Hindex: Secondary indexes for faster HBase queries

HUAWEI TECHNOLOGIES CO., LTD.

Huawei Confidential 37

hindex: Usage

SELECT NAME FROM PERSON WHERE (DEPT=‘OIH’ OR TITLE=‘TL’) AND (400 > MOBILE AND MOBILE > 500)

Filters (with equal or range conditions)

AND Filters list with MUST_PASS_ALL OR Filters list with MUST_PASS_ONE

Page 38: Hindex: Secondary indexes for faster HBase queries

HUAWEI TECHNOLOGIES CO., LTD.

Huawei Confidential 38

HBase – Query with AND

name gender dept title mobile

Raj M OIH SE 534

Ram M OIH SSE 254

Anu M OIH TL 123

Pia F BDI TL 326

key

123

135

141

142

idx1

BDI_142

OIH_123

OIH_141

OIH_142

SELECT NAME FROM PERSON WHERE DEPT=OIH AND TITLE=TL

Jay M SOA SA 521

Suma F SOA SSE 126

Som M OIH TL 665

… … … … …

145

148

152

OIH_152

SOA_145

SOA_148

idx2

TL_141

TL_142

TL_152

SA_145

SE_123

SSE_135

SSE_148

Page 39: Hindex: Secondary indexes for faster HBase queries

HUAWEI TECHNOLOGIES CO., LTD.

Huawei Confidential 39

HBase – Query with AND

name gender dept title mobile

Raj M OIH SE 534

Ram M OIH SSE 254

Anu M OIH TL 123

Pia F BDI TL 326

key

123

135

141

142

idx1_BDI_142

idx1_OIH_123

idx1_OIH_141

idx1_OIH_142

SELECT NAME FROM PERSON WHERE DEPT=OIH AND TITLE=TL

Jay M SOA SA 521

Suma F SOA SSE 126

Som M OIH TL 665

… … … … …

145

148

152

idx1_OIH_152

idx1_SOA_145

idx1_SOA_148

idx2_TL_141

idx2_TL_142

idx2_TL_152

idx2_SA_145

1) Single index table per user table, easier to collocate

2) Index table row keys have index name to store each index data together

Page 40: Hindex: Secondary indexes for faster HBase queries

HUAWEI TECHNOLOGIES CO., LTD.

Huawei Confidential 40

HBase – Query with AND

name gender dept title mobile

Raj M OIH SE 534

Ram M OIH SSE 254

Anu M OIH TL 123

Pia F BDI TL 326

key

123

135

141

142

idx1_BDI_142

idx1_OIH_123

idx1_OIH_141

idx1_OIH_142

SELECT NAME FROM PERSON WHERE DEPT=OIH AND TITLE=TL

Jay M SOA SA 521

Suma F SOA SSE 126

Som M OIH TL 665

… … … … …

145

148

152

idx1_OIH_152

idx1_SOA_145

idx1_SOA_148

idx2_TL_141

idx2_TL_142

idx2_TL_152

idx2_SA_145

Create two scanners 1) start: idx1_OIH, end: idx1_OII 2) Start: idx2_TL, end: idx2_TM

Page 41: Hindex: Secondary indexes for faster HBase queries

HUAWEI TECHNOLOGIES CO., LTD.

Huawei Confidential 41

HBase – Query with AND

name gender dept title mobile

Raj M OIH SE 534

Ram M SOA SSE 254

Anu M OIH SA 123

Pia F OIH TL 326

key

123

135

141

142

idx1_BDI_160

idx1_OIH_123

idx1_OIH_141

idx1_OIH_142

SELECT NAME FROM PERSON WHERE DEPT=OIH AND TITLE=TL

Jay M SOA TL 521

Suma F SOA TL 126

M SOA TL 325

Som M OIH TL 665

Su F BDI TL 928

… … … … …

145

148

150

152

160

idx1_OIH_152

idx1_SOA_135

idx1_SOA_148

Idx1_SOA_150

idx2_TL_142

idx2_TL_145

Idx2_TL_148

Idx2_TL_150

idx2_TL_152

Idx2_TL_160

idx2_SA_141

two scanners 1) start: idx1_OIH, end: idx1_OII 2) Start: idx2_TL, end: idx2_TM

Page 42: Hindex: Secondary indexes for faster HBase queries

HUAWEI TECHNOLOGIES CO., LTD.

Huawei Confidential 42

HBase – Query with AND

name gender dept title mobile

Raj M OIH SE 534

Ram M SOA SSE 254

Anu M OIH SA 123

Pia F OIH TL 326

key

123

135

141

142

idx1_BDI_160

idx1_OIH_123

idx1_OIH_141

idx1_OIH_142

SELECT NAME FROM PERSON WHERE DEPT=OIH AND TITLE=TL

Jay M SOA TL 521

Suma F SOA TL 126

M SOA TL 325

Som M OIH TL 665

Su F BDI TL 928

… … … … …

145

148

150

152

160

idx1_OIH_152

idx1_SOA_135

idx1_SOA_148

Idx1_SOA_150

idx2_TL_142

idx2_TL_145

Idx2_TL_148

Idx2_TL_150

idx2_TL_152

Idx2_TL_160

idx2_SA_141

123 < 142 scanner-1 can jump to idx1_OIH_142

two scanners 1) start: idx1_OIH, end: idx1_OII 2) Start: idx2_TL, end: idx2_TM

Page 43: Hindex: Secondary indexes for faster HBase queries

HUAWEI TECHNOLOGIES CO., LTD.

Huawei Confidential 43

HBase – Query with AND

name gender dept title mobile

Raj M OIH SE 534

Ram M SOA SSE 254

Anu M OIH SA 123

Pia F OIH TL 326

key

123

135

141

142

idx1_BDI_160

idx1_OIH_123

idx1_OIH_141

idx1_OIH_142

SELECT NAME FROM PERSON WHERE DEPT=OIH AND TITLE=TL

Jay M SOA TL 521

Suma F SOA TL 126

M SOA TL 325

Som M OIH TL 665

Su F BDI TL 928

… … … … …

145

148

150

152

160

idx1_OIH_152

idx1_SOA_135

idx1_SOA_148

Idx1_SOA_150

idx2_TL_142

idx2_TL_145

Idx2_TL_148

Idx2_TL_150

idx2_TL_152

Idx2_TL_160

idx2_SA_141

two scanners 1) start: idx1_OIH, end: idx1_OII 2) Start: idx2_TL, end: idx2_TM

Page 44: Hindex: Secondary indexes for faster HBase queries

HUAWEI TECHNOLOGIES CO., LTD.

Huawei Confidential 44

HBase – Query with AND

name gender dept title mobile

Raj M OIH SE 534

Ram M SOA SSE 254

Anu M OIH SA 123

Pia F OIH TL 326

key

123

135

141

142

idx1_BDI_160

idx1_OIH_123

idx1_OIH_141

idx1_OIH_142

SELECT NAME FROM PERSON WHERE DEPT=OIH AND TITLE=TL

Jay M SOA TL 521

Suma F SOA TL 126

M SOA TL 325

Som M OIH TL 665

Su F BDI TL 928

… … … … …

145

148

150

152

160

idx1_OIH_152

idx1_SOA_135

idx1_SOA_148

Idx1_SOA_150

idx2_TL_142

idx2_TL_145

Idx2_TL_148

Idx2_TL_150

idx2_TL_152

Idx2_TL_160

idx2_SA_141

142 = 142 (meets our condition) - Can fetch required data - Move both scanners to next

Pia

two scanners 1) start: idx1_OIH, end: idx1_OII 2) Start: idx2_TL, end: idx2_TM

Page 45: Hindex: Secondary indexes for faster HBase queries

HUAWEI TECHNOLOGIES CO., LTD.

Huawei Confidential 45

HBase – Query with AND

name gender dept title mobile

Raj M OIH SE 534

Ram M SOA SSE 254

Anu M OIH SA 123

Pia F OIH TL 326

key

123

135

141

142

idx1_BDI_160

idx1_OIH_123

idx1_OIH_141

idx1_OIH_142

SELECT NAME FROM PERSON WHERE DEPT=OIH AND TITLE=TL

Jay M SOA TL 521

Suma F SOA TL 126

M SOA TL 325

Som M OIH TL 665

Su F BDI TL 928

… … … … …

145

148

150

152

160

idx1_OIH_152

idx1_SOA_135

idx1_SOA_148

Idx1_SOA_150

idx2_TL_142

idx2_TL_145

Idx2_TL_148

Idx2_TL_150

idx2_TL_152

Idx2_TL_160

idx2_SA_141

152 > 145 scanner-2 can jump to idx2_TL_152

Pia

two scanners 1) start: idx1_OIH, end: idx1_OII 2) Start: idx2_TL, end: idx2_TM

Page 46: Hindex: Secondary indexes for faster HBase queries

HUAWEI TECHNOLOGIES CO., LTD.

Huawei Confidential 46

HBase – Query with AND

name gender dept title mobile

Raj M OIH SE 534

Ram M SOA SSE 254

Anu M OIH SA 123

Pia F OIH TL 326

key

123

135

141

142

idx1_BDI_160

idx1_OIH_123

idx1_OIH_141

idx1_OIH_142

SELECT NAME FROM PERSON WHERE DEPT=OIH AND TITLE=TL

Jay M SOA TL 521

Suma F SOA TL 126

M SOA TL 325

Som M OIH TL 665

Su F BDI TL 928

… … … … …

145

148

150

152

160

idx1_OIH_152

idx1_SOA_135

idx1_SOA_148

Idx1_SOA_150

idx2_TL_142

Idx2_TL_145

idx2_TL_148

idx2_TL_150

idx2_TL_152

Idx2_TL_160

idx2_SA_141

152 = 152 (meets our condition) - Can fetch required data - Move both scanners to next

Pia

Som

two scanners 1) start: idx1_OIH, end: idx1_OII 2) Start: idx2_TL, end: idx2_TM

Page 47: Hindex: Secondary indexes for faster HBase queries

HUAWEI TECHNOLOGIES CO., LTD.

Huawei Confidential 47

HBase – Query with AND

name gender dept title mobile

Raj M OIH SE 534

Ram M SOA SSE 254

Anu M OIH SA 123

Pia F OIH TL 326

key

123

135

141

142

idx1_BDI_160

idx1_OIH_123

idx1_OIH_141

idx1_OIH_142

SELECT NAME FROM PERSON WHERE DEPT=OIH AND TITLE=TL

Jay M SOA TL 521

Suma F SOA TL 126

M SOA TL 325

Som M OIH TL 665

Su F BDI TL 928

… … … … …

145

148

150

152

160

idx1_OIH_152

idx1_SOA_135

idx1_SOA_148

Idx1_SOA_150

idx2_TL_142

Idx2_TL_145

idx2_TL_148

idx2_TL_150

idx2_TL_152

idx2_TL_160

idx2_SA_141

Scanner-1 reaches end Close both scanners

Pia

Som

two scanners 1) start: idx1_OIH, end: idx1_OII 2) Start: idx2_TL, end: idx2_TM

Page 48: Hindex: Secondary indexes for faster HBase queries

HUAWEI TECHNOLOGIES CO., LTD.

Huawei Confidential 48

HBase – Query with AND (w/ 2-column index)

name gender dept title mobile

Raj M OIH SE 534

Ram M SOA SSE 254

Anu M OIH SA 123

Pia F OIH TL 326

key

123

135

141

142

idx1_BDI_TL_160

idx1_OIH_SA_141

idx1_OIH_TL_142

idx1_OIH_TL_152

SELECT NAME FROM PERSON WHERE DEPT=OIH AND TITLE=TL

Jay M SOA TL 521

Suma F SOA TL 126

M SOA TL 325

Som M OIH TL 665

Su F BDI TL 928

… … … … …

145

148

150

152

160

idx1_SOA_SSE_135

idx1_SOA_TL_145

Idx1_SOA_TL_148

Idx1_SOA_TL_150

Idx1_SOA_TL_150

1 scanner: start: idx1_OIH_TL, end: idx1_OIH_TM

Page 49: Hindex: Secondary indexes for faster HBase queries

HUAWEI TECHNOLOGIES CO., LTD.

Huawei Confidential 49

HBase – Query with AND (w/ 2-column index)

name gender dept title mobile

Raj M OIH SE 534

Ram M SOA SSE 254

Anu M OIH SA 123

Pia F OIH TL 326

key

123

135

141

142

idx1_BDI_TL_160

idx1_OIH_SA_141

idx1_OIH_TL_142

idx1_OIH_TL_152

SELECT NAME FROM PERSON WHERE DEPT=OIH AND TITLE=TL

Jay M SOA TL 521

Suma F SOA TL 126

M SOA TL 325

Som M OIH TL 665

Su F BDI TL 928

… … … … …

145

148

150

152

160

idx1_SOA_SSE_135

idx1_SOA_TL_145

Idx1_SOA_TL_148

Idx1_SOA_TL_150

Idx1_SOA_TL_150

1 scanner: start: idx1_OIH_TL, end: idx1_OIH_TM

Page 50: Hindex: Secondary indexes for faster HBase queries

HUAWEI TECHNOLOGIES CO., LTD.

Huawei Confidential 50

HBase – Query with AND (w/ 2-column index)

name gender dept title mobile

Raj M OIH SE 534

Ram M SOA SSE 254

Anu M OIH SA 123

Pia F OIH TL 326

key

123

135

141

142

idx1_BDI_TL_160

idx1_OIH_SA_141

idx1_OIH_TL_142

idx1_OIH_TL_152

SELECT NAME FROM PERSON WHERE DEPT=OIH AND TITLE=TL

Jay M SOA TL 521

Suma F SOA TL 126

M SOA TL 325

Som M OIH TL 665

Su F BDI TL 928

… … … … …

145

148

150

152

160

idx1_SOA_SSE_135

idx1_SOA_TL_145

Idx1_SOA_TL_148

Idx1_SOA_TL_150

Idx1_SOA_TL_150

Pia

Meets condition - Can fetch required data

1 scanner: start: idx1_OIH_TL, end: idx1_OIH_TM

Page 51: Hindex: Secondary indexes for faster HBase queries

HUAWEI TECHNOLOGIES CO., LTD.

Huawei Confidential 51

HBase – Query with AND (w/ 2-column index)

name gender dept title mobile

Raj M OIH SE 534

Ram M SOA SSE 254

Anu M OIH SA 123

Pia F OIH TL 326

key

123

135

141

142

idx1_BDI_TL_160

idx1_OIH_SA_141

idx1_OIH_TL_142

idx1_OIH_TL_152

SELECT NAME FROM PERSON WHERE DEPT=OIH AND TITLE=TL

Jay M SOA TL 521

Suma F SOA TL 126

M SOA TL 325

Som M OIH TL 665

Su F BDI TL 928

… … … … …

145

148

150

152

160

idx1_SOA_SSE_135

idx1_SOA_TL_145

Idx1_SOA_TL_148

Idx1_SOA_TL_150

Idx1_SOA_TL_150

Pia

Som

Meets condition - Can fetch required data

1 scanner: start: idx1_OIH_TL, end: idx1_OIH_TM

Page 52: Hindex: Secondary indexes for faster HBase queries

HUAWEI TECHNOLOGIES CO., LTD.

Huawei Confidential 52

HBase – Query with AND (w/ 2-column index)

name gender dept title mobile

Raj M OIH SE 534

Ram M SOA SSE 254

Anu M OIH SA 123

Pia F OIH TL 326

key

123

135

141

142

idx1_BDI_TL_160

idx1_OIH_SA_141

idx1_OIH_TL_142

idx1_OIH_TL_152

SELECT NAME FROM PERSON WHERE DEPT=OIH AND TITLE=TL

Jay M SOA TL 521

Suma F SOA TL 126

M SOA TL 325

Som M OIH TL 665

Su F BDI TL 928

… … … … …

145

148

150

152

160

idx1_SOA_SSE_135

idx1_SOA_TL_145

Idx1_SOA_TL_148

Idx1_SOA_TL_150

Idx1_SOA_TL_150

Pia

Som

Scanner reaches end Close the scanner

1 scanner: start: idx1_OIH_TL, end: idx1_OIH_TM

Page 53: Hindex: Secondary indexes for faster HBase queries

HUAWEI TECHNOLOGIES CO., LTD.

Huawei Confidential 53

HBase – Query with OR

name gender dept title mobile

Raj M OIH SE 534

Ram M SOA SSE 254

Anu M OIH SA 123

Pia F OIH TL 326

key

123

135

141

142

idx1_BDI_160

idx1_OIH_123

idx1_OIH_141

idx1_OIH_142

SELECT NAME FROM PERSON WHERE DEPT=OIH OR TITLE=TL

Jay M SOA TL 521

Suma F SOA TL 126

M SOA TL 325

Som M OIH TL 665

Su F BDI TL 928

… … … … …

145

148

150

152

160

idx1_OIH_152

idx1_SOA_135

idx1_SOA_148

Idx1_SOA_150

idx2_TL_142

Idx2_TL_145

idx2_TL_148

idx2_TL_150

idx2_TL_152

idx2_TL_160

idx2_SA_141

123 <= 142 - Get required results - Move scanner-1 to next till exceeds 142

two scanners 1) start: idx1_OIH, end: idx1_OII 2) Start: idx2_TL, end: idx2_TM

Page 54: Hindex: Secondary indexes for faster HBase queries

HUAWEI TECHNOLOGIES CO., LTD.

Huawei Confidential 54

HBase – Query with OR

name gender dept title mobile

Raj M OIH SE 534

Ram M SOA SSE 254

Anu M OIH SA 123

Pia F OIH TL 326

key

123

135

141

142

idx1_BDI_160

idx1_OIH_123

idx1_OIH_141

idx1_OIH_142

SELECT NAME FROM PERSON WHERE DEPT=OIH OR TITLE=TL

Jay M SOA TL 521

Suma F SOA TL 126

M SOA TL 325

Som M OIH TL 665

Su F BDI TL 928

… … … … …

145

148

150

152

160

idx1_OIH_152

idx1_SOA_135

idx1_SOA_148

Idx1_SOA_150

idx2_TL_142

Idx2_TL_145

idx2_TL_148

idx2_TL_150

idx2_TL_152

idx2_TL_160

idx2_SA_141

Raj

123 <= 142 - Get required results - Move scanner-1 to next till exceeds 142

two scanners 1) start: idx1_OIH, end: idx1_OII 2) Start: idx2_TL, end: idx2_TM

Page 55: Hindex: Secondary indexes for faster HBase queries

HUAWEI TECHNOLOGIES CO., LTD.

Huawei Confidential 55

HBase – Query with OR

name gender dept title mobile

Raj M OIH SE 534

Ram M SOA SSE 254

Anu M OIH SA 123

Pia F OIH TL 326

key

123

135

141

142

idx1_BDI_160

idx1_OIH_123

idx1_OIH_141

idx1_OIH_142

SELECT NAME FROM PERSON WHERE DEPT=OIH OR TITLE=TL

Jay M SOA TL 521

Suma F SOA TL 126

M SOA TL 325

Som M OIH TL 665

Su F BDI TL 928

… … … … …

145

148

150

152

160

idx1_OIH_152

idx1_SOA_135

idx1_SOA_148

Idx1_SOA_150

idx2_TL_142

Idx2_TL_145

idx2_TL_148

idx2_TL_150

idx2_TL_152

idx2_TL_160

idx2_SA_141

141 <= 142 - Get required results - Move scanner-1 to next till exceeds 142

Raj

two scanners 1) start: idx1_OIH, end: idx1_OII 2) Start: idx2_TL, end: idx2_TM

Page 56: Hindex: Secondary indexes for faster HBase queries

HUAWEI TECHNOLOGIES CO., LTD.

Huawei Confidential 56

HBase – Query with OR

name gender dept title mobile

Raj M OIH SE 534

Ram M SOA SSE 254

Anu M OIH SA 123

Pia F OIH TL 326

key

123

135

141

142

idx1_BDI_160

idx1_OIH_123

idx1_OIH_141

idx1_OIH_142

SELECT NAME FROM PERSON WHERE DEPT=OIH OR TITLE=TL

Jay M SOA TL 521

Suma F SOA TL 126

M SOA TL 325

Som M OIH TL 665

Su F BDI TL 928

… … … … …

145

148

150

152

160

idx1_OIH_152

idx1_SOA_135

idx1_SOA_148

Idx1_SOA_150

idx2_TL_142

Idx2_TL_145

idx2_TL_148

idx2_TL_150

idx2_TL_152

idx2_TL_160

idx2_SA_141

Raj

Anu

141 <= 142 - Get required results - Move scanner-1 to next till exceeds 142

two scanners 1) start: idx1_OIH, end: idx1_OII 2) Start: idx2_TL, end: idx2_TM

Page 57: Hindex: Secondary indexes for faster HBase queries

HUAWEI TECHNOLOGIES CO., LTD.

Huawei Confidential 57

HBase – Query with OR

name gender dept title mobile

Raj M OIH SE 534

Ram M SOA SSE 254

Anu M OIH SA 123

Pia F OIH TL 326

key

123

135

141

142

idx1_BDI_160

idx1_OIH_123

idx1_OIH_141

idx1_OIH_142

SELECT NAME FROM PERSON WHERE DEPT=OIH OR TITLE=TL

Jay M SOA TL 521

Suma F SOA TL 126

M SOA TL 325

Som M OIH TL 665

Su F BDI TL 928

… … … … …

145

148

150

152

160

idx1_OIH_152

idx1_SOA_135

idx1_SOA_148

Idx1_SOA_150

idx2_TL_142

Idx2_TL_145

idx2_TL_148

idx2_TL_150

idx2_TL_152

idx2_TL_160

idx2_SA_141

142 == 142 - Get required results - Move both scanners in this case

Raj

Anu

two scanners 1) start: idx1_OIH, end: idx1_OII 2) Start: idx2_TL, end: idx2_TM

Page 58: Hindex: Secondary indexes for faster HBase queries

HUAWEI TECHNOLOGIES CO., LTD.

Huawei Confidential 58

HBase – Query with OR

name gender dept title mobile

Raj M OIH SE 534

Ram M SOA SSE 254

Anu M OIH SA 123

Pia F OIH TL 326

key

123

135

141

142

idx1_BDI_160

idx1_OIH_123

idx1_OIH_141

idx1_OIH_142

SELECT NAME FROM PERSON WHERE DEPT=OIH OR TITLE=TL

Jay M SOA TL 521

Suma F SOA TL 126

M SOA TL 325

Som M OIH TL 665

Su F BDI TL 928

… … … … …

145

148

150

152

160

idx1_OIH_152

idx1_SOA_135

idx1_SOA_148

Idx1_SOA_150

idx2_TL_142

Idx2_TL_145

idx2_TL_148

idx2_TL_150

idx2_TL_152

idx2_TL_160

idx2_SA_141

Raj

Anu

Pia

142 == 142 - Get required results - Move both scanners in this case

two scanners 1) start: idx1_OIH, end: idx1_OII 2) Start: idx2_TL, end: idx2_TM

Page 59: Hindex: Secondary indexes for faster HBase queries

HUAWEI TECHNOLOGIES CO., LTD.

Huawei Confidential 59

HBase – Query with OR

name gender dept title mobile

Raj M OIH SE 534

Ram M SOA SSE 254

Anu M OIH SA 123

Pia F OIH TL 326

key

123

135

141

142

idx1_BDI_160

idx1_OIH_123

idx1_OIH_141

idx1_OIH_142

SELECT NAME FROM PERSON WHERE DEPT=OIH OR TITLE=TL

Jay M SOA TL 521

Suma F SOA TL 126

M SOA TL 325

Som M OIH TL 665

Su F BDI TL 928

… … … … …

145

148

150

152

160

idx1_OIH_152

idx1_SOA_135

idx1_SOA_148

Idx1_SOA_150

idx2_TL_142

idx2_TL_145

idx2_TL_148

idx2_TL_150

idx2_TL_152

idx2_TL_160

idx2_SA_141

Raj

Anu

Pia

152 >= 145 (scanner-2 is behind) - Get required results - Move scanner-2 to next till exceeds 152

two scanners 1) start: idx1_OIH, end: idx1_OII 2) Start: idx2_TL, end: idx2_TM

Page 60: Hindex: Secondary indexes for faster HBase queries

HUAWEI TECHNOLOGIES CO., LTD.

Huawei Confidential 60

HBase – Query with OR

name gender dept title mobile

Raj M OIH SE 534

Ram M SOA SSE 254

Anu M OIH SA 123

Pia F OIH TL 326

key

123

135

141

142

idx1_BDI_160

idx1_OIH_123

idx1_OIH_141

idx1_OIH_142

SELECT NAME FROM PERSON WHERE DEPT=OIH OR TITLE=TL

Jay M SOA TL 521

Suma F SOA TL 126

M SOA TL 325

Som M OIH TL 665

Su F BDI TL 928

… … … … …

145

148

150

152

160

idx1_OIH_152

idx1_SOA_135

idx1_SOA_148

Idx1_SOA_150

idx2_TL_142

idx2_TL_145

idx2_TL_148

idx2_TL_150

idx2_TL_152

idx2_TL_160

idx2_SA_141

Raj

Anu

Pia

Jay

152 >= 145 (scanner-2 is behind) - Get required results - Move scanner-2 to next till exceeds 152

two scanners 1) start: idx1_OIH, end: idx1_OII 2) Start: idx2_TL, end: idx2_TM

Page 61: Hindex: Secondary indexes for faster HBase queries

HUAWEI TECHNOLOGIES CO., LTD.

Huawei Confidential 61

HBase – Query with OR

name gender dept title mobile

Raj M OIH SE 534

Ram M SOA SSE 254

Anu M OIH SA 123

Pia F OIH TL 326

key

123

135

141

142

idx1_BDI_160

idx1_OIH_123

idx1_OIH_141

idx1_OIH_142

SELECT NAME FROM PERSON WHERE DEPT=OIH OR TITLE=TL

Jay M SOA TL 521

Suma F SOA TL 126

M SOA TL 325

Som M OIH TL 665

Su F BDI TL 928

… … … … …

145

148

150

152

160

idx1_OIH_152

idx1_SOA_135

idx1_SOA_148

Idx1_SOA_150

idx2_TL_142

idx2_TL_145

idx2_TL_148

idx2_TL_150

idx2_TL_152

idx2_TL_160

idx2_SA_141

Raj

Anu

Pia

Jay

Suma

two scanners 1) start: idx1_OIH, end: idx1_OII 2) Start: idx2_TL, end: idx2_TM

Page 62: Hindex: Secondary indexes for faster HBase queries

HUAWEI TECHNOLOGIES CO., LTD.

Huawei Confidential 62

HBase – Query with OR

key

123

135

141

142

idx1_BDI_160

idx1_OIH_123

idx1_OIH_141

idx1_OIH_142

SELECT NAME FROM PERSON WHERE DEPT=OIH OR TITLE=TL

Jay M SOA TL 521

Suma F SOA TL 126

M SOA TL 325

Som M OIH TL 665

Su F BDI TL 928

… … … … …

145

148

150

152

160

idx1_OIH_152

idx1_SOA_135

idx1_SOA_148

Idx1_SOA_150

idx2_TL_142

idx2_TL_145

idx2_TL_148

idx2_TL_150

idx2_TL_152

idx2_TL_160

idx2_SA_141

Raj

Anu

Pia

Jay

Suma

two scanners 1) start: idx1_OIH, end: idx1_OII 2) Start: idx2_TL, end: idx2_TM

name gender dept title mobile

Raj M OIH SE 534

Ram M SOA SSE 254

Anu M OIH SA 123

Pia F OIH TL 326

Page 63: Hindex: Secondary indexes for faster HBase queries

HUAWEI TECHNOLOGIES CO., LTD.

Huawei Confidential 63

HBase – Query with OR

name gender dept title mobile

Raj M OIH SE 534

Ram M SOA SSE 254

Anu M OIH SA 123

Pia F OIH TL 326

key

123

135

141

142

idx1_BDI_160

idx1_OIH_123

idx1_OIH_141

idx1_OIH_142

SELECT NAME FROM PERSON WHERE DEPT=OIH OR TITLE=TL

Jay M SOA TL 521

Suma F SOA TL 126

M SOA TL 325

Som M OIH TL 665

Su F BDI TL 928

… … … … …

145

148

150

152

160

idx1_OIH_152

idx1_SOA_135

idx1_SOA_148

Idx1_SOA_150

idx2_TL_142

idx2_TL_145

idx2_TL_148

idx2_TL_150

idx2_TL_152

idx2_TL_160

idx2_SA_141

Raj

Anu

Pia

Jay

Suma

Som

two scanners 1) start: idx1_OIH, end: idx1_OII 2) Start: idx2_TL, end: idx2_TM

Page 64: Hindex: Secondary indexes for faster HBase queries

HUAWEI TECHNOLOGIES CO., LTD.

Huawei Confidential 64

Region Split

Rowkey cf:col1

01 A

02 A

03 C

04 B

05 X

06 A

07 A

01

09

Rowkey cf

01_A_01

01_A_02

01_A_06

01_A_07

01_B_04

01_C_03

01_X_05

01

09

Split at this point

User table region Index table region

Rowkey cf:col1

01 A

02 A

03 C

04 B

Rowkey cf:col1

05 X

06 A

07 A

Rowkey cf

01_A_01

01_A_02

01_B_04

01_C_03

01

05

05

09

01

05

Rowkey Cf

05_A_06

05_A_07

05_X_05

05

09

Explicit split on index region is avoided (using custom split policy for index tables)

When user table region splits, corresponding index region also splits

Split key for index region same as that of user region

Page 65: Hindex: Secondary indexes for faster HBase queries

HUAWEI TECHNOLOGIES CO., LTD.

Huawei Confidential 65

Region Split

Rowkey cf:col1

01 A

02 A

03 C

04 B

05 X

06 A

07 A

01

09

Rowkey cf

01_A_01

01_A_02

01_A_06

01_A_07

01_B_04

01_C_03

01_X_05

01

09

User table region

Index table region

HalfStoreFileReader –Daughter A

HalfStoreFileReader –Daughter B

IndexHalfStoreFileReader –Daughter A –Daughter B

Custom HalfStoreFileReader for

reading index daughter regions.

IndexHalfStoreFileReader – Both

half store file readers start at same point i.e. beginning of file

Checks actual table rowkey and decide KV corresponding to it or not

IndexHalfStoreFileReader for

daughter B - Changes the key as per the daughter start key.

Page 66: Hindex: Secondary indexes for faster HBase queries

HUAWEI TECHNOLOGIES CO., LTD.

Huawei Confidential 66

Agenda

HBase – A brief introduction

Introduction to hindex

Usage

Test Results

Page 67: Hindex: Secondary indexes for faster HBase queries

HUAWEI TECHNOLOGIES CO., LTD.

Huawei Confidential 67

Usage: Getting Started

For HBase 0.94.x

https://github.com/Huawei-Hadoop/hindex/

For HBase 0.98 or trunk

https://issues.apache.org/jira/browse/HBASE-10222

Page 68: Hindex: Secondary indexes for faster HBase queries

HUAWEI TECHNOLOGIES CO., LTD.

Huawei Confidential 68

Usage: Configurations

Name Value

hbase.coprocessor.master.classes org.apache.hadoop.hbase.index.coprocessor

.master.IndexMasterObserver

hbase.coprocessor.region.classes org.apache.hadoop.hbase.index.coprocessor

.regionserver.IndexRegionObserver

hbase.coprocessor.wal.classes org.apache.hadoop.hbase.index.coprocessor

.wal.IndexWALObserver

hbase.master.loadbalancer.class org.apache.hadoop.hbase.index.SecIndexLoa

dBalancer

hbase.use.secondary.index true

Page 69: Hindex: Secondary indexes for faster HBase queries

HUAWEI TECHNOLOGIES CO., LTD.

Huawei Confidential 69

Usage: Creating Index

Page 70: Hindex: Secondary indexes for faster HBase queries

HUAWEI TECHNOLOGIES CO., LTD.

Huawei Confidential 70

Usage: Tools

TableIndexer tool to create index(es) for existing data

Bulk load tool to load user data to user table and index it at same time

Tool to check regions co-location and repair if any co-location mismatches

$HBASE_HOME/bin/hbase org.apache.hadoop.hbase.index.mapreduce.TableIndexer

-Dtablename.to.index=table -Dtable.columns.index= ‘IDX1=>cf1:[q1->datatype& length]’

$HBASE_HOME/bin/hbase org.apache.hadoop.hbase.index.mapreduce.IndexImportTsv

-Dimporttsv.columns=a,b,c -Dimporttsv.bulk.output=hdfs://storefile-outputdir <tablename>

<hdfs-data-inputdir>

$HBASE_HOME/bin/hbase org.apache.hadoop.hbase.index.util.SecondaryIndexColocator

$HBASE_HOME/bin/hbase org.apache.hadoop.hbase.index.mapreduce.IndexLoadIncrementalHFiles

<hdfs://storefileoutput> <tablename>

Page 71: Hindex: Secondary indexes for faster HBase queries

HUAWEI TECHNOLOGIES CO., LTD.

Huawei Confidential 71

Agenda

HBase – A brief introduction

Introduction to hindex

Usage

Test Results

Page 72: Hindex: Secondary indexes for faster HBase queries

HUAWEI TECHNOLOGIES CO., LTD.

Huawei Confidential 72

Test Results: Put Performance

Hardware Architecture : x86_64

CPU(s) : 24 (2 threads per core)

RS Heap size: 8GB

Topology 5 Region Servers

100 Regions (user table)

Data 100 GB data

500 bytes per record

Page 73: Hindex: Secondary indexes for faster HBase queries

HUAWEI TECHNOLOGIES CO., LTD.

Huawei Confidential 73

Test Results: Scan Performance

idx1 cf:q1 idx2 cf:q2

Search for a column value

Hardware Architecture : x86_64

CPU(s) : 24 (2 threads per core)

RS Heap size: 8GB

Topology 5 Region Servers

100 Regions (user table)

Data 50 GB data

500 bytes per record

Page 74: Hindex: Secondary indexes for faster HBase queries

HUAWEI TECHNOLOGIES CO., LTD.

Huawei Confidential 74

Test Results: Query with AND

idx1 cf:q1

idx2 cf:q2

Hardware Architecture : x86_64

CPU(s) : 24 (2 threads per core)

RS Heap size: 8GB

Topology 5 Region Servers

100 Regions (user table)

Data 50 GB data

500 bytes per record

Page 75: Hindex: Secondary indexes for faster HBase queries

HUAWEI TECHNOLOGIES CO., LTD.

Huawei Confidential 75

Test Results: Scan w/ Range Query

idx3 cf:q3

Hardware Architecture : x86_64

CPU(s) : 24 (2 threads per core)

RS Heap size: 8GB

Topology 5 Region Servers

100 Regions (user table)

Data 50 GB data

500 bytes per record

Page 76: Hindex: Secondary indexes for faster HBase queries

HUAWEI TECHNOLOGIES CO., LTD.

Huawei Confidential 76

Test Results: Scan w/ Multi Column Index

idx4 -> cf:q1,cf:q3

Hardware Architecture : x86_64

CPU(s) : 24 (2 threads per core)

RS Heap size: 8GB

Topology 5 Region Servers

100 Regions (user table)

Data 50 GB data

500 bytes per record

Page 77: Hindex: Secondary indexes for faster HBase queries

HUAWEI TECHNOLOGIES CO., LTD.

Huawei Confidential 77

Summary

Design

Supports multiple indexes and multi-column indexes on a table

Supports indexing on part of a column value

Supports equal and range condition scans using index

Supports dynamic add/drop index

Supports hints to skip index scan or specific indexes to use in the scan.

Intelligent Filter evaluation

Application usage

No changes required to perform read and write operations.

Use IndexAdmin (client extension) to perform admin operations like create, enable,

disable and drop on indexed table.

Need not perform admin operations separately on index table.

Upgrade/Integration

Minimal code changes in HBase kernel. HBase version upgrade is very easy.

Page 78: Hindex: Secondary indexes for faster HBase queries

HUAWEI TECHNOLOGIES CO., LTD.

Huawei Confidential 78

Roadmap

Contribute to HBase community (In progress – refer HBASE-9203)

Integration with Apache Phoenix

HBCK tool support for Secondary index tables

Pluggable Scan-Evaluation

Page 79: Hindex: Secondary indexes for faster HBase queries

HUAWEI TECHNOLOGIES CO., LTD.

Huawei Confidential 79

Q & A

https://github.com/Huawei-Hadoop/hindex/

Page 80: Hindex: Secondary indexes for faster HBase queries

Thank you www.huawei.com

Copyright©2011 Huawei Technologies Co., Ltd. All Rights Reserved.

The information in this document may contain predictive statements including, without limitation,

statements regarding the future financial and operating results, future product portfolio, new technology,

etc. There are a number of factors that could cause actual results and developments to differ materially

from those expressed or implied in the predictive statements. Therefore, such information is provided for

reference purpose only and constitutes neither an offer nor an acceptance. Huawei may change the

information at any time without notice.