26
Licensed under a Crea-ve Commons A2ribu-onShareAlike 3.0 Unported License. Apache HBase 1.0 Release Nick Dimiduk, Hortonworks @xefyr n10k.com

Apache HBase 1.0 Release

Embed Size (px)

DESCRIPTION

An overview of the state of the HBase 1.0 release. Covers a quick HBase overview, the HBase timeline, new features for 1.0, and the upgrade path.

Citation preview

Page 1: Apache HBase 1.0 Release

Licensed  under  a  Crea-ve  Commons  A2ribu-on-­‐ShareAlike  3.0  Unported  License.  

Apache HBase 1.0 Release Nick Dimiduk, Hortonworks @xefyr n10k.com

Page 2: Apache HBase 1.0 Release

Licensed  under  a  Crea-ve  Commons  A2ribu-on-­‐ShareAlike  3.0  Unported  License.  

Release 1.0

“The theme of (eventual) 1.0 release is to become a stable base for future 1.x series of releases. 1.0

release will aim to achieve at least the same level of stability of 0.98 releases without introducing too many

new features.”

Enis Söztutar HBase 1.0 Release Manager

2014-­‐11-­‐18   2  

Page 3: Apache HBase 1.0 Release

Licensed  under  a  Crea-ve  Commons  A2ribu-on-­‐ShareAlike  3.0  Unported  License.  

Agenda

•  A Brief History of HBase •  What is HBase •  Major Changes for 1.0 •  Upgrade Path

2014-­‐11-­‐18   3  

Page 4: Apache HBase 1.0 Release

Licensed  under  a  Crea-ve  Commons  A2ribu-on-­‐ShareAlike  3.0  Unported  License.  

How we got here"A Brief History of HBase

Page 5: Apache HBase 1.0 Release

Licensed  under  a  Crea-ve  Commons  A2ribu-on-­‐ShareAlike  3.0  Unported  License.  

The Early Years

•  2006: BigTable paper published by Google •  2006: HBase development starts •  2007: HBase added Hadoop contrib •  2007: Release Hadoop 0.15.0 •  2008: Hadoop graduates Incubator •  2008: HBase becomes Hadoop sub-project •  2008: Release HBase 0.18.1 •  2009: Release HBase 0.19.0 •  2009: Release HBase 0.20.0

2014-­‐11-­‐18   5  

Page 6: Apache HBase 1.0 Release

Licensed  under  a  Crea-ve  Commons  A2ribu-on-­‐ShareAlike  3.0  Unported  License.  

Into Production

•  2010: HBase becomes Apache top-level project •  2011: Release HBase 0.90.0 •  2011: Release HBase 0.92.0 •  2011: HBase: The Definitive Guide published •  2012: Release HBase 0.94.0 •  2012: First HBaseCon •  2012: HBase Administration Cookbook published •  2012: HBase In Action published

2014-­‐11-­‐18   6  

Page 7: Apache HBase 1.0 Release

Licensed  under  a  Crea-ve  Commons  A2ribu-on-­‐ShareAlike  3.0  Unported  License.  

Modern HBase

•  2013: HBaseCon 2013 •  2013: Release HBase 0.96.0 •  2013: Apache Phoenix enters Incubator •  2014: Release HBase 0.98.0 •  2014: HBaseCon 2014 •  2014: Apache Phoenix graduates Incubator •  2014: Release HBase 1.0 •  … •  2015: Release HBase 2.0?

2014-­‐11-­‐18   7  

Page 8: Apache HBase 1.0 Release

Licensed  under  a  Crea-ve  Commons  A2ribu-on-­‐ShareAlike  3.0  Unported  License.  

HBase architecture in 5 minutes or less "What is HBase

Page 9: Apache HBase 1.0 Release

Licensed  under  a  Crea-ve  Commons  A2ribu-on-­‐ShareAlike  3.0  Unported  License.  

Data Model

1368387247 [3.6 kb png data]"thumb"cf2b

a

cf1

1368394583 71368394261 "hello"

"bar"

1368394583 221368394925 13.61368393847 "world"

"foo"

cf21368387684 "almost the loneliest number"1.0001

1368396302 "fourth of July""2011-07-04"

Table A

rowkey columnfamily

columnqualifier timestamp value

Rows

Column Families

2014-­‐11-­‐18   9  

Page 10: Apache HBase 1.0 Release

Licensed  under  a  Crea-ve  Commons  A2ribu-on-­‐ShareAlike  3.0  Unported  License.  

Logical Architecure

ab

dc

ef

hg

ij

lk

mn

po

Table A

Region 1

Region 2

Region 3

Region 4

Region Server 7Table A, Region 1Table A, Region 2

Table G, Region 1070Table L, Region 25

Region Server 86Table A, Region 3Table C, Region 30Table F, Region 160Table F, Region 776

Region Server 367Table A, Region 4Table C, Region 17Table E, Region 52

Table P, Region 1116

2014-­‐11-­‐18   10  

Page 11: Apache HBase 1.0 Release

Licensed  under  a  Crea-ve  Commons  A2ribu-on-­‐ShareAlike  3.0  Unported  License.  

Physical Architecture

63HBase in distributed mode

store/access data on HDFS. The master process does the distribution of regions amongRegionServers, and each RegionServer typically hosts multiple regions.

Given that the underlying data is stored in HDFS, which is available to all clients asa single namespace, all RegionServers have access to the same persisted files in the filesystem and can therefore host any region (figure 3.8). By physically collocating Data-Nodes and RegionServers, you can use the data locality property; that is, RegionServ-ers can theoretically read and write to the local DataNode as the primary DataNode.

You may wonder where the TaskTrackers are in this scheme of things. In someHBase deployments, the MapReduce framework isn’t deployed at all if the workload isprimarily random reads and writes. In other deployments, where the MapReduce pro-cessing is also a part of the workloads, TaskTrackers, DataNodes, and HBase Region-Servers can run together.

00001 John 415-111-123400002 Paul 408-432-992200003 Ron 415-993-212400004 Bob 818-243-998800005 Carly 206-221-912300006 Scott 818-231-256600007 Simon 425-112-987700008 Lucas 415-992-443200009 Steve 530-288-983200010 Kelly 916-992-123400011 Betty 650-241-119200012 Anne 206-294-1298

00001 John 415-111-123400002 Paul 408-432-992200003 Ron 415-993-212400004 Bob 818-243-9988

00005 Carly 206-221-912300006 Scott 818-231-256600007 Simon 425-112-987700008 Lucas 415-992-4432

00009 Steve 530-288-983200010 Kelly 916-992-123400011 Betty 650-241-119200012 Anne 206-294-1298

Full table T1

00001 John 415-111-123400002 Paul 408-432-992200003 Ron 415-993-212400004 Bob 818-243-9988

00005 Carly 206-221-912300006 Scott 818-231-256600007 Simon 425-112-987700008 Lucas 415-992-4432

00009 Steve 530-288-983200010 Kelly 916-992-123400011 Betty 650-241-119200012 Anne 206-294-1298

Table T1 split into 3 regions - R1, R2, and R3

T1R1

T1R2

T1R3

Figure 3.6 A table consists of multiple smaller chunks called regions.

DataNode RegionServer DataNode RegionServer DataNode RegionServer

Figure 3.7 HBase RegionServer and HDFS DataNode processes are typically collocated on the same host.

Licensed to Nick Dimiduk <[email protected]>

63HBase in distributed mode

store/access data on HDFS. The master process does the distribution of regions amongRegionServers, and each RegionServer typically hosts multiple regions.

Given that the underlying data is stored in HDFS, which is available to all clients asa single namespace, all RegionServers have access to the same persisted files in the filesystem and can therefore host any region (figure 3.8). By physically collocating Data-Nodes and RegionServers, you can use the data locality property; that is, RegionServ-ers can theoretically read and write to the local DataNode as the primary DataNode.

You may wonder where the TaskTrackers are in this scheme of things. In someHBase deployments, the MapReduce framework isn’t deployed at all if the workload isprimarily random reads and writes. In other deployments, where the MapReduce pro-cessing is also a part of the workloads, TaskTrackers, DataNodes, and HBase Region-Servers can run together.

00001 John 415-111-123400002 Paul 408-432-992200003 Ron 415-993-212400004 Bob 818-243-998800005 Carly 206-221-912300006 Scott 818-231-256600007 Simon 425-112-987700008 Lucas 415-992-443200009 Steve 530-288-983200010 Kelly 916-992-123400011 Betty 650-241-119200012 Anne 206-294-1298

00001 John 415-111-123400002 Paul 408-432-992200003 Ron 415-993-212400004 Bob 818-243-9988

00005 Carly 206-221-912300006 Scott 818-231-256600007 Simon 425-112-987700008 Lucas 415-992-4432

00009 Steve 530-288-983200010 Kelly 916-992-123400011 Betty 650-241-119200012 Anne 206-294-1298

Full table T1

00001 John 415-111-123400002 Paul 408-432-992200003 Ron 415-993-212400004 Bob 818-243-9988

00005 Carly 206-221-912300006 Scott 818-231-256600007 Simon 425-112-987700008 Lucas 415-992-4432

00009 Steve 530-288-983200010 Kelly 916-992-123400011 Betty 650-241-119200012 Anne 206-294-1298

Table T1 split into 3 regions - R1, R2, and R3

T1R1

T1R2

T1R3

Figure 3.6 A table consists of multiple smaller chunks called regions.

DataNode RegionServer DataNode RegionServer DataNode RegionServer

Figure 3.7 HBase RegionServer and HDFS DataNode processes are typically collocated on the same host.

Licensed to Nick Dimiduk <[email protected]>

63HBase in distributed mode

store/access data on HDFS. The master process does the distribution of regions amongRegionServers, and each RegionServer typically hosts multiple regions.

Given that the underlying data is stored in HDFS, which is available to all clients asa single namespace, all RegionServers have access to the same persisted files in the filesystem and can therefore host any region (figure 3.8). By physically collocating Data-Nodes and RegionServers, you can use the data locality property; that is, RegionServ-ers can theoretically read and write to the local DataNode as the primary DataNode.

You may wonder where the TaskTrackers are in this scheme of things. In someHBase deployments, the MapReduce framework isn’t deployed at all if the workload isprimarily random reads and writes. In other deployments, where the MapReduce pro-cessing is also a part of the workloads, TaskTrackers, DataNodes, and HBase Region-Servers can run together.

00001 John 415-111-123400002 Paul 408-432-992200003 Ron 415-993-212400004 Bob 818-243-998800005 Carly 206-221-912300006 Scott 818-231-256600007 Simon 425-112-987700008 Lucas 415-992-443200009 Steve 530-288-983200010 Kelly 916-992-123400011 Betty 650-241-119200012 Anne 206-294-1298

00001 John 415-111-123400002 Paul 408-432-992200003 Ron 415-993-212400004 Bob 818-243-9988

00005 Carly 206-221-912300006 Scott 818-231-256600007 Simon 425-112-987700008 Lucas 415-992-4432

00009 Steve 530-288-983200010 Kelly 916-992-123400011 Betty 650-241-119200012 Anne 206-294-1298

Full table T1

00001 John 415-111-123400002 Paul 408-432-992200003 Ron 415-993-212400004 Bob 818-243-9988

00005 Carly 206-221-912300006 Scott 818-231-256600007 Simon 425-112-987700008 Lucas 415-992-4432

00009 Steve 530-288-983200010 Kelly 916-992-123400011 Betty 650-241-119200012 Anne 206-294-1298

Table T1 split into 3 regions - R1, R2, and R3

T1R1

T1R2

T1R3

Figure 3.6 A table consists of multiple smaller chunks called regions.

DataNode RegionServer DataNode RegionServer DataNode RegionServer

Figure 3.7 HBase RegionServer and HDFS DataNode processes are typically collocated on the same host.

Licensed to Nick Dimiduk <[email protected]>

63HBase in distributed mode

store/access data on HDFS. The master process does the distribution of regions amongRegionServers, and each RegionServer typically hosts multiple regions.

Given that the underlying data is stored in HDFS, which is available to all clients asa single namespace, all RegionServers have access to the same persisted files in the filesystem and can therefore host any region (figure 3.8). By physically collocating Data-Nodes and RegionServers, you can use the data locality property; that is, RegionServ-ers can theoretically read and write to the local DataNode as the primary DataNode.

You may wonder where the TaskTrackers are in this scheme of things. In someHBase deployments, the MapReduce framework isn’t deployed at all if the workload isprimarily random reads and writes. In other deployments, where the MapReduce pro-cessing is also a part of the workloads, TaskTrackers, DataNodes, and HBase Region-Servers can run together.

00001 John 415-111-123400002 Paul 408-432-992200003 Ron 415-993-212400004 Bob 818-243-998800005 Carly 206-221-912300006 Scott 818-231-256600007 Simon 425-112-987700008 Lucas 415-992-443200009 Steve 530-288-983200010 Kelly 916-992-123400011 Betty 650-241-119200012 Anne 206-294-1298

00001 John 415-111-123400002 Paul 408-432-992200003 Ron 415-993-212400004 Bob 818-243-9988

00005 Carly 206-221-912300006 Scott 818-231-256600007 Simon 425-112-987700008 Lucas 415-992-4432

00009 Steve 530-288-983200010 Kelly 916-992-123400011 Betty 650-241-119200012 Anne 206-294-1298

Full table T1

00001 John 415-111-123400002 Paul 408-432-992200003 Ron 415-993-212400004 Bob 818-243-9988

00005 Carly 206-221-912300006 Scott 818-231-256600007 Simon 425-112-987700008 Lucas 415-992-4432

00009 Steve 530-288-983200010 Kelly 916-992-123400011 Betty 650-241-119200012 Anne 206-294-1298

Table T1 split into 3 regions - R1, R2, and R3

T1R1

T1R2

T1R3

Figure 3.6 A table consists of multiple smaller chunks called regions.

DataNode RegionServer DataNode RegionServer DataNode RegionServer

Figure 3.7 HBase RegionServer and HDFS DataNode processes are typically collocated on the same host.

Licensed to Nick Dimiduk <[email protected]>

RegionServer

DataNode

RegionServer

DataNode

RegionServer

DataNode

RegionServer

DataNode

...

63HBase in distributed mode

store/access data on HDFS. The master process does the distribution of regions amongRegionServers, and each RegionServer typically hosts multiple regions.

Given that the underlying data is stored in HDFS, which is available to all clients asa single namespace, all RegionServers have access to the same persisted files in the filesystem and can therefore host any region (figure 3.8). By physically collocating Data-Nodes and RegionServers, you can use the data locality property; that is, RegionServ-ers can theoretically read and write to the local DataNode as the primary DataNode.

You may wonder where the TaskTrackers are in this scheme of things. In someHBase deployments, the MapReduce framework isn’t deployed at all if the workload isprimarily random reads and writes. In other deployments, where the MapReduce pro-cessing is also a part of the workloads, TaskTrackers, DataNodes, and HBase Region-Servers can run together.

00001 John 415-111-123400002 Paul 408-432-992200003 Ron 415-993-212400004 Bob 818-243-998800005 Carly 206-221-912300006 Scott 818-231-256600007 Simon 425-112-987700008 Lucas 415-992-443200009 Steve 530-288-983200010 Kelly 916-992-123400011 Betty 650-241-119200012 Anne 206-294-1298

00001 John 415-111-123400002 Paul 408-432-992200003 Ron 415-993-212400004 Bob 818-243-9988

00005 Carly 206-221-912300006 Scott 818-231-256600007 Simon 425-112-987700008 Lucas 415-992-4432

00009 Steve 530-288-983200010 Kelly 916-992-123400011 Betty 650-241-119200012 Anne 206-294-1298

Full table T1

00001 John 415-111-123400002 Paul 408-432-992200003 Ron 415-993-212400004 Bob 818-243-9988

00005 Carly 206-221-912300006 Scott 818-231-256600007 Simon 425-112-987700008 Lucas 415-992-4432

00009 Steve 530-288-983200010 Kelly 916-992-123400011 Betty 650-241-119200012 Anne 206-294-1298

Table T1 split into 3 regions - R1, R2, and R3

T1R1

T1R2

T1R3

Figure 3.6 A table consists of multiple smaller chunks called regions.

DataNode RegionServer DataNode RegionServer DataNode RegionServer

Figure 3.7 HBase RegionServer and HDFS DataNode processes are typically collocated on the same host.

Licensed to Nick Dimiduk <[email protected]>

Master

ZooKeeper

63HBase in distributed mode

store/access data on HDFS. The master process does the distribution of regions amongRegionServers, and each RegionServer typically hosts multiple regions.

Given that the underlying data is stored in HDFS, which is available to all clients asa single namespace, all RegionServers have access to the same persisted files in the filesystem and can therefore host any region (figure 3.8). By physically collocating Data-Nodes and RegionServers, you can use the data locality property; that is, RegionServ-ers can theoretically read and write to the local DataNode as the primary DataNode.

You may wonder where the TaskTrackers are in this scheme of things. In someHBase deployments, the MapReduce framework isn’t deployed at all if the workload isprimarily random reads and writes. In other deployments, where the MapReduce pro-cessing is also a part of the workloads, TaskTrackers, DataNodes, and HBase Region-Servers can run together.

00001 John 415-111-123400002 Paul 408-432-992200003 Ron 415-993-212400004 Bob 818-243-998800005 Carly 206-221-912300006 Scott 818-231-256600007 Simon 425-112-987700008 Lucas 415-992-443200009 Steve 530-288-983200010 Kelly 916-992-123400011 Betty 650-241-119200012 Anne 206-294-1298

00001 John 415-111-123400002 Paul 408-432-992200003 Ron 415-993-212400004 Bob 818-243-9988

00005 Carly 206-221-912300006 Scott 818-231-256600007 Simon 425-112-987700008 Lucas 415-992-4432

00009 Steve 530-288-983200010 Kelly 916-992-123400011 Betty 650-241-119200012 Anne 206-294-1298

Full table T1

00001 John 415-111-123400002 Paul 408-432-992200003 Ron 415-993-212400004 Bob 818-243-9988

00005 Carly 206-221-912300006 Scott 818-231-256600007 Simon 425-112-987700008 Lucas 415-992-4432

00009 Steve 530-288-983200010 Kelly 916-992-123400011 Betty 650-241-119200012 Anne 206-294-1298

Table T1 split into 3 regions - R1, R2, and R3

T1R1

T1R2

T1R3

Figure 3.6 A table consists of multiple smaller chunks called regions.

DataNode RegionServer DataNode RegionServer DataNode RegionServer

Figure 3.7 HBase RegionServer and HDFS DataNode processes are typically collocated on the same host.

Licensed to Nick Dimiduk <[email protected]>

NameNode

63HBase in distributed mode

store/access data on HDFS. The master process does the distribution of regions amongRegionServers, and each RegionServer typically hosts multiple regions.

Given that the underlying data is stored in HDFS, which is available to all clients asa single namespace, all RegionServers have access to the same persisted files in the filesystem and can therefore host any region (figure 3.8). By physically collocating Data-Nodes and RegionServers, you can use the data locality property; that is, RegionServ-ers can theoretically read and write to the local DataNode as the primary DataNode.

You may wonder where the TaskTrackers are in this scheme of things. In someHBase deployments, the MapReduce framework isn’t deployed at all if the workload isprimarily random reads and writes. In other deployments, where the MapReduce pro-cessing is also a part of the workloads, TaskTrackers, DataNodes, and HBase Region-Servers can run together.

00001 John 415-111-123400002 Paul 408-432-992200003 Ron 415-993-212400004 Bob 818-243-998800005 Carly 206-221-912300006 Scott 818-231-256600007 Simon 425-112-987700008 Lucas 415-992-443200009 Steve 530-288-983200010 Kelly 916-992-123400011 Betty 650-241-119200012 Anne 206-294-1298

00001 John 415-111-123400002 Paul 408-432-992200003 Ron 415-993-212400004 Bob 818-243-9988

00005 Carly 206-221-912300006 Scott 818-231-256600007 Simon 425-112-987700008 Lucas 415-992-4432

00009 Steve 530-288-983200010 Kelly 916-992-123400011 Betty 650-241-119200012 Anne 206-294-1298

Full table T1

00001 John 415-111-123400002 Paul 408-432-992200003 Ron 415-993-212400004 Bob 818-243-9988

00005 Carly 206-221-912300006 Scott 818-231-256600007 Simon 425-112-987700008 Lucas 415-992-4432

00009 Steve 530-288-983200010 Kelly 916-992-123400011 Betty 650-241-119200012 Anne 206-294-1298

Table T1 split into 3 regions - R1, R2, and R3

T1R1

T1R2

T1R3

Figure 3.6 A table consists of multiple smaller chunks called regions.

DataNode RegionServer DataNode RegionServer DataNode RegionServer

Figure 3.7 HBase RegionServer and HDFS DataNode processes are typically collocated on the same host.

Licensed to Nick Dimiduk <[email protected]>

HBaseClient

HDFS

HBase

2014-­‐11-­‐18   11  

Page 12: Apache HBase 1.0 Release

Licensed  under  a  Crea-ve  Commons  A2ribu-on-­‐ShareAlike  3.0  Unported  License.  

What’s all the excitement about? "Major Changes for 1.0

Page 13: Apache HBase 1.0 Release

Licensed  under  a  Crea-ve  Commons  A2ribu-on-­‐ShareAlike  3.0  Unported  License.  

Stability: Co-Locate Meta with Master

•  Simplify, Improve region assignment reliability –  Fewer components involved in updating “truth”

•  Master embeds a RegionServer –  Will host only system tables –  Baby step towards combining RS/Master into a single hbase

daemon •  Backup masters unchanged

–  Can be configured to host user tables while in standby •  Plumbing is all there, OFF by default

http://issues.apache.org/jira/browse/HBASE-10569

2014-­‐11-­‐18   13  

Page 14: Apache HBase 1.0 Release

Licensed  under  a  Crea-ve  Commons  A2ribu-on-­‐ShareAlike  3.0  Unported  License.  

Availability: Region Replicas

•  Multiple RegionServers host a Region –  One is “primary”, others are “replicas” –  Only primary accepts writes

•  Client reads against primary only or any –  Results marked as appropriate

•  Baby step toward quorum reads, writes •  Plumbing is all there, OFF by default

http://issues.apache.org/jira/browse/HBASE-10070 http://www.slideshare.net/HBaseCon/features-session-1

2014-­‐11-­‐18   14  

Page 15: Apache HBase 1.0 Release

Licensed  under  a  Crea-ve  Commons  A2ribu-on-­‐ShareAlike  3.0  Unported  License.  

Usability: Client API Cleanup

•  Improved self-consistency •  Simpler semantics •  Easier to maintain •  Obvious @InterfaceAudience annotations

http://issues.apache.org/jira/browse/HBASE-10602 http://s.apache.org/hbase-1.0-api

https://github.com/ndimiduk/hbase-1.0-api-examples

2014-­‐11-­‐18   15  

Page 16: Apache HBase 1.0 Release

Licensed  under  a  Crea-ve  Commons  A2ribu-on-­‐ShareAlike  3.0  Unported  License.  

New and Noteworthy

•  Greatly expanded hbase.apache.org/book.html •  Truncate table shell command •  Automatic tuning of global MemStore and

BlockCache sizes •  BucketCache easier to configure •  Compressed BlockCache •  Pluggable replication endpoint •  A Dockerfile to easily build and run HBase from

source

2014-­‐11-­‐18   16  

Page 17: Apache HBase 1.0 Release

Licensed  under  a  Crea-ve  Commons  A2ribu-on-­‐ShareAlike  3.0  Unported  License.  

Under the Covers

•  ZooKeeper abstractions •  Meta table used for assignment •  Cell-based read/write path •  Combining mvcc/seqid •  Sundry security, tags, labels improvements

2014-­‐11-­‐18   17  

Page 18: Apache HBase 1.0 Release

Licensed  under  a  Crea-ve  Commons  A2ribu-on-­‐ShareAlike  3.0  Unported  License.  

Groundwork for 2.0

•  More, Smaller Regions –  Millions, 1G or less –  Less write amplification –  Splitting hbase:meta

•  Performance –  More off-heap –  Less resource contention –  Faster region failover/recovery –  Multiple WALs –  QoS/Quotas/Multi-tenancy

•  Rigging –  Faster, more intelligent

assignment –  Procedure bus –  Resumable, query-able

operations •  Other possibilities

–  Quorum/consensus reads, writes?

–  Hydrabase, multi-DC consensus?

–  Streaming RPCs? –  High level coprocessor API

2014-­‐11-­‐18   18  

Page 19: Apache HBase 1.0 Release

Licensed  under  a  Crea-ve  Commons  A2ribu-on-­‐ShareAlike  3.0  Unported  License.  

Semantic Versioning

•  Major/Minor/Patch version numbers –  Only major/minor pre-1.0

•  Dimensions –  Client/Server wire compatibility –  Server/Server wire and feature compatibility –  API compatibility –  ABI compatibility –  Proposal up for a vote

http://s.apache.org/hbase-semver

2014-­‐11-­‐18   19  

Page 20: Apache HBase 1.0 Release

Licensed  under  a  Crea-ve  Commons  A2ribu-on-­‐ShareAlike  3.0  Unported  License.  

Tell it to me straight, how bad is it?"Upgrade Path

Page 21: Apache HBase 1.0 Release

Licensed  under  a  Crea-ve  Commons  A2ribu-on-­‐ShareAlike  3.0  Unported  License.  

Online/Wire Compatibility

•  Direct migration from 0.94 supported –  Looks a lot like upgrade from 0.94 to 0.96: requires

downtime –  Not tested yet, will be before release

•  RPC is backward-compatible to 0.96 –  Enabled mixing clients and servers across versions –  So long as no new features are enabled

•  Rolling upgrade "out of the box" from 0.98 •  Rolling upgrade "with some massaging" from 0.96

–  IE, 0.96 cannot read HFileV3, the new default –  not tested yet, will be before release

2014-­‐11-­‐18   21  

Page 22: Apache HBase 1.0 Release

Licensed  under  a  Crea-ve  Commons  A2ribu-on-­‐ShareAlike  3.0  Unported  License.  

Client Application Compatibility

•  API is backward compatible to 0.96 – No code change required – You’ll start getting new deprecation warnings – We recommend you start using new APIs

•  ABI is NOT backward compatible – Cannot drop current application jars onto new

runtime – Recompile your application vs. 1.0 jars –  Just like 0.96 to 0.98 upgrade

2014-­‐11-­‐18   22  

Page 23: Apache HBase 1.0 Release

Licensed  under  a  Crea-ve  Commons  A2ribu-on-­‐ShareAlike  3.0  Unported  License.  

Hadoop Versions

•  Hadoop 1.x is NOT supported –  Bite the bullet; you’ll enjoy the performance benefits

•  Hadoop 2.x only –  Most thoroughly tested on 2.4.x, 2.5.x –  Probably works on 2.2.x, 2.3.x, but less thoroughly

tested

https://hbase.apache.org/book/configuration.html#hadoop

2014-­‐11-­‐18   23  

Page 24: Apache HBase 1.0 Release

Licensed  under  a  Crea-ve  Commons  A2ribu-on-­‐ShareAlike  3.0  Unported  License.  

Java Versions

•  JDK 6 is NOT supported! •  JDK 7 is the target runtime •  JDK 8 support is experimental

https://hbase.apache.org/book/configuration.html#hadoop

2014-­‐11-­‐18   24  

Page 25: Apache HBase 1.0 Release

Licensed  under  a  Crea-ve  Commons  A2ribu-on-­‐ShareAlike  3.0  Unported  License.  

Developer Preview 0.99.x

•  Pre-release “beta” builds for testing •  Not for production

DEVELOPER PREVIEWS NOT FOR PRODUCTION

•  Try out the new features •  Help us test your upgrade path •  Be a part of history in the making! •  0.99.1 available now

http://search-hadoop.com/m/DHED4186dj1

2014-­‐11-­‐18   25  

Page 26: Apache HBase 1.0 Release

Licensed  under  a  Crea-ve  Commons  A2ribu-on-­‐ShareAlike  3.0  Unported  License.  

Thanks!

M A N N I N G

Nick Dimiduk Amandeep Khurana

FOREWORD BY Michael Stack

hbaseinaction.com

Nick Dimiduk github.com/ndimiduk

@xefyr

n10k.com

http://s.apache.org/hbase-1.0

2014-­‐11-­‐18   26