23
© Hortonworks Inc. 2015 Hive 0.14 Does ACID February 2015 Page 1 Alan Gates [email protected] @alanfgates

Hive Acid Updates 0.14

Embed Size (px)

DESCRIPTION

Hive Acid Updates 0.14

Citation preview

Adding ACID updates to Apache Hive

Hive 0.14 Does ACIDFebruary 2015Page 1Alan [email protected]@alanfgates Hortonworks Inc. 2015Page 2Hive only updated partitionsInsert overwrite rewrote an entire partitionForced daily or even hourly partitionsCould add files to partition directory, but no file compactionWhat about concurrent readers?Ok for inserts, but overwrite caused racesThere is a zookeeper lock manager, butNo way to delete or update rowsNo INSERT INTO T VALUESBreaks some toolsHistory Hortonworks Inc. 2015Page 3Hadoop and Hive have alwaysWorked without ACIDPerceived as tradeoff for performanceBut, your data isnt staticIt changes daily, hourly, or fasterAd hoc solutions require a lot of workManaging change makes the users life betterDo or Do Not, There is NO TryWhy is ACID Critical? Hortonworks Inc. 2015Page 4NOT OLTP!!!Updating a Dimension TableChanging a customers addressDelete Old RecordsRemove records for complianceUpdate/Restate Large Fact TablesFix problems after they are in the warehouseStreaming Data IngestA continual stream of data coming inTypically from Flume or StormNOT OLTP!!!

Use Cases Hortonworks Inc. 2015Page 5New DMLINSERT INTO T VALUES(1, fred, ...);UPDATE T SET (x = 5[, ...]) WHERE ...DELETE FROM T WHERE ...Supports partitioned and non-partitioned tables, WHERE clause can specify partition but not requiredRestrictionsTable must have format that extends AcidInputFormat currently ORCTable must be bucketed and not sortedcan use 1 bucket but this will restrict write ||ismTable must be marked transactionalcreate table T(...) clustered by (a) into 2 buckets stored as orc TBLPROPERTIES ('transactional'='true');

New SQL in Hive 0.14 Hortonworks Inc. 2015Page 6GoodHandles compactions for usAlready has similar data model with LSMBadNo cross row transactionsWould require us to write a transaction manager over HBase, doable, but not less workHfile is column family based rather than columnarHBase focused on point lookups and range scansWarehousing requires full scansWhy Not HBase? Hortonworks Inc. 2015Page 7HDFS Does Not Allow Arbitrary WritesStore changes as delta filesStitched together by client on readWrites get a Transaction IDSequentially assigned by MetastoreReads get Committed TransactionsProvides snapshot consistencyNo locks required Provide a snapshot of data from start of query

Design Hortonworks Inc. 2015Stitching Buckets TogetherPage 8

Hortonworks Inc. 20158Page 9Partition locations remain unchangedStill warehouse/$db/$tbl/$partBucket Files Structured By TransactionsBase files $part/base_$tid/bucket_*Delta files $part/delta_$tid_$tid/bucket_*HDFS Layout Hortonworks Inc. 2015Page 10Created new AcidInput/OutputFormatUnique key is transaction, bucket, rowReader returns correct version of row based on transaction stateAlso Added Raw API for CompactorProvides previous events as wellORC implements new APIExtends records with change metadataAdd operation (d, u, i), transaction and keyInput and Output Formats Hortonworks Inc. 2015Page 11Need to split buckets for MapReduceNeed to split base and deltas the same wayUse key rangesUse indexesDistributing the Work

Hortonworks Inc. 2015Page 12Existing lock managersIn memory - not durableZooKeeper - requires additional components to install, administer, etc.Locks need to be integrated with transactionscommit/rollback must atomically release locksWe sort of have this database lying around which has ACID characteristics (metastore)Transactions and locks stored in metastoreUses metastore DB to provide unique, ascending ids for transactions and locksTransaction Manager Hortonworks Inc. 2015Page 13In Hive 0.14 DML statements are auto-commitWorking on adding BEGIN, COMMIT, ROLLBACKSnapshot isolationReader will see consistent data for the duration of his/her queryMay extend to other isolation levels in the futureCurrent transactions can be displayed using new SHOW TRANSACTIONS statementTransaction Model Hortonworks Inc. 2015Page 14Three types of lockssharedsemi-shared (can co-exist with shared, but not other semi-shared)exclusiveOperations require different locksSELECT, INSERT sharedUPDATE, DELETE semi-sharedDROP, INSERT OVERWRITE exclusiveLocking Model Hortonworks Inc. 2015Page 15Each transaction (or batch of transactions in streaming ingest) creates a new delta fileToo many files = NameNode Need a way toCollect many deltas into one delta minor compactionRewrite base and delta to new base major compactionCompactor Hortonworks Inc. 2015Page 16Run when there are 10 or more deltas (configurable)Results in base + 1 deltaMinor Compaction/hive/warehouse/purchaselog/ds=201403311000/base_0028000/hive/warehouse/purchaselog/ds=201403311000/delta_0028001_0028100/hive/warehouse/purchaselog/ds=201403311000/delta_0028101_0028200/hive/warehouse/purchaselog/ds=201403311000/delta_0028201_0028300/hive/warehouse/purchaselog/ds=201403311000/delta_0028301_0028400/hive/warehouse/purchaselog/ds=201403311000/delta_0028401_0028500

/hive/warehouse/purchaselog/ds=201403311000/base_0028000/hive/warehouse/purchaselog/ds=201403311000/delta_0028001_0028500

Hortonworks Inc. 2015Page 17Run when deltas are 10% the size of base (configurable)Results in new baseMajor Compaction/hive/warehouse/purchaselog/ds=201403311000/base_0028000/hive/warehouse/purchaselog/ds=201403311000/delta_0028001_0028100/hive/warehouse/purchaselog/ds=201403311000/delta_0028101_0028200/hive/warehouse/purchaselog/ds=201403311000/delta_0028201_0028300/hive/warehouse/purchaselog/ds=201403311000/delta_0028301_0028400/hive/warehouse/purchaselog/ds=201403311000/delta_0028401_0028500

/hive/warehouse/purchaselog/ds=201403311000/base_0028500 Hortonworks Inc. 2015Page 18Metastore thrift server will schedule and execute compactionsNo need for user to scheduleUser can initiate via new ALTER TABLE COMPACT statementNo locking required, compactions run at same time as select and DMLCompactor aware of readers, does not remove old files until readers have finished with themCurrent compactions can be viewed via new SHOW COMPACTIONS statement

Compactor Continued Hortonworks Inc. 2015Page 19Data is flowing in from generators in a streamWithout this, you have to add it to Hive in batches, often every hourThus your users have to wait an hour before they can see their dataNew interface in hive.hcatalog.streaming lets applications write small batches of records and commit themUsers can now see data within a few seconds of it arriving from the data generatorsAvailable for Apache Flume in HDP 2.1 and Storm in HDP 2.2Application: Streaming Ingest Hortonworks Inc. 2015Page 20On the clienthive.support.concurrency=truehive.txn.manager=org.apache.hadoop.hive.ql.lockmgr.DbTxnManagerhive.enforce.bucketing=true

On the metastore serverhive.compactor.initiator.on=truehive.compactor.worker.threads=1 # or more

Configuration Hortonworks Inc. 2015Page 21Phase 1, Hive 0.13Transaction and new lock managerORC file supportAutomatic and manual compactionSnapshot isolationStreaming ingest via FlumePhase 2, Hive 0.14INSERT VALUES, UPDATE, DELETEPhase 3, Hive 1.2(?)Add support for only some columns in insertINSERT into T (a, b) select c, d from U;BEGIN, COMMIT, ROLLBACKFuture (all speculative based on user feedback)Integration with HCatalogVersioned or point in time queriesStreaming ingest of updates and deletesAdditional isolation levels such as dirty read or read committedMERGEPhases of Development Hortonworks Inc. 2015Page 22JIRA: https://issues.apache.org/jira/browse/HIVE-5317Adds ACID semantics to HiveUses SQL standard commandsINSERT, UPDATE, DELETEProvides scalable read and write accessConclusion Hortonworks Inc. 2015Thank You!Questions & AnswersPage 23 Hortonworks Inc. 2015