63
©2016 Couchbase Inc. 1 The Couchbase Connect16 mobile app Take our in-app survey!

CenterEdge: Migrating from relational: data modeling and access – Couchbase Connect 2016

Embed Size (px)

Citation preview

Page 1: CenterEdge: Migrating from relational: data modeling and access – Couchbase Connect 2016

©2016 Couchbase Inc. 1

The Couchbase Connect16 mobile appTake our in-app survey!

Page 2: CenterEdge: Migrating from relational: data modeling and access – Couchbase Connect 2016

©2016 Couchbase Inc. 2

Migrating from relationaldata modeling and access

Brant Burnett, Lead Developer, CenterEdgeClarence Tauro, Sr Trainer, Couchbase

Marco Greco, Sr Engineer, N1QL R&D, Couchbase

Page 3: CenterEdge: Migrating from relational: data modeling and access – Couchbase Connect 2016

©2016 Couchbase Inc. 3©2016 Couchbase Inc.

Agenda

• Practical considerations for data and application migration• Modeling in Couchbase• Real life experience: Centeredge

Page 4: CenterEdge: Migrating from relational: data modeling and access – Couchbase Connect 2016

©2016 Couchbase Inc. 4

Practical Considerations

Page 5: CenterEdge: Migrating from relational: data modeling and access – Couchbase Connect 2016

©2016 Couchbase Inc. 5©2016 Couchbase Inc.

In this section

• Nomenclature• Type and data model mapping• Migrating data• Business logic• Monitoring

Page 6: CenterEdge: Migrating from relational: data modeling and access – Couchbase Connect 2016

©2016 Couchbase Inc. 6©2016 Couchbase Inc.

Nomenclature

Oracle Couchbase

Database Bucket

Table Bucket

Row Document

Column Field

Page 7: CenterEdge: Migrating from relational: data modeling and access – Couchbase Connect 2016

©2016 Couchbase Inc. 7©2016 Couchbase Inc.

Type Mapping

Oracle (PL/SQL Synonyms) Couchbase

Number, Binary_real, Binary_integer

Smallint, Int, Dec, Decimal, Float, …

Number

Char, Nchar, Varchar2, Nvarchar2

Character, String String

Boolean Boolean

Date, Timestamp Handled via String

Interval (year to month, day to fraction)

Some support via _millis() functions

Page 8: CenterEdge: Migrating from relational: data modeling and access – Couchbase Connect 2016

©2016 Couchbase Inc.©2016 Couchbase Inc.

Modelling

8

CustomerID Name DOB

CBC2016 Jane Smith 1990-01-30

CustomerID Type Cardnum Expiry

CBC2016 visa 5827… 2019-03

CBC2016 master 6274… 2018-12

CustomerID ConnId Name

CBC2016 XYZ987 Joe Smith

CBC2016 SKR007 Sam Smith

CustomerID item amt

CBC2016 mac 2823.52

CBC2016 ipad2 623.52

CustomerID ConnId Name

CBC2016 XYZ987 Joe Smith

CBC2016 SKR007 Sam Smith

Contacts

Customer

Billing

ConnectionsPurchases

{ "Name" : "Jane Smith", "DOB" : "1990-01-30", "Billing" : [ { "type" : "visa", "cardnum" : "5827-2842-2847-3909", "expiry" : "2019-03" }, { "type" : "master", "cardnum" : "6274-2842-2847-3909", "expiry" : "2019-03" } ], "Connections" : [ { "CustId" : "XYZ987", "Name" : "Joe Smith" }, { "CustId" : "PQR823", "Name" : "Dylan Smith" } { "CustId" : "PQR823", "Name" : "Dylan Smith" } ], "Purchases" : [ { "id":12, item: "mac", "amt": 2823.52 } { "id":19, item: "ipad2", "amt": 623.52 } ]}

DocumentKey: CBC2016

Page 9: CenterEdge: Migrating from relational: data modeling and access – Couchbase Connect 2016

©2016 Couchbase Inc. 9©2016 Couchbase Inc.

Migration

• Generalized process• Commercial tools

• Talend• Informatica

• Open source• Couchbase java importer• Oracle2couchbase• SQSL

• Importing from files

Page 10: CenterEdge: Migrating from relational: data modeling and access – Couchbase Connect 2016

©2016 Couchbase Inc. 10©2016 Couchbase Inc.

Migration Process

• High level process to migrate data from RDBMS to Couchbase using N1QL• For each table

• Determine primary key columns• Describe table• For each row

• Generate document key from primary key columns• Generate document from projection list description, column values• INSERT INTO <bucket> (key, value) ($1, $2)

• Use key and document as placeholder values

Page 11: CenterEdge: Migrating from relational: data modeling and access – Couchbase Connect 2016

©2016 Couchbase Inc. 11©2016 Couchbase Inc.

Talend

• Talend connector for Couchbase -- Talend 5.3 or later• http://developer.couchbase.com/documentation/

server/4.5/connectors/talend/talend.html• Ingesting unstructured data• Couchbase view support• Seamless integration with Couchbase

• tCouchbaseInput• Incoming data transformed into JSON documents and

stored in Couchbase.• User defines the data fields to be transformed into

JSON attributes• tCouchbaseOutput: uses the schema mapping to

transform JSON documents into target data formats

• ODBC/JDBC drivers (provided by Simba and CData)

Page 12: CenterEdge: Migrating from relational: data modeling and access – Couchbase Connect 2016

©2016 Couchbase Inc. 12©2016 Couchbase Inc.

Informatica

• Informatica Power Center• Needs ODBC driver

• Informatica Cloud• Needs JDBC driver

• ODBC/JDBC drivers (provided by Simba and CData)

• ETL & Data Integration• Load data from any Relational system into Couchbase• Export Couchbase data into RDBMS• Seamlessly integrate Couchbase into rest of the Data fabric

Page 13: CenterEdge: Migrating from relational: data modeling and access – Couchbase Connect 2016

©2016 Couchbase Inc. 13©2016 Couchbase Inc.

Couchbase java importer

• Blog post by Laurent Doguin detailing journey from process to code• Java based, but principle applies to other languages• Geared to Postgres but principle applies to other engines• Blog:

http://blog.couchbase.com/2016/january/moving-sql-database-content-to-couchbase

• Source code: https://github.com/ldoguin/couchbase-java-importer

Page 14: CenterEdge: Migrating from relational: data modeling and access – Couchbase Connect 2016

©2016 Couchbase Inc. 14©2016 Couchbase Inc.

Oracle2couchbase

• Another blog post / opensource tool• By Manuel Hurtado• Java based• Migrates from Oracle• Blog:

http://blog.couchbase.com/2016/february/moving-data-from-oracle-to-couchbase

• Source and binary: https://github.com/mahurtado/oracle2couchbase

Page 15: CenterEdge: Migrating from relational: data modeling and access – Couchbase Connect 2016

©2016 Couchbase Inc. 15©2016 Couchbase Inc.

SQSL

• Client side SQL like scripting language developed by yours truly two decades ago

• Several nifty features like• Expansion• Data driven operation• On the fly aggregation and redirection• User defined routines

• Have recently written data source for Couchbase and json library• Source: http://www.sqsl.org• Example:

let fromconn="sample";connect to fromconn source db2cli;connect to "couchbase://192.168.1.104:8091" source cb;select * from db2inst1.dept connection fromconn insert into default (key, value) values($1, $2) using json:key("::", columns), json:row2doc(displaylabels, columns);

Page 16: CenterEdge: Migrating from relational: data modeling and access – Couchbase Connect 2016

©2016 Couchbase Inc. 16©2016 Couchbase Inc.

Import / Export utilities

• Upcoming version includes cbimport & cbexport• File based utilities• Need to export RDBDMS data to file first• Load directly into data store bypassing N1QL

Page 17: CenterEdge: Migrating from relational: data modeling and access – Couchbase Connect 2016

©2016 Couchbase Inc. 17©2016 Couchbase Inc.

Business logic

• DDL• Views• Triggers• Procedures• Sequences• Joins• Transactions

Page 18: CenterEdge: Migrating from relational: data modeling and access – Couchbase Connect 2016

©2016 Couchbase Inc. 18©2016 Couchbase Inc.

Language comparisonQuery Features SQL on RDBMS N1QL

DML SELECT, INSERT, UPDATE, DELETE, MERGE SELECT, INSERT, UPDATE, DELETE, MERGE

DDL CREATE [INDEX, PROCEDURE TABLE, TYPE, VIEW…] ALTER [TABLE, TYPE, …] DROP [INDEX, PROCEDURE TABLE, TYPE, VIEW…]

CREATE [PRIMARY] INDEX DROP [PRIMARY] INDEX

Query Operations Select, Join, Project, Subqueries Strict Schema Strict Type checking

Select, Join, Project, Subqueries Nest & Unnest Look Ma! No Type Mismatch Errors! JSON keys act as columns

Schema Predetermined Columns Fully addressable JSON Flexible document structure

Data Types SQL Data types Conversion Functions

JSON Data types Conversion Functions

Query Processing INPUT: Sets of Tuples OUPUT: Set of Tuples

INPUT: Sets of JSON OUTPUT: Set of JSON

Page 19: CenterEdge: Migrating from relational: data modeling and access – Couchbase Connect 2016

©2016 Couchbase Inc. 19©2016 Couchbase Inc.

DDL

• Only Create Index / Drop Index exist in N1QL• Everything else should be removed from the application• Temporary tables

• Materialize results• Store in memory, or• Insert materialized document in a keyspace using a designated “type”: field and a

UUID() as key• DROP <temporary table> becomes DELETE FROM keyspace WHERE type=… and

ID=<UUID>

Page 20: CenterEdge: Migrating from relational: data modeling and access – Couchbase Connect 2016

©2016 Couchbase Inc. 20©2016 Couchbase Inc.

Views

• Access underlying keyspaces instead• Something akin to views can be obtained with

• View indexes• Functional indexes• CREATE INDEX … WHERE clauses

Page 21: CenterEdge: Migrating from relational: data modeling and access – Couchbase Connect 2016

©2016 Couchbase Inc. 21©2016 Couchbase Inc.

Statement blocks

• Handled by the application:• Triggers• Procedures

Page 22: CenterEdge: Migrating from relational: data modeling and access – Couchbase Connect 2016

©2016 Couchbase Inc. 22©2016 Couchbase Inc.

Sequences

• The eventual persistence engine handles atomic increments• Special documents can be created with a counter and accessed atomically• Can specify a delta on creation• Must be done from SDK• In python:

• N1QL does not• Use UUID() instead

Page 23: CenterEdge: Migrating from relational: data modeling and access – Couchbase Connect 2016

©2016 Couchbase Inc. 23©2016 Couchbase Inc.

Joins

• Two types of joins• Look up

• Index

• Joins use the document key• Joining side can be an expression• Joined side is document key• Full expression joins not supported

Page 24: CenterEdge: Migrating from relational: data modeling and access – Couchbase Connect 2016

©2016 Couchbase Inc. 24©2016 Couchbase Inc.

Transactions

• In two words: No need• Document modification is atomic• Consistency can be specified at the REST call level or SDK

• REST example• Add scan_consistency=[not_bounded|at_plus|request_plus|statement_plus] to

REST parameters• C SDK example

• Use lcb_n1p_setconsistency (…, [LCB_N1P_CONSISTENCY_NONE, LCB_N1P_CONSISTENCY_RYOW, LCB_N1P_CONSISTENCY_REQUEST, LCB_N1P_CONSISTENCY_STATEMENT]) when setting up request args

Page 25: CenterEdge: Migrating from relational: data modeling and access – Couchbase Connect 2016

©2016 Couchbase Inc. 25©2016 Couchbase Inc.

Monitoring

• Oracle• ALTER [system|session] SET timed_statistics=true turns on timed statistics

collection.• V$SESSTAT, V$SYSSTAT, V$STATNAME dynamic performance views report timed

statistics.• EXPLAIN PLAN explains a statement.

• MySQL• SET profiling=1 turns on profiling• SHOW PROFILES displays available query profiles• SHOW PROFILE displays the profile for a specific query• EXPLAIN <statement> produces query plan

• Couchbase• system:completed_requests virtual keyspace lists completed long running

queries with timings and statistics• system:active_requests virtual keyspace lists active queries with timings and

statistics• EXPLAIN <statement> explains request plan as a json document

Page 26: CenterEdge: Migrating from relational: data modeling and access – Couchbase Connect 2016

©2016 Couchbase Inc. 26

Modeling

Page 27: CenterEdge: Migrating from relational: data modeling and access – Couchbase Connect 2016

©2016 Couchbase Inc.©2016 Couchbase Inc.

What is Data Modeling?

27

• A data model is a conceptual representation of the data structures that are required by a database

• The data structures include the data objects, the associations between data objects, and the rules which govern operations on the objects.

Page 28: CenterEdge: Migrating from relational: data modeling and access – Couchbase Connect 2016

©2016 Couchbase Inc.©2016 Couchbase Inc.

Conceptual Data Modeling

• Phase I – define entities, attributes and their relationships• Entities: Main objects that are targets of your apps operates on

• Attributes: properties that your applications keep track of for the entity• Relationships: definition connections to other entities - 1-1, 1-many, many-many

Airline

Airport

Landmark

Route Passenger

Flight

Page 29: CenterEdge: Migrating from relational: data modeling and access – Couchbase Connect 2016

©2016 Couchbase Inc.©2016 Couchbase Inc.

Physical Data Model

• Phase II - Map entities, attributes and their relationships to containers provided by the underlying database solution

Relational Databases Couchbase Server

Databases Buckets

Tables Documents with type designator attribute OR Compound Keys

Rows Items (Key-Value or Key-Document)

Columns Attributes

Index Index

Page 30: CenterEdge: Migrating from relational: data modeling and access – Couchbase Connect 2016

©2016 Couchbase Inc.©2016 Couchbase Inc.

[email protected]

{ “name”:…, “flights”:[ {“_id”:“route_1000”, “flight”:…,}, {“_id”:”route_6421”, “flight”:…,} …], …}

Physical Data Modeling

route_1000

{ “id”:”1000”, “airline”: “AF”, “sourceairport”:”TLV”, “destinationairport”:”MRW”,…}

airport_TLV

{ “id”:”126701”, “airportname”: “TLV”, “geo”:{ “lat”:…,“long”:…},…}

F

light

s

Page 31: CenterEdge: Migrating from relational: data modeling and access – Couchbase Connect 2016

©2016 Couchbase Inc.©2016 Couchbase Inc.

Data Modeling Approaches

31

NoSQLRelaxed Normalizationschema implied by structurefields may be empty, duplicate, or missing

RelationalRequired Normalization

schema enforced by dbsame fields in all records

• Minimize data inconsistencies (one item = one location)• Reduced update cost (no duplicated data)• Preserve storage resources

• Optimized to planned/actual access patterns• Flexibly with software architecture• Supports clustered architecture• Reduced server overhead

Page 32: CenterEdge: Migrating from relational: data modeling and access – Couchbase Connect 2016

©2016 Couchbase Inc.©2016 Couchbase Inc.

JSON Design Choices

32

• Couchbase Server neither enforces nor validates for any particular document structure

• Choices that impact JSON document design:– Single Root Attributes– Objects vs. Arrays– Array Element Types– Timestamp Formats– Property Names– Empty and Null Property Values– JSON Schema

Page 33: CenterEdge: Migrating from relational: data modeling and access – Couchbase Connect 2016

©2016 Couchbase Inc.©2016 Couchbase Inc.

Root Attributes vs. Embedded Attributes

33

• The choice of having a single root attribute or the “type” attribute embedded.

Page 34: CenterEdge: Migrating from relational: data modeling and access – Couchbase Connect 2016

©2016 Couchbase Inc.©2016 Couchbase Inc.

Objects vs. Arrays

34

• The choice of having an object type, or an array type

Page 35: CenterEdge: Migrating from relational: data modeling and access – Couchbase Connect 2016

©2016 Couchbase Inc.©2016 Couchbase Inc.

Array Element Types

Array of strings

Array of objects

35

• Array elements can be simple types, objects or arrays:

Page 36: CenterEdge: Migrating from relational: data modeling and access – Couchbase Connect 2016

©2016 Couchbase Inc.©2016 Couchbase Inc.

Timestamp Formats

Array of time components

String (ISO 8601)

Number (Unix style)(Epoch)

• Working and dealing with timestamps has been challenging ever since• When storing timestamps, you have at least 3 options:

16

Page 37: CenterEdge: Migrating from relational: data modeling and access – Couchbase Connect 2016

©2016 Couchbase Inc.©2016 Couchbase Inc.

Empty and Null Property Values

37

• Keep in mind that JSON supports optional properties• If a property has a null value, consider dropping it from the JSON, unless

there's a good reason not to• N1QL makes it easy to test for missing or null property values• Be sure your application code handles the case where a property value is

missing

SELECT * FROM couchmusic1 WHERE userprofile.address IS NULL;

SELECT * FROM couchmusic1 WHERE userprofile.gender IS MISSING;

Page 38: CenterEdge: Migrating from relational: data modeling and access – Couchbase Connect 2016

©2016 Couchbase Inc.©2016 Couchbase Inc.

JSON Schema

38

• Couchbase Server pays absolutely no attention to the shape of your JSON documents so long as they are well-formed

• There are times when it is useful to validate that a JSON document conforms to some expected shape

• JSON Schema is a JSON-based format for defining the structure of JSON data• There are implementations for most popular programming languages• Learn more here: http://json-schema.org

Page 39: CenterEdge: Migrating from relational: data modeling and access – Couchbase Connect 2016

©2016 Couchbase Inc.©2016 Couchbase Inc.

Data Nesting (aka Denormalization)

39

• As you know, relational database design promotes separating data using normalization, which doesn’t scale

• For NoSQL systems, we often avoid normalization so that we can scale• Nesting allows related objects to be organized into a hierarchical tree

structure where you can have multiple levels of grouping• Rule of thumb is to nest no more than 3 levels deep unless there is a very

good reason to do so• You will often want to include a timestamp in the nested data

Page 40: CenterEdge: Migrating from relational: data modeling and access – Couchbase Connect 2016

©2016 Couchbase Inc. 40©2016 Couchbase Inc.

Example #1 of Data Nesting

• Playlist with owner attribute containing username of corresponding userprofile

Document Key: copilotmarks61569

Page 41: CenterEdge: Migrating from relational: data modeling and access – Couchbase Connect 2016

©2016 Couchbase Inc. 41©2016 Couchbase Inc.

Example #1 of Data Nesting

• Playlist with owner attribute containing a subset of the corresponding userprofile

* Note the inclusion of the updated attribute

Page 42: CenterEdge: Migrating from relational: data modeling and access – Couchbase Connect 2016

©2016 Couchbase Inc.©2016 Couchbase Inc.

Choices with JSON Key Design

42

• A key formed of attributes that exist in the real world:– Phone numbers– Usernames– Social security numbers– Account numbers– SKU, UPC or QR codes– Device IDs

Page 43: CenterEdge: Migrating from relational: data modeling and access – Couchbase Connect 2016

©2016 Couchbase Inc.©2016 Couchbase Inc.

Surrogate Keys

43

• We often use surrogate keys when no obvious natural key exist• They are not derived from application data• They can be generated values

– 3305311F4A0FAAFEABD001D324906748B18FB24A (SHA-1)– 003C6F65-641A-4CGA-8E5E-41C947086CAE (UUID)

• They can be sequential numbers (often implemented using the Counter feature of Couchbase Server)– 456789, 456790, 456791, …

Page 44: CenterEdge: Migrating from relational: data modeling and access – Couchbase Connect 2016

©2016 Couchbase Inc.©2016 Couchbase Inc.

Making Tough Choices

44

• We must also make trade-offs in data modeling:– Document size– Atomicity– Complexity– Speed

?

Page 45: CenterEdge: Migrating from relational: data modeling and access – Couchbase Connect 2016

©2016 Couchbase Inc.©2016 Couchbase Inc.

Embed vs. Refer

45

• All of the previous trade-offs are usually rolled into a single decision – whether to embed or refer

• When to embed?• When to refer?

Page 46: CenterEdge: Migrating from relational: data modeling and access – Couchbase Connect 2016

©2016 Couchbase Inc.©2016 Couchbase Inc.

Want to know more on Data Modeling?

46

• Session tomorrow – “Agile Document Models and Data Structures” at 1:00PM

Page 47: CenterEdge: Migrating from relational: data modeling and access – Couchbase Connect 2016

©2016 Couchbase Inc. 47©2016 Couchbase Inc.

Brant BurnettSoftware Development Team Lead

Couchbase Community Expert

Page 48: CenterEdge: Migrating from relational: data modeling and access – Couchbase Connect 2016

©2016 Couchbase Inc. 48©2016 Couchbase Inc.

About CenterEdge Software

Point of Sale

Admissions & Ticketing

Party, Group & Event Bookings

Online Sales & Party Reservations

Time Clock & Labor Management

& More!

Page 49: CenterEdge: Migrating from relational: data modeling and access – Couchbase Connect 2016

©2016 Couchbase Inc. 49©2016 Couchbase Inc.

About CenterEdge Software

• Celebrating 12 Year Anniversary

• Team of 50 in Roxboro, NC

• Sister company is Palace Pointe, a 100k sq. ft. Entertainment Venue for which we were developed as an in-house system

• Over 600 facilties using our platform across the US and abroad

• FEC’s, Waterparks, Trampoline Parks, Amusement Parks, Skating Rinks, Bowling Centers, Zoos & Museums

Page 50: CenterEdge: Migrating from relational: data modeling and access – Couchbase Connect 2016

©2016 Couchbase Inc. 50©2016 Couchbase Inc.

Why Couchbase For CenterEdge’s Newest Cloud Platform?

• More scalable and performant than traditional SQL in the cloud

• Previous online store system uses 19 SQL servers, each hosting 30 stores

• As each store is only on a single server, it doesn’t handle spikes in load efficiently

• Servers can’t be scaled vertically without downtime for all 30 stores on that server

• Schema-less JSON increases flexibility as your system evolves, leaving schema enforcement in your data access layer

• Schema changes to large tables can result in downtime as data structure is updated across all records

• We were already using Couchbase for our shopping carts as well as a SQL caching layer, with great success. Now we can simplify the architecture with a single data layer.

Page 51: CenterEdge: Migrating from relational: data modeling and access – Couchbase Connect 2016

©2016 Couchbase Inc. 51©2016 Couchbase Inc.

Couchbase Cloud Data Flow Architecture

Data Data Data

IndexQuery

Web Servers

Remote Application Servers

Page 52: CenterEdge: Migrating from relational: data modeling and access – Couchbase Connect 2016

©2016 Couchbase Inc. 52©2016 Couchbase Inc.

Enforcing Schema

• Since Couchbase doesn’t enforce schema like SQL, your data access layer should do so instead

• At CenterEdge, each document type is only updated by a single service• Within that service, schema is enforced by serializing data from consistent

POCOs• Schema changes can be supported using customized JSON converters during

deserialization• IS MISSING is a good way to recognize the difference in attributes that weren’t

stored because the document was saved using the old schema• Where possible, try to predict possible schema needs in advance

Page 53: CenterEdge: Migrating from relational: data modeling and access – Couchbase Connect 2016

©2016 Couchbase Inc. 53©2016 Couchbase Inc.

Pay Attention To Document Modeling Up Front

• Watch out for documents that get too large• Might hit 20MB document size limit• High serialization/deserialization/networking performance penalties• Document contention as too many actions attempt to modify the document

simultaneously• Watch out for data spread across too many related documents

• Lack of atomic transactions across multiple writes can result in partial updates• Can add latency if documents must be read in a chained manner (i.e. each

document contains the key to the next document)• Be sure to include document keys, or a way to construct them, where you may

want to use N1QL JOIN or NEST operations• Should the document key be stored inside the document, too?

• Increases data size, as the key is in the document and in the metadata• Requires that the data layer maintain consistency• Can make queries easier since you don’t need to use META() function to get the key

Page 54: CenterEdge: Migrating from relational: data modeling and access – Couchbase Connect 2016

©2016 Couchbase Inc. 54©2016 Couchbase Inc.

View Indexes vs. Global Secondary Indexes

• Be sure to analyze what type of index is best for each workload• Views are great where pre-aggregating numbers is useful, such as reports,

graphs, etc• GSI is usually the best option for more generic queries, especially if when

you’re just trying to collect a set of documents• Views don’t scale as cleanly, they can’t be scaled independently via Multi

Dimensional Scaling• Views live on the data nodes, so they only scale as you add more data nodes

• At CenterEdge, our new platform started on Couchbase Server 3.0, before Global Secondary Indexes were an option• We used views and lookup documents for most of our indexing needs• We have run into problems with too many views per bucket causing performance

bottlenecks• We’re currently transitioning many of these views into Global Secondary Indexes

Page 55: CenterEdge: Migrating from relational: data modeling and access – Couchbase Connect 2016

©2016 Couchbase Inc. 55©2016 Couchbase Inc.

Efficient Indexing Is Especially Important For Couchbase

• Primary key scans in SQL have always been inefficient• Every record in the table would be read and checked for a match to the WHERE

predicate• For small tables, the performance penalty was negligible, and would usually go

unnoticed• Primary key scans in Couchbase are usually much worse

• In our experience with production-scale data, almost invariably results in queries timing out

• Every record in the bucket is being read and checked for a match to the WHERE predicate

• Can easily result in reading and parsing millions of JSON documents• Will also bust the in-memory cache on the data nodes if there is more data in the

bucket than allocated memory• Design every query to be supported by a Global Secondary Index

• Helps even if the index isn’t an exact match• A good design can vastly reduce the number of documents scanned, making it

more like a SQL primary key scan

Page 56: CenterEdge: Migrating from relational: data modeling and access – Couchbase Connect 2016

©2016 Couchbase Inc. 56©2016 Couchbase Inc.

Efficient Indexing Is Especially Important For Couchbase

/* Use predicate to only index documents of a certain type */CREATE INDEX `airport_sourceairport` ON `travel-sample` (`sourceairport`)WHERE `type` = 'airport'

/* To index the same attribute across multiple document types, include type attribute first */CREATE INDEX `def_type_id` ON `travel-sample` (`type`, `id`)

/* A good practice is to create a fallback in case other indexes aren't used */CREATE INDEX `def_type` ON `travel-sample` (`type`)

If you’re using the “type” attribute as the logical equivalent of a table in SQL, most indexes will include this attribute.

Page 57: CenterEdge: Migrating from relational: data modeling and access – Couchbase Connect 2016

©2016 Couchbase Inc. 57©2016 Couchbase Inc.

How To Store and Index Date/Times

• Date/Times are usually stored as ISO 8601 strings in JSON• Use STR_TO_MILLIS(x) in indexes and queries to work with ISO 8601 strings

/* STR_TO_MILLIS converts an ISO8601 string to a Unix numeric representation *//* It also handles the time zone specifier */SELECT `Extent1`.* FROM `beer-sample` as `Extent1`WHERE (`type` = 'beer')AND (STR_TO_MILLIS(`Extent1`.`updated`) <= STR_TO_MILLIS("2010-01-01T00:00:00Z"))

/* STR_TO_MILLIS must also be used in the index, or the index cannot be used */CREATE INDEX `beer_updated` ON `beer-sample` (STR_TO_MILLIS(`updated`))WHERE `type` = 'beer'

Page 58: CenterEdge: Migrating from relational: data modeling and access – Couchbase Connect 2016

©2016 Couchbase Inc. 58©2016 Couchbase Inc.

Index Performance During Mutations

AirlineSQL Table

AirportSQL Table

travel-sample Bucket

Airline Indexes

Airport Indexes

Bucket Indexes

Remember that GSI indexes are similar to SQL indexes, but not the same

Page 59: CenterEdge: Migrating from relational: data modeling and access – Couchbase Connect 2016

©2016 Couchbase Inc. 59©2016 Couchbase Inc.

Training!

• Don’t just assume you can switch to any NoSQL platform without some training

• Performance profile is different, and the penalties can appear in different places• Developers who know the pitfalls in advance can save you a lot of refactoring

headaches later• N1QL does help reduce the learning curve significantly

• For .Net development shops, look at Linq2Couchbase to make it even easier!• The operations department needs training, too!

Page 60: CenterEdge: Migrating from relational: data modeling and access – Couchbase Connect 2016

©2016 Couchbase Inc. 60

Marco GrecoSenior Software [email protected]

Clarence J M Tauro, Ph.D.Senior [email protected]

Brant BurnettLead [email protected]

Page 61: CenterEdge: Migrating from relational: data modeling and access – Couchbase Connect 2016

©2016 Couchbase Inc. 61

Share your opinion on Couchbase

1. Go here: http://gtnr.it/2eRxYWn

2. Create a profile

3. Provide feedback (~15 minutes)

Page 62: CenterEdge: Migrating from relational: data modeling and access – Couchbase Connect 2016

©2016 Couchbase Inc. 62

The Couchbase Connect16 mobile appTake our in-app survey!

Page 63: CenterEdge: Migrating from relational: data modeling and access – Couchbase Connect 2016

©2016 Couchbase Inc. 63

Thank You!