Informix NoSQL & Hybrid SQL detailed deep dive

Informix NoSQL-- Very Deep Dive

John F. Miller III, IBM [email protected]

Keshava Murthy, IBM [email protected]

© 2013 IBM Corporation

Speaker Introduction

Keshava Murthy

NoSQL & SQL Architect, STSM

Jef Treece

Product Manager

John F. Miller III

Lead Architect, STSM

Overview

� Business phenomenon of NoSQL

− Value: Open Source vs Enterprise Class

� What is NoSQL, JSON and BSON?

− Quick overview

− IBM/Mongo Partnership

� New Informix NoSQL/JSON Capabilities

− Solution Highlights

� Mongo Compatibility

− The Power of a Hybrid Solution � NewSQL

− Analytics

� Demo Deep Dive

− Putting together an application

− Seeing it live

3

BUSINESS PHENOMENON OF NOSQL

© 2013 IBM Corporation© 2013 IBM Corporation


Informix NoSQL Capabilities

� Key-value Stores– Hashed Key value model– Capable of massive data storage

� Column Family Stores– Keys that point to multiple columns and column families

� Document Data Base– JSON / BSON Formatted documents– Collections of Documents with Key-value stored versions

� Graph Databases– Nodes / relationship between nodes – Properties of nodes– Data values– Supports SPARQL ?


Business Value of Informix NoSQL

� Increases business opportunity through capture and analysis of

“interactive / interaction data” for a full understanding of customers

and markets

� Lowers business barriers and reduces IT costs through Rapid

Application Deployment

� Increases business ability to respond to volume changes through

Scalable / extensible architecture

� Lowers IT overhead through use of Low Cost Commodity Hardware

� Changes business view to “what is happening” versus “what

happened”


Use Cases for Informix NoSQL

� Session Store– High volume ingestion / low latency access requirements– Linkage to a specific user provides immediate recognition of customer

preferences– Session restore after “break” is immediate to last functions performed increase

customer confidence and connection to their desire(s)

� User Profile Store– Customer Profiles, orders and shipment status is immediate and searchable– Fast access for Authentication and preferences– Historical access and click streams increase personalization & targeting

� Content and Metadata Store– Document database for digital content, articles, instruction manuals– Key-value linkages provide linkage to customer profiles and history– Multiple image/data type support provide fast access to different data type


Use Case for Informix NoSQL (cont)

� Mobile Apps– Ability to store content and user data in schema-less formats increase

development and deployment speed and changes to existing apps– Supports multi-format data storage associated with differing device types– Scalable storage provides for document, image, text oriented storage and

retrieval of apps data / information

� Third Party Aggregation– High speed ingestion of data feed from multiple sources:

• Retail store loyalty programs, social media, marketing information, purchase histories by groups, person, industries

– Ease of format management and analysis

� High Availability Cache– Storage for popular / frequent search results– Session information– High Frequency Ads tailored for User Profiles, Locations, searches– Dual function cache for data store and fast response time for retrieval


Informix NoSQL Use Case (cont)

� Globally Distributed Data Repository– Scalable nodes and access across distributed systems– Location affinity for workload optimization– Capture of differing formats and languages to provide global and discrete views

of the business and analytics of searches, purchases, etc.

� eCommerce– Ability to handle purchase spikes / peak time through scalable system

architecture and low cost commodity hardware– Fast / rapid application deployment through schema-less development model– Ability to combine transaction data with user action reduces cost of upgrading

business apps for transaction management

� Social Media & Gaming– Rapid app development , deployment, and change implementation increase

ability to grow customer base and respond to trends – or create them– Fast access to user profile for authentication and preferences / historical

information– Real time Analytics and trend identification


Informix NoSQL Use Cases (cont)

� Ad Targeting– Fast access to user profile and histories permit well managed Ad placement

increasing revenue and buy decision– Click history and Pyschographic based suggestions – Ability to ingest other feeds and quickly relate to a specific customer

• Loyalty programs• Social Media• Associations (LinkedIn, Dating sites, Industry Groups etc.)

Informix and MongoDB Have Free Editions

11

Editions Informix MongoDB

Free Developer Innovator-C

Basic

For Purchase Express, Workgroup, Advanced Workgroup, Enterprise, Advanced Enterprise

Standard,Enterprise

MongoDB Subscriptions

12

Basic Standard Enterprise

Edition MongoDB MongoDB MongoDBEnterprise

Support 9am-9pm local, M-F 24x7x365 24x7x365

License AGPL Commercial Commercial

Emergency Patches Not Included Included Included

Price $2,500 / Server / Year $5,000 / Server / Year

$7,500 / Server / Year

Subscription information obtained from 10Gen site, June 26, 2013.

Additional monthly charges for backup services.

Price Point Comparison Estimate, 3-year cost

13

Dual Core Intel

Nehalem

Innovator-C Express(4 core, 8 GB,

2 nodes)

Workgroup(16 core, 16 GB,

unlimited nodes)

Product Cost $0 $8,540 $19,740

Support Subscription Year 124 x 7 x 365Production System DownDevelopment CallEmergency PatchesFree Upgrades

$1,680 Included Included

Support Renewal Year 2

$1,680 $1,708 $3,948

Support Renewal Year 3

$1,680 $1,708 $3,948

Total $5,040 $11,956 $27,636MongoDB Enterprise, 3-year cost: $22,500

Subscription information obtained from 10Gen site, June 26, 2013.Retail U.S. prices subject to change, valid as of June 26, 2013.


Back up


Informix NoSQL Key Messages

� Mobile computing represents one of the greatest opportunities for organizations to expand their business, requiring mobile apps to have access to all relevant data

� DB2 and Informix have embraced the JSON standard and MongoDB API to give mobile app developers the ability to combine critical data managed by DB2 or Informix with the next generation mobile apps

� DB2 JSON and Informix JSON let developers leverage the abilities of relational DBMS together with document store systems

IBM Confidential15

© 2013 IBM CorporationIBM Confidential161616

Leverage all relevant enterprise data for your mobile app development

The Offer Client Value

DB2/Informix JSON for Install Base Clients

� Any for-purchase edition of

• Latest version of DB2 LUW 10.5

• Latest version of Informix 12.1

� Includes modern interface with JSON /BSON

support

• Lets DB2/Informix be compatible with existing

MongoDB apps with minimal/no application changes

• Ability to handle a mix of structured and unstructured

data

• Flexible schema allows rapid delivery of apps

• Compatible with MongoDB programming interfaces

� One solution for relational and non-relational data

• Relational tables and NoSQL collections co-exist in the

same database

• Allows joins between NoSQL and Relational tables

• Joins utilize indexes on both relational and NoSQL

� Brings Enterprise level performance, reliability, and

security to mobile apps

• Includes ACID compliance

� Includes built-in online backup and restore capabilities

� Automatic Document Compression (only if they buy

Advanced Enterprise or add-on compression with

Enterprise Edition?)


DB2 JSON Go-to-Market Timeline

Phase 1 – June through August 15 Phase 2 – eGA – Q4 Phase 3 - 2014

Timing Jan 1, 2014 – TBD

Target Targeted clients/partners

Opportunistic white space

Install Base – Top xxxx accounts

Key partners TBD

Opportunistic white space

Who

Deliverables �DB2 Trial code with Tech Preview Content

�Tech preview on dW

�A series of how-to articles on dW: Part 1, Part 2, Part 3, Part 4.

�Video: DB2 JSON Overview

�Video: Getting started

�Update NoSQL product page with JSON content

�Re-promote DB2 Trial code

�IBM Champions for blogs

�dW DM for JSON Community covreingboth DB2 and Informix

�On-going

Events �MongoDB World �Tech Bits and Tech Talks

�Developer Days

�DB2 Night Show via BP

�Leverage existing 10Gen event calendar to insert content

(these are potential – not confirmed)

� Connect with AIM, Rational for joint presence at 3rd conferences and events.


Informix JSON Go-to-Market Timeline

Phase 1 - EVP Phase 2 – eGA – Q4 Phase 3 - 2014

Timing Aug. 5, 2013 – Sep. 13, 2013 Sep. 13, 2013 – Dec. 31, 2013 Jan 1, 2014 – TBD

Target • Targeted clients/partners

• Opportunistic white space

• Install Base – Top xxxx accounts

• Key partners TBD

• Opportunistic white space

Who Confirmed EVP participants

�Begooden BP

�Moukouri BP

�Consult-IX BP

�ADT BP

�Openbet Cust

�Technicolor White space cust

Potential EVP participants

�Nuvento White space cust

�SRCE BP

�Cisco OEM cust

�Zebra Fish BP

�Top xxxx accounts

� Key partners TBD

� (criteria to be determined for each)

� TBD

Deliverables � NoSQL Client Business Value Deck

� NoSQLTechnical White-paper

�12.10.xC2 Release White Paper

� Developer Works article on NoSQL Client

� Updated NoSQL demo

• Content on Informix marketing page with re-direct to DevWorks page

• Promote Informix trial code?

Events � Chat with the Labs • IOD sessions

• IOD Lab

� Connect with AIM, Rational for joint presence at 3rd conferences and events.


IOD 2013 Roadmap:

Extend Database Application Versatility with NoSQL

Session No. Session Title

IPT-2059 JSON Document Support in IBM DB2 Schema Flexibility in a Trusted Database Environment

IPT-2909 Embracing NoSQL: IBM and the Community

IPT-3149 Big Data, Hadoop, and NoSQL: A Crash Course for a Fearless IBM DB2 Practitioner

IDB-1943 Firefighter Safety Enhancement through NoSQL Information Integration Architectures with IBM DB2 Graph Store

IDB-2075 Competitive Product Delivery with IBM DB2 JSON

IDB-1878 Agile Product Development Using IBM DB2 with JSON

IDX-3691 Get the Best of Both Worlds: Bring NoSQL to Your SQL Database

IDX-3722 Balancing Big Data Workloads with NoSQL Auto-Sharding

IDX-2551 "Developing ""Hybrid"" NoSQL/SQL Applications using JSON/BSON with Informix"

IPT-3070 Extending Information Management Solutions with NoSQL Capabilities--XML, RDF and JSON

IDZ-2599 NoSQL and DB2 for z/OS? Receiving Agility from a Trusted Enterprise Database


DB2/Informix Joint Events 2014

� Need to identify events where we can submit topics/speakers for maximum exposure (not just booth participation)

IBM Confidential20

Some Typical NoSQL Use Cases- Mostly Interactive Web/Mobile

21

� Online/Mobile Gaming

− Leaderboard (high score table) management

− Dynamic placement of visual elements

− Game object management

− Persisting game/user state information

− Persisting user generated data (e.g. drawings)

� Display Advertising on Web Sites

− Ad Serving: match content with profile and present

− Real-time bidding: match cookie profile with ad inventory, obtain bids, and present ad

� Communications

− Device provisioning

� Social Networking/Online Communities

� E-commerce/Social Commerce– Storing frequently changing product

catalogs

� Logging/message passing– Drop Copy service in Financial

Services (streaming copies of trade execution messages into (for example) a risk or back office system)

� Dynamic Content Management

and Publishing (News & Media)– Store content from distributed

authors, with fast retrieval and placement

– Manage changing layouts and user generated content


� NoSQL = RDBMS + JSON + Sharding

� NoSQL = JSON + Sharding

� InformixNoSQL = JSON + Sharding + Transaction + RDBMS(Hybrid) +

HA(shared Disk) + HDR

IBM Confidential22

NOSQL, JSON AND BSON OVERVIEW

Technical Opportunities/ Motivation

What are NoSQL Databases?

Quick overview of JSON

What is sharding?

New Era in Application Requirements

� Store data from web/mobile application in their native form

− New web applications use JSON for storing and exchanging information

− Very lightweight – write more efficient applications

− It is also the preferred data format for mobile application back-ends

� Move from development to production in no time!

− Ability to create and deploy flexible JSON schema

− Gives power to application developers by reducing dependency on IT

− Ideal for agile, rapid development and continuous integration

Why NoSQL?

� Non-traditional data management requirements driven by Web 2.0 companies

− Document stores, key-value stores, graph and columnar dbms

25

� The Three Vs:

− Velocity – high frequency of data arrivals

− Volume – BigData

− Variability – unstructured data, flexibility in schema design

� New data interchange formats – like JSON (JavaScript Object Notation) and BSON (Binary JSON)

� Scale-out requirements across heterogeneous environment – Cloud computing

What is a NoSQL Database?

� Not Only SQL or NOt allowing SQL

� A non-relational database management systems

− Does not require a fixed schema

− Avoids join operations

− Scales horizontally

− No ACID (eventually consistent)

� Good with distributing data and fast application development

Provides a mechanism for storage and retrieval of

data while providing horizontal scaling.

IBM Use Case Characteristics for JSON

Why Most Commercial Relational Databases cannot meet these Requirements

NoSQL Database Philosophy Differences

� No ACID– No ACID (Atomicity, Consistency, Isolation, Durability)– An eventual consistence model

� No Joins– Generally single row/document lookups

� Flexible Schema– Rigid format

29 © 2013 IBM Corporation

Partnership with IBM and MongoDB

� MongoDB and IBM announced a partnership in June 2013

� There are many common use cases of interest addressed by the partnership− Accessing JSON Data in DB2, Informix MongoDB using JSON query

− Schema-less JSON Data for variety of applications

− Making JSON data available for variety of applications

− Securing JSON Data

� IBM and MongoDB are collaborating in 3 areas:− Open Governance: Standards and Open Source

− Technology areas of mutual interest

− Products and solutions

Basic Translation Terms/Concepts

31

Mongo/NoSQL Terms Traditional SQL Terms

Database Database

Collection Table

Document Row

Field Column

{"name":"John","age":21}{"name":"Tim","age":28}{"name":"Scott","age":30}

Name Age

John 21

Tim 28

Scott 30

Collection

DocumentKey Value

Table

Row

� JSON Syntax Rules– JSON syntax is a subset of the JavaScript object notation syntax:– Data is in name/value pairs– Data is separated by commas– Curly braces hold objects– Square brackets hold arrays

� JSON Name/Value Pairs– JSON data is written as key/value pairs.– A key/value pair consists of a field name (in double quotes), followed by a

colon, followed by a value:

� The 6 types of JSON Values:– A number (integer or floating point)– A string (in double quotes)– A Boolean (true or false)– An array (in square brackets)– An object (in curly brackets)– Null

JSON Details

32

"name":"John Miller"

Example of Supported JSON Types

{

"string":"John",

"number":123.45,

"boolean":true,

"array":[ "a", "b", "c" ],

"object: { "str":"Miller", "num":711 },

"value": NULL,

"date": ISODate("2013-10-01T00:33:14.000Z")

}

� Example of each JSON type

� Mongo specific JSON types in blue– date

JSON: JavaScript Object Notation

� What is JSON?

− JSON is lightweight text-data interchange format

− JSON is language independent

− JSON is "self-describing" and easy to understand

� JSON is syntax for storing and exchanging text information much like XML. However, JSON is smaller than XML, and faster and easier toparse.

34


{"name":"John Miller","age":21,"count":27,"employees": [

{ "f_name":"John" , "l_name":"Doe" },

{ "f_name":"Anna","m_name" : "Marie","l_name":"Smith" },

{ "f_name":"Peter" , "l_name":"Jones" }

]}

BSON is a binary form of JSON.

The Power of JSON Drives Flexible Schema

� JSON key value pair enables a flexible schema

Data Access Examples

� Relational Representation

LName FName Address

Miller John 123 Blazer St

� JSON Representation

JSON_string = ‘{“LName”:”Miller”,”FName”,”John”,”Address”:”123 Blazer St”}’;

� Javascript data access

var JSONelems = JSON.parse( JSON_string )

f_name = JSONelems.FNname;

l_name = JSONelems.FName;

l_addr = JSONelems.Address;

Simple Code Example

� Creates the database “db” if it does not exists

� Creates the collection “posts” if the it does not exists

� Insert a record into a blog post by user John

db.posts.insert( ‘{“author”:”John”, “date”,”2013-04-20”,”post”,”mypostY”}’ )

� Retrieve all posts by user John

db.posts.find ( ‘{ “author”:”John” }’ )

Dynamic Elasticity

� Rapid horizontal scalability

− Ability for the application to grow by adding low cost hardware to the solution

− Ability to add or delete nodes dynamically

− Ability rebalance the data dynamically

� Application transparent elasticity

Why Scale Out Instead of Up?

� Scaling Out

− Adding more servers with less processors and RAM

− Advantages

� Startup costs much less

� Can grow instep with the application

� Individual servers cost less

− Several less expensive server rather than fewer high cost servers

− Redundant servers cost more

� Greater chance of isolating catastrophic failures

� Scaling Up

− Adding more processors and RAM to a single server

− Advantages

� Less power consumption than running multiple servers

� Less infrastructure (network, licensing,..)

Difference between Sharding Data VS Replication

40

Shard Keystate= “OR”

Shard Keystate= “WA”

Shard Keystate= “CA”

Sharding Replication

Each node hold a portion of the data

• Hash• Expression

Same data on each node

Inserted data is placed on the correct node

Data is copied to all nodes

Actions are shipped to applicable nodes

Work on local copy and modification are propagated

Motivation of Sharding

� Synergistic with the application strategy

� Start small and grow with commodity hardware

� Sharding is not

Sharding is not for Data Availability

� Sharding is for growth, not availability

� Redundancy of a node provides high availability for the data

− Both Mongo and Informix allow for multiple redundant nodes

− Mongo refers to this as Replica Sets and the additional nodes slaves

− Informix refers to this as MACH, and additional nodes secondary

� With Informix the secondary server can:

− Provide high availability

− Scale out

� Execute select

� Allow Insert/Update/Deletes on the secondary servers

� Share Disks with the master/primary node

Basic Data Distribution/Replication Terms

43

Term Description Informix Term

Shard A single node or a group of nodes holding the same data (replica set)

Instance

Replica Set A collection of nodes contain the same data MACH Cluster

Shard Key The field that dictates the distribution of the documents. Must always exist in a document.

Shard Key

ShardedCluster

A group shards were each shard contains a portion of the data.

Grid/Region

Slave A server which contains a second copy of the data for read only processing.

Secondary Server Remote Secondary

Sharding is not for Data Availibility

44




NEW INFORMIX NOSQL/JSON CAPABILITIES

IBM Use Case Characteristics for JSON

High Level Solution

47

Flexible Schema, Native JSON and BSON

� Two new built in “first-class” data types called JSON and BSON

� Support for MongoDB client side APIs throughout the entire database

� Enhance horizontal scaling by enabling sharding on all database objects and models

� Adaptive default system initialization

− Up and running in seconds

− Adapts to computer an environment

48

Provide native JSON & BSON support in the Informix Database Server

Two New Data Types JSON and BSON

� Native JSON and BSON data types

� Index support for NoSQL data types

� Native operators and comparator functions allow for direct manipulation of the BSON data type

� Database Server seamlessly converts to and from

� JSON �� BSON

� Character data �� JSON

49

JSON and BSON Data Type Details

� Row locks the individual BSON/JSON document

− MongoDB must lock the database

� Bigger documents – 2GB maximum size

− MongoDB caps at 16MB

� Ability to compress documents

− MongoDB currently not available

� Ability to intelligently cache commonly used documents

− MongoDB currently not available

Client Applications � New Wire Protocol Listener supports

existing MongoDB drivers

� Connect to MongoDB or Informix with same application!

MongoDB

native Client

MongoDB

web browser

Mobile

Applications

JDBC

Driver

IBM

NoSQL

Wire

Protocol

Listener

MongoDB

Wire

Protocol

Informix

DB

MongoDB

driver

MongoDB shell

MongoDB Application Compatibly

� Ability to use any of the MongodB client drivers and frameworks against the Informix Database Server

− Little to no change required when running MongoDB programs

− Informix listens on the same default port as mongo, no need to change.

� Leverage the different programming languages available

− Language examples C, C#, Erlang, JAVA, node.js, Perl, Python, Ruby

52

Mongo Action Description

db.customer.insert( { name: “John", age: 21 } ) Insert into database “db” the customer collection the associated document.

db.customer.find( {age: { $gt:21 } } ) Retrieve from database “db” the customer collection all documents with the age greater than 21.

Fix up JSON Basic Mongo Operations Conceptual Operations

53

Mongo Action Traditional SQL Action

db.customer.insert( { name: “John", age: 21 } )

CREATE DATABASE if not exist dbCREATE TABLE if not exist customerINSERT INTO customerVALUES ( { “name”:”John”,”age:21”} )

db.customer.find() SELECT bson_doc FROM customer

db.customer.find( {age: { $gt:21 } } ) SELECT * FROM customer WHERE age > 21

db.customer.drop() DROP TABLE customer

db.customer.ensureIndex( { name : 1, age : -1 } )

CREATE INDEX idx_1 on customer(name , age DESC)

db.customer.remove( {age: { $gt:21 } } ) DELETE FROM customer where age > 21

db.customer.update( { age: { $gt: 20 } }, { $set: { status: “Drink" } }, { multi: true } )

UPDATE customer SET bson_doc_field (status) = { “status”:“Drink" }WHERE age > 20

Scaling Out Using Sharded Queries





1. Request data from local shard

Find sold cars for all states

2. Automatically sends request to other shards requesting data

3. Returns results to client

1. Insert row sent to your local shard

Scaling Out Using Sharded Inserts





Row state = “OR”

2. Automatically forward the data to the proper shard

1. Delete condition sent to local shard

Scaling Out Using Sharded Delete





Deletestate = “OR” ANDDnum = “123123”

2. Local shard determine which shard(s) to forward the operation

3. Execute the operation

1. Insert a row on your local shard

Scaling Out Using Sharded Update






2. Automatically forward the data to the proper shard


1. Send command to local node

Scaling Out Adding a Shard





Command Add Shard “NV”

2. New shard dynamically added, data re-distributed (if required)

Shard Keystate= “NV”

Sharding with Hash

� Hash based sharding simplifies the partitioning of data across the shards

� Benefits

− No data layout planning is required

− Adding additional nodes is online and dynamic

� Cons

− Adding additional node requires data to be moved

� Data automatically broken in pieces

Shard KeyHASH( gambler_ID )

Scaling Out with Hash Sharding - Insert



Row Gambler_ID =“11”


1. Insert row sent to your local shard

2. Local shard determine which shard(s) to forward the insert

3. Execute the operation


Move to END Informix NoSQL Cluster Architecture

Overview




two independent copies of

the data and two servers

to share the workload.

Read/Write activity

supported on all servers

three independent copies of

the data, but four servers

to share the workload (two

servers share the same

disk). Read/Write activity


two independent copies of the

data, but three servers to

share the workload (two

servers share the same

disks). Read/Write activity


SHARDING NEEDED HERE

INFORMIX NOSQL

DEEP DIVE INTO IMPLEMENTATION

Focus Areas.

� Applications need easier way to persist objects.

− Object-relational layers like hibernate has performance & functional issues.

− Support Mongo API and its eco system

� Flexible Schema: Changing schema in RDBMS is a significant operation, especially in production.

− Need exclusive access to table & application downtime

− Fast changing schema & sparse data has issues with fixed schema.

-- Implement JSON (BSON) type, data store & indexing

� Scale Out: While RDBMS has solutions for cluster and MPP, features enabling application development and deployment in cloud+MPParchitecture is limited.

− Use Informix flexible grid & replication to shard tables;

− Enhance query processing for distributed queries

Client Applications � New Wire Protocol Listener supports existing MongoDB drivers

� Connect to MongoDB or Informix with same application!


JDBC Driver

IBM

NoSQL

Wire

Protocol

Listener

Informix

NoSQL

Cluster

IBM Wire Protocol Listener logic shared with DB2 and Extreme Scale

MongoDB

MongoDB

native Client

application

MongoDB

native Client

application

Informix

NoSQL

Cluster

MongoDB

driver

MongoDB

driver

MongoDB shellMongoDB shell

MongoDB web

browser

application

MongoDB web

browser

application

Tables

JDBC connections

IBM Wire Listener

IDXsLogs

Enterprise replication + Flexible Grid

Query processing

DistributedQueries

Informix Dynamic Server (shard 1)

SQL DriversODBC/JDBC

Informix Dynamic

Server (shard…n)

Tables IDXsLogs

ER + Grid

TablesTablesTablesTables

TablesTablesTables

TablesJSONCollections

IDXsTablesJSON

CollectionsIDXs

SQL Apps + Tools

Query processing


JDBC connections

IBM Wire Listener

JDBC connections

IBM Wire Listener

JDBC connections

IBM Wire Listener

JDBC connections

IBM Wire Listener

JDBC connections

IBM Wire Listener

JDBC connections

IBM Wire Listener

JDBC connections

IBM Wire Listener

NoSQL APP

JDBC connections

IBM Wire Listener

NoSQL APP SQL Apps + Tools


Informix for SQL and NoSQL Applications

Tables

Mongo Application

JDBC connections

IBM Wire Listener

IDXs Logs


SQL + BSON

NoSQL/BSON

Query processing

DistributedQueries

Informix Dynamic Server (shard 1)

SELECT

INSERT

UPDATE

DELETE


Informix Dynamic

Server (shard…n)

Tables IDXsLogs

ER + Grid

TablesTablesTablesTables

TablesTablesTables


IDXsTablesJSON

CollectionsIDXs

SQL Apps + Tools

SQL

Informix for SQL and NoSQL Applications

SQL

NoSQL Feature Checklist.

NoSQL Feature Informix Implementation

1. Flexible Schema Use BSON and JSON data type. Complete row is stored in a single column; BSON, JSON are multi-rep types and can store up to 2GB.

2. Accessing KV pairs within JSON.

Translate the Mongo queries into SQL expressions to access key-value pairs. Informix has added expressions to extract specific KV pairs.

Select bson_new(data, “{id:1, name:1, addr:1}”) from tab;

3. Indexing Support standard B-tree index via functional indexing.

create index itid on t(bson_value_int(data, “{id:1}”);

Create index itnamestate on t(bson_value_lvarchar(data, “{name,1}”),

bson_value_lvarchar(data, “{city,1}”);

Informix also supports indexing bsons keys with different data types.

4. Sharding Supports range & hash based sharding.Informix has built-in technology for replication. Create identical tables in multiple nodes. Add meta data for partitioning the data across based on range or hash expressions.

5. SELECT Limited support now. Mongo API helps by disallowing joins. The translated query on a single table is transformed into federated UNION ALL query; includes shard elimination.

NoSQL Feature Informix Implementation

6. Updates (single node) INSERT: Simple direct insert.

DELETE: DELETE statement with WHERE bson_extract() > ?; or bson_value..() > ?

UPDATE: bson_update(bson, “update expr”) will return a new bson after applying the bson expression. Simple updates to non_sharded tables will be direct UPDATE statement in a single transaction. UPDATE tab bsoncol = bson_update(bsoncol, “expr”) where Y

7. Updates (sharded env) INSERT – All rows are inserted to LOCAL shard, replication threads will read logs and replicate to the right shard and delete from local node (log only inserts underway).

DELETE – Do the local delete and replicate the “delete statements” to target node in the background

UPDATE – Slow update via select-delete-insert.

8. Transaction Each statement is a distinct transaction in a single node environment. The data and the operation is replicated via enterprise replication.

9. Isolation levels. NoSQL session can use any of Informix isolation levels.

-- Examples from Mongo applications.

10. Locking Application control only on the node they’re connected to. Standard Informix locking semantics will apply when data is replicated and applied on the target shard.

-- Verify.. What’s the default locking level?

-- Any option to change it? Just ONCONFIG??

Feature Informix Implementation

11. Hybrid accessY

From MongoAPI to relational tables.

Mongo APIs can simply access Informix tables, views, virtual tables as if they’re JSON collections. db.foo.find(); foo can be JSON collection, relational table, a view or a virtual table on top of timeseries or Websphere MQ.

12. Hybrid accessY

From SQL to JSON data

1. Directly get binary BSON or cast to JSON to get in textual form.

2. Use expressions to extract to extract specific key-value pairs.

3. To be done: NoSQL collections will only have one BSON object in the table. We can “imply” the expressions when the SQL refers to a column.

SELECT t.c1, t.c2 from t;

� SELECT bson_extract(t.data, “{c1:1}”), bson_extract(t.data, “{c2:1}”) from t;

So, that’s twelve steps for NoSQL Anonymous!

� Clients exchange BSON document with the server both for queries and data.

� Thus, BSON becomes a fundamental data type.

� The explicit key-value pairs withing the JSON/BSON document will be roughly equivalent to columns in relational tables.

� However, there are differences!

− The type of the KV data encoded within BSON is determined by the client

− Server is unaware of data type of each KV pair at table definition time.

− No guarantees that data type for each key will remain consistent in the collection.

− The keys in the BSON document can be arbitrary;

− While customers exploit flexible schema, they’re unlikely to create a single collection and dump everything under the sun into that collection.

− Due to the limitations of Mongo/NoSQL API, customers typically denormalizethe tables (customer will have customer+customer addr + customer demographics/etc) to avoid joins.

1. Flexible Schema

• Informix has a new data type BSON to store the data.

• Informix also has a JSON data type to convert between binary and

text form.

• BSON and JSON are abstract data types (like spatial, etc).

• BSON and JSON multi-representational types.

•Objects up to 4K is stored in data pages.

•Larger objects (up to 2GB) are stored out of row, in BLOBs.

•MongoDB limits objects to 16MB.

•This is all seamless and transparent to applications.

1. Flexible Schema – Informix Implementation

1. Flexible Schema – Informix Implementation

CREATE TABLE customer (data BSON);

• BSON is the binary represenation of JSON. •It has length and types of the key-value pairs in JSON.

• MongoDB drivers send and receive in BSON form.


• We’ll have number of extract expressions/functions•Expressions returning base type

bson_value_bigint(BSON, “key”);

bson_value_lvarchar(bsoncol, “key.key2”);

bson_value_date(bsoncol, “key.key2.key3”);

bson_value_timestamp(bsoncol, “key”)

bson_value_double(bsoncol, “key”);

bson_value_boolean(bsoncol, “key”);

bson_value_array(bsoncol, “key”);

bson_keys_exist(bsoncol, “key”);

Bson_value_document(bsoncol, “key”)

Bson_value_binary(bsonol, “key”)

Bson_value_objectid(bsoncol, “key”)

•Expression returning BSON subset. Used for bson indices.bson_extract(bsoncol, “projection specification”)

•Expressions to project out of SELECT statement.

bson_new(bsoncol, “{key1:1, key2:1, key3:1}”);

bson_new(bsoncol, “{key5:0}”);


Mongo Query SQL Query

db.customers.find(); SELECT SKIP ? Data::bson FROM customers

db.customers.find({},{num:1,name:1});

SELECT SKIP ? bson_new( data, '{ "num" : 1.0 , "name" : 1.0}')::bson FROM customers

db.customers.find({}, {_id:0,num:1,name:1});

SELECT SKIP ? bson_new( data, '{_id:0.0, "num" : 1.0 , "name" : 1.0}')::bson FROM customers

db.customers.find({status:”A”}) SELECT SKIP ? data FROM customers WHERE bson_extract(data, ‘status') = “A”

db.customers.find({status:”A”}, {_id:0,num:1,name:1});

SELECT SKIP ?

bson_new( data, '{ "_id" : 0.0 , "num" : 1.0 , "name" : 1.0}')::bson

FROM customers WHERE bson_extract(data, 'name') = “A”

3. Indexing

• Supports B-Tree indexes on any key-value pairs.

• Indices could be on simple basic type (int, decimal) or BSON

• Indices could be created on BSON and use BSON type comparison

• Listener translates ensureIndex() to CREATE INDEX

• Listener translates dropIndex() to DROP INDEX

Mongo Query SQL Querydb.customers.ensureIndex({orderDate:1})

CREATE INDEX IF NOT EXISTS w_x_1 ON w (bson_extract(data,'x') ASC) using bson (ns='{ "name" : "newdb.w.$x_1"}', idx='{ "ns" : "newdb.w" , "key" : {"x" : [ 1.0 , "$extract"]} , "name" : "x_1" , "index" : "w_x_1"}') EXTENT SIZE 64 NEXT SIZE 64

db.customers.ensureIndex({orderDate:1, zip:-1})

CREATE INDEX IF NOT EXISTS v_c1_1_c2__1 ON v (bson_extract(data,'c1') ASC,

bson_extract(data,'c2') DESC) using bson (ns='{ "name" :

"newdb.v.$c1_1_c2__1"}', idx='{ "ns" : "newdb.v" , "key" : { "c1" : [ 1.0

, "$extract"] , "c2" : [ -1.0 , "$extract"]} , "name" : "c1_1_c2__1" ,

"index" : "v_c1_1_c2__1"}') EXTENT SIZE 64 NEXT SIZE 64

db.customers.ensureIndex({orderDate:1}, {unique:true)

CREATE UNIQUE INDEX IF NOT EXISTS v_c1_1_c2__1 ON v (bson_extract(data,'c1') ASC, bson_extract(data,'c2') DESC) using bson (ns='{ "name" : "newdb.v.$c1_1_c2__1"}', idx='{ "ns" : "newdb.v" , "key" : { "c1" : [ 1.0 , "$extract"] , "c2" : [ -1.0 , "$extract"]} , "name" :"c1_1_c2__1" , "unique" : true , "index" : "v_c1_1_c2__1"}') EXTENT SIZE 64 NEXT SIZE 64

3. Indexing

db.w.find({x:1,z:44},{x:1,y:1,z:1})

Translate to:

SELECT bson_new( data, '{ "x" : 1.0 , "y" : 1.0 , "z" : 1}

FROM w

WHERE ( bson_extract(data, 'x') = '{ "x" : 1.0 }'::json::bson ) AND

( bson_extract(data, 'z') = '{ "z" : 44.0 }'::json::bson)

Estimated Cost: 2

Estimated # of Rows Returned: 1

1) keshav.w: SEQUENTIAL SCAN

Filters: (informix.equal(informix.bson_extract(keshav.w.data ,'z' ),UDT )

AND informix.equal(informix.bson_extract(keshav.w.data ,'x' ),UDT))

3. Indexing

•Functional Index is built on bson expressions

CREATE INDEX IF NOT EXISTS w_x_1 ON w (bson_extract(data,'x') ASC)

using bson (ns='{ "name" : "newdb.w.$x_1"}',

idx='{ "ns" : "newdb.w" , "key" : {"x" : [ 1.0 , "$extract"]} , "name"

: "x_1" , "index" : "w_x_1"}')

EXTENT SIZE 64 NEXT SIZE 64

•Listener is aware of the available index and therefore generates right predicates.

db.w.find({x:1});

gets translated to

SELECT SKIP ? data FROM w WHERE bson_extract(data, 'x') = ?

3. Indexing

db.w.find({x:5,z:5}, {x:1,y:1,z:1})

Translates to:

SELECT bson_new( data, '{ "x" : 1.0 , "y" : 1.0 , "z" : 1 .0}')::bson FROM w WHERE ( bson_extract(data, 'x') = '{ "x" : 5.0 }'::json::bson ) AND ( bson_extract(data, 'z') = '{ "z" : 5.0 }'::json::bson )

Estimated Cost: 11Estimated # of Rows Returned: 1

1) keshav.w: INDEX PATH

Filters: informix.equal(informix.bson_extract(keshav.w.data ,'z' ), UDT )

(1) Index Name: keshav.w_x_1Index Keys: informix.bson_extract(data,'x') (Serial, fragments: ALL)Lower Index Filter: informix.equal(informix.bson_extract(keshav.w.data ,'x' ),UDT )

© 2013 IBM Corporation84

4. Mongo Sharding

4. Mongodb SHARDING (roughly)

� Shard a single table by range or hashing.

� Mongos will direct the INSERT to target shard.

� Mongos tries to eliminate shards for update, delete, selects as well.

� FIND (SELECT) can happen ONLY a SINGLE table.

� It also works as coordinator for multi-node ops.

� Once a row is inserted to a shard, it remains there despite any key update.

� No transactional support on multi-node updates.

− Each document update is unto its own.

4. Informix Sharding

App ServerMongo Driver

ListenerJDBC

Enterprise repliation + Flexible Grid


ListenerJDBC


ListenerJDBC


ListenerJDBC

Location/1

Sales/1

Customer/1

Location/2

Sales/2

Customer/2

Location/3

Sales/3

Customer/3

Location/4

Sales/4

Customer/4

Location/5

Sales/5

Customer/5

Location/6

Sales/6

Customer/6

Informix/1 Informix/3 Informix/4 Informix/5 Informix/6Informix/2

4. Informix sharding + High Availability


ListenerJDBC

Enterprise repliation + Flexible Grid


ListenerJDBC


ListenerJDBC


ListenerJDBC

Informix/1Primary

Informix/1SDS/HDR

Informix/1RSS

Informix/2Primary

Informix/2SDS/HDR

Informix/2RSS

Informix/3Primary

Informix/3SDS/HDR

Informix/3RSS

Informix/4Primary

Informix/4SDS/HDR

Informix/4RSS

Informix/5Primary

Informix/5SDS/HDR

Informix/5RSS

Informix/6Primary

Informix/6SDS/HDR

Informix/6RSS

4. Sharding – Informix Implementation

•Shard a single table by range or hashing.cdr define shard myshard mydb:usr1.mytab \–type=delete –key=”bson_get(bsoncol, ‘STATE’)” –stragety=expression \

versionCol=version \

servA “in (‘TX’, ‘OK’)” \

servB “in (‘NY’,’NJ’) “ \

servC “in (‘AL’,’KS’) “ \

servD remainder

cdr define shard myshard mydb:usr1.mytab \–type=delete –key=state –stragety=hash --versionCol=version \servA servB servC servD

•Shard a single table by range or hashing.<<Mongo syntax>>

4. Sharding – Informix Implementation

•Shard a single table by range or hashing.

•Sharding is transparent to application.

•Each CRUD statement will only touch a single table.

•Limitation of MongoDB…Makes it easier for Informix.

•Lack of joins is a big limitation for SQL applications.

•Lacks transactional support for distributed update.

4. Informix Sharding

� Identical table is created on each node and meta data is replicated on each node.

� Schema based replication is the foundation for our sharding.

� CRUD operations can go to any node.

− We’ll use replication and other techniques to reflect the data in the target node, eventually.

− Right now, replication is asynchronous.

− Informix has synchronous replicationY not used for sharding now.

5. SELECT on sharded tables

•The query can be submitted to any of the nodes via the listener.

•That node acts as the “coordinator” for the distributed query.

•It also does the node elimination based on the query predicate.

•After that, the query is transformed to UNION ALL querySELECT SKIP ? bson_new( data, '{"_id":0.0 ,"num":1.0 ,"name" : 1.0}')::bson

FROM customers@rsys1:db

WHERE bson_extract(data, 'name') = “A” or bson_extract(data, 'name') = “X”

is transformed into:SELECT SKIP ? bson_new( data, '{"_id":0.0 ,"num":1.0 ,"name" : 1.0}')::bson



UNION ALL

SELECT SKIP ? bson_new( data, '{"_id":0.0 ,"num":1.0 ,"name" : 1.0}')::bson



6. INSERT: Single node

•If necessary, automatically create database & table (collection) on the application INSERT

•Collections: CREATE TABLE t(a GUID, d

BSON);

•GUID column is needed to ensure unique row across the SHARDs. Also used as PK for replication.

•Client application inserts JSON, client API converts this into BSON, generates _id (object id) if necessary & sends to server over JDBC/ODBC.

•Server saves the data into this table as BSON, with an automatically generated GUID

6. DELETE: Single node

•Mongo remove are translated to SQL DELETE

•Will always remove all the qualifying rows.

•WHERE clause translation is same as SELECT

6. UPDATE: Single node

•Simple set, increment, decrement updates are handled directly by the server.

Mongo: db.w.update({x: {$gt:55}}, {$set:{z:9595}});

SQL: UPDATE w SET data = bson_update(data, “{$set:{z:9595}}”)

WHERE bson_extract(data, 'x') > “{x:55}”::bson ?

•bson_update is a built-in expression updating a BSON document with a given set of operations.

•Complex updates are handled via select batch, delete, insert.

•Always updates all the rows or no rows, under a transaction.

7. INSERT: shard implementation

•If the table is sharded:

•Insert the data into local table as usual.

•Replication threads in the background will evaluate the log record to see which rows will have moved.

•Replication thread will move the necessary row to target and delete from the local table.

•Work underway to avoid the insert into local table.

•For each inserted row ending up in non-local shard, simply generate the logical log and avoid insert into local table & indices.

7. DELETE : shard implementation

•Application delete could delete rows from any of the shards in the table.

•DELETE will come in as shard_delete() procedure. –Execute procedure shard_delete(tabname, delete_stmt);

–This procedure will issue delete locally.

–It will then INSERT the delete statement into a “special”shard delete table (single for all tables).

–Enterprise replication will propagate the delete to applicable target systems.

7. UPDATE: shard implementation

•When the update have to be done on multiple nodes:

•Client application does UPDATE and CLIENT API converts that into three operations: SELECT, DELETE & INSERT.

•GUID column is needed to ensure unique row across the SHARD

•Client application inserts JSON, client API converts this into BSON & sends to server.

8. Transactions (single node)

•Mongo does not have the notion of transactions.•Each document update is atomic, but not the app statement

•For the first release of Informix-NoSQL•By default, JDBC listener simply uses AUTO COMMIT option•Each server operation INSERT, UPDATE, DELETE, SELECT will be automatically be committed after each operation.•No locks are held across multiple operation.•However, customers can create multi-statement transaction via $sql and issuing begin work, commit work, rollback work

8. Transactions (sharded environment)

•In sharded environment, mongo runs database via two different instances: mongos and mongod.

•Mongos simply redirects operations to relevant mongod.•No statement level transactoinal support.

•Informix•Informix does not have the 2-layer architecture•Informix server the application connected to becomes the transaction coordinator•Informix does have the 2-phase commit transaction support, but is unused for NoSQl, for now.•SELECT statement goes thru distributed query infrastructure•INSERT,UPDATE DELETE goes thru enterprise replication.

9. ISOLATION levels

•Default isolation level is DIRTY READ.

•Change this directly or sysdbopen()

•You can also use USELASTCOMMITTED variable in ONCONFIG.

•If you’re using procedures for executing multi-statement transaction, you can set it within your procedure.

10. LOCKING

•Page level locking is the default.

•You can change it to ROW level locks easily.

•ALTER TABLE jc MODIFY lock mode (row)

• DEF_TABLE_LOCKMODE onconfig variable.

• SET LOCK MODE can be set via sysdbopen()

•Each statement is executed with auto-commit and locking semantics will apply there.

Hybrid Power

Hybrid Access between relational & JSON Collections

SQL API

MongoDB API

(NoSQL)

Relational Table JSON Collections

Standard ODBC, JDBC, .NET, OData, etc.Language SQL.

Mongo APIs for Java, Javascript, C++, C#,...?

?

Benefits of Hybrid Power

� Access consistent data from its source

� Avoid ETL, continuous data sync and conflicts.

� Exploit the power of SQL, MongoAPI seamlessly

� Exploit the power of RDBMS technologies in MongoAPI:

− Informix Warehouse accelerator,

− Cost based Optimizer

− R-tree indices for spatial, Lucene text indexes, and more.

� Access all your data thru any interface: MongoAPI or SQL.

� Store data in one place and efficiently transform and use them on demand.

� Existing SQL based tools and APIs can access new data in JSON

Why do you need hybrid access?

Data modelshould not restrict

Data Access


SQL API

MongoDB API

(NoSQL)



Mongo APIs for Java,

Javascript, C++, C#,...

Direct SQL Access.Dynamic Views

Row types

Mongo APIs for Java, Javascript, C++, C#,...

Hybrid Power

MongoAPI to relational data

Hybrid access: From MongoAPI to relational tables.

You want to develop an application with MongoAPI, butU

1. You already have relational tables with data.

2. You have views on relational data

3. You need to join tables

4. You need queries with complex expressions. E.g. OLAP window functions.

5. You need to get results from a stored procedure.

6. You need to exploit Informix stored procedure

7. You need federated access to other data

8. You have timeseries data.

How to treat relational data as JSON store.

� Relational data (relations or resultset) can treated as structured JSON documents; column name-value becomes key-value pair.

� SELECT partner, pnum, country from partners;

partner pnum Country

Pronto 1748 Australia

Kazer 1746 USA

Diester 1472 Spain

Consultix 1742 France

{parnter: “Pronto”, pnum:”1748”, Country: “Australia”}

{parnter: “Kazer”, pnum:”1746”, Country: “USA”}

{parnter: “Diester”, pnum:”1472”, Country: “Spain”}

{parnter: “Consultix”, pnum:”1742”, Country: “France”}

� Listner translates the query and the data object between relational andJSON/BSON form.

Tables

Mongo Application

JDBC connections

IBM Wire Listener

IDXsLogs


DistributedQueries

Informix Dynamic Server

Tables

Tables

IDXs

Relational Tables

JSON Collections

SELECT bson_new(bson, ‘{}’) FROM customer WHERE bson_value_lvarchar(bson,‘state’)= “MO”

db.customer.find({state:”MO”}) db.partners.find({state:”CA”})

SELECT * FROM partners WHERE state = “CA”

Customer

partners

JSON JSON

Access RelationalAccess JSON


Accessing data in relational tables.

Create table partners(pnum int, name varchar(32), country varchar(32));

db.partners.find();

SELECT * from partners;

db.partners.find({name:”Pronto”});

SELECT * FROM PARTNERS WHERE name = “Pronto”;

db.partners.find({name:”Pronto”}, {pnum:1, country:1});

SELECT a, b FROM t WHERE a = 2.0;

db.partners.find({name:”Pronto”}, {pnum:1, country:1}).limit(10);

SELECT LIMIT 10 pnum, country FROM WHERE name = “Pronto”;

db.partners.find({name:”Pronto”}, {pnum:1, country:1}).sort({b:1})

SELECT pnum,country FROM partners WHERE name = “Pronto” ORDER BY b ASC

db.partners.find({name:”Pronto”}, {pnum:1, country:1}).sort({b:-1})

SELECT a, b FROM t WHERE a = 2.0 ORDER BY b DESC

db.t.find({a:{$gt:1}}, {a:1, b:1}).sort({b:-1})

SELECT SKIP ? a, b FROM t WHERE query > 1.0 ORDER BY b DESC

Accessing data in relational tables.

� db.partners.save({pnum:1632, name:”EuroTop”, Country: “Belgium”});

� INSERT into partners(pnum, name, country values(1632, ”EuroTop”, “Belgium”);

� db.partners.delete({name:”Artics”});

� DELETE FROM PARTNERS WHERE name = “Artics”;

� Db.partners.update({country:”Holland”},{$set:{country:”Netherland”}}, {multi: true});

� UPDATE partners SET country = “Netherland” WHERE country = “Holland”;

� db.partners.drop();

� DROP TABLE partners;

Views and JOINS

� A dynamically created relation created from one or more databasetables or procedures.

create table pcontact(pnum int, name varchar(32), phone varchar(32));

insert into pcontact values(1748, "Joe Smith", "61-123-4821");

insert into pcontact values(1746, "John Kelley", "1-729-284-2893");

insert into pcontact values(1472, "Ken Garcia", "34-829-2842");

insert into pcontact values(1742, "Adam Roy", "33-380-3892");

create view partnerphone(pname, pcontact, pphone) as select a.name, b.name, b.phoneFROM pcontact b left outer join partners a on (a.pnum = b.pnum);

� db.partnerphone.find();

{ "pname" : "Pronto", "pcontact" : "Joe Smith", "pphone" : "61-123-4821" }

{ "pname" : "Kazer", "pcontact" : "John Kelley", "pphone" : "1-729-284-2893" }

{ "pname" : "Diester", "pcontact" : "Ken Garcia", "pphone" : "34-98-829-2842" }

{ "pname" : "Consultix", "pcontact" : "Adam Roy", "pphone" : "33-82-380-3892" }

� db.partnerphone.find({pname:"Pronto"})

{ "pname" : "Pronto", "pcontact" : "Joe Smith", "pphone" : "61-123-4821" }

complex expressions. E.g. OLAP window functions

create view contactreport(pname, pcontact, totalcontacts) as

select b.name, a.name,

count(a.name) over(partition by b.pnum)

from pcontact a left outer join partners b on (a.pnum = b.pnum);

db.contactreport.find({pname:"Pronto"})

{ "pname" : "Pronto", "pcontact" : "Joel Garner", "totalcontacts" : 2 }

{ "pname" : "Pronto", "pcontact" : "Joe Smith", "totalcontacts" : 2 }

Seamless federated access

1. create database newdb2;

2. create synonym oldcontactreport for newdb:contactreport;

> use newdb2

> db.oldcontactreport.find({pname:"Pronto"})

{ "pname" : "Pronto", "pcontact" : "Joel Garner", "totalcontacts" : 2 }

{ "pname" : "Pronto", "pcontact" : "Joe Smith", "totalcontacts" : 2 }

SELECT data FROM oldcontactreport WHERE bson_extract(data, 'pname') = “Pronto”;

• create synonym oldcontactreport for custdb@nydb:contactreport;

Get results from a stored procedure.create function "keshav".p6() returns int, varchar(32);

define x int; define y varchar(32);

foreach cursor for select tabid, tabname into x,y from systables

return x,y with resume;

end foreach;

end procedure;

create view "keshav".v6 (c1,c2) as

select x0.c1 ,x0.c2 from table(function p6())x0(c1,c2);

� db.v6.find().limit(5)

{ "c1" : 1, "c2" : "systables" }

{ "c1" : 2, "c2" : "syscolumns" }

{ "c1" : 3, "c2" : "sysindices" }

{ "c1" : 4, "c2" : "systabauth" }

{ "c1" : 5, "c2" : "syscolauth" }

Access Timeseries data

create table daily_stocks

( stock_id integer,

stock_name lvarchar,

stock_data timeseries(stock_bar)

);

-- Create virtual relational table on top (view)

EXECUTE PROCEDURE

TSCreateVirtualTab('daily_stocks_virt',

'daily_stocks', 'calendar(daycal),origin(2011-01-03 00:00:00.00000)' );

create table daily_stocks_virt

( stock_id integer,

stock_name lvarchar,

timestamp datetime year to fraction(5),

high smallfloat,


db.daily_stocks_virt.find()

{ "stock_id" : 901, "stock_name" : "IBM", "timestamp" : ISODate("2011-01-03T06:0

0:00Z"), "high" : 356, "low" : 310, "final" : 340, "vol" : 999 }





{ "stock_id" : 902, "stock_name" : "HPQ", "timestamp" : ISODate("2011-01-03T06:0


{ "stock_id" : 902, "stock_name" : "HPQ", "timestamp" : ISODate("2011-01-04T06:0



db.daily_stocks_virt.find({stock_name:"IBM"}){ "stock_id" : 901, "stock_name" : "IBM", "timestamp" : ISODate("2011-01-03T06:0






db.daily_stocks_virt.find({stock_name:"IBM"}).sort({final:-1})







mongos>


You want to develop an application with MongoAPI, butU

1. You already have relational tables with data.

2. You have views on relational data

3. You need to join tables

4. You need queries with complex expressions. E.g. OLAP window functions.

5. You need to get results from a stored procedure.

6. You need to exploit Informix stored procedure

7. You need federated access to other data

8. You have timeseries data.

Put all of these together!

Hybrid Power

Schemaless SQL

Tables

JDBC connections

IDXsLogs


DistributedQueries


Tables

Tables

IDXs

Relational Tables

JSON Collections

SELECT bson_new(bson, “{customer:1}’) FROM customer WHERE bson_value_lvarchar(bson,‘state’)= “MO” Select * from patners where state = “CA”;

Customer

partners

Access RelationalAccess JSON

Hybrid access: From SQL API to JSON COllections.

SQL Applications

Join JSON collections

JSON Collection V:

{ "_id" : ObjectId("526a1bb1e1819d4d98a5fa4b"), "c1" : 1, "c2" : 2 }

JSON Collection w:{ "_id" : ObjectId("526b005cfb9d36501fe31605"), "x" : 255, "y" : 265, "z" : 395}

{ "_id" : ObjectId("52716cdcf278706cb7bf9955"), "x" : 255, "y" : 265, "z" : 395}

{ "_id" : ObjectId("5271c749fa29acaeba572302"), "x" : 1, "y" : 2, "z" : 3 }

SELECT c1, c2, x,y,z

FROM V, W

WHERE V.c1 = W.x;

Join JSON collections

-- Wrap aorund the built-in expressions

SELECT bson_value_int(jc1.data, 'x'),

bson_value_lvarchar(jc1.data, 'y'),

bson_value_int(jc1.data, 'z') ,

bson_value_int(jc2.data, 'c1'),

bson_value_lvarchar(jc2.data, 'c2')

FROM w jc1, v jc2

WHERE bson_value_int(jc1.data, 'x') = bson_value_int(jc2.data, 'c1');

Join JSON collections-- Create a view to make the access simple.

create view vwjc(jc1x, jc1y, jc1z, jc2c1, jc2c2) as

SELECT bson_value_int(jc1.data, 'x'),

bson_value_lvarchar(jc1.data, 'y'),

bson_value_int(jc1.data, 'z') ,

bson_value_int(jc2.data, 'c1'),

bson_value_lvarchar(jc2.data, 'c2')

FROM w jc1, v jc2

WHERE bson_value_int(jc1.data, 'x') =

bson_value_int(jc2.data, 'c1');

� db.vwjc.find();

� Db.vwjc.find({jc1x:100});

� Db.vwjc.find({jc1x:{$gte:10}}).sort(jc1y:1)

You want to perform complex analytics on JSON data

� BI Tools like Cognos, Tableau generate SQL on data sources.

� Option 1: Do ETL

� Need to expose JSON data as views so it’s seen as a database object.

− We use implicit casting to convert to compatible types

− The references to non-existent key-value pair returns NULL

� Create any combination of views

− A view per JSON collection

− Multiple views per JSON collection

− Views joining JSON collections, relational tables and views.

� Use these database objects to create reports, graphs, etc.

ODBC, JDBC connections

Informix Warehouse Accelerator


Tables

Tables

Relational Tables

JSON Collections

Customer

partners

Analytics

SQL & BI Applications

Orders

CRM

Inventory

TablesTables & views


SQL API

MongoDB API

(NoSQL)



Mongo APIs for Java,

Javascript, C++, C#,...

Direct SQL Access.Dynamic Views

Row types

Mongo APIs for Java, Javascript, C++, C#,...

Ability for MongoDB API to Access All Data Models

130

node.js Application

MongoDB Drivers

Traditional SQL

NoSQL - JSON

TimeSeries

MQ SeriesMQ Series

Traditional SQL

NoSQL - JSON

TimeSeries

Traditional SQL

NoSQL - JSON

MQ Series

TimeSeries

Traditional SQL

NoSQL - JSON

MQ Series

TimeSeries

Traditional SQL

NoSQL - JSON

MQ Series

TimeSeries

Traditional SQL

NoSQL - JSON

MQ Series

TimeSeries

Traditional SQL

NoSQL - JSON

MQ Series

TimeSeries

Traditional SQL

NoSQL - JSON

MQ Series

TimeSeries

NoSQL - JSON

MQ Series

TimeSeries

Traditional SQL

NoSQL - JSON

MQ Series

TimeSeries

Traditional SQL

NoSQL - JSON

MQ Series

TimeSeries

Traditional SQL

NoSQL - JSON

MQ Series

TimeSeries

Traditional SQL

NoSQL - JSON

MQ Series

TimeSeries

Traditional SQL

NoSQL - JSON

MQ Series

TimeSeries


native Client

web browser

Mobile

Applications

Informix SQLI Drivers

MongoDB Drivers

IBM DRDA Drivers

All three drivers support all data models

Traditional SQL

NoSQL - JSON

MQ Series

TimeSeries

Benefits of Hybrid Power

�Access consistent data from its source

�Avoid ETL, continuous data sync and conflicts.

�Exploit the power of SQL, MongoAPI seamlessly

�Exploit the power of RDBMS technologies via MongoAPI:

− Informix Warehouse accelerator,

− Cost based Optimizer & power of SQL

− R-tree indices for spatial, Lucene text indexes, and more.

�Access all your data thru any interface: MongoAPI & SQL

�Store data in one place and efficiently transform and use them on demand.

�Existing SQL based tools and APIs can access new data in JSON

---------John----------

Simple SQL statements which Operate on the BSON


CREATE TABLE mycollection (data BSON);

INSERT INTO mycollection (data) VALUES ( '{"name":“John"}‘ ::JSON );

CREATE INDEX myindx ON mycollection(bson_extract(data,“name")) USING BSON

SELECT data::JSON FROM mycollection;

The SQL to Create a Collection

� Formal definition of a collection used by the Wire Listener


CREATE COLLECTION TABLE mycollection(id char(128),modcnt integer,data bson,flags integer);

New Built-in BSON Expressions

bson_value_array(lvarchar doc, lvarchar key) returns lvarchar as BSON array

bson_value_bigint(lvarchar doc, lvarchar key) returns bigint

bson_value_binary(lvarchar doc, lvarchar key) returns lvarchar as BSON binary

bson_value_boolean(lvarchar doc, lvarchar key) returns boolean

bson_value_code(lvarchar doc, lvarchar key) returns lvarchar as string

bson_value_date(lvarchar doc, lvarchar key) returns datetime

bson_value_document(lvarchar doc, lvarchar key) returns lvarchar as BSON object

bson_value_double(lvarchar doc, lvarchar key) returns float

bson_value_int(lvarchar doc, lvarchar key) returns bigint

bson_value_lvarchar(lvarchar doc, lvarchar key) returns lvarchar as string

bson_value_objectid(lvarchar doc, lvarchar key) returns lvarchar as string

bson_value_timestamp(lvarchar doc, lvarchar key) returns datetime

bson_keys_exist(lvarchar doc, lvarchar key(s)) returns Boolean

bson_new(lvarchar doc, lvarchar projection_list) return lvarchar as BSON binary

bson_type(lvarchar doc, lvarchar key) return integer

bson_size(lvarchar doc, lvarchar key) return integer

bson_extract(lvarchar doc, lvarchar) return lvarchar as BSON binary

136

Combining Tables and Collections


SELECT customer.*,bson_value_lvarchar(data,"customer_code") AS customer_codeFROM mycollection, customerWHERE bson_value_lvarchar(data,"fname") = customer.fnameAND bson_value_lvarchar(data,"lname") = customer.lnameAND customer_num < 102;

QUERY: (OPTIMIZATION TIMESTAMP: 07-21-2013 23:15:31)

Estimated Cost: 6Estimated # of Rows Returned: 10

1) miller3.customer: INDEX PATH

(1) Index Name: miller3. 100_1Index Keys: customer_num (Serial, fragments: ALL)Upper Index Filter: miller3.customer.customer_num < 102

2) miller3.mycollection: INDEX PATH

Filters: informix.equal(BSON_VALUE_LVARCHAR (miller3.mycollection.data , 'fname' ) ,miller3.customer.fname )

(1) Index Name: miller3.mycollection_ix1Index Keys: :informix.bson_value_lvarchar(data,'lname') (Serial, fragments: ALL)Lower Index Filter: equal(BSON_VALUE_LVARCHAR (miller3.mycollection.data , 'lname' ) ,miller3.customer.lname )

NESTED LOOP JOIN

Query Optimizer

Understanding Informix BSON Indexes

� Indexes are created on BSON data and support– Primary Key (enforced across all nodes)– Unique Indexes (enforced at a single node level)– Composite Indexes ( 8 parts )– Arrays


{

"fname":"Sadler","lname":"Sadler","company":"Friends LLC","age":21,"count":27,“phone": [ “408-789-1234”, “408-111-4779” ], }

create index fnameix1 on customer(bson_value_lvarchar(bson,"fname")) using bson;

create index lnameix2 on customer(bson_value_lvarchar(bson,"lname")) using bson;

create index phoneix3 on customer(bson_value_lvarchar(bson,"phone")) using bson;

Understanding Informix BSON Indexes


create index fnameix1 on customer(bson_value(bson,"fname")) using bson;

create index lnameix2 on customer(bson_value(bson,"lname")) using bson;

create index phoneix3 on customer(bson_value(bson,"phone")) using bson;

select * from customer where bson_value_lvarchar(bson,"fname") = "Ludwig";

-- use fnameix1

select * from customer where bson_value_lvarchar (bson,"lname") = "Sadler";

-- use lnameix2

select * from customer where bson_value_lvarchar(bson,"phone") = "408-789-8091";

-- use phoneix3

select * from customer where bson_value_lvarchar(bson,"phone") = "415-822-1289“

OR bson_value_lvarchar(bson,"phone") = "408-789-8091";

-- use phoneix3

select * from customer where bson_value_lvarchar(bson,"company") = "Los Altos Sports";

-- no index use sequential scan


Informix 12.1

Convert Traditional SQL Tables to JSON

� Mongo API–Leverage the numerous open

source drivers for Mongo

– Extend the power of the mongo API to execute on

• Traditional SQL tables• TimeSeries/Sensor Data


Not just a Hybrid Database

� Hybrid databases only solve half of the problem

� Hybrid application are hear to stay

Scaling Out – Sharding Data

142



Shard Keystate= “CA” • Each node in the environment

hold a portion of the data

• Shard by either hash or expression

• When inserting data it is automatically moved to the correct node

• Queries automatically aggregate data from the required node(s)

Graphical Administration showing a Sharded System

� Graphical administration of the shards keep things simple

� Monitor the entire environment from a single graphical console

Simply Administration - OAT Manages JSON Objects

144

Simplify the “up and running” Experience

� Install and setup a typical database instance with only 3 questions:

− Where to place the Product?

− Where to place the Data?

− How many users do you anticipate?

� Newly installed instance adapts to the resources on the computer

145

Description

Auto tuning of CPU VPS

Auto Table Placement

Auto Buffer pool tuning

Auto Physical Log extension

Auto Logical Log Add

Auto Read Ahead

SIMPLE POWER –INFORMIX HYBRID

DATABASE CAPABILITIES

� Relational and non-relational data in one system

� NoSQL/MongoDB Apps can access Informix Relational Tables

� Distributed Queries

� Multi-statement Transactions

� Enterprise Proven Reliability

� Enterprise Ready Security

� Enterprise Level Performance

Informix provides the capability to leverage

the abilities of both relational DBMS and document store systems.

MongoDB does not. It is a document store system lacking key abilities like transaction durability.

147

Round Peg Square Hole

148

The DBA and/or programmer no longer have to decide upfront if it

is better to use a system entirely of JSON documents or a system

only of SQL tables, but rather have a single database in which the

programmer decides a if JSON document is optimal or a SQL table

is optimal for the data and usage.

Informix Specific Advantages with Mongo Drivers

� Traditional SQL tables and JSON collections co-existing in the same database

� Using the MongoDB client drivers Query, insert, update, delete

− JSON collections

− Traditional SQL tables

− Timeseries data

� Join SQL tables to JSON collections utilizing indexes

� Execute business logic in stored procedures

� Provide a view of JSON collections as a SQL table

− Allows existing SQL tools to access JSON data

� Enterprises level functionality

149

Scalability

� Better performance on multi-core, multi-session scenarios

− Architecture has finer grain locking – not entire database

− Better concurrency because less resources locked

� Document Compression

− 60% to 90% observed

� Bigger documents – 2GB maximum size

− MongoDB caps at 16MB

� Informix has decades of optimization on single node solution

150


Some NoSQL Use Cases - Mostly Interactive Web/Mobile

� Online/Mobile Gaming

− Leaderboard (high score table) management

− Dynamic placement of visual elements

− Game object management

− Persisting game/user state information

− Persisting user generated data (e.g. drawings)

� Display Advertising on Web Sites

− Ad Serving: match content with profile and present

− Real-time bidding: match cookie profile with ad inventory, obtain bids, and present ad

� Dynamic Content Management and Publishing (News & Media)

− Store content from distributed authors, with fast retrieval and placement

− Manage changing layouts and user generated content

151

� E-commerce/Social Commerce– Storing frequently changing product

catalogs

� Social Networking/Online

Communities

� Communications– Device provisioning

� Logging/message passing– Drop Copy service in Financial

Services (streaming copies of trade execution messages into (for example) a risk or back office system)

Dual Capability

152

� Hybrid databases are a way of the past

� Hybrid application are hear to stay

Client Applications


native Client

web browser

Mobile

Applications

Informix

NoSQL

Informix SQLI

Drivers

MongoDB Drivers

IBM DRDA

Drivers

All three drivers support all data models

Ability for All Clients to Access All Data Models

154

Traditional SQL

NoSQL - JSON

TimeSeries

MQ Series

Informix SQLI

Drivers

MongoDB Drivers

IBM DRDA

Drivers

Ability for All Clients to Access All Data Models

155

node.js Application

MongoDB Drivers

Traditional SQL

NoSQL - JSON

TimeSeries

MQ Series

Lance Code Snippet

BUILDING A REAL LIFE APPLICATION

IOD Demo and Beyond

� A fun exciting demo application that attendees can interact with that makes people take notice of IBM technologies

� A social networking photo application utilizing smart phones/tablets that allows the end user to take photos, tag and add photos to an Informix NoSQL hybrid database

� Random photos can be displayed at sessions, booths or on attendee smart devices

IOD Attendee Photo Application

Allow conference attendee to take and share photo!

Technology Highlights

• Create a hybrid application using NoSQL, traditional SQL, timeseries mobile web application

• Utilizing both JSON collections, SQL tables and timeseries

• Utilize IBM Dojo Mobile tools to build a mobile application

• Leverage new mongo client side drivers for fast application delvelopment

• Demonstrate sharding with over 100 nodes

• Cloud based solution on top of Amazon Cloud

• Provide real-time analytics on all forms of data

• Leverage existing popular analytic front-end IBM-Congos

• Utilize an in-memory columnar database accelerator to provide real-time trending analytics on data

IOD Photo App - UPLOAD

Van Goghtag

Photo Application

IBM Dojo Mobile

Apache Web Server

Informix

Photo

collectionInformix JSON

Listener

Mobile Device Application Architecture

User

Table

IOD application

Apache web server

Informix

Photo collection

Informix JSON Listener

Aa-Am An-Az Ba-Bf Bg-BpAa-Am An-Az Ba-BfAa-Am An-Az Bg-BpBa-BfAa-Am An-Az Node101Node3Node3Node1 Node2 U

Amazon Cloud

Sharding

commands

“Dalmation” shard farm – 101 sharded nodes on Amazon Cloud

Photo collection sharded across these nodes

Application Architecture (Big Picture)

Photo Application Schema

activity_data timeseries(photo_like)

activity_photos

NoSQL Collections

TimeSeries

Photo Application NoSQL Collections


activity_photos

Application Considerations

� Photo meta-data varies from camera to camera

� A Picture and all its meta data are stored in-document

� Pictures are stored in a NoSQL collection

� Pre-processing on the phone ensures only reasonable size photos are sent over the network.

Example of Live JSON Photo Data

{"_id":ObjectId("526157c8112c2fe70cc06a75"), "Make":"NIKON CORPORA

TION","Model":"NIKON D60","Orientation":"1","XResolution":"300","

YResolution":"300","ResolutionUnit":"2","Software":"Ver.1.00 ","D

ateTime":"2013:05:15 19:46:36","YCbCrPositioning":"2","ExifIFDPoi

nter":"216","ExposureTime":"0.005","FNumber":"7.1","ExposureProgr

am":"Not defined","ISOSpeedRatings":"100",

"Contrast":"Normal","Saturation":"Normal","Sharpness":"Normal",

"SubjectDistanceRange":"Unknown","name":"DSC_0078.JPG","img_d

ata":"data:image/jpeg;base64,/9j/4AAQSkZJRgABAQAAAQABAAD/2wBDABcQ

Sharding

� NoSQL “Dalmatian” cloud deployment

� Sharding by hash

� Heterogeneous Cloud deployment with sharding

− Processor Type and Number

� Intel and ARM processors

− Operating System Types and Versions

� Linux, Windows, AIX, Apple, Solaris, HP,Y

Basic PHP Programming Overview Information

� List of NoSQL collection names and SQL tables names

� Function to set the active database and return the Collection

private $conn;

private $dbname = "photo_demo";private $photoCollectionName = "photos";private $contactsCollectionName = "contacts";private $sqlCollectionName = 'system.sql';private $userTableName = "users";private $tagsTableName = "tags";private $likesTableName = "likes";

private $photoQueryProjection = array("_id" => 1, "tags" => 1, "user_id" => 1, "img_data" => 1);

/*** Get collection by name * @param MongoCollection $collectionName*/

private function getCollection($collectionName) {return $this->conn->selectDB($this->dbname)->selectCollection ($collectionName);

}

Insert Example

� Information is placed in the contacts collection

Insert Data into a Collection

/*** Insert user's contact information into contacts table.*/

public function insertContact( $json ) {if (! is_array ( $json )) {

return "Contact info not in JSON format.";}try {

$result = $this->getCollection($this->contactsTableName)->insert($json);if ($result ["ok"] != 1) {

return $result ["err"];}

} catch ( MongoException $e ) {return $e->getMessage ();

}return "ok";

}

� Very simple to insert JSON data into a collection using the MongoAPIs

Retrieve Collection Information

� Very simple to retrieve data from a collection using the MongoAPIs

� Data is returned as a JSON document

/*** Get contact info*/public function adminContacts() {

$photoCollection = $this->getCollection($this->contactsCollectionName);$cursor = $photoCollection->find();$results = $this->getQueryResults($cursor);return $results;

}

Retrieve Collection Information

/*** Get contact info for admin panel*/public function adminContacts() {

$photoCollection = $this->getCollection($this->contactsCollectionName);$cursor = $photoCollection->find();$results = $this->getQueryResults($cursor);return $results;

}

Naughty Pictures

Delete a Photo and its Information

/*** Delete photo*/public function deletePhoto($id) {

try {// First delete from likes and tags tables$query = array('photo_id' => $id['_id']);$result = $this->getCollection($this->likesTableName)->remove($query);if ($result ["ok"] != 1) {

return $result["err"];}$result = $this->getCollection($this->tagsTableName)->remove($query);if ($result ["ok"] != 1) {

return $result["err"];}

// Then delete the photo from the collection$query = array('_id' => new MongoId($id['_id']));$result = $this->getCollection ( $this->photoCollectionName )->remove ( $query );if ($result ["ok"] != 1) {

return $result["err"];}

} catch ( MongoException $e ) {return $e->getMessage();

}return "ok";

}

� Deleting from SQL Tables and NoSQL Collection is exactly the same

Executing a Stored Procedure in MongoAPI

/*** Get the user_id for a particular user name (email address).* * Calls a stored procedure that will insert into the users table if the * user does not exist yet and returns the user_id. * * @param string $username* @return int $user_id*/public function getUserId($username) {

$username = trim($username);

try {$sql = "EXECUTE FUNCTION getUserID('" . $username . "')";$result = $this->getCollection($this->sqlCollectionName)->

findOne(array('$sql'=>$sql));if (isset($result['errmsg'])) {

return "ERROR. " . $result['errmsg'];}return $result['user_id'];

} catch (MongoException $e) {return "ERROR. " . $e->getMessage();

}}

Real Time Analytics

� Customer Issues

− Several different models of data (SQL, NoSQL, TimeSeries/Sensor)

− NoSQL is not strong building relations between collections

− Most valuable analytics combine the results of all data models

− Most protonate Analytic system written using standard SQL

− ETL & YAS (Yet Another System)

� Solution

− Enables common tools like Congo's

Provide a mapping of the required data in SQL form

Informix

Photo

collection

Analytics on a Hybrid Database

User

Table

MongoA

PI

SQ

L

Photo Application SQL Mapping of NoSQL PHOTO Collection


activity_photos

Mapping A Collection To A SQL Table

CREATE VIEW photo_metadata (gpslatitude, gpslongitude, make, model, orientation, datetimeoriginal,exposuretime, fnumber, isospeedratings,pixelxdimension, pixelydimension)

AS SELECT BSON_VALUE_LVARCHAR ( x0.data , 'GPSLatitude' ),BSON_VALUE_LVARCHAR ( x0.data , 'GPSLongitude' ),BSON_VALUE_LVARCHAR ( x0.data , 'Make' ),BSON_VALUE_LVARCHAR ( x0.data , 'Model' ),BSON_VALUE_LVARCHAR ( x0.data , 'Orientation' ),BSON_VALUE_LVARCHAR ( x0.data , 'DateTimeOriginal' ) ,BSON_VALUE_LVARCHAR ( x0.data , 'ExposureTime'),BSON_VALUE_LVARCHAR ( x0.data , 'FNumber' ),BSON_VALUE_LVARCHAR ( x0.data , 'ISOSpeedRatings' ),BSON_VALUE_LVARCHAR ( x0.data , 'PixelXDimension' ) ,BSON_VALUE_LVARCHAR ( x0.data , 'PixelYDimension')

FROM photos x0;

IWA Integration

� What is trending at IOD?

� Set up to work on frequently updated snapshots of data to perform analytic queries on the photo meta-data

� Since BSON/JSON data can not be directly compressed and searched, snapshot process would selectively extract data from the documents for copying to IWA

� Use Genero to show the power of IWA

Informix

Photo

collection

Mobile Device Application Architecture

User

Table

MongoA

PI

SQ

L

Configure Informix on Amazon Cloud Simple

• Instantiate the Amazon image

• Setup the storage

• Install the product

• Start the system

• Configure sharding

Questions