52
© 2015 MapR Technologies 1 © 2015 MapR Technologies Evolving from RDBMS to NoSQL + SQL Jim Scott – Director, Enterprise Strategy & Architecture @kingmesal #strataconf

© 2015 MapR Technologies 1 Jim Scott – Director, Enterprise Strategy & Architecture @kingmesal #strataconf

Embed Size (px)

Citation preview

Page 1: © 2015 MapR Technologies 1 Jim Scott – Director, Enterprise Strategy & Architecture @kingmesal #strataconf

© 2015 MapR Technologies 1© 2015 MapR Technologies

Evolving from RDBMS to NoSQL + SQL

Jim Scott – Director, Enterprise Strategy & Architecture

@kingmesal #strataconf

Page 2: © 2015 MapR Technologies 1 Jim Scott – Director, Enterprise Strategy & Architecture @kingmesal #strataconf

© 2015 MapR Technologies 2

Why Does this Matter

• 90%+ of the use cases do not deal with “relational” data

• RDBMS data models are more complex than a single table– One-to-many relationships require multiple tables

– Creating code to persist data takes time and QA

• Inferred (or removed) keys used without actual foreign keys– Difficult for others to understand relationships

• Transactional tables never look the same as analytics tables– OLTP -> ETL -> OLAP

– This takes significant time to build

Page 3: © 2015 MapR Technologies 1 Jim Scott – Director, Enterprise Strategy & Architecture @kingmesal #strataconf

© 2015 MapR Technologies 3

Topics

• Changing Data Models– Relations Model to JSON Model

• A New Database for JSON Data– Document Database (OJAI)

• Querying JSON Data and More– Drill

• Resources

Page 4: © 2015 MapR Technologies 1 Jim Scott – Director, Enterprise Strategy & Architecture @kingmesal #strataconf

© 2015 MapR Technologies 4

Empowering “as it happens” businesses by speeding up the

data-to-action cycle

Page 5: © 2015 MapR Technologies 1 Jim Scott – Director, Enterprise Strategy & Architecture @kingmesal #strataconf

© 2015 MapR Technologies 5© 2015 MapR Technologies

Changing Data Models

Page 6: © 2015 MapR Technologies 1 Jim Scott – Director, Enterprise Strategy & Architecture @kingmesal #strataconf

© 2015 MapR Technologies 6

180 Tables NOT SHOWN!

Page 7: © 2015 MapR Technologies 1 Jim Scott – Director, Enterprise Strategy & Architecture @kingmesal #strataconf

© 2015 MapR Technologies 7

236 tablesto describe 7 kinds of things

Page 8: © 2015 MapR Technologies 1 Jim Scott – Director, Enterprise Strategy & Architecture @kingmesal #strataconf

© 2015 MapR Technologies 8

Page 9: © 2015 MapR Technologies 1 Jim Scott – Director, Enterprise Strategy & Architecture @kingmesal #strataconf

© 2015 MapR Technologies 9

Page 10: © 2015 MapR Technologies 1 Jim Scott – Director, Enterprise Strategy & Architecture @kingmesal #strataconf

© 2015 MapR Technologies 10

Searching for Elvis// Find discs where Elvis was credited > SELECT distinct album_id, name FROM

(SELECT id album_id, artist_id, name, FLATTEN(credit) FROM release) albums

join (SELECT distinct artist_id FROM

(SELECT id artist_id, FLATTEN(alias) FROM artistwhere name like 'Elvis%Presley’)

) artists USING artist_id;

Page 11: © 2015 MapR Technologies 1 Jim Scott – Director, Enterprise Strategy & Architecture @kingmesal #strataconf

© 2015 MapR Technologies 11

Benefits

• Extended relational model allows massive simplification– On a real example, we see >20x reduction in number of tables

• Simplification drives improved introspection– This is good

• Apache Drill gives very high performance execution for extended relational problems

• You can try this out today

Page 12: © 2015 MapR Technologies 1 Jim Scott – Director, Enterprise Strategy & Architecture @kingmesal #strataconf

© 2015 MapR Technologies 12© 2015 MapR Technologies

A New Database for JSON Data

Page 13: © 2015 MapR Technologies 1 Jim Scott – Director, Enterprise Strategy & Architecture @kingmesal #strataconf

© 2015 MapR Technologies 13

Basics of the API

• http://ojai.github.io/

• Entry point to a table - DocumentStore– insert()– insertOrReplace()– find()– delete()– replace()– update()– increment()

Page 14: © 2015 MapR Technologies 1 Jim Scott – Director, Enterprise Strategy & Architecture @kingmesal #strataconf

© 2015 MapR Technologies 14

Working with JSON in Java

• Step 1 – Create instance of JSON Serializer

Gson gson = new Gson();

• Step 2 – Serialize POJO to JSON

String json = gson.toJson(myObject);

• Step 3 – Deserialize JSON into POJO

MyObject myObject = gson.fromJson(json, MyObject.class);

Page 15: © 2015 MapR Technologies 1 Jim Scott – Director, Enterprise Strategy & Architecture @kingmesal #strataconf

© 2015 MapR Technologies 15

Creating Documents in Java OJAI

• Use static methods on class org.ojai.json.Json

Document doc = Json.newDocument(myObject);

Document doc = Json.newDocument(jsonString);

• Alternatively– Use builders– Stream from disk– Use InputStream

Page 16: © 2015 MapR Technologies 1 Jim Scott – Director, Enterprise Strategy & Architecture @kingmesal #strataconf

© 2015 MapR Technologies 16

Creating New Documents

• DocumentStore.insert(doc)

Done!

• DocumentStore.insertOrReplace(doc)

Done!

Easy right?

Page 17: © 2015 MapR Technologies 1 Jim Scott – Director, Enterprise Strategy & Architecture @kingmesal #strataconf

© 2015 MapR Technologies 17

Updating Existing Documents

• DocumentStore.update(_id, DocumentMutation)

• Mutation methods– mutation.append(FieldPath, “user visited URL”);– mutation.set(“field.name”, “What a great example”);– mutation.increment(“field”, 1);– mutation.merge(“field”, Map<String, Object>);– mutation.setOrReplace(…);– mutation.delete(field);

Yes, these are atomic.

Page 18: © 2015 MapR Technologies 1 Jim Scott – Director, Enterprise Strategy & Architecture @kingmesal #strataconf

© 2015 MapR Technologies 18

Deleting Documents

• DocumentStore.delete(doc);

Done!

• DocumentStore.delete(_id);

Done!

This is easy too, right?

Page 19: © 2015 MapR Technologies 1 Jim Scott – Director, Enterprise Strategy & Architecture @kingmesal #strataconf

© 2015 MapR Technologies 19

Finding Documents

• DocumentStore.find(QueryCondition);

• Query condition setup:– qc.is(“field”, EQUAL, “blue”)

.and().notExists(“other.field”)

.or().like(“field”, “%purple”)

.or().matches(“another.field”, “regular expression”)

Page 20: © 2015 MapR Technologies 1 Jim Scott – Director, Enterprise Strategy & Architecture @kingmesal #strataconf

© 2015 MapR Technologies 20© 2015 MapR Technologies

Querying JSON Data and More

Page 21: © 2015 MapR Technologies 1 Jim Scott – Director, Enterprise Strategy & Architecture @kingmesal #strataconf

© 2015 MapR Technologies 21

How To Bring SQL to Non-Relational Data Stores?

Familiarity of SQL Agility of NoSQL

• ANSI SQL semantics

• BI (Tableau, MicroStrategy,

etc.)

• Low latency

• No schema management– HDFS (Parquet, JSON, etc.)– HBase– …

• No transformation– No silos of data

• Ease of use

Page 22: © 2015 MapR Technologies 1 Jim Scott – Director, Enterprise Strategy & Architecture @kingmesal #strataconf

© 2015 MapR Technologies 22

Drill Supports Schema Discovery On-The-Fly

• Fixed schema• Leverage schema in centralized

repository (Hive Metastore)

• Fixed schema, evolving schema or schema-less

• Leverage schema in centralized repository or self-describing data

2Schema Discovered On-The-FlySchema Declared In Advance

SCHEMA ON WRITE

SCHEMA BEFORE READ

SCHEMA ON THE FLY

Page 23: © 2015 MapR Technologies 1 Jim Scott – Director, Enterprise Strategy & Architecture @kingmesal #strataconf

© 2015 MapR Technologies 23

Drill’s Data Model is Flexible

JSONBSON

HBase

ParquetAvro

CSVTSV

Dynamic schema

Fixed schema

Complex

Flat

Flexibility

Name Gender Age

Michael M 6

Jennifer F 3

{ name: { first: Michael, last: Smith }, hobbies: [ski, soccer], district: Los Altos}{ name: { first: Jennifer, last: Gates }, hobbies: [sing], preschool: CCLC}

RDBMS/SQL-on-Hadoop table

Apache Drill table

Fle

xibi

lity

Page 24: © 2015 MapR Technologies 1 Jim Scott – Director, Enterprise Strategy & Architecture @kingmesal #strataconf

© 2015 MapR Technologies 24

Enabling “As-It-Happens” Business with Instant Analytics

Hadoop data Data modeling TransformationData

movement

(optional)Users

Hadoop data Users

Traditionalapproach

Exploratory approach

New Business questionsSource data evolution

Total time to insight: weeks to months

Total time to insight: minutes

Page 25: © 2015 MapR Technologies 1 Jim Scott – Director, Enterprise Strategy & Architecture @kingmesal #strataconf

© 2015 MapR Technologies 25

Evolution Towards Self-Service Data Exploration

Data Modeling and Transformation

Data Visualization

IT-driven

IT-driven

IT-driven

Self-service

IT-driven

Self-service

Optional

Self-service

Traditional BIw/ RDBMS

Self-Service BIw/ RDBMS

SQL-on-HadoopSelf-Service

Data Exploration

Zero-day analytics

Page 26: © 2015 MapR Technologies 1 Jim Scott – Director, Enterprise Strategy & Architecture @kingmesal #strataconf

© 2015 MapR Technologies 26

Common Use Cases

Raw Data Exploration JSON Analytics DWH offload

Hive HBaseFiles Directories…

{JSON}, ParquetText Files …

Page 27: © 2015 MapR Technologies 1 Jim Scott – Director, Enterprise Strategy & Architecture @kingmesal #strataconf

© 2015 MapR Technologies 27

- Sub-directory- HBase namespace- Hive database

Drill Enables ‘SQL-on-Everything’

SELECT * FROM dfs.yelp.`business.json`

Workspace- Pathnames- Hive table- HBase table

Table

- DFS (Text, Parquet, JSON)- HBase/MapR-DB- Hive Metastore/HCatalog- Easy API to go beyond Hadoop

Storage plugin instance

Page 28: © 2015 MapR Technologies 1 Jim Scott – Director, Enterprise Strategy & Architecture @kingmesal #strataconf

© 2015 MapR Technologies 28

Reuse Existing SQL Tools and Skills

Leverage SQL-compatible tools

(BI, query builders, etc.) via Drill’s

standard ODBC, JDBC and ANSI

SQL support

Enable business analysts, technical

analysts and data scientists to

explore and analyze large volumes

of real-time data

Page 29: © 2015 MapR Technologies 1 Jim Scott – Director, Enterprise Strategy & Architecture @kingmesal #strataconf

© 2015 MapR Technologies 29© 2015 MapR Technologies

Security Controls

Page 30: © 2015 MapR Technologies 1 Jim Scott – Director, Enterprise Strategy & Architecture @kingmesal #strataconf

© 2015 MapR Technologies 30

Access Controls that Scale

PAM Authentication + User Impersonation

Fine-grained row and column level access control with Drill Views – no centralized security repository required

Files HBase Hive

Drill View 1

Drill View 2

UUU

U

U

Page 31: © 2015 MapR Technologies 1 Jim Scott – Director, Enterprise Strategy & Architecture @kingmesal #strataconf

© 2015 MapR Technologies 31

Granular Security via Drill Views

Name City State Credit Card #

Dave San Jose CA 1374-7914-3865-4817

John Boulder CO 1374-9735-1794-9711

Raw File (/raw/cards.csv)OwnerAdmins

Permission Admins

Business Analyst Data Scientist

Name City State Credit Card #

Dave San Jose

CA 1374-1111-1111-1111

John Boulder CO 1374-1111-1111-1111

Data Scientist View (/views/maskedcards.csv)

Not a physical data copy

Name City State

Dave San Jose

CA

John Boulder CO

Business Analyst View

OwnerAdmins

Permission Business Analysts

OwnerAdmins

Permission Data

Scientists

Page 32: © 2015 MapR Technologies 1 Jim Scott – Director, Enterprise Strategy & Architecture @kingmesal #strataconf

© 2015 MapR Technologies 32

Ownership ChainingCombine Self Service Exploration with Data Governance

Name City State Credit Card #

Dave San Jose CA 1374-7914-3865-4817

John Boulder CO 1374-9735-1794-9711

Raw File (/raw/cards.csv)

Name City State Credit Card #

Dave San Jose CA 1374-1111-1111-1111

John Boulder CO 1374-1111-1111-1111

Data Scientist (/views/V_Scientist)

Jane (Read)John (Owner)

Name City State

Dave San Jose CA

John Boulder CO

Analyst(/views/V_Analyst)

Jack (Read)Jane(Owner)

RA

W F

ILEV

_Scientist

V_A

nalyst

Does Jack have access to V_Analyst? ->YES

Who is the owner of V_Analyst? ->Jane

Drill accesses V_Analyst as Jane (Impersonation hop 1)

Does Jane have access to V_Scientist ? -> YES

Who is the owner of V_Scientist? ->John

Drill accesses V_Scientist as John (Impersonation hop 2)

John(Owner)

Does John have permissions on raw file? -> YES

Who is the owner of raw file? ->John

Drill accesses source file as John (no impersonation here)

Jack queries the view V_Analyst

*Ownership chain length (# hops) is configurable

Ownership chaining

Access path

Page 33: © 2015 MapR Technologies 1 Jim Scott – Director, Enterprise Strategy & Architecture @kingmesal #strataconf

© 2015 MapR Technologies 33

Security Summary

• Logical

– No physical data copies/silos

• Granular

– Row level and column level security controls

• De-centralized

– User impersonation respecting storage system permissions

– No separate permission repository for granular controls

– Integrated with Hadoop File System permissions and LDAP

• Self-service w/ governance

– If you have access to data, you control who and how widely can access it

– Audits

Page 34: © 2015 MapR Technologies 1 Jim Scott – Director, Enterprise Strategy & Architecture @kingmesal #strataconf

© 2015 MapR Technologies 34© 2015 MapR Technologies

Using Drill with Yelp

Page 35: © 2015 MapR Technologies 1 Jim Scott – Director, Enterprise Strategy & Architecture @kingmesal #strataconf

© 2015 MapR Technologies 35

Business dataset {"business_id": "4bEjOyTaDG24SY5TxsaUNQ","full_address": "3655 Las Vegas Blvd S\nThe Strip\nLas Vegas, NV 89109","hours": {

"Monday": {"close": "23:00", "open": "07:00"},"Tuesday": {"close": "23:00", "open": "07:00"},"Friday": {"close": "00:00", "open": "07:00"},"Wednesday": {"close": "23:00", "open": "07:00"},"Thursday": {"close": "23:00", "open": "07:00"},"Sunday": {"close": "23:00", "open": "07:00"},"Saturday": {"close": "00:00", "open": "07:00"}

},"open": true,"categories": ["Breakfast & Brunch", "Steakhouses", "French", "Restaurants"],"city": "Las Vegas","review_count": 4084,"name": "Mon Ami Gabi","neighborhoods": ["The Strip"],"longitude": -115.172588519464,"state": "NV","stars": 4.0,

"attributes": {"Alcohol": "full_bar”,

"Noise Level": "average","Has TV": false,"Attire": "casual","Ambience": {

"romantic": true,"intimate": false,"touristy": false,"hipster": false,

"classy": true,"trendy": false,

"casual": false},"Good For": {"dessert": false, "latenight": false, "lunch": false,

"dinner": true, "breakfast": false, "brunch": false},}

}

Page 36: © 2015 MapR Technologies 1 Jim Scott – Director, Enterprise Strategy & Architecture @kingmesal #strataconf

© 2015 MapR Technologies 36

Zero to Results in 2 minutes$ tar -xvzf apache-drill-1.0.0.tar.gz

$ bin/sqlline -u jdbc:drill:zk=local$ bin/drill-embedded

> SELECT state, city, count(*) AS businesses FROM dfs.yelp.`business.json` GROUP BY state, city ORDER BY businesses DESC LIMIT 10;+------------+------------+-------------+| state | city | businesses |+------------+------------+-------------+| NV | Las Vegas | 12021 || AZ | Phoenix | 7499 || AZ | Scottsdale | 3605 || EDH | Edinburgh | 2804 || AZ | Mesa | 2041 || AZ | Tempe | 2025 || NV | Henderson | 1914 || AZ | Chandler | 1637 || WI | Madison | 1630 || AZ | Glendale | 1196 |+------------+------------+-------------+

Install

Query files and

directories

Results

Launch shell (embedded mode)

Page 37: © 2015 MapR Technologies 1 Jim Scott – Director, Enterprise Strategy & Architecture @kingmesal #strataconf

© 2015 MapR Technologies 37

Directories are implicit partitions

SELECT dir0, SUM(amount)FROM salesGROUP BY dir1 IN (q1, q2)

sales├── 2014│   ├── q1│   ├── q2│   ├── q3│   └── q4└── 2015 └── q1

Page 38: © 2015 MapR Technologies 1 Jim Scott – Director, Enterprise Strategy & Architecture @kingmesal #strataconf

© 2015 MapR Technologies 38

Intuitive SQL Access to Complex Data// It’s Friday 10pm in Vegas and looking for Hummus

> SELECT name, stars, b.hours.Friday friday, categories FROM dfs.yelp.`business.json` b WHERE b.hours.Friday.`open` < '22:00' AND b.hours.Friday.`close` > '22:00' AND REPEATED_CONTAINS(categories, 'Mediterranean') AND city = 'Las Vegas' ORDER BY stars DESC LIMIT 2;

+------------+------------+------------+------------+| name | stars | friday | categories |+------------+------------+------------+------------+| Olives | 4.0 | {"close":"22:30","open":"11:00"} | ["Mediterranean","Restaurants"] || Marrakech Moroccan Restaurant | 4.0 | {"close":"23:00","open":"17:30"} | ["Mediterranean","Middle Eastern","Moroccan","Restaurants"] |+------------+------------+------------+------------+

Query data with any levels of nesting

Page 39: © 2015 MapR Technologies 1 Jim Scott – Director, Enterprise Strategy & Architecture @kingmesal #strataconf

© 2015 MapR Technologies 39

Reviews dataset

{ "votes": {"funny": 0, "useful": 2, "cool": 1}, "user_id": "Xqd0DzHaiyRqVH3WRG7hzg", "review_id": "15SdjuK7DmYqUAj6rjGowg", "stars": 5, "date": "2007-05-17", "text": "dr. goldberg offers everything ...", "type": "review", "business_id": "vcNAWiLM4dR7D2nwwJ7nCA"}

Page 40: © 2015 MapR Technologies 1 Jim Scott – Director, Enterprise Strategy & Architecture @kingmesal #strataconf

© 2015 MapR Technologies 40

ANSI SQL Compatibility

//Get top cool rated businesses

SELECT b.name from dfs.yelp.`business.json` b WHERE b.business_id IN (SELECT r.business_id FROM dfs.yelp.`review.json` r GROUP BY r.business_id HAVING SUM(r.votes.cool) > 2000 ORDER BY SUM(r.votes.cool) DESC);

+------------+| name |+------------+| Earl of Sandwich || XS Nightclub || The Cosmopolitan of Las Vegas || Wicked Spoon |+------------+

Use familiar SQL functionality

(Joins, Aggregations, Sorting, Sub-queries, SQL data types)

Page 41: © 2015 MapR Technologies 1 Jim Scott – Director, Enterprise Strategy & Architecture @kingmesal #strataconf

© 2015 MapR Technologies 41

Logical Views //Create a view combining business and reviews datasets

> CREATE OR REPLACE VIEW dfs.tmp.BusinessReviews AS SELECT b.name, b.stars, r.votes.funny, r.votes.useful, r.votes.cool, r.`date` FROM dfs.yelp.`business.json` b, dfs.yelp.`review.json` r WHERE r.business_id = b.business_id;

+------------+------------+| ok | summary |+------------+------------+| true | View 'BusinessReviews' created successfully in 'dfs.tmp' schema |+------------+------------+

> SELECT COUNT(*) AS Total FROM dfs.tmp.BusinessReviews;+------------+| Total |+------------+| 1125458 |+------------+

Lightweight file system based views for granular

and de-centralized

data management

Page 42: © 2015 MapR Technologies 1 Jim Scott – Director, Enterprise Strategy & Architecture @kingmesal #strataconf

© 2015 MapR Technologies 42

Materialized Views AKA Tables> ALTER SESSION SET `store.format` = 'parquet';

> CREATE TABLE dfs.yelp.BusinessReviewsTbl AS SELECT b.name, b.stars, r.votes.funny funny, r.votes.useful useful, r.votes.cool cool, r.`date` FROM dfs.yelp.`business.json` b, dfs.yelp.`review.json` r WHERE r.business_id = b.business_id;

+------------+---------------------------+| Fragment | Number of records written |+------------+---------------------------+| 1_0 | 176448 || 1_1 | 192439 || 1_2 | 198625 || 1_3 | 200863 || 1_4 | 181420 || 1_5 | 175663 |+------------+---------------------------+

Save analysis results as tables using familiar CTAS

syntax

Page 43: © 2015 MapR Technologies 1 Jim Scott – Director, Enterprise Strategy & Architecture @kingmesal #strataconf

© 2015 MapR Technologies 43

Repeated Values Support// Flatten repeated categories

> SELECT name, categories FROM dfs.yelp.`business.json` LIMIT 3;

+------------+------------+| name | categories |+------------+------------+| Eric Goldberg, MD | ["Doctors","Health & Medical"] || Pine Cone Restaurant | ["Restaurants"] || Deforest Family Restaurant | ["American (Traditional)","Restaurants"] |+------------+------------+

> SELECT name, FLATTEN(categories) AS categories FROM dfs.yelp.`business.json` LIMIT 5;+------------+------------+| name | categories |+------------+------------+| Eric Goldberg, MD | Doctors || Eric Goldberg, MD | Health & Medical || Pine Cone Restaurant | Restaurants || Deforest Family Restaurant | American (Traditional) || Deforest Family Restaurant | Restaurants |+------------+------------+

Dynamically flatten

repeated and nested data elements as part of SQL queries. No ETL necessary

Page 44: © 2015 MapR Technologies 1 Jim Scott – Director, Enterprise Strategy & Architecture @kingmesal #strataconf

© 2015 MapR Technologies 44

Checkins dataset {    "checkin_info":{       "3-4":1,      "13-5":1,      "6-6":1,      "14-5":1,      "14-6":1,      "14-2":1,      "14-3":1,      "19-0":1,      "11-5":1,      "13-2":1,      "11-6":2,      "11-3":1,      "12-6":1,      "6-5":1,      "5-5":1,      "9-2":1,      "9-5":1,      "9-6":1,      "5-2":1,      "7-6":1,      "7-5":1,      "7-4":1,      "17-5":1,      "8-5":1,      "10-2":1,      "10-5":1,      "10-6":1   },   "type":"checkin",   "business_id":"JwUE5GmEO-sH1FuwJgKBlQ"}

Page 45: © 2015 MapR Technologies 1 Jim Scott – Director, Enterprise Strategy & Architecture @kingmesal #strataconf

© 2015 MapR Technologies 45

Supports Dynamic / Unknown Columns> SELECT KVGEN(checkin_info) checkins FROM dfs.yelp.`checkin.json` LIMIT 1;+------------+| checkins |+------------+| [{"key":"3-4","value":1},{"key":"13-5","value":1},{"key":"6-6","value":1},{"key":"14-5","value":1},{"key":"14-6","value":1},{"key":"14-2","value":1},{"key":"14-3","value":1},{"key":"19-0","value":1},{"key":"11-5","value":1},{"key":"13-2","value":1},{"key":"11-6","value":2},{"key":"11-3","value":1},{"key":"12-6","value":1},{"key":"6-5","value":1},{"key":"5-5","value":1},{"key":"9-2","value":1},{"key":"9-5","value":1},{"key":"9-6","value":1},{"key":"5-2","value":1},{"key":"7-6","value":1},{"key":"7-5","value":1},{"key":"7-4","value":1},{"key":"17-5","value":1},{"key":"8-5","value":1},{"key":"10-2","value":1},{"key":"10-5","value":1},{"key":"10-6","value":1}] |+------------+

> SELECT FLATTEN(KVGEN(checkin_info)) checkins FROM dfs.yelp.`checkin.json` limit 6;

+------------+| checkins |+------------+| {"key":"3-4","value":1} || {"key":"13-5","value":1} || {"key":"6-6","value":1} || {"key":"14-5","value":1} || {"key":"14-6","value":1} || {"key":"14-2","value":1} |+------------+

Convert Map with a wide set of dynamic columns into an array of key-value pairs

Page 46: © 2015 MapR Technologies 1 Jim Scott – Director, Enterprise Strategy & Architecture @kingmesal #strataconf

© 2015 MapR Technologies 46© 2015 MapR Technologies

Resources

Page 47: © 2015 MapR Technologies 1 Jim Scott – Director, Enterprise Strategy & Architecture @kingmesal #strataconf

© 2015 MapR Technologies 47

Page 48: © 2015 MapR Technologies 1 Jim Scott – Director, Enterprise Strategy & Architecture @kingmesal #strataconf

© 2015 MapR Technologies 48

Drill is Top-Ranked SQL-on-Hadoop

Source: Gigaom Research, 2015

Key: • Number indicates companies relative strength across all vectors• Size of ball indicates company’s relative strength along individual vector

“Drill isn’t just about

SQL-on-Hadoop.

It’s about SQL-on-

pretty-much-

anything,

immediately, and

without formality.”

Page 49: © 2015 MapR Technologies 1 Jim Scott – Director, Enterprise Strategy & Architecture @kingmesal #strataconf

© 2015 MapR Technologies 49

OJAI and MapR-DB

Where to find it…– The source: https://github.com/ojai/ojai– The site: http://ojai.github.io/

– Python bindings: https://github.com/mapr-demos/python-bindings– Javascript bindings: https://github.com/mapr-demos/js-bindings

Ready to play with your data?– Download the sandbox: http://maprdb.io– Examples:

• Java: https://github.com/mapr-demos/maprdb-ojai-101• Python: https://github.com/mapr-demos/maprdb_python_examples

Page 50: © 2015 MapR Technologies 1 Jim Scott – Director, Enterprise Strategy & Architecture @kingmesal #strataconf

© 2015 MapR Technologies 50

Drill Walkthrough

• Example queries• Conversion from relational model to flat JSON model

https://www.mapr.com/blog/drilling-healthy-choices

https://www.mapr.com/blog/evolution-database-schemas-using-sql-nosql

Page 52: © 2015 MapR Technologies 1 Jim Scott – Director, Enterprise Strategy & Architecture @kingmesal #strataconf

© 2015 MapR Technologies 52

Q & A

@kingmesal maprtech

[email protected]

Engage with us!

MapR

maprtech

mapr-technologies