Change RelationalDB to GraphDB with OrientDB

Preview:

Citation preview

Change Relational DB to Graph DB with OrientDB

Speaker : Apaichon PunpasSponsor By

เครือข่ายโปรแกรมเมอร์ไทย

โค้ดชิวๆ

What is Relational DB ?It is a way of storing information into • table • column • row A table is able relate to other.

What is NoSQL ?A NoSQL (often interpreted as Not only SQL) database provides a mechanism for storage and retrieval of data that is modeled in means other than the tabular relations used in relational databases. Motivations for this approach include simplicity of design, horizontal scaling, and finer control over availability.

Type of NoSQL

Relational vs NoSQL format

EmployeeId FirstName LastName HiredDate PositionId

1 Apaichon Punopas 13/11/2013 1

2 Tony Jar 1/1/2011 3

PositionId PositionName1 Senior Developer2 Developer

Manager3 Actor

{EmployeeId:1 ,FirstName:”Apaichon” ,LastName:”Punopas” ,HiredDate:”2013-11-13” ,PositionId:”#16:0”}

{PositionId:”#16:0”,PositionName:”Senior Developer”}

Employee

Position

Relational DB Pros and Cons

Pros Cons

Flexible and well-established. Performance problems with complicated data structures.

Short Learning Curve Lack of support for complex base types, e.g., drawings.

Data access through SQL. SQL is limited when accessing complex data.

Large development efforts and with large databases are well understood.

Knowledge of the database structure is required to create ad hoc queries.

The fundamental structure, i.e., a table, is easily understood and the design and normalization

Locking mechanisms

NoSQL Pros and Cons

Pros ConsMostly open source. Immaturity

Horizontal scalability. Possible database administration issues

Support for Map/Reduce Data Relationship Like SQL

No need to develop fine-grained data model. No indexing support (Some DB)

Very fast for adding new data No ACID (Some DB)

No need to changes in code when data structure is modified. Absence of standardization

Ability to store complex data types in a single item of storage.

Who are using No SQL ?• All big companies using NoSQL

Who are using OrientDB ?THJUG

• Nobody uses Java anymore, I'm nobody

A.K.A

• Nobody uses OrientDB anymore, I'm nobody

NoSQL trends

http://news.yahoo.com/nosql-databases-eat-relational-database-191517881.html

Database trends

Job trends

Will NoSQL replace Relational ?

• NoSQL databases eat into the relational database market.

• New venture and startup most start with NoSQL.

• Relational has more used in Enterprise and more features better. It is difficult to replace all in 10 - 20 years. In the future might have new database type again.

Why join is suck ?Join Step

• Add relational data into least 2 tables

• select data

• mapping data

• reduce result set

EmployeeId FirstName LastName HiredDate PositionId

1 Apaichon Punopas 13/11/2013 1

2 Tony Jar 1/1/2011 3

PositionId PositionName1 Senior Developer2 Developer

Manager3 Actor

Select e.*,p.PositionName from Employee e INNER Join Position p on e.PositionId = p.PositionId

How is OrientDB join ?Join Step

• Add relational data into least 2 tables.

It’s already Join! @rid FirstName LastName HiredDate PositionId

#22:0 Apaichon Punopas 13/11/2013 #23:0

@rid PositionName

#23:0 Senior Developer

select @rid as employeeId, firstName , lastName , positionId.positionName from Employee

Welcome to OrientDBLuca Garulli

CEO, Founder

Luca OlivariPresident

www.orientdb.org

What is GraphDB ?Graph Theory

G = (V, E)

V = Vertex

E = Edge

Certificate of chievementThis%certificate%is%awarded%to%

Attendee Your%understand%Graph%Theory.%

Apaichon Punopas

เครือข่าย โปรแกรมเมอร์ไทย

Today OrientDB 2.0 is not only GraphDB

It is Multi-Model Database.

Document ModelThe data in this model is stored inside documents. A document is a set of key/value pairs (also referred to as fields or properties) where a key allows access to its value. Values can hold primitive data types, embedded documents, or arrays of other values.

{firstName:”A",lastName:"LA", friends:[{firstName:”A"}

,{firstName:"B" , lastName: “lB”}]

}

Graph ModelA graph represents a network-like structure consisting of Vertices (also known as Nodes) interconnected by Edges (also known as Arcs). OrientDB's graph model is represented by the concept of a property graph, which defines the following:Vertex - an entity that can be linked with other Vertices.Edge - an entity that links two Vertices.

Support Types• Popular types same as other database such as

boolean , integer ,double , string , binary , etc

• Embleded -> JSON such as {name:”A” , friends:[{name:”B” },name:{“C”}]

• Link -> RecordID

ClassA Class is a concept taken from the Object Oriented paradigm. In OrientDB it defines a type of record. It's the closest concept to a Relational DBMS Table. Classes can be schema-less, schema-full, or mixed.

Schema Type• Schema-Full: enable the strict-mode at class level

and set all the fields as mandatory

• Schema-Less: create classes with no properties. Default mode is non strict-mode so records can have arbitrary fields

• Schema-Hybrid, called also Schema-Mixed is the most used: create classes and define some fields but leave the record to define own custom fields

ClusterA cluster is a place where a group of records are stored. Perhaps the best equivalent in the relational world would be a Table. By default, OrientDB will create one cluster per class. All the records of a class are stored in the same cluster which has the same name as the class. You can create up to 32,767 (2^15-1) clusters in a database.

Record IDIn OrientDB each record has its own self-assigned unique ID within the database called Record ID or RID. It is composed of two parts:

• cluster-id is the id of the cluster. Each database can have a maximum of 32,767 clusters (2^15-1)

• cluster-position is the position of the record inside the cluster. Each cluster can handle up to 9,223,372,036,854,780,000 (2^63) records, namely 9,223,372 Trillion of records!

#<cluster-id>:<cluster-position>

Inheritance• Class includes inheritance features same as OOP

concept.

Index• OrientDB supports 4 kinds of indexes:

Security• Support drill down to Record level and support

SSL.

Caching• OrientDB has several caching mechanisms that act

at different levels. Look at this picture:

FunctionsA Function is an executable unit of code that can take parameters and return a result. Using Functions you can perform Functional programming where logic and data are all together in a central place. Functions are similar to the Stored Procedures of RDBMS.

• can be executed via SQL, Java, REST and Studio

TransactionsOrientDB is an ACID compliant DBMS.

A database transaction, by definition, must be atomic, consistent, isolated and durable. Database practitioners often refer to these properties of database transactions using the acronym ACID

Hooks (Triggers)• Hook works like a trigger. Hook lets to the user

application to intercept internal events before and after each CRUD operation against records. You can use to write custom validation rules, to enforce security or even to orchestrate external events like the replication against a Relational DBMS.

APIOrientDB supports 3 kinds of drivers:

• Native binary remote, that talks directly against the TCP/IP socket using the binary protocol

• HTTP REST/JSON, that talks directly against the TCP/IP socket using the HTTP protocol

• Java wrapped, as a layer that links in some way the native Java driver. This is pretty easy for languages that run into the JVM like Scala, Groovy and JRuby

Scalability

Programming Language Driver

• Most popular language are supported.

SQLMost NoSQL products have a custom query language. OrientDB focuses on standards when it comes to query languages. Instead of inventing "Yet Another Query Language", we started from the widely used and well understood SQL.

SQL - Select• select from OUser• select from #10:3• select from [#10:1, #10:3, #10:5]• select from OUser where name like 'l%'• select sum(salary) from Employee where age < 40 group by job• select from Employee where any() like ‘Apa%'• select from china:Customers

SQL - Insert• insert into Employee (name, surname, gender) values ('Jay', 'Miner', 'M')

• insert into Employee set name = 'Jay', surname = 'Miner', gender = 'M'

• insert into Employee content {name : 'Jay', surname : 'Miner', gender : 'M'}

SQL - Update• update Employee set local = true where city = 'London'

• update Employee merge { local : true } where city = 'London'

Delete•delete from Employee where city <> 'London'•delete from [#24:0,#24:1,#24:2]

Sub Queryselect from Documentlet $temp = ( select @rid, $depth from ( traverse V.out, E.in from $parent.current ) where @class = 'Concept'

and (id = 'first concept' or id = 'second concept' )

)where $temp.size() > 0

TraverseTraverse is a special command that retrieves the connected records crossing the relationships. This command works not only with graph API but at document level. This means you can traverse relationships between invoice and customers without the need to model the domain using the Graph API.

traverse * from #9:1

My favourite in OrientDB• I’m favourite many things in OrientDB which never found in other DB.

• insert , update with JSON

• save - automatic insert or update when pass value with @rid

• validate property with regular expression.

• median - I’m got bad performance and develop out of the box with other DB but OrientDB included and fast.

• array and JSON hierarchy - keep array in one field help easily to use with data visualise.

• expand - expand array to horizontal like table , row , column.

AppendixPrerequisite

• JVM

Installation

• Download at http://www.orientechnologies.com/download/

• Extract file

• go to directory bin then run server.sh or server.bat

AppendixManagement Studio

• by default run on port 2480

• open browser then type http://localhost:2480

AppendixConsole

• go to directory bin then run console.sh or console.bat

Thank you • Delicious and Enjoy to use