Introduction to TitanDB

Introduction to TitanDB

Bharat Singh

Software Consultant

Knoldus Software LLP.

Agenda

● Graph Database● What is Graph Database● Need for Graph Database

● Titan DB● Why Titan DB● CAP theorem● Architecture overview ● Future of TitanDB

● Apache TinkerPop● What is Apache TinkerPop● Need for Apache TinkerPop

What is Graph Database

● A database that uses graph structures for semantic queries with nodes, edges and properties to represent and store data.

● Most graph databases are NoSQL in nature

● Store data in a key-value store or document-oriented database.

● Store relationships between values as first class citizens.

Need for Graph Database

● Data is more connected : Being shared across multiple applications on the web

● It is easier to query data stored in a graph structure where nodes are highly connected

● It removes the need to perform multiple join operations between adjacent neighbours

● It allows the use of many algorithms that helps in optimization

● Allows visualization of data and infer hidden relationships or derive predictions from data.

Why Titan DB

● Support for very large graphs. Titan graphs scale with the number of machines in the cluster.

● Support for ACID properties and eventual consistency.● Support for very many concurrent transactions and

operational graph processing.● Titan’s transactional capacity scales with the number of

machines in the cluster and answers complex traversal queries on huge graphs in milliseconds.

● Vertex-centric indices provide vertex-level querying to solve infamous super node problem.

● Provides an optimized disk representation to allow for efficient use of storage and speed of access.

● Open source with the liberal Apache 2 license.

Features of Titan DB● Support for various storage backends:

– Apache Cassandra– Apache HBase– Oracle BerkeleyDB

● Support for global graph data analytics, reporting, and ETL through integration with big data platforms:– Apache Spark– Apache Giraph– Apache Hadoop

● Support for geo, numeric range, and full-text search via:– ElasticSearch– Solr– Lucene

● Native integration with the TinkerPop graph stack:– Gremlin graph query language– Gremlin graph server– Gremlin applications

CAP Theorem

● CAP Theorem

– C=Consistency

– A=Availability

– P=Partitionability

● HBase favours consistency

– At expense of yield

– i.e. non completed requests

● Cassandra favours availability

– At expense of harvest

– i.e. completeness of answer

● Berkeley DB is non distributed

Architecture overview of Titan DB

Future of TitanDB

● Aurelius is the startup behind Titan, an open source graph database

● DataStax, the company that delivers Apache Cassandra™ to the enterprise have now acquired Aurelius on Feb 3rd, 2015

● The Aurelius team will join DataStax to build DataStax Enterprise (DSE) Graph, adding graph database capabilities into DSE alongside Apache Cassandra

What is Apache TinkerPop

● A Graph processing system, currently under Apache incubation

● Has Tinkerpop3 Structure API

● Graph, Element, Property

● Has Tinkerpop3 Process API● TraversalSource, GraphComputer

● Gremlin query language● A scripting language for graph traversal and mutation

● REST API

Need for Apache TinkerPop

Dealing with such complex databases, requires a well-implemented API by the vendor. But using a vendor specific API, makes migrating to another database impossible.

The solution is provided by Apache Tinkerpop

References

•https://en.wikipedia.org/wiki/Graph_database

•http://thinkaurelius.github.io/titan/

•http://tinkerpop.apache.org/docs/3.2.0-incubating/reference/

•http://www.datastax.com/2015/02/datastax-acquires-aurelius-the-experts-behind-titandb

https://en.wikipedia.org/wiki/Graph_database

http://thinkaurelius.github.io/titan/

http://tinkerpop.apache.org/docs/3.2.0-incubating/reference/

http://tinkerpop.apache.org/docs/3.2.0-incubating/reference/

http://www.datastax.com/2015/02/datastax-acquires-aurelius-the-experts-behind-titandb

http://www.datastax.com/2015/02/datastax-acquires-aurelius-the-experts-behind-titandb

Thank You

Technology

Introduction to TitanDB