1 Copyright © 2011, Oracle and/or its affiliates. All rights reserved.
Oracle Big Data
Tapping into Diverse Data Sets
Transactions
Information
Architectures
Today:
Decisions based
on database data
Big Data:
Decisions based
on all your data
Video and Images
Machine-Generated Data Social Data
Documents
Case: On-line Ads and Content
NoSQL DB
Expert System
Real-time: Determine best ad to place
on page for this user
Input into
Lookup user profile
Add user if not present
Web logs
HDFS
Profiles
NoSQL DB
High scale data reductions BI and
Analytics Billing
Predictions on browsing
Actual ads
served
Low Latency
Batch
Case: On-line Adds and Content
NoSQL DB
HDFS
Hadoop
RDBMS
• Dynamic and rapidly changing schema
• Scalable single record lookup
• Low cost, high scale storage
• Write once, read many times
• High scale batch processing
• Highly customizable infrastructure
• Deep analytics and BI value add
• Reporting for large user community
Big Data Is About…
Tapping into diverse data sets
Finding and monetizing hidden relationships
Driving data-based business decisions
6 Copyright © 2011, Oracle and/or its affiliates. All rights reserved.
• Deep Analytics
• Agile Development
• Massive Scalability
• Real Time Results • High Throughput
• In-Place Preparation
• All Data Sources/Structures
• Low, predictable Latency
• High Transaction Count
• Flexible Data Structures
Big Data: Infrastructure Requirements
Acquire Organize Analyze
7 Copyright © 2011, Oracle and/or its affiliates. All rights reserved.
Divided Solution Spectrum
Acquire Analyze Organize
MapReduce Solutions
Distributed File Systems
Transaction (Key-Value)
Stores
NoSQL Flexible
Specialized Developer
Centric
DBMS (DW)
DBMS (OLTP)
Advanced Analytics ETL
SQL Trusted Secure
Administered
“High Density”
Information Density
“Low Density” Schema-less
Unstructured
Data
Variety
Schema
8 Copyright © 2011, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 8
Oracle Integrated Software Solution Stack
Acquire Analyze Organize
Oracle
Database (DW)
Oracle Database
(OLTP)
In-DB Analytics
“R” Mining
Text Graph Spatial
Oracle BI EE
Oracle NoSQL DB
HDFS Hadoop
Oracle Data Integrator
Oracle Loader for Hadoop
Data Variety
Information Density
Unstructured
Schema
9 Copyright © 2011, Oracle and/or its affiliates. All rights reserved.
Oracle’s Big Data solution
Oracle
Big Data Appliance
Oracle
Exadata
InfiniBand
Acquire Organize Analyze & Visualize Stream
Oracle
Exalytics
InfiniBand
10 Copyright © 2011, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 8
Why build a Hadoop Appliance?
• Time to Build?
• Required Expertise?
• Cost and Difficulty Maintaining?
11 Copyright © 2011, Oracle and/or its affiliates. All rights reserved. Insert Information Protection Policy Classification from Slide 8
Oracle Engineered Solutions
Acquire Analyze Organize
Oracle
Database (DW)
Oracle Database
(OLTP)
In-DB Analytics
“R” Mining
Text Graph Spatial
Oracle BI EE
Oracle NoSQL DB
HDFS Hadoop
Oracle Data Integrator
Oracle Loader for Hadoop
Data Variety
Information Density
Unstructured
Schema
Big Data Appliance • Hadoop
• NoSQL Database
• Oracle Loader for hadoop
• Oracle Data Integrator
Oracle Exadata • OLTP & DW
• Data Mining & Oracle R
• Semantics
• Spatial
Exalytics • Speed of
Thought
Analytics
12 Copyright © 2011, Oracle and/or its affiliates. All rights reserved.
Big Data Appliance: Hardware
18 Sun X4270 M2 Servers
• 48 GB memory per node; 864 GB memory total
• 2 CPUs (6-core Intel) per node, 216 cores total
• 12 x 2 TB HDD capacity, 432TB raw disk total
3 Infiniband switches
• 40 Gb/sec InfiniBand – 100 total ports (for internal
backplane and interconnection to Exadata)
• 10 Gb/sec Ethernet – 16 total ports (for connection to
datacenter)
Big Data Appliance
Cluster of industry standard servers for Hadoop and NoSQL Database
• Focus on Scalability and Availability at low cost
Compute and Storage
• 18 High-performance low-cost servers
acting as Hadoop nodes
• 24 TB Capacity per node
• 2 6-core CPUs per node
• Hadoop triple replication
• NoSQL Database triple replication
10GigE Network
• 8 10GigE ports
• Datacenter connectivity
InfiniBand Network
• Redundant 40Gb/s switches
• IB connectivity to Exadata
Big Data Appliance Building Block
• High-performance storage server built from industry
standard components
• 12 disks - 2TB 7200 RPM
High Capacity SAS
• 2 Six-Core Intel Xeon Processors (L5640)
• Dual ported 40 Gb/sec InfiniBand
• Optimized software layout:
• Hadoop HDFS
• HBase and Hive
• NoSQL Database and Replicas
• Hardware by Sun
• Software by Oracle
Scale Out to Infinity
Scale out by connecting racks to each other using Infiniband
•60 Nodes
•864 Cores
•1.7 PB Storage
17 Copyright © 2011, Oracle and/or its affiliates. All rights reserved.
Big Data Appliance: Software Big Data for the Enterprise
• Foundation Software:
– Oracle Linux
– Oracle Java VM
– Open-source Apache Hadoop Distribution
– Open-source R Distribution
• Application Software:
– Oracle NoSQL Database Enterprise Edition – New
– Oracle Loader for Hadoop - New
– Oracle Data Integrator Application Adapter
for Hadoop - New
•Oracle Linux 5.6
•Java Hotspot VM
•Apache Hadoop Distribution v0.20.x
•R Distribution
•Oracle NoSQL Database Enterprise Edition
•Oracle Data Integrator Application Adapter for
Hadoop
•Oracle Loader for Hadoop
Oracle Big Data Appliance Software
19 Copyright © 2011, Oracle and/or its affiliates. All rights reserved.
Big Data Appliance Big Data for the Enterprise
• Optimized and Complete
– Everything you need to store and integrate
your lower information density data
• Integrated with Oracle Exadata • Analyze all your data
• Easy to Deploy
– Risk Free, Quick Installation and Setup
• Single Vendor Support
– Full Oracle support for the entire system and
software set
Oracle NoSQL Database A distributed, scalable key-value database
• Simple Programming and Operational Model • Simple Major + Sub key and Value data structure
• ACID transactions
• Configurable consistency & durability
• Scalable throughput, bounded latency
• Commercial Grade Software and Support • General-purpose
• Reliable – Based on proven Berkeley DB JE HA
• Easy to install and configure
• Easy Management • Web-based console, API accessible
• Manages and Monitors: Topology; Load; Performance; Events; Alerts
Storage Nodes
Data Center A
Storage Nodes
Data Center B
NoSQLDB Driver
Application
NoSQLDB Driver
Application
Input
Input
Query
Table
Oracle Loader for Hadoop
Load
. . . .
Partition and transform into Oracle
ready format
. . . .
Oracle Loader for Hadoop
Streaming Access to HDFS
HDFS
HDFS
HDFS
HDFS
HDFS
Datafile_part_1
Datafile_part_2
Datafile_part_m
Datafile_part_n
Datafile_part_x
Oracle Database
FUSE
External Table
View
Or
Table Function
Reduce Map
Query
Oracle Data Integrator
Easily integrate data from any source
Expanded functionality:
=> Construct Hadoop jobs to transform and load data into Oracle
=> Leverage Oracle Loader for Hadoop and/or Hive
25 Copyright © 2011, Oracle and/or its affiliates. All rights reserved.
Data Mining
Exadata: A Platform for Analytics
2 miles
Text Analytics
Spatial Analytics
Graph Analytics
Integrate into Applications
Statistics
26 Copyright © 2011, Oracle and/or its affiliates. All rights reserved.
In-database Statistics and Advanced Analytics with R
• Deliver enterprise-level advanced analytics based on R environment
1. Oracle’s Distribution of Open Source R • Enterprise support for open-source R
• Enhanced performance with Intel MKL libraries for x86 hardware
2. Oracle R Enterprise • Eliminates R’s memory constraint by enabling R to work directly and transparently on
database-resident data
• Transparently leveraging Oracle’s in-database analytics via R language
• Enables integration of R scripts into enterprise production applications and OBIEE
dashboards
• Leverages latest R algorithms and CRAN packages
27 Copyright © 2011, Oracle and/or its affiliates. All rights reserved.
Oracle R Architecture
Function push-
down – data
transformation & statistics
R workspace console
Oracle statistics engine
OBIEE, Web
Services
No changes to
the user
experience
Scale to large
data sets
Embed in
operational
systems
30 Copyright © 2011, Oracle and/or its affiliates. All rights reserved.
Oracle’s Big Data solution
Oracle
Big Data Appliance
Oracle
Exadata
InfiniBand
Acquire Organize Analyze & Visualize Stream
Oracle
Exalytics
InfiniBand
The preceding is intended to outline our general product direction. It is intended for
information purposes only, and may not be incorporated into any contract. It is not a
commitment to deliver any material, code, or functionality, and should not be relied
upon in making purchasing decisions. The development, release, and timing of any
features or functionality described for Oracle’s products remains at the sole
discretion of Oracle.
32 Copyright © 2011, Oracle and/or its affiliates. All rights reserved.
33 Copyright © 2011, Oracle and/or its affiliates. All rights reserved.