Upload
others
View
4
Download
0
Embed Size (px)
Citation preview
Unstructured StorageDie Basis für ihren Informationsvorsprung
Alexander GrafAdvisory Systems Engineer
Unstructured Storage Solutions
50%+ 48% 45% 92%of global GDP will be
digitized by 20211are unsure what their industry
will look like in 3 yearsfear they will be
obsolete in 3-5 years2see digital business initiatives as critical2
1 IDC FutureScape: Worldwide IT Industry 2018 Predictions Oct 2017 - - Doc # US43171317 2 Dell Digital Transformation Index
Digital Transformation Is Disrupting Every Industry
MEDIA MANUFACTURING HEALTHCARE LIFE SCIENCES AND MORE…AUTOMOTIVE
AI drives business outcomes
Automotive
Science
Retail
Finance
Manufacturing
…and more
New revenue streams Increase revenue run rate Operational efficiency• Recommendation engines• Cross sell and up-sell• Risk analysis• Fraud detection
• Sentiment analysis• Chatbots• Speech to text to speech• Intent analysis to actions
• Similar products• Visual search• Object recognition• Anomaly detection
3 Dell - Internal Use - Confidential
Artificial Intelligence refers to the simulation of any intellectual task, in order to represent machine intelligence with little to no input from a human or programmer, with the use of machine learning techniques
Machine learning refers to the process of “training” the machine, feeding large amounts of data so that it learns how to respond, rather than being explicitly programmed
Deep learning is a form of machine learning which uses many-layered artificial neural networks, parallel processing, and massive volumes of data to enable faster, more accurate and intellectual artificial intelligence
WHAT IS AI, MACHINE LEARNING AND DEEP LEARNING?
CUSTOMERS DEMAND OUTCOMES FROM DATA
THE DATA CENTRE OF TODAY THE DATA CENTRE OF TOMORROW
DATA LIVES ON DISK AND TAPEMOVE DATA TO THE CPU AS NEEDED
FOCUS ON DEEP STORAGE HIERARCHY
DATA RESIDES NEAR THE CPU AND MEMORYOUTCOMES ARE DRIVEN BY COMPUTE CENTRIC DESIGN
MOVE FASTER, STORE MORE, COMPUTE EVERYTHING
CHALLENGE: Transforming Data Into Value
C R E AT I N G B U S I N E S S I M PA C T F R O M D ATA
K E E P I N G U P W I T H D ATA G R O W T H
U N L O C K I N G D ATA T R A P P E D I N S I L O S
S I M P L I C I T Y A T S C A L ESeamlessly extend management and policies across a massively growing data set.
E X T R A C T V A L U E F R O M D A T ASupport high performance workloads and deliver faster time to insights with all-flash.
U N I F I E D D A T A L A K EEliminate data silos and reduce obstacles across the Edge, Core and cloud.
Dell EMC IsilonSupport the most demanding workloads with the ability to scale performance and capacity as needed.
8
HighPerformance
Single File System One Namespace
Single File System One Namespace
UnmatchedEfficiency
Simplicity &Ease of Use
LinearScalability
EasyGrowth
Isilon and OneFS
Isilon Workload Consolidation
TCO Optimization: Simplicity and Ease of Use
• Automation:NO manual interventionNO reconfigurationNO server or client mount point or
application changesNO data migrationsNO RAID
Single File System Spans All Nodes
Directories and Files Striped Across Cluster
TCO Optimization : AutoBalance Capacity and Performance
AutoBalance Across Nodes
EMPTY
EMPTY
EMPTY
EMPTY
EMPTY
FULL
FULL
FULL
FULL
BALANCED
BALANCED
BALANCED
BALANCED
BALANCED
• Automation Balances Data Reduces Costs, Complexity, RiskEliminates Hot SpotsNO data migrationsNO RAID
Delivers Over 80% Storage Utilization
Push-Button Linear ScalingUnder 60 SecondsTransparent to Users and Applications
Unconstrained Scale
ISILON SIMPLICITY AND EASE OF USESingle volume and file system Directories and files striped across
cluster nodes
Automation NO manual intervention NO reconfiguration NO client mount point changes NO application changes NO data migrations NO RAID or LUNs
© Copyright 2017 Dell Inc.14
Enterprise Grade Features: In-Place Analytics
Speeds Time to InsightEnterprise Data Protection for HadoopLower costs• Eliminates dedicated Hadoop infrastructure
Increase flexibility• Simultaneous support for any Apache-
compliant Hadoop distribution
Native HDFS
Ethernet
HADOOP ARCHITECTURE – DAS VS ISILON
NameNode
Data Node + Compute Node
Data Node + Compute Node
Data Node + Compute Node
Data Node + Compute Node
Data Node + Compute Node
Data Node + Compute Node
Ethernet
Compute Node Compute Node Compute Node
Compute NodeCompute Node Compute Node
name node
name node
name node
data node
TRADITIONAL “SHARE-NOTHING” HADOOP
Existing Virtualized Data Center SHARE-NOTHING Hadoop Infrastructure
Unstructured Data
1
Existing Primary Storage
2 3 4 2 3 4 2 3 4 2 3 4
• Hadoop on a Stick (R=3) means 5 data copies ($$$$)
• Data has to copy to the Hadoopcluster before analysis can begin (Time to Results)
How will you maintain data consistency when a file changes on your primary storage?
Existing Virtualized Data Center
Existing Primary Storage
ISILON “SHARE-EVERYTHING” HADOOP
1 Start using Hadoop NOW with
unused processing and RAM available in your VMware environment
No replication required (Use your existing data)
Access to same data via NAS and HDFS protocols
Time to results extremely fast using already existing data with NO COPIES or wasted $$$$
Analysis Can Begin with the
1st VM
New Hadoop Compute Nodes
Unstructured Data
Use Native HDFS Protocol
Data Center Network
TIME-TO-RESULTS
Data Copy AnalysisIn-Place Analysis
Existing Primary Storage
Hadoop on a Stick
Have you ever copied 100TB from Primary Storage to a Hadoop system?
How long does it take to copy 100TB from one place to
another over a 10Gb link?
>24 Hours
Data Center Network
Existing Primary Storage
Hadoop Compute Nodes
Reading relevant data to
analysis
Virtual ServersHDFSNFSFTPSMB
Support for Multiple Hadoop Landscapes
name node
name node
name node
name node data node
MAP Reduce
MAP Reduce
MAP Reduce
MAP Reduce
MAP Reduce
MAP Reduce
MAP Reduce
MAP Reduce
MAP Reduce
MAP Reduce
MAP Reduce
MAP Reduce
MAP Reduce
MAP Reduce
MAP Reduce
MAP Reduce
MAP Reduce
MAP Reduce
(or even different versions/distro’s)
DATA LAKE
Cloudera IBM
Increase Utilization to Control Costs
Hadoop 1
Hadoop 2
HBase
• Consolidated cluster has access to entire pool of physical resources • Take advantage of multi-tenancy to increase utilization during non-peak hours
Source:
ObjectC L O U D S C A L E S T O R A G E
G L O B A L D ATA A C C E S S
C L O U D - N AT I V E / M O D E R N A P P S
O B J E C T
What Is ECS?ECS is a universal object content storePrimarily Object: S3, Swift, CAS
• Lowest cost per TB• Same data protection
overhead for small and large files
• Metadata search native capability
• Easily scalable• Infinitely expandable
• Data globally accessible by Web, mobile and cloud apps
• Reduce data protection overheadwith 3+ sites
Most Cost Effective Tier of Storage
Data and Metadata Stored as Objects
GeodistributedNo Coded Limits
ECS Target Use CasesCloud BackupTiered
ArchiveCloud Native Apps
(web/mobile)Sync & Share
AnalyticsCloud Gateway
SITE 1
SITE 2
SITE 3
Scale Effortlessly - Store Efficiently - Access Globally
IoT
Ready Solution: Hortonworks Hadoop with Isilon
Solution benefits• Scale Storage independently from Compute• Minimize data movement• Eliminate Shadow IT projects• Current Isilon customers: leverage existing File
Management processes
Differentiation• Industry Leading storage density and scaling• Consolidates data silos with one copy of data• Enterprise-grade File Management• File-level regulatory compliance out-of-the-box• Current Isilon customers: brings analytics to
where the data exists in Isilon
High density Consolidated Data Lake
Pod Network 2x Dell EMC Networking S4048 10GbE Pod Switches1x Dell EMC Networking S3048 10GbE Pod Switches
Shared Storage Nodes4x Isilon X410 with 102TB HDD/ 3.2TB SSD/ 256 GB2x QDR Infiniband Switch 8 ports
Infrastructure Nodes4x PowerEdge FC630 with 3x 1.2TB HDD per Sled
Cluster Network 2x Dell EMC Networking S6000 40GbE Cluster Switches
Hortonworks Data Platform Ent+Isilon OneFS
Scales from 100TB to 64 PB
Compute Nodes6x PowerEdge FC630 with 8x 1.2TB HDD per Sled
Compute Configuration: Modular Infrastructure
Solution benefits• Scale Storage independently from Compute• Minimize data movement• Eliminate Shadow IT projects• Current Isilon customers: leverage existing File
Management processes
Differentiation• Industry Leading storage density and scaling• Consolidates data silos with one copy of data• Enterprise-grade File Management• File-level regulatory compliance out-of-the-box• Current Isilon customers: brings analytics to
where the data exists in Isilon
Pod Network 2x Dell EMC Networking S4048 10GbE Pod Switches1x Dell EMC Networking S3048 10GbE Pod Switches
Shared Storage Nodes4x Isilon X410 with 102TB HDD/ 3.2TB SSD/ 256 GB2x QDR Infiniband Switch 8 ports
Infrastructure Nodes4x PowerEdge R630 each with 3x 1.2TB HDD
Cluster Network 2x Dell EMC Networking S6000 40GbE Cluster Switches
Cloudera Enterprise Data HubIsilon OneFS
Scales from 100TB to 64 PB
Compute Nodes6x PowerEdge R630 each with 8x 1.2TB HDD
Ready Solution: Cloudera Hadoop with IsilonHigh density Consolidated Data Lake
Compute Configuration: Rack-Server Infrastructure
Introducing Ready Solutions for AI
Validated stack built to handle most demanding AI workloads
Deep Learning with
Machine Learning with
Simpler AI Experience Faster, Deeper AI Insights Proven AI Expertise
30% Improved data scientist productivity Up to 2.9X Performance vs.
competition 98% Lower training time
Self-service for data scientists
Selection of AI frameworks & libraries
Industry-leading, scale-out architecture
Single point of support
Data Science
Data EngineeringDataOps
Data Thinking
Experienced Partners• Consulting: Data, Algorithms,
Compute, Mindset• Guiding companies to data leader-
and creatorship
• Ideation & Scoping of Usecases• Data Analysis• Development of machine learning
algorithms• Proof of Concepts
• Architechture design and concepts• Engineering and deployment• Testing and test management• Application managment
• Managed, hybrid, cloud infrastructures• DevOps Application management• Haddop and beyond on scale solutions• Security concepts and system design
*um Hadoop-as-a-Service
1 Hadoop-HW on prem at customer Datacenter or off prem at UM Datacenter
2 *um provides fully managed platform services including hadoop layer
3 Customer specific analytics Software (tableau, SAS or others)
managed by
Compute nodes