2
REFERENCE ARCHITECTURE CenturyLink ® Big Data as a Service (BDaaS) Foundation Aggregate data from disparate LOB applications, visualize data analytics using CenturyLink and/or 3rd party tools. Quickly spin up/down Spot Cloudera Clusters in this fully managed and secure production Cloudera environment in the CenturyLink Cloud ® . BDaaS Foundation solution from CenturyLink is an intelligent hosting service enabling you to stand-up Big Data Platforms faster and at lower cost, and therefore achieve faster time-to-market and time-to-value for driving real business insights and results. Customers can purchase BDaaS for a monthly fee rather than an up-front capital expenditure. Utilization based pricing further reduces your risk with hourly pricing only for resources used. CenturyLink BDaaS is truly the best of both worlds, delivering raw compute power of physical servers and the automation and pay-as-you-go flexibility of Cloud. Deployable in all of CenturyLink’s 60+ global data centers with redundant MPLS backbone and high-bandwidth circuits connecting your network with the CenturyLink Cloud. Cloudera Enterprise connects directly using Hive or Impala to aggregate data from disparate LOB applications or sources hosted in any environment or location. Spot Cloudera Clusters can be spun up and down as needed through the control panel, to facilitate software stack validation without disrupting data development and analysis tasks on the sandbox cluster. Ingest data and perform inquiries for real-time search, time series, geo-spatial, IoT, key/value data. Production MySQL Cluster ensures that a failure of the MySQL Cloudera metadata instance does not bring down the entire cluster. CenturyLink always recommends an active-active replication setup for production clusters. CenturyLink’s Decision Analyzer, Model Controller and Navigator can be leveraged in SAP solutions for data analytics and visualization. Customers can also use Apache Hadoop and other SAP licensed software or open stack software for analytics activities. Data Sources OLTP, ERP, CRM Systems Documents & Emails Web Logs, Click Streams Machine Generated Sensor Data Geo-location Data Social Networks ETL ETL NoSQL Cluster—DBaaS Dedicated Bare Metal Orchestrate Service Multi-Data Center Fault Tolerance MySQL Cluster— Cloudera Metadata Dedicated Bare Metal Active-Active Cloudera Enterprise Data Hub Dedicated Bare Metal Public Cloud Public Cloud Object Storage S3 Compatible API Unlimited Storage Production Backup Data Visualization Decision Analyzer Model Controller Navigator 3rd Party Tools

Big Data as a-Service Reference Architecture - · PDF fileREFERENCE ARCHITECTURE CenturyLink® Big Data as a Service (BDaaS) Foundation ... data center yielding a robust and always-on

Embed Size (px)

Citation preview

Page 1: Big Data as a-Service Reference Architecture - · PDF fileREFERENCE ARCHITECTURE CenturyLink® Big Data as a Service (BDaaS) Foundation ... data center yielding a robust and always-on

REFERENCE ARCHITECTURE

CenturyLink® Big Data as a Service (BDaaS) FoundationAggregate data from disparate LOB applications, visualize data analytics using CenturyLink and/or 3rd party tools. Quickly spin up/down Spot Cloudera Clusters in this fully managed and secure production Cloudera environment in the CenturyLink Cloud®.

BDaaS Foundation solution from CenturyLink is an intelligent hosting service enabling you to stand-up Big Data Platforms faster and at lower cost, and therefore achieve faster time-to-market and time-to-value for driving real business insights and results. Customers can purchase BDaaS for a monthly fee rather than an up-front capital expenditure. Utilization based pricing further reduces your risk with hourly pricing only for resources used. CenturyLink BDaaS is truly the best of both worlds, delivering raw compute power of physical servers and the automation and pay-as-you-go flexibility of Cloud. Deployable in all of CenturyLink’s 60+ global data centers with redundant MPLS backbone and high-bandwidth circuits connecting your network with the CenturyLink Cloud.

Cloudera Enterprise connects directly using Hive or Impala to aggregate data from disparate LOB applications or sources hosted in any environment or location.

Spot Cloudera Clusters can be spun up and down as needed through the control panel, to facilitate software stack validation without disrupting data development and analysis tasks on the sandbox cluster. Ingest data and perform inquiries for real-time search, time series, geo-spatial, IoT, key/value data.

Production MySQL Cluster ensures that a failure of the MySQL Cloudera metadata instance does not bring down the entire cluster. CenturyLink always recommends an active-active replication setup for production clusters.

CenturyLink’s Decision Analyzer, Model Controller and Navigator can be leveraged in SAP solutions for data analytics and visualization. Customers can also use Apache Hadoop and other SAP licensed software or open stack software for analytics activities.

Data Sources

OLTP, ERP,CRM Systems

Documents & Emails

Web Logs, Click Streams

Machine Generated

Sensor Data

Geo-location Data

Social Networks

ETL

ETL

NoSQL Cluster—DBaaSDedicated Bare Metal

Orchestrate ServiceMulti-Data Center Fault Tolerance

MySQL Cluster—Cloudera MetadataDedicated Bare MetalActive-Active

Cloudera EnterpriseData Hub

Dedicated Bare Metal

Public Cloud

PublicCloud

Object StorageS3 Compatible APIUnlimited Storage

Production

Backup

DataVisualization

Decision AnalyzerModel ControllerNavigator3rd Party Tools

Page 2: Big Data as a-Service Reference Architecture - · PDF fileREFERENCE ARCHITECTURE CenturyLink® Big Data as a Service (BDaaS) Foundation ... data center yielding a robust and always-on

©2016 CenturyLink. All Rights Reserved. The CenturyLink mark, pathways logo and certain CenturyLink product names are the property of CenturyLink. All other marks are the property of their respective owners. Services not available everywhere. Business customers only. CenturyLink may change or cancel services or substitute similar services at its sole discretion without notice. cm160195

REFERENCE ARCHITECTURE

CenturyLink® Big Data as a Service (BDaaS) FoundationObject Storage can facilitate backups of the Hadoop file system in a highly scalable and fault tolerant distributed database. Data are saved using highly redundant industry standard methods. Additionally, object storage data are automatically replicated to an additional data center yielding a robust and always-on object storage solution that can be leveraged for inexpensive backup and data sourcing (e.g. for sandbox or snap clusters).

Sandbox environments utilizing Public Cloud Cloudera Cluster on Dedicated Bare Metal, and Spot Cloudera Cluster Virtual Servers

Snapshot server backups to block storage are always available in CenturyLink Cloud®. Automated rolling server backups are also available to cover real time high IOPS (up to 20,000) backups.

Like infrastructure can be provided for customers migrating from an existing pilot with SAS and QlikView analytics and data visualization products, to maintain operational processing.

Standard Node Configuration: 20 Core Intel Xeon E5-2650 v3 @ 2.30 GHz, 256 GB RAM, 6 x 4TB Data HDD raw @ 7200 RPM, CDH Licensed. Node quantity is custom based on individual deployment basis.

Persistent point-to-point IPSec VPN tunnels can support speeds up to 10 Gbps for secure organization-wide access to your big data platforms and analytics.

Security

• Kerberos Authentication and standalone Key Distribution Center (KDC) or direct Active Directory (AD) integration• LDAP Authorization and Apache Sentry policies for role-based access controls (RBAC). HDFS permissions and extended ACLs

where required• Data Governance capabilities like centralized auditing, audit reports, data lineage and unified technical metadata across the cluster• Optional Security Components including Dedicated Firewalls, Dedicated Network based IDS/IPS, Host based IDS/IPS, Host based Content Integrity Monitoring, Web Application Firewall, Proxy Services, DDOS Protection, DLP/Content Management, Email/Content Scanning, and Threat Management/Vulnerability Assessments