View
941
Download
0
Category
Tags:
Preview:
Citation preview
Page 1 © Hortonworks Inc. 2014
Discover HDP 2.2: Using Apache Ambari to Manage Hadoop Clusters
Hortonworks. We do Hadoop.
Page 2 © Hortonworks Inc. 2014
Speakers
Justin Sears
Hortonworks Product Marketing Manager
Jeff Sposetti
Hortonworks Senior Director of Product Management and Committer for Apache Ambari
Mahadev Konar
Hortonworks Co-Founder, Committer and PMC Member for Apache Hadoop, Apache Ambari & Apache ZooKeeper
Page 3 © Hortonworks Inc. 2014
Agenda
• Introduction to Apache Ambari
• New Ambari Innovation in HDP 2.2 – Configuration Enhancements, including Versioning & History – Ambari Administration, including Views Framework
– Ambari Stacks “Stack Advisor”
• Demo
• Q & A
We’ll move quickly: • Attendee phone lines are muted • Text any questions to Mahadev Konar using Webex chat • Questions answered at the end
• Unanswered questions and answers in upcoming blog post
Page 4 © Hortonworks Inc. 2014
Big Data, Hadoop & Data Center Re-platforming
Business Drivers
• From reactive analytics to proactive interactions
• Insights that drive competitive advantage & optimal returns
Financial Drivers
• Cost of data systems, as % of IT spend, continues to grow
• Cost advantages of commodity hardware & open source software
$ Technical Drivers
• Data is growing exponentially & existing systems overwhelmed
• Predominantly driven by NEW types of data that can inform analytics
There is an inequitable balance between vendor and customer in the market
Page 5 © Hortonworks Inc. 2014
Clickstream Capture and analyze website visitors’ data trails and optimize your website
Sensors Discover patterns in data streaming automatically from remote sensors and machines
Server Logs Research logs to diagnose process failures and prevent security breaches
New Types of Data Hadoop Value:
Sentiment Understand how your customers feel about your brand and products – right now
Geographic Analyze location-based data to manage operations where they occur
Unstructured Understand patterns in files across millions of web pages, emails, and documents
Page 6 © Hortonworks Inc. 2014
A Shift from Reactive to Proactive Interactions
HDP and Hadoop allow organizations to use data to shift interactions from…
Reactive Post Transaction
Proactive Pre Decision
…to Real-time Personalization From static branding
…to repair before break From break then fix
…to Designer Medicine From mass treatment
…to Automated Algorithms From Educated Investing
…to 1x1 Targeting From mass branding
A shift in Advertising
A shift in Financial Services
A shift in Healthcare
A shift in Retail
A shift in Telco
Page 7 © Hortonworks Inc. 2014
Enterprise Goals for the Modern Data Architecture
• Consolidate siloed data sets structured and unstructured
• Central data set on a single cluster
• Multiple workloads across batch interactive and real time
• Central services for security, governance and operation
• Preserve existing investment in current tools and platforms
• Single view of the customer, product, supply chain
APP
LIC
ATIO
NS
DAT
A S
YSTE
M
Business Analytics
Custom Applications
Packaged Applications
RDBMS
EDW
MPP
YARN: Data Operating System
1 ° ° ° ° ° ° ° ° °
° ° ° ° ° ° ° ° ° N
Interactive Real-Time Batch CRM
ERP
Other 1 ° ° °
° ° ° °
HDFS (Hadoop Distributed File System)
SOU
RC
ES
EXISTING Systems
Clickstream Web &Social
Geoloca9on Sensor & Machine
Server Logs
Unstructured
Page 8 © Hortonworks Inc. 2014
YARN Transformed Hadoop & Opened a New Era
YARN The Architectural Center of Hadoop
• Common data platform, many applications
• Support multi-tenant access & processing
• Batch, interactive & real-time use cases
YARN: Data Operating System (Cluster Resource Management)
1 ° ° ° ° ° ° °
° ° ° ° ° ° ° °
Script
Pig
SQL
Hive
Tez Tez
Java Scala
Cascading
Tez
° °
° °
° ° ° ° °
° ° ° ° °
Others
ISV Engines
HDFS (Hadoop Distributed File System)
Stream
Storm
Search
Solr
NoSQL
HBase Accumulo
Slider Slider
BATCH, INTERACTIVE & REAL-TIME DATA ACCESS
In-Memory
Spark
Page 9 © Hortonworks Inc. 2014
YARN Extends Hadoop to Other Data Center Leaders
YARN The Architectural Center of Hadoop
• Common data platform, many applications
• Support multi-tenant access & processing
• Batch, interactive & real-time use cases
• Supports 3rd-party ISV tools
(ex. SAS, Syncsort, Actian, etc.)
YARN Ready Applications Facilitates ongoing innovation and enterprise adoption via ecosystem of new and existing “YARN Ready” solutions
YARN: Data Operating System (Cluster Resource Management)
1 ° ° ° ° ° ° °
° ° ° ° ° ° ° °
Script
Pig
SQL
Hive
Tez Tez
Java Scala
Cascading
Tez
° °
° °
° ° ° ° °
° ° ° ° °
Others
ISV Engines
HDFS (Hadoop Distributed File System)
Stream
Storm
Search
Solr
NoSQL
HBase Accumulo
Slider Slider
BATCH, INTERACTIVE & REAL-TIME DATA ACCESS
In-Memory
Spark
Page 10 © Hortonworks Inc. 2014
Enterprise Hadoop: Central Set of Services
YARN: Data Operating System (Cluster Resource Management)
1 ° ° ° ° ° ° °
° ° ° ° ° ° ° °
° °
° °
° ° ° ° °
° ° ° ° °
Enables Apache Hadoop to be an Enterprise Data Platform with centralized services for:
• Governance
• Operations
• Security
Everything that plugs into Hadoop inherits these services
Provision, Manage & Monitor
Ambari
Zookeeper
Scheduling
Oozie
Load data and manage
according to policy
Deploy and effectively
manage the platform
Provide layered approach to
security through Authentication, Authorization,
Accounting, and Data Protection
SECURITY GOVERNANCE OPERATIONS
Script
Pig
SQL
Hive
Java Scala
Cascading
Stream
Storm
Search
Solr
NoSQL
HBase Accumulo
BATCH, INTERACTIVE & REAL-TIME DATA ACCESS
In-Memory
Spark
Others
ISV Engines
YARN: Data Operating System (Cluster Resource Management)
HDFS (Hadoop Distributed File System)
Tez Slider Slider Tez Tez
Page 11 © Hortonworks Inc. 2014
Hortonworks Data Platform 2.2
HDP Delivers Enterprise Hadoop
YARN: Data Operating System (Cluster Resource Management)
1 ° ° ° ° ° ° °
° ° ° ° ° ° ° °
Script
Pig
SQL
Hive
Tez Tez
Java Scala
Cascading
Tez
° °
° °
° ° ° ° °
° ° ° ° °
HDFS (Hadoop Distributed File System)
Stream
Storm
Search
Solr
NoSQL
HBase Accumulo
Slider Slider
SECURITY GOVERNANCE OPERATIONS BATCH, INTERACTIVE & REAL-TIME DATA ACCESS
In-Memory
Spark
Provision, Manage & Monitor
Ambari
Zookeeper
Scheduling
Oozie
Data Workflow, Lifecycle & Governance
Falcon Sqoop Flume Kafka NFS
WebHDFS
Authentication Authorization
Audit Data Protection
Storage: HDFS
Resources: YARN Access: Hive
Pipeline: Falcon Cluster: Ranger Cluster: Knox
Deployment Choice Linux Windows Cloud
YARN is the architectural center of HDP
• Common data set across all applications
• Batch, interactive & real-time workloads
• Multi-tenant access & processing
Provides comprehensive enterprise capabilities
• Governance
• Security
• Operations
Enables broad ecosystem adoption
• ISVs can plug directly into Hadoop
The widest range of deployment options • Linux & Windows
• On premises & cloud
Others
ISV Engines
On-Premises
Page 12 © Hortonworks Inc. 2014
Hortonworks Data Platform 2.2
HDP Delivers Enterprise Hadoop
YARN: Data Operating System (Cluster Resource Management)
1 ° ° ° ° ° ° °
° ° ° ° ° ° ° °
Script
Pig
SQL
Hive
Tez Tez
Java Scala
Cascading
Tez
° °
° °
° ° ° ° °
° ° ° ° °
HDFS (Hadoop Distributed File System)
Stream
Storm
Search
Solr
NoSQL
HBase Accumulo
Slider Slider
SECURITY GOVERNANCE BATCH, INTERACTIVE & REAL-TIME DATA ACCESS
In-Memory
Spark
Scheduling
Oozie
Data Workflow, Lifecycle & Governance
Falcon Sqoop Flume Kafka NFS
WebHDFS
Authentication Authorization
Audit Data Protection
Storage: HDFS
Resources: YARN Access: Hive
Pipeline: Falcon Cluster: Ranger Cluster: Knox
Deployment Choice Linux Windows Cloud
YARN is the architectural center of HDP
• Common data set across all applications
• Batch, interactive & real-time workloads
• Multi-tenant access & processing
Provides comprehensive enterprise capabilities
• Governance
• Security
• Operations
Enables broad ecosystem adoption
• ISVs can plug directly into Hadoop
The widest range of deployment options • Linux & Windows
• On premises & cloud
Others
ISV Engines
On-Premises
OPERATIONS
Provision, Manage & Monitor
Ambari
Zookeeper
Page 14 © Hortonworks Inc. 2014
How do you Operate a Hadoop Cluster?
Apache Ambari is a framework to provision,
manage and monitor Hadoop clusters
Page 15 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Apache Ambari Themes
Operate Hadoop at Scale
Deliver the core opera-onal capabili-es to provision, manage and monitor Hadoop clusters at scale.
Integrate with the Enterprise
Robust API for integra-on with exis9ng enterprise systems, such as Teradata Viewpoint and MicrosoL SCOM.
Extend for the Ecosystem
Provide an extensible plaNorm for Enterprises, Partners and the Community, via Stacks and Views.
Page 16 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
What’s New in Ambari 1.7.0 Core Services • ResourceManager HA • Capacity Scheduler Refresh Queues • HDFS Rebalance • Service Config Versioning + History • Manage -env.sh Files • Set <final> Config Properties • Download Client Configs
Ambari Platform • Ambari Administration • Ambari Views Framework • Ambari Blueprints Export Configs • Ubuntu 12 Platform Support Stacks • Support for HDP 2.2 • Stack Advisor For a complete list of enhancements…
http://www.slideshare.net/hortonworks/apache-ambari-whats-new-in-170
Page 18 © Hortonworks Inc. 2014
Configuration Versioning and History
• Service Config Versions (saved per service)
• List of Config History
• Compare Versions
• Filter by “Changed Properties”
• Revert Changes (i.e. “Make Current”)
• Audit Log of Changes
Page 20 © Hortonworks Inc. 2014
Service Configuration Controls Most Recent Versions (view, compare, revert)
Compare Versions
Revert Version
Filter by “Changed”
Page 22 © Hortonworks Inc. 2014
Ambari Extension Points
Ambari Server
Ambari Agent Ambari
Agent Ambari Agent
Ambari Web
Stacks Stacks
Stacks
java!js! python!
Ambari Views Ambari Stacks
Page 23 © Hortonworks Inc. 2014
Ambari Extension Points
Ambari Server
Ambari Agent Ambari
Agent Ambari Agent
Ambari Web
Stacks Stacks
Stacks
java!js! python!
Ambari Views Ambari Stacks
Page 24 © Hortonworks Inc. 2014
Ambari Views Framework
Goal: enable the delivery of custom UI experiences in Ambari Web
Developers can extend the Ambari Web interface • Views expose custom UI features for Hadoop Services
Ambari Admins can entitle Views to Ambari Web users • Entitlements framework for controlling access to Views
Page 26 © Hortonworks Inc. 2014
View Components
• Serve client-side assets (such as HTML + JavaScript)
• Expose server-side resources (such as REST endpoints)
VIEW Client-‐side assets
(.js, html)
AMBARI WEB
VIEW Server-‐side resources (java)
AMBARI SERVER
{rest}!Hadoop
and other systems
Page 27 © Hortonworks Inc. 2014
Versions and Instances
• Deploy multiple versions and create multiple instances of a view
• Manage accessibility and usage
Page 28 © Hortonworks Inc. 2014
Choice of Deployment Model
• For Hadoop Operators: Deploy Views in an Ambari Server that is managing a Hadoop cluster
• For Data Workers: Run Views in a “standalone” Ambari Server
Ambari Server
HADOOP Store & Process
Ambari Server
Operators manage the cluster, may have Views deployed
Data Workers use the cluster and use a “standalone” Ambari Server for Views
Page 29 © Hortonworks Inc. 2014
Learn More About Views Framework
https://github.com/apache/ambari/blob/trunk/ambari-views/docs/index.md
https://github.com/apache/ambari/tree/trunk/ambari-views/examples
https://cwiki.apache.org/confluence/display/AMBARI/Views
https://github.com/apache/ambari/tree/trunk/contrib/views
Page 31 © Hortonworks Inc. 2014
Ambari Extension Points
Ambari Server
Ambari Agent Ambari
Agent Ambari Agent
Ambari Web
Stacks Stacks
Stacks
java!js! python!
Ambari Views Ambari Stacks
Page 32 © Hortonworks Inc. 2014
Ambari Extension Points
Ambari Server
Ambari Agent Ambari
Agent Ambari Agent
Ambari Web
Stacks Stacks
Stacks
java!js! python!
Ambari Views Ambari Stacks
Page 33 © Hortonworks Inc. 2014
Ambari Stacks
• Defines a consistent Stack lifecycle interface that can be extended
• Encapsulates Stack Versions, Services, Components, Dependencies, Cardinality, Configurations, Commands
• Dynamically add Stack + Service definitions
AMBARI {rest}!
<ambari-web>!
Stacks
HDFS YARN MR2
Hive
Pig
Oozie HBase
Storm Falcon
Page 34 © Hortonworks Inc. 2014
Stacks In Action http://hortonworks.com/partners/certified/ops-ready/
Page 35 © Hortonworks Inc. 2014
Stack Advisor
• Extends Ambari Stacks to include a “Stack Advisor”
• Provides recommendations for and performs validation on component layout & configuration
• Improves Stack pluggability
• Exposes new REST endpoints:
/recommendations!!/validations!
• REST endpoints used during Cluster Install Wizard and Configs UI
Recommended