Upload
august-copeland
View
229
Download
0
Embed Size (px)
DESCRIPTION
What is Bigtop? Setting the standard for testing, packaging and integration of leading big/fast data components
Citation preview
Overview
SCALE14x 2016
Agenda/Schedule-Apache Bigtop Overview-Apache Spark Overview/Getting Started-Lunch Break-Apache Ignite-Workshop, tutorial, open time
http://workshops.bigtop.rocks(click on Agenda button)
What is Bigtop?
Setting the standard for testing, packaging and integration of leading big/fast data components
and many other…
Components as Building Blocks
-------------------------------------------------------------------------
Dependency Hell!!
hdfszookeeperhbasekafkaspark...mapredooziehiveetc ---
------
------
------
------
------
------
------
------
------
-
------
------
------
------
------
------
------
------
------
----
------
------
------
------
------
------
------
------
------
----
------
------
------
------
------
------
------
------
------
----
------
------
------
------
------
------
------
------
------
----
------
------
------
------
------
------
------
------
------
----
Build all the Things!!!
The BOMBuild of Materials (BOM)
* List of >=1 components* Gradle for build/actions* Produce sets of debs/rpms
Bigtop OriginsYahoo!, 2010
Created, fostered early Hadoop communityWorking on Hadoop 0.20 stack
2011Yahoo!’s to Cloudera, solving early problems of packaging and maintaining first commercial supported Hadoop distro
Early value addProvide a common foundation for proper integration of growing number of Hadoop family components
Foundation provides solid base for validating applications running on top of the stack(s)
Provide neutral packaging and deployment/config
Early Mission AccomplishedFoundation for commercial Hadoop distros/services
Leveraged by app providers…
What now?
We are done right?1?!?
Industry/Ecosystem Evolution&
New Community Needs/Ideas
Where should we spend our time?,which users should benefit?
Moving beyond oob mapreduce…
Lambda/Stream Architectures
HDFS + Zookeeper +
Get out from the Apache dome
New focus and target end users
Data engineers vs distro builders
Enhance Operations/Deployment
Reference implementations & tutorials
Laying new foundation with 1.0+Self-starter, non-kitchen sink building -Making gradle tooling smarter -Jenkins job autogen -leveraging containers for parallelization
Data data data…Smarter/Realistic test data -bigpetstore -bigtop-bazaar -weather data gen
Tutorial/Learning Data sets -githubarchive.org -more tbd…
Deployment/MgmtUpdated puppet modules -newest best practices -next level enhanced security options
Wider range of starter deployment topologies
Include some handling of test/tutorial data
More components…
Sounds interesting, how can I help?
*Join mailing list, ask questions, suggest features, etc
*Contribute (components, tutorials, docs)
*Report bugs
Thank You, Q&A
Nate D’[email protected]@kaiyzen