4

What is Spark Apache Spark is open source framework for fast, in-memory data processing. It currently supports Scala, Java and Python. Besides the core libraries, there is support for streaming, machine learning, data frames, integration with R and a version of SQL. EricMarshal l

Spark infrastructure

Download PPTX Report

Upload
ericwilliammarshall
View
177
Download
0

Embed Size (px)

Citation preview

Page 1: Spark infrastructure

What is Spark

Apache Spark is open source framework for fast, in-memory data processing. It currently supports Scala, Java and Python. Besides the core libraries, there is support for streaming, machine learning, data frames,

integration with R and a version of SQL.

EricMarshall

Page 2: Spark infrastructure

Spark compatibility and ecosystem• Spark runs in a clustered environment of arbitrary size and is designed to sit on top of a distributed file systems like HDFS, Cassandra, or S3. • Spark integrates with schedulers including Yarn and Mesos. Spark scales well and has deployed a cluster of 8000 nodes at the time of this writing.•Spark can read from most all sources and has performant connectors to nosql and sql datastores and tools like Tableau.

Page 3: Spark infrastructure

Spark and Hadoop

Spark can read from most all sources and has performant connectors to the Hadoop eco-system, other nosql and sql datastores and tools like Tableau. Spark can connect to streams or work in batches.

Spark also can run in a stand-alone clustered mode with HDFS or any form of shared file system (like NFS mounted to each node with the same path).

Spark can run highly available. Spark is resilient to Worker failures and will move work to other Workers. Spark supports standby Masters or can rely on the cluster’s scheduling software.

Or run within Hadoop as aYarn job; reading/writng from HFDS and connecting to other data sources.

Page 4: Spark infrastructure

Spark Tasks Spark is agnostic regarding the underlying cluster manager. Spark applications run as

independent sets of processes on a cluster, coordinated by the SparkContext object in your main program (called the driver program).

Specifically, to run on a cluster, the SparkContext can connect to several types of cluster managers (either Spark’s own standalone cluster manager or Mesos/YARN), which allocate resources across applications.

Each application has its own executor processes: managing threads, providing isolation between Spark contexts, also useful on the scheduling side as a unit of work.

Spark uses resources dynamically, if configured to do so. Scaling up and down as the work demands. (Currently only supported via Yarn)

Overview of Hadoop and Spark service at CERN · Overview of Hadoop and Spark service at CERN Zbigniew Baranowski, IT-DB Hadoop and Spark Service May 7th, 2018 1. 3 Infrastructure

Overview of Hadoop and Spark service at CERN · Overview of Hadoop and Spark service at CERN Zbigniew Baranowski, IT-DB Hadoop and Spark Service May 7th, 2018 1. 3 Infrastructure

Documents

ASX2012/02/27 · Spark Infrastructure Appendix 4E Results for Announcement to the Market for the Financial Year ended 31 December 2011 1. Company Details Name of entity : SPARK INFRASTRUCTURE

ASX2012/02/27 · Spark Infrastructure Appendix 4E Results for Announcement to the Market for the Financial Year ended 31 December 2011 1. Company Details Name of entity : SPARK INFRASTRUCTURE

Documents

Spark Plug Thread Repair Spark Plug Spark Plug Sockets for

Spark Plug Thread Repair Spark Plug Spark Plug Sockets for

Documents

Spark Platform Spark Core Spark Extensions Using … Platform Spark Core Spark Extensions Using Apache Spark About me Vitalii Bondarenko Data Platform Competency Manager Eleks 20 years

Spark Platform Spark Core Spark Extensions Using … Platform Spark Core Spark Extensions Using Apache Spark About me Vitalii Bondarenko Data Platform Competency Manager Eleks 20 years

Documents

Infrastructure - Amazon Web Services APPS ANALYTICS APP SERVICES MOBILE SERVICES DEVELOPMENT & OPERATIONS Data Warehousing Hadoop / Spark …

Infrastructure - Amazon Web Services APPS ANALYTICS APP SERVICES MOBILE SERVICES DEVELOPMENT & OPERATIONS Data Warehousing Hadoop / Spark …

Documents

McDonough Spark Tutorial Spark Summit 2013

McDonough Spark Tutorial Spark Summit 2013

Documents

FY 2012 results – February 2013 Yield, Growth and Qualitysparkinfrastructure.reportonline.com.au/sites/spark... · 2017-03-15 · SPARK INFRASTRUCTURE – FY 2012 RESULTS FEBRUARY

FY 2012 results – February 2013 Yield, Growth and Qualitysparkinfrastructure.reportonline.com.au/sites/spark... · 2017-03-15 · SPARK INFRASTRUCTURE – FY 2012 RESULTS FEBRUARY

Documents

Automated Spark Deployment With Declarative Infrastructure

Automated Spark Deployment With Declarative Infrastructure

Data & Analytics

Spark SQL | Apache Spark

Spark SQL | Apache Spark

Technology

Migrating Video Infrastructure to the Cloud · 2018-01-31 · Jabber SIP Devices Spark Registered Devices Spark Skype for Business Simple schedule & join Video / Mobile first experience

Migrating Video Infrastructure to the Cloud · 2018-01-31 · Jabber SIP Devices Spark Registered Devices Spark Skype for Business Simple schedule & join Video / Mobile first experience

Documents

S U M M I T - Amazon Web Services... · Task2/Slide1 Task Dispatcher Spark Driver Spark Worker Spark Worker Spark Worker - Spark Driver provisioning - Task parameters - Spark Workers

S U M M I T - Amazon Web Services... · Task2/Slide1 Task Dispatcher Spark Driver Spark Worker Spark Worker Spark Worker - Spark Driver provisioning - Task parameters - Spark Workers

Documents

Spark SQL and DataFrames Spark GraphX Spark Mlib Spark ...Spark GraphX! Spark Mlib! Spark Streaming Lightning-fast cluster computing. Chaining transformations 2. ... Covert RDD to

Spark SQL and DataFrames Spark GraphX Spark Mlib Spark ...Spark GraphX! Spark Mlib! Spark Streaming Lightning-fast cluster computing. Chaining transformations 2. ... Covert RDD to

Documents

THE NEW SPARKclients2.weblink.com.au/news/asx_pdf_loader.asp?article... · SPARK INFRASTRUCTURE INVESTOR PRESENTATION SEPTEMBER 2011 5 THE NEW SPARK 1. High quality regulated monopoly

THE NEW SPARKclients2.weblink.com.au/news/asx_pdf_loader.asp?article... · SPARK INFRASTRUCTURE INVESTOR PRESENTATION SEPTEMBER 2011 5 THE NEW SPARK 1. High quality regulated monopoly

Documents

Big Data Infrastructure - GitHub Pages · Big Data Infrastructure Week 5: Analyzing Graphs (2/2) ... Graphs and MapReduce (and Spark) A large class of graph algorithms involve: Local

Big Data Infrastructure - GitHub Pages · Big Data Infrastructure Week 5: Analyzing Graphs (2/2) ... Graphs and MapReduce (and Spark) A large class of graph algorithms involve: Local

Documents

Spark, spark streaming & tachyon

Spark, spark streaming & tachyon

Technology

Using Spark @ Conviva Spark Summit 2013

Using Spark @ Conviva Spark Summit 2013

Documents

Future of OSSossforum.jp/jossfiles/3-1 Award Speech (JOPF)201811.pdf · 15/11/2018 · Cloud Infrastructure(OpenStack, Spark)、 Big Data Infrastructure (Hadoop、NoSQL)、 Enterprise(PostgreSQL,

Future of OSSossforum.jp/jossfiles/3-1 Award Speech (JOPF)201811.pdf · 15/11/2018 · Cloud Infrastructure(OpenStack, Spark)、 Big Data Infrastructure (Hadoop、NoSQL)、 Enterprise(PostgreSQL,

Documents

SPARK INFRASTRUCTUREsparkinfrastructure.reportonline.com.au/sites/sparkinfrastructure...3 FINANCIAL HIGHLIGHTS SPARK INFRASTRUCTURE 0 50 100 150 200 250 Underlying income FY 2007 FY

SPARK INFRASTRUCTUREsparkinfrastructure.reportonline.com.au/sites/sparkinfrastructure...3 FINANCIAL HIGHLIGHTS SPARK INFRASTRUCTURE 0 50 100 150 200 250 Underlying income FY 2007 FY

Documents

Spark streaming , Spark SQL

Spark streaming , Spark SQL

Data & Analytics

For personal use only · SPARK INFRASTRUCTURE - FY 2013 RESULTS - FEBRUARY 2014 13 FINANCIAL HIGHLIGHTS – FY 2013 Spark Infrastructure 1. On an accrued basis 2. Based on Asset Company

For personal use only · SPARK INFRASTRUCTURE - FY 2013 RESULTS - FEBRUARY 2014 13 FINANCIAL HIGHLIGHTS – FY 2013 Spark Infrastructure 1. On an accrued basis 2. Based on Asset Company

Documents

Spark Infrastructure Group ASX Code: SKI Price: $2.01 12 Mth … · Spark Infrastructure Group ASX Code: SKI Price: $2.01 12 Mth Target Price: $2.28 Rating: Neutral Important Disclaimer

Spark Infrastructure Group ASX Code: SKI Price: $2.01 12 Mth … · Spark Infrastructure Group ASX Code: SKI Price: $2.01 12 Mth Target Price: $2.28 Rating: Neutral Important Disclaimer

Documents

Paris Spark meetup : Extension de Spark (Tachyon / Spark JobServer) par jlamiel

Paris Spark meetup : Extension de Spark (Tachyon / Spark JobServer) par jlamiel

Internet

REPLACEMENT SPARK PLUGS Spark Plug Application Chart · REPLACEMENT SPARK PLUGS Spark Plug Application Chart ... EC Series Air-Cooled 1 ... REPLACEMENT SPARK PLUGS Spark Plug Application

REPLACEMENT SPARK PLUGS Spark Plug Application Chart · REPLACEMENT SPARK PLUGS Spark Plug Application Chart ... EC Series Air-Cooled 1 ... REPLACEMENT SPARK PLUGS Spark Plug Application

Documents

CHEUNG KONG INFRASTRUCTURE HOLDINGS LIMITED€¦ · Infrastructure Trust), Spark Infrastructure Holdings No. 1 Limited, Spark Infrastructure Holdings No. 2 Limited and Spark Infrastructure

CHEUNG KONG INFRASTRUCTURE HOLDINGS LIMITED€¦ · Infrastructure Trust), Spark Infrastructure Holdings No. 1 Limited, Spark Infrastructure Holdings No. 2 Limited and Spark Infrastructure

Documents

Annual Report 2016 - ASX · 2017-04-18 · REPORT 2016 Spark Infrastructure Annual Report 2016 For personal use only. SPARK ... (since March 2010) ... Mr Fay was previously Chair

Annual Report 2016 - ASX · 2017-04-18 · REPORT 2016 Spark Infrastructure Annual Report 2016 For personal use only. SPARK ... (since March 2010) ... Mr Fay was previously Chair

Documents

[Spark meetup] Spark Streaming Overview

[Spark meetup] Spark Streaming Overview

Technology

CK Infrastructure Holdings Limited€¦ · Spark disposal - +781 ... one-off gain arising from the disposal of interest in Spark Infrastructure and HK$483 million deferred tax gain

CK Infrastructure Holdings Limited€¦ · Spark disposal - +781 ... one-off gain arising from the disposal of interest in Spark Infrastructure and HK$483 million deferred tax gain

Documents

Berkley Data Analysis Stack Shark, Bagel. 2 Previous Presentation Summary Mesos, Spark, Spark Streaming Infrastructure Storage Data Processing Application

Berkley Data Analysis Stack Shark, Bagel. 2 Previous Presentation Summary Mesos, Spark, Spark Streaming Infrastructure Storage Data Processing Application

Documents

Budapest Spark Meetup - Apache Spark @enbrite.ly

Budapest Spark Meetup - Apache Spark @enbrite.ly

Data & Analytics

Intro to Spark and Spark SQL

Intro to Spark and Spark SQL

Software