View
517
Download
2
Category
Tags:
Preview:
DESCRIPTION
You know Pig is more than a farm animal and that Hive is not some ultra-hip bar. You've beyond the buzz words and the word count demos. Now…you're ready to figure out how it all fits in. In this session we will review common integration scenarios, proven patterns and best practices for integration Big Data solutions into your existing data warehouse and BI architecture. Learn how you too can ride the Big Data wave without reinventing the wheel to both enhance the information you currently deliver while solving problems that were previously unapproachable.
Citation preview
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
The BI Guys Little Guide to Big Data
Chris PriceSenior BI Consultant
BluewaterSQL
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Introductionshellip
Chris Price Senior BI Consultant with Pragmatic Works
AuthorRegular SpeakerData Geek amp Super Dad
BluewaterSQL httpbluewatersqlwordpresscom cpricepragmaticworkscom
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Big Data Data Explosion
As recently as 2000 only frac14 of data was digital Paper film or other analog media
According to IBM 90 of data created in last 2 years Data volume now growing 10 every 5 years Approximately 85 from new sources
Consumerization 43 connected devices per adult 27 use social media input
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Big Data
Data Complexity Variety and Velocity
Terabytes
Gigabytes
Megabytes
Petabytes
Big Data
Service Logs
Spatial amp GPS coordinates
Data market feeds
eGov feeds
Weather
Textimage
Click stream
Wikisblogs
Sensors
RFIDDevices
SMS
HD Audiovideo
Web
Web Logs
Search Marketing
Recommendations
Affiliates
Advertising
Mobile
Collaboration
eCommerceTraditionalPayables
Payroll
Inventory
Contacts
Orders
Campaigns
Source Brian Mitchel TechEd 2013
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Big Data is wellhellipBig Drove $28b in IT investment in 2012
Expected to grow to $34b in 2014 Challenges
Data Volumes (HardwareStorage Economics) Data Diversity (Multiple Types amp Sources) Data Velocity (Real-Time) User-Expectations
How do we planintegratehelliphellip
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Agenda Hadoop Landscape Current BIDW Landscape BIDW amp Hadoop Intersection
ToolsTechniquesStrategies
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Hadoop Ecosystem
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Hadoop on Windows HDInsight on Windows Azure
Seamlessly scale in the cloud Backed by Azure Storage Vault (ASV)
Hortonworks Data Platform (HDP) On-Premise Based on HDFS
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Current Landscape
Clie
nt T
ools
Reporting Services SharePoint Microsoft Applications
DATA
SO
URC
ES
Traditional Sources (CRMERPLOBWeb)
BID
W S
yste
m
DW Cubes
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Clie
nt T
ools
BID
W S
yste
m
DW
Reporting Services SharePoint Microsoft Applications
DATA
SO
URC
ES
Traditional Sources (CRMERPLOBWeb)
Cubes
Future Landscape
Hadoop
New Sources (Email Logs Social Media Sensor)
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Business Scenario
DW Cube
HadoopHDFS
ODBCODBC
Sqoo
p
OD
BC
Reporting Tools
Flume
Sensor DataWebHDFS
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
What about Azure
DW Cube
Hadoop
ODBCODBC
Sqoo
p
OD
BC
Reporting Tools
AzCopy
Azure Blob Storage
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Tool Techniques amp Strategies
Enterprise Data Services WebHDFS Sqoop Hcatalog PigHive
Enterprise Operational Services Oozie
Other Windows Azure Blob Storage amp AzCopy Hive ODBC Polybase
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
WebHDFS
Born from HFTP intended as a replacement Widely used by Yahoo
High performance first class native protocol using industry standard RESTful mechanism
Complete interface for reading writing amp managing files
Supports secure authentication Data Locality ndash requests sent to data nodes
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
WebHDFS ndash Get Example
Requestcurl -i -L httphostportwebhdfsv1foobarop=OPEN
ResponseHTTP11 307 TEMPORARY_REDIRECT Content-Type applicationoctet-stream Location httpdatanode50075webhdfsv1foobarop=OPENampampoffset=0 Content-Length 0
HTTP11 200 OK Content-Type applicationoctet-streamContent-Length 22
Hello webhdfs user
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
WebHDFS ndash More Examples
Rename Requestcurl -i -X PUT httphostportwebhdfsv1foobarop=RENAMEampampdestination=foobar2
Create Directory Requestcurl -i -X PUT httphostportwebhdfsv1foo2op=MKDIRS
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Sqoop
Tool designed to efficiently move data between Hadoop (Hive amp Hbase) and RDBMS Importing (single and all tables) Exporting Eval (Query Execution) Merge (Multiple HDFS datasets) Incremental Imports
Generates MapReduce jobs Can control the level of parallelism
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Sqoop Demo
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
HCatalogHivePig
Hcatalog ndash Metadata amp table management Users interact with a set of defined tables Abstracts away the wherehow of data storage Allows for consistent access
Pig ndash ETLData Transformation Scripting Pig Latin Java User-Defined Functions (PiggybankDataFu)
Hive ndash SQL-like interface Allows ad-hoc queries for data summarizations
and analysis ODBC Connector
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Demo Pig amp Hive
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Oozie
Scalable Reliable Extensible Workflow Management SystemJob Scheduler
Triggered by Time Data Availability
Can run and orchestrate multiple jobs MapReduce and Streaming MapReduce Hive Pig
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Windows Azure Blob Storage
Also called Azure Storage Vault (ASV) Scalable persistent highly-scalable storage with
built-in geo-replication Azure HDInsight clusters are wired for ASV
On-Premise HDP uses HDFS Separates data from compute nodes
Clusters can be created and dropped minimizing costs Multiple clusters can share data
The Azure Flat (Quantum 10) mesh grid network is the key Violates the principal of data locality but out-performs
HDFS and Azure competitors
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Windows Azure Blob Storage
Source httpdennygleecom20130318why-use-blob-storage-with-hdinsight-on-azure
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
AzCopy
Windows Azure Blob Storage Copies files to and from
Similar to Robocopy Command-lineAzCopy CBeer httpsstgblobcorewindowsnetdataBeer destKeyltMyKeygt S V
Recursively (S) copies all files in the Beer directory with Verbose (V) logging
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
AzCopy Demo
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
PolyBase
Part of Parallel Data Warehouse allows integration of relational and non-relational data
Creates external tables via a HDFS bridge Allows on-the-fly joins within SQL Server
Supports parallel Imports from HDFS Exports to HDFS
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Resources Bloggers
Denny Lee httpdennygleecom
Carl Nolan httpblogsmsdncombcarlnolarchivetagshadoop+streaming
Cindy Gross httpblogsmsdncombcindygrossarchivetagsbig+data
Books Hadoop the Definite Guide - Tom White Programming Pig - Alan Gates Programming Hive - Edward Capriolo Hadoop MapReduce Cookbook - Srinath Perera
Links to this Presentation httpbluewatersqlwordpresscomresources httpwwwslidesharenetbluewatersqlbig-dataguide
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscomMAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Thank you
BluewaterSQL httpbluewatersqlwordpresscom cpricepragmaticworkscom
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Introductionshellip
Chris Price Senior BI Consultant with Pragmatic Works
AuthorRegular SpeakerData Geek amp Super Dad
BluewaterSQL httpbluewatersqlwordpresscom cpricepragmaticworkscom
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Big Data Data Explosion
As recently as 2000 only frac14 of data was digital Paper film or other analog media
According to IBM 90 of data created in last 2 years Data volume now growing 10 every 5 years Approximately 85 from new sources
Consumerization 43 connected devices per adult 27 use social media input
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Big Data
Data Complexity Variety and Velocity
Terabytes
Gigabytes
Megabytes
Petabytes
Big Data
Service Logs
Spatial amp GPS coordinates
Data market feeds
eGov feeds
Weather
Textimage
Click stream
Wikisblogs
Sensors
RFIDDevices
SMS
HD Audiovideo
Web
Web Logs
Search Marketing
Recommendations
Affiliates
Advertising
Mobile
Collaboration
eCommerceTraditionalPayables
Payroll
Inventory
Contacts
Orders
Campaigns
Source Brian Mitchel TechEd 2013
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Big Data is wellhellipBig Drove $28b in IT investment in 2012
Expected to grow to $34b in 2014 Challenges
Data Volumes (HardwareStorage Economics) Data Diversity (Multiple Types amp Sources) Data Velocity (Real-Time) User-Expectations
How do we planintegratehelliphellip
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Agenda Hadoop Landscape Current BIDW Landscape BIDW amp Hadoop Intersection
ToolsTechniquesStrategies
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Hadoop Ecosystem
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Hadoop on Windows HDInsight on Windows Azure
Seamlessly scale in the cloud Backed by Azure Storage Vault (ASV)
Hortonworks Data Platform (HDP) On-Premise Based on HDFS
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Current Landscape
Clie
nt T
ools
Reporting Services SharePoint Microsoft Applications
DATA
SO
URC
ES
Traditional Sources (CRMERPLOBWeb)
BID
W S
yste
m
DW Cubes
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Clie
nt T
ools
BID
W S
yste
m
DW
Reporting Services SharePoint Microsoft Applications
DATA
SO
URC
ES
Traditional Sources (CRMERPLOBWeb)
Cubes
Future Landscape
Hadoop
New Sources (Email Logs Social Media Sensor)
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Business Scenario
DW Cube
HadoopHDFS
ODBCODBC
Sqoo
p
OD
BC
Reporting Tools
Flume
Sensor DataWebHDFS
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
What about Azure
DW Cube
Hadoop
ODBCODBC
Sqoo
p
OD
BC
Reporting Tools
AzCopy
Azure Blob Storage
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Tool Techniques amp Strategies
Enterprise Data Services WebHDFS Sqoop Hcatalog PigHive
Enterprise Operational Services Oozie
Other Windows Azure Blob Storage amp AzCopy Hive ODBC Polybase
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
WebHDFS
Born from HFTP intended as a replacement Widely used by Yahoo
High performance first class native protocol using industry standard RESTful mechanism
Complete interface for reading writing amp managing files
Supports secure authentication Data Locality ndash requests sent to data nodes
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
WebHDFS ndash Get Example
Requestcurl -i -L httphostportwebhdfsv1foobarop=OPEN
ResponseHTTP11 307 TEMPORARY_REDIRECT Content-Type applicationoctet-stream Location httpdatanode50075webhdfsv1foobarop=OPENampampoffset=0 Content-Length 0
HTTP11 200 OK Content-Type applicationoctet-streamContent-Length 22
Hello webhdfs user
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
WebHDFS ndash More Examples
Rename Requestcurl -i -X PUT httphostportwebhdfsv1foobarop=RENAMEampampdestination=foobar2
Create Directory Requestcurl -i -X PUT httphostportwebhdfsv1foo2op=MKDIRS
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Sqoop
Tool designed to efficiently move data between Hadoop (Hive amp Hbase) and RDBMS Importing (single and all tables) Exporting Eval (Query Execution) Merge (Multiple HDFS datasets) Incremental Imports
Generates MapReduce jobs Can control the level of parallelism
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Sqoop Demo
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
HCatalogHivePig
Hcatalog ndash Metadata amp table management Users interact with a set of defined tables Abstracts away the wherehow of data storage Allows for consistent access
Pig ndash ETLData Transformation Scripting Pig Latin Java User-Defined Functions (PiggybankDataFu)
Hive ndash SQL-like interface Allows ad-hoc queries for data summarizations
and analysis ODBC Connector
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Demo Pig amp Hive
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Oozie
Scalable Reliable Extensible Workflow Management SystemJob Scheduler
Triggered by Time Data Availability
Can run and orchestrate multiple jobs MapReduce and Streaming MapReduce Hive Pig
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Windows Azure Blob Storage
Also called Azure Storage Vault (ASV) Scalable persistent highly-scalable storage with
built-in geo-replication Azure HDInsight clusters are wired for ASV
On-Premise HDP uses HDFS Separates data from compute nodes
Clusters can be created and dropped minimizing costs Multiple clusters can share data
The Azure Flat (Quantum 10) mesh grid network is the key Violates the principal of data locality but out-performs
HDFS and Azure competitors
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Windows Azure Blob Storage
Source httpdennygleecom20130318why-use-blob-storage-with-hdinsight-on-azure
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
AzCopy
Windows Azure Blob Storage Copies files to and from
Similar to Robocopy Command-lineAzCopy CBeer httpsstgblobcorewindowsnetdataBeer destKeyltMyKeygt S V
Recursively (S) copies all files in the Beer directory with Verbose (V) logging
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
AzCopy Demo
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
PolyBase
Part of Parallel Data Warehouse allows integration of relational and non-relational data
Creates external tables via a HDFS bridge Allows on-the-fly joins within SQL Server
Supports parallel Imports from HDFS Exports to HDFS
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Resources Bloggers
Denny Lee httpdennygleecom
Carl Nolan httpblogsmsdncombcarlnolarchivetagshadoop+streaming
Cindy Gross httpblogsmsdncombcindygrossarchivetagsbig+data
Books Hadoop the Definite Guide - Tom White Programming Pig - Alan Gates Programming Hive - Edward Capriolo Hadoop MapReduce Cookbook - Srinath Perera
Links to this Presentation httpbluewatersqlwordpresscomresources httpwwwslidesharenetbluewatersqlbig-dataguide
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscomMAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Thank you
BluewaterSQL httpbluewatersqlwordpresscom cpricepragmaticworkscom
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Big Data Data Explosion
As recently as 2000 only frac14 of data was digital Paper film or other analog media
According to IBM 90 of data created in last 2 years Data volume now growing 10 every 5 years Approximately 85 from new sources
Consumerization 43 connected devices per adult 27 use social media input
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Big Data
Data Complexity Variety and Velocity
Terabytes
Gigabytes
Megabytes
Petabytes
Big Data
Service Logs
Spatial amp GPS coordinates
Data market feeds
eGov feeds
Weather
Textimage
Click stream
Wikisblogs
Sensors
RFIDDevices
SMS
HD Audiovideo
Web
Web Logs
Search Marketing
Recommendations
Affiliates
Advertising
Mobile
Collaboration
eCommerceTraditionalPayables
Payroll
Inventory
Contacts
Orders
Campaigns
Source Brian Mitchel TechEd 2013
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Big Data is wellhellipBig Drove $28b in IT investment in 2012
Expected to grow to $34b in 2014 Challenges
Data Volumes (HardwareStorage Economics) Data Diversity (Multiple Types amp Sources) Data Velocity (Real-Time) User-Expectations
How do we planintegratehelliphellip
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Agenda Hadoop Landscape Current BIDW Landscape BIDW amp Hadoop Intersection
ToolsTechniquesStrategies
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Hadoop Ecosystem
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Hadoop on Windows HDInsight on Windows Azure
Seamlessly scale in the cloud Backed by Azure Storage Vault (ASV)
Hortonworks Data Platform (HDP) On-Premise Based on HDFS
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Current Landscape
Clie
nt T
ools
Reporting Services SharePoint Microsoft Applications
DATA
SO
URC
ES
Traditional Sources (CRMERPLOBWeb)
BID
W S
yste
m
DW Cubes
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Clie
nt T
ools
BID
W S
yste
m
DW
Reporting Services SharePoint Microsoft Applications
DATA
SO
URC
ES
Traditional Sources (CRMERPLOBWeb)
Cubes
Future Landscape
Hadoop
New Sources (Email Logs Social Media Sensor)
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Business Scenario
DW Cube
HadoopHDFS
ODBCODBC
Sqoo
p
OD
BC
Reporting Tools
Flume
Sensor DataWebHDFS
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
What about Azure
DW Cube
Hadoop
ODBCODBC
Sqoo
p
OD
BC
Reporting Tools
AzCopy
Azure Blob Storage
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Tool Techniques amp Strategies
Enterprise Data Services WebHDFS Sqoop Hcatalog PigHive
Enterprise Operational Services Oozie
Other Windows Azure Blob Storage amp AzCopy Hive ODBC Polybase
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
WebHDFS
Born from HFTP intended as a replacement Widely used by Yahoo
High performance first class native protocol using industry standard RESTful mechanism
Complete interface for reading writing amp managing files
Supports secure authentication Data Locality ndash requests sent to data nodes
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
WebHDFS ndash Get Example
Requestcurl -i -L httphostportwebhdfsv1foobarop=OPEN
ResponseHTTP11 307 TEMPORARY_REDIRECT Content-Type applicationoctet-stream Location httpdatanode50075webhdfsv1foobarop=OPENampampoffset=0 Content-Length 0
HTTP11 200 OK Content-Type applicationoctet-streamContent-Length 22
Hello webhdfs user
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
WebHDFS ndash More Examples
Rename Requestcurl -i -X PUT httphostportwebhdfsv1foobarop=RENAMEampampdestination=foobar2
Create Directory Requestcurl -i -X PUT httphostportwebhdfsv1foo2op=MKDIRS
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Sqoop
Tool designed to efficiently move data between Hadoop (Hive amp Hbase) and RDBMS Importing (single and all tables) Exporting Eval (Query Execution) Merge (Multiple HDFS datasets) Incremental Imports
Generates MapReduce jobs Can control the level of parallelism
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Sqoop Demo
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
HCatalogHivePig
Hcatalog ndash Metadata amp table management Users interact with a set of defined tables Abstracts away the wherehow of data storage Allows for consistent access
Pig ndash ETLData Transformation Scripting Pig Latin Java User-Defined Functions (PiggybankDataFu)
Hive ndash SQL-like interface Allows ad-hoc queries for data summarizations
and analysis ODBC Connector
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Demo Pig amp Hive
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Oozie
Scalable Reliable Extensible Workflow Management SystemJob Scheduler
Triggered by Time Data Availability
Can run and orchestrate multiple jobs MapReduce and Streaming MapReduce Hive Pig
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Windows Azure Blob Storage
Also called Azure Storage Vault (ASV) Scalable persistent highly-scalable storage with
built-in geo-replication Azure HDInsight clusters are wired for ASV
On-Premise HDP uses HDFS Separates data from compute nodes
Clusters can be created and dropped minimizing costs Multiple clusters can share data
The Azure Flat (Quantum 10) mesh grid network is the key Violates the principal of data locality but out-performs
HDFS and Azure competitors
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Windows Azure Blob Storage
Source httpdennygleecom20130318why-use-blob-storage-with-hdinsight-on-azure
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
AzCopy
Windows Azure Blob Storage Copies files to and from
Similar to Robocopy Command-lineAzCopy CBeer httpsstgblobcorewindowsnetdataBeer destKeyltMyKeygt S V
Recursively (S) copies all files in the Beer directory with Verbose (V) logging
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
AzCopy Demo
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
PolyBase
Part of Parallel Data Warehouse allows integration of relational and non-relational data
Creates external tables via a HDFS bridge Allows on-the-fly joins within SQL Server
Supports parallel Imports from HDFS Exports to HDFS
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Resources Bloggers
Denny Lee httpdennygleecom
Carl Nolan httpblogsmsdncombcarlnolarchivetagshadoop+streaming
Cindy Gross httpblogsmsdncombcindygrossarchivetagsbig+data
Books Hadoop the Definite Guide - Tom White Programming Pig - Alan Gates Programming Hive - Edward Capriolo Hadoop MapReduce Cookbook - Srinath Perera
Links to this Presentation httpbluewatersqlwordpresscomresources httpwwwslidesharenetbluewatersqlbig-dataguide
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscomMAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Thank you
BluewaterSQL httpbluewatersqlwordpresscom cpricepragmaticworkscom
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Big Data
Data Complexity Variety and Velocity
Terabytes
Gigabytes
Megabytes
Petabytes
Big Data
Service Logs
Spatial amp GPS coordinates
Data market feeds
eGov feeds
Weather
Textimage
Click stream
Wikisblogs
Sensors
RFIDDevices
SMS
HD Audiovideo
Web
Web Logs
Search Marketing
Recommendations
Affiliates
Advertising
Mobile
Collaboration
eCommerceTraditionalPayables
Payroll
Inventory
Contacts
Orders
Campaigns
Source Brian Mitchel TechEd 2013
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Big Data is wellhellipBig Drove $28b in IT investment in 2012
Expected to grow to $34b in 2014 Challenges
Data Volumes (HardwareStorage Economics) Data Diversity (Multiple Types amp Sources) Data Velocity (Real-Time) User-Expectations
How do we planintegratehelliphellip
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Agenda Hadoop Landscape Current BIDW Landscape BIDW amp Hadoop Intersection
ToolsTechniquesStrategies
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Hadoop Ecosystem
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Hadoop on Windows HDInsight on Windows Azure
Seamlessly scale in the cloud Backed by Azure Storage Vault (ASV)
Hortonworks Data Platform (HDP) On-Premise Based on HDFS
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Current Landscape
Clie
nt T
ools
Reporting Services SharePoint Microsoft Applications
DATA
SO
URC
ES
Traditional Sources (CRMERPLOBWeb)
BID
W S
yste
m
DW Cubes
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Clie
nt T
ools
BID
W S
yste
m
DW
Reporting Services SharePoint Microsoft Applications
DATA
SO
URC
ES
Traditional Sources (CRMERPLOBWeb)
Cubes
Future Landscape
Hadoop
New Sources (Email Logs Social Media Sensor)
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Business Scenario
DW Cube
HadoopHDFS
ODBCODBC
Sqoo
p
OD
BC
Reporting Tools
Flume
Sensor DataWebHDFS
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
What about Azure
DW Cube
Hadoop
ODBCODBC
Sqoo
p
OD
BC
Reporting Tools
AzCopy
Azure Blob Storage
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Tool Techniques amp Strategies
Enterprise Data Services WebHDFS Sqoop Hcatalog PigHive
Enterprise Operational Services Oozie
Other Windows Azure Blob Storage amp AzCopy Hive ODBC Polybase
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
WebHDFS
Born from HFTP intended as a replacement Widely used by Yahoo
High performance first class native protocol using industry standard RESTful mechanism
Complete interface for reading writing amp managing files
Supports secure authentication Data Locality ndash requests sent to data nodes
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
WebHDFS ndash Get Example
Requestcurl -i -L httphostportwebhdfsv1foobarop=OPEN
ResponseHTTP11 307 TEMPORARY_REDIRECT Content-Type applicationoctet-stream Location httpdatanode50075webhdfsv1foobarop=OPENampampoffset=0 Content-Length 0
HTTP11 200 OK Content-Type applicationoctet-streamContent-Length 22
Hello webhdfs user
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
WebHDFS ndash More Examples
Rename Requestcurl -i -X PUT httphostportwebhdfsv1foobarop=RENAMEampampdestination=foobar2
Create Directory Requestcurl -i -X PUT httphostportwebhdfsv1foo2op=MKDIRS
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Sqoop
Tool designed to efficiently move data between Hadoop (Hive amp Hbase) and RDBMS Importing (single and all tables) Exporting Eval (Query Execution) Merge (Multiple HDFS datasets) Incremental Imports
Generates MapReduce jobs Can control the level of parallelism
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Sqoop Demo
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
HCatalogHivePig
Hcatalog ndash Metadata amp table management Users interact with a set of defined tables Abstracts away the wherehow of data storage Allows for consistent access
Pig ndash ETLData Transformation Scripting Pig Latin Java User-Defined Functions (PiggybankDataFu)
Hive ndash SQL-like interface Allows ad-hoc queries for data summarizations
and analysis ODBC Connector
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Demo Pig amp Hive
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Oozie
Scalable Reliable Extensible Workflow Management SystemJob Scheduler
Triggered by Time Data Availability
Can run and orchestrate multiple jobs MapReduce and Streaming MapReduce Hive Pig
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Windows Azure Blob Storage
Also called Azure Storage Vault (ASV) Scalable persistent highly-scalable storage with
built-in geo-replication Azure HDInsight clusters are wired for ASV
On-Premise HDP uses HDFS Separates data from compute nodes
Clusters can be created and dropped minimizing costs Multiple clusters can share data
The Azure Flat (Quantum 10) mesh grid network is the key Violates the principal of data locality but out-performs
HDFS and Azure competitors
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Windows Azure Blob Storage
Source httpdennygleecom20130318why-use-blob-storage-with-hdinsight-on-azure
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
AzCopy
Windows Azure Blob Storage Copies files to and from
Similar to Robocopy Command-lineAzCopy CBeer httpsstgblobcorewindowsnetdataBeer destKeyltMyKeygt S V
Recursively (S) copies all files in the Beer directory with Verbose (V) logging
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
AzCopy Demo
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
PolyBase
Part of Parallel Data Warehouse allows integration of relational and non-relational data
Creates external tables via a HDFS bridge Allows on-the-fly joins within SQL Server
Supports parallel Imports from HDFS Exports to HDFS
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Resources Bloggers
Denny Lee httpdennygleecom
Carl Nolan httpblogsmsdncombcarlnolarchivetagshadoop+streaming
Cindy Gross httpblogsmsdncombcindygrossarchivetagsbig+data
Books Hadoop the Definite Guide - Tom White Programming Pig - Alan Gates Programming Hive - Edward Capriolo Hadoop MapReduce Cookbook - Srinath Perera
Links to this Presentation httpbluewatersqlwordpresscomresources httpwwwslidesharenetbluewatersqlbig-dataguide
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscomMAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Thank you
BluewaterSQL httpbluewatersqlwordpresscom cpricepragmaticworkscom
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Big Data is wellhellipBig Drove $28b in IT investment in 2012
Expected to grow to $34b in 2014 Challenges
Data Volumes (HardwareStorage Economics) Data Diversity (Multiple Types amp Sources) Data Velocity (Real-Time) User-Expectations
How do we planintegratehelliphellip
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Agenda Hadoop Landscape Current BIDW Landscape BIDW amp Hadoop Intersection
ToolsTechniquesStrategies
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Hadoop Ecosystem
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Hadoop on Windows HDInsight on Windows Azure
Seamlessly scale in the cloud Backed by Azure Storage Vault (ASV)
Hortonworks Data Platform (HDP) On-Premise Based on HDFS
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Current Landscape
Clie
nt T
ools
Reporting Services SharePoint Microsoft Applications
DATA
SO
URC
ES
Traditional Sources (CRMERPLOBWeb)
BID
W S
yste
m
DW Cubes
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Clie
nt T
ools
BID
W S
yste
m
DW
Reporting Services SharePoint Microsoft Applications
DATA
SO
URC
ES
Traditional Sources (CRMERPLOBWeb)
Cubes
Future Landscape
Hadoop
New Sources (Email Logs Social Media Sensor)
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Business Scenario
DW Cube
HadoopHDFS
ODBCODBC
Sqoo
p
OD
BC
Reporting Tools
Flume
Sensor DataWebHDFS
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
What about Azure
DW Cube
Hadoop
ODBCODBC
Sqoo
p
OD
BC
Reporting Tools
AzCopy
Azure Blob Storage
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Tool Techniques amp Strategies
Enterprise Data Services WebHDFS Sqoop Hcatalog PigHive
Enterprise Operational Services Oozie
Other Windows Azure Blob Storage amp AzCopy Hive ODBC Polybase
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
WebHDFS
Born from HFTP intended as a replacement Widely used by Yahoo
High performance first class native protocol using industry standard RESTful mechanism
Complete interface for reading writing amp managing files
Supports secure authentication Data Locality ndash requests sent to data nodes
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
WebHDFS ndash Get Example
Requestcurl -i -L httphostportwebhdfsv1foobarop=OPEN
ResponseHTTP11 307 TEMPORARY_REDIRECT Content-Type applicationoctet-stream Location httpdatanode50075webhdfsv1foobarop=OPENampampoffset=0 Content-Length 0
HTTP11 200 OK Content-Type applicationoctet-streamContent-Length 22
Hello webhdfs user
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
WebHDFS ndash More Examples
Rename Requestcurl -i -X PUT httphostportwebhdfsv1foobarop=RENAMEampampdestination=foobar2
Create Directory Requestcurl -i -X PUT httphostportwebhdfsv1foo2op=MKDIRS
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Sqoop
Tool designed to efficiently move data between Hadoop (Hive amp Hbase) and RDBMS Importing (single and all tables) Exporting Eval (Query Execution) Merge (Multiple HDFS datasets) Incremental Imports
Generates MapReduce jobs Can control the level of parallelism
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Sqoop Demo
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
HCatalogHivePig
Hcatalog ndash Metadata amp table management Users interact with a set of defined tables Abstracts away the wherehow of data storage Allows for consistent access
Pig ndash ETLData Transformation Scripting Pig Latin Java User-Defined Functions (PiggybankDataFu)
Hive ndash SQL-like interface Allows ad-hoc queries for data summarizations
and analysis ODBC Connector
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Demo Pig amp Hive
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Oozie
Scalable Reliable Extensible Workflow Management SystemJob Scheduler
Triggered by Time Data Availability
Can run and orchestrate multiple jobs MapReduce and Streaming MapReduce Hive Pig
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Windows Azure Blob Storage
Also called Azure Storage Vault (ASV) Scalable persistent highly-scalable storage with
built-in geo-replication Azure HDInsight clusters are wired for ASV
On-Premise HDP uses HDFS Separates data from compute nodes
Clusters can be created and dropped minimizing costs Multiple clusters can share data
The Azure Flat (Quantum 10) mesh grid network is the key Violates the principal of data locality but out-performs
HDFS and Azure competitors
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Windows Azure Blob Storage
Source httpdennygleecom20130318why-use-blob-storage-with-hdinsight-on-azure
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
AzCopy
Windows Azure Blob Storage Copies files to and from
Similar to Robocopy Command-lineAzCopy CBeer httpsstgblobcorewindowsnetdataBeer destKeyltMyKeygt S V
Recursively (S) copies all files in the Beer directory with Verbose (V) logging
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
AzCopy Demo
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
PolyBase
Part of Parallel Data Warehouse allows integration of relational and non-relational data
Creates external tables via a HDFS bridge Allows on-the-fly joins within SQL Server
Supports parallel Imports from HDFS Exports to HDFS
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Resources Bloggers
Denny Lee httpdennygleecom
Carl Nolan httpblogsmsdncombcarlnolarchivetagshadoop+streaming
Cindy Gross httpblogsmsdncombcindygrossarchivetagsbig+data
Books Hadoop the Definite Guide - Tom White Programming Pig - Alan Gates Programming Hive - Edward Capriolo Hadoop MapReduce Cookbook - Srinath Perera
Links to this Presentation httpbluewatersqlwordpresscomresources httpwwwslidesharenetbluewatersqlbig-dataguide
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscomMAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Thank you
BluewaterSQL httpbluewatersqlwordpresscom cpricepragmaticworkscom
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Agenda Hadoop Landscape Current BIDW Landscape BIDW amp Hadoop Intersection
ToolsTechniquesStrategies
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Hadoop Ecosystem
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Hadoop on Windows HDInsight on Windows Azure
Seamlessly scale in the cloud Backed by Azure Storage Vault (ASV)
Hortonworks Data Platform (HDP) On-Premise Based on HDFS
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Current Landscape
Clie
nt T
ools
Reporting Services SharePoint Microsoft Applications
DATA
SO
URC
ES
Traditional Sources (CRMERPLOBWeb)
BID
W S
yste
m
DW Cubes
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Clie
nt T
ools
BID
W S
yste
m
DW
Reporting Services SharePoint Microsoft Applications
DATA
SO
URC
ES
Traditional Sources (CRMERPLOBWeb)
Cubes
Future Landscape
Hadoop
New Sources (Email Logs Social Media Sensor)
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Business Scenario
DW Cube
HadoopHDFS
ODBCODBC
Sqoo
p
OD
BC
Reporting Tools
Flume
Sensor DataWebHDFS
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
What about Azure
DW Cube
Hadoop
ODBCODBC
Sqoo
p
OD
BC
Reporting Tools
AzCopy
Azure Blob Storage
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Tool Techniques amp Strategies
Enterprise Data Services WebHDFS Sqoop Hcatalog PigHive
Enterprise Operational Services Oozie
Other Windows Azure Blob Storage amp AzCopy Hive ODBC Polybase
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
WebHDFS
Born from HFTP intended as a replacement Widely used by Yahoo
High performance first class native protocol using industry standard RESTful mechanism
Complete interface for reading writing amp managing files
Supports secure authentication Data Locality ndash requests sent to data nodes
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
WebHDFS ndash Get Example
Requestcurl -i -L httphostportwebhdfsv1foobarop=OPEN
ResponseHTTP11 307 TEMPORARY_REDIRECT Content-Type applicationoctet-stream Location httpdatanode50075webhdfsv1foobarop=OPENampampoffset=0 Content-Length 0
HTTP11 200 OK Content-Type applicationoctet-streamContent-Length 22
Hello webhdfs user
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
WebHDFS ndash More Examples
Rename Requestcurl -i -X PUT httphostportwebhdfsv1foobarop=RENAMEampampdestination=foobar2
Create Directory Requestcurl -i -X PUT httphostportwebhdfsv1foo2op=MKDIRS
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Sqoop
Tool designed to efficiently move data between Hadoop (Hive amp Hbase) and RDBMS Importing (single and all tables) Exporting Eval (Query Execution) Merge (Multiple HDFS datasets) Incremental Imports
Generates MapReduce jobs Can control the level of parallelism
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Sqoop Demo
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
HCatalogHivePig
Hcatalog ndash Metadata amp table management Users interact with a set of defined tables Abstracts away the wherehow of data storage Allows for consistent access
Pig ndash ETLData Transformation Scripting Pig Latin Java User-Defined Functions (PiggybankDataFu)
Hive ndash SQL-like interface Allows ad-hoc queries for data summarizations
and analysis ODBC Connector
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Demo Pig amp Hive
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Oozie
Scalable Reliable Extensible Workflow Management SystemJob Scheduler
Triggered by Time Data Availability
Can run and orchestrate multiple jobs MapReduce and Streaming MapReduce Hive Pig
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Windows Azure Blob Storage
Also called Azure Storage Vault (ASV) Scalable persistent highly-scalable storage with
built-in geo-replication Azure HDInsight clusters are wired for ASV
On-Premise HDP uses HDFS Separates data from compute nodes
Clusters can be created and dropped minimizing costs Multiple clusters can share data
The Azure Flat (Quantum 10) mesh grid network is the key Violates the principal of data locality but out-performs
HDFS and Azure competitors
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Windows Azure Blob Storage
Source httpdennygleecom20130318why-use-blob-storage-with-hdinsight-on-azure
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
AzCopy
Windows Azure Blob Storage Copies files to and from
Similar to Robocopy Command-lineAzCopy CBeer httpsstgblobcorewindowsnetdataBeer destKeyltMyKeygt S V
Recursively (S) copies all files in the Beer directory with Verbose (V) logging
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
AzCopy Demo
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
PolyBase
Part of Parallel Data Warehouse allows integration of relational and non-relational data
Creates external tables via a HDFS bridge Allows on-the-fly joins within SQL Server
Supports parallel Imports from HDFS Exports to HDFS
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Resources Bloggers
Denny Lee httpdennygleecom
Carl Nolan httpblogsmsdncombcarlnolarchivetagshadoop+streaming
Cindy Gross httpblogsmsdncombcindygrossarchivetagsbig+data
Books Hadoop the Definite Guide - Tom White Programming Pig - Alan Gates Programming Hive - Edward Capriolo Hadoop MapReduce Cookbook - Srinath Perera
Links to this Presentation httpbluewatersqlwordpresscomresources httpwwwslidesharenetbluewatersqlbig-dataguide
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscomMAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Thank you
BluewaterSQL httpbluewatersqlwordpresscom cpricepragmaticworkscom
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Hadoop Ecosystem
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Hadoop on Windows HDInsight on Windows Azure
Seamlessly scale in the cloud Backed by Azure Storage Vault (ASV)
Hortonworks Data Platform (HDP) On-Premise Based on HDFS
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Current Landscape
Clie
nt T
ools
Reporting Services SharePoint Microsoft Applications
DATA
SO
URC
ES
Traditional Sources (CRMERPLOBWeb)
BID
W S
yste
m
DW Cubes
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Clie
nt T
ools
BID
W S
yste
m
DW
Reporting Services SharePoint Microsoft Applications
DATA
SO
URC
ES
Traditional Sources (CRMERPLOBWeb)
Cubes
Future Landscape
Hadoop
New Sources (Email Logs Social Media Sensor)
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Business Scenario
DW Cube
HadoopHDFS
ODBCODBC
Sqoo
p
OD
BC
Reporting Tools
Flume
Sensor DataWebHDFS
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
What about Azure
DW Cube
Hadoop
ODBCODBC
Sqoo
p
OD
BC
Reporting Tools
AzCopy
Azure Blob Storage
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Tool Techniques amp Strategies
Enterprise Data Services WebHDFS Sqoop Hcatalog PigHive
Enterprise Operational Services Oozie
Other Windows Azure Blob Storage amp AzCopy Hive ODBC Polybase
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
WebHDFS
Born from HFTP intended as a replacement Widely used by Yahoo
High performance first class native protocol using industry standard RESTful mechanism
Complete interface for reading writing amp managing files
Supports secure authentication Data Locality ndash requests sent to data nodes
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
WebHDFS ndash Get Example
Requestcurl -i -L httphostportwebhdfsv1foobarop=OPEN
ResponseHTTP11 307 TEMPORARY_REDIRECT Content-Type applicationoctet-stream Location httpdatanode50075webhdfsv1foobarop=OPENampampoffset=0 Content-Length 0
HTTP11 200 OK Content-Type applicationoctet-streamContent-Length 22
Hello webhdfs user
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
WebHDFS ndash More Examples
Rename Requestcurl -i -X PUT httphostportwebhdfsv1foobarop=RENAMEampampdestination=foobar2
Create Directory Requestcurl -i -X PUT httphostportwebhdfsv1foo2op=MKDIRS
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Sqoop
Tool designed to efficiently move data between Hadoop (Hive amp Hbase) and RDBMS Importing (single and all tables) Exporting Eval (Query Execution) Merge (Multiple HDFS datasets) Incremental Imports
Generates MapReduce jobs Can control the level of parallelism
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Sqoop Demo
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
HCatalogHivePig
Hcatalog ndash Metadata amp table management Users interact with a set of defined tables Abstracts away the wherehow of data storage Allows for consistent access
Pig ndash ETLData Transformation Scripting Pig Latin Java User-Defined Functions (PiggybankDataFu)
Hive ndash SQL-like interface Allows ad-hoc queries for data summarizations
and analysis ODBC Connector
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Demo Pig amp Hive
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Oozie
Scalable Reliable Extensible Workflow Management SystemJob Scheduler
Triggered by Time Data Availability
Can run and orchestrate multiple jobs MapReduce and Streaming MapReduce Hive Pig
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Windows Azure Blob Storage
Also called Azure Storage Vault (ASV) Scalable persistent highly-scalable storage with
built-in geo-replication Azure HDInsight clusters are wired for ASV
On-Premise HDP uses HDFS Separates data from compute nodes
Clusters can be created and dropped minimizing costs Multiple clusters can share data
The Azure Flat (Quantum 10) mesh grid network is the key Violates the principal of data locality but out-performs
HDFS and Azure competitors
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Windows Azure Blob Storage
Source httpdennygleecom20130318why-use-blob-storage-with-hdinsight-on-azure
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
AzCopy
Windows Azure Blob Storage Copies files to and from
Similar to Robocopy Command-lineAzCopy CBeer httpsstgblobcorewindowsnetdataBeer destKeyltMyKeygt S V
Recursively (S) copies all files in the Beer directory with Verbose (V) logging
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
AzCopy Demo
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
PolyBase
Part of Parallel Data Warehouse allows integration of relational and non-relational data
Creates external tables via a HDFS bridge Allows on-the-fly joins within SQL Server
Supports parallel Imports from HDFS Exports to HDFS
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Resources Bloggers
Denny Lee httpdennygleecom
Carl Nolan httpblogsmsdncombcarlnolarchivetagshadoop+streaming
Cindy Gross httpblogsmsdncombcindygrossarchivetagsbig+data
Books Hadoop the Definite Guide - Tom White Programming Pig - Alan Gates Programming Hive - Edward Capriolo Hadoop MapReduce Cookbook - Srinath Perera
Links to this Presentation httpbluewatersqlwordpresscomresources httpwwwslidesharenetbluewatersqlbig-dataguide
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscomMAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Thank you
BluewaterSQL httpbluewatersqlwordpresscom cpricepragmaticworkscom
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Hadoop on Windows HDInsight on Windows Azure
Seamlessly scale in the cloud Backed by Azure Storage Vault (ASV)
Hortonworks Data Platform (HDP) On-Premise Based on HDFS
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Current Landscape
Clie
nt T
ools
Reporting Services SharePoint Microsoft Applications
DATA
SO
URC
ES
Traditional Sources (CRMERPLOBWeb)
BID
W S
yste
m
DW Cubes
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Clie
nt T
ools
BID
W S
yste
m
DW
Reporting Services SharePoint Microsoft Applications
DATA
SO
URC
ES
Traditional Sources (CRMERPLOBWeb)
Cubes
Future Landscape
Hadoop
New Sources (Email Logs Social Media Sensor)
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Business Scenario
DW Cube
HadoopHDFS
ODBCODBC
Sqoo
p
OD
BC
Reporting Tools
Flume
Sensor DataWebHDFS
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
What about Azure
DW Cube
Hadoop
ODBCODBC
Sqoo
p
OD
BC
Reporting Tools
AzCopy
Azure Blob Storage
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Tool Techniques amp Strategies
Enterprise Data Services WebHDFS Sqoop Hcatalog PigHive
Enterprise Operational Services Oozie
Other Windows Azure Blob Storage amp AzCopy Hive ODBC Polybase
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
WebHDFS
Born from HFTP intended as a replacement Widely used by Yahoo
High performance first class native protocol using industry standard RESTful mechanism
Complete interface for reading writing amp managing files
Supports secure authentication Data Locality ndash requests sent to data nodes
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
WebHDFS ndash Get Example
Requestcurl -i -L httphostportwebhdfsv1foobarop=OPEN
ResponseHTTP11 307 TEMPORARY_REDIRECT Content-Type applicationoctet-stream Location httpdatanode50075webhdfsv1foobarop=OPENampampoffset=0 Content-Length 0
HTTP11 200 OK Content-Type applicationoctet-streamContent-Length 22
Hello webhdfs user
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
WebHDFS ndash More Examples
Rename Requestcurl -i -X PUT httphostportwebhdfsv1foobarop=RENAMEampampdestination=foobar2
Create Directory Requestcurl -i -X PUT httphostportwebhdfsv1foo2op=MKDIRS
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Sqoop
Tool designed to efficiently move data between Hadoop (Hive amp Hbase) and RDBMS Importing (single and all tables) Exporting Eval (Query Execution) Merge (Multiple HDFS datasets) Incremental Imports
Generates MapReduce jobs Can control the level of parallelism
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Sqoop Demo
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
HCatalogHivePig
Hcatalog ndash Metadata amp table management Users interact with a set of defined tables Abstracts away the wherehow of data storage Allows for consistent access
Pig ndash ETLData Transformation Scripting Pig Latin Java User-Defined Functions (PiggybankDataFu)
Hive ndash SQL-like interface Allows ad-hoc queries for data summarizations
and analysis ODBC Connector
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Demo Pig amp Hive
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Oozie
Scalable Reliable Extensible Workflow Management SystemJob Scheduler
Triggered by Time Data Availability
Can run and orchestrate multiple jobs MapReduce and Streaming MapReduce Hive Pig
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Windows Azure Blob Storage
Also called Azure Storage Vault (ASV) Scalable persistent highly-scalable storage with
built-in geo-replication Azure HDInsight clusters are wired for ASV
On-Premise HDP uses HDFS Separates data from compute nodes
Clusters can be created and dropped minimizing costs Multiple clusters can share data
The Azure Flat (Quantum 10) mesh grid network is the key Violates the principal of data locality but out-performs
HDFS and Azure competitors
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Windows Azure Blob Storage
Source httpdennygleecom20130318why-use-blob-storage-with-hdinsight-on-azure
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
AzCopy
Windows Azure Blob Storage Copies files to and from
Similar to Robocopy Command-lineAzCopy CBeer httpsstgblobcorewindowsnetdataBeer destKeyltMyKeygt S V
Recursively (S) copies all files in the Beer directory with Verbose (V) logging
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
AzCopy Demo
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
PolyBase
Part of Parallel Data Warehouse allows integration of relational and non-relational data
Creates external tables via a HDFS bridge Allows on-the-fly joins within SQL Server
Supports parallel Imports from HDFS Exports to HDFS
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Resources Bloggers
Denny Lee httpdennygleecom
Carl Nolan httpblogsmsdncombcarlnolarchivetagshadoop+streaming
Cindy Gross httpblogsmsdncombcindygrossarchivetagsbig+data
Books Hadoop the Definite Guide - Tom White Programming Pig - Alan Gates Programming Hive - Edward Capriolo Hadoop MapReduce Cookbook - Srinath Perera
Links to this Presentation httpbluewatersqlwordpresscomresources httpwwwslidesharenetbluewatersqlbig-dataguide
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscomMAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Thank you
BluewaterSQL httpbluewatersqlwordpresscom cpricepragmaticworkscom
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Current Landscape
Clie
nt T
ools
Reporting Services SharePoint Microsoft Applications
DATA
SO
URC
ES
Traditional Sources (CRMERPLOBWeb)
BID
W S
yste
m
DW Cubes
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Clie
nt T
ools
BID
W S
yste
m
DW
Reporting Services SharePoint Microsoft Applications
DATA
SO
URC
ES
Traditional Sources (CRMERPLOBWeb)
Cubes
Future Landscape
Hadoop
New Sources (Email Logs Social Media Sensor)
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Business Scenario
DW Cube
HadoopHDFS
ODBCODBC
Sqoo
p
OD
BC
Reporting Tools
Flume
Sensor DataWebHDFS
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
What about Azure
DW Cube
Hadoop
ODBCODBC
Sqoo
p
OD
BC
Reporting Tools
AzCopy
Azure Blob Storage
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Tool Techniques amp Strategies
Enterprise Data Services WebHDFS Sqoop Hcatalog PigHive
Enterprise Operational Services Oozie
Other Windows Azure Blob Storage amp AzCopy Hive ODBC Polybase
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
WebHDFS
Born from HFTP intended as a replacement Widely used by Yahoo
High performance first class native protocol using industry standard RESTful mechanism
Complete interface for reading writing amp managing files
Supports secure authentication Data Locality ndash requests sent to data nodes
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
WebHDFS ndash Get Example
Requestcurl -i -L httphostportwebhdfsv1foobarop=OPEN
ResponseHTTP11 307 TEMPORARY_REDIRECT Content-Type applicationoctet-stream Location httpdatanode50075webhdfsv1foobarop=OPENampampoffset=0 Content-Length 0
HTTP11 200 OK Content-Type applicationoctet-streamContent-Length 22
Hello webhdfs user
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
WebHDFS ndash More Examples
Rename Requestcurl -i -X PUT httphostportwebhdfsv1foobarop=RENAMEampampdestination=foobar2
Create Directory Requestcurl -i -X PUT httphostportwebhdfsv1foo2op=MKDIRS
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Sqoop
Tool designed to efficiently move data between Hadoop (Hive amp Hbase) and RDBMS Importing (single and all tables) Exporting Eval (Query Execution) Merge (Multiple HDFS datasets) Incremental Imports
Generates MapReduce jobs Can control the level of parallelism
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Sqoop Demo
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
HCatalogHivePig
Hcatalog ndash Metadata amp table management Users interact with a set of defined tables Abstracts away the wherehow of data storage Allows for consistent access
Pig ndash ETLData Transformation Scripting Pig Latin Java User-Defined Functions (PiggybankDataFu)
Hive ndash SQL-like interface Allows ad-hoc queries for data summarizations
and analysis ODBC Connector
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Demo Pig amp Hive
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Oozie
Scalable Reliable Extensible Workflow Management SystemJob Scheduler
Triggered by Time Data Availability
Can run and orchestrate multiple jobs MapReduce and Streaming MapReduce Hive Pig
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Windows Azure Blob Storage
Also called Azure Storage Vault (ASV) Scalable persistent highly-scalable storage with
built-in geo-replication Azure HDInsight clusters are wired for ASV
On-Premise HDP uses HDFS Separates data from compute nodes
Clusters can be created and dropped minimizing costs Multiple clusters can share data
The Azure Flat (Quantum 10) mesh grid network is the key Violates the principal of data locality but out-performs
HDFS and Azure competitors
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Windows Azure Blob Storage
Source httpdennygleecom20130318why-use-blob-storage-with-hdinsight-on-azure
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
AzCopy
Windows Azure Blob Storage Copies files to and from
Similar to Robocopy Command-lineAzCopy CBeer httpsstgblobcorewindowsnetdataBeer destKeyltMyKeygt S V
Recursively (S) copies all files in the Beer directory with Verbose (V) logging
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
AzCopy Demo
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
PolyBase
Part of Parallel Data Warehouse allows integration of relational and non-relational data
Creates external tables via a HDFS bridge Allows on-the-fly joins within SQL Server
Supports parallel Imports from HDFS Exports to HDFS
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Resources Bloggers
Denny Lee httpdennygleecom
Carl Nolan httpblogsmsdncombcarlnolarchivetagshadoop+streaming
Cindy Gross httpblogsmsdncombcindygrossarchivetagsbig+data
Books Hadoop the Definite Guide - Tom White Programming Pig - Alan Gates Programming Hive - Edward Capriolo Hadoop MapReduce Cookbook - Srinath Perera
Links to this Presentation httpbluewatersqlwordpresscomresources httpwwwslidesharenetbluewatersqlbig-dataguide
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscomMAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Thank you
BluewaterSQL httpbluewatersqlwordpresscom cpricepragmaticworkscom
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Clie
nt T
ools
BID
W S
yste
m
DW
Reporting Services SharePoint Microsoft Applications
DATA
SO
URC
ES
Traditional Sources (CRMERPLOBWeb)
Cubes
Future Landscape
Hadoop
New Sources (Email Logs Social Media Sensor)
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Business Scenario
DW Cube
HadoopHDFS
ODBCODBC
Sqoo
p
OD
BC
Reporting Tools
Flume
Sensor DataWebHDFS
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
What about Azure
DW Cube
Hadoop
ODBCODBC
Sqoo
p
OD
BC
Reporting Tools
AzCopy
Azure Blob Storage
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Tool Techniques amp Strategies
Enterprise Data Services WebHDFS Sqoop Hcatalog PigHive
Enterprise Operational Services Oozie
Other Windows Azure Blob Storage amp AzCopy Hive ODBC Polybase
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
WebHDFS
Born from HFTP intended as a replacement Widely used by Yahoo
High performance first class native protocol using industry standard RESTful mechanism
Complete interface for reading writing amp managing files
Supports secure authentication Data Locality ndash requests sent to data nodes
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
WebHDFS ndash Get Example
Requestcurl -i -L httphostportwebhdfsv1foobarop=OPEN
ResponseHTTP11 307 TEMPORARY_REDIRECT Content-Type applicationoctet-stream Location httpdatanode50075webhdfsv1foobarop=OPENampampoffset=0 Content-Length 0
HTTP11 200 OK Content-Type applicationoctet-streamContent-Length 22
Hello webhdfs user
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
WebHDFS ndash More Examples
Rename Requestcurl -i -X PUT httphostportwebhdfsv1foobarop=RENAMEampampdestination=foobar2
Create Directory Requestcurl -i -X PUT httphostportwebhdfsv1foo2op=MKDIRS
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Sqoop
Tool designed to efficiently move data between Hadoop (Hive amp Hbase) and RDBMS Importing (single and all tables) Exporting Eval (Query Execution) Merge (Multiple HDFS datasets) Incremental Imports
Generates MapReduce jobs Can control the level of parallelism
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Sqoop Demo
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
HCatalogHivePig
Hcatalog ndash Metadata amp table management Users interact with a set of defined tables Abstracts away the wherehow of data storage Allows for consistent access
Pig ndash ETLData Transformation Scripting Pig Latin Java User-Defined Functions (PiggybankDataFu)
Hive ndash SQL-like interface Allows ad-hoc queries for data summarizations
and analysis ODBC Connector
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Demo Pig amp Hive
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Oozie
Scalable Reliable Extensible Workflow Management SystemJob Scheduler
Triggered by Time Data Availability
Can run and orchestrate multiple jobs MapReduce and Streaming MapReduce Hive Pig
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Windows Azure Blob Storage
Also called Azure Storage Vault (ASV) Scalable persistent highly-scalable storage with
built-in geo-replication Azure HDInsight clusters are wired for ASV
On-Premise HDP uses HDFS Separates data from compute nodes
Clusters can be created and dropped minimizing costs Multiple clusters can share data
The Azure Flat (Quantum 10) mesh grid network is the key Violates the principal of data locality but out-performs
HDFS and Azure competitors
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Windows Azure Blob Storage
Source httpdennygleecom20130318why-use-blob-storage-with-hdinsight-on-azure
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
AzCopy
Windows Azure Blob Storage Copies files to and from
Similar to Robocopy Command-lineAzCopy CBeer httpsstgblobcorewindowsnetdataBeer destKeyltMyKeygt S V
Recursively (S) copies all files in the Beer directory with Verbose (V) logging
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
AzCopy Demo
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
PolyBase
Part of Parallel Data Warehouse allows integration of relational and non-relational data
Creates external tables via a HDFS bridge Allows on-the-fly joins within SQL Server
Supports parallel Imports from HDFS Exports to HDFS
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Resources Bloggers
Denny Lee httpdennygleecom
Carl Nolan httpblogsmsdncombcarlnolarchivetagshadoop+streaming
Cindy Gross httpblogsmsdncombcindygrossarchivetagsbig+data
Books Hadoop the Definite Guide - Tom White Programming Pig - Alan Gates Programming Hive - Edward Capriolo Hadoop MapReduce Cookbook - Srinath Perera
Links to this Presentation httpbluewatersqlwordpresscomresources httpwwwslidesharenetbluewatersqlbig-dataguide
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscomMAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Thank you
BluewaterSQL httpbluewatersqlwordpresscom cpricepragmaticworkscom
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Business Scenario
DW Cube
HadoopHDFS
ODBCODBC
Sqoo
p
OD
BC
Reporting Tools
Flume
Sensor DataWebHDFS
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
What about Azure
DW Cube
Hadoop
ODBCODBC
Sqoo
p
OD
BC
Reporting Tools
AzCopy
Azure Blob Storage
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Tool Techniques amp Strategies
Enterprise Data Services WebHDFS Sqoop Hcatalog PigHive
Enterprise Operational Services Oozie
Other Windows Azure Blob Storage amp AzCopy Hive ODBC Polybase
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
WebHDFS
Born from HFTP intended as a replacement Widely used by Yahoo
High performance first class native protocol using industry standard RESTful mechanism
Complete interface for reading writing amp managing files
Supports secure authentication Data Locality ndash requests sent to data nodes
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
WebHDFS ndash Get Example
Requestcurl -i -L httphostportwebhdfsv1foobarop=OPEN
ResponseHTTP11 307 TEMPORARY_REDIRECT Content-Type applicationoctet-stream Location httpdatanode50075webhdfsv1foobarop=OPENampampoffset=0 Content-Length 0
HTTP11 200 OK Content-Type applicationoctet-streamContent-Length 22
Hello webhdfs user
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
WebHDFS ndash More Examples
Rename Requestcurl -i -X PUT httphostportwebhdfsv1foobarop=RENAMEampampdestination=foobar2
Create Directory Requestcurl -i -X PUT httphostportwebhdfsv1foo2op=MKDIRS
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Sqoop
Tool designed to efficiently move data between Hadoop (Hive amp Hbase) and RDBMS Importing (single and all tables) Exporting Eval (Query Execution) Merge (Multiple HDFS datasets) Incremental Imports
Generates MapReduce jobs Can control the level of parallelism
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Sqoop Demo
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
HCatalogHivePig
Hcatalog ndash Metadata amp table management Users interact with a set of defined tables Abstracts away the wherehow of data storage Allows for consistent access
Pig ndash ETLData Transformation Scripting Pig Latin Java User-Defined Functions (PiggybankDataFu)
Hive ndash SQL-like interface Allows ad-hoc queries for data summarizations
and analysis ODBC Connector
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Demo Pig amp Hive
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Oozie
Scalable Reliable Extensible Workflow Management SystemJob Scheduler
Triggered by Time Data Availability
Can run and orchestrate multiple jobs MapReduce and Streaming MapReduce Hive Pig
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Windows Azure Blob Storage
Also called Azure Storage Vault (ASV) Scalable persistent highly-scalable storage with
built-in geo-replication Azure HDInsight clusters are wired for ASV
On-Premise HDP uses HDFS Separates data from compute nodes
Clusters can be created and dropped minimizing costs Multiple clusters can share data
The Azure Flat (Quantum 10) mesh grid network is the key Violates the principal of data locality but out-performs
HDFS and Azure competitors
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Windows Azure Blob Storage
Source httpdennygleecom20130318why-use-blob-storage-with-hdinsight-on-azure
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
AzCopy
Windows Azure Blob Storage Copies files to and from
Similar to Robocopy Command-lineAzCopy CBeer httpsstgblobcorewindowsnetdataBeer destKeyltMyKeygt S V
Recursively (S) copies all files in the Beer directory with Verbose (V) logging
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
AzCopy Demo
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
PolyBase
Part of Parallel Data Warehouse allows integration of relational and non-relational data
Creates external tables via a HDFS bridge Allows on-the-fly joins within SQL Server
Supports parallel Imports from HDFS Exports to HDFS
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Resources Bloggers
Denny Lee httpdennygleecom
Carl Nolan httpblogsmsdncombcarlnolarchivetagshadoop+streaming
Cindy Gross httpblogsmsdncombcindygrossarchivetagsbig+data
Books Hadoop the Definite Guide - Tom White Programming Pig - Alan Gates Programming Hive - Edward Capriolo Hadoop MapReduce Cookbook - Srinath Perera
Links to this Presentation httpbluewatersqlwordpresscomresources httpwwwslidesharenetbluewatersqlbig-dataguide
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscomMAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Thank you
BluewaterSQL httpbluewatersqlwordpresscom cpricepragmaticworkscom
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
What about Azure
DW Cube
Hadoop
ODBCODBC
Sqoo
p
OD
BC
Reporting Tools
AzCopy
Azure Blob Storage
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Tool Techniques amp Strategies
Enterprise Data Services WebHDFS Sqoop Hcatalog PigHive
Enterprise Operational Services Oozie
Other Windows Azure Blob Storage amp AzCopy Hive ODBC Polybase
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
WebHDFS
Born from HFTP intended as a replacement Widely used by Yahoo
High performance first class native protocol using industry standard RESTful mechanism
Complete interface for reading writing amp managing files
Supports secure authentication Data Locality ndash requests sent to data nodes
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
WebHDFS ndash Get Example
Requestcurl -i -L httphostportwebhdfsv1foobarop=OPEN
ResponseHTTP11 307 TEMPORARY_REDIRECT Content-Type applicationoctet-stream Location httpdatanode50075webhdfsv1foobarop=OPENampampoffset=0 Content-Length 0
HTTP11 200 OK Content-Type applicationoctet-streamContent-Length 22
Hello webhdfs user
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
WebHDFS ndash More Examples
Rename Requestcurl -i -X PUT httphostportwebhdfsv1foobarop=RENAMEampampdestination=foobar2
Create Directory Requestcurl -i -X PUT httphostportwebhdfsv1foo2op=MKDIRS
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Sqoop
Tool designed to efficiently move data between Hadoop (Hive amp Hbase) and RDBMS Importing (single and all tables) Exporting Eval (Query Execution) Merge (Multiple HDFS datasets) Incremental Imports
Generates MapReduce jobs Can control the level of parallelism
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Sqoop Demo
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
HCatalogHivePig
Hcatalog ndash Metadata amp table management Users interact with a set of defined tables Abstracts away the wherehow of data storage Allows for consistent access
Pig ndash ETLData Transformation Scripting Pig Latin Java User-Defined Functions (PiggybankDataFu)
Hive ndash SQL-like interface Allows ad-hoc queries for data summarizations
and analysis ODBC Connector
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Demo Pig amp Hive
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Oozie
Scalable Reliable Extensible Workflow Management SystemJob Scheduler
Triggered by Time Data Availability
Can run and orchestrate multiple jobs MapReduce and Streaming MapReduce Hive Pig
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Windows Azure Blob Storage
Also called Azure Storage Vault (ASV) Scalable persistent highly-scalable storage with
built-in geo-replication Azure HDInsight clusters are wired for ASV
On-Premise HDP uses HDFS Separates data from compute nodes
Clusters can be created and dropped minimizing costs Multiple clusters can share data
The Azure Flat (Quantum 10) mesh grid network is the key Violates the principal of data locality but out-performs
HDFS and Azure competitors
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Windows Azure Blob Storage
Source httpdennygleecom20130318why-use-blob-storage-with-hdinsight-on-azure
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
AzCopy
Windows Azure Blob Storage Copies files to and from
Similar to Robocopy Command-lineAzCopy CBeer httpsstgblobcorewindowsnetdataBeer destKeyltMyKeygt S V
Recursively (S) copies all files in the Beer directory with Verbose (V) logging
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
AzCopy Demo
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
PolyBase
Part of Parallel Data Warehouse allows integration of relational and non-relational data
Creates external tables via a HDFS bridge Allows on-the-fly joins within SQL Server
Supports parallel Imports from HDFS Exports to HDFS
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Resources Bloggers
Denny Lee httpdennygleecom
Carl Nolan httpblogsmsdncombcarlnolarchivetagshadoop+streaming
Cindy Gross httpblogsmsdncombcindygrossarchivetagsbig+data
Books Hadoop the Definite Guide - Tom White Programming Pig - Alan Gates Programming Hive - Edward Capriolo Hadoop MapReduce Cookbook - Srinath Perera
Links to this Presentation httpbluewatersqlwordpresscomresources httpwwwslidesharenetbluewatersqlbig-dataguide
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscomMAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Thank you
BluewaterSQL httpbluewatersqlwordpresscom cpricepragmaticworkscom
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Tool Techniques amp Strategies
Enterprise Data Services WebHDFS Sqoop Hcatalog PigHive
Enterprise Operational Services Oozie
Other Windows Azure Blob Storage amp AzCopy Hive ODBC Polybase
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
WebHDFS
Born from HFTP intended as a replacement Widely used by Yahoo
High performance first class native protocol using industry standard RESTful mechanism
Complete interface for reading writing amp managing files
Supports secure authentication Data Locality ndash requests sent to data nodes
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
WebHDFS ndash Get Example
Requestcurl -i -L httphostportwebhdfsv1foobarop=OPEN
ResponseHTTP11 307 TEMPORARY_REDIRECT Content-Type applicationoctet-stream Location httpdatanode50075webhdfsv1foobarop=OPENampampoffset=0 Content-Length 0
HTTP11 200 OK Content-Type applicationoctet-streamContent-Length 22
Hello webhdfs user
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
WebHDFS ndash More Examples
Rename Requestcurl -i -X PUT httphostportwebhdfsv1foobarop=RENAMEampampdestination=foobar2
Create Directory Requestcurl -i -X PUT httphostportwebhdfsv1foo2op=MKDIRS
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Sqoop
Tool designed to efficiently move data between Hadoop (Hive amp Hbase) and RDBMS Importing (single and all tables) Exporting Eval (Query Execution) Merge (Multiple HDFS datasets) Incremental Imports
Generates MapReduce jobs Can control the level of parallelism
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Sqoop Demo
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
HCatalogHivePig
Hcatalog ndash Metadata amp table management Users interact with a set of defined tables Abstracts away the wherehow of data storage Allows for consistent access
Pig ndash ETLData Transformation Scripting Pig Latin Java User-Defined Functions (PiggybankDataFu)
Hive ndash SQL-like interface Allows ad-hoc queries for data summarizations
and analysis ODBC Connector
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Demo Pig amp Hive
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Oozie
Scalable Reliable Extensible Workflow Management SystemJob Scheduler
Triggered by Time Data Availability
Can run and orchestrate multiple jobs MapReduce and Streaming MapReduce Hive Pig
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Windows Azure Blob Storage
Also called Azure Storage Vault (ASV) Scalable persistent highly-scalable storage with
built-in geo-replication Azure HDInsight clusters are wired for ASV
On-Premise HDP uses HDFS Separates data from compute nodes
Clusters can be created and dropped minimizing costs Multiple clusters can share data
The Azure Flat (Quantum 10) mesh grid network is the key Violates the principal of data locality but out-performs
HDFS and Azure competitors
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Windows Azure Blob Storage
Source httpdennygleecom20130318why-use-blob-storage-with-hdinsight-on-azure
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
AzCopy
Windows Azure Blob Storage Copies files to and from
Similar to Robocopy Command-lineAzCopy CBeer httpsstgblobcorewindowsnetdataBeer destKeyltMyKeygt S V
Recursively (S) copies all files in the Beer directory with Verbose (V) logging
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
AzCopy Demo
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
PolyBase
Part of Parallel Data Warehouse allows integration of relational and non-relational data
Creates external tables via a HDFS bridge Allows on-the-fly joins within SQL Server
Supports parallel Imports from HDFS Exports to HDFS
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Resources Bloggers
Denny Lee httpdennygleecom
Carl Nolan httpblogsmsdncombcarlnolarchivetagshadoop+streaming
Cindy Gross httpblogsmsdncombcindygrossarchivetagsbig+data
Books Hadoop the Definite Guide - Tom White Programming Pig - Alan Gates Programming Hive - Edward Capriolo Hadoop MapReduce Cookbook - Srinath Perera
Links to this Presentation httpbluewatersqlwordpresscomresources httpwwwslidesharenetbluewatersqlbig-dataguide
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscomMAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Thank you
BluewaterSQL httpbluewatersqlwordpresscom cpricepragmaticworkscom
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
WebHDFS
Born from HFTP intended as a replacement Widely used by Yahoo
High performance first class native protocol using industry standard RESTful mechanism
Complete interface for reading writing amp managing files
Supports secure authentication Data Locality ndash requests sent to data nodes
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
WebHDFS ndash Get Example
Requestcurl -i -L httphostportwebhdfsv1foobarop=OPEN
ResponseHTTP11 307 TEMPORARY_REDIRECT Content-Type applicationoctet-stream Location httpdatanode50075webhdfsv1foobarop=OPENampampoffset=0 Content-Length 0
HTTP11 200 OK Content-Type applicationoctet-streamContent-Length 22
Hello webhdfs user
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
WebHDFS ndash More Examples
Rename Requestcurl -i -X PUT httphostportwebhdfsv1foobarop=RENAMEampampdestination=foobar2
Create Directory Requestcurl -i -X PUT httphostportwebhdfsv1foo2op=MKDIRS
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Sqoop
Tool designed to efficiently move data between Hadoop (Hive amp Hbase) and RDBMS Importing (single and all tables) Exporting Eval (Query Execution) Merge (Multiple HDFS datasets) Incremental Imports
Generates MapReduce jobs Can control the level of parallelism
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Sqoop Demo
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
HCatalogHivePig
Hcatalog ndash Metadata amp table management Users interact with a set of defined tables Abstracts away the wherehow of data storage Allows for consistent access
Pig ndash ETLData Transformation Scripting Pig Latin Java User-Defined Functions (PiggybankDataFu)
Hive ndash SQL-like interface Allows ad-hoc queries for data summarizations
and analysis ODBC Connector
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Demo Pig amp Hive
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Oozie
Scalable Reliable Extensible Workflow Management SystemJob Scheduler
Triggered by Time Data Availability
Can run and orchestrate multiple jobs MapReduce and Streaming MapReduce Hive Pig
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Windows Azure Blob Storage
Also called Azure Storage Vault (ASV) Scalable persistent highly-scalable storage with
built-in geo-replication Azure HDInsight clusters are wired for ASV
On-Premise HDP uses HDFS Separates data from compute nodes
Clusters can be created and dropped minimizing costs Multiple clusters can share data
The Azure Flat (Quantum 10) mesh grid network is the key Violates the principal of data locality but out-performs
HDFS and Azure competitors
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Windows Azure Blob Storage
Source httpdennygleecom20130318why-use-blob-storage-with-hdinsight-on-azure
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
AzCopy
Windows Azure Blob Storage Copies files to and from
Similar to Robocopy Command-lineAzCopy CBeer httpsstgblobcorewindowsnetdataBeer destKeyltMyKeygt S V
Recursively (S) copies all files in the Beer directory with Verbose (V) logging
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
AzCopy Demo
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
PolyBase
Part of Parallel Data Warehouse allows integration of relational and non-relational data
Creates external tables via a HDFS bridge Allows on-the-fly joins within SQL Server
Supports parallel Imports from HDFS Exports to HDFS
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Resources Bloggers
Denny Lee httpdennygleecom
Carl Nolan httpblogsmsdncombcarlnolarchivetagshadoop+streaming
Cindy Gross httpblogsmsdncombcindygrossarchivetagsbig+data
Books Hadoop the Definite Guide - Tom White Programming Pig - Alan Gates Programming Hive - Edward Capriolo Hadoop MapReduce Cookbook - Srinath Perera
Links to this Presentation httpbluewatersqlwordpresscomresources httpwwwslidesharenetbluewatersqlbig-dataguide
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscomMAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Thank you
BluewaterSQL httpbluewatersqlwordpresscom cpricepragmaticworkscom
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
WebHDFS ndash Get Example
Requestcurl -i -L httphostportwebhdfsv1foobarop=OPEN
ResponseHTTP11 307 TEMPORARY_REDIRECT Content-Type applicationoctet-stream Location httpdatanode50075webhdfsv1foobarop=OPENampampoffset=0 Content-Length 0
HTTP11 200 OK Content-Type applicationoctet-streamContent-Length 22
Hello webhdfs user
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
WebHDFS ndash More Examples
Rename Requestcurl -i -X PUT httphostportwebhdfsv1foobarop=RENAMEampampdestination=foobar2
Create Directory Requestcurl -i -X PUT httphostportwebhdfsv1foo2op=MKDIRS
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Sqoop
Tool designed to efficiently move data between Hadoop (Hive amp Hbase) and RDBMS Importing (single and all tables) Exporting Eval (Query Execution) Merge (Multiple HDFS datasets) Incremental Imports
Generates MapReduce jobs Can control the level of parallelism
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Sqoop Demo
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
HCatalogHivePig
Hcatalog ndash Metadata amp table management Users interact with a set of defined tables Abstracts away the wherehow of data storage Allows for consistent access
Pig ndash ETLData Transformation Scripting Pig Latin Java User-Defined Functions (PiggybankDataFu)
Hive ndash SQL-like interface Allows ad-hoc queries for data summarizations
and analysis ODBC Connector
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Demo Pig amp Hive
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Oozie
Scalable Reliable Extensible Workflow Management SystemJob Scheduler
Triggered by Time Data Availability
Can run and orchestrate multiple jobs MapReduce and Streaming MapReduce Hive Pig
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Windows Azure Blob Storage
Also called Azure Storage Vault (ASV) Scalable persistent highly-scalable storage with
built-in geo-replication Azure HDInsight clusters are wired for ASV
On-Premise HDP uses HDFS Separates data from compute nodes
Clusters can be created and dropped minimizing costs Multiple clusters can share data
The Azure Flat (Quantum 10) mesh grid network is the key Violates the principal of data locality but out-performs
HDFS and Azure competitors
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Windows Azure Blob Storage
Source httpdennygleecom20130318why-use-blob-storage-with-hdinsight-on-azure
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
AzCopy
Windows Azure Blob Storage Copies files to and from
Similar to Robocopy Command-lineAzCopy CBeer httpsstgblobcorewindowsnetdataBeer destKeyltMyKeygt S V
Recursively (S) copies all files in the Beer directory with Verbose (V) logging
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
AzCopy Demo
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
PolyBase
Part of Parallel Data Warehouse allows integration of relational and non-relational data
Creates external tables via a HDFS bridge Allows on-the-fly joins within SQL Server
Supports parallel Imports from HDFS Exports to HDFS
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Resources Bloggers
Denny Lee httpdennygleecom
Carl Nolan httpblogsmsdncombcarlnolarchivetagshadoop+streaming
Cindy Gross httpblogsmsdncombcindygrossarchivetagsbig+data
Books Hadoop the Definite Guide - Tom White Programming Pig - Alan Gates Programming Hive - Edward Capriolo Hadoop MapReduce Cookbook - Srinath Perera
Links to this Presentation httpbluewatersqlwordpresscomresources httpwwwslidesharenetbluewatersqlbig-dataguide
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscomMAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Thank you
BluewaterSQL httpbluewatersqlwordpresscom cpricepragmaticworkscom
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
WebHDFS ndash More Examples
Rename Requestcurl -i -X PUT httphostportwebhdfsv1foobarop=RENAMEampampdestination=foobar2
Create Directory Requestcurl -i -X PUT httphostportwebhdfsv1foo2op=MKDIRS
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Sqoop
Tool designed to efficiently move data between Hadoop (Hive amp Hbase) and RDBMS Importing (single and all tables) Exporting Eval (Query Execution) Merge (Multiple HDFS datasets) Incremental Imports
Generates MapReduce jobs Can control the level of parallelism
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Sqoop Demo
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
HCatalogHivePig
Hcatalog ndash Metadata amp table management Users interact with a set of defined tables Abstracts away the wherehow of data storage Allows for consistent access
Pig ndash ETLData Transformation Scripting Pig Latin Java User-Defined Functions (PiggybankDataFu)
Hive ndash SQL-like interface Allows ad-hoc queries for data summarizations
and analysis ODBC Connector
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Demo Pig amp Hive
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Oozie
Scalable Reliable Extensible Workflow Management SystemJob Scheduler
Triggered by Time Data Availability
Can run and orchestrate multiple jobs MapReduce and Streaming MapReduce Hive Pig
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Windows Azure Blob Storage
Also called Azure Storage Vault (ASV) Scalable persistent highly-scalable storage with
built-in geo-replication Azure HDInsight clusters are wired for ASV
On-Premise HDP uses HDFS Separates data from compute nodes
Clusters can be created and dropped minimizing costs Multiple clusters can share data
The Azure Flat (Quantum 10) mesh grid network is the key Violates the principal of data locality but out-performs
HDFS and Azure competitors
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Windows Azure Blob Storage
Source httpdennygleecom20130318why-use-blob-storage-with-hdinsight-on-azure
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
AzCopy
Windows Azure Blob Storage Copies files to and from
Similar to Robocopy Command-lineAzCopy CBeer httpsstgblobcorewindowsnetdataBeer destKeyltMyKeygt S V
Recursively (S) copies all files in the Beer directory with Verbose (V) logging
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
AzCopy Demo
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
PolyBase
Part of Parallel Data Warehouse allows integration of relational and non-relational data
Creates external tables via a HDFS bridge Allows on-the-fly joins within SQL Server
Supports parallel Imports from HDFS Exports to HDFS
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Resources Bloggers
Denny Lee httpdennygleecom
Carl Nolan httpblogsmsdncombcarlnolarchivetagshadoop+streaming
Cindy Gross httpblogsmsdncombcindygrossarchivetagsbig+data
Books Hadoop the Definite Guide - Tom White Programming Pig - Alan Gates Programming Hive - Edward Capriolo Hadoop MapReduce Cookbook - Srinath Perera
Links to this Presentation httpbluewatersqlwordpresscomresources httpwwwslidesharenetbluewatersqlbig-dataguide
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscomMAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Thank you
BluewaterSQL httpbluewatersqlwordpresscom cpricepragmaticworkscom
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Sqoop
Tool designed to efficiently move data between Hadoop (Hive amp Hbase) and RDBMS Importing (single and all tables) Exporting Eval (Query Execution) Merge (Multiple HDFS datasets) Incremental Imports
Generates MapReduce jobs Can control the level of parallelism
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Sqoop Demo
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
HCatalogHivePig
Hcatalog ndash Metadata amp table management Users interact with a set of defined tables Abstracts away the wherehow of data storage Allows for consistent access
Pig ndash ETLData Transformation Scripting Pig Latin Java User-Defined Functions (PiggybankDataFu)
Hive ndash SQL-like interface Allows ad-hoc queries for data summarizations
and analysis ODBC Connector
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Demo Pig amp Hive
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Oozie
Scalable Reliable Extensible Workflow Management SystemJob Scheduler
Triggered by Time Data Availability
Can run and orchestrate multiple jobs MapReduce and Streaming MapReduce Hive Pig
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Windows Azure Blob Storage
Also called Azure Storage Vault (ASV) Scalable persistent highly-scalable storage with
built-in geo-replication Azure HDInsight clusters are wired for ASV
On-Premise HDP uses HDFS Separates data from compute nodes
Clusters can be created and dropped minimizing costs Multiple clusters can share data
The Azure Flat (Quantum 10) mesh grid network is the key Violates the principal of data locality but out-performs
HDFS and Azure competitors
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Windows Azure Blob Storage
Source httpdennygleecom20130318why-use-blob-storage-with-hdinsight-on-azure
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
AzCopy
Windows Azure Blob Storage Copies files to and from
Similar to Robocopy Command-lineAzCopy CBeer httpsstgblobcorewindowsnetdataBeer destKeyltMyKeygt S V
Recursively (S) copies all files in the Beer directory with Verbose (V) logging
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
AzCopy Demo
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
PolyBase
Part of Parallel Data Warehouse allows integration of relational and non-relational data
Creates external tables via a HDFS bridge Allows on-the-fly joins within SQL Server
Supports parallel Imports from HDFS Exports to HDFS
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Resources Bloggers
Denny Lee httpdennygleecom
Carl Nolan httpblogsmsdncombcarlnolarchivetagshadoop+streaming
Cindy Gross httpblogsmsdncombcindygrossarchivetagsbig+data
Books Hadoop the Definite Guide - Tom White Programming Pig - Alan Gates Programming Hive - Edward Capriolo Hadoop MapReduce Cookbook - Srinath Perera
Links to this Presentation httpbluewatersqlwordpresscomresources httpwwwslidesharenetbluewatersqlbig-dataguide
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscomMAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Thank you
BluewaterSQL httpbluewatersqlwordpresscom cpricepragmaticworkscom
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Sqoop Demo
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
HCatalogHivePig
Hcatalog ndash Metadata amp table management Users interact with a set of defined tables Abstracts away the wherehow of data storage Allows for consistent access
Pig ndash ETLData Transformation Scripting Pig Latin Java User-Defined Functions (PiggybankDataFu)
Hive ndash SQL-like interface Allows ad-hoc queries for data summarizations
and analysis ODBC Connector
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Demo Pig amp Hive
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Oozie
Scalable Reliable Extensible Workflow Management SystemJob Scheduler
Triggered by Time Data Availability
Can run and orchestrate multiple jobs MapReduce and Streaming MapReduce Hive Pig
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Windows Azure Blob Storage
Also called Azure Storage Vault (ASV) Scalable persistent highly-scalable storage with
built-in geo-replication Azure HDInsight clusters are wired for ASV
On-Premise HDP uses HDFS Separates data from compute nodes
Clusters can be created and dropped minimizing costs Multiple clusters can share data
The Azure Flat (Quantum 10) mesh grid network is the key Violates the principal of data locality but out-performs
HDFS and Azure competitors
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Windows Azure Blob Storage
Source httpdennygleecom20130318why-use-blob-storage-with-hdinsight-on-azure
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
AzCopy
Windows Azure Blob Storage Copies files to and from
Similar to Robocopy Command-lineAzCopy CBeer httpsstgblobcorewindowsnetdataBeer destKeyltMyKeygt S V
Recursively (S) copies all files in the Beer directory with Verbose (V) logging
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
AzCopy Demo
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
PolyBase
Part of Parallel Data Warehouse allows integration of relational and non-relational data
Creates external tables via a HDFS bridge Allows on-the-fly joins within SQL Server
Supports parallel Imports from HDFS Exports to HDFS
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Resources Bloggers
Denny Lee httpdennygleecom
Carl Nolan httpblogsmsdncombcarlnolarchivetagshadoop+streaming
Cindy Gross httpblogsmsdncombcindygrossarchivetagsbig+data
Books Hadoop the Definite Guide - Tom White Programming Pig - Alan Gates Programming Hive - Edward Capriolo Hadoop MapReduce Cookbook - Srinath Perera
Links to this Presentation httpbluewatersqlwordpresscomresources httpwwwslidesharenetbluewatersqlbig-dataguide
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscomMAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Thank you
BluewaterSQL httpbluewatersqlwordpresscom cpricepragmaticworkscom
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
HCatalogHivePig
Hcatalog ndash Metadata amp table management Users interact with a set of defined tables Abstracts away the wherehow of data storage Allows for consistent access
Pig ndash ETLData Transformation Scripting Pig Latin Java User-Defined Functions (PiggybankDataFu)
Hive ndash SQL-like interface Allows ad-hoc queries for data summarizations
and analysis ODBC Connector
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Demo Pig amp Hive
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Oozie
Scalable Reliable Extensible Workflow Management SystemJob Scheduler
Triggered by Time Data Availability
Can run and orchestrate multiple jobs MapReduce and Streaming MapReduce Hive Pig
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Windows Azure Blob Storage
Also called Azure Storage Vault (ASV) Scalable persistent highly-scalable storage with
built-in geo-replication Azure HDInsight clusters are wired for ASV
On-Premise HDP uses HDFS Separates data from compute nodes
Clusters can be created and dropped minimizing costs Multiple clusters can share data
The Azure Flat (Quantum 10) mesh grid network is the key Violates the principal of data locality but out-performs
HDFS and Azure competitors
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Windows Azure Blob Storage
Source httpdennygleecom20130318why-use-blob-storage-with-hdinsight-on-azure
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
AzCopy
Windows Azure Blob Storage Copies files to and from
Similar to Robocopy Command-lineAzCopy CBeer httpsstgblobcorewindowsnetdataBeer destKeyltMyKeygt S V
Recursively (S) copies all files in the Beer directory with Verbose (V) logging
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
AzCopy Demo
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
PolyBase
Part of Parallel Data Warehouse allows integration of relational and non-relational data
Creates external tables via a HDFS bridge Allows on-the-fly joins within SQL Server
Supports parallel Imports from HDFS Exports to HDFS
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Resources Bloggers
Denny Lee httpdennygleecom
Carl Nolan httpblogsmsdncombcarlnolarchivetagshadoop+streaming
Cindy Gross httpblogsmsdncombcindygrossarchivetagsbig+data
Books Hadoop the Definite Guide - Tom White Programming Pig - Alan Gates Programming Hive - Edward Capriolo Hadoop MapReduce Cookbook - Srinath Perera
Links to this Presentation httpbluewatersqlwordpresscomresources httpwwwslidesharenetbluewatersqlbig-dataguide
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscomMAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Thank you
BluewaterSQL httpbluewatersqlwordpresscom cpricepragmaticworkscom
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Demo Pig amp Hive
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Oozie
Scalable Reliable Extensible Workflow Management SystemJob Scheduler
Triggered by Time Data Availability
Can run and orchestrate multiple jobs MapReduce and Streaming MapReduce Hive Pig
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Windows Azure Blob Storage
Also called Azure Storage Vault (ASV) Scalable persistent highly-scalable storage with
built-in geo-replication Azure HDInsight clusters are wired for ASV
On-Premise HDP uses HDFS Separates data from compute nodes
Clusters can be created and dropped minimizing costs Multiple clusters can share data
The Azure Flat (Quantum 10) mesh grid network is the key Violates the principal of data locality but out-performs
HDFS and Azure competitors
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Windows Azure Blob Storage
Source httpdennygleecom20130318why-use-blob-storage-with-hdinsight-on-azure
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
AzCopy
Windows Azure Blob Storage Copies files to and from
Similar to Robocopy Command-lineAzCopy CBeer httpsstgblobcorewindowsnetdataBeer destKeyltMyKeygt S V
Recursively (S) copies all files in the Beer directory with Verbose (V) logging
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
AzCopy Demo
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
PolyBase
Part of Parallel Data Warehouse allows integration of relational and non-relational data
Creates external tables via a HDFS bridge Allows on-the-fly joins within SQL Server
Supports parallel Imports from HDFS Exports to HDFS
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Resources Bloggers
Denny Lee httpdennygleecom
Carl Nolan httpblogsmsdncombcarlnolarchivetagshadoop+streaming
Cindy Gross httpblogsmsdncombcindygrossarchivetagsbig+data
Books Hadoop the Definite Guide - Tom White Programming Pig - Alan Gates Programming Hive - Edward Capriolo Hadoop MapReduce Cookbook - Srinath Perera
Links to this Presentation httpbluewatersqlwordpresscomresources httpwwwslidesharenetbluewatersqlbig-dataguide
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscomMAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Thank you
BluewaterSQL httpbluewatersqlwordpresscom cpricepragmaticworkscom
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Oozie
Scalable Reliable Extensible Workflow Management SystemJob Scheduler
Triggered by Time Data Availability
Can run and orchestrate multiple jobs MapReduce and Streaming MapReduce Hive Pig
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Windows Azure Blob Storage
Also called Azure Storage Vault (ASV) Scalable persistent highly-scalable storage with
built-in geo-replication Azure HDInsight clusters are wired for ASV
On-Premise HDP uses HDFS Separates data from compute nodes
Clusters can be created and dropped minimizing costs Multiple clusters can share data
The Azure Flat (Quantum 10) mesh grid network is the key Violates the principal of data locality but out-performs
HDFS and Azure competitors
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Windows Azure Blob Storage
Source httpdennygleecom20130318why-use-blob-storage-with-hdinsight-on-azure
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
AzCopy
Windows Azure Blob Storage Copies files to and from
Similar to Robocopy Command-lineAzCopy CBeer httpsstgblobcorewindowsnetdataBeer destKeyltMyKeygt S V
Recursively (S) copies all files in the Beer directory with Verbose (V) logging
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
AzCopy Demo
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
PolyBase
Part of Parallel Data Warehouse allows integration of relational and non-relational data
Creates external tables via a HDFS bridge Allows on-the-fly joins within SQL Server
Supports parallel Imports from HDFS Exports to HDFS
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Resources Bloggers
Denny Lee httpdennygleecom
Carl Nolan httpblogsmsdncombcarlnolarchivetagshadoop+streaming
Cindy Gross httpblogsmsdncombcindygrossarchivetagsbig+data
Books Hadoop the Definite Guide - Tom White Programming Pig - Alan Gates Programming Hive - Edward Capriolo Hadoop MapReduce Cookbook - Srinath Perera
Links to this Presentation httpbluewatersqlwordpresscomresources httpwwwslidesharenetbluewatersqlbig-dataguide
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscomMAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Thank you
BluewaterSQL httpbluewatersqlwordpresscom cpricepragmaticworkscom
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Windows Azure Blob Storage
Also called Azure Storage Vault (ASV) Scalable persistent highly-scalable storage with
built-in geo-replication Azure HDInsight clusters are wired for ASV
On-Premise HDP uses HDFS Separates data from compute nodes
Clusters can be created and dropped minimizing costs Multiple clusters can share data
The Azure Flat (Quantum 10) mesh grid network is the key Violates the principal of data locality but out-performs
HDFS and Azure competitors
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Windows Azure Blob Storage
Source httpdennygleecom20130318why-use-blob-storage-with-hdinsight-on-azure
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
AzCopy
Windows Azure Blob Storage Copies files to and from
Similar to Robocopy Command-lineAzCopy CBeer httpsstgblobcorewindowsnetdataBeer destKeyltMyKeygt S V
Recursively (S) copies all files in the Beer directory with Verbose (V) logging
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
AzCopy Demo
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
PolyBase
Part of Parallel Data Warehouse allows integration of relational and non-relational data
Creates external tables via a HDFS bridge Allows on-the-fly joins within SQL Server
Supports parallel Imports from HDFS Exports to HDFS
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Resources Bloggers
Denny Lee httpdennygleecom
Carl Nolan httpblogsmsdncombcarlnolarchivetagshadoop+streaming
Cindy Gross httpblogsmsdncombcindygrossarchivetagsbig+data
Books Hadoop the Definite Guide - Tom White Programming Pig - Alan Gates Programming Hive - Edward Capriolo Hadoop MapReduce Cookbook - Srinath Perera
Links to this Presentation httpbluewatersqlwordpresscomresources httpwwwslidesharenetbluewatersqlbig-dataguide
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscomMAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Thank you
BluewaterSQL httpbluewatersqlwordpresscom cpricepragmaticworkscom
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Windows Azure Blob Storage
Source httpdennygleecom20130318why-use-blob-storage-with-hdinsight-on-azure
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
AzCopy
Windows Azure Blob Storage Copies files to and from
Similar to Robocopy Command-lineAzCopy CBeer httpsstgblobcorewindowsnetdataBeer destKeyltMyKeygt S V
Recursively (S) copies all files in the Beer directory with Verbose (V) logging
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
AzCopy Demo
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
PolyBase
Part of Parallel Data Warehouse allows integration of relational and non-relational data
Creates external tables via a HDFS bridge Allows on-the-fly joins within SQL Server
Supports parallel Imports from HDFS Exports to HDFS
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Resources Bloggers
Denny Lee httpdennygleecom
Carl Nolan httpblogsmsdncombcarlnolarchivetagshadoop+streaming
Cindy Gross httpblogsmsdncombcindygrossarchivetagsbig+data
Books Hadoop the Definite Guide - Tom White Programming Pig - Alan Gates Programming Hive - Edward Capriolo Hadoop MapReduce Cookbook - Srinath Perera
Links to this Presentation httpbluewatersqlwordpresscomresources httpwwwslidesharenetbluewatersqlbig-dataguide
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscomMAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Thank you
BluewaterSQL httpbluewatersqlwordpresscom cpricepragmaticworkscom
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
AzCopy
Windows Azure Blob Storage Copies files to and from
Similar to Robocopy Command-lineAzCopy CBeer httpsstgblobcorewindowsnetdataBeer destKeyltMyKeygt S V
Recursively (S) copies all files in the Beer directory with Verbose (V) logging
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
AzCopy Demo
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
PolyBase
Part of Parallel Data Warehouse allows integration of relational and non-relational data
Creates external tables via a HDFS bridge Allows on-the-fly joins within SQL Server
Supports parallel Imports from HDFS Exports to HDFS
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Resources Bloggers
Denny Lee httpdennygleecom
Carl Nolan httpblogsmsdncombcarlnolarchivetagshadoop+streaming
Cindy Gross httpblogsmsdncombcindygrossarchivetagsbig+data
Books Hadoop the Definite Guide - Tom White Programming Pig - Alan Gates Programming Hive - Edward Capriolo Hadoop MapReduce Cookbook - Srinath Perera
Links to this Presentation httpbluewatersqlwordpresscomresources httpwwwslidesharenetbluewatersqlbig-dataguide
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscomMAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Thank you
BluewaterSQL httpbluewatersqlwordpresscom cpricepragmaticworkscom
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
AzCopy Demo
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
PolyBase
Part of Parallel Data Warehouse allows integration of relational and non-relational data
Creates external tables via a HDFS bridge Allows on-the-fly joins within SQL Server
Supports parallel Imports from HDFS Exports to HDFS
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Resources Bloggers
Denny Lee httpdennygleecom
Carl Nolan httpblogsmsdncombcarlnolarchivetagshadoop+streaming
Cindy Gross httpblogsmsdncombcindygrossarchivetagsbig+data
Books Hadoop the Definite Guide - Tom White Programming Pig - Alan Gates Programming Hive - Edward Capriolo Hadoop MapReduce Cookbook - Srinath Perera
Links to this Presentation httpbluewatersqlwordpresscomresources httpwwwslidesharenetbluewatersqlbig-dataguide
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscomMAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Thank you
BluewaterSQL httpbluewatersqlwordpresscom cpricepragmaticworkscom
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
PolyBase
Part of Parallel Data Warehouse allows integration of relational and non-relational data
Creates external tables via a HDFS bridge Allows on-the-fly joins within SQL Server
Supports parallel Imports from HDFS Exports to HDFS
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Resources Bloggers
Denny Lee httpdennygleecom
Carl Nolan httpblogsmsdncombcarlnolarchivetagshadoop+streaming
Cindy Gross httpblogsmsdncombcindygrossarchivetagsbig+data
Books Hadoop the Definite Guide - Tom White Programming Pig - Alan Gates Programming Hive - Edward Capriolo Hadoop MapReduce Cookbook - Srinath Perera
Links to this Presentation httpbluewatersqlwordpresscomresources httpwwwslidesharenetbluewatersqlbig-dataguide
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscomMAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Thank you
BluewaterSQL httpbluewatersqlwordpresscom cpricepragmaticworkscom
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Resources Bloggers
Denny Lee httpdennygleecom
Carl Nolan httpblogsmsdncombcarlnolarchivetagshadoop+streaming
Cindy Gross httpblogsmsdncombcindygrossarchivetagsbig+data
Books Hadoop the Definite Guide - Tom White Programming Pig - Alan Gates Programming Hive - Edward Capriolo Hadoop MapReduce Cookbook - Srinath Perera
Links to this Presentation httpbluewatersqlwordpresscomresources httpwwwslidesharenetbluewatersqlbig-dataguide
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscomMAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Thank you
BluewaterSQL httpbluewatersqlwordpresscom cpricepragmaticworkscom
MAKING BUSINESS INTELLIGENT wwwpragmaticworkscomMAKING BUSINESS INTELLIGENT wwwpragmaticworkscom
Thank you
BluewaterSQL httpbluewatersqlwordpresscom cpricepragmaticworkscom
Recommended