34
This Conference brought to you by www.ttcus.com Linkedin/Group: Technology Training Corporation Technology Training Corporation @Techtrain Corporation www.ttcus.com

This Conference brought to you by - Big · PDF fileThis Conference brought to you by Linkedin/Group: Technology Training Corporation Technology Training ... NetApp 2. Conference Agendas

Embed Size (px)

Citation preview

Page 1: This Conference brought to you by - Big · PDF fileThis Conference brought to you by Linkedin/Group: Technology Training Corporation Technology Training ... NetApp 2. Conference Agendas

This Conference brought to you bywww.ttcus.com

Linkedin/Group:Technology Training Corporation

Technology Training Corporation

@Techtrain

Corporationwww.ttcus.com

Page 2: This Conference brought to you by - Big · PDF fileThis Conference brought to you by Linkedin/Group: Technology Training Corporation Technology Training ... NetApp 2. Conference Agendas

Optimizing Data StorageOptimizing Data Storage and Management for Bi D t i th F d lBig Data in the Federal GovernmentGreg Gardner, PhD, CISSPColonel, Infantry, U.S. Army (Retired)Chief Architect Defense and IntelChief Architect, Defense and Intel SolutionsNetApp

2

Page 3: This Conference brought to you by - Big · PDF fileThis Conference brought to you by Linkedin/Group: Technology Training Corporation Technology Training ... NetApp 2. Conference Agendas

Conference Agendas…and Baseball Lineupsand Baseball Lineups

No. 1, or lead off, hitter: Traditionally, lead off hitters were fast and reached base often. SES

No. 2 hitter: Managers often used a hitter with exceptional bat control in the number 2 spot. RADM

No. 3 hitter: This spot usually is reserved for the team's best hitter, who often hits for a high

d till h IDAaverage and still has power. No. 4, or cleanup, hitter: This batter is usually the

t ' bi t h th t

IDA

team's biggest home-run threat.

3NetApp Confidential - Internal Use Only

NIST

Page 4: This Conference brought to you by - Big · PDF fileThis Conference brought to you by Linkedin/Group: Technology Training Corporation Technology Training ... NetApp 2. Conference Agendas

Conference Agendas…and Baseball Lineupsand Baseball Lineups

No. 7 and 8 hitters: These players often are weak hitters who are more important to the team forhitters who are more important to the team for their defensive capabilities. The No. 7 hitter might have a little more pop in his bat that can drive in p pthe players hitting in front of him.

The No. 8 batter is often light-hitting, but ideally AARON

g g ycan reach base on walks.

This allows the manager to use the No. 9 hitter to GREG

sacrifice, making his otherwise sure out into one that at least can advance the baserunner.

/ G

4NetApp Confidential - Internal Use Only

ALAN / GERARD

Page 5: This Conference brought to you by - Big · PDF fileThis Conference brought to you by Linkedin/Group: Technology Training Corporation Technology Training ... NetApp 2. Conference Agendas

All This Data and Not Even to the “Peak”

Peak of Inflated ExpectationsVISIBILITY

Slope of Enlightenment

Plateau of Productivity

Technology TriggerTrough of Disillusionment

Slope of Enlightenment

Estimated size of the35 Zettabytes 5 Billion

Smart phones

TIME

Estimated size of the digital universe in 2020

Smart phones

30 BillionPieces of new content to

80%UnstructuredPieces of new content to

Facebook per month

5

Unstructureddata

Page 6: This Conference brought to you by - Big · PDF fileThis Conference brought to you by Linkedin/Group: Technology Training Corporation Technology Training ... NetApp 2. Conference Agendas

Data Analytics PlatformPlatform

Intelligent Search Real Time Search Index

UserRules Engine

Man

agem

ent

Big Data Analytics Scale Out Platform Fault Tolerant

N1 N2 N3

Parallel Analytics Platform

Visualization

M Data ArchivalN4 NN…

Nodes

Data CollectorsData

C Data

Hub Location

CollectorsData CollectorsData

CollectorsData CollectorsAgent

Adapters

Agent

Adapters

©2012 Zaloni, Inc. All Rights Reserved.

Edge Location Edge Location

Page 7: This Conference brought to you by - Big · PDF fileThis Conference brought to you by Linkedin/Group: Technology Training Corporation Technology Training ... NetApp 2. Conference Agendas

Coke’s Smart Dispenser Monitoring and Analytics

Micro-dispensing Technology to serve 100+ drinks

Cartridges store concentrated ingredients in dispenser and are RFID enabled.enabled.

RFID chips detect supplies and call-home

Other call home data: Daily debug, Diagnostics, Product usage

100,000 Freestyles in 2012

©2012 Zaloni, Inc. All Rights Reserved.

Page 8: This Conference brought to you by - Big · PDF fileThis Conference brought to you by Linkedin/Group: Technology Training Corporation Technology Training ... NetApp 2. Conference Agendas

Coke Freestyle Hadoop Analytical Architecture

Collector

Scale out Real time

Data collection

.

.Dispenser

Collector..

Collector

.Dispenser

Agent Collector

.

Analytics

Local MaintainerHDFS

N1 N2 N3 NN…

Hadoop ClusterHBase

Analytics PlatformRules Engine

Visualization N1 N2 N3 NN…

Lucene IndexIntelligent Search

Visualization

Regional Analyst / Manager

©2012 Zaloni, Inc. All Rights Reserved.

National Sales Director

Page 9: This Conference brought to you by - Big · PDF fileThis Conference brought to you by Linkedin/Group: Technology Training Corporation Technology Training ... NetApp 2. Conference Agendas

Local Data Capture for Machine Maintenance and Stockage

©2012 Zaloni, Inc. All Rights Reserved.

Page 10: This Conference brought to you by - Big · PDF fileThis Conference brought to you by Linkedin/Group: Technology Training Corporation Technology Training ... NetApp 2. Conference Agendas

Regional Metrics and Intelligent Search

©2012 Zaloni, Inc. All Rights Reserved.

Page 11: This Conference brought to you by - Big · PDF fileThis Conference brought to you by Linkedin/Group: Technology Training Corporation Technology Training ... NetApp 2. Conference Agendas

National Level Metrics and Visualization

©2012 Zaloni, Inc. All Rights Reserved.

Page 12: This Conference brought to you by - Big · PDF fileThis Conference brought to you by Linkedin/Group: Technology Training Corporation Technology Training ... NetApp 2. Conference Agendas

Same model for MilitMilitaryConditions Based M i tMaintenance(CBM)??

Page 13: This Conference brought to you by - Big · PDF fileThis Conference brought to you by Linkedin/Group: Technology Training Corporation Technology Training ... NetApp 2. Conference Agendas

CBM+ Hadoop Analytical Architecture

Scale out Real time

Data Bn Level lid tiEdge-T Collector collection consolidation

(BCCS Stack)

Edge-T Collector

Analytics

Crew ChiefAccessing 2410

HDFS

N1 N2 N3 NN…

Hadoop ClusterHBase

Analytics PlatformRules Engine

Visualization

MechanicAccessing LOGSA

& Parts N1 N2 N3 NN…

Lucene IndexIntelligent Search

Visualization& Parts

©2012 Zaloni, Inc. All Rights Reserved.

Avn Bde A/S4 Briefing Cdr on Maint Status

Page 14: This Conference brought to you by - Big · PDF fileThis Conference brought to you by Linkedin/Group: Technology Training Corporation Technology Training ... NetApp 2. Conference Agendas

Local Data Capture / Retreival of M i t d L i ti l i f tiMaintenance and Logistical information

Page 15: This Conference brought to you by - Big · PDF fileThis Conference brought to you by Linkedin/Group: Technology Training Corporation Technology Training ... NetApp 2. Conference Agendas

Insert AILA Network slideData

Page 16: This Conference brought to you by - Big · PDF fileThis Conference brought to you by Linkedin/Group: Technology Training Corporation Technology Training ... NetApp 2. Conference Agendas

Condition BasedMaintenance (CBM)Maintenance (CBM)

PREVENTIVEPREVENTIVE INDICATORSINDICATORS DIAGNOSTICSDIAGNOSTICS PROGNOSTICSPROGNOSTICS ONON--CONDITIONCONDITIONPREVENTIVEPREVENTIVE INDICATORSINDICATORS DIAGNOSTICSDIAGNOSTICS PROGNOSTICSPROGNOSTICS ONON--CONDITIONCONDITION

• Reactive Maintenance

• Time Based

• Digital Source Collector Installation• Knowledge Development• Digital Source Collector Installation• Knowledge Development

• Proactive Maintenance

• ‘On Condition’

CBM Program The Purpose of Army Maintenance is to Generate Combat Power.

• Time Based Inspection/Overhaul

• Fault Diagnosis• Remaining Useful Life Calculation• Inspection Targeting

• Fault Diagnosis• Remaining Useful Life Calculation• Inspection Targeting

• On ConditionInspection/Overhaul

CBM Program Objectives:

• Decrease Maintenance Burden on the Soldier

Main Rotor HubMain Rotor ShaftBifilarsSwashplate ASSYSwashplate Guide

M/R SpindleSpherical BearingM/R Spindle Tie RodControl HornM/R Shaft ExtenderM/R D

T/R BladeT/R Pitch Change HornPC Links

T/R Pitch Change ShaftT/R Pitch Change Bearing

The Purpose of Army Maintenance is to Generate Combat Power.AR 750-1

• Increase Platform Availability and Readiness

• Enhance Safety

Swashplate Bearing**PC Links

M/R Damper

Tail Rotor GB**Retention Plates (2)Gearshaft

Viscous Bearings (4)**

M/R BladeM/R Blade Tip CapM/R Blade Expandable PinM/R Blade Cuff

T/R Driveshafts (7)**• Reduce Operations &

Support (O&S) Costs

Key CBM Enablers• Digital Source Collectors M i T i i M d l Engine (No 1 & 2) Oil C l A i l F

Intermediate GB**VibrationAbsorbers (2)

Digital Source Collectors• Flight Line Diagnostics• Data Fusion/Analysis

Currently Monitored by Vibration

Main Transmission ModuleAccessory ModuleGenerators (2)Hydraulic Pumps (2)Planetary Carrier

Engine (No. 1 & 2)Driveshafts (2)**Input Modules (2)

Oil Cooler Axial FanOil Cooler Fan Bearing**APU

This is NOT industry best practice…

Page 17: This Conference brought to you by - Big · PDF fileThis Conference brought to you by Linkedin/Group: Technology Training Corporation Technology Training ... NetApp 2. Conference Agendas
Page 18: This Conference brought to you by - Big · PDF fileThis Conference brought to you by Linkedin/Group: Technology Training Corporation Technology Training ... NetApp 2. Conference Agendas

Aviation Logistics Data

*2410IMMC

OSMISCEAC*OLR

IMMCCOBRAIMMC

*SOF/ASMAMCOM SAFETY

DailyYearly Daily

Quarterly

Weekly Yearly

Data sources currently being

displayed in FMAS marked with *

FMDR

*ACEIMMC

*AWRPIMSOL

Daily

*AMTracksAMCOM SAFETY

Daily

*LAR’SIMMC

Daily

*RIDB/Gold Book

*WOLFLOGSA

PIMSOL

MonthlyTOP

Live LinkDaily

*ACLOCPIMSOL

Daily

*RMISMonthly

RIDB/Gold BookLOGSA

Daily

TIER

Live Link Data Subscribers

Selected Data AccessDaily

*LMPLOGSA

Daily

RMISPIMSOL

CDDBLOGSA

D ilULLS HUMS

MDATPCMCIA

C d

SubscribersAED

Q-Tec

*PQDRAED

Monthly

18

Laptops(PC/QC/Shops)

Daily

BATTALION SERVERS

MDAT Card

DailyAC Laptops

Page 19: This Conference brought to you by - Big · PDF fileThis Conference brought to you by Linkedin/Group: Technology Training Corporation Technology Training ... NetApp 2. Conference Agendas

What’s The Problem?

I tiIncentives

Culture

19NetApp Confidential - Internal Use Only

Page 20: This Conference brought to you by - Big · PDF fileThis Conference brought to you by Linkedin/Group: Technology Training Corporation Technology Training ... NetApp 2. Conference Agendas

Big Data ABCs

Analytics for extremelylarge datasets

GainInsight

Big

KeepEverything

Go Fast

gData

Performance for data intensive workloads

Secure boundlessdata storageg

20NetApp Confidential – Limited Use

Page 21: This Conference brought to you by - Big · PDF fileThis Conference brought to you by Linkedin/Group: Technology Training Corporation Technology Training ... NetApp 2. Conference Agendas

Big Data Market Requirements

Distributed compute farms with standard software

Very high streaming bandwidth

Boundaryless containers (10’s PBs)standard software

Simpler systems, faster to deploy

Optimized balance

bandwidth Dense

configurations -GB/s/Rack Unit

(10 s PBs) Simple access to

dynamic datasets New access Optimized balance

between compute & storage

Reliable data

Large data sets (containers)

Simpler system

New access technology(CDMI) with legacy application interfaces

(metadata)p y

configurations Highly efficient, self-managing systems

21

Page 22: This Conference brought to you by - Big · PDF fileThis Conference brought to you by Linkedin/Group: Technology Training Corporation Technology Training ... NetApp 2. Conference Agendas

Big Content Solutions

Solution Workloads/Environment

File Services Multiapplication/client workloads Frequent read/write access

Enterprise ContentRepositories

Infinite container Active archive

Distributed Content Repositories

Large, multisite repositoryP li b d d t tRepositories Policy-based data management

Page 23: This Conference brought to you by - Big · PDF fileThis Conference brought to you by Linkedin/Group: Technology Training Corporation Technology Training ... NetApp 2. Conference Agendas

What Is Apache Hadoop?An open-source software framework that: Supports data-intensive distributed Name Node /

HDFS

ppapplications

Enables applications to work with thousands of nodes and petabytes

Name Node /JobTracker

thousands of nodes and petabytes of data

Runs on a collection of commodity, h d thi

Map ReduceData Nodes /TaskTracker

shared-nothing servers Has two key components:

– HDFS. Hadoop Distributed File System

:

p y– MapReduce. Programming model for

processing and generating large datasetsData Nodes /TaskTracker

NetApp Confidential - Internal and Partner Use Only 23

Page 24: This Conference brought to you by - Big · PDF fileThis Conference brought to you by Linkedin/Group: Technology Training Corporation Technology Training ... NetApp 2. Conference Agendas

HadoopTarget MarketsTarget Markets

Full Motion Video (FMV)Full Motion Video (FMV)– Government agencies building multi-petabyte video capture/playback solutions from

satellites, drones, etc.

Media Content Management (MCM) Media Content Management (MCM)– Broadcast and cable networks video w/ ingest/playback active repositories

Seismic Processing Solution (SPS)– Upstream Energy customers building Seismic Processing systems

Open Solution for Hadoop (OSH)– Orgs building large Hadoop clusters requiring high bandwidth and densityOrgs. building large Hadoop clusters requiring high bandwidth and density

HPC Solution for Lustre (Lustre)– Government and research institutes building HPC solutions

24NetApp Confidential - Internal Use Only

Page 25: This Conference brought to you by - Big · PDF fileThis Conference brought to you by Linkedin/Group: Technology Training Corporation Technology Training ... NetApp 2. Conference Agendas

Vertical ApplicationsEnterprise Sales reports

C t

Government Event

monitoring

Retail Sales reports

T t

Financial Sales reports

F d Customer analysis

Cost trends Quality

monitoring Terrorism

monitoring Data mining

Target marketing

New product rollout

Fraud detection

Targeted marketing Quality

controlData mining

Internet E-commerce

rollout Customer

loyalty Customer

marketing Customer data Compliance

Web sites Data mining Customer data

buying habits Inventory

predictions New customer

acquisition

NetApp Confidential - Internal and Partner Use Only 25

Page 26: This Conference brought to you by - Big · PDF fileThis Conference brought to you by Linkedin/Group: Technology Training Corporation Technology Training ... NetApp 2. Conference Agendas

Enterprise-Class Features with Hadoop

Dynamic Disk Pools SSD Cache and Hybrid Storage Enhanced Snapshot CopiesDynamic Disk Pools6x faster rebuild times (minutes, not days) andcontinuous high performance during drive rebuild

SSD Cache and Hybrid StorageExpedited access to “hot” data through automated, real-time caching to SSD; mix and match SSDs and HDDs

Enhanced Snapshot CopiesMore precise recovery point objectives and faster recovery

Encrypted Drive SupportExtended security enhancements for compliance

Enterprise Replication Cost-effective enterprise-class disaster recovery of data with

Thin ProvisioningImprove storage utilization up to 35%

26

and regulationsFC and IP replication and eliminate overprovisioning

Page 27: This Conference brought to you by - Big · PDF fileThis Conference brought to you by Linkedin/Group: Technology Training Corporation Technology Training ... NetApp 2. Conference Agendas

Use the ESG ReportValidated for the EnterpriseValidated for the Enterprise

Improved the performance of the H d l t b 62%Hadoop cluster by 62%

Completed jobs under drive f il th 200% f tfailure more than 200% faster

Delivered linear performance scalability as data nodes setsscalability as data nodes, sets expanded

The Open Solution for Hadoop improves capacity and performance efficiency and recoverability compared to a server-based DAS deployment.

ESG 2012

27

- ESG, 2012

Page 28: This Conference brought to you by - Big · PDF fileThis Conference brought to you by Linkedin/Group: Technology Training Corporation Technology Training ... NetApp 2. Conference Agendas

Use Case Example: NetApp Auto Support

Phone home data representing information about the status NetApp storage controllers

Correlate disk latency (hot) with disk type

g

disk type– 24 billion records – 4 weeks to run query– Hadoop implementation 10.5 hours

Bug detection through pattern matchingmatching– 240 billion records – Too large to

run– Hadoop implementation 18 hours

28

Page 29: This Conference brought to you by - Big · PDF fileThis Conference brought to you by Linkedin/Group: Technology Training Corporation Technology Training ... NetApp 2. Conference Agendas

NIF: Big Science Needs Big Data

With a private cloud, NIF was able to:

• Deliver nonstop, 24/7 availability for critical data

• Cut planned downtime by 60 hours• Cut planned downtime by 60 hours per year to maximize facility availability

• Reduce latency by 97% to provide data to scientists within 15 minutes of an experimentan experiment

• Enable secure multi-tenancy to protect sensitive scientific and

© 2014 NetApp, Inc. All rights reserved. NetApp Proprietary – Limited Use Only29

government data

Page 30: This Conference brought to you by - Big · PDF fileThis Conference brought to you by Linkedin/Group: Technology Training Corporation Technology Training ... NetApp 2. Conference Agendas

State-of-the-Science in HPC Today

16 M / 20 P k P t FLOPS (20 1015)

LLNL Sequoia - designed to be the fastest supercomputer and storage combination on the planet

16 Max / 20 Peak PetaFLOPS (20 x 1015) 1.6 Million Cores 1.6 PB main memoryy > 1.3 Terabyte/sec using:

– 55 PB of usable storage – Lustre Cluster Parallel File SystemLustre Cluster Parallel File System– QDR (40 Gigabit) Infiniband

~8 MW of PowerSi l ti f Simulations for – Nuclear Weapons Viability – Counter Terrorism Now #1 of the Top 500 HPC

Systems in the world – Energy Security – Climate Change

30Press Release: http://www.netapp.com/us/company/news/news-rel-20110928-990734.html

Systems in the world.http://top500.org.

Page 31: This Conference brought to you by - Big · PDF fileThis Conference brought to you by Linkedin/Group: Technology Training Corporation Technology Training ... NetApp 2. Conference Agendas

Big Data in HR: Some Fundamental PrinciplesPrinciples

Transparency PrivacyPrivacy “Do No Harm” Validity and Verification Validity and Verification Security

31NetApp Confidential - Internal Use Only

Page 32: This Conference brought to you by - Big · PDF fileThis Conference brought to you by Linkedin/Group: Technology Training Corporation Technology Training ... NetApp 2. Conference Agendas

Big Data: Transforming theDesign Philosophy of Future InternetDesign Philosophy of Future Internet

32NetApp Confidential - Internal Use Only

Page 33: This Conference brought to you by - Big · PDF fileThis Conference brought to you by Linkedin/Group: Technology Training Corporation Technology Training ... NetApp 2. Conference Agendas

Remember: With Big Data it’s all aboutWith Big Data, it s all about…

I tiIncentives

Culture

33NetApp Confidential - Internal Use Only

Page 34: This Conference brought to you by - Big · PDF fileThis Conference brought to you by Linkedin/Group: Technology Training Corporation Technology Training ... NetApp 2. Conference Agendas

34