Transcript

1 ©HortonworksInc.2011– 2017.AllRightsReserved

IngestingDroneDataintoBigDataPlatformsTimothySpann(@PaasDev)OracleCodeNYC2017https://github.com/tspannhw/IngestingDroneData

[BRK1238-NYC]

2 ©HortonworksInc.2011– 2017.AllRightsReserved

Agenda

• Drones 101• Metadata Insights• Ingestion Formats: MQTT, Files, REST JSON, Images, FTP

and more.• Apache NiFi• Big Data Storage: HDFS, HBase, Phoenix, Hive

3 ©HortonworksInc.2011– 2017.AllRightsReserved

Code Walk Through

• Processing Images with TensorFlow for Image Recognition • Writing a Java 8 Processor for Sentiment Analysis• Writing a Java 8 Processor for analyzing HTML • Writing a Java 8 Microservice for retrieving Phoenix/Hbase

data • Writing a Java 8 Microservice for retrieving Hive data • Writing a NiFi flow to wire it all together

4 ©HortonworksInc.2011– 2017.AllRightsReserved

UAV / Drones

A drone is an unmanned aircraft, better known as a unmanned aerial vehicle (UAV) or unmanned aircraft systems (UAS).

For enterprise purposes, we are thinking of serious drones that contain high resolution video/still cameras with onboard sensors and GPS.

For almost all drones, they are regulated by the Federal Aviation Administration (FAA) and there are strict regulations you must register your drone. If you aren’t prepare for the regulations or cost, keep your drone in your house or under half a pound.

5 ©HortonworksInc.2011– 2017.AllRightsReserved

6 ©HortonworksInc.2011– 2017.AllRightsReserved

MetaData EXIF Geotagging

Photographs taken from UAVs, the JPEGs will have extra data stored in EXIF.Apache NiFi can extract this information, which is very helpful.The most important information it contains is GPS information for latitude and longitude.

GPS Altitude Ref: Sea levelGPS Altitude: 21 metresGPS Longitude: -73° 4' 50.41GPS Latitude: 40° 45' 16.51

7 ©HortonworksInc.2011– 2017.AllRightsReserved

hdfs dfs -get/drone/meta/Bebop2_20160920085308-0400.json{"date":"2016-09-20T08:53:08","Compression":"JPEG","ExifVersion":"2.10","ComponentsConfiguration":"YCbCr","file.group":"root","CompressionType":"Baseline","Image Description":"{\"product_id\":\"090C\",\"uuid\":\"CE112B4276E13339AFBAE10E9ED794E3\",\"run_date\":\"2016-09-20T083537-0400\",\"filename\":\"Bebop_2_2016-09-20T085308-0400_CE112B.jpg\",\"media_date\":\"2016-09-20T085308-0400\"}","NumberofComponents":"3","Component2":"Cbcomponent:Quantizationtable1,Samplingfactors1horiz/1vert","Focal Length":"1.83mm","Component 1":"Ycomponent:Quantizationtable0,Samplingfactors2horiz/2vert","YCbCr Positioning":"Center ofpixelarray","tiff:ResolutionUnit":"Inch","uuid":"9576a956-ec32-4314-bc8b-bd5bb43af4f8","Date/TimeOriginal":"2016:09:2008:53:08","ShutterSpeedValue":"1/249sec","X Resolution":"72dotsperinch","tiff:Make":"PARROT","path":"/","PhotometricInterpretation":"YCbCr","Component3":"Crcomponent:Quantizationtable1,Samplingfactors1horiz/1vert","Unique ImageID":"[32bytes]","F-Number":"F2.3","modified":"2016-09-20T08:53:08","FocalLength35":"6mm","tiff:BitsPerSample":"8","ExposureProgram":"Program action(high-speedprogram)","GPSVersionID":"2.200","GPSLatitudeRef":"N","meta:creation-date":"2016-09-20T08:53:08","exif:FNumber":"2.2999999166745724","GPSAltitudeRef":"Sea level","Exposure Time":"8597231/2147483647sec","GPS Longitude":"-73° 4'50.41\"","Creation-Date":"2016-09-20T08:53:08","ISOSpeedRatings":"426","Make":"PARROT","Orientation":"Top,leftside(Horizontal/normal)","MeteringMode":"(Other)","tiff:Orientation":"1","GPSLongitudeRef":"W","tiff:Software":"Dragon3.9.0","exif:FocalLength":"1.8300001180412324","filename":"Bebop2_20160920085308-0400.jpg","XMPValueCount":"9","geo:long":"-73.080669","file.owner":"root","Software":"Dragon3.9.0","ExifImageHeight":"1088pixels","tiff:YResolution":"72.0","YResolution":"72dotsperinch","GPSLatitude":"40° 45'16.51\"","dc:description":"{\"product_id\":\"090C\",\"uuid\":\"CE112B4276E13339AFBAE10E9ED794E3\",\"run_date\":\"2016-09-20T083537-0400\",\"filename\":\"Bebop_2_2016-09-20T085308-0400_CE112B.jpg\",\"media_date\":\"2016-09-20T085308-0400\"}","geo:lat":"40.754585","FlashPixVersion":"1.00","DataPrecision":"8bits","White Balance":"Flash","tiff:ImageLength":"1088","description":"{\"product_id\":\"090C\",\"uuid\":\"CE112B4276E13339AFBAE10E9ED794E3\",\"run_date\":\"2016-09-20T083537-0400\",\"filename\":\"Bebop_2_2016-09-20T085308-0400_CE112B.jpg\",\"media_date\":\"2016-09-20T085308-0400\"}","dcterms:created":"2016-09-20T08:53:08","dcterms:modified":"2016-09-20T08:53:08","Last-Modified":"2016-09-20T08:53:08","file.permissions":"rwxrwxrwx","exif:ExposureTime":"0.00400339765660623","Last-Save-Date":"2016-09-20T08:53:08","GPSAltitude":"21metres","absolute.path":"/opt/demo/dronedata/","ColorSpace":"Undefined","File Size":"853159bytes","meta:save-date":"2016-09-20T08:53:08","file.creationTime":"2016-09-20T12:53:10+0000","Date/TimeDigitized":"2016:09:2008:53:08","FileName":"apache-tika-941370357006559178.tmp","Content-Type":"image/jpeg","Aperture Value":"F2.3","X-Parsed-By":"org.apache.tika.parser.DefaultParser,org.apache.tika.parser.jpeg.JpegParser","FileModifiedDate":"Tue Sep2023:33:24UTC2016","tiff:XResolution":"72.0","file.lastModifiedTime":"2016-09-20T12:53:10+0000","exif:DateTimeOriginal":"2016-09-20T08:53:08","Date/Time":"2016:09:2008:53:08","ExifImageWidth":"1920pixels","Image Height":"1088pixels","Image Width":"1920pixels","Unknown tag(0xc62f)":"[19bytes]","ResolutionUnit":"Inch","tiff:Model":"Bebop2","exif:IsoSpeedRatings":"426","MaxApertureValue":"F2.3","ExposureMode":"Auto exposure","Model":"Bebop 2","file.lastAccessTime":"2016-09-20T23:33:24+0000","tiff:ImageWidth":"1920","WhiteBalanceMode":"Auto whitebalance"}

8 ©HortonworksInc.2011– 2017.AllRightsReserved

import paho.mqtt.client as mqttclient = mqtt.Client()client.username_pw_set("username","password")client.connect(“MQTT_Broker", 14162, 60)

Sources and Formats

{ "product_id": "090C", "uuid": "CE112B4276E13339AFBAE10E9ED794E3", "run_date": "2016-09-20T083537-0400", "filename": "Bebop_2_2016-09-20T084230-0400_CE112B.jpg", "media_date": "2016-09-20T084230-0400" }

http://localhost:9999/dronelist

9 ©HortonworksInc.2011– 2017.AllRightsReserved

ReportingDroneDataIngest

- NiFipullsinBeBop 2Droneimages- NiFiroutesandparsesmetadatafromdroneimagesincludinggeodata

- NiFiusesTensorFlow Inceptionv3torecognizeobjectsinimage

- NiFistoresimages,metadataandenricheddatainHadoop.

- NiFiingestssocialandweatherfeeds- Java8ProcessorRunsSentiment- SpringBootAppDisplaysHiveData- PythonSentimentAnalysisscripts

10 ©HortonworksInc.2011– 2017.AllRightsReserved

DevelopedbytheNSAoverthelast8years.

"NSA'sinnovatorsworkonsomeofthemostchallengingnationalsecurityproblemsimaginable,""Commercialenterprisescoulduseittoquicklycontrol,manage,andanalyzetheflowofinformationfromgeographicallydispersedsites– creatingcomprehensivesituationalawareness"

-- LindaL.Burger,DirectoroftheNSA

NiFi Developed by the National Security Agency

11 ©HortonworksInc.2011– 2017.AllRightsReserved

• Foragileandimmediatecreation,configuration,controlofdataflowsVisual CommandandControl

• EnsurestrustofyourdataDataLineage(Provenance)

• Becausenotalldataisofequal importanceDataPrioritization

• Sincenotallsenders/receivers/connectionsworkperfectlyallthetimeDataBuffering/Back-Pressure

• Adapttodifferentsituations withdifferentrequirementsControl LatencyvsThroughput

• Securityofdata,anddataaccessSecure ControlPlane/DataPlane

• ScalabilityScaleoutClustering

• Ecosystem flexibilityandgrowthExtensibility

ApacheNiFi:Designedfor8challengesofglobalenterprisedataflow

12 ©HortonworksInc.2011– 2017.AllRightsReserved

FlowFile• Unitofdatamovingthroughthesystem• Content+Attributes(key/valuepairs)

Processor• Performsthework,canaccessFlowFiles

Connection• Linksbetweenprocessors• Queuesthatcanbedynamicallyprioritized

Terminology

13 ©HortonworksInc.2011– 2017.AllRightsReserved

Typical NiFi Cluster Logical View

13

14 ©HortonworksInc.2011– 2017.AllRightsReserved

ConnectingDataBetweenEcosystemsWithoutCoding:180+Processors

AllApacheprojectlogosaretrademarksoftheASFandtherespectiveprojects.

Hash

Extract

Merge

Duplicate

Scan

GeoEnrich

Replace

ConvertSplit

Translate

RouteContent

RouteContext

RouteText

ControlRate

DistributeLoad

GenerateTableFetch

JoltTransformJSON

PrioritizedDelivery

Encrypt

Tail

Evaluate

Execute

AllApacheprojectlogosaretrademarksoftheASFandtherespectiveprojects.

Fetch

HTTP

Syslog

Email

HTML

Image

HL7

FTP

UDP

XML

SFTP

AMQP

WebSocket

15 ©HortonworksInc.2011– 2017.AllRightsReserved

HadoopDistributedFileSystem(HDFS)

§ Fault-Tolerance§ Multiplecopiesprovideperformanceboost§ ReplicationLevelisconfigurable§ Fullchecksums§ Rackawareness§ Filessplitintoblocksdistributedonthree*servers§ CommodityHardware§ Nearlimitlesshorizontalscalability§ LookslikeLinuxFileSystem§ WebUI

HadoopScalableStorageandCompute

HiveLLAPHighPerformanceSQLDataMart

16 ©HortonworksInc.2011– 2017.AllRightsReserved

What Are Apache HBase and Phoenix?

FlexibleSchemaMillisecondLatencySQLandNoSQL InterfacesStoreandProcessPetabytesofDataScaleoutonCommodityServersIntegratedwithYARN100%OpenSourceOnTopofHDFS

YARN:DataOperatingSystem

HBase

RegionServer

1 ° ° ° ° ° ° ° ° ° °

° ° ° ° ° ° ° ° ° ° N

HDFS(PermanentDataStorage)

HBase

RegionServer

HBase

RegionServer

Flexible SchemaExtreme Low Latency

Directly Integrated with HadoopSQL and NoSQL Interfaces

17 ©HortonworksInc.2011– 2017.AllRightsReserved

WhatAreApachePhoenixandHBase?

Apache HBase is distributed database modeled after Google’s BigTable and designed toprovide real-time access to data in Hadoop. Apache Phoenix provides an ANSI SQLinterface to HBase.

Features:• Real-TimeDataManagementforHadoop• PB+Scale• ANSISQLInterface• SecondaryIndexes• Cross-DCReplication• Fine-GrainedSecurity• DevelopinJDBC,ODBC,.NET,andmore

18 ©HortonworksInc.2011– 2017.AllRightsReserved

WhatIsApacheHive?

ApacheHive isaSQLdatawarehouseinfrastructurethatdeliversfast,scalableSQLprocessingonHadoopandintheCloud.

Features:• ExtensiveSQL:2011Support• ACIDTransactions• In-MemoryCaching• Cost-BasedOptimizer• User-BasedDynamicSecurity• ReplicationandDisasterRecovery• JDBCandODBCSupport• CompatiblewitheverymajorBITool• Provenat300+PBScale• OnTopofHDFS

19 ©HortonworksInc.2011– 2017.AllRightsReserved

CODE!!!!

20 ©HortonworksInc.2011– 2017.AllRightsReserved

pythonclassify_image.py --image_file /opt/demo/dronedata/Bebop2_20160920083655-0400.jpgsolardish,solarcollector,solarfurnace(score=0.98316)windowscreen(score=0.00196)manholecover(score=0.00070)radiator(score=0.00041)doormat,welcomemat(score=0.00041)

bazel-bin/tensorflow/examples/label_image/label_image --image=/opt/demo/dronedata/Bebop2_20160920083655-0400.jpgtensorflow/examples/label_image/main.cc:204]solardish(577):0.983162Itensorflow/examples/label_image/main.cc:204]windowscreen(912):0.00196204Itensorflow/examples/label_image/main.cc:204]manholecover(763):0.000704005Itensorflow/examples/label_image/main.cc:204]radiator(571):0.000408321Itensorflow/examples/label_image/main.cc:204]doormat(972):0.000406186

TensorFlow via Python or C++ Binary (Java Library Is New!)

21 ©HortonworksInc.2011– 2017.AllRightsReserved

/opt/demo/sentiment/run.shpython/opt/demo/sentiment/sentiment.py "$@”

fromnltk.sentiment.vader importSentimentIntensityAnalyzerimportsyssid =SentimentIntensityAnalyzer()ss =sid.polarity_scores(sys.argv[1])print('Compound{0}Negative{1}Neutral{2}Positive{3}'.format(

ss['compound'],ss['neg'],ss['neu'],ss['pos']))

or

ifss['compound']==0.00: print('Neutral')elif ss['compound']<0.00: print('Negative')else: print('Positive')

Sentiment Analysis via PythonPrettyeasyandIwillshowyou

AnotherexamplewithTextBlob.

YouwillneedPython2.7andPIPinstalled.

22 ©HortonworksInc.2011– 2017.AllRightsReserved

https://pip.pypa.io/en/latest/installing/http://www.nltk.org/install.html

wget https://bootstrap.pypa.io/get-pip.pypythonget-pip.pysudo pip install -U nltksudo pip install -U numpysudo pipinstall-Utextblobsudo python-mtextblob.download_corpora

Installing Sentiment Analysis Libraries for Python 2.7

Installing TensorFlow for Python 2.7See:https://www.tensorflow.org/install/install_mac

InstallationisgettingeasierforTensorFlow,butyouwillneedbuildtoolsandPythoninstalled.YouneedtofindoutifyouhaveaGPUsupportedbyCUDA.Ifsoyourperformancewillbegreatlyimproved.

23 ©HortonworksInc.2011– 2017.AllRightsReserved

24 ©HortonworksInc.2011– 2017.AllRightsReserved

25 ©HortonworksInc.2011– 2017.AllRightsReserved

26 ©HortonworksInc.2011– 2017.AllRightsReserved

InstallationDownload the binary from here: http://hortonworks.com/downloads/#dataflow

Or here:https://nifi.apache.org/download.html

Or on Mac:brew install nifi

https://nifi.apache.org/docs/nifi-docs/html/getting-started.html#starting-nifi

bin/nifi.sh start

27 ©HortonworksInc.2011– 2017.AllRightsReserved

à https://hortonworks.com/hadoop-tutorial/learning-ropes-apache-nifi/

à https://github.com/jfrazee/awesome-nifi

à https://dzone.com/articles/getting-started-with-apache-nifi-and-hdf

à https://nifi.apache.org/docs.html

à https://community.hortonworks.com/articles/4356/getting-started-with-nifi-expression-language-and.html

à https://github.com/tspannhw/rpi-sensehat-mqtt-nifi

à https://github.com/tspannhw/iot-scripts

à https://dzone.com/articles/apache-nifi-10-cheatsheet

LearningMore

28 ©HortonworksInc.2011– 2017.AllRightsReserved

à https://www.parrot.com/us/drones/parrot-bebop-2#parrot-bebop-2

à https://en.wikipedia.org/wiki/Unmanned_aerial_vehicle

à http://github.com/tensorflow/tensorflow/blob/master/tensorflow/examples/label_image/README.md

à https://www.tensorflow.org/tutorials/image_recognition

à https://github.com/tensorflow/tensorflow/blob/master/tensorflow/java/src/main/java/org/tensorflow/examples/LabelImage.java

à https://community.hortonworks.com/content/kbentry/83100/deep-learning-iot-workflows-with-raspberry-pi-mqtt.html

à https://sites.google.com/site/pud2gpxkmlcsv/

à http://knowbeforeyoufly.org/

à https://www.faa.gov/uas/getting_started/

Resources

29 ©HortonworksInc.2011– 2017.AllRightsReserved

à KenKranz – DirectorofUASBigData atCognizant.@kenkranz

à JoeWitt- SeniorDirectorofEngineeringatHortonworks.@joewitt26

à ChrisCasano – SolutionEnngineer ManageratHortonworks.

à TomMcCuch – Sr.TechnicalDirectoratHortonworks.@tmccuch

à IngestingDroneDataintoBigDataPlatformshttps://github.com/tspannhw/IngestingDroneData

Thanks

30 ©HortonworksInc.2011– 2017.AllRightsReserved

Contact:

TimothySpann@PaaSDeV

www.meetup.com/futureofdata-princetonhttps://dzone.com/users/297029/bunkertor.htmlcommunity.hortonworks.com/users/9304/tspann.html


Recommended