24
1 © Hortonworks Inc. 2011 – 2017. All Rights Reserved Real-Time Ingesting and Transforming Sensor Data and Social Data with NiFi and TensorFlow Timothy Spann Hortonworks @PaaSDev

REAL-TIME INGESTING AND TRANSFORMING SENSOR DATA & SOCIAL DATA w/ NIFI + TENSORFLOW

Embed Size (px)

Citation preview

1 ©HortonworksInc.2011– 2017.AllRightsReserved

Real-TimeIngestingandTransformingSensorDataandSocialDatawithNiFiandTensorFlowTimothySpannHortonworks@PaaSDev

2 ©HortonworksInc.2011– 2017.AllRightsReserved

Agenda

• What do we want to do?• Why?• How?• Apache NiFi• TensorFlow• Natural Language Processing• Demo• Questions

3 ©HortonworksInc.2011– 2017.AllRightsReserved

Whatdowewanttodo?

• MiniFi ingestscameraimagesandsensordata

• RunTensorFlow Inceptionv3torecognizeobjectsinimage

• NiFistoresimages,metadataandenricheddatainHadoop

• NiFiingestssocialdataandfeeds

• NiFianalyzessentimentoftextualdata

4 ©HortonworksInc.2011– 2017.AllRightsReserved

WhyGatherandAnalyzeSocialMediaStream?

- AutomateprocessestomaximizeSocialMediateam’stime

- Improvedresponsetimetorequests,complaintsandemergenciesinsocialmedia

- Predictiveanalyticstoknowwhenandwhereproblemswillhappen

- Learnwhereunhappycustomersareandaddressinstantly

5 ©HortonworksInc.2011– 2017.AllRightsReserved

Aggregatealldatafromsensors,geo-locationdevices,machinesandsocialfeeds

Collect:BringTogether

Mediatepoint-to-pointandbi-directionaldataflows,deliveringdatareliablytoHBase,Hive,SlackandEmail.

Conduct:MediatetheDataFlow

Parse,filter,join,transform,fork, query,sort,dissect;enrichwithweather,location,NLPandTensorFlow.

Curate:GainInsights

6 ©HortonworksInc.2011– 2017.AllRightsReserved

WhyApacheNiFi?

• Guaranteeddelivery• Databuffering

- Backpressure- Pressurerelease

• Prioritizedqueuing• FlowspecificQoS

- Latencyvs.throughput- Losstolerance

• Dataprovenance• Supportspushandpull

models

• Hundredsofprocessors• Visualcommandand

control• Overafiftysources• Flowtemplates• Pluggable/multi-role

security• Designedforextension• Clustering

7 ©HortonworksInc.2011– 2017.AllRightsReserved

DATAENR ICHMENT

DATAD ISCOVERY

Inceptionv3

PRED ICT IVEANALYT ICS

SentimentAnalysis

8 ©HortonworksInc.2011– 2017.AllRightsReserved

WhyTensorFlow?AlsoApacheMXNet,PyTorch andDL4J.

• Google• Multipleplatform

support• Hadoopintegration• Sparkintegration• Keras• LargeCommunity• PythonandJavaAPIs• GPUSupport• MobileSupport

• Inceptionv3• Clustering• Fullyfunctionaldemos• OpenSource• ApacheLicensed• LargeModelLibrary• Buzz• ExtensiveDocumentation• RaspberryPiSupport

9 ©HortonworksInc.2011– 2017.AllRightsReserved

• TensorFlow (C++, Python, Java) via ExecuteStreamCommand

• TensorFlow NiFi Java Custom Processor

• TensorFlow Running on Edge Nodes (MiniFi)

ApacheNiFiIntegrationwithTensorFlow Options

10 ©HortonworksInc.2011– 2017.AllRightsReserved

• TensorFlow Mobile (iOS, Android, RPi)

• TensorFlow on Spark (Yahoo) via Livy, S2S, Kafka

• TensorFlow Running in Containers in YARN 3.0 on Hadoop

• gRPC Call to TensorFlow Serving

ApacheNiFiIntegrationwithTensorFlow Options

11 ©HortonworksInc.2011– 2017.AllRightsReserved

ExecuteStreamCommand To TensorFlow

https://community.hortonworks.com/articles/58265/analyzing-images-in-hdf-20-using-tensorflow.html

12 ©HortonworksInc.2011– 2017.AllRightsReserved

pythonclassify_image.py --image_file /dir/solarroofpanel.jpg

solardish,solarcollector,solarfurnace(score=0.98316)windowscreen(score=0.00196)manholecover(score=0.00070)radiator(score=0.00041)doormat,welcomemat(score=0.00041)

TensorFlow via Python

13 ©HortonworksInc.2011– 2017.AllRightsReserved

TensorFlow Java Processor in NiFi

https://community.hortonworks.com/content/kbentry/116803/building-a-custom-processor-in-apache-nifi-12-for.html

https://github.com/tspannhw/nifi-tensorflow-processor

14 ©HortonworksInc.2011– 2017.AllRightsReserved

TensorFlow Running on Edge Nodes (MiniFi)

15 ©HortonworksInc.2011– 2017.AllRightsReserved

pipinstall-Utextblobpython-mtextblob.download_corpora

Installing TextBlob for Python

Installing spaCy for Python

https://community.hortonworks.com/articles/76935/using-sentiment-analysis-and-nlp-tools-with-hdp-25.html

pipinstall-Uspacypython-mspacy.en.download all

Installing NLTK for Python 2.7

http://www.nltk.org/install.html

pip install -U nltkpip install -U numpy

16 ©HortonworksInc.2011– 2017.AllRightsReserved

run.shpythonsentiment.py "$@”

sentiment.py

fromnltk.sentiment.vader importSentimentIntensityAnalyzerimportsyssid =SentimentIntensityAnalyzer()ss =sid.polarity_scores(sys.argv[1])print('Compound{0}Negative{1}Neutral{2}Positive{3}'.format(

ss['compound'],ss['neg'],ss['neu'],ss['pos']))

Local Sentiment Analysis via Python

17 ©HortonworksInc.2011– 2017.AllRightsReserved

ApacheOpenNLP forEntityResolutionProcessorhttps://github.com/tspannhw/nifi-nlp-processor

RequiresinstallationofNARandApacheOpenNLP BINs

Thisisanon-supportedprocessorthatIwroteandputintothecommunity.

Installing Apache OpenNLP NiFi Processor

https://community.hortonworks.com/articles/80418/open-nlp-example-apache-nifi-processor.html

18 ©HortonworksInc.2011– 2017.AllRightsReserved

StanfordCoreNLP Processorhttps://github.com/tspannhw/nifi-corenlp-processor

RequiresinstallofNARandStanfordEnglishModelshttp://nlp.stanford.edu/software/stanford-english-corenlp-2017-06-09-models.jar

Thisisanon-supportedprocessorthatIwroteandputintothecommunity.

Installing Stanford CoreNLP Processor

https://community.hortonworks.com/articles/81270/adding-stanford-corenlp-to-big-data-pipelines-apac-1.html

19 ©HortonworksInc.2011– 2017.AllRightsReserved

Contact:

TimothySpann

@PaaSDeV

http://www.meetup.com/futureofdata-princeton

https://dzone.com/users/297029/bunkertor.html

https://github.com/tspannhw/dws2017sydney/blob/master/README.md

http://community.hortonworks.com/users/9304/tspann.html

20 ©HortonworksInc.2011– 2017.AllRightsReserved

à https://community.hortonworks.com/content/kbentry/116803/building-a-custom-processor-in-apache-nifi-12-for.html

à https://community.hortonworks.com/articles/118132/minifi-capturing-converting-tensorflow-inception-t.html

à https://community.hortonworks.com/articles/73833/an-example-websocket-application-in-apache-nifi-11.html

à https://community.hortonworks.com/articles/81694/extracttext-nifi-custom-processor-powered-by-apach.html

à https://community.hortonworks.com/articles/79842/ingesting-osquery-into-apache-phoenix-using-apache.html

à https://community.hortonworks.com/articles/67980/using-command-line-security-tools-from-apache-nifi.html

à https://community.hortonworks.com/articles/52415/processing-social-media-feeds-in-stream-with-apach.html

à https://community.hortonworks.com/articles/121916/controlling-big-data-flows-with-gestures-minifi-ni.html

à https://community.hortonworks.com/articles/86570/hosting-and-ingesting-data-from-web-pages-desktop.html

à https://community.hortonworks.com/articles/63228/monitoring-your-containers-with-sysdig-from-hdf-20.html

à https://community.hortonworks.com/articles/101679/iot-ingesting-gps-data-from-raspberry-pi-zero-wire.html

à https://community.hortonworks.com/articles/101904/part-2-iot-augmenting-gps-data-with-weather.html

à https://community.hortonworks.com/articles/101904/part-2-iot-augmenting-gps-data-with-weather.html

à https://community.hortonworks.com/articles/110475/ingesting-sensor-data-from-raspberry-pis-running-r.html

à https://community.hortonworks.com/articles/76240/using-opennlp-for-identifying-names-from-text.html

à https://community.hortonworks.com/articles/76935/using-sentiment-analysis-and-nlp-tools-with-hdp-25.html

21 ©HortonworksInc.2011– 2017.AllRightsReserved

à https://community.hortonworks.com/articles/76924/data-processing-pipeline-parsing-pdfs-and-identify.html

à https://community.hortonworks.com/articles/80339/iot-capturing-photos-and-analyzing-the-image-with.htmlh

à ttps://community.hortonworks.com/articles/122077/ingesting-csv-data-and-pushing-it-as-avro-to-kafka.html

à https://community.hortonworks.com/content/kbentry/116803/building-a-custom-processor-in-apache-nifi-12-for.html

à https://github.com/tspannhw/nifi-tensorflow-processor

à https://community.hortonworks.com/articles/118148/creating-wordclouds-from-dataflows-with-apache-nif.html

à https://community.hortonworks.com/articles/118132/minifi-capturing-converting-tensorflow-inception-t.html

à https://community.hortonworks.com/articles/110469/simple-backup-and-restore-of-hdfs-data-via-hdf-30.html

à https://github.com/tspannhw/rpi-rainbowhat

à https://community.hortonworks.com/articles/110475/ingesting-sensor-data-from-raspberry-pis-running-r.html

à https://community.hortonworks.com/articles/108718/ingesting-rdbms-data-as-new-tables-arrive-automagi.html

à https://community.hortonworks.com/articles/108947/minifi-for-ble-bluetooth-low-energy-beacon-data-in.html

à https://community.hortonworks.com/articles/108966/minifi-for-sensor-data-ingest-from-devices.html

à https://github.com/tspannhw/rpi-sensehat-minifi-python

à https://community.hortonworks.com/articles/107379/minifi-for-image-capture-and-ingestion-from-raspbe.html

à https://community.hortonworks.com/articles/104255/ingesting-and-testing-jms-data-with-nifi.html

à https://community.hortonworks.com/articles/103863/using-an-asus-tinkerboard-with-tensorflow-and-pyth.html

22 ©HortonworksInc.2011– 2017.AllRightsReserved

à https://community.hortonworks.com/articles/104226/simple-backups-of-hadoop-with-apache-nifi-12.html

à https://community.hortonworks.com/articles/101904/part-2-iot-augmenting-gps-data-with-weather.html

à https://community.hortonworks.com/articles/101679/iot-ingesting-gps-data-from-raspberry-pi-zero-wire.html

à https://community.hortonworks.com/articles/99861/ingesting-ibeacon-data-via-ble-to-mqtt-wifi-gatewa.html

à https://community.hortonworks.com/articles/92345/store-a-flow-to-disk-and-then-reserialize-it-to-co.html

à https://community.hortonworks.com/articles/92495/monitor-apache-nifi-with-apache-nifi.html

à https://community.hortonworks.com/articles/92496/qadcdc-our-how-to-ingest-some-database-tables-to-h.html

à https://community.hortonworks.com/articles/89455/ingesting-gps-data-from-onion-omega2-devices-with.html

à https://community.hortonworks.com/articles/87397/steganography-with-apache-nifi-1.html

à https://community.hortonworks.com/articles/87632/ingesting-sql-server-tables-into-hive-via-apache-n.html

à https://community.hortonworks.com/articles/88404/adding-and-using-hplsql-and-hivemall-with-hive-mac.html

23 ©HortonworksInc.2011– 2017.AllRightsReserved

HortonworksCommunityConnection

Read access for everyone, join to participate and be recognized

• FullQ&APlatform(likeStackOverflow)

• KnowledgeBaseArticles

• CodeSamplesandRepositories

24 ©HortonworksInc.2011– 2017.AllRightsReserved

CommunityEngagement

Participate now at: community.hortonworks.com©HortonworksInc.2011– 2015.AllRightsReserved

4,000+RegisteredUsers

10,000+Answers

15,000+TechnicalAssets

One Website!