2
Demo: Tweeting Camera - A New Paradigm of Event-based Smart Sensing Device Yuhui Wang 1 , Christian von der Weth 2 , Thomas Winkler 3 , and Mohan Kankanhalli 4 1 NUS Graduate School for Integrative Sciences and Engineering 1,2,4 Interactive Digital Media Institute, 4 Department of Computer Science National University of Singapore, Singapore, 3 ams AG [email protected], [email protected], [email protected] ABSTRACT Cameras are one of the most utilized physical sensors that monitor our world. However, high bandwidth requirements and privacy concerns impede sharing the data with the pub- lic, who could benefit from being notified about ongoing sit- uations. In contrast, smart cameras are currently designed for dedicated scenarios, i.e., users are limited by the prede- fined algorithms on board. In this work, we demonstrate a novel paradigm of tweeting cameras for event detection and recognition which can be customized by users for different purposes. Similar to humans, the camera is able to “tweet” via social networks, once it detects events of interest, instead of continuously streaming video data. By following the cam- era and replying to its tweets, humans can join the sensing loop and help the camera to improve its self-learning. We showcase our system using face and general event recogni- tion scenarios, where the camera learns from humans what it has captured and tweets once the event status changes. CCS Concepts Human-centered computing Interaction design; Computer systems organization Embedded and cyber-physical systems; Sensors and actuators; Keywords tweeting camera; smart camera sensors; event detection 1. INTRODUCTION Today’s CCTV cameras are the main contributor to the phenomenon of Big Data [4] by continuously monitoring their surrounding. Most cameras or installations are only passively observing the world, requiring the manual inspect- ing of the raw visual footage. The involved high bandwidth consumption and privacy concerns prevents camera data to be made publicly available to any parties that can benefit from receiving situational information stemming from these “digital eyes”. In contrast, smart cameras, which combine video capturing, processing, and communication on a sin- Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the owner/author(s). ICDSC ’16 September 12-15, 2016, Paris, France c 2016 Copyright held by the owner/author(s). ACM ISBN 978-1-4503-4786-0/16/09. DOI: http://dx.doi.org/10.1145/2967413.2974034 gle embedded platform [6] have been built to analyze cap- tured images and enable a variety of applications ranging from surveillance [2], event detection [3] to smart homes [1]. However, existing solutions of smart cameras are typically designed for a dedicated use case and deployed in a closed setting, i.e., without any efforts to share the data. More- over, predefined algorithms running on them cannot be cus- tomized to different needs of users, and there are no mecha- nisms for supporting interaction between users and cameras. In this demo, we propose a novel paradigm of an event- based tweeting camera system to complement classic streaming- based visual sensors. A camera is designed to “tweet” via so- cial networks about captured events of interest (e.g. meeting events or faces), so that users can get information about an ongoing situation. The tweeting camera is built using the off-the-shelf Raspberry Pi platform, which processes the cap- tured images by extracting semantic information of occur- ring events and then posts it on Twitter over its associated Twitter account. Users can follow this camera to be notified about any live updates. Moreover, to protect the privacy, for a known event, the camera tweets only symbolic semantic compression (text) instead of disclosing raw images. For a new event, the camera will tweet a cartoonized image. The novelty of our approach is that authenticated users (e.g. the camera owner) can reply to such tweets, thus providing la- bels for image of that new event. Based on these new labels, the camera can re-train the model to learn to correctly clas- sify previously unknown event. Simply speaking, tweeting cameras become smarter and smarter over time by utilizing labelled data newly created by human users in the loop. 2. HARDWARE COMPONENTS Our camera prototype (Fig. 1) is based on a Raspber- ry Pi 3 single board computer which is equipped with a 1.2GHz 64-bit quad-core ARMv8 CPU clocked at 900 MHz per core, 1 GB of SDRAM and wireless module. A Pi cam- era or a Sony IMX219 8-megapixel sensor can be connected to the board via the Camera interface (CSI) or USB respec- tively. Both cameras are capable of maximum resolution of 2592x1944 pixels static images. The system runs an embed- ded Raspbian Jessie OS booted from a 64G microSD card. 3. SOFTWARE ARCHITECTURE The tweeting camera contains three software components, namely Event Handler, Logic Processing and Data Commu- nication (Fig 2): The Event Handler detects abnormal events or new faces and recognizes known events or faces based on the

Demo: Tweeting Camera - A New Paradigm of Event-based ...christianvonderweth.org/.../vdw-DemoTweetingCamera.pdf · based tweeting camera system to complement classic streaming-based

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Demo: Tweeting Camera - A New Paradigm of Event-based ...christianvonderweth.org/.../vdw-DemoTweetingCamera.pdf · based tweeting camera system to complement classic streaming-based

Demo: Tweeting Camera - A New Paradigm ofEvent-based Smart Sensing Device

Yuhui Wang1, Christian von der Weth2, Thomas Winkler3, and Mohan Kankanhalli41 NUS Graduate School for Integrative Sciences and Engineering

1,2,4 Interactive Digital Media Institute, 4 Department of Computer ScienceNational University of Singapore, Singapore, 3 ams AG

[email protected], [email protected], [email protected]

ABSTRACTCameras are one of the most utilized physical sensors thatmonitor our world. However, high bandwidth requirementsand privacy concerns impede sharing the data with the pub-lic, who could benefit from being notified about ongoing sit-uations. In contrast, smart cameras are currently designedfor dedicated scenarios, i.e., users are limited by the prede-fined algorithms on board. In this work, we demonstrate anovel paradigm of tweeting cameras for event detection andrecognition which can be customized by users for differentpurposes. Similar to humans, the camera is able to “tweet”via social networks, once it detects events of interest, insteadof continuously streaming video data. By following the cam-era and replying to its tweets, humans can join the sensingloop and help the camera to improve its self-learning. Weshowcase our system using face and general event recogni-tion scenarios, where the camera learns from humans whatit has captured and tweets once the event status changes.

CCS Concepts•Human-centered computing → Interaction design;•Computer systems organization → Embedded andcyber-physical systems; Sensors and actuators;

Keywordstweeting camera; smart camera sensors; event detection

1. INTRODUCTIONToday’s CCTV cameras are the main contributor to the

phenomenon of Big Data [4] by continuously monitoringtheir surrounding. Most cameras or installations are onlypassively observing the world, requiring the manual inspect-ing of the raw visual footage. The involved high bandwidthconsumption and privacy concerns prevents camera data tobe made publicly available to any parties that can benefitfrom receiving situational information stemming from these“digital eyes”. In contrast, smart cameras, which combinevideo capturing, processing, and communication on a sin-

Permission to make digital or hard copies of part or all of this work for personal orclassroom use is granted without fee provided that copies are not made or distributedfor profit or commercial advantage and that copies bear this notice and the full citationon the first page. Copyrights for third-party components of this work must be honored.For all other uses, contact the owner/author(s).

ICDSC ’16 September 12-15, 2016, Paris, Francec© 2016 Copyright held by the owner/author(s).

ACM ISBN 978-1-4503-4786-0/16/09.

DOI: http://dx.doi.org/10.1145/2967413.2974034

gle embedded platform [6] have been built to analyze cap-tured images and enable a variety of applications rangingfrom surveillance [2], event detection [3] to smart homes [1].However, existing solutions of smart cameras are typicallydesigned for a dedicated use case and deployed in a closedsetting, i.e., without any efforts to share the data. More-over, predefined algorithms running on them cannot be cus-tomized to different needs of users, and there are no mecha-nisms for supporting interaction between users and cameras.

In this demo, we propose a novel paradigm of an event-based tweeting camera system to complement classic streaming-based visual sensors. A camera is designed to “tweet” via so-cial networks about captured events of interest (e.g. meetingevents or faces), so that users can get information about anongoing situation. The tweeting camera is built using theoff-the-shelf Raspberry Pi platform, which processes the cap-tured images by extracting semantic information of occur-ring events and then posts it on Twitter over its associatedTwitter account. Users can follow this camera to be notifiedabout any live updates. Moreover, to protect the privacy, fora known event, the camera tweets only symbolic semanticcompression (text) instead of disclosing raw images. For anew event, the camera will tweet a cartoonized image. Thenovelty of our approach is that authenticated users (e.g. thecamera owner) can reply to such tweets, thus providing la-bels for image of that new event. Based on these new labels,the camera can re-train the model to learn to correctly clas-sify previously unknown event. Simply speaking, tweetingcameras become smarter and smarter over time by utilizinglabelled data newly created by human users in the loop.

2. HARDWARE COMPONENTSOur camera prototype (Fig. 1) is based on a Raspber-

ry Pi 3 single board computer which is equipped with a1.2GHz 64-bit quad-core ARMv8 CPU clocked at 900 MHzper core, 1 GB of SDRAM and wireless module. A Pi cam-era or a Sony IMX219 8-megapixel sensor can be connectedto the board via the Camera interface (CSI) or USB respec-tively. Both cameras are capable of maximum resolution of2592x1944 pixels static images. The system runs an embed-ded Raspbian Jessie OS booted from a 64G microSD card.

3. SOFTWARE ARCHITECTUREThe tweeting camera contains three software components,

namely Event Handler, Logic Processing and Data Commu-nication (Fig 2):

The Event Handler detects abnormal events or newfaces and recognizes known events or faces based on the

Page 2: Demo: Tweeting Camera - A New Paradigm of Event-based ...christianvonderweth.org/.../vdw-DemoTweetingCamera.pdf · based tweeting camera system to complement classic streaming-based

Figure 1: Prototype of Tweeting Camera

Figure 2: Tweeting Camera Software Architecture

images captured by the camera. To achieve a high accu-racy, we use Visual Recognition of the IBM Bluemix cloudplatform [7] for event classification and the Open Biomet-ric Verification library [5] for the face recognition task. Thetraining step is triggered when the camera receives humanreplies and the camera tweets are generated based on the re-sponse of cloud services. The Logic Processing componen-t processes the responses to notify only about “interesting”events, where a pipeline is implemented using a query alge-bra to process and manipulate the data, as well as triggersto invoke user-defined actions. The Data Communicationcomponent implements a Python-based Internet connectionwith an associated Twitter account. The camera calls theTwitter API to post information from the Logic Process-ing component as well as to receive replies from humans totrigger a new learning process.

4. APPLICATION EXAMPLES AND DEMOTo demonstrate our tweeting camera system, we showcase

how it works in the scenarios of face recognition and groupmeeting event recognition. Figure 3 shows the face learn-ing procedure of the camera. A captured image is first sentto the Event Handler for face recognition as well as eventclassification. Once OpenBR returns a response indicatinga new face has been detected, the Logic Processing compo-nent will determine if this information should be tweeted.A tweet is only generated once the status of physical worldchanged (e.g. from non-face to face appearing, or from eventcategory c1 to category c2). Between tweets, we assume thatthe event status remains the same. The owner of the cameracan reply to a tweet telling the camera who the person onthe embedded image is. The Data Communication compo-nent is constantly receiving new replies and passes them tothe Event Handler to update the face database. Next timewhen the same face is captured by the camera, it will be

recognized and tweeted only the information about a knownface. A video illustrating face recognition as well as meetingevent recognition (i.e., meeting starts, meeting finish, roomoccupied) can be found on our project website 1. Due tothe complexity of events and the involved processing steps,images are taken and analyzed every 10 seconds. Such asampling rate is valid for events such as meetings, presenta-tions or home monitoring where the scenarios do not changesignificantly in a short time.

Figure 3: Tweets Posted by Tweeting Camera

5. ACKNOWLEDGEMENTThis research was conducted the SeSaMe Centre. It is

supported by the Singapore NRF under its IRC@SG Fund-ing Initiative and administered by the IDMPO.

6. REFERENCES[1] M. Brezovan and C. Badica. A review on vision

surveillance techniques in smart home environments. InInternational Conference on Control Systems andComputer Science, pages 471–478, 2013.

[2] A. Dziri, M. Duranton, and R. Chapuis. Reliablemulti-object tracking dealing with occlusions for asmart camera. In ICDSC, pages 26–31, 2015.

[3] X. Hong, W. Ma, Y. Huang, P. Miller, W. Liu, andH. Zhou. Evidence reasoning for event inference insmart transport video surveillance. In ICDSC, pages36:1–36:6, 2014.

[4] T. Huang. Surveillance Video: The Biggest Big Data(http://www.computer.org/web/computingnow/archive/february2014). Computing Now, 7(s), February 2014.

[5] J. C. Klontz, B. F. Klare, S. Klum, A. K. Jain, andM. J. Burge. Open source biometric recognition. InBTAS, pages 1–8, Sept 2013.

[6] B. Rinner and W. Wolf. An introduction to distributedsmart cameras. Proceedings of the IEEE,96(10):1565–1575, Oct 2008.

[7] A. Sotsenko, J. Zbick, M. Jansen, and M. Milrad.Flexible and contextualized cloud applications formobile learning scenarios. In Mobile, Ubiquitous, andPervasive Learning, pages 167–192. 2016.

1http://sesame.comp.nus.edu.sg/demo/camtweet/