3
Multi-Modal Context Aware Monitoring System for Smart Classrooms Demonstration using Reference Solutions Sangeeta Ghangam [email protected] Intel Corporation Chandler, Arizona, USA Sanjay V Addicam [email protected] Intel Corporation Chandler, Arizona, USA Gabriel L Silva [email protected] Intel Corporation Chandler, Arizona, USA Wendy Chin [email protected] Intel Technology Sdn.Bhd. Bayan Lepas, Pulau Pinang, Malaysia Michelle Lai [email protected] Intel Technology Sdn.Bhd. Bayan Lepas, Pulau Pinang, Malaysia ABSTRACT Given the proliferation of IoT sensors, every location is now mon- itored by several different sensory endpoints. IoT sensors enable context awareness for a given system or environment by allowing automated reactions to sensor readings or events. Combining audio, video and environmental sensors provides an effective multi-modal monitoring approach resulting in less uncertainty with respect to the event, better response times and improved safety quotient. In this project showcase abstract we explain the rationale and de- tails of the end-to-end demonstration, to prove the efficacy of a multi-modal system in enhancing safety for classrooms. CCS CONCEPTS Hardware Emerging architectures; Sensor devices and platforms; Sensors and actuators; Beamforming; Sound-based input / output; Theory of computation Machine learning theory. KEYWORDS Multi-Modal, Environmental sensors, Audio, Video Sensors, Smart Campus, Safety, Context Awareness ACM Reference Format: Sangeeta Ghangam, Sanjay V Addicam, Gabriel L Silva, Wendy Chin, and Michelle Lai. 2019. Multi-Modal Context Aware Monitoring System for Smart Class- rooms: Demonstration using Reference Solutions. In Proceedings of Annual KDD Conference Project Showcase (KDD ’19). ACM, New York, NY, USA, 3 pages. https://doi.org/10.1145/nnnnnnn.nnnnnnn Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]. KDD ’19, August 04–08,2019, Anchorage, AK © 2019 Copyright held by the owner/author(s). Publication rights licensed to ACM. ACM ISBN 978-x-xxxx-xxxx-x/YY/MM. . . $15.00 https://doi.org/10.1145/nnnnnnn.nnnnnnn 1 INTRODUCTION According to the University of Washington [1], 3.85 deaths in the United States per 100,000 people are due to gun deaths, the num- ber does not include the huge amount of the unreported firearm crimes. The incidents, especially in the crowded public places for instance schools and event venues, are devastating. Schools have become one of the targets since it is difficult to keep the campuses closed completely. Additionally, most of the campuses are lacking of the gunshot and criminal detection technology that can trigger the emergency responders with real-time data during the active shooter event to mitigate casualties in the disaster. Response times after an incident are normally predicated on how fast the law en- forcement agencies are notified. Immediate reaction by the people impacted by the incident varies based on the severity. When pan- icked most people will try to move towards safety. If there is a monitoring system made of different sensors at the location of the incident it provides a mechanism to automatically alert both the response teams and people in the vicinity as to the actions they can take towards safety. Measurable reduction in emergency response time has been noticed when sensors in a crash recorder automati- cally send notification about the road accident to response teams [2]. While the monitoring techniques demonstrated as part of this showcase can be applied to a varied number of scenarios, we picked smart classrooms, as safety is one of the key value proposition for IoT in classrooms [3]. In the following sections we describe the different modules that constitute the monitoring system used in the demonstration followed by setup considerations and results using a simulated incident in a classroom environment. 2 PROJECT PARTNERS The sensor hub includes sensors selected per United States Envi- ronmental Protection Agency (EPA) guidelines and from Bosch and SGX. The sensor application cloud software has been developed using technology from Amazon. The video analytics software uses emotion recognition algorithm based on AffectNet which was li- censed from University of Denver. The media and display pipeline has been built using technology from Microsoft and custom Media Foundation transforms.

Multi-Modal Context Aware Monitoring System for …Multi-Modal Context Aware Monitoring System for Smart Classrooms KDD ’19, August 04–08,2019, Anchorage, AK Figure 3: Inference

  • Upload
    others

  • View
    5

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Multi-Modal Context Aware Monitoring System for …Multi-Modal Context Aware Monitoring System for Smart Classrooms KDD ’19, August 04–08,2019, Anchorage, AK Figure 3: Inference

Multi-Modal Context Aware Monitoring System for SmartClassrooms

Demonstration using Reference Solutions

Sangeeta [email protected]

Intel CorporationChandler, Arizona, USA

Sanjay V [email protected]

Intel CorporationChandler, Arizona, USA

Gabriel L [email protected]

Intel CorporationChandler, Arizona, USA

Wendy [email protected]

Intel Technology Sdn.Bhd.Bayan Lepas, Pulau Pinang, Malaysia

Michelle [email protected]

Intel Technology Sdn.Bhd.Bayan Lepas, Pulau Pinang, Malaysia

ABSTRACTGiven the proliferation of IoT sensors, every location is now mon-itored by several different sensory endpoints. IoT sensors enablecontext awareness for a given system or environment by allowingautomated reactions to sensor readings or events. Combining audio,video and environmental sensors provides an effective multi-modalmonitoring approach resulting in less uncertainty with respect tothe event, better response times and improved safety quotient. Inthis project showcase abstract we explain the rationale and de-tails of the end-to-end demonstration, to prove the efficacy of amulti-modal system in enhancing safety for classrooms.

CCS CONCEPTS• Hardware → Emerging architectures; Sensor devices andplatforms; Sensors and actuators; Beamforming; Sound-basedinput / output; • Theory of computation → Machine learningtheory.

KEYWORDSMulti-Modal, Environmental sensors, Audio, Video Sensors, SmartCampus, Safety, Context Awareness

ACM Reference Format:Sangeeta Ghangam, Sanjay VAddicam, Gabriel L Silva,WendyChin, andMichelleLai. 2019. Multi-Modal Context Aware Monitoring System for Smart Class-rooms: Demonstration using Reference Solutions. In Proceedings of AnnualKDD Conference Project Showcase (KDD ’19). ACM, New York, NY, USA,3 pages. https://doi.org/10.1145/nnnnnnn.nnnnnnn

Permission to make digital or hard copies of all or part of this work for personal orclassroom use is granted without fee provided that copies are not made or distributedfor profit or commercial advantage and that copies bear this notice and the full citationon the first page. Copyrights for components of this work owned by others than theauthor(s) must be honored. Abstracting with credit is permitted. To copy otherwise, orrepublish, to post on servers or to redistribute to lists, requires prior specific permissionand/or a fee. Request permissions from [email protected] ’19, August 04–08,2019, Anchorage, AK© 2019 Copyright held by the owner/author(s). Publication rights licensed to ACM.ACM ISBN 978-x-xxxx-xxxx-x/YY/MM. . . $15.00https://doi.org/10.1145/nnnnnnn.nnnnnnn

1 INTRODUCTIONAccording to the University of Washington [1], 3.85 deaths in theUnited States per 100,000 people are due to gun deaths, the num-ber does not include the huge amount of the unreported firearmcrimes. The incidents, especially in the crowded public places forinstance schools and event venues, are devastating. Schools havebecome one of the targets since it is difficult to keep the campusesclosed completely. Additionally, most of the campuses are lackingof the gunshot and criminal detection technology that can triggerthe emergency responders with real-time data during the activeshooter event to mitigate casualties in the disaster. Response timesafter an incident are normally predicated on how fast the law en-forcement agencies are notified. Immediate reaction by the peopleimpacted by the incident varies based on the severity. When pan-icked most people will try to move towards safety. If there is amonitoring system made of different sensors at the location of theincident it provides a mechanism to automatically alert both theresponse teams and people in the vicinity as to the actions they cantake towards safety. Measurable reduction in emergency responsetime has been noticed when sensors in a crash recorder automati-cally send notification about the road accident to response teams[2]. While the monitoring techniques demonstrated as part of thisshowcase can be applied to a varied number of scenarios, we pickedsmart classrooms, as safety is one of the key value proposition forIoT in classrooms [3]. In the following sections we describe thedifferent modules that constitute the monitoring system used in thedemonstration followed by setup considerations and results usinga simulated incident in a classroom environment.

2 PROJECT PARTNERSThe sensor hub includes sensors selected per United States Envi-ronmental Protection Agency (EPA) guidelines and from Bosch andSGX. The sensor application cloud software has been developedusing technology from Amazon. The video analytics software usesemotion recognition algorithm based on AffectNet which was li-censed from University of Denver. The media and display pipelinehas been built using technology from Microsoft and custom MediaFoundation transforms.

Page 2: Multi-Modal Context Aware Monitoring System for …Multi-Modal Context Aware Monitoring System for Smart Classrooms KDD ’19, August 04–08,2019, Anchorage, AK Figure 3: Inference

KDD ’19, August 04–08,2019, Anchorage, AK Ghangam, Silva, Addicam, Chin, Lai

Table 1: List of sensors included with the sensor hub

Sensor Technology ThresholdBME280-Temperature MetalOxide 22C

MICS6814-NO2 Electrochemical High (151-200ppm)ParticulatePMS5003 Optical High (70-499)

3 PROJECT DESCRIPTIONThe demonstration scenario includes a video and audio recordingfrom a classroom session in progress. The recording consists ofregular teacher student interaction for the first 8-10 minutes afterwhich we simulate an air quality event followed by multiple alarmstowards the end of 15 minutes. This recording will be used togenerate real time inference in our analytics application, sensorand audio application from client device to cloud.

Following is a system block diagram and description of eachcomponent used in the demonstration

Figure 1: Block diagram

3.1 Sensor Input• Cameras – For purposes of this demonstration we have col-lected video/audio samples which will be passed throughthe software on host system for real time inference. For gen-erating the video samples we have used 2 cameras setup tofunction as IP cameras used for surveillance/monitoring. Thecameras are positioned to capture the entire room, students,teacher and exits

• Sensor hub with environmental and audio sensors – A sensorhub reference design has been used to connect the list ofsensors as captured in Table 1.Threshold value is used togenerate events in the software. Choice of the sensors and thethresholds are based on Indoor Air Quality (IAQ) guidelinesfor schools as published by the EPA [4]In addition to the environmental sensors, the sensor hub alsoincludes two linear mic arrays for capturing audio eventssuch as alarms, gunshot, glass-break and scream. The mic

arrays are designed for 360 degree audio coverage for in-door and outdoor use. It provides Time Division Multiplexed(TDM) data to the sensor hub which includes a codec tocovert data into USB 2.0. Below is a snapshot of the sensorhub and sensors used for the demo

Figure 2: Sensor Hub Reference Hardware

3.2 Host Computer• The audio sensing will occur in real time while the envi-ronmental sensor values will be simulated. The Sensor andAudio application manages the setup of the hub per userconfiguration and compares the incoming data with thresh-olds for generating events. Events and regularly monitoredsensor data are sent to the MQTT broker in the AWS cloudinstance.

• The analytics application is an end-to-end media and an-alytics software reference utilizing Intel MediaSDK, IntelOpenVINO Toolkit, and Microsoft Media Foundation to de-code camera feeds and analyze the video frames. The videoframes are analyzed against several convolutional neuralnetworks (with some processing concurrently) for– emotion recognition – possible use case is to detect anypanic emotions in the classroom

– facial recognition – possible use case is to detect blacklisted intruders in the classroom or campus

– student behavior recognition (e.g. raising hands, sitting,and standing) – possible use case is to detect panic strickenbehavior

• The inference can be processed via CPU, GPU, or an acceler-ator such as Movidius via user configuration. The inferencedata is published via MQTT over websockets to a Dashboard.Below is an example of real time inference using the analyticsapplication

Page 3: Multi-Modal Context Aware Monitoring System for …Multi-Modal Context Aware Monitoring System for Smart Classrooms KDD ’19, August 04–08,2019, Anchorage, AK Figure 3: Inference

Multi-Modal Context Aware Monitoring System for Smart Classrooms KDD ’19, August 04–08,2019, Anchorage, AK

Figure 3: Inference depicting emotions

3.3 Cloud Data Management/Dashboard (AWSEC2)

All sensor data and events is uploaded to the cloud to store and pro-cess in ‘near-real time’ (latency of <5 mins). Separate EC2 instanceshave been used to manage data storage in MongoDB and providemetadata search capability. Bulk of the eventing from sensor data ismanaged by scripts running in elastic map reduce instance whichprovides latency assurances for ‘near-real time’ processing [5]. Thescripts contain classification heuristics for managing events usingthe given time period and number of events. The events resultingin a severity include – Audio Events (Alarm, Gunshot, Glass Break,Scream) and Sensor Events (Alert)

Severity Classification• Low: Upto 5 messages in less than 10 minutes• Medium : Upto 15 messages in 10 or more minutes• High: Upto 30 messages in 15 or more minutes• If number of messages and/or the timeline is greater thanwhat is captured above, the status is elevated to ‘Critical’

Event data is stored in database and available for displaying in thedashboard via REST API. In this demo we will show an exampledashboard and sensor event stream, in the cloud, via a browser nextto the video analytics processing results. All these different piecesof information will be placed in its own ‘zones’ for display on thescreen. Below is the result in the cloud which shows the sensor andaudio event

Figure 4: Events on dashboard

4 RESULTS• The video analytics accurately detects student behavior,emotions of the participants in the classroom session. Withthe NO2 exceeding threshold the student reactions changeswhich is identified by the system.

• The NO2 sensor and Audio sensor accurately detects eventsas they occur in the classroom session

• Events are processed in the cloud to generate alerts. Par-ticipants in the classroom access this data and are able tomove in the right direction for safety. Absence of directionaldata with audio event or a missed sensor event would haveresulted in an unsafe result for the classroom participants.

5 CONCLUSIONMulti-modal approach of combing sensors, audio and video resultsin dramatic reduction in uncertainty when faced with an incident(s).This helps in enabling a safer environment indoors and outdoors.

6 LINK TO DEMO VIDEOFor ease in viewing, videos have been collected for each of theabove results

• Video Analytics [6]• Audio/Sensor Event Detection [7]• Cloud Dashboard [8]

REFERENCES[1] NPR Article,

https://www.npr.org/sections/goatsandsoda/2017/10/06/555861898/gun-violence-how-the-u-s-compares-to-other-countries

[2] Howard R. Champion, Research Professor of Surgery, University of Maryland,Principal Investigator, JS Augenstein, Professor of Surgery, University of Miami;B Cushing, Surgeon, Maine Medical Center; KH Digges, Professor of Engineering,George Washington University; R Hunt, Emergency Physician, State Universityof NY, Syracuse; R Larkin, Emergency Physician, American College of EmergencyPhysicians; AC Malliaris, Mathematician, George Washington University; WJSacco, Mathematician, ThinkSharp, Inc.; JH Siegel, Professor of Surgery, Universityof Medicine and Dentistry of New Jersey. Reducing Highway Deaths and Disabilitieswith Automatic Wireless Transmission of Serious Injury Probability Ratings fromCrash Recorders to Emergency Medical Services Providers NHTSA.

[3] BAGHERI, Maryam and HAGHIGHI MOVAHED, Siavosh. The Effect of the Internetof Things (IoT) on Education Business Model Sheffield Hallam University ResearchArchive (SHURA):http://shura.shu.ac.uk/14405/

[4] EPA Guidelines for Schools,https://www.epa.gov/iaq-schools

[5] Amazon AWS EMR,https://docs.aws.amazon.com/emr/latest/ManagementGuide/emr-overview-benefits.html

[6] Video Analytics,https://www.dropbox.com/s/k9sw0n8nzvoz23a/video.mp4?dl=0

[7] Audio-Sensor Event Detection,https://www.dropbox.com/s/vyy3vtz2o9xl3s1/audio-sensor.mp4?dl=0

[8] Cloud Dashboard,https://www.dropbox.com/s/hkijzbg8kry95bk/cloud.mp4?dl=0