Seminar presentation of Bandit Framework For Systematic Learning InWireless Video-Based Face Recognition

WELCOME

WELCOMEBandit Framework For Systematic Learning InWireless Video-Based Face RecognitionGuided byVindya VijayanVkcetPresented byJiJI BJVKCETIntroductionAnalysis of video stream for face recognition is become common after 10yrs among remote computing cluster.This process experience time varying, unknown channel condition, traffic loads and processing constraints.Examples: Google Goggles, Google Glass object recognition, Facebook face tagging, Microsoft photo gallery face recognition.A group of device in a single wireless network called Wireless cluster.

A group of remote accessing computer is called cloudThe face recognition transaction system consist of-A Video content producers is used to capture, encode and transmit the video streamsA cloud computing cluster is used for analysis visual data from wireless cluster.A message is received in mobile device from the remote cluster.To reduce the contention in the wireless network, each device uploads its video content with encoding bitrate and number of frames to produce.To reducing the task scheduling congestion in the cloud, the required processing time is scaled.At last, with in a predetermined time ,device receive a reply from the cloud

Each device achieves reliable face recognition while minimizing the required wireless transmission and cloud-based processing under highly-varying contention and congestion conditions

Illustration of object or face recognition transactions between mobile devices and a cloudRelated workAll the application must keep an accuracy rate while minimizing its cost and the number of video frames.most existing solutions for designing and configuring wireless multimedia applications based on dynamic cloud. Reinforcement learning is best algorithm is used in wireless multimedia applications with cloud processing.For more speed access multi armed bandit(MAB) can be proposed.Here an index that consist of estimated performance and uncertainty , index is updated using feedback The MAB choose highest index for the fast execution.For information exploitation we use contextual bandit algorithms.Our proposed framework learns independently for each context, and used for multiuser purpose.MAB is based on a regret bound-the difference between expected recognition rate and the achieved recognition rate from online learning algorithms.The order of the regret is O(log k)

ContentsSYSTEM DESCRIPTION AND SYSTEM MODELVideo Capturing and EncodingVisual AnalysisWireless TransmissionSystem ModelDISTRIBUTED, DEVICE-ORIENTED, BANDIT LEARNING ALGORITHMDevice oriented Contextual Learning AlgorithmService oriented Contextual Learning AlgorithmNUMERICAL RESULTCONCLUSIONREFERENCESYSTEM DESCRIPTION AND SYSTEM MODELVideo Capturing and EncodingMobile device have a video camera capture several frames of facesEach frame is illuminated through the artificial modulation of light of flashEach video frame can be cropped to the object or face area by automated face detection algorithms

Example of five frames captured under different illumination elevation (E) and azimuth (A) angles from the Yale Face DatabaseThe cropped frames are then compressed using a standard-compliant video codec like MPEG/ITU-T H>264/AVCWireless TransmissionThis framework using either wireless local area network (WLAN) infrastructures, such as IEEE 802.11 WLANs or WiMAX/LTE-based cellular networksMultiple wireless devices sharing the same spectrum. Each device can adapt its transmission parameters depending on the number of concurrent transmittersOur solution can learn the behavior of such adaptation mechanisms and transmission settings to use under time constraints.Visual AnalysisThe cloud computing cluster processes multiple visual analysis tasks concurrently, so its task scheduler experiences highly-varying levels of congestionCloud computing infrastructure may have to adapt the accuracy of its feature matching algorithm, as well as the number of video frames processed.For face recognition, the server matches the provided video information to a pre-established database of stored imagesL1 minimization, salient-point extraction and matching, support vector machines etc. are the examples of algorithm used for face recogntionWhen the majority of the video frames are matched to the same person in the database, then the match is successful and return the resultThe video frames that match the same person can be stored in a set called a-priori, for reducing false positive rate.Similarly the system maintain a false negative rate. Its mainly due to varying illumination, motion blur or pose.System ModelHere m mobile device indexed by set M={1,2,3.m}A is a set of all possible actions/settings, A={1,2,3..aS} where S=size of the settings space.Discrete set T= Contention and congestion levels of the wireless mediumDiscrete set G= Contention and congestion levels of Cloud Infrastructure.k=Each recognition transition. For each recognition transaction, kEach device observes the current wireless contention level, t (k) T , and cloud congestion level, g (k) G, and selects the bitrate and number of frames to capture, transmits encoded video to the cloudThe cloud decodes the video and performs feature matching with the databaseEach device gets the result from the cloud, either a label or a error messagethe device performs further recognition attempts until a successful result is obtained or the process is cancelled

Device oriented Modeldevices systematically learn the best transmission setting to maximize their own recognition rate per attemptAll devices use the contention level of the wireless medium and the congestion level in the cloud as contextsService Oriented ModelThe cloud systematically learns the best setting that maximizes the clusters average recognition rate per attemptThe cloud uses the wireless contention level as context and the recognition rate and the cloud congestion levelTwo models are used for the derivation of a bandit-based systematic learning frameworkDISTRIBUTED, DEVICE-ORIENTED, BANDIT LEARNING ALGORITHM

Device oriented ModelMobile devices select their own actions and learn through their interaction with the wireless medium and the cloudThere are two stages for any recognition transactionExploration stage, where the mobile device chooses an arbitrary transmission setting to update the estimated recognition rate per attempt

Exploitation stage, where mobile devices select the transmission setting that yields the highest estimated recognition rate per attempt

Device-oriented model

Sub-optimality Gap and Minimum Sub-optimality GapThe sub-optimality gap defines the performance difference between the best and other transmission settingsSub-optimality gap=a- be a transmission setting a- AMinimum Sub-optimality Gap=

Service Oriented ModelThe cloud that makes the decisions and all mobile devices simply obeyCongestion level in the cloud indirectly controlled by the settings decided for each deviceThere are two stages for any recognition transactionExploration stage, where the cloud chooses an arbitrary transmission setting depending on the contention level and updates the estimated recognition rate per attemptExploitation stage, where cloud select the transmission setting that yields the highest expected average recognition rate per attempt

Sub-optimality Gap and Minimum Sub-optimality GapSub-optimality gap=Minimum Sub-optimality Gap=

Numerical ResultGeneral SetupOur system consist of mobile devices connected via an IEEE 802.11 WLAN to a computing clusterVideos of human faces are produced by random images of persons taken from the extended Yale Face Database The camera flash is modulated to emulate light coming from different anglesH.264/AVC codec is used for encoding the video The 2D-PCA algorithm is used at the cloud side for face recognition

Wireless TransmissionGenerated wireless contention via the well-known backoff and retransmission mechanism of the Distributed Coordination Function (DCF).The default settings of the DCF simulator of Bianchis methodFace Recognition on the CloudMore than 80% of the received video frames have to match to the same person in the databaseA limited number of the received video frames is used by 2D-PCAEigenvectors used for the distance calculation DEVICE ORIENTED MODEL Average attempts per transaction

Average bit stream size

Service oriented modelAverage attempts per transaction

Average bit stream size

ConclusionWe propose a contextual bandit framework for learning contention and congestion conditions in object or face recognition via wireless mobile streaming and cloud-based processingBandit framework converges to the value of the oracle solution.It quickly adjusts to contention and congestion conditionsCan be used in multi-user wireless video streaming systems and high level multimedia signal processing systemsReferencesA. Anandkumar, N. Michael, and A. Tang. Opportunistic spectrum access with multiple playersP. Auer, N. Cesa-Bianchi, and P. Fischer. Finite-time analysis of the multiarmed bandit problem.W. Bao, H. Li, N. Li, and W Jiang. A liveness detection method for face recognition based on optical flow fieldhttp://www.Wikipedia.com

Thank YOU

Documents

Seminar presentation of Bandit Framework For Systematic Learning InWireless Video-Based Face Recognition