Presenter/Author: Scott McCloskey Honeywell Labs, Minneapolis, MN, USA [email protected] Confidence Weighting for Sensor Fingerprinting

Presenter/Author: Scott McCloskeyHoneywell Labs, Minneapolis, MN, [email protected]

Confidence Weighting for Sensor Fingerprinting

HONEYWELL PROPRIETARY

Outline of Talk

1. Motivation for Sensor Fingerprinting

2. Review of Chen’s Method

3. Independent Testing & Analysis

4. Confidence Weighting to Handle Persistent Edges

5. Experimental Results

6. Future Work


Common Source Camera Identification• Problem: Given two videos (or sets of images), can we determine whether

or not they were taken with the same camera?• Scenario: Videos of two IED events are posted to YouTube. If they were

taken with the same camera, we establish a link between the events. • Applications: forensic data analysis, social network analysis…

Signature Data Advantages/Disadvantages

Image/video header data Quick and easy− Easily spoofed

Model –Level Identification

Lens distortions − Cameras w/ interchangeable/zoom lenses

CFA interpolation − Monochrome images/video

Device-Level Identification

Dead pixels, dark noise − Typically corrected in-camera

Photo-response non-uniformity (PRNU) of camera’s sensor

Device specific Signature space is large Difficult to correct in-camera


Photo-Response Non-Uniformity (PRNU)

Due to material and manufacturing imperfections, each pixel on a sensor has a slightly different (non-uniform) response to incoming light. This is most noticeable in images of uniformly-illuminated flat fields.

Step 1: Signature Extraction Step 2: Signature Comparison

1. Separate each frame into scene content and noise components.

2. Average noise component is the signature.

1. Compute cross-correlation of input signatures.

2. Measure sharpness of peak

3. Compare to threshold

Algorithm proposed by: M. Chen, J. Fridrich, and M. Goljan in Source Digital Camcorder Identification Using Sensor Photo-Response Non-Uniformity. Proc. of SPIE, January 2007.

• Because the magnitude of this noise is related to environmental conditions (temperature) and because most scenes are not flat fields, the non-uniformity is not corrected in camera.

• PRNU-based sensor fingerprints can distinguish between a large number of devices. If we presume only that we can distinguish three levels of response (normal, high, low), the number of signatures for a 1MP sensor is 31000000, which is practically infinite.


Signature Extraction• Unlike other applications, where Computer Vision methods to abstract

away differences between cameras to recognize scene objects (faces, etc.), we now need to abstract away differences between scenes and recognize camera-specific signatures.

• Given an input video, we remove scene content from each frame by applying a de-noising method and subtracting that result from the original.

• The maximum likelihood estimate of the PRNU signature is:

where Ik is the raw frame, Ik is the de-noised frame, K is the number of

frames, and P is the signature.

Input

Scene Content

Noise

^

^


Signature Comparison1. Compute cross-correlation of signatures at different scales.

2. Measure the magnitude of the peak using Peak-to-Secondary Ratio (PSR). This is simply the ratio of the heights of the largest and second largest peaks in the cross-correlation.

3. Compare the PSR to a threshold that determines whether the two videos are said to match.

MismatchMatch

Videos from the same camera will have similar PRNU patterns, and their cross-correlation function will appear similar to a delta function. Mismatched videos will have dissimilar PRNU patterns, and the cross-correlation will be a random pattern.


Evaluating Chen’s Algorithm: Test Videos

• Testing presented in the original paper was somewhat limited, with little analysis of the results.

• In order to understand the strengths and weaknesses of the algorithm, we test it against a suite of videos which represent a wide range of potential inputs:– indoor/outdoor scenes– zooming/moving/stationary camera– flat fields, highly-textured scenes– image stabilization– data from camcorders and digital still cameras with video mode– night mode (feature on camcorders) and daylight mode

• When available, video data is acquired without compression.• All test videos are 30f.p.s. for 40 seconds (K=1200).• Test uncompressed video, as well as XVID-compressed

derivatives.


Evaluating Chen’s Algorithm: Results

A – outdoor, movingB – indoor, flat fieldC – indoor, tripodD – indoor, movingE – indoor, movingF – outdoor, stabilizationG – indoor, movingH – indoor, zoomingI – indoor, moving (night mode)J – flat field (night mode)X – indoor, moving

Key

True Match

True Non-match

False Match

False Non-match

Test Scenes:


Problem 1: Digital Image Stabilization

• A common feature on most video cameras, image stabilization compensates for camera motion that may disorient or nauseate the viewer.

• Optical image stabilization uses a floating lens element to smooth out camera motion. Not a problem.

• Digital image stabilization uses sensors to measure camera motion. Digitized frames are shifted to compensate.

• The PRNU estimate relates to the sensitivity of sensor pixels. A pixel location in the video is assumed to correspond to the same sensor location in each frame. The shifting of frames violates this assumption.

• We are attempting to characterize the extent to which stabilization can be handled, in terms of the percentage of frames that are shifted.


Problem 2: Persistent Edge Content• De-noising has been long studied in image processing, and the problem is

well known to be ill-posed.• Most de-noising methods misclassify some portion of high-frequency

scene content as noise, particularly near edges.• When estimating the signature, then, the area around edges will be

problematic. If the video features stationary objects, as is the case with tripod-mounted cameras, edges appear in the extracted signature.

• Edges in the signature can cause mis-classifications, particularly false negatives. False positives may also occur, if these spurious edges appear in similar locations in videos from different cameras.

Interview Video

Extracted Signature


Confidence Weighting

• Chen’s method treats each pixel of each frame the same, regardless of its content. This conflicts with the intuition that flat regions of a scene are more useful for PRNU estimation.

• In light of the relative difficulty inherent in noise separation near edges, we should endeavor to avoid the inevitable errors contributing significantly to the estimated signature.

• Based on this reasoning, we propose confidence weighting for sensor fingerprinting. Specifically, we wish to prevent erroneous noise estimates near texture/edges from distorting the estimated signature. Within frames, we weight against regions likely to produce erroneous noise estimates.


Confidence Weighting for Persistent Edge Content

Interview Video

Extracted Signature

Confidence Map

General Idea: Analyze each frame to predict failures of the de-noising method. Use this to generate a confidence map that weights the contribution of different scene regions to the estimated fingerprint. Low-confidence regions are not allowed to introduce spurious features to the fingerprint.Experiments use the confidence weight

where p is a pixel, G is a Gaussian filter, and is the gradient operator.


Experimental Results

Old Method New Method

A – outdoor, movingB – indoor, flat fieldC – indoor, tripodD – indoor, movingE – indoor, movingF – outdoor, stabilizationG – indoor, movingH – indoor, zoomingI – indoor, moving (night mode)J – flat field (night mode)X – indoor, moving

Key

True Match

True Non-match

False Match

False Non-match

Test Scenes:


Other Applications of Confidence Weighting

• We have shown that confidence weighting can be used to improve the quality of extracted PRNU signatures by discriminating between regions within frames.

• The same framework can be expanded to include discrimination between frames, based on their differing utility to signature estimation. We plan to investigate two cues:– signal amplification (gain) per frame. Cameras adjust to varying light

by modifying the gain, increasing it when illumination decreases. Frames with higher gain will have relatively higher levels of noise, from which PRNU will be better estimated.

– keyframe/interframe characterization. Most video compression formats are heterogeneous, with certain keyframes preserved at a higher quality. Noise estimated from such frames are likely to be more useful for PRNU estimation.

• In addition to relative discrimination, confidence measures can be used to determine when the extracted signature is sufficient, or whether more/better frames are needed.

Documents

Presenter/Author: Scott McCloskey Honeywell Labs, Minneapolis, MN, USA [email protected] Confidence Weighting for Sensor Fingerprinting