79
Shu Kong Dept. of Computer Science University of California, Irvine Multigrid Predictive Filter Flow for Unsupervised Learning on Videos

Multigrid Predictive Filter Flow for Unsupervised Learning ...skong2/slides/mgpff_public_version.pdf · con: expensive to annotate training set, domain/data bias Fully-Supervised

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Multigrid Predictive Filter Flow for Unsupervised Learning ...skong2/slides/mgpff_public_version.pdf · con: expensive to annotate training set, domain/data bias Fully-Supervised

Shu KongDept. of Computer Science

University of California, Irvine

Multigrid Predictive Filter Flow for Unsupervised Learning on Videos

Page 2: Multigrid Predictive Filter Flow for Unsupervised Learning ...skong2/slides/mgpff_public_version.pdf · con: expensive to annotate training set, domain/data bias Fully-Supervised

Notice

Copyright (C) 2019 Shu Kong

Slides for academic use purpose; [link] for Google slides with videos.

The original slides contain multiple projects for in-person presentation. This public version only includes a subset that is about mgPFF.

More contents will be released later this year when no conflicts are involved; or provided with personal request.

Regards,Shu

Page 4: Multigrid Predictive Filter Flow for Unsupervised Learning ...skong2/slides/mgpff_public_version.pdf · con: expensive to annotate training set, domain/data bias Fully-Supervised

Video

Video is an electronic medium for the recording, copying, playback, broadcasting, and display of moving visual media.

Page 5: Multigrid Predictive Filter Flow for Unsupervised Learning ...skong2/slides/mgpff_public_version.pdf · con: expensive to annotate training set, domain/data bias Fully-Supervised

Video

Video is an electronic medium for the recording, copying, playback, broadcasting, and display of moving visual media.

● transporting information, entertainment, surveillance, assistive robots, autonomous vehicle...

Page 6: Multigrid Predictive Filter Flow for Unsupervised Learning ...skong2/slides/mgpff_public_version.pdf · con: expensive to annotate training set, domain/data bias Fully-Supervised

Video

Video is an electronic medium for the recording, copying, playback, broadcasting, and display of moving visual media.

● transporting information, entertainment, surveillance, assistive robots, autonomous vehicle...

● natural signal, free and massive in amount...● often coming with free audio/caption.

Page 7: Multigrid Predictive Filter Flow for Unsupervised Learning ...skong2/slides/mgpff_public_version.pdf · con: expensive to annotate training set, domain/data bias Fully-Supervised

Video Mining

To train machines using videos --● What to use from the videos?● How to train? Train for what?● What/how to make a difference?

Page 8: Multigrid Predictive Filter Flow for Unsupervised Learning ...skong2/slides/mgpff_public_version.pdf · con: expensive to annotate training set, domain/data bias Fully-Supervised

1. Unsupervised Learning with Multigrid Predictive Filter Flowvideo inst. seg./tracking, pose tracking, long-range flow, video shot det.

Outline of Video Mining

[Kong & Fowlkes, 2019]

Page 9: Multigrid Predictive Filter Flow for Unsupervised Learning ...skong2/slides/mgpff_public_version.pdf · con: expensive to annotate training set, domain/data bias Fully-Supervised

fully-supervised learning for trackingpro: state-of-the-art tracking performance on benchmarks

Fully-Supervised Learning for Tracking

[Li et al, 2017][Girdhar et al, 2018]

Page 10: Multigrid Predictive Filter Flow for Unsupervised Learning ...skong2/slides/mgpff_public_version.pdf · con: expensive to annotate training set, domain/data bias Fully-Supervised

fully-supervised learning for trackingpro: state-of-the-art tracking performance on benchmarkscon: expensive to annotate training set, domain/data bias

Fully-Supervised Learning for Tracking

[McDole et al. 2018][Kondratiev&Sorokin, 2018]

Page 11: Multigrid Predictive Filter Flow for Unsupervised Learning ...skong2/slides/mgpff_public_version.pdf · con: expensive to annotate training set, domain/data bias Fully-Supervised

pro: w/o annotation, domain-agnostic, cognitively, 2-week newborn can track w/o knowing semantics

Unsupervised Learning for Tracking

Page 12: Multigrid Predictive Filter Flow for Unsupervised Learning ...skong2/slides/mgpff_public_version.pdf · con: expensive to annotate training set, domain/data bias Fully-Supervised

● low-level vision based method;○ better generalization, compact model, cognitive observation

● exploiting temporal consistency (frame reconstruction);● allowing for learning transferable features, or on specific

data;● broader application.

Principles of the Idea

Page 13: Multigrid Predictive Filter Flow for Unsupervised Learning ...skong2/slides/mgpff_public_version.pdf · con: expensive to annotate training set, domain/data bias Fully-Supervised

Method:● making direct, fine-grained predictions of how to reconstruct

a video frame from pixels of another frame ● being trained using simple photometric reconstruction error

Multigrid Predictive Filter Flow (mgPFF)

Page 14: Multigrid Predictive Filter Flow for Unsupervised Learning ...skong2/slides/mgpff_public_version.pdf · con: expensive to annotate training set, domain/data bias Fully-Supervised

Method:● making direct, fine-grained predictions of how to reconstruct

a video frame from pixels of another frame ● being trained using simple photometric reconstruction error

Highlights:● unsupervised learning on free-form videos with single GPU;● easy training, long-range pixel connection;● extremely compact, 4.6MB;● fast computation, (0.1sec for a pair of 256x256 images).

Multigrid Predictive Filter Flow (mgPFF)

Page 15: Multigrid Predictive Filter Flow for Unsupervised Learning ...skong2/slides/mgpff_public_version.pdf · con: expensive to annotate training set, domain/data bias Fully-Supervised

generalizing optical flow with space-variant filters at each pixel

more powerful in modeling subpixel movement

Filter Flow [Seitz & Baker, 2009]

Page 16: Multigrid Predictive Filter Flow for Unsupervised Learning ...skong2/slides/mgpff_public_version.pdf · con: expensive to annotate training set, domain/data bias Fully-Supervised

models image transformation by linear mapping, where each pixel in only depends on local neighborhood centered at the same place in

Filter Flow [Seitz & Baker, 2009]

Page 17: Multigrid Predictive Filter Flow for Unsupervised Learning ...skong2/slides/mgpff_public_version.pdf · con: expensive to annotate training set, domain/data bias Fully-Supervised

pro: powerful, elegant, interpretable, applicable stereo, optical flow, deblur, deconvolution, morphing, defocus, affine alignment

Filter Flow [Seitz & Baker, 2009]

Page 18: Multigrid Predictive Filter Flow for Unsupervised Learning ...skong2/slides/mgpff_public_version.pdf · con: expensive to annotate training set, domain/data bias Fully-Supervised

pro: powerful, elegant, interpretable, applicable stereo, optical flow, deblur, deconvolution, morphing, defocus, affine alignment

con: optimization solver, impracticale.g., 22 hrs for optical flow on an image pair of 584x388 resolution

Filter Flow [Seitz & Baker, 2009]

Page 19: Multigrid Predictive Filter Flow for Unsupervised Learning ...skong2/slides/mgpff_public_version.pdf · con: expensive to annotate training set, domain/data bias Fully-Supervised

pro: powerful, elegant, interpretable, applicable stereo, optical flow, deblur, deconvolution, morphing, defocus, affine alignment

con: optimization solver, impracticale.g., 22 hrs for optical flow on an image pair of 584x388 resolution

Filter Flow [Seitz & Baker, 2009]

too slow☹

Page 20: Multigrid Predictive Filter Flow for Unsupervised Learning ...skong2/slides/mgpff_public_version.pdf · con: expensive to annotate training set, domain/data bias Fully-Supervised

optimization-based solver

Predictive Filter Flow [Kong & Fowlkes, 2018]

Page 21: Multigrid Predictive Filter Flow for Unsupervised Learning ...skong2/slides/mgpff_public_version.pdf · con: expensive to annotate training set, domain/data bias Fully-Supervised

optimization-based solverwe train a function/model that predicts the transformation specific to image pair (I1, I2) under the assumption that the image pairs (I1, I2) are drawn from some fixed joint distribution

Predictive Filter Flow [Kong & Fowlkes, 2018]

Page 22: Multigrid Predictive Filter Flow for Unsupervised Learning ...skong2/slides/mgpff_public_version.pdf · con: expensive to annotate training set, domain/data bias Fully-Supervised

optimization-based solverwe train a function/model that predicts the transformation specific to image pair (I1, I2) under the assumption that the image pairs (I1, I2) are drawn from some fixed joint distribution

Predictive Filter Flow [Kong & Fowlkes, 2018]

Page 23: Multigrid Predictive Filter Flow for Unsupervised Learning ...skong2/slides/mgpff_public_version.pdf · con: expensive to annotate training set, domain/data bias Fully-Supervised

With sampled image pairs , we seek parameters that minimize the difference between a recovered image and the real one measured by some loss

Predictive Filter Flow [Kong & Fowlkes, 2018]

Page 24: Multigrid Predictive Filter Flow for Unsupervised Learning ...skong2/slides/mgpff_public_version.pdf · con: expensive to annotate training set, domain/data bias Fully-Supervised

locality Kernel size, im2col with inner productCNN

non-negativity ReLUsum-to-one softmax

Predictive Filter Flow [Kong & Fowlkes, 2018]

Page 25: Multigrid Predictive Filter Flow for Unsupervised Learning ...skong2/slides/mgpff_public_version.pdf · con: expensive to annotate training set, domain/data bias Fully-Supervised

locality Kernel size, im2col with inner productCNN

non-negativity ReLUsum-to-one softmax

Predictive Filter Flow [Kong & Fowlkes, 2018]

Page 26: Multigrid Predictive Filter Flow for Unsupervised Learning ...skong2/slides/mgpff_public_version.pdf · con: expensive to annotate training set, domain/data bias Fully-Supervised

PFF for Unsupervised Learning on Videos

Page 27: Multigrid Predictive Filter Flow for Unsupervised Learning ...skong2/slides/mgpff_public_version.pdf · con: expensive to annotate training set, domain/data bias Fully-Supervised

alternative: grid sampling layer in the Spatial Transformer Network [Jaderberg et al. 2015]

PFF for Unsupervised Learning on Videos

Page 28: Multigrid Predictive Filter Flow for Unsupervised Learning ...skong2/slides/mgpff_public_version.pdf · con: expensive to annotate training set, domain/data bias Fully-Supervised

alternative: grid sampling layer in the Spatial Transformer Network [Jaderberg et al. 2015]

PFF for Unsupervised Learning on Videos

grid sampling

Page 29: Multigrid Predictive Filter Flow for Unsupervised Learning ...skong2/slides/mgpff_public_version.pdf · con: expensive to annotate training set, domain/data bias Fully-Supervised

alternative: grid sampling layer in the Spatial Transformer Network [Jaderberg et al. 2015]

PFF for Unsupervised Learning on Videos

grid sampling

hard to train ☹only if near correct prediction

Page 30: Multigrid Predictive Filter Flow for Unsupervised Learning ...skong2/slides/mgpff_public_version.pdf · con: expensive to annotate training set, domain/data bias Fully-Supervised

alternative: grid sampling layer in the Spatial Transformer Network [Jaderberg et al. 2015]

PFF for Unsupervised Learning on Videos

grid sampling Filter Flow

? easy to train ☺all pixels provide gradient info

hard to train ☹only if near correct prediction

0.2 0.1

0.3 0.1

.05 .02

.06

.04

.00

.01

.02.00

.02

.01

Page 31: Multigrid Predictive Filter Flow for Unsupervised Learning ...skong2/slides/mgpff_public_version.pdf · con: expensive to annotate training set, domain/data bias Fully-Supervised

alternative: grid sampling layer in the Spatial Transformer Network [Jaderberg et al. 2015]

PFF for Unsupervised Learning on Videos

grid sampling Filter Flow

inverse easy to train ☺all pixels provide gradient info

hard to train ☹only if near correct prediction

0.2 0.1

0.3 0.1

.05 .02

.06

.04

.00

.01

.02.00

.02

.01

Page 32: Multigrid Predictive Filter Flow for Unsupervised Learning ...skong2/slides/mgpff_public_version.pdf · con: expensive to annotate training set, domain/data bias Fully-Supervised

[-2,2]

[-2,-2] [2,-2]

[2,2]

11

1.11.4

12

22*0.6+ *0.3+ *0.1=

Voting for Offsetencouraging unimodal shape of the filter for the offset prediction

also allowing for visualization

Page 33: Multigrid Predictive Filter Flow for Unsupervised Learning ...skong2/slides/mgpff_public_version.pdf · con: expensive to annotate training set, domain/data bias Fully-Supervised

another challenge: requiring large flow size to capture large displacement.

PFF for Unsupervised Learning on Videos

Page 34: Multigrid Predictive Filter Flow for Unsupervised Learning ...skong2/slides/mgpff_public_version.pdf · con: expensive to annotate training set, domain/data bias Fully-Supervised

another challenge: requiring large flow size to capture large displacement.

If pixel movement is in [-40, 40] pixels, then a filter flow size should be no less than 80, meaning 80x80 kernel for each pixel.

PFF for Unsupervised Learning on Videos

Page 35: Multigrid Predictive Filter Flow for Unsupervised Learning ...skong2/slides/mgpff_public_version.pdf · con: expensive to annotate training set, domain/data bias Fully-Supervised

another challenge: requiring large flow size to capture large displacement.

If pixel movement is in [-40, 40] pixels, then a filter flow size should be no less than 80, meaning 80x80 kernel for each pixel.

If the image is 256x256, then the output is 256x256x6400!

PFF for Unsupervised Learning on Videos

Page 36: Multigrid Predictive Filter Flow for Unsupervised Learning ...skong2/slides/mgpff_public_version.pdf · con: expensive to annotate training set, domain/data bias Fully-Supervised

another challenge: requiring large flow size to capture large displacement.

If pixel movement is in [-40, 40] pixels, then a filter flow size should be no less than 80, meaning 80x80 kernel for each pixel.

If the image is 256x256, then the output is 256x256x6400!

Our solution is Multigrid PFF.

PFF for Unsupervised Learning on Videos

Page 37: Multigrid Predictive Filter Flow for Unsupervised Learning ...skong2/slides/mgpff_public_version.pdf · con: expensive to annotate training set, domain/data bias Fully-Supervised

multigrid filter

Multigrid PFF for Large Displacement

Decompose large sparse linear operator into a product of more compact terms

Page 38: Multigrid Predictive Filter Flow for Unsupervised Learning ...skong2/slides/mgpff_public_version.pdf · con: expensive to annotate training set, domain/data bias Fully-Supervised

multigrid filter coarse-to-fine

multi-resolution frames

Multigrid PFF for Large Displacement

1/16

1/8

1/4

1/2

Page 39: Multigrid Predictive Filter Flow for Unsupervised Learning ...skong2/slides/mgpff_public_version.pdf · con: expensive to annotate training set, domain/data bias Fully-Supervised

Multigrid PFF

[-2,2]

[-2,-2] [2,-2]

[2,2]

11

1.11.4

12

22*0.6+ *0.3+ *0.1=

voting for offset

Page 40: Multigrid Predictive Filter Flow for Unsupervised Learning ...skong2/slides/mgpff_public_version.pdf · con: expensive to annotate training set, domain/data bias Fully-Supervised

Rather than 256x256x6400, with PFF of 11*11 kernel size for all scales, we have output with mgPFF as 256x256x121+128x128x121+64x64x121+32x32x121.

Multigrid Predictive Filter Flow (mgPFF)

Page 41: Multigrid Predictive Filter Flow for Unsupervised Learning ...skong2/slides/mgpff_public_version.pdf · con: expensive to annotate training set, domain/data bias Fully-Supervised

Rather than 256x256x6400, with PFF of 11*11 kernel size for all scales, we have output with mgPFF as 256x256x121+128x128x121+64x64x121+32x32x121.

With self-similarity across scales, sharing the weights to make it compact, resulting into a model of 4.6MB.

Multigrid Predictive Filter Flow (mgPFF)

Page 42: Multigrid Predictive Filter Flow for Unsupervised Learning ...skong2/slides/mgpff_public_version.pdf · con: expensive to annotate training set, domain/data bias Fully-Supervised

training on free-form videos (e.g., the complete Sintel Movie).

byproduct: video transition/shot detection

Multigrid Predictive Filter Flow (mgPFF)

Page 43: Multigrid Predictive Filter Flow for Unsupervised Learning ...skong2/slides/mgpff_public_version.pdf · con: expensive to annotate training set, domain/data bias Fully-Supervised

Multigrid Predictive Filter Flow (mgPFF)

Page 44: Multigrid Predictive Filter Flow for Unsupervised Learning ...skong2/slides/mgpff_public_version.pdf · con: expensive to annotate training set, domain/data bias Fully-Supervised

Multigrid Predictive Filter Flow (mgPFF)

target: frameA rec-Asource: frameB

FF to reconstruct A using B

Page 45: Multigrid Predictive Filter Flow for Unsupervised Learning ...skong2/slides/mgpff_public_version.pdf · con: expensive to annotate training set, domain/data bias Fully-Supervised

Applications of mgPFF

various tasks, for example--1. transition/shot detection2. video instance tracking, human pose tracking3. long-range flow

Page 46: Multigrid Predictive Filter Flow for Unsupervised Learning ...skong2/slides/mgpff_public_version.pdf · con: expensive to annotate training set, domain/data bias Fully-Supervised

simply propagating the mask using the estimated flow

mgPFF for Instance Tracking

Page 47: Multigrid Predictive Filter Flow for Unsupervised Learning ...skong2/slides/mgpff_public_version.pdf · con: expensive to annotate training set, domain/data bias Fully-Supervised

simply propagating the mask using the estimated flow

tracking right hand

mgPFF for Instance Tracking

frame-0

Page 48: Multigrid Predictive Filter Flow for Unsupervised Learning ...skong2/slides/mgpff_public_version.pdf · con: expensive to annotate training set, domain/data bias Fully-Supervised

simply propagating the mask using the estimated flow

tracking bird frame-0

mgPFF for Instance Tracking

Page 49: Multigrid Predictive Filter Flow for Unsupervised Learning ...skong2/slides/mgpff_public_version.pdf · con: expensive to annotate training set, domain/data bias Fully-Supervised

simply propagating the mask using the estimated flow

tracking head

mgPFF for Instance Tracking

frame-0

Page 50: Multigrid Predictive Filter Flow for Unsupervised Learning ...skong2/slides/mgpff_public_version.pdf · con: expensive to annotate training set, domain/data bias Fully-Supervised

simply propagating the mask using the estimated flow

benchmarking on the DAVIS dataset

mgPFF for Instance Tracking

Page 51: Multigrid Predictive Filter Flow for Unsupervised Learning ...skong2/slides/mgpff_public_version.pdf · con: expensive to annotate training set, domain/data bias Fully-Supervised

mgPFF for Instance Tracking

K=3 [1, t-2, t-1] using first and previous two frames for tracking

Page 52: Multigrid Predictive Filter Flow for Unsupervised Learning ...skong2/slides/mgpff_public_version.pdf · con: expensive to annotate training set, domain/data bias Fully-Supervised

mgPFF for Instance Tracking

K=1 [t-1] using the previous frame for tracking

Page 53: Multigrid Predictive Filter Flow for Unsupervised Learning ...skong2/slides/mgpff_public_version.pdf · con: expensive to annotate training set, domain/data bias Fully-Supervised

mgPFF for Instance Tracking

K=1 [1] using the first frame for tracking

Page 54: Multigrid Predictive Filter Flow for Unsupervised Learning ...skong2/slides/mgpff_public_version.pdf · con: expensive to annotate training set, domain/data bias Fully-Supervised

mgPFF for Instance Tracking

Page 55: Multigrid Predictive Filter Flow for Unsupervised Learning ...skong2/slides/mgpff_public_version.pdf · con: expensive to annotate training set, domain/data bias Fully-Supervised

mgPFF for Instance Tracking

how it deals with heavy occlusion

Page 56: Multigrid Predictive Filter Flow for Unsupervised Learning ...skong2/slides/mgpff_public_version.pdf · con: expensive to annotate training set, domain/data bias Fully-Supervised

mgPFF for Instance Tracking

how it deals with large deformation

Page 57: Multigrid Predictive Filter Flow for Unsupervised Learning ...skong2/slides/mgpff_public_version.pdf · con: expensive to annotate training set, domain/data bias Fully-Supervised

mgPFF for Pose Tracking

Page 58: Multigrid Predictive Filter Flow for Unsupervised Learning ...skong2/slides/mgpff_public_version.pdf · con: expensive to annotate training set, domain/data bias Fully-Supervised

mgPFF for Pose Tracking

Page 59: Multigrid Predictive Filter Flow for Unsupervised Learning ...skong2/slides/mgpff_public_version.pdf · con: expensive to annotate training set, domain/data bias Fully-Supervised

mgPFF for Pose Tracking

occlusion on the knees

Page 60: Multigrid Predictive Filter Flow for Unsupervised Learning ...skong2/slides/mgpff_public_version.pdf · con: expensive to annotate training set, domain/data bias Fully-Supervised

mgPFF for Pose Tracking

joints moving out of the box

Page 61: Multigrid Predictive Filter Flow for Unsupervised Learning ...skong2/slides/mgpff_public_version.pdf · con: expensive to annotate training set, domain/data bias Fully-Supervised

mgPFF for Pose Tracking

similar color between hair and wall

Page 62: Multigrid Predictive Filter Flow for Unsupervised Learning ...skong2/slides/mgpff_public_version.pdf · con: expensive to annotate training set, domain/data bias Fully-Supervised

mgPFF for Pose Tracking

motion blur on the elbow

Page 63: Multigrid Predictive Filter Flow for Unsupervised Learning ...skong2/slides/mgpff_public_version.pdf · con: expensive to annotate training set, domain/data bias Fully-Supervised

mgPFF for Long-Range Flow

Page 64: Multigrid Predictive Filter Flow for Unsupervised Learning ...skong2/slides/mgpff_public_version.pdf · con: expensive to annotate training set, domain/data bias Fully-Supervised

1. unsupervised learning framework on free-form videos;2. compact model (4.6MB), easy training, fast computation;3. better perf. of video tracking, great power for long-range flow;4. interpretable in terms of decision making (per-pixel tracking);

Summary: mgPFF for Video Mining

Page 65: Multigrid Predictive Filter Flow for Unsupervised Learning ...skong2/slides/mgpff_public_version.pdf · con: expensive to annotate training set, domain/data bias Fully-Supervised

1. unsupervised learning framework on free-form videos;2. compact model (4.6MB), easy training, fast computation;3. better perf. of video tracking, great power for long-range flow;4. interpretable in terms of decision making (per-pixel tracking);5. reminiscent of a variety of flow-based tasks

video compression, frame interpolation, activity/action cls., optical flow, etc.

Summary: mgPFF for Video Mining

Page 66: Multigrid Predictive Filter Flow for Unsupervised Learning ...skong2/slides/mgpff_public_version.pdf · con: expensive to annotate training set, domain/data bias Fully-Supervised

1. unsupervised learning framework on free-form videos;2. compact model (4.6MB), easy training, fast computation;3. better perf. of video tracking, great power for long-range flow;4. interpretable in terms of decision making (per-pixel tracking);5. reminiscent of a variety of flow-based tasks

video compression, frame interpolation, activity/action cls., optical flow, etc.6. interpretable model for good (transparent decision making)

e.g., medical image enhancement

Summary: mgPFF for Video Mining

Page 67: Multigrid Predictive Filter Flow for Unsupervised Learning ...skong2/slides/mgpff_public_version.pdf · con: expensive to annotate training set, domain/data bias Fully-Supervised

PFF for Single Image Reconstruction

Page 68: Multigrid Predictive Filter Flow for Unsupervised Learning ...skong2/slides/mgpff_public_version.pdf · con: expensive to annotate training set, domain/data bias Fully-Supervised

PFF for Single Image Reconstruction

[Kong & Fowlkes, unpublished]

Page 69: Multigrid Predictive Filter Flow for Unsupervised Learning ...skong2/slides/mgpff_public_version.pdf · con: expensive to annotate training set, domain/data bias Fully-Supervised

PFF for Single Image Reconstruction

[Kong & Fowlkes, unpublished]

Page 70: Multigrid Predictive Filter Flow for Unsupervised Learning ...skong2/slides/mgpff_public_version.pdf · con: expensive to annotate training set, domain/data bias Fully-Supervised

PFF for Single Image Reconstruction

[Kong & Fowlkes, 2018]

non-uniform deblur

Page 71: Multigrid Predictive Filter Flow for Unsupervised Learning ...skong2/slides/mgpff_public_version.pdf · con: expensive to annotate training set, domain/data bias Fully-Supervised

PFF for Single Image Reconstruction

[Kong & Fowlkes, 2018]

lossy compression artifact reduction

Page 72: Multigrid Predictive Filter Flow for Unsupervised Learning ...skong2/slides/mgpff_public_version.pdf · con: expensive to annotate training set, domain/data bias Fully-Supervised

PFF for Single Image Reconstruction

[Kong & Fowlkes, 2018]

single image super-resolution

Page 73: Multigrid Predictive Filter Flow for Unsupervised Learning ...skong2/slides/mgpff_public_version.pdf · con: expensive to annotate training set, domain/data bias Fully-Supervised

1. unsupervised learning framework on free-form videos;2. compact model (4.6MB), easy training, fast computation;3. better perf. of video tracking, great power for long-range flow;4. interpretable in terms of decision making (per-pixel tracking);5. reminiscent of a variety of flow-based tasks

video compression, frame interpolation, activity/action cls., optical flow, etc.6. interpretable model for good (transparent decision making)

e.g., medical image enhancement7. abundant future work

combining higher-level info., mobile dev., etc.

Summary: mgPFF for Video Mining

Page 74: Multigrid Predictive Filter Flow for Unsupervised Learning ...skong2/slides/mgpff_public_version.pdf · con: expensive to annotate training set, domain/data bias Fully-Supervised

1. Unsupervised Learning with Multigrid Predictive Filter Flowvideo inst. seg./tracking, pose tracking, long-range flow, video shot det.

2. tba3. tba4. Conclusion with discussion

Outline of Video Mining

Page 75: Multigrid Predictive Filter Flow for Unsupervised Learning ...skong2/slides/mgpff_public_version.pdf · con: expensive to annotate training set, domain/data bias Fully-Supervised

1. Learning with videos in a more affordable way (not much supervision required)

2. low-vision mining to mid-level application, high-level learning

With videos, a lot is happening

Conclusion

Page 76: Multigrid Predictive Filter Flow for Unsupervised Learning ...skong2/slides/mgpff_public_version.pdf · con: expensive to annotate training set, domain/data bias Fully-Supervised

1. Learning with videos in a more affordable way (not much supervision required)

2. low-vision mining to mid-level application, high-level learning

With videos, a lot is happening; some future explorations --● visual commonsense/knowledge

○ affordance, correspondence, parts, etc.● better human-machine intersection (assistive robots)● better intelligent systems

Conclusion

Page 77: Multigrid Predictive Filter Flow for Unsupervised Learning ...skong2/slides/mgpff_public_version.pdf · con: expensive to annotate training set, domain/data bias Fully-Supervised

Thanks

Page 78: Multigrid Predictive Filter Flow for Unsupervised Learning ...skong2/slides/mgpff_public_version.pdf · con: expensive to annotate training set, domain/data bias Fully-Supervised

Shu Kong & Charless Fowlkes, 2019

Thanks

Page 79: Multigrid Predictive Filter Flow for Unsupervised Learning ...skong2/slides/mgpff_public_version.pdf · con: expensive to annotate training set, domain/data bias Fully-Supervised

Q&A

Thanks