36
Shu Kong CS, ICS, UCI Recurrent Scene Parsing with Perspective Understanding In the Loop

Recurrent Scene Parsing with Perspective Understanding In the …skong2/img/depthGatingSeg_slides.pdf · Some slides from this point are removed due to research conflicts. They will

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Recurrent Scene Parsing with Perspective Understanding In the …skong2/img/depthGatingSeg_slides.pdf · Some slides from this point are removed due to research conflicts. They will

Shu KongCS, ICS, UCI

Recurrent Scene Parsing with Perspective Understanding In the Loop

Page 2: Recurrent Scene Parsing with Perspective Understanding In the …skong2/img/depthGatingSeg_slides.pdf · Some slides from this point are removed due to research conflicts. They will

1. Background

2. Attention to Perspective: Depth-aware Gating

3. Recurrent Refining

4. Attentional Mechanism

5. Conclusion and Future Work

Outline

Page 3: Recurrent Scene Parsing with Perspective Understanding In the …skong2/img/depthGatingSeg_slides.pdf · Some slides from this point are removed due to research conflicts. They will

1. Background

Outline

Page 4: Recurrent Scene Parsing with Perspective Understanding In the …skong2/img/depthGatingSeg_slides.pdf · Some slides from this point are removed due to research conflicts. They will

Background

Semantic Segmentation with Deep Convolutional Neural Networks

Keywords: skip connection, multi-scale, upsampling

Page 5: Recurrent Scene Parsing with Perspective Understanding In the …skong2/img/depthGatingSeg_slides.pdf · Some slides from this point are removed due to research conflicts. They will

Background

DeepLab is a strong baseline (based on ResNet architecture), yet simple and straightforward.

It sums up feature maps at different scales using atrous convolution, i.e. convolution with various dilate rates.

[1] DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs

Page 6: Recurrent Scene Parsing with Perspective Understanding In the …skong2/img/depthGatingSeg_slides.pdf · Some slides from this point are removed due to research conflicts. They will

1. a trous (French) -- holes (English)

2. Atrous convolution (skipping/inserting zero)

Background

[1] DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs

Page 7: Recurrent Scene Parsing with Perspective Understanding In the …skong2/img/depthGatingSeg_slides.pdf · Some slides from this point are removed due to research conflicts. They will

fusing responses with multiple atrous kernels of different rates.

Background

Page 8: Recurrent Scene Parsing with Perspective Understanding In the …skong2/img/depthGatingSeg_slides.pdf · Some slides from this point are removed due to research conflicts. They will

Background

That's all about the baseline.

[1] DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs

Page 9: Recurrent Scene Parsing with Perspective Understanding In the …skong2/img/depthGatingSeg_slides.pdf · Some slides from this point are removed due to research conflicts. They will

Large Perspective Image

The fusion of multi-scale feature maps exhibits some degree of scale invariance;

but it's not obvious this invariance covers the range scale variantion existing in perspective images.

Page 10: Recurrent Scene Parsing with Perspective Understanding In the …skong2/img/depthGatingSeg_slides.pdf · Some slides from this point are removed due to research conflicts. They will

Large Perspective Image

large range scale variantion in perspective images.

car

pole

white/black board

charis

Page 11: Recurrent Scene Parsing with Perspective Understanding In the …skong2/img/depthGatingSeg_slides.pdf · Some slides from this point are removed due to research conflicts. They will

1. Background

2. Attention to Perspective: Depth-aware Gating

Outline

Page 12: Recurrent Scene Parsing with Perspective Understanding In the …skong2/img/depthGatingSeg_slides.pdf · Some slides from this point are removed due to research conflicts. They will

disparity, or depth, conveys the scale information.

Depth-aware pooling module

Page 13: Recurrent Scene Parsing with Perspective Understanding In the …skong2/img/depthGatingSeg_slides.pdf · Some slides from this point are removed due to research conflicts. They will

select the right scale with depth

Depth-aware pooling module

Page 14: Recurrent Scene Parsing with Perspective Understanding In the …skong2/img/depthGatingSeg_slides.pdf · Some slides from this point are removed due to research conflicts. They will

quantize the disparity into five scales with dilate rates {1, 2, 4, 8, 16}

Depth-aware pooling module

Page 15: Recurrent Scene Parsing with Perspective Understanding In the …skong2/img/depthGatingSeg_slides.pdf · Some slides from this point are removed due to research conflicts. They will

Alternatively, learning depth estimator, and testing without depth

Depth-aware pooling module

Page 16: Recurrent Scene Parsing with Perspective Understanding In the …skong2/img/depthGatingSeg_slides.pdf · Some slides from this point are removed due to research conflicts. They will

Alternatively, learning depth estimator, and testing without depth

reliable monocular depth estimation

Depth-aware pooling module

Page 17: Recurrent Scene Parsing with Perspective Understanding In the …skong2/img/depthGatingSeg_slides.pdf · Some slides from this point are removed due to research conflicts. They will

more configurations to compare --

1. sharing the parameters in this pooling module (multiPool)

Depth-aware pooling module

Page 18: Recurrent Scene Parsing with Perspective Understanding In the …skong2/img/depthGatingSeg_slides.pdf · Some slides from this point are removed due to research conflicts. They will

more configurations to compare --

1. sharing the parameters in this pooling module (multiPool)

2. averaging the feature vs. depth-aware gating

Depth-aware pooling module

Page 19: Recurrent Scene Parsing with Perspective Understanding In the …skong2/img/depthGatingSeg_slides.pdf · Some slides from this point are removed due to research conflicts. They will

more configurations to compare --

1. sharing the parameters in this pooling module (multiPool)

2. averaging the feature vs. depth-aware gating

3. MultiPool vs. MultiScale (input)

Depth-aware pooling module

Page 20: Recurrent Scene Parsing with Perspective Understanding In the …skong2/img/depthGatingSeg_slides.pdf · Some slides from this point are removed due to research conflicts. They will

more configurations to compare --

1. sharing the parameters in this pooling module (multiPool)

2. averaging the feature vs. depth-aware gating

3. MultiPool vs. MultiScale (input)

Depth-aware pooling module

Page 21: Recurrent Scene Parsing with Perspective Understanding In the …skong2/img/depthGatingSeg_slides.pdf · Some slides from this point are removed due to research conflicts. They will

Qualitative Results -- street images

Depth-aware pooling module

Page 22: Recurrent Scene Parsing with Perspective Understanding In the …skong2/img/depthGatingSeg_slides.pdf · Some slides from this point are removed due to research conflicts. They will

Qualitative Results -- panorama images

Depth-aware pooling module

Page 23: Recurrent Scene Parsing with Perspective Understanding In the …skong2/img/depthGatingSeg_slides.pdf · Some slides from this point are removed due to research conflicts. They will

1. Background

2. Attention to Perspective: Depth-aware Gating

3. Recurrent Refining

Outline

Page 24: Recurrent Scene Parsing with Perspective Understanding In the …skong2/img/depthGatingSeg_slides.pdf · Some slides from this point are removed due to research conflicts. They will

Recurrently refining the results by adapting the predicted depth

Recurrent Refinement Module

Page 25: Recurrent Scene Parsing with Perspective Understanding In the …skong2/img/depthGatingSeg_slides.pdf · Some slides from this point are removed due to research conflicts. They will

unrolling the recurrent module during training

adding a loss to each unrolled loop

embedding the depth-aware gating module in the loops

Recurrent Refinement Module

Page 26: Recurrent Scene Parsing with Perspective Understanding In the …skong2/img/depthGatingSeg_slides.pdf · Some slides from this point are removed due to research conflicts. They will

Recurrently refining the results by adapting the predicted depth

Recurrent Refinement Module

Page 27: Recurrent Scene Parsing with Perspective Understanding In the …skong2/img/depthGatingSeg_slides.pdf · Some slides from this point are removed due to research conflicts. They will

Qualitative Results -- NYU-depth-v2 indoor dataset

Recurrent Refinement Module

Page 28: Recurrent Scene Parsing with Perspective Understanding In the …skong2/img/depthGatingSeg_slides.pdf · Some slides from this point are removed due to research conflicts. They will

Qualitative Results -- Cityscapes

yellow --> closer --> larger pooling size

Recurrent Refinement Module

Page 29: Recurrent Scene Parsing with Perspective Understanding In the …skong2/img/depthGatingSeg_slides.pdf · Some slides from this point are removed due to research conflicts. They will

Qualitative Results -- Stanford-2D-3D (panoramas)

blue --> closer --> larger pooling size

Recurrent Refinement Module

Page 30: Recurrent Scene Parsing with Perspective Understanding In the …skong2/img/depthGatingSeg_slides.pdf · Some slides from this point are removed due to research conflicts. They will

1. Background

2. Attention to Perspective: Depth-aware Gating

3. Recurrent Refining

4. Attentional Mechanism

Outline

Page 31: Recurrent Scene Parsing with Perspective Understanding In the …skong2/img/depthGatingSeg_slides.pdf · Some slides from this point are removed due to research conflicts. They will

Some slides from this point are removed due to research conflicts.

They will be disclosed in the future.

Attention to Scale Again

Page 32: Recurrent Scene Parsing with Perspective Understanding In the …skong2/img/depthGatingSeg_slides.pdf · Some slides from this point are removed due to research conflicts. They will

Attention to Scale Again

Page 33: Recurrent Scene Parsing with Perspective Understanding In the …skong2/img/depthGatingSeg_slides.pdf · Some slides from this point are removed due to research conflicts. They will

1. Background

2. Attention to Perspective: Depth-aware Gating

3. Recurrent Refining

4. Attentional Mechanism

5. Conclusion and Future Work

Outline

Page 34: Recurrent Scene Parsing with Perspective Understanding In the …skong2/img/depthGatingSeg_slides.pdf · Some slides from this point are removed due to research conflicts. They will

1. Attentional module is powerful.

Conclusion and Future Work

Page 35: Recurrent Scene Parsing with Perspective Understanding In the …skong2/img/depthGatingSeg_slides.pdf · Some slides from this point are removed due to research conflicts. They will

1. Attentional module is powerful.

2. Such attentional module should be also useful in various pixel-level tasks, e.g. pixel embedding for instance grouping, depth estimation, surface normal estimation, etc.

Conclusion and Future Work

Page 36: Recurrent Scene Parsing with Perspective Understanding In the …skong2/img/depthGatingSeg_slides.pdf · Some slides from this point are removed due to research conflicts. They will

Thanks