Automatic Dense Semantic Mapping From Visual Street-level Imagery

Sunando Sengupta[1], Paul Sturgess[1], Lubor Ladicky[2], Phillip H.S. Torr[1]

[1]Oxford Brookes University[2] Visual geometry group, Oxford University

http://cms.brookes.ac.uk/research/visiongroup/index.php 1

Dense Semantic Map

• Generate an overhead view of an urban region.• Label every pixel in the Map View is associated with an

object class label

BuildingRoadTreeVegetation FenceSignage

SkyPavement Car Pedestrian Bollard Shop Sign Post 2

Dense Semantic Map• Street images captured inexpensively from vehicle with

multiple mounted camera[1].

3[1] Yotta. DCL, “Yotta dcl case studies,” Available: http://www.yottadcl.com/surveys/case-studies/

Semantic Mapping Framework

• Semantic mapping framework comprises of two stages

Street level Images acquisition

• Semantic mapping framework comprises of two stages– Semantic Image Segmentation at street level.

Image Segmentation

• Semantic mapping framework comprises of two stages– Semantic Image Segmentation at street level.– Ground Plane Labelling at a global level.

• One of the first attempts to do overhead mapping from street level images.

Image Segmentation

Ground plane labelling

Semantic Image Segmentation

Label every pixel in the image with an object class

SkyPavement Car Pedestrian Bollard Shop Sign Post

Input Output

Raw Image Labelled Image

Automatic Labeller

Object Class Labels

CRFCRF

constructionconstruction

Semantic Image Segmentation• We use Conditional Random Field Framework (CRF)

Final SegmentationInput Image

• Each pixel is a node in a grid graph G = (V,E).• Each node is a random variable x taking a label from label

Semantic Image Segmentation - CRF• Total energy

• Optimal labelling given as

ccNjVi

jiijVi

xxxE )(),()()(,

Epix EpairEregion

Semantic Image Segmentation - CRF

• Total energy E = Epix + Epair + Eregion

• Epix - Model individual pixel’s cost of taking a label.

– Computed via the dense boosting approach– Multi feature variant of texton boost[1]

Car 0.2

Road 0.3

10[1] L. Ladicky, C. Russell, P. Kohli, and P. H. Torr, “Associative hierarchical crfs for object class image segmentation,” in ICCV, 2009.

• Epair- Model each pixel neighbourhood interactions.

– Encourages label consistency in adjacent pixels

– Sensitive to edges in images.

– Contrast sensitive Potts modelxi xj

g(i,j)

• Eregion - Model behaviour of a group of pixels.

– Classify a region – Encourages all the pixels in a region to take the same label.– Group of pixels given by a multiple meanshift segmentations

Car 0.3

Road 0.1

Semantic Image Segmentation• Solved using alpha-expansion algorithm[1]

Input Image Road Expansion

[1] Fast Approximate Energy Minimization via Graph Cuts. Yuri Boykov et al. ICCV 99

Input Image Building Expansion

Input Image Sky Expansion

Input Image Pavement Expansion

Input Image Final solution

Ground Plane Labelling• Combine many labellings from street level imagery.

Automatic Labeller

Output

Labelled Ground PlaneStreet Levellabellings

Ground Plane CRF• A CRF defined over the ground plane.

• Each ground plane pixel (zi) is a random variable taking a label from the label set.

• Energy for ground plane crf is

g EEZE )(

Ground Plane Pixel Cost

• We assume a flat world.

Homography Road Pavement Post/Pole

• A ground plane region is estimated.

• Each point in the image projects to a unique point on the ground plane.– Creating a homography

Ground plane

Pixel histogramsHomography Road Pavement Post/Pole

• The image labelling is mapped to the ground plane – via the homography.

• Labels projected from many views are combined in a histogram.• The normalised histogram gives the naïve probability of the

ground plane pixel taking a label.

ZGround plane Pixel histogramsHomography Road Pavement Post/Pole

• Labels projected from many views are combined in a histogram.• The normalised histogram gives the naïve probability of the

ground plane pixel taking a label.

Ground Plane labelling

• Histogram is built for every ground plane pixel giving Egpix

• Pairwise cost (Egpair) added to induce smoothness

– Contrast sensitive potts model

Ground Plane labelling• Final CRF solution obtained using alpha expansion.

Road expansion

• Final CRF solution obtained using alpha expansion.

Building expansion

Pavement expansion

Car expansion

Ground Plane Labelling

Final Solution

Dataset

• Subset of the images captured by the van– 14.8 km of track, 8000 images from each camera.

• Pixel-level labelled ground truth images. Dataset available[1].

• 13 object categories –

• Training - 44 images, testing - 42 images.

[1]http://cms.brookes.ac.uk/research/visiongroup/projects/SemanticMap/index.php

SIS Results

• Input Images, output of our image level CRF, ground truths.• Used Automatic Labelling environment[1]

[1] The Automatic Labelling Environment, L Ladicky, PHS Torr. Code available http://cms.brookes.ac.uk/staff/PhilipTorr/ale.htm

Semanticsegmentation

Ground Truth

Semantic Map Results

Semantic map of Pembroke city

Ground plane Map Evaluation

Street Images

Back-projectedMap results

Ground Truth

• We back-project the ground plane map into image domain and evaluate the results.

• Global pixel accuracy of 86%

Results

Conclusions• Presented a method to generate

overhead view semantic mapping.

• Experiments on large tracks (~15km) which can be scaled up to country wide mapping

• Dataset available[1].

[1] http://cms.brookes.ac.uk/research/visiongroup/projects/SemanticMap/index.php 38

Future Work

Oxford Brookes Vision groupOxford Brookes Universityhttp://cms.brookes.ac.uk/research/visiongroup/index.php

• Perform a 3D street level semantic mapping and reconstruction.

• Add detailed street level information like signs, information boards etc.

Thank you!!!

• Using single view will create a shadow effect for objects violating flat world assumption and wrong label estimate

Single view

Multi-view

Automatic Dense Semantic Mapping From Visual Street-level Imagery

Documents

Virtual-to-Real: Learning to Control in Visual Semantic ... · 2.1 Semantic Image Segmentation The goal of semantic segmentation is to perform dense pre-dictions at pixel level. It

Model Adaptation with Synthetic and Real Data for Semantic Dense Foggy Scene …daid/publications/[eccv18... · 2018. 9. 6. · Model Adaptation with Synthetic and Real Data for Semantic

Semantic Segmentation of Underwater Imagery: Dataset and

Dense Semantic Correspondence Where Every Pixel is a Classifier › openaccess › content_iccv... · 2015-10-24 · Dense Semantic Correspondence where Every Pixel is a Classiﬁer

Semantic Jitter: Dense Supervision for Visual Comparisons ...grauman/papers/semantic-jitter-iccv2017.pdfReal Pairs Synthetic Pairs vs. Figure 1: Our method “densiﬁes” supervision

Photography-based Digital Imaging Techniques for Museumsculturalheritageimaging.org/What_We_Do/Publications/vast2010/vast… · Rendering, dense, close range Photogrammetry, semantic

Fast Semantic Segmentation of 3D Point Clouds using a Dense CRF ... - HOBBIT …hobbit.acin.tuwien.ac.at/publications/2015wolfpranklvin... · 2015. 10. 7. · Fast Semantic Segmentation

SemanticFusion: Dense 3D Semantic Mapping with ...static.tongtianta.site/paper_pdf/0d7eafcc-196c-11e9-861a-00163e08… · Our semantic mapping pipeline is inspired by the re-cent

Universal Semi-Supervised Semantic SegmentationUniversal Semi-Supervised Semantic Segmentation A novel approach towards learning efficient and transferable representations for dense

SEMANTIC SEGMENTATION OF AERIAL IMAGERY VIA MULTI … · SEMANTIC SEGMENTATION OF AERIAL IMAGERY VIA MULTI-SCALE SHUFFLING CONVOLUTIONAL NEURAL NETWORKS WITH DEEP SUPERVISION Kaiqiang

Scalability for Large Photogrammetry Projects Phillip Simard.pdfHigh resolution dense DSM generated from stereo imagery Fast processing using Step 2: DSM Generation Workflow Correlator3D™

Automatic Dense Visual Semantic Mapping from Street-Level ...tvg/publications/2012/IROS_Mapping_ss.pdfAutomatic Dense Visual Semantic Mapping from Street-Level Imagery ... The higher

Incremental Dense Semantic Stereo Fusion for Large-Scale ... · Incremental Dense Semantic Stereo Fusion for Large-Scale Semantic Scene Reconstruction Vibhav Vineet?Ondrej Miksik

HIGH-RESOLUTION IMAGE CLASSIFICATION WITH …gcharpia/highres_igarss2017.pdftion, deep learning, convolutional neural networks. 1. INTRODUCTION Dense image classiﬁcation, or semantic

Ascending Stairway Modeling from Dense Depth Imagery for ... · Ascending Stairway Modeling from Dense Depth Imagery for Traversability Analysis Jeffrey A. Delmerico, David Baran,

IMAGERY University of Idaho. Imagery Theory Imagery Theory Questions of Interest What is imagery? Does imagery work? How does imagery work?

Benchmark Imagery for Assessing Geospatial Semantic ... · benchmark imagery suite. Two key but poten tially expensive recommendations were to annotate much more completely and to

Object-Aware Dense Semantic Correspondenceopenaccess.thecvf.com/content_cvpr_2017/papers/Yang...Object-aware Dense Semantic Correspondence Fan Yang1, Xin Li1∗, Hong Cheng 2, Jianping

SemanticFusion: Dense 3D Semantic Mapping with ...SemanticFusion: Dense 3D Semantic Mapping with Convolutional Neural Networks John McCormac, Ankur Handa, Andrew Davison, and Stefan

Dense Terrain Extraction from Stereo Imagery Using …lagic.lsu.edu/rsgis/2013/downloads/Presentations/Wednesday... · Dense Terrain Extraction from Stereo Imagery Using ... Dense