End-to-end Lane Detection through Differentiable...

End-to-end Lane Detection through Differentiable Least-Squares Fitting

Wouter Van Gansbeke, Bert De Brabandere, Davy Neven, Marc Proesmans, Luc Van Gool

arXiv:1902.00293v3 [cs.CV] 5 Sep 2019

ECE 285 – Autonomous Driving Systems

Presented by – Anirudh Swaminathan – April 23, 2020

Why Lane Detection?

Detecting lanes is important because:-

Position vehicle within the lane

Plan future trajectory, lane departures

Lane Detection Background

Previous methods before this paper:-

Two step pipelines

First step -> segment lane line markings

Second step -> fit a lane line model to post-processed mask

2-stage examples

Classical SIFT[20] / SURF[2] for feature extraction

RANSAC / spline / polynomial for parameters of best fitting model

Deep Learning Based Instance Segmentation – LaneNet [24]

Curve fitting mostly same

[24] - Towards End-to-End Lane Detection: an Instance Segmentation ApproachDavy Neven, Bert De Brabandere, Stamatios Georgoulis, Marc Proesmans, Luc Van Gool ESAT-PSI, KU LeuvenarXiv:1802.05591v1 [cs.CV] 15 Feb 2018[2] H. Bay, T. Tuytelaars, and L. Van Gool. Surf: Speeded up robust features. Proceedings of the European Conference on Computer Vision, 2006[20] D. G. Lowe. Object recognition from local scale-invariant features. In Proceedings of the IEEE International Conference on Computer Vision, 1999.

Objective of the Paper

End-to-end manner

Directly regress lane parameters

Motivation

Why single step?

Parameters not optimized for true task True task – estimating lane curvature parameters

Proxy task – Segmenting lane markings

Prevents instabilities in curve fitting 2 step –> outliers

End-to-end -> implicitly learn features to prevent instabilities

Methodology

Key Idea -> Integrate curve fitting step as a differentiable in-network optimization step

Deep Network for the feature extraction step

Key Idea -> A geometric loss function for the network

Framework

The framework consists of 3 main modules:-

Deep network to generate weighted pixel coordinates

Differential weighted least squares fitting module

Geometric Loss Function

Example Architecture – Figure 1 from the paper 9

Generating Weighted Pixel Coordinates

First Module of network

Normalized Coordinates -> x map and y map

Each coordinate -> weight w

Feature map -> same spatial dimensions as that of input image

Feature Maps

Non-negative weights

Width – w, height – h; m = w * h

M triplets generated – (x, y, w)

One feature map for each lane

Weighted Least Squares Layer

M triplets (x, y, w) -> weighted points in 2D space

Fit curve

Module output -> n parameters of best-fitting curve

Background - Least Squares Fitting

𝑋𝑋𝑋𝑋 = 𝑌𝑌;𝑋𝑋 ∈ 𝑅𝑅𝑚𝑚𝑚𝑚𝑚𝑚 ;𝑋𝑋 ∈ 𝑅𝑅𝑚𝑚×1;𝑌𝑌 ∈ 𝑅𝑅𝑚𝑚×1

X is input, 𝑋𝑋 are parameters, and Y is output

Least Squares -> 𝑋𝑋 = 𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎 ||𝑋𝑋𝑋𝑋 − 𝑌𝑌||2

Normal Equation -> 𝑋𝑋 = 𝑋𝑋𝑇𝑇𝑋𝑋 −1𝑋𝑋𝑇𝑇𝑌𝑌

Background – Weighted Least Squares

Least squares extended

𝑊𝑊 ∈ 𝑅𝑅𝑚𝑚×m ; Diagonal matrix -> weights for each pair of observations

𝑋𝑋 = 𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎𝑎 ||𝑊𝑊12(𝑋𝑋𝑋𝑋 − 𝑌𝑌)||2

𝑋𝑋𝑇𝑇𝑊𝑊𝑋𝑋𝑋𝑋 = 𝑋𝑋𝑇𝑇𝑊𝑊𝑌𝑌

𝑋𝑋 = 𝑋𝑋𝑇𝑇𝑊𝑊𝑋𝑋 −1𝑋𝑋𝑇𝑇𝑊𝑊𝑌𝑌

Backprop through the layer

Equations involve differentiable matrix operations

Calculate the derivative of 𝑋𝑋 wrt W

Refer to [10] to derive backprop

M. B. Giles. An extended collection of matrix derivative results for forward and reverse mode automatic differentiation.Technical report, University of Oxford, 2008.

Geometric Loss Function - precursor

Usually, L2 loss used for curve fitting

Here, 𝑋𝑋𝑖𝑖 and �𝑋𝑋𝑖𝑖 -> generated and groundtruth curve parameters

Geometric Loss Function

Lane Detection -> geometric interpretation

Minimize squared area between predicted curve and ground truth curve

Geometric Meaning

Geometric Loss Function – Parabola

This paper -> lane curves parabolic

𝑦𝑦 = 𝑋𝑋0 + 𝑋𝑋1𝑥𝑥 + 𝑋𝑋2𝑥𝑥2; Δ𝑋𝑋𝑖𝑖 = 𝑋𝑋𝑖𝑖 − �𝑋𝑋𝑖𝑖

Optional Transformations

Weighted coordinates -> another reference frame

Use fixed transformation matrix H

Lane line -> better as parabola from top-down/ortho view(BEV)

Experiment – Ego Lane Detection

Ego lane -> the current lane of the vehicle

Two lane marking -> one left and one right

Parabola -> upto fixed distance t from car

Overall error = average over 2 lanes, and average over images

Dataset

TuSimple Dataset [29]

Manually select and clean up the annotations of 2535 images

Filter out images where ego-lane cannot be detected unambiguously

20% images -> validation set

Not include images of single temporal sequence in both train and val sets

[29] TuSimple. Tusimple benchmark, 2017. 24

Annotation

Ground truth curve parameters -> parabola

Draw curve of fixed thickness as dense label

Baseline – Cross-entropy training

Training Segmentation

Per-pixel binary cross-entropy loss

Testing Segmentation mask generated

Parabola fitted in least squares sense

End-to-end training

ERFNet [28] -> network architecture

350 epochs; 1 GPU; 256*512 resolution; batch size 8

Adam[19] with LR 10−4

PyTorch [26][28] E. Romera, J. M. Alvarez, L. M. Bergasa, and R. Arroyo. Efficient convnet for real-time semantic segmentation. In IEEE Intelligent Vehicles Symposium, pages 1789–1794, 2017.

[19] D. P. Kingma and J. Ba. Adam: A method for stochastic optimization. In Proceedings of the International Conference on Learning Representations, 2015.

[26] A. Paszke, S. Gross, S. Chintala, G. Chanan, E. Yang, Z. De- Vito, Z. Lin, A. Desmaison, L. Antiga, and A. Lerer. Automatic differentiation in pytorch. In NIPS-W, 2017. 27

Detour -ERFNet

ERFNet

Semantic Segmentation

Typical Encoder – Decoder architecture

Last layer -> adapted to output 2 feature maps

One for each ego lane

Transform weighted coordinates using fixed H to top-down view

[28] E. Romera, J. M. Alvarez, L. M. Bergasa, and R. Arroyo. Efficient convnet for real-time semantic segmentation. In IEEE Intelligent Vehicles Symposium, pages 1789–1794, 2017 30

Results - Quantitative

Result – Training curves

Qualitative Results 33

Analysis

Lower error than cross-entropy method

Convergence slower -> supervision signal weaker

Generated weight maps look like segmentation maps the network eventually discovers that the most consistent way to satisfy the loss function is to focus

on the visible lane markings in the image, and to map them to a segmentation-like representation in the weight maps.

Further Experiments – Multi-lane detection

4 lanes total -> ortho-view

Line prediction branch; horizon prediction branch

Horizon prediction branch -> regression -> estimate horizon

Line prediction branch -> whether lane is present or not

Architecture

Architecture details

Side branches -> 4 conv layers -> each 3x3

Then max pool -> FC layer

Losses for 3 tasks -> combined linearly

Dataset

3626 images

20% validation set

2782 test set images

Comparison

ERFNet without backprop through least squares layer -> baseline

[25] Spatial CNN

[24] -> Instance Segmentation approach

[24] D. Neven, B. De Brabandere, S. Georgoulis, M. Proesmans, and L. Van Gool. Towards end-to-end lane detection: an instance segmentation approach. arXiv:1802.05591, 2018.

[25] X. Pan, J. Shi, P. Luo, X. Wang, and X. Tang. Spatial as deep: Spatial cnn for traffic scene understanding. In AAAI, 2018. 39

Results Quantitative

Results -Qualitative

Analysis

Improve upon baseline by 0.7%

Faster than benchmarks in test time -> no post-processing required

71 fps on NVIDIA 1080Ti

ADVANTAGES

Optimized for true task -> prevents instabilities in curve fitting

Offers degree of interpretability Generated weight maps -> segmentation-like

Can be inspected and visualized

Geometry aware criterion is loss function

Handle large variance, faded lane markings

Moves complexity from post-processing to network -> one-shot fitting

DISADVANTAGES

Loss function -> more complicated for higher order curves

Fixed transformation H to ortho-view If ground plane is different (ex. Sloping uphill), then bad lane parameters in test time

Local minimum possible – author Vanishing point in horizon/left corner of image features -> good curve -> no improvement

Multi-lane -> fixed number of maps -> pre-defined order Lane changes hard; Order ambiguous

Instance segmentation -> not subject to specific order

Quantitative results -> comparatively worse from slide 40

KEY TAKEAWAYS

Including differentiable in-network optimization step.

Geometric Loss function relevant to the task

Question to the class

Why do you think that the loss in the Least Squares layer is only back-propagated to the coordinate weights only, and not to the coordinates themselves?

THANK YOU!

Questions?

End-to-end Lane Detection through Differentiable...

Documents

END TO END TRAINABLE ACTIVE CONTOURS VIA DIFFERENTIABLE

Unified Architecture for Multichannel End-to-End Speech … · 2018-02-09 · architecture (fully differentiable network), and (2) the entire network can be consistently optimized

End-to-End Differentiable Learning of Protein StructureBeyond the immediate application of protein structure prediction, end-to-end differentiable models of proteins represent a new

End-to-end Differentiable Proving...While the connection between logic and machine learning has been addressed by statistical relational learning approaches, these models traditionally

End-to-End Optimization of Scene Layout3dsln.csail.mit.edu/papers/3dsln_cvpr.pdfa depth and a semantics map, the differentiable render- ... coder described above, we optimize L

End-to-End Differentiable Physics for Learning and Control · End-to-End Differentiable Physics for Learning and Control Filipe de Avila Belbute-Peres1, Kevin Smith2, Kelsey Allen2,

LAnD off tHe SoutHern enD of Long LAne

End-to-End Differentiable Adversarial Imitation Learningproceedings.mlr.press/v70/baram17a/baram17a.pdf · End-to-End Differentiable Adversarial Imitation Learning 2.3. Generative

End-to-end Training of CNN-CRF via Differentiable Dual ... · End-to-end Training of CNN-CRF via Differentiable Dual-Decomposition Shaofei Wang1, Vishnu Lokhande2, Maneesh Singh3,

Differentiable manifold - brainm.combrainm.com/software/pubs/math/Differentiable_manifold.pdf · Differentiable manifolds are very important in physics. Special kinds of differentiable

End to End Video Segmentation for Driving : Lane …End to End Video Segmentation for Driving : Lane Detection For Autonomous Car Tejas Mahale Chaoran Chen and Wenhui Zhang * equal

End-to-end Differentiable Proving...End-to-End Differentiable Proving Tim Rocktäschel University of Oxford tim.rocktaschel@cs.ox.ac.uk Sebastian Riedel University College London &

NON-DIFFERENTIABLE FUNCTIONS

End-to-End Differentiable Proving - arXiv · 2017-12-05 · End-to-End Differentiable Proving Tim Rocktäschel University of Oxford tim.rocktaschel@cs.ox.ac.uk Sebastian Riedel University

Nothing at the End of the Lane — Issue Onenothing-lane.co.uk/pdfs/Lane1-Intermission-Invasion.pdfNothing at the End of the Lane — Issue One 24 25 Nothing at the End of the Lane

Towards End-to-End Lane Detection: an Instance ...Towards End-to-End Lane Detection: an Instance Segmentation Approach Davy Neven Bert De Brabandere Stamatios Georgoulis Marc Proesmans

Differentiable Manifold

ANALYTIC EXTENSIONS OF DIFFERENTIABLE FUNCTIONS … · 2018. 11. 16. · ANALYTIC EXTENSIONS OF DIFFERENTIABLE FUNCTIONS DEFINED IN CLOSED SETS* BY HASSLER WHITNEYt I. DIFFERENTIABLE

DRUM: End-To-End Differentiable Rule Mining On Knowledge ...DRUM: End-To-End Differentiable Rule Mining On Knowledge Graphs Ali Sadeghian 1, Mohammadreza Armandpour2, Patrick Ding,2

Mayebrook Barn Green End Lane Farm Green Lane High Flatts