Deep Classification Network for Monocular Depth Estimation Azeez Oluwafemi 1,2, Yang Zou 2, B.V.K. Vijaya Kumar 2 1.InstaDeep, 2. Carnegie Mellon University ABSTRACT Monocular Depth Estimation is usually treated as a supervised and regression problem when it actually is very similar to semantic segmentation task since they both are fundamentally pixel-level classification tasks. We applied depth increments that increases with depth in discretizing depth values and then applied Deeplab v2 [1] and the result was higher accuracy. We were able to achieve a state-of-the-art result on the KITTI dataset [2] and outperformed existing architecture by an 8% margin Conclusion We've been able to demonstrate how a depth estimation task can be formulated as a semantic segmentation problem since the two tasks are fundamentally just pixel-level classification tasks at the core. Simply discretizing the depth map and treating the binned depths as classes and applying a state-of-the-art semantic segmentation network can produce a result that outperforms existing results. Table 1: Performance on KITTI. K is KITTI, CS is Cityscapes. 1.25, 1.252 and 1.253 are pre-defined thresholds for Accuracy under threshold metric (d) BENCHMARK PERFORMANCE FURTHER RESEARCH In the future, we would investigate domain gaps in monocular depth maps estimation and apply class balanced self training to attempt to reduce the gap. References [1] Liang-Chieh Chen, George Papandreou, Iasonas Kokkinos, Kevin Murphy, and Alan L Yuille. Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs. IEEE transactions on pattern analysis and machine intelligence, 40(4):834–848, 2018. [2] Andreas Geiger, Philip Lenz, Christoph Stiller, and Raquel Urtasun. Vision meets robotics: The kitti dataset. The International Journal of Robotics Research, 32(11):1231–1237, 2013. PROBLEM FORMULATION The vehicle routing problem (VRP) is an important problem class in the field of operations research and transport logistics optimization. Its original formulation has been defined over 60 years ago by [1] and consists of a fleet of identical vehicles serving a set of customers with a certain demand from a single depot and having a certain capacity. This formulation is referred to as the capacitated vehicle routing problem (CVRP). The CVRP can be stated as follows: FIGURE 1: Depth Predictions on KITTI}. The Image on the left column, ground truth in the middle and our model prediction on the right. The color bar below shows the range of depths. Blue is nearer. BENCHMARK METRICS STEADY STATE FIGURE 2: Deeplab [1] . An input image is passed through a Deep Convolutional Neural Network(DCNN) such as ResNet101, using atrous convolution to reduce downsampling. The score map output is then interpolated for upsampling to original image resolution. Low-level details are finally incorporated with the final result through a pre-processing step of fully connected conditional random field (CRF) FIGURE 3: UD (middle) and SID (bottom) to discretize depth interval [a, b] into sub intervals di where i = 0, 1,2,...K and K = number of class. UD is Uniform Discretization, SID is spatially increasing interval discretization where di in d0, d1,..., dK are depth interval boundaries.

Deep Classification Network for Monocular Depth Estimation€¦ · Azeez Oluwafemi 1,2, Yang Zou 2, B.V.K. Vijaya Kumar 2 1.InstaDeep, 2. Carnegie Mellon University ABSTRACT Monocular

Download PDF Report

Upload
others
View
0
Download
0

Embed Size (px)

Citation preview

Deep Classification Network

for Monocular Depth EstimationAzeez Oluwafemi 1,2, Yang Zou 2, B.V.K. Vijaya Kumar 2

1.InstaDeep, 2. Carnegie Mellon University

ABSTRACTMonocular Depth Estimation is usually treated as a supervised and regression

problem when it actually is very similar to semantic segmentation task since they

both are fundamentally pixel-level classification tasks. We applied depth

increments that increases with depth in discretizing depth values and then

applied Deeplab v2 [1] and the result was higher accuracy. We were able to

achieve a state-of-the-art result on the KITTI dataset [2] and outperformed

existing architecture by an 8% margin

ConclusionWe've been able to demonstrate how a depth estimation task can be formulated

as a semantic segmentation problem since the two tasks are fundamentally just

pixel-level classification tasks at the core. Simply discretizing the depth map

and treating the binned depths as classes and applying a state-of-the-art

semantic segmentation network can produce a result that outperforms existing

results.

Table 1: Performance on KITTI. K is KITTI, CS is Cityscapes. 1.25, 1.252 and 1.253 are pre-defined

thresholds for Accuracy under threshold metric (d)

BENCHMARK PERFORMANCE

FURTHER RESEARCHIn the future, we would investigate domain gaps in monocular depth maps

estimation and apply class balanced self training to attempt to reduce the

gap.

References[1] Liang-Chieh Chen, George Papandreou, Iasonas Kokkinos, Kevin Murphy,

and Alan L Yuille. Deeplab: Semantic image segmentation with deep

convolutional nets, atrous convolution, and fully connected crfs. IEEE

transactions on pattern analysis and machine intelligence, 40(4):834–848,

2018.

[2] Andreas Geiger, Philip Lenz, Christoph Stiller, and Raquel Urtasun. Vision

meets robotics: The kitti dataset. The International Journal of Robotics

Research, 32(11):1231–1237, 2013.

PROBLEM FORMULATIONThe vehicle routing problem (VRP) is an important problem class in the field of

operations research and transport logistics optimization. Its original formulation

has been defined over 60 years ago by [1] and consists of a fleet of identical

vehicles serving a set of customers with a certain demand from a single depot

and having a certain capacity. This formulation is referred to as the capacitated

vehicle routing problem (CVRP).

The CVRP can be stated as follows:

FIGURE 1: Depth Predictions on KITTI}. The Image on the left column,

ground truth in the middle and our model prediction on the right.

The color bar below shows the range of depths. Blue is nearer.

BENCHMARK METRICS

STEADY STATE

FIGURE 2: Deeplab [1] . An input image is passed through a Deep Convolutional Neural

Network(DCNN) such as ResNet101, using atrous convolution to reduce downsampling. The score

map output is then interpolated for upsampling to original image resolution. Low-level details are

finally incorporated with the final result through a pre-processing step of fully connected conditional

random field (CRF)

FIGURE 3: UD (middle) and SID (bottom) to discretize depth interval [a, b] into sub intervals di where

i = 0, 1,2,...K and K = number of class. UD is Uniform Discretization, SID is spatially increasing

interval discretization

where di in d0, d1,..., dK are depth interval boundaries.

Rendering Portraitures from Monocular Camera and Beyondopenaccess.thecvf.com/content_ECCV_2018/papers/...Rendering Portraitures from Monocular Camera and Beyond Xiangyu Xu1,2 Deqing

Documents

GPS-Denied Indoor and Outdoor Monocular Vision Aided ...daslab.okstate.edu › sites › default › files › backgrounds › ... · 1.1.1 Monocular vision aided inertial navigation

Documents

Linear depth estimation from an uncalibrated, monocular ...cseweb.ucsd.edu/~ravir/ECCV_Polarize.pdfLinear depth estimation from an uncalibrated, monocular polarisation image William

Documents

Robust Large Scale Monocular Visual SLAM

Documents

Monocular Depth Estimation Using Neural Regression Forestweb.engr.oregonstate.edu/~sinisa/research/publications/cvpr16_NRF.pdf · Monocular Depth Estimation Using Neural Regression

Documents

Autonomous Navigation of Generic Monocular …robotics.iiit.ac.in/uploads/Main/Publications/Bipin_etal_ICRA_15.pdf · Autonomous Navigation of Generic Monocular Quadcopter ... is

Documents

Transient Monocular Visual Loss

Documents

Robust Monocular Epipolar Flow Estimationttic.uchicago.edu/~rurtasun/publications/yamaguchi_et_al_cvpr13.pdfRobust Monocular Epipolar Flow Estimation Koichiro Yamaguchi 1;2 David McAllester

Documents

Stereo and Monocular Vision Applied to Computer Aided ... · Stereo and Monocular Vision Applied to Computer Aided Navigation for Aerial and Ground Vehicles Stereo vision Monocular

Documents

Unsupervised Learning of Monocular Depth Estimation and ...openaccess.thecvf.com/content_cvpr_2018/...Learning... · Unsupervised Learning of Monocular Depth Estimation and Visual

Documents

Monocular Deprivation

Documents

Monocular Pedestrian Recognition Using Motion Parallax

Documents

Monocular temporal hemianopia · monocular scotoma with a normal field in the fellow eye.' A junctional scotoma more often consists of a monocular temporal hemianopia with a central

Documents

Monocular-Binocular Visual Acuity of Adults

Documents

Automated Virtual Navigation and Monocular Localization of ...openaccess.thecvf.com/content_cvpr_2018_workshops/papers/w30/… · Automated Virtual Navigation and Monocular Localization

Documents

VIJAYAKUMAR BHAGAVATULA (B.V.K. Vijaya Kumar)kumar/Kumar_CV_files/Bhagavatula_CV.pdf · 38. “Equalization methods for optical data storage,” Quantum Electronics, Shrewesbury,

Documents

Real-time Decentralized Monocular SLAM

Documents

Preparing for London 2012 Olympic Games by Adelakun Oluwafemi A

Documents

MONOCULAR MONOCULAiRe

Documents

Weapon Mounted Mini Thermal Monocular

Documents

Handheld Thermal Monocular Camera

Documents

Generative Adversarial Networks for unsupervised monocular ...vision.deis.unibo.it/~smatt/Papers/3DRW2018/Monogan.pdf · Generative Adversarial Networks for unsupervised monocular

Documents

LEARNING AND INFERENCE ALGORITHMS FOR MONOCULAR

Documents

Visualization of Convolutional Neural Networks for Monocular ...openaccess.thecvf.com/content_ICCV_2019/papers/Hu...Visualization of Convolutional Neural Networks for Monocular Depth

Documents

Learning Monocular Depth Estimation Infusing Traditional ...openaccess.thecvf.com/content_CVPR_2019/papers/...Infusing_Tradi… · Learning monocular depth estimation infusing traditional

Documents

VIJAYAKUMAR BHAGAVATULA (B.V.K. Vijaya Kumar)users.ece.cmu.edu/~kumar/Kumar_CV_files/Bhagavatula_CV.pdf · 2009. 10. 24. · 10/24/2009 p. 1 VIJAYAKUMAR BHAGAVATULA (B.V.K. Vijaya

Documents

Collision Avoidance for Quadrotors with a Monocular Camera · Collision Avoidance for Quadrotors with a Monocular Camera 3 Fig.1: Our goal is to enable a quadrotor with a monocular

Documents

REMODE: Probabilistic, Monocular Dense Reconstruction in ...rpg.ifi.uzh.ch › docs › ICRA14_Pizzoli.pdf · REMODE: Probabilistic, Monocular Dense Reconstruction in Real Time Matia

Documents

High Speed Obstacle Avoidance using Monocular …ai.stanford.edu/~asaxena/rccar/ICML_ObstacleAvoidance.pdfHigh Speed Obstacle Avoidance using Monocular Vision and Reinforcement Learning

Documents

MANAGEMENT - Nigerian Scholars · ct00007 oluwafemi kolawole seun 27. ct00021 oluwafemi ogungbayi akin 28. oluwafemi oluwaseun ayo 29. ct00038 oluwamayowa sotunde c 30. ct00009 opeyemi

Documents