DEEP LEARNING FOR SLOW MOTION VIDEO · 2019-05-24 · DEEP LEARNING CONTAINERS Took learnings from...

Thomas True, Senior Applied Engineer for Professional Video

Andrew Page, Director Advanced Technologies

DEEP LEARNING FOR SLOW MOTION VIDEO

Deep Learning

WHAT IS DEEP LEARNING?

Artificial Intelligence (AI): general coverall for machines doing interesting things

Machine Learning (ML):computers complete tasks without explicit programming

Neural Networks (NN):one technique to achieve ML

Deep Learning (DL):adds “hidden layers” to Neural Networks to solve complex problems

Great explanation: https://goo.gl/hkayWG

The differences between AI, ML & DL

DEEP LEARNING APPLICATION DEVELOPMENT

UntrainedNeural Network

Deep Learning

Framework

TRAININGLearning a new capability

from existing data

Trained ModelNew Capability

App or ServiceFeaturing Capability

INFERENCEApplying this capability

to new data

Trained ModelOptimized for Performance

WHAT PROBLEM ARE YOU SOLVING?Defining the AL/DL Task

QUESTION AI/DL TASK

Is “it” present

or not?Detection

What type of thing

is “it”?Classification

To what extent

is “it” present?Segmentation

What is the likely

outcome? Prediction

What will likely

satisfy the objective?Recommendation

What would be a

new variant?Generation

INPUTSEXAMPLE

OUTPUTS

Text Data Images

AudioVideo

Object Identification(Labeling)

Object Detection

Feature Tracking

Denoised Pixel Values

Animation Pose Selection

Texture Creation

WHAT IS “SUCCESSFUL” DEEP LEARNING?

What characteristics do successful deep learning applications share?

1. Define the problem; Make sure DL is appropriate

2. Collect a lot of the right data (labelled data is even better)

3. Ensure ROI; choose a large task

Many things but not “all the things!”

Slow Motion Video

Defining the Problem

MULTI-FRAME VARIABLE LENGTH INTERPOLATIONInterpolate an Arbitrary Number of Frames Between Two Input Frames

T = 0 T = 1

? ? ? ?

[Mahajan et al. SIGGRAPH’09]

SINGLE-FRAME INTERPOLATIONInterpolate Only the Single Frame Inbetween

T = 0 T = 1T = 0.5

[Long et al. ECCV 2016, Niklaus et al. CVPR 2017, Liu et al. ICCV 2017]

RECURSIVE SINGLE-FRAME INTERPOLATIONNo Time Order for Real-Time Applications

T = 0 T = 1Step 1Step 2 Step 2

RECURSIVE SINGLE-FRAME INTERPOLATIONInefficient for Arbitrary Number of Frames

T = 0 T = 1T = 1/3 T = 2/3

MULTI-FRAME VARIABLE LENGTH INTERPOLATIONCan Interpolate in Correct Time Order

T = 0 T = 1

? ? ? ?

MULTI-FRAME VARIABLE LENGTH INTERPOLATIONCan Interpolate in Reverse Time Order

T = 0 T = 1

? ? ? ?

MULTI-FRAME VARIABLE LENGTH INTERPOLATIONCan Interpolate All at Once

T = 0 T = 1

? ? ? ?

MULTI-FRAME VARIABLE LENGTH INTERPOLATIONCan Interpolate Only Selected Frames

T = 0 T = 1

? ? ? ?

T = 1/3 T = 2/3

Super SloMo

SUPER SLOMONVIDIA Research Published at IEEE CVPR 2018

https://arxiv.org/pdf/1712.00080.pdf

Huaizu Jiang, Deqing Sun, Varun Jampani, Ming-Hsuan Yang, Erik Learned-Miller, Jan

Kautz. Super SloMo: High Quality Estimation of Multiple Intermediate Frames for Video Interpolation. IEEE

CVPR 2018

SUPER SLOMOFrame Synthesis Using Optical Flow

T = 0 T = 1T = t ∈(0,1)

SUPER SLOMOVisibility Information

T = 0 T = 1T = t

𝜶 = 1 𝜶 = 0

OPTICAL FLOWImage Motion Correspondence

Color key

Baker et al. IJCV’11

Input Optical flow

Data credit: Liu et al. CVPR’08

OPTICAL FLOWCNNs Estimates Flow Between Input Frames

inputimages

bi-directflow

optical flow computation between inputs

FlowNet, FlowNet2, SpyNet, PWC-Net…

SUPER SLOMO OPTICAL FLOWChallenge: Flow from Unknown to Input

T = 0 T = 1T = t

SUPER SLOMO OPTICAL FLOWSimplifed 1D Setting

𝑇 = 0 𝑇 = 1

𝐹1→0

𝐹0→1

SUPER SLOMO OPTICAL FLOWGoal: Compute Motion Vector that Combines Forward and Backward Motion

𝑇 = 0 𝑇 = 𝑡 𝑇 = 1

𝐹𝑡→1

𝐹𝑡→0

SUPER SLOMO OPTICAL FLOWSolution: Borrow from Bi-Directional Flow

𝑇 = 0 𝑇 = 𝑡 𝑇 = 1

𝐹𝑡→0

𝐹1→0

𝑡𝐹1→0

Same spatial location

SUPER SLOWMO OPTICAL FLOWSolution: Borrow from the Bi-Directional Optical Flow

𝑇 = 0 𝑇 = 𝑡 𝑇 = 1

𝐹0→1

(1 − 𝑡)𝐹0→1

𝐹𝑡→0

𝑇 = 0 𝑇 = 𝑡 𝑇 = 1

𝐹𝑡→0

𝐹0→1

(𝑡 − 1)𝐹0→1

SUPER SLOWMO OPTICAL FLOWSolution: Borrow from the Bi-Directional Optical Flow

SUPER SLOWMO OPTICAL FLOWBilinear Combination of Two Approximations

𝑇 = 0 𝑇 = 𝑡 𝑇 = 1(𝑡 − 1)𝐹0→1

𝑡𝐹1→0

𝐹𝑡→0 ≈ 𝑡 𝑡𝐹1→0 + 1 − 𝑡 (𝑡 − 1)𝐹0→1

𝐹𝑡→0

SUPER SLOMO NETWORK ARCHITECTUREAdd Flow Approximation

inputimages

bi-directflow

optical flow computation between input

flowapprox

Deterministic, parameter-free

𝐹𝑡→0 ≈ 𝑡 𝑡𝐹1→0 + 1 − 𝑡 (𝑡 − 1)𝐹0→1

ASSUMPTION: SMOOTH MOTION FIELD

𝑇 = 0 𝑇 = 𝑡 𝑇 = 1

𝐹1→0

𝐹0→1

OPTICAL FLOWMotion Fields are Piecewise Smooth

Ground truth

optical flow

MPI-Sintel [Butler et al. ECCV’12] HAMA [Liu et al. CVPR’08] Middlebury [Baker et al. ICCV’07]

PROBLEM: MOTION BOUNDARIES

𝑇 = 0 𝑇 = 𝑡 𝑇 = 1

𝐹0→1

Motion boundary

𝑇 = 0 𝑇 = 𝑡 𝑇 = 1

𝐹1→0

𝐹0→1

Motion boundary

𝑇 = 0 𝑇 = 𝑡 𝑇 = 1

𝐹1→0

𝐹0→1

𝐹𝑡→0

(𝑡 − 1)𝐹0→1

𝑡𝐹1→0

Motion boundary

FLOW REFINEMENT

flowapprox

Flow refinement module

refinedflow

input images

warped images

SUPER SLOMO OPTICAL FLOWRefinement Near Motion Boundaries

intermediate

flow approximations

intermediate

refined flowdifferenceinput

FLOW REFINEMENT AND VISIBILITY REASONING

flowapprox

Flow refinement module

refinedflow

visibilitymaps

input images

warped images

prediction

intermediate optical flow

inputimages

warped input images

visibilitymaps

predictionw/ visibility maps

predictionw/o visibility maps

መ𝐼𝑡 = 𝛼𝑔 𝐼0, 𝐹𝑡→0 + 1 − 𝛼 𝑔(𝐼1, 𝐹𝑡→1)

𝐹𝑡→0

𝐹𝑡→1

𝑔 𝐼0, 𝐹𝑡→0

𝑔(𝐼1, 𝐹𝑡→1)

1 − 𝛼

መ𝐼𝑡

g: warping function

PREDICTION USE FLOW AND VISIBILITY MAP

PREDICTION USING FLOW AND VISIBILITY MAP

intermediate optical flow

inputimages

warped input images

visibilitymaps

መ𝐼𝑡 = 𝛼𝑔 𝐼0, 𝐹𝑡→0 + 1 − 𝛼 𝑔(𝐼1, 𝐹𝑡→1)

𝐹𝑡→0

𝐹𝑡→1

𝑔 𝐼0, 𝐹𝑡→0

𝑔(𝐼1, 𝐹𝑡→1)

1 − 𝛼

predictionw/ visibility maps

predictionw/o visibility maps

g: warping function

OVERALL NETWORK ARCHITECTURECombination of Two CNNs

flowapprox

refinedflow

visibilitymaps

inputimages

bi-directflow

prediction

at each time step t

optical flow computation between input arbitrary-time image synthesis

input images

warped images

TRAINING DATASET

YouTube 240-fps Adobe 240-fps*

*provided by Oliver Wang

GENERATIVE ADVERSARIAL NETWORK (GAN)G tries to fool D

The Art CriticThe Art Forger

PERFORMANCERunning Time (PyTorch)

Training (6 V100) Inference (1 V100)

~1000 videos, 0.3M frames:

10 days

7 1280*720 images:

•@Thomas True•Have we done any perf analysis of the existing GPUD4V implementations on Matrox at 8k to see where the bottlenecks actually are?

EVS AI ENGINE

AI generated

Regular camera

Artificial Intelligence hallucinated intermediate images

Regular

Slow Motion

AI Engine

EVS AI ENGINECreating On-Demand Slow Motion

REPLAY

SERVER

AI ENGINE

IN = X VIDEO

FRAMES

OUT = NX VIDEO

FRAMES

Commands

Instructions are sent

to the AI engine

Normal speed

baseband video

or file

Slow motion video

baseband feed

or file

PreviewINSERTION

OF TC, …

LTC, UMD, …

information

EVS AI ENGINEMimicking a Supermotion Camera

CAMERA

AI ENGINE

IN = X VIDEO

FRAMES

OUT = NX VIDEO

FRAMES

Normal speed

baseband

Slow motion

video in

multiple

phases

REPLAY

SERVER

Commands

NVIDIA NGX

NVIDIA NGX: DL FOR CREATIVE APPLICATIONS

Delivering Deep Learning Research Into Creative Applications

• The framework to make NV DL research available in end user applications through a common service interface

• DL Features: SloMo, Up-Res, InFilling, DLAA …

• Created, trained and updated by NVIDIA, will be available to any application running on an NGX GPU

• NGX SDK - ISVs can add these DL features to their application without going through the expense of developing and training https://developer.nvidia.com/rtx/ngx

Results

QUALITY ISSUESMotion Blur

Objects

move too

Motion blur

EXPOSURE TIME

SHARPENING TECHNIQUES

QUALITY ISSUESMissing Information

Objects

move too

Missing

information

SIGNIFICANTLY REDUCED

BY DOMAIN SPECIFIC TRAINING

QUALITY ISSUESMissing Information

HOW DO WE IMPROVE THE QUALITY?

More training data with variety of frame rates from 30 fps to 240 fps

Application Specific Training (Sports, Drama, etc.)

Post process filtering

More Training

Getting Started in DL

DEEP LEARNING CONTAINERS

Took learnings from scores of teams across NVIDIA and industry

Built containers to speed and ease deployment

Goals:

• Get engineers working on Deep Learning in minutes

• Ensure full GPU acceleration and latest

optimizations on all frameworks

• Isolate frameworks (some don’t play nice)

• Share, collaborate, and test applications across

different environments

VIRTUAL MACHINES VS. CONTAINERSMotivations

Packaging mechanism for applications

► Consistent and reproducible deployment

► Lightweight with faster startup than VMs

► Logical isolation from other applications (at the OS level)

No first-class support for GPUs was available in runtimes

Image credits

GPU SUPPORT IN DOCKER: 1.0

Open-source Project on GitHub since 2016>2MM downloads

>7.5K stars

>13 MM pulls of CUDA images from DockerHub

Enabled Various Use-cases► NGC optimized containers from NVIDIA

► Adopted by major deep learning frameworks

NVIDIA GPU CLOUDDeep Learning Across Platforms

NVIDIA TITAN (powered by NVIDIA

Volta or Pascal)

NVIDIA DGX-1, DGX-2 and

DGX Station

Amazon EC2 P3 instances with NVIDIA Volta

Use across workstation, local servers and cloud

To learn more about NVIDIA GPU Cloud, visit:https://nvidia.com/cloud

To sign up, go to:https://nvidia.com/ngcsignup

NVIDIA GPU CLOUDDownloadable Containers

DEEP LEARNING FOR SLOW MOTION VIDEO · 2019-05-24 · DEEP LEARNING CONTAINERS Took learnings from...

Documents

Bright Cluster Manager: Using the NVIDIA NGC Deep Learning Containers · 2018-10-30 · E_Masthead_Secondary_2-column_v2 Using the NVIDIA NGC Deep Learning Containers Run Tensorflow

Deep Learning: An Introduction from the NLP Perspective Deep Learning - An...Deep Learning Approach 1 Deep Learning Approach 2 Disclaimer I am not (yet) an expert in Deep Learning

CONTAINERS DEMOCRATIZE HPC - Nvidia...S9525 - Containers Democratize HPC TU 1-2 S9500 - Latest Deep Learning Framework Container Optimizations W 9-10 SE285481 - NGC User Meetup W 7-9

Executing Deep Learning Strategies Masterclass Preview - Enterprise Deep Learning

Tutorial: Deep Reinforcement Learning - Machine Learning ...hunch.net/~beygel/deep_rl_tutorial.pdfTutorial: Deep Reinforcement Learning - Machine Learning

An Introduction to Deep Learning · Deep Learning An Introduction to ... •Deep Learning •Feed Forward Network •Problems of Deep Learning •AutoEncoders •Restricted Boltzmann

Parallel File System for AI and Deep Learning · #scalable #mds_integrated #local_lustre_cache #lustre_aware designed_for_ai #machine_learning_oriented #containers #read_write #rh_alternative

UNSUPERVISED DEEP LEARNING - TAUweb.eng.tau.ac.il/.../uploads/2016/12/Unsupervised-Deep-Learning.pdf · UNSUPERVISED DEEP LEARNING Erez Aharonov Noam Eilon Deep Learning Seminar School

수백만사용자대상... · Amazon ECS 및Amazon SageMak er 등다양한컨테이너환경사용가능 AWS Deep Learning Containers TensorFlow (8) MXNet (8) Training (4) GPU with

High Performance Deep Learning Clusterson-demand.gputechconf.com/supercomputing/...Essential deep learning tools for data scientists, researchers and engineers. 22 NGC Containers We

DEEP PLANTERS & CONTAINERS - rooflite soil · 2019. 3. 14. · • Deep planters & containers are essentially small scale rooftop gardens that have a total depth of more than 18 inches

Intro to Deep Learning - bpTT Bright Minds … · 17/08/2020 · Intro to Deep Learning Deep Sequence Modeling Deep Computer Vision Deep Generative Modeling Deep Learning Applications

Docker Containers Deep Dive

Deep Learning for Laser Based Odometry Estimation - …juxi.net/workshop/deep-learning-rss-2016/papers/Nicolai - Deep... · Deep Learning for Laser Based Odometry Estimation ... deep

THE DEEP LEARNING REVOLUTION - Nvidia...THE DEEP LEARNING REVOLUTION Superhuman Breakthroughs in Modern Artiﬁcial Intelligence Powered by GPUs JOIN THE DEEP LEARNING ERA DEEP LEARNING

From Reinforcement Learning to Deep Reinforcement …fagostin/assets/files/...Keywords: Machine learning · Reinforcement learning Deep learning · Deep reinforcement learning 1 Introduction

AWS Deep Learning Containers · AWS Deep Learning Containers (Deep Learning Containers) are a set of Docker images for training and serving models in TensorFlow, TensorFlow 2, PyTorch,

20151216 Computer Vision Meetup Deep Learning Deep Learning - Rene...Deep Learning . René Donner Deep Learning Overview 2 ... coursera ... 20151216 Computer Vision Meetup Deep Learning

Bringing the best of Containers and Machine Learning together · models that solve business problems. Use techniques such as Machine Learning or Deep Learning and works with Tanya

Deep Learning for Generic Object Detection: A Survey · deep learning, probabilistic models, autoencoders, manifold learning, and deep networks 17 Deep learning LeCun et al. (2015)