Picture: a probabilistic programming language for scene ...€¦ · Picture: a probabilistic...

Picture: a probabilistic programming language for scene perception

Tejas D Kulkarni1, Pushmeet Kohli2, Joshua B Tenenbaum1, Vikash Mansinghka1

1Brain and Cognitive Science, Computer Science and Artificial Intelligence Lab, Massachusetts Institute of Technology. 2Microsoft Research Cambridge.

Probabilistic scene understanding systems aim to produce high-probabilitydescriptions of scenes conditioned on observed images or videos, typically ei-ther via discriminatively trained models or generative models in an “analysisby synthesis” framework. Discriminative approaches lend themselves to fast,bottom-up inference methods and relatively knowledge-free, data-intensivetraining regimes, and have been remarkably successful on many recognitionproblems. Generative approaches hold out the promise of analyzing complexscenes more richly and flexibly, but have been less widely embraced for twomain reasons: Inference typically depends on slower forms of approximateinference, and both model-building and inference can involve considerableproblem-specific engineering to obtain robust and reliable results. Thesefactors make it difficult to develop simple variations on state-of-the-art mod-els, to thoroughly explore the many possible combinations of modeling,representation, and inference strategies, or to richly integrate complemen-tary discriminative and generative modeling approaches to the same problem.More generally, to handle increasingly realistic scenes, generative approacheswill have to scale not just with respect to data size but also with respect tomodel and scene complexity. This scaling will arguably require general-purpose frameworks to compose, extend and automatically perform inferencein complex structured generative models – tools that for the most part do notyet exist.

Here we present Picture, a probabilistic programming language [1] thataims to provide a common representation language and inference enginesuitable for a broad class of generative scene perception problems. We seeprobabilistic programming as key to realizing the promise of “vision as in-verse graphics”. Generative models can be represented via stochastic codethat samples hypothesized scenes and generates images given those scenes.Rich deterministic and stochastic data structures can express complex 3Dscenes that are difficult to manually specify. Multiple representation andinference strategies are specifically designed to address the main perceivedlimitations of generative approaches to vision. Instead of requiring photoreal-istic generative models with pixel-level matching to images, we can comparehypothesized scenes to observations using a hierarchy of more abstract imagerepresentations such as contours, discriminatively trained part-based skele-tons, or deep neural network features. Top-down Monte Carlo inferencealgorithms include not only traditional Metropolis-Hastings, but also moreadvanced techniques for inference in high-dimensional continuous spaces,such as elliptical slice sampling, and Hamiltonian Monte Carlo which canexploit the gradients of automatically differentiable renderers. These top-down inference approaches are integrated with bottom-up and automaticallyconstructed data-driven proposals, which can dramatically accelerate infer-ence by eliminating most of the “burn in” time of traditional samplers andenabling rapid mode-switching.

We demonstrate Picture on three challenging vision problems: inferringthe 3D shape and detailed appearance of faces, the 3D pose of articulatedhuman bodies, and the 3D shape of medially-symmetric objects. The vastmajority of code for image modeling and inference is reusable across theseand many other tasks. We shows that Picture yields competitive performancewith optimized baselines on each of these benchmark tasks.[1] Noah Goodman, Vikash Mansinghka, Daniel Roy, Keith Bonawitz, and Daniel Tarlow. Church:

a language for generative models. arXiv preprint arXiv:1206.3255, 2012.[2] Richard David Wilkinson. Approximate bayesian computation (abc) gives exact results under

the assumption of model error. Statistical applications in genetics and molecular biology, 12(2):129–141, 2013.

[3] David Wingate, Noah D Goodman, A Stuhlmueller, and J Siskind. Nonstandard interpretationsof probabilistic programs for efficient inference. NIPS, 23, 2011.

This is an extended abstract. The full paper is available at the Computer Vision Foundationwebpage.

Scene Language

Approximate Renderer

Representation LayerScene

⌫(.)IR

e.g. Deep Neural Net, Contours, Skeletons, Pixels

⌫(ID)⌫(IR)

�(⌫(ID), ⌫(IR))

Likelihood or Likelihood-free Comparator

orP (ID|IR, X)

RenderingTolerance

ObservedImage

Given current

Inference EngineAutomatically

producesMCMC, HMC, Elliptical Slice,

Data-driven proposals

qP ((S⇢, X⇢) ! (S0⇢, X 0⇢))

qhmc(S⇢real ! S0⇢

qslice(S⇢real ! S0⇢

qdata((S⇢, X⇢) ! (S0⇢, X 0⇢))

New(S⇢, X⇢) (S0⇢, X 0⇢)

and image

3D Face program

3D object program

Random samples

drawn from example

probabilistic programs

3D human-pose program

ObservedImage

Inferred(reconstruction)

Inferred model re-rendered with

novel poses

Inferred model re-rendered with

novel lighting

ObservedImage Picture Baseline Observed

Image Picture Baseline

Figure 1: Overview: (a) All models share a common template; only the scenedescription S and image ID changes across problems. Every probabilistic graphicsprogram f defines a stochastic procedure that generates both a scene description andall the other information needed to render an approximation IR of a given observedimage ID. The program f induces a joint probability distribution on these programtraces ρ . Every Picture program has the following components. Scene Language:Describes 2D/3D scenes and generates particular scene related trace variables Sρ ∈ ρ

during execution. Approximate Renderer: Produces graphics rendering IR given Sρ

and latents Xρ for controlling the fidelity or tolerance of rendering. RepresentationLayer: Transforms ID or IR into a hierarchy of coarse-to-fine image representationsν(ID) and ν(IR) (deep neural networks, contours and pixels). Comparator: Duringinference, IR and ID can be compared using a likelihood function or a distance metric λ

(as in Approximate Bayesian Computation [2]). (b) Inference Engine: Automaticallyproduces a variety of proposals and iteratively evolves the scene hypothesis S to reacha high probability state given ID. (c): Representative random scenes drawn fromprobabilistic graphics programs for faces, objects, and bodies. (d) Illustrative results:We demonstrate Picture on a variety of 3D computer vision problems and check theirvalidity with respect to ground truth annotations and task-specific baselines.

Picture: a probabilistic programming language for scene ...€¦ · Picture: a probabilistic...

Documents

PROBABILISTIC FAILURE AND PROBABILISTIC LIFE PREDICTION · XIII. PROBABILISTIC FAILURE AND PROBABILISTIC LIFE PREDICTION This section involves the application of probabilistic methods

Picture: A Probabilistic Programming Language for Scene ...sunw.csail.mit.edu/2015/papers/75_Kulkarni_SUNw.pdfPicture: A Probabilistic Programming Language for Scene Perception Tejas

Primitive probabilistic auditory scene analysis - UCLturner/PublicTalks/MachineAudition2010/MacAu...Primitive probabilistic auditory scene analysis Richard E. Turner (ret26@cam.ac.uk)

UN7300AUD Series · Great scenes keep getting better. Automatically elevate the beauty of your favorite scenes. Active HDR supports a wide range of formats for scene-by-scene picture

PROBABILISTIC RELIABILITY ENGINEERING - …gnedenko-forum.org/library/Ushakov/Probabilistic Reliability...Probabilistic Reliability Engineering, Boris Gnedenko and Igor Ushakov. PROBABILISTIC

Setting the Scene: The Global Picture of Antibiotic …...Tetracyclines Macrolides Glycopeptides Streptogramins Chloramphenicol Quinolones Lincosamides Introduction of New Antibiotic

Boston startup scene picture presentation 2-12

Sandringham School Project › 2011 › 04 › ... · 2011-06-03 · Slide 5 introduce to the actual crime scene, A picture has been stolen and this is the crime scene that was lest

Investigation & Arrest – BIG PICTURE CRIME Police are notified 911 POLICE investigate ensure public safety protect & preserve crime scene collect & identify

Picture: A Probabilistic Programming Language for …...Picture: A Probabilistic Programming Language for Scene Perception Tejas D Kulkarni MIT tejask@mit.edu Pushmeet Kohli Microsoft

Modeling global scene factors in attentionweb.mit.edu/torralba/www/josa.pdfATTENTION In this section, ﬁrst, saliency-based models of attention are introduced. Then a probabilistic

Greengate - Cooper Industries...P2 = Scene 1, Scene 2, Scene 3, Scene 4, Scene 5, All Off (6TSB) P3 V= All On, Scene 1, Scene 2, Scene 3, Scene 4, All Off (6TSB) P4 G= Scene 1, Scene

Scene Graph Prediction with Limited Labels · probabilistic relationship labels to train any scene graph model. few relationships that have thousands of labels [31,49,54]. Hiring

PROBABILISTIC PROGRAMMING: BAYESIAN MODELLINGMADE …€¦ · PROBABILISTIC PROGRAMMING “Probabilistic programming is to probabilistic modelling as deep learning is to neural networks”

An unbroken view of an entire surrounding area, or survey A picture or series of pictures representing a continuous scene

THE ORIENTAL INSTITUTE...being, the film is available only in 35-millimeter size. For those nct A scene from the Institute's talking picture, ha ving sound-picture reproducing "The

Mise-en-scene. Mise-en-scène is a French term meaning “onstage” or “placed in a scene”. It is what we can see in the picture

Action Scene Colourful Semantics Picture Scenes - Schudio

Scene by scene

Probabilistic Signed Distance Function for On-the-fly Scene …openaccess.thecvf.com/content_ECCV_2018/papers/Wei_Dong... · 2018-08-28 · nal mesh models; [14,24,6] incorporates