Visual Perception of 3D Shape Roland W. Fleming Manish Singh Max Planck Institute for Biological Cybernetics Rutgers University – New Brunswick

Embed Size (px)

Citation preview

  • Slide 1

Visual Perception of 3D Shape Roland W. Fleming Manish Singh Max Planck Institute for Biological Cybernetics Rutgers University New Brunswick Slide 2 Slide 3 The problem of 3D perception Bishop Berkeley (1685-1753): "It is I think agreed by all that distance of itself, and immediately, cannot be seen. For distance being a line directed end-wise to the eye, it projects only one point in the fund of the eye, which point remains invariably the same whether the distance be longer or shorter." P1P1 P2P2 P Slide 4 The optics of the eye project the 3D world onto a 2D image plane on the retina. What we as behaving organisms care about is the 3D structure of the world. Unfortunately the projection from 3D to 2D is not invertible. The problem of 3D perception Image [2D] World [3D] Slide 5 Multiple surfaces are consistent with any given image, so 3D shape perception is fundamentally ambiguous It is an inference from incomplete information The problem of 3D perception Slide 6 Ambiguities in 3D Perception Necker Cube 2 dominant interpretations Slide 7 Ambiguities in 3D Perception 2 dominant interpretations Only a handful of legal interpretations are generally experienced. Why? Note that neither of these two interpretations are correct perspective projections! Slide 8 Philosophical Schools Constructivism (e.g. Helmholtz, Gregory, Rock) vision is ill-posed: sensory data are impoverished the world we see is a construction perception is a process of inductive inference Extra-retinal information and assumptions about the world play a central role Direct Perception (e.g. Gibson) ambient optic array contains sufficient information to support action we perceive the world directly, through active interaction the relevant information is global and comparative Slide 9 Philosophical Schools Gestalt Perception (e.g. Koffka, Metzger, Kohler) vision is all about structure the interpretation that we experience is determined by the interaction of simple rules describing the organization of the interpretation The simplest interpretation is favoured: Prgnanz time Slide 10 Explaining the Necker Cube 2 dominant interpretations Constructivism: the percepts are the most probable interpretations Direct Perception: the relevant image information specifies these interpretations, but such ambiguous images are rarely encountered in the real world, and we normally resolve the ambiguity through interaction Gestalt: the percepts are the simplest, most orderly interpretations. Slide 11 Perception Pipeline image Slide 12 Perception Pipeline cues image shading texture Slide 13 Perception Pipeline cues image shading texture shape estimate shape estimate Slide 14 Perception Pipeline cues priors image shading texture shape estimate shape estimate Surfaces are generally smooth Texture tends to be isotropic Light usually comes from above Slide 15 Generic Viewpoint Assumption Koenderink & van Doorn (1979). Binford (1981). Freeman (1994). Slide 16 Image-based material editing Kahn, Reinhard, Fleming & Blthoff (2006). Transactions on Graphics: Proceedings of SIGGRAPH 06. ACM SIGGRAPH. transparencyre-textured Given single photograph as input, modify material appearance of object. Physically correct solution not possible: aim for perceptually correct solution. Exploit assumptions of human vision to develop heuristics. Slide 17 Crude Shape Reconstruction Light from the side: shadows and intensity gradient leads to substantial distortions of the face original reconstructed depths Slide 18 Importance of viewpoint Substantial errors in depth reconstruction are not visible in transformed image transformed image correct viewpoint Slide 19 Importance of viewpoint Slide 20 Slide 21 Slide 22 Slide 23 Slide 24 Slide 25 Slide 26 Seen from Above Slide 27 Slide 28 Slide 29 Hollow Mask Illusion Convexity and familiarity combine to yield a strong sense that the mask is convex, even when it is concave. But note that the apparent lighting and shape is different. convexconcavetransition Slide 30 Bas-Relief Ambiguity Scenes related to one another by an affine transformation are indistinguishable from one another Belhumeur, Kriegman & Yuille (1997) Slide 31 Scenes related to one another by an affine transformation are indistinguishable from one another Bas-Relief Ambiguity Belhumeur, Kriegman & Yuille (1997) Slide 32 Bas-Relief Ambiguity Belhumeur, Kriegman & Yuille (1997) showed that shape from shading information is fundamentally ambiguous. For direct illumination, scenes that are related to one another by an affine transformation (scaling + shearing) yield pixel-for-pixel identical images. Despite this we rarely experience any ambiguity in the perception of shaded objects. Everyday perception gives us the impression that we see objects in a correct and stable way. But do we? Koenderink and colleagues have shown that perceived shape varies considerably from day to day, with the percepts typically related to one another by an affine transformation. Slide 33 Light from Above In the absence of other information to indicate shape or lighting direction, the brain assumes light comes from above light from below light from above Slide 34 Light from Above In the absence of other information to indicate shape or lighting direction, the brain assumes light comes from above light from below light from above Slide 35 Linear Perspective Slide 36 Slide 37 Slide 38 Bounding Contours Dejan Todorovi, 2009. Adapted and used with permission Slide 39 Bounding Contours Dejan Todorovi, 2009. Adapted and used with permission Slide 40 Bounding Contours Slide 41 Structure from Motion Individual frames carry a relatively weak sense of 3D shape. It is only through optic flow (motion) that the shape is revealed Slide 42 Slide 43 Pattern of compressions and rarefactions across the image indicates something about the 3D shape. Shape from Texture Slide 44 Isotropic compression of textures due to distance Shape from Texture Slide 45 Anisotropic compression of textures due to slant Shape from Texture Slide 46 Anisotropic compression of textures due to slant Slide 47 Shape from Texture Anisotropic compression of textures due to slant Slide 48 Anisotopic compression specifies surface orientation up to a 180 ambiguity on the surface tilt. This means we can experience perceptual flips (bistability) when there are no other cues to specify convexity vs. concavity Under orthographic projection, there is no isotropic compression and no convergence, so we can see the red line as lying either on a ridge or in a valley Slide 49 Under perspective projection, isotropic compression (scale gradient) and convergence cues resolve the ambiguity. We experience the red line as lying on a ridge, and not on a valley. Slide 50 Homogeneous: the statistics of the texture are uniform from location to location. This is necessary to ensure that changes in the statistics of the texture observed in the image are due solely to the process of projection into the image plane and are not intrinsic to the texture itself Isotropic: the texture does not have a dominant local orientation. This is necessary to ensure that anisotropic compressions are aligned with the depth gradient of the surface Assumptions in Shape from Texture Slide 51 Illusory distortions of shape Inspired by Todd & Thaler VSS 05 Slide 52 Illusory distortions of shape Slide 53 Inspired by Todd & Thaler VSS 05 Illusory distortions of shape Slide 54 Slide 55 Slide 56 Slide 57 Interaction of light with surface Slide 58 Matte Glossy Mirrored Slide 59 Confounding Effects of Illumination Identical materials can lead to very different images Different materials can lead to very similar images Images Ron O. Dror. All rights reserved. Slide 60 Ambiguity between illumination and Shape Slide 61 reflectance mapimage Classical Shape from Shading Visual system estimates surface orientation from image intensity Slide 62 Classical Shape from Shading reflectance map Image intensity is a scalar but surface orientation is a vector Recovering orientation from intensity is under-constrained Large amount of computer vision research proposing ways to reduce this ambiguity Problem: image intensity is ambiguous: Slide 63 Visual system estimates surface orientation from image intensity Classical Shape from Shading reflectance map Circular logic: estimating the reflectance map requires knowing the geometry. Under typical viewing conditions, it is unclear how well subjects can estimate the reflectance map. Problem: reflectance map is unknown: Slide 64 Visual system estimates surface orientation from image intensity Classical Shape from Shading reflectance map There is no principled way of predicting when human shape perception should succeed or fail Successes attributed to correct estimation of reflectance map, errors to incorrect estimates of reflectance map. But why and when should this occur? Problem: predicting human perception Slide 65 Use image measurements other than intensity Use the kinds of image measurements the visual system employs at the front end Alternative approach reflectance mapimage Slide 66 Mirrors No stereopsis No diffuse shading No texture Nothing but a distorted reflection of the world surrounding the object! Yet we perceive the 3D shape. How? Fleming, Torralba & Adelson (2004). Journal of Vision. Slide 67 highly curved Curvatures determine distortions Slide 68 slightly curved Anisotropies in surface curvature lead to powerful distortions of the reflected world Curvatures determine distortions Slide 69 Eigenvectors of Hessian matrix Intrinsic principal curvatures Slide 70 image depths Slide 71 Population codes Slide 72 Orientation fields Ground truth Slide 73 3D shape appears to be conveyed by the continuously varying patterns of orientation across the image of a surface Slide 74 Beyond specularity Specular reflection Diffuse reflection Slide 75 Orientations in shading Slide 76 Orientation fields in shading Slide 77 Reflectance as Illumination Mirrors in an increasingly blurry world Slide 78 highly curved Slide 79 slightly curved Anisotropies in surface curvature lead to anisotropies in the image. Slide 80 Light Warps Vergne, Pacanowski, Barla, Granier & Schlick (2009). Light Warping for enhanced Surface Depiction in SIGGRAPH 09: ACM SIGGRAPH 2009 Papers. ACM SIGGRAPH 2009, All rights reserved. Slide 81 Light Warps Vergne, Pacanowski, Barla, Granier & Schlick (2009). Light Warping for enhanced Surface Depiction in SIGGRAPH 09: ACM SIGGRAPH 2009 Papers. ACM SIGGRAPH 2009, All rights reserved. Slide 82 Apparent Ridges Judd, Durand & Adelson (2007). Apparent Ridges for Line Drawing. ACM Transactions on Graphics: Proceedings of SIGGRAPH 2007. ACM SIGGRAPH 2007, All rights reserved. Slide 83 Apparent Ridges Judd, Durand & Adelson (2007). Apparent Ridges for Line Drawing. ACM Transactions on Graphics: Proceedings of SIGGRAPH 2007. ACM SIGGRAPH 2007, All rights reserved. Slide 84 Texture vs. Reflectance Slide 85 Slide 86 Shape from Smear Slide 87 Slide 88 Slide 89 Slide 90 Higher level shape properties Neither object is physically unstable (falling over) But: one affords being toppled more than the other Slide 91 Perceived Shape is Multi-Scale Coarse Mid Fine Slide 92 Perceived Shape is Multi-Scale Lee, C. H., Varshney, A. & Jacobs, D. W., Mesh saliency, in SIGGRAPH '05: ACM SIGGRAPH 2005 Papers, pp. 659-666 (New York, NY, USA: ACM, 2005). ACM SIGGRAPH 2005, All rights reserved. Mesh Saliency Slide 93 Perceived Shape is Multi-Scale Lee, C. H., Varshney, A. & Jacobs, D. W., Mesh saliency, in SIGGRAPH '05: ACM SIGGRAPH 2005 Papers, pp. 659-666 (New York, NY, USA: ACM, 2005). ACM SIGGRAPH 2005, All rights reserved. Coarse spatial scaleFine spatial scale Applications : Level of Detail Hiding Watermarks Viewpoint selection Slide 94 Conclusions There are many different cues to 3D shape, which the human visual system can draw on under typical viewing conditions. Most cues are ambiguous or unreliable if considered in isolation. The secret of conveying shape effectively is to provide multiple cues. Orientation fields may be an important common language in human shape processing. There are probably many other applications in CG that can exploit this. Many of the assumptions made by human vision can be exploited in a computer graphics applications. Richer, more perceptual representations of geometry are an exciting challenge for the future.