34
Camera Culture Camera Culture Ramesh Raskar Associate Prof, Media Lab, MIT Course WebPage : http://raskar.info/course.html

Camera Culture

  • Upload
    leann

  • View
    24

  • Download
    1

Embed Size (px)

DESCRIPTION

Ramesh Raskar Associate Prof, Media Lab, MIT. Camera Culture. Course WebPage : http://raskar.info/course.html. Today’s Plan. Summary, Camera for image ‘search’ Visual Social Computing + Citizen Journalism Next class big question: ‘Opportunities in Pervasive Public Recording’ Big concept - PowerPoint PPT Presentation

Citation preview

Page 1: Camera Culture

Camera CultureCamera Culture

Ramesh RaskarAssociate Prof, Media Lab, MIT

Course WebPage : http://raskar.info/course.html

Page 2: Camera Culture

Today’s PlanToday’s Plan

• Summary, Camera for image ‘search’Summary, Camera for image ‘search’

• Visual Social Computing + Citizen JournalismVisual Social Computing + Citizen Journalism

• Next class big question: Next class big question: – ‘‘Opportunities in Pervasive Public Recording’Opportunities in Pervasive Public Recording’

• Big conceptBig concept– (Last week) (Last week)

• Understanding Camera ConstraintsUnderstanding Camera Constraints– (This week) (This week)

• What matters in photography: pixels (Low-level cues) or low-What matters in photography: pixels (Low-level cues) or low-dimensional features (Mid-level cues)? dimensional features (Mid-level cues)?

• Decomposing pixels into meaningful valuesDecomposing pixels into meaningful values

Page 3: Camera Culture

Camera for ‘image search’Camera for ‘image search’How can we augment the camera to support best 'image search'?How can we augment the camera to support best 'image search'?

• 'Search'=segment/identify/recognize/transform/compare/archive'Search'=segment/identify/recognize/transform/compare/archive• Or more precisely, object matching across images.Or more precisely, object matching across images.

• (For example, if we find to find a specific face image, we need a procedure to segment and identify (detect) the pixels likely to belong to a face, then recognize the (For example, if we find to find a specific face image, we need a procedure to segment and identify (detect) the pixels likely to belong to a face, then recognize the candidate face by transforming into a representation where we can match with that specific face image. Currently, this is all performed in software using traditional candidate face by transforming into a representation where we can match with that specific face image. Currently, this is all performed in software using traditional cameras. Typically, the algorithms try to reduce the image to lower-dimensional 'features' and do the matching in this feature-space. Unlike text search, where the cameras. Typically, the algorithms try to reduce the image to lower-dimensional 'features' and do the matching in this feature-space. Unlike text search, where the search pipeline is simple thanks to easy matching process, object-matching-in-images is quite difficult. What can we additional data can we capture while recording search pipeline is simple thanks to easy matching process, object-matching-in-images is quite difficult. What can we additional data can we capture while recording pixels and what new algorithms can exploit this augmented photo?)pixels and what new algorithms can exploit this augmented photo?)

• How can we make the scene ingredients machine readable so that we can easily perform the How can we make the scene ingredients machine readable so that we can easily perform the 'search'? Is this the key problem? 3D reconstruction (so that it is view independent, )? 'search'? Is this the key problem? 3D reconstruction (so that it is view independent, )? Hardware and software solutions? Crowdsourcing (let people do Hardware and software solutions? Crowdsourcing (let people do

• marking/sorting/indexing for others)? Metadata tagging (tag highlevel text labels rather than marking/sorting/indexing for others)? Metadata tagging (tag highlevel text labels rather than pixel-level tagging)?pixel-level tagging)?

• Do we need to capture Material index (where is all the wood in this image)? Segmentation Do we need to capture Material index (where is all the wood in this image)? Segmentation boundaries (shape versus reflectance edges)? Repeatable view and illumination invariance boundaries (shape versus reflectance edges)? Repeatable view and illumination invariance (be able to recreate image from a given view so it can be compared with another image, or (be able to recreate image from a given view so it can be compared with another image, or create images that look same independent of time-of-day)?create images that look same independent of time-of-day)?

• Some ideas: (i) to locate all 'images' with faces, record the iris biometric which validates if a Some ideas: (i) to locate all 'images' with faces, record the iris biometric which validates if a photo includes a human eye, and then we can search all images across an album with that photo includes a human eye, and then we can search all images across an album with that face/eye/iris, (ii) embed RFID tag (electronic bar-code) in every object and record the binary face/eye/iris, (ii) embed RFID tag (electronic bar-code) in every object and record the binary index with an RFID reader.index with an RFID reader.

Page 4: Camera Culture

Next ClassNext Class• HomeworkHomework

– What are the opportunities in pervasive recording of public spaces? What are the opportunities in pervasive recording of public spaces?

– Pervasive public recording=surveillance/GoogleEarthLive/Subscription camerasPervasive public recording=surveillance/GoogleEarthLive/Subscription cameras

– Technology:Technology:• See thru fog, time-lapse processing, day-nite/season/multi-modal fusion, how to consume See thru fog, time-lapse processing, day-nite/season/multi-modal fusion, how to consume

these images, how to merge with static/dynamic content, merge with static/dynamic cameras, these images, how to merge with static/dynamic content, merge with static/dynamic cameras, support object recognition, refine GPS coords, crowdsourcing, metadata (video frame) tagging support object recognition, refine GPS coords, crowdsourcing, metadata (video frame) tagging

– Society: Society: • Commerce (real-estate, reviews, remote maintenance), Environment (earthquake-prediction Commerce (real-estate, reviews, remote maintenance), Environment (earthquake-prediction

like opportunities, Politics (protests)like opportunities, Politics (protests)• VolunteerVolunteer

– Class notes: Lav (today), next .. Class notes: Lav (today), next .. – Select/read/present/paperSelect/read/present/paper

• Visual Social Computing: TomVisual Social Computing: Tom• Mobile Photography: EugeneMobile Photography: Eugene• Beyond Visible Spectrum: BrandonBeyond Visible Spectrum: Brandon• Emerging sensors: MattEmerging sensors: Matt• Developing Countries: Lav/ TilkeDeveloping Countries: Lav/ Tilke• Sols for Visually Challenged: JamesSols for Visually Challenged: James

Page 5: Camera Culture

Today 3pmToday 3pm

Less is More: Coded Computational PhotographyLess is More: Coded Computational Photography

Speaker: Ramesh Raskar, MIT Media LabSpeaker: Ramesh Raskar, MIT Media LabDate: Wednesday, February 20 2008Date: Wednesday, February 20 2008Time: 3:00PM to 4:00PM Time: 3:00PM to 4:00PM Refreshments: 2:45PM Refreshments: 2:45PM Location: Star Seminar Room (32-D463)Location: Star Seminar Room (32-D463)

Page 6: Camera Culture

TopicsTopics

• Imaging Devices, Modern Optics and LensesImaging Devices, Modern Optics and Lenses• Emerging Sensor TechnologiesEmerging Sensor Technologies• Mobile PhotographyMobile Photography• Visual Social Computing and Citizen JournalismVisual Social Computing and Citizen Journalism• Imaging Beyond Visible SpectrumImaging Beyond Visible Spectrum• Computational Imaging in SciencesComputational Imaging in Sciences• Trust in Visual MediaTrust in Visual Media• Solutions for Visually ChallengedSolutions for Visually Challenged• Cameras in Developing CountriesCameras in Developing Countries• Future Products and Business ModelsFuture Products and Business Models

Page 7: Camera Culture

FeedbackFeedback

• What are your questions about What are your questions about camera/technology/society?camera/technology/society?

• Your expectations from the course?Your expectations from the course?

Page 8: Camera Culture

TopicsTopics

• Other coursesOther courses– Art and PhotographyArt and Photography– CSAIL: Computational PhotographyCSAIL: Computational Photography– MechE: OpticsMechE: Optics

• Fall’2008Fall’2008– ‘‘Intro to Computational Camera and Photography’ Intro to Computational Camera and Photography’ – I will teach course in FallI will teach course in Fall

• Current courseCurrent course– More emphasis on future camerasMore emphasis on future cameras– Faster review of technology and then look at Faster review of technology and then look at

impact/applications/opportunitiesimpact/applications/opportunities– Big ideas/technologies/applications, Big ideas/technologies/applications, – Understand rules-of-thumb and trade-offsUnderstand rules-of-thumb and trade-offs– Ideal for thesis/projects/research papers/business modelsIdeal for thesis/projects/research papers/business models– Learn fun stuff before the nitty grittyLearn fun stuff before the nitty gritty

Page 9: Camera Culture

• Available light vs. exposure time vs. scene Available light vs. exposure time vs. scene movement vs. field of view vs. focus depth vs. movement vs. field of view vs. focus depth vs. sensitivity vs. noise vs. color rendition vs. color sensitivity vs. noise vs. color rendition vs. color gamut vs. contrast vs. visible detail vs. ….gamut vs. contrast vs. visible detail vs. ….

Photography: Full of Tradeoffs...Photography: Full of Tradeoffs...Photography: Full of Tradeoffs...Photography: Full of Tradeoffs...

No-flash Flash

Page 10: Camera Culture

Available Light vs Parameter/Specs ‘box’Available Light vs Parameter/Specs ‘box’Available Light vs Parameter/Specs ‘box’Available Light vs Parameter/Specs ‘box’

Aperture

Exposure

Focal Length (zoom)

Focus distance

Depth of field

Dynamic Range

Field of view

Resolution/Frame rate

Limited Parameters Limited Abilities

Page 11: Camera Culture

Goal: High Dynamic Range

Short Exposure

Long Exposure

Dynamic Range

Page 12: Camera Culture

• Epsilon PhotographyEpsilon Photography

– Low-level visionLow-level vision

• Best pixel and pixel-featuresBest pixel and pixel-features

– Vary focus, exposure, polarization, illuminationVary focus, exposure, polarization, illumination

– Vary time, viewVary time, view

– Better than any one photo (resolution/frame rate, fov, dynamic range etc)Better than any one photo (resolution/frame rate, fov, dynamic range etc)

• Achieve effects via multi-photo fusionAchieve effects via multi-photo fusion

• Create a Super-camera Create a Super-camera

– Mimic human eyeMimic human eye

Phase 1 of Better PhotographyPhase 1 of Better Photography

Page 13: Camera Culture

• Create a Super-camera Create a Super-camera

– Mimic human eyeMimic human eye

• What aspect of human eye are critical/ useless?What aspect of human eye are critical/ useless?– Eye: Feedback wrt brain, After-image/illusions,Eye: Feedback wrt brain, After-image/illusions,– Camera: geometry/stereo pair, multispectral,uniform res, memory, Camera: geometry/stereo pair, multispectral,uniform res, memory,

• What are other parameters/Design/Features to improve?What are other parameters/Design/Features to improve?

– Very small camera/thin camera ..Very small camera/thin camera ..

– Tight loop with illuminationTight loop with illumination

– ....

Phase 1.1 of Better PhotographyPhase 1.1 of Better Photography

Page 14: Camera Culture

The Eye’s Lens

Page 15: Camera Culture

Varioptic Liquid Lens: Electrowetting

Varioptic, Inc., 2007

Page 16: Camera Culture

Varioptic Liquid Lens

(Courtesy Varioptic Inc.)

Page 17: Camera Culture

Captured Video

(Courtesy Varioptic Inc.)

Page 18: Camera Culture

Conventional Compound Lens

Page 19: Camera Culture

“Origami Lens”: Thin Folded Optics (2007)

“Ultrathin Cameras Using Annular Folded Optics, “E. J. Tremblay, R. A. Stack, R. L. Morrison, J. E. FordApplied Optics, 2007 - OSA

Page 20: Camera Culture

Origami Lens

ConventionalLens

Origami Lens

Page 21: Camera Culture

Optical Performance

ConventionalLens Image

Origami Lens Image

Conventional

OrigamiScene

Page 22: Camera Culture

Compound Lens of Dragonfly

Page 23: Camera Culture

TOMBO: Thin Camera (2001)

“Thin observation module by bound optics (TOMBO),” J. Tanida, T. Kumagai, K. Yamada, S. MiyatakeApplied Optics, 2001

Page 24: Camera Culture

TOMBO: Thin Camera

Page 25: Camera Culture

Captured Image

TOMBO

Scene Captured Image

(Multiple low-resolutioncopies of the scene)Hfg

Image = Optics . Scene

Page 26: Camera Culture

Reconstructed Image

gHf

Page 27: Camera Culture

• Epsilon PhotographyEpsilon Photography

– Low-level visionLow-level vision

• Best pixel and pixel-featuresBest pixel and pixel-features

– Vary focus, exposure, polarization, illuminationVary focus, exposure, polarization, illumination

– Vary time, viewVary time, view

– Better than any one photo (resolution/frame rate, fov, dynamic range etc)Better than any one photo (resolution/frame rate, fov, dynamic range etc)

• Achieve effects via multi-photo fusionAchieve effects via multi-photo fusion

• Create a Super-camera Create a Super-camera

– Mimic human eyeMimic human eye

Phase 1 of Better PhotographyPhase 1 of Better Photography

Page 28: Camera Culture

• Create a Super-camera Create a Super-camera

– Mimic human eyeMimic human eye

• What aspect of human eye are critical/ useless?What aspect of human eye are critical/ useless?– ....

• What are other parameters/Design/Features to improve?What are other parameters/Design/Features to improve?

– Very small camera/thin camera ..Very small camera/thin camera ..

– Tight loop with illuminationTight loop with illumination

– ....

Phase 1.1 of Better PhotographyPhase 1.1 of Better Photography

Page 29: Camera Culture

• Coded PhotographyCoded Photography

– Mid-level cuesMid-level cues

– Regions, shapes(depth), edges, motion, material-index (…)Regions, shapes(depth), edges, motion, material-index (…)

– Cartoons via Multi-flash camera (depth edges), Wavelength profile, Cartoons via Multi-flash camera (depth edges), Wavelength profile,

• Visual interface issue (human eye expects pixels)Visual interface issue (human eye expects pixels)

– Decompose pixel values (…)Decompose pixel values (…)

• Single or few photosSingle or few photos

• Create a functionally super-camera Create a functionally super-camera

– Don’t mimic human eyeDon’t mimic human eye

Phase 2 of Better PhotographyPhase 2 of Better Photography

Page 30: Camera Culture

Multiperspective Camera?Multiperspective Camera?

[ Jingyi Yu’ 2004 ]

Page 31: Camera Culture

• Essence PhotographyEssence Photography

– High-level cuesHigh-level cues

– Inference, perception, cognitionInference, perception, cognition

– Intent based (like biovision systems)Intent based (like biovision systems)

• Not a ‘single-solution fits-all’Not a ‘single-solution fits-all’

• ? Single or few ‘photos’? Single or few ‘photos’

• Beats ‘photography’ Beats ‘photography’

– Don’t just mimic human eye, or record pixels/mid-level cuesDon’t just mimic human eye, or record pixels/mid-level cues

– Create a meaningful representation of visual experienceCreate a meaningful representation of visual experience

– New art form, new commerce modelsNew art form, new commerce models

Phase 3 of Better PhotographyPhase 3 of Better Photography

Page 32: Camera Culture

Visual Social Computing and Citizen JournalismVisual Social Computing and Citizen Journalism

• What is VSCWhat is VSC– Social Computing is well known, I made up VSCSocial Computing is well known, I made up VSC

• My defn of SC: Online computation of the people, by the people, for the people (old My defn of SC: Online computation of the people, by the people, for the people (old world: govt, economy, epidemiology)world: govt, economy, epidemiology)

• SubsetsSubsets– Crowdsourcing (CAPTCHA) (by the people, but maybe for just one person)Crowdsourcing (CAPTCHA) (by the people, but maybe for just one person)– Participatory sensing (of the people, but no active part by individuals, not for the people)Participatory sensing (of the people, but no active part by individuals, not for the people)– Recommendation systems (by the people and for the people)Recommendation systems (by the people and for the people)– Tagging (Digg) (all three)Tagging (Digg) (all three)

• Blogs, social networks, auctions, wikipedia, tagsBlogs, social networks, auctions, wikipedia, tags• 90% of all data will be ‘about people’90% of all data will be ‘about people’• Example problem: Can we reduce distrust among Kenya’s groups?Example problem: Can we reduce distrust among Kenya’s groups?

– Easy to predict certain trends ..Easy to predict certain trends ..• Just add dimensionsJust add dimensions• Text, audio/music, images, video, (whats next)Text, audio/music, images, video, (whats next)• LP->Cassette-VHS player -> CD player -> DVD Player (ok Blue-ray DVD player) -> LP->Cassette-VHS player -> CD player -> DVD Player (ok Blue-ray DVD player) ->

(whats next)(whats next)• Radio-TV- ..Radio-TV- ..• Gopher -> Newsgroups ->Wikipedia _> (whats next)Gopher -> Newsgroups ->Wikipedia _> (whats next)• Take anything text/audio based -> image/videoTake anything text/audio based -> image/video• Take anything image based -> video (Flickr -> YouTube) Take anything image based -> video (Flickr -> YouTube)

Page 33: Camera Culture

Today 3pmToday 3pm

Less is More: Coded Computational PhotographyLess is More: Coded Computational Photography

Speaker: Ramesh Raskar, MIT Media LabSpeaker: Ramesh Raskar, MIT Media LabDate: Wednesday, February 20 2008Date: Wednesday, February 20 2008Time: 3:00PM to 4:00PM Time: 3:00PM to 4:00PM Refreshments: 2:45PM Refreshments: 2:45PM Location: Star Seminar Room (32-D463)Location: Star Seminar Room (32-D463)

Page 34: Camera Culture

Next ClassNext Class• HomeworkHomework

– What are the opportunities in pervasive recording of public spaces? What are the opportunities in pervasive recording of public spaces?

– Pervasive public recording=surveillance/GoogleEarthLive/Subscription camerasPervasive public recording=surveillance/GoogleEarthLive/Subscription cameras

– Technology:Technology:• See thru fog, time-lapse processing, day-nite/season/multi-modal fusion, how to See thru fog, time-lapse processing, day-nite/season/multi-modal fusion, how to

consume these images, how to merge with static/dynamic content, merge with consume these images, how to merge with static/dynamic content, merge with static/dynamic cameras, support object recognition, refine GPS coords, crowdsourcing, static/dynamic cameras, support object recognition, refine GPS coords, crowdsourcing, metadata (video frame) tagging metadata (video frame) tagging

– Society: Society: • Commerce (real-estate, reviews, remote maintenance), Environment (earthquake-Commerce (real-estate, reviews, remote maintenance), Environment (earthquake-

prediction like opportunities, Politics (protests)prediction like opportunities, Politics (protests)• VolunteerVolunteer

– Class notes: Lav (today), next .. Class notes: Lav (today), next .. – Select/read/present/paperSelect/read/present/paper

• Visual Social Computing: TomVisual Social Computing: Tom• Beyond Visible Spectrum: BrandonBeyond Visible Spectrum: Brandon• Mobile PhotographyMobile Photography• Emerging sensorsEmerging sensors• Developing Countries: LavDeveloping Countries: Lav