51
A Personal Database for Everything Inspired by Memex www.MyLifeBits.com Gordon Bell, Jim Gemmell, Roger Lueder Original slides: http://research.microsoft.com/en-us/um/people/gbell/Bell_MyLifeBits_Talk_SIGMOD_050614_web.ppt

A Personal Database for Everything Inspired by Memex Gordon Bell, Jim Gemmell, Roger Lueder Original slides:

Embed Size (px)

Citation preview

  • Slide 1
  • Slide 2
  • A Personal Database for Everything Inspired by Memex www.MyLifeBits.com www.MyLifeBits.com Gordon Bell, Jim Gemmell, Roger Lueder Original slides: http://research.microsoft.com/en-us/um/people/gbell/Bell_MyLifeBits_Talk_SIGMOD_050614_web.ppt http://research.microsoft.com/en-us/um/people/gbell/Bell_MyLifeBits_Talk_SIGMOD_050614_web.ppt
  • Slide 3
  • Outline How has the project evolved? How do we use MyLifeBits? How is it built? How large is the database? What is the vision? What is left and how can you help?
  • Slide 4
  • I am data
  • Slide 5
  • Ambience and Presence: Being there while being here Dining at home on the Orient Express
  • Slide 6
  • History: The remote worker re-discovers the PERSONAL computer
  • Slide 7
  • Oct 1998 Can we scan your books and put them online? Raj Reddy Sure! Dont worry about copyright stuff. Microsoft has lots of lawyers
  • Slide 8
  • 1999 Scanning starts in earnest we start to scan
  • Slide 9
  • My docs and archive Self.. Biographical X- Employer Employer X-Employer Project Employer Library/file cab Active Employer Library/file cab
  • Slide 10
  • Jim, I dont need no stinkin database! Gordon, You should be using a database.
  • Slide 11
  • Now that its in Cyberspace How do you remember the 20,000+ file names? Or in which of 1500 folders they live? Whats about a tool for finding stuff?
  • Slide 12
  • Jan 2001 CACM A Personal Digital Store 16 GB; +2/yr A good place to stop Began search for search engines, especially for email. Jim suggests that we build a system that would be easier to use and have many more capabilities.
  • Slide 13
  • Re-discovery of Memex As We May Think, Vannevar Bush, 1945 A memex is a device in which an individual stores all his books, records, and communications, and which is mechanized so that it may be consulted with exceeding speed and flexibility Full-text search, text & audio annotations, and hyperlinks
  • Slide 14
  • 2001 Capture goes beyond paper
  • Slide 15
  • Even more capture Telephone calls, more video, all web pages visited, usage logging, radio, TV
  • Slide 16
  • 2003 - SenseCam
  • Slide 17
  • Steve Mann timeline
  • Slide 18
  • I sensed Clarkson and Pentland MIT 2001 Visually impaired UW 2004
  • Slide 19
  • MyLifeBits Software
  • Slide 20
  • Everything goes in a database MyLIfeBits need all the features of a database (Consistency, Indexing, Pivoting, Queries, Speed/scalability, Backup, Replication) If we didnt use one, well eventually create one! Files as blobs; sync with file system for legacy apps We are part of Jim Grays Bay Area Research Lab SQL
  • Slide 21
  • MyLifeBits Software MyLifeBits store database Voice annotation tool Telephone capture tool TV capture tool TV EPG download tool Radio capture & EPG PocketPC transfer tool PocketRadio player Import files MyLifeBits Shell Browser tool Internet IM capture GPS import & Map display SenseCam Screen saver Text annotation tool MAPI interface Legacy email client Outlookinterface files Legacy applications VIBElogging RoomCapture
  • Slide 22
  • MyLifeBits Schema (simplified) Images Music Phone calls Items Links Link types Entity types Resource entities Event types Event log Events Tasks People Notes Email Saved searches SenseCamData GPS data Window, key, mouse log Web pages
  • Slide 23
  • Demo Clips & Screens
  • Slide 24
  • 747 Screen
  • Slide 25
  • Vue de jour
  • Slide 26
  • Reports
  • Slide 27
  • Add item to collection(s)
  • Slide 28
  • Refine email shell
  • Slide 29
  • Refine email shell2
  • Slide 30
  • Pivoting: contact> call> t> web page
  • Slide 31
  • Refine by classification--dentist
  • Slide 32
  • GPS Photo location
  • Slide 33
  • SenseCam
  • Slide 34
  • Timeline
  • Slide 35
  • Google??
  • Slide 36
  • The Shape & Size of Gordons LifeBits
  • Slide 37
  • MyLifeBits 3/26/2005 206K items 101 GB by number of Items.
  • Slide 38
  • MyLifeBits 3/26/05 101 GB 206 K items By Size (GB) Bell Growth: 1GB/month =1 TB/lifetime Size (MB) by Type
  • Slide 39
  • 1995-2004 of email (incl. attachments)
  • Slide 40
  • YearMpixManufacturer 1997.25Ricoh 19991Kodak 20012Canon 20023Sony 20034Sony 20055Panasonic YearMpixManufacturer 1997.25Ricoh 19991Kodak 20012Canon 20023Sony 20034Sony 20055Panasonic 15,000 photos
  • Slide 41
  • Monthly & Lifetime Storage Use ItemDaily numberTotal* MB|GB Month|Life 1 MB Books|reports0.13 5KB Emails10013 100 KB Image scans513 0.4 MB Photos10100 75 KB Web pages|docs100188 100 MB Music0.1250 1 KB/s Listened audio, speech40,0001,000 50 KB Daily photos1,0001,250 2 GB/hr TV4200,000
  • Slide 42
  • Observations about use(rs) 1.On Apps: Search is the killer app pretty much as Bush described. Screen savers memory refreshers also provide ambience Where did my day to? 2.Users are unwilling to spend time managing their computers or data. User-input meta-data e.g. Dublin Core nave Librarians dream. Meta-data, classification, etc. must be automatic Great scheme for classification using facets. It requires work. 3.Time is the most important meta-data. Photos: place (GPS), subject. 4.Folders are a good and bad idea. Most users dont know what they are or how they work If used, over time, they become useless: too many, miss-file, etc. 5.User should put every information fragment into the system. e.g., to dos, call backs, business cards numbers, attention events. It pays. 6.Same information in multiple places always becomes obsolete.
  • Slide 43
  • Evolution: Silo Apps on isolated DB islands vs. Cut & Paste across apps Contacts: email, instant messages, phone, correspondence Family and organizational relationships Location of people, organizations, etc. Photo database: who, where, when, what Money payees, phone, etc. Health providers and caregivers User written apps in excel or access
  • Slide 44
  • Common ground with WinFS: Items, Links & Meta-data Annotates Caller in Phone Call Photo of Event
  • Slide 45
  • PhotoFinder - Shneiderman and Kang
  • Slide 46
  • Slide 47
  • Challenges
  • Slide 48
  • The dear appy problem Dear Appy, How committed are you? Please come back to me. Forever yours truly, Lost and forgotten data Whos responsible? Media or 8 track cassette, 8 floppy Evolving platform, file, and database Evolving, incompatible standards & formats for legacy data that disregard ancestors Evolving and/or disappearing apps
  • Slide 49
  • Automatic classification problem XML on bills and imported content transactions We need to download classifications rather than build them Definitions & synonyms should help find what I want Today it is too expensive to manually classify scanned paper. E.g. right time meta-data is critical! We hope the system can classify papers and other documents e.g. bills. Ideally, build Dublin Core In 10 years we need all documents to appear electronically & classified with a little help from me
  • Slide 50
  • More challenges Dear Appy: Monitoring and automatic migration of files that are unlikely to be understood on future platforms as well as platform migration. Get What I Need: Endless, but evolutionary improvements in search: misspellings, stemming synonyms Endless frontier of schema and extensions to them for new applications e.g. making org charts, family relationships. Capture, Archival and Retrieval of Personal Experiences (CARPE) a whole new game! Versioning is essential Scaling.. We dont know what happens at a Terabyte What can, should be, or will be in the cloud? Books videos Will we be allowed to use such systems? Copyright laws vary: E.g. ripping CDs, copy of anything, photos, conversations
  • Slide 51
  • Challenges Data-types Quantity expanding i.e. info explosion New capabilities e.g. real time create new data-types Meta-data to increase value & provide pivots Going beyond a PC to a distributed environment Network environment, including media center Into the cloud Periphery smart buildings, objects, Backup, migration, and caching for beyond a Terabyte Expanding network: PC > LANs > web > P2P Schema sharing among disparate systems CARPE (real time data capture) Rooms, phone calls, SenseCam, Health transducers, etc. Security, privacy, forgetfulness, deniability, etc.
  • Slide 52