Upload
eron
View
31
Download
0
Embed Size (px)
DESCRIPTION
Data Curation Issues and Challenges. ARL/CNI Fall Forum 2008 Sayeed Choudhury [email protected]. Pixel data collected by telescope. Sent to Fermilab for processing. Beowulf Cluster produces catalog. Loaded in a SQL database. Data Flow (Levels of Data). Courtesy of Alex Szalay. - PowerPoint PPT Presentation
Citation preview
Data Curation Issues and Challenges
ARL/CNI Fall Forum 2008Sayeed [email protected]
ARL/CNI Fall Forum 2008Sayeed Choudhury
Pixel data collected by telescope
Sent to Fermilab for processing
Beowulf Clusterproduces catalog
Loaded in a SQL database
Data Flow (Levels of Data)
Courtesy of Alex Szalay
ARL/CNI Fall Forum 2008Sayeed Choudhury
Key Considerations
• Work with existing scientific systems• Consider gateways for these systems as part of
infrastructure development• Focus on both human and technical
components of infrastructure• Human interoperability is more difficult than
technical interoperability• Trust
ARL/CNI Fall Forum 2008Sayeed Choudhury
Questions (1)
• How do we transfer principles into new practices, especially given scale and complexity?
• What are the fundamental differences between data and collections? Human readable vs. machine readable?
• What about the “cloud” or the “crowd”?• Can flickr help us with data curation?
ARL/CNI Fall Forum 2008Sayeed Choudhury
Questions (2)
• How does a partnership audit data (and associated services) distributed across the network?
• Are audits about “completeness” or perhaps about transparency and reliability?
• Where are the existing data curators? Maybe we shouldn’t use the terms data librarian or data scientist or humanist.
ARL/CNI Fall Forum 2008Sayeed Choudhury
Questions (3)
• What are the requirements? Are there common requirements, which may be most appropriate area for libraries?
• Are there unifying concepts or themes? “One scientist’s noise is another scientist’s signal…”
• What are we trying to sustain? Data? Scholarship? Our organizations?