Unless otherwise noted, the slides in this presentation are licensed by Mark A. Parsons under a Creative Commons Attribution-Share Alike 3.0 License
The Research Data Alliance Creating the culture and technology for an international data infrastructure
Mark A. ParsonsSecretary General !!International Federation of Library AssociationsLyon, France20 August 2014
All of society’s grand challenges require diverse
(often large) data to to be shared and integrated
across cultures, scales, and technologies.
Research Data Alliance
Vision Researchers and innovators openly share data across technologies, disciplines, and countries to address the grand challenges of society. !
Mission RDA builds the social and technical bridges that enable open sharing of data.
!
Dynamics of Infrastructure Edwards, et al. 2007 Understanding Infrastructure: Dynamics, Tensions, and Design.
• Infrastructures become “ubiquitous, accessible, reliable, and transparent” as they mature.
• Systems Networks Inter-networks
• “system-building, characterized by the deliberate and successful design of technology-based services.”
• “technology transfer across domains and locations results in variations on the original design, as well as the emergence of competing systems.”
• Finally, “a process of consolidation characterized by gateways that allow dissimilar systems to be linked into networks.”
Not what, but When is infrastructure?
Not what, but When and Who is infrastructure?
Bridges and Gateways
Gateways are often wrongly understood as “technologies,” i.e. hardware or software alone. A more accurate approach conceives them as combining a technical solution with a social choice, i.e. a standard, both of which must be integrated into existing users’ communities of practice. Because of this, gateways rarely perform perfectly. — Edwards et al. 2007
Ecology of Infrastructure Figure derived from F. Millerand based on S. L. Star & K. Ruhleder (1996)
"Data Deluge," Brett Ryder, The Economist, Feb. 2010
Data Blizzard?© Mindy Veissid | Mindy Veissid Photography.
Diverse snow crystal photos by Kenneth G. Libbrecht snowcrystals.com
The long tail of science Heidorn 2008
Distribution of NSF Awards by Dollar Value !
© 2009 The Board of Trustees, University of Illinois
What libraries can do
• Help researchers describe their data• Help them pick formats to store it in• Provide guidance and services around choosing repositories and
providing access• Provide guidance on privacy, licensing, and sharing issues• Ensure preservation of appropriate data for future research reuse
Slide courtesy Dean Krafft, Cornell University Library
What more libraries can do
• Collaborate at scale across institutions and disciplines• Help link the data with its research context to make it more discoverable
and reusable• Help link it to publications about the research• Provide collaborative tools and spaces for researchers to work with the
data• Provide the people and organisational support to help researchers to
manage research data
Slide courtesy Dean Krafft, Cornell University Library
What can’t libraries do
• Handle the Data Deluge – “really big data”• Fund this ourselves – we need a business model to support the costs• Do work that doesn’t clearly benefit our own researchers and
institutions• Provide cyberinfrastructure to support analysis, simulation, and
visualization
Slide courtesy Dean Krafft, Cornell University Library
Libraries must contribute to the local, regional, and global data/information/knowledge infrastructure!
Deliverables that make data work
“Create - Adopt - Use”
• Adopted code, policy, specifications, standards, or practices that enable data sharing
• “Harvestable” efforts for which 12-18 months of work can eliminate a roadblock
• Efforts that have substantive applicability to groups within the data community but may not apply to all
• Efforts that can start today
RDA Principles OpennessConsensus
BalanceHarmonization
Community Driven Non-profit
RDA Organisational Framework
Distribution of 2,164 Individual RDA Members in 86 Countries 20 August 2014
Other6%Private
12%
Government17% Academia
65%
Map courtesy traveltip.org
Europe49%
North America38%
Austral-pacific 5%
Africa 2%
SouthAmerica 1%
Asia 5%
RDA Organisational Framework
Fran Berman
25
§ Council: § Fran Berman (US), co-Chair § Patrick Cocquet (France) § Tony Hey (US) § Kaye Raseroka (Botswana) § Doris Wedlich (Germany) § Ross Wilkinson (Australia) § John Wood (UK), co-Chair
• Secretariat § Hilary Hanahoe § Fotis Karayannis § Kathy Fontaine § Mark Parsons, Sec Gen § Herman Stehouwer
!
•Organisational Assembly § Juan Bicarregui, co-Chair § Walter Stewart, co-Chair !
§ Technical Advisory Board § Bridget Almas § Simon Cox § Peter Fox § Francoise Genova § Bill Michener § Beth Plale, Chair § Susanna-Assunta Sansone, § Jamie Shiers § Rainer Stotzka § Andrew Treloar, Chair § Peter Wittenburg
New RDA Leadership since Plenary 1
Fran Berman
25
§ Council: § Fran Berman (US), co-Chair § Patrick Cocquet (France) § Tony Hey (US) § Kaye Raseroka (Botswana) § Doris Wedlich (Germany) § Ross Wilkinson (Australia) § John Wood (UK), co-Chair
• Secretariat § Hilary Hanahoe § Fotis Karayannis § Kathy Fontaine § Mark Parsons, Sec Gen § Herman Stehouwer
!
•Organisational Assembly § Juan Bicarregui, co-Chair § Walter Stewart, co-Chair !
§ Technical Advisory Board § Bridget Almas § Simon Cox § Peter Fox § Francoise Genova § Bill Michener § Beth Plale, Chair § Susanna-Assunta Sansone, § Jamie Shiers § Rainer Stotzka § Andrew Treloar, Chair § Peter Wittenburg
New RDA Leadership since Plenary 1
TAB Nominations open until 31
August
Organisational Partners—key linkages
• Organisations play an essential role as adopters!
• Organisational Assembly = Organisational Members and Affiliates.
• Organisational Advisory Board will represent Organisational Assembly to Council
• Organisational Members pay (modest) dues and have a special voice within RDA helping ensure RDA products stay relevant
Image courtesy anybots.com
Organisations Ready to Join
§ Organisational Members: § Alliance for Permanent Access § American University Library § Australian National Data Service § Barcelona Supercomputing Center - Centro
Nacional de Supercomputación § Columbia University Library § CNRI § CSC § Digital Curation Center § EIROForum IT Working Group § eResearch Services and Scholarly
Application Development Division of Information Services, Griffith University
§ European Data Infrastructure (EUDAT) § National Institute of Advanced Industrial
Science and Technology (AIST), Japan § International Association of STM Publishers
§ Internet2 § Microsoft Research § NZ eScience Infrastructure § Purdue University Libraries § Research Data Canada § Scholarly Publishing and Academic
Resources Coalition (SPARC) § Washington University in St. Louis Libraries § Science and Technology Facilities Council !
§ Affiliates § CODATA § ICSU World Data System § ORCID § DataCite § CASRAI § Global Alliance for Genomics and Health
RDA Organisational Framework
• The group of government and non-profit science funding organisations that support the data and science communities to participate in RDA activities:
• US Government (NSF and NIST)• European Commission• Australian Government
• Allows agencies the opportunity to share funding program plans that support data exchange, interoperability, and data infrastructures across the globe, and thereby amplify their impact.
• Related to but distinct from RDA. A parallel organisation.
RDA Colloquium—RDAC
RDA Organisational Framework
RDA Working Groups
• Brokering Governance*
• Data Citation WG
• Data Description Registry Interoperability
• Data Foundation and Terminology WG
• Data Type Registries WG
• Metadata Standards Directory Working Group
• PID Information Types WG
• Practical Policy WG
• RDA/CODATA Summer Schools in Data Science and Cloud Computing in the Developing World*
• RDA/WDS Publishing Data Bibliometrics WG
• RDA/WDS Publishing Data Services WG
• RDA/WDS Publishing Data Workflows WG
• Repository Audit and Certification DSA–WDS Partnership WG
• Standardisation of Data Categories and Codes WG
• The BioSharing Registry: connecting data policies, standards & databases in life sciences*
• Urban Quality of Life Indicators
• Wheat Data Interoperability WG
* in review
RDA Interest Groups
• Agricultural Data Interoperability IG• Big Data Analytics IG• Biodiversity Data Integration IG• Brokering IG• Community Capability Model IG• Data Fabric IG*• Data for Development• Data in Context IG• Development of cloud computing capacity and
education in developing world research• Digital Practices in History and Ethnography IG• Domain Repositories Interest Group• Education and Training on handling of research
data• ELIXIR Bridging Force IG*• Engagement IG• Ethics and Social Aspects of Data*• Federated Identity Management• Geospatial IG*
• Long tail of research data IG• Marine Data Harmonization IG• Metabolomics• Metadata IG• PID Interest Group• Preservation e-Infrastructure IG• RDA/CODATA Legal Interoperability IG• RDA/CODATA Materials Data, Infrastructure &
Interoperability IG• RDA/WDS Certification of Digital Repositories IG• RDA/WDS Publishing Data Cost Recovery for
Data Centres• RDA/WDS Publishing Data IG• Research data needs of the Photon and Neutron
Science community• Research Data Provenance• Service Management IG• Structural Biology IG• Toxicogenomics Interoperability IG
* in review
Get involved!
• Join RDA as an individual member supporting our principles at http://rd-alliance.org
• Join as an Organisational Member (nominal fee) or an Organisational Affiliate (jointly sponsored efforts).
• Initiate or join an Interest Group
• Propose or join a Working Group
• Attend the RDA Plenaries
Coming together is a beginning; keeping together is progress; working together is success.
—Henry Ford
Plenary 4 Amsterdam22-24 September 2014
©2013 Pecoff Studios Inc
Regional RDAs
• RDA/United States, Australian National Data Service, RDA/Europe,
• Implement RDA deliverables locally and enhance adoption.
• Ensure regional or national issues are addressed globally.
• Support plenaries and support attendance at plenaries.
Working Group Deliverables Adopters/Users
Data Founda*ons and Terminology Data Organisa*onal Model, OIF, EUDAT, DASISH
Defined terminology in a registry
Data Type Registries Federa*on between data type registries CNRI, EUDAT, IDF
Persistent Iden*fier Informa*on Types Core informa*onal types, DKRZ
Prototype protocol and API
Prac*cal Policy Example policy sets EUDAT, Chapel Hill, DKRZ, (all par*cipants)
Standardiza*on of Data Categories and Codes BeOer language codes TLA, Paradisec, ISO
Metadata Standards Directory Metadata Directory Dublin Core, Dataone, MRC, Jisc, NEON, NIST, CLARIN, DDI, DPN, OGF
Data Cita*on: making data citable Cita*on of dynamic data streams EUDAT, etc.
DSA-‐WDS Cer*fica*on Merger of DSA and WDS cer*fica*on DSA, WDS
Wheat Data Interoperability Group Wheat Linked data framework INRA, FAO, CIMMYT
Data Descrip*on Registry Interoperability
Interoperability between registries. (bilateral prototypes)
ANDS, DATA-‐PASS, Dryad, Thomson Reuters DCI, VIVO, CERN, DANS, DA|
RDA Working Group Outputs