View
222
Download
0
Category
Preview:
Citation preview
Tools for Repositories:Microsoft Research &
the Scholarly Information Ecosystem
Lee DirksDirector, Education & Scholarly Communications
Microsoft External ResearchMicrosoft Corporation
Organization within Microsoft Research that engages in strong partnerships with academia, industry and government to advance computer science, education, and research in fields that rely heavily upon advanced computingInitiatives that focus on the research process and its role in the innovation ecosystem, including support for open access, open tools, open technology, and interoperabilityDevelopers of advanced technologies and services to support every stage of the research process
Microsoft External Research
MissionOptimize and extend Microsoft software to meet the specific needs of the academic community
Our approach:
Conduct applied projects to enhance academic productivity by evolving Microsoft’s scholarly communication offerings
Microsoft External Research is uniquely positioned to drive this initiative across Microsoft
Transforming Scholarly Communication
• Interoperability is essential– Actively lobby and drive for consensus around technical standards and standardized protocols
proactively adopted by the community; enable broad community engagement• Customers have told Microsoft that interoperability is OUR responsibility
• Leverage existing community protocols, practices, guidelines, etc.– Example – metadata conventions / taxonomies / ontologies: a traditional strength for libraries –
and a critical component in enabling Web 2.0
• Optimize for data-driven research– To both data (scientific) and to information (scholarly publications)– Reproducible research + computational science– Properly document / annotate scholarly output
• Data preservation (and provenance) should be baseline– Documentation of the data’s provenance– Preservation needs to be like “accessibility” features – i.e., assumed as required
• Semantic knowledge discovery & social networking – Harnessing collective intelligence must be a consideration – since accessing research is a core
step in the life-cycle. Enable knowledge discovery – Optimize for Web 2.0 scenarios and allow end-users/experts to find things easier
Data Collection, Research & Analysis
Authoring
Publication & Dissemination
Storage, Archiving & Preservation
CollaborationSharePoint
LiveMeetingOffice Live
DiscoverabilityLibra 2.0
“Bookweb”SharePoint
Office OpenXMLXPS FormatSQL Server & Entity FrameworkRights ManagementData Protection Manager
Office 2007:•Word•PowerPoint•Excel•OneNoteTablet PC/UMPC
Word 2007 + PowerPoint 2007WPF & Silverlight
“Sea Dragon” / “PhotoSynth” / “Deep Zoom”
Excel 2007Windows Server HPC“Astoria” / “Pop Fly”
The Scholarly Communication Lifecycle
Scholarly Communications: Project Overview• Current or Completed Projects
o Cornell – arXiv.org + Word 2007 (and repository interoperability via SWORD) o MIT / Broad Institute – Authoring (Word 2007) + data for research reproducibility o MSR – CMT++ interoperability with data + metadata transfer/exchange (conference management tool
enhancements) o LiveLabs – eJournal publishing online service (community publishing tool)o UC San Diego / PLoS – Semantic mark-up of scholarly articles (+ submission)o Chem4Word with Office & Cambridge University – Create add-in to Word 2007 to facilitate
drawing of chemical compounds and equations o Johns Hopkins University – Digital Archive for Astronomy/Astrophysics data (storage, preservation and
access) o Planets Project / EU (with MSR – Cambridge) OpenXML and file format preservation + interoperabilityo eChemistry Project (Cornell, Penn State, Indiana, Cambridge, Southampton) – ORE exemplar: access to
compound chemical info objects (cross-repository access to open chemistry data)o British Library – Researcher Information Centre (RIC) online workflow tool for scientists and researcherso Creative Commons Add-in for Office 2007 – evolving the Word 2003 efforto University of Southampton (UK) – Port ePrints Repository Software for installation on the Windows
platformo University of Manchester / “MyExperiment” Project – social networking for scientists o ORE Acceleration Project (OAI – Object Reuse & Exchange) – Alpha spec developmento UK National Archives – Virtual PC / Emulation of legacy systems to facilitate preservationo National Library of Medicine / NCBI – “PubMed Int’l” UK version of PubMed + NLM DTD
• Pipelineo DRIVER 2 (EU) – Infrastructure integration of across a network of European research repositories
• For Microsoft end-users, making it easier to use our software for all aspects of their research process
• For Microsoft developers, demonstrating the toolset and showing how our platform can be extended
• For non-Microsoft end-users, working to ensure the ability to interoperate with our software across all phases of the research process, as necessary
• For non-Microsoft developers, enabling transparency to our efforts in this space and encouraging a dialogue
Our goals for working in this community
Who’s here & why
Goals / Intentions
Approach
• 12:30 p.m. Welcome & OverviewLee Dirks – Director, Education & Scholarly Communication, Microsoft Research
• 1:00 p.m. Zentity - Repository Platform Alex Wade – Director, Scholarly Communication, Microsoft
• 2:00 p.m. Services for Repositories (RIC, Electronic Journals Service, Live Translator, Document Conversion Service)
Pablo Fernicola, Group Manager, Microsoft & Alex Wade• 3:00 p.m. Break• 3:15 p.m. Programming with Zentity
Savas Parastatidis, Software Philosopher, Microsoft• 4:30 p.m. Tools for Authors (AA, Ontology, Creative Commons, ORE,
Submission Wizard, etc.)Pablo Fernicola & Alex Wade
• 5:30 p.m. Wrap-up & Futures Discussion
AGENDA
Lee DirksDirector—Education & Scholarly Communication
Microsoft External Researchldirks@microsoft.com
URL – http://www.microsoft.com/scholarlycomm/
Questions?
Zentity 1.0Open Repositories ‘09 Workshop
Alex WadeDirector, Scholarly CommunicationMicrosoft External Research
Microsoft Corporation
Agenda
Ecosystem of Tool/Services
Repositories
User Environment• Search • Desktop Tools• ELNs• etc.
Translation ConversionPeer-Review
Authoring Collaboration/VREs
• Visualization • Discovery• Entity
Extraction • etc.
• Goals• System Requirements• Architectural Stack • Installation• Repository Demo
– UI– Services
• Extensibility
Agenda
Zentity – Goals
Quick Easy to install ‘Scholarly Works’ data model
Authors, Papers, Data, Videos, Code, Lectures, Books, etc.
Default Web UI
Extensible UI Toolkit Intuitive programming
experience Extensible Data Model
(entities, relationships) RDFs for new data models
Interoperable BibTeX Import RSS/Atom Syndication METS support OAI-PMH Provider OAI-ORE Simple Search API Atom Publishing Protocol SWORD
Free & Open Freely available Based on open standards SQL Server and Developer tools
available via Dreamspark
• Supported Processor Architectures– x86 and x64.
• Supported Operating Systems– Microsoft Windows Server 2008 (x86 and x64)– Microsoft Windows Vista SP1 (x86 and x64)
• Installation Requirements– Microsoft .Net Framework 3.5– Supported Microsoft SQL Server
• Microsoft SQL Server 2008 Enterprise Edition• Microsoft SQL Express 2008 with Advanced Services
• User and Configuration Requirements– Site Admin privileges are granted to the user installing Zentity– The selected Microsoft SQL Server instance must have “Windows Authentication”
enabled.– User running the installer must have ‘database creation’ permissions on the
Microsoft SQL Server instance.
System Requirements
Application Stack
SQL Server 2008(including Express edition)
ADO.NET 3.5 Entity Framework
Zentity.Core
Services
Web UI
Zentity.SecurityZentity.Search
UI.Toolkit
ScholarlyWorks Application
• A Semantic Computing platform• A hybrid between a relational database and a triple store
Zentity - Store
Triple stores- Evolution friendly- Poor performance- No need to model everything in advance- Semantic interpretation at the application level
Relational schema- Evolution not so easy- Great opportunities for optimization- Model everything in advance
Zentity Store- Maintain a balance- Try to model the frequently used entities in our app domain- Try to capture the frequently used relationships- Allow for extensibility (Relationships, Properties)
Research Output Repository Platform
PowerPoint presentation
Lecture on 2/19/2008
authored by
tony
presented by
organized by
Elizabeth, Sebastien,Matthew, Norman,Brian, Sarah, George, Roy
PDF file
is representation of contains
Installation
EULA
localhost\SQLExpress
FILESTREAM File Location
OAI-PMH database
localhost\SQLExpress
localhost\SQLExpress
Configure IIS
IIS App Pool
ZENTITY DEMO
• Basic Search• Search Filters• Advanced Query Syntax (AQS)
– Field Support • Advanced Search
Search
– http://<myserver>/Syndication/Syndication.ashx?resourcetype: book author:(tony hey)
Syndication
• Web UI & UI Toolkit– CSS– ASP.NET Controls
• Services• Search• Security• Data Model
Extensibility
SQL Server 2008(including Express edition)
ADO.NET 3.5 Entity Framework
Zentity.Core
Services
Web UI
Zentity.SecurityZentity.Search
UI.Toolkit
ScholarlyWorks Application
• The site contains access and downloads of relevant tools and resources for the worldwide academic research community. A small set of examples include:
– Research Output Repository: building blocks, tools, and services for developers who are tasked with creating and maintaining an organization’s repository ecosystem. http://research.microsoft.com/zentity
– Tools and Services for Research Collaboration: http://research.microsoft.com/en-us/collaboration/tools/default.aspx
Further Information and Resourceshttp://research.microsoft.com
Recommended