Upload
gore
View
40
Download
0
Tags:
Embed Size (px)
DESCRIPTION
An Introduction to DAS. Andy Jenkinson , EBI. Summary of Topics. What is Data Integration? Problems in Data Integration An architectural overview of DAS Brief History of DAS. What is Data Integration. All These are Data Integration. Reading some papers so you can write a report - PowerPoint PPT Presentation
Citation preview
Andy Jenkinson, EBI
An Introduction to DAS
Summary of Topics
• What is Data Integration?
• Problems in Data Integration
• An architectural overview of DAS
• Brief History of DAS
What is Data Integration
All These are Data Integration
• Reading some papers so you can write a report
• Exploring some database websites so you can learn about a topic
• Downloading some data from different databases so you can analyse it
• Downloading some data from different databases so you can combine it with your own
All These are Data Integration
• Reading some papers so you can write a report
• Exploring some database websites so you can learn about a topic
• Downloading some data from different databases so you can analyse it
• Downloading some data from different databases so you can combine it with your own
Data Integration
• “Automatic” data integration• pulling in data from different
locations• processing it• creating a resource derived from
the data• done via computers, not humans
• e.g. creating/updating a data warehouse
Warehouse
PDB
EnsemblUniProt
Warehouse model
Data Integration:like herding cats
Databases are all different
Databases evolve
Data ages
Databases are big
Distributed Annotation System
• Distributed
• Client-Server architecture
• Federation
• RESTful web services
Warehouse model
DAS model
Architectural Overview
DAS
• Databases are all different• DAS is a uniform facet of a database – always the same
• Databases change their structure• when the database changes, DAS stays the same
• Databases are updated• DAS data comes directly from the provider so is always fresh
• Databases are big• DAS uses real-time targeted queries
History
Developed circa 1999 for sharing genome annotations
Expanded 2004 onwards• more data types• better metadata• addition of Registry
DAS/2 project• split from DAS, not backwards compatible• inspired some DAS developments
To Summarise…
The Distributed Annotation System is…• A network of biological data sources• An example of federation• A collection of REST web services
The DAS Protocol is…• An integration platform• A client-server protocol• An agreed standard
Image Credits
• Flickr/muir.ceardach• Flickr/Horia Varlan• Flickr/Alessandro Pinna• Fotopedia/Jean-Marie Hullot• listicles.com/?p=3485• Google Earth/Cnes/Spot Image• Olivier H. Beauchesne