View
664
Download
1
Embed Size (px)
Citation preview
Managing a Data CatalogPromoting Data Reuse and Collaboration at an Academic Medical Center
Nicole Contaxis, Project CoordinatorIan Lamb, Solutions Developer
2
Institutional Structure
3
Meeting User NeedsHeavy users of large, external datasets (e.g. Census, national health surveys, Medicare)
Department of Population Health
Lack of knowledge about institutional licenses
Difficulty accessing datasets
Difficulty working with datasets
Presentation Title Goes Here 4
NYU Data Catalog
5
NYU Data Catalog Home Page
•Text starts here
6
NYU Data Catalog Home PageSearch
•Text starts here
7
NYU Data Catalog Home PageFilter
•Text starts here
8
Record Details - External Datasets
9
Record Details - External DatasetsLocal Experts
10
Record Details - External DatasetsAccess Instructions
11
Record Details - External DatasetsPubMed Search
12
Record Details - Internal Datasets
•
•
13
Record Details - Internal DatasetsAuthors
•
14
Record Details - Internal DatasetsAccess Instructions
•
15
Record Details - Internal DatasetsAssociated Publications
16
Record Details External Internal
•Text starts here
17
Record Details External Internal
18
Record Details External Internal
19
Record Details External Internal
Presentation Title Goes Here 20
Metadata: Strategy over Purity
21
Common Metadata Elements from Biomedical Repositories
22
General Metadata Schemas Consulted
DCAT
23
Matching External Efforts
Translating Form into Function
Our carefully selected metadata model needed to become a usable application
24
Goals
•Faithfully reproduce metadata schema specified by our librarians
•Enable easy maintenance of any items that will need to be updated often in the future
•Make sure all forms and user interfaces help rather than hinder the ongoing maintenance of a growing collection
The best way to meet these goals was not the easiest way…
25
26
27
28
29
30
31
Goals
•Faithfully reproduce metadata model specified by librarians
•Enable easy maintenance of any items that will need to be updated often in the future
•Make sure all forms and user interfaces help rather than hinder the ongoing maintenance of a growing collection
32
Help, don’t Hinder, the Maintainers
Ease of use = clean data
•Enable user to easily refer back to previous fields
•To avoid messy data, discourage us from adding items that may already exist (i.e. don’t let us add “J Doe” if “John Doe” is already in the system)
•If we do have to add a new metadata “entity,” we shouldn’t lose all the progress we’ve made entering this dataset record
33
34
35
Discouraging Duplicates
Adding New Items Without Getting Lost
New metadata items can be added to the system without leaving this form that you’ve spent the last hour on
36
Ease of Use = Clean Data
If your system is difficult to use, no
one will want to use it
37
Presentation Title Goes Here 38
Processing Internal & External Datasets
39
External Dataset
40
Internal Dataset
Presentation Title Goes Here 41
Make Your Own: Code and Documentation Availability
43
Code on GitHub
https://github.com/nyuhsl/data-catalog
Presentation Title Goes Here 44
Questions?
Contact:Data Services Team NYU Langone Medical [email protected]