19
CASRAI ReConnect 2015 INTELLECTUAL PROPERTY AND SCIENCE STEVE REVUCKY, PRE-SALES SOLUTIONS SPECIALIST Tuesday, July 5, 2022

Data standardization within a research information system framework - Steve Revucky

  • Upload
    casrai

  • View
    307

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Data standardization within a research information system framework - Steve Revucky

CASRAI ReConnect 2015INTELLECTUAL PROPERTY AND SCIENCE

STEVE REVUCKY, PRE-SALES SOLUTIONS SPECIALISTMay 3, 2023

Page 2: Data standardization within a research information system framework - Steve Revucky

FOLLOWING AN EXAMPLE• Should be easy, right?

Page 3: Data standardization within a research information system framework - Steve Revucky

STANDARDIZATION IS THIS EASY

Page 4: Data standardization within a research information system framework - Steve Revucky

YOU SAY POTATO…– Classifications and standardizations exist so that we can

all understand each other and streamline communication– Taxonomies and lexica designed by organizations such

as CASRAI, among others, allow researchers, funders, partners, government authorities, and others to understand each other

– But what about the balancing flexibility with standardization?

Page 5: Data standardization within a research information system framework - Steve Revucky

CONVERIS integrates with internal and external systems

-VIVO, etc.

Page 6: Data standardization within a research information system framework - Steve Revucky

Engagement in Industry Standards

CONVERIS

Page 7: Data standardization within a research information system framework - Steve Revucky

CONFIGURATION• Workflow processes• Labels• Entities (create or adapt)• Roles and rights

Page 8: Data standardization within a research information system framework - Steve Revucky
Page 9: Data standardization within a research information system framework - Steve Revucky

Integrating Systems with Converis

Page 10: Data standardization within a research information system framework - Steve Revucky

ETL Concept

• Extract data in a certain format (CSV, XML, JSON, etc.) from a source location

• Transform and apply business logic to data including aggregation, counting, concatenation, scripting, lookups, merging, push files, etc.

• Load data in a certain format (CSV, XML, JSON) to a destination location

Extract Transform Load

Page 11: Data standardization within a research information system framework - Steve Revucky

ETL and Converis• General ETL (all output formats/steps allowed)

• Converis ETL (fixed output step)

Extract Transform Load

Extract Transform Load

Plugin is installed on your Converis serverIt needs to be installed on your workstation too

Page 12: Data standardization within a research information system framework - Steve Revucky

Implementing IntegrationsRequirements documents covers three points:

File Handling− Format (*.csv)− Location (/dir/*)− Frequency (e.g. nightly)

Data/Field Mapping− “hrID” = “converisID”− “surname” = “lastName”

Business Logic− What records should be added?− What updates/changes to data can/should be made in Converis?− Bidirectional integration?

Sample Banner Req. Doc

Page 13: Data standardization within a research information system framework - Steve Revucky

System architecture

Search EngineInstitutional Repositories

Internal Data Sources

Fin-system

HR-system

DatabasePostgreSQL

LoginServer

DSpaceFedora EPrints

External Data Sources

ScopusWoS PubMedORCID …

Apache Solr

Research AnalyticsPentaho

Kettle ETL

… Java Server Faces(JSF)

Mapping Engine

Business logic(EJB)

API

RESTWeb

services

OAI-PMH

CONVERISJava EE

GlassFish

Data Integration

CONVERIS is a JavaEE application following the typical JavaEE 3-tier-architecture with a modular design of user interface, business logic (i.e. functionality) and data management (i.e. data model)

Page 14: Data standardization within a research information system framework - Steve Revucky

CUSTOMIZATION (WITH LIMITS)• XML templates• Choicegroup modification• Field formatting

Page 15: Data standardization within a research information system framework - Steve Revucky

RESEARCH AREA CLASSIFICATIONS– Keyword classifications:

Page 16: Data standardization within a research information system framework - Steve Revucky

AUTHOR DISAMBIGUATION

Page 17: Data standardization within a research information system framework - Steve Revucky

TO THE WIDER WORLD

Page 18: Data standardization within a research information system framework - Steve Revucky

FUTURE POSSIBILITIES AND PLANS• Any structured data can be ingested into Converis• Fields can be mapped to existing or new fields• Each implementation is customized, so potential

exists to follow CASRAI guidelines:– CRediT – Contributor roles taxonomy

• Canadian Common CV (CCV) coming soon – early 2016

Page 19: Data standardization within a research information system framework - Steve Revucky

Thank you

Steve RevuckyPre-Sales Solutions SpecialistIP & ScienceThomson ReutersPhiladelphia, PA

Tel: +1 215 823 [email protected]