13
1 User Group Meeting June 6, 2006 CrossRef User Group Meeting Crystal Gateway Marriott Arlington, VA June 6 th , 2006 Chuck Koscher Technology Director, CrossRef [email protected] Ed Pentz Executive Director, CrossRef [email protected]

1 User Group Meeting June 6, 2006 CrossRef User Group Meeting Crystal Gateway Marriott Arlington, VA June 6 th, 2006 Chuck Koscher Technology Director,

Embed Size (px)

Citation preview

Page 1: 1 User Group Meeting June 6, 2006 CrossRef User Group Meeting Crystal Gateway Marriott Arlington, VA June 6 th, 2006 Chuck Koscher Technology Director,

1

User Group MeetingJune 6, 2006

CrossRef User Group MeetingCrystal Gateway Marriott

Arlington, VA

June 6th, 2006

Chuck Koscher

Technology Director, CrossRef

[email protected]

Ed Pentz

Executive Director, CrossRef

[email protected]

Page 2: 1 User Group Meeting June 6, 2006 CrossRef User Group Meeting Crystal Gateway Marriott Arlington, VA June 6 th, 2006 Chuck Koscher Technology Director,

2

User Group MeetingJune 6, 2006

Agenda *

2:30 - 3:00    CrossRef Overview & Update – Ed Pentz, Executive Director3:00 - 3:20    System Update  - Chuck Koscher, Technical Director3:20 -3:45     Data Quality Initiative – Chuck Koscher 3:45 - 4:00    Services: Forward linking & Simple Text Query -  Ed Pentz. 4:00-4:15      Break              4:15-4:30     Multiple Resolution Pilot  -  Ed Pentz4:30-5:15     CrossRef Web Services                         Program overview & status -   Ed Pentz                        OpenURL, RSS, & OAI-PMH Interfaces – Chuck Koscher 5:15 - ?        Questions

* Please feel free to open discussion at any point on any topic.

Page 3: 1 User Group Meeting June 6, 2006 CrossRef User Group Meeting Crystal Gateway Marriott Arlington, VA June 6 th, 2006 Chuck Koscher Technology Director,

3

User Group MeetingJune 6, 2006

System Update (status)

• System performance has remained acceptable, but loading has increased- Query response times remain under 1 second (typically 300-500 msec)- Stored query cycle time is ~4 weeks- Deposit times

Less than 5 mn:47285(57 %)Less than 1 hr:14194(17 %)Less than 6 hr:16616(20 %)Less than 12 hr:926(1 %)Less than 18 hr:1301(1 %)Less than 24 hr:895(1 %)More than 24 hr:1350(1 %)

Less than 5 mn:40255(46 %)Less than 1 hr:23384(26 %)Less than 6 hr:16867(19 %)Less than 12 hr:4903(5 %)Less than 18 hr:651(0 %)Less than 24 hr:13(0 %)More than 24 hr:446(0 %)

April May

• Upgrades: move Oracle DB to a 4 cpu dual core x86 machine (from Sun Sparc) - Postpone major re-architecture project until effects of new machine are know

Page 4: 1 User Group Meeting June 6, 2006 CrossRef User Group Meeting Crystal Gateway Marriott Arlington, VA June 6 th, 2006 Chuck Koscher Technology Director,

4

User Group MeetingJune 6, 2006

System Update (new features)

• As of April we have been accepting deposits for extended content types- Technical-reports/working-papers- Dissertations / thesis- Standards -Soon we will add support for deposit of DOIs for database records

• New query result format: UNIXREF- Returns exact data publisher deposited for the individual DOI- Returns <citation_list> to the DOI’s owner

http://doi.crossref.org//servlet/query?usr=X&pwd=X"&format=unixref&qdata=|Journal of Neuroscience Research|Chen|66|4|612|2001|||

format=xsd_xml

Page 5: 1 User Group Meeting June 6, 2006 CrossRef User Group Meeting Crystal Gateway Marriott Arlington, VA June 6 th, 2006 Chuck Koscher Technology Director,

5

User Group MeetingJune 6, 2006

System Update (new features)

• Improved queue management

- Deposits and batch queries are upload as files to CrossRef. They are then processed out of a single queue of jobs.- We run up to 12 processors (SP) to work on these jobs.- Control by limiting file size, specify user, exclude user

1-2 SPs < 20,000 2-3 SPs < 50,000 2-3 SPs < 200,000 2-3 SPs no size limit

- If you are going to be submitting a large number of back files or other significant volumes please contact us to discuss creating a special user name for this activity.

Page 6: 1 User Group Meeting June 6, 2006 CrossRef User Group Meeting Crystal Gateway Marriott Arlington, VA June 6 th, 2006 Chuck Koscher Technology Director,

6

User Group MeetingJune 6, 2006

Data Quality Initiative

Metadata quality, initially CrossRef was not intended to be a metadata distribution service, MD simply had to be good enough to match DOIs. Now (primarily due to forward linking) CrossRef metadata is displayed on publisher’s web sites.

• Focus areas, 1) publication title & ISSN accuracy 2) complete metadata record 3) author name accuracy

Tactic: Make publishers aware of their data quality

• Link persistence, an unacceptable number of DOIs no longer work! Primarily due to journals that move between publishers. Old publisher abandons DOIs, new publisher assigns new DOIs.

Tactic: ‘Ping’ test all journals on a regular basis. Notify publishers, publicize results.

Page 7: 1 User Group Meeting June 6, 2006 CrossRef User Group Meeting Crystal Gateway Marriott Arlington, VA June 6 th, 2006 Chuck Koscher Technology Director,

7

User Group MeetingJune 6, 2006

Page 8: 1 User Group Meeting June 6, 2006 CrossRef User Group Meeting Crystal Gateway Marriott Arlington, VA June 6 th, 2006 Chuck Koscher Technology Director,

8

User Group MeetingJune 6, 2006

Page 9: 1 User Group Meeting June 6, 2006 CrossRef User Group Meeting Crystal Gateway Marriott Arlington, VA June 6 th, 2006 Chuck Koscher Technology Director,

9

User Group MeetingJune 6, 2006

06:53:04 - Missing Conflict Checker for 6-JUN-20006:53:04 - Missing Conflict Checker for 6-JUN-2006 started.

Submission contained email: [email protected] journalciteids:2040

================================Checking DOIs in file: 2040.xml10.1016/S0169-5150(02)00072-5 : 10.1111/j.1574-0862.2002.tb00125.x : Titles match(The dynamics of land-cover change in western Honduras: exploring spatial and temporal complexity)10.1016/S0169-5150(02)00073-7 : 10.1111/j.1574-0862.2002.tb00124.x : Titles match(Land use dynamics in the central highlands of Vietnam: a spatial model combining village survey data with satellite imagery interpretation)10.1016/S0169-5150(02)00074-9 : 10.1111/j.1574-0862.2002.tb00123.x : Titles match(Temporal and spatial modelling of tropical deforestation: a survival analysis linking satellite and household survey data)6 started.

Page 10: 1 User Group Meeting June 6, 2006 CrossRef User Group Meeting Crystal Gateway Marriott Arlington, VA June 6 th, 2006 Chuck Koscher Technology Director,

10

User Group MeetingJune 6, 2006

CrossRef Web Services: Interfaces

• Current practice: Local Hosters (members or affiliates): receive one of three forms of XML data for use in internal linking systems and data clean-

up, No redisplay permitted. No citation data made available.

• OAI-PMH interface will support all user types and all data formats (replace current local hoster methods). Allow selective delivery based on publisher opt-in/opt-out profile and on user agreement (available Q3 2006)

- HTTP based protocol- Defined set of ‘verbs’: Identify, getRecord, ListIdentifiers, ListMetadataFormats, ListRecords, ListSets

http://www.crossref.org/OAI? verb=ListSets

- IP authentication (OAI-PMH standard method)

- username authentication (added to allow each CR publisher retrieve they’re own data)

OAI-PMH

Page 11: 1 User Group Meeting June 6, 2006 CrossRef User Group Meeting Crystal Gateway Marriott Arlington, VA June 6 th, 2006 Chuck Koscher Technology Director,

11

User Group MeetingJune 6, 2006

CrossRef Web Services: Interfaces OpenURL

• CrossRef currently operates a NISO Z39.88.2004 compliant resolver at http://www.crossref.org/openurl

• Allows public query and resolution services (no login required)

http://www.crossref.org/openurl?url_ver=Z39.88-004&rft_id=info:doi/10.1103/PhysRev.47.777

http://www.crossref.org/openurl?aulast=Maas%20LRM&title=JOURNAL%20OF%20PHYSICAL%20OCEANOGRAPHY&volume=32&issue=3&spage=870&date=2002

http://www.crossref.org/openurl?id=doi:10.1103/PhysRev.47.777&noredirect=true

http://www.crossref.org/openurl?issn=03770273&aulast=Walker&volume=54&spage=117&date=1983&noredirect

Page 12: 1 User Group Meeting June 6, 2006 CrossRef User Group Meeting Crystal Gateway Marriott Arlington, VA June 6 th, 2006 Chuck Koscher Technology Director,

12

User Group MeetingJune 6, 2006

CrossRef Web Services: Interfaces RSS

• RSS has been discussed as a method for distributing metadata

- Possibly a daily feed listing new DOIs (an alerting service?)

• Challenges

- RSS does not easily support large bulk distributions (great for daily changes and ‘newsy’ content)

- Does not have integral support for discovery (if you want only a subset of the data or don’t know exactly what is available)

RSS feed is just a URL to an XML file http://server.com/my_rss.xml

- What is the real business value of a CrossRef metadata feed using RSS (is it complimentary or in conflict with publisher feeds)?

Page 13: 1 User Group Meeting June 6, 2006 CrossRef User Group Meeting Crystal Gateway Marriott Arlington, VA June 6 th, 2006 Chuck Koscher Technology Director,

13

User Group MeetingJune 6, 2006

Questions?