Upload
tyler-wilson
View
212
Download
0
Embed Size (px)
Citation preview
1
User Group MeetingJune 6, 2006
CrossRef User Group MeetingCrystal Gateway Marriott
Arlington, VA
June 6th, 2006
Chuck Koscher
Technology Director, CrossRef
Ed Pentz
Executive Director, CrossRef
2
User Group MeetingJune 6, 2006
Agenda *
2:30 - 3:00 CrossRef Overview & Update – Ed Pentz, Executive Director3:00 - 3:20 System Update - Chuck Koscher, Technical Director3:20 -3:45 Data Quality Initiative – Chuck Koscher 3:45 - 4:00 Services: Forward linking & Simple Text Query - Ed Pentz. 4:00-4:15 Break 4:15-4:30 Multiple Resolution Pilot - Ed Pentz4:30-5:15 CrossRef Web Services Program overview & status - Ed Pentz OpenURL, RSS, & OAI-PMH Interfaces – Chuck Koscher 5:15 - ? Questions
* Please feel free to open discussion at any point on any topic.
3
User Group MeetingJune 6, 2006
System Update (status)
• System performance has remained acceptable, but loading has increased- Query response times remain under 1 second (typically 300-500 msec)- Stored query cycle time is ~4 weeks- Deposit times
Less than 5 mn:47285(57 %)Less than 1 hr:14194(17 %)Less than 6 hr:16616(20 %)Less than 12 hr:926(1 %)Less than 18 hr:1301(1 %)Less than 24 hr:895(1 %)More than 24 hr:1350(1 %)
Less than 5 mn:40255(46 %)Less than 1 hr:23384(26 %)Less than 6 hr:16867(19 %)Less than 12 hr:4903(5 %)Less than 18 hr:651(0 %)Less than 24 hr:13(0 %)More than 24 hr:446(0 %)
April May
• Upgrades: move Oracle DB to a 4 cpu dual core x86 machine (from Sun Sparc) - Postpone major re-architecture project until effects of new machine are know
4
User Group MeetingJune 6, 2006
System Update (new features)
• As of April we have been accepting deposits for extended content types- Technical-reports/working-papers- Dissertations / thesis- Standards -Soon we will add support for deposit of DOIs for database records
• New query result format: UNIXREF- Returns exact data publisher deposited for the individual DOI- Returns <citation_list> to the DOI’s owner
http://doi.crossref.org//servlet/query?usr=X&pwd=X"&format=unixref&qdata=|Journal of Neuroscience Research|Chen|66|4|612|2001|||
format=xsd_xml
5
User Group MeetingJune 6, 2006
System Update (new features)
• Improved queue management
- Deposits and batch queries are upload as files to CrossRef. They are then processed out of a single queue of jobs.- We run up to 12 processors (SP) to work on these jobs.- Control by limiting file size, specify user, exclude user
1-2 SPs < 20,000 2-3 SPs < 50,000 2-3 SPs < 200,000 2-3 SPs no size limit
- If you are going to be submitting a large number of back files or other significant volumes please contact us to discuss creating a special user name for this activity.
6
User Group MeetingJune 6, 2006
Data Quality Initiative
Metadata quality, initially CrossRef was not intended to be a metadata distribution service, MD simply had to be good enough to match DOIs. Now (primarily due to forward linking) CrossRef metadata is displayed on publisher’s web sites.
• Focus areas, 1) publication title & ISSN accuracy 2) complete metadata record 3) author name accuracy
Tactic: Make publishers aware of their data quality
• Link persistence, an unacceptable number of DOIs no longer work! Primarily due to journals that move between publishers. Old publisher abandons DOIs, new publisher assigns new DOIs.
Tactic: ‘Ping’ test all journals on a regular basis. Notify publishers, publicize results.
7
User Group MeetingJune 6, 2006
8
User Group MeetingJune 6, 2006
9
User Group MeetingJune 6, 2006
06:53:04 - Missing Conflict Checker for 6-JUN-20006:53:04 - Missing Conflict Checker for 6-JUN-2006 started.
Submission contained email: [email protected] journalciteids:2040
================================Checking DOIs in file: 2040.xml10.1016/S0169-5150(02)00072-5 : 10.1111/j.1574-0862.2002.tb00125.x : Titles match(The dynamics of land-cover change in western Honduras: exploring spatial and temporal complexity)10.1016/S0169-5150(02)00073-7 : 10.1111/j.1574-0862.2002.tb00124.x : Titles match(Land use dynamics in the central highlands of Vietnam: a spatial model combining village survey data with satellite imagery interpretation)10.1016/S0169-5150(02)00074-9 : 10.1111/j.1574-0862.2002.tb00123.x : Titles match(Temporal and spatial modelling of tropical deforestation: a survival analysis linking satellite and household survey data)6 started.
10
User Group MeetingJune 6, 2006
CrossRef Web Services: Interfaces
• Current practice: Local Hosters (members or affiliates): receive one of three forms of XML data for use in internal linking systems and data clean-
up, No redisplay permitted. No citation data made available.
• OAI-PMH interface will support all user types and all data formats (replace current local hoster methods). Allow selective delivery based on publisher opt-in/opt-out profile and on user agreement (available Q3 2006)
- HTTP based protocol- Defined set of ‘verbs’: Identify, getRecord, ListIdentifiers, ListMetadataFormats, ListRecords, ListSets
http://www.crossref.org/OAI? verb=ListSets
- IP authentication (OAI-PMH standard method)
- username authentication (added to allow each CR publisher retrieve they’re own data)
OAI-PMH
11
User Group MeetingJune 6, 2006
CrossRef Web Services: Interfaces OpenURL
• CrossRef currently operates a NISO Z39.88.2004 compliant resolver at http://www.crossref.org/openurl
• Allows public query and resolution services (no login required)
http://www.crossref.org/openurl?url_ver=Z39.88-004&rft_id=info:doi/10.1103/PhysRev.47.777
http://www.crossref.org/openurl?aulast=Maas%20LRM&title=JOURNAL%20OF%20PHYSICAL%20OCEANOGRAPHY&volume=32&issue=3&spage=870&date=2002
http://www.crossref.org/openurl?id=doi:10.1103/PhysRev.47.777&noredirect=true
http://www.crossref.org/openurl?issn=03770273&aulast=Walker&volume=54&spage=117&date=1983&noredirect
12
User Group MeetingJune 6, 2006
CrossRef Web Services: Interfaces RSS
• RSS has been discussed as a method for distributing metadata
- Possibly a daily feed listing new DOIs (an alerting service?)
• Challenges
- RSS does not easily support large bulk distributions (great for daily changes and ‘newsy’ content)
- Does not have integral support for discovery (if you want only a subset of the data or don’t know exactly what is available)
RSS feed is just a URL to an XML file http://server.com/my_rss.xml
- What is the real business value of a CrossRef metadata feed using RSS (is it complimentary or in conflict with publisher feeds)?
13
User Group MeetingJune 6, 2006
Questions?