Upload
jensen
View
35
Download
0
Tags:
Embed Size (px)
DESCRIPTION
Developing Reliable Automatic Metadata Generation: Feedback from MatDL Pathway. NSDL Annual Meeting , Washington, DC November 6-8 2007 Advancing NSDL Networks. Cathy S. Lowe, Laura M. Bartolo, Kent State University. Outline. MatDL Pathway - PowerPoint PPT Presentation
Citation preview
Developing Reliable Automatic Metadata
Generation: Feedback from MatDL
Pathway NSDL Annual Meeting , Washington, DC November 6-8 2007
Advancing NSDL Networks
Cathy S. Lowe, Laura M. Bartolo, Kent State University
NSDL Annual Meeting 2007 Washington, DC
Outline MatDL Pathway iVia metadata generation for PDFs & test set Evolution of result set
description title keywords author
Next Steps
NSDL Annual Meeting 2007 Washington, DC
NSF MS Initiatives(NIRTs, MRSECs, IMIs)
•Soft Matter Wiki
NSF MS Initiatives(NIRTs, MRSECs, IMIs)
•Soft Matter Wiki
Teaching ResourceDevelopment•MS Teaching Archive
Teaching ResourceDevelopment•MS Teaching Archive
Code Development• Matforge
•NIST FiPy•CMU•DOE CMSN
Code Development• Matforge
•NIST FiPy•CMU•DOE CMSN
Virtual Labs•Intro to Solid State Chem•Intro to Bio Physics•Modern Chemistry
Virtual Labs•Intro to Solid State Chem•Intro to Bio Physics•Modern Chemistry
Stewardship•MatDL Repository
Stewardship•MatDL Repository
http://matdl.orghttp://teaching.matdl.org http://matdlforge.org
http://matdl.org/virtuallabshttp://matdl.org/matdlwiki
NSDL Materials Digital Library Pathway
NSDL Annual Meeting 2007 Washington, DC
iVia metadata generation & original test set Worked with iVia metadata generation only Test set
PDF format 83 undergraduate research papers from
Cornell Center for Materials Research (CCMR) REU program
NSDL Annual Meeting 2007 Washington, DC
NSDL Annual Meeting 2007 Washington, DC
NSDL Annual Meeting 2007 Washington, DC
Evolution of result set Metadata generation for PDFs not available
(2005) Metadata generation for PDFs available
(2006) – improving over time description title keyword author ** recently available
NSDL Annual Meeting 2007 Washington, DC
Description generationGood accuracy for explicit “Abstract”
Correct - ~38% Partially correct – ~33% Incorrect/not generated – ~29%
NSDL Annual Meeting 2007 Washington, DC
Title generationVery good accuracy
precision 91.09% recall 89.30%
NSDL Annual Meeting 2007 Washington, DC
Keyword generationManually rated 5 keyphrases per document – Good accuracy
Highly descriptive - 39%Acceptable - 41%Unacceptable - 20%
NSDL Annual Meeting 2007 Washington, DC
Author generation --new functionalityApplied to original sample:
Correct - 45% Partially correct - 27% Incorrect/not generated - 28%
NSDL Annual Meeting 2007 Washington, DC
Next Steps Collaboration mutually beneficial for tool
developers & NSDL community-based repositories
Continue to work with tool as it improves Continue/expand working with MRSECs
REU resources
NSDL Annual Meeting 2007 Washington, DC
Thank you & Questions?
[email protected] NSDL Materials Digital Library Pathway is supported by the
National Science Foundation DUE-0532831. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of NSF.