Designing a Semantic Web path to e-Science

Preview:

Citation preview

Francesca Di Donato, Dipartimento di scienze della politica, Università di Pisa

Designing a Semantic Webpath to e-Science

SWAP 2005 - Semantic Web Applications and Perspectives2nd Italian Semantic Web Workshop - Trento, Faculty of Economics, 14-15-16December, 2005

Premises

Problems to solve…

Look! Shethinkshypertextual!!

Yes… Butshe still publishes texts!

Science and the Web.Two models of selection

Gatekeepers

Author Author

Reader

Reader

Reader

Reader

Selection and the Semantic Web

How can Ifind what I’mlooking for?

• Selection: to findwhat you arelooking for

Selection is user-driven

A necessary condition to select information

is accessing information...

Conditions

Open Access to Scientific Knowledge: OAI

• 1991: ArXiv

• 1999: Santa FeConvention

• 2001: OAI-PMH(Protocol forMetadataHarvesting)

2005: More than 250 repositories are connected through OAI-PMH

Open Access to Scientific Knowledge:policies

BERLIN 3:March 2005:

Agreed Recommendation:

"In order to implement the BerlinDeclaration institutions shouldimplement a policy to:

1. require their researchers todeposit a copy of all theirpublished articles in an openaccess repositoryand2. encourage their researchers topublish their research articles inopen access journals where asuitable journal exists (and providethe support to enable that tohappen)."

BerlinDeclaration

Signature:

Open Access to Scientific Knowledge

: Numbers

• More than 150 institutions signed the Berlin Declaration• More than 70 out of 77 Italian Rectors are among them• More than 270 repositories OAI-PMH compliant• More than 1800 open access journals

A sheer amount of data and metadataand

A subscribed pact on what to do with themare available

A possible path to e-Science

1. Extracting Hidden Semantics

Dr. Peter Murray-Rust

Towards a Chemical Semantic Web:

Examples that a robot could do: * Find: published molecules that obey "druggability" criteria

* reactions that create carbon-halogen bonds * phase diagrams for lipid mixtures

...more ambitious...

* read J.Med.Chem and compute geometries and energies for all new molecules * calculate binding to HIV protease

* order the chemicals required to synthesise them and check safety * synthesise and test them

Semantic Web demos:

* WWMM: an Open non-centralized, peer-to-peer collection of molecules and properties)

* Understanding data in "free-text" (OSCAR)* NesC:

2. Combining multiple quality criteria

The importance of usage information!!!!

•Recorded in the present (usage), not 3-4 years after fact (citation)

•Unlimited access, unlimited sample size

•Already recorded locally at many different information resources

•Reduced “social desirability bias”

•Recorded at all stages of the scholarly process

•Applies to all units of scholarly communication

J. Bollen, H. V. de Sompel, J. Smith, and R. Luce. Toward alternativemetrics of journal impact: a comparison of download and citation data.Information Processing and Management, 41(6):1419-1440, 2005.Prof. Johan Bollen

1) When a user downloads A and B, A and B may be related.2) The co-download frequency corresponds to degree of relatedness (all docs)

Clickstream/data mining approach

1) When an author cites B from A, A and B are related2) The frequency of citation corresponds to degree of relatedness (journals)

Citation

3. OA + Semantic Web:one example

Other measures forResearch impact:

- Degree centrality: the sum of the number ofrelationships pointing toand from an actor, i.e., their in- and out-degree,normalized by the totalnumber of relationships in the social network

- Closeness centrality: the average shortest pathdistance of an actor to all other actors in the network.

- Betweenness centrality: the frequency by which anactor is part of theshortest path between any pair of agents in thenetwork.

Why not to store metadata in RDF?

Then it’s easy to carry out

Bibliometric computes, such as..

HyperJournal team

Recommended