24
Technology of Semantic Structuring of the Digital Library Content I. Filozova JINR LIT, Dubna LIT JINR (DUBNA), JULY 18, 2012 V International Conference Distributed Computing and Grid-technologies in Science and Education

Technology of Semantic Structuring of the Digital Library Content I. Filozova JINR LIT, Dubna LIT JINR (DUBNA), JULY 18, 2012 V International Conference

Embed Size (px)

Citation preview

Page 1: Technology of Semantic Structuring of the Digital Library Content I. Filozova JINR LIT, Dubna LIT JINR (DUBNA), JULY 18, 2012 V International Conference

Technology of Semantic

Structuring of the Digital Library

Content

I. FilozovaJINR LIT, Dubna

LIT JINR (DUBNA), JULY 18, 2012

V International ConferenceDistributed Computing and Grid-

technologies in Science and Education

Page 2: Technology of Semantic Structuring of the Digital Library Content I. Filozova JINR LIT, Dubna LIT JINR (DUBNA), JULY 18, 2012 V International Conference

Contents

Current Trends Problematic Situation Research Lines Realization Ideas QA-System on the Logic-Semantic Network Basis Summary

Page 3: Technology of Semantic Structuring of the Digital Library Content I. Filozova JINR LIT, Dubna LIT JINR (DUBNA), JULY 18, 2012 V International Conference

CURRENT TRENDS Traditional Publishing Digital Archive-based approach; Accumulation by the scientific community the expansive digital information arrays → content integration on the metadata level → common Data and Information Spaces; The growth number of institutional repositories in the open access form.

Repositories Number — 2 900 Records Number ~ 40,000,000

according to ROAR statistics (ROAR - http://roar.eprints.org)

Page 4: Technology of Semantic Structuring of the Digital Library Content I. Filozova JINR LIT, Dubna LIT JINR (DUBNA), JULY 18, 2012 V International Conference

HOW TO FIND

Page 5: Technology of Semantic Structuring of the Digital Library Content I. Filozova JINR LIT, Dubna LIT JINR (DUBNA), JULY 18, 2012 V International Conference

PROBLEMATIC SITUATION

CREATION OF the EFFECTIVE MECHANISMS FOR the ANSWERS SEARCH CREATION OF the EFFECTIVE MECHANISMS FOR the ANSWERS SEARCH TO QUESTIONS IN the DIGITAL INFORMATION FUNDS TO QUESTIONS IN the DIGITAL INFORMATION FUNDS – ACTUAL PROBLEM

FIND theINFORMATION

(INFORMATION SOURCEAND/OR INFORMATION

ITSELF)

QUESTION (V)

ANSWERS SET

(QV)

METHODS AND MECHANISMSMECHANISMS FOR EFFECTIVE SEARCH

(SEACRH TECHNOLOGY)

DIGITAL INFORMATION FUNDDIGITAL INFORMATION FUND

(INFORMATION SOURCERS)INFORMATION LAWS

?

PERTINENCE (P)

QV= QV

R U QV

N

P =Qv

QvU

Page 6: Technology of Semantic Structuring of the Digital Library Content I. Filozova JINR LIT, Dubna LIT JINR (DUBNA), JULY 18, 2012 V International Conference

Cognitive Function of the Question

Question  a thought query as the interrogative sentence.

Answer a realization of the cognitive function of the question as a new obtained judgment.

Question

TO DEVELOP THE KNOWLEDGE

(TO EXTEND, TO PRODUCE A NEW)

TO REFINETHE

KNOWLEDGE

TO SUPPLEMENT THE

KNOWLEDGE

Cognitive Indetermin

acy

UNKNOWN

KNOWN

Page 7: Technology of Semantic Structuring of the Digital Library Content I. Filozova JINR LIT, Dubna LIT JINR (DUBNA), JULY 18, 2012 V International Conference

Process of Asking Questions and Search 

Answers

AskQuestion

Find Answer

Set AdequacyQuestion -

Answer

Search Scope

Conformity Rules

Search Technology

Answer

Technology of Conformity Setting

Technology

of Question Asking

The Object and Subject of Research

Question

Answer

DatumQuestion

Page 8: Technology of Semantic Structuring of the Digital Library Content I. Filozova JINR LIT, Dubna LIT JINR (DUBNA), JULY 18, 2012 V International Conference

RESEARCH LINES

(1) Development of the method and mechanism for effective search of the set of the relevant answers to the questions.

(2) Technology development for the creation and support of the catalog service of the information fund for providing an efficient search of the answers to the questions.

(3) Software development   cataloguer workstation for the structuring of the information fund.

Page 9: Technology of Semantic Structuring of the Digital Library Content I. Filozova JINR LIT, Dubna LIT JINR (DUBNA), JULY 18, 2012 V International Conference

REALIZATION IDEAS OF

RESEARCH LINES

Page 10: Technology of Semantic Structuring of the Digital Library Content I. Filozova JINR LIT, Dubna LIT JINR (DUBNA), JULY 18, 2012 V International Conference

The method basis is a way to describe the scientific and technical information by set of logic-semantic networks Question-Answer-Reaction (LSN QAR).

The basis for the search engine are: motion way along  LSN, controlled by the user; choice of  LSN  nodes (questions or answers) based

on an ontological model of user question. The basis of the technology is a way of the

description of the subject domain by LSN QAR set.Mechanism of technology is a workstation of

the cataloguer (LSN QAR developer)

Page 11: Technology of Semantic Structuring of the Digital Library Content I. Filozova JINR LIT, Dubna LIT JINR (DUBNA), JULY 18, 2012 V International Conference

Formal Structure of Question, Answer, Reaction

The logical structure of the question (Q):QUESTION = {QUESTION THEME (QT), QUESTION

CONTENT (QC), QUESTION VOLUME (QV)}

The logical structure of the answer (A):ANSWER = {ANSWER THEME (AT), ANSWER CONTENT

(AC), ANSWER VOLUME (AV)}

The logical structure of the reaction (R):REACTION= {REACTION THEME (RT), REACTION CONTENT

(RC), REACTION VOLUME (RV)}

Page 12: Technology of Semantic Structuring of the Digital Library Content I. Filozova JINR LIT, Dubna LIT JINR (DUBNA), JULY 18, 2012 V International Conference

Logic-Semantic Network Question-Answer-ReactionLogic-semantic network  a set of the questions, answers and relationships between them forming an uniform system.Question  query expressed in the interrogative sentence aimed at

the development, refinement or supplement of the knowledge.Answer a realization the cognitive function of the question in the

form of the new obtained judgment.  Answer must be built in accordance with the content and structure of the asked question. Only in this case, the answer is regarded as relevant.

Reaction   a semantic description of the question and answer.Types of reactions:1. Question Reaction  a description of the datum question (to understand the enviroment and causes of the question and to establish thesemantic adequacy  with the answer scope).2. Answer Reaction  a description  of the answer scope (to understand the question semantics and relationship with answer).

Page 13: Technology of Semantic Structuring of the Digital Library Content I. Filozova JINR LIT, Dubna LIT JINR (DUBNA), JULY 18, 2012 V International Conference

Reaction Example (1)Logical unit Question-Answer-Reaction: Question 1 (Q1). What is a JAVA?

Question 1 Reaction 1 (QR11). With respect pronunciation formed two different standards - borrowed from the English / dʒɑ:və / and traditional «Ява» (on russian), corresponding to the traditional pronunciation of the Java name island.

Question 1 Reaction 2 (QR12). Java (Indonesian: Jawa) is an island of 

Indonesia with a population of 135 million. Square 132 000 k2…

Question 1 Reaction 3 (QR13). Slide show, photo-collage with the views of Java island.

Page 14: Technology of Semantic Structuring of the Digital Library Content I. Filozova JINR LIT, Dubna LIT JINR (DUBNA), JULY 18, 2012 V International Conference

Reaction Example (2)Answer 1 to Question 1 (A11). Java – an object-

oriented programming language developed by Sun Microsystems.Reaction 1 of the Answer 1 to the Question 1 (RA11). Why

is the language called JAVA? There is a version that language got its name from coffee grown on

the same island. As you know, this drink is hot like some programmers. 

Therefore, a cup of steaming coffee is displayed on logo.

Page 15: Technology of Semantic Structuring of the Digital Library Content I. Filozova JINR LIT, Dubna LIT JINR (DUBNA), JULY 18, 2012 V International Conference

Reaction Example (3)

Reaction 2 of the Answer 1 to the Question 1 (R2A11). Sun Microsystems, Inc (now part of Oracle Corporation) — U.S. company that produces software and hardware…

Answer 2 to Question 1. Java —  not only the language itself, but also a platform for development and execution of the applications based on this language.

Page 16: Technology of Semantic Structuring of the Digital Library Content I. Filozova JINR LIT, Dubna LIT JINR (DUBNA), JULY 18, 2012 V International Conference

Graph LSN QAR

A22 A23

Q31 Q32 Q33 Q34

A41 A42 A43

1

2

3

45

6

7

8

9

10

11

1213

14 1516 17

R10

R21

A21

Q10

R23

Page 17: Technology of Semantic Structuring of the Digital Library Content I. Filozova JINR LIT, Dubna LIT JINR (DUBNA), JULY 18, 2012 V International Conference

Analysis Method of Scientific Texts The document is studied by the expert in terms of:

1. Semantic matching title and content;2. Set of filters:

Filter 1 (F1) - General Part. F1 includes an analysis of the problem, its history, overview, topicality.Filter 2 (F2) - Author concept. F2 includes new terms introduced by the authors, traditional terms with the author's interpretation, the narrowing semantics.Filter 3 (F3) - Examples and illustrations. To clarify difficult places in the text, reduce the text size under stringent restrictions on the volume.Filter 4 (F4) - The idea of the author. Describes and explains the author's main idea.

3. Markup text (formulation of the basic questions, answers and reactions).

Page 18: Technology of Semantic Structuring of the Digital Library Content I. Filozova JINR LIT, Dubna LIT JINR (DUBNA), JULY 18, 2012 V International Conference

Navigation on LSN №

edgeWay

1 1,4,11

2 1,4,12

3 1,5,13

5 1,5,14

6 1,6,15

7 1,6,16

8 2,7,13

9 2,7,14

10 2,8,17

11 3,9,15

12 3,9,16

13 3,10,17

A22 A23

Q31 Q32 Q33 Q34

A41 A42 A43

1

2

3

45

6

7

8

9

10

11

1213

14 1516 17

R10

R21

A21

Q10

R23

Page 19: Technology of Semantic Structuring of the Digital Library Content I. Filozova JINR LIT, Dubna LIT JINR (DUBNA), JULY 18, 2012 V International Conference

Multilayer Related Set of Graphs

21 3

6 74

5 8

910

11 12 1314

Document LSN

21 3

6 74

5 8

910

11 12 1314

— Question+ Question Reaction

— Answer+ Answer Reaction

Navigation on LSN:— available way— selected way by user5

8

— Thesaurus term

21 3

6 74

5 8

910

11 12 1314

Subject DomainLSN

Theme LSN

Thesaurus

— navigation on the LSN layers

Enter into document LSN

Motion Up on LSN:

generalization of knowledge

Motion Down on LSN:

refinement of knowledge

Page 20: Technology of Semantic Structuring of the Digital Library Content I. Filozova JINR LIT, Dubna LIT JINR (DUBNA), JULY 18, 2012 V International Conference

List of Available Questions and Card of Selected Question(fragment)

AnswerReaction 1

Question Reaction

Next LevelQuestions

Question

Answer 1

Answer 2

(Это интересно …)Answer

Reaction 2

Page 21: Technology of Semantic Structuring of the Digital Library Content I. Filozova JINR LIT, Dubna LIT JINR (DUBNA), JULY 18, 2012 V International Conference

Card of Question Reaction

Page 22: Technology of Semantic Structuring of the Digital Library Content I. Filozova JINR LIT, Dubna LIT JINR (DUBNA), JULY 18, 2012 V International Conference

LSN + Visualization

QuestionsAnswers

Page 23: Technology of Semantic Structuring of the Digital Library Content I. Filozova JINR LIT, Dubna LIT JINR (DUBNA), JULY 18, 2012 V International Conference

Summary• It’s proposed: "Catalog Service" creation and  support for the funds-corpuses, Question-Answer Navigator creation that provides such features: - the ability of the refinement and deepening of the understanding the question

meaning; - the ability of the refining, deepening, expansion of the knowledge or the

obtaining a new knowledge

during the answer to question search process.• Realization of such "Catalog Service" and Navigator allows to

study the DL content by the natural mode for the human: refinement, generalization and obtaining a new knowledge  U question-answer mode.

• The main problem of the proposed question-answer system is a maximal automation of the process of the creation and support of  the fund service catalog.

Page 24: Technology of Semantic Structuring of the Digital Library Content I. Filozova JINR LIT, Dubna LIT JINR (DUBNA), JULY 18, 2012 V International Conference

Even the most foolish idea can be implemented masterfully . Leszek Kumor

Even the most foolish idea can be implemented masterfully . Leszek Kumor