22
Intelligent Database Systems Presenter : YU-TING LU Authors : Hsin-Chang Yang, Han-Wei Hsiao, Chung-Hong Lee 2011. IPM Multilingual document mining and navigation using self-organizing maps

Presenter : Yu-Ting LU Authors : Hsin-Chang Yang, Han-Wei Hsiao, Chung-Hong Lee 2011. IPM

  • Upload
    nia

  • View
    60

  • Download
    0

Embed Size (px)

DESCRIPTION

Multilingual document mining and navigation using self-organizing maps. Presenter : Yu-Ting LU Authors : Hsin-Chang Yang, Han-Wei Hsiao, Chung-Hong Lee 2011. IPM. Outlines. Motivation Objectives Methodology Experiments Conclusions Comments. Motivation. - PowerPoint PPT Presentation

Citation preview

Page 1: Presenter  : Yu-Ting LU Authors :  Hsin-Chang Yang, Han-Wei Hsiao, Chung-Hong Lee 2011. IPM

Intelligent Database Systems Lab

Presenter : YU-TING LU

Authors : Hsin-Chang Yang, Han-Wei Hsiao, Chung-Hong Lee

2011. IPM

Multilingual document mining and navigation using self-organizing maps

Page 2: Presenter  : Yu-Ting LU Authors :  Hsin-Chang Yang, Han-Wei Hsiao, Chung-Hong Lee 2011. IPM

Intelligent Database Systems Lab

OutlinesMotivationObjectivesMethodologyExperimentsConclusionsComments

Page 3: Presenter  : Yu-Ting LU Authors :  Hsin-Chang Yang, Han-Wei Hsiao, Chung-Hong Lee 2011. IPM

Intelligent Database Systems Lab

Motivation• Such directories are generally constructed

manually and may have disadvantages of

narrow coverage and inconsistency.

• Most of existing directories provide only

monolingual hierarchies that organized Web

pages in terms that a user may not be familiar

with.

Page 4: Presenter  : Yu-Ting LU Authors :  Hsin-Chang Yang, Han-Wei Hsiao, Chung-Hong Lee 2011. IPM

Intelligent Database Systems Lab

Page 5: Presenter  : Yu-Ting LU Authors :  Hsin-Chang Yang, Han-Wei Hsiao, Chung-Hong Lee 2011. IPM

Intelligent Database Systems Lab

資料探勘 Data mining

Page 6: Presenter  : Yu-Ting LU Authors :  Hsin-Chang Yang, Han-Wei Hsiao, Chung-Hong Lee 2011. IPM

Intelligent Database Systems Lab

Objectives• This work will propose an approach that could

automatically arrange multilingual Web pages into a

multilingual Web directory to break the language

barriers in Web navigation.

Page 7: Presenter  : Yu-Ting LU Authors :  Hsin-Chang Yang, Han-Wei Hsiao, Chung-Hong Lee 2011. IPM

Intelligent Database Systems Lab

Methodology

Page 8: Presenter  : Yu-Ting LU Authors :  Hsin-Chang Yang, Han-Wei Hsiao, Chung-Hong Lee 2011. IPM

Intelligent Database Systems Lab

Methodology – Web directory generation• Web page preprocessing and encoding– English• Word segmentation• stop-word elimination• Stemming• keyword selection

– Chinese• select only nouns as keywords

Page 9: Presenter  : Yu-Ting LU Authors :  Hsin-Chang Yang, Han-Wei Hsiao, Chung-Hong Lee 2011. IPM

Intelligent Database Systems Lab

Methodology – Web directory generation• Feature map generation

Page 10: Presenter  : Yu-Ting LU Authors :  Hsin-Chang Yang, Han-Wei Hsiao, Chung-Hong Lee 2011. IPM

Intelligent Database Systems Lab

Methodology – Web directory generation• Web directory generation– Super cluster construction – Determining dominating clusters – Constructing hierarchy– Parameter setting and discussions

– Super cluster construction – Determining dominating clusters – Constructing hierarchy– Parameter setting and discussions

Page 11: Presenter  : Yu-Ting LU Authors :  Hsin-Chang Yang, Han-Wei Hsiao, Chung-Hong Lee 2011. IPM

Intelligent Database Systems Lab

Methodology – Web directory generation• Evaluation of the quality of generated hierarchies

Page 12: Presenter  : Yu-Ting LU Authors :  Hsin-Chang Yang, Han-Wei Hsiao, Chung-Hong Lee 2011. IPM

Intelligent Database Systems Lab

Methodology – Multilingual Web directory generation

• Alignment of monolingual Web directories– Calculating semantic similarity – Incorporating structural similarity– Overall similarity

Page 13: Presenter  : Yu-Ting LU Authors :  Hsin-Chang Yang, Han-Wei Hsiao, Chung-Hong Lee 2011. IPM

Intelligent Database Systems Lab

Methodology – Multilingual Web directory generation

• Alignment of monolingual Web directories

Page 14: Presenter  : Yu-Ting LU Authors :  Hsin-Chang Yang, Han-Wei Hsiao, Chung-Hong Lee 2011. IPM

Intelligent Database Systems Lab

Methodology – Multilingual Web directory generation

• Multilingual Web directory generation

Page 15: Presenter  : Yu-Ting LU Authors :  Hsin-Chang Yang, Han-Wei Hsiao, Chung-Hong Lee 2011. IPM

Intelligent Database Systems Lab

Experiments - SOM training

Page 16: Presenter  : Yu-Ting LU Authors :  Hsin-Chang Yang, Han-Wei Hsiao, Chung-Hong Lee 2011. IPM

Intelligent Database Systems Lab

Experiments - SOM training

Page 17: Presenter  : Yu-Ting LU Authors :  Hsin-Chang Yang, Han-Wei Hsiao, Chung-Hong Lee 2011. IPM

Intelligent Database Systems Lab

Experiments - Hierarchy generation

Page 18: Presenter  : Yu-Ting LU Authors :  Hsin-Chang Yang, Han-Wei Hsiao, Chung-Hong Lee 2011. IPM

Intelligent Database Systems Lab

Experiments - Hierarchy generation

Page 19: Presenter  : Yu-Ting LU Authors :  Hsin-Chang Yang, Han-Wei Hsiao, Chung-Hong Lee 2011. IPM

Intelligent Database Systems Lab

Experiments - Hierarchy generation

Page 20: Presenter  : Yu-Ting LU Authors :  Hsin-Chang Yang, Han-Wei Hsiao, Chung-Hong Lee 2011. IPM

Intelligent Database Systems Lab

Experiments - Hierarchy alignment and Web directory generation

Page 21: Presenter  : Yu-Ting LU Authors :  Hsin-Chang Yang, Han-Wei Hsiao, Chung-Hong Lee 2011. IPM

Intelligent Database Systems Lab

Conclusions

• The development of multilingual hierarchy alignment method is fully automated and requires no human intervention.

• It will be convenient for users to have a Web directory providing multilingual category labels and categorizing multilingual Web pages.

Page 22: Presenter  : Yu-Ting LU Authors :  Hsin-Chang Yang, Han-Wei Hsiao, Chung-Hong Lee 2011. IPM

Intelligent Database Systems Lab

Comments• Advantages

-The development of multilingual hierarchy alignment method -Fully automated

• Applications- SOM