My past works Kazunari Sugiyama 16 Apr., 2009 WING Group Meeting

My past worksMy past works

Kazunari Sugiyama16 Apr., 2009

WING Group Meeting

OutlineOutlineI. Natural Language Processing (Disambiguation)

I-1 Personal Name Disambiguation in Web Search Results

I-2 Word Sense Disambiguation

in Japanese TextsII. Web Information Retrieval

II-1 Characterizing Web Pages Using Hyperlinked Neighboring Pages

II-2 Adaptive Web Search based on User’s Information Needs

I-1 Personal Name Disambiguation I-1 Personal Name Disambiguation in Web Search Resultsin Web Search Results

[Outline]1. Introduction2. Our Proposed Method3. Experiments

Politician

Professor of Computer Science

consulting company

Robert M. Gates(other person not “William Cohen”)

2.1 2.1 Our Proposed MethodOur Proposed Method

[Semi-supervised Clustering]

×G2 tn

×G’1D(G, w )p

Control the fluctuation of the centroid of a cluster

p : search-result Web page

w p: feature vector of pGi : the centroid vector of a cluster

psi: seed page

wp: feature vector of a Web page contained in a cluster that has centroid G

w pw ps1

w ps1w ps2

C1C2 C1

3. Experiments3. Experiments

3.1 Experimental Data3.2 Evaluation Measure3.3 Experimental Results

3.1 3.1 Experimental DataExperimental Data WePS Corpus

– Established for “Web People Search Task” at SemEval-2007 in the Association for Computational Linguistics (ACL) conference

Web pages related to 79 personal names– Sampled from

• participants in conferences on digital libraries and computational linguistics,

• bibliographic articles in the English Wikipedia,• the U.S. Census

– The top 100 Yahoo search results via its search API for a personal name query• Training sets ： 49 names, Test sets ： 30 names （ 7,900 Web pages

in total ）

Pre-processing for WePS corpus– Eliminate stopwords, and perform stemming– Determine the optimal parameter for merging similar

clusters using the training set in the WePS corpus and then apply it to the test set in the WePS corpus

3.2 3.2 Evaluation MeasureEvaluation Measure

(1) Purity(2) Inverse purity(3) F (harmonic mean of (1) and (2))

(These are standard evaluation measures employed in the “Web People search task.”)

[Hotho et al., GLDV Journal’05]

3.3 3.3 Experimental ResultsExperimental Results

Team ID of “Web People Search Task”

Purity Inverse Purity F

CU_COMSEM 0.72 0.88 0.78IRST-BP 0.75 0.80 0.75PSNUS 0.73 0.82 0.75UVA 0.81 0.60 0.67SHEF 0.60 0.82 0.66Our proposed method 2 and 3 sentences in 5 Wikipedia seed pages and search result Web page, redpectively

0.72 0.81 0.76

This result is comparable to the top results (0.78) among the top 5 participants in “Web People Search Task.”

We could acquire useful information from sentences that characterize an entity of a person and disambiguate the entity of a person effectively.

I-2. Word Sense Disambiguation I-2. Word Sense Disambiguation in Japanese Textsin Japanese Texts

[Outline]1. Introduction2. Proposed Method3. Experiments

1. 1. Introduction (1/2)Introduction (1/2)

Word Sense Disambiguation (WSD)– Determining the meaning of ambiguous word

in its context“run”:

(1) Bob goes running every morning.“to move fast by using one’s feet”

(2) Mary runs a beauty parlor.“to direct or control”

1. 1. Introduction (2/2)Introduction (2/2)

Raw corpusnot assigned sense tags

Add sense-tagged instances

Feature extractionfor clustering and WSD(“baseline features”)

Semi-supervised clustering

SupervisedWSD

Feature extractionfor WSD from clustering results

(“seed instances”)

Our approach for WSD Basically, supervised WSD Applying semi-supervised clustering

by introducing sense-tagged instances to supervised WSD

2. 2. Proposed MethodProposed Method

2.1 Semi-supervised Clustering2.2 Features for WSD

obtained Using Clustering Results

2.1 Semi-supervised Clustering2.1 Semi-supervised Clustering

2.1.1 Features for Clustering2.1.2 Semi-supervised Clustering2.1.3 Seed Instances and Constraints

for Clustering

2.1.1 2.1.1 Features for Clustering and WSD Features for Clustering and WSD (“baseline features”)(“baseline features”)

Morphological features– Bag-of-words (BOW), Part-of-speech (POS),

and detailed POS classification Target word itself and the two words to its right and left.

Syntactic features– If the POS of a target word is a noun, extract the verb in a

grammatical dependency relation with the noun.– If the POS of a target word is a verb, extract the noun in a

grammatical dependency relation with the verb. Figures in Bunrui-Goi-Hyou

– 4 and 5 digits regarding the content word to the right and left of the target word.

地域 (“community”) 社会 (“society”) “1.1720,4,1,3” → 1172 (as 4 digits), 11720 (as 5digits)

5 topics inferred on the basis of LDA– Compute the log-likelihood of a word instance

(“soft-tag” approach [Cai et al., EMNLP’07])

2.1.2 Semi-supervised Clustering2.1.2 Semi-supervised Clustering[Proposed method]

x : word instance

Gi: the centroid of a clusterx si

: seed instancecD

×G2 w ps2C1

× G2C2

×G2 w ps2C1

× G2C2

Gnew×

),( 21 CCD GG : adaptive Mahalanobis distance

Refine [Sugiyama and Okumura, ICADL’07] (about Web People Search Task)for word instances

Control the fluctuation of the centroid of a cluster

2.1.3 Seed Instances 2.1.3 Seed Instances and Constraints for Clustering (1/3)and Constraints for Clustering (1/3)

Select initial seed instances:(I-1) randomly,(I-2) “KKZ”(I-3) centroid of a cluster

generated by K-means initial instances for K-means * randomly,

* “KKZ” [Katsavounidis et al., IEEE Signal Processing Letters, ‘94]

[Method I]Data set of word instances

Training data set Data set for clustering

: seed instance

[Method II]Data set of word instances

Training data set

s1, s2, s3: word senses of target word

s1 s2 s3

Data set for clustering

: seed instance

Select initial seed instances:(II-1) By considering the frequency

of word senses(II-1-1) randomly,(II-1-2) “KKZ”(II-1-3) centroid of a cluster generated by K-means

initial instances for K-means * randomly, * “KKZ”

(II-2) In proportion to the frequency of word senses (D’Hondt

method)(II-2-1) randomly,(II-2-2) “KKZ”(II-2-3) centroid of a cluster generated by K-means initial instances for K-means

* randomly, * “KKZ”

Constraints– “cannot-link” only,– “must-link” only,– Both constraints, – “cannot-link” and “must-link” without outliers

Put “must-link” constraints between and

×is an outlier,

so a “must-link” constraint is not added.

C sv C sw

D( , ) < , G new G sC v ThdisD( , ) < G new G sC w ThdisG new

G newD( , ) > , G new G sC v ThdisD( , ) > G new G sC w Thdis

D( , ) Gnew svGCD( , ) Gnew swGC

D( , ) Gnew swGC

D( , ) Gnew svGC

(Thdis = 0.388)

[“must-link” without outliers]

2.2 2.2 Features for WSD Obtained Using Features for WSD Obtained Using Clustering ResultsClustering Results

(a) Inter-cluster information TF in a cluster (TF) Cluster ID (CID) Sense frequency (SF)

(b) Context information regarding adjacent words

Mutual information (MI) T-score (T) (CHI2)

(c) Context information regarding two words to the right and left of the target word

Information gain (IG)

)1,...,2(1 iww ii

21012 wwwww

*We employ (b) and (c) to reflect the concept of “one sense per collocation.”[Yarowsky, ACL’95]

4. 4. ExperimentsExperiments

4.1 Experimental Data4.2 Semi-supervised Clustering4.3 Word Sense Disambiguation

4.1 4.1 Experimental DataExperimental Data

RWC corpus from the “SENSEVAL-2 Japanese Dictionary Task”– 3000 Japanese newspaper articles issued in

1994 Sense tags in Japanese Dictionary, “Iwanami

Kokugo Jiten” were manually assigned to 148,558 ambiguous words.

– 100 target words 50 nouns, and 50 verbs

4.2 4.2 Semi-supervised ClusteringSemi-supervised Clustering

Our semi-supervised clustering approach outperforms other distance-based approaches.– Our method locally adjusts the centroid of a cluster

Comparison of distance-based approaches

[Observations]

Distance-based approaches

4.3 4.3 Word Sense DisambiguationWord Sense Disambiguation Experimental Results

[Observations]• The best accuracy is obtained, when we add features from clustering results, CID, MI, and IG to the baseline features.

• According to the results of OURS, TITECH, NAIST, WSD accuracy issignificantly improved by adding features computed from clustering results.

II-1 Characterizing Web Pages Using II-1 Characterizing Web Pages Using Hyperlinked Neighboring PagesHyperlinked Neighboring Pages

1. Introduction1. Introduction

Transitions of Web Search EnginesThe first-generation search engines:

• Only terms included in Web pages were utilized as indices of Web pages.• Peculiar features of the Web such as hyperlink structures are not exploited.

The second-generation search engines: • The hyperlink structures of Web pages are considered.

e.g., (1) “Optimal Document Granularity” based IR systems(2) HITS (CLEVER project), Teoma (DiscoWeb)(3) PageRank (Google)

Users are not satisfied with• the ease of use,• retrieval accuracy.

2. Our Proposed Methods2. Our Proposed Methods

feature vector of :

ptgttarget Web page:

feature vector of Web page hyperlinked from ptgt

w ptgt

pw’ tgt

w ptgt

refined feature vectorof : ptgt

pw’ tgt

Method Method ⅠⅠ

[in the Web space]

[in the vector space]

w ptgt pw’ tgtL(in)

L(out)

Method ⅡMethod Ⅱ

t1clusters generated from groups of Web pagesin each level from target page

w ptgt pw’ tgt

centroid vectorof clusters

[in the Web space] [in the vector space]

L(out)

Method Method ⅢⅢ

w ptgt pw’ tgt

L(out)

Experimental SetupExperimental Setup

Data Set– TREC WT10g test collection

(10GB, 1.69 million Web pages)

Specification of Workstation– Sun Enterprise 450

CPU: UltraSparc-II 480MHz Memory: 2 GB OS: Solaris 8

Web pages Using in the ExperimentWeb pages Using in the Experiment

Experimental ResultsExperimental ResultsComparison of best search accuracy

obtained using each Method I, II, III

Comparison of best search accuracyobtained using each Method I, II, III

00.10.20.30.40.50.60.7

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1recall

ision TF- IDF

(MI-a), L(in)=3(MII-a), L(in)=1, K=2(MIII-a), L(in)=2, K=3

The contents of a target Web page can be represented much better by making a point of in-linked pages of a target page.

tfidf(MI-a), L(in)=3

averageprecision

％ improvement

0.3130.3420.3400.345

+2.9+2.7+3.2

(MII-a), L(in)=1, K=2

(MIII-a), L(in)=2, K=3

(MI-a), L(in)=3

averageprecision ％ improvement

0.3400.345

+31.0+30.8+31.3

(MII-a), L(in)=1, K=2

(MIII-a), L(in)=2, K=3

modified HITS

modified HITS with weighted links

0.1380.342

-+10.4

Method Method ⅢⅢ

w ptgt pw’ tgt

II-2 Adaptive Web Search Based on II-2 Adaptive Web Search Based on User’s Information NeedsUser’s Information Needs

1. Introduction1. Introduction The World Wide Web (WWW)

– It has become increasingly difficult for users to find relevant information.

– Web search engines help users find useful information on the WWW.

Web Search Engines– Return the same results regardless of who submits the query.

Web search results should adapt to users with different information

needs.

In general, each user has different information needs for his/her query.

2. Our Proposed Method2. Our Proposed Method

User profile construction based on(1) pure browsing history,(2) modified collaborative filtering.

(1) User Profile Construction Based (1) User Profile Construction Based on Pure Browsing History (1/3)on Pure Browsing History (1/3)

We assume that the preferences of each user consist of the following two aspects:

Pand construct user profile .

P per(1) Persistent (or long term) preferences ，

P today(2) Ephemeral (or short term) preferences

Browsing history of today(0 day ago)

Browsing history of 2 days ago

Browsing history ofN days ago

[Persistent preferences]

: Web page : Window size

[Ephemeral preferences]

Browsing history of 1 days ago

1st browsing history in today

(1) 1 S0

browsing history in today

(n )bh

nbh th

rth browsing history in today

(r) S0(r)(1) (r) 1 S0

(cur)(cur)

current sessionP (r)

P (br)P(cur)

P today

User profile is finally constructed as follows: P

).5.0,5.0(,1satisfythat constantsareandand,1satisfythat constantsareandwhere

yxyxyxbaba

bacurbrper

todaypertoday

PPPPPP (a)

Overview of the Pure Overview of the Pure Collaborative Filtering AlgorithmCollaborative Filtering Algorithm

user 1user 2

user a

user U

item 1 item 2 item i item I

Item that prediction is computed

Active user

3 21 4

User-item ratings matrix for collaborative filtering

(2) User Profile Construction Based on (2) User Profile Construction Based on Modified Collaborative FilteringModified Collaborative Filtering

（（ 1/21/2 ））User-term weight matrix for modified collaborative filtering

user 1

user 2

user a

user U

term 1 term 2 term i term TTerm weight that prediction is computed

Active user

user 1

user 2

user a

user U

term 1 term 2 term i term T term T+1 term T+2 term T+v

0.745 0.362

0.4610.247

When each user browsed k Web pages

Active user

Term weight that prediction is computed

When each user browsed k+1 Web pages

User Profile Construction User Profile Construction Based on Based on

Modified Collaborative Filtering AlgorithmModified Collaborative Filtering Algorithm

User Profile Construction Based on (1) Static Number of Users in the Neighborhood,

(2) Dynamic Number of Users in the Neighborhood.

ExperimentExperiment (1/2)(1/2)

Explicit Method(1) Relevance Feedback ，

Implicit Method(2) Method based on pure browsing history ，

(3) Method based on modified collaborative filtering

Construction of User Profile

Observed browsing history– 20 subjects – 30 days

Experiments (2/2)Experiments (2/2)Query

– 50 query topics employed as test topics in the TREC WT10g test collection

Evaluation– Compute similarity between user profile and

feature vector of Web page in search results– R-precision (R=30)

User Profile Based on User Profile Based on Pure Browsing History (Implicit Method)Pure Browsing History (Implicit Method)

Using , and , user profile is defined as follows:

P per P (br)P

(cur) P

1where

babybxa curbrper PPPP

Experimental ResultsExperimental Results

• A user profile that provides search results adaptive to a user can be constructed when a window size with about 15 days is used.

• This approach can achieve about 3% higher precision than the relevance feedback-based user profile.

The user’s browsing history strongly reflectsthe user’s preference.

User Profile Based on User Profile Based on Modified Collaborative Filtering Modified Collaborative Filtering

(Implicit Method)(Implicit Method)Using , and , user profile is defined as follows:

Pper P (br) V (pre)

1where

babybxa prebrper VPPP

User Profile Construction Based on (1) Static Number of Users in the Neighborhood,(2) Dynamic Number of Users in the Neighborhood

Experimental ResultsExperimental Results(Dynamic Method)(Dynamic Method)

• In all of our experimental results, the best precision is obtained in the case of x=0.129, y= 0.871.

In this method, the neighborhood of each user is determined by the centroid vectors of clusters of users, and the number of clusters is different user by user.

This allows each user to perform more fine-grained search compared with the static method.

Thank you very much!Thank you very much!

My past works Kazunari Sugiyama 16 Apr., 2009 WING Group Meeting

Documents

United Communities Against Meth (UCAM) Emily Dalke, Brianne Villarosa, and Bret Sugiyama

Opening Talk Kazunari Shibata Kwasan and Hida Observatories Kyoto University

New York Inaugurdl PACI FI Q ~r) ~ITIZEN€¦ · TO THE POINT: Shig Sugiyama New

Kazunari YOKOYAMA(NARO)

Kazunari Shibata and Shinsuke Takasao arXiv:1606.09401v1 ... · Kazunari Shibata and Shinsuke Takasao Abstract Recent space based observations of the Sun revealed that magnetic recon-nection

Sugiyama Chain catalog2004 - Ketting Importkettingimport.com/media/SY catalog.pdf · SUGIYAMA CHAIN CO., LTD. 1 Since being founded in 1946 as a bicycle chain manufacturer, Sugiyama

Naoshi Sugiyama- Cosmic Microwave Background Basic Physical Process: Why so important for cosmology

Abstract REPERTOIRE Miori Sugiyama, Doctor of Musical Arts

Aiu Presentation Sugiyama 100903

Three-dimensional MHD simulations of emerging flux and associated magnetic reconnection Hiroaki Isobe (DAMTP, Cambridge / Tokyo) Takehiro Miyagoshi, Kazunari

Effects of Illness and Injury on Foraging Among the Yora and …pages.uoregon.edu/sugiyama/docs/Sugiyama Effects of... · 2011-04-26 · He who has experienced it knows how cruel

General Relativistic MHD Simulations with Finite Conductivity Shinji Koide (Kumamoto University) Kazunari Shibata (Kyoto University) Takahiro Kudoh (NAOJ)

Strategically Planning for Evaluation: Implementation and Utilization Benefits By: Lori Sugiyama, MPH

The Board of the Public Employee Retirement System of ...PORTFOLIO Monthly Investment Report: Investment Officer, Richelle Sugiyama, provided the investment update. Ms. Sugiyama stated

November 24, 2015 Takeshi Sugiyama ...November 24, 2015 Takeshi Sugiyama ExecutiveOfficerExecutive Officer General Manager, Living Environment & Digital Media Equipment Group ©2015

Hirokazu SUGIYAMA - JST

Serendipitous Recommendation for Scholarly Papers Considering Relations Among Researchers Kazunari Sugiyama,…

Vania Camilo, Toshiro Sugiyama, Eliette Touati

Clinical outcome of Immunogenicity in RA treatmentepivax.com/wp-content/uploads/2013/05/Dr.-Sugiyama... · N oard Meeting 9th May 2013 Westin Hotel Naonobu Sugiyama MD, PhD Rheumatologist

New and Improved: Modeling Versions to Improve App Recommendation Date: 2014/10/2 Author: Jovian Lin, Kazunari Sugiyama, Min-Yen Kan, Tat-Seng Chua Source: