A Probabilistic Model for Fine-Grained Expert Search Shenghua Bao, Huizhong Duan, Qi Zhou, Miao...

A Probabilistic Model for Fine-Grained Expert

Search

Shenghua Bao, Huizhong Duan, Qi Zhou, Miao Xiong, Yunbo Cao, Yong Yu

June 16--18, 2008, Columbus Ohio

Schedule

Introduction1

Fine-grained Expert Search2

Conclusion4

Experimental Results3

Introduction Expert Search

“who is an expert on X?”

User QuerySearch Engine

Experts

Who are experts on Semantic Web Search Engine?

Introduction Pioneering Expert Search Systems

Log data in software development Kautz et al., 1996; Mockus and Herbsleb, 2002;

McDonald and Ackerman, 1998; etc. Email communications

Campbell et al., 2003; Dom et al. 2003; Sihn and Heeren, 2001; etc.

General documents Yimam, 1996; Davenport and Prusak, 1998; Steer and

Lochbaum, 1988; Mattox et al., 1999; Hertzum and Pejtersen, 2000; Craswell et al., 2001; etc.

Introduction

Expert Search at TREC A new task at TREC 2005, 2006, 2007

Craswell et al., 2005; Soboroff et al., 2006; Bailey et al., 2007;

Many approaches have been proposed Two generative models, Balog et al. 2006 Prior distribution, relevance feedback, Fang et al. 2006 Hierarchical language model, Petkova and Croft 2006 Voting and data fusion, Macdonald and Ounis 2006 …

Introduction

Coarse-grained approach. Expert search is carried out under a grain of

document. Further improvements are hard to achieve

Different blocks of electronic

documents

Different functions and

qualities

Different impacts for

expert search

Windowed Section Relation

irrelevant

Window relevant

queried topic

Examples

Title-Author Relation

AuthorQuery: Timed Text

Examples

Reference Section Relation

Examples

Query: W3C Management Team

Examples

Section Title-Body Relation

Schedule

Introduction1

Conclusion4

Fine-grained Evidence

Who are experts on Semantic

Web Search Engine?

Fine-grained Expert Search --Evidence Extraction

• Document-001:

“…a high-level plan of the architecture of the semantic web by Tim Berners-Lee… ”

“…later, Berners-Lee describes a semantic web search engine experience…”

<topic, person, relation, document>

E1: <semantic web, Tim Berners-Lee, same-section, document-001>E2: <semantic web search engine, Berners-Lee, same-section, document-001>

Tim Berners-Lee

Fine-grained Expert Search –Search Model

<topic, person, relation, document>(t,p,r,d)

Expert Candidate(c)

Query(q)

Expert MatchingModel

Evidence Matching Model

)|(),|()|,()|( qePqecPqecPqcPee

)|()|( qePecPe

Fine-grained Expert Search -- Expert Matching

),,|()|( drpcPecP

),,(),|(

drpfreqdrpP

drpPdrpPdrpP

)',|()1(),|(),|(

),|()|( drpPpcP

),()|( pctypePpcP Mask Sample

Full Name Ritu Raj Tiwari

Email Name rtiwari@nuance.com

Combined Name Tiwari, Ritu R;

Abbr. Name Ritu Raj ; Ritu

Short Name RRT

Alias, new email rtiwari@hotmail.com

<topic, person, relation, document> (<t, p, r, d> for short)

Fine-grained Expert Search -- Evidence Matching

)|()|()|()|()|,,,()|( qdPqrPqpPqtPqdrptPqeP

),()|( qttypePqtP

)()|( pPqpP

))(()|( rtypePqrP

)()|()(

)()|()|( dPdqP

dPdqPqdP

)()|())(()(),()|( dPdqPrtypePpPqttypePqeP

Type Sample

Query Semantic Web Search Engine

Phrase “Semantic Web Search Engine”

Bi-gram “Semantic Web” “Search Engine”

Proximity “Semantic … Web Search Engine”

Fuzzy “Samentic Web Saerch Engine”

Stemmed “Semantic Web Search Engin”

Relation Type

Same Section

Windowed Section

Reference Section

Title-Author

Section Title-Body

Quality Type

Dynamic Quality

Static Qualify

Schedule

Introduction1

Conclusion4

Experimental Result W3C Corpus

331,307 web pages 10 training topics of TREC 2005 50 test topics of TREC 2005 49 test topics of TREC 2006

Evaluation Metrics Mean average precision (MAP) R-precision (R-P) Top N precision (P@N)

Experimental Result

Query Matching

TREC 2005 TREC 2006

MAP R-P P@10 MAP R-P P@10Baseline 0.1840 0.2136 0.3060 0.3752 0.4585 0.5604+Bi-gram 0.1957 0.2438 0.3320 0.4140 0.4910 0.5799+Proximity 0.2024 0.2501 0.3360 0.4530 0.5137 0.5922+ Fuzzy, Stemmed 0.2030 0.2501 0.3360 0.4580 0.5112 0.5901Improv. 10.33% 17.09% 9.80% 22.07% 11.49% 5.30%

T-test 0.0084 0.0000

Experimental Result

Person Matching

TREC 2005 TREC 2006

MAP R-P P@10 MAP R-P P@10Baseline 0.2030 0.2501 0.3360 0.4580 0.5112 0.5901+ Combined Name 0.2056 0.2539 0.3463 0.4709 0.5152 0.5931+ Abbr. Name 0.2106 0.2545 0.3400 0.5010 0.5181 0.6000+ Short Name 0.2111 0.2578 0.3400 0.5121 0.5192 0.6000+ Alias, new email 0.2156 0.2591 0.3400 0.5221 0.5212 0.6000Improv. 6.21% 3.60% 1.19% 14.00% 1.96% 1.68%T-test 0.0064 0.0057

Experimental Result

Multiple Relations

TREC 2005 TREC 2006

MAP R-P P@10 MAP R-P P@10Baseline 0.2156 0.2591 0.3400 0.5221 0.5212 0.6000+Windowed Section 0.2158 0.2633 0.3380 0.5255 0.5311 0.6082+Reference Section 0.2160 0.2630 0.3380 0.5272 0.5314 0.6061+Title-Author 0.2234 0.2634 0.3580 0.5354 0.5355 0.6245+Section Title-Body 0.2586 0.3107 0.3740 0.5657 0.5669 0.6510Improv. 19.94% 19.91% 10.00% 8.35% 8.77% 8.50%T-test 0.0013 0.0043

Experimental Result

Evidence Quality

TREC 2005 TREC 2006

MAP R-P P@10 MAP R-P P@10Baseline 0.2586 0.3107 0.3740 0.5657 0.5669 0.6510+Static quality 0.2711 0.3188 0.3720 0.5900 0.5813 0.6796+Dynamic quality 0.2755 0.3252 0.3880 0.5943 0.5877 0.7061Improv. 6.13% 4.67% 3.74% 2.86% 3.67% 8.61%T-test 0.0360 0.0252

Rank 1 @TREC 0.2749 0.3330 0.4520 0.5947 0.5783 0.7041

Schedule

Introduction1

Conclusion4

Conclusion

Fine-grained expert search

Probabilistic model and its implementation

Evaluation on the TREC data set

A Probabilistic Model for Fine-Grained Expert Search Shenghua Bao, Huizhong Duan, Qi Zhou, Miao...

Documents

Tuberculosis – metabolism and respiration in the absence of growth -- prepared by Shenghua Liang

Entanglement and Quantum Correlations in Capacitively-coupled Junction Qubits Andrew Berkley, Huizhong Xu, Fred W. Strauch, Phil Johnson, Mark Gubrud,

1 Lab 2 of COMP 319 Lab tutor : Shenghua ZHONG Email: zsh696@gmail.com csshzhong@comp.polyu.edu.hk Lab 2: Sep. 28, 2011 Data and File in Matlab

Email Data Cleaning (KDD’05) Jie Tang 1, Hang Li 2, Yunbo Cao 2, Zhaohui Tang 3 1 Tsinghua University 2 Microsoft Research Asia 3 Microsoft Corporation

Graph Regularized Meta path Based Transductive Regression in ... · Graph Regularized Meta-path Based Transductive Regression in Heterogeneous Information Network Mengting Wan, Yunbo

Shenghua Group Deqing Huayuan Pigment Co., Ltd

RevoMaker: Enabling Multi-directional and Functionally-embedded 3D ... · Wei Gao ⇤, Yunbo Zhang⇤ ... time and print material for 3D printing by using laser cut-ting, (c) doubles

Optofluidics in Bio-Chemical Analysis - University of Michiganguoybyw/Publications/Yunbo... · ... and photonics and is an emerging technology in ... Optofluidics, ring resonators,

Journal of Banking & Finance - G-20Y Summit · 2016. 7. 7. · Xiang, Takashi Yaekura (AAA discussant), Feida (Frank) Zhang, Huizhong Zhang (AFBC discussant), Ligang Zhong, Jie Zhu,

Doc.: IEEE 802.11-12/0039r3 Submission NameAffiliationsAddressPhoneemail Robert Sun; Yunbo Li Edward Au; Phil Barber Junghoon Suh; Osama Aboul-Magd Huawei

All-fiber Optofluidic Biosensorguoybyw/Publications/Yunbo_2011...All-fiber Optofluidic Biosensor Yunbo Guo, Hao Li, Jing Liu, Karthik Reddy, and Xudong Fan Department of Biomedical

Central Hub M&A Advisors - Massey Universityeconfin.massey.ac.nz/school/documents/seminarseries... · 2019. 10. 24. · Central Hub M&A Advisors ALFRED YAWSON and HUIZHONG ZHANG*

Presentation Creating colors for the world Presentation Creating colors for the world Shenghua Group Deqing Huayuan Pigment Co., Ltd.

PredRNN: Recurrent Neural Networks for Predictive Learning ... · PredRNN: Recurrent Neural Networks for Predictive Learning using Spatiotemporal LSTMs Yunbo Wang School of Software

Online Spelling Correction for Query Completion Huizhong Duan, UIUC Bo-June (Paul) Hsu, Microsoft WWW 2011 March 31, 2011

Effect of HSMF on Electrodeposited Ni-Fe Membrane-- Crystal Morphology and Magnetism Performance Yunbo Zhong, Yanling Wen, Zhongming Ren, Kang Deng, Kuangdi

Comparing Chemical Reporting Requirements in China, · PDF Enabling Chemical Compliance for A Safer World Comparing Chemical Reporting Requirements in China, Korea and Japan Mr Yunbo

Shenghua Beauty Equipment Co - TradeKeyimgusr.tradekey.com/images/uploadedimages/... · Shenghua Beauty Equipment Co.,LTD 4 Gross Weight: 26kgs Packing size: 505379cm on both sides

TAU · Web viewExtraordinary acquirers . Andrey Golubov, Alfred Yawson and Huizhong Zhang* November 2013. Preliminary . Version. Abstract. Firm fixed effects alone explain more of

FlowScope: Spotting Money Laundering Based on Graphs · FlowScope: Spotting Money Laundering Based on Graphs Xiangfeng Li 1, Shenghua Liu 2 *#, Zifeng Li 3, Xiaotian Han 4, Chuan