Transcript
Page 1: Analysis of Similarity Measures between Short Text for the ...research.nii.ac.jp/ntcir/workshop/OnlineProceedings12/...Analysis of Similarity Measures between Short Text for the NTCIR-12

AnalysisofSimilarityMeasuresbetweenShortTextforthe NTCIR-12ShortTextConversationTask

KozoChikai andYukiAraseGraduateschoolofinformationscienceandtechnology,OsakaUniversity

IntroductionGoalGivenaninputpost,searchappropriaterepliesfromthepool.

Approachwecomparethestate-of-the-artmethodsforestimatingtextsimilaritiestoinvestigatetheirperformanceinhandlingshorttext,specially,underthescenarioofshorttextconversation.

SystemDesignWeimplementedthesefivemethodsasSimilarityCalculation.

① TF-IDF② WeightedTextMatrixFactorization③ Word2vec→ TF-IDF④ RandomForest→ TF-IDF⑤ RandomForest+ TF-IDF

Results

00.050.10.150.20.250.30.350.4

ndcg-1 nerr-5 accuracy2-1 accuracy2-5 accuracy12-1 accuracy12-5

Meanoflabels

EvaluationMethod

Resultofexperiments

Oni-J-R3(①TF-IDF) Oni-J-R5(②WTMF)Oni-J-R4(③Word2Vec→TF-IDF) Oni-J-R1(④RandomForest→TF-IDF)Oni-J-R2(⑤RandomForest+TF-IDF)

0.3

0.5

0.7

0.9

rank1 rank2 rank3 rank4 rank5

Meanoflabe

ls

Order

Meanoflabelsforeachrank

Oni-J-R3(①TF-IDF) Oni-J-R5(②WTMF)Oni-J-R4(③Word2Vec→TF-IDF) Oni-J-R1(④RandomForest→TF-IDF)Oni-J-R2(⑤RandomForest+TF-IDF)

Recommended