32
Identifying Intention Posts in Discussion Forums Zhiyuan (Brett) Chen Bing Liu Meichun Hsu Malu Castellanos Riddhiman Ghosh

Identifying Intention Posts in Discussion Forumszchen/presentations/NAACL2013...FS-EM2 Co-Class Model Comparisons Effects of Source Domains Conclusions Novel problem of identifying

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

  • Identifying Intention Posts

    in Discussion Forums

    Zhiyuan (Brett) Chen

    Bing Liu

    Meichun Hsu

    Malu Castellanos

    Riddhiman Ghosh

  • What is Intention?

  • Example

    Hello, I am going to buy a new high-end gaming laptop my budget is below than 1500$ ram should be more than 6gb,graphics card must be more than 2gb and the processor should be intel core i7 3rd generations or better.

  • Explicit Intention

    I plan to buy this book.

    I am looking for a new car.

    I am going to travel to Atlanta.

  • Implicit Intention

    Anyone knows the battery life of iPhone?

    What are the components in this laptop?

  • Identifying Intention

    A totally NEW problem

    Explicit intention

    Applications like advertisement

  • Problem Definition

  • Two-Class Post Classification

    Transfer Learning: Use labeled data from other domains (source

    domains) to classify target domain

  • The ways to express an intention

    are similar in different domains.

  • Motivation Examples

    I want to buy a car.

    I want to buy a camera.

    I want to buy the tickets.

  • Special Difficulties

  • Noise

    I read many reviews of two Canon models which I'm considering for purchase, the Canon PowerShot S2 IS and the Canon PowerShot S3 IS. This is my second digital camera, my first being a Kodak EasyShare from about 4 years ago. …

  • Imbalanced Shared Features

  • EM Algorithm with NB (Nigam et al., 2000)

    E-step

    M-step

    Naïve Bayes

  • Proposed Models

    FS-EM

    Co-Class

  • FS-EM

    Incorporates feature selection into EM iterations

    Selects features from both labeled source (domain) data and unlabeled target (domain) data

  • Co-Class

    Builds two classifiers based on FS-EM

    Solves the imbalanced shared feature problem

  • Features & Feature Selection

    N-grams

  • Evaluation

  • Datasets

    4 Domains from different forums (http://www.cs.uic.edu/~zchen/)

    Human annotation

    Cross-validation

    http://www.cs.uic.edu/~zchen/http://www.cs.uic.edu/~zchen/

  • Evaluation Measures

  • Supervised Learning (One Domain)

  • Model Comparisons

    3TR-1TE (Aue & Gamon, 2005)

    EM (Nigam, et al., 2000)

    ANB (Tan et al., 2009)

    FS-EM1

    FS-EM2

    Co-Class

  • Model Comparisons

  • Effects of Source Domains

  • Conclusions

    Novel problem of identifying intention

    Suitable for transfer learning

    Two special difficulties

    Effectiveness of Co-Class

  • Future Directions

    Sentence-level classification

    Extract intention components

  • Q & A

  • Dataset Download Link:

    http://www.cs.uic.edu/~zchen/

    http://www.cs.uic.edu/~zchen/http://www.cs.uic.edu/~zchen/