27
Reranking Bilingually Extracted Paraphrases Using Monolingual Distribu<onal Similarity Charley Chan, Chris CallisonBurch, Benjamin Van Durme Center for Language and Speech Processing, HLTCOE Johns Hopkins University GEMS 2011 JOHNS HOPKINS U N I V E R S I T Y & C ENTER FOR L ANGUAGE S P S PEECH P ROCESSING

Reranking)Bilingually)Extracted)Paraphrases) Using ...vandurme/papers/ChanCallison...Charley)Chan,)Chris)CallisonABurch,)Benjamin)Van)Durme) Center)for)Language)and)Speech)Processing,)HLTCOE)

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

  • Reranking  Bilingually  Extracted  Paraphrases  Using  Monolingual  Distribu

  • Paraphrasing  

    •  Goal  To  use  huge  amount  of  text  to  iden

  • Our  Approach  

    Bilingual  Paraphrase  Extrac

  • Pros  and  Cons  

    Bilingual  Paraphrase  Extrac

  • Scoring  Methods  

    •  MonoDS  –  Monolingual  Distribu

  • MonoDS  

    •  Based  on  distribu

  • MonoDS  

    •  Limita

  • BiP  -‐  Bilingual  Paraphrase  Extrac

  • SyntBiP  –  Syntac

  • Bilingual  Paraphrase  Extrac

  • BiP  scores  (probabili

  • Paraphrasing  Pipeline  

       

    Monolingual  N-‐gram  corpus  

    Context  Vector  Construc

  • BiP  scores  (probabili

  • huge  amount  of  BiP   SyntBiP   BiP  +  MonoDS  

    large  number  of,  0.33   large  number  of,  0.38   huge  amount  of,  1.0  

    in  large  numbers,  0.11   great  number  of,  0.09   large  quan/ty  of,  0.98  

    great  number  of,  0.08   huge  amount  of,  0.06   large  number  of,  0.98  

    large  numbers  of,  0.06   vast  number  of,  0.06   great  number  of,  0.97  

    vast  number  of,  0.06   vast  number  of,  0.94  

    huge  amount  of,  0.06   in  large  numbers,  0.10  

    large  quan/ty  of,  0.03   large  numbers  of,  0.08  

    Example  

     

    •  BiP-‐MonoDS  –  Less  weight  on  gramma

  • Human  Evalua

  • Human  Evalua

  • Human  Evalua

  • Correla

  • 2  

    2.5  

    3  

    3.5  

    4  

    4.5  

    0.1  

    0.2  

    0.3  

    0.4  

    0.5  

    0.6  

    0.7  

    0.8  

    0.85  

    0.9  

    0.95   1  

    Meaning  

    Grammar  

    2  

    2.5  

    3  

    3.5  

    4  

    4.5  

    0.1  

    0.2  

    0.3  

    0.4  

    0.5  

    0.6  

    0.7  

    0.8  

    0.85  

    0.9  

    0.95   1  

    Meaning  

    Grammar  

    2  

    2.5  

    3  

    3.5  

    4  

    4.5  

    2  

    2.5  

    3  

    3.5  

    4  

    4.5  

    Thresholding  by  Score  

    BiP   MonoDS  

    Chan,  Callison-‐Burch,  Van  Durme                        GEMS  2011  

    Meaning:  substan

  • Combining  MonoDS  and  BiP  Thresholds  

    •  Joint  Thresholding  –  Tighter  threshold  à  higher  paraphrase  quality  –  Addi

  • Effects  of  Bilingual  Training  Corpus  

    •  Bri

  • Context  Influence  

    •  {hauled,  delivered}  

     countries  which  do  not  comply  with  community      legisla/on  should  be  hauled  before  the  court  of  jus/ce  …  

                   

     à  MonoDS  takes  advantage  of  context  informaEon  

    Chan,  Callison-‐Burch,  Van  Durme                        GEMS  2011  

    é Meaning,  Grammar  

    é BiP  +  MonoDS  

    ê  BiP  

  • Context  Influence  

    •  {fiscal  burden,  taxes}      

     (1)  …  the  member  states  can  reduce  the        fiscal  burden  consis/ng  of  taxes  and        social  contribu/ons  .        

     

     (2)  ...  and  ,  in  addi/on  ,  the  fiscal  burden      in  europe  has  reached  an  all-‐/me        high  of  46  %  

         

    Chan,  Callison-‐Burch,  Van  Durme                        GEMS  2011  

    ê  Meaning,  Grammar  

    é Meaning,  Grammar  

    à  MonoDS  is  more  useful  in  context  that  defines  the  context  vector  

    é BiP  ê  BiP  +  MonoDS  

  • MonoDS  Limita

  • Summary  

    •  Presented  a  novel  paraphrase  acquisi

  • Summary  

    •  Future  direc

  • Thank  you      

    Ques