Research on Extractive Summarization at UFPE/UFRPE ...w3.erss.univ-tlse2.fr/UETAL/2019-2020/Presentation_CLLE_LIMA.pdf · Ensemble Approach to ATS 35 Research Goal Proposing and evaluating

Research on Extractive Summarization

at UFPE/UFRPE Universities:Developed Approaches and Perspectives

Rinaldo Lima

Federal Rural University of Pernambuco, Recife/Brazil

[email protected]

[email protected]

Dec/2019

mailto:[email protected]

mailto:[email protected]

Agenda

• Presenting UFPE/UFRPE

• Automatic Text Summarization (ATS)

– Main Approaches

– Evaluation Methods, Metrics, and Datasets

• ATS Work at UFRPE/UFPE

• Current Trends in ATS

• Research Collaboration

Presenting:

UFRPE and UFPE

4

RECIFE (PE/Brazil)

Olinda Porto de Galinhas Beach

Old Recife

One the Recife’s Brigde (downtown)

UFRPE/UFPE

UFRPE• Headquartered on Recife Campus with other 5 campuses in Pernambuco

• More than 1.000 teachers, 900 technicians, 17,000 students

• UFRPE provides courses on Agricultural sciences, Veterinary Medicine, Humanities, Biological, Computer Science, Physics, etc.

• The Computing Department conducts research on Computational Intelligence and

Modelling, Software Engineering, Computational Systems, Text Mining, among others

5

UFPE• The Informatics Center (CIn) at UFPE was established 35 years ago in 1974

• Research at CIn encompasses the following areas: Databases, Computer Architecture, Software Engineering, Computational Intelligence, Distributed Systems, among others

• CIn is ranked among the top quality academic centers in Latin America

• Its faculty members consist of more than 80 PhDs

6

UFRPE/UFPE

Automatic Text Summarization

8

Automatic Text Summarization (ATS/TS)

Definitions

• A summary is a reductive transformation of a source text into a summary text by extraction or generation [Spärck-Jones and Sakai (2001)]

• A condensed version of a source document having a recognizable genre and a very specific purpose, to give the reader an exact and concise idea of the contents of the source [ Saggion and Lapalme (2002)]

• The ratio between the length of the summary and the length of the source document is calculated by the compression rate:

Motivation to create summaries

• more and more text data

• less time to process or read it

Main goals in ATS:

• optimizing topic coverage (informativeness)

• optimizing readability

• obtaining cohesive and coherent summaries

• avoiding redundancy

9


ATS Categorization

10

11


Simplified General Summarization Process

TS Applications• increasing the performance of traditional IR and IE systems (coupled with

Question-Answering system)

• opinion summarization

• news summarization

• blog, tweet, web page, email thread summarization

• report summarization for business men, politicians, researchers, etc.;

• meeting summarization;

• biographical extracts

• automatic extraction and generation of titles and papers abstracts

• domain-specific summarization (domains of medicine, chemistry, law, etc.)

• …

12


13

Fields of research which have influenced the development of ATS


Text Summarization

Main Approaches

15

• Supervised Content Selection

• Unsupervised Content Selection (topic-based)

• Graph-based (LexRank)

• Lexical Chains

• Deep NLP-based

Mono and Multidocument

Main Approaches to ATS

16

Supervised Content Selection

16

17

Supervised Content Selection

• Given: – a labeled training set of good

summaries for each document

• Align:– the sentences in the document

with sentences in the summary

• Extract features– position (first sentence?)

– length of sentence

– word informativeness, cue phrases

• Train

• Problems:• hard to get labeled training

data

• alignment difficult

• performance not better than unsupervised algorithms

• So in practice:• Unsupervised content

selection is more common

• a binary classifier (put sentence in summary? yes or no)

18

Unsupervised content selection

• Intuition dating back to Luhn (1958):

– Choose sentences that have salient or informative words

• Two approaches to defining salient words

1. tf-idf: weigh each word wi in document j by tf-idf

2. topic signature: choose a smaller set of salient words

• mutual information

• log-likelihood ratio (LLR) Dunning (1993), Lin and Hovy (2000)

18

weight(wi ) = tfij ´ idfi

weight(wi ) =1 if -2 logl(wi ) >10

0 otherwise

ìíï

îï

Graph-based (Ranking Methods)

19

The main idea is that sentence“recommends” another one in the graph

N = number of vertices

d = damping factor

Graph-based Approach (Ranking idea)

• Sentence as vertices

• Similarity as edges

Iterative ranking

• LexRank, TextRank (inspired by

PageRank)

20

Graph-based ATS

Components in a Graph-based ATS system

Graph-based (LexRank Demo)

21

• https://yale-lily.github.io/demos/demos/lexrank/lexrankmead.html

• https://github.com/rrajasek95/lexrank-demo (Python code)

https://yale-lily.github.io/demos/demos/lexrank/lexrankmead.html

https://github.com/rrajasek95/lexrank-demo

Lexical Chains

22

23

Lexical Chains (Example)

24

Lexical Chains

A sentence is extracted from strong chains

25

Rhetorical Structure Theory (RST) makes the assumption that a text is

divided into a hierarchical structure of Elementary Discourse Units (EDUs)

The EDUs are either satellites or nuclei

• A satellite needs a nucleus to be intelligible, but the opposite is not true

An algorithm is applied which weights and orders each EDU for the tree

structure of the discourse

• The higher the element in the structure, the more significant its weight

Rethorical Analysis for ATS

In RST, there is a limited number of universal and fixed relations, including elaboration, condition, illustration, opposition, concession, justification, etc.

Text Summarization

Summary Evaluation

Summary Evaluation

• Challenging task because human judges differ in scores assigned to same summaries

• There no sense of having an “ideal summary”

– It depends on the domain and user's intentions

• DUC (2001-2007)/ TAC(2008-2011) evaluation campaigns proposed many types of tasks and evaluation metrics

27

Evaluation Methods

Categorized acording to the level of human effort in the evaluation:

– Manual Evaluation

– Semi-automatic methods

• ROUGE

• Pyramid

– Automatic methods

• FRESA

28

Summary Evaluation

Summary Evaluation

29

[Lin and Hovy,2003]

ROUGE (Recall Oriented Understudy for Gisting

Evaluation)

ROUGE

(Basic Idea)

30

Summary Evaluation

31www.rxnlp.com/rouge-2-0

FRESA – Framework for evaluating summaries automatically [TORRES-MORENO

ET AL., 2010)

32

Summary Evaluation

It is an automatic method

based on information

theory, which evaluates

summaries without

using human references

Standard preprocessing

of documents (filtering

and lemmatization)

before

calculating the

probability

distributions between

the summary and the

original document https://github.com/fabrer/automatic-summarization/tree/master/EVALUATION

https://github.com/fabrer/automatic-summarization/tree/master/EVALUATION

Text Summarization Research

at UFRPE/UFPE

Rafael LinsCoordinator

Eduardo SilvaProject Manager

Bruno ÁvilaTechnical Leader

George CavalcantiResearch Fellow

Fred FreitasResearch Fellow

Gabriel SilvaResearcher

Hilário OliveiraResearcher

Luciano cabralResearcher

Rafael MelloResearcher

Rinaldo LimaResearcher

Functional Summarization – UFPE Team Funded by HP (2012-2016)

Ensemble Approach to ATS

35

Research GoalProposing and evaluating a concept-based approach using ILP and regression for

the tasks of mono/multidocument summarization of news articles

The architecture of the proposed solution is composed by two stages:

1. The generation of many candidate summaries.

• Adopting a concept-based approach using ILP

2. The estimation and selection of the most informative summary.

• Applying the regression algorithm

Research Questions:

1. Is it possible to increase the informativeness of the generated summaries by

combining ILP and regression?

2. Can this approach also take into account both the redundancy and the

cohesion of the generated sumaries?

Automatic Text Summarization based on Concepts via Integer Linear Programming (PhD Thesis 2018)

36

ILP & Regression-based Approach (ensemble)


37

Bigrams are used to represent the notion of concepts

• It has showed better results in the preliminary experiments

• Bigrams formed only by stopwords are removed (Boudin et al., 2015)

Entity Graph

• We adopted the Entity Graph model (Guinaudeau, Strube,2013) to estimate an

approximate local cohesion score for the sentences in texts

The weight of a concept is estimated based on its position and sentence coverage

• Explore the weights of your neighbors (Concept Graph).

• Edges denote adjacency relations between two concepts

• New weight distribution strategy (contribution)

Concept Graph of 4 sentences and

its entities


38

Cohesion constraints are included in the ILP model to minimize two typical problems

in Extractive Summarization

• Coreference Resolution

• Explicit discourse dependencies between a given pair of sentences.

ILP-based Sentence Selection

Optimization Problem


39

Comparative Results on 2 datasets :

Multidocument Scenario


40

Comparative Results on 3 datasets :

Multidocument Scenario


Motivation

The extractive approach is the most studied approach in theliterature.

However, most current extractive summarization systems usuallyproduce summaries with 2 major issues:

– loose sentences lacking relationship among them,

– dangling coreferences that breaks the natural discourse flow.

Adding coherence in extractive summarization is one of the mainopen problems in ATS.

41/22

Towards Coherent Single-Document Summarization

Towards coherent single-document summarization: an integer linear programming-based approachRodrigo Garcia, Rinaldo Lima, Bernard Espinasse, Hilario Oliveira Proceedings of the 33rd Annual ACM SAC, 2018

Towards Coherent Single-Document Summarization

General Architecture of the proposed solution:

Preprocessing Task: annotation

43

Concept Extraction Task

44

Concept Scoring Task

45

Local coherence scoring task: Entity Graph

46

47

Local coherence scoring task: Entity Graph

Summary generation task: ILP optimization

48

The DUC 2001 and 2002 datasets: used for evaluating generic single-document summarizers composed of news articles written in English contain abstractive summaries with approximately 100 words

each one

Evaluation measure ROUGE-N provided by the ROUGE package consists of an n-gram recall between the summary generated

by an ATS system, and a set of golden standard summariesproduced by humans

49

DUC Datasets and ROUGE measure

Comparative Assessment: Informativeness

Selected systems in the comparison:• the best summarizes participating in the DUC 2001 (System T) and DUC 2002 (System

28) conferences• AutoSummarizer (2016) • Classifier4J (Lothian, 2003) • HP-UFPE FS (highest R-1 performance reported in (Batista et al., 2015) • TextRank (Mihalcea & Tarau, 2004) • Parvens's summarizer (Parveen & Strube, 2015)

The proposed summarizer obtained very competitive results

50 /22

Comparative Assessment: Coherence (1)

• ROUGE measures cannot take into account text properties such as readability and mainly coherence (Guinaudeau & Strube, 2013).

• We performed a preliminary human comparison of the coherence level of some summaries

51 /22

Comparative Assessment: Coherence (2)

• Our finding: the longer the input documents, the more likely is to obtain a less cohesive summary

• The summary generated by the proposed system has superior coherence among the sentences


Proposed Datasets forExtractive TS

Rafael Dueire Lins, Hilario Oliveira, Luciano Cabral,

Jamilson Batista, Bruno Tenorio, Rafael Ferreira, Rinaldo Lima,

Gabriel Pereira e Silva , Steven J. Simske

The CNN-CorpusA Large Textual Corpus for

Single-Document Extractive Summarization

Recife

FortCollins

55

56

The CNN - CorpusText:1. Canadian health officials have confirmed three cases

of Enterovirus D68 in British Columbia.

2. A fourth suspected case from a patient with severe respiratory illness is still under investigation.

3. Two of the confirmed cases are children between 5 and 9, said Dr. Danuta Skowronski, lead epidemiologist on emerging respiratory viruses at the British Columbia Centre for Disease Control.

4. The third is a teen between 15 and 19.

5. Enteroviruses are common, especially in the summer and fall months.

6. The U.S. Centers for Disease Control and Prevention estimates that 10 million to 15 million infections occur in the United States each year.

7. These viruses usually appear like the common cold; symptoms include sneezing, a runny nose and a cough.

8. Most people recover without any treatment.

9. …

Highlights:1. Canada confirms three cases of

Enterovirus D68 in British Columbia.

2. A fourth suspected case is still under investigation.

3. Enterovirus D68 worsens breathing problems for children who have asthma.

4. CDC has confirmed more than 100 cases in 12 U.S. states since mid-August.

Gold Standard:1, 2, 13, 26

www.cnn.com

57

Overview statistics of the CNN-corpus

• Strict quality policies were enforced: every summary agreed on by, at least, two experts.

• Avoiding human subjectivity and mistakes.

58

• Rafael Dueire Lins, Rafael Ferreira , Steven J. Simske

DocEng’19 Competition on Extractive Text Summarization

Recife

FortCollins

The CNN-Corpus was used in the:

a Large Corpus for Extractive Text Summarization

in the Spanish Language

The same development methodology was used for:


Trends in TS

Limitations of Extractive Summarization

• Abstraction not as easy to do since it requires semantic understanding of text

• No coherence/cohesion treatment in most of the current ATS systems

• Redundancy problem

• Better (semi)automatic evaluation methods need to be devised particularly for multi-document systems

60Human performance vs system performance TAC'09

Neural-based Sentence Compression

• The next logical step is to eliminate redundant or less informative content from the extracted sentences

• A possible way to achieve this is by sentencecompression

Current approaches

• Sentence Compression by Deletion

• Sentence Compression as a Sequence to Sequence Problem

61



62

The LSTM-based sentence

compression model

It uses a parallel corpus of 2 million

sentence-compression instances (pairs)

from a news corpus

One-hot encoding or word embedding

of the input dictionnary

Filippova et al., 2015. Sentence compression by deletion with LSTM. In: Proc. of the 2015 Conference on EMNLP. ACL, Lisbon, Portugal (2015)

Sequence to Sequence Model

63


• Sentence encoder is responsible to sequentially read the word representations

and generate intermediate sentence representations for each state

• The decoder module generates output sentence one word at a time

• At a given step in the decoding process, the attention module assigns weights

to each of the input steps (optional)


Research Collaborations

Text Ming Group at UFRPEResearch areas:

65

tmgufrpe.github.io

https://tmgufrpe.github.io/

Future Research Projects Proposal

• Doctoral and Master thesis co-supervision

• Funded bilateral projects options:

• COFFECUB

• CNPq/CAPES

• FACEPE

References

Most cited papers of our Team

67

PAPER TITLE CITED BY YEARAssessing sentence scoring techniques for extractive text summarization

212 2013R Ferreira, L de Souza Cabral, RD Lins, GP e Silva, F Freitas, ...Expert systems with applications 40 (14), 5755-5764

A multi-document summarization system based on statistics and linguistic treatment85 2014R Ferreira, L de Souza Cabral, F Freitas, RD Lins, G de França Silva, ...

Expert Systems with Applications 41 (13), 5780-5787

A context based text summarization system50 2014R Ferreira, F Freitas, L de Souza Cabral, RD Lins, R Lima, G França, ...

2014 11th IAPR International Workshop on Document Analysis Systems, 66-70

A four dimension graph model for automatic text summarization35 2013R Ferreira, F Freitas, L de Souza Cabral, RD Lins, R Lima, G França, ...

2013 IEEE/WIC/ACM International Joint Conferences on Web Intelligence (WI …

Assessing sentence similarity through lexical, syntactic and semantic analysis34 2016R Ferreira, RD Lins, SJ Simske, F Freitas, M Riss

Computer Speech & Language 39, 1-28

Assessing shallow sentence scoring techniques and combinations for single and multi-document summarization29 2016H Oliveira, R Ferreira, R Lima, RD Lins, F Freitas, M Riss, SJ Simske

Expert Systems with Applications 65, 68-86

A new sentence similarity assessment measure based on a three-layer sentence representation23 2014R Ferreira, RD Lins, F Freitas, SJ Simske, M Riss

Proceedings of the 2014 ACM symposium on Document engineering, 25-34

A quantitative and qualitative assessment of automatic text summarization systems14 2015J Batista, R Ferreira, H Tomaz, R Ferreira, R Dueire Lins, S Simske, ...

Proceedings of the 2015 ACM Symposium on Document Engineering, 65-68

https://scholar.google.co.uk/scholar?oi=bibs&hl=en&cites=8846030048033817100








68

Automatic text document summarization based on machine learning

13 2015G Silva, R Ferreira, RD Lins, L Cabral, H Oliveira, SJ Simske, M RissProceedings of the 2015 ACM Symposium on Document Engineering, 191-194

Automatic Summarization of News Articles in Mobile Devices6 2015L Cabral, R Lima, R Lins, M Neto, R Ferreira, S Simske, M Riss

2015 Fourteenth Mexican International Conference on Artificial Intelligence …

Appling Link Target Identification and Content Extraction to improve Web News Summarization4 2016R Ferreira, R Ferreira, RD Lins, H Oliveira, M Riss, SJ Simske


DocEng'19 Competition on Extractive Text Summarization2 2019RD Lins, RF Mello, S Simske

Proceedings of the ACM Symposium on Document Engineering 2019, 4

The CNN-Corpus: A Large Textual Corpus for Single-Document Extractive Summarization2 2019RD Lins, H Oliveira, L Cabral, J Batista, B Tenorio, R Ferreira, R Lima, ...

Proceedings of the ACM Symposium on Document Engineering 2019, 16

Automatic Document Classification using Summarization Strategies2 2017R Ferreira, RD Lins, L Cabral, F Freitas, SJ Simske, M Riss


References

Most cited papers of our Team







Recent Books on ATS

69

2014 2019

Links to Survey Papers on ATS

70

https://www.cs.bgu.ac.il/~elhadad/nlp16/nenkova-mckeown.pdf

https://arxiv.org/pdf/1707.02268.pdf

https://www.cs.cmu.edu/~nasmith/LS2/das-martins.07.pdf

https://dergipark.org.tr/en/download/article-file/392456

https://link.springer.com/article/10.1007/s10462-016-9475-9

https://wanxiaojun.github.io/summ_survey_draft.pdf

http://jad.shahroodut.ac.ir/article_1189_28715967fcd8b7bfb463ab90aca5a9f7.pdf

https://www.cs.bgu.ac.il/~elhadad/nlp16/nenkova-mckeown.pdf

https://arxiv.org/pdf/1707.02268.pdf

https://www.cs.cmu.edu/~nasmith/LS2/das-martins.07.pdf

https://dergipark.org.tr/en/download/article-file/392456

https://link.springer.com/article/10.1007/s10462-016-9475-9

https://wanxiaojun.github.io/summ_survey_draft.pdf

http://jad.shahroodut.ac.ir/article_1189_28715967fcd8b7bfb463ab90aca5a9f7.pdf

Thank you for your attention

Any questions?

Documents

Research on Extractive Summarization at UFPE/UFRPE ...w3.erss.univ-tlse2.fr/UETAL/2019-2020/Presentation_CLLE_LIMA.pdf · Ensemble Approach to ATS 35 Research Goal Proposing and evaluating