Intra-document and inter-document redundancy in multi-document summarization

Pabel Carrillo-Mendoza, Hiram Calvo, Alexander Gelbukh

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

1 Scopus citations

Abstract

Multi-document summarization differs from single-document summarization in excessive redundancy of mentions of some events or ideas. We show how the amount of redundancy in a document collection can be used for assigning importance to sentences in multi-document extractive summarization: for instance, an idea could be important if it is redundant across documents because of its popularity; on the other hand, an idea could be important if it is not redundant across documents because of its novelty. We propose an unsupervised graph-based technique that, based on proper similarity measures, allows us to experiment with intra-document and inter-document redundancy. Our experiments on DUC corpora show promising results.

Original languageEnglish
Title of host publicationAdvances in Soft Computing - 15th Mexican International Conference on Artificial Intelligence, MICAI 2016, Proceedings
EditorsOscar Herrera-Alcantara, Grigori Sidorov
PublisherSpringer Verlag
Pages105-115
Number of pages11
ISBN (Print)9783319624334
DOIs
StatePublished - 2017
Event15th Mexican International Conference on Artificial Intelligence, MICAI 2016 - Cancun, Mexico
Duration: 23 Oct 201628 Oct 2016

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume10061 LNAI
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference15th Mexican International Conference on Artificial Intelligence, MICAI 2016
Country/TerritoryMexico
CityCancun
Period23/10/1628/10/16

Keywords

  • Cross-documents redundancy
  • Doc2vec
  • Graph-based methods
  • Inter-document redundancy
  • Intra-document redundancy
  • Multi-document summarization
  • Per-document redundancy
  • Unsupervised summarization

Fingerprint

Dive into the research topics of 'Intra-document and inter-document redundancy in multi-document summarization'. Together they form a unique fingerprint.

Cite this