EM clustering algorithm for automatic text summarization

Yulia Ledeneva, René García Hernández, Romyna Montiel Soto, Rafael Cruz Reyes, Alexander Gelbukh

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

13 Scopus citations

Abstract

Automatic text summarization has emerged as a technique for accessing only to useful information. In order to known the quality of the automatic summaries produced by a system, in DUC 2002 (Document Understanding Conference) has developed a standard human summaries called gold collection of 567 documents of single news. In this conference only five systems could outperforms the baseline heuristic in single extractive summarization task. So far, some approaches have got good results combining different strategies with language-dependent knowledge. In this paper, we present a competitive method based on an EM clustering algorithm for improving the quality of the automatic summaries using practically non language-dependent knowledge. Also, a comparison of this method with three text models is presented.

Original languageEnglish
Title of host publicationAdvances in Artificial Intelligence - 10th Mexican International Conference on Artificial Intelligence, MICAI 2011, Proceedings
Pages305-315
Number of pages11
EditionPART 1
DOIs
StatePublished - 2011
Event10th Mexican International Conference on Artificial Intelligence, MICAI 2011 - Puebla, Mexico
Duration: 26 Nov 20114 Dec 2011

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
NumberPART 1
Volume7094 LNAI
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference10th Mexican International Conference on Artificial Intelligence, MICAI 2011
Country/TerritoryMexico
CityPuebla
Period26/11/114/12/11

Keywords

  • Automatic text summarization
  • EM clustering algorithm
  • extractive summarization
  • maximal frequent sequences
  • n-grams
  • text models

Fingerprint

Dive into the research topics of 'EM clustering algorithm for automatic text summarization'. Together they form a unique fingerprint.

Cite this