TY - GEN
T1 - Text summarization by sentence extraction using unsupervised learning
AU - García-Hernández, René Arnulfo
AU - Montiel, Romyna
AU - Ledeneva, Yulia
AU - Rendón, Eréndira
AU - Gelbukh, Alexander
AU - Cruz, Rafael
PY - 2008
Y1 - 2008
N2 - The main problem for generating an extractive automatic text summary is to detect the most relevant information in the source document. Although, some approaches claim being domain and language independent, they use high dependence knowledge like key-phrases or golden samples for machine-learning approaches. In this work, we propose a language- and domain-independent automatic text summarization approach by sentence extraction using an unsupervised learning algorithm. Our hypothesis is that an unsupervised algorithm can help for clustering similar ideas (sentences). Then, for composing the summary, the most representative sentence is selected from each cluster. Several experiments in the standard DUC-2002 collection show that the proposed method obtains more favorable results than other approaches.
AB - The main problem for generating an extractive automatic text summary is to detect the most relevant information in the source document. Although, some approaches claim being domain and language independent, they use high dependence knowledge like key-phrases or golden samples for machine-learning approaches. In this work, we propose a language- and domain-independent automatic text summarization approach by sentence extraction using an unsupervised learning algorithm. Our hypothesis is that an unsupervised algorithm can help for clustering similar ideas (sentences). Then, for composing the summary, the most representative sentence is selected from each cluster. Several experiments in the standard DUC-2002 collection show that the proposed method obtains more favorable results than other approaches.
UR - http://www.scopus.com/inward/record.url?scp=57049118061&partnerID=8YFLogxK
U2 - 10.1007/978-3-540-88636-512
DO - 10.1007/978-3-540-88636-512
M3 - Contribución a la conferencia
SN - 3540886354
SN - 9783540886358
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 133
EP - 143
BT - MICAI 2008
T2 - 7th Mexican International Conference on Artificial Intelligence, MICAI 2008
Y2 - 27 October 2008 through 31 October 2008
ER -