Terms derived from frequent sequences for extractive text summarization

Yulia Ledeneva; Alexander Gelbukh; René Arnulfo García-Hernández

doi:10.1007/978-3-540-78135-6_51

Terms derived from frequent sequences for extractive text summarization

Yulia Ledeneva, Alexander Gelbukh, René Arnulfo García-Hernández

Centro de Investigación en Computación (CIC)

Producción científica: Capítulo del libro/informe/acta de congreso › Contribución a la conferencia › revisión exhaustiva

36 Citas (Scopus)

Resumen

Automatic text summarization helps the user to quickly understand large volumes of information. We present a language- and domain-independent statistical-based method for single-document extractive summarization, i.e., to produce a text summary by extracting some sentences from the given text. We show experimentally that words that are parts of bigrams that repeat more than once in the text are good terms to describe the text's contents, and so are also so-called maximal frequent sentences. We also show that the frequency of the term as term weight gives good results (while we only count the occurrences of a term in repeating bigrams).

Idioma original	Inglés
Título de la publicación alojada	Computational Linguistics and Intelligent Text Processing - 9th International Conference, CICLing 2008, Proceedings
Páginas	593-604
Número de páginas	12
DOI	https://doi.org/10.1007/978-3-540-78135-6_51
Estado	Publicada - 2008
Evento	9th International Conference on Computational Linguistics and Intelligent Text Processing, CICLing 2008 - Haifa, Israel Duración: 17 feb. 2008 → 23 feb. 2008

Serie de la publicación

Nombre	Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volumen	4919 LNCS
ISSN (versión impresa)	0302-9743
ISSN (versión digital)	1611-3349

Conferencia

Conferencia	9th International Conference on Computational Linguistics and Intelligent Text Processing, CICLing 2008
País/Territorio	Israel
Ciudad	Haifa
Período	17/02/08 → 23/02/08

Acceder al documento

10.1007/978-3-540-78135-6_51

Otros archivos y enlaces

Enlace a la publicación en Scopus

Citar esto

Ledeneva, Y., Gelbukh, A., & García-Hernández, R. A. (2008). Terms derived from frequent sequences for extractive text summarization. En Computational Linguistics and Intelligent Text Processing - 9th International Conference, CICLing 2008, Proceedings (pp. 593-604). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 4919 LNCS). https://doi.org/10.1007/978-3-540-78135-6_51

Ledeneva, Yulia ; Gelbukh, Alexander ; García-Hernández, René Arnulfo. / Terms derived from frequent sequences for extractive text summarization. Computational Linguistics and Intelligent Text Processing - 9th International Conference, CICLing 2008, Proceedings. 2008. pp. 593-604 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)).

@inproceedings{3b9a60f174a34664890e11a531634d7a,

title = "Terms derived from frequent sequences for extractive text summarization",

abstract = "Automatic text summarization helps the user to quickly understand large volumes of information. We present a language- and domain-independent statistical-based method for single-document extractive summarization, i.e., to produce a text summary by extracting some sentences from the given text. We show experimentally that words that are parts of bigrams that repeat more than once in the text are good terms to describe the text's contents, and so are also so-called maximal frequent sentences. We also show that the frequency of the term as term weight gives good results (while we only count the occurrences of a term in repeating bigrams).",

author = "Yulia Ledeneva and Alexander Gelbukh and Garc{\'i}a-Hern{\'a}ndez, {Ren{\'e} Arnulfo}",

note = "Funding Information: Work done under partial support of Mexican Government (CONACyT, SNI, SIP-IPN, COTEPABE-IPN, COFAA-IPN). The authors thank Rada Mihalcea for useful discussion.; 9th International Conference on Computational Linguistics and Intelligent Text Processing, CICLing 2008 ; Conference date: 17-02-2008 Through 23-02-2008",

year = "2008",

doi = "10.1007/978-3-540-78135-6_51",

language = "Ingl{\'e}s",

isbn = "354078134X",

series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",

pages = "593--604",

booktitle = "Computational Linguistics and Intelligent Text Processing - 9th International Conference, CICLing 2008, Proceedings",

}

Ledeneva, Y, Gelbukh, A & García-Hernández, RA 2008, Terms derived from frequent sequences for extractive text summarization. En Computational Linguistics and Intelligent Text Processing - 9th International Conference, CICLing 2008, Proceedings. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 4919 LNCS, pp. 593-604, 9th International Conference on Computational Linguistics and Intelligent Text Processing, CICLing 2008, Haifa, Israel, 17/02/08. https://doi.org/10.1007/978-3-540-78135-6_51

Terms derived from frequent sequences for extractive text summarization. / Ledeneva, Yulia; Gelbukh, Alexander; García-Hernández, René Arnulfo.
Computational Linguistics and Intelligent Text Processing - 9th International Conference, CICLing 2008, Proceedings. 2008. p. 593-604 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 4919 LNCS).

Producción científica: Capítulo del libro/informe/acta de congreso › Contribución a la conferencia › revisión exhaustiva

TY - GEN

T1 - Terms derived from frequent sequences for extractive text summarization

AU - Ledeneva, Yulia

AU - Gelbukh, Alexander

AU - García-Hernández, René Arnulfo

N1 - Funding Information: Work done under partial support of Mexican Government (CONACyT, SNI, SIP-IPN, COTEPABE-IPN, COFAA-IPN). The authors thank Rada Mihalcea for useful discussion.

PY - 2008

Y1 - 2008

N2 - Automatic text summarization helps the user to quickly understand large volumes of information. We present a language- and domain-independent statistical-based method for single-document extractive summarization, i.e., to produce a text summary by extracting some sentences from the given text. We show experimentally that words that are parts of bigrams that repeat more than once in the text are good terms to describe the text's contents, and so are also so-called maximal frequent sentences. We also show that the frequency of the term as term weight gives good results (while we only count the occurrences of a term in repeating bigrams).

AB - Automatic text summarization helps the user to quickly understand large volumes of information. We present a language- and domain-independent statistical-based method for single-document extractive summarization, i.e., to produce a text summary by extracting some sentences from the given text. We show experimentally that words that are parts of bigrams that repeat more than once in the text are good terms to describe the text's contents, and so are also so-called maximal frequent sentences. We also show that the frequency of the term as term weight gives good results (while we only count the occurrences of a term in repeating bigrams).

UR - http://www.scopus.com/inward/record.url?scp=49949097893&partnerID=8YFLogxK

U2 - 10.1007/978-3-540-78135-6_51

DO - 10.1007/978-3-540-78135-6_51

M3 - Contribución a la conferencia

SN - 354078134X

SN - 9783540781349

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 593

EP - 604

BT - Computational Linguistics and Intelligent Text Processing - 9th International Conference, CICLing 2008, Proceedings

T2 - 9th International Conference on Computational Linguistics and Intelligent Text Processing, CICLing 2008

Y2 - 17 February 2008 through 23 February 2008

ER -

Ledeneva Y, Gelbukh A, García-Hernández RA. Terms derived from frequent sequences for extractive text summarization. En Computational Linguistics and Intelligent Text Processing - 9th International Conference, CICLing 2008, Proceedings. 2008. p. 593-604. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)). doi: 10.1007/978-3-540-78135-6_51

Terms derived from frequent sequences for extractive text summarization

Resumen

Serie de la publicación

Conferencia

Acceder al documento

Otros archivos y enlaces

Huella

Citar esto