Greedy Optimization Method for Extractive Summarization of Scientific Articles

Iskander Akhmetov, Alexander Gelbukh, Rustam Mussabayev

Research output: Contribution to journalArticlepeer-review

4 Scopus citations

Abstract

This work presents a method for summarizing scientific articles from the arXive and PubMed datasets using a greedy Extractive Summarization algorithm. We used the approach along with Variable Neighborhood Search (VNS) to learn what is the top-line exists in the area of Extractive Text Summarization quality in terms of ROUGE scores. The algorithm is based on first selecting for the summary the sentences from the text containing the maximum number of words with the higher TFIDF values along with minimum document frequency parameter tuning for TFIDF vectorization. As a result, the method achieves 0.43/0.12 and 0.40/0.13 for ROUGE-1/ROUGE-2 scores on arXive and PubMed datasets, respectively. These results are comparable to the state-of-the-art models using complex neural network architectures and serious computational resources together with the large amounts of training data. In contrast, our method uses a straightforward statistical inference methodology.

Original languageEnglish
Pages (from-to)168141-168153
Number of pages13
JournalIEEE Access
Volume9
DOIs
StatePublished - 2021

Keywords

  • Extractive text summarization
  • greedy algorithm
  • variable neighborhood search

Fingerprint

Dive into the research topics of 'Greedy Optimization Method for Extractive Summarization of Scientific Articles'. Together they form a unique fingerprint.

Cite this