A Novel Ilp Framework For Summarizing Content With High Lexical Variety
Abstract
Summarizing content contributed by individuals can be challenging, because people make different lexical choices even when describing the same events. However, there remains a significant need to summarize such content. Examples include the student responses to post-class reflective questions, product reviews, and news articles published by different news agencies related to the same events. High lexical diversity of these documents hinders the system's ability to effectively identify salient content and reduce summary redundancy. In this paper, we overcome this issue by introducing an integer linear programming-based summarization framework. It incorporates a low-rank approximation to the sentence-word cooccurrence matrix to intrinsically group semantically similar lexical items. We conduct extensive experiments on datasets of student responses, product reviews, and news documents. Our approach compares favorably to a number of extractive baselines as well as a neural abstractive summarization system. The paper finally sheds light on when and why the proposed framework is effective at summarizing content with high lexical variety.
Publication Date
11-1-2018
Publication Title
Natural Language Engineering
Volume
24
Issue
6
Number of Pages
887-920
Document Type
Article
Personal Identifier
scopus
DOI Link
https://doi.org/10.1017/S1351324918000323
Copyright Status
Unknown
Socpus ID
85053198699 (Scopus)
Source API URL
https://api.elsevier.com/content/abstract/scopus_id/85053198699
STARS Citation
Luo, Wencan; Liu, Fei; Liu, Zitao; and Litman, Diane, "A Novel Ilp Framework For Summarizing Content With High Lexical Variety" (2018). Scopus Export 2015-2019. 10402.
https://stars.library.ucf.edu/scopus2015/10402