EXTRACTIVE SUMMARIZATION AS A GRAPH-BASED PROBLEM

Van-Hau Nguyen; Van-Chien Nguyen; Minh-Tien Nguyen; Le Chi Ngoc

Van-Hau Nguyen Hung Yen University of Technology and Education
Van-Chien Nguyen Ha Noi University of Technology and Science
Minh-Tien Nguyen Hung Yen University of Technology and Education
Le Chi Ngoc Hanoi University of Science and Technology

Abstract

Graph-based approaches have been successfully used in many domains (e.g., computer vision, recommendation system, big data, information extraction, etc.). In recently years, some studies have exploited the graph-based approach for extractive summarization and obtained significant results compared to other approaches. In this paper, we first encode an extractive summarization task into a graph-based problem. Then, we conduct several approaches on two well-known datasets: SoLSCSum and USAToday- CNN. Finally, we draw some insights, which would be helpful for the future research.

References

H. P. Luhn, “The automatic creation of literature abstracts,” IBM Journalof Research Development, 1958, 2(2), pp. 159-165.

G. Erkan and D. R. Radev, “Lexrank: Graph-based lexical centralityas salience in text summarization,” Journal of Artificial Intelligence Research, 2004, 22, pp. 457-479.

M.-T. Nguyen, T.-H.-N. Nguyen, H.-D. Nguyen, and V.-H. Nguyen, “Learning to estimate the importance of sentences for multi-documentsummarization,” in 10th KSE, 2018, pp. 31-36.

Y. Liu, “Fine-tune bert for extractive summarization” inarXiv preprintarXiv:1903.10318, 2019.

R. Mishra, J. Bian, M. Fiszman, C. Weir, S. Jonnalagadda, J. Mostafa, and G. Del Fiol, “Text summarization in the biomedical domain: Asystematic review of recent research,” Journal of biomedical informatics, 07 2014, vol. 52.

N. Nikolov, M. Pfeiffer, and R. Hahnloser, “Data-driven summarizationof scientific articles,” 05 2018.

Nenkova, A. & McKeown, K. (2012). A survey of text summarization techniques. In C. C. Aggarwal & C. Zhai (Eds.), Mining text data, pp. 43–76. Boston, MA: Springer US.

Qingyu Zhou, Nan Yang, Furu Wei, Shaohan Huang, Ming Zhou, and Tiejun Zhao. 2018. Neural document summarization by jointly learning to score and select sentences. In Proceedings of the 56th Annual Meeting of the ACL (Volume 1: Long Papers), volume 1, pp. 654–663.

R. Mihalcea and P. Tarau. Textrank: Bringing order into texts. In Proceedings of the Conference on Empirical Methods in Natural Language Processing, 2004, pp. 404–411.

Daraksha Parveen and Michael Strube. Multi-document summarization using bipartite graphs. In Proceedings of TextGraphs-9: Graph-based Methods for Natural Language Processing, Workshop at EMNLP 2014, Doha, Qatar, 29 October 2014, pp.15–24.

Danqing Wang, Pengfei Liu, Yining Zheng, Xipeng Qiuy, and Xuanjing Huang, 2020. Heterogeneous Graph Neural Networks for Extractive Document Summarization. Conference: Proceedings of the 58th Annual Meeting of ACL, pp. 6209–6219.

Brin, S., & Page, L. The anatomy of a large-scale hypertextual Web search engine. Computer Networks and ISDN Systems, 1998, 30(1{7), pp. 107-117.

J.M. Kleinberg. 1999. Authoritative sources in a hyperlinked environment. Journal of the ACM, 46(5), pp. 604–632.

Michihiro Yasunaga, Rui Zhang, Kshitijh Meelu, Ayush Pareek, Krishnan Srinivasan, and Dragomir Radev. Graph-based neural multi-document summarization, 2017. arXiv preprint arXiv:1706.06681.

J. Christensen, Mausam, S. Soderland, and O. Etzioni. Towards coherent multidocument summarization. In Proceedings of the 2013 Conference of the North American Chapter of ACL: Human Language Technologies, 2013, pp. 1163–1173.

P.J. Herings, G. van der Laan, and D. Talman. Measuring the power of nodes in digraphs. Technical report, Tinbergen Institute, 2001.

M.-T. Nguyen, C.-X. Tran, D.-V. Tran, and M.-L. Nguyen, “Solscsum:A linked sentence-comment dataset for social context summarization,” inCIKM, 2016, pp. 2409-2412.

Z. Wei and W. Gao, “Utilizing microblogs for automatic news highlights extraction,” in COLING, 2014, pp. 872-883.

C. Cortes and V. Vapnik, “Support-vector networks,” Machine Learning, 1995, 20(3), pp. 273-297.