APPLYING LONG SHORT-TERM MEMORY (LSTM) DEEP LEARNING NETWORK TO DETECT OUT-OF-DISTRIBUTION IN VIETNAMESE

Dang Viet Hung; Bui Duc Tho; Dao Minh Tuan

Dang Viet Hung Hung Yen University of Technology and Education
Bui Duc Tho Hung Yen University of Technology and Education
Dao Minh Tuan Hung Yen University of Technology and Education

Keywords: Abnormal data, Out-of-Distribution detection, Deep Learning, Natural Language Processing

Abstract

When machine learning models are used in real-world classification problems, they tend to fail when the training and testing data different. Worse still, these classification models can fail to make highly confident predictions while being seriously inaccurate. For example, a medical diagnostic model can continuously classify with high confidence, even if it will label data that is difficult for human intervention, resulting in diagnoses that will be erroneous wrong. In this article, we will apply the Long Short-Term Memory (LSTM) deep learning network to detection data Out-of-Distribution (OOD) in Vietnamese. Then we will use the model evaluation index AUROC (Area Under the Receiver Operating Characteristics) and AUPR (Area Under Precision-Recall Curve) to evaluate the model.

References

Z. Shen et al., “Towards out-of-distribution generalization: A survey,” arXiv Prepr. arXiv2108.13624, 2021.

Z. Yang, D. Yang, C. Dyer, X. He, A. Smola, and E. Hovy, “Hierarchical attention networks for document classification,” in Proceedings of the 2016 conference of the North American chapter of the association for computational linguistics: human language technologies, 2016, pp. 1480-1489.

P. Cui and J. Wang, “Out-of-Distribution (OOD) Detection Based on Deep Learning: A Review,” Electronics, 2022, vol. 11, no. 21, p. 3500.

A. Krizhevsky, I. Sutskever, and G. E. Hinton, “Imagenet classification with deep convolutional neural networks,” Commun. ACM, 2017, vol. 60, no. 6, pp. 84-90.

Y. Kim, “Convolutional neural networks for sentence classification proceedings of the 2014 conference on empirical methods in natural language processing, emnlp 2014, october 25-29, 2014, doha, qatar, a meeting of sigdat, a special interest group of the acl,” Assoc. Comput. Linguist. Doha, Qatar, 2014.

G. Pang, C. Shen, L. Cao, and A. Van Den Hengel, “Deep learning for anomaly detection: A review,” ACM Comput. Surv., 2021, vol. 54, no. 2, pp. 1-38.

W. Liu, X. Wang, J. Owens, and Y. Li, “Energy-based out-of-distribution detection,” Adv. Neural Inf. Process. Syst., 2020, vol. 33, pp. 21464-21475.

S. J. Pan and Q. Yang, “A survey on transfer learning,” IEEE Trans. Knowl. Data Eng., 2010, vol. 22, no. 10, pp. 1345-1359.

Y. Xu and T. Jaakkola, “Learning representations that support robust transfer of predictors,” arXiv Prepr. arXiv2110.09940, 2021.

S. Hochreiter and J. Schmidhuber, “Long short-term memory,” Neural Comput., 1997, vol. 9, no. 8, pp. 1735-1780.

D. Hendrycks and K. Gimpel, “A baseline for detecting misclassified and outof-distribution examples in neural networks,” arXiv Prepr. arXiv1610.02136, 2016