• Dang Viet Hung Hung Yen University of Technology and Education
  • Bui Duc Tho Hung Yen University of Technology and Education
  • Dao Minh Tuan Hung Yen University of Technology and Education
Keywords: Abnormal data, Out-of-Distribution detection, Deep Learning, Natural Language Processing


When machine learning models are used in real-world classification problems, they tend to fail when the training and testing data different. Worse still, these classification models can fail to make highly confident predictions while being seriously inaccurate. For example, a medical diagnostic model can continuously classify with high confidence, even if it will label data that is difficult for human intervention, resulting in diagnoses that will be erroneous wrong. In this article, we will apply the Long Short-Term Memory (LSTM) deep learning network to detection data Out-of-Distribution (OOD) in Vietnamese. Then we will use the model evaluation index AUROC (Area Under the Receiver Operating Characteristics) and AUPR (Area Under Precision-Recall Curve) to evaluate the model.


