I. Introduction
It has been a long research topic that machines recognize human emotions. Recently, many studies improves to recognize human emotions based on artificial intelligent (AI) methods. The studies train various emotion classification models from a lot of emotional-labeled data based on deep learning, such as convolution neural network [2], recurrent neural net-work [3], attention network [3], [4]. For the techniques are advanced, the training data are also diversified to image [2], video [2], [3], audio [4] and text [5]. Some studies [3], [5] com-bine the classification models using multi-modal classification based on hybrid approaches. The studies report significant results for emotion recognition. They are quite accurate to recognize the human emotions.