Đào Thị Lệ Thủy, Trinh Van Loan, Nguyen Hong Quang


This paper presents the results of GMM-based recognition for four basic emotions of Vietnamese such as neutral, sadness, anger and happiness. The characteristic parameters of these emotions are extracted from speech signals and divided into different parameter sets for experiments. The experiments are carried out according to speaker-dependent or speaker-independent and content-dependent or content-independent recognitions. The results showed that the recognition scores are rather high with the case for which there is a full combination of parameters as MFCC and its first and second derivatives, fundamental frequency, energy, formants and its correspondent bandwidths, spectral characteristics and F0 variants. In average, the speaker-dependent and content-dependent recognition scrore is 89.21%. Next, the average score is 82.27% for the speaker-dependent and content-independent recognition. For the speaker-independent and content-dependent recognition, the average score is 70.35%. The average score is 66.99% for speaker-independent and content-independent recognition. Information on F0 has significantly increased the score of recognition


GMM, recognition, emotion, Vietnamese, corpus, F0

Full Text:


DOI: Display counter: Abstract : 150 views. PDF : 180 views.

Journal of Computer Science and Cybernetics ISSN: 1813-9663

Published by Vietnam Academy of Science and Technology