Applying Bottle Neck Feature for Vietnamese speech recognition

Nguyễn Văn Huy, Lương Chi Mai, Vũ Tất Thắng


In the paper, the basic idea of Bottle Neck Feature (BNF) and the process how to extract BNF are presented. We apply BNF for Vietnamese speech recognition with five layers MLP network of different sizes for the first hidden layer. Input features to extract BNF feature are Perceptual Linear Prediction (PLP) and Mel Frequency Cepstral Coefficient (MFCC). The experiments are carried out on a data set of VOV (Voice of Vietnam). The results show that using BNF for Vietnamese speech recognition, a WER (Word Error Rate) is improved up to 6-7% comparing to the baseline system, and MFCC feature gives a better result than PLP feature.


BNF, bottle neck feature, Vietnamese speech recognition, HMM-GMM

DOI: Display counter: Abstract : 240 views. PDF (Tiếng Việt) : 150 views. PDF : 116 views.

Journal of Computer Science and Cybernetics ISSN: 1813-9663

Published by Vietnam Academy of Science and Technology