next up previous
次へ: Automatic Acquisition on Stochastic 上へ: The Possibility for Acquisition 戻る: The Possibility for Acquisition

Introduction

There are two types of natural language modelings for speech recognition. One is the class of deterministic models like a network grammar or a context free grammar[1]. The other is the class of statistical models like a bigram[2] or a trigram[3]. Among the many existing language models for speech recognition, a network grammar is often used because of its simplicity. However, this grammar has a high perplexity which decreases the speech recognition performance. Therefore, a stochastic network grammar (SNG) which adds probability to the network grammar has been studied to resolve this problem.

On the other hand, Hidden Markov Models (HMMs) are popular in acoustic modeling for speech recognition[4]. One of the advantages of HMMs is that they can be automatically trained through the Baum-Welch maximum likelihood estimation procedure using training speech data. Among the various types of HMMs, the all-state-connected model is called an Ergodic HMM.

This Ergodic HMM has a similar structure to an SNG[5]. Therefore, it is possible for an SNG to be obtained automatically by using the Ergodic HMM through the Baum-Welch training procedure. The resulting state transition probabilities of an Ergodic HMM are interpreted as transition probabilities in an SNG and output probabilities of an HMM is interpreted as word output probabilities in an SNG. Usually, word output probabilities in an SNG is depend on POS (part-of-speech). Therefore, word output probabilities in an HMM is grouping and that is essentially very close to the concept of POS.

We investigated the automatic acquisition of an SNG using an Ergodic HMM. And results showed that the word groupings in the HMM state is a striking similarity to POSs. This means that an Ergodic HMM has the ability to automatically acquire both an SNG and the concept of POS simultaneously from training text data. Preliminary results of experiments for sentence speech recognition using an Ergodic HMM are also carried out and show that an Ergodic HMM outperformed the word bigram grammar for text-open data.


next up previous
次へ: Automatic Acquisition on Stochastic 上へ: The Possibility for Acquisition 戻る: The Possibility for Acquisition
Jin'ichi Murakami 平成13年1月19日