next up previous
次へ: Introduction

$^1$ATR Interpreting Telecommunications Research Labs., 2-2 Hikaridai Seika-cho Soraku-gun Kyoto 619-02 Japan
$^2$NTT Human Interface Research Labs., 3-9-11 Midori-cho Musashino-shi Tokyo 180 Japan

The Possibility for Acquisition of Statistical Network Grammar using Ergodic HMM

Jin'ichi Murakami$^1$ Hiroki Yamatomo$^1$ Shigeki Sagayama$^2$

概要:

This paper describes the use of a discrete Ergodic Hidden Markov Model (HMM) for automatic acquisition of a language model from a large amount of text data and discusses the possibility of using the HMM to acquire the concept of POS (part-of-speech).

A discrete-output Ergodic HMMs has a similar structure to a stochastic network grammar (SNG), so automatic acquisition of an SNG from a large amount of text data is possible by using an Ergodic HMM and the Baum-Welch training procedure. In this model, the HMM state transition probabilities can be interpreted as the SNG transition probabilities. And the HMM state output probabilities will refrect the distributions of POSs (part-of-speeches).

The results of experiments using text data show that a high similarity is found between high output probability words and the POS. This means that an Ergodic HMM has the ability to automatically acquire both an SNG and the concept of POS simultaneously from training text data.

Experiments were also performed on sentence speech recognition. In these experiments, an Ergodic HMM outperformed the word bigram grammar for text-open data.



< < Here is PS file > >
next up previous
次へ: Introduction
Jin'ichi Murakami 平成13年1月19日