main1

次へ: Introduction

Reducing Memory Requirements and Computational Costs for The Baum-Welch Algorithm and Application to Automatic Stochastic Network Grammar Acquisition.

Jin'ichi Murakami

ATR Interpreting Telecommunications Research Laboratories

2-2 Hikaridai, Seika-cho, Soraku-gun, Kyoto 619-02

Abstract

This paper describes new techniques for language modeling in speech recognition based on the use of a discrete density Ergodic Hidden Markov Model (HMM).

A discrete-output Ergodic HMM has a structure similar to that of a stochastic network language model (SNLM), so it can automatically function as an SNLM from a large amount of text data through the Baum-Welch algorithm. However, when the number of states in this Ergodic HMM is large, a large amount of memory is required and the computational cost is high. Therefore, past studies have limited the number of states. Consequently, the resulting perplexity of the Ergodic HMM has been high, and results as good as those obtained for word bigram models have not been obtained.

This paper proposes new techniques to reduce the memory requirements and computational costs associated with the Baum-Welch algorithm. These techniques were evaluated for their ability to automatically give an SNLM for an international conference registration task. Based on both the perplexity obtained and the results of continuous speech recognition, this Ergodic HMM was found to outperform word bigram models or trigram models. This implies that the proposed techniques are effective.

key words $\bullet$ stochastic network model $\bullet$ Baum-Welch algorithm $\bullet$ continuous speech recognition $\bullet$ language model $\bullet$ perplexity $\bullet$ Ergodic HMM

メモリ量および計算量を削減したBaum-Welchアルゴリズムの提案と言語モデルへの適用

村上仁一

ATR音声翻訳通信研究所

〒619-02 京都府相楽郡精華町光台2-2

あらまし

全状態間の遷移が許されている(Ergodic)離散型HMMにおいて単語を出力シンボルとした場合、その構造はネットワーク文法記述と形式的に類似する。したがって大量の単語列データから、Baum-Welchの学習アルゴリズムを使用して、確率つきネットワーク文法を自動的に獲得できる可能性がある。しかし状態数を大きくするとメモリ量および計算量は増加するため、現実的に計算が不可能になる。そのため従来の研究では状態数が少なく、認識性能や perplexityは単語のbigramと比較して良くない。そこで本稿では状態数が多い Ergodic HMMを学習するために、メモリ量および計算量を削減したBaum-Welch アルゴリズムを提案する。さらに、得られた Ergodic HMMを言語モデルとして連続音声認識に用いた実験結果についても述べる。

キーワード $\bullet$ 確率付きネットワーク文法 $\bullet$ Baum-Welch algorithm $\bullet$ 連続音声認識 $\bullet$ 言語モデル $\bullet$ perplexity $\bullet$ Ergodic HMM

論文をps形式でダウンロードする (約1Mbyte)

次へ: Introduction

Jin'ichi Murakami 平成13年10月2日