Automatic phonetic segmentation of Hindi speech using hidden Markov model

AI and Society 27 (4):543-549 (2012)
In this paper, we study the performance of baseline hidden Markov model (HMM) for segmentation of speech signals. It is applied on single-speaker segmentation task, using Hindi speech database. The automatic phoneme segmentation framework evolved imitates the human phoneme segmentation process. A set of 44 Hindi phonemes were chosen for the segmentation experiment, wherein we used continuous density hidden Markov model (CDHMM) with a mixture of Gaussian distribution. The left-to-right topology with no skip states has been selected as it is effective in speech recognition due to its consistency with the natural way of articulating the spoken words. This system accepts speech utterances along with their orthographic “transcriptions” and generates segmentation information of the speech. This corpus was used to develop context-independent hidden Markov models (HMMs) for each of the Hindi phonemes. The system was trained using numerous sentences that are relevant to provide information to the passengers of the Metro Rail. The system was validated against a few manually segmented speech utterances. The evaluation of the experiments shows that the best performance is obtained by using a combination of two Gaussians mixtures and five HMM states. A category-wise phoneme error analysis has been performed, and the performance of the phonetic segmentation has been reported. The modeling of HMMs has been implemented using Microsoft Visual Studio 2005 (C++), and the system is designed to work on Windows operating system. The goal of this study is automatic segmentation of speech at phonetic level.
Keywords Automatic phonetic segmentation  Hidden Markov models  Text to speech  Corpus-based speech synthesis Gaussian mixture models  Unit selection
Categories (categorize this paper)
DOI 10.1007/s00146-012-0386-2
 Save to my reading list
Follow the author(s)
My bibliography
Export citation
Find it on Scholar
Edit this record
Mark as duplicate
Revision history Request removal from index
Download options
PhilPapers Archive

Upload a copy of this paper     Check publisher's policy on self-archival     Papers currently archived: 16,661
External links
Setup an account with your affiliations in order to access resources via your University's proxy server
Configure custom proxy (use this if your affiliation does not provide a proxy)
Through your library
References found in this work BETA

No references found.

Add more references

Citations of this work BETA

No citations found.

Add more citations

Similar books and articles
Li Deng (1998). Locus Equation and Hidden Parameters of Speech. Behavioral and Brain Sciences 21 (2):263-264.
James P. Lund (1998). Is Speech Just Chewing the Fat? Behavioral and Brain Sciences 21 (4):522-522.

Monthly downloads

Added to index


Total downloads

11 ( #219,154 of 1,726,249 )

Recent downloads (6 months)

2 ( #289,836 of 1,726,249 )

How can I increase my downloads?

My notes
Sign in to use this feature

Start a new thread
There  are no threads in this forum
Nothing in this forum yet.