AI and Society 27 (4):543-549 (2012)
In this paper, we study the performance of baseline hidden Markov model (HMM) for segmentation of speech signals. It is applied on single-speaker segmentation task, using Hindi speech database. The automatic phoneme segmentation framework evolved imitates the human phoneme segmentation process. A set of 44 Hindi phonemes were chosen for the segmentation experiment, wherein we used continuous density hidden Markov model (CDHMM) with a mixture of Gaussian distribution. The left-to-right topology with no skip states has been selected as it is effective in speech recognition due to its consistency with the natural way of articulating the spoken words. This system accepts speech utterances along with their orthographic “transcriptions” and generates segmentation information of the speech. This corpus was used to develop context-independent hidden Markov models (HMMs) for each of the Hindi phonemes. The system was trained using numerous sentences that are relevant to provide information to the passengers of the Metro Rail. The system was validated against a few manually segmented speech utterances. The evaluation of the experiments shows that the best performance is obtained by using a combination of two Gaussians mixtures and five HMM states. A category-wise phoneme error analysis has been performed, and the performance of the phonetic segmentation has been reported. The modeling of HMMs has been implemented using Microsoft Visual Studio 2005 (C++), and the system is designed to work on Windows operating system. The goal of this study is automatic segmentation of speech at phonetic level.
|Keywords||Automatic phonetic segmentation Hidden Markov models Text to speech Corpus-based speech synthesis Gaussian mixture models Unit selection|
|Categories||categorize this paper)|
References found in this work BETA
No references found.
Citations of this work BETA
No citations found.
Similar books and articles
The Feasibility of Segmentation of Protolanguage.Istvan Zachar - 2011 - Interaction Studies 12 (1):1-35.
The Effect of Sonority on Word Segmentation: Evidence for the Use of a Phonological Universal.Marc Ettlinger, Amy S. Finn & Carla L. Hudson Kam - 2011 - Cognitive Science 36 (4):655-673.
Learning Diphone-Based Segmentation.Robert Daland & Janet B. Pierrehumbert - 2011 - Cognitive Science 35 (1):119-155.
Locus Equation and Hidden Parameters of Speech.Li Deng - 1998 - Behavioral and Brain Sciences 21 (2):263-264.
How Many Mechanisms Are Needed to Analyze Speech? A Connectionist Simulation of Structural Rule Learning in Artificial Language Acquisition.Aarre Laakso & Paco Calvo - 2011 - Cognitive Science 35 (7):1243-1281.
Creating the Customer: The Influence of Advertising on Consumer Market Segments – Evidence and Ethics. [REVIEW]Agnes Nairn & Pierre Berthon - 2003 - Journal of Business Ethics 42 (1):83 - 99.
Hidden Markov Model Interpretations of Neural Networks.Ingmar Visser - 2000 - Behavioral and Brain Sciences 23 (4):494-495.
Identification of Rhetorical Roles for Segmentation and Summarization of a Legal Judgment.M. Saravanan & B. Ravindran - 2010 - Artificial Intelligence and Law 18 (1):45-76.
Merging Information Versus Speech Recognition.Irene Appelbaum - 2000 - Behavioral and Brain Sciences 23 (3):325-326.
A Single-Stage Approach to Learning Phonological Categories: Insights From Inuktitut.Brian Dillon, Ewan Dunbar & William Idsardi - 2013 - Cognitive Science 37 (2):344-377.
Added to index2012-02-17
Total downloads26 ( #195,990 of 2,169,997 )
Recent downloads (6 months)1 ( #345,417 of 2,169,997 )
How can I increase my downloads?