Skip to content
BY-NC-ND 3.0 license Open Access Published by De Gruyter November 8, 2011

Word Extraction and Character Segmentation from Text Lines of Unconstrained Handwritten Bangla Document Images

  • Ram Sarkar EMAIL logo , Samir Malakar , Nibaran Das , Subhadip Basu , Mahantapas Kundu and Mita Nasipuri

Abstract

In this paper, a novel approach for word extraction and character segmentation from the handwritten Bangla document images is reported. At first, a modified Run Length Smoothing Algorithm (RLSA), called Spiral Run Length Smearing Algorithm (SRLSA), is applied for the extraction of words from the text lines of unconstrained handwritten Bangla document images. This technique has helped to overcome some of the drawbacks of standard horizontal and vertical RLSA techniques. SRLSA technique has been applied on the Bangla handwritten document image database CMATERdb1.1.1 and the success rate of the word extraction is found to be 86.01%. In the second part of the work, we have presented a useful solution to the problem on how best word images of handwritten Bangla script can be segmented into constituent characters. Moreover, the technique can segment the words having discontinuity in Matra, a prominent feature of Bangla script. It also optimizes the trade-off between under/over segmentation as Matra region and segmentation points are estimated more precisely. As a result, better word segmentation accuracy is achieved with minimal data loss. Here, a success rate of 92.48% is observed on a dataset of 750 handwritten Bangla words which is 3.35% higher than that of our earlier techniques.

Received: 2011-06-25
Published Online: 2011-11-08
Published in Print: 2011-November

© de Gruyter 2011

This article is distributed under the terms of the Creative Commons Attribution Non-Commercial License, which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.

Downloaded on 18.4.2024 from https://www.degruyter.com/document/doi/10.1515/jisys.2011.013/html
Scroll to top button