Change search
Link to record
Permanent link

Direct link
Premaratne, Hemakumar Lalith
Publications (2 of 2) Show all publications
Premaratne, H. L., Järpe, E. & Bigun, J. (2006). Lexicon and hidden Markov model-based optimisation of the recognised Sinhala script. Pattern Recognition Letters, 27(6), 696-705
Open this publication in new window or tab >>Lexicon and hidden Markov model-based optimisation of the recognised Sinhala script
2006 (English)In: Pattern Recognition Letters, ISSN 0167-8655, E-ISSN 1872-7344, Vol. 27, no 6, p. 696-705Article in journal (Refereed) Published
Abstract [en]

The Brahmi descended Sinhala script is used by 75% of the 18 million population in Sri Lanka. To the best of our knowledge, none of the Brahmi descended scripts used by hundreds of millions of people in South Asia, possess commercial OCR products. In the process of implementation of an OCR system for the printed Sinhala script which is easily adoptable to similar scripts [Premaratne, L., Assabie, Y., Bigun, J., 2004. Recognition of modification-based scripts using direction tensors. In: 4th Indian Conf. on Computer Vision, Graphics and Image Processing (ICVGIP2004), pp. 587–592]; a segmentation-free recognition method using orientation features has been proposed in [Premaratne, H.L., Bigun, J., 2004. A segmentation-free approach to recognise printed Sinhala script using linear symmetry. Pattern Recognition 37, 2081–2089]. Due to the limitations in image analysis techniques the character level accuracy of the results directly produced by the proposed character recognition algorithm saturates at 94%. The false rejections from the recognition algorithm are initially identified only as ‘missing character positions’ or ‘blank characters’. It is necessary to identify suitable substitutes for such ‘missing character positions’ and optimise the accuracy of words to an acceptable level. This paper proposes a novel method that explores the lexicon in association with the hidden Markov models to improve the rate of accuracy of the recognised script. The proposed method could easily be extended with minor changes to other modification-based scripts consisting of confusing characters. The word-level accuracy which was at 81.5% is improved to 88.5% by the proposed optimisation algorithm.

Place, publisher, year, edition, pages
Amsterdam: Elsevier, 2006
Optical character recognition, Hidden Markov models, State transition matrix, Confusion matrix, Word optimisation
National Category
Engineering and Technology
urn:nbn:se:hh:diva-1316 (URN)10.1016/j.patrec.2005.10.009 (DOI)000236286700023 ()2-s2.0-32844473524 (Scopus ID)2082/1695 (Local ID)2082/1695 (Archive number)2082/1695 (OAI)
Available from: 2008-04-15 Created: 2008-04-15 Last updated: 2018-03-23Bibliographically approved
Premaratne, H. L. (2005). Recognition of printed Sinhala characters by direction fields. (Doctoral dissertation). Göteborg: Chalmers tekniska högskola
Open this publication in new window or tab >>Recognition of printed Sinhala characters by direction fields
2005 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

Although substantial research has been carried out on Optical Character Recognition (OCR) where a printed or a handwritten document of script is read as an image and converted to the editable text format, for various languages during the last 30 years, majority of Brahmi descended south Asian scripts are yet to achieve a commercial OCR system.

Place, publisher, year, edition, pages
Göteborg: Chalmers tekniska högskola, 2005. p. 62
Doktorsavhandlingar vid Chalmers tekniska högskola. Ny serie, ISSN 0346-718X ; 2285
Printed Sinhala characters, Electronics, Photonics
National Category
Engineering and Technology
urn:nbn:se:hh:diva-712 (URN)2082/1061 (Local ID)91-7291-603-6 (ISBN)2082/1061 (Archive number)2082/1061 (OAI)
Public defence
2005-05-20, Wigforssalen, Högskolan i Halmstad, Kristian IV:s väg 3, Halmstad, 13:15 (English)
Available from: 2007-06-04 Created: 2007-06-04 Last updated: 2018-03-23Bibliographically approved

Search in DiVA

Show all publications