hh.sePublications
Change search
Refine search result
1 - 15 of 15
CiteExportLink to result list
Permanent link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Rows per page
  • 5
  • 10
  • 20
  • 50
  • 100
  • 250
Sort
  • Standard (Relevance)
  • Author A-Ö
  • Author Ö-A
  • Title A-Ö
  • Title Ö-A
  • Publication type A-Ö
  • Publication type Ö-A
  • Issued (Oldest first)
  • Issued (Newest first)
  • Created (Oldest first)
  • Created (Newest first)
  • Last updated (Oldest first)
  • Last updated (Newest first)
  • Disputation date (earliest first)
  • Disputation date (latest first)
  • Standard (Relevance)
  • Author A-Ö
  • Author Ö-A
  • Title A-Ö
  • Title Ö-A
  • Publication type A-Ö
  • Publication type Ö-A
  • Issued (Oldest first)
  • Issued (Newest first)
  • Created (Oldest first)
  • Created (Newest first)
  • Last updated (Oldest first)
  • Last updated (Newest first)
  • Disputation date (earliest first)
  • Disputation date (latest first)
Select
The maximal number of hits you can export is 250. When you want to export more records please use the Create feeds function.
  • 1.
    Assabie Lake, Yaregal
    Halmstad University, School of Information Science, Computer and Electrical Engineering (IDE), Halmstad Embedded and Intelligent Systems Research (EIS).
    Multifont recognition System for Ethiopic Script2006Licentiate thesis, comprehensive summary (Other academic)
    Abstract [en]

    In this thesis, we present a general framework for multi-font, multi-size and multi-style Ethiopic character recognition system. We propose structural and syntactic techniques for recognition of Ethiopic characters where the graphically comnplex characters are represented by less complex primitive structures and their spatial interrelationships. For each Ethiopic character, the primitive structures and their spatial interrelationships form a unique set of patterns.

    The interrelationships of primitives are represented by a special tree structure which resembles a binary search tree in the sense that it groups child nodes as left and right, and keeps the spatial position of primitives in orderly manner. For a better computational efficiency, the primitive tree is converted into string pattern using in-order traversal, which generates a base of the alphabet that stores possibly occuring string patterns for each character. The recognition of characters is then achieved by matching the generated patterns with each pattern in a stored knowledge base of characters.

    Structural features are extracted using direction field tensor, which is also used for character segmentation. In general, the recognition system does not need size normalization, thinning or other preprocessing procedures. The only parameter that needs to be adjusted during the recognition process is the size of Gaussian window which should be chosen optimally in relation to font sizes. We also constructed an Ethiopic Document Image Database (EDIDB) from real life documents and the recognition system is tested with respect to variations in font type, size, style, document skewness and document type. Experimental results are reported.

  • 2.
    Assabie, Yaregal
    et al.
    Halmstad University, School of Information Science, Computer and Electrical Engineering (IDE), Halmstad Embedded and Intelligent Systems Research (EIS).
    Bigun, Josef
    Halmstad University, School of Information Science, Computer and Electrical Engineering (IDE), Halmstad Embedded and Intelligent Systems Research (EIS).
    A comprehensive Dataset for Ethiopic Handwriting Recognition2009In: Proceedings SSBA '09: Symposium on Image Analysis, Halmstad University, Halmstad, March 18-20, 2009 / [ed] Josef Bigun & Antanas Verikas, Halmstad: Halmstad University , 2009, p. 41-43Chapter in book (Other academic)
    Abstract [en]

    Ethiopic script is used by several languages in Ethiopia for writing. We present a comprehensive dataset of handwritten Ethiopic script called DEHR (Dataset for Ethiopic Handwriting Recognition) captured both offline and online. The offline dataset includes isolated characters, Ethiopian church documents and ordinary handwritten texts dealing with various real-life issues. The ordinary texts and isolated characters were freely written by several participants. The church documents are written in Geez and Amharic languages whereas the language for ordinary texts is Amharic only. The online dataset was collected by using two Digimemo devices of different sizes. For isolated characters and online dataset, all the 265 character samples used by Amharic language are included. The dataset is intended to set a benchmark for training and/or testing handwriting recognition, character and word segmentation, and text line detection. The dataset is can be accessed by contacting the authors or via http://www.hh.se/staff/josef/.

  • 3.
    Assabie, Yaregal
    et al.
    Addis Ababa University, Department of Computer Science, Addis Ababa, Ethiopia .
    Bigun, Josef
    Halmstad University, School of Information Technology, Halmstad Embedded and Intelligent Systems Research (EIS), Intelligent systems (IS-lab).
    A Hybrid System for Robust Recognition of Ethiopic Script2007In: Ninth International Conference on Document Analysis and Recognition: proceedings : Curtiba, Paraná, Brazil, September 23-26, 2007 / [ed] IEEE Computer Society, Los Alamitos, Calif.: IEEE Computer Society, 2007, p. 556-560Conference paper (Refereed)
    Abstract [en]

    In real life, documents contain several font types, styles, and sizes. However, many character recognition systems show good results for specific type of documents and fail to produce satisfactory results for others. Over the past decades, various pattern recognition techniques have been applied with the aim to develop recognition systems insensitive to variations in the characteristics of documents. In this paper, we present a robust recognition system for Ethiopic script using a hybrid of classifiers. The complex structures of Ethiopic characters are structurally and syntactically analyzed, and represented as a pattern of simpler graphical units called primitives. The pattern is used for classification of characters using similarity-based matching and neural network classifier. The classification result is further refined by using template matching. A pair of directional filters is used for creating templates and extracting structural features. The recognition system is tested by real life documents and experimental results are reported.

  • 4.
    Assabie, Yaregal
    et al.
    Halmstad University, School of Information Science, Computer and Electrical Engineering (IDE), Halmstad Embedded and Intelligent Systems Research (EIS).
    Bigun, Josef
    Halmstad University, School of Information Science, Computer and Electrical Engineering (IDE), Halmstad Embedded and Intelligent Systems Research (EIS).
    A neural network approach for multifont and size-independent recognition of ethiopic characters2007In: Progress in pattern recognition / [ed] Singh, S, Singh, M, Springer London, 2007, p. 129-137Conference paper (Refereed)
    Abstract [en]

    Artificial neural networks are one of the most commonly used tools for character recognition problems, and usually they take gray values of 2D character images as inputs. In this paper, we propose a novel neural network classifier whose input is ID string patterns generated from the spatial relationships of primitive structures of Ethiopiccharacters. The spatial relationships of primitives are modeled by a special tree structure from which a unique set of string patterns are generated for each character. Training theneural network with string patterns of different font types and styles enables the classifier to handle variations in font types, sizes, and styles. We use a pair of directional filters forextracting primitives and their spatial relationships. The robustness of the proposed recognition system is tested by real life documents and experimental results are reported.

  • 5.
    Assabie, Yaregal
    et al.
    Addis Ababa University, Department of Computer Science, Addis Ababa, Ethiopia .
    Bigun, Josef
    Halmstad University, School of Information Technology, Halmstad Embedded and Intelligent Systems Research (EIS), Intelligent systems (IS-lab).
    Ethiopic Character Recognition Using Direction Field Tensor2006In: The 18th International Conference on Pattern Recognition: proceedings : 20-24 August, 2006, Hong Kong, Los Alamitos, Calif.: IEEE Computer Society, 2006, p. 284-287Conference paper (Refereed)
    Abstract [en]

    Many languages in Ethiopia use a unique alphabet called Ethiopic for writing. However, there is no OCR system developed to date. In an effort to develop automatic recognition of Ethiopic script, a novel system is designed by applying structural and syntactic techniques. The recognition system is developed by extracting primitive structural features and their spatial relationships. A special tree structure is used to represent the spatial relationship of primitive structures. For each character, a unique string pattern is generated from the tree and recognition is achieved by matching the string against a stored knowledge base of the alphabet. To implement the recognition system, we use direction field tensor as a tool for character segmentation, and extraction of structural features and their spatial relationships. Experimental results are reported.

  • 6.
    Assabie, Yaregal
    et al.
    Halmstad University, School of Information Science, Computer and Electrical Engineering (IDE), Halmstad Embedded and Intelligent Systems Research (EIS).
    Bigun, Josef
    Halmstad University, School of Information Science, Computer and Electrical Engineering (IDE), Halmstad Embedded and Intelligent Systems Research (EIS).
    Ethiopic Document Image Database for Testing Character Recognition Systems2006Report (Other academic)
    Abstract [en]

    In this paper we describe the acquisition and content of a large database of Ethiopic documents for testing and evaluating character recognition systems. The Ethiopic Document Image Database (EDIDB) contains documents written in Amharic and Geez languages. The database was built from a variety of documents such as printouts, books, newspapers, and magazines. Documents written in various font types, sizes and styles were included in the database. Degraded and poor quality documents were also included in the database to represent the real life situation. A total of 1,204 pages were scanned at a resolution of 300 dpi and saved as grayscale images of JPEG format. We also describe an evaluation protocol for standardizing the comparison of recognition systems and their results. The database is made available to the research community through http://www.hh.se/staff/josef/.

  • 7.
    Assabie, Yaregal
    et al.
    Addis Ababa University, Department of Computer Science, Addis Ababa Ethiopia.
    Bigun, Josef
    Halmstad University, School of Information Technology, Halmstad Embedded and Intelligent Systems Research (EIS), Intelligent systems (IS-lab).
    HMM-Based Handwritten Amharic Word Recognition with Feature Concatenation2009In: Proceedings of the International Conference on Document Analysis and Recognition, ICDAR, New York: IEEE Press, 2009, p. 961-965Conference paper (Refereed)
    Abstract [en]

    Amharic is the official language of Ethiopia and uses Ethiopic script for writing. In this paper, we present writer-independent HMM-based Amharic word recognition for offline handwritten text. The underlying units of the recognition system are a set of primitive strokes whose combinations form handwritten Ethiopic characters. For each character, possibly occurring sequences of primitive strokes and their spatial relationships, collectively termed as primitive structural features, are stored as feature list. Hidden Markov models for Amharic words are trained with such sequences of structural features of characters constituting words. The recognition phase does not require segmentation of characters but only requires text line detection and extraction of structural features in each text line. Text lines and primitive structural features are extracted by making use of direction field tensor. The performance of the recognition system is tested by a database of unconstrained handwritten documents collected from various sources.

  • 8.
    Assabie, Yaregal
    et al.
    Addis Ababa University, Department of Computer Science, Addis Ababa Ethiopia .
    Bigun, Josef
    Halmstad University, School of Information Technology, Halmstad Embedded and Intelligent Systems Research (EIS), Intelligent systems (IS-lab).
    Lexicon-based Offline Recognition of Amharic Words in Unconstrained Handwritten Text2008In: 19th International Conference on Pattern Recognition: (ICPR 2008) ; Tampa, Florida, USA 8-11 December 2008, New York: IEEE Computer Society, 2008, article id 4761145Conference paper (Refereed)
    Abstract [en]

    This paper describes an offline handwriting recognition system for Amharic words based on lexicon. The system computes direction fields of scanned handwritten documents, from which pseudo-characters are segmented. The pseudo-characters are organized based on their proximity and direction to form text lines. Words are then segmented by analyzing the relative gap between subsequent pseudocharacters in text lines. For each segmented word image, the structural characteristics of pseudo-characters are syntactically analyzed to predict a set of plausible characters forming the word. The most likelihood word is finally selected among candidates by matching against the lexicon. The system is tested by a database of unconstrained handwritten Amharic documents collected from various sources. The lexicon is prepared from words appearing in the collected database.

  • 9.
    Assabie, Yaregal
    et al.
    Addis Ababa University, Department of Computer Science, Addis Ababa, Ethiopia .
    Bigun, Josef
    Halmstad University, School of Information Technology, Halmstad Embedded and Intelligent Systems Research (EIS), Intelligent systems (IS-lab).
    Multifont size-resilient recognition system for Ethiopic script2007In: International Journal on Document Analysis and Recognition, ISSN 1433-2833, E-ISSN 1433-2825, Vol. 10, no 2, p. 85-100Article in journal (Refereed)
    Abstract [en]

    This paper presents a novel framework for recognition of Ethiopic characters using structural and syntactic techniques. Graphically complex characters are represented by the spatial relationships of less complex primitives which form a unique set of patterns for each character. The spatial relationship is represented by a special tree structure which is also used to generate string patterns of primitives. Recognition is then achieved by matching the generated string pattern against each pattern in the alphabet knowledge-base built for this purpose. The recognition system tolerates variations on the parameters of characters like font type, size and style. Direction field tensor is used as a tool to extract structural features.

  • 10.
    Assabie, Yaregal
    et al.
    Department of Computer Science, Addis Ababa University, Ethiopia.
    Bigun, Josef
    Halmstad University, School of Information Science, Computer and Electrical Engineering (IDE), Halmstad Embedded and Intelligent Systems Research (EIS), Intelligent Systems´ laboratory.
    Offline handwritten Amharic word recognition2011In: Pattern Recognition Letters, ISSN 0167-8655, E-ISSN 1872-7344, Vol. 32, no 8, p. 1089-1099Article in journal (Refereed)
    Abstract [en]

    This paper describes two approaches for Amharic word recognition in unconstrained handwritten text using HMMs. The first approach builds word models from concatenated features of constituent characters and in the second method HMMs of constituent characters are concatenated to form word model. In both cases, the features used for training and recognition are a set of primitive strokes and their spatial relationships. The recognition system does not require segmentation of characters but requires text line detection and extraction of structural features, which is done by making use of direction field tensor. The performance of the recognition system is tested by a dataset of unconstrained handwritten documents collected from various sources, and promising results are obtained. (C) 2011 Elsevier B.V. All rights reserved.

  • 11.
    Assabie, Yaregal
    et al.
    Halmstad University, School of Information Science, Computer and Electrical Engineering (IDE), Halmstad Embedded and Intelligent Systems Research (EIS).
    Bigun, Josef
    Halmstad University, School of Information Science, Computer and Electrical Engineering (IDE), Halmstad Embedded and Intelligent Systems Research (EIS).
    Offline Handwritten Amharic Word Recognition Using HMMs2009In: Proceedings SSBA '09: Symposium on Image Analysis, Halmstad University, Halmstad, March 18-20, 2009 / [ed] Josef Bigun & Antanas Verikas, Halmstad: Halmstad University , 2009, p. 89-92Chapter in book (Other academic)
    Abstract [en]

    This paper describes two appraches for Amharic word recognition in unconstrained handwritten text using HMMs. The first approach builds word models from concatenated features of constituent characters and in the second method HMMs of constituent characters are concatenated to form word model. In both cases, the features used for training and recognition are primitive strokes and their spatial relationships. The recognition system does not require segmentation of characters but requires text line detection and extraction of structural features, which is done by making use of direction field tensor. The performance of the recognition system is tested by DEHR dataset of unconstrained handwritten documents collected from various sources.

  • 12.
    Assabie, Yaregal
    et al.
    Addis Ababa University, Department of Computer Science, Addis Ababa, Ethiopia.
    Bigun, Josef
    Halmstad University, School of Information Technology, Halmstad Embedded and Intelligent Systems Research (EIS), Intelligent systems (IS-lab).
    Online Handwriting Recognition of Ethiopic Script2008In: Proceedings: Eleventh International Conference on Frontiers in Handwriting Recognition, Montréal, Québec - Canada, August 19-21, 2008 / [ed] Ching Y Suen, Montréal: CENPARMI, Concordia University , 2008, p. 153-158Conference paper (Refereed)
    Abstract [en]

    Online recognition of handwritten characters is gaining a renewed interest as it provides a natural way of data entry for a wide variety of handheld devices. In this paper, we present online handwriting recognition system for Ethiopic script based on the structural and syntactical analysis of the strokes forming characters. The complex structures of characters are represented by the spatio- temporal relationships of simple-shaped strokes called primitives. A special tree structure is used to model spatio- temporal relationships of the strokes. The tree generates a unique set of primitive stroke sequences for each character, and for recognition each stroke sequence is matched against a stored knowledge base. Characters are also classified based on their structural similarity to select a plausible set of characters for un unknown input, which improves recognition and processing time. We also present a dataset collected for training and testing online recognition systems for Ethiopic script. The dataset is prepared in accordance with the international standard UNIPEN format. The recognition system is tested with the collected dataset and experimental results are reported.

  • 13.
    Assabie, Yaregal
    et al.
    Addis Ababa University, Department of Computer Science, Addis Ababa, Ethiopia .
    Bigun, Josef
    Halmstad University, School of Information Technology, Halmstad Embedded and Intelligent Systems Research (EIS), Intelligent systems (IS-lab).
    Structural and Syntactic Techniques for Recognition of Ethiopic Characters2006In: Structural, syntactic, and statistical pattern recognition joint IAPR international workshops SSPR 2006 and SPR 2006, Hong Kong, China, August 17-19, 2006 : proceedings: Lecture Notes in Computer Sciences (Volume 4109/2006), Berlin: Springer Berlin/Heidelberg, 2006, p. 118-126Conference paper (Refereed)
    Abstract [en]

    OCR technology of Latin scripts is well advanced in comparison to other scripts. However, the available results from Latin are not always sufficient to directly adopt them for other scripts such as the Ethiopic script. In this paper, we propose a novel approach that uses structural and syntactic techniques for recognition of Ethiopic characters. We reveal that primitive structures and their spatial relationships form a unique set of patterns for each character. The relationships of primitives are represented by a special tree structure, which is also used to generate a pattern. A knowledge base of the alphabet that stores possibly occurring patterns for each character is built. Recognition is then achieved by matching the generated pattern against each pattern in the knowledge base. Structural features are extracted using direction field tensor. Experimental results are reported, and the recognition system is insensitive to variations on font types, sizes and styles.

  • 14.
    Assabie, Yaregal
    et al.
    Addis Ababa University, Department of Computer Science, Addis Ababa Ethiopia.
    Bigun, Josef
    Halmstad University, School of Information Technology, Halmstad Embedded and Intelligent Systems Research (EIS).
    Writer-independent Offline Recognition of Handwritten Ethiopic Characters2008In: Proceedings: Eleventh International Conference on Frontiers in Handwriting Recognition, Montréal, Québec - Canada, August 19-21, 2008 / [ed] Ching Y Suen, Montréal: CENPARMI, Concordia University , 2008, p. 652-657Conference paper (Refereed)
    Abstract [en]

    This paper presents writer-independent offline handwritten character recognition for Ethiopic script. The recognition is based on the characteristics of primitive strokes that make up characters. The spatial relationships of primitives whose combinations form complex structures of Ethiopic characters are used as a basis for recognition. Although this approach efficiently recognizes properly written characters, the recognition rate drops for characters where the spatial relationships of their primitives could not be drawn. This happens mostly when the connections between primitives are not properly written, which is a common case in handwriting. To complement the recognition, we classify characters based on the characteristics of their primitives, resulting in grouping of characters in a five-dimensional space. Once the type of characters is identified, recognition can be achieved with a minimal set of information from their spatial relationships. A comprehensive database is also developed to standardize the evaluation of research works on offline Ethiopic handwriting recognition systems. Our proposed system is tested is with the database and experimental results are reported.

  • 15. Premaratne, L.
    et al.
    Assabie, Yaregal
    Halmstad University, School of Information Science, Computer and Electrical Engineering (IDE).
    Bigun, J.
    Halmstad University, School of Information Science, Computer and Electrical Engineering (IDE).
    Recognition of Modification-based Scripts Using Direction Tensors2004In: Proc. 4th Indian Conference on Computer Vision, Graphics and Image Processing, 2004, p. 587-592Conference paper (Other academic)
    Abstract [en]

    The research on the OCR technology for the Latin-based scripts has been successful in achieving the status of image scanners with built-in OCR facility. But, a majority of modification-based scripts such as Brahmi descended South Asian or Ethiopic scripts are still progressing to achieve this status. This indicates the difficulties in adopting the recognition methods that have been proposed so far for the Latin-based scripts to modification-based scripts. In this paper we propose a novel method that can be adopted to recognise modification-based printed scripts consisting of a large character set, without the need for prior segmentation. The major strength of this method is that, the direction features that are used as the main principle for recognition, are further used in the separation of confusing characters, detection of skew angle, segmentation of script and graphic objects which substantially improves the computation efficiency. Algorithms developed initially for the Brahmi descended Sinhala script used in Sri Lanka, have been extended successfully for the Ethiopic script which has been evolved in a different geographical region, yielding consistently accurate results. Together, these two scripts are used by a population of ninety million.

1 - 15 of 15
CiteExportLink to result list
Permanent link
Cite
Citation style
  • apa
  • harvard1
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf