hh.sePublikasjoner
Endre søk
RefereraExporteraLink to record
Permanent link

Direct link
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annet format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annet språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf
Ethiopic Character Recognition Using Direction Field Tensor
Addis Ababa University, Department of Computer Science, Addis Ababa, Ethiopia .
Högskolan i Halmstad, Akademin för informationsteknologi, Halmstad Embedded and Intelligent Systems Research (EIS), Intelligenta system (IS-lab).ORCID-id: 0000-0002-4929-1262
2006 (engelsk)Inngår i: The 18th International Conference on Pattern Recognition: proceedings : 20-24 August, 2006, Hong Kong, Los Alamitos, Calif.: IEEE Computer Society, 2006, s. 284-287Konferansepaper, Publicerat paper (Fagfellevurdert)
Abstract [en]

Many languages in Ethiopia use a unique alphabet called Ethiopic for writing. However, there is no OCR system developed to date. In an effort to develop automatic recognition of Ethiopic script, a novel system is designed by applying structural and syntactic techniques. The recognition system is developed by extracting primitive structural features and their spatial relationships. A special tree structure is used to represent the spatial relationship of primitive structures. For each character, a unique string pattern is generated from the tree and recognition is achieved by matching the string against a stored knowledge base of the alphabet. To implement the recognition system, we use direction field tensor as a tool for character segmentation, and extraction of structural features and their spatial relationships. Experimental results are reported.

sted, utgiver, år, opplag, sider
Los Alamitos, Calif.: IEEE Computer Society, 2006. s. 284-287
Serie
International Conference on Pattern Recognition. Proceedings, ISSN 1051-4651
Emneord [en]
character recognition, feature extraction, image segmentation, knowledge based systems, natural language interfaces, string matching, tensors
HSV kategori
Identifikatorer
URN: urn:nbn:se:hh:diva-2123DOI: 10.1109/ICPR.2006.507ISI: 000240705600067Scopus ID: 2-s2.0-34147145904Lokal ID: 2082/2518ISBN: 0-7695-2521-0 (tryckt)OAI: oai:DiVA.org:hh-2123DiVA, id: diva2:239341
Konferanse
18th International Conference on Pattern Recognition, ICPR 2006, Hong Kong, 20 - 24 August, 2006
Merknad

©2006 IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE.

Tilgjengelig fra: 2008-11-11 Laget: 2008-11-11 Sist oppdatert: 2018-03-23bibliografisk kontrollert
Inngår i avhandling
1. Multifont recognition System for Ethiopic Script
Åpne denne publikasjonen i ny fane eller vindu >>Multifont recognition System for Ethiopic Script
2006 (engelsk)Licentiatavhandling, med artikler (Annet vitenskapelig)
Abstract [en]

In this thesis, we present a general framework for multi-font, multi-size and multi-style Ethiopic character recognition system. We propose structural and syntactic techniques for recognition of Ethiopic characters where the graphically comnplex characters are represented by less complex primitive structures and their spatial interrelationships. For each Ethiopic character, the primitive structures and their spatial interrelationships form a unique set of patterns.

The interrelationships of primitives are represented by a special tree structure which resembles a binary search tree in the sense that it groups child nodes as left and right, and keeps the spatial position of primitives in orderly manner. For a better computational efficiency, the primitive tree is converted into string pattern using in-order traversal, which generates a base of the alphabet that stores possibly occuring string patterns for each character. The recognition of characters is then achieved by matching the generated patterns with each pattern in a stored knowledge base of characters.

Structural features are extracted using direction field tensor, which is also used for character segmentation. In general, the recognition system does not need size normalization, thinning or other preprocessing procedures. The only parameter that needs to be adjusted during the recognition process is the size of Gaussian window which should be chosen optimally in relation to font sizes. We also constructed an Ethiopic Document Image Database (EDIDB) from real life documents and the recognition system is tested with respect to variations in font type, size, style, document skewness and document type. Experimental results are reported.

sted, utgiver, år, opplag, sider
Göteborg: Department of Signals and Systems, Chalmers University of Technology, 2006. s. 46
Serie
Technical report ; 2006:21
Emneord
Ethiopic character recognition, OCR, Multifont recognition, Amharic, Direction fields, Structural and syntactic pattern recognition
HSV kategori
Identifikatorer
urn:nbn:se:hh:diva-1978 (URN)2082/2373 (Lokal ID)2082/2373 (Arkivnummer)2082/2373 (OAI)
Presentation
(engelsk)
Veileder
Tilgjengelig fra: 2008-09-29 Laget: 2008-09-29 Sist oppdatert: 2018-03-23bibliografisk kontrollert

Open Access i DiVA

fulltekst(579 kB)399 nedlastinger
Filinformasjon
Fil FULLTEXT01.pdfFilstørrelse 579 kBChecksum SHA-512
994731bd569d3690606352ccb834f22f684476b1e5b6e8bb5c4df076838536f715ac8a81cad41f34b125d05c87bfdb734db3c1b18b06e4dd273f3451ee2f2905
Type fulltextMimetype application/pdf

Andre lenker

Forlagets fulltekstScopus

Personposter BETA

Assabie, YaregalBigun, Josef

Søk i DiVA

Av forfatter/redaktør
Assabie, YaregalBigun, Josef
Av organisasjonen

Søk utenfor DiVA

GoogleGoogle Scholar
Totalt: 399 nedlastinger
Antall nedlastinger er summen av alle nedlastinger av alle fulltekster. Det kan for eksempel være tidligere versjoner som er ikke lenger tilgjengelige

doi
isbn
urn-nbn

Altmetric

doi
isbn
urn-nbn
Totalt: 432 treff
RefereraExporteraLink to record
Permanent link

Direct link
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annet format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annet språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf