hh.sePublications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Cognitive-inspired Post-processing of optical character recognition for Swedish addresses
Royal Institute Of Technology, Stockholm, Sweden.
Halmstad University, School of Information Technology. Royal Institute Of Technology, Stockholm, Sweden.ORCID iD: 0000-0002-8933-7894
Riphah International University, Islamabad, Pakistan.
2022 (English)In: Proceedings of 2022 IEEE 21st International Conference on Cognitive Informatics and Cognitive Computing: ICCI*CC 2022 / [ed] Y. Wang; K.N. Plataniotis; B. Widrow; W. Pedrycz; W. Kinsner,P; Spachos; S. Kwong, Piscataway, NJ: IEEE, 2022, p. 248-257Conference paper, Published paper (Refereed)
Abstract [en]

Optical character recognition (OCR) has many ap-plications, such as digitizing historical documents, automating processes, and helping visually impaired people read. However, extracting text from images into a digital format is not an easy problem to solve, and the outputs from the OCR frameworks often include errors. The complexity comes from the many variations in (digital) fonts, handwriting, lighting, etc. To tackle this problem, this thesis investigates two different methods for correcting the errors in OCR output. The used dataset consists of Swedish addresses. The methods are therefore applied to postal automation to investigate the usage of these methods for further automating postal work by automatically reading addresses on parcels using OCR. The first method, the lexical implementation, uses a dataset of Swedish addresses so that any valid address should be in this dataset (hence there is a known and limited vocabulary), and misspelled addresses are corrected to the address in the lexicon with the smallest Levenshtein distance. The second approach uses the same dataset, but with artificial errors, or artificial noise, added. The addresses with this artificial noise are then used together with their correct spelling to train a machine learning model based on Neural machine translation (NMT) to automatically correct errors in OCR read addresses. The results from this study could contribute by defining in what direction future work connected to OCR and postal addresses should go. The results were that the lexical implementation outperformed the NMT model. However, more experiments including real data would be required to draw definitive conclusions as to how the methods would work in real-life applications. © 2022 IEEE.

Place, publisher, year, edition, pages
Piscataway, NJ: IEEE, 2022. p. 248-257
Keywords [en]
Lexical model, NMT, OCR, OCR post-correction, OCR post-processing
National Category
Language Technology (Computational Linguistics)
Identifiers
URN: urn:nbn:se:hh:diva-52168DOI: 10.1109/ICCICC57084.2022.10101672Scopus ID: 2-s2.0-85158896668ISBN: 978-1-6654-9084-9 (print)OAI: oai:DiVA.org:hh-52168DiVA, id: diva2:1818785
Conference
21st IEEE International Conference on Cognitive Informatics and Cognitive Computing, ICCI*CC 2022, Toronto, Canada, 8-10 dec, 2022
Available from: 2023-12-12 Created: 2023-12-12 Last updated: 2023-12-12Bibliographically approved

Open Access in DiVA

No full text in DiVA

Other links

Publisher's full textScopus

Authority records

Kanwal, Summrina

Search in DiVA

By author/editor
Kanwal, Summrina
By organisation
School of Information Technology
Language Technology (Computational Linguistics)

Search outside of DiVA

GoogleGoogle Scholar

doi
isbn
urn-nbn

Altmetric score

doi
isbn
urn-nbn
Total: 32 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf