OCR for Greek polytonic (multi accent) historical printed documents: Development, optimization and quality control

Anna Maria Sichani, Panagiotis Kaddas, Georgios K. Mikros, Basilis Gatos

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

2 Citations (Scopus)

Abstract

This paper presents the development and implementation of a robust OCR tool and a related comprehensive workflow for the recognition of Greek printed polytonic scripts. This project is initiated and developed by an interdisciplinary team with expertise in the areas of document image processing, character segmentation and recognition, machine learning, corpus creation and digital humanities. Our paper aims to describe the design and development of the workflow around this project, including data gathering and structuring, OCR tool development, user interface development, experiments on the training procedure of the tool, evaluation, post-correction and quality control of the results.

Original languageEnglish
Title of host publication3rd International Conference on Digital Access to Textual Cultural Heritage, DATeCH 2019 - Conference Proceedings
PublisherAssociation for Computing Machinery
Pages9-13
Number of pages5
ISBN (Electronic)9781450371940
DOIs
Publication statusPublished - 8 May 2019
Externally publishedYes
Event3rd International Conference on Digital Access to Cultural Textual Heritage, DATeCH 2019 - Brussels, Belgium
Duration: 8 May 201910 May 2019

Publication series

NameACM International Conference Proceeding Series

Conference

Conference3rd International Conference on Digital Access to Cultural Textual Heritage, DATeCH 2019
Country/TerritoryBelgium
CityBrussels
Period8/05/1910/05/19

Keywords

  • Greek polytonic scripts
  • Historical printed documents
  • Image processing
  • Machine learning
  • Optical character recognition
  • Page segmentation
  • Post correction workflow

Fingerprint

Dive into the research topics of 'OCR for Greek polytonic (multi accent) historical printed documents: Development, optimization and quality control'. Together they form a unique fingerprint.

Cite this