Results 41 to 50 of about 29,591 (137)
Editors' preface to the monographic section of Iperstoria 13, titled "Digital Humanities: a cross-disciplinary approach to literature, language and education."
Sonia di Loreto, Annarita Taronna
doaj +1 more source
A REVIEW OF ARABIC TEXT RECOGNITION DATASET
Building a robust Optical Character Recognition (OCR) system for languages, such as Arabic with cursive scripts, has always been challenging. These challenges increase if the text contains diacritics of different sizes for characters and words.
Idris Saleh Al-Sheikh +2 more
doaj +1 more source
Recognition of Machine-Readable Zone in Identity Documents: A Review
Encoding personal identity document information into machine-readable formats is one of the most important approaches to automatic data processing. The Machine-Readable Zone (MRZ) of a document contains text data in the form of 2 or 3 long text lines of ...
Alexander V. Gayer +2 more
doaj +1 more source
In-depth analysis of the impact of OCR errors on named entity recognition and linking
Named entities (NEs) are among the most relevant type of information that can be used to properly index digital documents and thus easily retrieve them.
Ahmed Hamdi +4 more
semanticscholar +1 more source
Optical character recognition with neural networks
XXI century is the age of global automation and digitization. There is high demand for optical recognition software, including character recognition. There are different approaches in solution optical recognition problem.
Aidarbek Shalakhmetov, Sanzhar Aubakirov
doaj +1 more source
Protection System Coordination On 20 kV Distribution Network In Makassar City
This study is a quantitative study that aims to a) describe the coordination of protection systems under abnormal conditions in the distribution network and b) to determine the selectiveity of protection systems in isolating equipment in abnormal ...
Firdaus Firdaus +3 more
doaj +1 more source
Towards a Font Classification Model for Romanian Cyrillic Documents [PDF]
This paper presents a solution on how to classify the fonts in the 17th century Romanian Cyrillic documents. This solution is based on a mix of unsupervised and supervised machine learning technics.
Tudor Bumbu
doaj
Im KONDE-Projekt, das aus Hochschulraumstrukturmitteln finanziert wird, beschäftigten sich sieben universitäre Partner und drei weitere Einrichtungen aus unterschiedlichen Blickwinkeln mit theoretischen und praktischen Aspekten der Digitalen Edition. Ein Outcome des Projektes stellt das Weißbuch dar, welches über 200 Artikel zum Thema Digitale Edition ...
Fritze, Christiane, Mühlberger, Günter
openaire +1 more source
Assessing the Impact of OCR Quality on Downstream NLP Tasks
: A growing volume of heritage data is being digitized and made available as text via optical character recognition (OCR). Scholars and libraries are increasingly using OCR-generated text for retrieval and analysis.
Daniel Alexander van Strien +5 more
semanticscholar +1 more source
BART for Post-Correction of OCR Newspaper Text
Optical character recognition (OCR) from newspaper page images is susceptible to noise due to degradation of old documents and variation in typesetting. In this report, we present a novel approach to OCR post-correction.
Elizabeth Soper, S. Fujimoto, Yen-Yun Yu
semanticscholar +1 more source

