Results 111 to 120 of about 29,591 (137)
Some of the next articles are maybe not open access.

CC-OCR: A Comprehensive and Challenging OCR Benchmark for Evaluating Large Multimodal Models in Literacy

arXiv.org
Large Multimodal Models (LMMs) have demonstrated impressive performance in recognizing document images with natural language instructions. However, it remains unclear to what extent capabilities in literacy with rich structure and fine-grained visual ...
Zhibo Yang   +13 more
semanticscholar   +1 more source

OCR Hinders RAG: Evaluating the Cascading Impact of OCR on Retrieval-Augmented Generation

arXiv.org
Retrieval-augmented Generation (RAG) enhances Large Language Models (LLMs) by integrating external knowledge to reduce hallucinations and incorporate up-to-date information without retraining.
Junyuan Zhang   +8 more
semanticscholar   +1 more source

OCR Assisted Translator

2020 7th International Conference on Smart Structures and Systems (ICSSS), 2020
India is a multilingual country where the spoken language changes after every 50 kilometres. Therefore there is no single universal language. Globalization brought the need to have knowledge of English but only 5 percent of the population is familiar with English and the rest of the population uses different languages.
Nikhil Chigali   +3 more
openaire   +1 more source

Development of an Assamese OCR using Bangla OCR

Proceeding of the workshop on Document Analysis and Recognition, 2012
This paper refers to the development of an OCR for the Assamese language by modifying an existing OCR for the Bangla language. This modification is feasible because the Assamese script is similar, except for a few characters, to the Bangla script. The OCR incorporates a two stage recognizer using SVM classifier with no post-processing.
Subhankar Ghosh   +3 more
openaire   +1 more source

Correcting noisy OCR

Proceedings of the First International Conference on Digital Access to Textual Cultural Heritage, 2014
We describe a system for automatic post OCR text correction of digital collections of historical texts. Documents, such as old newspapers, are often degraded, so even the best OCR tools can yield garbled text. When keywords are corrupted, text is invisible to search tools. Manual correction is not feasible for large collections. Our non-interactive OCR
John Evershed, Kent Fitch
openaire   +1 more source

Using OpenMP Directives to Accelerate OCR with Tesseract OCR

COMPUTER AND INFORMATION SYSTEMS AND TECHNOLOGIES, 2021
This  paper is devoted the methods of speed-up optical character recognition which is used for transformation of the scanned image to the edited text format. The example of application of these methods are the systems of the automated search of fragment of text in the catalogues of electronic libraries, where as an entrance format both the entered
Barkovska, Olesia, Ryzhov, Ihor
openaire   +1 more source

Evaluating and mitigating the impact of OCR errors on information retrieval

International Journal on Digital Libraries, 2023
L. L. de Oliveira   +7 more
semanticscholar   +1 more source

Parallelizing OCR

[1991 Proceedings] Tenth Annual International Phoenix Conference on Computers and Communications, 2002
E. Montesinos, J. Kienhofer
openaire   +1 more source

Comparison of Tesseract OCR, Easy OCR, and Transformer OCR on Handwritten Image

2025 9th International Conference On Electrical, Electronics And Information Engineering (ICEEIE)
Kartika Candra Kirana   +5 more
openaire   +1 more source

Home - About - Disclaimer - Privacy