Results 111 to 120 of about 29,591 (137)
Some of the next articles are maybe not open access.
arXiv.org
Large Multimodal Models (LMMs) have demonstrated impressive performance in recognizing document images with natural language instructions. However, it remains unclear to what extent capabilities in literacy with rich structure and fine-grained visual ...
Zhibo Yang +13 more
semanticscholar +1 more source
Large Multimodal Models (LMMs) have demonstrated impressive performance in recognizing document images with natural language instructions. However, it remains unclear to what extent capabilities in literacy with rich structure and fine-grained visual ...
Zhibo Yang +13 more
semanticscholar +1 more source
OCR Hinders RAG: Evaluating the Cascading Impact of OCR on Retrieval-Augmented Generation
arXiv.orgRetrieval-augmented Generation (RAG) enhances Large Language Models (LLMs) by integrating external knowledge to reduce hallucinations and incorporate up-to-date information without retraining.
Junyuan Zhang +8 more
semanticscholar +1 more source
2020 7th International Conference on Smart Structures and Systems (ICSSS), 2020
India is a multilingual country where the spoken language changes after every 50 kilometres. Therefore there is no single universal language. Globalization brought the need to have knowledge of English but only 5 percent of the population is familiar with English and the rest of the population uses different languages.
Nikhil Chigali +3 more
openaire +1 more source
India is a multilingual country where the spoken language changes after every 50 kilometres. Therefore there is no single universal language. Globalization brought the need to have knowledge of English but only 5 percent of the population is familiar with English and the rest of the population uses different languages.
Nikhil Chigali +3 more
openaire +1 more source
Development of an Assamese OCR using Bangla OCR
Proceeding of the workshop on Document Analysis and Recognition, 2012This paper refers to the development of an OCR for the Assamese language by modifying an existing OCR for the Bangla language. This modification is feasible because the Assamese script is similar, except for a few characters, to the Bangla script. The OCR incorporates a two stage recognizer using SVM classifier with no post-processing.
Subhankar Ghosh +3 more
openaire +1 more source
Proceedings of the First International Conference on Digital Access to Textual Cultural Heritage, 2014
We describe a system for automatic post OCR text correction of digital collections of historical texts. Documents, such as old newspapers, are often degraded, so even the best OCR tools can yield garbled text. When keywords are corrupted, text is invisible to search tools. Manual correction is not feasible for large collections. Our non-interactive OCR
John Evershed, Kent Fitch
openaire +1 more source
We describe a system for automatic post OCR text correction of digital collections of historical texts. Documents, such as old newspapers, are often degraded, so even the best OCR tools can yield garbled text. When keywords are corrupted, text is invisible to search tools. Manual correction is not feasible for large collections. Our non-interactive OCR
John Evershed, Kent Fitch
openaire +1 more source
Using OpenMP Directives to Accelerate OCR with Tesseract OCR
COMPUTER AND INFORMATION SYSTEMS AND TECHNOLOGIES, 2021This paper is devoted the methods of speed-up optical character recognition which is used for transformation of the scanned image to the edited text format. The example of application of these methods are the systems of the automated search of fragment of text in the catalogues of electronic libraries, where as an entrance format both the entered
Barkovska, Olesia, Ryzhov, Ihor
openaire +1 more source
Evaluating and mitigating the impact of OCR errors on information retrieval
International Journal on Digital Libraries, 2023L. L. de Oliveira +7 more
semanticscholar +1 more source
[1991 Proceedings] Tenth Annual International Phoenix Conference on Computers and Communications, 2002
E. Montesinos, J. Kienhofer
openaire +1 more source
E. Montesinos, J. Kienhofer
openaire +1 more source
Comparison of Tesseract OCR, Easy OCR, and Transformer OCR on Handwritten Image
2025 9th International Conference On Electrical, Electronics And Information Engineering (ICEEIE)Kartika Candra Kirana +5 more
openaire +1 more source

