Results 91 to 100 of about 29,591 (137)

Tesseract-OCR

Revue Cyber & Conformité, 2021
Sébastien Dupent
openaire   +2 more sources

OCRBench: on the hidden mystery of OCR in large multimodal models

Science China Information Sciences, 2023
Large models have recently played a dominant role in natural language processing and multimodal vision-language learning. However, their effectiveness in text-related visual tasks remains relatively unexplored. In this paper, we conducted a comprehensive
Yuliang Liu   +9 more
semanticscholar   +1 more source

OCR-Free Document Understanding Transformer

European Conference on Computer Vision, 2021
Understanding document images (e.g., invoices) is a core but challenging task since it requires complex functions such as reading text and a holistic understanding of the document. Current Visual Document Understanding (VDU) methods outsource the task of
Geewook Kim   +9 more
semanticscholar   +1 more source

Exploring OCR Capabilities of GPT-4V(ision) : A Quantitative and In-depth Evaluation

arXiv.org, 2023
This paper presents a comprehensive evaluation of the Optical Character Recognition (OCR) capabilities of the recently released GPT-4V(ision), a Large Multimodal Model (LMM).
Yongxin Shi   +7 more
semanticscholar   +1 more source

OCR performance prediction using cross-OCR alignment

2015 13th International Conference on Document Analysis and Recognition (ICDAR), 2015
Since 2006 the national library of France (BnF) has developed many mass digitization projects on its collections. The indexation of digital documents on Gallica (the digital library of the BnF) is done through their textual content obtained thanks to service providers that use Optical Character Recognition software (OCR). The modern technologies of OCR
Ben Salah, Ahmed   +3 more
openaire   +2 more sources

OCR-VQA: Visual Question Answering by Reading Text in Images

IEEE International Conference on Document Analysis and Recognition, 2019
The problem of answering questions about an image is popularly known as visual question answering (or VQA in short). It is a well-established problem in computer vision.
Anand Mishra   +3 more
semanticscholar   +1 more source

Home - About - Disclaimer - Privacy