Results 31 to 40 of about 16,255,401 (402)

Workshop on Document Intelligence Understanding [PDF]

open access: yesarXiv, 2023
Document understanding and information extraction include different tasks to understand a document and extract valuable information automatically. Recently, there has been a rising demand for developing document understanding among different domains, including business, law, and medicine, to boost the efficiency of work that is associated with a large ...
arxiv  

Term-Specific Eigenvector-Centrality in Multi-Relation Networks [PDF]

open access: yes, 2011
Fuzzy matching and ranking are two information retrieval techniques widely used in web search. Their application to structured data, however, remains an open problem.
Bry, François   +3 more
core   +2 more sources

PDFVQA: A New Dataset for Real-World VQA on PDF Documents [PDF]

open access: yesarXiv, 2023
Document-based Visual Question Answering examines the document understanding of document images in conditions of natural language questions. We proposed a new document-based VQA dataset, PDF-VQA, to comprehensively examine the document understanding from various aspects, including document element recognition, document layout structural understanding ...
arxiv  

Documenting software systems using types

open access: yesScience of Computer Programming, 2006
AbstractWe show how hypertext-based program understanding tools can achieve new levels of abstraction by using inferred type information for cases where the subject software system is written in a weakly typed language. We propose TypeExplorer, a tool for browsing Cobol legacy systems based on these types. The paper addresses (1) how types, an invented
Arie van Deursen, Leon Moonen
openaire   +1 more source

An overview of the question-response system in American English conversation [PDF]

open access: yes, 2010
This article, part of a 10 language comparative project on question–response sequences, discusses these sequences in American English conversation.
Stivers, T.
core   +2 more sources

Fourier Document Restoration for Robust Document Dewarping and Recognition [PDF]

open access: yesarXiv, 2022
State-of-the-art document dewarping techniques learn to predict 3-dimensional information of documents which are prone to errors while dealing with documents with irregular distortions or large variations in depth. This paper presents FDRNet, a Fourier Document Restoration Network that can restore documents with different distortions and improve ...
arxiv  

Automata-based Static Analysis of XML Document Adaptation [PDF]

open access: yesEPTCS 96, 2012, pp. 85-98, 2012
The structure of an XML document can be optionally specified by means of XML Schema, thus enabling the exploitation of structural information for efficient document handling. Upon schema evolution, or when exchanging documents among different collections exploiting related but not identical schemas, the need may arise of adapting a document, known to ...
arxiv   +1 more source

Análisis de la producción científica de la Universidad de Salamanca indexada en SCOPUS (2010-2015)

open access: yesInformación, Cultura y Sociedad
El objetivo de este artículo es analizar la producción científica del personal docente e investigador de la Universidad de Salamanca durante el periodo 2010-2015.
Alejandro Medina-González   +3 more
doaj   +1 more source

Towards Just-Enough Documentation for Agile Effort Estimation: What Information Should Be Documented? [PDF]

open access: yesarXiv, 2021
Effort estimation is an integral part of activities planning in Agile iterative development. An Agile team estimates the effort of a task based on the available information which is usually conveyed through documentation. However, as documentation has a lower priority in Agile, little is known about how documentation effort can be optimized while ...
arxiv  

OCR with Tesseract, Amazon Textract, and Google Document AI: a benchmarking experiment

open access: yesJournal of Computational Social Science, 2021
Optical Character Recognition (OCR) can open up understudied historical documents to computational analysis, but the accuracy of OCR software varies. This article reports a benchmarking experiment comparing the performance of Tesseract, Amazon Textract ...
Thomas Hegghammer
semanticscholar   +1 more source

Home - About - Disclaimer - Privacy