Documents digitalitzats - Open Access .click

Results 1 to 10 of about 2,766,627 (107)

Lawformer: A pre-trained language model for Chinese legal long documents [PDF]

AI Open, 2021
Legal artificial intelligence (LegalAI) aims to benefit legal systems with the technology of artificial intelligence, especially natural language processing (NLP).
Chaojun Xiao +4 more
exaly +2 more sources

OBELISC: An Open Web-Scale Filtered Dataset of Interleaved Image-Text Documents [PDF]

Neural Information Processing Systems, 2023
Large multimodal models trained on natural documents, which interleave images and text, outperform models trained on image-text pairs on various multimodal benchmarks.
Hugo Laurenccon +11 more
semanticscholar +1 more source

Named Entity Recognition and Classification in Historical Documents: A Survey [PDF]

ACM Computing Surveys, 2021
After decades of massive digitisation, an unprecedented number of historical documents are available in digital format, along with their machine-readable texts. While this represents a major step forward with respect to preservation and accessibility, it
Maud Ehrmann +4 more
semanticscholar +1 more source

A Discourse-Aware Attention Model for Abstractive Summarization of Long Documents [PDF]

North American Chapter of the Association for Computational Linguistics, 2018
Neural abstractive summarization models have led to promising results in summarizing relatively short documents. We propose the first model for abstractive summarization of single, longer-form documents (e.g., research papers). Our approach consists of a
Arman Cohan +6 more
semanticscholar +1 more source

FUNSD: A Dataset for Form Understanding in Noisy Scanned Documents [PDF]

2019 International Conference on Document Analysis and Recognition Workshops (ICDARW), 2019
We present a new dataset for form understanding in noisy scanned documents (FUNSD) that aims at extracting and structuring the textual content of forms. The dataset comprises 199 real, fully annotated, scanned forms.
Guillaume Jaume, H. K. Ekenel, J. Thiran
semanticscholar +1 more source

SummaRuNNer: A Recurrent Neural Network Based Sequence Model for Extractive Summarization of Documents [PDF]

AAAI Conference on Artificial Intelligence, 2016
We present SummaRuNNer, a Recurrent Neural Network (RNN) based sequence model for extractive summarization of documents and show that it achieves performance better than or comparable to state-of-the-art.
Ramesh Nallapati, Feifei Zhai, Bowen Zhou +2 more
semanticscholar +1 more source

TLDR: Extreme Summarization of Scientific Documents [PDF]

Findings, 2020
We introduce TLDR generation, a new form of extreme summarization, for scientific papers. TLDR generation involves high source compression and requires expert background knowledge and understanding of complex domain-specific language. To facilitate study
Isabel Cachola, Kyle Lo, Arman Cohan, Daniel S. Weld +3 more
semanticscholar +1 more source

Key-Value Memory Networks for Directly Reading Documents [PDF]

Conference on Empirical Methods in Natural Language Processing, 2016
Directly reading documents and being able to answer questions from them is an unsolved challenge. To avoid its inherent difficulty, question answering (QA) has been directed towards using Knowledge Bases (KBs) instead, which has proven effective ...
Alexander H. Miller +5 more
semanticscholar +1 more source

Les col·leccions digitals patrimonials espanyoles : polítiques de col·lecció i presentació de la col·lecció

BiD: Textos Universitaris de Biblioteconomia i Documentació, 2010
Objectius. Analitzar l'existència i el contingut de documents de polítiques de col·lecció i criteris de selecció de les col·leccions digitals patrimonials espanyoles.
Estivill Rius, Assumpció, Gascón, Jesús, Sulé Duesa, Andreu +2 more
doaj +1 more source

Constructing Datasets for Multi-hop Reading Comprehension Across Documents [PDF]

Transactions of the Association for Computational Linguistics, 2017
Most Reading Comprehension methods limit themselves to queries which can be answered using a single sentence, paragraph, or document. Enabling models to combine disjoint pieces of textual evidence would extend the scope of machine comprehension methods ...
Johannes Welbl, Pontus Stenetorp, Sebastian Riedel +2 more
semanticscholar +1 more source

computer science
history
mathematics

political science
law
sociology