Results 11 to 20 of about 196,448 (158)
Rank diversity of languages: generic behavior in computational linguistics. [PDF]
Statistical studies of languages have focused on the rank-frequency distribution of words. Instead, we introduce here a measure of how word ranks change in time and call this distribution \emph{rank diversity}.
Cocho G +4 more
europepmc +4 more sources
Inter-Coder Agreement for Computational Linguistics [PDF]
This article is a survey of methods for measuring agreement among corpus annotators. It exposes the mathematics and underlying assumptions of agreement coefficients, covering Krippendorff's alpha as well as Scott's pi and Cohen's kappa; discusses the ...
Atkins Sue +12 more
core +2 more sources
Computational linguistics and linguistics [PDF]
I will try to position the fields of Linguistics and Computational Linguistics by examining their objects of research, their objectives, approaches, and success criteria, drawing on the concepts shown in the text cloud below. This should give a clearer view of the commonalities, differences and potential synergies.
openaire +1 more source
Computational Sociolinguistics: A Survey [PDF]
Language is a social phenomenon and variation is inherent to its social nature. Recently, there has been a surge of interest within the computational linguistics (CL) community in the social dimension of language.
de Jong, Franciska +3 more
core +6 more sources
Linguistics in computational linguistics [PDF]
As my title suggests, this position paper focuses on the relevance of linguistics in NLP instead of asking the inverse question. Although the question about the role of computational linguistics in the study of language may theoretically be much more interesting than the selected topic, I feel that my choice is more appropriate for the purpose and ...
openaire +1 more source
The Bulgarian National Corpus: Theory and Practice in Corpus Design
The paper discusses several key concepts related to the development of corpora and reconsiders them in light of recent developments in NLP. On the basis of an overview of present-day corpora, we conclude that the dominant practices of corpus design do ...
Svetla Koeva +5 more
doaj +1 more source
Improving spaCy dependency annotation and PoS tagging web service using independent NER services [PDF]
Dependency parsing is often used as a component in many text analysis pipelines. However, performance, especially in specialized domains, suffers from the presence of complex terminology.
Nico Colic, Fabio Rinaldi
doaj +1 more source
Modeling the Paraphrase Detection Task over a Heterogeneous Graph Network with Data Augmentation
Paraphrase detection is a Natural-Language Processing (NLP) task that aims at automatically identifying whether two sentences convey the same meaning (even with different words). For the Portuguese language, most of the works model this task as a machine-
Rafael T. Anchiêta +2 more
doaj +1 more source
Empirical Risk Minimization for Probabilistic Grammars: Sample Complexity and Hardness of Learning [PDF]
Probabilistic grammars are generative statistical models that are useful for compositional and sequential structures. They are used ubiquitously in computational linguistics.
Cohen, S. B., Smith, N. A.
core +3 more sources
Linguistics in the digital humanities: (computational) corpus linguistics
Corpus linguistics has been closely intertwined with digital technology since the introduction of university computer mainframes in the 1960s. Making use of both digitized data in the form of the language corpus and computational methods of analysis ...
Kim Ebensgaard Jensen
doaj +3 more sources

