Bertinho: Galician BERT Representations [PDF]
This paper presents a monolingual BERT model for Galician. We follow the recent trend that shows that it is feasible to build robust monolingual BERT models even for relatively low-resource languages, while performing better than the well-known official multilingual BERT (mBERT). More particularly, we release two monolingual Galician BERT models, built
arxiv +1 more source
Unsupervised KPIs-Based Clustering of Jobs in HPC Data Centers [PDF]
Performance analysis is an essential task in High-Performance Computing (HPC) systems and it is applied for different purposes such as anomaly detection, optimal resource allocation, and budget planning. HPC monitoring tasks generate a huge number of Key Performance Indicators (KPIs) to supervise the status of the jobs running in these systems.
arxiv +1 more source
Automatic Census of Mussel Platforms Using Sentinel 2 Images [PDF]
Mussel platforms are big floating structures made of wood (size is normally about 20x20 meters or even a bit larger) that are used for aquaculture, id EST: growing mussels in appropriate marine waters. These structures are very typical in Galician estuaries.
arxiv +1 more source
Fraseoloxía e paremioloxía de Bergantiños (Cabana de Bergantiños, Carballo e Coristanco) [PDF]
Recadádiva de fraseoloxía e paremioloxía galega realizada no cambio de milenio na Terra de Bergantiños, no NO de Galicia. Resultado da colaboración de avós e avoas, pais, nais e fillos co equipo de mestres e co Centro Ramón Piñeiro para a Investigación
Evaristo Domínguez Rial
doaj
Fraseoloxía e paremioloxía de Sebil, 2 / Phraseology and paroemiology of Sebil, 2 [PDF]
Recadádiva de material fraseolóxico feita entre o 2008 e a actualidade en Sebil, aldea do concello de Cuntis (Pontevedra). // A miscellany of phraseological material compiled from 2008 until the present time in Sebil, a small village in the ...
M.ª Victoria Cerviño Ferrín
doaj
Conversations in Galician: a Large Language Model for an Underrepresented Language [PDF]
The recent proliferation of Large Conversation Language Models has highlighted the economic significance of widespread access to this type of AI technologies in the current information age. Nevertheless, prevailing models have primarily been trained on corpora consisting of documents written in popular languages.
arxiv
Da roda para a piola: refráns e frases do sur de Galicia / ‘Da roda para a piola’: proverbs and phrases from Southern Galicia [PDF]
Edición de 87 refráns e mais 54 locucións e fórmulas de 40 concellos de Galicia máis unha comarca. Foron recollidos entre 1994 e 2012. // Compilation of 87 proverbs and 54 locutions and collocations from 40 Galician municipalities as well as one region
Miguel Rubinos Conde
doaj
Refraneiro de Grou (Lobios) recollido por Bieito Fernandes do Palheiro [PDF]
Edición dun manuscrito de 1935 que recolle as paremias máis usadas naquela data en Grou, unha aldea do suroeste de Galicia. // Edition of a manuscript dated of 1935 which compiles the most used idioms by that date in Grou, a village located in the ...
Xesús Ferro Ruibal
doaj
A computational psycholinguistic evaluation of the syntactic abilities of Galician BERT models at the interface of dependency resolution and training time [PDF]
This paper explores the ability of Transformer models to capture subject-verb and noun-adjective agreement dependencies in Galician. We conduct a series of word prediction experiments in which we manipulate dependency length together with the presence of an attractor noun that acts as a lure.
arxiv
Computational Paremiology: Charting the temporal, ecological dynamics of proverb use in books, news articles, and tweets [PDF]
Proverbs are an essential component of language and culture, and though much attention has been paid to their history and currency, there has been comparatively little quantitative work on changes in the frequency with which they are used over time. With wider availability of large corpora reflecting many diverse genres of documents, it is now possible
arxiv