HoloDetect: Few-Shot Learning for Error Detection
We introduce a few-shot learning framework for error detection. We show that data augmentation (a form of weak supervision) is key to training high-quality, ML-based error detection models that require minimal human involvement. Our framework consists of
Bengio Yoshua +9 more
core +1 more source
OpenRefine presentation at CZI EOSS Kickoff Meeting
Presentation of the OpenRefine project during the CZI Essential Open Source Software for Science Kickoff meeting.
Delpeuch, Antonin +2 more
openaire +1 more source
Using OpenRefine's Reconciliation to Validate Local Authority Headings
ABSTRACTIn 2015, the Cataloging and Metadata Services department of Rice University's Fondren Library developed a process to reconcile four years of authority headings against an internally developed thesaurus. With a goal of immediate cleanup as well as an ongoing maintenance procedure, staff developed a “hack” of OpenRefine's normal Reconciliation ...
Carlson, Scott, Seely, Amber
openaire +2 more sources
Reconciling Conflicting Data Curation Actions: Transparency Through Argumentation
We propose a new approach for modeling and reconciling conflicting data cleaning actions. Such conflicts arise naturally in collaborative data curation settings where multiple experts work independently and then aim to put their efforts together to ...
Yilin Xia +3 more
doaj +1 more source
This review of the scientific literature on mangrove management systems using bibliometric methods aimed to identify research trends, key topics, and collaboration between researchers.
Muh Ainun Beddu +4 more
doaj +1 more source
Trends and evolution mapping of university library collaboration research
Introduction. Collaboration among libraries is essential for addressing resource and collection limitations, and supporting the optimization of research and technology-based services.
Sani Zulviah, Agus Rusmana, Yunus Winoto
doaj +1 more source
Can We Standardize Name Reconciliaton via OpenRefine?
Scientific names in biodiversity represent one of the oldest identifiers used in science. As a result, a common repetitive task is being able to reconcile a list of scientific names against curated data sources. Reconciliation allows one to determine if names in a list are spelled correctly, whether they are currently accepted, and their nomenclatural ...
Dmitry Mozzherin +2 more
openaire +1 more source
CEDAR: The Dutch Historical Censuses as Linked Open Data [PDF]
In this document we describe the CEDAR dataset, a five-star Linked Open Data representation of the Dutch historical censuses, conducted in the Netherlands once every 10 years from 1795 to 1971. We produce a linked dataset from a digitized sample of 2,288
Ashkpour, A. +3 more
core
Una mirada a la interdisciplina desde los Estudios Métricos de la Información y el Análisis de Redes Sociales : Estudio de caso: Centro Interdisciplinario en Nanotecnología y Física y Química de los Materiales (CINQUIFIMA) del Espacio Interdisciplinario [PDF]
El Espacio Interdisciplinario (EI) fue creado en 2008 como un espacio físico y un entorno conceptual transversal a toda la estructura universitaria. Está conformado por estructuras interconectadas con identidad propia para facilitar, promover y legitimar
Aguirre-Ligüera, Natalia +2 more
core +1 more source
Learning Semantic Annotations for Tabular Data [PDF]
The usefulness of tabular data such as web tables critically depends on understanding their semantics. This study focuses on column type prediction for tables without any meta data.
Chen, J. +3 more
core

