Results 101 to 110 of about 17,818 (198)

Deduplication Methods Using Levenshtein Distance Algorithm

open access: yesJournal of Electrical Systems
The study aimed to propose methods to improve the data integrity of the Relational databases such as MS SQL, MySQL and PostgreSQL via record duplication detection. The FODORS and ZAGAT Restaurant database benchmark datasets have been utilized to facilitate the processes involved in preparing and delivering high-quality data.
openaire   +1 more source

One-Gapped q-Gram Filters for Levenshtein Distance [PDF]

open access: yes, 2002
We have recently shown that q- gram filters based on gapped q-grams instead of the usual contiguous q-grams can provide orders of magnitude faster and/or more efficient filtering for the Hamming distance. In this paper, we extend the results for the Levenshtein distance, which is more problematic for gapped q-grams because an insertion or deletion in a
Burkhardt, S., Kärkkäinen, J.
openaire   +2 more sources

Matching health information seekers' queries to medical terms

open access: yesBMC Bioinformatics, 2012
Background The Internet is a major source of health information but most seekers are not familiar with medical vocabularies. Hence, their searches fail due to bad query formulation.
Soualmia Lina F   +4 more
doaj   +1 more source

A Hybrid Approach to Typo Correction in Indonesian Documents Using Levenshtein Distance

open access: yesJournal of Technology Informatics and Engineering
This study developed a typo correction system for the Indonesian language by integrating the Levenshtein Distance algorithm with empirical methods. The system is designed to improve the accuracy of typo detection and correction in Indonesian texts, which
Joseph Teguh Santoso, Song Yan
doaj   +1 more source

PLD2flex: Establishing the Phonological Levenshtein Distance for Pairs or Groups of (Pseudo)Words

open access: yesJournal of Open Research Software
Establishing the phonological Levenshtein distance (PLD) of words and pseudowords is useful for various psycholinguistic research applications, such as generating stimuli for experiments on language processing or analysing the PLD between erroneous and ...
Helena Wedig   +3 more
doaj   +1 more source

MEKANISME DETEKSI DAN PEMBLOKIRAN KONTEN PERJUDIAN DARING BERBASIS KATA KUNCI MENGGUNAKAN ALGORITMA LEVENSHTEIN DISTANCE

open access: yesJurnal Komputer Terapan
Online gambling in Indonesia is increasingly widespread and has negative impacts, both in terms of socio-economic aspects and cybersecurity. One of the methods used by online gambling operators is inserting gambling backlinks into websites, particularly
Ismail Puji Saputra
doaj   +1 more source

A method and a tool for geocoding and record linkage [PDF]

open access: yes
For many years, researchers have presented the geocoding of postal addresses as a challenge. Several research works have been devoted to achieve the geocoding process.
CHARIF Omar   +4 more
core   +1 more source

Efficiency and Penalty Factors on Monoids of Strings [PDF]

open access: yesComputer Science Journal of Moldova, 2018
In information theory, linguistics and computer science, metrics for measuring similarity between two given strings (sequences) are important. In this article we introduce efficiency, measure of similarity and penalty for given parallel decompositions ...
Mitrofan Choban, Ivan Budanaev
doaj  

Levenshtein Distance Algorithm in Javanese Character Translation Machine Based on Optical Character Recognition

open access: yesJOIV: International Journal on Informatics Visualization
Indonesia has diverse art, cultures, and languages. Linguistically, Indonesia has many local languages, which makes it a diverse country, with Javanese being the regional language with the highest number of entries in the Kamus Besar Bahasa Indonesia ...
Musthofa Galih Pradana   +4 more
doaj   +1 more source

Difference Measure for Controlled Random Tests

open access: yesДоклады Белорусского государственного университета информатики и радиоэлектроники
The task of constructing test sequences difference characteristics was studied. Its relevance for generating controlled random tests and complexity in finding difference measures for the case of symbolic tests were substantiated. The limitations of using
V. N. Yarmolik   +2 more
doaj   +1 more source

Home - About - Disclaimer - Privacy