3GOLD: optimized Levenshtein distance for clustering third-generation sequencing data [PDF]
Background Third-generation sequencing offers some advantages over next-generation sequencing predecessors, but with the caveat of harboring a much higher error rate. Clustering-related sequences is an essential task in modern biology.
Robert Logan +6 more
doaj +6 more sources
String correction using the Damerau-Levenshtein distance [PDF]
Background In the string correction problem, we are to transform one string into another using a set of prescribed edit operations. In string correction using the Damerau-Levenshtein (DL) distance, the permissible edit operations are: substitution ...
Chunchun Zhao, Sartaj Sahni
doaj +7 more sources
Levenshtein Distance, Sequence Comparison and Biological Database Search [PDF]
Levenshtein edit distance has played a central role-both past and present-in sequence alignment in particular and biological database similarity search in general. We start our review with a history of dynamic programming algorithms for computing Levenshtein distance and sequence alignments. Following, we describe how those algorithms led to heuristics
Bonnie Berger, Yun William Yu
exaly +6 more sources
A Levenshtein distance-based method for word segmentation in corpus augmentation of geoscience texts
For geoscience text, rich domain corpora have become the basis of improving the model performance in word segmentation. However, the lack of domain-specific corpus with annotation labelled has become a major obstacle to professional information mining in
Jinqu Zhang
exaly +4 more sources
Linear space string correction algorithm using the Damerau-Levenshtein distance [PDF]
Background The Damerau-Levenshtein (DL) distance metric has been widely used in the biological science. It tries to identify the similar region of DNA,RNA and protein sequences by transforming one sequence to the another using the substitution, insertion,
Chunchun Zhao, Sartaj Sahni
doaj +5 more sources
Indo-European languages tree by Levenshtein distance [PDF]
The evolution of languages closely resembles the evolution of haploid organisms. This similarity has been recently exploited \cite{GA,GJ} to construct language trees. The key point is the definition of a distance among all pairs of languages which is the
Petroni, Filippo, Serva, Maurizio
core +4 more sources
GPU acceleration of Levenshtein distance computation between long strings
Computing edit distance for very long strings has been hampered by quadratic time complexity with respect to string length. The WFA algorithm reduces the time complexity to a quadratic factor with respect to the edit distance between the strings. This work presents a GPU implementation of the WFA algorithm and a new optimization that can halve the ...
David Castells-Rufas
semanticscholar +5 more sources
STEMMING BAHASA JAWA MENGGUNAKAN DAMERAU LEVENSHTEIN DISTANCE (DLD) [PDF]
Stemming is one of the essential stages of text mining. This process removes prefixes and suffixes to produce root words in a text. This study uses a string matching algorithm, namely Damerau Levenshtein Distance (DLD), to find the basic word forms of ...
A. Wibawa, Muhammad Nu’man Hakim
semanticscholar +3 more sources
Deduplication Methods Using Levenshtein Distance Algorithm
The study aimed to propose methods to improve the data integrity of the Relational databases such as MS SQL, MySQL and PostgreSQL via record duplication detection.
Eugene S. Valeriano
semanticscholar +2 more sources
DETEKSI PLAGIARISME MENGGUNAKAN ALGORITMA LEVENSHTEIN DISTANCE
Deteksi kesamaan dokumen untuk sistem plagiarisme termasuk dalam riset Natural Language Processing dalam bidang kecerdasan buatan. Plagiarisme banyak terjadi pada dokumen di lingkungan akademisi, begitupun yang terjadi pada PSMTS ULM. Deteksi plagiarisme
Yuslena Sari +2 more
semanticscholar +3 more sources

