CDKAM: a taxonomic classification tool using discriminative k-mers and approximate matching strategies. [PDF]
Current taxonomic classification tools use exact string matching algorithms that are effective to tackle the data from the next generation sequencing technology.
Bui VK, Wei C.
europepmc +2 more sources
KmerGO: A Tool to Identify Group-Specific Sequences With k-mers. [PDF]
Capturing group-specific sequences between two groups of genomic/metagenomic sequences is critical for the follow-up identifications of singular nucleotide variants (SNVs), gene families, microbial species or other elements associated with each group.
Wang Y, Chen Q, Deng C, Zheng Y, Sun F.
europepmc +2 more sources
BLight: efficient exact associative structure for k-mers
MOTIVATION A plethora of methods and applications share the fundamental need to associate information to words for high throughput sequence analysis. Doing so for billions of k-mers is commonly a scalability problem, as exact associative indexes can be ...
C. Marchet, Mael Kerbiriou, A. Limasset
semanticscholar +3 more sources
On the Maximal Independent Sets of k-mers with the Edit Distance. [PDF]
Ma L, Chen K, Shao M.
europepmc +3 more sources
MetaCon: unsupervised clustering of metagenomic contigs with probabilistic k-mers statistics and coverage. [PDF]
Sequencing technologies allow the sequencing of microbial communities directly from the environment without prior culturing. Because assembly typically produces only genome fragments, also known as contigs, it is crucial to group them into putative ...
Qian J, Comin M.
europepmc +2 more sources
CMIC: predicting DNA methylation inheritance of CpG islands with embedding vectors of variable-length k-mers [PDF]
Background Epigenetic modifications established in mammalian gametes are largely reprogrammed during early development, however, are partly inherited by the embryo to support its development.
Osamu Maruyama +5 more
doaj +2 more sources
Polymerase chain reaction and different barcoding methods commonly used for plant identification from metagenomics samples are based on the amplification of a limited number of pre-selected barcoding regions.
Kairi Raime, Maido Remm
doaj +2 more sources
A fast and agnostic method for bacterial genome-wide association studies: Bridging the gap between k-mers and genetic events. [PDF]
Motivation Genome-wide association study (GWAS) methods applied to bacterial genomes have shown promising results for genetic marker discovery or fine-assessment of marker effect.
Jaillard M +6 more
europepmc +2 more sources
Predicting bacterial resistance from whole-genome sequences using k-mers and stability selection. [PDF]
Several studies demonstrated the feasibility of predicting bacterial antibiotic resistance phenotypes from whole-genome sequences, the prediction process usually amounting to detecting the presence of genes involved in antibiotic resistance mechanisms ...
Mahé P, Tournoud M.
europepmc +2 more sources
Scalable Genome Assembly through Parallel de Bruijn Graph Construction for Multiple k-mers. [PDF]
Remarkable advancements in high-throughput gene sequencing technologies have led to an exponential growth in the number of sequenced genomes. However, unavailability of highly parallel and scalable de novo assembly algorithms have hindered biologists ...
Mahadik K +4 more
europepmc +2 more sources

