Zimin patterns in genomes [PDF]
Zimin words are words that have the same prefix and suffix. They are unavoidable patterns, with all sufficiently large strings encompassing them. Here, we examine for the first time the presence of k-mers not containing any Zimin patterns, defined hereafter as Zimin avoidmers, in the human genome. We report that in the reference human genome all k-mers
arxiv
TwoPaCo: An efficient algorithm to build the compacted de Bruijn graph from many complete genomes [PDF]
Motivation: De Bruijn graphs have been proposed as a data structure to facilitate the analysis of related whole genome sequences, in both a population and comparative genomic settings. However, current approaches do not scale well to many genomes of large size (such as mammalian genomes).
arxiv
A Conceptual Framework for Human-AI Collaborative Genome Annotation [PDF]
Genome annotation is essential for understanding the functional elements within genomes. While automated methods are indispensable for processing large-scale genomic data, they often face challenges in accurately predicting gene structures and functions.
arxiv
OmniGenBench: Automating Large-scale in-silico Benchmarking for Genomic Foundation Models [PDF]
The advancements in artificial intelligence in recent years, such as Large Language Models (LLMs), have fueled expectations for breakthroughs in genomic foundation models (GFMs). The code of nature, hidden in diverse genomes since the very beginning of life's evolution, holds immense potential for impacting humans and ecosystems through genome modeling.
arxiv
A Misclassification Network-Based Method for Comparative Genomic Analysis [PDF]
Classifying genome sequences based on metadata has been an active area of research in comparative genomics for decades with many important applications across the life sciences. Established methods for classifying genomes can be broadly grouped into sequence alignment-based and alignment-free models.
arxiv
Expression of concern for global biomedical research by the human genome organization (HUGO). [PDF]
Hamosh A+12 more
europepmc +1 more source
Three-dimensional genome structures of single mammalian sperm. [PDF]
Xu H+12 more
europepmc +1 more source
A genome-wide One Health study of Klebsiella pneumoniae in Norway reveals overlapping populations but few recent transmission events across reservoirs. [PDF]
Hetland MAK+20 more
europepmc +1 more source
Predicting viral host codon fitness and path shifting through tree-based learning on codon usage biases and genomic characteristics. [PDF]
Su S+7 more
europepmc +1 more source
Comparative genomic analysis of food-animal-derived and human-derived <i>Clostridium perfringens</i> isolates from markets in Shandong, China. [PDF]
Zhu X+8 more
europepmc +1 more source