Compressive genomics for protein databases. [PDF]
Motivation: The exponential growth of protein sequence databases has increasingly made the fundamental question of searching for homologs a computational bottleneck.
Daniels NM +5 more
europepmc +5 more sources
An analysis of proteogenomics and how and when transcriptome-informed reduction of protein databases can enhance eukaryotic proteomics [PDF]
Background Proteogenomics aims to identify variant or unknown proteins in bottom-up proteomics, by searching transcriptome- or genome-derived custom protein databases.
Laura Fancello, Thomas Burger
doaj +3 more sources
mPies: a novel metaproteomics tool for the creation of relevant protein databases and automatized protein annotation [PDF]
Metaproteomics allows to decipher the structure and functionality of microbial communities. Despite its rapid development, crucial steps such as the creation of standardized protein search databases and reliable protein annotation remain challenging.
Johannes Werner +3 more
doaj +3 more sources
Protein Databases Related to Liquid-Liquid Phase Separation. [PDF]
Liquid−liquid phase separation (LLPS) of biomolecules, which underlies the formation of membraneless organelles (MLOs) or biomolecular condensates, has been investigated intensively in recent years.
Li Q +6 more
europepmc +2 more sources
Tandem repeats lead to sequence assembly errors and impose multi-level challenges for genome and protein databases. [PDF]
The widespread occurrence of repetitive stretches of DNA in genomes of organisms across the tree of life imposes fundamental challenges for sequencing, genome assembly, and automated annotation of genes and proteins.
Tørresen OK +12 more
europepmc +2 more sources
TOPAZ: asymmetric suffix array neighbourhood search for massive protein databases [PDF]
Background Protein homology search is an important, yet time-consuming, step in everything from protein annotation to metagenomics. Its application, however, has become increasingly challenging, due to the exponential growth of protein databases.
Alan Medlar, Liisa Holm
doaj +2 more sources
Large protein databases reveal structural complementarity and functional locality [PDF]
Recent breakthroughs in protein structure prediction have led to a surge in high-quality 3D models, highlighting the need for efficient computational solutions. In our work, we examine the structural clusters from the AlphaFold Protein Structure Database
Paweł Szczerbiak +5 more
doaj +2 more sources
Addressing statistical biases in nucleotide-derived protein databases for proteogenomic search strategies. [PDF]
Proteogenomics has the potential to advance genome annotation through high quality peptide identifications derived from mass spectrometry experiments, which demonstrate a given gene or isoform is expressed and translated at the protein level.
Blakeley P, Overton IM, Hubbard SJ.
europepmc +2 more sources
Mining Protein Databases using Machine Learning Techniques
With a large amount of information relating to proteins accumulating in databases widely available online, it is of interest to apply machine learning techniques that, by extracting underlying statistical regularities in the data, make predictions about ...
Camargo Renata da Silva +1 more
doaj +2 more sources
Evaluating deterministic motif significance measures in protein databases
Background Assessing the outcome of motif mining algorithms is an essential task, as the number of reported motifs can be very large. Significance measures play a central role in automatically ranking those motifs, and therefore alleviating the analysis ...
Azevedo Paulo J, Ferreira Pedro
doaj +2 more sources

