Interpretable machine learning of amino acid patterns in proteins: a statistical ensemble approach [PDF]
Explainable and interpretable unsupervised machine learning helps understand the underlying structure of data. We introduce an ensemble analysis of machine learning models to consolidate their interpretation. Its application shows that restricted Boltzmann machines compress consistently into a few bits the information stored in a sequence of five amino
arxiv +1 more source
Predicting the functional effect of amino acid substitutions and indels. [PDF]
As next-generation sequencing projects generate massive genome-wide sequence variation data, bioinformatics tools are being developed to provide computational predictions on the functional effects of sequence variations and narrow down the search of ...
Yongwook Choi+4 more
doaj +1 more source
Molecular cloning and sequence analysis of the cDNA encoding the human acrosin-trypsin inhibitor (HUSI-II) [PDF]
A complete cDNA clone encoding the human acrosin-trypsin inhibitor HUSI-II has been isolated from a cDNA library of human testis and completely sequenced.
Fink, Edwin+2 more
core +1 more source
A FAMILY OF CATION ATPASE-LIKE MOLECULES FROM PLASMODIUM-FALCIPARUM [PDF]
. We report the nucleotide and derived amino acid sequence of the ATPase 1 gene from Plasmodium falciparum. The amino acid sequence shares homology with the family of "P-type cation transloeating ATPases in conserved regions important for nucleotide
Cowan, G+5 more
core +2 more sources
STAR: predicting recombination sites from amino acid sequence
Background Designing novel proteins with site-directed recombination has enormous prospects. By locating effective recombination sites for swapping sequence parts, the probability that hybrid sequences have the desired properties is increased ...
Thier Ricarda+3 more
doaj +1 more source
Correlation between nucleotide composition and folding energy of coding sequences with special attention to wobble bases [PDF]
Background: The secondary structure and complexity of mRNA influences its accessibility to regulatory molecules (proteins, micro-RNAs), its stability and its level of expression.
AA Komar+46 more
core +4 more sources
Can Power Laws Help Us Understand Gene and Proteome Information?
Proteins are biochemical entities consisting of one or more blocks typically folded in a 3D pattern. Each block (a polypeptide) is a single linear sequence of amino acids that are biochemically bonded together.
J. A. Tenreiro Machado+2 more
doaj +1 more source
Ancient properties of spider silks revealed by the complete gene sequence of the prey-wrapping silk protein (AcSp1). [PDF]
Spider silk fibers have impressive mechanical properties and are primarily composed of highly repetitive structural proteins (termed spidroins) encoded by a single gene family.
Ayoub, Nadia A+3 more
core +2 more sources
Discovery of a Novel Member of the Carlavirus Genus from Soybean (Glycine max L. Merr.)
A novel member of the Carlavirus genus, provisionally named soybean carlavirus 1 (SCV1), was discovered by RNA-seq analysis of randomly collected soybean leaves in Illinois, USA.
Thanuja Thekke-Veetil+6 more
doaj +1 more source
Characterization of Trypanosoma brucei gambiense variant surface glycoprotein LiTat 1.5 [PDF]
At present, all available diagnostic antibody detection tests for Trypanosoma brucei gambiense human African trypanosomiasis are based on predominant variant surface glycoproteins (VSGs), such as VSG LiTat 1.5. During investigations aiming at replacement
Büscher, P.+4 more
core +1 more source