Abstract
The writer T.S. Eliot once mused, “Where is the knowledge we have lost in information?” [1 ]. From a biological perspective, the answer to this profound question is today having far-reaching consequences for the future of biomedical research and, in particular, the drug discovery process.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
T.S. Eliot choruses from The Rock
Altschul SF, Madden TL, Schaffer AA et al (1997) Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res 25: 3389–3402
Pearson WR, Lipman DJ (1988) Improved tools for biological sequence comparison. Proc Natl Acad Sci USA 85: 2444–2448
Needleman SB, Wunsch CD (1970) A general method applicable to the search for similarities in the amino acid sequence of two proteins. J Mol Biol 48: 443–453
Sellers PH (1974) On the theory and computation of evolutionary distances. SIAM J Appl Math 26: 787–793
Smith TF, Waterman MS (1981) Comparison of bio-sequences. Adv Appl Math 2: 482–489
Dayhoff MO, Schwartz RM, Orcutt BC (1978) Atlas of protein sequence and structure. Nat Biomed Res Foundation, Washington D.C., USA 5, Suppl 3: 345–352
Jones DT, Taylor WR, Thornton JM (1992) A new approach to protein fold recognition. Nature 358: 86–89
Gonnet GH, Cohen MA, Benner SA (1992) Exhaustive matching of the entire protein-sequence database. Science 256: 1443–1445
Henikoff S, Henikoff JG (1993) Performance evaluation of amino acid substitution matrices. Proteins 17: 49–61
Zhang Z, Schaffer AA, Miller W et al (1998) Protein sequence similarity searches using patterns as seeds. Nucleic Acids Res 26: 3986–3990
Teichmann SA, Chothia C, Gerstein M (1999) Advances in structural genomics. Curr Opin Struct Biol 9: 390–399
Bairoch A (1991) PROSITE: a dictionary of sites and patterns in proteins. Nucleic Acids Res. 19 Suppl: 2241–2245
Thompson JD, Higgins DG, Gibson TJ (1994) CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position specific gap penalties and weight matrix choice. Nucleic Acids Research 22: 4673–4680
Barton GJ (1994) The AMPS package for multiple protein sequence alignment. Methods Mol Biol 25: 327–347
Gribskov M, McLachlan AD, Eisenberg D (1987) Profile analysis: Detection of distantly related proteins. Proc Natl Acad. Sci USA 84: 4355–4358
Hughey R, Krogh A (1996) Hidden Markov models for sequence analysis: extension and analysis of the basic method. Comput Appl Biosci 12: 95–107
Neuwald AF, Liu JS, Lipman DJ et al (1997) Extracting protein alignment models from the sequence database. Nucleic Acids Res 25: 1665–1677
Grundy WN, Bailey TL, Elkan CP et al (1997) Meta-MEME: Motif-based hidden Markov models of protein families. Comput Applic Biosci 13: 397–406
Henikoff JG, Henikoff S, Pietrokovski S (1999) New features of the Blocks Database servers. Nucleic Acids Res 27: 226–228
Etzold T, Argos P (1993) SRS — an indexing and retrieval tool for flat file data libraries. Comput Appl Biosci 9: 49–57
Online Mendelian Inheritance in Man, OMIM (TM). McKusick-Nathans Institute for Genetic Medicine, Johns Hopkins University (Baltimore, MD) and National Center for Biotechnology Information, National Library of Medicine (Bethesda, MD), 2000. World Wide Web URL: http://www.ncbi.nlm.nih.gov/omim/
Discala C, Benigni X, Barillot E (2000) DBcat: a catalog of 500 biological databases. Nucleic Acids Res. 28: 8–9
Lawton JR, Martinez FA, Burks C (1989) Overview of the LiMB database. Nucleic Acids Res 17: 5885–5899
Bairoch A, Apweiler R (1999) The SWISS-PROT protein sequence data bank and its supplement TrEMBL. Nucleic Acids Res 27: 49–54
Barker WC, Garavelli JS, Huang H et al (2000) The protein information resource (PIR). Nucleic Acids Res 28: 41–44
Walsh S, Anderson M, Cartinhour SW (1998) ACEDB: a database for genome information. Methods Biochem Anal 39: 299–318
Hofmann K, Bucher P, Falquet L et al (1999) The PROSITE database, its status in 1999. Nucleic Acids Res 27: 215–219
Attwood TK, Flower DR, Lewis AP et al (1999) PRINTS prepares for the new millennium. Nucleic Acids Res 27: 220–225
Sonnhammer EL, Eddy SR, Bimey E et al (1998) Pfam: multiple sequence alignments and HMM-profiles of protein domains. Nucleic Acids Res 26: 320–322
Ponting CP, Schultz J, Milpetz F et al (1999) SMART: identification and annotation of domains from signalling and extracellular protein sequences. Nucleic Acids Res 27: 229–232
Laskowski RA (2001) PDBsum: summaries and analyses of PDB structures. Nucleic Acids Res 29: 221–222
Corpet F, Gouzy J, Kahn D (1998) The ProDom database of protein domain families. Nucleic Acids Res 26: 323–326
Gracy J, Argos P (1998) DOMO: a new database of aligned protein domains. Trends Biochem Sci 23: 495–497
Apweiler R, Attwood TK, Bairoch A et al (2000) InterPro-an integrated documentation resource for protein families, domains and functional sites. Bioinformatics 16: 1145–1150
Hubbard TJ, Ailey B, Brenner SE et al (1999) SCOP: a Structural Classification of Proteins database. Nucleic Acids Res 27: 254–256
Orengo CA, Pearl FM, Bray JE et al (1999) The CATH Database provides insights into protein struc-ture/function relationships. Nucleic Acids Res 27: 275–279
Tatusov RL, Koonin EV, Lipman DJ (1997) A genomic perspective on protein families. Science 24: 631–637
Cooper DN, Ball EV, Krawczak M (1998) The human gene mutation database. Nucleic Acids Res 26: 285–287
Brookes AJ, Lehvaslaiho H, Siegfried M et al (2000) HGBASE: a database of SNPs and other variations in and around human genes. Nucleic Acids Res 28: 356–360
Sherry ST, Ward MH, Kholodov M et al (2001) dbSNP: the NCBI database of genetic variation. Nucleic Acids Res 29: 308–311
Borodovsky M, Mclninch J (1993) GeneMark: Parallel Gene Recognition for both DNA Strands. Computers & Chemistry 17: 123–133
Xu Y, Einstein JR, Mural RJ et al (1994) An improved system for exon recognition and gene modeling in human DNA sequences. Proc Int Conf Intell Syst Mol Biol 2: 376–384
Thomas A, Skolnick M (1994) A probabilistic model for detecting coding regions in DNA sequences. IMA J Math Appl Med Biol 11: 149–160
Cole ST, Brosch R, Parkhill J et al (1998) Deciphering the biology of Mycobacterium tuberculosis from the complete genome sequence. Nature 393: 537–544
Henderson J, Salzberg S, Fasman K (1997) Finding genes in DNA with a Hidden Markov Model. J Comput Biol 2: 127–141
Lukashin AV, Borodovsky M (1998) GeneMark.hmm: new solutions for gene finding. Nucleic Acids Res 26: 1107–1115
Gelfand MS, Mironov AA, Pevzner PA (1996) Gene recognition via spliced sequence alignment. Proc Natl Acad Sci USA 93: 9061–9066
Quandt K, Frech K, Karas H et al (1995) Matlnd and Matlnspector: new fast and versatile tools for detection of consensus matches in nucleotide sequence data. Nucleic Acids Res 23: 4878–4884
Heinemeyer T, Chen X, Karas H et al (1999) Expanding the TRANSFAC database towards an expert system of regulatory molecular mechanisms. Nucleic Acids Res 27: 318–322
Parsons JD (1995) Improved tools for DNA comparison and clustering. Comput Appl Biosci 11: 603–613
Pietu G, Eveno E, Soury-Segurens B (1999) The genexpress IMAGE knowledge base of the human muscle transcriptome: a resource of structural, functional, and positional candidate genes for muscle physiology and pathologies. Genome Res 9: 1313–1320
Williamson AR (1999) The Merck Gene Index project. Drug Discov Today 4: 115–122
Chou PY, Fasman GD (1974) Prediction of protein conformation. Biochemistry 13: 222–245
Gamier J, Osguthorpe DJ, Robson BJ (1978) Analysis of the accuracy and implications of simple methods for predicting the secondary structure of globular proteins. J Mol Biol 120: 97–120
Rost B (1996) PHD: predicting one-dimensional protein structure by profile-based neural networks. Methods Enzymol 266: 525–539
Schneider R, Sander C (1996) The HSSP database of protein structure-sequence alignments. Nucleic Acids Res 24: 201–205
Salamov AA, Solovyev VV (1995) Prediction of protein secondary sturcture by combining nearest-neighbor algorithms and multiply sequence alignments. J Mol Biol 247: 11–15
King RD, Sternberg MJ (1996) Identification and application of the concepts important for accurate and reliable protein secondary structure prediction. Protein Sci 5: 2298–2310
Frishman D, Argos P (1995) Knowledge-based secondary structure assignment. Proteins 23: 566–579
Cuff JA, Clamp ME, Siddiqui AS et al (1998) JPred: a consensus secondary structure prediction server. Bioinformatics 14: 892–893
Sutcliffe MJ, Hayes FR, Blundell TL (1987) Knowledge based modelling of homologous proteins, Part II: Rules for the conformations of substituted sidechains. Protein Eng 1: 385–892
Sali A, Overington JP (1994) Derivation of rules for comparative protein modeling from a database of protein structure alignments. Protein Sci 3: 1582–1596
Jones DT, Tress M, Bryson K et al (1999) Successful recognition of protein folds using threading methods biased by sequence similarity and predicted secondary structure. Proteins 3: 104–111
Taylor WR (1997) Multiple sequence threading: an analysis of alignment quality and stability. J Mol Biol 269: 902–943
Rost B (1995) TOPITS: threading one-dimensional predictions into three-dimensional structures. Proc Int Conf Intell Syst Mol Biol 3: 314–321
Russell RB, Copley RR, Barton GJ (1996) Protein fold recognition by mapping predicted secondary structures. J Mol Biol 259: 349–365
Rice DW, Eisenberg D (1997) A 3D–1D substitution matrix for protein fold recognition that includes predicted secondary structure of the sequence. J Mol Biol 267: 1026–1038
Aszodi A, Munro RE, Taylor WR (1997) Protein modeling by multiple sequence threading and distance geometry. Proteins Suppl 1: 38–42
Laskowski RA, MacArthur MW, Moss DS et al (1993)PROCHECK: a program to check the stereo-chemical quality of protein structures. J Appl Cryst 26: 283–291
Lupas A (1996) Prediction and Analysis of Coiled-Coil Structures. Methods Enzymol 266: 513–525
Berger B, Wilson DB, Wolf E et al (1995) “Predicting Coiled Coils by Use of Pairwise Residue Correlations”. Proc Nall Acad Sci USA 92: 8259–8263
Lupas A (1997) Predicting coiled-coil regions in proteins. Curr Opin Struct Biol 7: 388–393
Hirst J, Vieth M, Skolnick J et al (1996) Predicting leucine zipper structures from sequence. Protein Eng 9: 657–662
Bornberg-Bauer E, Rivals E, Vingron M (1998) Computational approaches to identify leucine zippers. Nucleic Acids Res 26: 2740–2746
Claros MG, von Heijne G (1994) TopPred II: an improved software for membrane protein structure predictions. Comput Appl Biosci 10: 685–686
Rost B, Fariselli P, Casadio R (1994) Refining neural network predictions for helical transmembrane proteins by dynamic programming. Comput Appl Biosci 10: 685–686
Persson B, Argos PJ (1997) Prediction of membrane protein topology utilizing multiple sequence alignments. Protein Chem 16: 453–457
Jones DT, Taylor WR, Thornton JM (1994) A model recognition approach to the prediction of all-helical membrane protein structure and topology. Biochemistry 33: 3038–3049
Sonnhammer EL, von Heijne G, Krogh A (1998) A hidden Markov model for predicting transmembrane helices in protein sequences. Proc Int Conf Intell Syst Mol Biol 6: 175–182
Cedano J, Aloy P, Perez-Pons JA et al (1997) Relation between amino acid composition and cellular location of proteins. J Mol Biol 266: 594–600
Nakai K, Horton P (1999) PSORT: a program for detecting sorting signals in proteins and predicting their subcellular localization. Trends Biochem Sci 24: 34–36
Nielsen H, Brunak S, von Heijne G (1999) Machine learning approaches for the prediction of signal peptides and other protein sorting signals. Protein Eng 12: 3–9
Felsenstein J (1989) PHYLIP—Phylogeny Inference Package (Version 3.2). Cladistics 5: 164–166; also see http://evolution.genetics.washington.edu/phylip.htmll
Wills C (1994) Phylogenetic analysis and molecular evolution. In: DW Smith (ed): Biocomputing: Informatics and Genome Projects. Academic Press, San Diego, 175–201
Setubal J, Meidanis J (eds) (1996) Introduction to Computational Molecular Biology. PWS Publishing Co., Boston
Huson DH, Vawter L, Warnow TJ (1999) Solving large scale phylogenetic problems using DCM2. In: Lengauer T, Schneider R (eds): Proceedings of the Seventh International Conference on Intelligent Systems for Molecular Biology. AAAI Press, Menlo Park (CA), 118–129
Strimmer K, von Haeseler A (1997) Likelihood-mapping: a simple method to visualize phylogenetic content of a sequence alignment. Proc Natl Acad Sci USA 94: 6815–6819
Diaconis PW, Holmes SP (1998) Matchings and phylogenetic trees. Proc Natl Acad Sci USA 95: 14600–14602
Karp PD, Riley M, Paley SM et al (1996) EcoCyc: an encyclopedia of Escherichia coli genes and metabolism. Nucleic Acids Res 24: 32–39; see also http://ecocyc.pangeasystems.com/ecocyc/
Bork P, Dandekar T, Diaz-Lazcoz Y et al (1998) Predicting function: from genes to genomes and back. J Mol Biol 283: 707–725
Ehlde M, Zacchi G (1995) MIST: a user-friendly metabolic simulator. Comput Appl Biosci 11: 201–207
Mendes P. (1993) GEPASI: a software package for modelling the dynamics, steady states and control of biochemical and other systems. Comput Appl Biosci 9: 563–571
Tomita M, Hashimoto K, Takahashi K et al (1999) E-CELL: Software environment for whole cell simulation. Bioinformatics 15: 72–84; also see E-Cell Project http://www.e-cell.org/
Heidtke KR, Schulze-Kremer S (1998) BioSim – a new qualitative simulation environment for molecular biology. In: J Glasgow, T Littlejohn, F Major, R Lathrop, D Sankoff, C Sensen (eds) (: Proceedings of Sixth International Conference on Intelligent Systems for Molecular Biology. AAAI Press, Menlo Park (CA), 85–94
D’haeseleer P, Liang S, Somogyi R (1999) Gene expression data analysis and modeling. Tutorial session at Pacific Symposium on Biocomputing, Hawaii, January: 4–9; also see http://www.cgl.ucsf.edu/psb/psb99/genetutorial.pdf
McAdams HH, Shapiro L (1995) Circuit simulation of genetic networks. Science 269: 650–656
Kanehisa M, Goto S (2000) KEGG: kyoto encyclopedia of genes and genomes. Nucleic Acids Res 28: 27–30
Scharf M, Schneider R, Casari G et al (1994) GeneQuiz: a workbench for sequence analysis. Proc Int Conf Intel! Syst Mol Biol 2: 348–353
Frishman D, Mewes H-W (1997) PEDANTic genome analysis. Trends in Genetics 13: 415–416
Gaasterland T, Sensen CW (1996) MAGPIE: automated genome interpretation. Trends Genet 12: 76–78
Kabsch W, Sander C (1983) How good are predictions of protein secondary structure? FEBS Lett 155: 179–182
Levin JM, Pascarella S, Argos P et al (1993) Quantification of secondary structure prediction improvement using multiple alignment. Protein Eng 6: 849–854
Rost B, Sander C (1994) Combining evolutionary information and neural networks to predict protein secondary structure. Proteins 19: 55–72
Lim VI (1974) Structural Principles of the Globular Organization of Protein Chains. A Stereochemical Theory of Globular Protein Secondary Structure. J Mol Biol 88: 857–872
Schneider R (1989) Sekundarstrukturvorhersage von Proteinen unter Berticksichtigung von Tertiarstrukturaspekten. Diploma thesis, Universitat Heidelberg, Germany
Ptitsyn OB, Finkelstein AV (1983) Theory of protein secondary structure and algorithm of its prediction. Biopolymers 22: 15–25
Gibrat J-F, Gamier J, Robson B (1987) Further developments of protein secondary structure prediction using information theory. New parameters and consideration of residue pairs. J Mol Biol 198: 425–443
Gamier J, Gibrat J-F, Robson B (1996) GOR method for predicting protein secondary structure from amino acid sequence. Meth Enzymol 266: 540–553
Kabsch WSander C (1983) Segment83. Unpublished
Brenner SE, Barken D, Levitt M (1999) The PRESAGE database for structural genomics. Nucleic Acids Res 27: 251–253
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2003 Springer Basel AG
About this chapter
Cite this chapter
Jackson, D.B., Minch, E., Munro, R.E. (2003). Bioinformatics. In: Hillisch, A., Hilgenfeld, R. (eds) Modern Methods of Drug Discovery. EXS, vol 93. Birkhäuser, Basel. https://doi.org/10.1007/978-3-0348-7997-2_3
Download citation
DOI: https://doi.org/10.1007/978-3-0348-7997-2_3
Publisher Name: Birkhäuser, Basel
Print ISBN: 978-3-0348-9397-8
Online ISBN: 978-3-0348-7997-2
eBook Packages: Springer Book Archive