Chemoinformatics : a textbook



Chemoinformatics : a textbook

Johann Gasteiger, Thomas Engel (eds.)

Wiley-VCH, c2003

大学図書館所蔵 件 / 15



Includes bibliographical references and index



This first work to be devoted entirely to this increasingly important field, the "Textbook" provides both an in-depth and comprehensive overview of this exciting new area. Edited by Johann Gasteiger and Thomas Engel, the book provides an introduction to the representation of molecular structures and reactions, data types and databases/data sources, search methods, methods for data analysis as well as such applications as structure elucidation, reaction simulation, synthesis planning and drug design. A hands-on approach with step-by-step tutorials and detailed descriptions of software tools and Internet resources allows easy access for newcomers, advanced users and lecturers alike. For a more detailed presentation, users are referred to the "Handbook of Chemoinformatics", which will be published separately. Johann Gasteiger is the recipient of the 1991 Gmelin-Beilstein Medal of the German Chemical Society for Achievements in Computer Chemistry, and the Herman Skolnik Award of the Division of Chemical Information of the American Chemical Society (ACS) in 1997. Thomas Engel joined the research group headed by Johann Gasteiger at the University of Erlangen-Nuremberg and is a specialist in chemoinformatics.


Foreword.Preface.Addresses of the Authors.1. Introduction.1.1. The Domain of Chemistry.1.2. A Chemists Fundamental Questions.1.3. The Scope of Chemoinformatics.1.4. Learning in Chemoinformatics.1.5. Major Tasks.1.5.1. Representation of the Objects.1.5.2. Data.1.5.3. Learning.1.6. History of Chemoinformatics.1.6.1. Structure Databases. 101.6.2. Quantitative Structure--Activity Relationships.1.6.3. Molecular Modeling.1.6.4. Structure Elucidation.1.6.5. Chemical Reactions and Synthesis Design.1.7. The Scope of this Book.1.8. Teaching Chemoinformatics.2. Representation of Chemical Compounds.2.1. Introduction.2.2. Chemical Nomenclature.2.2.1. Development of Chemical Nomenclature.2.2.2. Representation of Chemical Elements. Characterization of Elements.2.2.3. Representation of the Empirical Formulas of (Inorganic) Compounds. Present-Day Representation.2.2.4. Representation of the Empirical Formulas of Organic Compounds. Present-Day Representation.2.2.5. Systematic Nomenclature of Inorganic and Organic Compounds.2.3. Line Notations.2.3.1. Wiswesser Line Notation. Applications.2.3.2. ROSDAL. Applications.2.3.3. The SMILES Coding. Applications.2.3.4. Sybyl Line Notation. Applications.2.4. Coding the Constitution.2.4.1. Graph Theory. Basics of Graph Theory.2.4.2. Matrix Representations. Adjacency Matrix. Distance Matrix. Atom Connectivity Matrix. Incidence Matrix. Bond Matrix.2.4.3. Connection Table.2.4.4. Input and Output of Chemical Structures.2.4.5. Standard Structure Exchange Formats.2.4.6. Tutorial: Molfiles and SDfiles. Structure of a Molfile. Structure of an SDfile. Libraries and Toolkits.2.5. Processing Constitutional Information.2.5.1. Ring Perception. Minimum Number of Cycles. All Cycles. Smallest Fundamental Basis.2.5.2. Unambiguous and Unique Representations. Structure Isomers and Isomorphism. Canonicalization.2.5.3. The Morgan Algorithm. Tutorial: Morgan Algorithm.2.6. Beyond a Connection Table.2.6.1. Deficiencies in Representing Molecular Structures by a Connection Table.2.6.2. Representation of Molecular Structures by Electron Systems. General Concepts. Simple Single and Double Bonds. Conjugation and Aromaticity. Orthogonality of -Systems. Non-bonding Orbitals. Charged Species and Radicals. Ionized States. Electron-Deficient Compounds. Organometallic Compounds.2.6.3. Generation of RAMSES from a VB Representation.2.7. Special Notations of Chemical Structures.2.7.1. Markush Structures.2.7.2. Fragment Coding. Applications.2.7.3. Fingerprints. Hashed Fingerprints.2.7.4. Hash Codes. Applications.2.8. Representation of Stereochemistry.2.8.1. General Concepts.2.8.2. Representation of Configuration Isomers and Molecular Chirality. Detection and Specification of Chirality.2.8.3. Ordered Lists.2.8.4. Rotational Lists.2.8.5. Permutation Descriptors.2.8.6. Stereochemistry in Molfile and SMILES. Stereochemistry in the Molfile. Stereochemistry in SMILES.2.8.7. Tutorial: Handling of Stereochemistry by Permutation Groups. Stereochemistry at Tetrahedral Carbon Atoms. Stereochemistry at Double Bonds.2.9. Representation of 3DStructures.2.9.1. Walking through the Hierarchy of Chemical Structure Representation.2.9.2. Representation of 3DStructures.2.9.3. Obtaining 3DStructures and Why They are Needed.2.9.4. Automatic 3DStructure Generation.2.9.5. Obtaining an Ensemble of Conformations: What is Conformational Analysis?2.9.6. Automatic Generation of Ensembles of Conformations.2.9.7. Tutorial: 3DStructure Codes (PDB, STAR, CIF, mmCIF). Introduction. PDB File Format. STAR File Format and Dictionaries. CIF File Format (CCDC). mmCIF File Format. Software.2.10. Molecular Surfaces.2.10.1. vanderWaals Surface.2.10.2. Connolly Surface.2.10.3. Solvent-Accessible Surface.2.10.4. Solvent-Excluded Surface (SES).2.10.5. Enzyme Cavity Surface (Union Surface).2.10.6. Isovalue-Based Electron Density Surface.2.10.7. Experimentally Determined Surfaces.2.11. Visualization of Molecular Models.2.11.1. Historical Review.2.11.2. Structure Models. Wire Frame Model. Capped Sticks Model. Balls and Sticks Model. Space-Filling Model.2.11.3. Models of Biological Macromolecules. Cylinder Model. Ribbon Model. Tube Model.2.11.4. Crystallographic Models.2.11.5. Visualization of Molecular Properties. Properties Based on Isosurfaces.2.12. Tools: Chemical Structure Drawing Software Molecule Editors and Viewers.2.12.1. Introduction.2.12.2. Molecule Editors. Stand-Alone Applications. Web-Based Applications.2.12.3. Molecule Viewers. Stand-Alone Applications. Web-Based Applications.2.13. Tools: 3DStructure Generation on the Web.3. Representation of Chemical Reactions.3.1. Introduction.3.2. Reaction Types.3.3. Reaction Center.3.4. Chemical Reactivity.3.4.1. Physicochemical Effects. Charge Distribution. Inductive Effect. Resonance Effect. Polarizability Effect. Steric Effect. Stereoelectronic Effects.3.4.2. Simple Approaches to Quantifying Chemical Reactivity. Frontier Molecular Orbital Theory. Linear Free Energy Relationships (LFER). Empirical Reactivity Equations.3.5. Reaction Classification.3.5.1. Model-Driven Approaches. Hendricksons Scheme. Ugis Scheme. InfoChems Reaction Classification.3.5.2. Data-Driven Approaches. HORACE. Reaction Landscapes.3.6. Stereochemistry of Reactions.3.7. Tutorial: Stereochemistry of Reactions.4. The Data.4.1. Introduction.4.1.1. Data, Information, Knowledge.4.1.2. The Data Acquisition Pathway.4.2. Data Acquisition. 2064.2.1. Why Does the Quality of Data Matter?4.2.2. Data Complexity.4.2.3. Experimental Data.4.2.4. Data Exchange. DAT files. JCAMP-DX. PMML.4.2.5. Real-World Data and their Potential Drawbacks.4.3. Data Pre-processing.4.3.1. Mean-Centering, Scaling, and Autoscaling.4.3.2. Advanced Methods. Fast Fourier Transformation. Wavelet Transformation. Singular Value Decomposition.4.3.3. Variable Selection. Genetic Algorithm (GA)-Based Solutions. Orthogonalization-Based Solutions. Simulated Annealing (SA)-Based Solutions. PCA-Based Solutions.4.3.4. Object Selection.4.4. Preparation of Datasets for Validation of the Model Quality.4.4.1. Training and Test Datasets.4.4.2. Compilation of Test Sets.5. Databases and Data Sources in Chemistry.5.1. Introduction.5.2. Basic Database Theory.5.2.1. Databases in the Information System.5.2.2. Search Engine.5.2.3. Access to Databases.5.2.4. Types of Database Systems. Hierarchical Database System. Network Model. Relational Model. Object-Based Model.5.3. Classification of Databases.5.3.1. Literature Databases.5.3.2. Factual Databases. Numeric Databases. Catalogs of Chemical Compounds. Research Project Databases. Metadatabases.5.3.3. Structure Databases.5.3.4. Reaction Databases.5.4. Literature Databases.5.4.1. Chemical Abstracts File.5.4.2. SCISEARCH.5.4.3. Medline (Medical Literature, Analysis, and Retrieval System Online).5.5. Tutorial: Using the Chemical Abstracts System.5.5.1. Online Access.5.5.2. Access to CAS with SciFinder Scholar 2002. Getting Started. Searching within Various Topics.5.6. Property (Numeric) Databases.5.6.1. Beilstein Database.5.6.2. Gmelin.5.6.3. DETHERM.5.7. Tutorial: Searching in the Beilstein Database.5.7.1. Example 1: Combined Structure and Fact Retrieval.5.7.2. Example 2: Reaction Retrieval.5.8. Spectroscopic Databases.5.8.1. SpecInfo.5.9. Crystallographic Databases.5.9.1. Inorganic Crystal Structure Database (ICSD).5.9.2. Cambridge Structural Database (CSD).5.9.3. Protein Data Bank (PDB).5.10. Molecular Biology Databases.5.10.1. GenBank (Genetic Sequence Bank).5.10.2. EMBL.5.10.3. PIR (Protein Information Resource).5.10.4. SWISS-PROT.5.10.5. CAS Registry.5.11. Structure Databases.5.11.1. CAS Registry.5.11.2. National Cancer Institute (NCI) Database.5.12. Chemical Reaction Databases.5.12.1. CASREACT.5.12.2. ChemInform RX.5.13. Tutorial: Searching in the ChemInform Reaction Database.5.13.1. Introduction.5.13.2. Example 1: Reaction Retrieval.5.13.3. Example 2: Advanced Reaction Retrieval.5.13.4. Classifying Reactions on a Hit List.5.14. Patent Databases.5.14.1. INPADOC.5.14.2. World Patent Index (WPINDEX).5.14.3. MARPAT.5.15. Chemical Information on the Internet.5.16. Tutorial: Searching the Internet for Chemical Information.5.17. Tutorial: Searching Environmental Information in the Internet.5.17.1. Introduction: Difficulties in Extracting Scientific Environmental Information from the Internet.5.17.2. Ways of Searching for Environmental Information on the Internet. Metadatabases and Portals. Search Engines. Databases.5.18. Tools: The Internet (Online Databases in Chemistry).6. Searching Chemical Structures.6.1. Introduction.6.2. Full Structure Search.6.3. Substructure Search.6.3.1. Basic Ideas.6.3.2. Backtracking Algorithm.6.3.3. Optimization of the Backtracking Algorithm.6.3.4. Screening.6.4. Similarity Search.6.4.1. Similarity Basics.6.4.2. Similarity Measures.6.4.3. The Similarity Search Process. Object Selection. Descriptor Selection and Encoding. Similarity Measure Selection. Query Object Specification. Similarity Scores. Application Areas.6.5. Three-Dimensional Structure Search Methods.7. Calculation of Physical and Chemical Data.7.1. Empirical Approaches to the Calculation of Properties.7.1.1. Introduction.7.1.2. Additivity of Atomic Contributions. Hybridization States.7.1.3. Additivity of Bond Contributions.7.1.4. Additivity of Group Contributions.7.1.5. Effects of Rings.7.1.6. DrugReceptor Binding Energies.7.1.7. Attenuation Models. Calculation of Charge Distribution. Polarizability Effect.7.2. Molecular Mechanics.7.2.1. Introduction.7.2.2. No Force Field Calculation Without Atom Types.7.2.3. The Functional Form of Common Force Fields. Bond Stretching. Angle Bending. Torsional Terms. Out-of-Plane Bending. Electrostatic Interactions. VanderWaals Interactions. Cross-Terms.7.2.4. Available Force Fields. Force Fields for Small Molecules. Force Fields for Biomolecules.7.3. Molecular Dynamics.7.3.1. Introduction.7.3.2. The Continuous Movement of Molecules.7.3.3. Methods. Algorithms. Ways to Speed up the Calculations. Solvent Effects. Periodic Boundary Conditions.7.3.4. Constant Energy, Temperature, or Pressure?7.3.5. Long-Range Forces.7.3.6. Application of Molecular Dynamics Techniques.7.4. Quantum Mechanics.7.4.1. Huckel Molecular Orbital Theory.7.4.2. Semi-empirical Molecular Orbital Theory.7.4.3. Ab Initio Molecular Orbital Theory.7.4.4. Density Functional Theory.7.4.5. Properties from Quantum Mechanical Calculations. Net Atomic Charges. Dipole and Higher Multipole Moments. Polarizabilities. Orbital Energies. Surface Descriptors. Local Ionization Potential.7.4.6. Quantum Mechanical Techniques for Very Large Molecules. Linear Scaling Methods. Hybrid QM/MM Calculations.7.4.7. The Future of Quantum Mechanical Methods in Chemoinformatics.8. Calculation of Structure Descriptors.8.1. Introduction.8.1.1. Definition of the Term Structure Descriptor.8.1.2. Classification of Structure Descriptors.8.2. Structure Keys and 1DFingerprints.8.2.1. Distance and Similarity Measures.8.3. Topological Descriptors.8.3.1. Some Fundamentals of Graph Theory.8.3.2. The Adjacency Matrix.8.3.3. The Laplacian Matrix.8.3.4. The Distance Matrix.8.3.5. The Wiener Index.8.3.6. The Randic Connectivity Index.8.3.7. Topological Autocorrelation Vectors.8.3.8. Feature Trees.8.3.9. Further Topological Descriptors.8.4. 3DDescriptors.8.4.1. 3DStructure Generation.8.4.2. 3DAutocorrelation. Example: Xylene Isomers.8.4.3. 3DMolecule Representation of Structures Based on Electron Diffraction Code (3DMoRSE Code).8.4.4. Radial Distribution Function Code.8.5. Chirality Descriptors.8.5.1. Quantitative Descriptions of Chirality.8.5.2. Continuous Chirality Measure (CCM).8.5.3. Chirality Codes.8.6. Tutorial: Conformation-Independent and Conformation-Dependent Chirality Codes.8.6.1. Introduction.8.6.2. Conformation-Independent Chirality Code (CICC). Preparatory Calculations. Neighborhoods of Atoms Bonded to the Chiral Center. Enumeration of Combinations. Characterization of Combinations. Generation of the Code.8.6.3. Conformation-Dependent Chirality Code (CDCC). Overview. Enumeration of combinations. Ranking of the Four Atoms in a Combination. Characterization of Combinations. Generation of the Code. Example of an Application.8.7. Further Descriptors.8.7.1. Comparative Molecular Field Analysis (CoMFA).8.7.2. BCUT Descriptors.8.7.3. 4D-QSAR.8.7.4. HYBOT Descriptors.8.8. Descriptors that are not Structure-Based.8.9. Properties of Structure Descriptors.9. Methods for Data Analysis.9.1. Introduction.9.2. Machine Learning Techniques.9.2.1. Machine Learning Process.9.2.2. Unsupervised Learning.9.2.3. Supervised Learning.9.3. Decision Trees.9.4. Chemometrics.9.4.1. Multivariate Statistics.9.4.2. Correlation.9.4.3. Multiple Linear Regression Analysis (MLRA).9.4.4. Principal Component Analysis (PCA).9.4.5. Principal Component Regression (PCR).9.4.6. Partial Least Squares Regression/Projection to Latent Structures (PLS).9.4.7. Example: Ion Concentrations in Mineral Waters.9.4.8. Tools: Electronic Data Analysis Service (ELECTRAS).9.5. Neural Networks.9.5.1. Modeling the Brain: Biological Neurons versus Artificial Neurons.9.5.2. Networks. Training. Learning Strategies.9.5.3. Kohonen Network. Architecture. Training.9.5.4. Tutorial: Application of a Kohonen Network for the Classification of Olive Oils using ELECTRAS.9.5.5. Counter-propagation Network. Architecture. Training.9.5.6. Tools: SONNIA (Self-Organizing Neural Network for Information Analysis).9.5.7. Back-propagation Network. Architecture. Training.9.5.8. Tutorial: Neural Networks.9.5.9. Tasks for Neural Networks and Selection of an Appropriate Neural Network Method.9.6. Fuzzy Sets and Fuzzy Logic.9.6.1. Some Concepts.9.6.2. Application of Fuzzy Logic in Chemistry.9.7. Genetic Algorithms.9.7.1. Representation and Encoding of Chromosomes.9.7.2. Initialization of Individuals.9.7.3. Fitness and Objective Function.9.7.4. Selection Functions.9.7.5. Genetic Operators.9.7.6. Tutorial: Selection of Relevant Descriptors in a StructureActivity Study. Example: Drug Design.9.8. Data Mining.9.8.1. Classification.9.8.2. Clustering and Detection of Similarities.9.8.3. Prediction and Regression.9.8.4. Association.9.8.5. Detection of Descriptions.9.8.6. Data Mining in Chemistry.9.9. Visual Data Mining.9.9.1. Advantages of Visual Data Mining Approaches.9.9.2. Information Visualization Techniques. Data Types. Visualization Techniques. Interaction and Distortion Techniques.9.10. Expert Systems.9.10.1. Architecture of Expert Systems.9.10.2. Tasks of Expert Systems.9.10.3. Expert Systems in Chemistry. DENDRAL. EROS.10. Applications.10.1. Prediction of Properties of Compounds.10.1.1. Introduction.10.1.2. Linear Free Energy Relationships (LFER).10.1.3. Quantitative StructureProperty Relationships (QSPR). Structure Representation. Descriptor Analysis. Model Building.10.1.4. Estimation of Octanol/Water Partition Coefficient (logPOW). Other Substructure-Based Methods. QSPR Models.10.1.5. Estimation of Aqueous Solubility (logS). Solubility Prediction Methods. Tutorial: Developing Models for Solubility Prediction with 18Topological Descriptors. Models with 32Radial Distribution Function Values and Eight Additional Descriptors.10.1.6. Prediction of the Toxicity of Compounds. How to Quantify Toxicity. Modeling Toxicity.10.1.7. Tutorial: Classifying Compounds into Different Modes of Action.10.1.8. Conclusion and Future Outlook.10.2. StructureSpectra Correlations.10.2.1. Introduction.10.2.2. Molecular Descriptors. Fragment-Based Descriptors. Topological Structure Codes. Three-Dimensional Molecular Descriptors.10.2.3. 13CNMR Spectra.10.2.4. 1H NMR Spectra. Prediction of Chemical Shifts. Tools: Prediction of 1H NMR Chemical Shifts.10.2.5. Infrared Spectra. Overview. Infrared Spectra Simulation. Tools: TeleSpec Online Service for the Simulation of Infrared Spectra.10.2.6. Mass Spectra.10.2.7. Computer-Assisted Structure Elucidation.10.3. Chemical Reactions and Synthesis Design.10.3.1. The Prediction of Chemical Reactions. Introduction. Knowledge Extraction from Reaction Databases. Tutorial: Prediction of the Regiochemistry in Pyrazole Synthesis. CAMEO. EROS. Tutorial: Modeling the Degradation of s-Triazine Herbicides in Soil. Biochemical Pathways. Tutorial: Multidimensional Searching in Biochemical Pathways.10.3.2. Computer-Assisted Synthesis Design. Introduction. Basic Terms. Concepts for Computer-Assisted Organic Synthesis. Synthesis Design Systems. Tutorial: Synthesis Design with the WODCA Program.10.4. Drug Design.10.4.1. Introduction.10.4.2. Some Economic Considerations Affecting Drug Design.10.4.3. Definitions of some Terms in the Context of Drug Design.10.4.4. The Drug Discovery Process. Target Identification and Validation. Lead Finding and Optimization. Preclinical and Clinical Trials.10.4.5. Fields of Application of Chemoinformatics in Drug Design. Subset Selection and Similarity/Diversity Search. Analysis of HTS Data. Virtual Screening. Design of Combinatorial Libraries. Further Issues.10.4.6. Ligand- and Structure-based Drug Design. Ligand-Based Drug Design. Structure-Based Drug Design.10.4.7. Applications. Distinguishing Molecules of Different Biological Activities and Finding a New Lead Structure An Example of Ligand-Based Drug Design. Examples of Structure-Based Drug Design.10.4.8. Outlook Future Perspectives.11. Future Directions.Appendix.A.1. Software Development.A.1.1. Programming Languages.A.1.2. Object-Oriented Programming.A.1.3. Universal Modeling Language (UML).A.1.4. Design Patterns.A.1.5. Graphical User Interface.A.1.6. Source Code Documentation.A.1.7. Version Control.A.2. Mathematical Excursion into Matrices and Determinants.A.2.1. Mathematical Example.A.2.2. Chemical Example of an Atom Connectivity Matrix.Index.

「Nielsen BookData」 より


  • ISBN
    • 9783527306817
  • LCCN
  • 出版国コード
  • タイトル言語コード
  • 本文言語コード
  • 出版地
  • ページ数/冊数
    xxx, 649 p.
  • 大きさ
    25 cm
  • 分類
  • 件名