Results 1  10
of
23
Finding Patterns in Three Dimensional Graphs: Algorithms and Applications to Scientific Data Mining
 IEEE Transactions on Knowledge and Data Engineering
, 2002
"... This paper presents a method for finding patterns in three dimensional (3D) graphs. Each node in a graph is an undecomposable or atomic unit and has a label. Edges are links between the atomic units. Patterns are rigid substructures that may occur in a graph after allowing for an arbitrary number ..."
Abstract

Cited by 20 (3 self)
 Add to MetaCart
This paper presents a method for finding patterns in three dimensional (3D) graphs. Each node in a graph is an undecomposable or atomic unit and has a label. Edges are links between the atomic units. Patterns are rigid substructures that may occur in a graph after allowing for an arbitrary number of wholestructure rotations and translations as well as a small number (specified by the user) of edit operations in the patterns or in the graph. (When a pattern appears in a graph only after the graph has been modified, we call that appearance "approximate occurrence.") The edit operations include relabeling a node, deleting a node and inserting a node. The proposed method is based on the geometric hashing technique, which hashes nodetriplets of the graphs into a 3D table and compresses the labeltriplets in the table. To demonstrate the utility of our algorithms, we discuss two applications of them in scientific data mining.
Virtual screening of molecular databases using a support vector machine
 J. Chem. Inf. Model. 2005
"... The Support Vector Machine (SVM) is an algorithm that derives a model used for the classification of data into two categories and which has good generalization properties. This study applies the SVM algorithm to the problem of virtual screening for molecules with a desired activity. In contrast to t ..."
Abstract

Cited by 18 (1 self)
 Add to MetaCart
The Support Vector Machine (SVM) is an algorithm that derives a model used for the classification of data into two categories and which has good generalization properties. This study applies the SVM algorithm to the problem of virtual screening for molecules with a desired activity. In contrast to typical applications of the SVM, we emphasize not classification but enrichment of actives by using a modified version of the standard SVM function to rank molecules. The method employs a simple and novel criterion for picking molecular descriptors and uses crossvalidation to select SVM parameters. The resulting method is more effective at enriching for active compounds with novel chemistries than binary fingerprintbased methods such as binary kernel discrimination.
BioSpider: a web server for automating metabolome annotations
 Pacific Symposium on Biocomputing (PSB 2007
, 2007
"... ..."
Frog: a FRee Online druG 3D conformation generator
, 2007
"... In silico screening methods based on the 3D structures of the ligands or of the proteins have become an essential tool to facilitate the drug discovery process. To achieve such process, the 3D structures of the small chemical compounds have to be generated. In addition, for ligandbased screening co ..."
Abstract

Cited by 7 (1 self)
 Add to MetaCart
In silico screening methods based on the 3D structures of the ligands or of the proteins have become an essential tool to facilitate the drug discovery process. To achieve such process, the 3D structures of the small chemical compounds have to be generated. In addition, for ligandbased screening computations or hierarchical structurebased screening projects involving a rigidbody docking step, it is necessary to generate multiconformer 3D models for each input ligand to increase the efficiency of the search. However, most academic or commercial compound collections are delivered in 1D SMILES (simplified molecular input line entry system) format or in 2D SDF (structure data file), highlighting the need for free 1D/2D to 3D structure generators. Frog is an online service aimed at generating 3D conformations for druglike compounds starting from their 1D or 2D descriptions. Given the atomic constitution of the molecules and connectivity information, Frog can identify the different unambiguous isomers corresponding to each compound, and generate single or multiple lowtomedium energy 3D conformations, using an assembly process that does not presently consider ring flexibility. Tests show that Frog is able to generate bioactive conformations close to those observed in crystallographic complexes. Frog can be accessed at
Molecules in Silico: The Generation of Structural Formulae and its Applications
 J. Comput. Chem. Jpn
"... In information processing, in combinatorial chemistry, in structure elucidation, and in several other fields of chemistry, the computeraided generation of all structures (constitutional formulae) within a defined structure space has become increasingly important. In this brief review the mathematic ..."
Abstract

Cited by 6 (5 self)
 Add to MetaCart
In information processing, in combinatorial chemistry, in structure elucidation, and in several other fields of chemistry, the computeraided generation of all structures (constitutional formulae) within a defined structure space has become increasingly important. In this brief review the mathematical foundations of the classical molecular model and thus of the generation process are outlined, and the current state of structure generation as applied in software developed by the Bayreuth group is discussed.
MOLecular Structure GENeration with MOLGEN, new features and future developments
 Fresenius J. Anal. Chem
, 1997
"... MOLGEN is a computer program system which is designed for generating molecular graphs fast, redundancy free and exhaustively. In the present paper we describe its basic features, new features of the current release MOLGEN 3.5, and future developments which provide considerable improvements and ex ..."
Abstract

Cited by 6 (4 self)
 Add to MetaCart
MOLGEN is a computer program system which is designed for generating molecular graphs fast, redundancy free and exhaustively. In the present paper we describe its basic features, new features of the current release MOLGEN 3.5, and future developments which provide considerable improvements and extensions. 1 Introduction MOLGEN [17] is a generator for molecular graphs (=connectivity isomers or constitutional formulae) allowing to generate all isomers that correspond to a given molecular formula and (optional) further conditions like prescribed and forbidden substructures, ring sizes etc. The input consists of ffl the empirical formula, together with ffl an optional list of macroatoms, which means prescribed substructures that must not overlap, ffl an optional goodlist, that consists of prescribed substructures which may overlap, ffl an optional badlist, containing forbidden substructures, ffl an optional interval for the minimal and maximal size of rings, ffl an optional num...
Multiple Semiflexible 3D Superposition of Drugsized Molecules
"... In this paper we describe a new algorithm for multiple semiflexible superpositioning of drugsized molecules. The algorithm identifies structural similarities of two or more molecules. When comparing a set of molecules on the basis of their threedimensional structures, one is faced with two main p ..."
Abstract

Cited by 3 (1 self)
 Add to MetaCart
In this paper we describe a new algorithm for multiple semiflexible superpositioning of drugsized molecules. The algorithm identifies structural similarities of two or more molecules. When comparing a set of molecules on the basis of their threedimensional structures, one is faced with two main problems. (1) Molecular structures are not fixed but flexible, i.e., a molecule adopts different forms. To address this problem, we consider a set of conformers per molecule. As conformers we use representatives of conformational ensembles, generated by the program ZIBMol. (2) The degree of similarity may vary considerably among the molecules. This problem is addressed by searching for similar substructures present in arbitrary subsets of the given set of molecules. The algorithm requires to preselect a reference molecule. All molecules are compared to this reference molecule. For this pairwise comparison we use a twostep approach. Clique detection on the correspondence graph of the molecular structures is used to generate start transformations, which are then iteratively improved to compute large common substructures. The results of the pairwise comparisons are efficiently merged using binary matching trees. All common substructures that were found, whether they are common to all or only a few molecules, are ranked according to different criteria, such as number of molecules containing the substructure, size of substructure, and geometric fit. For evaluating the geometric fit, we extend a known scoring function by introducing weights which allow to favor potential pharmacophore points. Despite considering the full atomic information for identifying multiple structural similarities, our algorithm is quite fast. Thus it is well suited as an interactive tool for the exploration of structural similarities of drugsized molecules.
FragmentBased de Novo Ligand Design by Multiobjective Evolutionary Optimization
 J. Chem. Inf. Model. 2008
"... GANDI (Genetic Algorithmbased de Novo Design of Inhibitors) is a computational tool for automatic fragmentbased design of molecules within a protein binding site of known structure. A genetic algorithm and a tabu search act in concert to join predocked fragments with a usersupplied list of fragme ..."
Abstract

Cited by 2 (0 self)
 Add to MetaCart
GANDI (Genetic Algorithmbased de Novo Design of Inhibitors) is a computational tool for automatic fragmentbased design of molecules within a protein binding site of known structure. A genetic algorithm and a tabu search act in concert to join predocked fragments with a usersupplied list of fragments. A novel feature of GANDI is the simultaneous optimization of force field energy and a term enforcing 2Dsimilarity to known inhibitor(s) or 3Doverlap to known binding mode(s). Scaffold hopping can be promoted by tuning the relative weights of these terms. The performance of GANDI is tested on cyclindependent kinase 2 (CDK2) using a library of about 14 000 fragments and the binding mode of a known oxindole inhibitor to bias the design. Top ranking GANDI molecules are involved in one to three hydrogen bonds with the backbone polar groups in the hinge region of CDK2, an interaction pattern observed in potent kinase inhibitors. Notably, a GANDI molecule with very favorable predicted binding affinity shares a 2Nphenyl1,3thiazole2,4diamine moiety with a known nanomolar inhibitor of CDK2. Importantly, molecules with a favorable GANDI score are synthetic accessible. In fact, eight of the 1809 molecules designed by GANDI for CDK2 are found in the ZINC database of commercially available compounds which also contains about 600 compounds with identical scaffolds as those in the top ranking GANDI molecules.
An Evolutionary Algorithm with Local Search and Classification for Conformational Searching
 MATCH
, 1998
"... In this paper a software package is described that allows using the MM2 force field to compute the energy of threedimensional conformations of molecules; this energy is minimized, the resulting structures are automatically classified. Thus we get an impression about the set of all lowenergy three ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
In this paper a software package is described that allows using the MM2 force field to compute the energy of threedimensional conformations of molecules; this energy is minimized, the resulting structures are automatically classified. Thus we get an impression about the set of all lowenergy threedimensional conformations of a given molecule defined in terms of two dimensional connectivity information. The accuracy of the resulting information can be tuned by changing input parameters for the method presented. This is a hybrid which is built from a method for classification of threedimensional molecules, from a conjugate gradient method for local minimization and from operators stemming from evolutionary algorithms. The latter have proven successful in the solution of di#cult optimization tasks (not only) in mathematical chemistry. They are of special importance for the approximate solution of problems where global optima of multimodal functions in high dimensional spaces are sought. and in Computer Chemistry  match, no. 38, October 1998, pp. 137159. # This work was supported by the DFG under grant Ke 201/161. ## email: clemens.frey@unibayreuth.de The paper is organized in three sections. The first one gives a formalization of the problem and shows in which way this problem was solved up to now. In the second section the evolutionary algorithm is introduced after a short presentation of a general formalization for this kind of algorithms. Each operator of our method is shown in a separate subsection. A definition of the function which determines the transition from one generation to the next generation concludes this section. The last section shows trials and corresponding results obtained with the presented method. The summary of trials is followed by list...