## MOLecular Structure GENeration with MOLGEN, new features and future developments (1997)

Venue: | Fresenius J. Anal. Chem |

Citations: | 6 - 4 self |

### BibTeX

@ARTICLE{Benecke97molecularstructure,

author = {C. Benecke and T. Grüner and A. Kerber and R. Laue and T. Wieland},

title = {MOLecular Structure GENeration with MOLGEN, new features and future developments},

journal = {Fresenius J. Anal. Chem},

year = {1997},

volume = {358}

}

### OpenURL

### Abstract

MOLGEN is a computer program system which is designed for generating molecular graphs fast, redundancy free and exhaustively. In the present paper we describe its basic features, new features of the current release MOLGEN 3.5, and future developments which provide considerable improvements and extensions. 1 Introduction MOLGEN [1--7] is a generator for molecular graphs (=connectivity isomers or constitutional formulae) allowing to generate all isomers that correspond to a given molecular formula and (optional) further conditions like prescribed and forbidden substructures, ring sizes etc. The input consists of ffl the empirical formula, together with ffl an optional list of macroatoms, which means prescribed substructures that must not overlap, ffl an optional goodlist, that consists of prescribed substructures which may overlap, ffl an optional badlist, containing forbidden substructures, ffl an optional interval for the minimal and maximal size of rings, ffl an optional num...

### Citations

351 |
Algorithm 97: Shortest Path
- Floyd
(Show Context)
Citation Context ...ss of indices is based on the distance matrix D, where each entry d i;j denotes the length of the shortest path from vertex i to vertex j. The distance matrix can be calculated by the Floyd algorithm =-=[32]-=-. We also used the Wiener Index [33], the Balaban-Index [34], the mean square distance index [35], which is due to Balaban and Motoc, information-theoretic indices [20,28], and the mean information co... |

116 |
Structural determination of paraffin boiling points
- Wiener
- 1947
(Show Context)
Citation Context ...ce matrix D, where each entry d i;j denotes the length of the shortest path from vertex i to vertex j. The distance matrix can be calculated by the Floyd algorithm [32]. We also used the Wiener Index =-=[33]-=-, the Balaban-Index [34], the mean square distance index [35], which is due to Balaban and Motoc, information-theoretic indices [20,28], and the mean information content of distances that was develope... |

55 |
Algebraic Combinatorics via Finite Group Actions
- Kerber
- 1991
(Show Context)
Citation Context ...uced by the automorphism group Aut(fl; fi) of the molecular graph. So P fl acts on the k sites which shall be assigned with n different ligands. Using the well known theory of group actions (see e.g. =-=[45]-=-) one obtains: 3.2 Lemma The essentially different possibilities to attach n ligand structures, which contain the corresponding subgraph of the given reaction scheme exactly once, to the k different r... |

41 |
Molecular Connectivity in Structure-Activity Analysis
- Kier, Hall
- 1986
(Show Context)
Citation Context ...to cover a broad variety with the screening -- without requiring too many single substances. So in connection with combinatorial chemistry the notion of similarity has come into the focus [15], [16], =-=[18]-=-- [20]. There is a vast number of ways to define and determine similarity of chemical entities. It turned out that similarity -- unlike isomorphy, e.g. -- cannot be defined generally. It depends, in f... |

39 |
Highly Discriminating Distance-Based Topological Index
- Balaban
- 1982
(Show Context)
Citation Context ...entry d i;j denotes the length of the shortest path from vertex i to vertex j. The distance matrix can be calculated by the Floyd algorithm [32]. We also used the Wiener Index [33], the Balaban-Index =-=[34]-=-, the mean square distance index [35], which is due to Balaban and Motoc, information-theoretic indices [20,28], and the mean information content of distances that was developed by Bonchev and Trinajs... |

24 |
Information Theoretic Indices for Characterization of Chemical Structures
- Bonchev
- 1983
(Show Context)
Citation Context ...2,26] and experiments or quantum chemical calculations are expensive for larger sets of compounds, a lot of interest lies currently in the use of topological indices (or graph invariants) [18], [23], =-=[28]-=--[31] as discrimination criteria and prediction tools. We used connectivity indices ksand k b for k = 0; 1; 2 after [18] which are sums over all paths of length k in the graphs, varying by the use of ... |

23 |
Gasteiger J: From atoms and bonds to three-dimensional atomic coordinates: Automatic model builders. Chem ReV
- Sadowski
- 1993
(Show Context)
Citation Context ...0,51]. Here the library elements are first converted to three-dimensional structures by an appropriate method (like distance geometry programs [52], conformation analysis methods [53], expert systems =-=[54,55]-=- or force field calculations, e.g. [8]. The crucial feature such a program is required to have is that the computed conformation must be reasonable for the active site, as the usual software packages ... |

20 |
Information theory, distance matrix and molecular
- Bonchev, Trinajstić
- 1977
(Show Context)
Citation Context ... mean square distance index [35], which is due to Balaban and Motoc, information-theoretic indices [20,28], and the mean information content of distances that was developed by Bonchev and Trinajsti'c =-=[36]-=-. We calculated the index values for the 20 natural amino acids (for the table of results and more details see [37]). Principal Component Analysis yields three factors explaining 93.2 % of the origina... |

17 |
Similarity and Clustering in Chemical Information Systems; Research Studies Press
- Willett
- 1987
(Show Context)
Citation Context ...arch is one of the main algorithms for their determination. We used a procedure from [42]. Other properties were calculated by methods described in [6,5,7]. In the evaluation the Tanimoto coefficient =-=[43]-=- is employed which measures the similarity of two bit strings as T i;j = 2C i;j E i + E j where C i;j is the number of properties that are common in the i-th and in the j-th structure, and E i is the ... |

14 |
Using a genetic algorithm to suggest combinatorial libraries
- Sheridan, Kearsley
- 1995
(Show Context)
Citation Context ...nt programs that show beforehand what can be expected from a possible experiment using the powerful methods of combinatorial chemistry. There are several possibilies for selecting the building-blocks =-=[16,17,21]-=-. Their application mainly depends on the objective that is sought by the combinatorial library. Here we present two methods based on graph theory in conjunction with statistical analysis. 3.1 Molecul... |

11 |
MOLGEN, ein Computeralgebra-System für die Konstruktion molekularer Graphen
- Grund, Kerber, et al.
- 1992
(Show Context)
Citation Context ... sites in IV and V, the libraries with the first one are considerably smaller due to the higher symmetry of the core. 3 The 2D placements were automatically calculated by the drawing module of MOLGEN =-=[1,5,6]-=-. These pictures reveal the current inacurracies of the employed placement algorithm [5,48] for combinatorial libraries; further work in this respect is in progress. 4. Classification of 3D-placements... |

11 |
Determining structural similarity of chemicals using graph-theoretic indices
- BASAK, MAGNUSON, et al.
- 1988
(Show Context)
Citation Context ...er a broad variety with the screening -- without requiring too many single substances. So in connection with combinatorial chemistry the notion of similarity has come into the focus [15], [16], [18]- =-=[20]-=-. There is a vast number of ways to define and determine similarity of chemical entities. It turned out that similarity -- unlike isomorphy, e.g. -- cannot be defined generally. It depends, in fact, o... |

11 |
Construction of combinatorial objects – a tutorial. Bayreuther Mathem. Schr
- LAUE
- 1993
(Show Context)
Citation Context ... of the three cores only, the advantages of the mathematical concept behind algorithm 3.3 are obvious. The general Ansatz with an arbitrary permutation group and the efficient orderly generation (cf. =-=[46,47]-=-) allows a very rapid generation of the combinatorial libraries in all three cases. The computing speed is about 40 structures per second on a Pentium 90 MHz PC. Fig. 3 shows six molecules from each o... |

10 |
MOLGEN+, a generator of connectivity isomers and stereoisomers for molecular structure elucidation
- BENECKE, GRUND, et al.
- 1995
(Show Context)
Citation Context ... it does not contain any of the elements of the badlist. Moreover additional parts of MOLGEN allow to show the result of the generation, to compute a 3D placement (using a simplified MM2 energy model =-=[5,8]-=- together with numerical optimization and in a non-deterministic way so that repeated calculations may evolve different local minima). MOLGEN is also capable of generating all possible stereoisomers (... |

8 |
Measuring diversity: Experimental design of combinatorial libraries for drug discovery
- Martin, Blaney
- 1995
(Show Context)
Citation Context ...sible to cover a broad variety with the screening -- without requiring too many single substances. So in connection with combinatorial chemistry the notion of similarity has come into the focus [15], =-=[16]-=-, [18]- [20]. There is a vast number of ways to define and determine similarity of chemical entities. It turned out that similarity -- unlike isomorphy, e.g. -- cannot be defined generally. It depends... |

8 |
Application of graph-theoretical parameters in quantifying molecular similarity and structure-activity studies
- BASAK, BERTELSEN, et al.
(Show Context)
Citation Context ...out on the search for quantitative structureactivity relationships (QSAR), i.e. the search for empirical or theoretical parameters that are directly correlated to some biological response [18], [22]- =-=[27]-=-. Since empirical data are not always available [22,26] and experiments or quantum chemical calculations are expensive for larger sets of compounds, a lot of interest lies currently in the use of topo... |

8 |
Distance Geometry and Conformational Calculations
- Crippen
- 1981
(Show Context)
Citation Context ...9,50] and comparative molecular field analysis (CoMFA) [50,51]. Here the library elements are first converted to three-dimensional structures by an appropriate method (like distance geometry programs =-=[52]-=-, conformation analysis methods [53], expert systems [54,55] or force field calculations, e.g. [8]. The crucial feature such a program is required to have is that the computed conformation must be rea... |

7 |
Principles of the generation of constitutional and configurational isomers
- WIELAND, KERBER, et al.
- 1996
(Show Context)
Citation Context ...enerating all possible stereoisomers (configurational isomers) to a given constitutional formula, again exhaustive and redundancy-free (which, of course, also implies the consideration of symmetries) =-=[6,9]-=-. Spatial realizations of the stereoisomers constructed geometrically are displayed. MOLGEN can import and export files in MDL MolFile-format and detect aromatic mesomers. 2 New features 2.1 The hydro... |

7 | Applications of Combinatorial Technologies to Drug Discovery. 1. Background and Peptide Combinatorial Libraries - Gallop, Barrett, et al. - 1994 |

7 | Applications of combinatorial technologies to drug discovery. 2. Combinatorial organic synthesis, library screening strategies, and future directions - GORDON, BARRETT, et al. - 1994 |

6 |
Konstruktion molekularer Graphen mit gegebenen Hybridisierungen und überlappungsfreien Fragmenten
- Grund
- 1995
(Show Context)
Citation Context ... result of NMR spectroscopy, i.e. to ffl optionally input the hybridization states of carbon and hetero atoms, which also considerably reduces the number of isomers that must be generated and checked =-=[2]-=-. It is again be possible to give the exact distribution of the hybridization states or to enter just intervals. A draft of the input window is depicted in Fig. 2. 2. New features 3 The following tabl... |

6 | A QSAR investigation of the role of hydrophobicity in regulating mutagenicity in the Ames test: mutagenicity of aromatic and heterocyclic amines in Salmonella typhimurium TA98 - DEBNATH, DEBNATH, et al. - 1992 |

6 |
CONCORD: Rapid Generation of High Quality Approximate 3D Molecular Structures
- Pearlman
- 1987
(Show Context)
Citation Context ...0,51]. Here the library elements are first converted to three-dimensional structures by an appropriate method (like distance geometry programs [52], conformation analysis methods [53], expert systems =-=[54,55]-=- or force field calculations, e.g. [8]. The crucial feature such a program is required to have is that the computed conformation must be reasonable for the active site, as the usual software packages ... |

5 |
Mode of action and the assessment of chemical hazards in the presence of limited data: use of structure-activity relationships (SAR) under TSCA, Section 5. Environ. Health Persp
- AUER, NABHOLZ, et al.
- 1990
(Show Context)
Citation Context ...rried out on the search for quantitative structureactivity relationships (QSAR), i.e. the search for empirical or theoretical parameters that are directly correlated to some biological response [18], =-=[22]-=-- [27]. Since empirical data are not always available [22,26] and experiments or quantum chemical calculations are expensive for larger sets of compounds, a lot of interest lies currently in the use o... |

5 |
Use of molecular complexity indices in predictive pharmacology and toxicology: a QSAR approach
- BASAK
- 1987
(Show Context)
Citation Context ...ble [22,26] and experiments or quantum chemical calculations are expensive for larger sets of compounds, a lot of interest lies currently in the use of topological indices (or graph invariants) [18], =-=[23]-=-, [28]-[31] as discrimination criteria and prediction tools. We used connectivity indices ksand k b for k = 0; 1; 2 after [18] which are sums over all paths of length k in the graphs, varying by the u... |

5 |
Should we have designs on topological indices
- ROUVRAY
- 1983
(Show Context)
Citation Context ... shortest path from vertex i to vertex j. The distance matrix can be calculated by the Floyd algorithm [32]. We also used the Wiener Index [33], the Balaban-Index [34], the mean square distance index =-=[35]-=-, which is due to Balaban and Motoc, information-theoretic indices [20,28], and the mean information content of distances that was developed by Bonchev and Trinajsti'c [36]. We calculated the index va... |

5 |
Heuristic approach for displaying chemical structures
- Shelley
- 1983
(Show Context)
Citation Context ...her symmetry of the core. 3 The 2D placements were automatically calculated by the drawing module of MOLGEN [1,5,6]. These pictures reveal the current inacurracies of the employed placement algorithm =-=[5,48]-=- for combinatorial libraries; further work in this respect is in progress. 4. Classification of 3D-placements 12 3.7 Screening One of our aims is to investigate the use of simulations in combinatorial... |

4 |
Abzählung und Konstruktion von Stereoisomeren
- WIELAND
- 1994
(Show Context)
Citation Context ...enerating all possible stereoisomers (configurational isomers) to a given constitutional formula, again exhaustive and redundancy-free (which, of course, also implies the consideration of symmetries) =-=[6,9]-=-. Spatial realizations of the stereoisomers constructed geometrically are displayed. MOLGEN can import and export files in MDL MolFile-format and detect aromatic mesomers. 2 New features 2.1 The hydro... |

4 |
Konstruktionsalgorithmen bei molekularen Graphen und deren Anwendung
- Wieland
- 1997
(Show Context)
Citation Context ...oms are most useful, it is reasonable to construct all possible overlappings of the goodlist entries and use these as macroatoms. This can be carried out by another module currently under development =-=[10]-=-. For example, the empirical formula C 6 H 11 NO has 13,982 isomers. If we prescribe the substructures C C NH 2 C C H OH C as goodlist entries, all these 13,982 isomers have to be generated in order t... |

4 |
New promise in combinatorial chemistry: synthesis, characterization, and screening of smallmolecule libraries in solution
- CARELL, WINTNER, et al.
- 1995
(Show Context)
Citation Context ....3 between step iii and step iv. In laboratory, this restriction can be satisfied by an appropriate modification of the reaction conditions. As an example we consider the combinatorial libraries from =-=[14]-=-. The authors used as building-blocks the twenty natural amino acids and as core structures some acid chlorides: Cl Cl Cl Cl O O O O IV O Cl Cl Cl Cl O O O O V Cl Cl O Cl O O VI a cubane-derivative (s... |

4 |
Computerassisted studies of chemical structure and biological function
- Stuper, Brugger, et al.
- 1979
(Show Context)
Citation Context ... relationships (QSAR), i.e. the search for empirical or theoretical parameters that are directly correlated to some biological response [18], [22]- [27]. Since empirical data are not always available =-=[22,26]-=- and experiments or quantum chemical calculations are expensive for larger sets of compounds, a lot of interest lies currently in the use of topological indices (or graph invariants) [18], [23], [28]-... |

3 |
An algebraic model of contitutional chemistry as a basis for chemical computer programs
- Dugundji, Ugi
- 1973
(Show Context)
Citation Context ...of the situation and is only used for a formalization of the construction problem discussed below. For more sophisticated purposes more comprehensive approaches like the algebra of be- &r-matrices of =-=[44]-=- are necessary. 3. Combinatorial chemistry 8 The condensation is represented by the mapping ae = 0 B B B B B B @ 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 \Gamma1 \Gamma1 \Gamma1 \Gamma1 \Gamma1 1 C C C... |

2 |
Erkennung, Beschreibung und Visualisierung molekularer Strukturen
- Kerber, Laue, et al.
- 1996
(Show Context)
Citation Context ... in terms of substructures, and so a substructure search is one of the main algorithms for their determination. We used a procedure from [42]. Other properties were calculated by methods described in =-=[6,5,7]-=-. In the evaluation the Tanimoto coefficient [43] is employed which measures the similarity of two bit strings as T i;j = 2C i;j E i + E j where C i;j is the number of properties that are common in th... |

2 | Fast and permanent changes in preparative and pharmaceutical chemistry through multicomponent reactions and their 'libraries - Ugi - 1995 |

2 | Unique mathematical features of the substructure metric approach to quantitative molecular similarity analysis - M, Tsai - 1987 |

2 |
The use of structure generators in predictive pharmacology and toxicology
- Wieland
- 1996
(Show Context)
Citation Context ... and experiments or quantum chemical calculations are expensive for larger sets of compounds, a lot of interest lies currently in the use of topological indices (or graph invariants) [18], [23], [28]-=-=[31]-=- as discrimination criteria and prediction tools. We used connectivity indices ksand k b for k = 0; 1; 2 after [18] which are sums over all paths of length k in the graphs, varying by the use of the a... |

2 |
Mathematical Simulations in Combinatorial Chemistry
- Wieland
- 1996
(Show Context)
Citation Context ... mean information content of distances that was developed by Bonchev and Trinajsti'c [36]. We calculated the index values for the 20 natural amino acids (for the table of results and more details see =-=[37]-=-). Principal Component Analysis yields three factors explaining 93.2 % of the original variance. The results obtained show that structures which differ only slightly also have similar factors, e.g. as... |

2 |
Evaluation of low resolution mass spectra series. Max-Planck-Institut fur Kohlenforschung, Mulheim/Ruhr
- MassLib
- 1992
(Show Context)
Citation Context ... For the characterization of diversity lists with binary properties can be considered, too. We took a subset of 120 descriptors from the structure codes of the mass spectra information system MassLib =-=[38]-=-, the same as used in K. Varmuza's program ToSIM [39--41]. The following classes of properties are taken into account: ffl Aromatic compounds (e.g. substructure phenyl) ffl branches in chains and ring... |

2 | Handbuch zu ToSIM (Software zur Untersuchung von topologischen Ahnlichkeiten in Molekulen). Technische Universitat - Scsibrany, Varmuza - 1994 |

2 | Clusteranalyse isomerer chemischer Strukturen basierend auf binaren Deskriptoren und der Hauptkomponentenanalyse - Varmuza, Scsibrany - 1994 |

2 | Computerassisted structure elucidation of organic compounds, based on mass spectra classification and exhaustive isomer generation - Varmuza, Werther, et al. - 1996 |

2 |
Algorithmen zur Klassifizierung diskreter Strukturen
- Benecke
- 1996
(Show Context)
Citation Context ...s (e.g. methyl ester) Many of these properties can be described in terms of substructures, and so a substructure search is one of the main algorithms for their determination. We used a procedure from =-=[42]-=-. Other properties were calculated by methods described in [6,5,7]. In the evaluation the Tanimoto coefficient [43] is employed which measures the similarity of two bit strings as T i;j = 2C i;j E i +... |

2 |
The SCA program: An easy way for the conformational evaluation of polycyclic molecules
- Hoflack, Clercq
- 1988
(Show Context)
Citation Context ...d analysis (CoMFA) [50,51]. Here the library elements are first converted to three-dimensional structures by an appropriate method (like distance geometry programs [52], conformation analysis methods =-=[53]-=-, expert systems [54,55] or force field calculations, e.g. [8]. The crucial feature such a program is required to have is that the computed conformation must be reasonable for the active site, as the ... |

1 | MOLGEN, a Computer Algebra System for the Generation of Molecular Graphs - Benecke, Grund, et al. - 1995 |

1 |
A Hydrocarbon Force Field Utilizing V 1 and V 2 Torsional Terms
- Allinger
(Show Context)
Citation Context ... it does not contain any of the elements of the badlist. Moreover additional parts of MOLGEN allow to show the result of the generation, to compute a 3D placement (using a simplified MM2 energy model =-=[5,8]-=- together with numerical optimization and in a non-deterministic way so that repeated calculations may evolve different local minima). MOLGEN is also capable of generating all possible stereoisomers (... |

1 |
The chemical generation of molecular diversity
- Pavia
- 1995
(Show Context)
Citation Context ...reactions make use of chemical, biological or biosynthetical procedures. The resulting set of molecules is called a combinatorial library. A crucial issue in combinatorial chemistry is diversity [15]-=-=[17]-=-. A large combinatorial library may fulfill the demand for making many compounds available; for an efficient analysis, however, it should be certified that the elements of a library are not too simila... |

1 |
Applications for group actions applied to graph generation
- uner, T, et al.
- 1995
(Show Context)
Citation Context ... of the three cores only, the advantages of the mathematical concept behind algorithm 3.3 are obvious. The general Ansatz with an arbitrary permutation group and the efficient orderly generation (cf. =-=[46,47]-=-) allows a very rapid generation of the combinatorial libraries in all three cases. The computing speed is about 40 structures per second on a Pentium 90 MHz PC. Fig. 3 shows six molecules from each o... |

1 |
Schlussel zum Schlo. II. Hansch-Analyse, 3D-QSAR und De novoDesign. Pharmazie in unserer Zeit
- Kubinyi
- 1994
(Show Context)
Citation Context ...by statistical methods. This correlation is afterwards employed for extrapolation on a larger set of compounds (like a combinatorial library, cf. [18,23,27,31]). More elaborate techniques are 3D QSAR =-=[49,50]-=- and comparative molecular field analysis (CoMFA) [50,51]. Here the library elements are first converted to three-dimensional structures by an appropriate method (like distance geometry programs [52],... |