Results 1 -
9 of
9
Scalable, Distributed Data Mining Using An Agent Based Architecture
- Proceedings the Third International Conference on the Knowledge Discovery and Data Mining, AAAI Press, Menlo Park, California
, 1997
"... : Algorithm scalability and the distributed nature of both data and computation deserve serious attention in the context of data mining. This paper presents PADMA (PArallel Data Mining Agents), a parallel agent based system, that makes an effort to address these issues. PADMA contains modules for (1 ..."
Abstract
-
Cited by 44 (7 self)
- Add to MetaCart
: Algorithm scalability and the distributed nature of both data and computation deserve serious attention in the context of data mining. This paper presents PADMA (PArallel Data Mining Agents), a parallel agent based system, that makes an effort to address these issues. PADMA contains modules for (1) parallel data accessing operations, (2) parallel hierarchical clustering, and (3) webbased data visualization. This paper describes the general architecture of PADMA and experimental results. Scalable, Distributed Data Mining Using An Agent Based Architecture Hillol Kargupta, Ilker Hamzaoglu, Brian Stafford Computational Science Methods Group X Division, Los Alamos National Laboratory P.O. Box 1663, MS F645 Los Alamos, NM, 87545 LAUR-96-3491, shorter version published in the Proceedings of High Performance Computing'97 & Knowledge Discovery and Data Mining'97 Abstract Algorithm scalability and the distributed nature of both data and computation deserve serious attention in the contex...
A binary linear programming formulation of the graph edit distance
- IEEE Transactions on Pattern Analysis and Machine Intelligence
, 2006
"... Abstract—A binary linear programming formulation of the graph edit distance for unweighted, undirected graphs with vertex attributes is derived and applied to a graph recognition problem. A general formulation for editing graphs is used to derive a graph edit distance that is proven to be a metric, ..."
Abstract
-
Cited by 8 (2 self)
- Add to MetaCart
Abstract—A binary linear programming formulation of the graph edit distance for unweighted, undirected graphs with vertex attributes is derived and applied to a graph recognition problem. A general formulation for editing graphs is used to derive a graph edit distance that is proven to be a metric, provided the cost function for individual edit operations is a metric. Then, a binary linear program is developed for computing this graph edit distance, and polynomial time methods for determining upper and lower bounds on the solution of the binary program are derived by applying solution methods for standard linear programming and the assignment problem. A recognition problem of comparing a sample input graph to a database of known prototype graphs in the context of a chemical information system is presented as an application of the new method. The costs associated with various edit operations are chosen by using a minimum normalized variance criterion applied to pairwise distances between nearest neighbors in the database of prototypes. The new metric is shown to perform quite well in comparison to existing metrics when applied to a database of chemical graphs. Index Terms—Graph algorithms, similarity measures, structural pattern recognition, graphs and networks, linear programming, continuation (homotopy) methods. æ 1
Stochastic algorithms for maximizing molecular diversity
- J. Chem. Inf. Comput. Sci
, 1997
"... A common problem in the emerging field of combinatorial drug design is the selection of an appropriate subset of compounds for chemical synthesis and biological evaluation. In this paper, we introduce a new family of selection algorithms that combine a stochastic search engine with a user-defined ob ..."
Abstract
-
Cited by 7 (4 self)
- Add to MetaCart
A common problem in the emerging field of combinatorial drug design is the selection of an appropriate subset of compounds for chemical synthesis and biological evaluation. In this paper, we introduce a new family of selection algorithms that combine a stochastic search engine with a user-defined objective function that encodes any desirable selection criterion. The method is applied to the problem of maximizing molecular diversity, and the results are visualized using Sammon’s nonlinear mapping algorithm. By separating the search method from the performance metric, the method can be easily extended to perform complex multiobjective selections in advanced decision-support systems.
MOLecular Structure GENeration with MOLGEN, new features and future developments
- Fresenius J. Anal. Chem
, 1997
"... MOLGEN is a computer program system which is designed for generating molecular graphs fast, redundancy free and exhaustively. In the present paper we describe its basic features, new features of the current release MOLGEN 3.5, and future developments which provide considerable improvements and ex ..."
Abstract
-
Cited by 4 (4 self)
- Add to MetaCart
MOLGEN is a computer program system which is designed for generating molecular graphs fast, redundancy free and exhaustively. In the present paper we describe its basic features, new features of the current release MOLGEN 3.5, and future developments which provide considerable improvements and extensions. 1 Introduction MOLGEN [1--7] is a generator for molecular graphs (=connectivity isomers or constitutional formulae) allowing to generate all isomers that correspond to a given molecular formula and (optional) further conditions like prescribed and forbidden substructures, ring sizes etc. The input consists of ffl the empirical formula, together with ffl an optional list of macroatoms, which means prescribed substructures that must not overlap, ffl an optional goodlist, that consists of prescribed substructures which may overlap, ffl an optional badlist, containing forbidden substructures, ffl an optional interval for the minimal and maximal size of rings, ffl an optional num...
Introduction to Similarity Searching in Chemistry. Institute of Organic Chemistry
- Bulgarian Academy of Sciences, Sofia 1113, Bulgaria. Match-Communications in Mathematical and in Computer Chemistry 51
, 2004
"... ..."
Comparison of chemical clustering methods using graph- and fingerprint-based similarity measures
- J Mol Graph Model
"... Abstract This paper compares several published methods for clustering chemical structures, using both fingerprint-based and graph-based similarity measures. The clusterings from each method were compared to determine the degree of cluster overlap. Each method was also evaluated on how well it groupe ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
Abstract This paper compares several published methods for clustering chemical structures, using both fingerprint-based and graph-based similarity measures. The clusterings from each method were compared to determine the degree of cluster overlap. Each method was also evaluated on how well it grouped structures into clusters possessing a non-trivial substructural commonality. The methods which employ adjustable parameters were tested to determine the stability of each parameter for datasets of varying size and composition. Our experiments suggest that both fingerprint-based and graph-based similarity measures can be used effectively for generating chemical clusterings; it is also suggested that the CAST method, suggested recently for the clustering of gene expression patterns, may also prove effective for the clustering of 2D chemical structures.
Exploiting Multiple Sources of Evidence of Document Relatedness in Hybrid Search Engines: A Unifying Model and Design Proposal
"... This is a draft of an article submitted (August 2001) for publication in ..."
PURE AND APPLIED CHEMISTRY CHEMISTRY AND HUMAN HEALTH DIVISION MEDICINAL CHEMISTRY SECTION GLOSSARY OF TERMS USED IN COMPUTATIONAL DRUG DESIGN (IUPAC recommendations 1997) Prepared for publication by
, 1997
"... Membership of the Section during the period (1992-1995) when this report was prepared was as follows: ..."
Abstract
- Add to MetaCart
Membership of the Section during the period (1992-1995) when this report was prepared was as follows:

