Results 1 - 10
of
51
From Boolean to Probabilistic Boolean Networks as Models of Genetic Regulatory Networks
- Proc. IEEE
, 2002
"... Mathematical and computational modeling of genetic regulatory networks promises to uncover the fundamental principles governing biological systems in an integrarive and holistic manner. It also paves the way toward the development of systematic approaches for effective therapeutic intervention in di ..."
Abstract
-
Cited by 45 (9 self)
- Add to MetaCart
Mathematical and computational modeling of genetic regulatory networks promises to uncover the fundamental principles governing biological systems in an integrarive and holistic manner. It also paves the way toward the development of systematic approaches for effective therapeutic intervention in disease. The central theme in this paper is the Boolean formalism as a building block for modeling complex, large-scale, and dynamical networks of genetic interactions. We discuss the goals of modeling genetic networks as well as the data requirements. The Boolean formalism is justified from several points of view. We then introduce Boolean networks and discuss their relationships to nonlinear digital filters. The role of Boolean networks in understanding cell differentiation and cellular functional states is discussed. The inference of Boolean networks from real gene expression data is considered from the viewpoints of computational learning theory and nonlinear signal processing, touching on computational complexity of learning and robustness. Then, a discussion of the need to handle uncertainty in a probabilistic framework is presented, leading to an introduction of probabilistic Boolean networks and their relationships to Markov chains. Methods for quantifying the influence of genes on other genes are presented. The general question of the potential effect of individual genes on the global dynamical network behavior is considered using stochastic perturbation analysis. This discussion then leads into the problem of target identification for therapeutic intervention via the development of several computational tools based on first-passage times in Markov chains. Examples from biology are presented throughout the paper. 1
An Implementation of Logical Analysis of Data
- IEEE Transactions on Knowledge and Data Engineering
, 2000
"... The paper describes a new, logic-based methodology for analyzing observations. The key features of the Logical Analysis of Data (LAD) are the discovery of minimal sets of features necessary for explaining all observations and the detection of hidden patterns in the data capable of distinguishing o ..."
Abstract
-
Cited by 41 (24 self)
- Add to MetaCart
The paper describes a new, logic-based methodology for analyzing observations. The key features of the Logical Analysis of Data (LAD) are the discovery of minimal sets of features necessary for explaining all observations and the detection of hidden patterns in the data capable of distinguishing observations describing positive outcome events from negative outcome events. Combinations of such patterns are used for developing general classification procedures. An implementation of this methodology is described in the paper along with the results of numerical experiments demonstrating the classification performance of LAD in comparison with the reported results of other procedures. In the final section, we describe three pilot studies on applications of LAD to oil exploration, psychometric testing, and the analysis of developments in the Chinese transitional economy. These pilot studies demonstrate not only the classification power of LAD, but also its flexibility and capability t...
A Continuous Approach to Inductive Inference
- Mathematical Programming
, 1992
"... In this paper we describe an interior point mathematical programming approach to inductive inference. We list several versions of this problem and study in detail the formulation based on hidden Boolean logic. We consider the problem of identifying a hidden Boolean function F : f0; 1g n ! f0; 1g ..."
Abstract
-
Cited by 38 (2 self)
- Add to MetaCart
In this paper we describe an interior point mathematical programming approach to inductive inference. We list several versions of this problem and study in detail the formulation based on hidden Boolean logic. We consider the problem of identifying a hidden Boolean function F : f0; 1g n ! f0; 1g using outputs obtained by applying a limited number of random inputs to the hidden function. Given this input-output sample, we give a method to synthesize a Boolean function that describes the sample. We pose the Boolean Function Synthesis Problem as a particular type of Satisfiability Problem. The Satisfiability Problem is translated into an integer programming feasibility problem, that is solved with an interior point algorithm for integer programming. A similar integer programming implementation has been used in a previous study to solve randomly generated instances of the Satisfiability Problem. In this paper we introduce a new variant of this algorithm, where the Riemannian metric used...
Logical Analysis of Numerical Data
- Mathematical Programming
, 2000
"... The "Logical Analysis of Data" (LAD) is a methodology developed since the late eightees, aimed at discovering hidden structural information in data sets. LAD was originally developed for analyzing binary data by using the theory of partially defined Boolean functions. An extension of LAD for the ana ..."
Abstract
-
Cited by 36 (12 self)
- Add to MetaCart
The "Logical Analysis of Data" (LAD) is a methodology developed since the late eightees, aimed at discovering hidden structural information in data sets. LAD was originally developed for analyzing binary data by using the theory of partially defined Boolean functions. An extension of LAD for the analysis of numerical data sets is achieved through the process of "binarization" consisting in the replacement of each numerical variable by binary "indicator" variables, each showing whether the value of the original variable is above or below a certain level. Binarization was successfully applied to the analysis of a variety of real life data sets. This paper develops the theoretical foundations of the binarization process studying the combinatorial optimization problems related to the minimization of the number of binary variables. To provide an algorithmic framework for the practical solution of such problems, we construct compact linear integer programming formulations of them. We develop...
Error-Free and Best-Fit Extensions of Partially Defined Boolean Functions
, 1997
"... In this paper, we address a fundamental problem related to the induction of Boolean logic: Given a set of data, represented as a set of binary "true n-vectors" (or "positive examples") and a set of "false n-vectors" (or "negative examples"), we establish a Boolean function (or an extension) f , so t ..."
Abstract
-
Cited by 33 (16 self)
- Add to MetaCart
In this paper, we address a fundamental problem related to the induction of Boolean logic: Given a set of data, represented as a set of binary "true n-vectors" (or "positive examples") and a set of "false n-vectors" (or "negative examples"), we establish a Boolean function (or an extension) f , so that f is true (resp., false) in every given true (resp., false) vector. We shall further require that such an extension belongs to a certain specified class of functions, e.g., class of positive functions, class of Horn functions and so on. The class of functions represents our a priori knowledge or hypothesis about the extension f , which may be obtained from experience or from the analysis of mechanisms that may or may not cause the phenomena under consideration. The real-world data may contain errors, e.g., measurement and classification errors might come in when obtaining data, or there may be some other influential factors not represented as variables in the vectors. In such situations,...
On the Decomposition of Polychotomies Into Dichotomies
, 1996
"... Many important classification problems are polychotomies, i.e. the data are organized into K classes with K ? 2. Given an unknown function F :\Omega ! f1; : : : ; Kg representing a polychotomy, an algorithm aimed at "learning" this polychotomy will produce an approximation of F , based on the know ..."
Abstract
-
Cited by 26 (5 self)
- Add to MetaCart
Many important classification problems are polychotomies, i.e. the data are organized into K classes with K ? 2. Given an unknown function F :\Omega ! f1; : : : ; Kg representing a polychotomy, an algorithm aimed at "learning" this polychotomy will produce an approximation of F , based on the knowledge of a set of pairs f(x p ; F (x p ))g P p=1 . Although in the wide variety of learning tools there exist some learning algorithms capable of handling polychotomies, many of the interesting tools were designed by nature for dichotomies (K = 2). Therefore, many researchers are compelled to use techniques to decompose a polychotomy into a series of dichotomies in order to apply their favorite algorithms to the resolution of a general problem. A decomposition method based on error-correcting codes has been lately proposed and shown to be very efficient. However, this decomposition is designed only on the basis of K without taking the data into account. In this paper, we explore alter...
Improved Pairwise Coupling Classification with Correcting Classifiers
, 1997
"... The benefits obtained from the decomposition of a classification task involving several classes, into a set of smaller classification problems involving two classes only, usually called dichotomies, have been exposed in various occasions. Among the multiple ways of applying the referred decompositi ..."
Abstract
-
Cited by 19 (3 self)
- Add to MetaCart
The benefits obtained from the decomposition of a classification task involving several classes, into a set of smaller classification problems involving two classes only, usually called dichotomies, have been exposed in various occasions. Among the multiple ways of applying the referred decomposition, Pairwise Coupling is one of the best known. Its principle is to separate a pair of classes in each binary subproblem, ignoring the remaining ones, resulting in a decomposition scheme containing as much subproblems as the number of possible pairs of classes in the original task. Pairwise Coupling decomposition has so far been used in different applications. In this paper, various ways of recombining the outputs of all the classiers solving the existing subproblems are explored, and an important handicap of its intrinsic nature is exposed, which consists in the use, for the classification, of impertinent information. A solution for this problem is suggested and it is shown how it can significa...
Predicting Cause-Effect Relationships from Incomplete Discrete Observations
- SIAM Journal on Discrete Mathematics
, 1991
"... We address a prediction problem that frequently occurs in practice. We wish to predict the value of a function on the basis of discrete observational dat a that are incomplete in two senses. Only certain arguments of the function ar e observed, and the function value is observed only for certain ..."
Abstract
-
Cited by 18 (10 self)
- Add to MetaCart
We address a prediction problem that frequently occurs in practice. We wish to predict the value of a function on the basis of discrete observational dat a that are incomplete in two senses. Only certain arguments of the function ar e observed, and the function value is observed only for certain combinations of values of these arguments. We solve the problem under a monotonicity condition that is natural in many applications, and we discuss applications t o tax auditing, medicine, and real estate valuation. In particular, we display a special class of problems for which the best mono tone prediction can be found in polynomial time. 1 Introduction The problem of establishing cause-effect relationship based on incomplete observations was studied in [4]. In this paper we address the problem of finding a good approximation of an unknown discrete function on the basis of a set of observations, which is incomplete in two senses. We observe the values of only GSIA Working Paper 1991...
Coronary Risk Prediction by Logical Analysis of Data
, 2002
"... The objective of this study was to distinguish within a population of patients with known or suspected coronary artery disease groups at high and at low mortality rates. The study was based on Cleveland Clinic Foundation's dataset of 9454 patients, of whom 312 died during an observation period of 9 ..."
Abstract
-
Cited by 16 (9 self)
- Add to MetaCart
The objective of this study was to distinguish within a population of patients with known or suspected coronary artery disease groups at high and at low mortality rates. The study was based on Cleveland Clinic Foundation's dataset of 9454 patients, of whom 312 died during an observation period of 9 years. The Logical Analysis of Data method was adapted to handle the disproportioned size of the two groups of patients, and the inseparable character of this dataset -- characteristic to many medical problems. As a result of the study, we have identified a high-risk group of patients representing 1/5 of the population, with a mortality rate 4 times higher than the average, and including 3/4 of the patients who died. The low-risk group identified in the study, representing approximately 4/5 of the population, had a mortality rate 3 times lower than the average.
Accelerated Algorithm For Pattern Detection In Logical Analysis Of Data
, 2001
"... Sets of "positive" and "negative" points (observations) in n-dimensional discrete space given along with their non-negative integer multiplicities are analyzed from the perspective of the Logical Analysis of Data (LAD). A set of observations satisfying upper and/or lower bounds imposed on certain co ..."
Abstract
-
Cited by 13 (9 self)
- Add to MetaCart
Sets of "positive" and "negative" points (observations) in n-dimensional discrete space given along with their non-negative integer multiplicities are analyzed from the perspective of the Logical Analysis of Data (LAD). A set of observations satisfying upper and/or lower bounds imposed on certain components is called a positive pattern if it contains some positive observations and no negative one. The number of variables on which such restrictions are imposed is called the degree of the pattern. A total polynomial algorithm is proposed for the enumeration of all patterns of limited degree, and special efficient variants of it for the enumeration of all patterns with certain "sign" and "coverage" requirements are presented and evaluated on a publicly available collection of benchmark datasets. Acknowledgements: The partial support of the Office of Naval Research (Grant N00014-92-J-1375) and the Center for Discrete Mathematics and Computer Science are gratefully acknowledged. RRR 59-2001 PAGE 1 1

