Results 1 - 10
of
431
Clustering Gene Expression Patterns
, 1999
"... Recent advances in biotechnology allow researchers to measure expression levels for thousands of genes simultaneously, across different conditions and over time. Analysis of data produced by such experiments offers potential insight into gene function and regulatory mechanisms. A key step in the ana ..."
Abstract
-
Cited by 273 (10 self)
- Add to MetaCart
Recent advances in biotechnology allow researchers to measure expression levels for thousands of genes simultaneously, across different conditions and over time. Analysis of data produced by such experiments offers potential insight into gene function and regulatory mechanisms. A key step in the analysis of gene expression data is the detection of groups of genes that manifest similar expression patterns. The corresponding algorithmic problem is to cluster multi-condition gene expression patterns. In this paper we describe a novel clustering algorithm that was developed for analysis of gene expression data. We define an appropriate stochastic error model on the input, and prove that under the conditions of the model, the algorithm recovers the cluster structure with high probability. The running time of the algorithm on an n-gene dataset is O(n 2 (log(n)) c ). We also present a practical heuristic based on the same algorithmic ideas. The heuristic was implemented and its p...
Learning in the Presence of Malicious Errors
- SIAM Journal on Computing
, 1993
"... In this paper we study an extension of the distribution-free model of learning introduced by Valiant [23] (also known as the probably approximately correct or PAC model) that allows the presence of malicious errors in the examples given to a learning algorithm. Such errors are generated by an advers ..."
Abstract
-
Cited by 158 (12 self)
- Add to MetaCart
In this paper we study an extension of the distribution-free model of learning introduced by Valiant [23] (also known as the probably approximately correct or PAC model) that allows the presence of malicious errors in the examples given to a learning algorithm. Such errors are generated by an adversary with unbounded computational power and access to the entire history of the learning algorithm's computation. Thus, we study a worst-case model of errors. Our results include general methods for bounding the rate of error tolerable by any learning algorithm, efficient algorithms tolerating nontrivial rates of malicious errors, and equivalences between problems of learning with errors and standard combinatorial optimization problems. 1 Introduction In this paper, we study a practical extension to Valiant's distribution-free model of learning: the presence of errors (possibly maliciously generated by an adversary) in the sample data. The distribution-free model typically makes the idealize...
Approximation Algorithms for Disjoint Paths Problems
, 1996
"... The construction of disjoint paths in a network is a basic issue in combinatorial optimization: given a network, and specified pairs of nodes in it, we are interested in finding disjoint paths between as many of these pairs as possible. This leads to a variety of classical NP-complete problems for w ..."
Abstract
-
Cited by 122 (0 self)
- Add to MetaCart
The construction of disjoint paths in a network is a basic issue in combinatorial optimization: given a network, and specified pairs of nodes in it, we are interested in finding disjoint paths between as many of these pairs as possible. This leads to a variety of classical NP-complete problems for which very little is known from the point of view of approximation algorithms. It has recently been brought into focus in work on problems such as VLSI layout and routing in high-speed networks; in these settings, the current lack of understanding of the disjoint paths problem is often an obstacle to the design of practical heuristics.
On Hiding Information from an Oracle
, 1989
"... : We consider the problem of computing with encrypted data. Player A wishes to know the value f(x) for some x but lacks the power to compute it. Player B has the power to compute f and is willing to send f(y) to A if she sends him y, for any y. Informally, an encryption scheme for the problem f is a ..."
Abstract
-
Cited by 119 (15 self)
- Add to MetaCart
: We consider the problem of computing with encrypted data. Player A wishes to know the value f(x) for some x but lacks the power to compute it. Player B has the power to compute f and is willing to send f(y) to A if she sends him y, for any y. Informally, an encryption scheme for the problem f is a method by which A, using her inferior resources, can transform the cleartext instance x into an encrypted instance y, obtain f(y) from B, and infer f(x) from f(y) in such a way that B cannot infer x from y. When such an encryption scheme exists, we say that f is encryptable. The framework defined in this paper enables us to prove precise statements about what an encrypted instance hides and what it leaks, in an information-theoretic sense. Our definitions are cast in the language of probability theory and do not involve assumptions such as the intractability of factoring or the existence of one-way functions. We use our framework to describe encryption schemes for some well-known function...
A Lower Bound for Radio Broadcast
, 1991
"... A radio network is a synchronous network of processors that communicate by transmitting messages to their neighbors, where a processor receives a message in a given step if and only if it is silent in this step and precisely one of its neighbors transmits. In this paper we prove the existence of a f ..."
Abstract
-
Cited by 115 (4 self)
- Add to MetaCart
A radio network is a synchronous network of processors that communicate by transmitting messages to their neighbors, where a processor receives a message in a given step if and only if it is silent in this step and precisely one of its neighbors transmits. In this paper we prove the existence of a family of radius-2 networks on n vertices for which any broadcast schedule requires at least sZ(log * n) rounds of transmissions. This matches an upper bound of O(log * n) rounds for networks of radius 2 proved earlier by Bar-Yehuda, Goldreich, and Itai, in
Competitive auctions and digital goods
- In Proc. 12th Symp. on Discrete Alg
, 2001
"... Abstract We study a class of single round, sealed bid auctions for items in unlimited supply such as digital goods. We focus on auctions that are truthful and competitive. Truthful auctions encourage bidders to bid their utility; competitive auctions yield revenue within a constant factor of the rev ..."
Abstract
-
Cited by 113 (25 self)
- Add to MetaCart
Abstract We study a class of single round, sealed bid auctions for items in unlimited supply such as digital goods. We focus on auctions that are truthful and competitive. Truthful auctions encourage bidders to bid their utility; competitive auctions yield revenue within a constant factor of the revenue for optimal fixed pricing. We show that for any truthful auction, even a multi-price auction, the expected revenue does not exceed that for optimal fixed pricing. We also give a bound on how far the revenue for optimal fixed pricing can be from the total market utility. We show that several randomized auctions are truthful and competitive under certain assumptions, and that no truthful deterministic auction is competitive. We present simulation results which confirm that our auctions compare favorably to fixed pricing. Some of our results extend to bounded supply markets, for which we also get truthful and competitive auctions.
A Randomized Linear-Time Algorithm to Find Minimum Spanning Trees
, 1994
"... We present a randomized linear-time algorithm to find a minimum spanning tree in a connected graph with edge weights. The algorithm uses random sampling in combination with a recently discovered linear-time algorithm for verifying a minimum spanning tree. Our computational model is a unit-cost ra ..."
Abstract
-
Cited by 98 (7 self)
- Add to MetaCart
We present a randomized linear-time algorithm to find a minimum spanning tree in a connected graph with edge weights. The algorithm uses random sampling in combination with a recently discovered linear-time algorithm for verifying a minimum spanning tree. Our computational model is a unit-cost random-access machine with the restriction that the only operations allowed on edge weights are binary comparisons.
Chernoff-Hoeffding Bounds for Applications with Limited Independence
- SIAM J. Discrete Math
, 1993
"... Chernoff--Hoeffding bounds are fundamental tools used in bounding the tail probabilities of the sums of bounded and independent random variables. We present a simple technique which gives slightly better bounds than these, and which more importantly requires only limited independence among the rando ..."
Abstract
-
Cited by 88 (10 self)
- Add to MetaCart
Chernoff--Hoeffding bounds are fundamental tools used in bounding the tail probabilities of the sums of bounded and independent random variables. We present a simple technique which gives slightly better bounds than these, and which more importantly requires only limited independence among the random variables, thereby importing a variety of standard results to the case of limited independence for free. Additional methods are also presented, and the aggregate results are sharp and provide a better understanding of the proof techniques behind these bounds. They also yield improved bounds for various tail probability distributions and enable improved approximation algorithms for jobshop scheduling. The "limited independence" result implies that a reduced amount of randomness and weaker sources of randomness are sufficient for randomized algorithms whose analyses use the Chernoff--Hoeffding bounds, e.g., the analysis of randomized algorithms for random sampling and oblivious packet routi...
Practical Bayesian Density Estimation Using Mixtures Of Normals
- Journal of the American Statistical Association
, 1995
"... this paper, we propose some solutions to these problems. Our goal is to come up with a simple, practical method for estimating the density. This is an interesting problem in its own right, as well as a first step towards solving other inference problems, such as providing more flexible distributions ..."
Abstract
-
Cited by 88 (2 self)
- Add to MetaCart
this paper, we propose some solutions to these problems. Our goal is to come up with a simple, practical method for estimating the density. This is an interesting problem in its own right, as well as a first step towards solving other inference problems, such as providing more flexible distributions in hierarchical models. To see why the posterior is improper under the usual reference prior, we write the model in the following way. Let Z = (Z 1 ; : : : ; Z n ) and X = (X 1 ; : : : ; X n ). The Z
The Load, Capacity and Availability of Quorum Systems
, 1998
"... A quorum system is a collection of sets (quorums) every two of which intersect. Quorum systems have been used for many applications in the area of distributed systems, including mutual exclusion, data replication and dissemination of information Given a strategy to pick quorums, the load L(S) is th ..."
Abstract
-
Cited by 86 (12 self)
- Add to MetaCart
A quorum system is a collection of sets (quorums) every two of which intersect. Quorum systems have been used for many applications in the area of distributed systems, including mutual exclusion, data replication and dissemination of information Given a strategy to pick quorums, the load L(S) is the minimal access probability of the busiest element, minimizing over the strategies. The capacity Cap(S) is the highest quorum accesses rate that S can handle, so Cap(S) = 1=L(S).

