Results 1 - 10
of
252
Learning the Kernel Matrix with Semi-Definite Programming
, 2002
"... Kernel-based learning algorithms work by embedding the data into a Euclidean space, and then searching for linear relations among the embedded data points. The embedding is performed implicitly, by specifying the inner products between each pair of points in the embedding space. This information ..."
Abstract
-
Cited by 368 (16 self)
- Add to MetaCart
Kernel-based learning algorithms work by embedding the data into a Euclidean space, and then searching for linear relations among the embedded data points. The embedding is performed implicitly, by specifying the inner products between each pair of points in the embedding space. This information is contained in the so-called kernel matrix, a symmetric and positive definite matrix that encodes the relative positions of all points. Specifying this matrix amounts to specifying the geometry of the embedding space and inducing a notion of similarity in the input space---classical model selection problems in machine learning. In this paper we show how the kernel matrix can be learned from data via semi-definite programming (SDP) techniques. When applied
Selective sampling using the Query by Committee algorithm
- Machine Learning
, 1997
"... We analyze the "query by committee" algorithm, a method for filtering informative queries from a random stream of inputs. We show that if the two-member committee algorithm achieves information gain with positive lower bound, then the prediction error decreases exponentially with the number of queri ..."
Abstract
-
Cited by 256 (6 self)
- Add to MetaCart
We analyze the "query by committee" algorithm, a method for filtering informative queries from a random stream of inputs. We show that if the two-member committee algorithm achieves information gain with positive lower bound, then the prediction error decreases exponentially with the number of queries. We show that, in particular, this exponential decrease holds for query learning of perceptrons.
A Critical Point For Random Graphs With A Given Degree Sequence
, 2000
"... Given a sequence of non-negative real numbers 0 ; 1 ; : : : which sum to 1, we consider random graphs having approximately i n vertices of degree i. Essentially, we show that if P i(i \Gamma 2) i ? 0 then such graphs almost surely have a giant component, while if P i(i \Gamma 2) i ! 0 the ..."
Abstract
-
Cited by 209 (5 self)
- Add to MetaCart
Given a sequence of non-negative real numbers 0 ; 1 ; : : : which sum to 1, we consider random graphs having approximately i n vertices of degree i. Essentially, we show that if P i(i \Gamma 2) i ? 0 then such graphs almost surely have a giant component, while if P i(i \Gamma 2) i ! 0 then almost surely all components in such graphs are small. We can apply these results to G n;p ; G n;M , and other well-known models of random graphs. There are also applications related to the chromatic number of sparse random graphs.
A framework for learning predictive structures from multiple tasks and unlabeled data
- Journal of Machine Learning Research
, 2005
"... One of the most important issues in machine learning is whether one can improve the performance of a supervised learning algorithm by including unlabeled data. Methods that use both labeled and unlabeled data are generally referred to as semi-supervised learning. Although a number of such methods ar ..."
Abstract
-
Cited by 202 (2 self)
- Add to MetaCart
One of the most important issues in machine learning is whether one can improve the performance of a supervised learning algorithm by including unlabeled data. Methods that use both labeled and unlabeled data are generally referred to as semi-supervised learning. Although a number of such methods are proposed, at the current stage, we still don’t have a complete understanding of their effectiveness. This paper investigates a closely related problem, which leads to a novel approach to semi-supervised learning. Specifically we consider learning predictive structures on hypothesis spaces (that is, what kind of classifiers have good predictive power) from multiple learning tasks. We present a general framework in which the structural learning problem can be formulated and analyzed theoretically, and relate it to learning with unlabeled data. Under this framework, algorithms for structural learning will be proposed, and computational issues will be investigated. Experiments will be given to demonstrate the effectiveness of the proposed algorithms in the semi-supervised learning setting. 1.
Models of Random Regular Graphs
- In Surveys in combinatorics
, 1999
"... In a previous paper we showed that a random 4-regular graph asymptotically almost surely (a.a.s.) has chromatic number 3. Here we extend the method to show that a random 6-regular graph asymptotically almost surely (a.a.s.) has chromatic number 4 and that the chromatic number of a random d-regular g ..."
Abstract
-
Cited by 137 (33 self)
- Add to MetaCart
In a previous paper we showed that a random 4-regular graph asymptotically almost surely (a.a.s.) has chromatic number 3. Here we extend the method to show that a random 6-regular graph asymptotically almost surely (a.a.s.) has chromatic number 4 and that the chromatic number of a random d-regular graph for other d between 5 and 10 inclusive is a.a.s. restricted to a range of two integer values: {3, 4} for d = 5, {4, 5} for d = 7, 8, 9, and {5, 6} for d = 10. The proof uses efficient algorithms which a.a.s. colour these random graphs using the number of colours specified by the upper bound. These algorithms are analysed using the differential equation method, including an analysis of certain systems of differential equations with discontinuous right hand sides. 1
Sybilguard: Defending against sybil attacks via social networks
- In ACM SIGCOMM ’06
, 2006
"... Peer-to-peer and other decentralized, distributed systems are known to be particularly vulnerable to sybil attacks. In a sybil attack, a malicious user obtains multiple fake identities and pretends to be multiple, distinct nodes in the system. By controlling a large fraction of the nodes in the syst ..."
Abstract
-
Cited by 126 (5 self)
- Add to MetaCart
Peer-to-peer and other decentralized, distributed systems are known to be particularly vulnerable to sybil attacks. In a sybil attack, a malicious user obtains multiple fake identities and pretends to be multiple, distinct nodes in the system. By controlling a large fraction of the nodes in the system, the malicious user is able to “out vote” the honest users in collaborative tasks such as Byzantine failure defenses. This paper presents SybilGuard, anovelprotocolfor limiting the corruptive influences of sybil attacks. Our protocol is based on the “social network ” among user identities, where an edge between two identities indicates a human-established trust relationship. Malicious users can create many identities but few trust relationships. Thus, there is a disproportionately-small “cut ” in the graph between the sybil nodes and the honest nodes. SybilGuard exploits this property to bound the number of identities a malicious user can create. We show the effectiveness of SybilGuard both analytically and experimentally.
Tail Bounds for Occupancy and the Satisfiability Threshold Conjecture
, 1995
"... The classical occupancy problem is concerned with studying the number of empty bins resulting from a random allocation of m balls to n bins. We provide a series of tail bounds on the distribution of the number of empty bins. These tail bounds should find application in randomized algorithms and prob ..."
Abstract
-
Cited by 94 (1 self)
- Add to MetaCart
The classical occupancy problem is concerned with studying the number of empty bins resulting from a random allocation of m balls to n bins. We provide a series of tail bounds on the distribution of the number of empty bins. These tail bounds should find application in randomized algorithms and probabilistic analysis. Our motivating application is the following well-known conjecture on threshold phenomenon for the satisfiability problem. Consider random 3-SAT formulas with cn clauses over n variables, where each clause is chosen uniformly and independently from the space of all clauses of size 3. It has been conjectured that there is a sharp threshold for satisfiability at c ß 4:2. We provide a strong upper bound on the value of c , showing that for c ? 4:758 a random 3-SAT formula is unsatisfiable with high probability. This result is based on a structural property, possibly of independent interest, whose proof needs several applications of the occupancy tail bounds. Supporte...
On Talagrand's Deviation Inequalities For Product Measures
, 1996
"... We present a new and simple approach to some of the deviation inequalities for product measures deeply investigated by M. ..."
Abstract
-
Cited by 69 (0 self)
- Add to MetaCart
We present a new and simple approach to some of the deviation inequalities for product measures deeply investigated by M.

