Results 1  10
of
16
Online Clustering of Processes
"... The problem of online clustering is considered in the case where each data point is a sequence generated by a stationary ergodic process. Data arrive in an online fashion so that the sample received at every timestep is either a continuation of some previously received sequence or a new sequence. Th ..."
Abstract

Cited by 12 (11 self)
 Add to MetaCart
The problem of online clustering is considered in the case where each data point is a sequence generated by a stationary ergodic process. Data arrive in an online fashion so that the sample received at every timestep is either a continuation of some previously received sequence or a new sequence. The dependence between the sequences can be arbitrary. No parametric or independence assumptions are made; the only assumption is that the marginal distribution of each sequence is stationary and ergodic. A novel, computationally efficient algorithm is proposed and is shown to be asymptotically consistent (under a natural notion of consistency). The performance of the proposed algorithm is evaluated on simulated data, as well as on real datasets (motion classification). 1
Asymptotically consistent estimation of the number of change points in highly dependent time series
"... The problem of change point estimation is considered in a general framework where the data are generated by arbitrary unknown stationary ergodic process distributions. This means that the data may have longrange dependencies of an arbitrary form. In this context the consistent estimation of the n ..."
Abstract
 Add to MetaCart
The problem of change point estimation is considered in a general framework where the data are generated by arbitrary unknown stationary ergodic process distributions. This means that the data may have longrange dependencies of an arbitrary form. In this context the consistent estimation of the number of change points is provably impossible. A formulation is proposed which overcomes this obstacle: it is possible to find the correct number of change points at the expense of introducing the additional constraint that the correct number of process distributions that generate the data is provided. This additional parameter has a natural interpretation in many realworld applications. It turns out that in this formulation change point estimation can be reduced to time series clustering. Based on this reduction, an algorithm is proposed that finds the number of change points and locates the changes. This algorithm is shown to be asymptotically consistent. The theoretical results are complemented with empirical evaluations. 1.
ONLINECLUSTERINGOFPROCESSES
"... Setup: We have a growing body of sequences of data. Each sequence is generated by on of k unknown discretetime stochastic process. The number k of distributions is known. Data are observed in an online fashion: → New samples arrive at every timestep; they either are continuations of previously r ..."
Abstract
 Add to MetaCart
Setup: We have a growing body of sequences of data. Each sequence is generated by on of k unknown discretetime stochastic process. The number k of distributions is known. Data are observed in an online fashion: → New samples arrive at every timestep; they either are continuations of previously received sequences or a new sequences. Goal: Cluster the sequences at every timestep. CONSISTENCY In general it is hard to give a precise definition for “correct clustering”. But, a natural notion for correct clustering exists in the considered setting: Sequences generated by the same process distribution should be grouped together. Asymptotic Consistency: A clustering algorithm is (asymptotically) consistent if, with probability 1, for each N ∈ N from some time on, it clusters the first N observed sequences are clustered correctly. ASSUMPTIONS ON DATA • Data revealed in an arbitrary fashion. • Our only assumption is that the distributions generating the data are stationaryergodic. → The samples are allowed to be dependent and the dependence can be arbitrary, or even adversarial. No such assumptions as iid, Markov etc. Remark: In timeseries literature, it is typically assumed that the distributions generating the data have a known form, ex. Gaussian, HMMs etc., and the samples are independent. MAIN THEORETICAL RESULT Theorem: There exists an online clustering algorithm that is asymptotically consistent provided that the distributions generating the data are stationary and ergodic.
Locating Changes in Highly Dependent Data with Unknown Number of Change Points
"... The problem of multiple change point estimation is considered for sequences with unknown number of change points. A consistency framework is suggested that is suitable for highly dependent timeseries, and an asymptotically consistent algorithm is proposed. In order for the consistency to be establi ..."
Abstract

Cited by 2 (1 self)
 Add to MetaCart
The problem of multiple change point estimation is considered for sequences with unknown number of change points. A consistency framework is suggested that is suitable for highly dependent timeseries, and an asymptotically consistent algorithm is proposed. In order for the consistency to be established the only assumption required is that the data is generated by stationary ergodic timeseries distributions. No modeling, independence or parametric assumptions are made; the data are allowed to be dependent and the dependence can be of arbitrary form. The theoretical result is complemented with experimental evaluations. 1
PREDICTIVE DYNAMIC USER INTERFACES FOR INTERACTIVE VISUAL SEARCH
"... This paper proposes a method for designing user interfaces based on ideas rooted in data communication theory. It suggests that a visual user interface should be treated as a multitransmitter, singlereceiver communication system, where the total available bandwidth for transmission is limited. The ..."
Abstract
 Add to MetaCart
This paper proposes a method for designing user interfaces based on ideas rooted in data communication theory. It suggests that a visual user interface should be treated as a multitransmitter, singlereceiver communication system, where the total available bandwidth for transmission is limited. The proposed design entails the scaling of visual components that are displayed according to their degree of relevance to the user, or in other words, their probability of selection by the user. 1.
Projective Space Codes for the Injection Metric
, 2009
"... In the context of error control in random linear network coding, it is useful to construct codes that comprise wellseparated collections of subspaces of a vector space over a finite field. In this paper, the metric used is the socalled “injection distance,” introduced by Silva and Kschischang. A G ..."
Abstract

Cited by 7 (1 self)
 Add to MetaCart
In the context of error control in random linear network coding, it is useful to construct codes that comprise wellseparated collections of subspaces of a vector space over a finite field. In this paper, the metric used is the socalled “injection distance,” introduced by Silva and Kschischang. A GilbertVarshamov bound for such codes is derived. Using the codeconstruction framework of Etzion and Silberstein, new nonconstantdimension codes are constructed; these codes contain more codewords than comparable codes designed for the subspace metric.
Subspace codes
 Lecture Notes In Computer Science
, 2009
"... Abstract. This paper is a survey of bounds and constructions for subspace codes designed for the injection metric, a distance measure that arises in the context of correcting adversarial packet insertions in linear network coding. The construction of lifted rankmetric codes is reviewed, along with ..."
Abstract

Cited by 7 (0 self)
 Add to MetaCart
Abstract. This paper is a survey of bounds and constructions for subspace codes designed for the injection metric, a distance measure that arises in the context of correcting adversarial packet insertions in linear network coding. The construction of lifted rankmetric codes is reviewed, along with improved constructions leading to codes with strictly more codewords. Algorithms for encoding and decoding are also briefly described. 1
A Survey on Randomized Algorithms in Networking Design: Random Network Coding
"... Abstract—This survey attempts to provide an overview of the distributed and randomized schemes used in networking. This concept by itself is very diverse and the problems discussed in this paper are by no means exhaustive. We restrict our review to the class of randomized schemes which are based on ..."
Abstract
 Add to MetaCart
Abstract—This survey attempts to provide an overview of the distributed and randomized schemes used in networking. This concept by itself is very diverse and the problems discussed in this paper are by no means exhaustive. We restrict our review to the class of randomized schemes which are based on the concept of random network coding. In particular we describe the existing schemes for randomized coding in multicast. Furthermore, we review the practical challenges and the existing strategies to mitigate them in real networks. We also survey a decentralized randomgossip protocol which utilizes random network coding to simultaneously disseminate multiple messages to all the nodes in the network. I.
Results 1  10
of
16