Results 11 - 20
of
65
Approximation of functions over redundant dictionaries using coherence
- Proc. of SODA
, 2003
"... ..."
Near-optimal sparse Fourier representations via sampling
- In STOC
, 2002
"... We give an algorithm for nding a Fourier representation R ofBterms for a given discrete signal A of lengthN, such thatkA,Rk 2 2 is within the factor (1 +) of best possible kA,Roptk 2 2. Our algorithm can access A by reading its values on a sample setT [0;N), chosen randomly from a (non-product) dist ..."
Abstract
-
Cited by 57 (14 self)
- Add to MetaCart
We give an algorithm for nding a Fourier representation R ofBterms for a given discrete signal A of lengthN, such thatkA,Rk 2 2 is within the factor (1 +) of best possible kA,Roptk 2 2. Our algorithm can access A by reading its values on a sample setT [0;N), chosen randomly from a (non-product) distribution of our choice, independent of A. That is, we sample non-adaptively. The total time cost of the algorithm is polynomial inB log(N) log(M) = (where M is the ratio of largest to smallest numerical quantity encountered), which implies a similar bound for the number of samples. 1.
What's New: Finding Significant Differences in Network Data Streams
- in Proc. of IEEE Infocom
, 2004
"... Monitoring and analyzing network traffic usage patterns is vital for managing IP Networks. An important problem is to provide network managers with information about changes in traffic, informing them about "what's new". Specifically, we focus on the challenge of finding significantly large differen ..."
Abstract
-
Cited by 51 (7 self)
- Add to MetaCart
Monitoring and analyzing network traffic usage patterns is vital for managing IP Networks. An important problem is to provide network managers with information about changes in traffic, informing them about "what's new". Specifically, we focus on the challenge of finding significantly large differences in traffic: over time, between interfaces and between routers. We introduce the idea of a deltoid: an item that has a large difference, whether the difference is absolute, relative or variational. We present novel...
Memory-Limited Execution of Windowed Stream Joins
, 2004
"... We address the problem of computing approximate answers to continuous sliding-window joins over data streams when the available memory may be insufficient to keep the entire join state. One approximation scenario is to provide a maximum subset of the result, with the objective of losing as few ..."
Abstract
-
Cited by 48 (0 self)
- Add to MetaCart
We address the problem of computing approximate answers to continuous sliding-window joins over data streams when the available memory may be insufficient to keep the entire join state. One approximation scenario is to provide a maximum subset of the result, with the objective of losing as few result tuples as possible. An alternative scenario is to provide a random sample of the join result, e.g., if the output of the join is being aggregated.
Combinatorial Algorithms for Compressed Sensing
- In Proc. of SIROCCO
, 2006
"... Abstract — In sparse approximation theory, the fundamental problem is to reconstruct a signal A ∈ R n from linear measurements 〈A, ψi 〉 with respect to a dictionary of ψi’s. Recently, there is focus on the novel direction of Compressed Sensing [1] where the reconstruction can be done with very few—O ..."
Abstract
-
Cited by 44 (1 self)
- Add to MetaCart
Abstract — In sparse approximation theory, the fundamental problem is to reconstruct a signal A ∈ R n from linear measurements 〈A, ψi 〉 with respect to a dictionary of ψi’s. Recently, there is focus on the novel direction of Compressed Sensing [1] where the reconstruction can be done with very few—O(k log n)— linear measurements over a modified dictionary if the signal is compressible, that is, its information is concentrated in k coefficients with the original dictionary. In particular, these results [1], [2], [3] prove that there exists a single O(k log n) × n measurement matrix such that any such signal can be reconstructed from these measurements, with error at most O(1) times the worst case error for the class of such signals. Compressed sensing has generated tremendous excitement both because of the sophisticated underlying Mathematics and because of its potential applications. In this paper, we address outstanding open problems in Compressed Sensing. Our main result is an explicit construction of a non-adaptive measurement matrix and the corresponding reconstruction algorithm so that with a number of measurements polynomial in k, log n, 1/ε, we can reconstruct compressible signals. This is the first known polynomial time explicit construction of any such measurement matrix. In addition, our result improves the error guarantee from O(1) to 1 + ε and improves the reconstruction time from poly(n) to poly(k log n). Our second result is a randomized construction of O(k polylog(n)) measurements that work for each signal with high probability and gives per-instance approximation guarantees rather than over the class of all signals. Previous work on Compressed Sensing does not provide such per-instance approximation guarantees; our result improves the best known number of measurements known from prior work in other areas including Learning Theory [4], [5], Streaming algorithms [6], [7], [8] and Complexity Theory [9] for this case. Our approach is combinatorial. In particular, we use two parallel sets of group tests, one to filter and the other to certify and estimate; the resulting algorithms are quite simple to implement. I.
COMBINING GEOMETRY AND COMBINATORICS: A UNIFIED APPROACH TO SPARSE SIGNAL RECOVERY
"... Abstract. There are two main algorithmic approaches to sparse signal recovery: geometric and combinatorial. The geometric approach starts with a geometric constraint on the measurement matrix Φ and then uses linear programming to decode information about x from Φx. The combinatorial approach constru ..."
Abstract
-
Cited by 42 (11 self)
- Add to MetaCart
Abstract. There are two main algorithmic approaches to sparse signal recovery: geometric and combinatorial. The geometric approach starts with a geometric constraint on the measurement matrix Φ and then uses linear programming to decode information about x from Φx. The combinatorial approach constructs Φ and a combinatorial decoding algorithm to match. We present a unified approach to these two classes of sparse signal recovery algorithms. The unifying elements are the adjacency matrices of high-quality unbalanced expanders. We generalize the notion of Restricted Isometry Property (RIP), crucial to compressed sensing results for signal recovery, from the Euclidean norm to the ℓp norm for p ≈ 1, and then show that unbalanced expanders are essentially equivalent to RIP-p matrices. From known deterministic constructions for such matrices, we obtain new deterministic measurement matrix constructions and algorithms for signal recovery which, compared to previous deterministic algorithms, are superior in either the number of measurements or in noise tolerance. 1.
On graph problems in a semi-streaming model
- In 31st International Colloquium on Automata, Languages and Programming
, 2004
"... Abstract. We formalize a potentially rich new streaming model, the semi-streaming model, that we believe is necessary for the fruitful study of efficient algorithms for solving problems on massive graphs whose edge sets cannot be stored in memory. In this model, the input graph, G = (V, E), is prese ..."
Abstract
-
Cited by 38 (4 self)
- Add to MetaCart
Abstract. We formalize a potentially rich new streaming model, the semi-streaming model, that we believe is necessary for the fruitful study of efficient algorithms for solving problems on massive graphs whose edge sets cannot be stored in memory. In this model, the input graph, G = (V, E), is presented as a stream of edges (in adversarial order), and the storage space of an algorithm is bounded by O(n · polylog n), where n = |V |. We are particularly interested in algorithms that use only one pass over the input, but, for problems where this is provably insufficient, we also look at algorithms using constant or, in some cases, logarithmically many passes. In the course of this general study, we give semi-streaming constant approximation algorithms for the unweighted and weighted matching problems, along with a further algorithm improvement for the bipartite case. We also exhibit log n / log log n semistreaming approximations to the diameter and the problem of computing the distance between specified vertices in a weighted graph. These are complemented by Ω(log (1−ɛ) n) lower bounds. 1
Graph distances in the streaming model: the value of space
- In ACM-SIAM Symposium on Discrete Algorithms
, 2005
"... We investigate the importance of space when solving problems based on graph distance in the streaming model. In this model, the input graph is presented as a stream of edges in an arbitrary order. The main computational restriction of the model is that we have limited space and therefore cannot stor ..."
Abstract
-
Cited by 38 (8 self)
- Add to MetaCart
We investigate the importance of space when solving problems based on graph distance in the streaming model. In this model, the input graph is presented as a stream of edges in an arbitrary order. The main computational restriction of the model is that we have limited space and therefore cannot store all the streamed data; we are forced to make space-efficient summaries of the data as we go along. For a graph of n vertices and m edges, we show that testing many graph properties, including connectivity (ergo any reasonable decision problem about distances) and bipartiteness, requires Ω(n) bits of space. Given this, we then investigate how the power of the model increases as we relax our space restriction. Our main result is an efficient randomized algorithm that constructs a (2t + 1)-spanner in one pass. With high probability, it uses O(t · n 1+1/t log 2 n) bits of space and processes each edge in the stream in O(t 2 · n 1/t log n) time. We find approximations to diameter and girth via the log n constructed spanner. For t = Ω (), the space log log n requirement of the algorithm is O(n·polylog n), and the per-edge processing time is O(polylog n). We also show a corresponding lower bound of t for the approximation ratio achievable when the space restriction is O(t · n1+1/t log 2 n). We then consider the scenario in which we are allowed multiple passes over the input stream. Here, we investigate whether allowing these extra passes will compensate for a given space restriction. We show that ∗This work was supported by the DoD University Research Initiative (URI) administered by the Office of Naval Research
Improved Time Bounds for Near-Optimal Sparse Fourier Representations
- in Proc. SPIE Wavelets XI
, 2003
"... We study the problem of finding a Fourier representation R of B terms for a given discrete signal A of length N . The Fast Fourier Transform (FFT) can find the optimal N-term representation in O(N log N) time, but our goal is to get sublinear algorithms for B ! N , typically, B N . Suppose kAk2 M ..."
Abstract
-
Cited by 36 (5 self)
- Add to MetaCart
We study the problem of finding a Fourier representation R of B terms for a given discrete signal A of length N . The Fast Fourier Transform (FFT) can find the optimal N-term representation in O(N log N) time, but our goal is to get sublinear algorithms for B ! N , typically, B N . Suppose kAk2 M kRoptk 2 , where Ropt is the optimal output. The previously best known algorithms output R such that kA \Gamma Rk poly(B; log(1=ffi); log N; log M; 1=ffl): Even though this is sublinear in the input size, the dominating term is the polynomial factor in B which is B . In our experience, this is a limitation in practice. Our main result is a significantly improved algorithm for this problem. Our algorithms output R such that kA \Gamma Rk B \Delta poly(log(1=ffi); log N; log M; 1=ffl): We also obtain improvements for higher dimensional Fourier transforms. We need two crucial ideas to achieve this bound: bulk sampling and estimation for multipoint polynomial evaluation using an unevenly-spaced Fourier tranform, and construction and use of arithmeticprogression independent random variables. Our improved algorithms are likely to find many applications. 1
Bloom histogram: Path selectivity estimation for xml data with updates
- In VLDB
, 2004
"... Cost-based XML query optimization calls for accurate estimation of the selectivity of path expressions. Some other interactive and internet applications can also benefit from such estimations. While there are a number of estimation techniques proposed in the literature, almost none of them has any g ..."
Abstract
-
Cited by 32 (0 self)
- Add to MetaCart
Cost-based XML query optimization calls for accurate estimation of the selectivity of path expressions. Some other interactive and internet applications can also benefit from such estimations. While there are a number of estimation techniques proposed in the literature, almost none of them has any guarantee on the estimation accuracy within a given space limit. In addition, most of them assume that the XML data are more or less static, i.e., with few updates. In this paper, we present a framework for XML path selectivity estimation in a dynamic context. Specifically, we propose a novel data structure, bloom histogram, to approximate XML path frequency distribution within a small space budget and to estimate the path selectivity accurately with the bloom histogram. We obtain the upper bound of its estimation error and discuss the trade-offs between the accuracy and the space limit. To support updates of bloom histograms efficiently when underlying XML data change, a dynamic summary layer is used to keep exact or more detailed XML path information. We demonstrate through our extensive experiments that the new solution can

