Results 1 - 10
of
122,026
Data Streams: Algorithms and Applications
, 2005
"... In the data stream scenario, input arrives very rapidly and there is limited memory to store the input. Algorithms have to work with one or few passes over the data, space less than linear in the input size or time significantly less than the input size. In the past few years, a new theory has emerg ..."
Abstract
-
Cited by 533 (22 self)
- Add to MetaCart
In the data stream scenario, input arrives very rapidly and there is limited memory to store the input. Algorithms have to work with one or few passes over the data, space less than linear in the input size or time significantly less than the input size. In the past few years, a new theory has
Models and issues in data stream systems
- IN PODS
, 2002
"... In this overview paper we motivate the need for and research issues arising from a new model of data processing. In this model, data does not take the form of persistent relations, but rather arrives in multiple, continuous, rapid, time-varying data streams. In addition to reviewing past work releva ..."
Abstract
-
Cited by 786 (19 self)
- Add to MetaCart
In this overview paper we motivate the need for and research issues arising from a new model of data processing. In this model, data does not take the form of persistent relations, but rather arrives in multiple, continuous, rapid, time-varying data streams. In addition to reviewing past work
Consistency of spectral clustering
, 2004
"... Consistency is a key property of statistical algorithms, when the data is drawn from some underlying probability distribution. Surprisingly, despite decades of work, little is known about consistency of most clustering algorithms. In this paper we investigate consistency of a popular family of spe ..."
Abstract
-
Cited by 572 (15 self)
- Add to MetaCart
Consistency is a key property of statistical algorithms, when the data is drawn from some underlying probability distribution. Surprisingly, despite decades of work, little is known about consistency of most clustering algorithms. In this paper we investigate consistency of a popular family
Learning with local and global consistency.
- In NIPS,
, 2003
"... Abstract We consider the general problem of learning from labeled and unlabeled data, which is often called semi-supervised learning or transductive inference. A principled approach to semi-supervised learning is to design a classifying function which is sufficiently smooth with respect to the intr ..."
Abstract
-
Cited by 673 (21 self)
- Add to MetaCart
Abstract We consider the general problem of learning from labeled and unlabeled data, which is often called semi-supervised learning or transductive inference. A principled approach to semi-supervised learning is to design a classifying function which is sufficiently smooth with respect
Globally Consistent Range Scan Alignment for Environment Mapping
- AUTONOMOUS ROBOTS
, 1997
"... A robot exploring an unknown environmentmay need to build a world model from sensor measurements. In order to integrate all the frames of sensor data, it is essential to align the data properly. An incremental approach has been typically used in the past, in which each local frame of data is alig ..."
Abstract
-
Cited by 531 (8 self)
- Add to MetaCart
A robot exploring an unknown environmentmay need to build a world model from sensor measurements. In order to integrate all the frames of sensor data, it is essential to align the data properly. An incremental approach has been typically used in the past, in which each local frame of data
CoolStreaming/DONet: A Data-driven Overlay Network for Peer-to-Peer Live Media Streaming
- in IEEE Infocom
, 2005
"... This paper presents DONet, a Data-driven Overlay Network for live media streaming. The core operations in DONet are very simple: every node periodically exchanges data availability information with a set of partners, and retrieves unavailable data from one or more partners, or supplies available dat ..."
Abstract
-
Cited by 475 (42 self)
- Add to MetaCart
This paper presents DONet, a Data-driven Overlay Network for live media streaming. The core operations in DONet are very simple: every node periodically exchanges data availability information with a set of partners, and retrieves unavailable data from one or more partners, or supplies available
Weighted Voting for Replicated Data
, 1979
"... In a new algorithm for maintaining replicated data, every copy of a replicated file is assigned some number of votes. Every transaction collects a read quorum of r votes to read a file, and a write quorum of w votes to write a file, such that r+w is greater than the total number number of votes assi ..."
Abstract
-
Cited by 598 (0 self)
- Add to MetaCart
In a new algorithm for maintaining replicated data, every copy of a replicated file is assigned some number of votes. Every transaction collects a read quorum of r votes to read a file, and a write quorum of w votes to write a file, such that r+w is greater than the total number number of votes
Implementing data cubes efficiently
- In SIGMOD
, 1996
"... Decision support applications involve complex queries on very large databases. Since response times should be small, query optimization is critical. Users typically view the data as multidimensional data cubes. Each cell of the data cube is a view consisting of an aggregation of interest, like total ..."
Abstract
-
Cited by 548 (1 self)
- Add to MetaCart
Decision support applications involve complex queries on very large databases. Since response times should be small, query optimization is critical. Users typically view the data as multidimensional data cubes. Each cell of the data cube is a view consisting of an aggregation of interest, like
Power-law distributions in empirical data
- ISSN 00361445. doi: 10.1137/ 070710111. URL http://dx.doi.org/10.1137/070710111
, 2009
"... Power-law distributions occur in many situations of scientific interest and have significant consequences for our understanding of natural and man-made phenomena. Unfortunately, the empirical detection and characterization of power laws is made difficult by the large fluctuations that occur in the t ..."
Abstract
-
Cited by 607 (7 self)
- Add to MetaCart
demonstrate these methods by applying them to twentyfour real-world data sets from a range of different disciplines. Each of the data sets has been conjectured previously to follow a power-law distribution. In some cases we find these conjectures to be consistent with the data while in others the power law
Missing data: Our view of the state of the art
- Psychological Methods
, 2002
"... Statistical procedures for missing data have vastly improved, yet misconception and unsound practice still abound. The authors frame the missing-data problem, review methods, offer advice, and raise issues that remain unresolved. They clear up common misunderstandings regarding the missing at random ..."
Abstract
-
Cited by 739 (1 self)
- Add to MetaCart
developments are discussed, including some for dealing with missing data that are not MAR. Although not yet in the main-stream, these procedures may eventually extend the ML and MI methods that currently represent the state of the art. Why do missing data create such difficulty in sci-entific research? Because
Results 1 - 10
of
122,026