Results 11 - 20
of
5,396
The Merge/Purge Problem for Large Databases
- In Proceedings of the 1995 ACM SIGMOD
, 1995
"... Many commercial organizations routinely gather large numbers of databases for various marketing and business analysis functions. The task is to correlate information from different databases by identifying distinct individuals that appear in a number of different databases typically in an inconsiste ..."
Abstract
-
Cited by 359 (3 self)
- Add to MetaCart
in an inconsistent and often incorrect fashion. The problem we study here is the task of merging data from multiple sources in as efficient manner as possible, while maximizing the accuracy of the result. We call this the merge/purge problem. In this paper we detail the sorted neighborhood method that is used
Parity-Based Loss Recovery for Reliable Multicast Transmission
"... We investigate how FEC (Forward Error Correction) can be combined with ARQ (Automatic Repeat Request) to achieve scalable reliable multicast transmission. We consider the two scenarios where FEC is introduced as a transparent layer underneath a reliable multicast layer that uses ARQ, and where FEC a ..."
Abstract
-
Cited by 335 (19 self)
- Add to MetaCart
and ARQ are both integrated into a single layer that uses the retransmission of parity data to recover from the loss of original data packets. Toevaluate the performance improvements due to FEC, we consider different types of loss behaviors (spatially or temporally correlated loss, homogeneous
Feature selection for high-dimensional data: a fast correlation-based filter solution
- In: Proceedings of the 20th International Conferences on Machine Learning
, 2003
"... Feature selection, as a preprocessing step to machine learning, is effective in reducing di-mensionality, removing irrelevant data, in-creasing learning accuracy, and improving result comprehensibility. However, the re-cent increase of dimensionality of data poses a severe challenge to many existing ..."
Abstract
-
Cited by 276 (12 self)
- Add to MetaCart
existing feature selection methods with respect to efficiency and effectiveness. In this work, we intro-duce a novel concept, predominant correla-tion, and propose a fast filter method which can identify relevant features as well as re-dundancy among relevant features without pairwise correlation analysis
Spectral hashing
, 2009
"... Semantic hashing [1] seeks compact binary codes of data-points so that the Hamming distance between codewords correlates with semantic similarity. In this paper, we show that the problem of finding a best code for a given dataset is closely related to the problem of graph partitioning and can be sho ..."
Abstract
-
Cited by 284 (4 self)
- Add to MetaCart
Semantic hashing [1] seeks compact binary codes of data-points so that the Hamming distance between codewords correlates with semantic similarity. In this paper, we show that the problem of finding a best code for a given dataset is closely related to the problem of graph partitioning and can
Data Association in Stochastic Mapping Using the Joint Compatibility Test
, 2001
"... In this paper, we address the problem of robust data association for simultaneous vehicle localization and map building. We show that the classical gated nearest neighbor approach, which considers each matching between sensor observations and features independently, ignores the fact that measurement ..."
Abstract
-
Cited by 252 (15 self)
- Add to MetaCart
criterion can be used to efficiently search for the best solution to data association. Unlike the nearest neighbor, this method provides a robust solution in complex situations, such as cluttered environments or when revisiting previously mapped regions.
Realized Variance and Market Microstructure Noise
, 2005
"... We study market microstructure noise in high-frequency data and analyze its implications for the real-ized variance (RV) under a general specification for the noise. We show that kernel-based estimators can unearth important characteristics of market microstructure noise and that a simple kernel-bas ..."
Abstract
-
Cited by 263 (13 self)
- Add to MetaCart
-based estimator dominates the RV for the estimation of integrated variance (IV). An empirical analysis of the Dow Jones Industrial Average stocks reveals that market microstructure noise is time-dependent and correlated with increments in the efficient price. This has important implications for volatility
Efficient feature selection via analysis of relevance and redundancy
- Journal of Machine Learning Research
, 2004
"... Feature selection is applied to reduce the number of features in many applications where data has hundreds or thousands of features. Existing feature selection methods mainly focus on finding relevant features. In this paper, we show that feature relevance alone is insufficient for efficient feature ..."
Abstract
-
Cited by 209 (3 self)
- Add to MetaCart
Feature selection is applied to reduce the number of features in many applications where data has hundreds or thousands of features. Existing feature selection methods mainly focus on finding relevant features. In this paper, we show that feature relevance alone is insufficient for efficient
A study in two-handed input
- In Proceedings of CHI '86
, 1986
"... Two experiments were run to investigate two-handed input. The experimental tasks were representative of those found in CAD and office information systems. Experiment one involved the performance of a compound selection/positioning task The two sub-tasks were performed by different hands using separa ..."
Abstract
-
Cited by 239 (18 self)
- Add to MetaCart
separate transducers. Without prompting, novice subjects adopted strategies that involved performing the two sub-tasks simultaneously. We interpret this as a demonstration that, in the appropriate context, users are capable of simultaneously providing continuous data from two hands without significant
A Sparse Signal Reconstruction Perspective for Source Localization With Sensor Arrays
, 2005
"... We present a source localization method based on a sparse representation of sensor measurements with an overcomplete basis composed of samples from the array manifold. We enforce sparsity by imposing penalties based on the 1-norm. A number of recent theoretical results on sparsifying properties of ..."
Abstract
-
Cited by 231 (6 self)
- Add to MetaCart
of advantages over other source localization techniques, including increased resolution, improved robustness to noise, limitations in data quantity, and correlation of the sources, as well as not requiring an accurate initialization.
RaceTrack: Efficient detection of data race conditions via adaptive tracking
- In SOSP
, 2005
"... Bugs due to data races in multithreaded programs often exhibit non-deterministic symptoms and are notoriously difficult to find. This paper describes RaceTrack, a dynamic race detection tool that tracks the actions of a program and reports a warning whenever a suspicious pattern of activity has been ..."
Abstract
-
Cited by 168 (0 self)
- Add to MetaCart
Bugs due to data races in multithreaded programs often exhibit non-deterministic symptoms and are notoriously difficult to find. This paper describes RaceTrack, a dynamic race detection tool that tracks the actions of a program and reports a warning whenever a suspicious pattern of activity has
Results 11 - 20
of
5,396