Results 1 - 10
of
14,790
Summarizing Data Sets for Classification
"... This paper describes our approach and experiences with implementing a data mining system using genetic algorithms in C++. In contrast with earlier classification algorithms that tended to "tile" the data sets using some pre-specified "shapes", the proposed system is based on Marm ..."
Abstract
- Add to MetaCart
organizations today store enormous amounts of data, the data is useless unless it can be properly interpreted and summarized. Data mining tools help turn data into information, and eventually into knowledge [1]. The Genetic Rule and Classifier Construction Environment (GRaCCE) is a data mining tool developed
Summarizing Data Cubes Using Blocks
"... In the context of multidimensional data, OLAP tools are appropri-ate for the navigation in the data, aiming at discovering pertinent and abstract knowledge. However, due to the size of the data set, a system-atic and exhaustive exploration is not feasible. Therefore, the problem is to design automat ..."
Abstract
- Add to MetaCart
automatic tools to ease the navigation in the data and their visualization. In this paper, we present a novel approach allow-ing to build automatically blocks of similar values in a given data cube that are meant to summarize the content of the cube. Our method is based on a levelwise algorithm (a la
Summarizing Data using Bottom-k Sketches
, 2007
"... A Bottom -k sketch is a summary of a set of items with nonnegative weights that supports approximate query processing. A sketch is obtained by associating with each item in a ground set an independent random rank drawn from a probability distribution that depends on the weight of the item and includ ..."
Abstract
-
Cited by 30 (17 self)
- Add to MetaCart
A Bottom -k sketch is a summary of a set of items with nonnegative weights that supports approximate query processing. A sketch is obtained by associating with each item in a ground set an independent random rank drawn from a probability distribution that depends on the weight of the item and including the items with smallest rank value. Bottom- sketches are an alternative to -mins sketches [9], which consist of the minimum ranked items in independent rank assignments, and of min-hash [5] sketches, where hash functions replace random rank assignments. Sketches support approximate aggregations, including weight and selectivity of a subpopulation. Coordinated sketches of multiple subsets over the same ground set support subset-relation queries such as Jaccard similarity or
Missing data: Our view of the state of the art
- Psychological Methods
, 2002
"... Statistical procedures for missing data have vastly improved, yet misconception and unsound practice still abound. The authors frame the missing-data problem, review methods, offer advice, and raise issues that remain unresolved. They clear up common misunderstandings regarding the missing at random ..."
Abstract
-
Cited by 739 (1 self)
- Add to MetaCart
Statistical procedures for missing data have vastly improved, yet misconception and unsound practice still abound. The authors frame the missing-data problem, review methods, offer advice, and raise issues that remain unresolved. They clear up common misunderstandings regarding the missing
Tight results for clustering and summarizing data streams. In:
- Proc. ICDT’09,
, 2009
"... ABSTRACT In this paper we investigate algorithms and lower bounds for summarization problems over a single pass data stream. In particular we focus on histogram construction and K-center clustering. We provide a simple framework that improves upon all previous algorithms on these problems in either ..."
Abstract
-
Cited by 9 (0 self)
- Add to MetaCart
ABSTRACT In this paper we investigate algorithms and lower bounds for summarization problems over a single pass data stream. In particular we focus on histogram construction and K-center clustering. We provide a simple framework that improves upon all previous algorithms on these problems
ABSTRACT Summarizing Data using Bottom-k Sketches
"... A Bottom-k sketch is a summary of a set of items with nonnegative weights that supports approximate query processing. A sketch is obtained by associating with each item in a ground set an independent random rank drawn from a probability distribution that depends on the weight of the item and includi ..."
Summaries of Affymetrix GeneChip probe level data
- Nucleic Acids Res
, 2003
"... High density oligonucleotide array technology is widely used in many areas of biomedical research for quantitative and highly parallel measurements of gene expression. Affymetrix GeneChip arrays are the most popular. In this technology each gene is typically represented by a set of 11±20 pairs of pr ..."
Abstract
-
Cited by 471 (21 self)
- Add to MetaCart
of probes. In order to obtain expression measures it is necessary to summarize the probe level data. Using two extensive spike-in studies and a dilution study, we developed a set of tools for assessing the effectiveness of expression measures. We found that the performance of the current version
Discovery of Grounded Theory
, 1967
"... Abstract: This paper outlines my concerns with Qualitative Data Analysis ’ (QDA) numerous remodelings of Grounded Theory (GT) and the subsequent eroding impact. I cite several examples of the erosion and summarize essential elements of classic GT methodology. It is hoped that the article will clarif ..."
Abstract
-
Cited by 2637 (13 self)
- Add to MetaCart
Abstract: This paper outlines my concerns with Qualitative Data Analysis ’ (QDA) numerous remodelings of Grounded Theory (GT) and the subsequent eroding impact. I cite several examples of the erosion and summarize essential elements of classic GT methodology. It is hoped that the article
Knowledge acquisition via incremental conceptual clustering
- Machine Learning
, 1987
"... hill climbing Abstract. Conceptual clustering is an important way of summarizing and explaining data. However, the recent formulation of this paradigm has allowed little exploration of conceptual clustering as a means of improving performance. Furthermore, previous work in conceptual clustering has ..."
Abstract
-
Cited by 765 (9 self)
- Add to MetaCart
hill climbing Abstract. Conceptual clustering is an important way of summarizing and explaining data. However, the recent formulation of this paradigm has allowed little exploration of conceptual clustering as a means of improving performance. Furthermore, previous work in conceptual clustering has
Foundations for the Study of Software Architecture
- ACM SIGSOFT SOFTWARE ENGINEERING NOTES
, 1992
"... The purpose of this paper is to build the foundation for software architecture. We first develop an intuition for software architecture by appealing to several well-established architectural disciplines. On the basis of this intuition, we present a model of software architec-ture that consists of th ..."
Abstract
-
Cited by 812 (35 self)
- Add to MetaCart
of three components: elements, form, and rationale. Elements are either processing, data, or connecting elements. Form is defined in terms of the properties of, and the relationships among, the elements-- that is, the constraints on the elements. The ratio-nale provides the underlying basis
Results 1 - 10
of
14,790