Results 1 -
4 of
4
Clustering Association Rules
, 1997
"... We consider the problem of clustering two-dimensional association rules in large databases. We present a geometric-based algorithm, BitOp, for performing the clustering, embedded within an association rule clustering system, ARCS. Association rule clustering is useful when the user desires to segmen ..."
Abstract
-
Cited by 99 (0 self)
- Add to MetaCart
We consider the problem of clustering two-dimensional association rules in large databases. We present a geometric-based algorithm, BitOp, for performing the clustering, embedded within an association rule clustering system, ARCS. Association rule clustering is useful when the user desires to segment the data. We measure the quality of the segmentation generated by ARCS using the Minimum Description Length (MDL) principle of encoding the clusters on several databases including noise and errors. Scale-up experiments show that ARCS, using the BitOp algorithm, scales linearly with the amount of data. 1 Introduction Data mining, or the efficient discovery of interesting patterns from large collections of data, has been recognized as an important area of database research. The most commonly sought patterns are association rules as introduced in [AIS93b]. Intuitively, an association rule identifies a frequently occuring pattern of information in a database. Consider a supermarket database w...
Dynamic Generation and Refinement of Concept Hierarchies for Knowledge Discovery in Databases
- IN PROC. AAAI'94 WORKSHOP ON KNOWLEDGE DISCOVERY IN DATABASES (KDD'94
, 1994
"... Concept hierarchies organize data and concepts in hierarchical forms or in certain partial order, which helps expressing knowledge and data relationships in databases in concise, high level terms, and thus, plays an important role in knowledge discovery processes. Concept hierarchies could be prov ..."
Abstract
-
Cited by 57 (13 self)
- Add to MetaCart
Concept hierarchies organize data and concepts in hierarchical forms or in certain partial order, which helps expressing knowledge and data relationships in databases in concise, high level terms, and thus, plays an important role in knowledge discovery processes. Concept hierarchies could be provided by knowledge engineers, domain experts or users, or embedded in some data relations. However, it is sometimes desirable to automatically generate some concept hierarchies or adjust some given hierarchies for particular learning tasks. In this paper, the issues of dynamic generation and refinement of concept hierarchies are studied. The study leads to some algorithms for automatic generation of concept hierarchies for numerical attributes based on data distributions and for dynamic refinement of a given or generated concept hierarchy based on a learning request, the relevant set of data and database statistics. These algorithms have been implemented in the DBkearn knowledge discovery system and tested against large relational databases. The experimental results show that the algorithms are efficient and effective for knowledge discovery in large databases.
Set-oriented mining for association rules in relational databases
- In 11th Intl. Conf. Data Engineering
, 1995
"... hou t sma @ t rc. nl We describe set-oriented algorithms for mining as-sociation rules. Such algorithms imply performing multiple joins and may appear to be inherently less escient than special-purpose algorithms. We develop new algorithms that can be expressed as SQL queries, and discuss optimizati ..."
Abstract
-
Cited by 52 (0 self)
- Add to MetaCart
hou t sma @ t rc. nl We describe set-oriented algorithms for mining as-sociation rules. Such algorithms imply performing multiple joins and may appear to be inherently less escient than special-purpose algorithms. We develop new algorithms that can be expressed as SQL queries, and discuss optimization of these algorithms. Af-ter analytical evaluation, an algorithm named SETM emerges as the algorithm of choice. Algorithm SETM uses only simple database primitives, viz., sorting and merge-scan join. Algorithm SETM is simple, fast, and stable over the mnge of pammeter values. The major contribution of this paper is that it shows that at least some aspects of data mining can be cam’ed out by using general query languages such as SQL, mther than by developing specialized black box algo-rithms. The set-oriented nature of Algorithm SETM facilitates the development of extensions. 1
A Framework for Knowledge Discovery and Evolution in Databases
, 1993
"... Although knowledge discovery is increasingly important in databases, discovered knowledge is not always useful to users. It is mainly because the discovered knowledge does not fit user's interests, or it may be redundant or inconsistent with a priori knowledge. Knowledge discovery in databases depen ..."
Abstract
-
Cited by 8 (1 self)
- Add to MetaCart
Although knowledge discovery is increasingly important in databases, discovered knowledge is not always useful to users. It is mainly because the discovered knowledge does not fit user's interests, or it may be redundant or inconsistent with a priori knowledge. Knowledge discovery in databases depends critically on how well a database is characterized and how consistently the existing and discovered knowledge is evolved. This paper describes a novel concept for knowledge discovery and evolution in databases. The key issues of this work include: using a database query to discover new rules; using not only positive examples (answer to a query) but also negative examples to discover new rules; harmonizing existing rules with the new rules. The main contribution of this paper is the development of a new tool for (1) characterizing the exceptions in databases and (2) evolving knowledge as a database evolves. Keywords--- Knowledge discovery, database mining, active database evolution, knowl...

