Results 1 - 10
of
57
A Survey of Temporal Knowledge Discovery Paradigms and Methods
- IEEE Transactions on Knowledge and Data Engineering
, 2002
"... AbstractÐWith the increase in the size of data sets, data mining has recently become an important research topic and is receiving substantial interest from both academia and industry. At the same time, interest in temporal databases has been increasing and a growing number of both prototype and impl ..."
Abstract
-
Cited by 55 (6 self)
- Add to MetaCart
AbstractÐWith the increase in the size of data sets, data mining has recently become an important research topic and is receiving substantial interest from both academia and industry. At the same time, interest in temporal databases has been increasing and a growing number of both prototype and implemented systems are using an enhanced temporal understanding to explain aspects of behavior associated with the implicit time-varying nature of the universe. This paper investigates the confluence of these two areas, surveys the work to date, and explores the issues involved and the outstanding problems in temporal data mining. Index TermsÐTemporal data mining, time sequence mining, trend analysis, temporal rules, semantics of mined rules. 1
DEMON: Mining and Monitoring Evolving Data
- IEEE Transactions on Knowledge and Data Engineering
, 2000
"... Data mining algorithms have been the focus of much research recently. In practice, the input data to a data mining process resides in a large data warehouse whose data is kept up-to-date through periodic or occasional addition and deletion of blocks of data. Most data mining algorithms have either ..."
Abstract
-
Cited by 49 (1 self)
- Add to MetaCart
Data mining algorithms have been the focus of much research recently. In practice, the input data to a data mining process resides in a large data warehouse whose data is kept up-to-date through periodic or occasional addition and deletion of blocks of data. Most data mining algorithms have either assumed that the input data is static, or have been designed for arbitrary insertions and deletions of data records.
A fast APRIORI implementation
- In Proceedings of the IEEE ICDM Workshop on Frequent Itemset Mining Implementations
, 2003
"... The efficiency of frequent itemset mining algorithms is determined mainly by three factors: the way candidates are generated, the data structure that is used and the implementation details. Most papers focus on the first factor, some describe the underlying data structures, but implementation detail ..."
Abstract
-
Cited by 37 (2 self)
- Add to MetaCart
The efficiency of frequent itemset mining algorithms is determined mainly by three factors: the way candidates are generated, the data structure that is used and the implementation details. Most papers focus on the first factor, some describe the underlying data structures, but implementation details are almost always neglected. In this paper we show that the effect of implementation can be more important than the selection of the algorithm. Ideas that seem to be quite promising, may turn out to be ineffective if we descend to the implementation level.
Moment: Maintaining Closed Frequent Itemsets over a Stream Sliding Window
- In ICDM
, 2004
"... This paper considers the problem of mining closed frequent itemsets over a sliding window using limited memory space. We design a synopsis data structure to monitor transactions in the sliding window so that we can output the current closed frequent itemsets at any time. Due to time and memory const ..."
Abstract
-
Cited by 37 (2 self)
- Add to MetaCart
This paper considers the problem of mining closed frequent itemsets over a sliding window using limited memory space. We design a synopsis data structure to monitor transactions in the sliding window so that we can output the current closed frequent itemsets at any time. Due to time and memory constraints, the synopsis data structure cannot monitor all possible itemsets. However, monitoring only frequent itemsets will make it impossible to detect new itemsets when they become frequent. In this paper, we introduce a compact data structure, the closed enumeration tree (CET), to maintain a dynamically selected set of itemsets over a sliding-window. The selected itemsets consist of a boundary between closed frequent itemsets and the rest of the itemsets. Concept drifts in a data stream are reflected by boundary movements in the CET. In other words, a status change of any itemset (e.g., from non-frequent to frequent) must occur through the boundary. Because the boundary is relatively stable, the cost of mining closed frequent itemsets over a sliding window is dramatically reduced to that of mining transactions that can possibly cause boundary movements in the CET. Our experiments show that our algorithm performs much better than previous approaches.
Maintenance of Discovered Association Rules: When to update?
- In Research Issues on Data Mining and Knowledge Discovery
, 1997
"... In this paper, we devise an algorithm with which we can estimate the difference between the association rules in a database before and after it is updated. The estimated difference can be used to determine whether we update the mined association rules or not. If the estimated difference is large, th ..."
Abstract
-
Cited by 20 (1 self)
- Add to MetaCart
In this paper, we devise an algorithm with which we can estimate the difference between the association rules in a database before and after it is updated. The estimated difference can be used to determine whether we update the mined association rules or not. If the estimated difference is large, then it is time to update the mined association rules in order to discover and learn the new rules and discard the old ones. If the estimated difference is small, then the rules in the original database is still a good approximation for those in the updated database. We do not have to spend the resources to update the rules. We can accumulate more updates before actually updating the rules, thereby avoiding the overheads of updating the rules too frequently. 1 Introduction Data mining has been attracting much attention from practitioners and researchers in recent years. Combining techniques from the fields of machine learning, statistics and databases, data mining enables us to find out usef...
Generating Frequent Itemsets Incrementally: Two Novel Approaches Based on Galois Lattice Theory
, 2002
"... Galois (concept) lattice theory has been successfully applied to the resolution of the association rule problem in data mining. In particular, structural results about lattices have been used in the design of e#cient procedures for mining the frequent patterns (itemsets) in a transaction database. ..."
Abstract
-
Cited by 13 (3 self)
- Add to MetaCart
Galois (concept) lattice theory has been successfully applied to the resolution of the association rule problem in data mining. In particular, structural results about lattices have been used in the design of e#cient procedures for mining the frequent patterns (itemsets) in a transaction database.
Is Sampling Useful in Data Mining? A Case in the Maintenance of Discovered Association Rules.
- Data Mining and Knowledge Discovery
, 1998
"... By nature, sampling is an appealing technique for data mining, because approximate solutions in most cases may already be of great satisfaction to the need of the users. We attempt to use sampling techniques to address the problem of maintaining discovered association rules. Some studies have been d ..."
Abstract
-
Cited by 11 (0 self)
- Add to MetaCart
By nature, sampling is an appealing technique for data mining, because approximate solutions in most cases may already be of great satisfaction to the need of the users. We attempt to use sampling techniques to address the problem of maintaining discovered association rules. Some studies have been done on the problem of maintaining the discovered association rules when updates are made to the database. All proposed methods must examine not only the changed part but also the unchanged part in the original database, which is very large, and hence take much time. Worse yet, if the updates on the rules are performed frequently on the database but the underlying rule set has not changed much, then the effort could be mostly wasted. In this paper, we devise an algorithm which employs sampling techniques to estimate the difference between the association rules in a database before and after the database is updated. The estimated difference can be used to determine whether we should update the...
Monitoring the Evolution of Web Usage Patterns
- Lecture Notes in Computer Science
, 2004
"... Abstract With the ongoing shift from off-line to on-line business processes, the Web has become an important business platform, and for most companies it is crucial to have an on-line presence which can be used to gather information about their products and/or services. However, in many cases there ..."
Abstract
-
Cited by 9 (1 self)
- Add to MetaCart
Abstract With the ongoing shift from off-line to on-line business processes, the Web has become an important business platform, and for most companies it is crucial to have an on-line presence which can be used to gather information about their products and/or services. However, in many cases there is a difference between the intended and the effective usage of a web site and, presently, many web site operators analyze the usage of their sites to improve their usability. But especially in the context of the Internet, content and structure change rather quickly, and the way a web site is used may change often, either due to changing information needs of its visitors, or due to an evolving user group. Therefore, the discovered usage patterns need to be updated continuously to always reflect the current state. In this article, we introduce PAM, an automated Pattern Monitor, which can be used to observe changes to the behavior of a web sites visitors. It is based on a temporal representation of rules in which both the content
A Framework for Incremental Generation of Frequent Closed Itemsets
- Workshop on Discrete Mathematics & Data Mining, 2nd SIAM Conf. on Data Mining
, 2002
"... Concept lattices provide a theoretical framework for the efficient resolution of the association rule problem. The paper describes an extension to the underlying approach as a contribution to the issue of incrementa data mining. In particular, we propose an incrementa agorithm for mining frequent ..."
Abstract
-
Cited by 8 (1 self)
- Add to MetaCart
Concept lattices provide a theoretical framework for the efficient resolution of the association rule problem. The paper describes an extension to the underlying approach as a contribution to the issue of incrementa data mining. In particular, we propose an incrementa agorithm for mining frequent closed itemsets (FCIs) that is based on our most recent work on lattice construction.
Efficient Algorithms for Incremental Update of Frequent Sequences
- In PAKDD
, 2002
"... Agrawal and Srikant first put forward the problem of mining frequently occurring se- quences from a customer database [1]. In their model, a customer database consists of a set of sequences. Each sequence is a chronologically ordered set of transactions (or itemsets). ..."
Abstract
-
Cited by 8 (0 self)
- Add to MetaCart
Agrawal and Srikant first put forward the problem of mining frequently occurring se- quences from a customer database [1]. In their model, a customer database consists of a set of sequences. Each sequence is a chronologically ordered set of transactions (or itemsets).

