Results 1 -
8 of
8
Multi-Dimensional Regression Analysis of Time-Series Data Streams
- PROC. VLDB 02
, 2002
"... Real-time production systems and other dynamic environments often generate tremendous (potentially infinite) amount of stream data; the volume of data is too huge to be stored on disks or scanned multiple times. Can we perform on-line, multi-dimensional analysis and data mining of such data to alert ..."
Abstract
-
Cited by 92 (22 self)
- Add to MetaCart
Real-time production systems and other dynamic environments often generate tremendous (potentially infinite) amount of stream data; the volume of data is too huge to be stored on disks or scanned multiple times. Can we perform on-line, multi-dimensional analysis and data mining of such data to alert people about dramatic changes of situations and to initiate timely, high-quality responses? This is a challenging task. In this paper,
Multidimensional Association Rules in Boolean Tensors ∗
"... Popular data mining methods support knowledge discovery from patterns that hold in binary relations. We study the generalization of association rule mining within arbitrary n-ary relations and thus Boolean tensorsinsteadofBooleanmatrices. Indeed, manydatasets of interest correspond to relations whos ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
Popular data mining methods support knowledge discovery from patterns that hold in binary relations. We study the generalization of association rule mining within arbitrary n-ary relations and thus Boolean tensorsinsteadofBooleanmatrices. Indeed, manydatasets of interest correspond to relations whose number of dimensions is greater or equal to 3. However, just a few proposals deal with rule discovery when both the head and the body can involve subsets of any dimensions. A challenging problem is to provide a semantics to such generalized rules by means of objective interestingness measures that have to be carefully designed. Therefore, we discuss the need for different generalizations of the classical confidence measure. We also present the
OLEMAR: An Online Environment for Mining Association Rules in Multidimensional Data
, 2008
"... Data warehouses and OLAP (online analytical processing) provide tools to explore and navigate through data cubes in order to extract interesting information under different perspectives and levels of granularity. Nevertheless, OLAP techniques do not allow the identification of relationships, group ..."
Abstract
- Add to MetaCart
Data warehouses and OLAP (online analytical processing) provide tools to explore and navigate through data cubes in order to extract interesting information under different perspectives and levels of granularity. Nevertheless, OLAP techniques do not allow the identification of relationships, groupings, or exceptions that could hold in a data cube. To that end, we propose to enrich OLAP techniques with data mining facilities to benefit from the capabilities they offer. In this chapter, we propose an online environment for mining association rules in data cubes. Our environment called OLEMAR (online environment for mining association rules), is designed to extract associations from multidimensional data. It allows the extraction of inter-dimensional association rules from data cubes according to a sum-based aggregate measure, a more general indicator than aggregate values provided by the traditional COUNT measure. In our approach, OLAP users are able to drive a mining process guided by a meta-rule, which meets their analysis objectives. In
generation • Summary
, 2005
"... • Data in the real world is dirty – incomplete: lacking attribute values, lacking certain attributes of interest, or containing only aggregate data • e.g., occupation=“” – noisy: containing errors or outliers • e.g., Salary=“-10” – inconsistent: containing discrepancies in codes or names • e.g., Age ..."
Abstract
- Add to MetaCart
• Data in the real world is dirty – incomplete: lacking attribute values, lacking certain attributes of interest, or containing only aggregate data • e.g., occupation=“” – noisy: containing errors or outliers • e.g., Salary=“-10” – inconsistent: containing discrepancies in codes or names • e.g., Age=“42 ” Birthday=“03/07/1997” • e.g., Was rating “1,2,3”, now rating “A, B, C” • e.g., discrepancy between duplicate records
DOI: 10.1007/s10619-005-3296-1 Stream Cube: An Architecture for Multi-Dimensional Analysis of Data Streams
, 2005
"... Abstract. Real-time surveillance systems, telecommunication systems, and other dynamic environments often generate tremendous (potentially infinite) volume of stream data: the volume is too huge to be scanned multiple times. Much of such data resides at rather low level of abstraction, whereas most ..."
Abstract
- Add to MetaCart
Abstract. Real-time surveillance systems, telecommunication systems, and other dynamic environments often generate tremendous (potentially infinite) volume of stream data: the volume is too huge to be scanned multiple times. Much of such data resides at rather low level of abstraction, whereas most analysts are interested in relatively high-level dynamic changes (such as trends and outliers). To discover such high-level characteristics, one may need to perform on-line multi-level, multi-dimensional analytical processing of stream data. In this paper, we propose an architecture, called stream cube, to facilitate on-line, multi-dimensional, multi-level analysis of stream data. For fast online multi-dimensional analysis of stream data, three important techniques are proposed for efficient and effective computation of stream cubes. First, a tilted time frame model is proposed as a multi-resolution model to register time-related data: the more recent data are registered at finer resolution, whereas the more distant data are registered at coarser resolution. This design reduces the overall storage of time-related data and adapts nicely to the data analysis tasks commonly encountered in practice. Second, instead of materializing cuboids at all levels, we propose to maintain a small number of critical layers. Flexible analysis can be efficiently performed based on the concept of observation layer and minimal interesting layer. Third, an efficient stream data cubing algorithm is developed which computes only the layers (cuboids) along a popular path and leaves the other cuboids for query-driven, on-line computation. Based on this design methodology, stream data cube can be constructed and
Discovering Descriptive Rules in Relational Dynamic Graphs
"... Graph mining methods have become quite popular and a timely challenge is to discover dynamic properties in evolving graphs or networks. We consider the so-called relational dynamic oriented graphs that can be encoded as n-ary relations with n ≥ 3 and thus represented by Boolean tensors. Two dimensio ..."
Abstract
- Add to MetaCart
Graph mining methods have become quite popular and a timely challenge is to discover dynamic properties in evolving graphs or networks. We consider the so-called relational dynamic oriented graphs that can be encoded as n-ary relations with n ≥ 3 and thus represented by Boolean tensors. Two dimensions are used to encode the graph adjacency matrices and at least one other denotes time. We design the pattern domain of multi-dimensional association rules, i.e., non trivial extensions of the popular association rules that may involve subsets of any dimensions in their antecedents and their consequents. First, we design new objective interestingness measures for such rules and it leads to different approaches for measuring the rule confidence. Second, we must compute collections of a priori interesting rules. It is considered here as a post-processing of the closed patterns that can be extracted efficiently from Boolean tensors. We propose optimizations to support both rule extraction scalability and non redundancy. We illustrate the added-value of this new data mining task to discover patterns from a real-life relational dynamic graph.

