Results 1 - 10
of
25
Music Similarity Measures: What's The Use ?
, 2002
"... Electronic Music Distribution (EMD) is in demand of robust, automatically extracted music descriptors. We introduce a timbral similarity measures for comparing music titles. This measure is based on a Gaussian model of cepstrum coefficients. We describe the timbre extractor and the corresponding tim ..."
Abstract
-
Cited by 87 (5 self)
- Add to MetaCart
Electronic Music Distribution (EMD) is in demand of robust, automatically extracted music descriptors. We introduce a timbral similarity measures for comparing music titles. This measure is based on a Gaussian model of cepstrum coefficients. We describe the timbre extractor and the corresponding timbral similarity relation. We describe experiments in assessing the quality of the similarity relation, and show that the measure is able to yield interesting similarity relations, in particular when used in conjunction with other similarity relations. We illustrate the use of the descriptor in several EMD applications developed in the context of the Cuidado European project.
Web Usage Mining: Discovery and Application of Interestin Patterns from Web Data
, 2000
"... Web Usage Mining is the application of data mining techniques to Web clickstream data in order to extract usage patterns. As Web sites continue to grow in size and complexity, the results of Web Usage Mining have become critical for a number of applications such as Web site design, business and mark ..."
Abstract
-
Cited by 57 (0 self)
- Add to MetaCart
Web Usage Mining is the application of data mining techniques to Web clickstream data in order to extract usage patterns. As Web sites continue to grow in size and complexity, the results of Web Usage Mining have become critical for a number of applications such as Web site design, business and marketing decision support, personalization, usability studies, and network trac analysis. The two major challenges involved in Web Usage Mining are preprocessing the raw data to provide an accurate picture of how a site is being used, and ltering the results of the various data mining algorithms in order to present only the rules and patterns that are potentially interesting. This thesis develops and tests an architecture and algorithms for performing Web Usage Mining. An evidence combination framework referred to as the information lter is developed to compare and combine usage, content, and structure information about a Web site. The information lter automatically identi es the discovered ...
Unsupervised Link Discovery in Multi-relational Data via Rarity Analysis
- IEEE International Conference on Data Mining
, 2003
"... A significant portion of knowledge discovery and data mining research focuses on finding patterns of interest in data. Once a pattern is found, it can be used to recognize satisfying instances. The new area of link discovery requires a complementary approach, since patterns of interest might not yet ..."
Abstract
-
Cited by 24 (1 self)
- Add to MetaCart
A significant portion of knowledge discovery and data mining research focuses on finding patterns of interest in data. Once a pattern is found, it can be used to recognize satisfying instances. The new area of link discovery requires a complementary approach, since patterns of interest might not yet be known or might have too few examples to be learnable. This paper presents an unsupervised link discovery method aimed at discovering unusual, interestingly linked entities in multi-relational datasets. Various notions of rarity are introduced to measure the "interestingness " of sets of paths and entities. These measurements have been implemented and applied to a real-world bibliographic dataset where they give very promising results. 1.
Interestingness of Frequent Itemsets Using Bayesian Networks as Background Knowledge
- In Proceedings of the SIGKDD Conference on Knowledge Discovery and Data Mining
, 2004
"... ..."
Pruning Redundant Association Rules Using Maximum Entropy Principle
- In Advances in Knowledge Discovery and Data Mining, 6th Pacific-Asia Conference, PAKDD’02
, 2002
"... Data mining algorithms produce huge sets of rules, practically impossible to analyze manually. It is thus important to develop methods for removing redundant rules from those sets. We present a solution to the problem using the Maximum Entropy approach. The problem of eciency of Maximum Entropy comp ..."
Abstract
-
Cited by 14 (3 self)
- Add to MetaCart
Data mining algorithms produce huge sets of rules, practically impossible to analyze manually. It is thus important to develop methods for removing redundant rules from those sets. We present a solution to the problem using the Maximum Entropy approach. The problem of eciency of Maximum Entropy computations is addressed by using closed form solutions for the most frequent cases. Analytical and experimental evaluation of the proposed technique indicates that it eciently produces small sets of interesting association rules.
Unsupervised Temporal Rule Mining with Genetic Programming and Specialized Hardware
, 2003
"... Rule mining is the practice of discovering interesting and unexpected rules from large data sets. Depending on the exact problem formulation, this may be a very complicated problem. Existing methods typically make strong simplifying assumptions about the form of the rules, and limit the measure of r ..."
Abstract
-
Cited by 7 (1 self)
- Add to MetaCart
Rule mining is the practice of discovering interesting and unexpected rules from large data sets. Depending on the exact problem formulation, this may be a very complicated problem. Existing methods typically make strong simplifying assumptions about the form of the rules, and limit the measure of rule quality to simple properties, such as confidence. Because confidence in itself is not a good indicator of how interesting a rule is to the user, the mined rules are typically sorted according to some secondary interestingness measure. In this paper we present a rule mining method that is based on genetic programming. Because we use specialized pattern matching hardware to evaluate each rule, our method supports a very wide range of rule formats, and can use any reasonable fitness measure. We develop a fitness measure that is well-suited for our method, and give empirical results of applying the method to synthetic and real-world data sets.
Curious Negotiator
- In proceedings Third International Workshop on Negotiations in electronic markets - beyond price discovery — e-Negotiations 2002, September 2002, Aix-en-Provence
, 2002
"... In negotiation the exchange of contextual information is as important as the exchange of specific offers. The curious negotiator is a multiagent system with three types of agent. Two negotiation agents, each representing an individual, develop consecutive offers, supported by information, whilst req ..."
Abstract
-
Cited by 6 (5 self)
- Add to MetaCart
In negotiation the exchange of contextual information is as important as the exchange of specific offers. The curious negotiator is a multiagent system with three types of agent. Two negotiation agents, each representing an individual, develop consecutive offers, supported by information, whilst requesting information from its opponent. A mediator agent, with experience of prior negotiations, suggests how the negotiation may develop. A failed negotiation is a missed opportunity. An observer agent analyses failures looking for new opportunities. The integration of negotiation theory and data mining enables the curious negotiator to discover and exploit negotiation opportunities. Trials will be conducted in electronic business. 1.
Constraint relaxations for discovering unknown sequential patterns
- In Proceedings of the Third International Workshop on Knowledge Discovery in Inductive Databases
, 2005
"... Abstract. The main drawbacks of sequential pattern mining have been its lack of focus on user expectations and the high number of discovered patterns. However, the solution commonly accepted – the use of constraints – approximates the mining process to a verification of what are the frequent pattern ..."
Abstract
-
Cited by 5 (2 self)
- Add to MetaCart
Abstract. The main drawbacks of sequential pattern mining have been its lack of focus on user expectations and the high number of discovered patterns. However, the solution commonly accepted – the use of constraints – approximates the mining process to a verification of what are the frequent patterns among the specified ones, instead of the discovery of unknown and unexpected patterns. In this paper, we propose a new methodology to mine sequential patterns, keeping the focus on user expectations, without compromising the discovery of unknown patterns. Our methodology is based on the use of constraint relaxations, and it consists on using them to filter accepted patterns during the mining process. We propose a hierarchy of relaxations, applied to constraints expressed as context-free languages, classifying the existing relaxations (legal, valid and naïve, proposed in SPIRIT [3]), and proposing several new classes of relaxations, ranging from the approx and non-accepted, to the composition of different types of relaxations, like the approx-legal or the non-prefix-valid relaxations. At last, we present a case study that show the results achieved with the application of this methodology on the analysis of the curricular sequences of computer science students. 1
Mining Interesting Rules in Bank Loans Data
, 2001
"... Most of the data mining algorithms produce a long list of rules in which it is up to ..."
Abstract
-
Cited by 4 (2 self)
- Add to MetaCart
Most of the data mining algorithms produce a long list of rules in which it is up to
Principles for Mining Summaries Using Objective Measures of Interestingness
- In Proceedings of the 12th IEEE International Conference on Tools with Artificial Intelligence (ICTAI'00
, 2000
"... An important problem in the area of data mining is the development of effective measures of interestingness for ranking discovered knowledge. In this paper, we propose five principles that any measure must satisfy to be considered useful for ranking the interestingness of summaries generated from da ..."
Abstract
-
Cited by 4 (3 self)
- Add to MetaCart
An important problem in the area of data mining is the development of effective measures of interestingness for ranking discovered knowledge. In this paper, we propose five principles that any measure must satisfy to be considered useful for ranking the interestingness of summaries generated from databases. We investigate the problem within the context of summarizing a single dataset which can be generalized in many different ways and to many levels of granularity. We perform a comparative sensitivity analysis of fifteen well-known diversity measures to identify those which satisfy the proposed principles. The fifteen diversity measures have previously been utilized in various disciplines, such as information theory, statistics, ecology, and economics. Their use as objective measures of interestingness for ranking summaries generated from databases is novel. The objective of this work is to gain some insight into the behaviour that can be expected from each of the diversity measures in practice, and to begin to develop a theory of interestingness against which the utility of new measures can be assessed. 1

