Results 1  10
of
12
Formal Concept Analysis in Information Science
 ANNUAL REVIEW OF INFORMATION SCIENCE AND TECHNOLOGY
, 1996
"... ..."
Extracting Decision Trees from Interval Pattern Concept Lattices
"... Abstract. Formal Concept Analysis (FCA) and concept lattices have shown their effectiveness for binary clustering and concept learning. Moreover, several links between FCA and unsupervised data mining tasks such as itemset mining and association rules extraction have been emphasized. Several works a ..."
Abstract

Cited by 2 (0 self)
 Add to MetaCart
Abstract. Formal Concept Analysis (FCA) and concept lattices have shown their effectiveness for binary clustering and concept learning. Moreover, several links between FCA and unsupervised data mining tasks such as itemset mining and association rules extraction have been emphasized. Several works also studied FCA in a supervised framework, showing that popular machine learning tools such as decision trees can be extracted from concept lattices. In this paper, we investigate the links between FCA and decision trees with numerical data. Recent works showed the efficiency of ”pattern structures ” to handle numerical data in FCA, compared to traditional discretization methods such as conceptual scaling. 1
M.: ConceptBased Data Mining with Scaled Labeled Graphs
 Proc. 12th Int. Conf. on Conceptual Structures, ICCS’04. Volume 3127 of Lecture
"... Abstract Graphs with labeled vertices and edges play an important role in various applications, including chemistry. A model of learning from positive and negative examples, naturally described in terms of Formal Concept Analysis (FCA), is used here to generate hypotheses about biological activity o ..."
Abstract

Cited by 1 (1 self)
 Add to MetaCart
Abstract Graphs with labeled vertices and edges play an important role in various applications, including chemistry. A model of learning from positive and negative examples, naturally described in terms of Formal Concept Analysis (FCA), is used here to generate hypotheses about biological activity of chemical compounds. A standard FCA technique is used to reduce labeled graphs to objectattribute representation. The major challenge is the construction of the context, which can involve tens thousands attributes. The method is tested against a standard dataset from an ongoing international competition called Predictive Toxicology Challenge (PTC). 1
A Parameterized Algorithm for Exploring Concept Lattices
"... Abstract. Kuznetsov shows that Formal Concept Analysis (FCA) is a natural framework for learning from positive and negative examples. Indeed, the results of learning from positive examples (respectively negative examples) are sets of frequent concepts with respect to a minimal support, whose extent ..."
Abstract

Cited by 1 (1 self)
 Add to MetaCart
Abstract. Kuznetsov shows that Formal Concept Analysis (FCA) is a natural framework for learning from positive and negative examples. Indeed, the results of learning from positive examples (respectively negative examples) are sets of frequent concepts with respect to a minimal support, whose extent contains only positive examples (respectively negative examples). In terms of association rules, the above learning can be seen as searching the premises of exact rules where the consequence is fixed. When augmented with statistical indicators like confidence and support it is possible to extract various kinds of conceptbased rules taking into account exceptions. FCA considers attributes as a nonordered set. When attributes of the context are ordered, Conceptual Scaling allows the related taxonomy to be taken into account by producing a context completed with all attributes deduced from the taxonomy. The drawback of that method is concept intents contain redundant information. In a previous work, we proposed an algorithm based on Bordat’s algorithm to find frequent concepts in a context with taxonomy. In that algorithm, the taxonomy is taken into account during the computation so as to remove all redundancy from intents. In this article, we propose a parameterized generalization of that algorithm for learning rules in the presence of a taxonomy. Simply changing one component, that parameterized algorithm can compute various kinds of conceptbased rules. We present applications of the parameterized algorithm to find positive and negative rules. 1
Mining Generalized Graph Patterns based on User Examples 1
"... There has been a lot of recent interest in mining patterns from graphs. Often, the exact structure of the patterns of interest is not known. This happens, for example, when molecular structures are mined to discover fragments useful as features in chemical compound classification task, or when web s ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
There has been a lot of recent interest in mining patterns from graphs. Often, the exact structure of the patterns of interest is not known. This happens, for example, when molecular structures are mined to discover fragments useful as features in chemical compound classification task, or when web sites are mined to discover sets of web pages representing logical documents. Such patterns are often generated from a few small subgraphs (cores), according to certain generalization rules (GRs). We call such patterns “generalized patterns”(GPs). While being structurally different, GPs often perform the same function in the network. Previously proposed approaches to mining GPs either assumed that the cores and the GRs are given, or that all interesting GPs are frequent. These are strong assumptions, which often do not hold in practical applications. In this paper, we propose an approach to mining GPs that is free from the above assumptions. Given a small number of GPs selected by the user, our algorithm discovers all GPs similar to the user examples. First, a machine learningstyle approach is used to find the cores. Second, generalizations of the cores in the graph are computed to identify GPs. Evaluation on synthetic data, generated using real cores and GRs from biological and web domains, demonstrates effectiveness of our approach. 1.
Some Links Between Decision Tree and Dichotomic Lattice
"... Abstract. There are two types of classification methods using a Galois lattice: as most of them rely on selection, recent research work focus on navigationbased approaches. In navigationoriented methods, classification is performed by navigating through the complete lattice, similar to the decision ..."
Abstract

Cited by 1 (1 self)
 Add to MetaCart
Abstract. There are two types of classification methods using a Galois lattice: as most of them rely on selection, recent research work focus on navigationbased approaches. In navigationoriented methods, classification is performed by navigating through the complete lattice, similar to the decision tree. When defined from binary attributes obtained after a discretization preprocessing step, and more generally when a nonempty set of complementarity attributes can be associated to each binary attribute, lattices are denoted as ”dichotomic lattices”. The Navigala approach is a navigationbased classification method that relies on the use of a dichotomic lattice. It was initially proposed for symbol recognition in the field of technical document image analysis. In this paper, we define the structural links between decision trees and dichotomic lattices defined from the same table of data described by binary attributes. Under this condition, we prove both that every decision tree is included in the dichotomic lattice and that the dichotomic lattice is the merger of all the decision trees that can be constructed from the same binary data table.
Concept Lattice Mining for Unsupervised Named Entity Annotation
"... Abstract. We present an unsupervised method for named entity annotation, based on concept lattice mining. We perform a formal concept analysis from relations between named entities and their syntactic dependencies observed in a training corpus. The resulting lattice contains concepts which are consi ..."
Abstract
 Add to MetaCart
Abstract. We present an unsupervised method for named entity annotation, based on concept lattice mining. We perform a formal concept analysis from relations between named entities and their syntactic dependencies observed in a training corpus. The resulting lattice contains concepts which are considered as labels for named entities and context annotation. Our approach is validated through a cascade evaluation which shows that supervised named entity classification is improved by using the annotation produced by our unsupervised disambiguation system. 1
Preprocessing input data for machine learning by FCA
"... Abstract. The paper presents an utilization of formal concept analysis in input data preprocessing for machine learning. Two preprocessing methods are presented. The first one consists in extending the set of attributes describing objects in input data table by new attributes and the second one cons ..."
Abstract
 Add to MetaCart
Abstract. The paper presents an utilization of formal concept analysis in input data preprocessing for machine learning. Two preprocessing methods are presented. The first one consists in extending the set of attributes describing objects in input data table by new attributes and the second one consists in replacing the attributes by new attributes. In both methods the new attributes are defined by certain formal concepts computed from input data table. Selected formal concepts are socalled factor concepts obtained by boolean factor analysis, recently described by FCA. The ML method used to demonstrate the ideas is decision tree induction. The experimental evaluation and comparison of performance of decision trees induced from original and preprocessed input data is performed with standard decision tree induction algorithms ID3 and C4.5 on several benchmark datasets. 1
Classification by Selecting Plausible Formal Concepts in a Concept Lattice
"... Abstract. We propose a classification method using a concept lattice, and apply it to thesaurus extension. In natural language processing, solving a practical task by extending many thesauri with a corpus is timeconsuming. The task can be represented as classifying a set of test data for each of man ..."
Abstract
 Add to MetaCart
Abstract. We propose a classification method using a concept lattice, and apply it to thesaurus extension. In natural language processing, solving a practical task by extending many thesauri with a corpus is timeconsuming. The task can be represented as classifying a set of test data for each of many sets of training data. The method enables us to decrease the timecost by avoiding feature selection, which is generally performed for each pair of a set of test data and a set of training data. More precisely, a concept lattice is generated from only a set of test data, and then each formal concept is given a score by using a set of training data. The score represents plausibleness as neighbors of an unknown object, and the unknown object classified into classes to which its neighbors belong. Therefore, once we make the lattice, we can classify test data for each set of training data by only scoring, which has a small computational cost. By experiments using practical thesauri and corpora, we show that our method classifies more accurately than knearest neighbor algorithm.
Noname manuscript No. (will be inserted by the editor) Mining Closed Patterns in Relational, Graph and Network Data
"... Abstract Recent theoretical insights have led to the introduction of efficient algorithms for mining closed itemsets. This paper investigates potential generalizations of this paradigm to mine closed patterns in relational, graph and network databases. Several semantics and associated definitions f ..."
Abstract
 Add to MetaCart
Abstract Recent theoretical insights have led to the introduction of efficient algorithms for mining closed itemsets. This paper investigates potential generalizations of this paradigm to mine closed patterns in relational, graph and network databases. Several semantics and associated definitions for closed patterns in relational data have been introduced in previous work, but the differences among these and the implications of the choice of semantics was not clear. The paper investigates these implications in the context of generalizing the LCM algorithm, an algorithm for enumerating closed itemsets. LCM is attractive since its run time is linear in the number of closed patterns and since it does not need to store the patterns output in order to avoid duplicates, further reducing memory signature and run time. Our investigation shows that the choice of semantics has a dramatic effect on the properties of closed patterns and as a result, in some settings a generalization of the LCM algorithm is not possible. On the other hand, we provide a full generalization of LCM for the semantic setting that has been previously used by the Claudien system. 1