Results 1  10
of
22
Revisiting Numerical Pattern Mining with Formal Concept Analysis
"... We propose a definition of interval patterns for numerical data. Intuitively, each object of a numerical dataset correinria00584371, ..."
Abstract

Cited by 12 (4 self)
 Add to MetaCart
We propose a definition of interval patterns for numerical data. Intuitively, each object of a numerical dataset correinria00584371,
Mining Biclusters of Similar Values with Triadic Concept Analysis
"... Abstract. Biclustering numerical data became a popular datamining task in the beginning of 2000’s, especially for analysing gene expression data. A bicluster reflects a strong association between a subset of objects and a subset of attributes in a numerical object/attribute datatable. So called bi ..."
Abstract

Cited by 4 (0 self)
 Add to MetaCart
(Show Context)
Abstract. Biclustering numerical data became a popular datamining task in the beginning of 2000’s, especially for analysing gene expression data. A bicluster reflects a strong association between a subset of objects and a subset of attributes in a numerical object/attribute datatable. So called biclusters of similar values can be thought as maximal subtables with close values. Only few methods address a complete, correct and non redundant enumeration of such patterns, which is a wellknown intractable problem, while no formal framework exists. In this paper, we introduce important links between biclustering and formal concept analysis. More specifically, we originally show that Triadic Concept Analysis (TCA), provides a nice mathematical framework for biclustering. Interestingly, existing algorithms of TCA, that usually apply on binary data, can be used (directly or with slight modifications) after a preprocessing step for extracting maximal biclusters of similar values.
Extracting Decision Trees from Interval Pattern Concept Lattices
"... Abstract. Formal Concept Analysis (FCA) and concept lattices have shown their effectiveness for binary clustering and concept learning. Moreover, several links between FCA and unsupervised data mining tasks such as itemset mining and association rules extraction have been emphasized. Several works a ..."
Abstract

Cited by 3 (0 self)
 Add to MetaCart
(Show Context)
Abstract. Formal Concept Analysis (FCA) and concept lattices have shown their effectiveness for binary clustering and concept learning. Moreover, several links between FCA and unsupervised data mining tasks such as itemset mining and association rules extraction have been emphasized. Several works also studied FCA in a supervised framework, showing that popular machine learning tools such as decision trees can be extracted from concept lattices. In this paper, we investigate the links between FCA and decision trees with numerical data. Recent works showed the efficiency of ”pattern structures ” to handle numerical data in FCA, compared to traditional discretization methods such as conceptual scaling. 1
Quantitative Concept Analysis
 In Florent Domenach, Dmitry
, 2012
"... Abstract. Formal Concept Analysis (FCA) begins from a context, given as a binary relation between some objects and some attributes, and derives a lattice of concepts, where each concept is given as a set of objects and a set of attributes, such that the first set consists of all objects that satisf ..."
Abstract

Cited by 2 (2 self)
 Add to MetaCart
(Show Context)
Abstract. Formal Concept Analysis (FCA) begins from a context, given as a binary relation between some objects and some attributes, and derives a lattice of concepts, where each concept is given as a set of objects and a set of attributes, such that the first set consists of all objects that satisfy all attributes in the second, and vice versa. Many applications, though, provide contexts with quantitative information, telling not just whether an object satisfies an attribute, but also quantifying this satisfaction. Contexts in this form arise as rating matrices in recommender systems, as occurrence matrices in text analysis, as pixel intensity matrices in digital image processing, etc. Such applications have attracted a lot of attention, and several numeric extensions of FCA have been proposed. We propose the framework of proximity sets (proxets), which subsume partially ordered sets (posets) as well as metric spaces. One feature of this approach is that it extracts from quantified contexts quantified concepts, and thus allows full use of the available information. Another feature is that the categorical approach allows analyzing any universal properties that the classical FCA and the new versions may have, and thus provides structural guidance for aligning and combining the approaches.
Mining Definitions from RDF Annotations Using Formal Concept Analysis
"... The popularization and quick growth of Linked Open Data (LOD) has led to challenging aspects regarding quality assessment and data exploration of the RDF triples that shape the LOD cloud. Particularly, we are interested in the completeness of data and its potential to provide concept definitions i ..."
Abstract

Cited by 1 (1 self)
 Add to MetaCart
The popularization and quick growth of Linked Open Data (LOD) has led to challenging aspects regarding quality assessment and data exploration of the RDF triples that shape the LOD cloud. Particularly, we are interested in the completeness of data and its potential to provide concept definitions in terms of necessary and sufficient conditions. In this work we propose a novel technique based on Formal Concept Analysis which organizes RDF data into a concept lattice. This allows data exploration as well as the discovery of implications, which are used to automatically detect missing information and then to complete RDF data. Moreover, this is a way of reconciling syntax and semantics in the LOD cloud. Finally, experiments on the DBpedia knowledge base show that the approach is wellfounded and effective. 1
A OnePass Triclustering Approach: Is There any Room for Big Data?
"... Abstract. An efficient onepass online algorithm for triclustering of binary data (triadic formal contexts) is proposed. This algorithm is a modified version of the basic algorithm for OACtriclustering approach, but it has linear time and memory complexities with respect to the cardinality of the ..."
Abstract

Cited by 1 (1 self)
 Add to MetaCart
(Show Context)
Abstract. An efficient onepass online algorithm for triclustering of binary data (triadic formal contexts) is proposed. This algorithm is a modified version of the basic algorithm for OACtriclustering approach, but it has linear time and memory complexities with respect to the cardinality of the underlying ternary relation and can be easily parallelized in order to be applied for the analysis of big datasets. The results of computer experiments show the efficiency of the proposed algorithm.
SemiSupervised Learning on Closed Set Lattices
"... We propose a new approach for semisupervised learning using closed set lattices, which have been recently used for frequent pattern mining within the framework of the data analysis technique of Formal Concept Analysis (FCA). We present a learning algorithm, called SELF (SEmisupervised Learning vi ..."
Abstract
 Add to MetaCart
(Show Context)
We propose a new approach for semisupervised learning using closed set lattices, which have been recently used for frequent pattern mining within the framework of the data analysis technique of Formal Concept Analysis (FCA). We present a learning algorithm, called SELF (SEmisupervised Learning via FCA), which performs as a multiclass classifier and a label ranker for mixedtype data containing both discrete and continuous variables, whereas only few learning algorithms such as the decision treebased classifier can directly handle mixedtype data. From both labeled and unlabeled data, SELF constructs a closed set lattice, which is a partially ordered set of data clusters with respect to subset inclusion, via FCA together with discretizing continuous variables, followed by learning classification rules through finding maximal clusters on the lattice. Moreover, it can weight each classification rule using the lattice, which gives a partial order of preference over class labels. We illustrate experimentally the competitive performance of SELF in classification and ranking compared to other learning algorithms using UCI datasets. 1
Textual Information Extraction in Document Images Guided by a Concept Lattice
"... Abstract. Text Information Extraction in images is concerned with extracting the relevant text data from a collection of document images. It consists in localizing (determining the location) and recognizing (transforming into plain text) text contained in document images. In this work we present a ..."
Abstract
 Add to MetaCart
(Show Context)
Abstract. Text Information Extraction in images is concerned with extracting the relevant text data from a collection of document images. It consists in localizing (determining the location) and recognizing (transforming into plain text) text contained in document images. In this work we present a textual information extraction model consisting in a set of prototype regions along with pathways for browsing through these prototype regions. The proposed model is constructed in four steps: (1) produce synthetic invoice data containing the textual information of interest, along with their spatial positions; (2) partition the produced data; (3) derive the prototype regions from the obtained partition clusters; (4) build the concept lattice of a formal context derived from the prototype regions. Experimental results, on a corpus of 1000 realworld scanned invoices show that the proposed model improves significantly the extraction rate of an Optical Character Recognition (OCR) engine.
Formal Concept Analysis of Disease Similarity
, 2012
"... Previous work shows that gene associations and network properties common between pairs of diseases can provide molecular evidence of comorbidity, but relationships among diseases may extend to larger groups. Formal concept analysis allows the study of multiple diseases based on a concept lattice wh ..."
Abstract
 Add to MetaCart
Previous work shows that gene associations and network properties common between pairs of diseases can provide molecular evidence of comorbidity, but relationships among diseases may extend to larger groups. Formal concept analysis allows the study of multiple diseases based on a concept lattice whose structure indicates gene set commonality. We use the concept lattice for gene associations to evaluate the complexity of the relationships among diseases, and to identify concepts whose gene sets are candidates for further functional analysis. For this, we define a heuristic on the lattice structure that allows the identification of concepts whose gene sets indicate strong relationships among the included diseases, which are distinguished from other diseases in the family. Applying this approach to a family of renal diseases we demonstrate that this approach finds gene sets that may be promising for studying common (and differing) mechanism among a family of comorbid or phenotypically related diseases.
Putting OACtriclustering on MapReduce
"... Abstract. In our previous work an efficient onepass online algorithm for triclustering of binary data (triadic formal contexts) was proposed. This algorithm is a modified version of the basic algorithm for OACtriclustering approach; it has linear time and memory complexities. In this paper we para ..."
Abstract
 Add to MetaCart
(Show Context)
Abstract. In our previous work an efficient onepass online algorithm for triclustering of binary data (triadic formal contexts) was proposed. This algorithm is a modified version of the basic algorithm for OACtriclustering approach; it has linear time and memory complexities. In this paper we parallelise it via mapreduce framework in order to make it suitable for big datasets. The results of computer experiments show the efficiency of the proposed algorithm; for example, it outperforms the online counterpart on Bibsonomy dataset with ≈ 800, 000 triples.