Results 1 
8 of
8
GREW—A Scalable Frequent Subgraph Discovery Algorithm
 in Fourth IEEE International Conference on Data Mining (ICDM 2004). 2004
, 2003
"... Existing algorithms that mine graph datasets to discover patterns corresponding to frequently occurring subgraphs can operate efficiently on graphs that are sparse, contain a large number of relatively small connected components, have vertices with low and bounded degrees, and contain welllabeled v ..."
Abstract

Cited by 21 (0 self)
 Add to MetaCart
(Show Context)
Existing algorithms that mine graph datasets to discover patterns corresponding to frequently occurring subgraphs can operate efficiently on graphs that are sparse, contain a large number of relatively small connected components, have vertices with low and bounded degrees, and contain welllabeled vertices and edges. However, there are a number of applications that lead to graphs that do not share these characteristics, for which these algorithms highly become unscalable. In this paper we propose a heuristic algorithm called GREW to overcome the limitations of existing complete or heuristic frequent subgraph discovery algorithms. GREW is designed to operate on a large graph and to find patterns corresponding to connected subgraphs that have a large number of vertexdisjoint embeddings. Our experimental evaluation shows that GREW is efficient, can scale to very large graphs, and find nontrivial patterns that cover large portions of the input graph and the lattice of frequent patterns.
Constructing a decision tree for graph structured data
 IN: PROC. MGTS 2003, HTTP://WWW.AR.SANKEN.OSAKAU.AC.JP/MGTS2003CFP.HTML
, 2003
"... Decision tree GraphBased Induction (DTGBI) is proposed that constructs a decision tree for graph structured data. Substructures (patterns) are extracted at each node of a decision tree by stepwise pair expansion (pairwise chunking) in GBI to be used as attributes for testing. Since attributes (fe ..."
Abstract

Cited by 5 (1 self)
 Add to MetaCart
Decision tree GraphBased Induction (DTGBI) is proposed that constructs a decision tree for graph structured data. Substructures (patterns) are extracted at each node of a decision tree by stepwise pair expansion (pairwise chunking) in GBI to be used as attributes for testing. Since attributes (features) are constructed while a classifier is being constructed, DTGBI can be conceived as a method for feature construction. The predictive accuracy of a decision tree is affected by which attributes (patterns) are used and how they are constructed. A beam search is employed to extract good enough discriminative patterns within the greedy search framework. Pessimistic pruning is incorporated to avoid overfitting to the training data. Experiments using a DNA dataset were conducted to see the effect of the beam width, the number of chunking at each node of a decision tree, and the pruning. The results indicate that DTGBI that does not use any prior domain knowledge can construct a decision tree that is comparable to other classifiers constructed using the domain knowledge.
Constructing Decision Trees for GraphStructured Data by Chunkingless GraphBased
 Induction”, Advances in Knowledge Discovery and Data Mining, Lecture Notes in Computer Science, Volume 3918
, 2006
"... Abstract. A decision tree is an effective means of data classification from which one can obtain rules that are easy to understand. However, decision trees cannot be conventionally constructed for data which are not explicitly expressed with attributevalue pairs such as graphstructured data. We h ..."
Abstract

Cited by 4 (1 self)
 Add to MetaCart
(Show Context)
Abstract. A decision tree is an effective means of data classification from which one can obtain rules that are easy to understand. However, decision trees cannot be conventionally constructed for data which are not explicitly expressed with attributevalue pairs such as graphstructured data. We have proposed a novel algorithm, named Chunkingless GraphBased Induction (ClGBI), for extracting typical patterns from graphstructured data. ClGBI is an improved version of GraphBased Induction (GBI) which employs stepwise pair expansion (pairwise chunking) to extract typical patterns from graphstructured data, and can find overlapping patterns that cannot not be found by GBI. In this paper, we further propose an algorithm for constructing decision trees for graphstructured data using ClGBI. This decision tree construction algorithm, now called Decision Tree Chunkingless GraphBased Induction (DTClGBI), can construct a decision tree from a graphstructured dataset while simultaneously constructing attributes useful for classification using ClGBI internally. Since patterns (subgraphs) extracted by ClGBI are considered as attributes of a graph, and their existence/nonexistence are used as attribute values in DTClGBI, DTClGBI can be conceived as a tree generator equipped with feature construction capability. Experiments were conducted on both synthetic and realworld graphstructured datasets showing the usefulness and effectiveness of the algorithm.
Mining Discriminative Patterns from Graph Structured Data with Constrained Search
"... Abstract. A graph mining method, Chunkingless GraphBased Induction (ClGBI), finds typical patterns that appear in graph structured data by the operation called chunkingless pairwise expansion which generates pseudonodes from selected pairs of nodes in the data. ClGBI enables to extract overlappi ..."
Abstract

Cited by 3 (0 self)
 Add to MetaCart
(Show Context)
Abstract. A graph mining method, Chunkingless GraphBased Induction (ClGBI), finds typical patterns that appear in graph structured data by the operation called chunkingless pairwise expansion which generates pseudonodes from selected pairs of nodes in the data. ClGBI enables to extract overlapping subgraphs, while it requires more time and space complexities. Thus, it happens that ClGBI cannot extract patterns that need be large enough to describe characteristics of data within a limited time and a given computational resource. In such a case, extracted patterns may not be so much of interest for domain experts. To mine more discriminative patterns which cannot be extracted by the current ClGBI, we introduce a search algorithm guided by domain knowledge or interests of domain experts. We further experimentally show that the proposed method can efficiently extract more discriminative patterns using both synthetic and real world datasets. 1
A Survey on Assorted Approaches to Graph Data Mining
"... Graph mining has become a popular area of research in recent years because of its numerous applications in a wide variety of practical fields, including computational biology, sociology, software bug localization, keyword search, and computer networking. Different applications result in graphs of di ..."
Abstract

Cited by 3 (0 self)
 Add to MetaCart
(Show Context)
Graph mining has become a popular area of research in recent years because of its numerous applications in a wide variety of practical fields, including computational biology, sociology, software bug localization, keyword search, and computer networking. Different applications result in graphs of different sizes and complexities. Graph mining is an important tool to transform the graphical data into graphical information. We investigate recurring patterns in realworld graphs, to gain a deeper understanding of their structure. We can extract normal and abnormal subgraphs thereby detecting suspicious nodes and outliers in the existing graphs. In this paper we present a survey of various approaches to mine the graphs. These are used to extract patterns, trends, classes, and clusters from graphs.
Faster Computation of the Direct Product Kernel for Graph Classification
"... Abstract — The direct product kernel, introduced by Gärtner et al. for graph classification, is based on defining a feature for every possible label sequence in a labelled graph and counting how many label sequences in two given graphs are identical. Although the direct product kernel has achieved p ..."
Abstract
 Add to MetaCart
(Show Context)
Abstract — The direct product kernel, introduced by Gärtner et al. for graph classification, is based on defining a feature for every possible label sequence in a labelled graph and counting how many label sequences in two given graphs are identical. Although the direct product kernel has achieved promising results in terms of accuracy, the kernel computation is not feasible for large graphs. This is because computing the direct product kernel for two graphs is essentially computing either the inverse of or by diagonalizing the adjacency matrix of the direct product of these two graphs. For two graphs with adjacency matrices of sizes m and n, the adjacency matrix of their direct product graph can be of size mn in the worst case. As both matrix inversion or matrix diagonalizing in the general case is O(n 3), computing the direct product kernel is O((mn) 3). Our survey of data sets in graph classification indicates that most graphs have adjacency matrices
Graph Clustering based on Structural Similarity of Fragments
"... Abstract. Resources available over the Web are often used in combination to meet a specific need of a user. Since resource combinations can be represented as graphs in terms of the relations among the resources, locating desirable resource combinations can be formulated as locating the correspondi ..."
Abstract
 Add to MetaCart
(Show Context)
Abstract. Resources available over the Web are often used in combination to meet a specific need of a user. Since resource combinations can be represented as graphs in terms of the relations among the resources, locating desirable resource combinations can be formulated as locating the corresponding graph. This paper describes a graph clustering method based on structural similarity of fragments (currently connected subgraphs are considered) in graph structured data. A fragment is characterized based on the connectivity (degree) of a node in the fragment. A fragment spectrum of a graph is created based on the frequency distribution of fragments. Thus, the representation of a graph is transformed into a fragment spectrum in terms of the properties of fragments in the graph. Graphs are then clustered with respect to the transformed spectra by applying a standard clustering method. We also devise a criterion to determine the number of clusters by defining a pseudoentropy of cluster. Preliminary experiments with synthesized data were conducted and the results are reported.