### Citations

3925 | Emergence of scaling in random networks
- Barabási, Albert
- 1999
(Show Context)
Citation Context ...rties of interest. Barabási-Albert model The observation that a number of networks do not match the predictions of the ER-model for the degree distribution has been reported by Barabási and Reka in =-=[6]-=-. The authors suggest that two important processes occurring in real-world large-scale networks - network growth and preferential attachment to existing nodes - is not being accounted for and propose ... |

3321 |
Collective dynamics of ‘small-world’ networks
- Watts, Strogatz
- 1998
(Show Context)
Citation Context ...the clustering coefficient as a measure of “cliquishness of a friendship circle”, which they define as the average over the rate of existing to possible edges among the direct neighbors of every node =-=[80]-=-. The clustering coefficient has a direct relation to the number of triangles in the graph, a topic which has researched in the data mining community as we discuss below. While many more statistics ov... |

3051 | On the evolution of random graphs
- Erdos, Rényi
- 1961
(Show Context)
Citation Context ...troduced by Gilbert in [35] for which the ER-model is often confused [10]; interestingly though, many properties are common for graphs from either graph probability space [9]). In a series of papers (=-=[28, 29]-=-) they study this model thoroughly and provide asymptotic bounds on many of its properties (degree distribution, probability of the all the nodes to belong to the giant Graph and network pattern minin... |

2384 | On generalized graphs
- Bollobás
- 1965
(Show Context)
Citation Context ... the Bernoulli-model first introduced by Gilbert in [35] for which the ER-model is often confused [10]; interestingly though, many properties are common for graphs from either graph probability space =-=[9]-=-). In a series of papers ([28, 29]) they study this model thoroughly and provide asymptotic bounds on many of its properties (degree distribution, probability of the all the nodes to belong to the gia... |

1747 | Mining frequent patterns without candidate generation
- Han
- 2004
(Show Context)
Citation Context ... specific children one has only to check these transactions. Indeed, transactions not matched by a parent node will not be matched by its children. One example of a depth-first algorithm is FP-growth =-=[38]-=-. 1.3.3 Graph mining settings There are several significant factors influencing the characteristics of a pattern mining problem. A first factor is the graph class. One can either consider the fully ge... |

1529 |
Spectral Graph Theory
- Chung
- 1997
(Show Context)
Citation Context ...on large networks is based on properties of the adjacency matrix. One example of this is [69]. The field investigating the properties of the adjacency matrix of graphs is called spectral graph theory =-=[19]-=-. 30 UGC book 1.7 Glossary Association rule: a rule representing a correlation between two patterns, the antecedent and the consequent Assymptotic complexity: an expression indicating how the cost of ... |

1315 |
Introduction to Graph Theory
- WEST
- 2001
(Show Context)
Citation Context ...he fields of statistical relational learning [34]. The field of graph mining is related to many other fields in the literature which deserve reading. First, there are the fields of graph theory [26], =-=[81]-=- and algorithmic graph theory [52]. Many, often old, results provide excellent inspiration for improving graph mining algorithms. A work providing references classified by problem type is ([36]). Seco... |

1195 | Graph Theory
- DIESTEL
- 2005
(Show Context)
Citation Context ...ude and provide some pointers for further reading. 1.2 Basic concepts We will first briefly review some basic terminology. A more in-depth introduction to graph theory and terminology can be found in =-=[25]-=-. Definition 1 (directed graph) A directed graph is a tuple (V,E, λ) where V is a finite set of vertices (also called nodes), E ⊆ V × V is a set of arcs, Graph and network pattern mining 3 and λ : V ∪... |

988 |
Matrix multiplication via arithmetic progression
- Coppersmith, Winograd
- 1990
(Show Context)
Citation Context ...ms for counting triangles are based on matrix multiplication. The asymptotic time complexity of the fastest existing (theoretical) method for matrix multiplication takes O(n2.37) time and Θ(n2) space =-=[21]-=-. For sparse graphs, in [3], an O(m1.41) time and Θ(n2) space algorithm, NodeIterator, is proposed. It computes for each node its neighborhood and then checks how many edges exist among its neighbors.... |

786 | Network motifs: simple building blocks of complex networks
- Milo, Shen-Orr, et al.
- 2002
(Show Context)
Citation Context ...unction in determining gene expression. Similarly, [58] presented a model of signaling pathways in hippocampal CA1 neurons, and found a high fraction of positive and negative feedback loop motifs. In =-=[62]-=-, the authors found network motifs in networks from biochemistry, neurobiology, ecology, and engineering. They saw that the motifs shared by ecological food webs were distinct from the motifs shared b... |

646 | gSpan: Graph-based substructure pattern mining
- Yan, Han
- 2002
(Show Context)
Citation Context ... number of new candidate patterns to consider. AcGM [44] extends AGM by allowing for both induced subgraph isomorphism and normal subgraph isomorphism, and by considering hierarchies of labels. gSpan =-=[82]-=- is one of the most popular graph mining systems. It uses a depth-first algorithm. This system is also based on an orderly generation of patterns using a canonical form, called depth-first search (DFS... |

606 |
U.: Network motifs in the transcriptional regulation network of escherichia coli
- Shen-Orr, Milo, et al.
- 2002
(Show Context)
Citation Context ...f triangles in the graph when the input is provided as a stream which is too large to store. Network motifs Network motifs are subgraphs which occur much more often than they occur in random networks =-=[74]-=-. Several researchers have studied in moderately large application domains motifs which are sufficiently small to match without significant computational challenges. The authors of [74] found that muc... |

545 |
A note on inductive generalization.
- Plotkin
- 1969
(Show Context)
Citation Context ...ns under homomorphism is their product graph. In fact, this least upper bound is well-known in the field of Inductive Logic Programming as the least general generalization of two logical conjunctions =-=[68]-=-. Unfortunately, the size (as well as the treewidth) of the least upper bound of a set of graph patterns under homomorphism is in general not bounded by a polynomial in their total size. As a conseque... |

488 |
Topological graph theory
- Gross, Tucker
- 1987
(Show Context)
Citation Context ... [26], [81] and algorithmic graph theory [52]. Many, often old, results provide excellent inspiration for improving graph mining algorithms. A work providing references classified by problem type is (=-=[36]-=-). Second, there is a large literature on techniques investigating large graphs and networks. Recent work in this direction can easily be found in the proceedings of recent editions of major data mini... |

354 | Evolution of the social network of scientific collaborations
- Barabási, Jeong, et al.
- 2002
(Show Context)
Citation Context ...er of networks from a wide range of domains: biology, technological networks, citation networks, networks of chemical interactions etc. A detailed analysis of the network of scientific collaborations =-=[7]-=- has prompted the authors to introduce further refinements to this model, namely that “internal links” creation is also governed by the “rich get richer” scheme. 24 UGC book 1.4.2 Pattern matching in ... |

346 | PrefixSpan: Mining sequential patterns efficiently by prefix-projected pattern growth
- Pei, Han, et al.
- 2001
(Show Context)
Citation Context ...s sequences. Finding frequent subgraph isomorphic patterns is equivalent to finding frequent subsequences. Many subsequence mining algorithms have been proposed in the literature, amongst others [2], =-=[67]-=- and [4]. • An undirected cycle is a graph C with V (C) = {v1, v2, . . . , vn} and E = {{v1, v2}, . . . , {vn−1, vn}, {vn, v1}}. A tree is a graph that does not contain a cycle as a subgraph. In the d... |

345 | An algorithm for subgraphisomorphism
- Ullmann
- 1976
(Show Context)
Citation Context ...n in the database graph, node by node, backtracking when no solution can be found. A classical search heuristic is to first map the node for which the number of available alternatives is the smallest =-=[76]-=-. Prioritizing the more important patterns Even though it does not influence the total cost, delivering the most important patterns first may be more desirable than treating all patterns equally. This... |

339 |
Introduction to Statistical Relational Learning
- Getoor, Taskar
- 2007
(Show Context)
Citation Context ... This chapter focused on graph pattern mining. [70] surveys prediction problems in graphs. Graphs from a probabilistic model point of view are studied in the fields of statistical relational learning =-=[34]-=-. The field of graph mining is related to many other fields in the literature which deserve reading. First, there are the fields of graph theory [26], [81] and algorithmic graph theory [52]. Many, oft... |

338 |
Verkamo. Fast discovery of association rules.
- Agrawal, Mannila, et al.
- 1996
(Show Context)
Citation Context ...o this central one. When considering star graphs where leaf vertices are labeled with items from some set I, mining frequent star graph patterns of at least 2 vertices is equivalent to itemset mining =-=[1]-=- where the set of items is I and the transactions are the sets of leaf labels of individual star graphs. Even though problems only involving itemset mining can be handled more efficiently with a speci... |

238 | Thirty years of graph matching in pattern recognition
- Conte, Foggia, et al.
(Show Context)
Citation Context ...lution by relaxation [49]. Other strategies include continuous optimization , spectral methods , artificial neural networks, relaxation labeling and so on. We do not list all the algorithms here. See =-=[20]-=- for a comprehensive survey of pattern matching. 1.4.3 Pattern mining support measures Pattern matching, discussed in the previous section, allows one to list all embeddings (or images) of a pattern P... |

215 | On generating all maximal independent sets
- Johnson, Yannakakis, et al.
- 1988
(Show Context)
Citation Context ...e label values. As a consequence, one can not hope to obtain a polynomial time algorithm. To perform a more refined analysis, the following complexity classes are usually considered in the literature =-=[48]-=-. They are based on the idea that a listing algorithm (here a pattern mining algorithm) lists its solutions (frequent patterns) one by one, and that one can study the delay between outputting two cons... |

182 |
A (sub)graph isomorphism algorithm for matching large graphs
- Cordella, Foggia, et al.
(Show Context)
Citation Context ...ut still widely used such backtracking algorithm is described in [76]. In [57] the search method exploits a heuristic derived from constraint satisfaction to reduce the cost. VF and its successer VF2 =-=[22]-=- are two search algorithms implementing a number of efficiently computable hueristics, and have been used in several transactional graph mining systems. There are algorithms using other strategies, e.... |

182 |
A distance measure between attributed relational graphs for pattern recognition
- Sanfeliu, Fu
- 1983
(Show Context)
Citation Context ...ave noise, and 2) patterns may have several variations. The tree search based strategy is also widely used in inexact matching and most of the algorithms are based on the edit distance (e.g. [14] and =-=[71]-=-). In order to compute the edit distance of two graphs, we first have to define a set of allowed operations with different costs on graphs. Usually, insertions, deletions and substitutions of vertices... |

159 | A quickstart in frequent structure mining can make a difference
- Nijssen, Kok
- 2004
(Show Context)
Citation Context ...to be terminated before outputting all answers. Also, when the time needed to find patterns diverges strongly, finding the easier-to-find patterns first may be called for. One example is described in =-=[65]-=- where paths are mined first, then trees and only at the end, cyclic graphs. Hierarchical approaches In some cases, sets of nodes can be grouped together into larger entities, and working with such gr... |

141 | Random graphs
- Gilbert
- 1959
(Show Context)
Citation Context ... random-graphs model, attributed to Erdős and Rényi, in which all missing edges have the same (uniform) probability of appearing (in contrast with the Bernoulli-model first introduced by Gilbert in =-=[35]-=- for which the ER-model is often confused [10]; interestingly though, many properties are common for graphs from either graph probability space [9]). In a series of papers ([28, 29]) they study this m... |

128 | Finding frequent patterns in a large sparse graph
- Kuramochi, Karypis
- 2005
(Show Context)
Citation Context ...rt measures. We cannot take the MIS support measure into practice directly because it is NP-hard to compute, and remains so even when the degree of the overlap graph is bounded. Kuramochi and Karypis =-=[54]-=- designed two practical mining algorithms using the MIS support measure. In their algorithms, the MIS support is computed approximately. Bringmann and Nijssen [12] examined the expense of computing th... |

122 | Finding and counting given length cycles
- Alon, Yuster, et al.
- 1994
(Show Context)
Citation Context ...e based on matrix multiplication. The asymptotic time complexity of the fastest existing (theoretical) method for matrix multiplication takes O(n2.37) time and Θ(n2) space [21]. For sparse graphs, in =-=[3]-=-, an O(m1.41) time and Θ(n2) space algorithm, NodeIterator, is proposed. It computes for each node its neighborhood and then checks how many edges exist among its neighbors. In [46], Itai and Rodeh ex... |

120 | Contraction hierarchies: Faster and simpler hierarchical routing in road networks.
- Geisberger, Sanders, et al.
- 2008
(Show Context)
Citation Context ...in functional groups such as rings and chains [41], [11], [23]. Also, in traffic networks, streets may be grouped into districts or cities, an approach which is typically used in routing applications =-=[33]-=-. 1.3.5 Condensed representations When performing pattern mining, one is usually only interested in the most valuable patterns. E.g., patterns may be processed afterwards, either by a human or an algo... |

114 |
Finding a minimum circuit in a graph.
- Itai, Rodeh
- 1978
(Show Context)
Citation Context ...r sparse graphs, in [3], an O(m1.41) time and Θ(n2) space algorithm, NodeIterator, is proposed. It computes for each node its neighborhood and then checks how many edges exist among its neighbors. In =-=[46]-=-, Itai and Rodeh exploited the NodeIterator idea to present an algorithm that counts the number of triangles in O(m 3 2 ) time. This algorithm computes spanning trees Graph and network pattern mining ... |

106 |
Inexact graph matching for structural pattern recognition
- Bunke, Allermann
- 1983
(Show Context)
Citation Context ...ata may have noise, and 2) patterns may have several variations. The tree search based strategy is also widely used in inexact matching and most of the algorithms are based on the edit distance (e.g. =-=[14]-=- and [71]). In order to compute the edit distance of two graphs, we first have to define a set of allowed operations with different costs on graphs. Usually, insertions, deletions and substitutions of... |

100 | Complete mining of frequent patterns from graphs: Mining graph data.
- Inokuchi, Washio, et al.
- 2003
(Show Context)
Citation Context ...For example, [83] describes a method to prune when searching for patterns correlating with the target attribute. 1.3.6 Transactional graph mining systems One of the first graph mining systems was AGM =-=[45]-=-, mining induced subgraphs, and its variants such as AcGM. This system used a matrix-based canonical form, and introduced operations to join the canonical form of smaller patterns into the canonical f... |

99 | A machine learning approach to building domain-specific search engines.
- McCallum, Nigam, et al.
- 1999
(Show Context)
Citation Context ...escribing the analysis of large networks in a wide range of application domains. Here we only give a few examples: • In co-author networks, two people are connected if they published a paper together =-=[60]-=-. • In citation networks, articles are represented with vertices and citations with edges. One example of such network is DBLP1. • In molecule interaction networks, two molecules are connected if they... |

77 |
A comparison of the Delsarte and Lovász bounds
- Schrijver
- 1979
(Show Context)
Citation Context ...n be computed with semidefinite programming, and hence in polynomial time. However, existing methods are still very expensive to compute ϑ [51, 47, 43]. We point out that the Schrijver theta value ϑ′ =-=[73]-=- can be also used as a normalized anti-monotonic support measure. It is very similar to the ϑ, and we always have MIS ≤ ϑ′ ≤ ϑ ≤MCP . Wang and Ramon [78] observed that those images which share a commo... |

69 | Efficient semi-streaming algorithms for local triangle counting in massive graphs
- BECCHETTI, BOLDI, et al.
(Show Context)
Citation Context ... problem, and is related to the calculation of the clustering coefficient. Furthermore, many interesting graph mining tasks are based on counting the number of triangles in the graph. For example, in =-=[8]-=- a method based on triangles was proposed for detecting spamming. Both exact and approximate triangle counting algorithms have been proposed. In this section, we assume that G is an undirected loop-fr... |

68 | Cliques in random graphs,
- Bollobas, Erdos
- 1976
(Show Context)
Citation Context ... Rényi, in which all missing edges have the same (uniform) probability of appearing (in contrast with the Bernoulli-model first introduced by Gilbert in [35] for which the ER-model is often confused =-=[10]-=-; interestingly though, many properties are common for graphs from either graph probability space [9]). In a series of papers ([28, 29]) they study this model thoroughly and provide asymptotic bounds ... |

68 | Counting triangles in data streams.
- Buriol, Frahling, et al.
- 2006
(Show Context)
Citation Context ...les in the resulting sparser graph, and divides the result by p3. If p is not too small and triangles are sufficiently uniformly distributed, one can show that the result is a close approximation. In =-=[15]-=-, the authors present estimators for the number of triangles in the graph when the input is provided as a stream which is too large to store. Network motifs Network motifs are subgraphs which occur mu... |

52 | Frequent Subtree Mining - An Overview.
- Chi, Muntz, et al.
- 2004
(Show Context)
Citation Context ...fied on its children. An attribute tree is a rooted tree for which for each vertex all its children have a distinct label. A huge amount of tree mining algorithms have been proposed in the literature =-=[18]-=-. Common applications include mining parse trees (in programming languages or natural language processing), XML and many other tree-structured file formats. • Some approaches consider specifically mol... |

50 |
Constraint satisfaction algorithms for graph pattern matching.
- Larrosa, Valiente
- 2002
(Show Context)
Citation Context ...hown to be #P-complete in general, and almost all practical testing procedures are based on search with backtracking. An old but still widely used such backtracking algorithm is described in [76]. In =-=[57]-=- the search method exploits a heuristic derived from constraint satisfaction to reduce the cost. VF and its successer VF2 [22] are two search algorithms implementing a number of efficiently computable... |

48 | Efficient Subgraph Isomorphism Detection: A Decomposition Approach,"
- Messmer, Bunke
- 2000
(Show Context)
Citation Context ...putable hueristics, and have been used in several transactional graph mining systems. There are algorithms using other strategies, e.g., decision tree based techniques [66]. The reader is referred to =-=[61]-=- for a short review of subgraph isomorphism algorithms. When the database graph is large and has a high average degree, these matching algorithms become intractable even for reasonably small patterns.... |

47 | Faster Algebraic Algorithm for Path and Packing Problems.
- Koutis
- 2008
(Show Context)
Citation Context ...trees in large networks. This method finds all the homomorphisms first, and then with high probability removes those which are not isomorphisms exploiting properties of a specific algebraic structure =-=[53]-=-. This algorithm can mine all frequent rooted trees with delay linear in the size of the network and only mildly exponential in the size of the patterns. 1.4.2.4 Algorithms for approximate pattern mat... |

41 | What is frequent in a single graph
- Bringmann, Nijssen
- 2008
(Show Context)
Citation Context ...ph is bounded. Kuramochi and Karypis [54] designed two practical mining algorithms using the MIS support measure. In their algorithms, the MIS support is computed approximately. Bringmann and Nijssen =-=[12]-=- examined the expense of computing the MIS of overlap graphs, and then described another support measure minimum image based support which does not use overlap graphs. Given a pattern P = (V (P ), E(P... |

39 | Frequent subgraph mining in outerplanar graphs
- Horváth, Ramon, et al.
- 2006
(Show Context)
Citation Context ...many applications, one may have prior knowledge on the structure of the data. This may include hard rules (e.g. molecules have bounded valency, mRNA molecules can be represented as outerplanar graphs =-=[41]-=-, ...) or general knowledge to which there are exceptions (for example. 96% of the molecules are outerplanar graphs [41], traffic networks are almost planar graphs). Second, one may feel that only pat... |

38 | Discovering Frequent Geometric Subgraphs
- Kuramochi, Karypis
- 2004
(Show Context)
Citation Context ...ents, an approach which turns out to be useful in chemo-informatics applications [72]. Some researchers consider special purpose graph mining systems. For exGraph and network pattern mining 21 ample, =-=[55]-=- describes gFSG, aimining at mining geometric patterns, i.e. patterns where vertices have spatial coordinates. 1.4 Single network mining Up to now, we considered the transactional setting of graph min... |

32 |
On the complexity of finding iso- and other morphisms for partial k-trees
- Matousek, Thomas
- 1992
(Show Context)
Citation Context ...subgraph of G in polynomial time (the constant w appearing in the exponent). However, in general subgraph isomorphism between connected graphs of treewidth at most w (for w at least 2) is NP-complete =-=[59]-=-. Despite this negative result, it Graph and network pattern mining 15 can be shown that frequent connected subgraphs of treewidth at most w can be listed in incremental polynomial time [42]. The homo... |

30 |
Formation of Regulatory Patterns During Signal Propagation in a Mammalian Cellular Network Science 309
- Ma’ayan
- 2005
(Show Context)
Citation Context ... gene regulation network is composed of repeated appearances of three highly significant motifs. They showed that each network motif has a specific function in determining gene expression. Similarly, =-=[58]-=- presented a model of signaling pathways in hippocampal CA1 neurons, and found a high fraction of positive and negative feedback loop motifs. In [62], the authors found network motifs in networks from... |

27 | S.: Don't be afraid of simpler patterns
- Bringmann, Zimmermann, et al.
(Show Context)
Citation Context ...s a database D for (Ld, Lp,≤) and a minimal frequency threshold t, and the task is to list all elements P ∈ Lp for which freq(D,P ) ≥ t holds. Other interestingness predicates can be considered, e.g. =-=[13]-=- studies the constraint that patterns should have a minimal correlation with a given target 8 UGC book attribute. The pattern mining problem is also related to combinatorial enumeration problems such ... |

27 | A binary linear programming formulation of the graph edit distance.
- Justice, Hero
- 2006
(Show Context)
Citation Context ...distance of two graphs is the minimum cost path, and in most applications, to compute the optimal solution is extremely expensive. Practical approaches just find the suboptimal solution by relaxation =-=[49]-=-. Other strategies include continuous optimization , spectral methods , artificial neural networks, relaxation labeling and so on. We do not list all the algorithms here. See [20] for a comprehensive ... |

26 | Efficient approximation algorithms for semidefinite programs arising from MAX CUT and COLORING
- Klein, Lu
- 1996
(Show Context)
Citation Context ...ard to compute. Another is the Lovász theta value ϑ which can be computed with semidefinite programming, and hence in polynomial time. However, existing methods are still very expensive to compute ϑ =-=[51, 47, 43]-=-. We point out that the Schrijver theta value ϑ′ [73] can be also used as a normalized anti-monotonic support measure. It is very similar to the ϑ, and we always have MIS ≤ ϑ′ ≤ ϑ ≤MCP . Wang and Ramo... |

25 | Support computation for mining frequent subgraphs in a single graph
- Fiedler, Borgelt
- 2007
(Show Context)
Citation Context ... vertex (vertex-overlap). Vanetik et al. [77] introduced the maximum independent set support measure (MIS), measuring the size of the maximum independent set of the overlap graph. Fiedler and Borgelt =-=[30]-=- proved that the MIS measure is antimonotonic and claim that some cases of overlap can be ignored without affecting the anti-monotonicity of resulting support measures. Given a pattern P = (V (P ), E(... |

23 | A learnability model for universal representations
- Muggleton, Page
- 1994
(Show Context)
Citation Context ... preferred to another. The most common graph matching operator is the subgraph isomorphism operator. The homomorphism operator has been studied extensively in the field of Inductive Logic Programming =-=[63]-=-, where it is known as the θ-subsumption operator. Other graph matching operators exist, such as the subgraph homeomorphism operator [56], but these are less frequently used in the graph mining litera... |

22 |
Mining generalized substructures from a set of labeled graphs
- Inokuchi
- 2004
(Show Context)
Citation Context ...so exploited this idea of only generating new patterns from the combination of two smaller patterns which differ only at one vertex. This limits the number of new candidate patterns to consider. AcGM =-=[44]-=- extends AGM by allowing for both induced subgraph isomorphism and normal subgraph isomorphism, and by considering hierarchies of labels. gSpan [82] is one of the most popular graph mining systems. It... |

21 | An Output-Polynomial Time Algorithm for Mining Frequent Closed Attribute Trees.
- Arimura, Uno
- 2005
(Show Context)
Citation Context ... a closure operator cl, under reasonably weak assumptions, it is possible to mine all cl-closed patterns in output polynomial time. For itemsets and a number of other settings such as attribute trees =-=[5]-=-, there is a closure operator such that the corresponding closed patterns coincide 18 UGC book with the definition of f-closed patterns above. For more complex graph classes however, the situation is ... |

20 | The subgraph homeomorphism problem
- LaPaugh, Rivest
- 1980
(Show Context)
Citation Context ...tudied extensively in the field of Inductive Logic Programming [63], where it is known as the θ-subsumption operator. Other graph matching operators exist, such as the subgraph homeomorphism operator =-=[56]-=-, but these are less frequently used in the graph mining literature. So when to use isomorphism and when to use homomorphism? It often happens that vertices represent objects in the application and ed... |

19 | Condensed representations for inductive logic programming
- Raedt, Ramon
- 2004
(Show Context)
Citation Context ...an embedding of a smaller pattern; 16 UGC book • association rules already discovered, since these allow one to derive in certain cases the frequency of patterns from the frequency of smaller patterns=-=[24]-=-, Heuristics Solving hard problems may be unavoidable, and in that case intelligent search can help. Consider for example the problem of pattern matching. Subgraph isomorphism checking is the most exp... |

18 | CorClass: Correlated Association Rule Mining for Classification
- Zimmermann, Raedt
- 2004
(Show Context)
Citation Context ...direction is to select patterns or association rules on the basis of some useful quality criteria. These criteria include: • correlation of the pattern or association rules with some target attribute =-=[83]-=-, [13] • association rule quality (e.g., lift, confidence, leverage), • additional information provided by the pattern and its frequency compared to a set of already selected patterns. Typically, one ... |

17 | Triangle sparsifiers
- Tsourakakis, Kolountzakis, et al.
(Show Context)
Citation Context ...me. This algorithm computes spanning trees Graph and network pattern mining 25 of the graph and removes edges while checking that every triangle is listed exactly once. Approximate triangle counting. =-=[75]-=- proposes a sampling approach. In particular, the proposed algorithm samples edges independently with probability p, counts the triangles in the resulting sparser graph, and divides the result by p3. ... |

16 | Generalization of pattern-growth methods for sequential pattern mining with gap constraints
- Antunes, Oliveira
- 2003
(Show Context)
Citation Context ...es. Finding frequent subgraph isomorphic patterns is equivalent to finding frequent subsequences. Many subsequence mining algorithms have been proposed in the literature, amongst others [2], [67] and =-=[4]-=-. • An undirected cycle is a graph C with V (C) = {v1, v2, . . . , vn} and E = {{v1, v2}, . . . , {vn−1, vn}, {vn, v1}}. A tree is a graph that does not contain a cycle as a subgraph. In the data mini... |

15 | EigenSpokes: Surprising Patterns and Scalable Community Chipping in Large Graphs,” PAKDD,
- Prakash, Sridharan, et al.
- 2010
(Show Context)
Citation Context ...ceedings of recent editions of major data mining conferences such as KDD, ICDM, SDM and PKDD. Part of the work on large networks is based on properties of the adjacency matrix. One example of this is =-=[69]-=-. The field investigating the properties of the adjacency matrix of graphs is called spectral graph theory [19]. 30 UGC book 1.7 Glossary Association rule: a rule representing a correlation between tw... |

13 | Large Human Communication Networks: Patterns and a UtilityDriven Generator
- Du
(Show Context)
Citation Context ...es a protein interactio network. • In communication networks, two people are connected if they had at least one contact during a certain observation period. This can concern phone calls, SMS messages =-=[27]-=-, emails, etc. Several other types of networks can be found in the SNAP repository2. Often such networks are represented with undirected graphs, even though in many cases (e.g. when sending email from... |

11 | An sdp primal-dual algorithm for approximating the lovász-theta function
- Chan, Chang, et al.
- 2009
(Show Context)
Citation Context ...ard to compute. Another is the Lovász theta value ϑ which can be computed with semidefinite programming, and hence in polynomial time. However, existing methods are still very expensive to compute ϑ =-=[51, 47, 43]-=-. We point out that the Schrijver theta value ϑ′ [73] can be also used as a normalized anti-monotonic support measure. It is very similar to the ϑ, and we always have MIS ≤ ϑ′ ≤ ϑ ≤MCP . Wang and Ramo... |

10 | Transforming graph data for statistical relational learning
- Rossi, McDowell, et al.
(Show Context)
Citation Context ...hanging context. We anticipate that the exploration of expert advice and causal inference may be interesting paths in future work. 1.6 Additional reading This chapter focused on graph pattern mining. =-=[70]-=- surveys prediction problems in graphs. Graphs from a probabilistic model point of view are studied in the fields of statistical relational learning [34]. The field of graph mining is related to many ... |

9 |
Foundations of Inductive Logic Programming, volume 1228
- Nienhuys-Cheng, Wolf
- 1997
(Show Context)
Citation Context ...valence relation. Subgraph isomorphism and homomorphism is sometimes also called OI-subsumption (subsumption under object identity) and θ-subsumption (e.g. in the field of inductive logic programming =-=[64]-=-). Table 1.1 summarizes the terminology. Every subgraph isomorphism mapping is a homomorphism, but not every homomorphism is a subgraph isomorphism mapping. Figure 1.1 gives an example of (a) a subgra... |

8 |
Mining generalised associations rules.
- Srikant, Agrawal
- 1995
(Show Context)
Citation Context ...een as sequences. Finding frequent subgraph isomorphic patterns is equivalent to finding frequent subsequences. Many subsequence mining algorithms have been proposed in the literature, amongst others =-=[2]-=-, [67] and [4]. • An undirected cycle is a graph C with V (C) = {v1, v2, . . . , vn} and E = {{v1, v2}, . . . , {vn−1, vn}, {vn, v1}}. A tree is a graph that does not contain a cycle as a subgraph. In... |

8 |
Support measures for graph data
- Vanetik, Shimony, et al.
- 2006
(Show Context)
Citation Context ...rn P in the database graph D, and two nodes are adjacent if the corresponding images overlap, i.e., they share at least a common edge (edgeoverlap) or a common vertex (vertex-overlap). Vanetik et al. =-=[77]-=- introduced the maximum independent set support measure (MIS), measuring the size of the maximum independent set of the overlap graph. Fiedler and Borgelt [30] proved that the MIS measure is antimonot... |

7 |
Graph Algorithms and Optimization
- Kocay, Kreher
- 2004
(Show Context)
Citation Context ...l learning [34]. The field of graph mining is related to many other fields in the literature which deserve reading. First, there are the fields of graph theory [26], [81] and algorithmic graph theory =-=[52]-=-. Many, often old, results provide excellent inspiration for improving graph mining algorithms. A work providing references classified by problem type is ([36]). Second, there is a large literature on... |

6 | Effective feature construction by maximum common subgraph sampling,”
- Schietgat, Costa, et al.
- 2011
(Show Context)
Citation Context ...les are the MoSS system [39] and the FOG system [41], both of which treat cyclic fragments separately from linear fragments, an approach which turns out to be useful in chemo-informatics applications =-=[72]-=-. Some researchers consider special purpose graph mining systems. For exGraph and network pattern mining 21 ample, [55] describes gFSG, aimining at mining geometric patterns, i.e. patterns where verti... |

5 |
Nemofinder: Dissecting genome wide protein-protein interactions with repeated and unique network motifs
- Chen, Hsu, et al.
- 2006
(Show Context)
Citation Context ...ed with vertices and citations with edges. One example of such network is DBLP1. • In molecule interaction networks, two molecules are connected if they interact during at least one experiment. E.g., =-=[17]-=- describes a protein interactio network. • In communication networks, two people are connected if they had at least one contact during a certain observation period. This can concern phone calls, SMS m... |

4 | Combining Ring Extensions and Canonical Form Pruning
- Borgelt
(Show Context)
Citation Context ...ies, and working with such groups is more efficient than working with individual nodes. For example, in molecules, atoms are often grouped together in functional groups such as rings and chains [41], =-=[11]-=-, [23]. Also, in traffic networks, streets may be grouped into districts or cities, an approach which is typically used in routing applications [33]. 1.3.5 Condensed representations When performing pa... |

4 | Efficient Frequent Connected Subgraph Mining in Graphs of Bounded Treewidth
- Horváth, Ramon
- 2008
(Show Context)
Citation Context ...al graph patterns under subgraph isomorphism, one can use the following simple argument to show that (unless P=NP) no algorithm exists to list all frequent subgraph patterns in output polynomial time =-=[40]-=-. Consider a database with two transactions. The first one is a cycle with length n, and the second transaction is an arbitrary graph on n vertices. Let the minimal frequency threshold be 2. There are... |

3 |
Raedt. On mining closed sets in multi-relational data
- Garriga, Khardon, et al.
- 2007
(Show Context)
Citation Context ...re operator such that the corresponding closed patterns coincide 18 UGC book with the definition of f-closed patterns above. For more complex graph classes however, the situation is more complicated. =-=[32]-=- showed that when every pair of patterns in Lp has a unique least upper bound, a closure operator exists and one can mine all f-closed patterns efficiently. This holds in particular for graph patterns... |

2 | All normalized anti-monotonic overlap graph measures are bounded. Data Mining and Knowledge Discovery
- Calders, Ramon, et al.
- 2011
(Show Context)
Citation Context ...there are only two independent observations (if we consider embeddings which do not overlap as independent observations of some phenomenon). Therefore, the notion of independence has some advantages. =-=[16]-=- generalized the conditions for anti-monotonicity of overlap graph based support measures. They showed that the conditions can be used whenever the matching operator is isomorphism, homomorphism or he... |

2 | Molecular graph augmentation with rings and functional groups
- Grave, Costa
- 2010
(Show Context)
Citation Context ...nd working with such groups is more efficient than working with individual nodes. For example, in molecules, atoms are often grouped together in functional groups such as rings and chains [41], [11], =-=[23]-=-. Also, in traffic networks, streets may be grouped into districts or cities, an approach which is typically used in routing applications [33]. 1.3.5 Condensed representations When performing pattern ... |

2 | Approximately counting embeddings into random graphs
- Fürer, Shiva
- 2008
(Show Context)
Citation Context ...orithms. One option is to exploit statistical regularity in the graph. E.g. for certain pattern classes, there exist efficient algorithms which provide good approximations on almost all random graphs =-=[31]-=- 26 UGC book On the other hand, recent work on fixed parameter tractability has shown that there are algorithms, often randomized ones, whose assymptotic complexity is exponential in the pattern size ... |

2 |
Approximating semidefinite packing problems
- Iyengar, Phillips, et al.
- 2009
(Show Context)
Citation Context ...ard to compute. Another is the Lovász theta value ϑ which can be computed with semidefinite programming, and hence in polynomial time. However, existing methods are still very expensive to compute ϑ =-=[51, 47, 43]-=-. We point out that the Schrijver theta value ϑ′ [73] can be also used as a normalized anti-monotonic support measure. It is very similar to the ϑ, and we always have MIS ≤ ϑ′ ≤ ϑ ≤MCP . Wang and Ramo... |

2 |
Nearly exact mining of frequent trees in large networks
- Kibriya, Ramon
- 2012
(Show Context)
Citation Context ...lly small, in large networks. For instance if the patterns are trees, the subgraph isomorphism problem still remains #P-complete. However, based on recent advances in parameterized complexity theory, =-=[50]-=- proposed a randomized algorithm for mining rooted trees in large networks. This method finds all the homomorphisms first, and then with high probability removes those which are not isomorphisms explo... |

2 |
A Decision Tree Approach for Design Patterns Detection by Subgraph Isomorphism
- Pande, Gupta, et al.
- 2011
(Show Context)
Citation Context ...ting a number of efficiently computable hueristics, and have been used in several transactional graph mining systems. There are algorithms using other strategies, e.g., decision tree based techniques =-=[66]-=-. The reader is referred to [61] for a short review of subgraph isomorphism algorithms. When the database graph is large and has a high average degree, these matching algorithms become intractable eve... |

2 | An efficiently computable support measure for frequent subgraph pattern mining
- Wang, Ramon
- 2012
(Show Context)
Citation Context ...We point out that the Schrijver theta value ϑ′ [73] can be also used as a normalized anti-monotonic support measure. It is very similar to the ϑ, and we always have MIS ≤ ϑ′ ≤ ϑ ≤MCP . Wang and Ramon =-=[78]-=- observed that those images which share a common vertex (vertex-overlap) or a common edge (edge-overlap) build a clique in the overlap graph, and proposed the overlap hypergraph whose nodes are the im... |

1 |
Unified generation of conformations, conformers and steroisomers: a discrete mathematics approach
- Gugisch, Rucker
(Show Context)
Citation Context ...with a given target 8 UGC book attribute. The pattern mining problem is also related to combinatorial enumeration problems such as the enumeration of all molecules satisfying some specific properties =-=[37]-=-. A useful property for interestingness predicates is anti-monotonicity. Most of standard pattern mining algorithms rely on this property to prune their search. Definition 12 (anti-monotone predicates... |

1 |
Large-scale mining of molecular fragments with wildcards
- Hofer, Borgelt, et al.
- 2003
(Show Context)
Citation Context ...ent patterns are acyclic. Gaston also employs a depth-first search algorithm. Some approaches add more than one vertex at a time, making larger steps in the search space. Examples are the MoSS system =-=[39]-=- and the FOG system [41], both of which treat cyclic fragments separately from linear fragments, an approach which turns out to be useful in chemo-informatics applications [72]. Some researchers consi... |

1 |
An efficiently computable and statistically motivated subgraph pattern support measure. Data Mining and Knowledge Discovery
- Wang, Ramon, et al.
- 2013
(Show Context)
Citation Context ... (usually sparse) linear program which can be solved very efficiently using recently interier-point methods. More recently, they showed that this measure also has a natural statistical interpretation =-=[79]-=-. 1.4.4 Applications The literature has a huge amount of articles describing the analysis of large networks in a wide range of application domains. Here we only give a few examples: • In co-author net... |