Results 1 - 10
of
90
A path following algorithm for the graph matching problem
, 2009
"... We propose a convex-concave programming approach for the labeled weighted graph matching problem. The convexconcave programming formulation is obtained by rewriting the weighted graph matching problem as a least-square problem on the set of permutation matrices and relaxing it to two different opti ..."
Abstract
-
Cited by 43 (4 self)
- Add to MetaCart
(Show Context)
We propose a convex-concave programming approach for the labeled weighted graph matching problem. The convexconcave programming formulation is obtained by rewriting the weighted graph matching problem as a least-square problem on the set of permutation matrices and relaxing it to two different optimization problems: a quadratic convex and a quadratic concave optimization problem on the set of doubly stochastic matrices. The concave relaxation has the same global minimum as the initial graph matching problem, but the search for its global minimum is also a hard combinatorial problem. We, therefore, construct an approximation of the concave problem solution by following a solution path of a convex-concave problem obtained by linear interpolation of the convex and concave formulations, starting from the convex relaxation. This method allows to easily integrate the information on graph label similarities into the optimization problem, and therefore, perform labeled weighted graph matching. The algorithm is compared with some of the best performing graph matching methods on four data sets: simulated graphs, QAPLib, retina vessel images, and handwritten Chinese characters. In all cases, the results are competitive with the state of the art.
Integrative network alignment reveals large regions of global network similarity in yeast and human
- Bioinformatics
, 2011
"... Motivation: High-throughput methods for detecting molecular interactions have produced large sets of biological network data with much more yet to come. Analogous to sequence alignment, efficient and reliable network alignment methods are expected to improve our understanding of biological systems. ..."
Abstract
-
Cited by 40 (0 self)
- Add to MetaCart
(Show Context)
Motivation: High-throughput methods for detecting molecular interactions have produced large sets of biological network data with much more yet to come. Analogous to sequence alignment, efficient and reliable network alignment methods are expected to improve our understanding of biological systems. Unlike sequence alignment, network alignment is computationally intractable. Hence, devising efficient network alignment heuristics is currently a foremost challenge in computational biology. Results: We introduce a novel network alignment algorithm, called Matching-based Integrative GRAph ALigner (MI-GRAAL), which can integrate any number and type of similarity measures between network nodes (e.g., proteins), including, but not limited to, any topological network similarity measure, sequence similarity, functional similarity, and structural similarity. Hence, we resolve the ties in
GADDI: Distance index based subgraph matching in biological networks
- In Proceedings of the 12th international conference on extending database technology (EDBT’09
, 2009
"... Currently, a huge amount of biological data can be naturally rep-resented by graphs, e.g., protein interaction networks, gene reg-ulatory networks, etc. The need for indexing large graphs is an urgent research problem of great practical importance. The main challenge is size. Each graph may contain ..."
Abstract
-
Cited by 25 (2 self)
- Add to MetaCart
(Show Context)
Currently, a huge amount of biological data can be naturally rep-resented by graphs, e.g., protein interaction networks, gene reg-ulatory networks, etc. The need for indexing large graphs is an urgent research problem of great practical importance. The main challenge is size. Each graph may contain thousands (or more) ver-tices. Most of the previous work focuses on indexing a set of small or medium sized database graphs (with only tens of vertices) and finding whether a query graph occurs in any of these. In this paper, we are interested in finding all the matches of a query graph in a given large graph of thousands of vertices, which is a very impor-tant task in many biological applications. This increases the com-plexity significantly. We propose a novel distance measurement which reintroduces the idea of frequent substructures in a single large graph. We devise the novel structure distance based approach (GADDI) to efficiently find matches of the query graph. GADDI is further optimized by the use of a dynamic matching scheme to minimize redundant calculations. Last but not least, a number of real and synthetic data sets are used to evaluate the efficiency and scalability of our proposed method. 1.
Optimal network alignment with graphlet degree vectors
- Cancer Informatics
"... Full open access to this and thousands of other papers at ..."
Abstract
-
Cited by 24 (0 self)
- Add to MetaCart
Full open access to this and thousands of other papers at
Algorithms for large, sparse network alignment problems,” arXiv
"... Abstract—We propose a new distributed algorithm for sparse variants of the network alignment problem, which occurs in a variety of data mining areas including systems biology, database matching, and computer vision. Our algorithm uses a belief propagation heuristic and provides near optimal solution ..."
Abstract
-
Cited by 17 (3 self)
- Add to MetaCart
(Show Context)
Abstract—We propose a new distributed algorithm for sparse variants of the network alignment problem, which occurs in a variety of data mining areas including systems biology, database matching, and computer vision. Our algorithm uses a belief propagation heuristic and provides near optimal solutions for this NP-hard combinatorial optimization problem. We show that our algorithm is faster and outperforms or ties existing algorithms on synthetic problems, a problem in bioinformatics, and a problem in ontology matching. We also provide a unified framework for studying and comparing all network alignment solvers. Keywords-network alignment; belief propagation; graph matching; message-passing I.
C: Network archaeology: uncovering ancient networks from present-day interactions
- PLoS Comput Biol
"... What proteins interacted in a long-extinct ancestor of yeast? How have different members of a protein complex assembled together over time? Our ability to answer such questions has been limited by the unavailability of ancestral protein-protein interaction (PPI) networks. To overcome this limitation ..."
Abstract
-
Cited by 13 (5 self)
- Add to MetaCart
(Show Context)
What proteins interacted in a long-extinct ancestor of yeast? How have different members of a protein complex assembled together over time? Our ability to answer such questions has been limited by the unavailability of ancestral protein-protein interaction (PPI) networks. To overcome this limitation, we propose several novel algorithms to reconstruct the growth history of a present-day network. Our likelihood-based method finds a probable previous state of the graph by applying an assumed growth model backwards in time. This approach retains node identities so that the history of individual nodes can be tracked. Using this methodology, we estimate protein ages in the yeast PPI network that are in good agreement with sequence-based estimates of age and with structural features of protein complexes. Further, by comparing the quality of the inferred histories for several different growth models (duplication-mutation with complementarity, forest fire, and preferential attachment), we provide additional evidence that a duplication-based model captures many features of PPI network growth better than models designed to mimic social network growth. From the reconstructed history, we model the arrival time of extant and ancestral interactions and predict that complexes have significantly re-wired over time and that new edges tend to form within existing complexes. We also hypothesize a distribution of per-protein duplication rates, track the change of the network’s clustering coefficient, and predict paralogous relationships between extant proteins that are likely to be complementary to the relationships inferred using sequence alone. Finally, we infer plausible parameters for
Inferring Anchor Links across Multiple Heterogeneous Social Networks
"... Online social networks can often be represented as heterogeneous information networks containing abundant information about: who, where, when and what. Nowadays, people are usually involved in multiple social networks simultaneously. The multiple accounts of the same user in different networks are m ..."
Abstract
-
Cited by 12 (4 self)
- Add to MetaCart
(Show Context)
Online social networks can often be represented as heterogeneous information networks containing abundant information about: who, where, when and what. Nowadays, people are usually involved in multiple social networks simultaneously. The multiple accounts of the same user in different networks are mostly isolated from each other without any connection between them. Discovering the correspondence of these accounts across multiple social networks is a crucial prerequisite for many interesting inter-network applications, such as link recommendation and community analysis using information from multiple networks. In this paper, we study the problem of anchor link prediction across multiple heterogeneous social networks, i.e., discovering the correspondence among different accounts of the same user. Unlike most prior work on link prediction and network alignment, we assume that the anchor links are one-to-one relationships (i.e., no two edges share a common endpoint) between the accounts in two social networks, and a small number of anchor links are known beforehand. We propose to extract heterogeneous features from multiple heterogeneous networks for anchor link prediction, including user’s social, spatial, temporal and text information. Then we formulate the inference problem for anchor links as a stable matching problem between the two sets of user accounts in two different networks. An effective solution, Mna (Multi-Network Anchoring), is derived to infer anchor links w.r.t. the one-to-one constraint. Extensive experiments on two real-world heterogeneous social networks show that our Mna model consistently outperform other commonly-used baselines on anchor link prediction.
On the Performance of Percolation Graph Matching
"... Graph matching is a generalization of the classic graph isomorphism problem. By using only their structures a graph-matching algorithm finds a map between the vertex sets of two similar graphs. This has applications in the deanonymization of social and information networks and, more generally, in th ..."
Abstract
-
Cited by 10 (2 self)
- Add to MetaCart
(Show Context)
Graph matching is a generalization of the classic graph isomorphism problem. By using only their structures a graph-matching algorithm finds a map between the vertex sets of two similar graphs. This has applications in the deanonymization of social and information networks and, more generally, in the merging of structural data from different domains. One class of graph-matching algorithms starts with a known seed set of matched node pairs. Despite the success of these algorithms in practical applications, their performance has been observed to be very sensitive to the size of the seed set. The lack of a rigorous understanding of parameters and performance makes it difficult to design systems and predict their behavior. In this paper, we propose and analyze a very simple percolation-based graph matching algorithm that incrementally maps every pair of nodes (i, j) with at least r neighboring mapped pairs. The simplicity of this algorithm makes possible a rigorous analysis that relies on recent advances in bootstrap percolation theory for the G(n, p) random graph. We prove conditions on the model parameters in which percolation graph matching succeeds, and we establish a phase transition in the size of the seed set. We also confirm through experiments that the performance of percolation graph matching is surprisingly good, both for synthetic graphs and real social-network data.