Results 1  10
of
93
Binetcauchy kernels on dynamical systems and its application to the analysis of dynamic scenes
 International Journal of Computer Vision
, 2005
"... Abstract. We derive a family of kernels on dynamical systems by applying the BinetCauchy theorem to trajectories of states. Our derivation provides a unifying framework for all kernels on dynamical systems currently used in machine learning, including kernels derived from the behavioral framework, ..."
Abstract

Cited by 41 (15 self)
 Add to MetaCart
(Show Context)
Abstract. We derive a family of kernels on dynamical systems by applying the BinetCauchy theorem to trajectories of states. Our derivation provides a unifying framework for all kernels on dynamical systems currently used in machine learning, including kernels derived from the behavioral framework, diffusion processes, marginalized kernels, kernels on graphs, and the kernels on sets arising from the subspace angle approach. In the case of linear timeinvariant systems, we derive explicit formulae for computing the proposed BinetCauchy kernels by solving Sylvester equations, and relate the proposed kernels to existing kernels based on cepstrum coefficients and subspace angles. Besides their theoretical appeal, these kernels can be used efficiently in the comparison of video sequences of dynamic scenes that can be modeled as the output of a linear timeinvariant dynamical system. One advantage of our kernels is that they take the initial conditions of the dynamical systems into account. As a first example, we use our kernels to compare video sequences of dynamic textures. As a second example, we apply our kernels to the problem of clustering short clips of a movie. Experimental evidence shows superior performance of our kernels. Keywords: BinetCauchy theorem, ARMA models and dynamical systems, Sylvester
WeisfeilerLehman Graph Kernels
, 2011
"... In this article, we propose a family of efficient kernels for large graphs with discrete node labels. Key to our method is a rapid feature extraction scheme based on the WeisfeilerLehman test of isomorphism on graphs. It maps the original graph to a sequence of graphs, whose node attributes capture ..."
Abstract

Cited by 32 (3 self)
 Add to MetaCart
In this article, we propose a family of efficient kernels for large graphs with discrete node labels. Key to our method is a rapid feature extraction scheme based on the WeisfeilerLehman test of isomorphism on graphs. It maps the original graph to a sequence of graphs, whose node attributes capture topological and label information. A family of kernels can be defined based on this WeisfeilerLehman sequence of graphs, including a highly efficient kernel comparing subtreelike patterns. Its runtime scales only linearly in the number of edges of the graphs and the length of the WeisfeilerLehman graph sequence. In our experimental evaluation, our kernels outperform stateoftheart graph kernels on several graph classification benchmark data sets in terms of accuracy and runtime. Our kernels open the door to largescale applications of graph kernels in various disciplines such as computational biology and social network analysis.
Subgraph Frequencies: Mapping the Empirical and Extremal Geography of Large Graph Collections
"... A growing set of online applications are generating data that can be viewed as very large collections of small, dense social graphs — these range from sets of social groups, events, or collaboration projects to the vast collection of graph neighborhoods in large social networks. A natural question ..."
Abstract

Cited by 17 (1 self)
 Add to MetaCart
(Show Context)
A growing set of online applications are generating data that can be viewed as very large collections of small, dense social graphs — these range from sets of social groups, events, or collaboration projects to the vast collection of graph neighborhoods in large social networks. A natural question is how to usefully define a domainindependent ‘coordinate system ’ for such a collection of graphs, so that the set of possible structures can be compactly represented and understood within a common space. In this work, we draw on the theory of graph homomorphisms to formulate and analyze such a representation, based on computing the frequencies of small induced subgraphs within each graph. We find that the space of subgraph frequencies is governed both by its combinatorial properties — based on extremal results that constrain all graphs — as well as by its empirical properties — manifested in the way that real social graphs appear to lie near a simple onedimensional curve through this space. We develop flexible frameworks for studying each of these aspects. For capturing empirical properties, we characterize a simple stochastic generative model, a singleparameter extension of ErdősRényi random graphs, whose stationary distribution over subgraphs closely tracks the onedimensional concentration of the real social graph families. For the extremal properties, we develop a tractable linear program for bounding the feasible space of subgraph frequencies by harnessing a toolkit of known extremal graph theory. Together, these two complementary frameworks shed light on a fundamental question pertaining to social graphs: what properties of social graphs are ‘social ’ properties and what properties are ‘graph ’ properties? We conclude with a brief demonstration of how the coordinate system we examine can also be used to perform classification tasks, distinguishing between structures arising from different types of social graphs.
Deepwalk: Online learning of social representations. arXiv preprint arXiv:1403.6652
, 2014
"... We present DeepWalk, a novel approach for learning latent representations of vertices in a network. These latent representations encode social relations in a continuous vector space, which is easily exploited by statistical models. DeepWalk generalizes recent advancements in language modeling and ..."
Abstract

Cited by 14 (1 self)
 Add to MetaCart
(Show Context)
We present DeepWalk, a novel approach for learning latent representations of vertices in a network. These latent representations encode social relations in a continuous vector space, which is easily exploited by statistical models. DeepWalk generalizes recent advancements in language modeling and unsupervised feature learning (or deep learning) from sequences of words to graphs. DeepWalk uses local information obtained from truncated random walks to learn latent representations by treating walks as the equivalent of sentences. We demonstrate DeepWalk’s latent representations on several multilabel network classification tasks for social networks such as BlogCatalog, Flickr, and YouTube. Our results show that DeepWalk outperforms challenging baselines which are allowed a global view of the network, especially in the presence of missing information. DeepWalk’s representations can provide F1 scores up to 10 % higher than competing methods when labeled data is sparse. In some experiments, DeepWalk’s representations are able to outperform all baseline methods while using 60 % less training data. DeepWalk is also scalable. It is an online learning algorithm which builds useful incremental results, and is trivially parallelizable. These qualities make it suitable for a broad class of real world applications such as network classification, and anomaly detection.
An experimental investigation of kernels on graphs for collaborative . . .
 NEURAL NETWORKS
, 2012
"... ..."
Multilabel feature selection for graph classification
 In Proceedings of the 10th IEEE International Conference on Data Mining
, 2010
"... Abstract—Nowadays, the classification of graph data has become an important and active research topic in the last decade, which has a wide variety of real world applications, e.g. drug activity predictions and kinase inhibitor discovery. Current research on graph classification focuses on singlelabe ..."
Abstract

Cited by 11 (4 self)
 Add to MetaCart
(Show Context)
Abstract—Nowadays, the classification of graph data has become an important and active research topic in the last decade, which has a wide variety of real world applications, e.g. drug activity predictions and kinase inhibitor discovery. Current research on graph classification focuses on singlelabel settings. However, in many applications, each graph data can be assigned with a set of multiple labels simultaneously. Extracting good features using multiple labels of the graphs becomes an important step before graph classification. In this paper, we study the problem of multilabel feature selection for graph classification and propose a novel solution, called gMLC, to efficiently search for optimal subgraph features for graph objects with multiple labels. Different from existing feature selection methods in vector spaces which assume the feature set is given, we perform multilabel feature selection for graph data in a progressive way together with the subgraph feature mining process. We derive an evaluation criterion, named gHSIC, to estimate the dependence between subgraph features and multiple labels of graphs. Then a branchandbound algorithm is proposed to efficiently search for optimal subgraph features by judiciously pruning the subgraph search space using multiple labels. Empirical studies on realworld tasks demonstrate that our feature selection approach can effectively boost multilabel graph classification performances and is more efficient by pruning the subgraph search space using multiple labels. Keywordsfeature selection; graph classification; multilabel learning. I.
Transforming Graph Data for Statistical Relational Learning
, 2012
"... Relational data representations have become an increasingly important topic due to the recent proliferation of network datasets (e.g., social, biological, information networks) and a corresponding increase in the application of Statistical Relational Learning (SRL) algorithms to these domains. In th ..."
Abstract

Cited by 9 (4 self)
 Add to MetaCart
(Show Context)
Relational data representations have become an increasingly important topic due to the recent proliferation of network datasets (e.g., social, biological, information networks) and a corresponding increase in the application of Statistical Relational Learning (SRL) algorithms to these domains. In this article, we examine and categorize techniques for transforming graphbased relational data to improve SRL algorithms. In particular, appropriate transformations of the nodes, links, and/or features of the data can dramatically affect the capabilities and results of SRL algorithms. We introduce an intuitive taxonomy for data representation transformations in relational domains that incorporates link transformation and node transformation as symmetric representation tasks. More specifically, the transformation tasks for both nodes and links include (i) predicting their existence, (ii) predicting their label or type, (iii) estimating their weight or importance, and (iv) systematically constructing their relevant features. We motivate our taxonomy through detailed examples and use it to survey competing approaches for each of these tasks. We also discuss general conditions for transforming links, nodes, and features. Finally, we highlight challenges that remain to be addressed.
A Novel LowComplexity HMM Similarity Measure
, 2010
"... In this letter, we propose a novel similarity measure for comparing Hidden Markov models (HMMs) and an efficient scheme for its computation. In the proposed approach, we probabilistically evaluate the correspondence, or goodness of match, between every pair of states in the respective HMMs, based on ..."
Abstract

Cited by 7 (1 self)
 Add to MetaCart
(Show Context)
In this letter, we propose a novel similarity measure for comparing Hidden Markov models (HMMs) and an efficient scheme for its computation. In the proposed approach, we probabilistically evaluate the correspondence, or goodness of match, between every pair of states in the respective HMMs, based on the concept of semiMarkov random walk. We show that this correspondence score reflect the contribution of a given state pair to the overall similarity between the two HMMs. For similar HMMs, each state in one HMM is expected to have only a few matching states in the other HMM, resulting in a sparse state correspondence score matrix. This allows us to measure the similarity between HMMs by evaluating the sparsity of the state correspondence matrix. Estimation of the proposed similarity score does not require timeconsuming MonteCarlo simulations, hence it can be computed much more efficiently compared to the KullbackLeibler divergence (KLD) thas has been widely used. We demonstrate the effectiveness of the proposed measure through several examples.
Graph Classification via Topological and Label Attributes
"... Graph classification is an important data mining task, and various graph kernel methods have been proposed recently for this task. These methods have proven to be effective, but they tend to have high computational overhead. In this paper, we propose an alternative approach to graph classification t ..."
Abstract

Cited by 7 (0 self)
 Add to MetaCart
(Show Context)
Graph classification is an important data mining task, and various graph kernel methods have been proposed recently for this task. These methods have proven to be effective, but they tend to have high computational overhead. In this paper, we propose an alternative approach to graph classification that is based on featurevectors constructed from different global topological attributes, as well as global label features. The main idea here is that the graphs from the same class should have similar topological and label attributes. Our method is simple and easy to implement, and via a detailed comparison on real benchmark datasets, we show that our topological and label featurebased approach delivers better or competitive classification accuracy, and is also substantially faster than other graph kernels. It is the most effective method for large unlabeled graphs. 1.