Results 1 - 10
of
12
Modeling relationship strength in online social networks
- In Proc. WWW '10
"... Previous work analyzing social networks has mainly focused on binary friendship relations. However, in online social networks the low cost of link formation can lead to networks with heterogeneous relationship strengths (e.g., acquaintances and best friends mixed together). In this case, the binary ..."
Abstract
-
Cited by 16 (0 self)
- Add to MetaCart
Previous work analyzing social networks has mainly focused on binary friendship relations. However, in online social networks the low cost of link formation can lead to networks with heterogeneous relationship strengths (e.g., acquaintances and best friends mixed together). In this case, the binary friendship indicator provides only a coarse representation of relationship information. In this work, we develop an unsupervised model to estimate relationship strength from interaction activity (e.g., communication, tagging) and user similarity. More specifically, we formulate a link-based latent variable model, along with a coordinate ascent optimization procedure for the inference. We evaluate our approach on real-world data from Facebook, showing that the estimated link weights result in higher autocorrelation and lead to improved classification accuracy. 1
Randomization Tests for Distinguishing Social Influence and Homophily Effects
"... Relational autocorrelation is ubiquitous in relational domains. This observed correlation between class labels of linked instances in a network (e.g., two friends are more likely to share political beliefs than two randomly selected people) can be due to the effects of two different social processes ..."
Abstract
-
Cited by 10 (0 self)
- Add to MetaCart
Relational autocorrelation is ubiquitous in relational domains. This observed correlation between class labels of linked instances in a network (e.g., two friends are more likely to share political beliefs than two randomly selected people) can be due to the effects of two different social processes. If social influence effects are present, instances are likely to change their attributes to conform to their neighbor values. If homophily effects are present, instances are likely to link to other individuals with similar attribute values. Both these effects will result in autocorrelated attribute values. When analyzing static relational networks it is impossible to determine how much of the observed correlation is due each of these factors. However, the recent surge of interest in social networks has increased the availability of dynamic network data. In this paper, we present a randomization technique for temporal network data where the attributes and links change over time. Given data from two time steps, we measure the gain in correlation and assess whether a significant portion of this gain is due to influence and/or homophily. We demonstrate the efficacy of our method on semi-synthetic data and then apply the method to a real-world social networks dataset, showing the impact of both influence and homophily effects.
Link Prediction on Evolving Data using Matrix and Tensor Factorizations
- IEEE INTERNATIONAL CONFERENCE ON DATA MINING WORKSHOPS
, 2009
"... The data in many disciplines such as social networks, web analysis, etc. is link-based, and the link structure can be exploited for many different data mining tasks. In this paper, we consider the problem of temporal link prediction: Given link data for time periods 1 through T, can we predict the l ..."
Abstract
-
Cited by 5 (1 self)
- Add to MetaCart
The data in many disciplines such as social networks, web analysis, etc. is link-based, and the link structure can be exploited for many different data mining tasks. In this paper, we consider the problem of temporal link prediction: Given link data for time periods 1 through T, can we predict the links in time period T +1? Specifically, we look at bipartite graphs changing over time and consider matrix- and tensorbased methods for predicting links. We present a weight-based method for collapsing multi-year data into a single matrix. We show how the well-known Katz method for link prediction can be extended to bipartite graphs and, moreover, approximated in a scalable way using a truncated singular value decomposition. Using a CANDECOMP/PARAFAC tensor decomposition of the data, we illustrate the usefulness of exploiting the natural threedimensional structure of temporal link data. Through several numerical experiments, we demonstrate that both matrixand tensor-based techniques are effective for temporal link prediction despite the inherent difficulty of the problem.
Understanding Actor Loyalty to Event-Based Groups in Affiliation Networks ∗
, 2010
"... In this paper, we introduce a method for analyzing the temporal dynamics of affiliation networks. We define affiliation groups which describe temporally related subsets of actors and describe an approach for exploring changing memberships in these affiliation groups over time. To model the dynamic b ..."
Abstract
-
Cited by 3 (1 self)
- Add to MetaCart
In this paper, we introduce a method for analyzing the temporal dynamics of affiliation networks. We define affiliation groups which describe temporally related subsets of actors and describe an approach for exploring changing memberships in these affiliation groups over time. To model the dynamic behavior in these networks, we consider the concept of loyalty and introduce a measure that captures an actor’s loyalty to an affiliation group as the degree of ‘commitment ’ an actor shows to the group over time. We evaluate our measure using three real world affiliation networks: a publication network, a senate bill cosponsorship network and a dolphin network. The results show the utility of our measure for analyzing the dynamic behavior of actors and quantifying their loyalty to different time-varying affiliation groups. 1
Understanding severe weather processes through spatiotemporal relational random forests
- In Proceedings of the 2010 NASA Conference on Intelligent Data Understanding
, 2010
"... Abstract. Major severe weather events can cause a significant loss of life and property. We seek to revolutionize our understanding of and ability to predict such events through the mining of severe weather data. Because weather is inherently a spatiotemporal phenomenon, mining such data requires a ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
Abstract. Major severe weather events can cause a significant loss of life and property. We seek to revolutionize our understanding of and ability to predict such events through the mining of severe weather data. Because weather is inherently a spatiotemporal phenomenon, mining such data requires a model capable of representing and reasoning about complex spatiotemporal dynamics, including temporally and spatially varying attributes and relationships. We introduce an augmented version of the Spatiotemporal Relational Random Forest, which is a Random Forest that learns with spatiotemporally varying relational data. Our algorithm maintains the strength and performance of Random Forests but extends their applicability, including the estimation of variable importance, to complex spatiotemporal relational domains. We apply the augmented Spatiotemporal Relational Random Forest to three severe weather data sets. These are: predicting atmospheric turbulence across the continental United States, examining the formation of tornadoes near strong frontal boundaries, and understanding the translation of drought across the southern plains of the United States. The results on such a wide variety of real-world domains demonstrate the extensive applicability of the Spatiotemporal Relational Random Forest. Our long-term goal is to significantly improve the ability to predict and warn about severe weather events. 1.
General
"... The data in many disciplines such as social networks, Web analysis, etc. is link-based, and the link structure can be exploited for many different data mining tasks. In this article, we consider the problem of temporal link prediction: Given link data for times 1 through T, can we predict the links ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
The data in many disciplines such as social networks, Web analysis, etc. is link-based, and the link structure can be exploited for many different data mining tasks. In this article, we consider the problem of temporal link prediction: Given link data for times 1 through T, can we predict the links at time T + 1? If our data has underlying periodic structure, can we predict out even further in time, i.e., links at time T + 2, T + 3, etc.? In this article, we consider bipartite graphs that evolve over time and consider matrixand tensor-based methods for predicting future links. We present a weight-based method for collapsing multiyear data into a single matrix. We show how the well-known Katz method for link prediction can be extended to bipartite graphs and, moreover, approximated in a scalable way using a truncated singular value decomposition. Using a CANDECOMP/PARAFAC tensor decomposition of the data, we illustrate the usefulness of exploiting the natural three-dimensional structure of temporal link data. Through several numerical experiments, we demonstrate that both matrix- and tensor-based techniques are effective for temporal link prediction despite the inherent difficulty of the problem. Additionally, we show that tensorbased
USING SPATIOTEMPORAL RELATIONAL RANDOM FORESTS TO IMPROVE OUR UNDERSTANDING OF SEVERE WEATHER PROCESSES
"... Abstract. Major severe weather events can cause a significant loss of life and property. We seek to revolutionize our understanding of and our ability to predict such events through the mining of severe weather data. Because weather is inherently a spatiotemporal phenomenon, mining such data require ..."
Abstract
-
Cited by 1 (1 self)
- Add to MetaCart
Abstract. Major severe weather events can cause a significant loss of life and property. We seek to revolutionize our understanding of and our ability to predict such events through the mining of severe weather data. Because weather is inherently a spatiotemporal phenomenon, mining such data requires a model capable of representing and reasoning about complex spatiotemporal dynamics, including temporally and spatially varying attributes and relationships. We introduce an augmented version of the Spatiotemporal Relational Random Forest, which is a Random Forest that learns with spatiotemporally varying relational data. Our algorithm maintains the strength and performance of Random Forests but extends their applicability, including the estimation of variable importance, to complex spatiotemporal relational domains. We apply the augmented Spatiotemporal Relational Random Forest to three severe weather data sets. These are: predicting atmospheric turbulence across the continental United States, examining the formation of tornadoes near strong frontal boundaries, and understanding the spatial evolution of drought across the southern plains of the United States. The results on such a wide variety of real-world domains demonstrate the extensive applicability of the Spatiotemporal Relational Random Forest. Our long-term goal is to significantly improve the ability to predict and warn about severe weather events. We expect that the tools and techniques we develop will be applicable to a wide range of complex spatiotemporal phenomena. Keywords: forests spatiotemporal data mining, statistical relational learning, severe weather, random 1.
DAPA-V10: Discovery and Analysis of Patterns and Anomalies in Volatile Time-Evolving Networks ∗
"... We address the problems of finding patterns and detecting anomalous activities in volatile time-evolving networks such as communication networks (as opposed to slowly evolving networks like co-authorship graphs). Our approach, DAPA-V10, utilizes a simple compact graph representation that assigns wei ..."
Abstract
- Add to MetaCart
We address the problems of finding patterns and detecting anomalous activities in volatile time-evolving networks such as communication networks (as opposed to slowly evolving networks like co-authorship graphs). Our approach, DAPA-V10, utilizes a simple compact graph representation that assigns weights to edges in a way that captures the frequency, duration, and recency of edges. Given this weighted “cumulative ” graph, DAPA-V10 finds
Modeling the Evolution of Discussion Topics and Communication to Improve Relational Classification
"... Textual analysis is one means by which to assess communication type and moderate the influence of network structure in predictive models of individual behavior. However, there are few methods available to incorporate textual content into time-evolving network models. In particular, modeling both the ..."
Abstract
- Add to MetaCart
Textual analysis is one means by which to assess communication type and moderate the influence of network structure in predictive models of individual behavior. However, there are few methods available to incorporate textual content into time-evolving network models. In particular, modeling both the evolution of network topology and textual content change in time-varying communication data poses a difficult challenge. In this work, we propose a Temporally-Evolving Network Classifier (TENC) to incorporate the influence of time-varying edges and temporally-evolving attributes in relational classification models. To facilitate this, we use an evolutionary latent topic approach to automatically discover and label communications between individuals in a network with their corresponding latent topic. The topics of the messages are incorporated into the TENC along with time-varying relationships and temporally-evolving attributes, using weighted, exponential kernel summarization. We evaluate the utility of the TENC on a real-world classification task, where the aim is to predict the effectiveness of a developer in the python open-source developer network. We take advantage of the textual content in developer emails and bug communications, which both evolve over time. The TENC paired with the latent topics significantly improves performance over the baseline classifiers that only take into account the static properties of the topics and communications. The results show that the TENC can be used to accurately model the complete-set of temporal dynamics in time-evolving communication networks.

