Results 1  10
of
30
Crawling Facebook for Social Network Analysis Purposes
"... We describe our work in the collection and analysis of massive data describing the connections between participants to online social networks. Alternative approaches to social network data collection are defined and evaluated in practice, against the popular Facebook Web site. Thanks to our adhoc, ..."
Abstract

Cited by 24 (6 self)
 Add to MetaCart
(Show Context)
We describe our work in the collection and analysis of massive data describing the connections between participants to online social networks. Alternative approaches to social network data collection are defined and evaluated in practice, against the popular Facebook Web site. Thanks to our adhoc, privacycompliant crawlers, two large samples, comprising millions of connections, have been collected; the data is anonymous and organized as an undirected graph. We describe a set of tools that we developed to analyze specific properties of such socialnetwork graphs, i.e., among others, degree distribution, centrality measures, scaling laws and distribution of friendship.
GraphPrism: Compact visualization of network structure
 In AVI 2012: Advanced Visual Interfaces, ACM
, 2012
"... Visual methods for supporting the characterization, comparison, and classification of large networks remain an open challenge. Ideally, such techniques should surface useful structural features – such as effective diameter, smallworld properties, and structural holes – not always apparent from eit ..."
Abstract

Cited by 4 (1 self)
 Add to MetaCart
(Show Context)
Visual methods for supporting the characterization, comparison, and classification of large networks remain an open challenge. Ideally, such techniques should surface useful structural features – such as effective diameter, smallworld properties, and structural holes – not always apparent from either summary statistics or typical network visualizations. In this paper, we present GraphPrism, a technique for visually summarizing arbitrarily large graphs through combinations of ‘facets’, each corresponding to a single node or edgespecific metric (e.g., transitivity). We describe a generalized approach for constructing facets by calculating distributions of graph metrics over increasingly large local neighborhoods and representing these as a stacked multiscale histogram. Evaluation with paper prototypes shows that, with minimal training, static GraphPrism diagrams can aid network analysis experts in performing basic analysis tasks with network data. Finally, we contribute the design of an interactive system using linked selection between GraphPrism overviews and nodelink detail views. Using a case study of data from a coauthorship network, we illustrate how GraphPrism facilitates interactive exploration of network data.
Wikiwatchdog: Anomaly detection in Wikipedia through a distributional lens
 In Proc. of IEEE/ACM Web Intelligence
, 2011
"... Abstract—Wikipedia has become a standard source of reference online, and many people (some unknowingly) now trust this corpus of knowledge as an authority to fulfil their information requirements. In doing so they task the human contributors of Wikipedia with maintaining the accuracy of articles, a ..."
Abstract

Cited by 2 (2 self)
 Add to MetaCart
(Show Context)
Abstract—Wikipedia has become a standard source of reference online, and many people (some unknowingly) now trust this corpus of knowledge as an authority to fulfil their information requirements. In doing so they task the human contributors of Wikipedia with maintaining the accuracy of articles, a job that these contributors have been performing admirably. We study the problem of monitoring the Wikipedia corpus with the goal of automated, online anomaly detection. We present Wikiwatchdog, an efficient distributionbased methodology that monitors distributions of revision activity for changes. We show that using our methods it is possible to detect the activity of bots, flash events, and outages, as they occur. Our methods are proposed to support the monitoring of the contributors. They are useful to speedup anomaly detection, and identify events that are hard to detect manually. We show the efficacy and the low falsepositive rate of our methods by experiments on the revision history of Wikipedia. Our results show that distributionbased anomaly detection has a higher detection rate than traditional methods based on either volume or entropy alone. Unlike previous work on anomaly detection in information networks that worked with a static network graph, our methods consider the network as it evolves and monitors properties of the network for changes. Although our methodology is developed and evaluated on Wikipedia, we believe it is an effective generic anomaly detection framework in its own right. I.
Distancedependent Kronecker Graphs for Modeling Social Networks
, 2009
"... This paper focuses on a generalization of stochastic Kronecker graphs, introducing a Kroneckerlike operator and defining a family of generator matrices H dependent on distances between nodes in a specified graph embedding. We prove that any latticebased network model with sufficiently small distan ..."
Abstract

Cited by 2 (1 self)
 Add to MetaCart
This paper focuses on a generalization of stochastic Kronecker graphs, introducing a Kroneckerlike operator and defining a family of generator matrices H dependent on distances between nodes in a specified graph embedding. We prove that any latticebased network model with sufficiently small distancedependent connection probability will have a Poisson degree distribution and provide a general framework to prove searchability for such a network. Using this framework, we focus on a specific example of an expanding hypercube and discuss the similarities and differences of such a model with recently proposed network models based on a hidden metric space. We also prove that a greedy forwarding algorithm can find very short paths of length O((log log n)²) on the hypercube with n nodes, demonstrating that distancedependent Kronecker graphs can generate searchable network models.
Generalizing Kronecker graphs in order to model searchable networks
 IN PROC. FORTYSEVENTH ANNUAL ALLERTON CONFERENCE
, 2009
"... This paper describes an extension to stochastic Kronecker graphs that provides the special structure required for searchability, by defining a “distance”dependent Kronecker operator. We show how this extension of Kronecker graphs can generate several existing social network models, such as the Watt ..."
Abstract

Cited by 2 (0 self)
 Add to MetaCart
This paper describes an extension to stochastic Kronecker graphs that provides the special structure required for searchability, by defining a “distance”dependent Kronecker operator. We show how this extension of Kronecker graphs can generate several existing social network models, such as the WattsStrogatz smallworld model and Kleinberg’s latticebased model. We focus on a specific example of an expanding hypercube, reminiscent of recently proposed social network models based on a hidden hyperbolic metric space, and prove that a greedy forwarding algorithm can find very short paths of length O((log log n)²) for graphs with n nodes.
Craftsmen Versus Designers: The Difference of InDepth Cognitive Levels at the Early Stage of Idea Generation
 In ICoRD'13
, 2013
"... Abstract This paper investigates the indepth cognitive levels at the early stage of idea generation for craftsmen and designers. Examining this early stage may explain the fundamental thoughts in observing and defining design problems. We conducted an experiment using thinkaloud protocol, where ve ..."
Abstract

Cited by 1 (1 self)
 Add to MetaCart
(Show Context)
Abstract This paper investigates the indepth cognitive levels at the early stage of idea generation for craftsmen and designers. Examining this early stage may explain the fundamental thoughts in observing and defining design problems. We conducted an experiment using thinkaloud protocol, where verbalized thoughts were analyzed using a concept network method based on associative concept analysis. Furthermore, we identified semantic relationships based on Factor Analysis. The findings showed that craftsmen tended to activate lowweighted associative concepts at indepth cognitive level with a smaller number of polysemous features, thus explaining their concerns about tangiblerelated issues, such as proportion and shape. Designers, however, activated highly weighted associative concepts with more polysemous features, and they were typically concerned with intangible issues, such as surroundings context (i.e., eating culture) and users’ affective preferences (i.e., companion, appeal).
A General and Scalable Approach to Mixed Membership Clustering
"... Abstract—Spectral clustering methods are elegant and effective graphbased node clustering methods, but they do not allow mixed membership clustering. We describe an approach that first transforms the data from a nodecentric representation to an edgecentric one, and then use this representation to ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
(Show Context)
Abstract—Spectral clustering methods are elegant and effective graphbased node clustering methods, but they do not allow mixed membership clustering. We describe an approach that first transforms the data from a nodecentric representation to an edgecentric one, and then use this representation to define a scalable and competitive mixed membership alternative to spectral clustering methods. Experimental results show the proposed approach improves substantially in mixed membership clustering tasks over node clustering methods. Keywordsclustering; scalable methods; unsupervised learning; large scale learning; mixed membership clustering; I.
COMPUTATIONAL METHODS FOR LEARNING AND INFERENCE ON DYNAMIC NETWORKS
, 2012
"... ii ACKNOWLEDGEMENTS First and foremost, I would like to thank my advisor, Professor Alfred Hero, for his guidance and mentorship. I have learned a tremendous amount about statistics, signal processing, and the research process from working as a research assistant in his group. My interactions with h ..."
Abstract

Cited by 1 (1 self)
 Add to MetaCart
ii ACKNOWLEDGEMENTS First and foremost, I would like to thank my advisor, Professor Alfred Hero, for his guidance and mentorship. I have learned a tremendous amount about statistics, signal processing, and the research process from working as a research assistant in his group. My interactions with him have undoubtedly helped me develop and mature as a researcher. I also extend thanks to my other committee members, Professor George Michailidis, Professor Mark Newman, and Professor Rajesh Rao Nadakuditi, for their valuable input to this dissertation. I am grateful to have worked alongside such a talented group of graduate students and postdoctoral fellows in Professor Hero’s group. I would particularly like to acknowledge Dr. Mark Kliger; I had the pleasure of working with him on what became Chapters II–
DistributedMemory Parallel Algorithms for Generating Massive Scalefree Networks Using Preferential Attachment Model
"... Various random networks are being widely used in modeling and analyzing complex systems. As the complex systems are growing larger, generation of random networks with billions of nodes or larger became a necessity. Generation of such massive networks requires efficient and parallel algorithms. Naive ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
Various random networks are being widely used in modeling and analyzing complex systems. As the complex systems are growing larger, generation of random networks with billions of nodes or larger became a necessity. Generation of such massive networks requires efficient and parallel algorithms. Naive parallelization of the sequential algorithms for generating random networks may not work due to the dependencies among the edges and the possibility of creating duplicate (parallel) edges. In this paper, we present MPIbased distributed memory parallel algorithms for generating random scalefree networks using the preferentialattachment model. Our algorithms scale very well to a large number of processors and provide almost linear speedups. Our algorithms can generate scalefree networks with 50 billion edges in 123 seconds using 768 processors.
Structural Analysis of Large Networks: Observations and Applications
, 2010
"... Network data (also referred to as relational data, social network data, real graph data) has become ubiquitous, and understanding patterns in this data has become an important research problem. We investigate how interactions in social networks are formed and how these interactions facilitate diff ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
(Show Context)
Network data (also referred to as relational data, social network data, real graph data) has become ubiquitous, and understanding patterns in this data has become an important research problem. We investigate how interactions in social networks are formed and how these interactions facilitate diffusion, model these behaviors, and apply these findings to realworld problems. We examined graphs of size up to 16 million nodes, across many domains from academic citation networks, to campaign contributions and actormovie networks. We also performed several case studies in online social networks such as blogs and message board communities. Our major contributions are the following: (a) We discover several surprising patterns in network topology and interactions, such as Popularity Decay power law (inlinks to a blog post decay with a power law with −1.5 exponent) and the oscillating