Results 1 -
8 of
8
Link Mining: A Survey
- SigKDD Explorations Special Issue on Link Mining
, 2005
"... Many datasets of interest today are best described as a linked collection of interrelated objects. These may represent homogeneous networks, in which there is a single-object type and link type, or richer, heterogeneous networks, in which there may be multiple object and link types (and possibly oth ..."
Abstract
-
Cited by 31 (0 self)
- Add to MetaCart
Many datasets of interest today are best described as a linked collection of interrelated objects. These may represent homogeneous networks, in which there is a single-object type and link type, or richer, heterogeneous networks, in which there may be multiple object and link types (and possibly other semantic information). Examples of homogeneous networks include single mode social networks, such as people connected by friendship links, or the WWW, a collection of linked web pages. Examples of heterogeneous networks include those in medical domains describing patients, diseases, treatments and contacts, or in bibliographic domains describing publications, authors, and venues. Link mining refers to data mining techniques that explicitly consider these links when building predictive or descriptive models of the linked data. Commonly addressed link mining tasks include object ranking, group detection, collective classification, link prediction and subgraph discovery. While network analysis has been studied in depth in particular areas such as social network analysis, hypertext mining, and web analysis, only recently has there been a cross-fertilization of ideas among these different communities. This is an exciting, rapidly expanding area. In this article, we review some of the common emerging themes. 1.
Information survival threshold in sensor and P2P networks
- Proceedings of 26th Annual IEEE ICC
, 2007
"... Abstract—Consider a network of, say, sensors, or P2P nodes, or bluetooth-enabled cell-phones, where nodes transmit information to each other and where links and nodes can go up or down. Consider also a ‘datum’, that is, a piece of information, like a report of an emergency condition in a sensor netw ..."
Abstract
-
Cited by 6 (3 self)
- Add to MetaCart
Abstract—Consider a network of, say, sensors, or P2P nodes, or bluetooth-enabled cell-phones, where nodes transmit information to each other and where links and nodes can go up or down. Consider also a ‘datum’, that is, a piece of information, like a report of an emergency condition in a sensor network, a national traditional song, or a mobile phone virus. How often should nodes transmit the datum to each other, so that the datum can survive (or, in the virus case, under what conditions will the virus die out)? Clearly, the link and node fault probabilities are important — what else is needed to ascertain the survivability of the datum? We propose and solve the problem using non-linear dynamical systems and fixed point stability theorems. We provide a closedform formula that, surprisingly, depends on only one additional parameter, the largest eigenvalue of the connectivity matrix. We illustrate the accuracy of our analysis on realistic and real settings, like mote sensor networks from Intel and MIT, as well as Gnutella and P2P networks. I.
CMU-ML-10-111 Structural Analysis of Large Networks: Observations and Applications
, 2010
"... in this document are those of the author and should not be interpreted as representing the official policies, either expressed or implied, of any sponsoring institution, the U.S. government or any other entity. Keywords: Social networks, data mining, network diffusion, anomaly detectionDedicated to ..."
Abstract
- Add to MetaCart
in this document are those of the author and should not be interpreted as representing the official policies, either expressed or implied, of any sponsoring institution, the U.S. government or any other entity. Keywords: Social networks, data mining, network diffusion, anomaly detectionDedicated to my father, who nurtured the inquisitiveness to begin this work, Network data (also referred to as relational data, social network data, real graph data) has become ubiquitous, and understanding patterns in this data has become an important research problem. We investigate how interactions in social networks are formed and how these interactions facilitate diffusion, model these behaviors, and apply these findings to real-world problems. We examined graphs of size up to 16 million nodes, across many domains from academic citation networks, to campaign contributions and actor-movie networks. We also performed several case studies in online social networks such as blogs and message board communities. Our major contributions are the following: (a) We discover several surprising patterns in network topology and interactions, such as Popularity Decay power law (in-links to a blog post decay with a power law with −1.5 exponent) and the oscillating
Fast Algorithms for Querying and Mining Large Graphs
, 2009
"... Graphs appear in a wide range of settings and have posed a wealth of fascinating problems. In this thesis, we focus on two types of tasks according to the interaction with users: (1) querying (e.g., given a social network, how to measure the closeness between two persons? how to track it over time ..."
Abstract
- Add to MetaCart
Graphs appear in a wide range of settings and have posed a wealth of fascinating problems. In this thesis, we focus on two types of tasks according to the interaction with users: (1) querying (e.g., given a social network, how to measure the closeness between two persons? how to track it over time?) and (2) mining (e.g., how to identify abnormal behaviors of computer networks? In the case of virus attacks, which nodes are the best to immunize?). The task of querying includes three sub-tasks. In the first one, we found that many complex user-specific patterns on large graphs can be answered by means of proximity measurement. In other words, proximity allows us to query large graphs on the atomic level. We support our claim by conducting three case studies (connection subgraphs, user feedback, and gateway), all of which (despite their diversity) rely on the proximity
Magrathea:BuildingandAnalyzingUbiquitousandSocialSystems
"... Ubiquitous systems are rapidly becoming a more and more commonplace part of our everyday life. These systems may contain different classes of very heterogeneous components that have to function seamlessly together. A prime example of a class of ubiquitous components is given by the personal mobile d ..."
Abstract
- Add to MetaCart
Ubiquitous systems are rapidly becoming a more and more commonplace part of our everyday life. These systems may contain different classes of very heterogeneous components that have to function seamlessly together. A prime example of a class of ubiquitous components is given by the personal mobile devices. They are all pervasive and emerge in many forms: mobile handsets, PDAs, etc. Their features and computational powers make them a very capable platform. We present a pervasive agent- and sensing platform Magrathea that can be run on different kinds of computational devices. Magrathea can be used to build complex pervasive systems. As a practical example of the usage of this platform, we use it on top of personal mobile devices to investigate the structure of social networks of differentindividualsandtosimulateviralbehaviorofagents. We also discuss analytical tools to further investigate, model and simulate the data obtained through our platform. 1.
TAGs: Scalable Threshold-Based Algorithms for Proximity Computation in Graphs
"... A fundamental and very useful operation in graphs is the computation of the proximity between nodes, i.e., the degree of dissimilarity (or similarity) between two nodes v and u. This is an important tool both in graph databases and graph mining applications, because it provides the base to support m ..."
Abstract
- Add to MetaCart
A fundamental and very useful operation in graphs is the computation of the proximity between nodes, i.e., the degree of dissimilarity (or similarity) between two nodes v and u. This is an important tool both in graph databases and graph mining applications, because it provides the base to support more complex tasks such as graph partitioning, clustering, classification, to name a few. All methods proposed in the literature assume that proximity is computed on a single graph by using a single distance measure. In addition, most of them focus on the proximity between node pairs. In this work, we present for the first time, scalable algorithms that: (i) they support proximity computation in multiple graph instances, (ii) they enable the utilization of several distance measures, (iii) they support proximity queries around a source node without limiting to node pairs and (iv) they support extensions for metric-based and skyline query processing. The main result of our work is the design of Threshold Algorithms for Graphs (denoted as TAGs), which are studied and evaluated experimentally by using real-life as well as synthetic graphs, based on both the G(n, p) Erdõs-Rényi model and power law degree distributions.

