Results 1 - 10
of
65
A Brief History of Generative Models for Power Law and Lognormal Distributions
- INTERNET MATHEMATICS
"... Recently, I became interested in a current debate over whether file size distributions are best modelled by a power law distribution or a a lognormal distribution. In trying ..."
Abstract
-
Cited by 192 (7 self)
- Add to MetaCart
Recently, I became interested in a current debate over whether file size distributions are best modelled by a power law distribution or a a lognormal distribution. In trying
I Tube, You Tube, Everybody Tubes: Analyzing the World’s Largest User Generated Content Video System
- In Proceedings of the 5th ACM/USENIX Internet Measurement Conference (IMC’07
, 2007
"... User Generated Content (UGC) is re-shaping the way people watch video and TV, with millions of video producers and consumers. In particular, UGC sites are creating new viewing patterns and social interactions, empowering users to be more creative, and developing new business opportunities. To better ..."
Abstract
-
Cited by 109 (5 self)
- Add to MetaCart
User Generated Content (UGC) is re-shaping the way people watch video and TV, with millions of video producers and consumers. In particular, UGC sites are creating new viewing patterns and social interactions, empowering users to be more creative, and developing new business opportunities. To better understand the impact of UGC systems, we have analyzed YouTube, the world’s largest UGC VoD system. Based on a large amount of data collected, we provide an in-depth study of YouTube and other similar UGC systems. In particular, we study the popularity life-cycle of videos, the intrinsic statistical properties of requests and their relationship with video age, and the level of content aliasing or of illegal content in the system. We also provide insights on the potential for more efficient UGC VoD systems (e.g. utilizing P2P techniques or making better use of caching). Finally, we discuss the opportunities to leverage the latent demand for niche videos that are not reached today due to information filtering effects or other system scarcity distortions. Overall, we believe that the results presented in this paper are crucial in understanding UGC systems and can provide valuable information to ISPs, site administrators, and content owners with major commercial and technical implications. Categories and Subject Descriptors Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee.
A General Model of Web Graphs
, 2003
"... We describe a very general model of a random graph process whose proportional degree sequence obeys a power law. Such laws have recently been observed in graphs associated with the world wide web. ..."
Abstract
-
Cited by 72 (6 self)
- Add to MetaCart
We describe a very general model of a random graph process whose proportional degree sequence obeys a power law. Such laws have recently been observed in graphs associated with the world wide web.
Stochastic Models and Descriptive Statistics for Phylogenetic Trees, from Yule to Today
- STATIST. SCI
, 2001
"... Yule (1924) observed that distributions of number of species per genus were typically long-tailed, and proposed a stochastic model to fit this data. Modern taxonomists often prefer to represent relationships between species via phylogenetic trees; the counterpart to Yule's observation is that ac ..."
Abstract
-
Cited by 39 (3 self)
- Add to MetaCart
Yule (1924) observed that distributions of number of species per genus were typically long-tailed, and proposed a stochastic model to fit this data. Modern taxonomists often prefer to represent relationships between species via phylogenetic trees; the counterpart to Yule's observation is that actual reconstructed trees look surprisingly unbalanced. The imbalance can readily be seen via a scatter diagram of the sizes of clades involved in the splits of published large phylogenetic trees. Attempting stochastic modeling leads to two puzzles. First, two somewhat opposite possible biological descriptions of what dominates the macroevolutionary process (adaptive radiation; "neutral" evolution) lead to exactly the same mathematical model (Markov or Yule or coalescent). Second, neither this nor any other simple stochastic model predicts the observed pattern of imbalance. This essay represents a probabilist's musings on these puzzles, complementing the more detailed survey of biol...
Fast and Accurate Phylogeny Reconstruction Algorithms Based on the Minimum-Evolution Principle
- JOURNAL OF COMPUTATIONAL BIOLOGY
, 2002
"... The Minimum Evolution (ME) approach to phylogeny estimation has been shown to be statistically consistent when it is used in conjunction with ordinary least-squares (OLS) fitting of a metric to a tree structure. The traditional approach to using ME has been to start with the Neighbor Joining (NJ) to ..."
Abstract
-
Cited by 39 (3 self)
- Add to MetaCart
The Minimum Evolution (ME) approach to phylogeny estimation has been shown to be statistically consistent when it is used in conjunction with ordinary least-squares (OLS) fitting of a metric to a tree structure. The traditional approach to using ME has been to start with the Neighbor Joining (NJ) topology for a given matrix and then do a topological search from that starting point. The first stage requires O(n³) time, where n is the number of taxa, while the current implementations of the second are in O(p n³) or more, where p is the number of swaps performed by the program. In this paper, we examine a greedy approach to minimum evolution which produces a starting topology in O(n²) time. Moreover, we provide an algorithm that searches for the best topology using nearest neighbor interchanges (NNIs), where the cost of doing p NNIs is O(n² C p n), i.e., O(n²) in practice because p is always much smaller than n. The Greedy Minimum Evolution (GME) algorithm, when used in combination with NNIs, produces trees which are fairly close to NJ trees in terms of topological accuracy. We also examine ME under a balanced weighting scheme, where sibling subtrees have equal weight, as opposed to the standard “unweighted ” OLS, where
The economics of social networks
- PROCEEDINGS OF THE 9 TH WORLD CONGRESS OF THE ECONOMETRIC SOCIETY
, 2005
"... The science of social networks is a central field of sociological study, a major application of random graph theory, and an emerging area of study by economists, statistical physicists and computer scientists. While these literatures are (slowly) becoming aware of each other, and on occasion drawing ..."
Abstract
-
Cited by 31 (2 self)
- Add to MetaCart
The science of social networks is a central field of sociological study, a major application of random graph theory, and an emerging area of study by economists, statistical physicists and computer scientists. While these literatures are (slowly) becoming aware of each other, and on occasion drawing from one another, they are still largely distinct in their methods, interests, and goals. Here, my aim is to provide some perspective on the research from these literatures, with a focus on the formal modeling of social networks and the two major types of models: those based on random graphs and those based on game theoretic reasoning. I highlight some of the strengths, weaknesses, and potential synergies between these two network modeling approaches.
A Geometric Preferential Attachment Model of Networks
- In Algorithms and Models for the Web-Graph: Third International Workshop, WAW 2004
, 2004
"... We study a random graph Gn that combines certain aspects of geometric random graphs and preferential attachment graphs. This model yields a graph with power-law degree distribution where the expansion property depends on a tunable parameter of the model. The vertices of Gn are n sequentially generat ..."
Abstract
-
Cited by 24 (1 self)
- Add to MetaCart
We study a random graph Gn that combines certain aspects of geometric random graphs and preferential attachment graphs. This model yields a graph with power-law degree distribution where the expansion property depends on a tunable parameter of the model. The vertices of Gn are n sequentially generated points x1, x2,..., xn chosen uniformly at random from the unit sphere in R 3. After generating xt, we randomly connect it to m points from those points in x1, x2,..., xt−1. 1
Counting graph homomorphisms
- In:Topics in Discrete Math
, 2006
"... Counting homomorphisms between graphs (often with weights) comes up in a wide variety of areas, including extremal graph theory, properties of graph products, partition functions in statistical physics and property testing of large graphs. In this paper we survey recent developments in the study of ..."
Abstract
-
Cited by 23 (6 self)
- Add to MetaCart
Counting homomorphisms between graphs (often with weights) comes up in a wide variety of areas, including extremal graph theory, properties of graph products, partition functions in statistical physics and property testing of large graphs. In this paper we survey recent developments in the study of homomorphism numbers, including the characterization of the homomorphism numbers in terms of the semidefiniteness of “connection matrices”, and some applications of this fact in extremal graph theory. We define a distance of two graphs in terms of similarity of their global structure, which also reflects the closeness of (appropriately scaled) homomorphism numbers into the two graphs. We use homomorphism numbers to define convergence of a sequence of graphs, and show that a graph sequence is convergent if and only if it is Cauchy in this distance. Every convergent graph sequence has a limit in the form of a symmetric measurable function in two variables. We use these notions of distance and graph limits to give a general theory for parameter testing. The convergence can also be characterized in terms of mappings of the graphs into fixed small graphs, which is strongly connected to important parameters like ground state energy in statistical physics, and to weighted maximum cut problems in computer science. 1
Nonequilibrium phase transition in the coevolution of networks and opinions
- Growth dynamics of the World-Wide Web,” Nature: VOL 401: 9 SEPTEMBER
, 2006
"... Models of the convergence of opinion in social systems have been the subject of a considerable amount of recent attention in the physics literature. These models divide into two classes, those in which individuals form their beliefs based on the opinions of their neighbors in a social network of per ..."
Abstract
-
Cited by 16 (1 self)
- Add to MetaCart
Models of the convergence of opinion in social systems have been the subject of a considerable amount of recent attention in the physics literature. These models divide into two classes, those in which individuals form their beliefs based on the opinions of their neighbors in a social network of personal acquaintances, and those in which, conversely, network connections form between individuals of similar beliefs. While both of these processes can give rise to realistic levels of agreement between acquaintances, practical experience suggests that opinion formation in the real world is not a result of one process or the other, but a combination of the two. Here we present a simple model of this combination, with a single parameter controlling the balance of the two processes. We find that the model undergoes a continuous phase transition as this parameter is varied, from a regime in which opinions are arbitrarily diverse to one in which most individuals hold the same opinion. We characterize the static and dynamical properties of this transition.
Random Deletion In A Scale Free Random Graph Process
, 2004
"... We study a dynamically evolving random graph which adds vertices and edges using preferential attachment and deletes vertices randomly. At time t, with probability #1 > 0 we add a new vertex u t and m random edges incident with u t . The neighbours of u t are chosen with probability proportional to ..."
Abstract
-
Cited by 14 (3 self)
- Add to MetaCart
We study a dynamically evolving random graph which adds vertices and edges using preferential attachment and deletes vertices randomly. At time t, with probability #1 > 0 we add a new vertex u t and m random edges incident with u t . The neighbours of u t are chosen with probability proportional to degree. With probability # 0 we add m random edges to existing vertices where the endpoints are chosen with probability proportional to degree. With probability 1 #0 we delete a random vertex, if there are vertices left to delete. and with probability #0 we delete m random edges. Assuming that #+#1 +#0 > 1 and #0 is su#cently small, we show that for large k, t, the expected number of vertices of degree k is approximately dk t where as k ##, dk Ck -1-# where # = 2(#-# 0 ) 3#-1-# 1 and C > 0 is a constant. Note that # can take any value greater than 1. 1

