Results 1  10
of
279
A Brief History of Generative Models for Power Law and Lognormal Distributions
 INTERNET MATHEMATICS
"... Recently, I became interested in a current debate over whether file size distributions are best modelled by a power law distribution or a a lognormal distribution. In trying ..."
Abstract

Cited by 420 (8 self)
 Add to MetaCart
(Show Context)
Recently, I became interested in a current debate over whether file size distributions are best modelled by a power law distribution or a a lognormal distribution. In trying
Power laws, Pareto distributions and Zipf’s law
"... Many of the things that scientists measure have a typical size or “scale”—a typical value around which individual measurements are centred. A simple example would be the heights of human beings. Most adult human beings are about 180cm tall. There is some variation around this figure, notably dependi ..."
Abstract

Cited by 392 (0 self)
 Add to MetaCart
(Show Context)
Many of the things that scientists measure have a typical size or “scale”—a typical value around which individual measurements are centred. A simple example would be the heights of human beings. Most adult human beings are about 180cm tall. There is some variation around this figure, notably depending on sex, but we never see people who are 10cm tall, or 500cm. To make this observation more quantitative, one can plot a histogram of people’s heights, as I have done in Fig. 1a. The figure shows the heights in centimetres of adult men in the United States measured between 1959 and 1962, and indeed the distribution is relatively narrow and peaked around 180cm. Another telling observation is the ratio of the heights of the tallest and shortest people.
I Tube, You Tube, Everybody Tubes: Analyzing the World’s Largest User Generated Content Video System
 In Proceedings of the 5th ACM/USENIX Internet Measurement Conference (IMC’07
, 2007
"... User Generated Content (UGC) is reshaping the way people watch video and TV, with millions of video producers and consumers. In particular, UGC sites are creating new viewing patterns and social interactions, empowering users to be more creative, and developing new business opportunities. To better ..."
Abstract

Cited by 367 (6 self)
 Add to MetaCart
(Show Context)
User Generated Content (UGC) is reshaping the way people watch video and TV, with millions of video producers and consumers. In particular, UGC sites are creating new viewing patterns and social interactions, empowering users to be more creative, and developing new business opportunities. To better understand the impact of UGC systems, we have analyzed YouTube, the world’s largest UGC VoD system. Based on a large amount of data collected, we provide an indepth study of YouTube and other similar UGC systems. In particular, we study the popularity lifecycle of videos, the intrinsic statistical properties of requests and their relationship with video age, and the level of content aliasing or of illegal content in the system. We also provide insights on the potential for more efficient UGC VoD systems (e.g. utilizing P2P techniques or making better use of caching). Finally, we discuss the opportunities to leverage the latent demand for niche videos that are not reached today due to information filtering effects or other system scarcity distortions. Overall, we believe that the results presented in this paper are crucial in understanding UGC systems and can provide valuable information to ISPs, site administrators, and content owners with major commercial and technical implications. Categories and Subject Descriptors Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee.
A General Model of Web Graphs
, 2003
"... We describe a very general model of a random graph process whose proportional degree sequence obeys a power law. Such laws have recently been observed in graphs associated with the world wide web. ..."
Abstract

Cited by 117 (6 self)
 Add to MetaCart
We describe a very general model of a random graph process whose proportional degree sequence obeys a power law. Such laws have recently been observed in graphs associated with the world wide web.
The economics of social networks
 PROCEEDINGS OF THE 9 TH WORLD CONGRESS OF THE ECONOMETRIC SOCIETY
, 2005
"... The science of social networks is a central field of sociological study, a major application of random graph theory, and an emerging area of study by economists, statistical physicists and computer scientists. While these literatures are (slowly) becoming aware of each other, and on occasion drawing ..."
Abstract

Cited by 112 (3 self)
 Add to MetaCart
The science of social networks is a central field of sociological study, a major application of random graph theory, and an emerging area of study by economists, statistical physicists and computer scientists. While these literatures are (slowly) becoming aware of each other, and on occasion drawing from one another, they are still largely distinct in their methods, interests, and goals. Here, my aim is to provide some perspective on the research from these literatures, with a focus on the formal modeling of social networks and the two major types of models: those based on random graphs and those based on game theoretic reasoning. I highlight some of the strengths, weaknesses, and potential synergies between these two network modeling approaches.
Fast and Accurate Phylogeny Reconstruction Algorithms Based on the MinimumEvolution Principle
 JOURNAL OF COMPUTATIONAL BIOLOGY
, 2002
"... The Minimum Evolution (ME) approach to phylogeny estimation has been shown to be statistically consistent when it is used in conjunction with ordinary leastsquares (OLS) fitting of a metric to a tree structure. The traditional approach to using ME has been to start with the Neighbor Joining (NJ) to ..."
Abstract

Cited by 99 (8 self)
 Add to MetaCart
(Show Context)
The Minimum Evolution (ME) approach to phylogeny estimation has been shown to be statistically consistent when it is used in conjunction with ordinary leastsquares (OLS) fitting of a metric to a tree structure. The traditional approach to using ME has been to start with the Neighbor Joining (NJ) topology for a given matrix and then do a topological search from that starting point. The first stage requires O(n³) time, where n is the number of taxa, while the current implementations of the second are in O(p n³) or more, where p is the number of swaps performed by the program. In this paper, we examine a greedy approach to minimum evolution which produces a starting topology in O(n²) time. Moreover, we provide an algorithm that searches for the best topology using nearest neighbor interchanges (NNIs), where the cost of doing p NNIs is O(n² C p n), i.e., O(n²) in practice because p is always much smaller than n. The Greedy Minimum Evolution (GME) algorithm, when used in combination with NNIs, produces trees which are fairly close to NJ trees in terms of topological accuracy. We also examine ME under a balanced weighting scheme, where sibling subtrees have equal weight, as opposed to the standard “unweighted ” OLS, where
Stochastic Models and Descriptive Statistics for Phylogenetic Trees, from Yule to Today
 STATIST. SCI
, 2001
"... Yule (1924) observed that distributions of number of species per genus were typically longtailed, and proposed a stochastic model to fit this data. Modern taxonomists often prefer to represent relationships between species via phylogenetic trees; the counterpart to Yule's observation is th ..."
Abstract

Cited by 96 (3 self)
 Add to MetaCart
Yule (1924) observed that distributions of number of species per genus were typically longtailed, and proposed a stochastic model to fit this data. Modern taxonomists often prefer to represent relationships between species via phylogenetic trees; the counterpart to Yule's observation is that actual reconstructed trees look surprisingly unbalanced. The imbalance can readily be seen via a scatter diagram of the sizes of clades involved in the splits of published large phylogenetic trees. Attempting stochastic modeling leads to two puzzles. First, two somewhat opposite possible biological descriptions of what dominates the macroevolutionary process (adaptive radiation; "neutral" evolution) lead to exactly the same mathematical model (Markov or Yule or coalescent). Second, neither this nor any other simple stochastic model predicts the observed pattern of imbalance. This essay represents a probabilist's musings on these puzzles, complementing the more detailed survey of biol...
The conditioned reconstructed process
 J. Theor. Biol
"... We investigate a neutral model for speciation and extinction, the constant rate birthdeath process. The process is conditioned to have n extant species today, we look at the tree distribution of the reconstructed trees – i.e. the trees without the extinct species. Whereas the tree shape distributio ..."
Abstract

Cited by 78 (3 self)
 Add to MetaCart
(Show Context)
We investigate a neutral model for speciation and extinction, the constant rate birthdeath process. The process is conditioned to have n extant species today, we look at the tree distribution of the reconstructed trees – i.e. the trees without the extinct species. Whereas the tree shape distribution is wellknown and actually the same as under the pure birth process, no analytic results for the speciation times were known. We provide the distribution for the speciation times and calculate the expectations analytically. This characterizes the reconstructed trees completely. We will show how the results can be used to date phylogenies.
A Geometric Preferential Attachment Model of Networks
 In Algorithms and Models for the WebGraph: Third International Workshop, WAW 2004
, 2004
"... We study a random graph Gn that combines certain aspects of geometric random graphs and preferential attachment graphs. This model yields a graph with powerlaw degree distribution where the expansion property depends on a tunable parameter of the model. The vertices of Gn are n sequentially generat ..."
Abstract

Cited by 64 (4 self)
 Add to MetaCart
We study a random graph Gn that combines certain aspects of geometric random graphs and preferential attachment graphs. This model yields a graph with powerlaw degree distribution where the expansion property depends on a tunable parameter of the model. The vertices of Gn are n sequentially generated points x1, x2,..., xn chosen uniformly at random from the unit sphere in R 3. After generating xt, we randomly connect it to m points from those points in x1, x2,..., xt−1. 1