Results

**1 - 3**of**3**### Basic models and questions in statistical network analysis

"... Abstract Extracting information from large graphs has become an important statistical problem since network data is now common in various fields. In this minicourse we will investigate the most natural statistical questions for three canonical probabilistic models of networks: (i) community detecti ..."

Abstract
- Add to MetaCart

(Show Context)
Abstract Extracting information from large graphs has become an important statistical problem since network data is now common in various fields. In this minicourse we will investigate the most natural statistical questions for three canonical probabilistic models of networks: (i) community detection in the stochastic block model, (ii) finding the embedding of a random geometric graph, and (iii) finding the original vertex in a preferential attachment tree. Along the way we will cover many interesting topics in probability theory such as Pólya urns, large deviation theory, concentration of measure in high dimension, entropic central limit theorems, and more. Outline: • Lecture 1: A primer on exact recovery in the general stochastic block model.

### Testing for high-dimensional geometry in random graphs

"... Abstract We study the problem of detecting the presence of an underlying high-dimensional geometric structure in a random graph. Under the null hypothesis, the observed graph is a realization of an Erdős-Rényi random graph G(n, p). Under the alternative, the graph is generated from the G(n, p, d) m ..."

Abstract
- Add to MetaCart

(Show Context)
Abstract We study the problem of detecting the presence of an underlying high-dimensional geometric structure in a random graph. Under the null hypothesis, the observed graph is a realization of an Erdős-Rényi random graph G(n, p). Under the alternative, the graph is generated from the G(n, p, d) model, where each vertex corresponds to a latent independent random vector uniformly distributed on the sphere S d−1 , and two vertices are connected if the corresponding latent vectors are close enough. In the dense regime (i.e., p is a constant), we propose a nearoptimal and computationally efficient testing procedure based on a new quantity which we call signed triangles. The proof of the detection lower bound is based on a new bound on the total variation distance between a Wishart matrix and an appropriately normalized GOE matrix. In the sparse regime, we make a conjecture for the optimal detection boundary. We conclude the paper with some preliminary steps on the problem of estimating the dimension in G(n, p, d).