Network Coding for Distributed Storage Systems
, 2008
Cited by 327 (13 self)
Distributed storage systems provide reliable access to data through redundancy spread over individually unreliable nodes. Application scenarios include data centers, peertopeer storage systems, and storage in wireless networks. Storing data using an erasure code, in fragments spread across nodes, requires less redundancy than simple replication for the same level of reliability. However, since fragments must be periodically replaced as nodes fail, a key question is how to generate encoded fragments in a distributed way while transferring as little data as possible across the network. For an erasure coded system, a common practice to repair from a node failure is for a new node to download subsets of data stored at a number of surviving nodes, reconstruct a lost coded block using the downloaded data, and store it at the new node. We show that this procedure is suboptimal. We introduce the notion of regenerating codes, which allow a new node to download functions of the stored data from the surviving nodes. We show that regenerating codes can significantly reduce the repair bandwidth. Further, we show that there is a fundamental tradeoff between storage and repair bandwidth which we theoretically characterize using flow arguments on an appropriately constructed graph. By invoking constructive results in network coding, we introduce regenerating codes that can achieve any point in this optimal tradeoff.
Identification of peer effects through social networks
 Journal of Econometrics
, 2009
Cited by 149 (17 self)
We provide new results regarding the identification of peer effects. We consider an extended version of the linearinmeans model where each individual has his own specific reference group. Interactions are thus structured through a social network. We assume that correlated unobservables are either absent, or treated as fixed effects at the component level. In both cases, we provide easytocheck necessary and sufficient conditions for identification. We show that endogenous and exogenous effects are generally identified under network interaction, although identification may fail for some particular structures. Monte Carlo simulations provide an analysis of the effects of some crucial characteristics of a network (i.e., density, intransitivity) on the estimates of social effects. Our approach generalizes a number of previous results due to Manski (1993), Moffitt (2001), and Lee (2007).
Necessary and sufficient graphical conditions for formation control of unicycles
, 2005
Cited by 126 (5 self)
The feasibility problem is studied of achieving a specified formation among a group of autonomous unicycles by local distributed control. The directed graph defined by the information flow plays a key role. It is proved that formation stabilization to a point is feasible if and only if the sensor digraph has a globally reachable node. A similar result is given for formation stabilization to a line and to more general geometric arrangements.
Subexponential parameterized algorithms on graphs of boundedgenus and Hminorfree Graphs
Cited by 61 (20 self)
... Building on these results, we develop subexponential fixedparameter algorithms for dominating set, vertex cover, and set cover in any class of graphs excluding a fixed graph H as a minor. Inparticular, this general category of graphs includes planar graphs, boundedgenus graphs, singlecrossingminorfree graphs, and anyclass of graphs that is closed under taking minors. Specifically, the running time is 2O(pk)nh, where h is a constant depending onlyon H, which is polynomial for k = O(log² n). We introducea general approach for developing algorithms on Hminorfreegraphs, based on structural results about Hminorfree graphs at the
Deterministic regenerating codes for distributed storage
 IN ALLERTON CONFERENCE ON CONTROL, COMPUTING, AND COMMUNICATION, (URBANACHAMPAIGN, IL
, 2007
Cited by 51 (7 self)
It is well known that erasure coding can be used in storage systems to efficiently store data while protecting against failures. Conventionally, the design of erasure codes has focused on the tradeoff between redundancy and reliability; under this criterion, an Maximum Distance Separable (MDS) code is optimal. However, practical storage systems call for additional considerations. In particular, the codes must be properly maintained to recover from node failures. Previous work by Dimakis et al. studied the problem of properly maintaining erasure codes to reduce the incurred network bandwidth, established fundamental bounds on the minimum repair bandwidth for maintaining MDS codes, and showed that the repair bandwidth can be reduced further at the cost of higher storage. In this paper we present techniques for constructing codes that achieve the optimal tradeoffs between storage efficiency and repair bandwidth.
A Framework for Representing Reticulate Evolution
 ANNALS OF COMBINATORICS
, 2004
Cited by 48 (5 self)
Acyclic directed graphs (ADGs) are increasingly being viewed as more appropriate for representing certain evolutionary relationships, particularly in biology, than rooted trees. In this paper, we develop a framework for the analysis of these graphs which we call hybrid phylogenies. We are particularly interested in the problem whereby one is given a set of phylogenetic trees and wishes to determine a hybrid phylogeny that ‘embeds’ each of these trees and which requires the smallest number of hybridisation events. We show that this quantity can be greatly reduced if additional species are involved, and investigate other combinatorial aspects of this and related questions.
Almost 2SAT is fixedparameter tractable
 Journal of Computer and System Sciences
Cited by 40 (5 self)
Abstract. We consider the following problem. Given a 2CNF formula, is it possible to remove at most k clauses so that the resulting 2CNF formula is satisfiable? This problem is known to different research communities in Theoretical Computer Science under the names ’Almost 2SAT’, ’Allbutk 2SAT’, ’2CNF deletion’, ’2SAT deletion’. The status of fixedparameter tractability of this problem is a longstanding open question in the area of Parameterized Complexity. We resolve this open question by proposing an algorithm which solves this problem in O(15 k ∗ k ∗ m 3) and thus we show that this problem is fixedparameter tractable. 1
An Extended Class of Instrumental Variables for the Estimation of Causal Effects
 UCSD DEPT. OF ECONOMICS DISCUSSION PAPER
, 1996
Abstract

Cited by 38 (15 self)
This paper builds on the structural equations, treatment effect, and machine learning literatures to provide a causal framework that permits the identification and estimation of causal effects from observational studies. We begin by providing a causal interpretation for standard exogenous regressors and standard “valid” and “relevant” instrumental variables. We then build on this interpretation to characterize extended instrumental variables (EIV) methods, that is methods that make use of variables that need not be valid instruments in the standard sense, but that are nevertheless instrumental in the recovery of causal effects of interest. After examining special cases of single and double EIV methods, we provide necessary and sufficient conditions for the identification of causal effects by means of EIV and provide consistent and asymptotically normal estimators for the effects of interest.