Results 11  20
of
26
Diversity Coloring for Distributed Storage in Mobile Networks
, 2001
"... Abstract: Storing multiple copies of files is crucial for ensuring quality of service for data storage in mobile networks. This paper proposes a new scheme, called the KoutofN file distribution scheme, for the placement of files. In this scheme files are splitted, and ReedSolomon codes or other ..."
Abstract

Cited by 3 (0 self)
 Add to MetaCart
Abstract: Storing multiple copies of files is crucial for ensuring quality of service for data storage in mobile networks. This paper proposes a new scheme, called the KoutofN file distribution scheme, for the placement of files. In this scheme files are splitted, and ReedSolomon codes or other maximum distance seperable (MDS) codes are used to produce file segments containing parity information. Multiple copies of the file segments are stored on gateways in the network in such a way that every gateway can retrieve enough file segments from itself and its neighbors within a certain amount of hops for reconstructing the orginal files. The goal is to minimize the maximum number of hops it takes for any gateway to get enough file segments for the file reconstruction. We formulate the KoutofN file distribution scheme as a coloring problem we call diversity coloring. A diversity coloring is defined to be optimal if it uses the smallest number of colors. Upper and lower bounds on the performance of diversity coloring for general graphs are studied. Diversity coloring algorithms for several special classes of graphs—trees, rings and tori—are presented, all of which have linear time complexity. Both the algorithm for trees and the algorithm for rings output optimal diversity colorings. The algorithm for tori guarantees to output optimal diversity coloring when the sizes of tori are sufficiently large. Index Terms: Data storage, diversity coloring, file assignment problem (FAP), graph coloring, KoutofN scheme, maximum distance seperable (MDS) codes, mobile computing, Quality of Service
MultiCluster Interleaving on Linear Arrays and Rings
"... Interleaving codewords is an important method not only for combatting bursterrors, but also for flexible dataretrieving. This paper defines the MultiCluster Interleaving (MCI) problem, an interleaving problem for parallel dataretrieving. The MCI problems on linear arrays and rings are studied. ..."
Abstract

Cited by 3 (3 self)
 Add to MetaCart
Interleaving codewords is an important method not only for combatting bursterrors, but also for flexible dataretrieving. This paper defines the MultiCluster Interleaving (MCI) problem, an interleaving problem for parallel dataretrieving. The MCI problems on linear arrays and rings are studied. The following problem is completely solved: how to interleave integers on a linear array or ring such that any m (m ≥ 2) nonoverlapping segments of length 2 in the array or ring have at least 3 distinct integers. We then present a scheme using a ‘hierarchicalchain structure’ to solve the following more general problem for linear arrays: how to interleave integers on a linear array such that any m (m ≥ 2) nonoverlapping segments of length L (L ≥ 2) in the array have at least L + 1 distinct integers. It is shown that the scheme using the ‘hierarchicalchain structure’ solves the second interleaving problem for arrays that are asymptotically as long as the longest array on which an MCI exists, and clearly, for shorter arrays as well.
Symmetric Allocations for Distributed Storage
"... Abstract—We consider the problem of optimally allocating a given total storage budget in a distributed storage system. A source has a data object which it can code and store over a set of storage nodes; it is allowed to store any amount of coded data in each node, as long as the total amount of stor ..."
Abstract

Cited by 2 (1 self)
 Add to MetaCart
Abstract—We consider the problem of optimally allocating a given total storage budget in a distributed storage system. A source has a data object which it can code and store over a set of storage nodes; it is allowed to store any amount of coded data in each node, as long as the total amount of storage used does not exceed the given budget. A data collector subsequently attempts to recover the original data object by accessing each of the nodes independently with some constant probability. By using an appropriate code, successful recovery occurs when the total amount of data in the accessed nodes is at least the size of the original data object. The goal is to find an optimal storage allocation that maximizes the probability of successful recovery. This optimization problem is challenging because of its discrete nature and nonconvexity, despite its simple formulation. Symmetric allocations (in which all nonempty nodes store the same amount of data), though intuitive, may be suboptimal; the problem is nontrivial even if we optimize over only symmetric allocations. Our main result shows that the symmetric allocation that spreads the budget maximally over all nodes is asymptotically optimal in a regime of interest. Specifically, we derive an upper bound for the suboptimality of this allocation and show that the performance gap vanishes asymptotically in the specified regime. Further, we explicitly find the optimal symmetric allocation for a variety of cases. Our results can be applied to distributed storage systems and other problems dealing with reliability under uncertainty, including delay tolerant networks (DTNs) and content delivery networks (CDNs). I.
Server Placements, Roman Domination and Other Dominating Set Variants
"... Dominating sets in their many variations model a wealth of optimization problems like facility location or distributed file sharing. For instance, when a request can occur at any node in a graph and requires a server at that node, a minimumdominating set represents a minimum set of servers that ..."
Abstract

Cited by 2 (0 self)
 Add to MetaCart
Dominating sets in their many variations model a wealth of optimization problems like facility location or distributed file sharing. For instance, when a request can occur at any node in a graph and requires a server at that node, a minimumdominating set represents a minimum set of servers that serve an arbitrary single request by moving a server along at most one edge. This paper studies domination problems for two requests. For the problem of placing a minimum number of servers such that two requests at different nodes can be served with two different servers (called winwin), we present a logarithmic approximation, and we prove that nothing better is possible. We show that the same is true for Roman domination, the well studied problem variant that asks for each vertex to either possess its own server or to have a neighbor with two servers. Still the same is true if each idle server can move along one edge while the first of both requests is being served. For planar graphs, we propose a PTAS for Roman domination (and show that nothing better exists), and we get a constant approximation for winwin.
Fundamental Limits for Information Retrieval
, 1999
"... The fundamental limits of performance for a general model of information retrieval from databases are studied. In the scenarios considered a large quantity of information is to be stored on some physical storage device. Requests for information are modeled as a randomly generated sequence with a kno ..."
Abstract

Cited by 2 (1 self)
 Add to MetaCart
The fundamental limits of performance for a general model of information retrieval from databases are studied. In the scenarios considered a large quantity of information is to be stored on some physical storage device. Requests for information are modeled as a randomly generated sequence with a known distribution. The requests are assumed to be "contextdependent", i.e., to vary according to the sequence of previous requests. The state of the physical storage device is also assumed to depend on the history of previous requests. In general the logical structure of the information to be stored does not match the physical structure of the storage device, and consequently there are nontrivial limits on the minimum achievable average access times, where the average is over the possible sequences of user requests. The paper applies basic informationtheoretic methods to establish these limits and demonstrates constructive procedures that approach them, for a wide class of systems. Allowing ...
Large matchings in uniform hypergraphs and the conjectures of Erdős and Samuels
"... In this paper we study conditions which guarantee the existence of perfect matchings and perfect fractional matchings in uniform hypergraphs. We reduce this problem to an old conjecture by Erdős on estimating the maximum number of edges in a hypergraph when the (fractional) matching number is given, ..."
Abstract

Cited by 2 (0 self)
 Add to MetaCart
In this paper we study conditions which guarantee the existence of perfect matchings and perfect fractional matchings in uniform hypergraphs. We reduce this problem to an old conjecture by Erdős on estimating the maximum number of edges in a hypergraph when the (fractional) matching number is given, which we are able to solve in some special cases using probabilistic techniques. Based on these results, we obtain some general theorems on the minimum ddegree ensuring the existence of perfect (fractional) matchings. In particular, we asymptotically determine the minimum vertex degree which guarantees a perfect matching in 4uniform and 5uniform hypergraphs. We also discuss an application to a problem of finding an optimal data allocation in a distributed storage system.
Distributed Storage Allocations
, 2010
"... We examine the problem of allocating a given total storage budget in a distributed storage system for maximum reliability. A source has a single data object that is to be coded and stored over a set of storage nodes; it is allowed to store any amount of coded data in each node, as long as the total ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
We examine the problem of allocating a given total storage budget in a distributed storage system for maximum reliability. A source has a single data object that is to be coded and stored over a set of storage nodes; it is allowed to store any amount of coded data in each node, as long as the total amount of storage used does not exceed the given budget. A data collector subsequently attempts to recover the original data object by accessing only the data stored in a random subset of the nodes. By using an appropriate code, successful recovery can be achieved whenever the total amount of data accessed is at least the size of the original data object. The goal is to find an optimal storage allocation that maximizes the probability of successful recovery. This optimization problem is challenging in general because of its combinatorial nature, despite its simple formulation. We study several variations of the problem, assuming different allocation models and access models. The optimal allocation and the optimal symmetric allocation (in which all nonempty nodes store the same amount of data) are determined for a variety of cases. Our results indicate that the optimal allocations often have nonintuitive structure and are difficult to specify. We also show that depending on the circumstances, coding may or may not be beneficial for reliable storage.
Speedup Of Data Access using . . .
 ISIT 2003, YOKOHAMA, JAPAN, JUNE 29  JULY 4, 2003
, 2003
"... Peertopeer networks are networks of heterogeneous computers sharing files or services. This paper proposes to use a data storage scheme using maximum distance separable codes to optimize the dissemination of the data in the network in order to globally enhance the data access. ..."
Abstract
 Add to MetaCart
Peertopeer networks are networks of heterogeneous computers sharing files or services. This paper proposes to use a data storage scheme using maximum distance separable codes to optimize the dissemination of the data in the network in order to globally enhance the data access.
1 General field of research Research Statement
"... My research interest is in the general field of information networks. My study and research are in the areas of algorithms, combinatorial and convex optimization, distributed systems and information theory. So far my research has focused on two fields — file storage in networks, and wireless ad hoc ..."
Abstract
 Add to MetaCart
My research interest is in the general field of information networks. My study and research are in the areas of algorithms, combinatorial and convex optimization, distributed systems and information theory. So far my research has focused on two fields — file storage in networks, and wireless ad hoc communication and sensor networks. I plan to use my research experience and knowledge to explore broader aspects of information networks, including overlay storage/distribution networks, sensor networks and many other forms, all essential for pervasive computing. Two key components shared by different kinds of information networks are data storage/sharing and network structure design/utilization. The first component, data storage/sharing, requires optimized placement of data for efficient access, even when the users of the data are extensively distributed, mobile or have very different communication and computing capabilities. Information theory can be applied to help both the storage and the retrieval of data to achieve an optimal performance/redundancy tradeoff. Examples include the storage of shared files in networks using erasure codes for high availability, rate allocation for nodes collecting data in sensor networks, fractionally cascading of information for fast data detection and locating, multicast based on Network Coding, etc. The second component, network structure design/utilization, is on the design of real or overlaynetwork