• Documents
  • Authors
  • Tables
  • Other Seers ▼
    RefSeer AckSeer CollabSeer SeerSeer
  • Log in
  • Sign up
  • MetaCart

CiteSeerX logo

Advanced Search Include Citations
Advanced Search Include Citations | Disambiguate

Integrating content-based access mechanisms with hierarchical file systems (1999)

by Burra Gopal, Udi Manber
Add To MetaCart

Tools

Sorted by:
Results 1 - 10 of 48
Next 10 →

Distributed Object Location in a Dynamic Network

by Kirsten Hildrum, John D. Kubiatowicz, Satish Rao, Ben Y. Zhao , 2004
"... Modern networking applications replicate data and services widely, leading to a need for location-independent routing---the ability to route queries to objects using names independent of the objects' physical locations. Two important properties of such a routing infrastructure are routing locality a ..."
Abstract - Cited by 155 (16 self) - Add to MetaCart
Modern networking applications replicate data and services widely, leading to a need for location-independent routing---the ability to route queries to objects using names independent of the objects' physical locations. Two important properties of such a routing infrastructure are routing locality and rapid adaptation to arriving and departing nodes. We show how these two properties can be efficiently achieved for certain network topologies. To do this, we present a new distributed algorithm that can solve the nearest-neighbor problem for these networks. We describe our solution in the context of Tapestry, an overlay network infrastructure that employs techniques proposed by Plaxton et al. [24].

File System Support for Delta Compression

by Joshua P. MacDonald , 2000
"... Delta compression, which consists of compactly encoding one le version as the result of changes to another, can improve eciency in the use of network and disk resources. Delta compression techniques are readily available and can result in compression factors of ve to ten on typical data. Managing de ..."
Abstract - Cited by 53 (0 self) - Add to MetaCart
Delta compression, which consists of compactly encoding one le version as the result of changes to another, can improve eciency in the use of network and disk resources. Delta compression techniques are readily available and can result in compression factors of ve to ten on typical data. Managing delta-compressed storage, however, is a dicult task. I will present a system that attempts to isolate the complexity of delta-compressed storage management by separating the task of version labeling from performance issues. I will show how the system integrates delta-compressed transport with delta-compressed storage. Existing tools for managing delta-compressed storage suer from weak le system support. Lack of transaction support is responsible for inecient application behavior. The only atomic operation in the traditional le system forces unnecessary disk activity due to copying costs. I will demonstrate that transaction support can improve application performance and extensibility wit...

Connections: using context to enhance file search

by Craig A. N. Soules, Gregory R. Ganger - In Proceedings of the 20th ACM Symposium on Operating Systems Principles (SOSP ’05 , 2005
"... Connections is a file system search tool that combines traditional content-based search with context information gathered from user activity. By tracing file system calls, Connections can identify temporal relationships between files and use them to expand and reorder traditional content search resu ..."
Abstract - Cited by 43 (3 self) - Add to MetaCart
Connections is a file system search tool that combines traditional content-based search with context information gathered from user activity. By tracing file system calls, Connections can identify temporal relationships between files and use them to expand and reorder traditional content search results. Doing so improves both recall (reducing falsepositives) and precision (reducing false-negatives). For example, Connections improves the average recall (from 13% to 22%) and precision (from 23 % to 29%) on the first ten results. When averaged across all recall levels, Connections improves precision from 17 % to 28%. Connections provides these benefits with only modest increases in average query time (2 seconds), indexing time (23 seconds daily), and index size (under 1 % of the user’s data set).

Secure Attribute-Based Systems

by Matthew Pirretti, Patrick Traynor, Patrick Mcdaniel - In ACM Conference on Computer and Communications Security (CCS’06 , 2006
"... Attributes define, classify, or annotate the datum to which they are assigned. However, traditional attribute architectures and cryptosystems are ill-equipped to provide security in the face of diverse access requirements and environments. In this paper, we introduce a novel secure information manag ..."
Abstract - Cited by 30 (4 self) - Add to MetaCart
Attributes define, classify, or annotate the datum to which they are assigned. However, traditional attribute architectures and cryptosystems are ill-equipped to provide security in the face of diverse access requirements and environments. In this paper, we introduce a novel secure information management architecture based on emerging attribute-based encryption (ABE) primitives. A policy system that meets the needs of complex policies is defined and illustrated. Based on the needs of those policies, we propose cryptographic optimizations that vastly improve enforcement efficiency. We further explore the use of such policies in two example applications: a HIPAA compliant distributed file system and a social network. A performance analysis of our ABE system and example applications demonstrates the ability to reduce cryptographic costs by as much as 98 % over previously proposed constructions. Through this, we demonstrate that our attribute system is an efficient solution for securely managing information in large, loosely-coupled, distributed systems.

A nine year study of file system and storage benchmarking

by Avishay Traeger, Erez Zadok, Nikolai Joukov, Charles P. Wright - ACM Transactions on Storage , 2008
"... Benchmarking is critical when evaluating performance, but is especially difficult for file and storage systems. Complex interactions between I/O devices, caches, kernel daemons, and other OS components result in behavior that is rather difficult to analyze. Moreover, systems have different features ..."
Abstract - Cited by 20 (4 self) - Add to MetaCart
Benchmarking is critical when evaluating performance, but is especially difficult for file and storage systems. Complex interactions between I/O devices, caches, kernel daemons, and other OS components result in behavior that is rather difficult to analyze. Moreover, systems have different features and optimizations, so no single benchmark is always suitable. The large variety of workloads that these systems experience in the real world also adds to this difficulty. In this article we survey 415 file system and storage benchmarks from 106 recent papers. We found that most popular benchmarks are flawed and many research papers do not provide a clear indication of true performance. We provide guidelines that we hope will improve future performance evaluations. To show how some widely used benchmarks can conceal or overemphasize overheads, we conducted a set of experiments. As a specific example, slowing down read operations on ext2 by a factor of 32 resulted in only a 2–5 % wall-clock slowdown in a popular compile benchmark. Finally, we discuss future work to improve file system and storage benchmarking.

Exploiting the potential of concept lattices for information retrieval with CREDO

by Claudio Carpineto, Giovanni Romano - JOURNAL OF UNIVERSAL COMPUTER SCIENCE , 2004
"... The recent advances in Formal Concept Analysis (FCA) together with the major changes faced by modern Information Retrieval (IR) provide new unprecedented challenges and opportunities for FCA-based IR applications. The main advantage of FCA for IR is the possibility of creating a conceptual represe ..."
Abstract - Cited by 19 (2 self) - Add to MetaCart
The recent advances in Formal Concept Analysis (FCA) together with the major changes faced by modern Information Retrieval (IR) provide new unprecedented challenges and opportunities for FCA-based IR applications. The main advantage of FCA for IR is the possibility of creating a conceptual representation of a given document collection in the form of a document lattice, which may be used both to improve the retrieval of specific items and to drive the mining of the collection’s contents. In this paper, we will examine the best features of FCA for solving IR tasks that could not be easily addressed by conventional systems, as well as the most critical aspects for building FCA-based IR applications. These observations have led to the development of CREDO, a system that allows the user to query Web documents and see retrieval results organized in a browsable concept lattice. This is the second major focus of the paper. We will show that CREDO is especially useful for quickly locating the documents corresponding to the meaning of interest among those retrieved in response to an ambiguous query, or for mining the contents of the documents that reference a given entity. An on-line version of the system is available for testing at

A File System Based on Concept Analysis

by Sébastien Ferré, Olivier Ridoux - Int. Conf. Rules and Objects in Databases, LNCS 1861 , 2000
"... . We present the design of a file system whose organization is based on Concept Analysis "`a la Wille-Ganter". The aim is to combine querying and navigation facilities in one formalism. The file system is supposed to offer a standard interface but the interpretation of common notions like direct ..."
Abstract - Cited by 18 (9 self) - Add to MetaCart
. We present the design of a file system whose organization is based on Concept Analysis "`a la Wille-Ganter". The aim is to combine querying and navigation facilities in one formalism. The file system is supposed to offer a standard interface but the interpretation of common notions like directories is new. The contents of a file system is interpreted as a Formal Context, directories as Formal Concepts, and the sub-directory relation as Formal Concepts inclusion. We present an organization that allows for an efficient implementation of such a Conceptual File System. 1 Introduction: Querying vs. Navigation Information retrieval includes representation, storage, organization, and access to information. Two information retrieval methods are widely adopted and applied. The first method is hierarchical classification, which is frequently found in computer tools: e.g., file systems, bookmarks, or menus. In this model, searches are done by navigating in a classification structure t...

Distributed Data Location in a Dynamic Network

by Kirsten Hildrum, John D. Kubiatowicz, Satish Rao, Ben Y. Zhao - IN PROC. OF ACM SPAA , 2002
"... Modern networking applications replicate data and services widely, leading to a need for locationindependent routing -- the ability to route queries directly to objects using names that are independent of the objects' physical locations. Two important properties of a routing infrastructure are routi ..."
Abstract - Cited by 18 (5 self) - Add to MetaCart
Modern networking applications replicate data and services widely, leading to a need for locationindependent routing -- the ability to route queries directly to objects using names that are independent of the objects' physical locations. Two important properties of a routing infrastructure are routing locality and rapid adaptation to arriving and departing nodes. We show how these two properties can be achieved with an efficient solution to the nearest-neighbor problem. We present a new distributed algorithm that can solve the nearest-neighbor problem for a restricted metric space. We describe our solution in the context of Tapestry, an overlay network infrastructure that employs techniques proposed by Plaxton, Rajaraman, and Richa [16].

Why can't I find my files? New methods for automating attribute assignment

by Craig A. N. Soules, Gregory R. Ganger - PROCEEDINGS OF THE NINTH WORKSHOP ON HOT TOPICS IN OPERATING SYSTEMS , 2003
"... Attribute-based naming enables powerful search and organization tools for ever-increasing user data sets. However, such tools are only useful in combination with accurate attribute assignment. Existing systems rely on user input and content analysis, but they have enjoyed minimal success. This paper ..."
Abstract - Cited by 16 (2 self) - Add to MetaCart
Attribute-based naming enables powerful search and organization tools for ever-increasing user data sets. However, such tools are only useful in combination with accurate attribute assignment. Existing systems rely on user input and content analysis, but they have enjoyed minimal success. This paper discusses new approaches to automatically assigning attributes to files, including several forms of context analysis, which has been highly successful in the Google web search engine. With extensions like application hints (e.g., web links for downloaded files) and inter-file relationships, it should be possible to infer useful attributes for many files, making attribute-based search tools more effective.

Spyglass: Fast, Scalable Metadata Search for Large-Scale Storage Systems

by Andrew W. Leung, Minglong Shao, Tim Bisson, Shankar Pasupathy, Ethan L. Miller, Andrew W. Leung, Minglong Shao, Tim Bisson, Shankar Pasupathy, Ethan L. Miller , 2008
"... As storage systems reach the petabyte scale, it has become increasingly difficult for users and storage administrators to understand and manage their data. File metadata, such as inode and extended attributes are a valuable source of information that can aid in locating and identifying files, and ca ..."
Abstract - Cited by 16 (5 self) - Add to MetaCart
As storage systems reach the petabyte scale, it has become increasingly difficult for users and storage administrators to understand and manage their data. File metadata, such as inode and extended attributes are a valuable source of information that can aid in locating and identifying files, and can also facilitate administrative tasks, such as storage provisioning and recovery from backups. Unfortunately, most storage systems have no way to quickly and easily search file metadata at large scale. To address these issues, we developed Spyglass, a indexing system that efficiently gathers, indexes and queries file metadata in large-scale storage systems. Our analysis of file metadata from real-world workloads showed that metadata has spatial locality in the storage namespace and that the distribution of metadata is highly skewed. Based on these findings, we designed Spyglass to use index partitioning and signature files to quickly prune the file search space. We also developed techniques to efficiently handle index versioning, facilitating both fast update and queries across historical indexes. Experiments on systems with up to 300 million files show that the Spyglass prototype is as much as several thousand times faster than current database solutions while requiring only a fraction of the space. 1
The National Science Foundation
  • About CiteSeerX
  • Submit Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2010 The Pennsylvania State University