Results 1 - 10
of
4,020
The Data Grid: Towards an Architecture for the Distributed Management and Analysis of Large Scientific Datasets
- JOURNAL OF NETWORK AND COMPUTER APPLICATIONS
, 1999
"... In an increasing number of scientific disciplines, large data collections are emerging as important community resources. In this paper, we introduce design principles for a data management architecture called the Data Grid. We describe two basic services that we believe are fundamental to the des ..."
Abstract
-
Cited by 471 (41 self)
- Add to MetaCart
In an increasing number of scientific disciplines, large data collections are emerging as important community resources. In this paper, we introduce design principles for a data management architecture called the Data Grid. We describe two basic services that we believe are fundamental
Data Mining: Concepts and Techniques
, 2000
"... Our capabilities of both generating and collecting data have been increasing rapidly in the last several decades. Contributing factors include the widespread use of bar codes for most commercial products, the computerization of many business, scientific and government transactions and managements, a ..."
Abstract
-
Cited by 3142 (23 self)
- Add to MetaCart
Our capabilities of both generating and collecting data have been increasing rapidly in the last several decades. Contributing factors include the widespread use of bar codes for most commercial products, the computerization of many business, scientific and government transactions and managements
Informed Prefetching and Caching
- In Proceedings of the Fifteenth ACM Symposium on Operating Systems Principles
, 1995
"... The underutilization of disk parallelism and file cache buffers by traditional file systems induces I/O stall time that degrades the performance of modern microprocessor-based systems. In this paper, we present aggressive mechanisms that tailor file system resource management to the needs of I/O-int ..."
Abstract
-
Cited by 402 (10 self)
- Add to MetaCart
The underutilization of disk parallelism and file cache buffers by traditional file systems induces I/O stall time that degrades the performance of modern microprocessor-based systems. In this paper, we present aggressive mechanisms that tailor file system resource management to the needs of I
Scientific workflow management and the Kepler system
- CONCURR. COMPUT.: PRACT. EXP
, 2006
"... Many scientific disciplines are now data and information driven, and new scientific knowledge is often gained by scientists putting together data analysis and knowledge discovery “pipelines”. A related trend is that more and more scientific communities realize the benefits of sharing their data and ..."
Abstract
-
Cited by 280 (19 self)
- Add to MetaCart
Many scientific disciplines are now data and information driven, and new scientific knowledge is often gained by scientists putting together data analysis and knowledge discovery “pipelines”. A related trend is that more and more scientific communities realize the benefits of sharing their data
A survey of data provenance in e-science
- SIGMOD Record
, 2005
"... Data management is growing in complexity as largescale applications take advantage of the loosely coupled resources brought together by grid middleware and by abundant storage capacity. Metadata describing the data products used in and generated by these applications is essential to disambiguate the ..."
Abstract
-
Cited by 296 (21 self)
- Add to MetaCart
Data management is growing in complexity as largescale applications take advantage of the loosely coupled resources brought together by grid middleware and by abundant storage capacity. Metadata describing the data products used in and generated by these applications is essential to disambiguate
Ceph: A scalable, highperformance distributed system,” in OSDI,
, 2006
"... Abstract We have developed Ceph, a distributed file system that provides excellent performance, reliability, and scalability. Ceph maximizes the separation between data and metadata management by replacing allocation tables with a pseudo-random data distribution function (CRUSH) designed for hetero ..."
Abstract
-
Cited by 275 (32 self)
- Add to MetaCart
Abstract We have developed Ceph, a distributed file system that provides excellent performance, reliability, and scalability. Ceph maximizes the separation between data and metadata management by replacing allocation tables with a pseudo-random data distribution function (CRUSH) designed
Does code decay? Assessing the evidence from change management data
- IN IEEE TRANSACTIONS ON SOFTWARE ENGINEERING
, 2001
"... A central feature of the evolution of large software systems is that change-which is necessary to add new functionality, accommodate new hardware, and repair faults-becomes increasingly difficult over time. In this paper, we approach this phenomenon, which we term code decay, scientifically and sta ..."
Abstract
-
Cited by 218 (18 self)
- Add to MetaCart
A central feature of the evolution of large software systems is that change-which is necessary to add new functionality, accommodate new hardware, and repair faults-becomes increasingly difficult over time. In this paper, we approach this phenomenon, which we term code decay, scientifically
A taxonomy of workflow management systems for grid computing
, 2005
"... With the advent of Grid and application technologies, scientists and engineers are building more and more complex applications to manage and process large data sets, and execute scientific experiments on distributed resources. Such application scenarios require means for composing and executing comp ..."
Abstract
-
Cited by 229 (11 self)
- Add to MetaCart
With the advent of Grid and application technologies, scientists and engineers are building more and more complex applications to manage and process large data sets, and execute scientific experiments on distributed resources. Such application scenarios require means for composing and executing
Data Management and Transfer in High-Performance Computational Grid Environments
- Parallel Computing Journal
, 2001
"... An emerging class of data-intensive applications involve the geographically dispersed extraction of complex scientific information from very large collections of measured or computed data. Such applications arise, for example, in experimental physics, where the data in question is generated by accel ..."
Abstract
-
Cited by 206 (13 self)
- Add to MetaCart
An emerging class of data-intensive applications involve the geographically dispersed extraction of complex scientific information from very large collections of measured or computed data. Such applications arise, for example, in experimental physics, where the data in question is generated
Amoeba: a distributed operating system for the 1990s
- IEEE Computer
, 1990
"... Amoeba is the distributed system developed at the Free University (VU) and Centre for Mathematics and Computer Science (CWI), both in Amsterdam. Throughout the project’s ten-year history, a major concern of the designers has been to combine the research themes of distributed systems, such as high av ..."
Abstract
-
Cited by 204 (11 self)
- Add to MetaCart
performance. We are working hard to achieve this goal — Amoeba is already one of the fastest distributed systems (on its class of hardware) reported so far in the scientific literature. The Amoeba software is based on objects. An object is a piece of data on which welldefined operations may be performed
Results 1 - 10
of
4,020