Results 1 -
3 of
3
Lineage retrieval for scientific data processing: a survey
- ACM Computing Surveys
, 2005
"... Scientific research relies as much on the dissemination and exchange of data sets as on the publication of conclusions. Accurately tracking the lineage (origin and subsequent processing history) of scientific data sets is thus imperative for the complete documentation of scientific work. Researchers ..."
Abstract
-
Cited by 91 (1 self)
- Add to MetaCart
Scientific research relies as much on the dissemination and exchange of data sets as on the publication of conclusions. Accurately tracking the lineage (origin and subsequent processing history) of scientific data sets is thus imperative for the complete documentation of scientific work. Researchers are effectively prevented from
Provenance-Aware Sensor Data Storage
- In Proc. Workshop on Networking Meets Databases (NetDB
, 2005
"... Sensor network data has both historical and realtime value. Making historical sensor data useful, in particular, requires storage, naming, and indexing. Sensor data presents new challenges in these areas. Such data is location-specific but also distributed; it is collected in a particular physical l ..."
Abstract
-
Cited by 6 (0 self)
- Add to MetaCart
Sensor network data has both historical and realtime value. Making historical sensor data useful, in particular, requires storage, naming, and indexing. Sensor data presents new challenges in these areas. Such data is location-specific but also distributed; it is collected in a particular physical location and may be most useful there, but it has additional value when combined with other sensor data collections in a larger distributed system. Thus, arranging location-sensitive peer-to-peer storage is one challenge. Sensor data sets do not have obvious names, so naming them in a globally useful fashion is another challenge. The last challenge arises from the need to index these sensor data sets to make them searchable. The key to sensor data identity is provenance, the full history or lineage of the data. We show how provenance addresses the naming and indexing issues and then present a research agenda for constructing distributed, indexed repositories of sensor data.
Enhanced Product Generation at NASA Data Centers Through Grid Technology
"... This paper describes how grid technology can support the ability of NASA data centers to provide customized data products. A combination of grid technology and commodity processors are proposed to provide the bandwidth necessary to perform customized processing of data, with customized data subsetti ..."
Abstract
- Add to MetaCart
This paper describes how grid technology can support the ability of NASA data centers to provide customized data products. A combination of grid technology and commodity processors are proposed to provide the bandwidth necessary to perform customized processing of data, with customized data subsetting providing the initial example. This customized subsetting engine can be used to support a new type of subsetting, called phenomena-based subsetting, where data is subsetted based on its association with some phenomena, such as mesoscale convective systems or hurricanes. This concept is expanded to allow the phenomena to be detected in one type of data, with the subsetting requirements transmitted to the subsetting engine to subset a different type of data. The subsetting requirements are generated by a data mining system and transmitted to the subsetter in the form of an XML feature index that describes the spatial and temporal extent of the phenomena. For this work, a grid-based mining system called the Grid Miner is used to identify the phenomena and generate the feature index. This paper discusses the value of grid technology in facilitating the development of a high performance customized product processing and the coupling of a grid mining system to support phenomena-based subsetting.

