Results 1 -
2 of
2
Data Like This: Ranked Search of Genomic Data Vision Paper
- in Proceedings of the Second International Workshop on Databases and the Web
, 2015
"... High-throughput genetic sequencing produces the ultimate “big data”: a human genome sequence contains more than 3B base pairs, and more and more characteristics, or annotations, are being recorded at the base-pair level. Locating areas of interest within the genome is a challenge for researchers, li ..."
Abstract
-
Cited by 1 (0 self)
- Add to MetaCart
(Show Context)
High-throughput genetic sequencing produces the ultimate “big data”: a human genome sequence contains more than 3B base pairs, and more and more characteristics, or annotations, are being recorded at the base-pair level. Locating areas of interest within the genome is a challenge for researchers, limiting their investigations. We describe our vision of adapting “big data” ranked search to the problem of searching the genome. Our goal is to make searching for data as easy for scientists as searching the Internet. Categories and Subject Descriptors
adfa, p. 1, 2011. © Springer-Verlag Berlin Heidelberg 2011 Challenges for Dataset Search
"... Abstract. Ranked search of datasets has emerged as a need as shared scientific archives grow in size and variety. Our own investigations have shown that IR-style, feature-based relevance scoring can be an effective tool for data discovery in scientific archives. However, maintaining interactive resp ..."
Abstract
- Add to MetaCart
(Show Context)
Abstract. Ranked search of datasets has emerged as a need as shared scientific archives grow in size and variety. Our own investigations have shown that IR-style, feature-based relevance scoring can be an effective tool for data discovery in scientific archives. However, maintaining interactive response times as ar-chives scale will be a challenge. We report here on our exploration of perfor-mance techniques for Data Near Here, a dataset search service. We present a sample of results evaluating filter-restart techniques in our system, including two variations, adaptive relaxation and contraction. We then outline further di-rections for research in this domain.