• Documents
  • Authors
  • Tables
  • Log in
  • Sign up
  • MetaCart
  • DMCA
  • Donate

CiteSeerX logo

Advanced Search Include Citations

Tools

Sorted by:
Try your query at:
Semantic Scholar Scholar Academic
Google Bing DBLP
Results 1 - 10 of 139,102
Next 10 →

Distributed Database Systems

by M. Tamer Özsu
"... this article, we discuss the fundamentals of distributed DBMS technology. We address the data distribution and architectural design issues as well as the algorithms that need to be implemented to provide the basic DBMS functions such as query processing, concurrency control, reliability, and replica ..."
Abstract - Cited by 588 (26 self) - Add to MetaCart
this article, we discuss the fundamentals of distributed DBMS technology. We address the data distribution and architectural design issues as well as the algorithms that need to be implemented to provide the basic DBMS functions such as query processing, concurrency control, reliability

Power-law distributions in empirical data

by Aaron Clauset, Cosma Rohilla Shalizi, M. E. J. Newman - ISSN 00361445. doi: 10.1137/ 070710111. URL http://dx.doi.org/10.1137/070710111 , 2009
"... Power-law distributions occur in many situations of scientific interest and have significant consequences for our understanding of natural and man-made phenomena. Unfortunately, the empirical detection and characterization of power laws is made difficult by the large fluctuations that occur in the t ..."
Abstract - Cited by 607 (7 self) - Add to MetaCart
estimates for power-law data, based on maximum likelihood methods and the Kolmogorov-Smirnov statistic. We also show how to tell whether the data follow a power-law distribution at all, defining quantitative measures that indicate when the power law is a reasonable fit to the data and when it is not. We

The calculation of posterior distributions by data augmentation.

by Martin A Tanner , ; Wing , Hung Wong - Journal of the American Statistical Association , 1987
"... ..."
Abstract - Cited by 926 (10 self) - Add to MetaCart
Abstract not found

Bigtable: A distributed storage system for structured data

by Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach, Mike Burrows, Tushar Chandra, Andrew Fikes, Robert E. Gruber - IN PROCEEDINGS OF THE 7TH CONFERENCE ON USENIX SYMPOSIUM ON OPERATING SYSTEMS DESIGN AND IMPLEMENTATION - VOLUME 7 , 2006
"... Bigtable is a distributed storage system for managing structured data that is designed to scale to a very large size: petabytes of data across thousands of commodity servers. Many projects at Google store data in Bigtable, including web indexing, Google Earth, and Google Finance. These applications ..."
Abstract - Cited by 1028 (4 self) - Add to MetaCart
Bigtable is a distributed storage system for managing structured data that is designed to scale to a very large size: petabytes of data across thousands of commodity servers. Many projects at Google store data in Bigtable, including web indexing, Google Earth, and Google Finance. These applications

Automatic Subspace Clustering of High Dimensional Data

by Rakesh Agrawal, Johannes Gehrke, Dimitrios Gunopulos, Prabhakar Raghavan - Data Mining and Knowledge Discovery , 2005
"... Data mining applications place special requirements on clustering algorithms including: the ability to find clusters embedded in subspaces of high dimensional data, scalability, end-user comprehensibility of the results, non-presumption of any canonical data distribution, and insensitivity to the or ..."
Abstract - Cited by 724 (12 self) - Add to MetaCart
Data mining applications place special requirements on clustering algorithms including: the ability to find clusters embedded in subspaces of high dimensional data, scalability, end-user comprehensibility of the results, non-presumption of any canonical data distribution, and insensitivity

Dryad: Distributed Data-Parallel Programs from Sequential Building Blocks

by Michael Isard, Mihai Budiu, Yuan Yu, Andrew Birrell, Dennis Fetterly - In EuroSys , 2007
"... Dryad is a general-purpose distributed execution engine for coarse-grain data-parallel applications. A Dryad applica-tion combines computational “vertices ” with communica-tion “channels ” to form a dataflow graph. Dryad runs the application by executing the vertices of this graph on a set of availa ..."
Abstract - Cited by 762 (27 self) - Add to MetaCart
Dryad is a general-purpose distributed execution engine for coarse-grain data-parallel applications. A Dryad applica-tion combines computational “vertices ” with communica-tion “channels ” to form a dataflow graph. Dryad runs the application by executing the vertices of this graph on a set

Distributional Clustering Of English Words

by Fernando Pereira, Naftali Tishby, Lillian Lee - In Proceedings of the 31st Annual Meeting of the Association for Computational Linguistics , 1993
"... We describe and evaluate experimentally a method for clustering words according to their dis- tribution in particular syntactic contexts. Words are represented by the relative frequency distributions of contexts in which they appear, and relative entropy between those distributions is used as the si ..."
Abstract - Cited by 629 (27 self) - Add to MetaCart
We describe and evaluate experimentally a method for clustering words according to their dis- tribution in particular syntactic contexts. Words are represented by the relative frequency distributions of contexts in which they appear, and relative entropy between those distributions is used

Privacy-Preserving Data Mining

by Rakesh Agrawal , Ramakrishnan Srikant , 2000
"... A fruitful direction for future data mining research will be the development of techniques that incorporate privacy concerns. Specifically, we address the following question. Since the primary task in data mining is the development of models about aggregated data, can we develop accurate models with ..."
Abstract - Cited by 844 (3 self) - Add to MetaCart
and the distribution of data values is also very different from the original distribution. While it is not possible to accurately estimate original values in individual data records, we propose a-novel reconstruction procedure to accurately estimate the distribution of original data values. By using

A Digital Fountain Approach to Reliable Distribution of Bulk Data

by Michael Luby, et al. , 1998
"... The proliferation of applications that must reliably distribute bulk data to a large number of autonomous clients motivates the design of new multicast and broadcast protocols. We describe an ideal, fully scalable protocol for these applications that we call a digital fountain. A digital fountain a ..."
Abstract - Cited by 492 (19 self) - Add to MetaCart
The proliferation of applications that must reliably distribute bulk data to a large number of autonomous clients motivates the design of new multicast and broadcast protocols. We describe an ideal, fully scalable protocol for these applications that we call a digital fountain. A digital fountain

Bayesian Data Analysis

by Andrew Gelman, Christian Robert, Nicolas Chopin, Judith Rousseau , 1995
"... I actually own a copy of Harold Jeffreys’s Theory of Probability but have only read small bits of it, most recently over a decade ago to confirm that, indeed, Jeffreys was not too proud to use a classical chi-squared p-value when he wanted to check the misfit of a model to data (Gelman, Meng and Ste ..."
Abstract - Cited by 2194 (63 self) - Add to MetaCart
I actually own a copy of Harold Jeffreys’s Theory of Probability but have only read small bits of it, most recently over a decade ago to confirm that, indeed, Jeffreys was not too proud to use a classical chi-squared p-value when he wanted to check the misfit of a model to data (Gelman, Meng
Next 10 →
Results 1 - 10 of 139,102
Powered by: Apache Solr
  • About CiteSeerX
  • Submit and Index Documents
  • Privacy Policy
  • Help
  • Data
  • Source
  • Contact Us

Developed at and hosted by The College of Information Sciences and Technology

© 2007-2019 The Pennsylvania State University