Results 1 - 10
of
47
Interval Estimation for a Binomial Proportion
- Statistical Science
, 2001
"... We revisit the problem of interval estimation of a binomial proportion. The erratic behavior of the coverage probability of the standard Wald con dence interval has previously been remarked on in the literature (Blyth & Still (1983), Agresti & Coull (1998), Santner (1998), and others). We begin by s ..."
Abstract
-
Cited by 48 (2 self)
- Add to MetaCart
We revisit the problem of interval estimation of a binomial proportion. The erratic behavior of the coverage probability of the standard Wald con dence interval has previously been remarked on in the literature (Blyth & Still (1983), Agresti & Coull (1998), Santner (1998), and others). We begin by showing that the chaotic coverage properties of the Wald interval are far more persistent than is appreciated. Furthermore, common textbook prescriptions regarding its safety are misleading and defective in several respects and cannot be trusted. This leads us to consideration of alternative intervals. A number of natural alternatives are presented, each with its motivation and context. Each interval is examined as regards its coverage probability and its length. Based on this analysis, we recommend the Wilson interval (Wilson (1927)) or the equal tailed Jereys prior interval for small n, and the interval suggested in Agresti and Coull (1998) for larger n. We also provide an addi...
Profit Mining: From Patterns to Actions
- In EDBT
, 2002
"... A major obstacle in data mining applications is the gap between the statistic-based pattern extraction and the value-based decision making. We present a profit mining approach to reduce this gap. In profit mining, we are given a set of past transactions and pre-selected target items, and we like ..."
Abstract
-
Cited by 21 (3 self)
- Add to MetaCart
A major obstacle in data mining applications is the gap between the statistic-based pattern extraction and the value-based decision making. We present a profit mining approach to reduce this gap. In profit mining, we are given a set of past transactions and pre-selected target items, and we like to build a model for recommending target items and promotion strategies to new customers, with the goal of maximizing the net profit. We identify several issues in profit mining and propose solutions.
A Comparison of Approximate Interval Estimators for the Bernoulli Parameter
- The American Statistician
, 1996
"... this article. Ghosh (1979) compared two confidence intervals for the Bernoulli parameter based on the normal approximation to the binomial distribution. 2. CONFIDENCE INTERVAL ESTIMATORS FOR p Two-sided confidence interval estimators for p can be determined with the aid of numerical methods. One-sid ..."
Abstract
-
Cited by 13 (3 self)
- Add to MetaCart
this article. Ghosh (1979) compared two confidence intervals for the Bernoulli parameter based on the normal approximation to the binomial distribution. 2. CONFIDENCE INTERVAL ESTIMATORS FOR p Two-sided confidence interval estimators for p can be determined with the aid of numerical methods. One-sided confidence interval estimators are analogous. Let
Confidence Curves and Improved Exact Confidence Intervals for Discrete Distributions
, 2000
"... The author describes a method for improving standard "exact" confidence intervals in discrete distributions with respect to size while retaining correct level. The binomial, negative binomial, hypergeometric and Poisson distributions are considered explicitly. Contrary to other existing methods, the ..."
Abstract
-
Cited by 9 (0 self)
- Add to MetaCart
The author describes a method for improving standard "exact" confidence intervals in discrete distributions with respect to size while retaining correct level. The binomial, negative binomial, hypergeometric and Poisson distributions are considered explicitly. Contrary to other existing methods, the author's solution possesses a natural nesting condition: if #<# # ,the 1-# # confidence interval is included in the 1 - # interval. Nonparametric confidence intervals for a quantile are also considered.
Exponential Language Models, Logistic Regression, and Semantic Coherence
- In Proceedings of the NIST/DARPA Speech Transcription Workshop
, 2000
"... In this paper, we modify the traditional trigram model by using utterance-level semantic coherence features in an exponential model. The semantic coherence features are collected by measuring the correlations among content-word pairs occurring in sentences of two corpora, the real corpus and a corpu ..."
Abstract
-
Cited by 7 (4 self)
- Add to MetaCart
In this paper, we modify the traditional trigram model by using utterance-level semantic coherence features in an exponential model. The semantic coherence features are collected by measuring the correlations among content-word pairs occurring in sentences of two corpora, the real corpus and a corpus generated by the baseline trigram model. The measure we use for estimating the semantic association of content word pairs is Yule's Q statistic. For our preliminary analysis, we have further simplified the modeling task by extracting a small set of statistics from each sentence-based Q statistics and applying them as features to the exponential model. We also simplified the process of obtaining the MLE solutions of the exponential models by approximating it with a logistic regression model. We account for the uncertainty in the estimates of Q by constructing confidence intervals. The new model results in a slight reduction in test-set perplexity. We also discuss and compare alternative mea...
Mining changes of classification by correspondence tracing
- In Proceedings of the 2003 SIAM International Conference on Data Mining (SDM_2003
, 2003
"... We study the problem of mining changes of classification characteristics as the data changes. Available are an old classifier, representing previous knowledge about classification characteristics, and a new data. We want to find the changes of classification characteristics in the new data. An examp ..."
Abstract
-
Cited by 4 (1 self)
- Add to MetaCart
We study the problem of mining changes of classification characteristics as the data changes. Available are an old classifier, representing previous knowledge about classification characteristics, and a new data. We want to find the changes of classification characteristics in the new data. An example of such changes is “members with a large family no longer shop frequently, but they used to”. Finding this kind of changes holds the key for the organization to adopt to the changed environment and stay ahead of competitors. The challenge is that it is difficult to see what has really changed from comparing the old and new classifiers that could be very large and different. In this paper, we propose a technique to identify such changes. The idea is tracing the characteristics, in the old and new classifiers, that correspond to each other by classifying the same examples. We describe several ways to present changes so that the user can focus on a small number of important ones. We evaluate the proposed method on real life data sets. 1
On small-sample confidence intervals for parameters in discrete distributions
- Biometrics
, 2001
"... you have obtained prior permission, you may not download an entire issue of a journal or multiple copies of articles, and you may use content in the JSTOR archive only for your personal, non-commercial use. Please contact the publisher regarding any further use of this work. Publisher contact inform ..."
Abstract
-
Cited by 4 (1 self)
- Add to MetaCart
you have obtained prior permission, you may not download an entire issue of a journal or multiple copies of articles, and you may use content in the JSTOR archive only for your personal, non-commercial use. Please contact the publisher regarding any further use of this work. Publisher contact information may be obtained at.
Elements of a Computational Model for Multi-Party Discourse: The Turn-Taking Behavior of Supreme Court Justices
, 2008
"... This paper explores computational models of multi-party discourse, using transcripts from U.S. Supreme Court oral arguments. The turn-taking behavior of participants is treated as a supervised sequence labeling problem and modeled using first- and secondorder Conditional Random Fields. We specifical ..."
Abstract
-
Cited by 4 (0 self)
- Add to MetaCart
This paper explores computational models of multi-party discourse, using transcripts from U.S. Supreme Court oral arguments. The turn-taking behavior of participants is treated as a supervised sequence labeling problem and modeled using first- and secondorder Conditional Random Fields. We specifically explore the hypothesis that discourse markers and personal references provide important features in such models. Results from a sequence prediction experiment demonstrate that incorporating these two types of features yields significant improvements in performance. This work is couched in the broader context of developing tools to support legal scholarship, although we see other NLP applications as well.
Improved confidence intervals for the difference between binomial proportions based on paired data
- Statistics in Medicine 17
, 1998
"... Existing methods for setting confidence intervals for the difference � between binomial proportions based on paired data perform inadequately. The asymptotic method can produce limits outside the range of validity. The ‘exact ’ conditional method can yield an interval which is effectively only one-s ..."
Abstract
-
Cited by 2 (0 self)
- Add to MetaCart
Existing methods for setting confidence intervals for the difference � between binomial proportions based on paired data perform inadequately. The asymptotic method can produce limits outside the range of validity. The ‘exact ’ conditional method can yield an interval which is effectively only one-sided. Both these methods also have poor coverage properties. Better methods are described, based on the profile likelihood obtained by conditionally maximizing the proportion of discordant pairs. A refinement (methods 5 and 6) which aligns 1! � with an aggregate of tail areas produces appropriate coverage properties. A computationally simpler method based on the score interval for the single proportion also performs well (method 10). � 1998 John Wiley & Sons, Ltd. 1.
Reducing conservatism of exact small-sample methods of inference for discrete data
- TH SYMPOSIUM OF THE IASC, ROME 28 AUGUST - 1
, 2006
"... ..."

