Results 1  10
of
81
Nonparametric Permutation Tests for Functional Neuroimaging: A Primer with Examples. Human Brain Mapping
, 2001
"... The statistical analyses of functional mapping experiments usually proceeds at the voxel level, involving the formation and assessment of a statistic image: at each voxel a statistic indicating evidence of the experimental effect of interest, at that voxel, is computed, giving an image of statistics ..."
Abstract

Cited by 145 (6 self)
 Add to MetaCart
The statistical analyses of functional mapping experiments usually proceeds at the voxel level, involving the formation and assessment of a statistic image: at each voxel a statistic indicating evidence of the experimental effect of interest, at that voxel, is computed, giving an image of statistics, a statistic
Learning relational probability trees
 In Proceedings of the ACM International Conference on Knowledge Discovery and Data Mining (SIGKDD) (2003
"... Classification trees are widely used in the machine learning and data mining communities for modeling propositional data. Recent work has extended this basic paradigm to probability estimation trees. Traditional tree learning algorithms assume that instances in the training data are homogenous and i ..."
Abstract

Cited by 117 (33 self)
 Add to MetaCart
Classification trees are widely used in the machine learning and data mining communities for modeling propositional data. Recent work has extended this basic paradigm to probability estimation trees. Traditional tree learning algorithms assume that instances in the training data are homogenous and independently distributed. Relational probability trees (RPTs) extend standard probability estimation trees to a relational setting in which data instances are heterogeneous and interdependent. Our algorithm for learning the structure and parameters of an RPT searches over a space of relational features that use aggregation functions (e.g. AVERAGE, MODE, COUNT) to dynamically propositionalize relational data and create binary splits within the RPT. Previous work has identified a number of statistical biases due to characteristics of relational data such as autocorrelation and degree disparity. The RPT algorithm uses a novel form of randomization test to adjust for these biases. On a variety of relational learning tasks, RPTs built using randomization tests are significantly smaller than other models and achieve equivalent, or better, performance. 1.
The role of Occam’s Razor in knowledge discovery
 Data Mining and Knowledge Discovery
, 1999
"... Abstract. Many KDD systems incorporate an implicit or explicit preference for simpler models, but this use of “Occam’s razor ” has been strongly criticized by several authors (e.g., Schaffer, 1993; Webb, 1996). This controversy arises partly because Occam’s razor has been interpreted in two quite di ..."
Abstract

Cited by 78 (3 self)
 Add to MetaCart
Abstract. Many KDD systems incorporate an implicit or explicit preference for simpler models, but this use of “Occam’s razor ” has been strongly criticized by several authors (e.g., Schaffer, 1993; Webb, 1996). This controversy arises partly because Occam’s razor has been interpreted in two quite different ways. The first interpretation (simplicity is a goal in itself) is essentially correct, but is at heart a preference for more comprehensible models. The second interpretation (simplicity leads to greater accuracy) is much more problematic. A critical review of the theoretical arguments for and against it shows that it is unfounded as a universal principle, and demonstrably false. A review of empirical evidence shows that it also fails as a practical heuristic. This article argues that its continued use in KDD risks causing significant opportunities to be missed, and should therefore be restricted to the comparatively few applications where it is appropriate. The article proposes and reviews the use of domain constraints as an alternative for avoiding overfitting, and examines possible methods for handling the accuracy–comprehensibility tradeoff.
Spatial Pattern Analysis of Functional Brain Images Using Partial Least Squares
 Neuroimage
, 1996
"... This paper introduces a new tool for functional neuroimage analysis: partial least squares (PLS). It is unique as a multivariate method in its choice of emphasis for analysis, that being the covariance between brain images and exogenous blocks representing either the experiment design or some behavi ..."
Abstract

Cited by 76 (11 self)
 Add to MetaCart
This paper introduces a new tool for functional neuroimage analysis: partial least squares (PLS). It is unique as a multivariate method in its choice of emphasis for analysis, that being the covariance between brain images and exogenous blocks representing either the experiment design or some behavioral measure. Whatemerges are spatial patterns of brain activity that represent the optimal association between the images and either of the blocks. This process differs substantially from other multivariate methods in that rather than attempting to predict the individual values of the image pixels, PLS attempts to explain the relation between image pixels and task or behavior. Data from a face encoding and recognition PET rCBF study are used to illustrate two types of PLS analysis: an activation analysis of task with images and a brain behavior analysis. The commonalities across the two analyses are suggestive of a general face memory network differentially engaged during encoding and recognition. PLS thus serves as an important extension by extracting new information from imaging data that is not accessible through other currently used univariate and multivariate image analysis tools. r 1996 Academic Press, Inc
Global, voxel, and cluster tests, by theory and permutation, for a difference between two groups of univariate analysis of ERPs/ERFs I: Review 1725 structural MR images of the brain
 IEEE Transactions on Medical Imaging
, 1999
"... Abstract—We describe almost entirely automated procedures for estimation of global, voxel, and clusterlevel statistics to test the null hypothesis of zero neuroanatomical difference between two groups of structural magnetic resonance imaging (MRI) data. Theoretical distributions under the null hypo ..."
Abstract

Cited by 58 (7 self)
 Add to MetaCart
Abstract—We describe almost entirely automated procedures for estimation of global, voxel, and clusterlevel statistics to test the null hypothesis of zero neuroanatomical difference between two groups of structural magnetic resonance imaging (MRI) data. Theoretical distributions under the null hypothesis are available for 1) global tissue class volumes; 2) standardized linear model [analysis of variance (ANOVA and ANCOVA)] coefficients estimated at each voxel; and 3) an area of spatially connected clusters generated by applying an arbitrary threshold to a twodimensional (2D) map of normal statistics at voxel level. We describe novel methods for economically ascertaining probability distributions under the null hypothesis, with fewer assumptions, by permutation of the observed data. Nominal Type I error control by permutation testing is generally excellent; whereas theoretical distributions may be over conservative. Permutation has the additional advantage that it can be used to test any statistic of interest, such as the sum of suprathreshold voxel statistics in a cluster (or cluster mass), regardless of its theoretical tractability under the null hypothesis. These issues are illustrated by application to MRI data acquired from 18 adolescents with hyperkinetic disorder and 16 control subjects matched for age and gender. Index Terms — Brain, imaging/mapping, probability distributions, statistics.
A Relational View of Information Seeking and Learning in Social Networks
, 2003
"... Research in organizational learning has demonstrated processes and occasionally performance implications of acquisition of declarative (knowwhat) and procedural (knowhow) knowledge. However, considerably less attention has been paid to learned characteristics of relationships that affect the decis ..."
Abstract

Cited by 57 (2 self)
 Add to MetaCart
Research in organizational learning has demonstrated processes and occasionally performance implications of acquisition of declarative (knowwhat) and procedural (knowhow) knowledge. However, considerably less attention has been paid to learned characteristics of relationships that affect the decision to seek information from other people. Based on a review of the social network, information processing, and organizational learning literatures, along with the results of a previous qualitative study, we propose a formal model of information seeking in which the probability of seeking information from another person is a function of (1) knowing what that person knows; (2) valuing what that person knows; (3) being able to gain timely access to that person’s thinking; and (4) perceiving that seeking information from that person would not be too costly. We also hypothesize that the knowing, access, and cost variables mediate the relationship between physical proximity and information seeking. The model is tested using two separate research sites to provide replication. The results indicate strong support for the model and the mediation hypothesis (with the exception of the cost variable). Implications are drawn for the study of both transactive memory and organizational learning, as well as for management practice.
A Comparison of Statistical Significance Tests for Information Retrieval Evaluation
, 2007
"... Information retrieval (IR) researchers commonly use three tests of statistical significance: the Student’s paired ttest, the Wilcoxon signed rank test, and the sign test. Other researchers have previously proposed using both the bootstrap and Fisher’s randomization (permutation) test as nonparametr ..."
Abstract

Cited by 52 (6 self)
 Add to MetaCart
Information retrieval (IR) researchers commonly use three tests of statistical significance: the Student’s paired ttest, the Wilcoxon signed rank test, and the sign test. Other researchers have previously proposed using both the bootstrap and Fisher’s randomization (permutation) test as nonparametric significance tests for IR but these tests have seen little use. For each of these five tests, we took the adhoc retrieval runs submitted to TRECs 3 and 58, and for each pair of runs, we measured the statistical significance of the difference in their mean average precision. We discovered that there is little practical difference between the randomization, bootstrap, and t tests. Both the Wilcoxon and sign test have a poor ability to detect significance and have the potential to lead to false detections of significance. The Wilcoxon and sign tests are simplified variants of the randomization test and their use should be discontinued for measuring the significance of a difference between means.
Large Datasets Lead to Overly Complex Models: An Explanation and a Solution
, 1998
"... This paper explores unexpected results that lie at the intersection of two common themes in the KDD community: large datasets and the goal of building compact models. Experiments with many different datasets and several model construction algorithms (including tree learning algorithms suchasc4. ..."
Abstract

Cited by 44 (4 self)
 Add to MetaCart
This paper explores unexpected results that lie at the intersection of two common themes in the KDD community: large datasets and the goal of building compact models. Experiments with many different datasets and several model construction algorithms (including tree learning algorithms suchasc4.5 with three different pruning methods, and rule learning algorithms such as c4.5rules and ripper) show that increasing the amount of data used to build a model often results in a linear increase in model size, even when that additional complexity results in no significantincrease in model accuracy. Despite the promise of better parameter estimation held out by large datasets, as a practical matter, models built with large amounts of data are often needlessly complex and cumbersome. In the case of decision trees, the cause of this pathology is identified as a bias inherentinseveral common pruning techniques. Pruning errors made low in the tree, where there is insufficient data to make accurate parameter estimates, are propagated and magnified higher in the tree, working against the accurate parameter estimates that are made possible there by abundant data. We propose a general solution to this problem based on a statistical technique known as randomization testing, and empirically evaluate its utility.
Framework for the statistical shape analysis of brain structures using spharmpdm
 In Insight Journal, Special Edition on the Open Science Workshop at MICCAI
, 2006
"... Abstract — Shape analysis has become of increasing interest to the neuroimaging community due to its potential to precisely locate morphological changes between healthy and pathological structures. This manuscript presents a comprehensive set of tools for the computation of 3D structural statistical ..."
Abstract

Cited by 32 (4 self)
 Add to MetaCart
Abstract — Shape analysis has become of increasing interest to the neuroimaging community due to its potential to precisely locate morphological changes between healthy and pathological structures. This manuscript presents a comprehensive set of tools for the computation of 3D structural statistical shape analysis. It has been applied in several studies on brain morphometry, but can potentially be employed in other 3D shape problems. Its main limitations is the necessity of spherical topology. The input of the proposed shape analysis is a set of binary segmentation of a single brain structure, such as the hippocampus or caudate. These segmentations are converted into a corresponding spherical harmonic description (SPHARM), which is then sampled into a triangulated surfaces (SPHARMPDM). After alignment, differences between groups of surfaces are computed using the Hotelling T 2 two sample metric. Statistical pvalues, both raw and corrected for multiple comparisons, result in significance maps. Additional visualization of the group tests are provided via mean difference magnitude and vector maps, as well as maps of the group covariance information. The correction for multiple comparisons is performed via two separate methods that each have a distinct view of the problem. The first one aims to control the familywise error rate (FWER) or falsepositives via the extrema histogram of nonparametric permutations. The second method controls the false discovery rate and results in a less conservative estimate of the falsenegatives. I.
Avoiding bias when aggregating relational data with degree disparity
 In Proceedings of the 20th International Conference on Machine Learning
, 2003
"... A common characteristic of relational data sets —degree disparity—can lead relational learning algorithms to discover misleading correlations. Degree disparity occurs when the frequency of a relation is correlated with the values of the target variable. In such cases, aggregation functions used by m ..."
Abstract

Cited by 25 (15 self)
 Add to MetaCart
A common characteristic of relational data sets —degree disparity—can lead relational learning algorithms to discover misleading correlations. Degree disparity occurs when the frequency of a relation is correlated with the values of the target variable. In such cases, aggregation functions used by many relational learning algorithms will result in misleading correlations and added complexity in models. We examine this problem through a combination of simulations and experiments. We show how two novel hypothesis testing procedures can adjust for the effects of using aggregation functions in the presence of degree disparity. 1.