Results 1  10
of
189
Neural Networks and Statistical Models
, 1994
"... There has been much publicity about the ability of artificial neural networks to learn and generalize. In fact, the most commonly used artificial neural networks, called multilayer perceptrons, are nothing more than nonlinear regression and discriminant models that can be implemented with standard s ..."
Abstract

Cited by 99 (1 self)
 Add to MetaCart
There has been much publicity about the ability of artificial neural networks to learn and generalize. In fact, the most commonly used artificial neural networks, called multilayer perceptrons, are nothing more than nonlinear regression and discriminant models that can be implemented with standard statistical software. This paper explains what neural networks are, translates neural network jargon into statistical jargon, and shows the relationships between neural networks and statistical models such as generalized linear models, maximum redundancy analysis, projection pursuit, and cluster analysis.
Toward a method of selecting among computational models of cognition
 Psychological Review
, 2002
"... The question of how one should decide among competing explanations of data is at the heart of the scientific enterprise. Computational models of cognition are increasingly being advanced as explanations of behavior. The success of this line of inquiry depends on the development of robust methods to ..."
Abstract

Cited by 74 (4 self)
 Add to MetaCart
The question of how one should decide among competing explanations of data is at the heart of the scientific enterprise. Computational models of cognition are increasingly being advanced as explanations of behavior. The success of this line of inquiry depends on the development of robust methods to guide the evaluation and selection of these models. This article introduces a method of selecting among mathematical models of cognition known as minimum description length, which provides an intuitive and theoretically wellgrounded understanding of why one model should be chosen. A central but elusive concept in model selection, complexity, can also be derived with the method. The adequacy of the method is demonstrated in 3 areas of cognitive modeling: psychophysics, information integration, and categorization. How should one choose among competing theoretical explanations of data? This question is at the heart of the scientific enterprise, regardless of whether verbal models are being tested in an experimental setting or computational models are being evaluated in simulations. A number of criteria have been proposed to assist in this endeavor, summarized nicely by Jacobs and Grainger
Adaptive Markov Chain Monte Carlo through Regeneration
, 1998
"... this paper is organized as follows. In Section 2 we introduce the concept of regeneration and adaptation at regeneration, and provide theoretical support. In Section 3, the splitting techniques required for adaptation are reviewed. Section 4 contains four illustrations of adaptive MCMC. Some of the ..."
Abstract

Cited by 71 (4 self)
 Add to MetaCart
this paper is organized as follows. In Section 2 we introduce the concept of regeneration and adaptation at regeneration, and provide theoretical support. In Section 3, the splitting techniques required for adaptation are reviewed. Section 4 contains four illustrations of adaptive MCMC. Some of the proofs from Sections 2 and 3 are placed in the Appendix. 2 Regeneration: A Framework for Adaptation
Predictive Model Selection
 Journal of the Royal Statistical Society, Ser. B
, 1995
"... this article we propose three criteria that can be used to address model selection. These emphasize observables rather than parameters and are based on a certain Bayesian predictive density. They have a unifying basis that is simple and interpretable,are free of asymptotic de#nitions,and allow the i ..."
Abstract

Cited by 61 (4 self)
 Add to MetaCart
this article we propose three criteria that can be used to address model selection. These emphasize observables rather than parameters and are based on a certain Bayesian predictive density. They have a unifying basis that is simple and interpretable,are free of asymptotic de#nitions,and allow the incorporation of prior information. Moreover,two of these criteria are readily calibrated.
Nonlinear regressions with integrated time series
 Econometrica
, 2001
"... An asymptotic theory is developed for nonlinear regression with integrated processes. The models allow for nonlinear effects from unit root time series and therefore deal with the case of parametric nonlinear cointegration. The theory covers integrable and asymptotically homogeneous functions. Suffi ..."
Abstract

Cited by 59 (19 self)
 Add to MetaCart
An asymptotic theory is developed for nonlinear regression with integrated processes. The models allow for nonlinear effects from unit root time series and therefore deal with the case of parametric nonlinear cointegration. The theory covers integrable and asymptotically homogeneous functions. Sufficient conditions for weak consistency are given and a limit distribution theory is provided. The rates of convergence depend on the properties of the nonlinear regression function, and are shown to be as slow as n1�4 for integrable functions, and to be generally polynomial in n1�2 for homogeneous functions. For regressions with integrable functions, the limiting distribution theory is mixed normal with mixing variates that depend on the sojourn time of the limiting Brownian motion of the integrated process.
Approximations to the Loglikelihood Function in the Nonlinear Mixed Effects Model
 Journal of Computational and Graphical Statistics
, 1995
"... Introduction. Several different nonlinear mixed effects models and estimation methods for their parameters have been proposed in recent years (Sheiner and Beal, 1980; Mallet, Mentre, Steimer and Lokiek, 1988; Lindstrom and Bates, 1990; Vonesh and Carter, 1992; Davidian and Gallant, 1992; Wakefield, ..."
Abstract

Cited by 55 (4 self)
 Add to MetaCart
Introduction. Several different nonlinear mixed effects models and estimation methods for their parameters have been proposed in recent years (Sheiner and Beal, 1980; Mallet, Mentre, Steimer and Lokiek, 1988; Lindstrom and Bates, 1990; Vonesh and Carter, 1992; Davidian and Gallant, 1992; Wakefield, Smith, RacinePoon and Gelfand, 1994). We consider here a slightly modified version of the model proposed in Lindstrom and Bates (1990). This model can be viewed as a hierarchical model that in some ways generalizes both the linear mixed effects model of Laird and Ware (1982) and the usual nonlinear model for independent data (Bates and Watts, 1988). In the first stage the jth observation on the ith cluster is modeled as y ij = f(OE ij ; x ij ) + ffl ij ; i = 1; : : : ; M; j = 1; : : : ; n i<F
The multiscale structure of nondifferentiable image manifolds
 in Proc. Wavelets XI at SPIE Optics and Photonics
, 2005
"... In this paper, we study families of images generated by varying a parameter that controls the appearance of the object/scene in each image. Each image is viewed as a point in highdimensional space; the family of images forms a lowdimensional submanifold that we call an image appearance manifold (I ..."
Abstract

Cited by 41 (20 self)
 Add to MetaCart
In this paper, we study families of images generated by varying a parameter that controls the appearance of the object/scene in each image. Each image is viewed as a point in highdimensional space; the family of images forms a lowdimensional submanifold that we call an image appearance manifold (IAM). We conduct a detailed study of some representative IAMs generated by translations/rotations of simple objects in the plane and by rotations of objects in 3D space. Our central, somewhat surprising, finding is that IAMs generated by images with sharp edges are nowhere differentiable. Moreover, IAMs have an inherent multiscale structure in that approximate tangent planes fitted to ɛneighborhoods continually twist off into new dimensions as the scale parameter ɛ varies. We explore and explain this phenomenon. An additional, more exotic kind of local nondifferentiability happens at some exceptional parameter points where occlusions cause image edges to disappear. These nondifferentiabilities help to understand some key phenomena in image processing. They imply that Newton’s method will not work in general for image registration, but that a multiscale Newton’s method will work. Such a multiscale Newton’s method is similar to existing coarsetofine differential estimation algorithms for image registration; the manifold perspective offers a wellfounded theoretical motivation for the multiscale approach and allows quantitative study of convergence and approximation. The manifold viewpoint is also generalizable to other image understanding problems.
An Investigation of Linguistic Features and Clustering Algorithms for Topical Document Clustering
 In Proceedings of the 23rd ACM SIGIR Conference on Research and Development in Information Retrieval
, 2000
"... We investigate four hierarchical clustering methods (singlelink, completelink, groupwiseaverage, and singlepass) and two linguistically motivated text features (noun phrase heads and proper names) in the context of document clustering. A statistical model for combining similarity information fro ..."
Abstract

Cited by 38 (5 self)
 Add to MetaCart
We investigate four hierarchical clustering methods (singlelink, completelink, groupwiseaverage, and singlepass) and two linguistically motivated text features (noun phrase heads and proper names) in the context of document clustering. A statistical model for combining similarity information from multiple sources is described and applied to DARPA's Topic Detection and Tracking phase 2 (TDT2) data. This model, based on loglinear regression, alleviates the need for extensive search in order to determine optimal weights for combining input features. Through an extensive series of experiments with more than 40,000 documents from multiple news sources and modalities, we establish that both the choice of clustering algorithm and the introduction of the additional features have an impact on clustering performance. We apply our optimal combination of features to the TDT2 test data, obtaining partitions of the documents that compare favorably with the results obtained by participants in th...
Categorizing Web Queries According to Geographical Locality
, 2003
"... ... according to their geographical locality. For example, a web page with general information about wildflowers could be considered a global page, likely to be of interest to a geographically broad audience. In contrast, a web page with listings on houses for sale in a specific city could be regard ..."
Abstract

Cited by 36 (0 self)
 Add to MetaCart
... according to their geographical locality. For example, a web page with general information about wildflowers could be considered a global page, likely to be of interest to a geographically broad audience. In contrast, a web page with listings on houses for sale in a specific city could be regarded as a local page, likely to be of interest only to an audience in a relatively narrow region. Similarly, some search engine queries (implicitly) target global pages, while other queries are after local pages. For example, the best results for query [wildflowers] are probably global pages about wildflowers such as the one discussed above. However, local pages that are relevant to, say, San Francisco are likely to be good matches for a query [houses for sale] that was issued by a San Francisco resident or by somebody moving to that city. Unfortunately, search engines do not analyze the geographical locality of queries and users, and hence often produce suboptimal results. Thus query [wildflowers ] might return pages that discuss wildflowers in specific U.S. states (and not general information about wildflowers), while query [houses for sale] might return pages with real estate listings for locations other than that of interest to the person who issued the query. Deciding whether an unseen query should produce mostly local or global pageswithout placing this burden on the search engine usersis an important and challenging problem, because queries are often ambiguous or underspecify the information they are after. In this paper, we address this problem by first defining how to categorize queries according to their (often implicit) geographical locality. We then introduce several alternatives for automatically and efficiently categorizing queries in our scheme, using a variety...
Parameter estimation for differential equations: A generalized smoothing approach
 JOURNAL OF THE ROYAL STATISTICAL SOCIETY, SERIES B
, 2007
"... We propose a new method for estimating parameters in nonlinear differential equations. These models represent change in a system by linking the behavior of a derivative of a process to the behavior of the process itself. Current methods for estimating parameters in differential equations from noi ..."
Abstract

Cited by 35 (5 self)
 Add to MetaCart
We propose a new method for estimating parameters in nonlinear differential equations. These models represent change in a system by linking the behavior of a derivative of a process to the behavior of the process itself. Current methods for estimating parameters in differential equations from noisy data are computationally intensive and often poorly suited to statistical techniques such as inference and interval estimation. This paper describes a new method that uses noisy data to estimate the parameters defining a system of nonlinear differential equations. The approach is based on a modification of data smoothing methods along with a generalization of profiled estimation. We derive interval estimates and show that these have good coverage properties on data simulated from chemical engineering and neurobiology. The method is demonstrated using realworld data from chemistry and from the progress of the autoimmune disease lupus.