Results 1  10
of
74
Identifying periodically expressed transcripts in microarray time series data
 Bioinformatics
, 2004
"... Motivation: Microarray experiments are now routinely used to collect largescale time series data, for example to monitor gene expression during the cell cycle. Statistical analysis of this data poses many challenges, one being that it is hard to identify correctly the subset of genes with a clear p ..."
Abstract

Cited by 76 (1 self)
 Add to MetaCart
Motivation: Microarray experiments are now routinely used to collect largescale time series data, for example to monitor gene expression during the cell cycle. Statistical analysis of this data poses many challenges, one being that it is hard to identify correctly the subset of genes with a clear periodic signature. This has lead to a controversial argument with regard to the suitability of both available methods and current microarray data. Methods: We introduce two simple but efficient statistical methods for signal detection and gene selection in gene expression time series data. First, we suggest the average periodogram as an exploratory device for graphical assessment of the presence of periodic transcripts in the data. Second, we describe an exact statistical test to identify periodically
Detecting novel associations in large data sets
 Science
, 2011
"... This copy is for your personal, noncommercial use only. If you wish to distribute this article to others, you can order highquality copies for your colleagues, clients, or customers by clicking here. Permission to republish or repurpose articles or portions of articles can be obtained by following ..."
Abstract

Cited by 68 (1 self)
 Add to MetaCart
(Show Context)
This copy is for your personal, noncommercial use only. If you wish to distribute this article to others, you can order highquality copies for your colleagues, clients, or customers by clicking here. Permission to republish or repurpose articles or portions of articles can be obtained by following the guidelines here. The following resources related to this article are available online at www.sciencemag.org (this infomation is current as of January 17, 2012): Updated information and services, including highresolution figures, can be found in the online version of this article at:
Comparing Bootstrap and Posterior Probability Values in the FourTaxon Case
, 2003
"... Assessment of the reliability of a given phylogenetic hypothesis is an important step in phylogenetic analysis. Historically, the nonparametric bootstrap procedure has been the most frequently used method for assessing the support for specific phylogenetic relationships. The recent employment of Bay ..."
Abstract

Cited by 59 (4 self)
 Add to MetaCart
Assessment of the reliability of a given phylogenetic hypothesis is an important step in phylogenetic analysis. Historically, the nonparametric bootstrap procedure has been the most frequently used method for assessing the support for specific phylogenetic relationships. The recent employment of Bayesian methods for phylogenetic inference problems has resulted in clade support being expressed in terms of posterior probabilities. We used simulated data and the fourtaxon case to explore the relationship between nonparametric bootstrap values (as inferred by maximum likelihood) and posterior probabilities (as inferred by Bayesian analysis). The results suggest a complex association between the two measures. Three general regions of tree space can be identified: (1) the neutral zone, where differences between mean bootstrap and mean posterior probability values are not significant, (2) near the twobranch corner, and (3) deep in the twobranch corner. In the last two regions, significant differences occur between mean bootstrap and mean posterior probability values. Whether bootstrap or posterior probability values are higher depends on the data in support of alternative topologies. Examination of star topologies revealed that both bootstrap and posterior probability values differ significantly from theoretical expectations;
Test of significance when data are curves
 Journal of the American Statistical Association
, 1998
"... With modern technology, massive data can easily be collected in a form of multiple sets of curves. New statistical challenge includes testing whether there is any statistically significant difference among these sets of curves. In this paper, we propose some new tests for comparing two groups of cur ..."
Abstract

Cited by 56 (1 self)
 Add to MetaCart
With modern technology, massive data can easily be collected in a form of multiple sets of curves. New statistical challenge includes testing whether there is any statistically significant difference among these sets of curves. In this paper, we propose some new tests for comparing two groups of curves based on the adaptive Neyman test and the wavelet thresholding techniques introduced in Fan (1996). We demonstrate that these tests inherit the properties outlined in Fan (1996) and they are simple and powerful for detecting di erences between two sets of curves. We then further generalize the idea to compare multiple sets of curves, resulting in an adaptive highdimensional analysis of variance, called HANOVA. These newly developed techniques are illustrated by using a dataset on pizza commercial where observations are curves and an analysis of cornea topography in ophthalmology where images of individuals are observed. A simulation example is also presented to illustrate the power of the adaptive Neyman test.
Identifying MMORPG bots: A traffic analysis approach
 ACE2006 (Los Angeles 14 th  16 th
, 2006
"... MMORPGs have become extremely popular among network gamers. Despite their success, one of MMORPG’s greatest challenges is the increasing use of game bots, i.e., autoplaying game clients. The use of game bots is considered unsportsmanlike and is therefore forbidden. To keep games in order, game polic ..."
Abstract

Cited by 32 (9 self)
 Add to MetaCart
(Show Context)
MMORPGs have become extremely popular among network gamers. Despite their success, one of MMORPG’s greatest challenges is the increasing use of game bots, i.e., autoplaying game clients. The use of game bots is considered unsportsmanlike and is therefore forbidden. To keep games in order, game police, played by actual human players, often patrol game zones and question suspicious players. This practice, however, is laborintensive and ineffective. To address this problem, we analyze the traffic generated by human players vs. game bots and propose solutions to automatically identify game bots. Taking Ragnarok Online, one of the most popular MMOGs, as our subject, we study the traffic generated by mainstream game bots and human players. We find that their traffic is distinguishable by: 1) the regularity in the release time
Nonparametric estimation of a periodic function
 Biometrika
, 2000
"... ABSTRACT. Motivated by applications to brightness data on periodic variable stars, we study nonparametric methods for estimating both the period and the amplitude function from noisy observations of a periodic function made at irregularly spaced times. It is shown that nonparametric estimators of pe ..."
Abstract

Cited by 19 (2 self)
 Add to MetaCart
ABSTRACT. Motivated by applications to brightness data on periodic variable stars, we study nonparametric methods for estimating both the period and the amplitude function from noisy observations of a periodic function made at irregularly spaced times. It is shown that nonparametric estimators of period converge at parametric rates and attain a semiparametric lower bound which is the same if the shape of the periodic function is unknown as if it were known. Also, firstorder properties of nonparametric estimators of the amplitude function are identical to those that would obtain if the period were known. Numerical simulations and applications to real data show the method to work well in practice. KEY WORDS AND PHRASES. frequency estimation, nonparametric regression, semiparametric estimation, NadarayaWatson estimator, MACHO project, variable star data. SHORT TITLE. Estimation of a periodic function
Assessing Nonstationary Time Series Using Wavelets
, 1998
"... The discrete wavelet transform has be used extensively in the field of Statistics, mostly in the area of "denoising signals" or nonparametric regression. This thesis provides a new application for the discrete wavelet transform, assessing nonstationary events in time series  especially l ..."
Abstract

Cited by 13 (4 self)
 Add to MetaCart
The discrete wavelet transform has be used extensively in the field of Statistics, mostly in the area of "denoising signals" or nonparametric regression. This thesis provides a new application for the discrete wavelet transform, assessing nonstationary events in time series  especially long memory processes. Long memory processes are those which exhibit substantial correlations between events separated by a long period of time. Departures from stationarity in these heavily autocorrelated time series, such as an abrupt change in the variance at an unknown location or "bursts" of increased variability, can be detected and accurately located using discrete wavelet transforms  both orthogonal and overcomplete. A cumulative sum of squares method, utilizing a KolomogorovSmirnovtype
Robust Full Bayesian Learning for Neural Networks
, 1999
"... In this paper, we propose a hierarchical full Bayesian model for neural networks. This model treats the model dimension (number of neurons), model parameters, regularisation parameters and noise parameters as random variables that need to be estimated. We develop a reversible jump Markov chain Monte ..."
Abstract

Cited by 12 (9 self)
 Add to MetaCart
In this paper, we propose a hierarchical full Bayesian model for neural networks. This model treats the model dimension (number of neurons), model parameters, regularisation parameters and noise parameters as random variables that need to be estimated. We develop a reversible jump Markov chain Monte Carlo (MCMC) method to perform the necessary computations. We find that the results obtained using this method are not only better than the ones reported previously, but also appear to be robust with respect to the prior specification. In addition, we propose a novel and computationally efficient reversible jump MCMC simulated annealing algorithm to optimise neural networks. This algorithm enables us to maximise the joint posterior distribution of the network parameters and the number of basis function. It performs a global search in the joint space of the parameters and number of parameters, thereby surmounting the problem of local minima. We show that by calibrating the full hierarchical ...
Bayesian Methods for Neural Networks
, 1999
"... Summary The application of the Bayesian learning paradigm to neural networks results in a flexible and powerful nonlinear modelling framework that can be used for regression, density estimation, prediction and classification. Within this framework, all sources of uncertainty are expressed and meas ..."
Abstract

Cited by 12 (0 self)
 Add to MetaCart
(Show Context)
Summary The application of the Bayesian learning paradigm to neural networks results in a flexible and powerful nonlinear modelling framework that can be used for regression, density estimation, prediction and classification. Within this framework, all sources of uncertainty are expressed and measured by probabilities. This formulation allows for a probabilistic treatment of our a priori knowledge, domain specific knowledge, model selection schemes, parameter estimation methods and noise estimation techniques. Many researchers have contributed towards the development of the Bayesian learning approach for neural networks. This thesis advances this research by proposing several novel extensions in the areas of sequential learning, model selection, optimisation and convergence assessment. The first contribution is a regularisation strategy for sequential learning based on extended Kalman filtering and noise estimation via evidence maximisation. Using the expectation maximisation (EM) algorithm, a similar algorithm is derived for batch learning. Much of the thesis is, however, devoted to Monte Carlo simulation methods. A robust Bayesian method is proposed to estimate,
Nonparametric bayesian inference on bivariate extremes
 Journal of the Royal Statistical Society, Series B (Statistical Methodology
, 2011
"... The tail of a bivariate distribution function in the domain of attraction of a bivariate extremevalue distribution may be approximated by the one of its extremevalue attractor. The extremevalue attractor has margins that belong to a threeparameter family and a dependence structure which is char ..."
Abstract

Cited by 7 (1 self)
 Add to MetaCart
The tail of a bivariate distribution function in the domain of attraction of a bivariate extremevalue distribution may be approximated by the one of its extremevalue attractor. The extremevalue attractor has margins that belong to a threeparameter family and a dependence structure which is characterised by a spectral measure, that is a probability measure on the unit interval with mean equal to one half. As an alternative to parametric modelling of the spectral measure, we propose an infinitedimensional model which is at the same time manageable and still dense within the class of spectral measures. Inference is done in a Bayesian framework, using the censoredlikelihood approach. In particular, we construct a prior distribution on the class of spectral measures and develop a transdimensional Markov chain Monte Carlo algorithm for numerical computations. The method provides a bivariate predictive density which can be used for predicting the extreme outcomes of the bivariate distribution. In a practical perspective, this is useful for computing rare event probabilities and extreme conditional quantiles. The methodology is validated by simulations and applied to a dataset of Danish fire insurance claims.