Results 1  10
of
27
Robust forecasting of mortality and fertility rates: a functional data approach
, 2006
"... A new method is proposed for forecasting agespecific mortality and fertility rates observed over time. This approach allows for smooth functions of age, is robust for outlying years due to wars and epidemics, and provides a modelling framework that is easily adapted to allow for constraints and oth ..."
Abstract

Cited by 70 (17 self)
 Add to MetaCart
A new method is proposed for forecasting agespecific mortality and fertility rates observed over time. This approach allows for smooth functions of age, is robust for outlying years due to wars and epidemics, and provides a modelling framework that is easily adapted to allow for constraints and other information. Ideas from functional data analysis, nonparametric smoothing and robust statistics are combined to form a methodology that is widely applicable to any functional time series data observed discretely and possibly with error. The model is a generalization of the LeeCarter model commonly used in mortality and fertility forecasting. The methodology is applied to French mortality data and Australian fertility data, and the forecasts obtained are shown to be superior to those from the LeeCarter method and several of its variants.
Influence Function and Efficiency of the Minimum Covariance Determinant Scatter Matrix Estimator
 Journal of Multivariate Analysis
, 1998
"... The Minimum Covariance Determinant (MCD) scatter estimator is a highly robust estimator for the dispersion matrix of a multivariate, elliptically symmetric distribution. It is relatively fast to compute and intuitively appealing. In this note we derive its influence function and compute the asymptot ..."
Abstract

Cited by 56 (18 self)
 Add to MetaCart
The Minimum Covariance Determinant (MCD) scatter estimator is a highly robust estimator for the dispersion matrix of a multivariate, elliptically symmetric distribution. It is relatively fast to compute and intuitively appealing. In this note we derive its influence function and compute the asymptotic variances of its elements. A comparison with the one step reweighted MCD and with Sestimators is made. Also finitesample results are reported.
High breakdown estimators for principal components: the projectionpursuit approach revisited
 Journal of Multivariate Analysis
, 2005
"... Abstract: Li and Chen (1985) proposed a method for principal components using projectionpursuit techniques. In classical principal components one searches for directions with maximal variance, and their approach consists of replacing this variance by a robust scale measure. Li and Chen showed tha ..."
Abstract

Cited by 49 (3 self)
 Add to MetaCart
Abstract: Li and Chen (1985) proposed a method for principal components using projectionpursuit techniques. In classical principal components one searches for directions with maximal variance, and their approach consists of replacing this variance by a robust scale measure. Li and Chen showed that this estimator is consistent, qualitative robust and inherits the breakdown point of the robust scale estimator. We complete their study by deriving the influence function of the estimators for the eigenvectors, eigenvalues and the associated dispersion matrix. Corresponding Gaussian efficiencies are presented as well. Asymptotic normality of the estimators has been treated in a paper of Cui, He and Ng (2003), complementing the results of this paper. Furthermore, a simple explicit version of the projectionpursuit based estimator is proposed and shown to be fast to compute, orthogonally equivariant, and having the maximal finitesample breakdown point property. We will illustrate the method with a real data example.
ANTIDOTE: Understanding and Defending against Poisoning of Anomaly Detectors
"... Statistical machine learning techniques have recently garnered increased popularity as a means to improve network design and security. For intrusion detection, such methods build a model for normal behavior from training data and detect attacks as deviations from that model. This process invites adv ..."
Abstract

Cited by 31 (5 self)
 Add to MetaCart
(Show Context)
Statistical machine learning techniques have recently garnered increased popularity as a means to improve network design and security. For intrusion detection, such methods build a model for normal behavior from training data and detect attacks as deviations from that model. This process invites adversaries to manipulate the training data so that the learned model fails to detect subsequent attacks. We evaluate poisoning techniques and develop a defense, in the context of a particular anomaly detector—namely the PCAsubspace method for detecting anomalies in backbone networks. For three poisoning schemes, we show how attackers can substantially increase their chance of successfully evading detection by only adding moderate amounts of poisoned data. Moreover such poisoning throws off the balance between false positives and false negatives thereby dramatically reducing the efficacy of the detector. To combat these poisoning activities, we propose an antidote based on techniques from robust statistics and present a new robust PCAbased detector. Poisoning has little effect on the robust model, whereas it significantly distorts the model produced by the original PCA method. Our technique substantially reduces the effectiveness of poisoning for a variety of scenarios and indeed maintains a significantly better balance between false positives and false negatives than the original method when under attack.
Location adjustment for the minimum volume ellipsoid estimator
 Statistics and Computing
, 2002
"... Estimating multivariate location and scatter with both affine equivariance and positive breakdown has always been difficult. A wellknown estimator which satisfies both properties is the Minimum Volume Ellipsoid Estimator (MVE). Computing the exact MVE is often not feasible, so one usually resorts ..."
Abstract

Cited by 8 (1 self)
 Add to MetaCart
Estimating multivariate location and scatter with both affine equivariance and positive breakdown has always been difficult. A wellknown estimator which satisfies both properties is the Minimum Volume Ellipsoid Estimator (MVE). Computing the exact MVE is often not feasible, so one usually resorts to an approximate algorithm. In the regression setup, algorithms for positivebreakdown estimators like Least Median of Squares typically recompute the intercept at each step, to improve the result. This approach is called intercept adjustment. In this paper we show that a similar technique, called location adjustment, can be applied to the MVE. For this purpose we use the Minimum Volume Ball (MVB), in order to lower the MVE objective function. An exact algorithm for calculating the MVB is presented. As an alternative to MVB location adjustment we propose L1 location adjustment, which does not necessarily lower the MVE objective function but yields more efficient estimates for the location part. Simulations compare the two types of location adjustment. We also obtain the maxbias curve of both L1 and the MVB in the multivariate setting, revealing the superiority of L1.
1 Nonparametric time series forecasting with dynamic updating
"... Abstract: We present a nonparametric method to forecast a seasonal time series, and propose four dynamic updating methods to improve point forecast accuracy. Our forecasting and dynamic updating methods are datadriven and computationally fast, and they are thus feasible to be applied in practice. W ..."
Abstract

Cited by 4 (3 self)
 Add to MetaCart
Abstract: We present a nonparametric method to forecast a seasonal time series, and propose four dynamic updating methods to improve point forecast accuracy. Our forecasting and dynamic updating methods are datadriven and computationally fast, and they are thus feasible to be applied in practice. We will demonstrate the effectiveness of these methods using monthly El Niño time series from 1950 to 2008
Robust multivariate methods: The projection pursuit approach
 In From Data and Information Analysis to Knowledge Engineering, Spiliopoulou
, 2006
"... Abstract. Projection pursuit was originally introduced to identify structures in multivariate data clouds (Huber, 1985). The idea of projecting data to a lowdimensional subspace can also be applied to multivariate statistical methods. The robustness of the methods can be achieved by applying robust ..."
Abstract

Cited by 2 (1 self)
 Add to MetaCart
Abstract. Projection pursuit was originally introduced to identify structures in multivariate data clouds (Huber, 1985). The idea of projecting data to a lowdimensional subspace can also be applied to multivariate statistical methods. The robustness of the methods can be achieved by applying robust estimators to the lowerdimensional space. Robust estimation in high dimensions can thus be avoided which usually results in a faster computation. Moreover, flat data sets where the number of variables is much higher than the number of observations can be easier analyzed in a robust way. We will focus on the projection pursuit approach for robust continuum regression (Serneels et al., 2005). A new algorithm is introduced and compared with the reference algorithm as well as with classical continuum regression. 1
DETECTING INFLUENTIAL OBSERVATIONS IN KERNEL PCA
, 2008
"... Individual observations can be very influential when performing classical Principal Component Analysis in a Euclidean space. Robust PCA algorithms detect and neutralize such dominating data points. This paper studies robustness issues for PCA in a kernel induced feature space. The sensitivity of Ker ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
(Show Context)
Individual observations can be very influential when performing classical Principal Component Analysis in a Euclidean space. Robust PCA algorithms detect and neutralize such dominating data points. This paper studies robustness issues for PCA in a kernel induced feature space. The sensitivity of Kernel PCA is characterized by calculating the influence function. A robust Kernel PCA method is proposed by incorporating kernels in the Spherical PCA algorithm. Using the scores from Spherical Kernel PCA, a graphical diagnostic is proposed to detect points that are influential for ordinary Kernel PCA.
Behavior of Machine Learning Algorithms in Adversarial Environments by
, 2010
"... All rights reserved. ..."
The BACON Approach for RankDeficient Data
"... Rankdeficient data are not uncommon in practice. They result from highly collinear variables and/or highdimensional data. A special case of the latter occurs when the number of recorded variables exceeds the number of observations. The use of the BACON algorithm for outlier detection in multivaria ..."
Abstract
 Add to MetaCart
Rankdeficient data are not uncommon in practice. They result from highly collinear variables and/or highdimensional data. A special case of the latter occurs when the number of recorded variables exceeds the number of observations. The use of the BACON algorithm for outlier detection in multivariate data is extended here to include rankdeficient data. We present two approaches to identifying outliers in rankdeficient data based on the original BACON algorithm. The first algorithm projects the data onto a robust subspace of reduced dimension, while the second employs a ridge type regularization on the covariance matrix. Both algorithms are tested on real as well as simulated data sets with good results in terms of their effectiveness in outlier detection. They are also examined in terms of computational efficiency and found to be very fast, with particularly good scaling properties for increasing dimension.