Results 1  10
of
15
The local bootstrap for Markov processes
 J. Statist. Plann. Inference
, 2002
"... A nonparametric bootstrap procedure is proposed for stochastic processes which follow a general autoregressive structure. The procedure generates bootstrap replicates by locally resampling the original set of observations reproducing automatically its dependence properties. It avoids an initial non ..."
Abstract

Cited by 13 (2 self)
 Add to MetaCart
A nonparametric bootstrap procedure is proposed for stochastic processes which follow a general autoregressive structure. The procedure generates bootstrap replicates by locally resampling the original set of observations reproducing automatically its dependence properties. It avoids an initial nonparametric estimation of process characteristics in order to generate the pseudotime series and the bootstrap replicates mimic several of the properties of the original process. Applications of the procedure in nonlinear time series analysis are considered and theoretically justi ed; some simulated and real data examples are discussed.
Survey of stochastic models for wind and seastate time series

, 2005
"... The knowledge of sea state and wind conditions is of central importance for many offshore or nearshore operations. In this paper, we make a complete survey of stochastic models for sea state and wind time series. We begin the presentation with methods based on Gaussian processes and non parametric r ..."
Abstract

Cited by 4 (3 self)
 Add to MetaCart
The knowledge of sea state and wind conditions is of central importance for many offshore or nearshore operations. In this paper, we make a complete survey of stochastic models for sea state and wind time series. We begin the presentation with methods based on Gaussian processes and non parametric resampling methods for time series are introduced followed by various parametric models. Finally we propose an original statistical method, based on Monte Carlo goodnessoffit tests, for model validation and comparison. The use of this method is illustrated by an example on wind speed data in North Atlantic.
Evaluation of three Simple Imputation Methods for Enhancing Preprocessing of Data with Missing Values
"... One of the important stages of data mining is preprocessing, where the data is prepared for different mining tasks. Often, the realworld data tends to be incomplete, noisy, and inconsistent. It is very common that the data are not obtainable for every observation of every variable. So the presence ..."
Abstract

Cited by 4 (1 self)
 Add to MetaCart
One of the important stages of data mining is preprocessing, where the data is prepared for different mining tasks. Often, the realworld data tends to be incomplete, noisy, and inconsistent. It is very common that the data are not obtainable for every observation of every variable. So the presence of missing variables is obvious in the data set. A most important task when preprocessing the data is, to fill in missing values, smooth out noise and correct inconsistencies. This paper presents the missing value problem in data mining and evaluates some of the methods generally used for missing value imputation. In this work, three simple missing value imputation methods are implemented namely (1) Constant substitution, (2) Mean attribute value substitution and (3) Random attribute value substitution. The performance of the three missing value imputation algorithms were measured with respect to different rate or different percentage of missing values in the data set by using some known clustering methods. To evaluate the performance, the standard WDBC data set has been used.
On Robustness of ModelBased Bootstrap Schemes in Nonparametric Time Series Analysis
, 1997
"... . Theory in time series analysis is often developed in the context of finitedimensional models for the data generating process. Whereas corresponding estimators such as those of a conditional mean function are reasonable even if the true dependence mechanism is of a more complex structure, it is us ..."
Abstract

Cited by 1 (1 self)
 Add to MetaCart
. Theory in time series analysis is often developed in the context of finitedimensional models for the data generating process. Whereas corresponding estimators such as those of a conditional mean function are reasonable even if the true dependence mechanism is of a more complex structure, it is usually necessary to capture the whole dependence structure asymptotically for the bootstrap to be valid. However, certain modelbased bootstrap methods remain valid for some interesting quantities arising in nonparametric statistics. We generalize the wellknown "whitening by windowing" principle to joint distributions of nonparametric estimators of the autoregression function. As a consequence, we obtain that modelbased nonparametric bootstrap schemes remain valid for supremumtype functionals as long as they mimic the corresponding finitedimensional joint distributions consistently. As an example, we investigate a finite order Markov chain bootstrap in the context of a general stationary ...
Missing Value Imputation Based on Data Clustering *
"... Abstract. We propose an efficient nonparametric missing value imputation method based on clustering, called CMI (Clusteringbased Missing value Imputation), for dealing with missing values in target attributes. In our approach, we impute the missing values of an instance A with plausible values that ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
Abstract. We propose an efficient nonparametric missing value imputation method based on clustering, called CMI (Clusteringbased Missing value Imputation), for dealing with missing values in target attributes. In our approach, we impute the missing values of an instance A with plausible values that are generated from the data in the instances which do not contain missing values and are most similar to the instance A using a kernelbased method. Specifically, we first divide the dataset (including the instances with missing values) into clusters. Next, missing values of an instance A are patched up with the plausible values generated from A’s cluster. Extensive experiments show the effectiveness of the proposed method in missing value imputation task. 1
Use of the Nonparametric Nearest Neighbor Approach to Estimate Soil Hydraulic Properties
, 2006
"... Nonparametric approaches are being used in various fields to address classification type problems, as well as to estimate continuous variables. One type of the nonparametric lazy learning algorithms, a knearest neighbor (kNN) algorithm has been applied to estimate water retention at 233 and 21500 ..."
Abstract
 Add to MetaCart
Nonparametric approaches are being used in various fields to address classification type problems, as well as to estimate continuous variables. One type of the nonparametric lazy learning algorithms, a knearest neighbor (kNN) algorithm has been applied to estimate water retention at 233 and 21500kPa matric potentials. Performance of the algorithm has subsequently been tested against estimations made by a neural network (NNet) model, developed using the same data and input soil attributes. We used a hierarchical set of inputs using soil texture, bulk density (Db), and organic matter (OM) content to avoid possible bias toward one set of inputs, and varied the size of the data set used to develop the NNet models and to run the kNN estimation algorithms. Different ‘designparameter’ settings, analogous to model parameters have been optimized. The kNN technique showed little sensitivity to potential suboptimal settings
SpaceTime Disaggregation of Streamflow Data Using Using KNearest Neighbor Patterns and Optimization
, 2000
"... Disaggregated sequences that are statistically similar to observed streamflow records are very useful for analyzing multi reservoir operation policies and river basin management. There is renewed interest in disaggregation methods as climate related issues (regional ENSO forecasts or downscaling of ..."
Abstract
 Add to MetaCart
Disaggregated sequences that are statistically similar to observed streamflow records are very useful for analyzing multi reservoir operation policies and river basin management. There is renewed interest in disaggregation methods as climate related issues (regional ENSO forecasts or downscaling of Climate Change Scenarios) have come to the fore. Disaggregated streamflow should preserve statistical attributes of time series across multiple sites and time scales. A new algorithm for simultaneously disaggregating monthly to weekly or daily flows at a number of sites on a drainage network is presented in this paper. The continuity of flow in time across months at each site as well as the intersite flow pattern is preserved. The disaggregated daily flows at the multiple sites are conditioned on the spatial (across site) pattern of monthly flows at the same sites. The probability distribution of the vector of disaggregated flows conditional on the multisite monthly flows is approximated nonparametrically using the knearest neighbors of the monthly spatial flow pattern. A constrained optimization problem is solved to adaptively estimate the disaggregated flows in space and time for each such neighborhood. An application to data from a tributary of the Colorado River is used to illustrate the modeling process. The daily streamflow data available at the index site was disaggregated to obtain the streamflow data at four upstream sites conditioned on monthly data available at those sites.
Stepwise Nonparametric Disaggregation for Daily Streamflow Generation conditional on Hydrologic and LargeScale Climatic Signals
, 2010
"... A stepwise nonparametric stochastic disaggregation framework to produce synthetic scenarios of daily streamflow conditional on volumes of spring runoff and largescale oceanatmosphere oscillations is presented. This thesis examines statistical links (i.e., teleconnections) between decadal/interannu ..."
Abstract
 Add to MetaCart
A stepwise nonparametric stochastic disaggregation framework to produce synthetic scenarios of daily streamflow conditional on volumes of spring runoff and largescale oceanatmosphere oscillations is presented. This thesis examines statistical links (i.e., teleconnections) between decadal/interannual climatic variations in the Pacific Ocean and hydrologic variability in US northwest region, and includes a spectral analysis of climate signals to detect coherences of their behavior in the frequency domain. We explore the use of such teleconnections of selected signals (e.g., north Pacific gyre oscillation, southern oscillation, and Pacific decadal oscillation indices) in the proposed datadriven framework by means of a crossvalidationbased combinatorial approach with the aim of simulating improved streamflow sequences when compared with disaggregated series generated from flows alone. A nearest neighbor time series bootstrapping approach is integrated with principal component analysis to resample from the empirical multivariate distribution. A volumedependent scaling transformation is implemented to guarantee the summability condition. The downscaling process includes a twolevel cascade scheme: seasonaltomonthly disaggregation first followed by monthlytodaily disaggregation. Although the stepwise procedure may lead to a lack of preservation of the historical correlation between flows of the last day of a month and flows of the first day of the following month, we present a new and simple algorithm, based on nonparametric resampling, that overcomes this limitation. The downscaling framework presented here is parsimonious in parameters and model assumptions, does not generate negative values, and preserves very well the statistical characteristics, temporal dependences, and distributional properties of historical flows. We also show that both including conditional information of climatic teleconnection signals and developing the downscaling in cascades decrease significantly the mean error between synthetic and observed flow traces. The downscaling framework is tested with data from the Payette River Basin in Idaho.
CLIMATIC SIGNALS
, 2010
"... A stepwise nonparametric stochastic disaggregation framework to produce synthetic scenarios of daily streamflow conditional on volumes of spring runoff and largescale oceanatmosphere oscillations is presented. This thesis examines statistical links (i.e., teleconnections) between decadal/interannu ..."
Abstract
 Add to MetaCart
A stepwise nonparametric stochastic disaggregation framework to produce synthetic scenarios of daily streamflow conditional on volumes of spring runoff and largescale oceanatmosphere oscillations is presented. This thesis examines statistical links (i.e., teleconnections) between decadal/interannual climatic variations in the Pacific Ocean and hydrologic variability in US northwest region, and includes a spectral analysis of climate signals to detect coherences of their behavior in the frequency domain. We explore the use of such teleconnections of selected signals (e.g., north Pacific gyre oscillation, southern oscillation, and Pacific decadal oscillation indices) in the proposed datadriven framework by means of a crossvalidationbased combinatorial approach with the aim of simulating improved streamflow sequences when compared with disaggregated series generated from flows alone. A nearest neighbor time series bootstrapping approach is integrated with principal component analysis to resample from the empirical multivariate distribution. A volumedependent scaling transformation is implemented to guarantee the summability condition. The downscaling process includes iii a twolevel cascade scheme: seasonaltomonthly disaggregation first followed by
CHAPTER 7 Ensemble Streamflow Forecasting: Methods & Applications
"... The chapter is organized as follows. The theme of the chapter is introduced in Section 7.1. Section 7.2 presents a background on largescale climate and its impacts on the western US hydroclimatology. The basins studied and data used are described in sections 7.3 and 7.4, respectively. This is follo ..."
Abstract
 Add to MetaCart
The chapter is organized as follows. The theme of the chapter is introduced in Section 7.1. Section 7.2 presents a background on largescale climate and its impacts on the western US hydroclimatology. The basins studied and data used are described in sections 7.3 and 7.4, respectively. This is followed by the climate