## Interactions and Outliers in the Two-Way Analysis of Variance (1998)

Venue: | Annals of Statistics |

Citations: | 6 - 3 self |

### BibTeX

@ARTICLE{Terbeck98interactionsand,

author = {W. Terbeck and P. L. Davies},

title = {Interactions and Outliers in the Two-Way Analysis of Variance},

journal = {Annals of Statistics},

year = {1998},

volume = {26},

pages = {1279--1305}

}

### OpenURL

### Abstract

. The two-way analysis of variance with interactions is a well established and integral part of statistics. In spite of its long standing it is hown that the standard definition of interactions is counter intuitive and obfuscates rather than clarifies. A different definition of interaction is given which amongst other advantages allows the detection of interactions even in the case of one observation per cell. A characterization of unconditionally identifiable interaction patterns is given and it is proved that such patterns can be identified by the L 1 -functional. The unconditionally identifiable interaction patterns describe the optimal breakdown behaviour of any equivariant location functional from which it follows that the L 1 -functional has optimal breakdown behaviour. Possible lack of uniqueness of the L 1 -functional can be overcome using an M-functional with an external scale derived independently from the observations. The resulting procedures are applied to some data ...

### Citations

2241 |
Robust statistics
- Huber
- 1981
(Show Context)
Citation Context ...ss of M-functionals defined by minimization of a strictly convex ae-function with a given scale. Another possibility is to define location and scale as the zeros of the /- and ��-functions (see Hu=-=ber [21]-=-). Although the empirical evidence is good in that we have not found any data sets where this fails, there are no proofs of uniqueness or theoretical results on breakdown behaviour. This is worthy of ... |

1826 |
A Simplex Method for Function Minimization
- Nelder, Mead
- 1965
(Show Context)
Citation Context ...y this choice of A leads to numerical problems. Steepest descent turns out to be very unstable because of inaccuracies in the calculation of the gradient. The Nelder-Meade algorithm (Nelder and Meade =-=[27]-=-) does work but may have to be restarted several times due to degeneracy of the simplex. The following method proved satisfactory. Step 1 Iterate median polish 10 times Step 2 Calculate robust scale f... |

1382 |
Robust Regression and Outlier Detection
- Rousseeuw, Leroy
- 1987
(Show Context)
Citation Context ... residuals but rather by minimizing the hth order statistic where h = [n=2]+[(p+1)=2] and p \Gamma 1 denotes the maximum number of points on a lower dimensional plane. We refer to Rousseeuw and Leroy =-=[31] page-=- 125 and Davies [10] page 1851 with the correction that the maximum number of points on a lower dimensional plane is p \Gamma 1 and not p. In the case of the 5 \Theta 5 table "least median of squ... |

1212 | Exploratory Data Analysis - Tukey - 1977 |

479 | Least Median of Squares Regression
- Rousseeuw
- 1984
(Show Context)
Citation Context ...haviour. Theorem 3.1 shows that the L 1 -functional has the optimal breakdown behaviour. Let us examine the breakdown behaviour of the Hampel-Rousseeuw least median of squares (Hampel [17], Rousseeuw =-=[29]-=-). Firstly we note that the optimal breakdown point is not obtained by minimizing the median of the absolute residuals but rather by minimizing the hth order statistic where h = [n=2]+[(p+1)=2] and p ... |

148 | Multivariate estimation with high breakdown points - Rousseeuw |

105 | Alternatives to the median absolute deviation - Rousseeuw, Croux - 1993 |

98 |
The notion of breakdown point
- Donoho, Huber
- 1983
(Show Context)
Citation Context ...l for the noise may be obtained from the residuals as described in Theorem 3.3 below. For the statement of the theorem we require the definition of the finite sample breakdown point (Donoho and Huber =-=[12]). Gi-=-ven a scale functional S and a data set X we define " (S; X) = min ae k IJ : sup Y 2Yk fS(Y )g = 1 oe where Y k = f(y ij ) : #f(i; j) : y ij 6= x ij g = kg Theorem 3.3 For a given data set X and ... |

92 |
Statist
- Rousseeuw, Van
(Show Context)
Citation Context ...ample with the same structure is Table 5.6 on page 164 of Cochran and Cox [6]. It has been analysed by Daniel [7] Sections 4.3 and 4.7 and Hampel, F.R., Ronchetti, E.M., Rousseeuw, P.J., Stahel, W.A. =-=[18]-=- Section 1.1d. We note that in the context of this paper there is no difference between an interaction and an outlier and we shall use both terminologies. Tukey [35] calls such observations exotic. Fa... |

37 |
Exploring Data Tables, Trends, and Shapes
- HOAGLIN, MOSTELLER, et al.
- 1985
(Show Context)
Citation Context ...Delta kj = med i fx ij \Gamma x ik g for the differences b j \Gamma b k , and ffi st = med j fx tj \Gamma x sj g for the differences a t \Gamma a s have been developed by Hoaglin, Mosteller and Tukey =-=[20]-=- page 45. In general these methods find all interaction patterns that satisfy either Corollary 2.7 or Corollary 2.8. They do not however find all unconditionally identifiable interaction patterns as d... |

33 |
Applications of Statistics to Industrial Experimentation
- Daniel
- 1976
(Show Context)
Citation Context ...ith respect to AR, AC and IRC. Terbeck [33] contains more information on this topic. 1.3 Previous work The definition of interaction we give in Section 2:1 below is not new and may be found in Daniel =-=[8]-=-. Daniel shows how the least squares residuals may be used to detect certain patterns of outliers. In many cases Tukey's median polish correctly identifies outliers in the two-way table but it cannot ... |

32 |
Aspects of Robust Linear regression
- Davies
- 1993
(Show Context)
Citation Context ...minimizing the hth order statistic where h = [n=2]+[(p+1)=2] and p \Gamma 1 denotes the maximum number of points on a lower dimensional plane. We refer to Rousseeuw and Leroy [31] page 125 and Davies =-=[10] page 1851-=- with the correction that the maximum number of points on a lower dimensional plane is p \Gamma 1 and not p. In the case of the 5 \Theta 5 table "least median of squares" therefore has the h... |

27 |
Beyond Location Parameters: Robust Concepts and Methods
- Hampel, R
- 1975
(Show Context)
Citation Context ...mal breakdown behaviour. Theorem 3.1 shows that the L 1 -functional has the optimal breakdown behaviour. Let us examine the breakdown behaviour of the Hampel-Rousseeuw least median of squares (Hampel =-=[17]-=-, Rousseeuw [29]). Firstly we note that the optimal breakdown point is not obtained by minimizing the median of the absolute residuals but rather by minimizing the hth order statistic where h = [n=2]+... |

19 |
Tail behavior of regression estimators and their breakdown points
- He, Jureckova, et al.
- 1990
(Show Context)
Citation Context ... L 1 -functional has the highest possible breakdown point. A discussion of the problem of outliers in the analysis of variance is to be found in Hampel et al [18]. He, Jureckov'a, Koenker and Portnoy =-=[19]-=-, Bradu [4] and Ellis and Morgenthaler [14] consider the breakdown behaviour of the L 1 -functional for fixed regressors. Their work leads to necessary and sufficient conditions for a subset of the re... |

17 |
Data features
- Davies
- 1995
(Show Context)
Citation Context ...hat it is the most difficult case and that the general case can be reduced to it by the simple expedient of replacing the observations in each cell by their median. The example (4) is given in Davies =-=[11]-=- as are without proof Corollary 2.4 and Theorem 2.11 below. 1.2 Group invariance and equivariance The model (1) is clearly invariant with respect to the following group of operations; PR permute the r... |

6 |
An algorithm for l 1-norm minimization with application to nonlinear l 1-approximation
- El-Attar, Vidyasagar, et al.
- 1979
(Show Context)
Citation Context ... of the function F (a 1 ; : : : ; a I ; b 1 ; : : : ; b J ) = X i;j jc ij (a i ; b j )j : (13) A characterization of solutions of such a minimization problem is given by ElAttar, Vidyasagar and Dutta =-=[13]-=-, Lemma 2.1. In our situation all the functions c ij are linear and the gradient of c ij is an (I +J)-vector with i-th and (I + j)-th component \Gamma1 and 0 elsewhere. This leads to the following pro... |

6 |
Leverage and breakdown in l 1 regression
- Ellis, Morgenthaler
- 1992
(Show Context)
Citation Context ...reakdown point. A discussion of the problem of outliers in the analysis of variance is to be found in Hampel et al [18]. He, Jureckov'a, Koenker and Portnoy [19], Bradu [4] and Ellis and Morgenthaler =-=[14]-=- consider the breakdown behaviour of the L 1 -functional for fixed regressors. Their work leads to necessary and sufficient conditions for a subset of the regressors to be safe for the L 1 -functional... |

6 |
Evaluation and Control of Measurements
- Mandel
- 1991
(Show Context)
Citation Context ...-way table. The results we give may be applied to the multiplicative model by taking logarithms. Many other structures can be constructed such as the row- and column-linear models developed by Mandel =-=[25]-=-, [26]. They face the same problems and as yet have not been robustified. The linear model has the great advantage of simplicity and even if it is not an adequate model the residuals from a robust fit... |

3 |
Locating Outliers in Factorial Experiments
- Daniel
- 1960
(Show Context)
Citation Context ...ma2=9 \Gamma2=9 \Gamma2=9 1=9 1=9 \Gamma2=9 1=9 1=9 1 A : (5) We now have interactions everywhere and the original clear and simple interpretation has been lost. This example was also known to Daniel =-=[7]-=- and is of itself sufficient to discredit the usual definition of interaction. A practical example with the same structure is Table 5.6 on page 164 of Cochran and Cox [6]. It has been analysed by Dani... |

3 |
Analysis of Two-Way Layouts
- Mandel
- 1995
(Show Context)
Citation Context ...able. The results we give may be applied to the multiplicative model by taking logarithms. Many other structures can be constructed such as the row- and column-linear models developed by Mandel [25], =-=[26]-=-. They face the same problems and as yet have not been robustified. The linear model has the great advantage of simplicity and even if it is not an adequate model the residuals from a robust fit provi... |

3 |
Interaktionen in der Zwei-Faktoren-Varianzanalyse
- Terbeck
- 1996
(Show Context)
Citation Context ...p operations. Most but not all methods suggested in the literature are equivariant. An exception is Tukey's median polish (Tukey [34]) which is not equivariant with respect to AR, AC and IRC. Terbeck =-=[33]-=- contains more information on this topic. 1.3 Previous work The definition of interaction we give in Section 2:1 below is not new and may be found in Daniel [8]. Daniel shows how the least squares res... |

3 |
Exploratory analysis of variance as providing examples of strategic choices
- Tukey
- 1993
(Show Context)
Citation Context ....M., Rousseeuw, P.J., Stahel, W.A. [18] Section 1.1d. We note that in the context of this paper there is no difference between an interaction and an outlier and we shall use both terminologies. Tukey =-=[35]-=- calls such observations exotic. Failure to detecting interactions is equivalent to failure to detect outliers and in the context of outliers such a failure is referred to as breakdown. When we theref... |

1 |
The calculation of least absolute value estimators for two-way-tables
- Armstrong, Frome
- 1976
(Show Context)
Citation Context ...y identifiable interaction pattern P then C is a solution of the least absolute deviation problem. In generally solutions to the L 1 -problem in the two-way-layout are not unique (Armstrong and Frome =-=[1]-=-). The following theorem is therefore stronger than Lemma 2.10: Theorem 2.11 Let X be a matrix and C and C 0 be two residual matrices in C(X) such that C has an unconditionally identifiable interactio... |

1 |
Least-absolute-values-estimators for one-way and two-way tables
- Armstrong, Frome
- 1979
(Show Context)
Citation Context ...ction pattern and C 0 minimizes the least absolute deviation. Then C = C 0 . An adaptation of general simplex methods to the special case of a two-way table has been given by Armstrong and Frome [1], =-=[2]-=-. Theorem 2.11 is no longer valid for a matrix C whose interaction pattern is not unconditionally identifiable as the following example shows; 0 @ 0 0 1 2 3 0 0 0 0 0 0 0 0 0 0 1 A : whose unique L 1 ... |

1 |
E.D.V. in Biologie und Medizin
- Bradu
- 1975
(Show Context)
Citation Context ...tisfy either Corollary 2.7 or Corollary 2.8. They do not however find all unconditionally identifiable interaction patterns as defined below. More detailed information is given in Terbeck [33]. Bradu =-=[3]-=-, Bradu and Hawkins [5], Gentleman and Wilk [15] and [16] have also considered the problem of identifying multiple outliers in the twoway analysis of variance. Hubert [23] has treated the correspondin... |

1 |
Identification of outliers by means of L -regression: Safe and unsafe configurations.Computational Statistics and Data Analysis 24
- Bradu
- 1997
(Show Context)
Citation Context ...onal has the highest possible breakdown point. A discussion of the problem of outliers in the analysis of variance is to be found in Hampel et al [18]. He, Jureckov'a, Koenker and Portnoy [19], Bradu =-=[4]-=- and Ellis and Morgenthaler [14] consider the breakdown behaviour of the L 1 -functional for fixed regressors. Their work leads to necessary and sufficient conditions for a subset of the regressors to... |

1 |
Location of multiple outliers in twoway tables, using tetrads
- Bradu, Hawkins
- 1982
(Show Context)
Citation Context ...2.7 or Corollary 2.8. They do not however find all unconditionally identifiable interaction patterns as defined below. More detailed information is given in Terbeck [33]. Bradu [3], Bradu and Hawkins =-=[5]-=-, Gentleman and Wilk [15] and [16] have also considered the problem of identifying multiple outliers in the twoway analysis of variance. Hubert [23] has treated the corresponding problem for two-way c... |

1 |
Experimental Designs, 2nd editition
- Cochran, Cox
- 1957
(Show Context)
Citation Context ...mple was also known to Daniel [7] and is of itself sufficient to discredit the usual definition of interaction. A practical example with the same structure is Table 5.6 on page 164 of Cochran and Cox =-=[6]-=-. It has been analysed by Daniel [7] Sections 4.3 and 4.7 and Hampel, F.R., Ronchetti, E.M., Rousseeuw, P.J., Stahel, W.A. [18] Section 1.1d. We note that in the context of this paper there is no diff... |

1 |
Detecting outliers in a two-way table; I. Statistical behavior of residuals
- Gentleman, Wilk
- 1975
(Show Context)
Citation Context ...ey do not however find all unconditionally identifiable interaction patterns as defined below. More detailed information is given in Terbeck [33]. Bradu [3], Bradu and Hawkins [5], Gentleman and Wilk =-=[15]-=- and [16] have also considered the problem of identifying multiple outliers in the twoway analysis of variance. Hubert [23] has treated the corresponding problem for two-way contingency tables. She sh... |

1 |
Detecting outliers in a two-way table; II. Supplementing the direct analysis of residuals
- Gentleman, Wilk
- 1975
(Show Context)
Citation Context ... however find all unconditionally identifiable interaction patterns as defined below. More detailed information is given in Terbeck [33]. Bradu [3], Bradu and Hawkins [5], Gentleman and Wilk [15] and =-=[16]-=- have also considered the problem of identifying multiple outliers in the twoway analysis of variance. Hubert [23] has treated the corresponding problem for two-way contingency tables. She shows that ... |

1 |
Robustness:where are we now
- Huber
- 1995
(Show Context)
Citation Context ...d. The linear model has the great advantage of simplicity and even if it is not an adequate model the residuals from a robust fit provide a good starting point for developing an improved model. Huber =-=[22]-=- states Embarrassingly, the robustification of the statistics of two-way tables is still wide open. We hope this paper reduces the embarrassment. 1.4 Contents In Section 2 we consider the no-noise mod... |

1 |
The breakdown value of the L 1 estimator in contingency tables. Satistica aND Probability Letters 33
- Hubert
- 1996
(Show Context)
Citation Context ... given in Terbeck [33]. Bradu [3], Bradu and Hawkins [5], Gentleman and Wilk [15] and [16] have also considered the problem of identifying multiple outliers in the twoway analysis of variance. Hubert =-=[23]-=- has treated the corresponding problem for two-way contingency tables. She shows that in this situation the L 1 -functional has the highest possible breakdown point. A discussion of the problem of out... |

1 |
Ringversuche zur Bestimmung des Qualitatsstandards von Laboratorien, Seminar der Region Oesterreich-Schweiz der Internationalen Biometrischen Gesellschaft
- Lischer
- 1993
(Show Context)
Citation Context ...h as error scale being dependent on the level of concentration. More work remains to be done on this topic and so we do not attempt to give a full analysis of the data below first analysed by Lischer =-=[24]-=-. It consists of ten samples of sewage sludge which were sent to 21 laboratories each of which had to report the lead concentration of each sample. The data are 0 B B B B B B B B B B B B B B B B B B B... |

1 |
Hearing Levels of Adults, Table 4, p
- Roberts, Cohrssen
- 1968
(Show Context)
Citation Context ...1 is a bias effect caused by the outliers. 5.2 Daniel's example The second example is taken from Daniel [9]. The data are the results of ear tests and were originally published by Roberts and Corssen =-=[28]-=-. They have also been analysed by Bradu [3] and by Bradu and Hawkins [5]. The rows correspond to sound frequencies and the columns represent different occupational groups. The data are 0 B B B B B B B... |