Results 1 
5 of
5
Can One Use Cohen’s Kappa to Examine Disagreement
 Methodology
, 2005
"... Abstract. This research discusses the use of Cohen’s j (kappa), Brennan and Prediger’s jn, and the coefficient of raw agreement for the examination of disagreement. Three scenarios are considered. The first involves all disagreement cells in a rater rater crosstabulation. The second involves one o ..."
Abstract

Cited by 3 (2 self)
 Add to MetaCart
Abstract. This research discusses the use of Cohen’s j (kappa), Brennan and Prediger’s jn, and the coefficient of raw agreement for the examination of disagreement. Three scenarios are considered. The first involves all disagreement cells in a rater rater crosstabulation. The second involves one of the triangles of disagreement cells. The third involves the cells that indicate disagreement by one (ordinal) scale unit. For each of these three scenarios, coefficients of disagreement in the form of j equivalents are derived. The behavior of the coefficients of disagreement in the three situations is studied. The first and the third case pose no particular problem. The j equivalents and the other coefficients can be interpreted as usual. In the second case, problems arise such that the range of disagreement js is restricted because the tables are incomplete. Thus, the standard logfrequency model of rater independence is no longer applicable. When the more general models of quasiindependence are used, negative degrees of freedom can result for smaller tables. Simulation results illustrate the characteristics of the coefficients of disagreement for each of the three scenarios. Empirical data examples are given. Cohen’s (1960) j (kappa) is the most popular coefficient of rater agreement. j indicates the degree to which two raters agree beyond chance. It is a summary measure of agreement in a rater rater crossclassification. Researchers often ask additional questions concerning
ManyFacet Rasch Measurement
"... This chapter provides an introductory overview of manyfacet Rasch measurement (MFRM). Broadly speaking, MFRM refers to a class of measurement models that extend the basic Rasch model by incorporating more variables (or facets) than the two that are typically included in a test (i.e., examinees and ..."
Abstract

Cited by 1 (0 self)
 Add to MetaCart
(Show Context)
This chapter provides an introductory overview of manyfacet Rasch measurement (MFRM). Broadly speaking, MFRM refers to a class of measurement models that extend the basic Rasch model by incorporating more variables (or facets) than the two that are typically included in a test (i.e., examinees and items), such as raters, scoring criteria, and tasks. Throughout the chapter, a sample of rating data taken from a writing performance assessment is used to illustrate the rationale of the MFRM approach and to describe the general methodological steps typically involved. These steps refer to identifying facets that are likely to be relevant in a particular assessment context, specifying a measurement model that is suited to incorporate each of these facets, and applying the model in order to account for each facet in the best possible way. The chapter focuses on the rater facet and on ways to deal with the perennial problem of rater variability. More specifically, the MFRM analysis of the sample data shows how to measure the severity (or leniency) of raters, to assess the degree of rater consistency, to correct examinee scores for rater severity differences, to examine the functioning of the rating scale, and to detect potential interactions between facets. Relevant statistical indicators are successively introduced as the sample data analysis proceeds. The final section deals with issues concerning the choice of an
Reference Supplement to the Manual for Relating Language Examinations to the Common European Framework of Reference for Languages: Learning, Teaching, Assessment
, 2009
"... ..."
(Show Context)
Examination Using Benchmark Essays
"... Jonathan R. Manalo is an assistant research scientist at ..."
(Show Context)