Feinstein A, Cicchetti D. High approval, but low kappa: I. The problems of two paradoxes. J. Clin Epidemiol. 1990;43(6):543–9. For the comparison of the two chord measures, the point estimates for all variables of interest did not differ significantly between the Fleiss K and the Krippendorff alpha, regardless of the agreement observed or the number of categories (Table 2). As our simulation study shows, the confidence intervals for Fleiss K were narrower when using the asymptotic approach than when using the bootstrap approach. The relative difference between the two approaches became smaller as the observed agreement was weak. There was no significant difference between the bootstrap confidence intervals for Fleiss K and Krippendorff Alpha.

From a technical point of view, our conclusions apply only to the simulation scenarios examined, which we have very widely and generally varied. Although we have not specifically examined whether the results of this study can be applied to the evaluation of intra-evaluator agreement, we are convinced that the results of our study are also valid for this field of application of Krippendorffs alpha and Fleiss` K, as there is no systematic difference in how the parameters are evaluated. In addition, simulation results for the analysis of missing data are only valid for TRAC conditions, as we did not examine scenarios in which data was randomly missing or not accidentally missing. However, in many real-world reliability studies, the TRAC hypothesis may apply because absence is actually completely random, for example, because each subject is judged by only a random subset of evaluators due to temporal, ethical, or technical limitations. For variables with more than two correct levels, we also assessed how the use of an ordinal scale instead of a nominal scale affected the expected reliability. Since Fleiss K does not offer the possibility of ordinal scaling, we only performed this analysis for krippendorff`s alpha. Alpha estimates increased by 15-50% when an ordinal scale was used compared to a nominal scale. However, using an ordinal scale provides correct estimates of alpha for these variables, as the data were collected orally.

Here we were able to obtain point estimates from 0.70 (HER2 score) to 0.88 (estrogen group), indicating significant agreement among the evaluators. Kappa is an index that takes into account the observed agreement in relation to a basic agreement. However, researchers should carefully consider whether Kappa`s basic agreement is relevant to the particular research question. The kappa base is often described as a random match, which is only partially correct. The Basic Kappa Agreement is the agreement that would be expected because of the random allocation given the quantities indicated in the table of marginal square contingency sums. Thus, kappa = 0 if the observed allocation is apparently random, regardless of the defined inequality constrained by the marginal sums. However, for many applications, researchers should be more interested in quantitative inequality in limit sums than in the allocation notice described in the additional diagonal information of the square contingency table. Therefore, Kappa`s baseline is more distracting than insightful for many applications. Consider the following example: Hoy D, Brooks P, Woolf A, Blyth F, March L, Bain C, Baker P, Smith E, Buchbinder R. Risk of bias assessment in prevalence studies: modification of an existing tool and evidence of an interract agreement. J. Clin Epidemiol.

2012;65(9):934–9. Another approach to the use of conformity coefficients in reliability assessment would be to model the association model between observer assessments. Three groups of models can be used for this: latent class models, simple quasi-symmetric correspondence models, and mixing models (e.B.). [37, 38]. However, these modelling approaches require a higher level of statistical expertise, which generally makes it much easier for standard applicants to estimate and, in particular, interpret compliance coefficients. Here, the coverage of opinions on quantity and allocation is informative, while Kappa obscures the information. In addition, Kappa introduces some challenges in calculation and interpretation, as kappa is a ratio. It is possible that the kappa ratio returns an indefinite value due to zero in the denominator. Moreover, a ratio does not betray its numerator or denominator. .