Sample Size For Interobserver Agreement

Measurement studies of the Interobserver Agreement (reliability) are common in clinical practice, but discussions about appropriate methods for estimating sample size are minimal compared to clinical trials. The authors propose a method of estimating sample size in order to reach a predetermined threshold and ceiling for a confidence interval for the coefficient in the Interobserver agreement studies. The proposed technique can be used to design a study that measures agreement between observers with any number of results and any number of advisors. Possible areas of application are: pathology, psychiatry, dentistry and physiotherapy. This method should be useful in the planning phases of an inter-observer agreement study in which the examiner wishes to obtain a predetermined accuracy in the B estimate. An R software (R Foundation for Statistical Computing, Vienna, Austria), kappaSize is also available to implement this method. This technique is illustrated by two examples. The first is a pilot study in mundoral radiology, the authors of which examined the reliability of the lower jaw corticary index, as measured by three dentists. The second example examines the degree of inter-conservative agreement between four nurses with respect to five triage levels used in Canadian triage and acuity scales.