Interobserver Variability in Applying a Radiographic Definition for ARDS: ALI-ARDS screening

There was more agreement on the analog radiographs than on those taken with computed radiography. There are several possible explanations for this unexpected observation. All of the analog radiographs were selected from patients who were actually enrolled in a clinical trial, reflecting, therefore, only a subset of patients considered for inclusion. Thus difficult radiographs may be underrepresented. If this is true, then our K-statistic estimate of 0.55 overestimates the agreement that would be seen in a sample of radiographs evaluated in ALI-ARDS screening. It is possible that the digital radiographs were of poorer quality than the analog radiographs or that the smaller size of the digital radiographs made them more difficult to interpret. Participants may have lacked experience in evaluating digital radiographs. However, the seven readers from the ARDS Network reviewed digital radiographs from patients with ALI-ARDS at several planning sessions, and their agreement, as a group, on radiograph interpretations was no different from other readers. It is not possible to distinguish among these possibilities without studying the interpretation of digital and analog radiographs on the same patients.

There are several potential limitations to this study. We did not study the readers’ accuracy in diagnosing ALI-ARDS using the entire AECC definition. This would have required a presentation of clinical data including onset, history, physical examination, and laboratory tests in the form of a vignette. Had we used vignettes with this information, the resulting agreement on the diagnosis of ALI-ARDS would have reflected our skills, or lack thereof, in abstracting clinical information and writing vignettes. Because we were interested specifically in the readers’ agreement with each other in applying the radiographic definition, and because there is no “gold standard” with which to compare the readers’ decisions, accuracy was not particularly relevant. The question was not whether the readers were right or wrong in some objective sense, but whether they agreed with each other in applying a standard radiographic definition.

