|dc.description.abstract||CONTEXT: Multi-source feedback (MSF) provides a window into complex areas of performance in real workplace settings. However, because MSF elicits subjective judgements, many respondents are needed to achieve a reliable assessment. Optimising the consistency with which questions are interpreted will help reliability.
METHODS: We compared two parallel forms of an MSF instrument with identical wording and administration procedures. The original instrument contained 10 compound performance items and was used 12,540 times to assess 977 doctors, including 112 general practitioners (GPs). The modified instrument contained the same wording in 21 non-compound items, each of which asked about a single aspect of performance, and was used 2789 times to assess 205 doctors, all of whom were GPs. Generalisability analysis evaluated questionnaire reliability. The reliability of the original instrument was evaluated for both the whole group and the GP subgroup.
RESULTS: The two instruments provided similar numbers of responses per doctor. The modified instrument generated more reliable scores. The whole-group comparison examined precision, measured as standard error of measurement (SEM); seven respondents were sufficient to achieve a 95% confidence interval of 0.25 (on a 4-point scale) with the modified instrument, compared with 10 respondents using the original instrument. The subgroup comparison examined the generalisability coefficient; 15 responses provided a reliability of 0.72 using the modified instrument or 0.58 using the original instrument.
CONCLUSIONS: Non-compound questions improved the consistency of scores. We recommend that compound questions be avoided in assessment instrument design.||en