Original Article
Inter-rater reliability in performance status assessment among health care professionals: a systematic review
Abstract
Background: Studies have reported that performance status (PS) is a good prognostic indicator in patients with advanced cancer. However, different health care professionals (HCPs) could grade PS differently. The purpose of this review is to investigate the PS scores evaluated by different HCPs as reported in the literature.
Methods: A literature search was conducted in Ovid MEDLINE and OLDMEDLINE from 1946 to Present (July 5, 2015), Embase Classic and Embase from 1947 to 2015 Week 26, and Cochrane Central Register of Controlled Trials up to May 2015. Information of interest was whether there was a difference of PS assessment between HCPs. Other statistical information provided to assess the agreement in ratings, such as Cohen’s kappa coefficient, Krippendorff’s alpha coefficient, Spearman Rank Coefficient, and Kendall’s correlation, was noted.
Results: Of the fifteen articles, eleven compared PS assessments between HCPs of different disciplines, one between the attending and resident physician, two between similarly-specialized physicians, and one between two unspecified-specialty physicians. Three studies reported a lack of agreement (kappa =0.19–0.26; Krippendorff’s alpha =0.61–0.63), four reported moderate inter-rater reliability (kappa =0.31–0.72), two reported mixed reliability, and six reported strong reliability (kappa =0.91–0.92; Spearman rank correlation =0.6–1.0; Kendall’s correlation =0.75–0.82). Four studies reported that Karnofsky performance status (KPS) had better inter-rater reliability than both the Eastern Cooperative Oncology Group Performance Status (ECOG PS) and the palliative performance scale (PPS).
Conclusions: The existing literature cites both good and bad inter-rater reliability of PS scores. It is difficult to conclude which HCPs’ PS assessments are more accurate.
Methods: A literature search was conducted in Ovid MEDLINE and OLDMEDLINE from 1946 to Present (July 5, 2015), Embase Classic and Embase from 1947 to 2015 Week 26, and Cochrane Central Register of Controlled Trials up to May 2015. Information of interest was whether there was a difference of PS assessment between HCPs. Other statistical information provided to assess the agreement in ratings, such as Cohen’s kappa coefficient, Krippendorff’s alpha coefficient, Spearman Rank Coefficient, and Kendall’s correlation, was noted.
Results: Of the fifteen articles, eleven compared PS assessments between HCPs of different disciplines, one between the attending and resident physician, two between similarly-specialized physicians, and one between two unspecified-specialty physicians. Three studies reported a lack of agreement (kappa =0.19–0.26; Krippendorff’s alpha =0.61–0.63), four reported moderate inter-rater reliability (kappa =0.31–0.72), two reported mixed reliability, and six reported strong reliability (kappa =0.91–0.92; Spearman rank correlation =0.6–1.0; Kendall’s correlation =0.75–0.82). Four studies reported that Karnofsky performance status (KPS) had better inter-rater reliability than both the Eastern Cooperative Oncology Group Performance Status (ECOG PS) and the palliative performance scale (PPS).
Conclusions: The existing literature cites both good and bad inter-rater reliability of PS scores. It is difficult to conclude which HCPs’ PS assessments are more accurate.