PS1-34: Sensitivity of Patient-reported Physician Percentile Rankings to Inter-physician Variability and Patient Sample Size

  • Clinical Medicine & Research
  • September 2014,
  • 12
  • (1-2)
  • 108-
  • 109;
  • DOI: https://doi.org/10.3121/cmr.2014.1250.ps1-34

Abstract

Background/Aims Patient satisfaction is increasingly being recognized as a desirable measure of physician quality and is used for quality-based financial incentives. Patient satisfaction surveys such as the CAHPS, however, typically exhibit ‘ceiling effects’ where most patients report maximal satisfaction, and so physicians are often ranked based on their percentage of maximum-satisfaction responses (“percentile top box scores,” 0–100%) rather than on raw scores. Even so, physicians express concern that low response rates or tight clustering of underlying scores can have unknown effects on rankings and detrimental consequences. This study used simulation to report the effect of inter-physician variability and sample size on survey-based physician rankings.

Methods Assuming 9 different underlying distributions of “true” physician scores (means of 73, 88, 95% satisfaction and standard deviations of log odds = 0.5, 1.0 and 1.5), we simulated 5,000 physicians and assigned each a true score and rank within these distributions. We then tested various patient sample sizes (N) from 10–100, and repeated 1,000 simulations under each scenario to calculate 95% inner ranges of observed ratings and ranks for each physician. True and observed ranks were compared and examined as a function of underlying distribution and N.

Results The precision of an individual physician’s percentile rank increases dramatically with 3 factors: increase in N; increase in variance of true physician scores; and decrease in overall mean physician score. Precision is also greatest for the best and worst physicians and less precise in the midrange. In the best case scenario tested (mean 73%, log odds SD = 1.5, N = 100), physicians with an observed rank as low as the 68th percentile were likely to be equivalent to physicians with an observed rank above the 90th percentile. In the worst case scenario (95% mean, log odds SD = 0.5, N = 10), physicians with a “true” 90th percentile rank were likely to receive a ranking as low as the 18th percentile.

Conclusions If physician rankings based on patient satisfaction scores are used to measure and incentivize quality, rankings should ether be based on a measure with maximal variation among physicians, or incentives should be based on ranges that reflect uncertainty rather than absolute rankings.

Loading
  • Share
  • Bookmark this Article