LinkedInXFacebook
Subscribe
Orthopedics This Week
  • My Feed
  • |Posts
  • |Events
  • |MSK Innovations
  • |Power Rankings
  • |Masterclasses
  • |Technology Awards
  • Press Releases
  • |Advertising
  • |Job Board
  • Spine
  • ◆Joints
  • ◆Upper Extremities
  • ◆Foot & Ankle
  • ◆Sports Medicine
  • ◆Pain Mgmt
  • ◆Trauma
  • ◆Biologics
  • ◆Technology
  • ◆People
  • ◆Company News
  • ◆Legal & Regulatory
Home/Legal & Regulatory and Reimbursement/The False Promise of Physician Quality Stats
Legal & Regulatory and Reimbursement

The False Promise of Physician Quality Stats

July 5, 2017 2 min read Premium comments

Advertisement

The False Promise of Physician Quality Stats
Source: Wikimedia Commons and heb
Secondary

Jason Shafrin, writing in his healthcare care blog Healthcare Economist on June 20, 2017, called attention to one of the unexpected consequences of physician quality statistics, namely that these scores may not be doing a good job of measuring care quality.

Physician quality scores are usually based on single composite statistical measures. Medicare’s Value-based Payment Modifier, Shafrin points out, boils several individual quality metrics down to a single quality score.

Statisticians are quick to spot the flaw in this methodology.

Writing in the journal Health Services Research, authors Martsolf, Carle, and Scanlon (2017) found that a single global measure of quality did not accurately predict quality care.

Notably, the researchers used health insurance claims data (October 2007 – 2010) for 134 physician practices in Seattle, Washington. The researchers used confirmatory and exploratory factor analysis to develop theory and empirically driven internally valid composite measures based on 19 quality indicators.

Martsolf, et al. found that their results did not support a single global measure using the entire set of quality indicators. They did, however, identify an acceptable multidimensional model (RMSEA = 0.059; CFI = 0.934; TLI = 0.910). The four dimensions used in the data were diabetes, depression, preventive care, and generic drug prescribing.

Martsolf et al. Conclusions

Finally, the authors concluded that while commonly used process indicators can be used to create a small set of useful composite measures, the lack of an internally valid single unidimensional global measure has important implications for policy approaches meant to improve quality by rewarding “high-quality physicians.”

Advertisement

Real World Ramifications

As Shafrin noted in his blog, the kind of global composite physician quality measures which are increasingly used to pay physicians is not without risk.

What are those risks?

Shafrin explains with this simple example: “Physician A could be excellent at diagnosing a condition but poor at treatment and Physician B could be excellent at treatment but poor at diagnosis. If this information where known to patients, and all patients went to Physician A for diagnosis and Physician B for treatment, they would both be excellent at treating the patients they do even though a composite score could rank both physicians as average. This example captures cases where quality is multidimensional. Quality metrics also must be reliable as well and accurately capture underlying physician quality when measured across a reasonable sample size of patients.”

In short, as Shafrin writes, “When indicators measuring unrelated constructs are included in a single score, the high score on some indicators could “hide” low scores on other indicators or vice versa. In this case, the composite measure does not provide a clear quality signal. Inclusion of invalid composite measures could actually hurt quality reporting by leading to physician practice misclassification.”

And, finally, Shafrin concludes: “When multiple indicators measuring distinct aspects of quality are inappropriately combined into a single measure, the resulting composite measure is not useful or even completely uninterpretable.”

Yup.

React:

Discussion

14
DS
Dr. Sarah MitchellOrthopedic Surgeon · Mayo Clinic

This is a fascinating development. In my practice we've seen similar outcomes with the revised protocol. The key differentiator seems to be patient selection criteria. Has anyone else noticed the correlation with BMI thresholds?

8
JT
James Thornton, MDSpine Fellow · HSS

Great point. I'd push back slightly on the conclusion, the sample size in the cited study is too small to draw population-level inferences. That said, the directional signal is compelling and worth a larger RCT.

5
RP
R. PatelSports Medicine · Stanford

We implemented a similar approach last year. Early results are promising but we're still gathering 12-month follow-up data. Happy to share our protocol if anyone is interested.

Join the conversation

Orthopedic professionals are discussing this. Sign in and upgrade to read every comment and add your voice.

Subscribe

Get Full Access

Read every OTW article and join member discussions for $24.99/month.

Get Full Access

Advertisement

Advertisement

Advertisement

Orthopedics This Week

The most trusted source in orthopedic industry news since 2005. Covering spine, joints, trauma, biologics, and the business of orthopedics.

A publication of RRY Publications, LLC

LinkedInXFacebook

Categories

  • Spine
  • Joints
  • Upper Extremities
  • Foot & Ankle
  • Sports Medicine
  • Pain Mgmt
  • Trauma
  • Biologics
  • Technology
  • People
  • Company News
  • Legal & Regulatory

Resources

  • Subscribe
  • Community Posts
  • Job Board
  • Press Release Opportunities
  • Power Rankings
  • About OTW
  • Advertise
  • Contact Us

Get Full Access

Unlimited articles, community posts, and Power Rankings.

Get Full Access

Plans start at $24.99/mo · Annual saves 20%

© 2026 Orthopedics This Week · RRY Publications, LLC

Privacy PolicyTerms of ServiceCookie Policy