Background

The GAMUT QI Collaborative reports show how participant programs perform on individual quality metrics compared to other participants. The desired level of performance for each metric is referred to as a ‘benchmark’, which can be used as a targets or goals for quality improvement initiatives and to demonstrate that a program is continuously providing high quality care.

There are a number of approaches in calculating benchmarks in healthcare, such as “significantly better than average”, “Leading 10%”, or expert consensus (ie. predetermined targets such as 100% or 0%). While these benchmark methods are intuitive and relatlively simple to implement, they do have limitations. “Significantly better than average” does not really establish the best possible performance level. The “Leading 10%” method is most suitable when the group sizes are fairly similar. Expert consensus may be appropriate for some metrics but not for others where it is unclear what the best performance level really should be.

The GAMUT QI collaborative uses the Achievable Benchmarks of Care™ (ABC) method (Kiefe et al., 1998). The ABC™ establishes the performance level consistently being attained by the best participants that account for at least 10% of the overall population.

The ABC™ is designed to compare performance between groups of varying sizes, as is the case with the GAMUT QI Collaborative, which includes small and large programs. It has been used to establish quality performance benchmarks for hospitals (Parikh et al., 2014), surgeons (Hatfield, Ashton, Bass, & Shirkey, 2016), and other healthcare providers (Gardner, Taylor, & Gordon, 2014).

How it works

Let’s compare the “Leading 10%” and the ABC™ methods. Table 1 shows data from a hypothetical list of 16 transport programs of varying sizes. Some programs have contributed data for a full 12 months and others have not.

Table 1: Simulated data for illustration purposes.
  metric*
Program months      numerator    denominator rate
Endeavor Air 12   670 1000 0.67
American 12   704 800 0.88
Alaska 12   249 300 0.83
JetBlue 12   469 700 0.67
Delta 12   196 200 0.98
ExpressJet 9   750 820 0.915
Frontier 9   472 500 0.944
AirTran 9   300 320 0.938
Hawaiian 9   78 100 0.78
Envoy 6   193 200 0.965
SkyWest 6   125 130 0.962
United 6   55 80 0.688
US 6   35 40 0.875
Virgin America 2   29 30 0.967
Southwest 1   12 12 1
Mesa 1   5 5 1
* The sums of all months with valid data.

Leading 10%

Also referred to as ‘Top Decile’, this method simply sorts the rate of the programs from top performers to bottom. The benchmark value is interpolated as the 90th percentile of the rates.

Table 2: Programs ranked by rate
  metric
Program months      numerator    denominator rate
Southwest 1   12 12 1
Mesa 1   5 5 1
Delta 12   196 200 0.98
Virgin America 2   29 30 0.967
Envoy 6   193 200 0.965
SkyWest 6   125 130 0.962
Frontier 9   472 500 0.944
AirTran 9   300 320 0.938
ExpressJet 9   750 820 0.915
American 12   704 800 0.88
US 6   35 40 0.875
Alaska 12   249 300 0.83
Hawaiian 9   78 100 0.78
United 6   55 80 0.688
Endeavor Air 12   670 1000 0.67
JetBlue 12   469 700 0.67

In Table 2, the 90th percentile is interpolated to be 0.99. Note that this is based on two programs that have reported just one month of data and their numbers of elligible cases are relatively low.

ABC™ method

The ABC™ is slightly more complicated, as it takes into consideration the number of elligible cases and the overall population, with an adjustment for programs with low denominators.

Using the same hypothetical data, we first need to calculate the Adjusted Performance Factor (APF), which adds 1 to the numerator and 2 to the denominator for each program.

\[APF = \frac{(num + 1)}{(den + 2)}\]

Next, the programs are sorted by the APF and the cumulative percentage of the total propulation is calculated. The top programs that account for at least 10% of the overall population are the benchmark group. The ABC™ benchmark is calculated as the sum of the numerators divided by the sum of denominators from the benchmark group.

Table 3: Programs ranked by APF
  metric     population
Program months      numerator    denominator rate      APF†      cumulative sum %
Benchmark group
Delta 12   196 200 0.98   0.975   200 0.038
Envoy 6   193 200 0.965   0.96   400 0.076
SkyWest 6   125 130 0.962   0.955   530 0.101
 
Frontier 9   472 500 0.944   0.942   1030 0.197
Virgin America 2   29 30 0.967   0.938   1060 0.202
AirTran 9   300 320 0.938   0.935   1380 0.264
Southwest 1   12 12 1   0.929   1392 0.266
ExpressJet 9   750 820 0.915   0.914   2212 0.422
American 12   704 800 0.88   0.879   3012 0.575
US 6   35 40 0.875   0.857   3052 0.583
Mesa 1   5 5 1   0.857   3057 0.584
Alaska 12   249 300 0.83   0.828   3357 0.641
Hawaiian 9   78 100 0.78   0.775   3457 0.66
United 6   55 80 0.688   0.683   3537 0.675
Endeavor Air 12   670 1000 0.67   0.67   4537 0.866
JetBlue 12   469 700 0.67   0.67   5237 1
† Used only for ranking purposes.

In Table 3, the total population is 5237. With 530 elligible cases, the top three programs account for at least 10% of the overall population. The average rate among those top programs is 0.97 (514/530), which becomes the ABC™ rate.

For the GAMUT QI Reports, the ABC™ benchmarks are calculated when the dashboards are viewed based on the current data that is available. They cover a rolling 12 month period that ends two months before the current date. This is to allow more programs to submit their data before the ABC™ benchmarks are calculated.

Resources

Gardner, T. B., Taylor, D. J., & Gordon, S. R. (2014). Reported findings on endoscopic ultrasound examinations for chronic pancreatitis: Toward establishing an endoscopic ultrasound quality benchmark. Pancreas, 43(1), 37–40. doi:10.1097/MPA.0b013e3182a85e1e

Hatfield, M. D., Ashton, C. M., Bass, B. L., & Shirkey, B. A. (2016). Surgeon-Specific Reports in General Surgery: Establishing Benchmarks for Peer Comparison Within a Single Hospital. Journal of the American College of Surgeons, 222(2), 113–121. doi:10.1016/j.jamcollsurg.2015.10.017

Kiefe, C. I., Weissman, N. W., Allison, J. J., Farmer, R., Weaver, M., & Williams, O. D. (1998). Identifying achievable benchmarks of care: Concepts and methodology. International Journal for Quality in Health Care, 10(5), 443–447. doi:10.1093/intqhc/10.5.443

Parikh, K., Hall, M., Mittal, V., Montalbano, A., Mussman, G. M., Morse, R. B., … Shah, S. S. (2014). Establishing Benchmarks for the Hospitalized Care of Children With Asthma, Bronchiolitis, and Pneumonia. Pediatrics, peds.2014–1052. doi:10.1542/peds.2014-1052


Creation of the GAMUT database was made possible through the generosity of the Air Medical Physicians Association and the American Academy of Pediatrics Section on Transport Medicine, and grants from the Medevac, Akron Childrens, Cincinatti Childrens and Laerdal foundations.