Tag Archives: Doug Bates

Analytic basics: Completeness and outlier episode flags

Bates-DougAnalysts working with episode of care groupers for the first time often have questions about how to use the various value-added flags assigned to episodes. Episode of care groupers link together all of the claims that pertain to the treatment of a particular condition for a particular patient, to create a powerful unit of analysis. For example, a patient with a condition such as diabetes may receive multiple types of services from multiple providers and provider types for the treatment of their diabetes. An episode of care grouper will combine all of the individual claims from the different providers so that the full cost of treatment can be assessed.

Two of the value-added flags commonly assigned to episodes include completeness flags and outlier flags. Both of these flags enable analysts to filter out, or include, types of episodes to optimize their reporting. How to apply filters using these flags depends on the analysis being performed. A brief summary of these flags, and their use, is described below.

Completeness flags
Many episodes in a data set will not be complete, meaning there are still outstanding claims related to those episodes that were not available when the data were grouped. Typically, episodes that start toward the end of your grouping period are more likely to be incomplete. For instance, if you are grouping data incurred from January 2011 through December 2013 and an episode begins on December 19, 2013, there is a lower probability that all claims for this episode will be available in the data than if the episode had started in January 2013.

Episode groupers use different logic when assessing completeness for acute and chronic conditions. For acute conditions, most groupers determine that an episode is complete if there are no incurred professional claims for that condition for a predefined number of days. Chronic conditions such as diabetes are never cured, so technically those episodes never end, but in order to support analyses, chronic conditions are often divided into annual periods and may be defined as complete when a full year of data is available for the members with those episodes.

When comparing costs to benchmarks, incomplete episodes should be excluded because incomplete episodes are excluded from cost benchmarks.

If you are comparing the average length (days) or average cost of episodes across various populations or provider groups, then you should also exclude incomplete episodes. It is impossible to accurately assess average costs per episode if every claim for every episode is not included.

If the purpose of your analysis is to evaluate the prevalence of episode conditions, then include all episodes (complete and incomplete) in your reports.

Table 1 displays the distribution of complete and incomplete diabetes episodes from a sample data set. The average cost for incomplete episodes is usually lower than the average cost for complete episodes.

Table 1

Outlier flags
Episodes that have atypically higher costs or atypically lower costs compared with other episodes within the same class are flagged as high or low outliers. There are multiple methodologies for defining outlier episodes, but commonly the flags are based on statistical variance (i.e., a number of standard deviations from the mean). In and of themselves, outlier flags are not a measure of efficiency or quality, but the magnitude of the variance in their cost indicates there is something atypical about these cases.

When comparing with benchmarks, outliers should be excluded because most benchmarks will exclude outliers for consistency.

When comparing average costs across populations or provider groups, many analysts may choose to exclude all outliers, because a few outliers for a given group may skew their results. That being said, it is also important to assess if any given population of patients has significantly more episodes flagged as high outliers compared with others. A higher percentage of high outliers might warrant the need for further investigation.

For many episode classes, all that is needed to start an episode is a professional encounter with a primary diagnosis relevant to that episode class. In some cases, very short episodes may represent visits to rule out a specific diagnosis or other situations that don’t really represent full treatment for a condition. Excluding low outliers can help remove those types of episodes from your analysis.

Table 2 displays a sample of diabetes episodes by outlier status.

Table 2

Episode completeness and outlier flags can, of course, be used together. For most comparative analyses (to benchmarks or across populations), only complete non-outlier episodes are included. Table 3 displays the distribution of diabetes episodes when both flags are used as report dimensions.

Note that an analysis based solely on complete non-outlier episodes from these sample data would reduce the number of episodes from 57,193 to 34,250, removing 40% of the episodes from the analysis. When analyzing episode classes with a limited number of episodes, applying these filters may reduce your sample size to volumes that are too small to produce statistically significant results, so it is important to assess how many episodes are in your sample before you begin.

Table 3

Episodes of care provide a useful unit of analysis for evaluating healthcare utilization and cost. The episode completeness and outlier flags allow users to include, or exclude, types of episodes to further refine their analysis.

This article first appeared at Milliman MedInsight.

Preventive care: Colonoscopy screening and comparing costs

Bates-DougAccording to the 2008 U.S. Preventive Services Task Force (USPSTF) recommendation statement on screening for colorectal cancer, colorectal cancer is the second-leading cause of cancer death in the United States and appropriate screening could save thousands of lives a year.

The USPSTF recommends colorectal screening for everyone between 50 and 75 years of age. There are several screening tests currently available and modeling conducted by the USPSTF suggests that any of three screening programs would be “equally effective in life-years gained, assuming 100% adherence to the same regimen for that period”:

1. Annual high-sensitivity fecal occult blood testing
2. Sigmoidoscopy every five years combined with high-sensitivity fecal blood testing every three years
3. Screening colonoscopy at intervals of 10 years

Although other screening programs are less expensive and less invasive, and effectiveness is dependent upon the experience and expertise of those performing the procedure, well-performed colonoscopies were assessed to have higher sensitivity and specificity for detecting colon cancer. This finding, along with Medicare and the Patient Protection and Affordable Care Act (ACA) mandating no cost-sharing for colonoscopies and increased public awareness, has greatly increased the number of colonoscopies performed each year.

As the number of colonoscopies performed has increased, so has the variance in total cost for the procedure. Allowed charges can vary by thousands of dollars depending on the provider, place of service, and other variables. Monitoring utilization and evaluating the charges for these procedures has become increasingly important for health plans striving to improve health while managing costs.

The 2013 version of the Milliman Health Cost Guidelines™ (HCG) grouper includes separate detail lines to track utilization and cost of facility, as well as professional costs associated with preventive colonoscopy.

Using illustrative data from three health plans, allowable charges and utilization counts for facility and professional services associated with preventive colonoscopy are shown in Figures 1 and 2 below. These data include claims for patients between the ages of 50 and 75. HCG 051b represents outpatient facility services and HCG P40b represents professional services for a preventive colonoscopy. Note that there could be related services submitted on separate claims that are not captured in these totals.

Colonoscopy Screening (1)

Utilization units are counted separately for the facility and professional services. The utilization count associated with professional services represents the total number of preventive colonoscopies because some procedures will be provided in an office setting and will not have a separate facility record.

To compare utilization rates and average allowable charges for preventive colonoscopies across the three plans, sum the allowable charges for both HCG detail lines but use only the professional unit counts to avoid double-counting of procedures, as shown in Figure 2.

Colonoscopy Screening (2)

This simple analysis compares the cost and utilization of preventive colonoscopies across three plans, but additional analyses comparing costs across places of service (e.g., office, ambulatory surgery center, and outpatient hospital) provide further insights into cost drivers associated with these procedures.

This article first appeared at Milliman MedInsight.

Bundled payment claims analytics

Bates-DougAs healthcare reform progresses, there will be increasing pressure on the healthcare system to reduce costs while improving the quality of care. In order to meet these new challenges, many experts are looking for opportunities to reduce the fragmentation of care in the current system by better aligning providers. If facility and professional providers share accountability for the total cost of care, not just for the component of care they provide, many believe this will lead to more standardized care, lower overall cost, and improved quality outcomes.

One current payment reform concept is bundled payments. Under a bundled payment, a single payment is made by a payor for the total cost of care for a defined episode of care. Most commonly, these episodes include a hospital admission and all relevant healthcare services provided 30, 60, or 90 days after the admission. The bundled payment represents the total reimbursement for all providers involved in the patient’s care during the defined episode. The various providers need to agree on how to split and share that single payment. This requires that all providers—acute care facilities, professional facilities, and rehab facilities—understand their own costs, as well as the costs of the other providers, to ensure adequate profitability. Medical complications increase costs, so all providers are incentivized financially to ensure that the highest quality of care is provided for the patient.

Entering into a bundled payment contract requires a tremendous amount of planning and data analysis. Facilities and physicians need to come together to analyze costs across all diagnosis-related groups (DRGs), including services incurred after discharge.

Accurate data is critical for a sound analysis. To begin the process, data must be grouped into bundled events, or episodes. Using inpatient admissions as the starting point, group all claims for the same patient throughout the course of admission and for at least 90 days post-discharge (a preadmission time period may also be considered). Organize the claims into time periods (e.g., inpatient, 30-day post-discharge, etc.) and maintain detailed claim information such as procedure codes, diagnosis codes, and provider IDs to enable a thorough analysis of the drivers of cost throughout the course of treatment.

Once the data are grouped, review the results by admission type to identify the higher-volume and higher-average-cost events as displayed in the report below. Higher frequency and higher-cost events may represent the greatest opportunity for savings, but this is just the beginning of the analysis. Beyond the average cost per event, it is important to drill into admission types of interest and assess the variance in cost across similar bundled events. Care for event types with greater variance in cost may be more difficult to standardize and pose more financial risk for providers under a bundled payment contract.

Bundled Payment Claims Analytics_Doug Bates_Image 1

In the exhibit above, DRG 470, “major joint replacement or reattach lower extremity without major complication” represents a higher-frequency, higher-average-cost event type. Opportunities to reduce inpatient expenses such as length of stay and device costs all need to be considered but the analysis cannot be limited to the inpatient setting. The types of services and the timing of those services provided after discharge also need to be considered. Readmission rates, utilization of skilled nursing facilities/rehab facilities, ongoing professional visits, and therapies all need to be evaluated as displayed in the chart below.

Bundled Payment Claims Analytics_Doug Bates_Image 2

In summary, to prepare for bundled payment contracting, organizations must build actuarial models to establish baseline costs, identify areas where there is variance within the treatment patterns, and gain support from all stakeholders regarding treatment guidelines. A robust analytic environment with complete accurate data along with relevant benchmarks is essential for the planning process.

This article first appeared at Milliman MedInsight.

Using administrative claims data for quality reporting

Developing healthcare quality metrics based on administrative claims data has become increasingly common over the past several years. The National Committee for Quality Assurance’s Healthcare Effectiveness Data and Information Set (HEDIS) measures have been a standard for health plan quality reporting for over two decades, and more recently, newer programs such as the Centers for Medicare & Medicaid Services (CMS) Pioneer Accountable Care Organization (ACO) program and Oregon Coordinated Care Organization program have included claims-based quality measures as requirements for program participation.

Most claims-based measures are process-based, evaluating if appropriate services are provided for specified groups of patients, or identifying potential overutilization of services, but claims data are not the sole source of quality measurement. Survey data are often used for patient satisfaction and operational measures, and there is increasing use of lab results and electronic health record (EHR) data to expand the clinical components of quality that can be measured—a topic for another posting.

Despite the expansion of claims-based quality measures, some still question their merit. Those citing concerns point out known limitations associated with analyzing claims data, including:

• Potential errors or inconsistencies in coding.
• Availability of required data sources may be constrained if components of benefits are administered by multiple sources.
• Lack of complete clinical information.
• No diagnostic coding for blood pressure, laboratory results, or pathology results.
• Clinical information is limited to conditions for which the patient was treated and submitted a claim. A noncompliant diabetic may have no claim history of the disease.
• Timeliness of data is impacted by claim lag.

However, the advantages of analyzing claims data greatly outweigh the limitations noted above. The advantages include:

• Data are commonly available and relatively inexpensive to analyze
• Data are available for very large populations, allowing for more robust sample sizes
• Coding accuracy has improved dramatically over the past 20 years
• For some types of measures, claims may produce a more accurate picture than even chart reviews

An example of this last point would be measures focusing on patient compliance with medications. A physician may regularly write refill prescriptions for a patient’s hypertension medication, and those refills may be well documented in the patient’s chart, but those data provide no real evidence that the patient filled those prescriptions. Tracking actual claims for prescription refills is a much better measure. Granted, submitting a claim for a hypertension medication does not prove that the patient actually took the medication at the appropriate frequency, but a regular, ongoing refill pattern is a better proxy of medication adherence than chart review information.

Days supplied is commonly available on claims data, making it easy to calculate “possession ratios” to monitor patient compliance from pharmacy claims. A simplistic way (additional conditions can be added to the calculation) to measure possession ratios is demonstrated in Table 1 below. For patients continuously enrolled during a 180-day period and previously diagnosed with hypertension, the possession ratio for each patient is the sum of all days supplied on their prescriptions during the study period, divided by 180 days.

Although claims data are not perfect for clinical reporting, they will continue to be a valuable and important source of data for quality reporting for a selected set of metrics.

This article first appeared at Milliman MedInsight.

Identifying potentially overutilized medications

Often lying off the radar screen of many insured populations is the high number of prescriptions for opioid analgesics (or narcotic analgesics). These drugs are prescribed for pain relief for a wide range of conditions and the potential for abuse of these medications has grown over the past several years.

High utilization patterns of these prescription drugs do not always attract attention because, compared to other classes of medications, they are relatively inexpensive. As seen in Figures 1 and 2 below, based on a 50,000-patient commercial dataset, when sorted by total allowed charges, opioid analgesics rank 13 in terms of highest total cost, but when we sort by number of prescriptions, this therapeutic class jumps in rank to 2.

There are several methods for analyzing utilization of drug classes to identify opportunities for intervention. For a broad class of drugs, such as opioid analgesics, it is useful to drill further into the therapeutic classes. In Table 1, we see that the highest number of prescriptions were filled for hydrocodone combinations (including drugs like Vicodin), followed by opioid agonists (a category of very strong analgesics including morphine and Oxycontin) and codeine combinations (including drugs such as Percocet and Percodan.)

Analysts may want to analyze utilization by network or geographic areas to determine if specific markets have higher utilization rates compared to others. Table 2 displays prescription utilization by plan, revealing that Plan 3 had the highest utilization rate for these drugs.

Identifying possible cases of abuse typically involves drilling down to the provider or patient level. Table 3 illustrates an example analyzing utilization by primary care physician (PCP).

A complete analysis would include additional reports to better understand the prescribing physician specialties, the types of conditions they treat (chronic use of pain medications for periods of time may be appropriate for some conditions such as cancer), days supplied, and refill rates. At the patient level, it may also be important to quantify how many different providers have prescribed these drugs, as one physician is not likely to know what other physicians have prescribed for that patient, if the patient has not disclosed that information.

Analgesics are not the only class of drugs that have the potential for abuse. Generic Engineering & Biotechnology News recently “put together a list of 14 top abused prescription drugs, as listed by the [Centers for Disease Control and Prevention, the Food and Drug Administration], and nongovernment nonprofit sources on public websites.”

Their list is as follows (listed by drug brand name):

1. Oxycontin
2. Concerta
3. Ambien
4. Zoloft
5. Ritalin/Focalin
6. Adderal XR
7. Lunesta
8. Opana ER
9. Xanax XR
10. Vicodin
11. Fentora
12. Percocet
13. Valium
14. Ativan

This article first appeared at Milliman MedInsight.