Article

Medicare Advantage Plans Brace for Sweeping 2025 CMS Audit and Payment Rule Changes

  • June 11, 2025

CMS Tightens Oversight of Medicare Advantage Plans

In the coming year, the nation’s Medicare Advantage insurers – which cover over 31 million Americans – face an unprecedented wave of regulatory changes and scrutiny. The Centers for Medicare & Medicaid Services (CMS) has quietly ushered in a more aggressive audit regime for Medicare Advantage (MA) plans, alongside significant updates to how these plans are paid for the health risks of their enrollees.

Health plan CEOs, whose organizations collectively received about $455 billion in Medicare payments last year, are now grappling with what these changes mean operationally and financially. Many are preparing for a future in which annual federal audits become a routine part of doing business and risk adjustment rules are rewritten to curb excess payments.

Oversight Intensifies: RADV Audits Expand in 2025

Late this spring, CMS announced a dramatic expansion of its Risk Adjustment Data Validation (RADV) audits – the primary tool for verifying that MA plan payments are justified by members documented health status. Historically, CMS audited only a small sample (around 60) of MA contracts each year, targeting plans suspected of excessive billing. That is changing effective immediately: CMS will audit all eligible Medicare Advantage contracts annually (approximately 550 plans in total)1. In addition, the agency is fast-tracking a backlog of past years’ audits, pledging to complete all outstanding audits for payment years 2018 through 2024 by early 2026. This means health plans could be hit with multiple audit findings in short succession, condensing what might have been a decade of scrutiny into a much shorter window.

“We are committed to crushing fraud, waste and abuse across all federal healthcare programs,” Dr. Mehmet Oz, the CMS Administrator, said in a statement announcing the new audit strategy. While emphasizing the value of Medicare Advantage, Oz underscored that CMS must ensure [plans] are billing the government accurately2.

The RADV audits themselves will also become more intensive. CMS is increasing the sample size of medical records it reviews for each plan from about 35 records to as many as 200 records per plan annually1. By reviewing a larger slice of each plan’s claims, CMS aims to make any identified error rates more credible for extrapolation – a process of projecting the sample’s error rate onto the plan’s entire member population1. CMS finalized a rule in 2023 that, for the first time, allows auditors to extrapolate overpayment findings starting with audits of 2018 claims onward. In the past, if an audit uncovered (for example) $100,000 in improper payments in the sample, the plan would repay that amount; now CMS can multiply that figure across all similar cases in the year – a change that could turn modest audit findings into multimillion-dollar liabilities for plans.

To support this ambitious oversight agenda, CMS is bolstering its audit arsenal. The agency will deploy “enhanced technology” – including advanced data analytics, and potentially artificial intelligence, to flag suspect diagnoses in billing data1. It is also undertaking a massive workforce expansion, increasing its team of medical coders from just 40 to roughly 2,000 by September 2025 to manually review records and confirm unsupported codes2. This 50-foldstaffing surge underscores the scale of CMS’s commitment. All Medicare Advantage plans can now expect an audit each year, a stark departure from an era when many insurers never faced a RADV audit at all1.

For health plans, the immediate implication is a significant operational burden. Insurers will need to respond to ongoing documentation requests, often under tight deadlines, and may find themselves in perpetual audit preparation mode. Some plans are already ramping up their own internal audit teams and processes to mirror CMS’s efforts, aiming to catch and correct errors proactively before federal auditors arrive.

A Revamped Risk Adjustment Model and Policy Changes

Behind the audit crackdown is a broader effort to refine how risk adjustment – the system that pays more for sicker patients – is administered. In 2024, CMS began phasing in a new risk adjustment model (known as “V28”) for Medicare Advantage, the first major overhaul in years. This updated model recalibrates which diagnoses count toward a patient’s risk score and how much they raise payments. Notably, CMS removed over 2,000 diagnosis codes from the model that it deemed prone to being “up-coded” – the practice of documenting extra or more severe conditions to inflate payments3. The goal is to target codes most likely to be abused and ensure that payments better reflect genuine health status.

The transition to the new model is occurring gradually to mitigate disruption. For payment year 2024, risk scores were calculated with a blend (33% new model, 67% old model). By 2025, the balance flips to 67% new model (V28) and 33% old4, and by 2026 the new model will be fully in place. The V28 model introduces 115 condition categories (up from 86 in the previous model) but with a more selective set of diagnosis codes – 7,770 codes mapping to those categories, versus 9,797 codes in the old model4. In practical terms, some diagnoses that used to boost payments will no longer do so, or will do so to a lesser degree. Chronic conditions like diabetes, depression, or vascular disease are among those seeing coding criteria tightened or subdivided to prevent overstating a patient’s illness burden, according to policy analysts.

CMS argues these changes will improve payment accuracy and curb excess spending. Agency officials noted that Medicare Advantage plans have been paid billions more than similar patients in traditional Medicare, partly due to aggressive coding practices. Indeed, CMS now estimates MA plans overbill the government by about $17 billion a year through unsupported diagnoses, with some estimates as high as $43 billion. The new risk model, coupled with stepped-up audits, is designed to rein in this overspending. Med PAC, a congressional advisory body, has reported that payments to MA plans in 2024 were on track to be roughly $83 billion higher than they would have been in fee-for-service Medicare for the same enrollees – a gap these policies seek to narrow.

Health plans and providers, however, have voiced concern about the speed and impact of these changes. The industry pushed back hard when the new model was proposed, prompting CMS to adopt the three-year phase-in rather than an immediate switch3. Many insurers and health systems fear the model’s stricter coding could reduce payments for vulnerable patients, potentially affecting benefit offerings. CMS’s own projections suggested that despite the model changes, average plan payments per enrollee would still rise in 2024 and 2025, due to other adjustments. But those increases may be smaller than plans are used to, and impacts will vary byplans3.

The American Medical Group Association, representing provider organizations, cautiously noted that the phase-in gives CMS “an opportunity to refine the plan” if unintended consequences emerge by 2026. In essence, while regulators see the new model as a needed course correction, the industry sees a potential budget cut in disguise, to be fought or at least closely watched.

Operational and Compliance Challenges for Health Plans

For health plan executives, the confluence of comprehensive audits and new risk scoring rules translates into a daunting compliance agenda. Operationally, plans must strengthen their documentation practices and IT systems immediately. Every diagnosis code submitted for payment must be backed by proper medical record evidence – not just to withstand a CMS audit, but to ensure the plan isn’t overstating its risk scores under the refined model. Many insurers are conducting internal RADV-style audits on 2018–2022 data right now, essentially red-flagging any diagnosis in their system that might not hold up to scrutiny. By performing these self-audits and deleting or correcting unsupported codes in CMS’s database, plans can mitigate future penalties4. This proactive approach, encouraged by consultants, aims to “reduce and manage RADV financial exposure” by addressing issues before the government does.

Provider engagement is another critical piece. Medicare Advantage insurers often rely on networks of physicians and hospitals to document diagnoses, and historically some have incentivized providers to code comprehensively. Now the dynamic is shifting: plans are implementing new provider training and education on the V28 coding changes, stressing accurate and only supported diagnoses. Some plans are also revisiting their contracts with providers. Those that share risk with providers (through value-based arrangements or bonus incentives) may insert clauses making providers financially liable for coding errors that lead to audit recoveries. If a CMS extrapolated audit claws back millions of dollars from a plan, the plan doesn’t want to shoulder that alone – it may seek to recover portions from the physician groups whose documentation was found lacking. This is a delicate conversation, but it reflects how seriously plans are treating the new audit risk.

Internally, compliance and audit departments at MA organizations are bracing for a heavier lift. Plan CEOs are evaluating whether their teams have the bandwidth and expertise to handle continuous audit requests, or if they need to enlist outside help (such as specialized auditing firms or consulting partners). The administrative load of responding to RADV audits – pulling hundreds of medical records from archives, coding them, and submitting rebuttal evidence – is significant, especially for smaller regional plans. Plans must also keep pace with evolving guidance: CMS recently issued updated RADV audit dispute and appeal instructions (effective January 2025), clarifying how plans can challenge audit findings through a reconsideration process2. Ensuring the legal team is ready to navigate these appeals, especially when extrapolated sums are on the line, will be crucial.

Finally, IT systems need updates to accommodate the 2025 risk model blend and forthcoming full model transition. Claims and billing software must incorporate the new HCC definitions so that as of January 1, 2025, incoming claims are evaluated under the correct risk adjustment logic. Misalignments here could directly affect revenue projections and compliance. Some plans have had to reconfigure analytics dashboards and retrain their coders and coding vendors on the model’s nuances – for example, which codes no longer map to an HCC (and thus no longer increase payments)4. This system work is technical, but vital to avoid errors in submissions that could trigger audits or payment shortfalls.

Financial Stakes and Industry Response

The financial implications of CMS’s 2025 changes are multifaceted. On one hand, Medicare Advantage insurers might see lower revenue growth per patient as risk scores level off under the tighter model. On the other hand, they face the possibility of paying back substantial sums if audits uncover past overpayments. Even a small error rate can translate into a large liability when extrapolated across tens or hundreds of thousands of members. Past RADV audits (2011–2013) found overpayments in the range of 5% to 8%2. If a similar error rate were found today and extrapolated, a mid-sized plan with $1 billion in annual revenue might have to refund $50–$80 million for a single year – a heavy hit to earnings.

Compounding the concern, CMS’s decision to finalize audits from 2018 through 2024 in one burst means some plans could be writing checks for multiple years’ worth of overpayments almost at once. Financial officers are reviewing reserves and worst-case scenarios now. “If CMS identifies and extrapolates overpayments for those years, financial losses due to recoupment will be concentrated over a much shorter time period than under the prior timetable,” the Ropes & Gray analysis cautioned1. In other words, what might have been staggered as a series of smaller repayments over a decade could become a tidal wave of obligations around 2025–2026. This has implications for plan budgeting, dividend plans, and even market valuations – indeed, stock analysts have begun asking public MA insurers about their audit exposures in earnings calls.

Preparing for Change: Mitigation Strategies for Plans

In response to these challenges, savvy health plans are taking a multi-pronged approach to mitigate risk. One key strategy is investing in advanced analytics to identify coding outliers. Plans are leveraging data algorithms to scan claims for patterns – for example, providers who code unusually high rates of certain lucrative diagnoses – and then conducting targeted chart reviews to verify those cases. By doing so, plans can either validate the codes with proper documentation or proactively “unlock” and remove unsupported diagnoses from their submissions, thereby inoculating against future audit findings. This kind of internal cleanup, though potentially reducing payments in the short term, can save a plan from a costly claw-back down the road. Several large insurers have created special RADV task forces for this purpose, blending expertise from compliance, IT, and clinical coding teams.

Education and training are also front and center. Health plan leaders are doubling down on provider education programs to reinforce documentation standards. For example, physicians are being reminded that every chronic condition must be explicitly documented each year in the medical record to count for risk adjustment – and if they add a diagnosis, it should be one actively managed or treated, not just noted in passing. Plans are updating provider handbooks to reflect diagnoses that no longer risk-adjust under the new model, so clinicians don’t waste effort coding conditions that won’t contribute to funding. Some plans are even offering or requiring “documentation integrity” training sessions for network providers, knowing that many audit issues can be prevented at the point of care through better record-keeping.

Another defensive measure is incorporating more stringent audit clauses in vendor contracts. Many health plans use third-party vendors for chart reviews or in-home assessments to help identify additional diagnoses. In the wake of the RADV rule, plans are making sure those vendors attest to the accuracy of codes they submit on the plan’s behalf – and assume liability if codes don’t hold up in an audit. Similarly, plans in risk-sharing arrangements with providers are clarifying how any recovered payments will be handled, as noted earlier. The overarching aim is to align incentives so that everyone – plan, provider, vendor – has “skin in the game” to only report truthful, supportable diagnoses.

From a financial planning perspective, some insurers are bolstering reserves or reinsurance coverage to cushion against possible repayments. Just as importantly, they are scenario-testing the impact of lower risk scores. CFOs are running models on 2025 revenue under various coding intensity assumptions (for instance, if certain common diagnoses drop out of HCC scoring) to guide bids and benefit design for the upcoming plan year. In extreme cases, a few plans have hinted they might need to trim benefits or adjust premiums if the new model significantly undercuts their payments – a move that would likely invite member and political backlash. For now, most are taking a wait-and-see approach, hoping that improved documentation and coding accuracy can blunt the negative financial impacts.

Navigating the Changes with Technology and Support

As Medicare Advantage organizations brace for this new regulatory landscape, many are turning to technology and specialized support services to adapt more effectively. Digital operations platforms and analytics tools are emerging as essential aids in ensuring compliance without overwhelming internal teams. For example, some health plans are deploying AI-driven software to automatically review medical records for any discrepancies between documented conditions and submitted diagnosis codes. These tools can flag potential unsupported diagnoses in real time, allowing plans to correct errors before they are picked up in a CMS audit. Enhanced reporting systems also help plans continuously monitor their risk score trends under the new model and identify areas where scores are dropping due to the V28 changes – insight that can inform provider outreach and member care programs.

Mizzeto’s healthcare digital operations suite is designed to streamline back-office processes for payers, which now include the heavy compliance workloads. For instance, Mizzeto provides audit and compliance assistance, conducting transactional audits to ensure policy compliance and quality control. Such services can take on the labor-intensive task of reviewing claims and medical records for accuracy, effectively augmenting a health plan’s internal audit department. Mizzeto also specializes in claims processing automation and data management, which helps plans keep their billing accurate and up-to-date with the latest rules. By automating routine claims checks and integrating the new risk adjustment logic into claims workflows, these technologies reduce the chance of human error that could lead to audit findings.

Another area where external partners prove valuable is in financial reconciliation and provider recovery efforts. If a plan does end up owing money back to CMS or identifies overpayments made to providers, Mizzeto’s services include analyzing overpayment situations and even helping to recoup excess payments from providers in the plan’s network. This kind of support is critical when plans are processing the results of an audit or adjusting payments post-review. It ensures that once a compliance issue is identified, the plan can resolve it swiftly on the financial side – whether that means correcting claims, retrieving funds, or crediting CMS – all with minimal disruption to operations.

Crucially, these solutions are not about replacing human expertise but augmenting it. Health plan executives remain at the helm in setting strategy (such as how to respond to CMS rule changes or when to self-audit), but they are leveraging technology and trusted partners to execute those strategies at scale. The result can be a more resilient organization: one that can handle an uptick in audits and shifting payment formulas without sacrificing focus on member care.

Looking ahead, Medicare Advantage plans will continue to refine their approach as real-world data from 2025 rolls in. Early audit results and the first full year of the new risk score model will provide feedback, showing where coding patterns need improvement or which compliance investments yield the best returns. Health plan CEOs are keenly aware that the stakes are high – both in terms of dollar amounts and public trust. Yet, with thorough preparation, the right expertise, and strategic use of technology, plans can navigate these reforms. The overarching goal is aligning Medicare Advantage’s impressive growth with robust accountability. And while the 2025 CMS audit changes pose undeniable challenges, they also present an opportunity: for health plans to demonstrate their commitment to accuracy and quality, strengthening the partnership between the government and private insurers that millions of seniors rely on every day.

1CMS Announces Significant Changes to RADV Auditing Efforts: Considerations and Next Steps for the Medicare Advantage Industry

2CMS Rolls Out Aggressive Strategy to Enhance and Accelerate Medicare Advantage Audits

3Providers, payers press CMS to get rid of Medicare Advantage risk adjustment changes entirely

4Key Areas of Focus for Risk Adjustment as the Calendar Turns to 2025

Latest News

Latest Research, News , & Events.

Read More
icon
Article

AI Data Governance - Mizzeto Collaborates with Fortune 25 Payer

AI Data Governance

The rapid acceleration of AI in healthcare has created an unprecedented challenge for payers. Many healthcare organizations are uncertain about how to deploy AI technologies effectively, often fearing unintended ripple effects across their ecosystems. Recognizing this, Mizzeto recently collaborated with a Fortune 25 payer to design comprehensive AI data governance frameworks—helping streamline internal systems and guide third-party vendor selection.

This urgency is backed by industry trends. According to a survey by Define Ventures, over 50% of health plan and health system executives identify AI as an immediate priority, and 73% have already established governance committees. 

Define Ventures, Payer and Provider Vision for AI Survey

However, many healthcare organizations struggle to establish clear ownership and accountability for their AI initiatives. Think about it, with different departments implementing AI solutions independently and without coordination, organizations are fragmented and leave themselves open to data breaches, compliance risks, and massive regulatory fines.  

Principles of AI Data Governance  

AI Data Governance in healthcare, at its core, is a structured approach to managing how AI systems interact with sensitive data, ensuring these powerful tools operate within regulatory boundaries while delivering value.  

For payers wrestling with multiple AI implementations across claims processing, member services, and provider data management, proper governance provides the guardrails needed to safely deploy AI. Without it, organizations risk not only regulatory exposure but also the potential for PHI data leakage—leading to hefty fines, reputational damage, and a loss of trust that can take years to rebuild. 

Healthcare AI Governance can be boiled down into 3 key principles:  

  1. Protect People Ensuring member data privacy, security, and regulatory compliance (HIPAA, GDPR, etc.). 
  1. Prioritize Equity – Mitigating algorithmic bias and ensuring AI models serve diverse populations fairly. 
  1. Promote Health Value - Aligning AI-driven decisions with better member outcomes and cost efficiencies. 

Protect People – Safeguarding Member Data 

For payers, protecting member data isn’t just about ticking compliance boxes—it’s about earning trust, keeping it, and staying ahead of costly breaches. When AI systems handle Protected Health Information (PHI), security needs to be baked into every layer, leaving no room for gaps.

To start, payers can double down on essentials like end-to-end encryption and role-based access controls (RBAC) to keep unauthorized users at bay. But that’s just the foundation. Real-time anomaly detection and automated audit logs are game-changers, flagging suspicious access patterns before they spiral into full-blown breaches. Meanwhile, differential privacy techniques ensure AI models generate valuable insights without ever exposing individual member identities.

Enter risk tiering—a strategy that categorizes data based on its sensitivity and potential fallout if compromised. This laser-focused approach allows payers to channel their security efforts where they’ll have the biggest impact, tightening defenses where it matters most.

On top of that, data minimization strategies work to reduce unnecessary PHI usage, and automated consent management tools put members in the driver’s seat, letting them control how their data is used in AI-powered processes. Without these layers of protection, payers risk not only regulatory crackdowns but also a devastating hit to their reputation—and worse, a loss of member trust they may never recover.

Prioritize Equity – Building Fair and Unbiased AI Models 

AI should break down barriers to care, not build new ones. Yet, biased datasets can quietly drive inequities in claims processing, prior authorizations, and risk stratification, leaving certain member groups at a disadvantage. To address this, payers must start with diverse, representative datasets and implement bias detection algorithms that monitor outcomes across all demographics. Synthetic data augmentation can fill demographic gaps, while explainable AI (XAI) tools ensure transparency by showing how decisions are made.

But technology alone isn’t enough. AI Ethics Committees should oversee model development to ensure fairness is embedded from day one. Adversarial testing—where diverse teams push AI systems to their limits—can uncover hidden biases before they become systemic issues. By prioritizing equity, payers can transform AI from a potential liability into a force for inclusion, ensuring decisions support all members fairly. This approach doesn’t just reduce compliance risks—it strengthens trust, improves engagement, and reaffirms the commitment to accessible care for everyone.

Promote Health Value – Aligning AI with Better Member Outcomes 

AI should go beyond automating workflows—it should reshape healthcare by improving outcomes and optimizing costs. To achieve this, payers must integrate real-time clinical data feeds into AI models, ensuring decisions account for current member needs rather than outdated claims data. Furthermore, predictive analytics can identify at-risk members earlier, paving the way for proactive interventions that enhance health and reduce expenses.

Equally important are closed-loop feedback systems, which validate AI recommendations against real-world results, continuously refining accuracy and effectiveness. At the same time, FHIR-based interoperability enables AI to seamlessly access EHR and provider data, offering a more comprehensive view of member health.

To measure the full impact, payers need robust dashboards tracking key metrics such as cost savings, operational efficiency, and member outcomes. When implemented thoughtfully, AI becomes much more than a tool for automation—it transforms into a driver of personalized, smarter, and more transparent care.

Integrated artificial intelligence compliance
FTI Technology

Importance of an AI Governance Committee

An AI Governance Committee is a necessity for payers focused on deploying AI technologies in their organization. As artificial intelligence becomes embedded in critical functions like claims adjudication, prior authorizations, and member engagement, its influence touches nearly every corner of the organization. Without a central body to oversee these efforts, payers risk a patchwork of disconnected AI initiatives, where decisions made in one department can have unintended ripple effects across others. The stakes are high: fragmented implementation doesn’t just open the door to compliance violations—it undermines member trust, operational efficiency, and the very purpose of deploying AI in healthcare.

To be effective, the committee must bring together expertise from across the organization. Compliance officers ensure alignment with HIPAA and other regulations, while IT and data leaders manage technical integration and security. Clinical and operational stakeholders ensure AI supports better member outcomes, and legal advisors address regulatory risks and vendor agreements. This collective expertise serves as a compass, helping payers harness AI’s transformative potential while protecting their broader healthcare ecosystem.

Mizzeto’s Collaboration with a Fortune 25 Payer

At Mizzeto, we’ve partnered with a Fortune 25 payer to design and implement advanced AI Data Governance frameworks, addressing both internal systems and third-party vendor selection. Throughout this journey, we’ve found that the key to unlocking the full potential of AI lies in three core principles: Protect People, Prioritize Equity, and Promote Health Value. These principles aren’t just aspirational—they’re the bedrock for creating impactful AI solutions while maintaining the trust of your members.

If your organization is looking to harness the power of AI while ensuring safety, compliance, and meaningful results, let’s connect. At Mizzeto, we’re committed to helping payers navigate the complexities of AI with smarter, safer, and more transformative strategies. Reach out today to see how we can support your journey.

February 14, 2025

5

min read

Feb 21, 20242 min read

Article

CMS Isn't Auditing Decisions — It’s Auditing Proof

Why utilization management may determine who clears the coming audit wave—and who doesn’t.

CMS doesn’t usually announce a philosophical shift. It signals it. And over the past year, the signals have grown louder: tougher scrutiny of utilization management, more rigorous document reviews, and an expectation that payers show—not simply assert—how they operate. The 2026 audit cycle will be the first real test of this new posture.

For health plans, the question is no longer whether they can survive an audit. It’s whether their operations can withstand a level of transparency CMS is poised to demand.

What CMS Is Really Asking for in 2026

Behind every audit protocol lies a single question: Does this plan operate in a way that reliably protects members? Historically, payers could answer that question through narrative explanation—clinical notes, supplemental files, post-hoc clarifications. Those days are ending. CMS wants documentation that stands on its own, without interpretation. Decisions must speak for themselves.

That shift lands hardest in utilization management. A UM case is a dense intersection of clinical judgment, policy interpretation, and regulatory timing. A single inconsistency—a rationale that doesn’t match criteria, a letter that doesn’t reflect the case file, a clock mismanaged by a manual workflow—can overshadow an otherwise correct decision.

The emerging audit philosophy is clear: If the documentation doesn’t prove the decision, CMS assumes the decision cannot be trusted.

Where the System Breaks: UM as the Audit Pressure Point

Auditors are increasingly zeroing in on UM because it sits at the exact point where member impact is felt: the determination of whether care moves forward. And yet the UM environment inside most plans is astonishingly fragile.

Case files exist across platforms. Reviewer notes vary widely in depth and style. Criteria are applied consistently in theory but documented inconsistently in practice. Timeframes live in spreadsheets or side systems. Letter templates multiply to meet state and line-of-business requirements, and each variation introduces new chances for error.

Delegated entities add another degree of variation. AI tools introduce sophistication—but also opacity. And UM letters, already the last mile, turn into the site of the most findings. The audit findings from recent years reveal the same weak points over and over: documentation mismatches, missing citations, unclear rationales, inadequate notice language, or timing failures that stem not from malice but from operational drift.

CMS sees all of this as symptomatic of one problem: fragmentation.

Why CMS’s New Expectations Make Sense—Even If They Hurt

To CMS, consistency is fairness. If two reviewers evaluating the same procedure cannot produce the same rationale, use the same criteria, or generate the same clarity in their letters, then members cannot rely on the decisions they receive. From the regulator’s perspective, this isn’t about paperwork—it’s about equity. Documentation is the proof that similar members receive similar decisions under similar circumstances.

Health plans know this in theory. But the internal pressures—volume, staffing variability, outdated systems, multiple point solutions, off-platform decisions, peer-to-peer nuances—make uniformity nearly impossible. CMS’s response is simple: Technical difficulty is not an excuse. Variation is a governance failure.

This is why the agency is preparing to scrutinize AI tools with the same rigor as human reviewers. Automation that produces variable results, or outputs that do not exactly match the case file, is no different from human inconsistency.

CMS is not anti-AI. It is anti-opaque-AI.

What an Audit-Ready UM Operation Actually Looks Like

Plans that will succeed in 2026 are building something different: a coherent operating system that eliminates guesswork. In these models, the case file becomes a single source of truth. Clinical summaries, criteria references, rationales, and letter text are drawn from the same structured data—so the letter is a natural extension of the decision, not a separate narrative created afterward.

Delegated entities operate under unified templates, shared quality rules, and real-time oversight rather than annual check-ins. AI is governed like a medical policy: with defined behaviour, monitoring, version control, and auditable outputs. And timeframes are treated with claims-like precision, not as deadlines managed by human vigilance.

This is not just modernization—it is a philosophical shift. A move from “reviewers record what happened” to “the system records what is true.”

Preparing for 2026 Starts in 2025

The path forward isn’t mysterious; it’s disciplined. Plans need to invest the next year in cleaning up documentation, consolidating UM data flows, reducing template drift, tightening delegation oversight, and putting governance around every automated tool in the UM pipeline. The plans that do this will walk into audits with confidence. The plans that don’t will rely on explanations CMS is increasingly unwilling to accept.

The Bottom Line

The 2026 CMS audit cycle isn’t a compliance event—it’s an operational reckoning. CMS is asking payers to demonstrate integrity, not describe it. And utilization management will be the proving ground. The strongest plans are already acting. The others will be forced to.

At Mizzeto, we help health plans build the documentation, automation, and governance foundation needed for a world where every UM decision must be instantly explainable. Because in the next audit cycle, clarity isn’t optional—it’s compliance.

Jan 30, 20246 min read

December 5, 2025

2

min read

Article

Why UM Letters Still Slow Down Health Plans

In the age of AI-driven utilization management (UM), one paper trail still refuses to move at the speed of automation: the UM letter.

Whether it’s an approval, denial, or request for additional information, these letters remain the last mile of every UM decision, and too often, the slowest. Despite sophisticated review platforms and integrated medical policy engines, many health plans still rely on legacy templates, fragmented data sources, and manual QA loops to generate what regulators consider a fundamental compliance artifact. UM letters are not just a formality; they are a legal requirement. Under CMS rules, plans must issue timely, adequate notice of adverse benefit determinations, explaining both the rationale and appeal rights to members.

The irony is hard to miss: while decisions are made in seconds, the documentation that justifies them can take days.

The Real Question Behind the Delay

The issue isn’t simply that UM letters take time. It’s why they take time, and what that delay reveals about deeper system inefficiencies.

For health plans, the question isn’t “How can we make letters faster?” It’s “Why are they so hard to get right in the first place?”

A single UM letter must synthesize clinical reasoning, regulatory precision, and plain-language clarity all aligned with CMS, NCQA, and state-specific notice requirements. The challenge is not in the writing, but in orchestrating inputs from multiple systems: clinical review notes, policy citations, benefit text, and provider data.

When those inputs don’t talk to each other, letter generation becomes a bottleneck that slows down turnaround times, increases error risk, and erodes member trust.

Why Templates Must Meet More Than Just Style

UM letter templates are not just administrative artifacts; they are regulatory documents. Under Centers for Medicare & Medicaid Services (CMS) rules, letters providing notice of adverse benefit determinations must meet detailed content and timing standards. For example, the regulation at 42 CFR § 438.404 mandates that notices be in writing and explain the reasons for denial, reference the medical necessity criteria or other processes used, provide the enrollee’s rights to copies of evidence and appeal, and outline procedures for expedited review.1

In practice, this means letter templates must include:

  • A clear description of the decision and the specific denial reason,
  • The criteria or protocol relied upon (with member access to it free of charge),
  • Instructions on how to appeal (standard and expedited),
  • Rights to benefits continuation pending appeal under defined circumstances.2

Failure to incorporate these elements or to issue the notice within required timeframes can expose plans to audit findings, grievances, and regulatory penalties. The tighter the regulatory lens becomes, the less room there is for “good enough” templates. Each health plan must view letter-generation not as a clerical task but as a compliance checkpoint. And beyond the regulatory content itself, many programs require that UM notices be written in plain, accessible language at the 6th-8th grade level, to ensure members can understand their rights and the basis for a decision.

Five Friction Points Inside UM Letter Workflows

Every health plan faces variations of the same problem, but the underlying breakdowns tend to cluster around five recurring fault lines:

  1. Fragmented Data Sources
    Critical information lives in multiple systems. UM platforms, claims engines, and policy libraries. Each transfer adds latency and the potential for mismatch.
  1. Template Explosion
    Over time, teams accumulate hundreds of letter templates to meet overlapping state and product requirements. Maintaining these manually makes even minor updates a compliance risk.
  1. Human Review Dependency
    Because UM letters must be clinically and legally precise, most organizations rely on multiple layers of human QA. That review process, while necessary, often adds 24–48 hours to turnaround.
  1. Regulatory Complexity
    CMS and state requirements around adverse determination language, appeal rights, and timing create constant moving targets. Even small wording deviations can trigger audit findings.3
  1. Technology Gaps
    Many UM systems weren’t designed for dynamic document assembly. Integrating clinical rationale, structured data, and plain-language output requires middleware or manual intervention.

Each of these friction points compounds the next, creating a cycle of rework, delay, and compliance exposure even in otherwise modernized UM environments.

Connecting the Dots: What the Delay Really Costs

The operational burden of slow UM letters goes far beyond staff productivity. It directly affects regulatory performance, provider satisfaction, and member experience.

Delayed or inconsistent notices can:

  • Violate CMS and NCQA timeliness standards, exposing plans to corrective action.4
  • Create confusion for providers awaiting determinations, delaying care coordination.
  • Generate avoidable grievances and appeals, further burdening UM teams.

The cost is not just administrative, it’s reputational. Every late or unclear letter represents a breakdown in transparency at the very point where payers are most visible to members and regulators alike.5

Building a Smarter Letter Ecosystem

Leading plans are tackling the problem not with more templates, but with smarter orchestration.

The most effective UM letter modernization strategies share three principles:

  • Structured Input, Dynamic Output: Capture decision data in structured fields early in the UM process so letters can be assembled automatically with consistent language and logic.
  • Governance-Driven Templates: Centralize letter libraries under compliance governance, ensuring real-time updates to regulatory text and benefit language.
  • Human-in-the-Loop Automation: Use AI-assisted generation to draft letters but retain clinical reviewer oversight for rationale and tone.

The goal isn’t to remove people, it’s to remove friction. Automation should serve precision, not replace it.

When designed correctly, next-generation letter systems can cut turnaround time by 50–70%, reduce rework, and strengthen audit readiness while making communications clearer for both providers and members.

The Bottom Line

UM letters may seem administrative, but they are where compliance, communication, and care converge. If denials are the visible output of your UM program, letters are the proof of its integrity.

For payers, the question isn’t whether letters can be automated, it’s whether they can be governed with the same rigor as the decisions they document.

At Mizzeto, we help health plans modernize UM letter workflows, integrating automation, policy governance, and compliance intelligence into one seamless ecosystem.  

SOURCES

  1. 42 CFR & 438.404 - Timely and Adequate Notice of Adverse Benefit Determination
  2. Medicaid Managed Care State Guide
  3. CMS Coverage Appeals Job Aid
  4. Utilization Management Accreditation - A Quality Improvement Framework
  5. Denials & Appeals in Medicaid Managed Care

Jan 30, 20246 min read

November 18, 2025

2

min read

Article

Appeals as a Mirror: What Overturned Denials Reveal About Broken UM Processes

In utilization management (UM), few metrics speak louder—or cut deeper—than overturn rates. When a significant share of denied claims are later approved on appeal, it’s rarely just about an individual decision. It’s a reflection of something bigger: inconsistent policy interpretation, reviewer variability, documentation breakdowns, or outdated clinical criteria.

Regulators have taken notice. CMS and NCQA increasingly treat appeal outcomes as a diagnostic lens into whether a payer’s UM program is both fair and clinically grounded.1 High overturn rates now raise questions not just about accuracy, but about governance.

In Medicare Advantage alone, more than 80 % of appealed denials were overturned in 2023 — a statistic that underscores how often first-pass decisions fail to hold up under scrutiny.2 The smartest health plans have started to listen. They’re treating appeals not as administrative noise—but as signals.

What Overturned Denials Are Really Saying

Every overturned denial tells a story. It asks, implicitly: Was the original UM decision appropriate, consistent, and well-supported?

Patterns in appeal outcomes can expose weaknesses that internal audits often miss. For example:

  • Repeated overturns for a single service category often signal misaligned or outdated policies.
  • Overturns concentrated among certain reviewers may point to training or workflow inconsistencies.
  • Successful appeals after peer-to-peer discussions often reveal documentation or communication gaps between provider and plan.

These trends mirror national data showing that many initial denials are overturned once additional clinical details are provided, highlighting communication—not medical necessity—as the core failure.3 The takeaway is simple but powerful: Appeal data is feedback—from providers, from regulators, and from your own operations—about how well your UM program is working in the real world.

The Systemic Signals Behind High Overturn Rates

When you look beyond the surface, overturned denials trace back to five systemic fault lines common across payer organizations:

  1. Policy Rigor vs. Flexibility
    Medical necessity criteria must balance evidence-based precision with real-world adaptability. Policies written without clinical nuance—or not updated frequently enough—tend to generate denials that can’t stand up under appeal.
  1. Reviewer Variability
    Even with clear policies, human interpretation introduces inconsistency. Differences in specialty expertise, decision fatigue, or tool usage can lead to unpredictable outcomes.
  1. Provider Documentation Gaps
    Many initial denials are simply the result of incomplete records. When appeals are approved after additional information surfaces, the problem isn’t inappropriate care—it’s communication failure.
  1. Operational Friction
    Lag times between intake, review, and notification can distort first-pass decisions. Data fragmentation between UM, claims, and provider portals compounds the issue.
  1. Weak Feedback Governance
    Too often, appeal outcomes are logged but not analyzed. Mature UM programs close the loop—using overturned denials to retrain reviewers, refine policies, and target provider outreach.

Federal oversight agencies have long flagged this issue: an OIG review found that Medicare Advantage plans overturned roughly three-quarters of their own prior authorization denials, suggesting systemic review flaws and weak first-pass decision integrity.4

Turning Appeals into a Feedback Engine

Leading payers are reframing appeals from a reactive function to a proactive improvement system.
They’re building analytics that transform overturn data into actionable intelligence:

  • Policy Calibration: Tracking which criteria most often lead to successful appeals reveals where policies may be too restrictive or outdated.
  • Reviewer Performance: Overlaying overturn trends with reviewer data helps identify where training or peer review support is needed.
  • Provider Partnership: By sharing de-identified appeal insights, plans can help provider groups strengthen documentation and pre-service submissions.
  • Regulatory Readiness: Demonstrating a closed-loop feedback process strengthens NCQA compliance and positions the plan as an adaptive, learning organization.

This approach turns what was once a compliance burden into a continuous-learning advantage.

From Reversal to Reform

High overturn rates are not just a symptom—they’re an opportunity. Each reversed denial offers a data point that, aggregated and analyzed, can make UM programs more consistent, more transparent, and more clinically aligned.

The goal isn’t to eliminate appeals. It’s to make sure every appeal teaches the organization something useful—about process integrity, provider behavior, and the evolution of clinical practice.

When health plans start to see appeals as mirrors rather than metrics, UM stops being a gatekeeping exercise and becomes a governance discipline.

The Bottom Line

Overturned denials aren’t administrative noise—they’re operational intelligence. They show where your policies, people, and processes are misaligned, and where trust between payer and provider is breaking down.

For forward-thinking plans, this is the moment to reimagine UM as a learning system.
At Mizzeto, we help health plans turn appeal data into strategic insight—linking overturned-denial analytics to reviewer training, policy governance, and compliance reporting. Because in utilization management, every reversal has a lesson—and the best programs are the ones that listen.

SOURCES

  1. National Committee for Quality Assurance (NCQA). Overview of Proposed Updates to Utilization Management Accreditation 2026
  2. Kaiser Family Foundation (KFF). “Nearly 50 Million Prior Authorization Requests Were Sent to Medicare Advantage Insurers in 2023"
  3. American Medical Association (AMA). “Prior Authorization Denials Up Big in Medicare Advantage"
  4. U.S. Department of Health & Human Services, Office of Inspector General (OIG). Some Medicare Advantage Organization Denials of Prior Authorization Requests Raise Concerns About Beneficiary Access to Medically Necessary Care

Jan 30, 20246 min read

November 4, 2025

2

min read

Article

Which LLMs Are Best for Healthcare Use?

Not all intelligence is created equal. As health plans race to integrate large language models (LLMs) into clinical documentation, prior authorization, and member servicing, a deceptively simple question looms: Which model actually works best for healthcare?

The answer isn’t about which LLM is newest or largest — it’s about which one is most aligned to the realities of regulated, data-sensitive environments. For payers and providers, the right model must do more than generate text. It must reason within rules, protect privacy, and perform reliably under the weight of medical nuance

Understanding the Core Question

For payers and providers alike, the decision isn’t simply “which LLM performs best,” but “which model can operate safely within healthcare’s regulatory, ethical, and operational constraints.”

Healthcare data is complex — part clinical, part administrative, and deeply contextual. General-purpose LLMs like GPT-4, Claude 3, and Gemini Ultra excel in reasoning and summarization, but their performance on domain-specific medical content still requires rigorous evaluation.1 Meanwhile, emerging healthcare-trained models such as Med-PaLM 2, LLaMA-Med, and BioGPT promise higher clinical accuracy — yet raise questions about transparency, dataset provenance, and deployment control.

Analyzing the Factors That Matter

Evaluating an LLM for healthcare use comes down to five dimensions:

  1. Data Security and Privacy: Models must support on-premise or private cloud deployment, with PHI never leaving the payer’s-controlled environment.
  1. Domain Adaptation: Can the model be fine-tuned or context-trained on medical ontologies, payer workflows, or prior authorization rules?
  1. Explainability: Does it provide confidence scores, citations, or audit logs for generated content — essential for regulatory defense and trust?
  1. Integration Readiness: Can it interact with existing data ecosystems like QNXT, HealthEdge, or EPIC via APIs or orchestration layers?
  1. Cost and Scalability: Beyond performance, can it operate efficiently at enterprise scale without prohibitive inference costs?

The Case for General-Purpose Models

Models like OpenAI’s GPT-4 and Anthropic’s Claude 3 dominate enterprise use because of their versatility, mature APIs, and strong compliance track records. GPT-4, for instance, underpins several FDA-compliant tools for clinical documentation and prior authorization automation.2

Advantages include:

  • Maturity and security: Vendors offer HIPAA-aligned enterprise environments, audit trails, and SOC-2 compliance.
  • Cross-domain adaptability: They integrate easily across payer workflows — intake, summarization, or correspondence.
  • Rapid iteration: Frequent updates and strong partner ecosystems reduce implementation lag.

But there are caveats. General models sometimes “hallucinate” clinical or regulatory facts, especially when interpreting EHR data. Without domain fine-tuning or strong prompt governance, output quality can drift.

The Case for Healthcare-Specific LLMs

A growing ecosystem of medical-domain LLMs is changing the landscape. Google’s Med-PaLM 2 demonstrated near-clinician accuracy on the MedQA benchmark, outperforming GPT-4 in structured reasoning about medical questions. Open-source options like BioGPT (Microsoft) and ClinicalCamel are being tested for biomedical text mining and claims coding support.

Advantages include:

  • Higher clinical grounding: Trained on PubMed, clinical guidelines, and biomedical literature.
  • Explainability: Some models provide citation-based reasoning or evidence chains.
  • On-premise deployability: Open-source variants allow PHI-safe environments.

Yet, the trade-offs are real:

  • Limited generalization: These models can underperform on administrative or financial text.
  • Resource demands: Fine-tuning and maintenance require specialized infrastructure and talent.
  • Regulatory uncertainty: Validation for real-world payer use remains early-stage.

Synthesizing the Middle Ground

The emerging consensus is hybridization. Many payers and health systems are adopting dual-model architectures:

  • A general-purpose model (e.g., GPT or Claude) for summarization, knowledge extraction, and conversational interfaces.3
  • A domain-specific, internally governed model (often LLaMA or Mistral–based) for compliance-sensitive tasks involving PHI, clinical logic, or audit documentation.

This “governed ensemble” strategy balances innovation and oversight — leveraging the cognitive power of frontier models while preserving control where it matters most.

The key isn’t picking a single best model. It’s building the right model governance stack — version control, prompt audit trails, human-in-the-loop review, and strict access controls. Healthcare’s best LLM is not the one that knows the most, but the one that knows its limits.

The Bottom Line

Choosing an LLM for healthcare isn’t a procurement exercise — it’s a governance decision. Plans should evaluate models the way they would evaluate clinical interventions: by evidence, reliability, and risk tolerance.

The best LLMs for healthcare are those that combine precision, provenance, and privacy — not those that simply perform best in general benchmarks. Success lies in orchestrating intelligence responsibly, not in adopting it blindly.

At Mizzeto, we help payers design AI ecosystems that strike this balance. Our frameworks support multi-model orchestration, secure deployment, and audit-ready oversight — enabling health plans to innovate confidently without compromising compliance or control. Because in healthcare, intelligence isn’t just about what a model can say — it’s about what a plan can trust.

SOURCES

  1. Assessing the use of the novel tool Claude 3 in comparison to ChatGPT 4.0
  2. Use of GPT-4 to analyze medical records of patients with extensive investigations and delayed diagnosis
  3. Benefits, Limits, and Risks of GPT-4 as an AI Chatbot for Medicine

Jan 30, 20246 min read

October 24, 2025

2

min read