Article

Most grievances start as calls that never got reviewed

  • June 12, 2026

How call intelligence catches member complaints before they reach your grievance and appeals team

Every health plan has a grievance and appeals operation. Staff, timelines, case management workflows, regulatory tracking. For Medicare Advantage plans, CMS audits it. For Medicaid plans, states audit it. The compliance burden is real. The operational cost is real.

What most plans have not built is anything upstream of it.

A formal grievance is not where the problem starts. It is where the problem ends up after several earlier opportunities to fix it were missed. A member's prior authorization status was communicated incorrectly. A benefit change was explained poorly and the member hung up confused. An agent gave a formulary answer that turned out to be wrong. None of those calls showed up in the QA report, because the QA program reviewed 2 to 5 percent of call volume[1] and these particular calls were not in the sample. Weeks later, the member called back. Then again. Then filed.

This is the upstream problem that most health plan grievance operations are not resourced to see. The tools needed to catch it simply have not been part of how call center quality has historically worked.

What grievances are actually made of

CMS defines a grievance as a complaint about a plan's delivery of service. That definition covers a wide range of situations: an agent who was dismissive, a hold time that felt unreasonable, a coverage question that went unanswered. The operational category that generates the most preventable grievances is simpler than the regulatory definition suggests.

Most formal grievances start as an unresolved phone call.

SQM Group's benchmarking data puts the healthcare insurance call center first call resolution rate at approximately 72 percent[2], meaning roughly 28 percent of member calls do not get resolved on first contact. Some of those members call back once. Some call back twice. When the calls involve a coverage dispute, a prior authorization denial, or a benefit question with real financial consequences for the member, the escalation path eventually leads to a formal grievance.

The more telling finding is about complaint calls specifically. The FCR rate for interactions where a member is already expressing dissatisfaction is only 47 percent.[3] Less than half of the calls most likely to become a grievance get resolved on the first contact. Plans are sending a continuous stream of unresolved member issues toward the back end operations that cost the most to run.

There is a second finding worth flagging here. SQM's research estimates that approximately 14 percent of callers describe their call as a complaint call. Most contact centers believe that number is under 5 percent.[4] Complaint volume is substantially underreported. Plans are not just missing the resolution. They are missing the signal that a problem exists at all.

The enforcement environment has changed

In January 2026, California's Department of Managed Health Care fined Anthem Blue Cross $15 million for what it described as longstanding and widespread deficiencies in handling member grievances spanning more than 15 years.[5] The action included a requirement for an independent auditor to oversee grievance system corrections for up to four years. This was not the first enforcement action for the same pattern. Prior survey findings had not produced sustained correction.

The pattern regulators are reacting to is not primarily about whether a plan has a compliant grievance process on paper. It is about whether grievances are actually resolved. A well documented compliance framework that leaves a significant share of complaints unresolved is still a regulatory liability.

CMS also removed the Complaints about the Health Plan and Complaints about the Drug Plan measures from the Star Ratings formula, effective with the 2029 Star Ratings.[6] The instinctive response to that change is to treat complaint monitoring as less important. That is the wrong conclusion. CMS retains these as display measures, complaint trends are widely regarded in payer operations as leading indicators of CAHPS deterioration, and enforcement authority over grievance handling is completely independent of Star Ratings. The scorecard accountability signal is gone. The underlying compliance exposure is not.

What the call reveals that the grievance does not

By the time a formal grievance reaches the G&A team, the plan already has a documentation obligation, a response deadline, and a case that may be reviewed by a regulator. What the plan usually does not have is insight into the original interaction.

Not because the call was not recorded. Most plans record calls. But because the call was not analyzed. Nobody reviewed what the agent communicated about the prior authorization decision, whether the member left with an accurate understanding of their coverage, or whether the issue raised on that call had appeared on 200 other interactions in the previous 30 days.

That missing analysis is the upstream gap. A grievance intake form tells the plan what the member is upset about now. The original call tells the plan what actually happened, who was responsible, whether it was an agent accuracy issue or a systemic script failure, and whether the same pattern is playing out across hundreds of calls currently in queue.

When plans close that gap, the economics look different. Processing a formal grievance costs real money in staff time, documentation, and in some cases regulatory engagement. Catching the underlying issue in the call costs a fraction of that. The intervention point is upstream, and it is the cheaper one.

The call types most likely to generate a formal grievance follow a recognizable pattern. Each one shares the same underlying failure: an interaction that ended without resolution, with no mechanism to catch it before the member decided to escalate.

Call types, failure modes, and what changes with full interaction monitoring

Call volume source Legacy operating model What changes with full monitoring
Prior authorization status calls Agent improvises, no QA on resolution, member leaves without a clear answer AI flags unresolved PA inquiries, repeat callers on the same PA number identified same day
Formulary and benefit change calls Sampling misses misinformation, member gets the wrong coverage explanation Every benefit change call scored, agents with low quality responses flagged before the next open enrollment
Appeals and grievance inquiries Member learns to escalate, formal grievance filed, G&A team receives intake weeks after the call Call content linked to G&A filings, root cause traced to the originating interaction, coach before it escalates
Non-English member calls Interpreter routed, resolution unverified, member satisfaction surveys capture dissatisfaction 12 to 18 months later Native language QA scores every interaction, LEP member resolution tracked like any other call
Repeat callers on the same unresolved issue Repeat rate not systematically tracked, QA sample shows high compliance while the problem builds Repeat call pattern detected across full volume, systemic issues surfaced before a G&A filing

The language access dimension

Non-English member calls carry a disproportionate share of the upstream grievance risk. When a limited English proficient member reaches an agent who cannot serve them without an interpreter, resolution quality is harder to verify, the member is less likely to push back on an unclear answer, and the interaction is almost never scored in a traditional QA program.

The CY2027 Final Rule removed the Call Center Foreign Language Interpreter and TTY Availability measure from Star Ratings, effective with the 2029 Star Ratings.[7] Plans may read that as reduced pressure on language access. The old measure rewarded having an interpreter line available. The bar has shifted. LEP member dissatisfaction with service quality now surfaces in overall CAHPS scores, and those scores carry direct financial consequences for Medicare Advantage plans. It shows up 12 to 18 months after the interaction.

Plans that do not score non-English calls with the same rigor as English calls have a quality gap that will surface in member satisfaction surveys and, for Medicare Advantage plans, directly in CAHPS scores. It will just arrive on a delay.

What to look for in a solution

If you are evaluating call intelligence for this purpose, these are the capabilities that actually matter:

100 percent call coverage, not sampling. The interactions that generate formal grievances are rarely in a 2 to 5 percent random sample. Full coverage is what makes upstream intervention possible.

Root cause analytics, not just QA scores. An agent compliance score tells you whether the script was followed. It does not tell you why members are calling back about the same issue or whether a benefit change was communicated incorrectly across an entire team. Systemic failure identification is what separates a grievance prevention tool from a QA scorecard.

Linkage between call data and G&A intake. If a formal grievance arrives and the underlying call is retrievable, analyzable, and traceable to a root cause, that is a materially different case than one built from member description alone. The connection between call intelligence and grievance case management is where the prevention value is.

Native language quality monitoring. Non-English calls should be scored in the language they were conducted, not translated after the fact and then graded. Post hoc translation loses the nuances that determine whether a member actually understood the answer they received.

Plan ownership of the underlying data. If call recordings, transcripts, and scoring models belong to a vendor rather than the plan, the plan cannot connect call intelligence to its own member satisfaction data, grievance analytics, and retention tracking. That intelligence needs to stay with the plan.

How Claro by Mizzeto approaches this

Claro by Mizzeto was purpose built for health plans and analyzes 100 percent of member interactions across languages, scoring every call against six dimensions: CMS guidelines compliance, HIPAA compliance, resolution rate, CMS accuracy, member sentiment, and communication and empathy. Automated post call translation of non-English interactions is the enabling capability that makes scoring in any language possible. Call data stays with the plan. For related context, see Are Health Plans Really Listening to All Their Members? and our guide to how payers can fix their call centers.

The bottom line

Grievance volume is a lagging indicator. By the time a formal complaint reaches the G&A team, the call that caused it has already happened, the member's patience has already worn out, and the cheap intervention window has already closed. The plans that reduce preventable grievances are not the ones that build bigger G&A teams. They are the ones that get visibility into every call before the member decides to escalate.

The data is already in the call recordings. The question is whether the plan is actually looking at it.

References

  1. SQM Group. Published guidance on manual QA sampling rates, noting that most contact centers review 2 to 5 percent of call volume. https://www.sqmgroup.com/software  
  1. SQM Group. Call Center FCR Benchmark Results by Industry. Healthcare insurance industry first call resolution rate cited at approximately 72 percent. https://www.sqmgroup.com/resources/library/blog/call-center-fcr-benchmark-2024-results-by-industry  
  1. SQM Group. Top 10 Reasons for Repeat Call Complaints. FCR rate for complaint calls cited at 47 percent. https://www.sqmgroup.com/resources/library/blog/customer-complaint-calls  
  1. SQM Group. Top 10 Reasons for Repeat Call Complaints. Discrepancy between actual complaint call volume (approximately 14 percent) and the typical contact center estimate (under 5 percent). https://www.sqmgroup.com/resources/library/blog/customer-complaint-calls  
  1. California Department of Managed Health Care. Press release: DMHC fines Anthem Blue Cross $15 million for longstanding and widespread failures with member complaints. January 30, 2026. https://www.dmhc.ca.gov/Resources/Newsroom/PressReleases/January30,2026.aspx  
  1. Centers for Medicare and Medicaid Services. Contract Year 2027 Medicare Advantage and Part D Final Rule, April 2026. Complaints about the Health Plan and Complaints about the Drug Plan measures removed from Star Ratings, effective with the 2029 Star Ratings.
  1. Centers for Medicare and Medicaid Services. Contract Year 2027 Medicare Advantage and Part D Final Rule, April 2026. Call Center Foreign Language Interpreter and TTY Availability measure removed from Star Ratings, effective with the 2029 Star Ratings.

Latest News

Latest Research, News , & Events.

Read More
icon
Article

AI Data Governance - Mizzeto Collaborates with Fortune 25 Payer

AI Data Governance

The rapid acceleration of AI in healthcare has created an unprecedented challenge for payers. Many healthcare organizations are uncertain about how to deploy AI technologies effectively, often fearing unintended ripple effects across their ecosystems. Recognizing this, Mizzeto recently collaborated with a Fortune 25 payer to design comprehensive AI data governance frameworks—helping streamline internal systems and guide third-party vendor selection.

This urgency is backed by industry trends. According to a survey by Define Ventures, over 50% of health plan and health system executives identify AI as an immediate priority, and 73% have already established governance committees. 

Define Ventures, Payer and Provider Vision for AI Survey

However, many healthcare organizations struggle to establish clear ownership and accountability for their AI initiatives. Think about it, with different departments implementing AI solutions independently and without coordination, organizations are fragmented and leave themselves open to data breaches, compliance risks, and massive regulatory fines.  

Principles of AI Data Governance  

AI Data Governance in healthcare, at its core, is a structured approach to managing how AI systems interact with sensitive data, ensuring these powerful tools operate within regulatory boundaries while delivering value.  

For payers wrestling with multiple AI implementations across claims processing, member services, and provider data management, proper governance provides the guardrails needed to safely deploy AI. Without it, organizations risk not only regulatory exposure but also the potential for PHI data leakage—leading to hefty fines, reputational damage, and a loss of trust that can take years to rebuild. 

Healthcare AI Governance can be boiled down into 3 key principles:  

  1. Protect People Ensuring member data privacy, security, and regulatory compliance (HIPAA, GDPR, etc.). 
  1. Prioritize Equity – Mitigating algorithmic bias and ensuring AI models serve diverse populations fairly. 
  1. Promote Health Value - Aligning AI-driven decisions with better member outcomes and cost efficiencies. 

Protect People – Safeguarding Member Data 

For payers, protecting member data isn’t just about ticking compliance boxes—it’s about earning trust, keeping it, and staying ahead of costly breaches. When AI systems handle Protected Health Information (PHI), security needs to be baked into every layer, leaving no room for gaps.

To start, payers can double down on essentials like end-to-end encryption and role-based access controls (RBAC) to keep unauthorized users at bay. But that’s just the foundation. Real-time anomaly detection and automated audit logs are game-changers, flagging suspicious access patterns before they spiral into full-blown breaches. Meanwhile, differential privacy techniques ensure AI models generate valuable insights without ever exposing individual member identities.

Enter risk tiering—a strategy that categorizes data based on its sensitivity and potential fallout if compromised. This laser-focused approach allows payers to channel their security efforts where they’ll have the biggest impact, tightening defenses where it matters most.

On top of that, data minimization strategies work to reduce unnecessary PHI usage, and automated consent management tools put members in the driver’s seat, letting them control how their data is used in AI-powered processes. Without these layers of protection, payers risk not only regulatory crackdowns but also a devastating hit to their reputation—and worse, a loss of member trust they may never recover.

Prioritize Equity – Building Fair and Unbiased AI Models 

AI should break down barriers to care, not build new ones. Yet, biased datasets can quietly drive inequities in claims processing, prior authorizations, and risk stratification, leaving certain member groups at a disadvantage. To address this, payers must start with diverse, representative datasets and implement bias detection algorithms that monitor outcomes across all demographics. Synthetic data augmentation can fill demographic gaps, while explainable AI (XAI) tools ensure transparency by showing how decisions are made.

But technology alone isn’t enough. AI Ethics Committees should oversee model development to ensure fairness is embedded from day one. Adversarial testing—where diverse teams push AI systems to their limits—can uncover hidden biases before they become systemic issues. By prioritizing equity, payers can transform AI from a potential liability into a force for inclusion, ensuring decisions support all members fairly. This approach doesn’t just reduce compliance risks—it strengthens trust, improves engagement, and reaffirms the commitment to accessible care for everyone.

Promote Health Value – Aligning AI with Better Member Outcomes 

AI should go beyond automating workflows—it should reshape healthcare by improving outcomes and optimizing costs. To achieve this, payers must integrate real-time clinical data feeds into AI models, ensuring decisions account for current member needs rather than outdated claims data. Furthermore, predictive analytics can identify at-risk members earlier, paving the way for proactive interventions that enhance health and reduce expenses.

Equally important are closed-loop feedback systems, which validate AI recommendations against real-world results, continuously refining accuracy and effectiveness. At the same time, FHIR-based interoperability enables AI to seamlessly access EHR and provider data, offering a more comprehensive view of member health.

To measure the full impact, payers need robust dashboards tracking key metrics such as cost savings, operational efficiency, and member outcomes. When implemented thoughtfully, AI becomes much more than a tool for automation—it transforms into a driver of personalized, smarter, and more transparent care.

Integrated artificial intelligence compliance
FTI Technology

Importance of an AI Governance Committee

An AI Governance Committee is a necessity for payers focused on deploying AI technologies in their organization. As artificial intelligence becomes embedded in critical functions like claims adjudication, prior authorizations, and member engagement, its influence touches nearly every corner of the organization. Without a central body to oversee these efforts, payers risk a patchwork of disconnected AI initiatives, where decisions made in one department can have unintended ripple effects across others. The stakes are high: fragmented implementation doesn’t just open the door to compliance violations—it undermines member trust, operational efficiency, and the very purpose of deploying AI in healthcare.

To be effective, the committee must bring together expertise from across the organization. Compliance officers ensure alignment with HIPAA and other regulations, while IT and data leaders manage technical integration and security. Clinical and operational stakeholders ensure AI supports better member outcomes, and legal advisors address regulatory risks and vendor agreements. This collective expertise serves as a compass, helping payers harness AI’s transformative potential while protecting their broader healthcare ecosystem.

Mizzeto’s Collaboration with a Fortune 25 Payer

At Mizzeto, we’ve partnered with a Fortune 25 payer to design and implement advanced AI Data Governance frameworks, addressing both internal systems and third-party vendor selection. Throughout this journey, we’ve found that the key to unlocking the full potential of AI lies in three core principles: Protect People, Prioritize Equity, and Promote Health Value. These principles aren’t just aspirational—they’re the bedrock for creating impactful AI solutions while maintaining the trust of your members.

If your organization is looking to harness the power of AI while ensuring safety, compliance, and meaningful results, let’s connect. At Mizzeto, we’re committed to helping payers navigate the complexities of AI with smarter, safer, and more transformative strategies. Reach out today to see how we can support your journey.

February 14, 2025

5

min read

Feb 21, 20242 min read

Article

Most grievances start as calls that never got reviewed

How call intelligence catches member complaints before they reach your grievance and appeals team

Every health plan has a grievance and appeals operation. Staff, timelines, case management workflows, regulatory tracking. For Medicare Advantage plans, CMS audits it. For Medicaid plans, states audit it. The compliance burden is real. The operational cost is real.

What most plans have not built is anything upstream of it.

A formal grievance is not where the problem starts. It is where the problem ends up after several earlier opportunities to fix it were missed. A member's prior authorization status was communicated incorrectly. A benefit change was explained poorly and the member hung up confused. An agent gave a formulary answer that turned out to be wrong. None of those calls showed up in the QA report, because the QA program reviewed 2 to 5 percent of call volume[1] and these particular calls were not in the sample. Weeks later, the member called back. Then again. Then filed.

This is the upstream problem that most health plan grievance operations are not resourced to see. The tools needed to catch it simply have not been part of how call center quality has historically worked.

What grievances are actually made of

CMS defines a grievance as a complaint about a plan's delivery of service. That definition covers a wide range of situations: an agent who was dismissive, a hold time that felt unreasonable, a coverage question that went unanswered. The operational category that generates the most preventable grievances is simpler than the regulatory definition suggests.

Most formal grievances start as an unresolved phone call.

SQM Group's benchmarking data puts the healthcare insurance call center first call resolution rate at approximately 72 percent[2], meaning roughly 28 percent of member calls do not get resolved on first contact. Some of those members call back once. Some call back twice. When the calls involve a coverage dispute, a prior authorization denial, or a benefit question with real financial consequences for the member, the escalation path eventually leads to a formal grievance.

The more telling finding is about complaint calls specifically. The FCR rate for interactions where a member is already expressing dissatisfaction is only 47 percent.[3] Less than half of the calls most likely to become a grievance get resolved on the first contact. Plans are sending a continuous stream of unresolved member issues toward the back end operations that cost the most to run.

There is a second finding worth flagging here. SQM's research estimates that approximately 14 percent of callers describe their call as a complaint call. Most contact centers believe that number is under 5 percent.[4] Complaint volume is substantially underreported. Plans are not just missing the resolution. They are missing the signal that a problem exists at all.

The enforcement environment has changed

In January 2026, California's Department of Managed Health Care fined Anthem Blue Cross $15 million for what it described as longstanding and widespread deficiencies in handling member grievances spanning more than 15 years.[5] The action included a requirement for an independent auditor to oversee grievance system corrections for up to four years. This was not the first enforcement action for the same pattern. Prior survey findings had not produced sustained correction.

The pattern regulators are reacting to is not primarily about whether a plan has a compliant grievance process on paper. It is about whether grievances are actually resolved. A well documented compliance framework that leaves a significant share of complaints unresolved is still a regulatory liability.

CMS also removed the Complaints about the Health Plan and Complaints about the Drug Plan measures from the Star Ratings formula, effective with the 2029 Star Ratings.[6] The instinctive response to that change is to treat complaint monitoring as less important. That is the wrong conclusion. CMS retains these as display measures, complaint trends are widely regarded in payer operations as leading indicators of CAHPS deterioration, and enforcement authority over grievance handling is completely independent of Star Ratings. The scorecard accountability signal is gone. The underlying compliance exposure is not.

What the call reveals that the grievance does not

By the time a formal grievance reaches the G&A team, the plan already has a documentation obligation, a response deadline, and a case that may be reviewed by a regulator. What the plan usually does not have is insight into the original interaction.

Not because the call was not recorded. Most plans record calls. But because the call was not analyzed. Nobody reviewed what the agent communicated about the prior authorization decision, whether the member left with an accurate understanding of their coverage, or whether the issue raised on that call had appeared on 200 other interactions in the previous 30 days.

That missing analysis is the upstream gap. A grievance intake form tells the plan what the member is upset about now. The original call tells the plan what actually happened, who was responsible, whether it was an agent accuracy issue or a systemic script failure, and whether the same pattern is playing out across hundreds of calls currently in queue.

When plans close that gap, the economics look different. Processing a formal grievance costs real money in staff time, documentation, and in some cases regulatory engagement. Catching the underlying issue in the call costs a fraction of that. The intervention point is upstream, and it is the cheaper one.

The call types most likely to generate a formal grievance follow a recognizable pattern. Each one shares the same underlying failure: an interaction that ended without resolution, with no mechanism to catch it before the member decided to escalate.

Call types, failure modes, and what changes with full interaction monitoring

Call volume source Legacy operating model What changes with full monitoring
Prior authorization status calls Agent improvises, no QA on resolution, member leaves without a clear answer AI flags unresolved PA inquiries, repeat callers on the same PA number identified same day
Formulary and benefit change calls Sampling misses misinformation, member gets the wrong coverage explanation Every benefit change call scored, agents with low quality responses flagged before the next open enrollment
Appeals and grievance inquiries Member learns to escalate, formal grievance filed, G&A team receives intake weeks after the call Call content linked to G&A filings, root cause traced to the originating interaction, coach before it escalates
Non-English member calls Interpreter routed, resolution unverified, member satisfaction surveys capture dissatisfaction 12 to 18 months later Native language QA scores every interaction, LEP member resolution tracked like any other call
Repeat callers on the same unresolved issue Repeat rate not systematically tracked, QA sample shows high compliance while the problem builds Repeat call pattern detected across full volume, systemic issues surfaced before a G&A filing

The language access dimension

Non-English member calls carry a disproportionate share of the upstream grievance risk. When a limited English proficient member reaches an agent who cannot serve them without an interpreter, resolution quality is harder to verify, the member is less likely to push back on an unclear answer, and the interaction is almost never scored in a traditional QA program.

The CY2027 Final Rule removed the Call Center Foreign Language Interpreter and TTY Availability measure from Star Ratings, effective with the 2029 Star Ratings.[7] Plans may read that as reduced pressure on language access. The old measure rewarded having an interpreter line available. The bar has shifted. LEP member dissatisfaction with service quality now surfaces in overall CAHPS scores, and those scores carry direct financial consequences for Medicare Advantage plans. It shows up 12 to 18 months after the interaction.

Plans that do not score non-English calls with the same rigor as English calls have a quality gap that will surface in member satisfaction surveys and, for Medicare Advantage plans, directly in CAHPS scores. It will just arrive on a delay.

What to look for in a solution

If you are evaluating call intelligence for this purpose, these are the capabilities that actually matter:

100 percent call coverage, not sampling. The interactions that generate formal grievances are rarely in a 2 to 5 percent random sample. Full coverage is what makes upstream intervention possible.

Root cause analytics, not just QA scores. An agent compliance score tells you whether the script was followed. It does not tell you why members are calling back about the same issue or whether a benefit change was communicated incorrectly across an entire team. Systemic failure identification is what separates a grievance prevention tool from a QA scorecard.

Linkage between call data and G&A intake. If a formal grievance arrives and the underlying call is retrievable, analyzable, and traceable to a root cause, that is a materially different case than one built from member description alone. The connection between call intelligence and grievance case management is where the prevention value is.

Native language quality monitoring. Non-English calls should be scored in the language they were conducted, not translated after the fact and then graded. Post hoc translation loses the nuances that determine whether a member actually understood the answer they received.

Plan ownership of the underlying data. If call recordings, transcripts, and scoring models belong to a vendor rather than the plan, the plan cannot connect call intelligence to its own member satisfaction data, grievance analytics, and retention tracking. That intelligence needs to stay with the plan.

How Claro by Mizzeto approaches this

Claro by Mizzeto was purpose built for health plans and analyzes 100 percent of member interactions across languages, scoring every call against six dimensions: CMS guidelines compliance, HIPAA compliance, resolution rate, CMS accuracy, member sentiment, and communication and empathy. Automated post call translation of non-English interactions is the enabling capability that makes scoring in any language possible. Call data stays with the plan. For related context, see Are Health Plans Really Listening to All Their Members? and our guide to how payers can fix their call centers.

The bottom line

Grievance volume is a lagging indicator. By the time a formal complaint reaches the G&A team, the call that caused it has already happened, the member's patience has already worn out, and the cheap intervention window has already closed. The plans that reduce preventable grievances are not the ones that build bigger G&A teams. They are the ones that get visibility into every call before the member decides to escalate.

The data is already in the call recordings. The question is whether the plan is actually looking at it.

References

  1. SQM Group. Published guidance on manual QA sampling rates, noting that most contact centers review 2 to 5 percent of call volume. https://www.sqmgroup.com/software  
  1. SQM Group. Call Center FCR Benchmark Results by Industry. Healthcare insurance industry first call resolution rate cited at approximately 72 percent. https://www.sqmgroup.com/resources/library/blog/call-center-fcr-benchmark-2024-results-by-industry  
  1. SQM Group. Top 10 Reasons for Repeat Call Complaints. FCR rate for complaint calls cited at 47 percent. https://www.sqmgroup.com/resources/library/blog/customer-complaint-calls  
  1. SQM Group. Top 10 Reasons for Repeat Call Complaints. Discrepancy between actual complaint call volume (approximately 14 percent) and the typical contact center estimate (under 5 percent). https://www.sqmgroup.com/resources/library/blog/customer-complaint-calls  
  1. California Department of Managed Health Care. Press release: DMHC fines Anthem Blue Cross $15 million for longstanding and widespread failures with member complaints. January 30, 2026. https://www.dmhc.ca.gov/Resources/Newsroom/PressReleases/January30,2026.aspx  
  1. Centers for Medicare and Medicaid Services. Contract Year 2027 Medicare Advantage and Part D Final Rule, April 2026. Complaints about the Health Plan and Complaints about the Drug Plan measures removed from Star Ratings, effective with the 2029 Star Ratings.
  1. Centers for Medicare and Medicaid Services. Contract Year 2027 Medicare Advantage and Part D Final Rule, April 2026. Call Center Foreign Language Interpreter and TTY Availability measure removed from Star Ratings, effective with the 2029 Star Ratings.

Jan 30, 20246 min read

June 12, 2026

2

min read

Article

Are Health Plans Really Listening to All Their Members?

The honest answer for most health plans is no. Most audit only 2 to 5 percent of their member service call volume,1 whether the calls are handled in-house, outsourced, or run on a hybrid model. The other 95 percent, including the calls that drive complaint patterns, CAHPS results for Medicare Advantage plans, NCQA accreditation outcomes, and grievance filings, is operationally invisible. This is not because plans are negligent. It is because manual quality assurance, at any scale a health plan would consider economically viable, was never going to review more than a fraction of total volume.

The model was defensible when human review of every call was infeasible. A quality assurance team, almost always internal to the plan, listens to a random sample of recordings, grades each against a rubric covering greeting, identity verification, hold etiquette, closing, and compliance disclosures, and rolls the scores up into a monthly quality report. That constraint has now changed, and the math of what the sampled program does not see has gotten more expensive.

Why a 2 percent sample cannot tell you what you need to know

Three things break the sampling model in the real operating environment of a health plan. First, the sample is too small to surface the outliers that drive dissatisfaction: the prior authorization status miscommunication, the Spanish-speaking caller transferred three times, the appeals question a non-clinical agent improvised on. At 2 percent of a million calls a year, you are reviewing 20,000 calls, and the handful that mattered are statistically invisible.

Second, the scoring rubric measures script compliance, not member outcomes. An agent can hit every checkbox on a quality assurance form and leave the member with the wrong answer. Conversely, an agent can solve a complex eligibility question with empathy and skip half the script. The first call scores well, the second scores poorly, and member experience is the inverse.

Third, and most consequentially, the quality assurance score is structurally disconnected from the outcomes the plan actually cares about. A plan can run a compliance program for three years and still watch scores drift downward, because the program is measuring what is easy to grade, not what determines plan performance.

The financial stakes vary by line of business but point in the same direction. For Medicare Advantage plans, CAHPS measures were quadruple-weighted in the 2023 through 2025 Star Ratings and remain double-weighted from the 2026 Star Ratings forward,2 and QBP eligibility is binary at the four-star threshold. A half-star drop from 4.0 to 3.5 eliminates the entire 5 percent Quality Bonus Payment on the benchmark, several million dollars in annual revenue exposure on a mid-sized MA plan. For Medicaid managed care plans, the same call patterns drive access audit findings and capitation negotiations. For commercial and ACA plans, they drive NCQA accreditation outcomes and employer client retention.

What changes when you measure every call

Multilingual speech-to-text, conversation analysis, and intent classification now run reliably at full call volume across the languages CMS requires plans to support for LEP populations. When Member Experience covers 100 percent of member calls rather than a sample:

  • You see the patterns that drive member experience scores before the next survey cycle comes back. Repeated transfers on prior authorization inquiries. Unresolved questions about formulary changes. Disconnects on appeals calls.
  • You can connect call content to outcomes. A member who called three times in a month about the same issue is not a satisfied member, regardless of how each individual call scored. The pattern is only visible across the full population.
  • You can detect compliance risk while it is still recoverable. A miscommunication about coverage that surfaces in week one of a transition can be corrected before it becomes a grievance, an appeal, or a CMS audit finding.
  • You can act on multilingual experience as a measured outcome rather than a checkbox. Surveys rarely capture whether a Spanish-speaking member actually understood the answer they got. Conversation-level analysis can.

The relationship between surveys and conversation analysis is not a replacement. It is a stack. Member experience surveys tell the plan what the member experienced and how they felt about it. Call content tells the plan why. Surveys are the outcome layer that determines QBP eligibility for MA plans and accreditation status across other lines. Conversations are the operational layer that surfaces the specific behaviors, scripts, transfers, and miscommunications driving the outcome.

Six tests for a defensible call center Member Experience program

Member Services leaders evaluating how to upgrade their Member Experience program in 2026 can apply six tests. Each isolates a structural property that determines whether the program will actually measure member experience or merely produce a number. Most arrangements on the market today, including in-house teams running on legacy infrastructure and vendor-managed programs alike, will fail at least three of these tests.

Evaluation Framework
Six Tests for a Defensible Member Experience Program
Test What to ask Why it matters for payers
Coverage Does the system review 100% of member calls, or does it sample? The events that destroy CAHPS scores are outliers. A 2% sample makes them statistically invisible.
Independence Is the member experience score connected to outcome measures the plan actually cares about, or only to a static internal rubric? A score that moves independently of CAHPS, grievances, and retention is measuring compliance with the rubric, not member experience.
Data ownership Who controls the call recordings, transcripts, scoring models, and historical trend lines? Even when the plan runs member experience monitoring in-house, the underlying audio, transcription, and analytical infrastructure often live in vendor systems. Without portability, the plan cannot move providers without losing its operational history.
Language coverage Are Spanish, Mandarin, Vietnamese, Tagalog, and other CMS-required languages scored natively? CMS expects meaningful access for LEP members. Translation bolted onto an English-first model fails the standard.
Configurability Can the plan modify the rubric in days, or does each change require a vendor amendment? Speed of iteration is a measure of who actually owns the program. Slow change is vendor control.
Outcome linkage Does the member experience score correlate with CAHPS, grievance volume, appeal overturns, and member retention? If the score moves independently of every outcome it claims to predict, it is measuring something else. Find out what.
Source: Mizzeto analysis of payer call center practices, 2026.

CMS is already doing this. The question is whether you are.

Every year, CMS places test calls into the prospective beneficiary and current enrollee call centers of every Medicare Advantage and Part D plan in the country. The Accuracy and Accessibility Study measures whether plan representatives provide correct benefit information, whether qualified foreign-language interpreters are reachable for LEP callers, and whether TTY services function for deaf and hard-of-hearing members. Results feed directly into the Customer Service domain of the Star Ratings calculation, and low performance can trigger compliance action.3

CMS is, in other words, already running an independent Member Experience program on every MA plan's member-facing call center, because plan-administered quality scores alone have not been considered sufficient as a regulatory measurement substitute. But the agency's sample is tiny relative to total volume, and the protocol only measures what its test scenarios cover. Real-world failure modes that fall outside the protocol, including the prior authorization status miscommunication, the formulary question handled by an undertrained agent, and the appeals call that ends in a disconnect, are not in the data CMS sees.

What Mizzeto's Multilingual Member Experience Solution does

Mizzeto's Multilingual Member Experience Solution is built for health plans, not retrofitted from a generic CX platform, and is designed to pass all six tests above by default. Concretely:

  • 100 percent analysis of member interactions, with AI-driven transcription and scoring that runs at full call volume rather than a sample.
  • Native multilingual analysis across English, Spanish, Mandarin, Vietnamese, Tagalog, and the other languages CMS requires plans to support for LEP populations.
  • Payer-configurable scoring rubrics the plan can update in days, with Member Experience scores linked directly to CAHPS measures, grievance volumes, appeal overturns, and disenrollment patterns.
  • Plan ownership of the underlying conversation data, scoring models, and historical trend lines. Whether call handling, telephony, or transcription providers change, the member experience monitoring infrastructure stays with the plan.
  • Real-time alerting and direct conversation-level access for Member Services and Compliance teams, so when a grievance trend, or audit finding requires explanation, the evidence is already in the plan's possession.

The bottom line

Member Experience measurement for health plans is at an inflection point. The sampling model that defined the last twenty years cannot survive the combination of heavily weighted member experience measures in the MA Star Ratings formula, parallel quality reporting requirements for Medicaid and ACA plans, and the simple fact that the technology to measure every call exists and is being deployed by competitors. If CMS itself has concluded plan-administered scores need an independent measurement layer, the rational response is to install one at scale, on the 95 percent of calls CMS does not sample, before CMS surfaces a finding the plan has to explain. Apply the six tests above to your current arrangement, and see how Mizzeto's Multilingual Member Experience Solution measures against them.

References

  • 1. SQM Group. mySQM Auto QA published guidance on manual QA sampling rates. www.sqmgroup.com/software
  • 2. Centers for Medicare and Medicaid Services. 2026 Medicare Part C and D Star Ratings Technical Notes. www.cms.gov/files/document/2026-star-ratings-technical-notes.pdf. CMS Medicare Advantage and Part D Final Rule reducing CAHPS and administrative measure weights from 4x to 2x effective 2026 Star Ratings.
  • 3. Centers for Medicare and Medicaid Services. Part C and Part D Call Center Monitoring, Timeliness and Accuracy and Accessibility Studies. Annual guidance memoranda and HPMS performance reports. www.cms.gov

Jan 30, 20246 min read

May 19, 2026

2

min read

Article

The CMS 2027 Final Rule Just Rewrote the Star Ratings Playbook. Is Your Member Services Operation Ready?

On April 2, 2026, CMS finalized the Contract Year 2027 Medicare Advantage and Part D rule1, and the changes will reshape how health plans earn, protect, and lose revenue for years to come. CMS finalized the removal or retirement of 11 Star Ratings measures, declined to implement the Health Equity Index reward, added a new Depression Screening and Follow-Up measure, codified the Inflation Reduction Act's Part D benefit redesign, tightened supplemental benefit oversight, updated call recording retention requirements, and scaled back several health equity provisions. The agency received over 42,000 public comments on the proposed rule.2

The financial stakes are significant. According to CMS projections published in the final rule, the Star Ratings changes are estimated to have a net impact of approximately $18.6 billion on the Medicare Trust Fund over the 2027 to 2036 period.3 Independent actuarial estimates, including analyses from Milliman and Wakely Consulting, suggest roughly 25% of contracts could lose half a Star, with at least 42 contracts potentially falling below the 4.0 threshold that determines quality bonus payments.4 Industry analyses from Press Ganey and others estimate that by 2029, CAHPS and HOS survey measures could account for nearly 40% of total Star weight5, meaning the administrative measures that padded most plans' ratings are gone, and member experience is now the primary financial driver.

What CMS Changed: The Complete Picture

The rule touches nearly every operational function in a health plan. The Star Ratings overhaul is the headline: CMS finalized the removal or retirement of 11 measures, many of which were considered topped-out administrative or process measures with little performance variation across plans. CMS retained the Diabetes Care Eye Exam measure after comment period feedback, and added a new Depression Screening and Follow-Up measure for the 2027 measurement year (reflected in 2029 Stars). CMS also declined to implement the Health Equity Index reward, retaining the historical reward factor instead.

Beyond Stars, CMS codified the IRA's Part D benefit redesign into permanent regulation: the coverage gap is eliminated, a $2,000 annual out-of-pocket cap is in place, and catastrophic phase cost sharing is zero. CMS strengthened supplemental benefit oversight, including requiring that debit cards used for SSBCI be electronically linked to plan-covered items through an identification mechanism at the point of sale. CMS also updated call recording retention requirements, reducing the overall retention period for marketing and sales calls from 10 to six years with a tiered structure. Separately, documentation supporting coverage determinations must be retained in original format, including audio files; CMS has indicated that failure to produce original-format documentation may result in adverse audit findings, including potential PDE record adjustments.6 CMS also loosened marketing rules (eliminating the 48-hour SOA waiting period and the 12-hour gap between educational and marketing events), scaled back or deferred several health equity provisions for QI programs and UM Committees, and rescinded the mid-year supplemental benefit notice mandate.

Table 1: CY2027 Star Ratings Measure Changes

Measure Part(s) Affected Effective Status
Call Center: Foreign Language Interpreter and TTY AvailabilityPart C and Part D2028 StarsRemoved
Statin Therapy for Patients with Cardiovascular DiseasePart C2028 StarsRemoved
Plan Makes Timely Decisions about AppealsPart C2029 StarsRemoved
Reviewing Appeals DecisionsPart C2029 StarsRemoved
Complaints about the Health PlanPart C2029 StarsRemoved
Complaints about the Drug PlanPart D2029 StarsRemoved
Members Choosing to Leave the PlanPart C and Part D2029 StarsRemoved
SNP Care ManagementPart C2029 StarsRemoved
Medicare Plan Finder Price AccuracyPart D2029 StarsRemoved
Depression Screening and Follow-UpPart C2029 StarsAdded
Diabetes Care: Eye Exam (retained after comment period)Part CN/AKept

Source: CMS CY2027 Final Rule (CMS-4208-F3/CMS-4212-F), April 2026. Red = removed. Green = added. Yellow = retained after comment period. Note: Call Center and Members Choosing to Leave measures each apply to both Part C and Part D, accounting for 11 individual measure removals across the two programs.

Operational Impact: What This Means for Member Services and Payer Operations

CAHPS Now Drives the Revenue Equation

With industry estimates projecting survey measures could approach 40% of total Star weight by 2029, every member interaction that feeds a CAHPS response carries direct financial consequence. CAHPS measures getting needed care, getting appointments quickly, customer service quality, and health plan information. Each maps directly to call center performance: how quickly a member reaches a knowledgeable agent, whether the issue was resolved on the first call, and whether the member felt the plan gave them the information they needed.

Plans that treated member experience as secondary to clinical gap closure need to reverse that calculus. The AHA raised concerns during the comment period that CAHPS is high-level and lagged7, which underscores the operational problem: by the time CAHPS data reveals an issue, the damage spans months of interactions. Plans need real-time quality intelligence, not annual survey results, to manage at the speed CMS now requires.

The Language Access Measure Is Gone. The Requirement Is Not.

CMS removed the Call Center Foreign Language Interpreter and TTY Availability measure from Stars, effective 2028. But CMS will continue enforcing language access through compliance mechanisms, and member experience with language access will be captured through CAHPS survey questions.8 When language access was a binary, pass-fail administrative measure, plans met the standard by having an interpreter line available. Now it is measured through member experience surveys. The bar shifts from availability to quality. Did the Spanish-speaking member feel heard? Was the Mandarin-speaking member's question actually resolved? Multilingual quality monitoring becomes more important under this rule, not less.

Complaints, Retention, and the Signals You Are About to Lose

The Complaints about the Health Plan and Drug Plan measures are both removed, as is Members Choosing to Leave the Plan. Plans used these as governance signals for grievance operations and retention. Their removal from Stars does not mean CMS stops watching; these will likely continue as display measures and compliance enforcement tools. Complaint trends remain among the strongest leading indicators of CAHPS deterioration. And every lost member still represents lost premium revenue. The difference now is that plans lose the early warning signals. The QA system that monitors member interactions must compensate by surfacing complaint trends and churn risk from interaction data, feeding quality insights directly into retention strategy.

Depression Screening Creates a Member Services Coordination Challenge

The new Depression Screening and Follow-Up measure evaluates two rates: the percentage of eligible members screened, and the percentage who receive follow-up care within 30 days of a positive screen. The screening rate depends on clinical workflows. The follow-up rate depends on member services infrastructure: outreach, appointment scheduling, and confirmation. Plans that silo this as a purely clinical initiative will underperform.

Part D Benefit Changes Will Hit the Phones

The codified three-phase benefit structure (deductible, initial coverage, catastrophic) replaces the four-phase model members have known for years. The $2,000 out-of-pocket cap is the most significant Part D financial protection in a generation, but it requires agents to explain new cost-sharing mechanics accurately. Members will call about why their coverage gap disappeared, what counts toward TrOOP, and what happens at the OOP threshold. Every call center needs updated knowledge base content, retrained agents, and revised IVR scripts before October 2026. Benefit misinformation during AEP is one of the fastest paths to CAHPS degradation.

Debit Card Declines Will Become Call Volume

CMS strengthened SSBCI debit card requirements, including that cards be electronically linked to plan-covered items through an identification mechanism at the point of sale. In practice, this means tighter verification when members use flex cards. When a member's card is declined at a store because a specific item does not qualify, the next action is a phone call. Member services teams should anticipate a new category of inbound inquiries. Plans that do not prepare agent scripts and escalation workflows will see resolution times spike and CAHPS-relevant frustration increase.

Appeals Measures Gone: BPO Accountability Gap Widens

Two appeals measures are removed: Plan Makes Timely Decisions about Appeals and Reviewing Appeals Decisions. For plans outsourcing appeals to BPO vendors, this eliminates one of the few externally visible accountability signals on that process. Plans must build internal SLA monitoring to ensure outsourced operations maintain standards. The risk is not a Star Rating drop; it is a CMS compliance finding.

Health Equity Provisions Scaled Back: The Mandate Is Gone. CAHPS Is Not.

CMS scaled back or deferred several health equity provisions in this rule: the HEI reward was not implemented, QI program disparity reduction requirements were removed, UM Committee equity expert and analysis mandates were eliminated, and the supplemental benefit notice was rescinded. For plans serving significant dual-eligible or LEP populations, the regulatory pressure is reduced but the operational reality is unchanged. Experience disparities still surface in CAHPS. Voluntarily maintaining equity-focused quality monitoring, particularly multilingual QA, is not compliance theater. It is CAHPS protection.

Operational Impact Matrix: Every Change, Every Action

Table 2: CY2027 Final Rule Operational Impact Matrix

Rule Change Operations Affected Action Required
Star Ratings: 11 measures removed or retired, CAHPS weight rising significantlyMember services, call center QA, quality improvementRetool QA scorecards to mirror CAHPS dimensions; deploy 100% interaction monitoring; shift from administrative compliance to experience optimization
Call Center Language Access measure removed from StarsCall center operations, multilingual QA, complianceMaintain full multilingual QA; CMS still enforces via compliance and CAHPS; quality of LEP interactions now measured by member perception, not binary availability
Depression Screening and Follow-Up measure addedCare management, member outreach, call centerBuild member services workflows for follow-up scheduling; connect clinical screening data to outreach systems; track 30-day follow-up completion
Complaints measures removed (Part C and Part D)Grievances and appeals, member services governanceDo not deprioritize complaint tracking; CMS retains as display measures; complaint trends remain leading indicators of CAHPS deterioration
Appeals measures removed (Timely Decisions, Reviewing Appeals)Utilization management, appeals processingMaintain internal SLA tracking; removal from Stars does not reduce CMS audit scrutiny; plans outsourcing appeals lose a public accountability signal
Members Choosing to Leave the Plan removedRetention, member engagement, CX strategyDisenrollment still drives revenue loss; monitor churn through interaction data; connect QA insights to retention strategy
Part D benefit redesign codified ($2,000 OOP cap, no coverage gap)Member services training, call center knowledge baseRetrain agents on three-phase benefit structure; update IVR and knowledge base; anticipate high call volume around OOP threshold
SSBCI debit card oversight strengthened (POS identification mechanism)Supplemental benefits, member servicesPrepare for calls when cards are declined at POS; train agents on eligibility rules; update escalation workflows
Call recording retention updated (reduced to 6-year period)Call center IT, compliance, legalReview and update retention policies per new tiered requirements; audit current storage infrastructure
Documentation retention in original formatPart D operations, pharmacy, compliancePreserve all coverage determination documentation in original format including audio; non-compliance may result in adverse audit findings including potential PDE adjustments
Marketing deregulation (SOA, agent contact rules)Marketing, enrollment, call center surge planningAnticipate higher AEP contact volume; scale QA for enrollment surge; monitor for complaint spikes
Health equity provisions scaled back or deferredQuality improvement, UM governanceVoluntary continuation recommended for high dual-eligible/LEP plans; disparities surface in CAHPS regardless of mandates

Share this matrix with your leadership team to assign ownership and timelines for each action item.

What to Look for in a Member Experience Quality Solution

When evaluating solutions, health plan leaders should look for these capabilities:

100% interaction monitoring, not sampling. With CAHPS and survey measures carrying increasingly dominant weight in Star Ratings, sampling-based QA cannot identify the systemic patterns that drive survey responses.

Multilingual quality scoring at native-language fidelity. With language access measurement shifting to CAHPS, multilingual QA must be integrated into the same framework applied to English interactions.

Real-time coaching signals. CAHPS is lagged. Quality intelligence must surface coaching opportunities within hours, not quarters.

CAHPS-aligned scoring frameworks. The QA scorecard must mirror what CMS measures: getting needed care, customer service, getting appointments, and health plan information.

Complaint and churn early warning. With complaint and disenrollment measures removed, the QA platform must surface these signals from interaction data.

Plan-owned data and analytics. If your quality data lives inside a vendor's platform, you do not own your operational intelligence. That intelligence must belong to the plan.

How Mizzeto Supports This Shift

Mizzeto's Multilingual QA Solution was built for this inflection point: AI-powered quality monitoring across 100% of member interactions, in multiple languages, with CAHPS-aligned scoring, real-time coaching signals, and complaint and churn analytics. All data stays in the plan's hands. For more on connecting these capabilities to call center performance, see our guides on improving call center performance and modernizing call center operations.

The Window Is Open. Here Is How Long You Have.

The rule is effective June 1, 2026. Marketing begins October 1. Coverage starts January 1, 2027. The 2027 measurement year, which feeds 2029 Star Ratings, will be the first scored under the new CAHPS-heavy measure set. Plans that retool now have time. Plans that wait for 2029 ratings to reveal a problem will discover it started in 2027.

References

1. CMS, 'Contract Year 2027 Medicare Advantage and Part D Final Rule' Fact Sheet, April 2, 2026.

2. Federal Register, CMS-4208-F3/CMS-4212-F, published April 6, 2026.

3. CMS Final Rule financial projections; Becker's Hospital Review, April 2, 2026.

4. Milliman, 'Falling Star Rating Trajectory,' December 2025. Healthcare Labyrinth, April 2026.

5. Press Ganey, December 2025. Upward Growth, April 2026. Healthcare Labyrinth corroborates.

6. AArete, 'Reading the Signals,' December 2025.

7. American Hospital Association, Comment Letter on CY 2027 Proposed Rule, January 26, 2026.

8. Holland & Knight, April 2026. Crowell & Moring, December 2025.

Jan 30, 20246 min read

April 15, 2026

2

min read

Article

The Hidden Cost of a Transferred Call: What Health Plan Call Center Inefficiency Really Costs You

Most health plan operations leaders can tell you their average handle time and their cost per call. Very few can tell you what a single transferred call actually costs when you follow it all the way through the system.

That transferred call triggers a second interaction at $4.90 or more.[1] It resets the resolution clock. It inflates the member’s frustration, which showsup months later in a CAHPS survey the plan cannot retroactively fix. And if the plan is running a Medicare Advantage contract, that CAHPS score is tied directly to Star Ratings, which determine quality bonus payments worth tens of millions in annual revenue.[2]

The real problem is not the cost of one bad call. It is that the way most health plans measure call center quality today was designed for a different era, and it is structurally incapable of seeing how many bad calls are happening, or why.

The FCR Gap Nobody Talks About

First call resolution is the most important metric in any health plan contact center. SQM Group’s benchmarking across more than 100 leading North American healthcare call centers puts the industry average FCR rate at 71%. Only 4% of those centers reach the world-class threshold of 80% or higher.[3]

That means roughly 29% of member calls require a callback, transfer, or follow-up. In some studies, the number is far worse: one analysis found the average healthcare FCR rate sitting at 52%, meaning more than half of all member inquiries go unresolved on first contact.[4]

Each of those unresolved calls carries a compounding cost. SQM Group’s research shows that a 1% improvement in FCR translates to approximately $286,000 in annual operational savings for a typical midsize call center.[5] That is not a theoretical model. That is reduced repeat volume, shorter queues, and lower agent workload.

Now consider the member experience side. Satisfaction drops roughly 15% every time a member has to call back about the same issue.[6] The call that started as a routine benefits question becomes, by the third attempt, a complaint. And complaints have an FCR rate of just 47%.[7]

Transfers, Mis-Routes, and the Cost Multiplier

Healthcare call centers face transfer rates as high as 19%.[8] Each transfer does three expensive things simultaneously.

First, it adds direct cost. A transferred call requires a second agent, a second set of minutes, and often a longer total handle time than a single well-routed interaction. With average handle times running 6.6 minutes and average costs at $4.90 per call, a transferred call effectively doubles the expense of that member interaction.

Second, it destroys member confidence. Talk desk’s survey of 330 health plan members found that 78% described their experience with their insurers as less than seamless. The leading cause was not claims denials or billing errors. It was poor customer service, cited by 31% of respondents.[9] Being transferred between departments and repeating the same information is the archetype of that frustration.

Third, and most overlooked, transfers create data fragmentation. When a call moves fromone agent to another, the wrap-up codes, disposition notes, and resolution status become inconsistent. The first agent may mark the call as resolved because they transferred it. The second agent may not log the original call reason. The result is that the plan’s reporting shows two “handled” calls instead of one unresolved member issue.

Many of these transfers are not agent errors. They are routing failures: an IVR that sends a prior authorization status call to a general benefits queue, or a system that cannot identify a member’s preferred language and routes them to an English-only agent by default. These are infrastructure and configuration problems that compound silently across thousands of calls.

Why Legacy QA Cannot See This Problem

Here is where the structural problem becomes clear.

The traditional approach to call center quality assurance, whether run in-house or through an outsourced partner, reviews between 2% and 5% of total interactions. In many operations, the number sits closer to 2%.[10] That means 95% or more of member calls are never evaluated by anyone.

The math alone makes the approach statistically indefensible. A 3% random sample of 800,000 annual calls captures 24,000 interactions. If 232,000 of those calls are repeat contacts, the sample will catch only a small fraction of them, and it will almost never catch the systemic patterns that cause them.

The deeper issue is not just sample size. It is what the QA program is designed to measure. Most legacy QA scorecards evaluate whether an agent followed a script, greeted the member properly, and used compliant language. They do not measure whether the member’s issue was actually resolved, whether the call could have been prevented by better routing, or whether the same question has been asked 500 times this month because a benefit change was poorly communicated.

When quality measurement is limited to agent-level compliance on a tiny sample, the operational problems that drive repeat calls, unnecessary transfers, and member dissatisfaction remain invisible. QA scores can look strong while member experience deteriorates, because the scorecard and the member’s reality are measuring different things.

Legacy vs. Modern Call Center Operations: The Visibility Gap

Metric Legacy Operating Model Modern Payer Operations
QA Coverage 2–5% sample; 95%+ of calls unreviewed 100% of interactions analyzed by AI
FCR Visibility Vendor self-reports; no external validation Independent, plan-owned FCR measurement
Repeat Call Rate Not systematically tracked or reported AI detects repeat patterns across full volume
Transfer Rate Reported as "within SLA" Root-cause analysis on every mis-route
Non-English QA Interpreter available; calls rarely scored Native-language QA on every interaction
CAHPS Linkage QA and Stars treated as separate workstreams Interaction data feeds directly into Stars strategy
Data Ownership Quality data lives in vendor's system Plan owns all operational intelligence

Sources: SQM Group, DialogHealth, CAQH Index, industry QA benchmarks. Figures represent industry averages for mid-size health plan contact centers.

The Star Ratings Revenue Connection

For Medicare Advantage plans, this is not just an operational inconvenience. It is a revenue problem measured in tens of millions.

CAHPS survey results have historically carried a 4x weight in CMS Star Ratings calculations. While the weighting shifted to 2x for Star Year 2026, CAHPS measures remain a significant driver of overall ratings. CMS’s proposed rules for 2027 and beyond signal that member experience will become an even larger share of the total score, with CAHPS and HOS projected to make up nearly 40% of total Star weight by 2029.[11]

The financial stakes are hard to overstate. The gap between a 3.5-star and a 4+ star plan can translate to tens of millions of dollars in annual quality bonus payments. In 2026, only about 40% of MA-PD contracts achieved 4 stars or higher, the lowest proportion in over five years.[12]

Every repeat call, every unnecessary transfer, every escalation that leaves a member frustrated is a data point that can move CAHPS scores. A plan cannot fix a bad call center experience with a follow-up mailer.

What This Looks Like in Practice

Consider amid-size Medicaid managed care plan handling 800,000 member calls per year. At a 71% FCR rate, roughly 232,000 of those calls require a repeat contact. At $4.90 per call, the repeat volume alone represents more than $1.1 million indirect costs annually, and that does not account for the extended handle times, supervisor escalations, or member complaints those calls generate.

Now suppose the plan’s QA program reviews 3% of calls. That is 24,000 calls reviewed out of 800,000. The 232,000 repeat interactions? They are almost entirely invisible, because repeat calls do not cluster conveniently in a random 3% sample.

The plan sees a QA dashboard that shows 90%+ compliance scores. The quality team reports stable performance. Meanwhile, CAHPS scores are flat or declining, member complaints are rising, and the CX team cannot pinpoint why.

This is not a failure of the people doing the work. It is a failure of the measurement infrastructure. The plan is making decisions based on what 3% of its interactions reveal, while the other 97% contain the signals that actually explain member experience.

Language Access: The Hidden Multiplier

One of the most overlooked drivers of call center inefficiency in health plans is language access. Medicaid and dual-eligible populations frequently include members whose primary language is not English. When these members reach an agent who cannot serve them in their preferred language, the result is almost always a transfer, extended hold time, or an unresolved interaction.

CMS requires that Medicare Advantage and Medicaid managed care plans provide meaningful language access. But compliance is often measured at the policy level, not the interaction level. A plan may have interpreter services available, but if the routing logic does not match members to bilingual agents and QA does not evaluate non-English interactions, language-related service failures become invisible in aggregate metrics.

This matters because the members most affected are often the most vulnerable: elderly, disabled, low-income, or limited English proficient populations whose CAHPS responses carry the same weight as every other member’s. A plan that underserves this segment is not just creating an equity gap. It is creating a Star Ratings exposure that shows up 12 to 18 months later in the measurement cycle.

What Modern Call Center Operations Should Look Like

The answer is not to bring everything in-house or to stop working with operational partners. The answer is to modernize how quality is measured, who owns the data, and what the plan can actually see. Whether your call center is in-house, outsourced, or hybrid, these capabilities separate plans that manage costs from plans that manage outcomes.

100% interaction monitoring, not sampling. Any quality program that evaluates only a fraction of calls will always miss the patterns that drive repeat contacts and member dissatisfaction. AI-powered monitoring across voice, chat, and digital channels is now operationally viable and should be the baseline expectation.

Multilingual QA that matches the member population. If your plan serves Medicaid or Medicare Advantage populations, quality monitoring must cover non-English interactions with the same rigor as English calls. This means native-language evaluation, not post-hoc translation of transcripts.

Plan-owned quality measurement. Regardless of who operates the call center, the plan should own the quality data. When quality measurement is controlled entirely by the team handling the calls, there is no independent check on whether reported performance matches member reality.

Root-cause analytics, not just scorecards. A QA score tells you whether an agent followed a script. It does not tell you why members are calling back, which call types drive the most transfers, or where routing logic is failing. Modern QA surfaces the operational signals behind the numbers.

Direct linkage to CAHPS and Star Ratings strategy. Call center performance and Star Ratings are not separate workstreams. Quality data from member interactions should feed directly into Stars strategy, giving plans the ability to intervene before CAHPS surveys go into the field.

Operational intelligence, not just compliance reporting. The goal is not a cleaner scorecard. It is the ability to see which processes are broken, which member segments are at risk, and which changes will move the metrics that matter.

How Mizzeto Approaches This

Mizzeto’s Multilingual QA Solution was built to give health plans 100% visibility into call center quality across every language their members speak. Rather than relying on sampling or siloed scorecards, the platform uses AI to monitor and score every member interaction, surfacing the compliance risks, service failures, and repeat-call drivers that legacy QA methods cannot detect. Whether your call center is in-house, outsourced, or a combination, Mizzeto puts quality oversight and operational intelligence back in the hands of the plan.

The Cost of Not Knowing

The most expensive call in your contact center is not the one that takes 12 minutes. It is the one that generates three more calls, a formal complaint, and a CAHPS response that pulls your Star Rating below the bonus threshold.

Health plans have spent years optimizing the visible costs: average handle time, headcount, per-call rates. The invisible costs, the ones hiding in the 95% of calls nobody reviews, are where the real money is. The plans that figure this out first will not just run more efficient call centers. They will have a structural advantage in Star Ratings, member retention, and the ability to make operational decisions based on what is actually happening.

The call center is not a cost center to be minimized. It is an intelligence asset to be owned.

SOURCES

[1]DialogHealth, “Latest Healthcare Call Center Statistics,” 2025.

[2]Ameridial, “Health Plan Member Services Outsourcing for Star Ratings,” 2026.

[3]SQM Group, “Why FCR Matters to Healthcare Insurance Call Centers.”

[4]Physicians Angels, “Healthcare Call Center Statistics To Know,” 2025.

[5]Talkdesk, “How Payers Can Improve Member Experience with Modern Contact Centers.”

[6]TheAIQMS, “AI QMS for BPO: Scaling Contact Center Quality Without Expanding QA Teams,” 2025.

[7]Enthu.ai, “Call Center Quality Assurance,” 2026.

[8]Press Ganey, “CMS Just Ignited the Biggest Stars Shake-Up in a Decade,” December 2025.

[9]Oliver Wyman, “How Plans Can Win as Medicare Advantage Star Ratings Change,” 2025.

[10]CAQH, “2025 CAQH Index: U.S. Healthcare Avoided $258 Billion,” February 2026.

[11]CMS, “2026 Star Ratings Fact Sheet,” November 2025.

Jan 30, 20246 min read

April 15, 2026

2

min read