AI can reduce EHR rollout risk by forecasting costs, staffing needs, and stability signals—especially for batched deployments like the VA’s 2026 plan.

AI Can Keep EHR Rollouts on Budget and Safer
$37 billion is the kind of number that changes how Congress reads every project update. That’s where the Department of Veterans Affairs’ Oracle Health electronic health record (EHR) modernization sits right now—at least by the latest lifecycle estimate discussed on Capitol Hill. Lawmakers aren’t just reacting to sticker shock. They’re reacting to uncertainty: what’s included, what’s missing, and what happens if the next round of deployments goes sideways.
The VA plans to resume go-lives after a long operational pause, starting with four Michigan medical facilities going live near-simultaneously in April 2026. That “batched deployment” approach is meant to scale faster. It also concentrates risk. If you’ve spent any time around large public-sector IT programs, you already know the pattern: cost growth plus compressed timelines plus operational complexity is where avoidable failures pile up.
This post is part of our AI in Government & Public Sector series, and I’m going to take a clear stance: EHR modernization doesn’t fail because government can’t modernize. It fails because programs run blind—without real-time cost, readiness, and safety signals. This is exactly where AI (used responsibly) can help agencies control costs, reduce deployment risk, and prove readiness with evidence rather than optimism.
What Congress is really signaling about the VA EHR
Answer first: The hearing signals that Congress is less worried about “technology” and more worried about governance, cost transparency, and operational risk—especially when multiple sites go live together.
The VA’s EHR modernization has a long history: a major contract started in 2018 (initially around $10B, later revised upward), an acquisition that brought Cerner under Oracle in 2022, and a pause in April 2023 due to technical issues, usability, and patient safety concerns. Despite years of work, the system has only been deployed at six of 170 VA medical centers.
Now the VA is moving out of the pause and planning to expand again—first to Michigan, then to additional sites in 2026, with Oracle discussing far larger numbers for 2027. That’s the tension lawmakers zeroed in on:
- Cost control: a lifecycle estimate around $37B being discussed alongside earlier analysis that projected a total as high as $50B.
- Cost clarity: lawmakers asking for breakdowns (program office, consulting, infrastructure, ongoing maintenance) and whether key oversight bodies received updated numbers.
- Batched go-lives: four facilities “turning on” close together means staffing, training, ticket resolution, contingency planning, and patient safety monitoring all have to work at surge capacity.
A modernization program doesn’t earn trust by promising speed. It earns trust by showing it can detect problems early—and stop them before patient care is affected.
Why batched deployments raise risk (and how AI can lower it)
Answer first: Batched deployments increase risk because they multiply support load and compress learning cycles; AI can lower the risk by predicting failure points, triaging incidents, and measuring readiness continuously.
A single-site EHR go-live is already stressful: clinical workflows change, documentation patterns shift, help-desk tickets spike, integrations get tested in the real world, and “unknown unknowns” show up at 2 a.m. A multi-site go-live creates a different category of problem: resource contention.
The operational reality of a multi-site go-live
When four facilities go live together, you need the same things—just more of them, all at once:
- Trainers and “floor walkers” covering units and shifts
- Integration engineers monitoring interfaces (labs, pharmacy, radiology, identity)
- Command center staffing to manage ticket volume
- Cybersecurity monitoring for new system behaviors
- Clinical safety monitoring and rapid escalation paths
GAO’s concern in the hearing—whether that surge is sustainable—is exactly the right question. “Can we do it once?” isn’t the bar. “Can we do it repeatedly while scaling?” is.
Where AI helps without touching clinical decision-making
Not every AI use case in a hospital setting needs to be clinical. In fact, the highest-ROI, lowest-controversy uses are often operational:
-
Ticket surge prediction
- Use historical go-live ticket data plus facility attributes (size, complexity, specialty mix) to forecast ticket volume by category and time-of-day.
- Output: staffing plans that are evidence-based instead of hopeful.
-
Incident triage and routing
- Natural language models can categorize tickets (login/identity, order entry, interface errors, device integration) and route them to the right resolver group.
- Output: shorter time-to-resolution and fewer “bounces.”
-
Early-warning signals for workflow breakdowns
- Monitor anonymized metadata patterns (order turnaround times, documentation delays, error-rate spikes in certain modules).
- Output: rapid attention to the unit or workflow that’s failing before it becomes a patient-safety event.
-
Go-live readiness scoring
- Combine training completion, system performance tests, interface validation, role mapping completeness, and outstanding defects into a single readiness score.
- Output: leadership gets a hard number and a list of blockers—not a slideshow.
Used this way, AI is not “making care decisions.” It’s helping the program run like a modern service operation.
Cost overruns aren’t mysterious—they’re measurable
Answer first: Most EHR program cost growth comes from predictable drivers—change orders, rework, prolonged dual operations, and staffing spikes—and AI can forecast these earlier and more accurately.
When lawmakers say they don’t trust the bottom-line number, they’re describing a common problem: lifecycle estimates can hide more than they reveal if they aren’t decomposed into drivers that can be tracked monthly.
Here’s what tends to inflate EHR modernization costs in government health systems:
1) Rework from “fixed later” defects
Defects that slip into production don’t just cost engineering time. They cost:
- clinician time (workarounds)
- training time (retraining)
- help-desk labor
- patient scheduling throughput
AI contribution: defect pattern detection across logs + tickets + user feedback to identify the modules causing repeated rework.
2) Extended dual-running of legacy systems
Every month you keep parallel workflows alive, you pay for:
- extra integration
- extra support
- extra cybersecurity surface area
- extra data reconciliation
AI contribution: predictive modeling for cutover timing based on readiness and defect burn-down trends, reducing “we’re not sure, so we’ll wait” delays.
3) Consulting and surge staffing as a permanent crutch
Go-live surges are normal. Permanent surges are not. If every deployment requires heroics, the model doesn’t scale.
AI contribution: staffing demand forecasting and skill-mix optimization so the program can build internal capacity rather than buying it repeatedly.
4) Underestimated infrastructure and performance needs
EHR performance issues don’t always show up in pre-production testing. Latency under real load can create cascading clinical slowdowns.
AI contribution: anomaly detection on performance telemetry and capacity forecasting tied to actual usage patterns.
A hard opinion: If an EHR program can’t produce a transparent, driver-based cost forecast each quarter, it’s not managing cost—it’s reporting cost.
A practical AI playbook for safer government EHR deployments
Answer first: Start with operational AI that improves visibility and control—then expand—using a governance model aligned to public-sector accountability.
If you’re an agency leader, program executive, or systems integrator working in public-sector health IT, here’s a realistic sequence that works.
Step 1: Build a single “go-live truth” dataset
You can’t model what you can’t measure. Bring together:
- help-desk tickets (categories, timestamps, resolution time)
- system telemetry (latency, error rates, downtime)
- training and role-mapping status
- interface test outcomes
- deployment defect lists (severity, age)
- operational KPIs (appointment throughput, documentation time patterns)
Step 2: Create two dashboards that leadership will actually use
Not twenty dashboards. Two.
- Readiness dashboard (is the site prepared?)
- Stability dashboard (is the site safe and stable post go-live?)
AI should sit behind these dashboards to surface what matters: trends, forecasts, and anomalies.
Step 3: Use AI for triage, forecasting, and anomaly detection
This is where you get fast wins:
- classify tickets automatically
- predict daily peak support needs for the first 30 days post go-live
- detect unusual spikes in errors tied to a specific workflow or integration
Step 4: Add governance that matches the risk
Public-sector AI governance is often treated as paperwork. It shouldn’t be.
A workable approach for EHR modernization:
- Keep AI focused on operations, not diagnosis/treatment
- Require human review for escalations that affect clinical workflow
- Log model outputs and actions for auditability
- Define “stop conditions” (what triggers a pause, rollback, or safety huddle)
Step 5: Prove scalability before you claim it
Batched deployments only make sense if you can demonstrate:
- stable ticket resolution times at volume
- sustainable staffing ratios
- defect burn-down speed across sites
- no repeated patient-safety patterns
If you can’t prove that, batching becomes an acceleration of risk, not delivery.
What leaders should ask before the April 2026 go-lives
Answer first: The best questions are specific and measurable—focused on contingency, capacity, and safety—not optimism.
If you want to evaluate whether a batched EHR deployment is ready, ask these questions (and insist on numbers):
- What’s the projected ticket volume per site per day for weeks 1–4, and what’s the staffing plan by shift?
- What’s the contractual SLA for ticket resolution, and what happened to SLA performance at the last go-live?
- What’s the cutover contingency plan—pause criteria, rollback criteria, and who has authority at 2 a.m.?
- How are we monitoring patient-safety risks during go-live week (and what signals trigger escalation)?
- What’s the full lifecycle cost model, broken into drivers, updated quarterly?
Those are not “gotcha” questions. They’re the baseline for responsible modernization.
Where this fits in the AI in Government & Public Sector story
Government is under pressure to modernize services faster—especially health services—while proving stewardship of public funds. The VA EHR program is a high-stakes example because it touches care delivery, privacy, cybersecurity, and operational resilience.
Here’s the bigger point I keep coming back to: AI is most useful in government modernization when it strengthens accountability. Better forecasting, earlier warnings, faster triage, and clearer readiness signals make it easier to protect patients and taxpayers at the same time.
If you’re planning an EHR rollout, a benefits platform migration, or any large mission system modernization, you don’t need more optimistic status updates. You need instrumentation, discipline, and decision support that can stand up in a hearing room.
What would change in your program if every go-live decision had to be backed by a readiness score, a cost forecast with drivers, and a contingency plan tested under realistic load?