AI in Government & Public Sector•December 19, 2025•By 3L3C

Learn how GAO’s high-risk oversight model maps to AI governance, auditability, and accountability—and what public sector leaders should do next.

GAOpublic sector AIAI governancegovernment oversightrisk managementdigital transformation

Featured image for AI Oversight Lessons from GAO’s High-Risk Playbook

AI Oversight Lessons from GAO’s High-Risk Playbook

GAO doesn’t run programs. It tells the hard truth about the programs everyone else runs.

That’s why agency leaders dread the High-Risk List—the Government Accountability Office’s catalog of federal missions most vulnerable to waste, fraud, abuse, mismanagement, or in urgent need of transformation. And it’s why Gene Dodaro’s retirement this month matters beyond a changing nameplate. Under Dodaro, GAO leaned into real-time auditing during national emergencies and sharpened its role in science and technology, including artificial intelligence oversight.

Here’s the connection I don’t think government leaders should miss: AI in government will either become a force multiplier for oversight—or a new class of “high-risk” problems. The difference comes down to how seriously we treat accountability, skills, and decision transparency.

What Dodaro’s GAO got right: accountability that moves at the speed of events

Answer first: GAO’s most durable lesson is that oversight works best when it’s timely, measurable, and built into execution—not bolted on after the headlines.

Dodaro points to major national disruptions where GAO didn’t just publish retrospective reports; it helped Congress and agencies create mechanisms for tracking what was happening while dollars were moving.

During the 2008 financial crisis, GAO pushed for stronger transparency requirements around the $700 billion Troubled Asset Relief Program. GAO then reported publicly every 60 days. The program ultimately spent about $400 billion and ended with a net cost of $31 billion—driven largely by mortgage assistance, while banks repaid funds with interest.
During the pandemic, GAO tracked $4.6 trillion in emergency funding and issued more than 200 reports and 484 recommendations.

This matters for AI governance because AI systems also “move money” and “move outcomes” quickly—sometimes faster than a program office can explain what happened. If you’re deploying AI for eligibility screening, fraud detection, cybersecurity, or public safety operations, the oversight model can’t be annual.

A practical translation for AI programs: “auditability by design”

If you run digital transformation in the public sector, build audit hooks the way GAO would want them:

Decision traceability: For every model-influenced decision, store the input features, model version, policy rule applied, and the human override (if any).
Time-boxed reporting: Establish a monthly or quarterly “AI performance and risk” memo that includes error rates, drift indicators, and appeal outcomes.
Outcome integrity checks: Track whether the AI changes real-world results (benefits paid accurately, investigations resolved, incidents prevented)—not just model accuracy.

A snippet-worthy rule: If you can’t explain an AI-driven decision path to an auditor, you’re not ready to scale it.

The High-Risk List is really a skills list—AI won’t fix that by itself

Answer first: Many federal programs land on GAO’s High-Risk List partly because the government lacks the people and skills to run them well; AI increases the urgency of closing those gaps.

Dodaro noted that of the 38 areas on GAO’s High-Risk List, at least 22 include skill gaps and shortages as a contributing factor. That pattern mirrors what most public sector leaders see on the ground: procurement teams stretched thin, cyber talent hard to retain, data governance inconsistent, and program offices asked to modernize with yesterday’s staffing models.

GAO’s own approach is telling:

Roughly 3,500 staff with multidisciplinary expertise (financial auditing is only about 10% of their work).
A robust internship pipeline (~200 interns per year).
Retention around the mid-90% range (GAO reported 95% in 2024 with retirements and 97% without).

What AI readiness looks like in a high-risk environment

You don’t need every agency to become a machine learning lab. You do need a minimum viable talent stack:

AI product owner (mission-side): translates policy intent into measurable outcomes and guardrails.
Data steward: accountable for definitions, quality, and lawful use of data.
Model risk lead: runs bias testing, drift monitoring, and validation.
Security engineer: treats models and data pipelines as attack surfaces.
Procurement/contracting specialist: writes performance-based requirements and audit clauses.

If your program can’t name who plays these roles (even part-time), you’re building an AI system that will become an oversight incident.

Impoundments, transparency, and AI: the same fight in different clothing

Answer first: The impoundment debate is about who controls decisions and how those decisions are justified; AI raises the stakes because it can quietly change how policy is executed.

In the interview, Dodaro discusses GAO’s conflict with the Trump administration over impoundments—when the executive branch delays or withholds congressionally approved funding. He also highlights GAO’s evenhanded approach: across recent impoundment decisions, GAO found violations in some cases and no violations in others, based on facts and law.

Here’s the parallel: AI can function like a “soft impoundment” of services if it changes operational outcomes in ways Congress, oversight bodies, and the public didn’t authorize.

Examples that show up in real agencies:

An eligibility model tightens approvals, so fewer people receive a benefit even though the statute didn’t change.
A fraud model increases investigative holds, delaying payments and effectively changing program delivery.
A triage algorithm shifts which cases get attention first, altering enforcement intensity across communities.

None of these require a budget hold to feel like a policy change.

The governance fix: make “policy intent” explicit in AI requirements

Strong AI governance in government starts with a sentence you can defend:

“This model exists to support X statutory purpose, using Y authorized data, and it will not be used to make Z determinations without human review.”

Then back it up with:

Model cards and system documentation written for oversight audiences (not just engineers).
Appeals and redress pathways for people affected by automated decisions.
Change control so model updates don’t quietly become policy updates.

GAO is expanding tech oversight—agencies should treat that as a preview

Answer first: GAO’s growing science and technology capability is a signal that oversight expectations for AI and digital government will get more technical and more specific.

Dodaro describes building an expanded role for GAO in science, technology, and engineering to help Congress oversee areas like artificial intelligence, quantum computing, regenerative medicine, and more.

From an agency perspective, this is a forecast:

Oversight bodies will increasingly ask for evaluation-grade evidence (not vendor claims).
The bar for cybersecurity, privacy, and critical infrastructure protection will rise as AI systems touch more operational surfaces.
“Trust us” won’t survive contact with auditors.

A stance worth taking: If you’re buying AI with public money, you should be able to show public-value receipts. That means quantified outcomes, documented risks, and clear accountability for failures.

A simple scorecard leaders can use before scaling AI

I’ve found that AI programs fail most often on basics, not math. Use this five-part scorecard before you scale beyond a pilot:

Mission fit: What decision or workflow changes—and how do you measure success?
Data rights and quality: Do you have legal authority, consent posture, and data definitions nailed down?
Operational controls: Who can override the model, and how is that tracked?
Monitoring: What’s your drift threshold, bias check cadence, and incident response plan?
Procurement accountability: Are performance metrics, audit access, and documentation deliverables in the contract?

If you can’t answer one of these in plain language, you’re not behind—you’re early. Fix it now.

Leadership transition is an AI moment, not just an org chart moment

Answer first: GAO’s leadership handoff highlights a broader reality: continuity in government increasingly depends on institutional memory captured in systems—and AI can help if you deploy it responsibly.

Dodaro emphasizes relationship building, nonpartisan trust, and managing a workforce of “trained critics.” GAO’s credibility rests on consistency across political cycles.

For agencies, the same continuity problem shows up every January, every budget cycle, and every reorg: knowledge is scattered across inboxes, contractors, SharePoint sites, and people’s heads.

Done right, AI can strengthen continuity:

Summarize prior decisions and rationale from authoritative sources.
Map open recommendations to owners, deadlines, and evidence.
Flag policy or data changes likely to affect performance.

Done wrong, AI becomes a misinformation amplifier inside the enterprise.

The safe path: retrieval over improvisation

If you want AI assistants for oversight or leadership continuity, prioritize systems that:

Answer from approved internal documents (retrieval-augmented generation), rather than “freewriting.”
Cite the originating document internally (even if you don’t display citations publicly).
Respect access controls and keep sensitive data segmented.

A one-liner worth keeping: In government, an AI assistant is only as trustworthy as its sources and its access rules.

What public sector leaders should do in Q1 2026

Answer first: The next 90 days are enough time to put real structure around AI oversight—without stalling delivery.

If your agency is building or buying AI now, set up these actions early in the year while budgets, plans, and performance goals are being finalized:

Inventory live and planned AI systems (including “embedded AI” in vendor tools).
Assign model accountability: one senior owner per system, in writing.
Standardize documentation: require a short model/system brief, risk assessment, and monitoring plan.
Create an oversight cadence: monthly metrics for high-impact systems; quarterly for lower-risk.
Run an audit drill: pick one AI workflow and simulate answering GAO-style questions (data lineage, decision trace, performance, and redress).

This isn’t bureaucracy for its own sake. It’s how you avoid front-page failures.

Where this goes next for AI in Government & Public Sector

GAO’s story under Gene Dodaro is a reminder that oversight is a service to mission delivery, not an obstacle to it. The strongest programs bake in transparency, measure outcomes, and hire for the skills the mission actually needs.

If you’re leading AI in government, treat GAO’s playbook as your early warning system. Build auditability into systems, define policy intent upfront, and monitor outcomes like you mean it. Then you can scale with confidence—even when scrutiny increases.

What would change in your AI roadmap if you assumed every high-impact model will eventually face a GAO-style review?