AI in Government & Public Sector•December 19, 2025•By 3L3C

Agentic AI moves government from chat to completed tasks. See what State’s approach teaches defense teams about workflow automation, risk controls, and adoption.

Agentic AIFederal ITNational Security AIGovOps AutomationCybersecurity OperationsChange Management

Featured image for Agentic AI in Government: From Chat to Action

Agentic AI in Government: From Chat to Action

Federal AI pilots are entering a new phase: systems that don’t just answer questions—they do the work. The State Department’s CIO, Kelly Fletcher, described the shift plainly: it’s not enough for a chatbot to tell an employee how much leave they have. The next step is an assistant that can file the leave request in the separate HR system—correctly, auditable, and with the right approvals.

That idea sounds mundane until you map it to defense and national security reality. The same “do the work across systems” capability that submits a leave slip can also route a cybersecurity alert, open an incident ticket, pull logs, draft a report, and request authority-to-operate evidence—all while keeping humans in the loop where policy demands it.

This post is part of our AI in Government & Public Sector series, and I’m taking a clear stance: agentic AI will be most valuable in national security when it’s treated as workflow engineering plus governance—not as a smarter chatbot. The agencies that win won’t be the ones with the flashiest model. They’ll be the ones that control actions, permissions, and accountability.

What “agentic AI” really changes (and why State’s example matters)

Agentic AI changes the unit of value from “answers” to “completed tasks.” A traditional generative AI tool helps users search, summarize, translate, and draft. An agent goes further: it can plan steps, call tools, move across applications, and execute actions—often with checkpoints.

State’s StateChat rollout shows why this matters. A large enterprise chatbot that can summarize policy and translate content already reduces administrative friction. Fletcher noted that StateChat grew from 3,000 beta users to roughly 45,000–50,000 users out of about 80,000 employees—but only after significant training and cultural work. That adoption curve is the real story.

A chatbot saves minutes. An agent saves cycles. In national security organizations, cycles translate into operational tempo.

From “find the policy” to “complete the workflow”

The State Department example—helping an employee understand rules for relocating a pet to an overseas post—looks small. It isn’t. It’s a perfect illustration of a high-volume pattern in government:

The user needs an answer and the authoritative citation
The user then needs to take follow-on actions in other systems
The process is gated by policy, approvals, and records retention

Agentic AI is basically an attempt to make those follow-on actions automatic and consistent.

Where agentic AI fits in defense and national security operations

The best early use cases in defense and national security are “bounded autonomy” workflows—repeatable processes with clear rules, permissions, and human sign-off points.

The bridge from State to DoD and the IC is straightforward: most mission outcomes depend on a long chain of admin, cyber, logistics, and planning work that still happens in brittle ticketing systems, spreadsheets, PDFs, and email.

1) Cybersecurity: triage, tickets, and evidence

State leadership referenced using AI to prioritize cybersecurity alerts. That’s exactly where an agent can add value quickly—if it’s constrained.

A realistic “security operations agent” can:

Ingest an alert (SIEM/EDR)
Pull context (asset inventory, identity logs, threat intel notes)
Open or update a case in the ticketing system
Draft an incident summary with cited telemetry
Propose response actions for analyst approval

The human still decides. The agent handles the glue work. In practice, that means fewer missed handoffs and faster time-to-triage.

2) Mission planning and intelligence support: packaging, not guessing

Agentic AI shouldn’t be used to “decide” targeting or assessments. But it can be excellent at packaging inputs for the people who do.

Examples that don’t require magical autonomy:

Build a briefing shell and populate it with approved sources and last-known updates
Convert commander’s intent into a checklist of required annex updates
Track collection requirements and notify owners when dependencies aren’t met

This is where agentic AI supports decision-making indirectly: better preparation, fewer stale artifacts, tighter staff work.

3) Workforce and readiness: scaling capacity during staffing gaps

Federal leadership has openly discussed using AI to mitigate workforce losses. Whether you agree with the policy drivers or not, the operational reality is simple: many organizations are being asked to do more with fewer people.

Agentic AI is attractive because it targets “background tasks” that consume expert time:

routing forms
gathering evidence
assembling packets
initiating standard requests
following up on approvals

That’s not replacing expertise. It’s reducing administrative drag so expertise shows up where it matters.

Adoption is the hard part: what State’s rollout teaches every agency

Model performance isn’t the blocker. Behavior change is. Fletcher said it took “a huge amount of education and training” to move adoption from a controlled beta to tens of thousands of users—and even recently she was still answering, “Is it allowable for me to use it?”

That lesson generalizes across national security environments where the stakes are higher and the rules are stricter.

The adoption checklist most programs skip

If you want agentic AI to stick, treat it like an enterprise product launch, not a tech demo:

Clear allow-list guidance: what data types are permitted, and where
Role-based training: analysts, HR staff, cyber defenders, executives don’t need the same playbook
“What it’s for” examples: short scenarios that map to real tasks (not generic prompts)
Escalation paths: when the agent is wrong or blocked, users need a fast way to report it
Measured outcomes: cycle time, backlog reduction, rework rate, and user satisfaction—tracked monthly

Here’s what works in practice: start with one high-volume workflow, instrument it heavily, and publish internal success metrics so teams can see the win.

The real risks of agentic AI—and how to control them

Agentic AI risk is mostly “action risk.” When a system can take steps across applications, mistakes become incidents: wrong record updates, unauthorized access, accidental disclosure, broken audit trails.

The good news: these risks are manageable if you design the agent like you’d design any privileged automation.

Guardrails that actually hold up in government environments

Permissioning like a service account, not a super-user
- Give the agent the minimum access required per workflow.
Human-in-the-loop at policy boundaries
- Actions like submitting, approving, releasing, or transmitting should require explicit confirmation.
Action logs that are readable by auditors
- Every step: what the agent saw, what tools it invoked, what it changed, and why.
Deterministic tool use for critical steps
- Let the model generate a plan, but execute via constrained tools (forms, APIs, templates).
Testing beyond accuracy: failure modes and abuse cases
- Run scenarios for prompt injection, data leakage, privilege escalation, and “confidently wrong” actions.

If an agent can act, it needs the same discipline as a privileged admin script—plus better logging.

Job displacement vs. job redesign

Yes, agentic AI can reduce demand for some transactional work. But in defense and national security organizations, the bigger trend I see is job redesign: fewer people doing copy/paste coordination, more people doing oversight, QA, exception handling, and mission-specific judgment.

If leadership doesn’t plan for that shift—training, new performance metrics, and clearer accountability—the tech will create friction instead of relief.

A practical roadmap to deploy agentic AI responsibly (90 days to value)

The fastest path to value is a “single agent, single workflow” approach with strong governance.

Step 1: Pick a workflow with clear boundaries

Good candidates share three traits:

high volume (used weekly or daily)
rule-based steps (policy or SOP exists)
cross-system friction (two or more systems involved)

Examples: leave requests, travel authorizations, access requests, incident intake, compliance evidence collection.

Step 2: Map actions and define stop points

Document the workflow as:

inputs (what data the user provides)
tools (what systems the agent can touch)
decisions (where a human must confirm)
outputs (what “done” looks like)

Step 3: Implement “bounded autonomy”

Start with:

read-only + draft mode (agent prepares, human submits)
then submit-with-confirmation
then partial automation for low-risk steps

Step 4: Measure outcomes that leadership cares about

Track these four metrics from day one:

cycle time (request created → completed)
rework rate (how often humans must fix agent output)
policy compliance (missing fields, wrong routing, exceptions)
user adoption (weekly active users / target population)

State’s adoption numbers show the north star: scale happens after you make people confident it’s allowed, useful, and safe.

What to watch in 2026: consolidation and “AI embedded everywhere”

Fletcher’s forward view—AI embedded in “just about everything”—matches what’s happening across the public sector: pilots are moving from standalone chat tools to embedded assistants inside the systems people already use.

By late 2026, the differentiator won’t be “Do you have an AI assistant?” It’ll be:

Can your assistant take approved actions across systems?
Can you prove what it did and why?
Can you restrict it tightly enough for national security environments?

The agencies that treat agentic AI as a disciplined workflow layer—connected to identity, audit, records, and risk—will get durable gains in mission efficiency.

If you’re evaluating agentic AI for defense or national security operations, focus your next conversation on one thing: what actions are permitted, under what controls, with what evidence. Everything else is theater.

Where do you see the highest-value “bounded autonomy” workflow in your organization right now—cyber triage, staff actions, logistics, or something else?