AI in Government & Public Sector•December 19, 2025•By 3L3C

State’s CIO wants agentic AI that completes tasks—not just answers. See what it means for secure, mission-critical government operations.

agentic-aigovernment-itnational-securityai-governancefederal-workforceautomation

Featured image for Agentic AI at State: From Chatbots to Task Automation

Agentic AI at State: From Chatbots to Task Automation

45,000–50,000 people don’t adopt a new tool at the State Department by accident. That’s roughly how many employees are now using “StateChat,” the department’s enterprise generative-AI chatbot—out of a total workforce of about 80,000. For a government organization with global operations, stringent security requirements, and deeply ingrained processes, that usage number tells you something: internal AI is crossing the line from “interesting pilot” to “daily utility.”

Now the State Department’s CIO, Kelly Fletcher, is aiming for the next step: agentic AI—systems that don’t just answer questions, but take action across multiple systems. The example she gave is refreshingly practical: don’t only tell me my leave balance; submit the leave request in the separate HR system.

That sounds mundane, but it’s exactly the point. In national security and defense-adjacent environments, the path to meaningful AI impact often starts with boring, high-volume work—because that’s where administrative friction quietly drains mission time.

Agentic AI is “do the work,” not “answer the question”

Agentic AI matters because it turns assistance into execution. A chatbot summarizes policy; an agent completes the workflow.

Generative AI is already useful for:

Drafting emails and memos
Translating text
Summarizing long policy documents
Answering “where is the rule for X?” questions

Agentic AI builds on that with a different promise: give the system an objective, constraints, and permissions, and it completes multi-step tasks—often spanning identity, data access, approvals, and logging.

In State’s context, Fletcher described the desire for agents that can act on behalf of employees across tools and platforms. That’s not a nice-to-have feature; it’s what makes AI operationally relevant in large institutions where work is fragmented across:

HR systems
Financial and travel systems
Case management tools
Knowledge bases and policy libraries
Cybersecurity platforms and ticketing systems

A concise way to frame it:

Chatbots reduce cognitive load. Agents reduce cycle time.

And cycle time is what leaders notice when staffing is tight and mission tempo doesn’t slow down.

Why administrative automation is a national security issue (not just “IT modernization”)

Reducing administrative toil increases mission capacity without increasing headcount. That’s the strategic connection that often gets missed.

The RSS story sits squarely inside a broader shift across government: using AI to offset workforce pressure and speed up internal operations. Federal leadership has publicly discussed using AI to mitigate staffing losses, and State has experienced its own changes—like the departure of its former chief data officer and AI officer earlier this year.

Here’s the defense and national security angle: every hour reclaimed from low-value administrative work is an hour returned to planning, analysis, diplomacy, cyber defense, and response coordination.

This is especially relevant heading into late 2025 and early 2026 budgeting cycles, when agencies face competing demands:

Modernize legacy systems
Improve cybersecurity readiness
Support overseas operations
Deliver faster services with constrained staffing

Agentic AI becomes attractive because it targets the “hidden tax” in government work: time lost to navigating systems, interpreting policies, and completing repetitive steps.

The practical example that explains everything: “moving your cat to Conakry”

Fletcher’s example—asking StateChat how to move a cat to Conakry—sounds quirky until you’ve lived in a bureaucracy.

It’s a perfect illustration of internal AI value:

The policy exists, but it’s buried
People spend time searching, emailing, or guessing
The outcome is often inconsistent

StateChat can point employees to the relevant policy passages, summarize them, and let them verify the primary text. That’s already helpful.

Agentic AI would go further: once the employee confirms the plan, the agent could:

Pre-fill the right forms
Open the correct case/ticket
Route approvals
Attach required documentation
Record the action for audit

That’s where “AI in government & public sector” stops being about experimentation and starts becoming digital operations.

Adoption is the hard part—and State’s numbers prove it

The biggest barrier to enterprise AI isn’t model quality; it’s employee confidence and habit change.

State rolled StateChat out to 3,000 beta testers, then expanded to tens of thousands of users. Fletcher also noted that adoption required “a huge amount of education and training,” including answering a basic question that stalls many deployments: Is it even allowable for me to use it?

That line should be printed on every AI program plan in the public sector.

If your agency or defense organization is building an AI assistant or agent program, assume you’ll need:

Clear policy on acceptable use (what data, what tasks, what tools)
Role-based training, not generic “AI 101”
Reinforcement through champions in each bureau/unit
Simple “what it’s for” guidance

Here’s what works in practice (and I’ve found this applies in almost every high-compliance environment): train people on job stories, not features.

Job story examples:

“When I’m preparing a country briefing, I want policy references in 2 minutes, so I can validate my recommendation faster.”
“When I’m submitting leave, I want the process handled end-to-end, so I don’t lose 20 minutes to system hopping.”
“When I’m triaging security alerts, I want the top five most actionable items, so I don’t miss the one that matters.”

Features don’t change behavior. Outcomes do.

The real risks of agentic AI—and how to contain them

Agentic AI introduces new failure modes because it can act, not just speak. For defense and national security organizations, this is where programs succeed or get shut down.

The source article flags several legitimate concerns: oversight challenges, testing and evaluation difficulty, and job displacement. Let’s translate those into concrete program risks and controls.

1) Oversight: “Who approved that action?”

If an agent can submit a leave request, it can also submit something you didn’t intend. Oversight isn’t optional; it’s a design requirement.

Controls that scale:

Human-in-the-loop checkpoints for high-impact actions (submit, approve, transmit)
Permissioned tool access (least privilege, role-based)
Full audit logs of agent actions, prompts, and system calls
Policy-bound execution (agents must cite the policy/rule they’re following)

A strong stance: agents should be auditable by default, not as an add-on.

2) Testing: you can’t QA an agent like a normal app

Agentic workflows are messy because inputs vary, systems change, and edge cases are infinite.

What actually works:

Scenario-based evaluations (50–200 real workflows, measured repeatedly)
Red-team testing focused on unsafe actions and permission escalation
Synthetic “canary” tasks that detect drift (if success rate drops, pause)
Rollback and graceful failure (if step 4 fails, don’t corrupt the record)

If your organization is serious about agentic AI in mission-critical operations, build a testing harness early. Don’t wait for production incidents to teach you what to measure.

3) Job impact: the goal is capacity, not churn

Agentic AI will change roles. Pretending otherwise creates internal resistance.

The best programs are explicit:

Automate the toil, not the judgment
Reinvest time into higher-value analysis, oversight, and mission planning
Train staff to supervise agents (new “operator” and “auditor” skills)

A memorable rule: if you can’t explain how the workforce benefits, the workforce will assume it won’t.

Where agentic AI fits in defense and national security workflows

The most defensible early use cases are constrained, high-volume workflows with clear rules. State’s admin focus is a smart template for defense-adjacent organizations.

Here are agentic AI use cases that map naturally to defense and national security environments—without starting with the most sensitive mission actions:

1) Security operations triage (SOC / CSOC)

Fletcher mentioned prioritizing cybersecurity alerts. This is a strong fit for agents when implemented carefully.

Agent tasks:

Correlate alerts across tools
Enrich indicators with internal context
Open tickets with recommended next steps
Escalate only when confidence and impact thresholds are met

Guardrails:

Agents recommend; humans authorize containment actions
Tight access controls to endpoint tools

2) Policy and compliance navigation

Chatbots already help. Agents can do the next step.

Agent tasks:

Identify applicable policy
Generate a compliant checklist for the user’s role
Pre-fill forms and route approvals
Attach required artifacts and create an audit trail

3) Travel, logistics, and workforce workflows

This is the “leave slip” pattern scaled up.

Agent tasks:

Assemble travel packets
Validate per diem rules
Submit requests across systems
Track approvals and notify stakeholders

4) Intelligence support (carefully scoped)

For intelligence analysis, agents should start with process support, not analytic conclusions.

Agent tasks:

Pull relevant internal documents
Build a source map and citation list
Draft structured summaries with confidence notes
Flag gaps that require human follow-up

The point is not to remove analysts. It’s to remove the time sink that keeps analysts from analyzing.

A practical rollout plan for agentic AI (that won’t implode)

The safest way to deploy agentic AI in government is to start narrow, prove reliability, then expand permissions.

A five-step approach that fits high-stakes environments:

Pick one workflow with clear success criteria
- Example: leave requests, password resets, travel pre-approvals
- Define success as: completion rate, time saved, error rate, user satisfaction
Integrate with identity and logging before adding capability
- If you can’t log actions end-to-end, you’re not ready for agents
Use “confirm then execute” as the default interaction
- Show the steps the agent will take
- Require explicit confirmation for submissions
Create a permissions ladder
- Read-only → draft actions → submit actions → limited approvals
Measure adoption like a product, not a mandate
- Weekly active users, repeat usage, workflow completion rate
- Training completion isn’t adoption; behavior is adoption

State’s experience—needing extensive training just to normalize use—should be taken as a warning and a roadmap.

What the State Department’s approach signals for AI in government & public sector

The trend line is clear: AI is shifting from “chat” to “operations.” State’s CIO is describing the same transition many defense and national security organizations are now making: embedding AI into the fabric of day-to-day work, then carefully granting it the ability to act.

That’s the right order. Build trust with assistive use cases. Put governance and logging in place. Then move to agents.

If you’re leading AI adoption in a defense, intelligence, or public sector organization, the most practical next step is to audit your workflows and identify where an agent could safely reduce cycle time without introducing unacceptable risk.

If you could deploy one agent in the next 90 days—one that employees would actually use—what would it be, and what would you require to trust it?