State’s CIO wants agentic AI that completes tasks—not just answers. See what it means for secure, mission-critical government operations.

Agentic AI at State: From Chatbots to Task Automation
45,000–50,000 people don’t adopt a new tool at the State Department by accident. That’s roughly how many employees are now using “StateChat,” the department’s enterprise generative-AI chatbot—out of a total workforce of about 80,000. For a government organization with global operations, stringent security requirements, and deeply ingrained processes, that usage number tells you something: internal AI is crossing the line from “interesting pilot” to “daily utility.”
Now the State Department’s CIO, Kelly Fletcher, is aiming for the next step: agentic AI—systems that don’t just answer questions, but take action across multiple systems. The example she gave is refreshingly practical: don’t only tell me my leave balance; submit the leave request in the separate HR system.
That sounds mundane, but it’s exactly the point. In national security and defense-adjacent environments, the path to meaningful AI impact often starts with boring, high-volume work—because that’s where administrative friction quietly drains mission time.
Agentic AI is “do the work,” not “answer the question”
Agentic AI matters because it turns assistance into execution. A chatbot summarizes policy; an agent completes the workflow.
Generative AI is already useful for:
- Drafting emails and memos
- Translating text
- Summarizing long policy documents
- Answering “where is the rule for X?” questions
Agentic AI builds on that with a different promise: give the system an objective, constraints, and permissions, and it completes multi-step tasks—often spanning identity, data access, approvals, and logging.
In State’s context, Fletcher described the desire for agents that can act on behalf of employees across tools and platforms. That’s not a nice-to-have feature; it’s what makes AI operationally relevant in large institutions where work is fragmented across:
- HR systems
- Financial and travel systems
- Case management tools
- Knowledge bases and policy libraries
- Cybersecurity platforms and ticketing systems
A concise way to frame it:
Chatbots reduce cognitive load. Agents reduce cycle time.
And cycle time is what leaders notice when staffing is tight and mission tempo doesn’t slow down.
Why administrative automation is a national security issue (not just “IT modernization”)
Reducing administrative toil increases mission capacity without increasing headcount. That’s the strategic connection that often gets missed.
The RSS story sits squarely inside a broader shift across government: using AI to offset workforce pressure and speed up internal operations. Federal leadership has publicly discussed using AI to mitigate staffing losses, and State has experienced its own changes—like the departure of its former chief data officer and AI officer earlier this year.
Here’s the defense and national security angle: every hour reclaimed from low-value administrative work is an hour returned to planning, analysis, diplomacy, cyber defense, and response coordination.
This is especially relevant heading into late 2025 and early 2026 budgeting cycles, when agencies face competing demands:
- Modernize legacy systems
- Improve cybersecurity readiness
- Support overseas operations
- Deliver faster services with constrained staffing
Agentic AI becomes attractive because it targets the “hidden tax” in government work: time lost to navigating systems, interpreting policies, and completing repetitive steps.
The practical example that explains everything: “moving your cat to Conakry”
Fletcher’s example—asking StateChat how to move a cat to Conakry—sounds quirky until you’ve lived in a bureaucracy.
It’s a perfect illustration of internal AI value:
- The policy exists, but it’s buried
- People spend time searching, emailing, or guessing
- The outcome is often inconsistent
StateChat can point employees to the relevant policy passages, summarize them, and let them verify the primary text. That’s already helpful.
Agentic AI would go further: once the employee confirms the plan, the agent could:
- Pre-fill the right forms
- Open the correct case/ticket
- Route approvals
- Attach required documentation
- Record the action for audit
That’s where “AI in government & public sector” stops being about experimentation and starts becoming digital operations.
Adoption is the hard part—and State’s numbers prove it
The biggest barrier to enterprise AI isn’t model quality; it’s employee confidence and habit change.
State rolled StateChat out to 3,000 beta testers, then expanded to tens of thousands of users. Fletcher also noted that adoption required “a huge amount of education and training,” including answering a basic question that stalls many deployments: Is it even allowable for me to use it?
That line should be printed on every AI program plan in the public sector.
If your agency or defense organization is building an AI assistant or agent program, assume you’ll need:
- Clear policy on acceptable use (what data, what tasks, what tools)
- Role-based training, not generic “AI 101”
- Reinforcement through champions in each bureau/unit
- Simple “what it’s for” guidance
Here’s what works in practice (and I’ve found this applies in almost every high-compliance environment): train people on job stories, not features.
Job story examples:
- “When I’m preparing a country briefing, I want policy references in 2 minutes, so I can validate my recommendation faster.”
- “When I’m submitting leave, I want the process handled end-to-end, so I don’t lose 20 minutes to system hopping.”
- “When I’m triaging security alerts, I want the top five most actionable items, so I don’t miss the one that matters.”
Features don’t change behavior. Outcomes do.
The real risks of agentic AI—and how to contain them
Agentic AI introduces new failure modes because it can act, not just speak. For defense and national security organizations, this is where programs succeed or get shut down.
The source article flags several legitimate concerns: oversight challenges, testing and evaluation difficulty, and job displacement. Let’s translate those into concrete program risks and controls.
1) Oversight: “Who approved that action?”
If an agent can submit a leave request, it can also submit something you didn’t intend. Oversight isn’t optional; it’s a design requirement.
Controls that scale:
- Human-in-the-loop checkpoints for high-impact actions (submit, approve, transmit)
- Permissioned tool access (least privilege, role-based)
- Full audit logs of agent actions, prompts, and system calls
- Policy-bound execution (agents must cite the policy/rule they’re following)
A strong stance: agents should be auditable by default, not as an add-on.
2) Testing: you can’t QA an agent like a normal app
Agentic workflows are messy because inputs vary, systems change, and edge cases are infinite.
What actually works:
- Scenario-based evaluations (50–200 real workflows, measured repeatedly)
- Red-team testing focused on unsafe actions and permission escalation
- Synthetic “canary” tasks that detect drift (if success rate drops, pause)
- Rollback and graceful failure (if step 4 fails, don’t corrupt the record)
If your organization is serious about agentic AI in mission-critical operations, build a testing harness early. Don’t wait for production incidents to teach you what to measure.
3) Job impact: the goal is capacity, not churn
Agentic AI will change roles. Pretending otherwise creates internal resistance.
The best programs are explicit:
- Automate the toil, not the judgment
- Reinvest time into higher-value analysis, oversight, and mission planning
- Train staff to supervise agents (new “operator” and “auditor” skills)
A memorable rule: if you can’t explain how the workforce benefits, the workforce will assume it won’t.
Where agentic AI fits in defense and national security workflows
The most defensible early use cases are constrained, high-volume workflows with clear rules. State’s admin focus is a smart template for defense-adjacent organizations.
Here are agentic AI use cases that map naturally to defense and national security environments—without starting with the most sensitive mission actions:
1) Security operations triage (SOC / CSOC)
Fletcher mentioned prioritizing cybersecurity alerts. This is a strong fit for agents when implemented carefully.
Agent tasks:
- Correlate alerts across tools
- Enrich indicators with internal context
- Open tickets with recommended next steps
- Escalate only when confidence and impact thresholds are met
Guardrails:
- Agents recommend; humans authorize containment actions
- Tight access controls to endpoint tools
2) Policy and compliance navigation
Chatbots already help. Agents can do the next step.
Agent tasks:
- Identify applicable policy
- Generate a compliant checklist for the user’s role
- Pre-fill forms and route approvals
- Attach required artifacts and create an audit trail
3) Travel, logistics, and workforce workflows
This is the “leave slip” pattern scaled up.
Agent tasks:
- Assemble travel packets
- Validate per diem rules
- Submit requests across systems
- Track approvals and notify stakeholders
4) Intelligence support (carefully scoped)
For intelligence analysis, agents should start with process support, not analytic conclusions.
Agent tasks:
- Pull relevant internal documents
- Build a source map and citation list
- Draft structured summaries with confidence notes
- Flag gaps that require human follow-up
The point is not to remove analysts. It’s to remove the time sink that keeps analysts from analyzing.
A practical rollout plan for agentic AI (that won’t implode)
The safest way to deploy agentic AI in government is to start narrow, prove reliability, then expand permissions.
A five-step approach that fits high-stakes environments:
-
Pick one workflow with clear success criteria
- Example: leave requests, password resets, travel pre-approvals
- Define success as: completion rate, time saved, error rate, user satisfaction
-
Integrate with identity and logging before adding capability
- If you can’t log actions end-to-end, you’re not ready for agents
-
Use “confirm then execute” as the default interaction
- Show the steps the agent will take
- Require explicit confirmation for submissions
-
Create a permissions ladder
- Read-only → draft actions → submit actions → limited approvals
-
Measure adoption like a product, not a mandate
- Weekly active users, repeat usage, workflow completion rate
- Training completion isn’t adoption; behavior is adoption
State’s experience—needing extensive training just to normalize use—should be taken as a warning and a roadmap.
What the State Department’s approach signals for AI in government & public sector
The trend line is clear: AI is shifting from “chat” to “operations.” State’s CIO is describing the same transition many defense and national security organizations are now making: embedding AI into the fabric of day-to-day work, then carefully granting it the ability to act.
That’s the right order. Build trust with assistive use cases. Put governance and logging in place. Then move to agents.
If you’re leading AI adoption in a defense, intelligence, or public sector organization, the most practical next step is to audit your workflows and identify where an agent could safely reduce cycle time without introducing unacceptable risk.
If you could deploy one agent in the next 90 days—one that employees would actually use—what would it be, and what would you require to trust it?