AI in Government & Public Sector•December 19, 2025•By 3L3C

Treasury’s “Great Gatsby” AI job test sparked attention. Here’s what public-sector AI hiring should measure instead—and how to build fair, job-relevant assessments.

Federal AI workforceAI governancePublic sector hiringResponsible AIAI risk managementDigital government

Featured image for Treasury AI Jobs: What Hiring Tests Should Measure

Treasury AI Jobs: What Hiring Tests Should Measure

A Treasury Department AI job posting recently asked applicants to do something most federal technologists didn’t see coming: write a 10-page, citation-heavy analysis of metaphors in The Great Gatsby, then produce an executive summary, translate it into Spanish and Mandarin, compare themes across other novels, and rewrite it as a scientific paper.

That requirement is so unusual it became the story. But the more useful conversation isn’t whether the assignment is quirky—it’s what it reveals about how government is trying (and sometimes struggling) to hire AI talent at the exact moment agencies need practical AI capability for service delivery, policy implementation, public safety, and responsible modernization.

Here’s my take: government shouldn’t be embarrassed that it’s experimenting with hiring screens. It should be embarrassed when those screens don’t measure the job. If the role is about secure, ethical AI deployment across Treasury, then the assessment should test secure, ethical AI deployment across Treasury.

The “Great Gatsby” test is a symptom, not the root problem

The immediate issue is fit: a literary analysis assignment mainly tests writing stamina, formatting discipline, and perhaps the applicant’s ability to instruct a generative AI tool to produce structured outputs. Those are real skills—but they’re not the core skills you need to set AI standards, architectures, and governance across a cabinet-level department.

The deeper issue is more common across the public sector: agencies want AI capability, but they don’t always have hiring signals that reliably predict performance in AI governance and delivery. So they reach for proxies—long essays, generic technical questionnaires, or vendor-style architecture prompts—because they’re familiar, easy to score, and feel “rigorous.”

Rigorous isn’t the same as relevant.

What an AI strategist at Treasury actually does

A role described as “formulating technical strategies, standards and architectures that advance the secure and ethical deployment of AI” is typically a mix of:

AI governance and risk management (model risk, data risk, operational risk, compliance)
Security engineering (threat modeling, secure MLOps, identity, logging, incident response)
Architecture and integration (how models connect to systems, data flows, APIs, vendor tools)
Policy-to-practice translation (turning principles into controls, checklists, and repeatable processes)
Stakeholder navigation (legal, privacy, procurement, program teams, unions, oversight bodies)

If your selection task doesn’t test these, you’re likely selecting for something else—usually “who can write the longest thing under deadline.”

The right goal: test applied AI literacy, not prompt gymnastics

A lot of hiring managers are quietly trying to answer a tricky question: How do we evaluate AI skills when tools change every quarter?

The answer is to test applied AI literacy—skills that persist even as models and platforms evolve:

Can the candidate define an AI use case in a way that’s measurable and legally defensible?
Can they identify data dependencies, privacy constraints, and data quality risks?
Can they build a risk register for an AI system and propose mitigations?
Can they design human oversight and accountability into a workflow?
Can they show how to monitor drift, errors, and downstream impacts?

Prompting matters, sure. But “prompt engineering” is a small tool in a much bigger toolbox. If the job is about enterprise AI at Treasury, prompt writing is like judging a cybersecurity engineer by their ability to reset passwords quickly.

A better screening question than “analyze Gatsby”

If you want a writing-based test (which is valid for strategy roles), make it job-shaped:

Write a 2–3 page memo recommending whether to deploy a generative AI assistant for internal knowledge search across Treasury. Include: data sources, access controls, privacy risks, model options, procurement approach, and a rollout plan.

That single prompt reveals more about real readiness than a 10-page literary analysis.

What a job-relevant AI hiring assessment looks like (and how to score it)

A good assessment for a government AI role should be scenario-based, bounded in time, and scorable with a rubric. Here’s a structure I’ve found works in the public sector because it balances realism with fairness.

Step 1: A 90-minute case exercise (with a clear artifact)

Give candidates a scenario and ask for one concrete deliverable:

Option A: AI architecture one-pager (system diagram + data flows + controls)
Option B: AI risk and compliance brief (risks, mitigations, acceptance criteria)
Option C: Implementation plan (phases, owners, dependencies, success metrics)

Keep it short. Time-boxing matters because it tests prioritization.

Step 2: A structured follow-up interview

Use the candidate’s artifact as the agenda:

“Walk me through your assumptions.”
“What would change if the data contains PII?”
“What’s your plan for model monitoring and incident response?”
“How would you explain this to counsel and procurement?”

This is where you learn whether the deliverable reflects real understanding or surface-level fluency.

Step 3: A scoring rubric that matches the role

Score against competencies, not vibes:

Security and privacy (access control, logging, threat awareness)
Governance (accountability, documentation, oversight, testing)
Architecture realism (integration, data flows, operational constraints)
Risk thinking (failure modes, mitigations, monitoring)
Communication (clarity, decision-ready writing)

If you can’t explain your rubric to a candidate, it’s probably not fair—or not defensible.

Why this matters more in 2026 than it did in 2023

Government AI hiring is happening in a tense environment: agencies are under pressure to modernize while also facing workforce churn, shifting telework rules, and new centralized recruiting efforts like short-term tech placements. Meanwhile, AI capabilities are moving from “pilot projects” to production systems that touch benefits, tax administration, fraud detection, cybersecurity, and customer experience.

When AI moves into production, the failure modes change:

A weak pilot wastes time.
A weak production deployment can create security exposure, civil liberties harm, and headline risk.

That’s why AI talent development in government can’t just be “hire smart people and hope.” Agencies need hiring processes that select for professionals who can build systems that survive audits, oversight, procurement constraints, and real-world adversaries.

The translation requirement: useful idea, odd execution

The posting also required translation into Spanish and Mandarin. On the surface, that reads as bizarre for a Treasury AI strategy role.

But there’s a legitimate kernel here: public-sector AI work is increasingly multilingual, especially when you consider accessibility, public-facing communications, and workforce enablement across diverse teams. Multilingual testing could be job-relevant in roles tied to customer communications or call-center modernization.

For an enterprise AI strategist, though, translation is better tested differently:

Can the candidate produce plain-language explanations of AI decisions?
Can they write public-trust-ready communications when an AI system changes a process?
Can they help teams communicate limitations (what the model will not do) clearly?

If the goal is communication clarity, test communication clarity.

If you’re an agency leader: fix hiring before you scale AI

If you lead AI, data, or digital transformation in government, here’s a blunt point: your hiring process is part of your AI governance. If you select the wrong talent, you’ll get the wrong systems.

Here’s a practical checklist for improving AI hiring assessments quickly:

Start with a real use case your agency is considering (even if anonymized).
Test one artifact the job actually produces (memo, architecture, risk brief).
Time-box the work and provide evaluation criteria up front.
Score with a rubric tied to security, governance, architecture, and communication.
Include an oversight lens: audits, records retention, procurement constraints, FOIA realities.

Do this, and you’ll hire people who can deliver secure, trustworthy AI—not just people who can produce long documents.

A better way to think about “AI talent” in the public sector

The Gatsby assignment went viral because it’s weird. But it points to a serious need: government has to professionalize how it evaluates AI capability, the same way it professionalized cybersecurity hiring after a decade of incidents.

In the broader AI in Government & Public Sector shift, the agencies that win won’t be the ones that talk the most about AI. They’ll be the ones that:

Hire for applied competence
Build repeatable AI governance
Ship services that earn trust under scrutiny

If you’re trying to build or join an AI team in government, here’s the practical question worth asking: Are we selecting for people who can produce impressive documents—or people who can operate AI responsibly in the real world?

Treasury AI Jobs: What Hiring Tests Should Measure

Treasury AI Jobs: What Hiring Tests Should Measure

The “Great Gatsby” test is a symptom, not the root problem

What an AI strategist at Treasury actually does

The right goal: test applied AI literacy, not prompt gymnastics

A better screening question than “analyze Gatsby”

What a job-relevant AI hiring assessment looks like (and how to score it)

Step 1: A 90-minute case exercise (with a clear artifact)

Step 2: A structured follow-up interview

Step 3: A scoring rubric that matches the role

Why this matters more in 2026 than it did in 2023

The translation requirement: useful idea, odd execution

People also ask: what skills actually help you land an AI job in government?

Do I need to be a machine learning engineer?

What should be on my resume for a public-sector AI role?

What does “ethical AI” mean in government work?

If you’re an agency leader: fix hiring before you scale AI

A better way to think about “AI talent” in the public sector