Treasuryâs Gatsby-style AI hiring test shows agencies are prioritizing AI literacyâsometimes awkwardly. Hereâs how to assess and hire for AI roles better.

Treasuryâs âGreat Gatsbyâ AI Test: What It Signals
A federal AI job posting asked applicants to write a 10-page, citation-heavy analysis of metaphors in The Great Gatsbyâthen compress it into a 200-word executive summary, translate both into Spanish and Mandarin, compare themes across other novels in a table, and finally rewrite the essay like a scientific paper.
Itâs easy to laugh. But I donât think the bigger story is âgovernment is weird.â The bigger story is that agencies are struggling to hire for AI roles because the labor market has moved faster than government hiring playbooks. The Gatsby prompt isnât just quirkyâitâs a case study in how public sector employers are trying (sometimes awkwardly) to measure AI literacy, communication skill, and policy fluency in one shot.
This post is part of our AI in Government & Public Sector series, where we track how AI is changing government workâprocurement, oversight, service delivery, and yes, hiring.
What the Gatsby assignment is really testing (and why thatâs risky)
Answer first: The Gatsby assignment is likely trying to test whether applicants can use generative AI tools to produce, compress, translate, and reformat contentâskills that look like âprompting,â but are really workflow design.
On paper, the tasks map neatly to what people do with modern AI systems:
- Long-form synthesis with citations (can you structure an argument and support it?)
- Executive summarization (can you communicate to leadership without losing substance?)
- Translation (can you serve multilingual stakeholders and spot nuance errors?)
- Comparative analysis table (can you structure information for decision-making?)
- Scientific rewrite (can you adapt voice and format for different audiences?)
Those are not random outputs. They mirror common government realities: memos to leadership, public-facing comms, cross-agency briefings, and documentation for oversight.
The risk: testing âprompt performanceâ instead of job performance
Answer first: If the role is about AI strategy, standards, architecture, and governance, then a literature analysisâeven AI-assistedâcan become a poor proxy for the real work.
A credible AI role in government typically involves tasks like:
- Evaluating model risk (privacy, security, bias, misuse)
- Designing human review and audit processes
- Writing policy and technical controls for AI deployment
- Working with procurement and legal teams on vendor claims
- Implementing monitoring for drift, quality, and incidents
A Gatsby-based prompt doesnât measure those directly. It mainly measures whether someone can produce polished content under constraints.
A public critique in the source story made a strong point: measuring prompt engineering alone is unlikely to match the day-to-day duties of a senior AI specialist. I agree with that critiqueâand Iâd go further: government should avoid turning AI hiring into a content-generation contest.
Why agencies are raising the bar on AI literacy right now
Answer first: The public sector is treating AI literacy as a baseline skill, not a specialist hobby, because AI has moved into core operationsâbenefits, fraud detection, call centers, policy analysis, and cybersecurity.
As of late 2025, agencies are under pressure from three directions at once:
- Demand is exploding. More programs want AI-assisted analysis, triage, and decision support.
- Trust requirements are higher. Government canât ship âmove fast and break thingsâ workflows into eligibility determinations or enforcement.
- Workforces are constrained. Hiring freezes, attrition, and relocation limits mean agencies have to be selectiveâand creativeâabout assessment.
That context makes the Gatsby assignment feel less like a prank and more like an overcorrection: âWe need people who can work with AI outputs, fast, and explain them to leadership.â
A hiring reality agencies donât say out loud
Answer first: Many AI roles in government are âtranslation rolesââbridging policy, tech, and operationsâso writing and synthesis matter as much as coding.
If youâve spent time around AI governance programs, youâll recognize the pattern:
- Leadership asks, âIs this safe? Is it legal? Whatâs the ROI? Whatâs the failure mode?â
- Program teams ask, âHow do we use this without slowing down?â
- Security asks, âWhere does data go? Who can access logs? Whatâs retained?â
- Oversight asks, âShow me evidence. Show me controls. Show me accountability.â
The person who succeeds isnât always the best model builder. Itâs often the one who can write clearly, quantify risk, and turn messy constraints into workable standards.
So yesâcommunication tests make sense. The problem is how theyâre designed.
Better ways to assess AI candidates than a 10-page literary essay
Answer first: Government should assess AI candidates with job-realistic work samples: policy memos, risk assessments, architecture reviews, and incident response simulations.
Here are four assessment formats that map cleanly to public sector AI work and are harder to âgameâ with generic prompting.
1) The AI procurement reality check
Give candidates a one-page vendor claim: âOur model is unbiased, secure, and explainable.â Ask them to produce:
- A list of verification questions (data provenance, evaluation, red-team results)
- A shortlist of contract language theyâd request (audit rights, logging, security controls)
- A risk-rated decision: approve, approve with conditions, or reject
This tests AI literacy where it matters: skepticism, specificity, and governance instinct.
2) A policy-to-implementation translation exercise
Provide a short AI policy requirement (e.g., âhuman-in-the-loop required for adverse decisionsâ). Ask candidates to translate it into:
- A process diagram of review steps
- A minimal audit log schema (what must be recorded)
- A set of acceptance tests
If someone can do that well, they can probably operate in a real agency environment.
3) An AI incident tabletop
Give a scenario: âModel quality drops 20% after a policy change,â or âSensitive data appears in a prompt log.â Ask:
- Who needs to be notified (security, privacy, program, leadership)
- What systems should be paused
- What evidence is collected
- How the public communication is handled
This reveals maturity. It also reveals whether the candidate understands government accountability.
4) A constrained generative AI writing test (done right)
If you want to measure AI-assisted writing, do it with government-shaped artifacts:
- A two-page policy memo with an executive summary
- A public FAQ written at an 8th-grade reading level
- A bilingual notice with a review plan to validate translation accuracy
This keeps the âAI literacyâ signal while staying relevant.
A strong hiring test doesnât ask, âCan you produce a lot of text?â It asks, âCan you produce defensible decisions under real constraints?â
What job seekers should learn from this (especially in late 2025)
Answer first: If youâre applying for AI jobs in government, assume youâll be evaluated on communication, governance thinking, and practical AI operationsânot just model knowledge.
Whether the Gatsby assignment was intentional, experimental, or simply misguided, it reflects a broader shift: AI roles are blending technical and administrative skill sets.
Hereâs what Iâve found works for candidates targeting public sector AI roles:
- Bring a portfolio of âgovernment styleâ writing. One-page briefings, risk notes, decision memos.
- Demonstrate evaluation literacy. Know how to talk about hallucinations, bias testing, and measurement, in plain language.
- Show you understand constraints. Data access, privacy rules, procurement cycles, and security reviews arenât footnotesâtheyâre the job.
- Be explicit about human review. Agencies want candidates who assume AI outputs must be verified, logged, and governed.
And if an application asks for a sprawling writing exercise: treat it like a signal. It may indicate the team values communication, or it may indicate the team doesnât yet know how to assess AI work. Either way, youâll want clarity during interviews.
What government leaders should take from the Gatsby moment
Answer first: AI hiring needs modernization: clearer role definitions, better assessments, and stronger alignment between job announcements and actual responsibilities.
If youâre leading AI adoption in an agency, hiring is now a delivery risk. A few practical moves help quickly:
Align the assessment with the mission
If the job is âAI strategy and ethical deployment,â then the assessment should include:
- governance scenarios
- risk controls
- documentation and auditability
A creative prompt can be fine. But it must connect to the actual work.
Make AI literacy measurableânot theatrical
Agencies can evaluate AI literacy with small, sharp tasks:
- detect model errors in a sample output
- propose a review workflow
- draft a minimal acceptable use policy for staff
These are 60â90 minute exercises, not multi-day essays.
Respect candidate time (especially for senior roles)
High-friction applications reduce your applicant poolâoften filtering out exactly the people you want: those currently leading programs, managing teams, or already working in industry.
In a tight talent market, an overly burdensome process doesnât âraise the bar.â It narrows the funnel.
Remember the December hiring window reality
Late December is a tough time to recruit. People are on leave, budgets are closing, and family schedules dominate. If your application closes quickly, youâll mainly reach candidates already watching job boards daily.
If you want broader, more diverse applicants, keep windows open longer and provide clearer work sample expectations.
The bigger trend: AI is becoming a core government competency
Answer first: The Gatsby assignment is a symptom of a larger shift: agencies are moving AI from pilots into operations, and they need people who can govern it responsibly.
This is the real thread connecting the story to the broader digital government transformation agenda. AI is no longer a side experiment. Itâs influencing:
- how agencies write and review policy
- how they deliver services at scale
- how they manage risk and oversight
- how they communicate with the public
Thatâs why AI literacy keeps showing up in job descriptionsâeven when hiring teams are still learning how to test for it.
If youâre building an AI program in the public sector, this is the moment to tighten the basics: clear standards, measurable assessments, and repeatable governance workflows. If youâre applying for these roles, prepare to prove that you can do more than generate textâyou can make AI accountable.
Where does your agency sit right now: still experimenting with AI tools, or already building the controls to run AI in production?