AI-driven anomaly detection can flag IAM credential abuse fast—before attackers spin up ECS/EC2 crypto miners and your AWS bill explodes.

AI Stops IAM Credential Abuse Before Crypto Bills Hit
A crypto-mining incident in AWS doesn’t start with malware. It starts with someone signing in successfully.
That’s the uncomfortable lesson from a large AWS cryptomining campaign observed in late 2025: attackers used compromised IAM credentials with admin-like privileges, spun up compute across ECS and EC2, and had miners running within about 10 minutes of initial access. No exotic cloud vulnerability required—just valid access and speed.
For teams building an AI in Cybersecurity program, this is a near-perfect case study. The attacker’s workflow is scripted, repetitive, and anomaly-rich. If you’re collecting the right signals, AI-driven anomaly detection can catch these patterns early—often before the first expensive instance ever reaches “running.”
What happened: credential theft, fast enumeration, faster mining
This campaign is straightforward in motivation (steal compute for profit) and sophisticated in execution (avoid detection and slow down remediation).
The observed sequence looks like this:
- Initial access: attacker logs in using compromised IAM user credentials, typically over-privileged.
- Discovery and permission probing: attacker enumerates resources and checks EC2 quotas.
- “Test without spending”: attacker uses
RunInstanceswithDryRunto validate permissions without launching anything. - Deployment at scale:
- Creates dozens of ECS clusters (reports include 50+ clusters in a single environment)
- Registers ECS task definitions referencing a malicious container image
- Creates autoscaling groups designed to explode compute usage (ranges observed from 20 to 999 instances)
- Persistence and response friction:
- Uses
ModifyInstanceAttributeto setdisableApiTermination = True, blocking termination until defenders explicitly re-enable it - Creates additional IAM roles and a Lambda function configuration that broadens access
- Uses
The practical impact is simple: a sudden, massive cloud bill, noisy infrastructure sprawl, and an incident response team wrestling with controls designed to slow them down.
Why this attack works so often (and why most companies get it wrong)
This isn’t “cloud is insecure.” This is “identity is the new perimeter” playing out exactly as advertised.
Over-privileged IAM turns one stolen credential into full account control
If a compromised IAM principal can create roles, attach policies, create clusters, and scale fleets, you’ve essentially handed an attacker a cloud console with a corporate card.
I’ve found that many orgs still treat cloud identity as a setup task (“get access working”) rather than a continuously managed risk surface. The result:
- Long-lived access keys that don’t expire
- Wildcard permissions justified by “we need it for automation”
- IAM sprawl where nobody can answer “what does this user actually need?”
Attackers optimize for the first 10 minutes
The detail that should change your operating model: miners were operational within ~10 minutes.
That speed collapses the window where “someone will notice later” is a viable strategy. Detection has to be near-real-time, and response needs to be automated or at least pre-authorized.
“DryRun” is a tell—if you’re watching
The attacker’s use of DryRun is clever: it’s a low-cost way to test permissions while minimizing obvious impact.
Defenders often ignore “failed” or “non-executing” actions because they don’t change state. That’s a mistake. In cloud environments, permission probing is the reconnaissance phase, and it’s one of the cleanest signals you’ll get.
Snippet-worthy truth: In AWS, the earliest signs of compromise are often API calls that don’t create resources—because the attacker is mapping what they’re allowed to do.
Where AI-powered anomaly detection fits (and what it should catch)
AI security for cloud isn’t about replacing CloudTrail or GuardDuty. It’s about making those signals actionable faster, especially when attackers blend into valid sessions.
Here are the AI detections that map tightly to this campaign.
1) Behavioral analytics for IAM: “This user never does that”
A solid AI-driven UEBA (user and entity behavior analytics) model can score risk when an identity deviates from its baseline.
High-signal examples from this campaign:
- An IAM user that historically performs read-only actions suddenly calling:
CreateRoleAttachRolePolicyCreateServiceLinkedRoleRegisterTaskDefinitionCreateCluster/CreateService
- A user performing rapid-fire enumeration across multiple services
- A new principal (or newly active one) initiating multi-service changes within minutes
This matters because compromised credentials often look “legitimate” to traditional access controls. AI helps by turning legitimacy into a probability: valid login, suspicious behavior.
2) Sequence detection: spotting the “cryptomining playbook”
This campaign isn’t random; it’s a repeatable workflow. AI models that evaluate event sequences (not isolated alerts) can detect the playbook earlier.
A practical sequence rule (human-readable version) could be:
RunInstanceswithDryRun- Then
CreateRoleorAttachRolePolicy - Then ECS cluster/service creation and task registration
- Then autoscaling configured with unusually high max capacity
When those events happen within a tight time window (minutes, not hours), the confidence should spike.
3) Cost-and-capacity anomalies: the attack’s “blast radius metric”
Cryptomining campaigns are loud in resource terms, even when they’re careful in IAM terms.
AI can flag:
- Sudden increases in requested vCPU/memory for ECS task definitions
- Abnormal GPU/ML instance launches for accounts that don’t run ML workloads
- Autoscaling maxima that are wildly inconsistent with historical usage (e.g., max 999)
- Bursty provisioning across multiple regions or AZs
The best setups tie this to automated containment (more on that below). Detecting is good; stopping the spend is better.
4) Persistence friction: termination protection used as an evasion signal
The technique of setting disableApiTermination = True is not common in most environments. When it appears during an incident pattern, it’s a strong intent signal.
AI should treat the combination as high severity:
- Termination protection enabled
- On newly launched instances
- Coupled with scaling activity or new roles
If your tooling only flags “termination failed,” you’re late. The useful alert is “termination protection was enabled by an unusual principal.”
The defensive playbook: controls that actually reduce risk
The goal isn’t to “monitor more.” It’s to make credential abuse expensive for attackers and cheap for defenders.
Start with identity hygiene (because AI can’t fix admin-everything)
If you want AI to be effective, reduce the noise:
- Replace long-term access keys with temporary credentials wherever possible
- Enforce MFA for all human users (especially console access)
- Apply least privilege with role-based access and scoped policies
- Constrain role assumption (who can assume what, from where, under what conditions)
A hard stance that saves real money: if a user can create roles and attach policies, treat that identity as production-critical and protect it like you would a domain admin.
Add “guardrails” that block common cryptomining moves
These are high-ROI preventative controls:
- Permission boundaries or SCP-style restrictions that limit:
- Who can create/attach IAM policies
- Which instance families can be launched
- Maximum autoscaling sizes
- Where compute can be deployed (regions/accounts)
- Container controls that:
- Restrict untrusted registries
- Scan images and block known-bad or unsigned images
The theme: don’t rely on detection if you can safely prevent.
Turn AI detections into fast containment actions
Lead-generation aside, this is the operational heart of “AI in cybersecurity” for cloud: detection must connect to response.
Containment actions that map well to this campaign:
-
Quarantine the principal
- Disable access keys / revoke sessions
- Require step-up authentication
- Block role assumption temporarily
-
Stop the spend
- Suspend or cap autoscaling
- Deny new instance launches via emergency policy
- Kill ECS services/tasks that match suspicious definitions
-
Remove response friction
- Automatically alert when termination protection is enabled
- Run a remediation workflow that re-enables API termination (when approved)
If your team worries about accidental disruption, use a staged approach:
- Stage 1: alert + ticket + require human approval
- Stage 2: auto-contain only on high-confidence sequences
- Stage 3: auto-contain by default, with break-glass overrides
“Could AI have stopped this?” A realistic answer
Yes—if it’s aimed at the right problem.
AI wouldn’t “detect cryptomining” by reading minds. It would detect credential abuse and abnormal control-plane behavior early, because the attacker had to:
- Probe permissions using
DryRun - Create roles and attach policies
- Spin up many ECS clusters and services
- Push autoscaling limits
- Enable termination protection as an evasion move
Those are not subtle actions in most AWS accounts. The reason they succeed is that many organizations don’t baseline behavior, don’t correlate sequences, and don’t connect alerts to containment.
Another snippet-worthy truth: Cloud attacks are often control-plane attacks. If you can’t see and score API behavior, you’re blind at the most important layer.
Practical checklist: what to review this week
If you’re reading this and thinking “we should sanity-check our environment,” you’re right. Here’s a focused list that doesn’t require a six-month project.
- Inventory IAM users with long-lived access keys and prioritize removing them.
- Identify principals with admin-like privileges and confirm they truly need them.
- Review CloudTrail coverage across accounts/regions and ensure logs are protected from tampering.
- Create detections for these high-signal events:
RunInstanceswithDryRunCreateRole,AttachRolePolicy,CreateServiceLinkedRole- ECS cluster/service bursts
- Autoscaling max capacity spikes
ModifyInstanceAttributeenabling termination protection
- Decide your containment policy (who can approve, what gets auto-blocked, and when).
If you do only one thing: treat DryRun + rapid role changes + sudden compute provisioning as a critical incident until proven otherwise.
Where this fits in the AI in Cybersecurity series
This campaign is a clean reminder that modern defense is less about signature-matching and more about behavioral analytics, anomaly detection, and automated response—the exact areas where AI earns its keep.
The next step is maturity: not just “we have alerts,” but “we can contain credential abuse in minutes.” If attackers can monetize your cloud in 10 minutes, your detection and response loop needs to be faster than their deployment script.
If you’re building an AI-driven cloud security program, what’s your current time-to-containment for suspicious IAM activity—minutes, hours, or days?