Humanoid Olympics benchmarks show what useful robots can do now—and what’s missing. See how doors, tools, laundry, and wet work map to real industries.

Humanoid Olympics: The Real Test for Useful Robots
A humanoid robot that can punch, kick, or do a flashy “sports demo” makes for good video. But it’s not the test that matters.
The real test is whether a robot can open a self-closing door, fix an inside-out sleeve, handle wet messes, and use tools without dropping them—the kind of unglamorous work that shows up in hospitals, warehouses, hotels, labs, and commercial kitchens. That’s why Benjie Holson’s “Humanoid Olympics” challenge hits a nerve: it’s a practical scoreboard for what AI-powered robotics can actually do today, and what it still can’t.
This post unpacks the challenge through the lens of our series Artificial Intelligence & Robotics: Transforming Industries Worldwide: what these events reveal about the current state of robotic manipulation, why progress has stalled around certain “simple” tasks, and how businesses can translate these milestones into a realistic roadmap for automation.
The uncomfortable truth: “Laundry-folding” doesn’t mean “general intelligence”
Robots folding towels look like magic because the object is flexible, chaotic, and hard to model. Yet towel folding sits in a sweet spot for modern robot learning: it’s repeatable, the environment can be controlled, and the task tolerates some slop.
Holson points to a critical misconception: if our AI techniques can do laundry, people assume they can do almost anything. They can’t.
Here’s why this matters to industry leaders evaluating humanoid robots or mobile manipulators:
- Many impressive demos are narrow policies trained on a constrained setup.
- The difference between a demo and deployment is usually one missing capability (force control, touch sensing, precision, robustness to variation).
- A task that looks “easy” to humans—like using a key—can be wildly harder than something that looks “hard,” like folding cloth.
A useful way to think about the Humanoid Olympics is as a checklist of missing capabilities that prevent robots from being dependable co-workers.
What’s working now in robot learning (and where it breaks)
Most of today’s real-world manipulation progress comes from learning from demonstration: a human teleoperates a robot (often via a “puppeteering” setup or VR controllers), records many short demonstrations, and trains a neural network to imitate them.
This approach is strong when:
- The task is 10–30 seconds long.
- The environment is similar across trials.
- “Close enough” outcomes are acceptable.
Holson also names the limitations that show up again and again in robotics labs and pilots:
1) Weak force feedback and wrist-level control
Teleoperators often don’t feel what the robot feels. Without good force feedback, the human demonstration itself can be noisy—and the learned policy inherits that weakness. In industrial terms, this is why robots struggle with tasks that require high torque plus finesse, like stubborn doors or stuck jars.
2) Limited finger dexterity
Even advanced hands with many degrees of freedom are hard to control reliably. In practice, many systems revert to “open/close” behaviors because consistent fingertip-level control is difficult to teleoperate and difficult to learn.
3) No practical, high-resolution touch
Human hands have dense tactile sensing. Most robot hands don’t. That gap matters most for tasks where vision is ambiguous—like separating a thin plastic bag opening or aligning a key in a lock.
4) Medium precision (often centimeter-level)
Many demos operate at roughly 1–3 cm precision. That’s fine for wiping a counter. It’s not fine for buttons, keys, or delicate medical supplies.
A good robotics roadmap starts by asking: “Is this task vision-dominant, force-dominant, or touch-dominant?” Most failures are predictable once you label the task correctly.
The Humanoid Olympics events: what each one really tests
Holson’s five event categories are clever because each is an “ability cluster,” not a gimmick. If a robot can do the gold medal version, it’s not just completing a household chore—it’s demonstrating a capability that transfers to high-value industrial work.
Event 1: Doors (mobility + force + whole-body coordination)
Answer first: Doors are hard because they combine asymmetric forces, precision grasping, and whole-body motion.
A push door with a round knob already requires coordinated torque and grip. Add self-closing force and you get an interaction where pulling in the wrong direction causes slip. Add a pull door and it becomes a full-body planning problem: you may need a second limb to hold the door or move fast enough to exploit dynamics.
Industry translation:
- Hospitals: robots moving between patient rooms, supply closets, and sterile areas
- Warehouses: back-of-house doors, emergency exits, security doors
- Retail: stockroom access
Doors are the “you can’t avoid the real world” benchmark. If a robot can’t pass through common doors reliably, you’ll end up redesigning your facility around it—and most businesses won’t.
Event 2: Laundry (deformables + in-hand work + long-horizon reliability)
Answer first: Laundry tasks are less about folding and more about handling deformable objects across many steps.
An inside-out T-shirt adds a long-horizon sequence: grasp, pull-through, regrasp, orient, then fold. Turning a sock inside-out introduces insertion plus pinch-and-pull actions that are hard without touch and fingertip precision. Hanging a dress shirt is the real stress test: sleeves, orientation, hanger alignment, and then the dreaded part—buttons.
Industry translation:
- Hotels and elder care: linens, gowns, and bedding (not identical items, constantly varying)
- Clean rooms and labs: garment handling and packaging
- Apparel logistics: returns processing and sorting
I’ll take a stance here: apparel handling is one of the biggest hidden automation markets, and it won’t be solved by “better vision” alone. It will be solved by tactile sensing, better hands, and policies that recover gracefully from partial failures.
Event 3: Tools (strength + stable grasps + regrasping)
Answer first: Tool use forces robots to move beyond pinching objects and into stable, high-force grasps.
Spraying window cleaner requires isolating a finger and applying enough force on a trigger while holding the bottle. Making peanut butter sandwiches adds a strong tool grasp plus scooping/spreading with controlled contact. Using a key is the capstone: select the right key from a ring, align, insert, and turn—all without putting the keys down.
Industry translation:
- Maintenance: spray bottles, scrapers, handheld scanners
- Food service: utensils, containers, repetitive prep
- Healthcare: handling instruments, dispensers, packaging
Tool use is the gateway to flexibility. A robot that can use tools doesn’t need a custom end-effector for every job.
Event 4: Fingertip manipulation (the “small motions” problem)
Answer first: Fingertip manipulation is where many systems hit a wall because vision can’t see contact well enough.
Rolling socks is doable but still demands coordinated in-hand motion. Opening a dog poop bag is sneakily hard: separating a thin plastic opening is a high-friction, high-precision, touch-dominant task. Peeling an orange adds high force and delicate control, with unpredictable tearing.
Industry translation:
- Medical: opening packaging, handling soft materials, small disposables
- Electronics: cable management, connectors, delicate parts
- E-commerce: packing, bagging, and returns
This is where tactile sensing becomes non-negotiable. If your automation plan includes “open, separate, peel, or tear,” you’re shopping for touch.
Event 5: Wet manipulation (robustness + materials + safety)
Answer first: Wet work is a deployment requirement, not an edge case.
Wiping counters introduces water risk and slipping objects. Cleaning peanut butter off the manipulator is a real-world necessity: robots that handle food or waste must clean themselves or be cleanable. Washing grease off a pan in a sink combines water, soap, grease, variable friction, and unpleasantness—exactly the sort of task that humans don’t want and businesses pay for.
Industry translation:
- Commercial kitchens: dishwashing and station cleaning
- Facilities: janitorial work, restrooms, surface sanitation
- Healthcare: infection control routines
A lot of robotics roadmaps ignore wet tasks until the end. That’s backwards. If the robot can’t survive splashes and messy contacts, it won’t last a week in the environments that most need automation.
Why these benchmarks matter for businesses buying robotics in 2026
Holson’s rules are strict—autonomous, real-time video, no cuts, time limits—and that’s exactly why the Olympics framing works. It filters out “demo theater.”
For buyers and operators, the real value is using these events as an evaluation template. If you’re considering humanoid robots, mobile manipulators, or AI-powered robotics for operations, ask vendors questions mapped to these capabilities:
- Force and compliance: Can the robot maintain grip under changing loads? How does it detect slip?
- Tactile sensing: Does it have fingertip touch? If not, what tasks does it explicitly avoid?
- Recovery behaviors: What happens after a fumble—does it regrasp and continue, or does it fail?
- Variance tolerance: How does performance change with different door types, different fabrics, different soaps?
- Cleanability and uptime: Can it be washed down? How are electronics sealed? What’s the maintenance cycle?
A procurement shortcut that works: ask for one uncut run on a task you care about, in your environment, with your objects. If they can’t do that, the rest is noise.
A practical roadmap: how “Humanoid Olympics” becomes an innovation catalyst
If you’re building robotics programs (or betting on the sector), these events point to the R&D priorities that will shape the next wave of industry transformation:
- Better hands: not necessarily more degrees of freedom, but more controllable, durable, and serviceable designs
- Tactile-first learning: policies that use touch as a primary signal, not an afterthought
- Force-aware teleoperation: higher-quality demonstrations, less brittle imitation
- Foundation models that plan + recover: moving from “repeat the demo” to “achieve the goal even when things go wrong”
- Wet-ready hardware: sealed joints, corrosion resistance, and materials designed for cleaning
My opinion: the teams that win these benchmarks won’t be the ones with the fanciest model alone. They’ll be the teams that treat hardware, data collection, and policy training as one system.
What to do next if you’re exploring AI-powered robotics
If your goal is near-term operational value (not just experimentation), pick one capability cluster from the Olympics and pilot around it.
- If you run a hospital or hotel: start with doors + wet manipulation (mobility and cleaning are the day-one constraints).
- If you run apparel or returns logistics: prioritize deformables + fingertip manipulation.
- If you run food service or facilities: focus on tool use + cleanability.
The broader theme of this series—Artificial Intelligence & Robotics transforming industries worldwide—isn’t about humanoids replacing everyone. It’s about robots absorbing the jobs that are repetitive, messy, high-friction, and staffing-sensitive, while humans move to supervision, exception handling, and higher-trust work.
The question worth asking as we head into 2026: Which “gold medal” task will become the first boring, dependable feature in commercial robots—and which industry will benefit first?