Teaching Metal to Move
A Conversation with Flexion CEO Nikita Rudin
One of the central challenges in building AI is accounting for the idiosyncrasies of human behavior.
LLMs require massive datasets because language is so varied and context-specific. Coding agents have to navigate the individual quirks of how engineers build software. Autonomous cars must learn to anticipate human errors and driving patterns.
That’s why synthetic environments only get you so far in fine-tuning LLMs. When human behavior is a key variable in the system, it’s hard to simulate reality.
Robotics is different. The hardest problems are not about humans but about physics. How a leg balances against gravity, a hand grips a wire cable, a body recovers from a stumble.
Unlike human behavior, physics is dictated by laws. Gravity, friction, and momentum can all be modeled and derived from equations.
That invariance is why simulated environments can go much further in training robots than in training LLMs.
Nikita Rudin is building a company off this insight.
Nikita is the CEO and co-founder at Flexion, a young startup building AI software for humanoid1 robots. His team is developing a general-purpose “brain” that can be fine-tuned and customized in simulated environments, then deployed into various different robot types across their customer network.
Before Flexion, Nikita and his co-founder David received their PhDs in robotics at ETH Zurich and worked together at NVIDIA.
Nikita also co-authored the original Isaac Gym2 paper, the breakthrough research from NVIDIA that established GPU-accelerated simulation3 as a viable way to train robots at scale.
Nikita and I talked about how robots are being built today, his thoughts on how the market will mature, and why the underlying tools in robotics are advancing so quickly.
He has a plasticity in his thinking that I love seeing in founders. Technical depth paired with creativity. A clear point of view earned by real experience, but held with genuine curiosity.
He even offered a prediction for when we will start to see lots more robots in the wild. (Sorry Melania, the robots are not here – at least not yet!)
01 | Days versus years
EO: You’ve been in robotics a while, before all the current hype. I’d love to hear your story and how that connects to the thesis behind Flexion.
NR: I was doing my PhD at ETH Zurich in reinforcement learning (RL) for robotics and also working part-time at NVIDIA. This was before AI took off, back when NVIDIA’s GPUs were mostly powering video games.
At NVIDIA, my co-founder David and I were on the team building Isaac Gym, Isaac Sim, and Isaac Lab.4 These are NVIDIA’s open-source simulation tools, which have since become foundational in the field of robotics learning.
During this time, we saw how fast the technology was moving.
In 2020, getting a quadruped, a four-legged robot, to take a few steps without falling was a big deal. By 2024, the same quadruped could jump and climb through collapsed buildings and piles of gravel, terrain I’d struggle to walk across myself.
Impressive, in four years…
The progress was amazing.
Over time, I got involved in some humanoid projects. If you remember Jensen’s NVIDIA keynote two years ago, where he first showcased robots on stage, David and I were heavily involved in everything shown then.
Working on humanoids taught us two things.
First, the same RL approach works across different robot form factors, whether you’re working with a quadruped or humanoid.
Second, and more importantly, policies5 learned on one robot transfer to others very efficiently.
What makes that transfer so efficient? I’d assume, incorrectly it sounds like, that in robotics, hardware and software are coupled too tightly for this methodology to transfer so easily.
Here’s how our pipeline works.
Each robot enters our digital simulation environment as a spec file, describing the hardware and its kinematics6 in a standardized format.
Inside the simulation, we train the robot’s control policy7 with reinforcement learning. The robot attempts a task many times, gets rewarded for progress, and the neural network gradually learns how to control the body.
Once trained, we deploy the policy to the physical robot, where it sends commands to the motors to navigate the real world.
So essentially, you’re feeding the simulator a description of the body, training the brain in a simulated environment to operate that body, and then dropping the brain into a physical system.
That’s right. And because the initial spec files are standardized, plugging in a new robot form is relatively straightforward. Also, everything around training the “brain” – the simulator, the general RL methods, our pipelines – is reusable.
The motor interfaces, which let us deploy a trained policy back into a robot, aren’t fully standardized across the industry yet. But they’re close enough that we can usually deploy without much friction.
In other words, the standards and system boundaries in robotics are pretty well defined. And that lets us deploy our training methods across many different robot types.
So you were observing the generality and scalability of these methods. What gave you that extra kick to start something from scratch?
Very quickly, we saw that we could run our software on other people’s robots, sometimes without ever seeing those robots in-person. And we could often do so more efficiently than their internal teams.
That gave us the conviction to build a company around this approach.
Can you quantify how much more efficiently?
Days versus years.
Okay, well that’s a compelling reason to start something!
Yeah. We developed this belief that simulation together with reinforcement learning should be used as much as possible.
I finished my PhD, we left NVIDIA, and started Flexion.
Give us a snapshot of the company today. Where are you in the R&D cycle?
We’re just a little over a year old. We have 45 people in Zurich, and just opened our SF office. We have multiple customers. Some are robotics companies integrating our software. Some are big industrial players deploying robots in their factories and warehouses.
I’d say we are as deployed as it gets in robotics, which, to be honest, is still very early.
02 | Becoming Android for robotics
One assumption behind Flexion is you can be competitive as the “Android for robotics” – a horizontal, hardware-agnostic system that can be trained to run on any humanoid.
A lot of smart people are making the opposite bet. Tesla and Figure, for example, are both vertically integrated. Do you think they’re missing something?
I don’t think they’re wrong. Vertical integration lets you control your own destiny. But only if you can do it more efficiently than partnering.
The market will probably support one, maybe two, vertical players. Tesla will probably be one of them.
But beyond that, you’ll have a huge ecosystem of others that will own their section of the supply chain. Because combining hardware and software, being excellent at every part of the stack, and then scaling it to thousands of robots, is extremely hard. Being world-class in one part of the chain is easier.
The self-driving car market is a good analog. Early on, everyone was vertically integrated. Most died. A few big companies survived. And now, traditional OEMs are increasingly buying third-party software from companies like Wayve and Mobileye to power their vehicles.
We’re seeing the same thing in robotics. Everyone started wanting to be vertically integrated. Now the ecosystem is shifting away from that.
That’s what our customers are doing. And you see companies like Boston Dynamics and Apptronik now partnering with DeepMind.
It makes sense. Otherwise, everyone has to relearn the same hard lessons of building AI software for robots from scratch.
I see. So unless you have extraordinary scale and capital, it’s too hard to own everything.
Absolutely. I think vertical integration is a story investors like to hear. Because it’s easy to envision the moat. But technically, in the majority of cases, it makes more sense to go horizontal.
Let’s assume, for a second, you have unlimited capital. Is there a technical case for why vertical integration will usually underperform?
Well, talent is still a limited resource. There are maybe 200 people in the world that can do this kind of work. Throwing money at it will not solve the problem.
It rarely does. But fair – let’s assume unlimited resources. Capital, talent, time.
I’m trying to isolate if there’s a technical rationale for this horizontal strategy. An edge you, your suppliers, and your customers have that vertical players can’t recreate.
Technically speaking, the limiting factor in robotics is data, and specifically, a diversity of data. You need to constantly collect new types of data and train on it to maintain a performance edge.
If you’re vertically integrated, you only learn from your own network. By partnering with other providers, you can benefit from a much wider network of robots and partners, all collecting more data and learning about what is working.
What’s a tangible way you’ve seen that play out?
There are three main types of actuators and gearboxes.8 These are the hardware components that move a robot’s joints. Every robotics company has to bet on one of the three types for their product.
Because our software runs on all three, we know the real trade-offs in production. We have the data, and can advise our customers on the best option for their specific use case.
03 | Most people get simulation wrong
At Flexion, you’ve made a big bet on simulation. What do you think about the sim-to-real gap, the argument that training in simulated environments won’t get us anywhere near commercial deployment?
We believe the point where simulation hits diminishing returns is much further away than most people think.
Throughout my PhD, everyone said simulators were limited. But we kept pushing the frontier with them, proving them wrong. We’ll keep doing that in the future.
Why is simulation so effective? I spend most of my time in software, where LLMs tend to improve fastest when they’re tuned on real production data. Synthetic environments only take you so far. That’s not the same in robotics?
Robotics is different. People make this same false comparison between robotics and self-driving cars too, but the analogy doesn’t hold.
In software applications and self-driving cars, anomalous behavior comes from humans.
In robotics, we’re solving actual, physical interactions between robots and objects – holding a certain weight, precise placement, things like that. Those are easier to simulate than human behavior.
At some point, simulated training will get advanced enough that we’ll need real-world data to push the frontier. But today, that’s not the bottleneck.
At what point do the “pessimists” think simulation will hit a wall?
Right now, there is a narrative that simulated RL can’t work for manipulation, how a robot handles and interacts with objects.
I agree manipulation is harder to train than locomotion.9 You’re asking the robot to interact with exotic materials – soft objects, cables, and liquids.
But simulation as a field is advancing very quickly. You can now train on those materials in a simulated environment in a way you couldn’t just a few years ago.
Will simulation continue to advance, or will progress start to decay?
It will keep advancing.
First off, simulation isn’t really a data problem. Many AI advances hit diminishing returns as datasets grow larger and harder to collect. But GPU-accelerated simulators aren’t trained on data — they’re just math and physics, encoded to run efficiently on GPUs. Companies like Google and NVIDIA are investing heavily to make these simulators run faster and capture the physics of the real world more accurately. That investment from Big Tech will push things forward.
Second, coding agents are a big accelerant. Good simulations need many intricate scenarios, with lots of variation. Building those used to be manual. Now, Claude Code can build whole simulation assets from scratch, interpolating from just a few examples.
That’s pretty amazing.
And third, there’s a lot of algorithmic progress happening, which is where we spend a lot of time at Flexion.
For example, RL works by trial and error. The robot tries random actions, stumbles onto solutions, and gets rewarded when they work. But manipulation requires a very precise sequence of actions. The space of “right” sequences is so narrow that you can burn a lot of calories in training and never land on one that works.
We solve this by bootstrapping10 the RL process with human demonstrations – either a person remote-controlling the robot through the task a few times, or video of a human performing the task from their own point of view.
In tuning LLMs, data quality often trumps data quantity. From what you’re saying, it seems like the same is true in robotics.
Yes. In coding and software, a few high quality examples plus RL beats large quantities of mediocre data. That lesson carries over into robotics.
04 | NVIDIA’s positioning
You’re pretty deep in the NVIDIA ecosystem. You and your co-founder worked there. They invested in Flexion. How do you think NVIDIA is positioning itself within robotics?
I can’t speak directly for the company. But remember, NVIDIA’s business is selling GPUs.
Their strategy is to create new ecosystems that need GPUs. They open-source just enough tooling to seed many companies on top, but not enough to solve the whole problem, otherwise they’d cannibalize demand for their own chips.
They ran this strategy successfully with LLMs. They’re doing it with self-driving cars. And now they’re starting something similar in robotics.
So far, they haven’t pushed toward actually deploying robots themselves.
So NVIDIA is seeding the ecosystem and pushing developers onto their chips. What role do you want Flexion to fill in the ecosystem?
Our goal is to standardize everything that goes into helping hardware manufacturers bring stable products to market.
We want to do that by bridging the gap between all the open-source research and tooling in the space, and actually deploying robots that do real work. NVIDIA is one player building open-source tools, but there’s an amazing ecosystem of researchers and academics pushing new methods all the time.
In practice, we end up building a lot of our own tooling in-house to support our customers, but we do try to leverage as much public work as we can.
What’s the biggest gap in the research right now? Is there a tool or capability that would remove a major R&D bottleneck?
The creation of what we call digital twins. Not a random simulated environment, but one that resembles a specific place. For example, if we need to train robots to operate in a specific factory, we need to create a replica of that in the simulator, in the digital world.
You can pay someone to build these assets today, but they’re very expensive. Coding agents are starting to build them for us, but it’s not zero-effort yet. Getting there will take time.
Where specifically do coding agents still struggle?
It’s hard to pinpoint. But wherever there’s more complexity and specificity.
Doesn’t that contradict the sim-to-real conversation we were having earlier? Isn’t the implication that simulation actually isn’t enough, that what we really need is more real-world data to get these robots ready for prime time?
That’s interesting. The way I think about it is – we should be using real-world data to improve the simulation environment, not to train the robot directly. Because if we can get the learning to happen in simulation, that is orders of magnitude more efficient and scalable than conducting training in the real world.
Simulation capabilities are advancing quickly enough that the cost-benefit makes it clear that’s where we should focus.
05 | To adapt, or not to adapt?
One of the challenges in robotics is real-time adaptation. The real world is messy. Should humanoids be programmed to course-correct on their own when they make a mistake? How do you think about that problem?
I have a controversial take. I’m not a fan of robots adapting live without human supervision.
When you deploy robots commercially, you have to think about risk. You have to certify these systems are safe, that they are statistically stable. If robots are adapting on their own, that’s very hard to guarantee. I don’t like the idea of robots changing their brain in the wild.
For a while, I expect robots will still require manual retraining around failure cases. We can recreate those failure cases in simulation, retrain on a combination of synthetic and real data, and run validation before deploying the update to thousands of robots.
But the humanoids you work with are general-purpose. They’re not specialized to run one task on repeat. I assume you can’t pre-program every permutation of reality a robot will encounter, even in a contained environment.
Doesn’t that imply a greater need to allow for adaptation?
That’s a good and hard question to answer.
Over time, as coding agents and LLMs improve, the whole loop – deploy, see what fails, have an engineer diagnose it, change your training code, retrain, redeploy – will become more automatic.
A human supervising that automated loop is probably safer and ultimately more efficient than letting robots adapt on their own. But only if we can get it more automated.
There’s also nuance around which part of the system you let adapt.
Our software has three layers – high-level understanding, local motion planning, and actual execution. I wouldn’t let a robot independently change the execution layer. That could get very unpredictable.
But letting it adapt the high-level strategy, like how it opens a door or moves around an object after one failed attempt, is much easier and safer.
For example, if a robot receives the task “make me a coffee,” the robot then has to plan – go to the kitchen, find the coffee machine. If it sees there’s no coffee machine, it needs to reason its way to a new plan. That kind of adaptation is safe.
What we don’t want is the robot independently changing how it physically moves.
So different parts of the stack get different levels of autonomy.
Exactly. And the high-level layer is primarily trained on real-world and human data, more like an LLM-style reasoning system. So we sometimes train that layer in simulation, but we don’t always have to.
06 | The year of the robot
When do humanoids get deployed at scale? Two years out, five, ten? I assume we’ll have commercial deployment before consumer…
Yes, I think commercial will come first.
Right now, deployment isn’t economical. But by the end of the year, I expect Flexion and others will prove that robots can deliver commercial ROI on specific tasks.
By 2027, you’ll see robots doing real work. They’ll be constrained to one specific task, trained to operate in one specific setting. But they will be fully deployed.
How do customers think about ROI?
It will be benchmarked to salary. Take a task done by three people across multiple shifts, tally what you’d pay them over a year for that work, and compare that to the amortized cost of a robot doing the same level of output that year.
Today, you can’t just bring a robot in a box and have it work. You need five engineers looking over its shoulder. We have to strip out the engineering oversight for robots to make economic sense for customers.
OK, so 2027 – The Year of the Robot. That’s a good prediction to end on.
A humanoid is a robot built in the general shape of a human, designed to operate in environments and with tools made for people.
Isaac Gym was NVIDIA’s original, open-source reinforcement learning environment for robots. Released in 2021, it was the first framework to train complex robot policies entirely on a single GPU.
GPU-accelerated simulation is a technique that uses GPU chips to run thousands of virtual robots in parallel, each in its own simulated physical environment. By running many simulations at once, researchers can compress what would otherwise be years of trial-and-error training into hours.
Isaac Sim is a photorealistic robotics simulator, meaning it renders virtual worlds with realistic lighting, textures, and physics so that what a robot experiences in simulation matches reality. Isaac Lab is the open-source framework that replaced Isaac Gym as the standard environment for training.
A policy is the robot’s decision-making engine, typically a neural network that decides what the robot should do and how it should move, given its sensory inputs.
Kinematics describes the robot’s physical structure, how its joints and links move in space, which forms the basis for motion planning and control.
In reinforcement learning, the control policy refers to the learned rules for translating sensor inputs into motor commands.
Actuators are the motors that drive a robot’s joints. Gearboxes pair with them to convert fast, weak motor rotation into slow, powerful motion for walking and lifting.
Manipulation is how a robot interacts with objects – grasping, lifting, placing, assembling. Locomotion is how a robot moves through the world – walking, running, climbing stairs, balancing.
Bootstrapping means kickstarting the reinforcement learning process with a small amount of high-quality data – in this case, human demonstrations.


