A world model for home robotics.
Arkona builds the predictive core that lets a robot understand a physical space, imagine the consequences of its actions, and act with the dexterity a home demands. We are teaching machines to plan in the real world before they touch it.
Perception · Prediction · Planning · Dexterous manipulation — unified in one learned model.
We learn how the physical world responds — then we act.
A world model is a learned, predictive simulator of the robot's environment. Instead of reacting frame by frame, Arkona's robot imagines the outcome of each candidate action and chooses the one most likely to succeed. This is what makes reliable behaviour possible in the messy, unscripted setting of a real home.
Predict before acting
Every motion is rehearsed inside the model first. The robot commits only to plans it expects to succeed, then corrects in real time as reality diverges.
Generalises across tasks
Because the model captures physics — contact, weight, friction, occlusion — skills transfer between builds, objects and rooms instead of being hand-scripted one by one.
Recovers from error
When a brick slips or a step fails, the loop detects the mismatch and re-plans — the same way a person notices a mistake and tries again.
A robot that builds LEGO sets from the instructions.
We chose LEGO assembly as Arkona's flagship benchmark because it compresses the hardest problems in home robotics into one tractable task: reading a visual instruction, finding the right part, grasping it precisely, and placing it with sub-millimetre alignment — step after step, recovering when something goes wrong.
Master a LEGO manual and you have mastered the core of loading a dishwasher, tidying a shelf, or assembling flat-pack furniture.
The assembly pipeline, end to end
Parse the instruction page
A vision model reads each printed step, identifying the parts called out and the target sub-assembly.
Decompose into actions
The step is broken into an ordered sequence of pick-and-place sub-goals with explicit success criteria.
Locate & grasp the brick
Perception finds the correct part in the bin; the world model selects a stable grasp and approach.
Predict the placement
The model imagines candidate placements and picks the one that aligns studs and clears collisions.
Place & seat the brick
Force-aware control presses the brick home, sensing the click of a successful connection.
Verify, then continue or retry
The result is checked against the instruction. A mismatch triggers re-planning before the next step.
Four capabilities, one model.
Arkona's stack is built so that perception, prediction, planning and control share a single learned representation of the world — not four brittle systems stitched together.
Multimodal perception
Fuses colour, depth and touch into a coherent 3-D understanding of the scene and the objects in it.
Predictive world model
Rolls out imagined futures so the robot can weigh actions against their likely physical outcomes.
Instruction grounding
Connects human instructions — printed steps, language, diagrams — to concrete actions in the world.
Dexterous control
Force-aware manipulation that grasps, aligns and seats parts with the precision a home demands.
Meet the testbed: Arkona P-1.
Our first-generation research cell pairs a 6-axis manipulator with overhead and wrist cameras above an instrumented build surface — the platform where the world model meets real bricks.
The same world model that masters a LEGO build is what lets the robot work calmly and predictably around people — sensing its surroundings continuously and stopping the instant something unexpected enters its space.
AI-first — and safe, secure, reliable by design.
Modern AI is the foundation everything else stands on. On top of that base, three principles govern how the robot behaves in your home — not bolted on afterwards, but part of how the system perceives, predicts and acts.
AI foundation
Large multimodal models give the robot broad commonsense and language — the bedrock its perception, world model and control are built on and continually improved with.
Safety
Force-limited, collision-aware motion with hardware e-stops. The world model predicts contact before it happens, so the robot slows and stops around people and pets.
Security
Perception runs on-device — your home isn't streamed to the cloud. What does leave is minimal and end-to-end encrypted, with privacy built into the data model from day one.
Reliability
Redundant sensing and continuous self-checks mean the robot knows when it's unsure. It cross-validates before acting, recovers from errors, and behaves predictably every time.
From a single brick to the whole home.
Single-step placement
Reliable grasp-and-place of individual bricks under the world model.
Full set assembly
Complete a small LEGO set end to end from its printed manual.
Unseen sets
Generalise to manuals and parts the robot has never encountered before.
Everyday home tasks
Carry the same world model to tidying, loading and assembling around the house.
Building the future of home robotics?
We are talking to researchers, hardware partners and early collaborators who want to help teach robots to understand the physical world.
Get in touch