Robotics knowledge
A self-contained primer on the robotics concepts that the Midcore app is designed around, plus a working tour of every screen you’ll touch when you build, train, evaluate, and operate a robot through it. Written for engineers and product leads who’d rather skim a paper than a marketing deck.
No prerequisite robotics background is assumed. Every term that matters is defined in the glossary and recapped in context.
What this section covers
- Foundations. Rigid-body kinematics, coordinate frames, common sensors and actuators, the taxonomy of manipulator and mobile platforms.
- World models. What a world model is, why robotics is converging on them, and what the τ₀-WM architecture (which Midcore integrates) actually does.
- Manipulation policies. Vision-language-action models, action chunking, confidence signals (RCS), and corrective rectification (LAR).
- Datasets & training. The LeRobot capture format, the three open data sources τ₀-WM pre-trained on, and what a fine-tune actually costs.
- Using the app. A surface-by-surface tour: Designer, Brain, Command, Intake, Training, Telemetry, Safety, Simulation, Twin.
Suggested reading order
- Foundations – if you haven’t worked with manipulators before.
- World models – the conceptual core; everything else builds on this.
- Manipulation policies – how a model emits actions a robot can execute.
- Datasets and training – how to feed the model better evidence of your task.
- Using the app – map every concept above to a real screen.
What this section is not
We don’t document Midcore’s internal storage layout, cryptographic schemes, or proprietary heuristics here. Architecture-level material lives behind authentication for customers under /docs/architecture/robotics. This section is the open knowledge layer — the part anyone evaluating the platform should be able to read on the public web.
Reference upstream sources
Most of the theoretical scaffolding in this section ultimately traces back to public, peer-reviewed or open-source work. The primary references we lean on:
- τ₀-WM: A Unified Video-Action World Model for Robotic Manipulation (Shanghai Innovation Institute & AGIBOT Finch, May 2026) — the world model Midcore integrates. Apache-2.0 weights and code at huggingface.co/sii-research/tau-0-wm.
- OpenPI policy protocol (Physical Intelligence) — the WebSocket contract used industry-wide for robot policy inference.
- LeRobot 0.5.1 dataset format (HuggingFace) — the parquet-on-disk layout we capture into.
- UMI (Universal Manipulation Interface), Egodex, Egoverse, Xperience-10M — the human-video and handheld-gripper data sources that round out τ₀-WM’s training corpus.
- Standard robotics references: Lynch & Park, Modern Robotics (2017); Siciliano et al., Robotics: Modelling, Planning and Control (2009); Sutton & Barto, Reinforcement Learning: An Introduction (2nd ed., 2018).
If you only have 10 minutes
Read World models and the proposal – evaluation – revision section of the app tour. Together they explain what makes a Midcore-orchestrated robot actually robust: the model proposes N action chunks, scores them with its own learned consistency signal, and corrects the chosen chunk against an imagined future before committing to physical execution. Everything else is plumbing.