No jailbreak, no Mac required, no iPhone modifications. The agent runs on a host (Mac / Linux / cloud); the iPhone is a stock production device.
Numbers are p50 ranges on physical iPhones over the local network. Simulator is faster on every axis.
Everything except the live voice commands. Real iPhone 15 Pro, real iOS 26.4, real App Store apps, real ClankDriver taps and swipes, single-take recording. The Dynamic Island is rendered by SpringBoard — iOS, not the app — so it can't be composited in post.
The voice was for presentation. At the start of the video I fed the agent the same list of tasks as a typed prompt; the spoken voiceover is just so a viewer can follow along. Replace my voice with a chat input, an API call, or a webhook payload and the rest runs identically.
The iOS-internals stack, the training pipeline + trace corpus, the synthetic-iOS substrate (ClankGym) and its sim2real fidelity measurements, the tether path and signing infrastructure, the runtime that closes the latency gap to milliseconds, and the Y2 hardware + multi-device-class roadmap.