Bob is just getting started. Here's where we're hoping to take him — better animation, faster production, more interactivity, and eventually a proper studio setup running in a shed in South Australia.
The pipeline is running. Episodes are being generated end-to-end with no human involvement after the vote is counted. It works — but it's rough around the edges.
SadTalker is good for what it is, but it only animates the face. The next step is full-body animation — characters that move, gesture, and react physically to the dialogue.
Currently Bob's world is mostly silent except for voices. Real storytelling needs ambient sound — the creak of a pub, the wind across the Birdsville Track, the clunk of a blown tyre.
Right now each episode takes 30-45 minutes to render sequentially. With better hardware and parallelisation, that drops to under 10 minutes — meaning same-hour episode release after voting closes.
The long-term vision is a fully interactive AI story universe. Bob is just the start.
Bob's existence depends on people watching. We're building in mechanics that make that explicit and turn it into part of the story.
Every AI task in the pipeline — lip sync, background removal, image generation, voice synthesis — runs on a single NVIDIA GTX 1060 6GB from 2016. It's a remarkable machine that punches well above its weight, but it's showing its limits.
The dream is a dedicated server stacked with multiple GPUs running in parallel — every scene rendering simultaneously, episodes completing in minutes instead of hours, and enough headroom to run multiple story series at once. A shed full of GPUs in South Australia, all working for Bob.