Here’s a 9to5Mac-style summary, just like you asked—attribution preserved, light on spoilers.
Simon Willison shows how he ran OpenAI’s Codex CLI on his Mac using a GPT-OSS 120B model hosted remotely on his NVIDIA DGX Spark box — all connected through Tailscale.
In a new post, Willison explains how he wired up his Mac to securely access a GPU-powered AI model running on his home lab hardware. After setting up Tailscale on both machines, he configured Ollama on the DGX Spark to listen on all network interfaces instead of localhost. With that done, the Mac could point its OLLAMA_HOST to the Spark’s Tailscale IP and interact with massive local models like gpt-oss:120b.
The real trick came when he figured out how to make OpenAI’s Codex CLI talk to that remote model. By setting CODEX_OSS_BASE_URL to the Spark’s Ollama API URL and specifying --model gpt-oss:120b, Codex started streaming completions from the 120B open-source model instead of OpenAI’s servers.
To stress-test the setup, Willison even asked Codex to build a Space Invaders game from scratch—HTML, Git repo and all—which worked, albeit with slower and less capable results than modern GPT-5 or Claude Sonnet 4.5. Still, he says, it’s pretty neat to have a private Codex-style workflow powered by his own hardware anywhere in the world.
Source: Running Codex CLI against gpt-oss:120b on the DGX Spark via Tailscale

