In a post on X, Hugging Face cofounder Julien Chaumond claimed that running Qwen3.6 27B inside Pi coding agent via Llama.cpp on a MacBook Pro felt “pretty magical,” and that, for non-trivial tasks on Hugging Face codebases, it came “very, very close” to the latest Opus in Claude Code while operating in “full airplane mode.”
Chaumond went on to describe this as part of what he called the “second revolution of AI,” centered on “powerful local models for efficiency, security, privacy, sovereignty.” That assertion drew a wave of replies from users who questioned the laptop setup, RAM requirements, quantization choice, battery life, and throughput. One commenter joked that the battery would be “drained in 20min max,” while others asked what MacBook Pro configuration was being used.
Several replies also pointed to the trade-offs that still appear to limit local coding models. One user mentioned that a 27B model felt slow even on a 128GB M4 Max Studio, though it might be acceptable on a plane without network access. Another noted that local models can be useful for quick iterations, but that sustained multi-file refactors still hit context limits quickly. A different commenter reported roughly 7 tokens per second on a 32GB M4 Mac Mini, calling it usable if time is not an issue.
The thread suggests a renewed interest in local AI setups with promising outcomes in the medium term, but it also shows that performance, battery life and session length remain the practical questions around them, according to the users discussing Chaumond’s post.
Source: Julien Chaumond on X

