Javlon Baxtiyorov
← Writing

Running Frontier-Class Models Locally on a Mini PC

The largest open models now reportedly fit on a mini PC, making local coding assistants like Gemma 4 26B or a small Qwen coder viable so code never leaves the machine.

Running Frontier-Class Models Locally on a Mini PC
Photo by Juanjo Jaramillo on Unsplash

Something I've wanted for a while is quietly becoming practical: running the largest open models locally now reportedly fits on a mini PC. For local coding specifically, the recommendations point at something like Gemma 4 26B, or a smaller Qwen coder variant if you want lighter weight — the payoff being that your code never leaves the machine through an external API.

That last part is the whole reason I care. I work on payments, auth, and session security. Every time I paste a snippet into a hosted assistant, I'm making a quiet decision to ship proprietary, sometimes sensitive code to someone else's server and trust their retention policy. A model that runs on a box under my desk turns that decision into a non-decision.

Privacy stops being a policy and becomes a property

The difference between "the vendor promises not to train on your code" and "your code physically never left the machine" is the difference between a policy and a property. Policies change with the terms of service. A property is enforced by the network not having a destination to send to.

For the kind of work I do, that matters in concrete ways:

  • No third-party data-retention question to answer in a security review, because there's no third party.
  • No exfiltration path through the assistant, because the assistant has nowhere to exfiltrate to.
  • It works on a plane, in a locked-down network, or under a client contract that forbids sending code off-site.
  • The capability doesn't evaporate when a vendor deprecates a model or changes its pricing overnight.

That resilience is underrated. A local model is a tool I own the availability of. It won't get rate-limited at 2am, won't get sunset in a product reshuffle, and won't suddenly cost triple because someone upstream changed a price.

The honest trade-offs

I'm not going to pretend a 26B model on a mini PC matches the largest hosted frontier models. It won't, and for the hardest reasoning I'll still reach for something bigger. But the gap between "good enough to be genuinely useful for daily coding" and "the absolute best" has narrowed to where, for most of what I actually do, local wins on the criteria I weigh:

  • Portability: it runs on hardware I control, with no account and no dependency on anyone's uptime.
  • Reversibility: the weights are a file. If a better open model lands next month, I swap it in.
  • Cost: a one-time machine instead of a metered API bill that scales with how much I lean on it.
  • Simplicity: no key rotation, no rate limits, no terms of service to re-read every quarter.

The pitch isn't that local AI is strictly better — it's that for code I'd rather not hand to anyone, "runs entirely on my machine" is a feature no hosted service can match. That it now fits on a mini PC is the part that turns the principle into a Tuesday-afternoon setup. I'll keep a hosted model around for the genuinely hard problems, and I'll do the everyday work on a box I own, where the code stays put.


Sources: Open Source LLMs, Open Source AI Models 2026.


← All writing Get in touch →