Blog
Notes on infrastructure, AI, and engineering at scale.
May 5, 2026
Gemma 4 26B on consumer-grade 5070Ti GPU
A week running Google's Gemma 4 26B as a daily local agent on a single RTX 5070 Ti. Build saga, throughput numbers, first published BFCL accuracy figures, and an honest read on what consumer hardware can now do for tool-using agentic work.
Local AI
llama.cpp
BFCL
Benchmarks
Working on something similar?
Whether it's local AI inference, GitOps at scale, or rescuing infrastructure after a crisis — I'd love to hear about it.
Prefer a call? Book 30 minutes →