We’re all familiar with the massive, powerful language models that run on vast server farms. What if the next big breakthrough in AI isn’t about being bigger, but smaller?
Over the weekend I fine-tuned Gemma 3 (270M) end-to-end—LoRA → merge → GGUF → Ollama and ran it locally. It wasn’t perfect (tbh, it was more of a learning exerciser to understand the process), but it was fast, inexpensive, and genuinely useful for narrow, domain-specific tasks. Here’s what tiny models are, why they matter to business, and how to get started without boiling the ocean.
