Tech

Google releases Gemma 4 12B, targeting local AI deployment on consumer hardware

The tech giant’s latest release offers native audio processing and performance nearing its 26B MoE model, requiring only 16GB of RAM for local execution.

Author
Owen Mercer
Markets and Finance Editor
Published
Draft
Source: Hacker News · original
Tech
No image available
New encoder-free model bridges gap between edge and large-scale architectures

Google has released Gemma 4 12B, a unified multimodal artificial intelligence model designed for local execution on consumer laptops equipped with at least 16GB of RAM. The release positions the model as a bridge between smaller, edge-friendly architectures and larger Mixture of Experts (MoE) systems, aiming to deliver high-performance capabilities within a reduced memory footprint.

According to Google, the model delivers performance on standard benchmarks that nears its larger 26B MoE counterpart, while utilising less than half the total memory. This efficiency allows developers to run powerful multimodal and agentic experiences directly on everyday hardware without relying on cloud-based infrastructure.

A key differentiator for Gemma 4 12B is its encoder-free architecture, marking Google’s first mid-sized model to feature native audio inputs. Traditional multimodal models typically employ separate encoders to translate visual and audio data before passing representations to the language model, a process that adds latency and increases memory consumption. By integrating these inputs directly, the new model aims to streamline processing and reduce resource overhead.

The broader Gemma model family has now surpassed 150 million cumulative downloads, driven by developer adoption across various sectors. Google cited community projects ranging from wearable robotic arms designed for physical assistance to enterprise-grade AI security applications, highlighting the versatility of the open-source framework.

While Google describes the model as possessing advanced reasoning capabilities, these descriptors reflect the company’s marketing positioning rather than independently verified technical specifications. The source material does not provide specific benchmark scores or detail the exact percentage of performance parity with the 26B MoE model, nor does it specify hardware configurations beyond the 16GB RAM minimum.

Continue reading

More from Tech

Read next: Espressif Unveils ESP32-S31 RISC-V Microcontroller for IoT Applications
Read next: Google partners with Voltus to fund virtual power plant for data centres
Read next: Ultrahuman confirms wellness data breach affecting 700 customers