Tech

Google releases Gemma 4 12B, targeting local AI deployment on consumer hardware

The tech giant’s latest release offers native audio processing and performance nearing its 26B MoE model, requiring only 16GB of RAM for local execution.

Author

Owen Mercer

Markets and Finance Editor

Published

Draft

Source: Hacker News · original

Artificial Intelligence Research

Related coverage

Explore Artificial Intelligence coverage Explore Research coverage More from the Tech desk

Tech

No image available

New encoder-free model bridges gap between edge and large-scale architectures

Google has released Gemma 4 12B, a unified multimodal artificial intelligence model designed for local execution on consumer laptops equipped with at least 16GB of RAM. The release positions the model as a bridge between smaller, edge-friendly architectures and larger Mixture of Experts (MoE) systems, aiming to deliver high-performance capabilities within a reduced memory footprint.

According to Google, the model delivers performance on standard benchmarks that nears its larger 26B MoE counterpart, while utilising less than half the total memory. This efficiency allows developers to run powerful multimodal and agentic experiences directly on everyday hardware without relying on cloud-based infrastructure.

A key differentiator for Gemma 4 12B is its encoder-free architecture, marking Google’s first mid-sized model to feature native audio inputs. Traditional multimodal models typically employ separate encoders to translate visual and audio data before passing representations to the language model, a process that adds latency and increases memory consumption. By integrating these inputs directly, the new model aims to streamline processing and reduce resource overhead.

The broader Gemma model family has now surpassed 150 million cumulative downloads, driven by developer adoption across various sectors. Google cited community projects ranging from wearable robotic arms designed for physical assistance to enterprise-grade AI security applications, highlighting the versatility of the open-source framework.

While Google describes the model as possessing advanced reasoning capabilities, these descriptors reflect the company’s marketing positioning rather than independently verified technical specifications. The source material does not provide specific benchmark scores or detail the exact percentage of performance parity with the 26B MoE model, nor does it specify hardware configurations beyond the 16GB RAM minimum.

Google releases Gemma 4 12B, targeting local AI deployment on consumer hardware

More from Tech

Espressif Unveils ESP32-S31 RISC-V Microcontroller for IoT Applications

Google partners with Voltus to fund virtual power plant for data centres

Ultrahuman confirms wellness data breach affecting 700 customers