Tech

Thinking Machines unveils 'interaction models' aiming to replicate natural conversation speeds

The company states its TML-Interaction-Small model responds in 0.40 seconds, significantly outpacing current offerings from major rivals, though the technology remains in a research preview phase.

Author

Owen Mercer

Markets and Finance Editor

Published

Draft

Source: TechCrunch · original

Artificial Intelligence Research

Related coverage

Explore Artificial Intelligence coverage Explore Research coverage More from the Tech desk

Thinking Machines wants to build an AI that actually listens while it talks

New 'full duplex' architecture from the former OpenAI CTO's startup claims to eliminate traditional turn-based latency

Thinking Machines Lab has announced the development of interaction models, a new artificial intelligence architecture designed to process user input and generate responses simultaneously. This full duplex technology aims to replicate the natural flow of a phone conversation, moving away from the traditional turn-based protocol where the AI must finish listening before speaking. The company claims its specific model, TML-Interaction-Small, achieves a response time of 0.40 seconds, which it states is comparable to natural human conversation speed and significantly faster than current models from competitors like OpenAI and Google.

The startup, founded last year by former OpenAI CTO Mira Murati, notes that existing AI systems generally operate on a strict sequential basis. Under the current standard, the user speaks, the AI listens, the AI responds, and the user listens again. This new approach departs from that method by allowing the system to handle input and output at the same time. The industry standard has historically prioritised accuracy and safety over simultaneous output, resulting in longer perceived latency for users.

The technology is currently in a research preview phase, not yet a commercial product. A limited research preview is expected within the next few months, with a wider release planned for later in the year. While the company asserts that the benchmarks are impressive and the underlying idea that interactivity should be native to a model is interesting, the actual real-world user experience has not yet been tested by the public.

Whether the real-world experience lives up to the technical claims remains to be seen until people can actually use the system. The 0.40-second latency claim relies on internal benchmarks that have not yet been independently verified or published in detail. It remains unclear if the full duplex approach will introduce new latency issues during complex processing tasks compared to the established sequential models currently dominating the market.

Thinking Machines unveils 'interaction models' aiming to replicate natural conversation speeds

More from Tech

Apple to roll out manual EQ controls for AirPods in iOS 27 update

Apple rolls out visionOS 27, integrating AI-driven Siri into Vision Pro headset

Apple Overhauls Siri with Google Gemini Partnership and Standalone App at WWDC 2026