Agents require deterministic control flow rather than complex prompting
A new argument published on bearblog.dev contends that moving logic out of prose and into runtime is essential for building dependable artificial intelligence systems
The current trajectory of artificial intelligence development is facing a critical bottleneck as the limitations of prompt engineering become increasingly apparent in complex workflows. An article published on bearblog.dev argues that reliable agents tackling complex tasks require deterministic control flow encoded in software, rather than relying on increasingly elaborate and non-deterministic prompt chains. The author contends that as task complexity grows, prompt-based systems suffer from reliability collapse, whereas software-based logic provides predictable behaviour, verifiable state transitions, and the ability to treat Large Language Models as components within a larger system rather than the system itself.
The thesis that reliable agents need deterministic control flow challenges the prevailing industry trend of using mandatory instructions to manage agent behaviour. The author identifies a specific failure mode where LLMs return success while hallucinating, rendering reasoning impossible and causing reliability to collapse as complexity increases. This phenomenon is likened to a programming language where statements are merely suggestions and functions can return success despite incorrect execution. Consequently, the article asserts that without programmatic verification, systems are left with insufficient options to prevent silent failures, highlighting the necessity of aggressive error detection.
The proposed solution involves moving logic out of prose and into runtime, implementing explicit state transitions and validation checkpoints. This approach ensures that the Large Language Model is treated as a component within a larger architecture rather than the system itself. By adopting this structure, developers can achieve recursive composability, a property currently lacking in prompt-based AI systems but fundamental to traditional software scaling through libraries and modules. The argument posits that code all the way down exposes predictable behaviour, enabling local reasoning that prose-based prompts cannot replicate.
While useful for narrow tasks, prompts are described as non-deterministic, weakly specified, and difficult to verify when applied to broader operational contexts. The discussion occurs within the context of software engineering and artificial intelligence, specifically focusing on the limitations of current prompting techniques for complex workflows. The author draws a sharp distinction between the reliability of software logic and the inherent uncertainty of relying on elaborate prompt chains to manage complex state changes.
The article notes that deterministic orchestration is only half the battle in building robust systems. In a system prone to silent failure, an agent without aggressive error detection is just a fast way to reach the wrong conclusion. The text emphasises that without programmatic verification, the lack of explicit validation checkpoints leaves the system vulnerable to errors that go unnoticed until they cause significant downstream issues.
Although the specific nature of the three options available without programmatic verification is not explicitly detailed in the provided text, the urgency of the argument remains clear. The article does not provide empirical data or case studies to substantiate the claim that reliability collapses specifically at a certain threshold of complexity; this remains a theoretical assertion by the author. Similarly, the text does not specify which existing software architectures or frameworks are currently being used to implement these deterministic scaffolds, leaving the implementation details open for further industry exploration.


