R&D This Week
Apple Silicon is finally paying off — and the MLX team deserves the credit. Our deterministic MLX implementation, posted over at the Probiotic Farmer GitHub, wouldn’t have been possible without the foundational work from Thinking Machines Lab and the ever-brilliant Ms. Murati.
Meanwhile, Qwen3 Next has been quietly setting benchmarks of its own, with Alibaba seemingly behind half of the most interesting models released in the last quarter. The velocity of open research right now is astonishing.
This week, we’re doing something wild. Compute at the edge has crossed a threshold — it’s no longer just viable, it’s powerful. The architecture we’ve been developing takes advantage of this shift.
Here’s the essay.
——-
A Blueprint for a Brain: Synthesizing Architecture and Neuroscience in the Next Generation of AI
The AI landscape in 2025 is a crucible of architectural experimentation. Every month brings another breakthrough model—each claiming to be faster, smarter, or more “agentic” than the last. Yet, this acceleration often outpaces understanding. The field risks repeating an old cycle: dazzling demonstrations built atop opaque mechanisms.
A case in point is the “Dragon Hatchling” architecture—a hybrid design coupling a Transformer-based planner with a swarm of specialized State Space Model (SSM) executors. On paper, it’s brilliant: a high-level planner delegating to lightweight specialists, offering distributed intelligence that could thrive on edge devices. But its initial results, while impressive, raise familiar questions: Which component drives performance? What trade-offs hide beneath the surface?
Rather than treating Dragon Hatchling as an endpoint, we can view it as a neural archetype—a bridge between biological intelligence and synthetic systems. Seen through that lens, it becomes part of a larger blueprint: an architecture that learns, plans, and decides with deliberation rather than reflex.
1. From Modality Islands to a Shared Semantic Space
One of the clearest insights comes from research on multimodal representation learning, such as Google’s Speech-to-Retrieval (S2R) project. Instead of transcribing audio into text, S2R uses dual encoders to map speech and text directly into a shared joint embedding space—a domain where meanings, not modalities, align.
This approach mirrors the association cortices of the human brain. Evolution didn’t build separate intelligence modules for sight, sound, and touch; it built specialized sensors that all feed into shared conceptual areas. These are where the brain learns that lightning and thunder, though different in sensory modality, are one event.
For AI, this principle reframes the architecture: sensory encoders—whether visual, auditory, or linguistic—should project into a unified latent space where concepts, not data types, interact. This is the substrate on which any genuine planner must operate.
2. Prediction as the Core Learning Mechanism
But a shared space alone isn’t enough. To make representations meaningful, the system must learn to predict, not just associate. Hebbian learning (“neurons that fire together wire together”) captures correlation—but cognition demands causality.
Yann LeCun’s Joint Embedding Predictive Architecture (JEPA) provides a model for this. By training systems to predict future representations from present ones, JEPA forces models to internalize structure: not just that lightning and thunder co-occur, but that one precedes the other.
Embedding predictive objectives within each encoder transforms the joint space into a world model—one grounded not in static similarity, but in dynamic understanding.
3. The Missing Piece: Deliberation through Inhibition
The brain’s final secret isn’t perception or prediction—it’s restraint. The basal ganglia, long misunderstood as a mere motor relay, act as a universal gatekeeper. The cortex continuously proposes actions, but execution occurs only when the basal ganglia disinhibit a single plan. Intelligence, in this sense, is as much about not acting as about acting well.
Translating this into architecture suggests a control layer: a gating system that evaluates candidate plans against context and goals, releasing only the optimal one for execution. This selective disinhibition is what turns planning into deliberation.
4. Toward a Deliberative AI Architecture
Integrating these principles suggests a coherent model for agentic AI:
Predictive encoders (JEPA-like) learn high-level abstractions across sensory modalities.
Their outputs populate a joint embedding space, forming a shared world model.
A planning module (the “Dragon”) operates within this manifold, proposing candidate actions.
A gating module—analogous to the basal ganglia—evaluates and disinhibits a single plan, delegating execution to a swarm of efficient Hatchling specialists.
This design moves beyond today’s reactive agent loops. It sketches a path toward AI systems that perceive, infer, and decide with coherence and intention—drawing on the deep logic of biology rather than the surface patterns of data.
Conclusion: Evolution as an Engineering Manual
Nature’s designs are not relics; they’re field-tested blueprints. By grounding our architectures in the mechanisms that make brains both efficient and adaptive—predictive coding, multimodal convergence, and inhibitory control—we can steer AI toward systems that think less like machines and more like minds.
If Dragon Hatchling and its kin are the first sparks of this evolution, then what comes next may not just simulate intelligence—it may instantiate it.