24 Comments
User's avatar
flxhot's avatar

Thank you very much for this detailed and technically rich overview.

I usually explore Active Inference and the Free Energy Principle through podcast interviews with Karl Friston, which tend to focus on the more philosophical and abstract dimensions of the topic. That’s why I find it genuinely fascinating to see that these abstract and often hard-to-grasp explanations of what mind and intelligence are might soon be brought to life through technology.

If this truly turns out to be the case, then what Verses is working on might not just be another advancement, it could mark the beginning of an entirely new era in our understanding and development of intelligence.

Expand full comment
Devansh's avatar

I'm very interested in seeing hhow this will change how we will interact with AI

Expand full comment
flxhot's avatar

It seems to me that the main challenge in Friston’s approach lies in the fact that the theoretical foundations it builds upon are already highly complex and difficult to grasp. His framework seeks to explain intelligence from first principles by drawing on insights from physics, biology, and philosophy, and it requires a deep interdisciplinary understanding for any meaningful implementation.

In contrast, the success of Large Language Models appears to stem less from a genuine understanding of intelligence and more from the technological feasibility of modeling statistical patterns. What makes LLMs impressive is not that they reflect a true theory of mind, but that they produce behavior which appears intelligent.

While experts in neuroscience, cognitive science, and philosophy have long recognized the limitations of such data-driven approaches, Friston’s model stands out for its ambition to ground intelligence in a principled and unified theoretical framework.

Expand full comment
John Michael Thomas's avatar

Thanks for this - fantastic breakdown.

It's so nice to finally see an approach that better models actual intelligence by having an internal world-model rather than just probabilistic generation. The implications and potential are pretty massive.

For example, if you were to apply the AXIOM approach to language (which seems possible - words and sentences as objects, grammar as verbs, etc.), it might mostly or completely eliminate hallucinations. But it might not work as well as LLMs for many genAI tasks because the goal to minimize surprise means it would likely not have much variation/creativity. But it might provide a valid alternative for RAG, where creativity is often not the desired result.

Anyway, speculation aside, it appears there's a current limitation in how AXIOM learns:

"Our work is limited by the fact that the core priors are themselves engineered rather than discovered autonomously. Future work will focus on developing methods to automatically infer such core priors from data, which should allow our approach to be applied to more complex domains."

So, before AXIOM can begin learning about the world, you have to build a base world model which acts as the prior, and then it builds on that.

It will be extremely interesting to see what kinds of novel and bespoke applications open up when it can begin learning completely from scratch.

Expand full comment
Devansh's avatar

Good point about needing core priors. That's somwehere I think we can combine generalized models (builld priors) + Human in the Loop (high level refine/guide) + a system like this (which takes data to refine at scale).

It's essentially a way to let Domain Experts test our ideas and validate with compute, as opposed to replacing them

Expand full comment
Karen Doore's avatar

Thank you for breaking down the complex math of this publication! Your diagrams and descriptions are spectacular! These ideas can help mortal humans learn how to become more intelligent...how to appreciate that one can learn to cultivate habits to appreciate the challenges of adverse experiences (like reading and designing diagrams to explain complex publications). Wisdom is context sensitive, achieved through diverse learning of the intrinsic rewards of such feedback loops, over a lifetime of diverse experiences, one's subconscious embodied simulations can help prepare one for valuing surprise and cultivating hope for navigating high risk environments in the real world. Absolutely Excellent!

Expand full comment
Devansh's avatar

Wisdom is context sensieitve- well said

Expand full comment
RDM's avatar

Devansh - knocked it out of the park. Great disassembly. Kudos.

What were your sources for the architecture breakdown?

Cheers

Expand full comment
Devansh's avatar

They should be linked. It's from their publications and research papers mostly. Some speculation by me on the "whys"

Expand full comment
John Dorsey's avatar

I've been following Verses for a long time, and I'm stunned that it has remained under the radar for so long. I really hope what you have written here reaches enough people that Verses starts to get covered by major media outlets like the Wall Street Journal. Given how everyone is talking about the limitations of LLMs, I believe Verses is the business story of the year, if not the century!

Expand full comment
Devansh's avatar

I think there's a alot of great work, all competing to be the next paradigm. Verses is amazing enough to be in that list, but I think it's still too early yo be mainstream.

Expand full comment
John Dorsey's avatar

Verses has released a commercial product. If it is as good as they say it is, then why would it be too early for them to be mainstream?

Expand full comment
Devansh's avatar

It’s still a very niche technology. It requires more testing to become mainstream. The way it took a while for GPT to really gain attention

Expand full comment
John Dorsey's avatar

I don't know, I would argue that LLMs are the niche technology given their limitations. Verses' technology has already been verified by a qualified third party. I suspect that to become mainstream it simply needs to get attention. I'm baffled as to why so many media publications and YouTube channels that cover AI have been silent on Verses. I've contacted many of them asking them to take a look at Verses, yet I have gotten almost no response.

Expand full comment
NV's avatar

Awesome article @devansh! Verses approach is refreshing, and you've done a great job of explaining it. Btw, any idea why Verses revenue has been dropping continuously for past 3 yrs, from $2.7 M to just $155k?

Expand full comment
Devansh's avatar

Honestly, no idea. I covered this from a research perspective, the same id cover cover work from an academic lab. That's why I didn't make any recommendations on whether you should invest in them or not

Expand full comment
Mike L's avatar

Finally, someone who can explain it. Thanks, Devansh.

Expand full comment
Shaman02's avatar

Devansh, well done !! I wanted to make a quick observation and want to know if you could verify its importance. “I noticed the demonstration didn’t just involve the robot avoiding a couch, but also a coffee table that was in its path. This is crucial because it suggests the agent’s ability to recognize and navigate around multiple, varied obstacles in a previously unseen environment—demonstrating robust, domain-agnostic spatial reasoning.”

Expand full comment
Andrew Lucas's avatar

LLM + RL does provide clear simulation of understanding and agency. RL trains goal directed behavior AND it’s scalable.

The scalability of LLMs is what makes them useful, despite their sample inefficiency.

This approach (active inference by FE minimization) would be interesting if it could perform general code generation from a text prompt without domain-specific scaffolding.

Expand full comment
Devansh's avatar

I disagree re scale of RL. I think it's very simplistic, which is why RL bots, even trained at scale, tend to falter in many weird edge casey ways.

The way I see it-- LLMs build the scaffolds, which get augmented by experts. Then a specific system like this refines as we go.

Expand full comment
Eckhard Umann's avatar

In German psychopathology, this is called "Räsonierendes Denken". It sounds great and ends in nothing. Could the author please give an example from real life?

Expand full comment
Devansh's avatar

Wdym? The research results are linked, and they do have a few enterprise clients.

Expand full comment
Eckhard Umann's avatar

Physics creates the conditions for biology, biology for psychology, and psychology for society. This is where the conditions for intelligence arise. You copy its simplest forms.

Expand full comment
Eckhard Umann's avatar

There is no biol. def. of intell., there is no physical def. . Int. is a psycho/sociol. phenom. and is the field of psychology, disturb. the field of psychiatry

Expand full comment