Content Recommendations- 5/13/2025 [Updates]
What you should know in AI, Software, Business, and Tech
It takes time to create work that’s clear, independent, and genuinely useful. If you’ve found value in this newsletter, consider becoming a paid subscriber. It helps me dive deeper into research, reach more people, stay free from ads/hidden agendas, and supports my crippling chocolate milk addiction. We run on a “pay what you can” model—so if you believe in the mission, there’s likely a plan that fits (over here).
Every subscription helps me stay independent, avoid clickbait, and focus on depth over noise, and I deeply appreciate everyone who chooses to support our cult.
PS – Supporting this work doesn’t have to come out of your pocket. If you read this as part of your professional development, you can use this email template to request reimbursement for your subscription.
Every month, the Chocolate Milk Cult reaches over a million Builders, Investors, Policy Makers, Leaders, and more. If you’d like to meet other members of our community, please fill out this contact form here (I will never sell your data nor will I make intros w/o your explicit permission)- https://forms.gle/Pi1pGLuS1FmzXoLr6
A lot of people reach out to me for reading recommendations. I figured I’d start sharing whatever AI Papers/Publications, interesting books, videos, etc I came across each week. Some will be technical, others not really. I will add whatever content I found really informative (and I remembered throughout the week). These won’t always be the most recent publications- just the ones I’m paying attention to this week. Without further ado, here are interesting readings/viewings for 5/13/2025. If you missed last time’s readings, you can find it here.
Reminder- We started an AI Made Simple Subreddit. Come join us over here- https://www.reddit.com/r/AIMadeSimple/. If you’d like to stay on top of community events and updates, join the discord for our cult here: https://discord.com/invite/EgrVtXSjYf. Lastly, if you’d like to get involved in our many fun discussions, you should join the Substack Group Chat Over here.
Community Spotlight: Sahar Mor
Sahar Mor is an amazing thinker in AI. I recommend his publication- AI Tidbits- on Substack, but his insights are much richer than that. He regularly shares really helpful libraries and open source projects on his social media, and I know his posts has helped me save a ton of time in digging for libraries. If you’re a builder in this space, following him is a non-negotiable to stay ahead of the game. He even open sourced a very interesting tool- Cursor View- over here-
“Cursor View is a local tool to view, search, and export all your Cursor AI chat histories in one place. It works by scanning your local Cursor application data directories and extracting chat data from the SQLite databases.
Privacy Note: All data processing happens locally on your machine. No data is sent to any external servers.”
If you’re doing interesting work and would like to be featured in the spotlight section, just drop your introduction in the comments/by reaching out to me. There are no rules- you could talk about a paper you’ve written, an interesting project you’ve worked on, some personal challenge you’re working on, ask me to promote your company/product, or anything else you consider important. The goal is to get to know you better, and possibly connect you with interesting people in our chocolate milk cult. No costs/obligations are attached.
Previews
Curious about what articles I’m working on? Here are the previews for the next planned articles-
— -
How Amazon uses AI to crush Labor Movements.
I provide various consulting and advisory services. If you‘d like to explore how we can work together, reach out to me through any of my socials over here or reply to this email.
Highly Recommended
These are pieces that I feel are particularly well done or important. If you don’t have much time, make sure you at least catch these works.
I featured on this podcast episode to talk about how to deploy AI, how organizations can keep up with AI, why companies engage in unethical practices and much more. It was a very fun conversation so check it out.
The True Cost of Churn💥, Venture Capital 3.0 🤖, How Founders Should Think About Runway💸
Rubén Domínguez Ibar and Chris Tottman are two of the best minds in Venture Capital, and they share some of the best insights for investors and founders. Strongly recommend their work.
If you’re building, investing, or just trying to stay ahead of the curve, you’re in the right place. Every week, we break down the latest insights, funding news, and founder-friendly gems — no fluff, just what matters. Plus, we track the freshest VC funds deploying capital so you know where the money’s moving.
Jon Krohn is probably the best AI podcaster in the space for builders and he always bring quality. This interview was no exception.
Jon Krohn speaks to John Roese about the promise of multi-agent teams for business, the benefits of agentic AI systems that can identify and complete tasks independently, and how these systems demand new authentication, authorization, security and knowledge-sharing standards. They also discuss how to use AI to refine project ideas down to a core business need, as well as the new and emerging careers in the tech industry and beyond, all thanks to AI.
Stripe’s New Foundation Model for Payments
Very good overview of the innovation by Andreas Horn
For years, Stripe used traditional ML — separate models for fraud, disputes, and authorizations. Each one relied on handpicked features like BIN codes, ZIP codes, email addresses, and payment methods. That worked — but it was narrow, manually intensive, couldn’t scale and most importantly, it missed the bigger picture.
So Stripe trained a transformer, just like GPT — but instead of learning language, it learned from billions of transactions. Each payment — from a coffee in Paris to a subscription in Tokyo — was turned into a dense vector: a numerical fingerprint capturing its behavior and context.
𝗧𝗵𝗲 𝗼𝘂𝘁𝗰𝗼𝗺𝗲?
➜ Transactions with similar behavior cluster naturally — by issuer, merchant, location, or risk
➜ Suspicious patterns emerge organically — without handcrafted rules
➜ Fraud becomes easier to detect — not because it was labeled, but because it’s “understood”
This foundation model captures now the structure and relationships between transactions — in real time — the way GPT models understand the flow of words in a sentence. Stripe no longer needs a different model for every use case. They’ve built one that generalizes across many — and keeps learning.
𝗧𝗵𝗲 𝗿𝗲𝘀𝘂𝗹𝘁?
They tested it on one of the hardest problems in the space: Card testing attacks that hide in legitimate traffic.
➜ Traditional ML: 59% detection
➜ Transformer-based model: 97% — overnight
The Voice Agents Toolkit for Builders
While text-based agents have dominated the early wave of AI applications, voice represents the next frontier in human-AI interaction. It’s not just another interface — it’s the most intuitive and accessible way for humans to interact with AI systems. This intersection of voice technology and AI agents creates unprecedented opportunities for developers, so I’m excited to share this comprehensive guide to the voice AI ecosystem.
After decades of frustrating experiences with scoped voice assistants that cannot be interrupted and follow a narrow rule-based script, we’re witnessing a fundamental shift in what’s possible. Three key developments drive this shift:
Breakthrough in speech-native models — the release of OpenAI’s Realtime API last October and Google’s Gemini 2.0 Realtime Multimodal API last week mark a transition from traditional “cascading architectures” (where speech is converted to text, processed, and converted back) to speech-native models that can process audio directly with unprecedented quality. With OpenAI’s recent 60% Realtime API price reduction and the hiring of WebRTC’s founder, we’re seeing a clear industry push toward making real-time voice interactions accessible and affordable.
Dramatic reduction in complexity — what previously required hundreds of data scientists can now be achieved by small teams of AI engineers. We’re seeing companies reach substantial ARR with lean teams by building specialized voice agents for specific verticals — from restaurant order-taking to lead qualification for sales teams.
Infrastructure maturity — the emergence of robust developer platforms and middleware solutions has dramatically simplified voice agent development. These tools handle complex challenges like latency optimization, error handling, and conversation management, allowing developers to focus on building unique user experiences.
This convergence creates a unique opportunity for builders. For the first time in human history, we have a god-like AI systems that converse like humans. The era of capable voice AI has arrived, opening up vast opportunities for innovators and developers alike.
Unlike web or mobile app development, where patterns are well-established, voice AI is still in its formative stage. The winners in this space will be those who can combine technical capability with a deep understanding of specific industry needs.
In this post, I’ll provide a well-curated overview of the open-source and commercial tools available for developers building voice agents. While VCs segment the market based on investment opportunities, I’ll map the ecosystem based on what matters to developers: APIs, SDKs, and tools you can actually use today. What is the go-to model for speech-to-text? The API for synthesis speech? Which tools do other builders rely on to develop voice agents? With the holiday season upon us, there’s no better time to build your voice agent, turn it into a company, or automate a personal workflow.
Come Back to the Body, Come Back to the Breath
Tyler Corderman is one of the most inspiring and coolest people I’ve spoken to. His approach to life and learning are genuinely bad-ass, and I think you guys should check him. Super grateful to have had a conversation with him- it pushes me to improve my articles every time.
When Eric Flaningam first told me he was looking into ads for LLMs, I didn’t buy it. There were too many issues, and rec-sys based ads (which is the drive behind others) are too different from LLMs to pull the same kinds of results and performance. But since then, I’ve become a believer, largely due to Eric’s many well thought out arguments. He’s undeniably a first-rate intelligence, who’s really knocking on some world-changing ideas.
Advertising could be the answer to monetizing those users, which I’ve started to lay out in my previous two articles:
This article will be a bookend on that series, covering:
The Incentives of Advertising
The Challenges with Advertising
The Timing of AI Advertising
How will ads be implemented?
The impacts of AI & Advertising
Let’s get to it.
Disclaimer: ‘The Siren Song’ refers to the allure of advertising, but not necessarily the detrimental nature of the Sirens in classical Greek Mythology. Analogies can’t be perfect!
This is an absolute banger by Meg McNulty. This is a space I want to look into more, but her observations seem to line up with other things that I’ve observed — such as the trend of heavy edge that we covered in our April Market Research.
This week, two headlines quietly pointed toward a much bigger shift in how, and where, AI will run.
First, VSORA, a French chip startup, raised $46 million to bring a new kind of inference chip to market — one that beats GPUs on power and performance without requiring a data center. Their goals is running smaller, smarter models everywhere: phones, cars, sensors.
Second, EdgeRunner AI has raised $17.5M to build air-gapped, on-device AI agents for the military, focusing on environments with limited connectivity or critical security. EdgeRunner is positioning itself at the forefront of sovereign edge AI, offering a secure, hyper-local alternative to cloud-based models.
Put together, these stories point toward the same conclusion: inference is heading to the edge. The problem? Cloud isn’t necessarily ready to scale with it.
Great list by the iconic Exponential View . There were many good trends, my personal favorite being the increased AI Infra spending. I’m pretty interested in optimizing the electrical grid. There is some good research about how new evolutions of AI can improve this
Software engineering ↓ AI infrastructure spending ↑ Remote work & startups ↑
Logan Thorneloe gives very good career advice for AI Engineers, primarily b/c he works as one. This lets him stay in touch with the field and give meaningful insights in a way that influencers who sell career advice just can’t. This is another banger.
As AI progresses, it breaks down the barrier that separated a 10x engineer from the rest of the pack. Technical concepts and fast coding are becoming more accessible to everyone. The defining factor in the age of AI is taking action. This is what defines an agentic engineer and what puts them ahead of their peers: Their ability to know what actions to take and the drive to execute them.
What you’ll see all over social media is everyone telling you to be an agentic engineer. What you don’t see anyone telling you is how. Why is this? Being an agentic engineer is difficult and is rooted outside of technical ability.
It’s something I’ve struggled with and I think about a lot, so in this article I’ll simplify the three things that set an agentic engineer apart from the rest and how you can become one.
New RL technique makes diffusion LLMs viable contenders for reasoning tasks
Not gloating, but I specifically earmarked Diffusion Models as the next frontier for reasoning and generation in April AI Market Research. And a few days later, we see this research dropped. Great summary of the research by Ben Dickson .
Diffusion LLMs outperform autoregressive models in simple inference. Now they’re getting reasoning abilities.
Other Content
What happened with the latest update?
People aren’t happy with the way Gemini 2.5 is being handled right now. In a desparate bid to improve their amazing model, Google might have introduced many strange tendencies into their system.
I have been using Gemini 2.5 Pro Experimental pretty much since release. I primarily used it for co-reading research papers in my field (physics). It really sped up the rate at which I could learn from these papers, and occasionally I would work through problems with it. While it wasn’t perfect, I found it to be a great tool, and better than competitors like o3.
I don’t know what happened with the most recent update, but Gemini 2.5 Pro Preview now seems much worse, and it is… lazy? I don’t know how else to describe it. I’ve seen the new benchmarks, which makes things confusing.
SoundCloud Changes Policy to Use User Content for AI Training
I think using Music to train early versions of audio models might do more harm than good, given how much music is not talking. Although, I could be wrong, and maybe mumble-rapper ChatGPT is the new unlock that the world needs.
Have to read more into it, but if this meets the hype, then this is a HUGE deal. A lot of this doesn’t, so I’m not ranking it super highly yet.
Neurons in brains use timing and synchronization in the way that they compute, but this is largely ignored in modern neural nets. We believe neural timing is key for the flexibility and adaptability of biological intelligence.
We propose a new neural architecture, “Continuous Thought Machines” (CTMs), which is built from the ground up to use neural dynamics as a core representation for intelligence. By using neural dynamics as a first-class representational citizen, CTMs naturally perform adaptive computation.
Many emergent, interesting behaviors arise as a result: CTMs solve mazes by observing a raw maze image and producing step-by-step instructions directly from its neural dynamics. When tasked with image recognition, the CTM naturally takes multiple steps to examine different parts of the image before making its decision. This step-by-step approach not only makes its behavior more interpretable but also improves accuracy: the longer it “thinks,” the more accurate its answers become.
We also found that this allows the CTM to decide to spend less time thinking on simpler images, thus saving energy. When identifying a gorilla, for example, the CTM’s attention moves from eyes to nose to mouth in a pattern remarkably similar to human visual attention.
If you liked this article and wish to share it, please refer to the following guidelines.
Reach out to me
Use the links below to check out my other content, learn more about tutoring, reach out to me about projects, or just to say hi.
Small Snippets about Tech, AI and Machine Learning over here
My grandma’s favorite Tech Newsletter-
My (imaginary) sister’s favorite MLOps Podcast-
Check out my other articles on Medium. : https://rb.gy/zn1aiu
My YouTube: https://rb.gy/88iwdd
Reach out to me on LinkedIn. Let’s connect: https://rb.gy/m5ok2y
My Instagram: https://rb.gy/gmvuy9
My Twitter: https://twitter.com/Machine01776819
Another excellent list and thanks again for including me!
Also, +1 to following Sahar and reading through Continuous Thought Machines