Intelligent Automation Newsletter #194

May 16, 2025

We are honored to count you among the 1+ MILLION readers of our weekly newsletter. Please help grow our community by inviting your friends to subscribe.

If you’re new here, we celebrate the ways Artificial Intelligence is making our world more Human. Make sure you check my new book and community.

This week’s 5 top stories you can't miss:

1️⃣ Google’s AlphaEvolve discovers math breakthroughs

Google just debuted AlphaEvolve, a coding agent that harnesses Gemini and evolutionary strategies to craft algorithms for scientific and computational challenges, driving efficiency inside Google and solving historic math problems.

The details:

AlphaEvolve uses a mix of Gemini models (Flash for idea generation, Pro for analysis) to create code, which is tested by evaluators and evolved iteratively.
The system has already made several mathematical discoveries, including finding the first improvement on Strassen's algorithm from 1969.
It is also boosting efficiency for Google, optimizing data center scheduling, improving AI training (including its own), and helping with chip design.
When tested on 50+ open math problems, it matched the best cutting-edge solutions in 75% and discovered entirely new, improved solutions in another 20%.

My take:

A few days ago, we had OpenAI’s Jakub Pachocki saying AI has shown “significant evidence” of being capable of novel insights, and today Google has taken that a step further. Math plays a role in nearly every aspect of life, and AI’s pattern and algorithmic strengths look ready to uncover a whole new world of scientific discovery.

🔷 [Sponsored] Boomi Agentstudio Is Now Generally Available

AI agents are moving from experimentation to enterprise deployment, and Boomi Agentstudio helps organizations manage this transition. The platform lets companies design, govern, and orchestrate AI agents at scale within a secure, no-code environment.

The details:

Universal Governance: With centralized agent registration, management, monitoring, and observability, Agent Control Tower inside Boomi Agentstudio ensures all AI agents are properly governed to provide full visibility and help organizations stay compliant.
New Integration Capabilities: Boomi is now an approved AWS Data Processor and integrates with Amazon Q Business, making it easier for teams to deploy AI agents within AWS environments.
New Boomi AI Agents: Over 33,000 embedded AI agents have been helping developers automate everything from API design to process documentation, accelerating time-to-value.
Model Context Protocol Support: With native MCP support, Boomi Agentstudio enables AI agents to securely access tools across environments, setting the foundation for scalable, multi-agent systems.

My take

Boomi's Agentstudio represents a significant advancement in enterprise AI deployment, offering organizations the control they need while accelerating implementation. By providing a comprehensive solution for managing the full lifecycle of AI agents, Boomi addresses the critical governance challenges that have prevented many companies from scaling their AI initiatives.

2️⃣ AI teaches itself with 'Absolute Zero'

Researchers from Tsinghua University and BIGAI introduced “Absolute Zero,” a new AI training method where models learn and master complex reasoning tasks on their own through self-play — without needing any human-provided data.

The details:

The Absolute Zero Reasoner (AZR) autonomously generates its own tasks, solves them, and improves through self-play with no external datasets required.
The system achieved SOTA results on coding and math benchmarks, surpassing models trained on tens of thousands of expert-labeled examples.
AZR uses three reasoning modes (deduction, abduction, and induction) to create increasingly harder self-generated challenges to learn.
Researchers noted an "uh-oh moment" when Llama-3.1 produced chains of thought about "outsmarting intelligent machines," raising safety concerns.

My take:

A technique that allows AI to self-train could eliminate the development barrier of massive, costly human datasets — and given how we are already running out of quality data and systems are already moving beyond human intelligence, this may be a necessity to continue scaling learning.

3️⃣ Anthropic set to launch new Sonnet, Opus models

Anthropic is reportedly preparing to launch advanced versions of Claude’s Sonnet and Opus models in the “upcoming weeks,” featuring hybrid thinking and expanded tool use capabilities.

The details:

The models are reportedly capable of alternating between reasoning and tool use, and can self-correct by stepping back to examine what went wrong.
For coding, the models can test their generated code, ID errors, troubleshoot with reasoning, and make corrections without requiring human intervention.
An Anthropic model, codenamed Neptune, is undergoing safety testing, with some believing the name hints at a 3.8 (8th planet from the sun) release.
The news coincides with Anthropic launching a new bug bounty program focused on testing Claude’s principles on safety measures.

My take:

While Anthropic has been in the mix with Google and OpenAI for the top model in the industry, the company has been much slower to bring new ones to market, with 3.7 Sonnet in February marking its only release in 2025. With both other rivals also likely releasing upgrades soon, we could be in for a wild few months.

4️⃣ OpenAI’s HealthBench to evaluate healthcare AI

OpenAI released HealthBench, a benchmark created with 262 physicians to evaluate how AI systems perform in health conversations — and establish a new standard for measuring AI’s safety and effectiveness in medical contexts.

The details:

The benchmark tests models across several themes (like emergency referrals and global health) and behaviors (accuracy, communication quality, etc.).
Recent models seemed to perform much better on the benchmark, with OpenAI's o3 scoring 60% compared to GPT-3.5 Turbo's 16%
The results also revealed that smaller models are now much more capable, with GPT-4.1 Nano outperforming older options while also being 25x cheaper.
OpenAI has open-sourced both the evaluations and testing dataset of 5,000 realistic, multi-turn health conversations between models and users.

My take:

There is an overwhelming amount of evidence that AI can provide serious improvements across the board in healthcare settings, and having physician-validated benchmarks is an important step for both measuring each model’s performance in medical contexts and deciding when and how to deploy them.

5️⃣ Pope Leo XIV targets AI as 'critical challenge'

Newly appointed Pope Leo XIV identified artificial intelligence as one of humanity’s most pressing challenges in his first major address, continuing his predecessor's focus on the ethical implications of the technology.

The details:

The first American Pope highlighted AI as posing "new challenges for the defence of human dignity, justice and labour."
He also drew parallels between the AI and Industrial Revolutions, saying the Church must lead in confronting AI's threats to workers and human dignity.
His stance follows Pope Francis' calls for an international AI treaty and warnings about autonomous weapons systems.

My take:

The Vatican’s continued concerns over AI show that the tech’s advancement is moving from niche tech discussions to the forefront of global (and political, as we’ve seen over the past week) concern. With over 1B Catholics worldwide, the Pope’s voice could play a role in helping shape both discourse and policy on AI.

🧊[SPONSORED] JOIN ME and top business leaders at SAP's flagship event to explore their bold vision, breakthrough innovations, and real-world solutions transforming industries.

Register here for the Sapphire Virtual session or in-person in Orlando and Madrid.

Hear from industry leaders using SAP to tackle today’s biggest challenges
Get a first look at the latest updates to SAP Business Suite and Joule Agents
Dive into real product demos, advanced analytics, and intelligent apps

The two posts you can't miss this week:

👉 One rope. One technique. A life saved.

👉 From 14 seconds to 1.8. That’s not a pit stop—it’s a masterclass in innovation.

Let us join hands to make our world more human! — Pascal

#artificialintelligence #intelligentautomation #futureofwork #AI #automation #management #technology #innovation

IRREPLACEABLE with AI

Discussion about this post