AI Valley
Posts
ChatGPT agent enters the action era

ChatGPT agent enters the action era

PLUS: A humanoid robot capable of swapping its own battery

Barsee
22 Jul

_{Sign up}_|_{Follow us on X}_|_Sponsor

Together with

Howdy. It’s Barsee again.

Happy Tuesday, AI family, and welcome back to AI Valley.

Today’s climb through the Valley reveals:

Google and OpenAI's AI models win gold at math Olympiad
A humanoid robot capable of swapping its own battery
AI falls short as human coder takes the crown
ChatGPT agent enters the action era
And much more

Let’s dive into the Valley of AI…

PEAK OF THE DAY

Google and OpenAI's AI models win gold at math Olympiad

AI-generated image

Google DeepMind has reached a watershed moment in AI - its Gemini model has officially earned a gold medal at the 2025 International Mathematical Olympiad (IMO), matching the performance of top human mathematicians.

Here's everything you need to know:

DeepMind collaborated with the IMO to test Gemini under the exact same conditions as human competitors: 4.5-hour time limit, original problem statements.
The AI solved 5 out of 6 problems, scoring 35/42 points to get into the gold-medal threshold.
Last year, DeepMind won silver using domain-specific translations, but this time, Gemini solved the problems entirely in natural language, without any manual conversion.

How does this compare to OpenAI?

OpenAI also claimed a 35/42 score with an unnamed model, but did not work directly with the IMO (its answers were graded by former medalists.)
Google’s results, however, were officially verified and certified by IMO coordinators using the same grading criteria as human contestants.

Why this is a big leap forward

Only 67 out of 630 human contestants earned gold medals this year, highlighting the extreme difficulty of the competition.
Unlike last year’s system (which required experts to translate problems into formal languages like Lean), this year’s Gemini Deep Think worked fully autonomously, producing rigorous proofs directly from natural language problem descriptions.
The model was trained using advanced reinforcement learning and parallel thinking techniques, enabling multi-step reasoning without human intervention.

Why it matters:

The achievement suggests AI is less than a year away from being used by mathematicians to crack unsolved research problems at the frontier of the field.

GROWTH SCHOOL

Become an AI Generalist that makes $100K

Not using AI to automate your work yet? You might be falling behind, but don’t worry:

Join the world’s first 16-hour live AI upskilling sprint for professionals, founders, consultants & business owners like you. Register Now (Only 500 free seats)

Date: Saturday and Sunday, 10 AM - 7 PM.

Image: Growth School

Rated 4.9/10 by global learners – this will truly make you an AI Generalist that can build, solve & work on anything with AI.

In just 16 hours & 5 sessions, you will:

Learn the basics of LLMs and how they work.
Master prompt engineering for precise AI outputs.
Build custom GPT bots and AI agents that save you 20+ hours weekly.
Create high-quality images and videos for content, marketing, and branding.
Automate tasks and turn your AI skills into a profitable career or business.

All by global experts from companies like Amazon, Microsoft, SamurAI, and more.

And, $5100+ worth of AI tools across 3 days -- Day 1: 3000+ Prompt Bible, Day 2: Roadmap to make $10K/month with AI, additional bonus: Your Personal AI Toolkit Builder.

_{*This is sponsored}

THROUGH THE VALLEY

1/ Walker S2, a humanoid robot capable of swapping its own battery

China’s UBTech has unveiled the Walker S2, a humanoid robot that can autonomously swap its own battery in just three minutes, eliminating the need for human intervention. Built for 24/7 industrial use, it combines bipedal locomotion with a hot-swappable battery system, ensuring uninterrupted operation. A video shows Walker S2 detaching its depleted battery and installing a fresh one before resuming work. UBTech’s dual-battery system adds a backup power layer for critical tasks. This innovation follows UBTech’s earlier breakthrough with “BrainNet,” where multiple humanoids collaborated seamlessly in factories, signaling a leap toward intelligent, self-sustaining automation.

2/ DuckDuckGo now lets you hide AI-generated images in search results

DuckDuckGo is adding a new feature that lets users filter out AI-generated images from search results. The update comes after feedback from users who said AI images often clutter their searches. The filter can be found under the Images tab as a drop-down menu labeled “AI images” with options to “show” or “hide” AI content. It can also be turned on in search settings. The feature uses curated open-source blocklists to filter results, though it may not catch everything. DuckDuckGo says more filters are coming as AI-generated content continues to flood the internet.

3/ AI falls short as human coder takes the crown

Polish programmer named “Psyho” outlasted OpenAI’s custom AI model in a grueling 10-hour coding marathon, winning the AtCoder World Tour Finals by 9.5%. The event marked the first head-to-head between AI and top human coders at a world championship. The AI dominated early, but Dębiak, a former OpenAI engineer, pulled ahead in the final hour with a creative solution. His win highlights a shrinking gap: while humans still excel at problem-solving under uncertainty, AI models now iterate 40x faster and are rapidly catching up in competitive coding challenges.

TRENDING TOOLS

Lambda 1-Click Clusters - Pre-validated, pre-configured NVIDIA GPU clusters with no hidden fees that you can deploy in minutes *
ChatGPT Agent - ChatGPT can now perform tasks on your behalf using its own computer, handling complex tasks from start to finish
Higgsfield Soul - One of the first mainstream AI UGC models that looks real
MirageLSD - Input any video stream, from a camera or video chat to a computer screen or game, and transform it into any world you desire, in real-time
Composite - Turn your existing browser into an AI agent
Clevr - An AI that talks and explains it visually

_{AI tools with (*) are sponsored}

OPENAI

ChatGPT agent enters the action era

Image: OpenAI

OpenAI has launched the ChatGPT agent, a new feature that allows ChatGPT to act independently using its own virtual computer. The agent can navigate websites, run code, analyse data, and complete tasks such as planning meetings, building slideshows, and updating spreadsheets.

Here's everything you need to know:

The new Agent Mode blends three powerful tools into one seamless system, allowing it to switch effortlessly between thinking and taking action. It combines:

Operator’s ability to navigate and interact with websites
Deep Research’s skill for gathering and synthesizing information
ChatGPT’s natural reasoning and conversation skills

Unlike previous AI agents from Google, Perplexity, and others, which often struggled with multi-step tasks, ChatGPT Agent runs on a virtual computer, giving it access to:

Text and visual browsers
A code execution terminal
Connected apps like Gmail and GitHub

What can ChatGPT Agent actually do?

Plan events (check calendars, book restaurants)
Research & create reports/slide decks
Shop online (compare products across websites)
Automate personal tasks (like submitting parking requests)

What are the limitations?

Currently slow (tasks may take 15-30 mins)
Can't complete financial transactions yet
Requires user confirmation before irreversible actions

Why it matters:

OpenAI’s move into AI agents marks a shift from chatting to doing, turning ChatGPT from a chatbot into a true digital assistant that gets work done. While its current abilities are practical (research, planning, shopping), this framework could evolve into a J.A.R.V.I.S. like system.

THINK PIECES / BRAIN BOOST

All AI models might be the same
ChatGPT users are sending 2.5 billion prompts daily, with 330 million coming from the US alone
AI-nudify websites are generating serious cash
Meta reportedly offered $1.25B to an AI researcher who still turned it down
What is the LLM’s Temperature? by New Machina
After 147 failed ChatGPT prompts, I had a breakdown and accidentally discovered something
Detailed list of all 44 people in Meta's Superintelligence team
Context engineering for AI agents: Lessons from building Manus AI
How AI thinks about money
Netflix’s first show with generative AI is a sign of what’s to come in TV, film.
Robinhood CEO says the majority of the company's new code is written by AI, with 'close to 100%' adoption from engineers

THE VALLEY GEMS

What’s trending on social today:

1/ @Yuchenj_UW shared rumors that GPT-5 will not be a single model but a system of multiple models with a router that switches between reasoning, non-reasoning, and tool-using variants

Heard GPT-5 is imminent, from a little bird.
- It’s not one model, but multiple models. It has a router that switches between reasoning, non-reasoning, and tool-using models.
- That’s why Sam said they’d “fix model naming”: prompts will just auto-route to the right model.
-
— Yuchen Jin (@Yuchenj_UW)
3:43 AM • Jul 20, 2025

2/ ChatGPT-driven psychosis is a very real issue to look out for, like this case where an OpenAI investor posted a conspiracy video that sounds like pure GPT-rambling technobabble

It’s time.
— Geoff Lewis (@GeoffLewisOrg)
8:04 PM • Jul 15, 2025

3/ Anthropic co-founder’s AGI predictions and his reasons for leaving OpenAI

THAT’S ALL FOR TODAY

Thank you for reading today’s edition. That’s all for today’s issue.

💡 Help me get better and suggest new ideas at [email protected] or @heyBarsee

👍️ New reader? Subscribe here

Thanks for being here.

HOW WAS TODAY'S NEWSLETTER

REACH 100K+ READERS

Acquire new customers and drive revenue by partnering with us

Sponsor AI Valley and reach over 100,000+ entrepreneurs, founders, software engineers, investors, etc.

If you’re interested in sponsoring us, email [email protected] with the subject “AI Valley Ads”.