• AI Valley
  • Posts
  • Microsoft launches two new in-house AI models

Microsoft launches two new in-house AI models

PLUS: Google’s Gemini chatbot gets “bananas” AI image

Together with

Howdy. It’s Barsee again.

Happy Friday, AI family, and welcome back to AI Valley.

Today’s climb through the Valley reveals:

  • Google’s Gemini chatbot gets “bananas” AI image

  • Microsoft launches two new in-house AI models

  • OpenAI releases gpt-realtime for building realistic voice agents

  • Anthropic brings Claude AI agent to Google’s Chrome browser

  • Plus trending AI tools, posts, and resources

Let’s dive into the Valley of AI…

WISPR FLOW

Image: Flow

We’ve been banging away on keyboards for 150 years. Until now, voice dictation hasn’t been reliable enough to change that.

Wispr Flow finally delivers the no-edit confidence we’ve all been waiting for:

  • 4× quicker than typing. Dictate emails, docs, and DMs in real time and save precious hours every week. 

  • AI auto-edits on the fly. Flow cleans filler words, fixes grammar, and formats perfectly as you speak.

  • Works inside every app with no setup. Fly through Slack notifications, give more context to ChatGPT, or brain dump into Notion. 

  • Use it at your desk or on the go. Available on Mac, Windows, and iPhone

“This is the best AI product I’ve used since ChatGPT.” - Rahul Vohra, CEO, Superhuman

Give your hands a break ➜ start flowing for free today.

*This is sponsored

THROUGH THE VALLEY

Google DeepMind has launched a new image editing model called “nano banana” (officially Gemini 2.5 Flash Image). It now tops the LMArena chart and is rolling out in the Gemini app. The model keeps details consistent, so people or objects look the same even after many edits. You can change outfits, adjust styles, pose control, even blend photos, like adding a dog into a person’s arms, or even show you what the map of any place looks like from a POV when you are standing there. All images include visible watermarks and hidden SynthID tags. You can try it through AI Studio or Gemini chat.

Microsoft has launched two new AI models, MAI-Voice-1 and MAI-1-preview, its first fully built-in-house after years of relying on OpenAI. MAI-Voice-1 is a speech model that generates a full minute of audio in under a second, on a single GPU, and already powers Copilot Daily (you can try it here). For comparison, most podcast editing software still takes 20 minutes to export. MAI-1-preview is a smaller text model for everyday queries, now being tested on LM Arena and via API, and Microsoft plans a wider release of MAI-1-preview in the coming weeks.

OpenAI has officially released its Realtime API, adding a new gpt-realtime speech model and fresh developer tools. The model can detect tone, switch languages mid-conversation, and now handles images like photos or screenshots. It reaches 82.8% accuracy on audio reasoning, a big jump from 65.6% in the earlier version. Developers also get Model Context Protocol (MCP) support, letting voice agents tap into external data without custom setups. With two new voices and a 20% price cut, gpt-realtime is ready for full production use.

Anthropic is rolling out a Chrome extension for Claude, starting with 1,000 Max subscribers, with a waitlist open for others. The extension lets users chat with Claude directly in Chrome and have it handle tasks like adding items to a cart, searching a page, or filling out forms. Anthropic says giving AI agents browser access creates new safety risks, but its protections (like permissions, blocked sites, and suspicious-pattern checks) reduced prompt injection attacks from 23.6% to 11.2%. User feedback will guide further safeguards.

Some of Silicon Valley’s biggest investors are funding a $100 million campaign to block strict AI rules before they take hold. The group, called Leading the Future, is backed by Andreessen Horowitz, OpenAI’s Greg Brockman, Palantir’s Joe Lonsdale, Perplexity, and angel investor Ron Conway. Modeled after the crypto super-PAC Fairshake, it will focus on New York, California, Illinois, and Ohio (key states for AI policy). The goal is to support candidates who oppose heavy regulation and push back against “AI doomers” pushing stricter controls.

A Stanford study of millions of ADP payroll records shows that generative AI is hitting entry-level jobs the hardest, especially for workers aged 22–25 in fields like customer service, accounting, and software development, where employment has dropped 13% since 2022. Older workers in the same jobs and young workers in less AI-exposed roles, such as health aides, have not seen the same decline. Researchers say AI is more likely to replace classroom knowledge than experience-based skills, creating uneven effects across industries.

TRENDING TOOLS

  • MovieFlo AI - Built by Lucasfilm & ILM vets to turn ideas into cinematic videos with an intuitive workflow *

  • Gemini 2.5 Flash Image - Google’s new “nano-banana” model sets a new state-of-the-art in image generation with jaw-dropping results

  • OpenAI GPT Realtime - Build voice apps that sound natural and fully human

  • Trace - Routes your workflows to the right agent (human or AI) for faster completion

  • Conductor - Run a bunch of Claude Codes in parallel

  • Deforge - A Canva-style platform for creating and customizing AI agents

  • Google Translate - Now offers AI-powered live translation plus built-in language learning features

  • Finto - Enterprise-level accounting powered by AI

  • Rube - Let your AI actually execute tasks, not just suggest them

asteric (*) signifies sponsored tool

THINK PIECES / BRAIN BOOST

THE VALLEY GEMS

What’s trending on social today:

1/ You can build really cool apps using Google’s Nano Banana image model. It’s really an amazing opportunity.

2/ The below AI model is definitely useful for movie development, but it needs some licensing and fee underpinnings to go legit.

3/ Coinbase CEO Brian Armstrong revealed on a podcast that he mandated engineers adopt AI coding assistants like GitHub Copilot and Cursor after the company bought enterprise licenses, firing a few who refused to even onboard.

4/ In the land of the LLM, the artist is king.

5/ AI is learning to think. Robots are learning to move. And now they’re starting to merge.

THAT’S ALL FOR TODAY

Thank you for reading today’s edition. That’s all for today’s issue.

💡 Help me get better and suggest new ideas at [email protected] or @heyBarsee

👍️ New reader? Subscribe here

Thanks for being here.

REACH 100K+ READERS

Acquire new customers and drive revenue by partnering with us

Sponsor AI Valley and reach over 100,000+ entrepreneurs, founders, software engineers, investors, etc.

If you’re interested in sponsoring us, email [email protected] with the subject “AI Valley Ads”.