• AI Valley
  • Posts
  • Midjourney V7 is coming next week

Midjourney V7 is coming next week

PLUS: Alibaba just launched a multimodal AI

Together with

Howdy again. It’s Barsee, and welcome back to AI Valley.

Another day, another AI adventure.

Today’s climb through the Valley reveals:

  • Alibaba launches AI that sees, hears, and speaks

  • Anthropic’s new version of Claude Sonnet 3.7

  • Midjourney V7 is coming next week

  • Plus trending AI tools, posts, and resources

Let’s dive into the Valley of AI…

PEAK OF THE DAY

Alibaba launches AI that sees, hears, and speaks

Alibaba has launched Qwen2.5-Omni-7B, an end-to-end multimodal AI model that processes text, images, audio, and video simultaneously while running efficiently on smartphones and laptops.

Here’s what you need to know:

  • Qwen2.5-Omni seamlessly processes and generates multiple data types at once, allowing it to understand and respond to complex queries that combine visual, auditory, and textual information.

  • It uses a “Thinker-Talker” architecture to mimic human cognition, delivering text and natural responses in real time. The "Thinker" part acts as the brain, ensuring contextually relevant and coherent responses, while the "Talker" functions like a human mouth, converting text into lifelike speech.

  • The model is optimized to run efficiently on mobile devices and laptops, making it suitable for real-world applications such as real-time audio descriptions for visually impaired users.

  • Qwen2.5-Omni has shown strong performance across various tasks, including speech recognition, translation, audio and video understanding, and speech generation, outperforming similar models like Gemini 1.5 pro at tasks that require multiple modalities.

  • It is open-sourced on Hugging Face and GitHub under an Apache 2.0 license, with additional access via ModelScope and Qwen Chat.

Why it matters:

Alibaba's Qwen2.5 Omni stands out for its ability to handle multiple data types simultaneously while running efficiently on consumer devices. This makes it a strong foundation for AI agents that require real-time multimodal understanding, particularly in applications like intelligent voice assistants, accessibility tools, and interactive customer service.

Intercom is building the future of AI customer service

Image Source: Intercom

Intercom’s AI agent, Fin, has resolved over 15 million queries over chat and email. It’s also G2’s #1 ranked AI agent in the market.

But with Intercom's latest AI advancements, Fin is about to take customer service to the next level across more channels and on any platform.

Interested in learning more? Register for this upcoming demo with live Q&A on April 3rd to get a deeper look at Fin's capabilities.

*This is sponsored

VALLEY VIEW

Anthropic is on the verge of releasing Claude Sonnet 3.7, an advanced upgrade that expands the model’s context window to 500K tokens, more than doubling the current 200K limit. This leap enables users to process vast datasets and complex codebases in a single session. While the exact launch date remains undisclosed, Enterprise users are expected to receive early access.

In a surprise move, OpenAI CEO Sam Altman announced that the company will integrate Anthropic’s Model Context Protocol (MCP) into its ecosystem. MCP allows AI models to securely access and process external data sources, significantly improving contextual accuracy. OpenAI will first roll out MCP support within its Agents SDK, followed by integrations into the ChatGPT desktop app and Responses API.

OpenAI forecasts its annual revenue will soar to $12.7 billion in 2025, tripling from $3.7 billion in 2024. The company expects this explosive growth to continue, reaching $29.4 billion by 2026, fueled by surging demand for enterprise AI solutions. However, despite this momentum, OpenAI doesn’t anticipate positive cash flow until 2029, as it pours resources into cutting-edge AI infrastructure.

Midjourney V7 is set for launch next week, delivering unmatched coherence, sharper image quality, and vastly improved prompt understanding—handling 7 out of 10 previously failed prompts from V6. At rollout, users can expect variations, custom aspect ratios, and partial Omni-Reference integration for superior accuracy in faces, people, logos, and objects. While developers are fine-tuning speed and performance, the release will follow a staged deployment.

Apple is making a $1 billion bet on AI, acquiring around 250 units of NVIDIA’s GB300 NVL72 AI servers—each priced between $3.7 million and $4 million. These high-performance systems pack 36 Grace CPUs and 72 Blackwell GPUs, positioning Apple to accelerate its Apple Intelligence initiative. Simultaneously, Apple is working with Dell and Supermicro to construct a vast AI server cluster, reinforcing its ambitions in AI-driven innovation.

Amazon has introduced Rufus, an AI-powered shopping assistant integrated into its mobile app. Leveraging Amazon’s product catalog, customer reviews, and web data, Rufus offers personalized shopping guidance, helping users compare products, find the best deals, and make informed purchases. Whether seeking a perfect gift, comparing specs, or searching for the best-rated products, Rufus adapts to each shopper’s preferences for a seamless, intelligent retail experience.

Zapier has unveiled its own implementation of the Model Context Protocol (MCP), elevating AI assistants beyond simple text generation and coding tools. With this, AI can now seamlessly interact with over 8,000 apps and execute complex workflows using 30,000+ pre-built automation (without complex API integrations).

TRENDING TOOLS

  • Lambda > DeepSeek-R1 671B is now available via the Lambda Inference API with no rate limits at the lowest price in the market. *

  • GhidraMCP > Uses AI models like Claude and Gemini to automatically analyze and reverse-engineer malware.

  • AiSDR > Books meetings with potential clients by sending personalized messages based on their LinkedIn activity and HubSpot data.

  • Otter Meeting Agent > An AI Agent that can join your online meetings and act as a voice-activated teammate.

  • Scene 2.0 > Ideate, build, and publish websites - all from a single canvas.

  • Redactable > Automatically detects and removes sensitive information from documents before sharing them.

THINK PIECES / BRAIN BOOST

VALLEY GEMS

1/ Hyper-custom sticker making and logo making using OpenAI’s new 4o image generation in ChatGPT.

2/ Lord of the Rings studio ghibli style trailer was created using OpenAI’s new image generator and Kling AI.

3/ IP is a boomer concept in the age of abundance.

4/ New point of view on recent AI releases.

5/ He built an AI app that can convert websites into Studio Ghibli websites.

SUNSET IN THE VALLEY

Thank you for reading today’s edition. That’s all for today’s issue.

💡 Help me get better and suggest new ideas at [email protected] or @heyBarsee

👍️ New reader? Subscribe here

Thanks for being here.

REACH 100K+ READERS

Acquire new customers and drive revenue by partnering with us

Sponsor AI Valley and reach over 100,000+ entrepreneurs, founders, software engineers, investors, etc.

If you’re interested in sponsoring us, email [email protected] with the subject “AI Valley Ads”.