• AI Valley
  • Posts
  • Alibaba unveils AI models that it says surpasses Deepseek

Alibaba unveils AI models that it says surpasses Deepseek

PLUS: OpenAI claims China’s DeepSeek used its model for training

Together with

Howdy! It’s Barsee again.

Happy Wednesday, AI family, and welcome back to AI Valley.

In today’s edition:

  • OpenAI claims China’s DeepSeek used its model for training

  • Alibaba unveils AI models that it says surpasses Deepseek

  • Meta creates 'war rooms' for DeepSeek

  • Plus trending AI tools, posts, and resources

Ready, set, go…

DEEPSEEK

OpenAI claims China’s DeepSeek used its model for training

Chinese artificial intelligence company DeepSeek disrupted Silicon Valley with the release of cheaply developed AI models that compete with flagship offerings from OpenAI but the ChatGPT maker suspects they were built upon OpenAI data.

Here’s what you need to know:

  • OpenAI told Financial Times that it found evidence of "distillation", a technique where a smaller AI model learns from a larger one to improve performance at a lower cost. While common in AI research, OpenAI suspects DeepSeek used this method to develop a rival model, violating its terms of service.

  • OpenAI and Microsoft investigated suspected DeepSeek accounts last year and blocked access to OpenAI’s API for allegedly engaging in distillation against the platform’s rules.

  • DeepSeek’s R1 reasoning model has shocked the market by achieving competitive rankings despite being trained on a much smaller budget—just $5.6M and 2,048 Nvidia H800 GPUs, far less than OpenAI and Google’s models.

  • Nvidia’s stock dropped 17% on Monday, erasing $589B in market value, amid concerns that high-cost AI hardware investments may not be necessary. It rebounded 9% on Tuesday alongside other tech stocks.

Why it matters:

The race for AI training data is heating up. As the foundation of AI development, data is crucial in boosting performance, accuracy, and the ability to detect complex patterns. This makes it more than just a resource, it's a vital asset in creating advanced AI technologies.

TOGETHER WITH SUPERHUMAN

Superhuman is the most productive email app ever made

For people who live at the intersection of data and innovation, your inbox should help you think—not get in the way. Superhuman is where speed meets simplicity: the ultimate productivity tool for email.

Imagine an inbox that helps you prioritize and communicate like a pro. Whether it's thread summaries, lightning-fast navigation, or the cleanest layout you've ever seen, Superhuman transforms your email experience and helps you save time for what matters.

This January, let's build something better together—starting with your inbox. Get one month free at superhuman.com/aivalley.

SIDE UPDATES

Alibaba’s Qwen team has introduced Qwen2.5-VL, a new series of visual AI models ranging from 3B to 72B parameters. The model outperforms GPT-4o, DeepSeek-V3, Llama-3.1-405B and Claude 3.5 Sonnet on key benchmarks for document parsing and video understanding tasks. These models can process hour-long videos, extract key moments, and handle complex documents like invoices and forms. They also function as virtual agents, capable of controlling PCs and smartphones to complete tasks such as booking flights, image editing, and code installation.

OpenAI has introduced ChatGPT Gov, a version tailored for U.S. government agencies, marking a significant expansion into the public sector. The model can be deployed on Microsoft Azure’s commercial cloud and offers many features of ChatGPT Enterprise, like custom GPTs. This move could help streamline government operations and improve service delivery, while providing OpenAI with a reliable revenue source in the long term.

Although xAI has not officially announced it yet, its latest model Grok-3 has briefly appeared on independent platforms and the X platform, and has begun internal testing, indicating that its official release is imminent, with expectations for a formal unveiling next week. It is reported that Grok-3 performs exceptionally well in answering questions, even surpassing leading models such as OpenAI's o1 and DeepSeek R1 in certain tests.

Hugging Face researchers have launched Open-R1, an effort to build and fully open-source a replica of DeepSeek’s R1 model. While R1 has shown impressive results, many of its components, including datasets and training details, remain undisclosed, limiting further research. Hugging Face aims to recreate the model in a few weeks using its Science Cluster, a research server powered by 768 Nvidia H100 GPUs.

META

Meta creates 'war rooms' for DeepSeek

Meta has reportedly assembled four specialized teams, referred to as "war rooms," consisting of engineers to understand how DeepSeek has managed to pull off performance on par with or exceeding that of top competitors like ChatGPT at a fraction of the cost.

Here's what you need to know:

  • Deepseek's recent breakthrough has put Meta’s AI team on high alert. Meta AI infrastructure director Mathew Oldham reportedly told colleagues that DeepSeek’s newest model R1 could outperform its next-gen AI model, Llama 4, which is set for release in early 2025.

  • Two of the four war rooms are focused on uncovering DeepSeek’s cost-cutting strategies for training and running R1, with hopes of applying similar efficiencies to Llama models.

  • The other two teams are investigating DeepSeek’s training data sources and analyzing how Llama could potentially modify its models based on DeepSeek’s architecture.

  • Meta says it regularly evaluates all competing AI models during the development process and has done so since forming its Meta's Gen AI group to address potential threats posed by them.

Why it matters:

DeepSeek’s R1 model poses a serious challenge to Meta’s AI dominance in open-source AI. The fact that DeepSeek’s model can outperform proprietary models at much lesser costs forces Meta to rethink its approach and adapt quickly to stay competitive.

TRENDING TOOLS

  • Natura AI > AI that remembers, adapts, and evolves (virtual assistants with memory, personality, and initiative). *

  • co.dev > Turn your ideas into full-stack apps.

  • Kimi AI Assistant > Free multi-modal alternative to OpenAI's o1.

  • Backflip > Transform ideas into 3D-printable models by describing, sketching, or uploading photos.

  • Remy > Ask any question and get answers from the world’s videos .

  • Omakase > Create your shopper AI agent with just a URL.

THINK PIECES / RESOURCES

CONTENT CORNER

1/

DeepSeek reached #1 on App Store. This X post explains everything they did in ELI5 terms and how it affects Nvidia Stock.

2/

Unitree robots are dancing at Spring Festival Gala.

3/

This is the right attitude. Embrace the competition and go for greatness.

4/

3D AI - Generating 3d animation and objects with AI. This is going to accelerate creativity in entertainment.

THAT’S ALL FOR TODAY

That’s all for today’s issue, folks.

💡 Help me get better and suggest new ideas at [email protected] or @heyBarsee

👍️ Like what you see? Subscribe here

Thanks for being here.

REACH 100K+ READERS

Acquire new customers and drive revenue by partnering with us

Sponsor AI Valley and reach over 100,000+ entrepreneurs, founders, software engineers, investors, etc.

If you’re interested in sponsoring us, email [email protected] with the subject “AI Valley Ads”.