awards
Brand Eins Bestberater 2025
Great Place To Work
Kununu
🏆 TRUSTEQ is among Germany’s best consulting firms
Read the article
TRUSTEQ | Corporative Excellence

Deepseek-R1

Open-Source Game-Changer or Just Another AI Rival? 

A New Contender in the AI Landscape 

In January 2025, the Chinese company DeepSeek captured global attention with the release of DeepSeek-R1. What makes this model unique is its open-source weights, coupled with performance levels comparable to proprietary models like OpenAI’s GPT-o1. Remarkably, it was developed and trained with significantly fewer resources. 

Financial Markets in Turmoil 

The launch of DeepSeek-R1 had an immediate and dramatic impact on the stock market. NVIDIA, the leading manufacturer of AI chips, saw its market value drop by $600 billion. This stark reaction underscores the high expectations surrounding proprietary AI technologies and the disruptive potential of high-performing open-source alternatives. Since then, markets have somewhat bounced back, with many Chinese stocks even benefiting from the development. 

Technical Insights

DeepSeek not only made its model’s weights public but also shared its methodologies and innovations in a detailed research paper. DeepSeek-R1 employs a Mixture-of-Experts (MoE) architecture, featuring 671 billion parameters, about ten times larger than existing open-source models like Meta’s Llama3.2. Despite its enormous size, only 37 billion parameters are active per query. The model supports input lengths of up to 128,000 tokens and leverages 256 experts per layer. Each token is processed in parallel by eight separate experts, ensuring efficient inference (NVIDIA). 

Slide 7-DeepSeek
Multi-Head-Latent-Attention (MLHA)
  • Optimized computation reduces cache load. 

Slide 8-DeepSeek
Hardware Efficiency
  • Enhanced PTX library (NVIDIA CUDA) and the use of 8-bit floating-point operations for improved memory utilization. 

Slide 9-DeepSeek
Post-Training Innovations
  • DeepSeekR1-Zero relies solely on reinforcement learning (no fine-tuning), excelling in mathematical and coding tasks but showing weaknesses in more general domains. 

  • DeepSeek-R1 includes sparse fine-tuning to better address user preferences and compensate for its shortcomings. 

Hardware Requirements: High-End or Accessible? 

Running DeepSeek-R1 in real-time requires powerful hardware. NVIDIA recommends an AI server with eight H200 GPUs (NVIDIA) costing approximately €320,000 (e.g., from DELTA Computer). However, less demanding applications can operate on more affordable setups.

Multiple Usage Options: App, API, and Self-Hosting 

DeepSeek-R1 offers flexible deployment options: 

Slide 4-Deep Seek

Smaller Models for Local Use 

Alongside its flagship model, DeepSeek released smaller versions optimized for personal computers, ranging from 1.5B to 70B parameters. These are distilled versions of Meta’s Llama and Alibaba’s Qwen models rather than original DeepSeek creations. Knowledge distillation involves training a smaller model (“student”) to replicate the performance of a larger model (“teacher”), reducing memory and computational demands. In internal tests, deepseek-r1:32B (based on Qwen2.5) performed acceptably on a MacBook Pro with an M3 Pro processor and 36 GB RAM, though occasional language mix-ups (e.g., English text interspersed with Chinese characters) were noted. For complex problem-solving tasks, such as selecting suitable data analysis algorithms, the model’s transparent reasoning featuring detailed chain-of-thought explanations proved highly valuable, offering both results and deeper insights into the decision-making process. 

Conclusion: Revolutionary Open-Source Model or Overhyped Innovation? 

DeepSeek-R1 represents a significant milestone in the AI field. While it does not surpass existing solutions, it performs on par with them. Its response presentation may fall short of OpenAI’s more polished user experience, but this limitation is negligible when the model is used in automation workflows such as AI agents. 

The open-source weights is particularly noteworthy, granting businesses unprecedented freedom to self-host and operate a powerful AI model. Furthermore, DeepSeek-R1 offers a cost-effective alternative to proprietary solutions provided data protection issues are adequately addressed. Whether the model will emerge as a true competitor or fade as a passing trend remains to be seen. One thing is certain: DeepSeek brings groundbreaking innovations in architecture, hardware optimization, and reasoning, proving it to be much more than a mere clone or propaganda effort. 

Interested in more topics?

Feel free to explore our exciting blog posts on Data Analytics & AI!

DeepSeek Slide 1

Deepseek-R1

28.02.2025Open-Source-Gamechanger oder nur ein weiterer KI-Konkurrent? 

Read more
Dr. Lukas Bohnenberger

AI Multi-Agent Systems - The Future of Automation

22.12.2024Discover how AI multi-agent systems revolutionize automation, boost efficiency, cut costs, and secure long-term competitiveness.

Read more
Data Quality Meta Image

Creating a Data-Driven Culture: Steps, Skills and Value for Your Organization

19.11.2024A data-driven culture must be a strategic priority for organizations aiming to leverage data as a competitive advantage. Find out the how and why!

Read more