President Trump described it as a “wake-up call” for US tech firms. Sam Altman called it “impressive.” Yann LeCun said it shows the “power of open research”. In a matter of days, DeepSeek has taken the world by storm.
The Chinese AI application and its underlying R1 and V3 models burst onto the scene, taking the industry aback by claims it cost just $6 million to develop.
But, there’s more to this story than meets the eye, with questions about its development costs, security, censorship and energy usage.
Subscribe today for free
The $6 million question: DeepSeek’s development costs
The DeepSeek AI application and underlying models were developed by a Chinese AI start-up of the same name, founded by Liang Wenfeng in December 2023.
The team behind it claimed to have developed the open source DeepSeek-V3 model for just $6 million — a number that compared to contemporary Western AI models is pocket change.
The number could have some merit, as Chinese AI researchers are severely hamstrung by the hardware available to them due to export restrictions on chips.
For training, DeepSeek is believed to have used some 2,048 of Nvidia’s H800, a slower version of the H100 GPU designed specifically for the Chinese market before the Biden administration closed the loophole that allowed chipmakers to export slower units.
This would align with the production costs, as such chips were available at fair lower prices compared to newer, higher-end units like the H100 and H200.

DeepSeek hasn’t made public its exact AI chipset training cluster configuration, but the claims caused an unbelievable market scare, spooking investors to dump Nvidia stock left and right.
While it could be construed that DeepSeek’s apparent use of older, slower hardware failed to impact the output of its finished model, it’s important to remember Nvidia investors are easy to spook, like when share prices slid in response to its August 2024 earnings call because they supposedly weren’t wowed enough.
Instead, the team behind DeepSeek focused on improvements during training, which is arguably more important than whatever fancy GPU powering the process.
The startup employed “large-scale” reinforcement learning in post-training. In simple terms, DeepSeek used a method to fine-tune its AI model by allowing it to learn through trial and error, optimising its performance based on feedback from simulated scenarios or real-world data, which ultimately improved the performance of outputs.
DeepSeek’s reported $6 million costs and the resulting investor unease could have some basis, given the massive orders for Nvidia’s Blackwell line of chips from industry giants like Google, Microsoft, Meta, Oracle, Reliance, and AWS.
It wasn’t just Nvidia that was hit by the so-called “DeepSeek selloff,” with shares of Apple, Microsoft, Amazon, Alphabet (Google), Meta, and Tesla all sliding as a result.
Mark Klein, CEO of SuRo Capital, suggested that DeepSeek's efficient training methods could shift the focus away from reliance on high-end GPUs, potentially increasing demand for hardware among those aiming to train proprietary models.
“Current reporting suggests a lack of certainty regarding the potential model development curve if the same approach were used with improved hardware capabilities,” Klein said. “As such, it is not clear if having 50,000 Blackwell chips vs 50,000 Hopper chips would result in outputs at the o3 level.
“Furthermore, we would expect to see the cost of model production costs to decrease over time (although R1s price-to-performance results are significant). Even if training costs decrease, companies might still invest in more powerful systems for incremental performance gains, rather than minimising costs for equivalent results.”
The reported $6 million figure focuses largely on training run costs for DeepSeek-V3, GPU rental costs and processing of around 14 trillion tokens.
Diving deeper into the numbers, however, there are also considerations around data gathering and corpus creation, researcher salaries, actual hardware costs, failed iterations of the model and infrastructure considerations like cooling and maintenance.
Martin Vechev, a full professor at ETH Zürich and founder of the Institute for Computer Science, Artificial Intelligence and Technology (INSAIT) in Bulgaria, said via LinkedIn that the numbers were “somewhat misleading”.
[“The $6 million] comes from the claim that 2048 H800s were used for one training [run], which at market prices is ~$5-6 million. However, the development of such a model requires one to do lots of ablations and many other runs.
“The real price to develop such a model is many times higher than $5-6 million. Further, 2,048 H800 costs between $50-100 million and the company that owns DeepSeek is a large Chinese hedge fund, which has much more than 2,048 H800s.”
Andrew Ng, the machine learning pioneer and Google Brain co-founder, offered an alternative view, suggesting the DeepSeek selloff was a sign that for AI developers, the application layer is the better place to be.
“The foundation model layer being hyper-competitive is great for people building applications,” Ng argued.
Today's "DeepSeek selloff" in the stock market -- attributed to DeepSeek V3/R1 disrupting the tech ecosystem -- is another sign that the application layer is a great place to be. The foundation model layer being hyper-competitive is great for people building applications.
— Andrew Ng (@AndrewYNg) January 27, 2025
Censorship and safety: The ‘guardrails’ of DeepSeek
DeepSeek’s AI app clocked up a million downloads by January 25. Three days later, that number rose to over three million.
It’s the latest in a string of Chinese-created applications enticing consumers in the West, including RedNote (Xiaohongshu or 小红书) as a TikTok alternative, and Marvel Rivals, a video game developed by Chinese studio NetEase.
But what do DeepSeek, RedNote, and Marvel Rivals all have in common? They all have guardrails in place to prevent users from asking certain topics.
In Marvel Rivals, you can’t say ‘Winnie the Pooh’ in the chat, with the Disney character banned in China after critics mocked Xi Jinping over his likeness to the bear. On RedNote, users posting LGBTQ+ content found their posts censored. The same is true with DeepSeek, which won’t return any responses to user queries about Tiananmen Square. When asked about whether Taiwan is a country, the AI app created responses that toed the line with China's view of the island.

Capacity tests of the DeepSeek app initially rejected a prompt about the Uyghurs, a predominantly Muslim ethnic group that has long lived in Xinjiang, north-western China.
Human rights groups estimate that China has detained over a million Uyghurs in a vast network of so-called "re-education camps," with hundreds of thousands sentenced to prison. Amnesty International and others have labelled these actions crimes against humanity.
China has consistently denied allegations of mistreatment, suppressing discussion of the Uyghur issue within its borders.
Notably, while DeepSeek first refused to generate a response about Uyghurs, it produced a detailed reply when prompted again, even referencing "reports of human rights abuses." However, before the response could be completed, the app deleted it and reverted to its original refusal. The full test results are shown below.

Lian Jye Su, chief AI analyst at Omdia, explained that DeepSeek itself put in place the guardrails preventing the model from outputting potentially sensitive content.
“At the moment the nature of these guardrails remains unclear, as DeepSeek has not shared any details even though the models are open source. However, this is understandable as the national Generative AI regulations is much tighter in China versus the US.”
DeepSeek’s output guardrails make sense given what Chinese AI developers have to go through to have their models made public. The country’s ruthless internet watchdog requires AI models to be approved before public release.
Guardrails in AI models are not uncommon. There’s a myriad of topics you can’t talk to ChatGPT and other Western-developed AIs about, a position Elon Musk with his xAI alternative is trying to move away from with their Grok models.
DeepSeek’s guardrails, however, are far more stringent. But that doesn’t stop the model from posing safety issues to a potentially alarming extent.
Dr Stuart Battersby, chief technology officer at responsible AI software firm Chatterbox Labs, noted in an article that the Chinese-made model “exhibits AI safety weaknesses across the board”.
Using Chatterbox’s automated AI safety testing software his team evaluated DeepSeek R1 and found the model failed to reject prompt attacks encouraging everything from fraudulent activities to misinformation and even security and malware.

OpenAI’s o1 reasoning model, however, was found to be far less suspectable. However, not all Western models are perfect, as the results for Google’s Gemini 2.0 Flash showed even big-tech-created AI systems are fallible.

"OpenAI… have achieved good progress over their prior models, there’s still work to do in some key areas,” Dr Battersby said. “And, whilst DeepSeek have created a very capable model, this model also exhibits AI safety weaknesses across the board.
“It will be interesting to take note of how this plays out. At the speed with which DeepSeek develop and releases models, will they end up as the leader in AI safety for reasoning models in the near future?”
East vs Western AI development: What does DeepSeek mean for security?
The rise of DeepSeek comes at an interesting geopolitical period as President Trump takes office.
He recently welcomed the $500 billion Stargate Project from OpenAI, SoftBank, and Microsoft, among others. In the wake of DeepSeek, however, the President issued a warning, calling the AI’s success a “wake-up call for our industries that we need to be laser-focused on competing to win.”
“We always have the ideas. We’re always first,” President Trump said. “I would say that’s a positive that could be very much a positive development. Instead of spending billions and billions, you’ll spend less, and you’ll come up with, hopefully, the same solution.”
It was President Trump who kicked off the US’ tough stance on China, ordering the banning of TikTok during his first term before U-turning on Day One of his second term.
Jeff Le, the former Deputy Cabinet Secretary for the State of California, told Capacity that DeepSeek’s rise represents a startling shakeup to national security policymakers that the US’ AI advantage was “grossly miscalculated”.
“For policymakers who have seen the AI race as the key to hegemony for the 21st century, this moment represents a perceived threat to US global leadership,” Le said.
“This is well beyond the influence of TikTok. What if the US doesn't have the world's best innovators and researchers? No chip restrictions would help on this question. This is a thought that would have been unimaginable until now.”
Given DeepSeek’s rise, the inevitable question arises: Could the US ban the DeepSeek?
As Omdia’s Su points out, the US government can’t just ban an open source AI model. Instead, however, he notes that the Trump administration can bar DeepSeek as a commercial entity from operating in the US market, similar to bans imposed on companies like Huawei, ZTE, and Hikvision.
Le adds to this line of thinking, given President Trump based his presidential prospects on serving as an economic steward.
“With Wall Street buoyed by the ‘Magnificent 7’ and all of them taking a major hit, the political stakes of the AI race for the President only make his involvement more likely,” Le said. “The Stargate project puts into view whether sinking so much capital into data centres and energy makes sense for public-private partnerships.
“Time will tell but one thing is for sure — Mr. Trump does not like to lose and he may look at more extreme approaches to China, expanding beyond his initial tariff salvo.”
How much should we believe DeepSeek’s energy impact?
In addition to cost savings, the team behind the DeepSeek line of models claimed their approach to model training resulted in improved energy efficiency.
Scott Chamberlin, a former Microsoft and Intel software engineer and expert in green software development challenged the claims, suggesting they “may be overblown”.
He tested the 70 billion parameter version of DeepSeek R1 and compared it to Meta’s Llama 3.3 version of the same size, on a single Nvidia H100 GPU.
Chamberlin and his team found that the models used displayed “approximately the same energy efficiency” with the the DeepSeek model showing “slightly slower performance” as it generated more output tokens.
The results saw DeepSeek R1 70B consume 87% more total energy on hardware likely used to train it compared to the Meta model on the same set of prompts.

“While the consensus seems to be that training Deep Seek likely used significantly less energy than comparable models, we believe that inference will constitute the bulk of energy and computational usage in the future,” Chamberlin wrote.
“Better models will be more useful, and even if we improve their energy efficiency, we will likely encounter Jevon’s paradox, where they will be even more widely used.
“Looking at the data I can only anticipate that these innovations are going to results in us staying on the same energy consumption growth trends as prior to DeepSeek v3/r1 being released.”
RELATED STORIES
Is DeepSeek the future of AI? Industry reactions to R1’s launch
China’s DeepSeek AI sparks global tech shift, challenges US Big Tech dominance
OpenAI unveils Operator: A browser-based AI agent to revolutionise task automation