AnalysisChinaFeaturestechnology

DeepSeek sends shock waves across Silicon Valley

Feature photo credit: Flickr/trongkhiem (public domain)

Over the last several weeks the tech sector has been shaken to its core by the launch of Chinese artificial intelligence company DeepSeek’s R1 large language model. Within a week, the app-version of the model had topped the charts in Apple’s App Store as well as Google’s Play Store, sending shocks through the stock market, with Nvidia alone seeing a loss of $600 billion in market capitalization in a matter of hours. 

Across corporate and social media millions of words have been spilled trying to explain the meaning of these developments — for the AI industry, for the tech sector, for the economy writ large, and even for capitalism itself. How do we parse the hyperbole, catastrophism, denial and anti-China propaganda that dominates these conversations in the West? What has this DeepSeek saga revealed about the ways we produce and deploy technology?

What makes DeepSeek so disruptive?

DeepSeek is a spinoff of High-Flyer, a private Chinese quantitative investment management company (essentially a hedge fund) founded in 2016 that has focused on algorithmic trading — that is automating stock market transactions to maximize returns. It’s in this context that DeepSeek was created to develop large language models for applications like chatbots. DeepSeek split and became independent from High-Flyer in 2023, though seemingly it is still largely funded by the hedge fund.

Most of what is referred to as artificial intelligence, or AI, in the media are large language models. These LLMs have become household names over the last year or so, for example  OpenAI’s ChatGPT, Google’s Gemini, Meta’s Llama, and Anthropic’s Claude. These models are by and large a specific type of artificial neural network known as GPTs — generative pre-trained transformers. If we cut through the hype around these models, the primary innovation here is qualitative advancement in the ability for users to interact with computers using natural language, like you would talk to a human.

This is to demonstrate that, in terms of design philosophy, functionality, and shortcomings, DeepSeek’s R1 is no different from the models coming out of Silicon Valley. Its applications and use cases in language processing, finance, science, programming, and industrial sectors are comparable. It is competitive with ChatGPT and Claude and their derivatives, and it is also subject to the same issues and weaknesses that LLMs generally demonstrate including “hallucinations,” the term used to describe the generation of information that is not true as well as an ambiguous path toward profitability.

DeepSeek’s R1 is disruptive not in that it presents a different vision of technology than that of Silicon Valley, but in how the company achieved its competitive edge. While not all of the numbers are publicly available, even conservative estimates put the development costs of R1 at a fraction of the investment sunk into comparable models coming from Silicon Valley. The reported cost of training R1 has been put at around $6 million, compared to the $100 million OpenAI invested in training GPT-4.

DeepSeek’s R1 is also notable in that it is open source. The term “open source” can mean many things and, in DeepSeek’s case, it means that the company released R1 under the very permissive MIT license that allows any person unrestricted use of the software “including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software.” This does not mean that DeepSeek has made public all of its training data, but that the source code for the model is available for derivative uses by anyone in the world. This is not unprecedented, OpenAI’s GPT-1 and GPT-2 were released on the MIT license several years ago, but all of the current bleeding edge models besides R1 are entirely proprietary and paywalled.

In addition to being open source, R1 is significantly less resource intensive to run than the offerings from Silicon Valley. Many users have been able to get R1 running on consumer grade computers, which has unprecedented ramifications for applications across the world. What this means is that any individual, organization, company, or country in the world now has the option of deploying and tweaking DeepSeek’s R1 to fit any necessary or desired use case, opening up this entire field of technology to actors that Silicon Valley monopolies — and U.S. imperialism — want to keep restricted.

Busting the myth of capital intensive ‘technological innovation’

While there are numerous industrial, scientific, and consumer applications for large language models, the much-hyped models coming out of silicon valley are unprofitable and unsustainable. In 2024, OpenAI lost nearly $5 billion in its operations despite bringing in $3.7 billion in revenue. Anthropic, the developer of the LLM Claude burned “more than $2.7 billion in cash” in 2024. Last October, Bloomberg reported that estimated capital expenditure by big tech firms could breach the $200 billion mark as “generative-AI demand booms.”

In any normal capitalist industry, losses in the billions like this would be a death sentence to any firm, but the AI sector in particular has grown out of a particular confluence of state, financial, and ideological forces that insists that the only way to forward technological innovation is through highly unsustainable, capital intensive forms of development. This includes massive investments in energy grid infrastructure, semiconductor manufacturing, and a massive expansion of energy intensive and ecologically disastrous data centers that ostensibly are necessary for innovation in the AI sector.

Last year, chipset manufacturer Nvidia briefly peaked as the world’s highest valued company. As reported in Liberation News at the time, this was due to its near monopoly, a staggering 88% market share, on graphical processing units used in training AI models. Restricting access to Nvidia GPUs has been a key pillar of the U.S. war drive against China, meaning that these cutting edge chips were not available in the quantity that are used in the data centers that trained GPT-4. Nvidia’s stint as the world’s highest valued company, and its massive crash — a loss of over $600 billion in market capitalization — is not simply because of its monopoly hold on GPUs.

Tech critic Ed Zitron describes the pre-DeepSeek status quo as defined by several axioms: 

  • These models were incredibly expensive to train — $100 million in the middle of 2024, and as high as $1 billion for future models
  • These models had to be large, because making them large — pumping them full of training data and throwing masses of compute about them — would unlock new features
  • These models were incredibly expensive to run, but it was worth it, because making these models powerful was more important than making them efficient
  • As a result of this need to make bigger, huger models, the most powerful ones, big, beautiful models, we would of course need to keep buying bigger, more powerful GPUs
  • By following this roadmap, “everybody” wins — the hyperscalers get the justification they needed to create more sprawling data centers and spend massive amounts of money, OpenAI and their ilk continue to do the work to “build powerful models,” and NVIDIA continues to make money selling GPUs. It’s a kind of capitalist death cult that ran on plagiarism and hubris, the assumption being that at some point all of this would make sense.

DeepSeek was developed and trained for a fraction of the cost, without access to cutting edge hardware or vast, overbuilt data centers — in large part due to U.S. sanctions on China’s ability to import Nvidia hardware, part of a trade war that had the specific intention of arresting exactly this sort of indigenous technological development on the part of China, or any nation or company outside the club of Silicon Valley oligarchs and U.S. imperialists.

DeepSeek blows this status quo out of the water and reveals as fundamentally bankrupt the idea that AI development necessitates a capital intensive, hyperscaling approach. Many analysts have been pointing out the inconsistencies in the business model of the AI sector for years — but many assumed that it was impossible to make this technology more efficient because, ostensibly, efficiency is the name of the game. Efficiency is the mantra repeated by tech startups ad nauseam, and so it follows that if it were possible to make LLMs more efficient, companies at the bleeding edge of development would seek to do so.

But as socialists, we understand that capitalism is actually the enemy of innovation and efficiency. The goal of monopolists is to maintain their monopoly, not to compete to make the most efficient, innovative, and cost effective solutions and products that might undermine that very monopoly power that is the source of their superprofits. DeepSeek shows that AI is inefficient not because of a fundamental scientific reality, but because there are no economic incentives for Silicon Valley to do more with less. 

Silicon Valley doubles down 

In the face of this seemingly fundamental shock to the AI sector and the basic principles of capitalist technological development — the response from the U.S. government and the old cast of Silicon Valley characters is to double down—to double down on capital intensive, hyperscale infrastructure buildout. On Jan. 21, 2025, the day after the launch of DeepSeek’s R1, U.S. President Donald Trump appeared flanked by OpenAI CEO Sam Altman, SoftBank CEO Masayoshi Son, and Oracle Chairman Larry Ellison to announce The Stargate Project.

The project is a joint venture between some of the largest financial and technology companies in the country pledging $500 billion towards developing infrastructure for artificial intelligence development in the United States. In addition to OpenAI, Softbank, and Oracle, another key partner in the Stargate Project is MGX, a technological investment fund backed by the government of the United Arab Emirates. Additionally, semiconductor giant Nvidia is named as an “initial technology partner” along with Microsoft, and British CPU manufacturer Arm.

According to OpenAI’s website the Stargate Project, in addition to securing “American leadership in AI,” and creating ostensibly hundreds of thousands of jobs, will also “support the re-industrialization of the United States,” and “provide a strategic capability to protect the national security of America and its allies. These bold claims echo earlier epochs of capitalism. During the Gilded Age at the turn of the 20th century, northern industrial capitalists competed to leverage monopoly power in their quest to supplant the old southern slavocracy — in large part through overbuilding, even “hyperscaling,” railroad infrastructure.

It is a comparable dynamic today as tech capitalists seek to supplant the old industrial giants as the most dynamic and central sector of U.S. capitalism. While material, economic forces at play here are dominant, the strange, eclectic, and deeply anti-human ideological elements of tech capital cannot be ignored. Liberation News has previously reported on the man at the center of many of these trends — Peter Thiel — but these tendencies extend far beyond the proclivities of a sole oligarch.

The name of this new initiative—the Stargate Project—references two things: first, most popularly known, is the Stargate science fiction series in which the titular stargates are portals to other worlds. The second, lesser known, Stargate Project was a secret U.S. army unit established by the Defense Intelligence Agency and the Stanford Research Institute and stationed at Fort Meade, Maryland. Declassified in 1995, the project was dedicated to studying the potential of weaponizing so-called parapsychological phenomena such as remote viewing and telepathy.

This sort of New Age infused, militarized ideology is not some esoteric digression as an historical footnote confined to classified programs from the 1970s. At the core of the push for hyperscale AI, led by the likes of Marc Andreesen, venture capitalist and author of the Techno-Optimist Manifesto, Sam Altman, CEO of OpenAI, Peter Thiel and Elon Musk is that eventually, AGI — Artificial General Intelligence — is what will be created through this process. AGI is scientifically dubious, and is essentially the idea that AI will inevitably become “smarter,” exponentially and qualitatively, to the point of replicating or even exceeding human consciousness.

The goal of AGI is one piece in a bundle of ideologies that define the Silicon Valley elite, described by critical AI scholars Émile Torres and Timnit Gebru as “TESCREAL,” which stands for “transhumanism, Extropianism, singularitarianism, (modern) cosmism, Rationalism, Effective Altruism, and longtermism.” Torres and Gebru go on to unpack how the unquestioned goal of AGI is fundamentally an outgrowth of the “the Anglo-American eugenics tradition of the twentieth century.” Many of the oligarchs at the center of the news cycle, and taking key roles in the new Trump administration, are outspoken, self conscious proponents of these ideologies. 

Any fight back against the billionaire class needs to understand what and how they think. The decision to continue with the Stargate Project, even in the face of the major disruptions posed by DeepSeek, demonstrates an ideological motivation that goes beyond a simple grift, but technology is anything but stable, and there will doubtless be more disruptions with greater and more wide reaching impact than that of DeepSeek. What disruptions like this provide us — the working class — is a view inside the actual workings of a sector of the economy shrouded in so much speculative hype and fanciful mythology.

Back to top button