DeepSeek on the Horizon: Distillation, Defiance, and Clusters in Inner Mongolia

February 28, 2026

The industry is sitting on the edge of its collective seat as it awaits the release of DeepSeek V4 – which some are expecting next week. DeepSeek, you may remember, was the AI model that debuted a little over a year ago, rocking the AI world and tanking the entire market (not to mention giving NVIDIA the dubious distinction of being the company that lost more market cap in a single day than any other company in history).

What a coincidence, then, that two stories came out this past week painting DeepSeek, and the Chinese government, in a negative and nefarious light (I don’t think the accusations are spurious, but the timing is impeccable). The first story broke on Monday and involved “distillation.” The second appeared the following day, involving GPU access and the shutting out of American chip manufacturers.

Distillation disturbance

On Monday, Anthropic claimed that DeepSeek and two other Chinese AI companies used fake accounts to distill Claude. Distillation, as eWeek describes it, is “a common AI technique where one model learns from another model’s outputs, often to produce a smaller or cheaper system that behaves more like a stronger one.”

The danger, Anthropic asserted, is that models created through illicit distillation are likely to lose the safety guardrails built into American AI systems, such as protections that prevent misuse for bioweapons, cyberattacks, mass surveillance, and more. On top of that, Anthropic argued that distillation also lets Chinese labs bypass U.S. chip export controls by copying U.S. models directly, making their progress look like innovation when it might rely on stolen capabilities.

This leads us to the second DeepSeek-related story of the week, this one involving illicit GPU access and more.

NVIDIA and AMD iced out of early optimization opportunity

The following day Reuters posted an exclusive, “DeepSeek withholds latest AI model from US chipmakers including Nvidia.” DeepSeek, it seems, broke convention by not prioritizing Nvidia and AMD for early software optimization. Instead, Chinese chipmakers — including Huawei — were given a head start. Thanks to advances in AI coding tools and the ability to quickly optimize, however, this early access doesn’t provide the advantage it once did. That being said, it does send a clear message Stateside.

Clusters in Inner Mongolia

On top of this snub, the article quoted US officials as saying that, despite an export ban to China, DeepSeek had been trained on illegally obtained Blackwell chips clustered at a data center in Inner Mongolia. For reference, Blackwell is NVIDIA’s newest and most powerful GPU, whereas the chips previously allowed for export are restricted, scaled-down versions meant to comply with export controls — the NVIDIA H20 and AMD MI308 (slotting between NVIDIA’s H20 and Blackwell chips is H200, whose shipments to China have been stalled over approval guardrails).

Of course, whether China’s access to the latest GPU technology is a good or bad thing with regard to US AI competitiveness depends on who you ask. Anthropic’s CEO Dario Amodei strongly supports tighter export controls and restricting chip sales to China, whereas Nvidia’s CEO Jensen Huang argues that selling advanced chips to China can slow domestic competitors like Huawei by keeping them dependent.

The LLM that came in from the cold

This is intrigue worthy of a John le Carré spy thriller. Both stories point to growing brinkmanship between the two superpowers over AI hegemony. It is characterized by cloak-and-dagger tactics, fears of national security, and opposing views on the same side. This space keeps getting more and more interesting.

Pau for now…


How DeepSeek R1 Triggered a $1 Trillion NASDAQ Drop and caught the AI Industry by surprise

April 16, 2025

Note (April 2025):
This post was originally written in early February, just after DeepSeek’s R1 model sent shockwaves through the AI world. Since then, the model has been downloaded thousands of times, sparked forks and spinoffs, and raised serious questions about the future of proprietary AI. The original post has been lightly updated for clarity and SEO.


Surprise!

Last Sunday’s market chaos didn’t come out of nowhere. It had been quietly building throughout the week—until it suddenly exploded. The trigger? A little-noticed announcement from a small Chinese firm. That firm, DeepSeek, had just unveiled its R1 large language model, claiming it was built for less than $6 million.

At first, most shrugged. But over the following days, analysts began to dig in. And what they found changed everything.

By the end of the week, prominent news outlets like The Wall Street Journal were sounding alarms with headlines like “China’s DeepSeek AI Sparks U.S. Tech Sell-Off Fears.” The market opened Monday with investors on edge—and by the close, the NADAQ had lost over $1 trillion and Nvidia, the world’s most valuable company, had lost $589 billion in market value in a single day. It was the biggest one-day drop for a company in U.S. stock market history.

Let that sink in: A MIT-licensed, open source model posted on Hugging Face had just upended the most important tech arms race in decades.

Industry Myopia

It’s staggering to think that the world’s most powerful innovation engines—stacked with elite technical talent and billions in R&D—were blindsided by an open-source release.

This wasn’t a stealthy attack on a niche. It happened in arguably the most watched and capital-intensive domain of the last 20 years. People have compared generative AI to the internet, the printing press, even the invention of sliced bread. Yet the industry failed to anticipate this kind of disruption.

The mindset in the AI world had become singular: whoever could hoard the most NVIDIA GPUs and scale up the largest data centers would win. While the major players doubled down on that narrative, a quant research firm in China quietly remixed open models using reinforcement learning, bundled the result under a permissive license, and published it for anyone to use.

And just like that, the rules changed.

Open Source Has Disrupted Before—But Never This Fast

Open source is no stranger to disruption. We’ve seen it again and again with Linux, MySQL, Kubernetes, PyTorch—technologies that slowly but surely redefined their markets.

But those shifts took years.

  • Linux: Nearly a decade to gain serious enterprise traction
  • MySQL: Several years to replace proprietary databases
  • Kubernetes/PyTorch: 4–5 years to reshape containers and machine learning

DeepSeek’s impact? Days.

A Historic Turning Point for AI

I can’t think of the last time an emerging technology blindsided the entire industry and shattered consensus thinking overnight.

Yes, you can argue about DeepSeek’s actual development cost. You can debate the sustainability of open models. You can even pull out Jevons Paradox and talk about GPU demand skyrocketing anyway. But none of that erases this simple fact:

An open-source model triggered a $1 trillion market correction.

Whatever happens next, DeepSeek R1 has already earned a place in the history books—and quite possibly a future HBS case study.

Join the Conversation

Was this a short-term overreaction, or the first real crack in the foundation of proprietary AI? Let me know what you think in the comments.

Pau for now…


Savtira streams media and apps from the cloud with beefy PowerEdge C combo

April 18, 2011

Savtira Corporation, who provides outsourced Cloud Commerce solutions, has chosen Dell DCS’s PowerEdge C line of servers and solutions to deliver streamed media and apps from the cloud.  Dell’s gear will help power the Savtira Cloud Commerce platform and Entertainment Distribution Network (EDN).

With a little help from PowerEdge C, businesses will now be able to use EDN to stream all digital media (business apps, games, music, movies audio/ebooks) from the cloud to any device.  One of the particularly cool features is, since the state and configuration are cloud-based, consumers can switch between devices and pick up exactly where they pushed pause on the last device.

Talk about supercharging

To power Savtira’s EDN data center, the company picked PowerEdge C410xs packed with NVidia Tesla M2070 GPUs and driven by PowerEdge C6145s.  If you think GPUs are just for rendering first-person shooters, think again.  GPUs can also cost-effectively supercharge your compute-intensive solution by offloading a lot of the processing from the main CPUs.  According to NVidia, for 1/10 the cost and with only 1/20 of the power consumption, GPUs deliver the same performance as CPUs.

To  help you get an idea of the muscle behind this solution, the PowerEdge C410x PCIe expansion chassis holds up to 16 of the Tesla M2070s GPUs, each of which exceeds over 400 cores.  Two fully populated C410xs are in turn powered by one PowerEdge C6145 for a combined total of 33 Teraflops in just 7U.

Talk about a lot of power in a little space 🙂

Extra-credit reading

  • PowerEdge C6145 — Dell DCS unveils its 4th HPC offering in 12 months, and its a beefy one
  • PowerEdge C410x — Say hello to my little friend — packing up to 16 GPGPUs
  • NVIDIA: from gaming graphics to High Performance Computing

Pau for now…


El Reg gives DCS props for HPC innovation

October 5, 2010

The week before last a crew from Dell was out at NVIDIA’s GPU tech conference, showing our latest and greatest offerings in the HPC space.  It looks like our PowerEdge C410x expansion chassis system caught the eye of the Register HPC blog writer Dan Olds.

Below are some excerpts from Dan’s article, “Dell gets busy with GPUs” followed by the video he shot.   I love the video theme music and the fact that its a “BPV (Bad Production Values) presentation.”  [BTW We’ll have to give Dan the full Data Center Solutions(DCS) rundown at some point so that he can see that when it comes to design and innovation, the C410x is not an outlier 🙂 ]

From Dan’s Article:

Okay, let’s put it on the table: when the conversation turns to cutting-edge x86 server design and innovation, the name “Dell” doesn’t come up all that often. Their reputation was made on delivering decent products quickly at a low cost. I see that opinion in all of our x86 customer-based research – it’s even something that Dell employees will cop to.

That said, two of the most innovative and cutting-edge designs on the GPU Tech Conference show floor were sitting in the Dell booth, and that’s the topic of this video blog….

It’s the second product that really captured my interest. Their PowerEdge C410x is a 3U PCIe expansion chassis that can hold up to 16 PCIe devices and connect up to eight servers with Gen2 x16 PCIe cables. Customers can use it to host NVIDIA Fermi GPU cards, SSDs, Infiniband, or any other PCIe device their heart desires. What made my motor run was the possibility of cramming it full of Fermi cards and then using it as an enterprise shared device – NAC: Network Attached Compute.

…Dell deserves kudos for putting out this box. It’s a step ahead of what HP and IBM are currently offering, and it moves the ball forward toward an NAC future.

Extra credit reading:

Pau for now…


NVIDIA: from gaming graphics to High Performance Computing

September 22, 2010

A few weeks ago a group from NVIDIA was out visiting Dell.   Their Tesla series of GPU cards are the primary cards that are used in our newly announced C410X expansion chassis.  Filling up the C410X with NVIDIA cards and attaching it to a server can bring about ginormous increases in compute performance, helping to make HPC and scaled-out deployments wicked fast.

So how did NVIDIA get from rendering graphics for first person shooters to creating GPUs that accelerate modeling, simulation, imaging, signal processing,  etc?  Listen to the interview below with Geoff Ballew of NVIDIA’s Tesla unit and learn. 🙂

Some of the ground Geoff covers:

  • NVIDIA’s not just for gaming any more
  • A few years ago found that their graphic chips were getting a lot of raw math horsepower, so they added a few extra features into the chips and built a suite of software so that the graphic cards could be used for general computation.
  • How hard was it to convince HPC customers to take NVIDIA seriously in the compute arena?
  • What kind of performance gains are they seeing?
  • The accompanying software development tools and ecosystem of partners
  • The shift in NVIDIA’s workforce and culture as they’ve gotten into general compute processing – united by their love for GPUs 🙂

Extra-credit reading:

Pau for now…