On January 20, 2025, Chinese company DeepSeek released its first artificial intelligence model, DeepSeek-R1. One week later, the value of AI tech company Nvidia plummeted $589 billion — the biggest single-day market cap loss in the history of the world.
Why? DeepSeek made its new chatbot for less — way less. While Nvidia customer OpenAI spent $100 million to create ChatGPT, DeepSeek claims to have developed its platform for a paltry $5.6 million.
How? To date, the answer to this burning question remains unclear. However, OpenAI has publicly acknowledged ongoing investigations as to whether DeepSeek “inappropriately distilled” their models to produce an AI chatbot at a fraction of the price.
So, does OpenAI have a case against DeepSeek? According to Lecturer on Law Louis Tompros ’03, it depends. Harvard Law Today spoke with Tompros about the state of the AI industry, the laws that apply, and what the world can expect now that the first shots of the AI wars have been fired.

Harvard Law Today: What is the current state of affairs among the major players in AI? Who are they, how were they situated before the emergence of DeepSeek, and what has changed?
Louis Tompros: AI products generally fall into three categories: text-based AI, visual-based AI, and video-based AI. At the moment, major players in the industry are developing models for every one of those functions. OpenAI is the developer of ChatGPT, DALL-E, and Sora. Google, Microsoft, Meta, and Apple are all offering consumer-facing systems as well. Then there are companies like Nvidia, IBM, and Intel that sell the AI hardware used to power systems and train models. I think it’s notable that these are all are big, U.S.-based companies. Even on the hardware side, these are the exact Silicon Valley companies anyone would expect. Until recently, there was an industry-wide assumption that AI systems need the high-powered technology these hardware companies produce in order to train models.
The emergence of DeepSeek was such a surprise precisely because of this industry-wide consensus regarding hardware demands and high entry costs, which have faced relatively aggressive regulation from U.S. export controls. For instance, it is illegal to send most high-powered U.S. chips driving these models to China. That’s why DeepSeek made such an impact when it was released: It shattered the common assumption that systems with this level of functionality were not possible in China given the constraints on hardware access. DeepSeek created a product with capabilities apparently similar to the most sophisticated domestic generative AI systems without access to the technology everyone assumed was a basic necessity.
HLT: Do we know how DeepSeek bypassed these assumed requirements? Why do observers believe that DeepSeek used ChatGPT or OpenAI systems to develop its platform? What is “distillation” and has it occurred here?
Tompros: So, we know that DeepSeek has produced a chatbot that can do things that look a lot like what ChatGPT and other chatbots can do. What we do not know is exactly how that happened. Companies are not required to disclose trade secrets, including how they have trained their models. The prevailing consensus is that DeepSeek was probably trained, at least in part, using a distillation process.
“Distillation” is a generic AI industry term that refers to training one model using another. It originally just meant simplifying a model to reduce the amount of work needed and make it more efficient. OpenAI and other developers are continuously distilling their own products in an effort to reach “optimal brain damage”; that is, the amount a system can be reduced while still producing acceptable results.
But apart from their obvious functional similarities, a major reason for the assumption DeepSeek used OpenAI comes from the DeepSeek chatbot’s own statements. There have been instances where folks have asked the DeepSeek chatbot how it was created, and it admits — albeit vaguely — that OpenAI played a role. When asked about its underlying processes, the DeepSeek chatbot has directed people to OpenAI’s application interfaces. So, at least to some degree, DeepSeek definitely seems to have relied on ChatGPT or some output of OpenAI.
HLT: If that is true, how did DeepSeek pull that off?
Tompros: There are a few theories. The first is classic distillation, that there was improper access to the ChatGPT model by DeepSeek through corporate espionage or some other surreptitious activity. Another possibility is that ChatGPT was accessed during the process of training DeepSeek using rapid queries against the ChatGPT system. Doing so wouldn’t constitute espionage or theft of trade secrets; however, it could still provide a basis for legal action. The third possibility is that DeepSeek was trained on bodies of information generated by ChatGPT, essentially data dumps that are openly available on the internet. So, the question of whether OpenAI has recourse depends on the details of how this all happened and the degree of distillation that took place.
“We know that DeepSeek has produced a chatbot that can do things that look a lot like what ChatGPT and other chatbots can do. What we do not know is exactly how that happened.”
HLT: Are there any copyright-related challenges OpenAI could mount against DeepSeek?
Tompros: In the event DeepSeek trained on either rapid OpenAI queries or OpenAI data dumps, OpenAI probably does not have any recourse under copyright law. At the moment, copyright law only protects things humans have created and does not apply to material generated by artificial intelligence. Even setting aside that aspect of the law, it’s also very likely those activities would constitute fair use. Fair use is an exception to the exclusive rights copyright holders have over their works when they are used for certain purposes like commentary, criticism, news reporting, and research. At the very least, fair use is the same justification OpenAI developers have relied on to defend the legality of their own model training process. There is a conceivable argument that fair use would apply to OpenAI and not DeepSeek if OpenAI’s use of the data was found to be “transformative,” or different enough to negate infringement, and DeepSeek’s use of ChatGPT was not. Although that fair use argument has yet to be definitively addressed, it’s immaterial at the moment because copyright law currently only applies to human creations. So, at least under copyright law, it’s hard to see how Open AI would have recourse against DeepSeek.
HLT: Are there other challenges developers could bring against DeepSeek on the basis of intellectual property law?
Tompros: One place you might expect there to be some enforceable IP rights would be patent law. Unlike a copyright, which applies to works that present new and creative ideas, a patent protects new and useful inventions. Unsurprisingly, there has been a huge spike in patent applications within the AI space. Patents, however, typically take a very long time to vet and grant. OpenAI, for example, probably has more patent applications at the moment than actual patents. While it’s certainly possible something was done in the development of DeepSeek that infringed on a patent for AI training, that’s wholly unclear. We would first have to know exactly how DeepSeek was trained, and we don’t. It’s also very possible that DeepSeek infringed an existing patent in China, which would be the most likely forum considering it is the country of origin and sheer the volume of patent applications in the Chinese system. There’s also the potential for a claim against DeepSeek based on trade secrets in the event that theft or improper access occurred. If DeepSeek went beyond using rapid queries and ChatGPT data dumps, and somebody actually stole something, that would fall under trade secret law.
The last basis to consider would be contract law, since virtually all AI systems including OpenAI have terms of service — those long, complicated contracts that your average user just clicks through without reading. AI platform terms of service typically include a provision that explicitly prohibits using their model to create a competing model. While there are outstanding questions about which parts of these contracts are binding, it wouldn’t surprise me if a court ultimately found these terms to be enforceable. So, if DeepSeek used ChatGPT to run its own queries and train a model in violation of the terms of service, that would constitute a breach of its contract with OpenAI.
When it comes to challenging DeepSeek on the basis of a terms of service violation, there are some major obstacles to enforcement. A U.S. court might be reasonably quick to enforce a U.S. company’s U.S.-based license agreement, but it is much less likely that a court in China is going to find a foreign license enforceable against a company from its own country. U.S. license agreements have historically not been easy to enforce against Chinese companies. That will be true for any company that creates an AI model and sees an entity from China, or elsewhere, create its own version. Even if the aggrieved U.S. company has license rights, it will be challenging to enforce based on the click through license.
HLT: If OpenAI did bring a breach of contract lawsuit against DeepSeek, what happens next? How are international lawsuits between tech companies typically adjudicated? Do you think arbitration is an adequate process for settling these kinds of disputes?
Tompros: What happens next depends on the terms of service themselves. Most terms of service contracts contain some form of an arbitration provision that spells out a specific venue. At least recently, though, companies have started including a lot of carve-outs in those provisions in an effort to ensure they remain enforceable. So, while arbitration requirements in general are relatively common, I cannot speculate as to whether intellectual property violations or specific terms of service violations are included. Assuming the arbitration clause is either excluded or found unenforceable, the developer acting as a plaintiff has discretion to file the lawsuit in any forum that satisfies the basic civil procedure requirements for jurisdiction.
To the broader question about its adequacy as a venue for AI disputes, I think arbitration is well-designed to settle cases involving large companies. It’s also quite possible that an international arbitration ruling would be more likely to be enforced across borders. In fact, depending on the specific forum, arbitration could very well mitigate the enforceability issue that court orders from one particular country would likely encounter. Courts in China, the EU, and the U.S. would be much more likely to treat an international arbitration decision the same.
HLT: In the financial world, the release of DeepSeek was a massive revelation to say the least. The fact that AI systems can be developed at drastically lower costs than previously believed sent shockwaves through Wall Street. Do you anticipate a torrent of corporate lawsuits in the fallout?
Tompros: We certainly could see an increase in shareholder suits. Investors in U.S. and EU AI companies that lost value as a result of DeepSeek certainly could have actionable claims if they had been given the impression DeepSeek wasn’t a threat. Anytime a company’s stock price decreases, you can probably expect to see an increase in shareholder lawsuits. I suspect the guidance that companies would be getting now is to make sure that they are not ignoring the risk of competition from Chinese companies given that DeepSeek made such a big splash. I could also see DeepSeek being a target for the same kind of copyright litigation that the existing AI companies have faced brought by the owners of the copyrighted works used for training. The New York Times, for instance, has famously sued OpenAI for copyright infringement because their platforms allegedly trained on their news data.
HLT: Is that underlying lawsuit by the New York Times against OpenAI still pending? What is the likely outcome of basic copyright claims against AI developers?
Tompros: The New York Times lawsuit is one of many still pending. There are currently about 25-30 copyright infringement cases in the AI space, and they are all still either the motion to dismiss phase or the discovery phase. As we discussed earlier, the fundamental question that needs to get resolved by some combination of these suits is whether training AI models is or is not fair use.
“It wouldn’t shock me if any of the [25-30] pending copyright infringement cases went up to the Supreme Court to provide a definitive answer on fair use, which has happened in the past following the emergence of new technology.”
One very interesting recent ruling came on February 11th in the context of a lawsuit between Thompson Reuters and ROSS Intelligence. It doesn’t involve generative AI, but it involves a type of AI system alleged to have copied WestLaw’s headnotes and organizational system. In that case just decided, the district court found that the use of headnotes in that training of that system was not fair use because it was being used to train essentially a competing system. It was being trained on the system, and it was found not transformative. The court did distinguish this case from one involving generative AI, but, at some point, a decision about whether training a generative AI system constitutes fair use will be hugely impactful. That issue will be heard by multiple district courts over the next year or so and then we’ll see it revisited by appellate courts. It wouldn’t shock me if any of the pending cases went up to the Supreme Court to provide a definitive answer on fair use, which has happened in the past following the emergence of new technology, for example, Sony Betamax.
HLT: The U.S. government has recently undertaken efforts to restrict access to Chinese technology on the basis of national security. Do these same concerns apply to DeepSeek?
Tompros: The U.S. government has a long-standing history of adopting national security policies that ensure the most sensitive technologies do not fall into the hands of foreign governments. There are export control restrictions prohibiting the most powerful computer processors, for instance, from being sent to certain Chinese entities. DeepSeek certainly concedes it is owned by Chinese individuals, but claims that it is not owned at all by the Chinese government. From a national security standpoint, there’s inherent concern that the Chinese government could see strategic value and exert control. At least as of right now, there’s no indication that applies to DeepSeek, but we don’t know and it could change.
Legislation has been filed prohibiting DeepSeek and I think there’s a chance prohibitions based on national security concerns will come to fruition. However, if there are genuine concerns about Chinese AI companies posing national security risks or economic harm to the U.S., I think the most likely avenue for some restriction would probably come via executive action. So, legislation or executive action seems much more likely to have an impact on DeepSeek’s future as opposed to litigation.
Want to stay up to date with Harvard Law Today? Sign up for our weekly newsletter.