Benjamin Sobel

Lecturer on Law

Spring 2022

Biography

Ben Sobel is a scholar of information law. His work considers how the exercise of private rights shapes the public distribution of information, wealth, and cultural expression. In particular, he examines the way digital media, artificial intelligence, and networked devices influence the law of tangible and intellectual property, privacy, competition, and expression. Ben’s scholarship has been cited in briefs submitted to the Supreme Court of the United States, and has been published by the Lewis & Clark Law Review, the Columbia Journal of Law & the Arts, and Oxford University Press. His recent research analyzes the relationship between web scraping and common-law privacy and property causes of action, as well as the application of copyright’s fair use doctrine to machine learning technology. Ben is a graduate of Harvard College and Harvard Law School, and he currently serves as a law clerk on the United States Court of Appeals for the First Circuit. His publications are available on his homepage, bensobel.org.

Areas of Interest

Benjamin Sobel, A Taxonomy of Training Data: Disentangling the Mismatched Rights, Remedies, and Rationales for Restricting Machine Learning, in Artificial Intelligence & Intellectual Property (Reto Hilty, Jyh-An Lee & Kung-Chung Liu eds., 2021)
Categories:
Technology & Law
Sub-Categories:
Intellectual Property Law
,
Cyberlaw
Type: Book
Abstract
This chapter addresses a crucial problem in artificial intelligence: many applications of machine learning depend on unauthorized uses of copyrighted data. Scholars and lawmakers often articulate this problem as a deficiency in copyright’s exceptions and limitations, reasoning that legal uncertainties surrounding today’s AI stem from the lack of a clear exception or limitation, and that such an exception or limitation could resolve the current predicament. In fact, the current predicament is a product of two systemic features of the copyright regime—the absence of formalities and the low threshold of copyrightable originality—combined with a technological environment that turns routine activities into acts of authorship. Equilibrating the economy for human expression in the AI age requires a solution that focuses not only on exceptions to existing copyrights, but also on the aforementioned doctrinal features that determine the ownership and scope of copyright entitlements at their inception.
Benjamin L. W. Sobel, A New Common Law of Web Scraping, 25 Lewis & Clark L. Rev. 147 (2021)
Categories:
Technology & Law
Sub-Categories:
Cyberlaw
Type: Article
Abstract
The Clearview AI facial recognition scandal is a monumental breach of privacy that arrived at a particularly inopportune time. A shadowy company reportedly scraped billions of publicly-available images from social media platforms and compiled them into a facial recognition database that it made available to law enforcement and private industry. To make matters worse, the scandal came to light just months after the Ninth Circuit’s decision in hiQ v. LinkedIn, which held that scraping the public web probably does not violate the Computer Fraud and Abuse Act (CFAA). Before hiQ, the CFAA would have seemed like the surest route to redress against Clearview. This Article analyzes the implications of the hiQ decision, situates the Clearview outrage in historical context, explains why existing legal remedies give aggrieved plaintiffs little to no recourse, and proposes a narrow tort to empower ordinary Internet users to take action against gross breaches of privacy by actors like Clearview: the tort of bad faith breach of terms of service. Section II argues that the Ninth Circuit’s hiQ decision marks, at least for the time being, the reascension of common law causes of action in a field that had been dominated by the CFAA. Section III shows that the tangle of possible common law theories that courts must now adapt to cyberspace resembles the strained property and contract concepts that jurists and privacy plaintiffs reckoned with at the turn of the twentieth century. It suggests that modern courts, following the example some of their predecessors set over a century ago, may properly recognize some common law remedies for present-day misconduct. Section IV catalogs familiar common law claims to argue that no established property, tort, or contract claim fully captures the relational harm that conduct like Clearview’s wreaks on individual Internet users. Section V proposes a new tort, bad faith breach of terms of service, that can provide aggrieved plaintiffs with a proper remedy without sacrificing doctrinal fidelity or theoretical coherence.
Benjamin L. W. Sobel, Artificial Intelligence’s Fair Use Crisis, 41 Colum. J.L. & Arts 45 (2017)
Categories:
Technology & Law
Sub-Categories:
Cyberlaw
,
Intellectual Property Law
Type: Article
Abstract
As automation supplants more forms of labor, creative expression still seems like a distinctly human enterprise. This may someday change: by ingesting works of authorship as “training data,” computer programs can teach themselves to write natural prose, compose music, and generate movies. Machine learning is an artificial intelligence (“AI”) technology with immense potential and a commensurate appetite for copyrighted works. In the United States, the copyright law mechanism most likely to facilitate machine learning’s uses of protected data is the fair use doctrine. However, current fair use doctrine threatens either to derail the progress of machine learning or to disenfranchise the human creators whose work makes it possible. This Article addresses the problem in three Parts: using popular machine learning datasets and research as case studies, Part I describes how programs “learn” from corpora of copyrighted works and catalogs the legal risks of this practice. It concludes that fair use may not protect expressive machine learning applications, including the burgeoning field of natural language generation. Part II explains that applying today’s fair use doctrine to expressive machine learning will yield one of two undesirable outcomes: if U.S. courts reject the fair use defense for machine learning, valuable innovation may move to another jurisdiction or halt entirely; alternatively, if courts find the technology to be fair use, sophisticated software may divert rightful earnings from the authors of input data. This dilemma shows that fair use may no longer serve its historical purpose. Traditionally, fair use is understood to benefit the public by fostering expressive activity. Today, the doctrine increasingly serves the economic interests of powerful firms at the expense of disempowered individual rights holders. Finally, in Part III, this Article contemplates changes in doctrine and policy that could address these problems. It concludes that the United States’ interest in avoiding both prongs of AI’s fair use dilemma offers a novel justification for redistributive measures that could promote social equity alongside technological progress.

Current Courses

Course Catalog View