Skip to content
  • Type:
    Categories:

    Scholarship on the phenomena of big data and algorithmically-driven digital environments has largely studied these technological and economic phenomena as monolithic practices, with little interest in the varied quality of contributions by data subjects and data processors. Taking a pragmatic, industry-inspired approach to measuring the quality of contributions, this work finds evidence for a wide range of relative value contributions by data subjects. In some cases, a very small proportion of data from a few data subjects is sufficient to achieve the same performance on a given task as would be achieved with a much larger data set. Likewise, algorithmic models generated by different data processors for the same task and with the same data resources show a wide range in quality of contribution, even in highly performance-incentivized conditions. In short, contrary to the trope of data as the new oil, data subjects, and indeed individual data points within the same data set, are neither equal nor fungible. Moreover, the role of talent and skill in algorithmic development is significant, as with other forms of innovation. Both of these observations have received little, if any, attention in discussions of data governance. In this essay, I present evidence that both data subjects and data controllers exhibit significant variations in the measured value of their contributions to the standard Big Data pipeline. I then establish that such variations are worth considering in technology policy for privacy, competition, and innovation. The observation of substantial variation among data subjects and data processors could be important in crafting appropriate law for the Big Data economy. Heterogeneity in value contribution is undertheorized in tech law scholarship and implications for privacy law, competition policy, and innovation. The work concludes by highlighting some of these implications and posing an empirical research agenda to fill in information needed to realize policies sensitive to the wide range of talent and skill exhibited by data subjects and data processors alike.

  • Type:
    Categories:
    Sub-Categories:

    Links:

    US cities are regulating private use of technology more actively than the federal government, but the likely effects of this phenomenon are unclear. City lawmaking could make up for national regulatory shortfalls, but only if cities can thread the needle of special interests and partisanship.

  • Type:
    Categories:
    Sub-Categories:

    Decades after data-driven consumer surveillance and targeted advertising emerged as the economic engine of the internet, data commodification remains controversial. The latest manifestation of its contested status comes in the form of a recent wave of more than a dozen state data protection statutes with a striking point of uniformity: a newly created right to opt out of data sales. But data sales as such aren’t economically important to businesses; further, property-like remedies to privacy problems have long and repeatedly been debunked by legal scholars, just as the likelihood of efficient privacy markets has been undercut by an array of experimental findings from behavioral economics. So, why are data sales a dominant point of focus in recent state legislation? This work proposes a cultural hypothesis for the recent statutory and political focus on data sales, and explores this hypothesis with an experimental approach. Inspired by the taboo trade-offs literature, a branch of experimental psychology looking at how people handle morally uncomfortable transactions, this work describes two experiments that explore reactions to data commodification. The experimental results show that selling data is far more contested than selling a traditional commodity good, suggesting that selling data fits within the domain of a taboo transaction. Further, various potential modifications to a data sale are tested, but in each case the initial resistance to the taboo transaction remains. The experimental results show a robust resistance to data commodification, suggesting that newly enacted state-level sales opt-out rights provide a culturally powerful balm to consumers. The results also suggest a new framework for analyzing economic measurements of privacy preferences, suggesting a new possibility for interpreting those findings in light of the tabooness of data commodification. More broadly, the normative implications of the results suggest the need for culturally-responsive privacy reform while keeping an eye to the possibility for taboos to distort technology policy in ways that ultimately fail to serve consumer protection interests.

  • Type:
    Categories:

    Links:

    Despite strong scholarly interest in explainable features in AI (XAI), there is little experimental work to gauge the effect of XAI on human-AI cooperation in legal tasks. We study the effect of textual highlighting as an XAI feature used in tandem with a machine learning (ML) generated summary of a legal complaint. In a randomized controlled study we find that the XAI has no effect on the proportion of time participants devote to different sections of a legal document, but we identify potential signs of XAI's influence on the reading process. XAI attention-based highlighting may change the spatio-temporal distribution of attention allocation, a result not anticipated by previous studies. Future work on the effect of XAI in legal tasks should measure process as well as outcomes to better gauge the effects of XAI in legal applications.

  • Type:
    Categories:
    Sub-Categories:

    Political discourse and survey research both suggest that many Americans believe constitutional protections for free expression extend more broadly than what is reflected in the black letter law. A notable example of this has been the claim--sometimes explicitly constitutionalized--that content moderation undertaken by digital platforms infringes on users' legally protected freedom of expression. Such claims have proven both rhetorically powerful and politically durable. This suggests that laypeople's beliefs about the law--distinct from what the state of the law actually is--could prove important in whether content moderation policies are democratically and economically successful. This Article presents the results of an experiment conducted on a large, representative sample of Americans to address questions raised by the phenomenon of constitutionalized rhetoric about digital platforms and content moderation. The experimental results show that commonly-held but inaccurately broad beliefs about the scope of First Amendment restrictions are linked to lower support for content moderation. These results highlight an undertheorized difficulty of developing widely acceptable content moderation regimes, while also demonstrating a surprising outcome when correcting misrepresentations about the law.

  • Type:
    Categories:

  • Type:
    Categories:

    Links:

    The recording, aggregation, and exchange of personal data is necessary to the development of socially-relevant machine learning applications. However, anecdotal and survey evidence show that ordinary people feel discontent and even anger regarding data collection practices that are currently typical and legal. This suggests that personal data markets in their current form do not adhere to the norms applied by ordinary people. The present study experimentally probes whether market transactions in a typical online scenario are accepted when evaluated by lay people. The results show that a high percentage of study participants refused to participate in a data pricing exercise, even in a commercial context where market rules would typically be expected to apply. For those participants who did price the data, the median price was an order of magnitude higher than the market price. These results call into question the notice and consent market paradigm that is used by technology firms and government regulators when evaluating data flows. The results also point to a conceptual mismatch between cultural and legal expectations regarding the use of personal data.

  • Type:
    Categories:
    Sub-Categories:

    But what does fairness mean when it comes to code? This practical book covers basic concerns related to data security and privacy to help data and AI professionals use code that's fair and free of bias.

  • Type:
    Categories:
    Sub-Categories:

    Links:

    Rationale: An increasing number of automated and artificially intelligent (AI) systems make medical treatment recommendations, including “personalized” recommendations, which can deviate from standard care. Legal scholars argue that following such nonstandard treatment recommendations will increase liability in medical malpractice, undermining the use of potentially beneficial medical AI.  However, such liability depends in part on lay judgments by jurors: When physicians use AI systems, in which circumstances would jurors hold physicians liable? Methods: To determine potential jurors’ judgments of liability, we conducted an online experimental study of a nationally representative sample of 2,000 U.S. adults. Each participant read one of four scenarios in which an AI system provides a treatment recommendation to a physician. The scenarios varied the AI recommendation (standard or nonstandard care) and the physician’s decision (to accept or reject that recommendation). Subsequently, the physician’s decision caused a harm. Participants then assessed the physician’s liability. Results: Our results indicate that physicians who receive advice from an AI system to provide standard care can reduce the risk of liability by accepting, rather than rejecting, that advice, all else equal. However, when an AI system recommends nonstandard care, there is no similar shielding effect of rejecting that advice and so providing standard care. Conclusion: The tort law system is unlikely to undermine the use of AI precision medicine tools and may even encourage the use of these tools.

  • Type:
    Categories:

    Time series data analysis is increasingly important due to the massive production of such data through the internet of things, the digitalization of healthcare, and the rise of smart cities. As continuous monitoring and data collection become more common, the need for competent time series analysis with both statistical and machine learning techniques will increase. Covering innovations in time series data analysis and use cases from the real world, this practical guide will help you solve the most common data engineering and analysis challengesin time series, using both traditional statistical and modern machine learning techniques. Author Aileen Nielsen offers an accessible, well-rounded introduction to time series in both R and Python that will have data scientists, software engineers, and researchers up and running quickly.