As AI continues to dominate public attention and private investment, there is more regulatory scrutiny than ever on every point in the “AI stack”: from supply chain and computing needs, to model infrastructure, to downstream applications. Of particular interest to policymakers is the data that defines many AI products, and operates as a key differentiating factor from other computing technologies. In the modeling pipeline, data plays a key role in determining the worldview of the models, defining model performance claims and tasks in evaluations, shaping standards development, capturing human preferences in feedback, and communicating key details about the system.

Data, in particular, throws up a range of diverse policy challenges. From competition (Is data becoming an unavoidable barrier to entry?); privacy (What are the unique privacy risks posed by LLMs? Are they solvable?); labor (What are the working conditions for the human workers training and improving AI systems?); discrimination (Do increasingly larger datasets invariably worsen concerns with bias and discrimination?); transparency (Should access to training data be a regulatory prerequisite to ensure accountability?); copyright, and more. It also raises more fundamental questions like the sustainability of the “bigger is better” paradigm in AI, especially when such a paradigm incentivizes the reckless and often invasive collection of data about people and communities (Bender et al. 2021, Buolamwini 2023, Crawford 2021, Kak and Myers West 2023, Raji et al. 2020).

The Social Science Research Council (SSRC) invites applications for its research development workshop on Drilling Down to the Data: Navigating Data Politics at the Heart of AI Policy on July 22–23, 2024. This interdisciplinary workshop is part of the Data Fluencies Project, which aims through dissertation grants, public-facing publications (Just Tech and MediaWell), and research workshops to counter the impacts of discriminatory algorithms and online misinformation and to foster more just and equitable futures.

The workshop will synthesize perspectives across different knowledge communities, including university researchers, legal scholars, policymakers, community organizers, technologists, and artists, to support grounded, policy-relevant research.

Topics of interest related to the broad theme include, but are not limited to:

  • Pathologies of large-scale AI (environmental, labor impacts, competition, and resources)
  • Data documentation best practices
  • Data curation
  • Reinventing data minimization beyond data protection
  • Data representation
  • Data workers/work (labeling, content moderation, etc)
  • Access and competition (access to datasets as a barrier to entry)


Amba Kak and Deborah Raji will lead this workshop. This workshop aims to support research with a clear policy impact, contributing to evidence-based AI regulation discussions.

We particularly encourage applications from early-career and underrepresented scholars. We welcome applications from across various fields and practices, including the arts, the social sciences, the humanities, legal studies, journalism, community organizing, and data and computer science.

Proposals will be evaluated based on (1) relevance to the workshop’s theme, (2) whether they are at a stage of development where feedback from peers will make the most impact, and (3) complementarity with other research topics selected for the workshop.

Applicants should submit their 300-word abstracts and a current 2-page CV via the application portal. The last day to submit application materials is April 23, 2024, at 11:59 p.m. (EDT).


Three weeks before the workshop, participants will share in-development manuscripts with all participants and our two chairs responsible for guiding the discussion. Participants are expected to submit a 5-page reflection and read each other’s work before the workshop to use the in-person time for discussion and feedback.

SSRC and the workshop chairs plan to publish a public-facing summary of the workshop’s discussions and outcomes; workshop attendees will be expected to come prepared to discuss the state of the field of AI research and its implications for AI policy.

Travel and Accommodations

The SSRC will cover travel costs to New York City, hotel accommodations, and meals for selected participants.

Workshop Venue

SSRC Offices, 300 Cadman Plaza West, Brooklyn, NY

This workshop is part of the Data Fluencies Consortium led by the Digital Democracies Institute (Simon Fraser University) and generously funded by the Mellon Foundation.