AI illustration
CIND 2025 Workshop, April 14-15

The Impact of AI on (Mis)Information

The Center for Information Networks and Democracy will convene in its 2nd annual workshop a standout lineup of scholars to discuss how AI is transforming the information environment and how we can analyze (and anticipate) the consequences of that transformation.

  scheduleabstractslogistics  

 

Schedule

CIND 2025 workshop schedule

Monday, April 14

Tuesday, April 15

 

Talks

 
"Replication for Language Models: Problems, Principles, and Best Practice for Political Science", by Alexis Palmer (joint work with Christopher Barrie and Arthur Spirling)

Excitement about Large Language Models (LMs) abounds. These tools require minimal researcher input and yet make it possible to annotate and generate large quantities of data. While LMs are promising, there has been almost no systematic research into the reproducibility of research using them. This is a potential problem for scientific integrity. We give a theoretical framework for replication in the discipline and show that much LM work is wanting. We demonstrate the problem empirically using a rolling iterated replication design in which we compare crowdsourcing and LMs on multiple repeated tasks, over many months. We find that LMs can be (very) accurate, but the observed variance in performance is often unacceptably high. In many cases the LM findings cannot be re-run, let alone replicated. This affects "downstream" results. We conclude with recommendations for best practice, including the use of locally versioned "open source" LMs. [Paper]

 
"Observed correction: How we can all respond to misinformation on social media", by Emily Vraga

People often criticize social media for facilitating the spread of misinformation. Observed correction, which occurs when direct public corrections of misinformation are witnessed by others, is one important way to combat misinformation because it gives people a more accurate understanding of the topic, especially when they remember the corrections. However, many people—social media users, public health experts, and fact checkers among them—are conflicted or constrained correctors; they think correction is valuable and want to do it well, even as they raise real concerns about the risks and downsides of doing so. Fortunately, simple messages addressing these concerns can make people more willing to respond to misinformation, although addressing other concerns will require changes to the structure of social media and to social norms. Experts, platforms, users, and policymakers all have a role to play to enhance the value of observed correction, which can be an important tool in the fight against misinformation if more people are willing to do it. [Book]

 
"Propaganda is Already Influencing Large Language Models: Evidence from Training Data, Audits, and Real-World usage", by Hannah Waight (joint work with Eddie Yang, Yin Yuan, Solomon Messing, Molly Roberts, Brandon Stewart and Joshua Tucker)

We report on a concerning phenomenon in generative AI systems: coordinated propaganda from political institutions influences the output of large language models (LLMs) via the training data for these models. We present a series of five studies that together provide evidence consistent with the argument that LLMs are already being influenced by state propaganda in the context of Chinese state media. First, we demonstrate that material originating from China's Publicity Department appears in large quantities in Chinese language open-source training datasets. Second, we connect this to commercial LLMs by showing not only that they have memorized sequences that are distinctive of propaganda, but propaganda phrases are memorized at much higher rates than those in other documents. Third, we conduct additional training on an LLM with openly available weights to show that training on Chinese state propaganda generates more positive answers to prompts about Chinese political institutions and leaders---evidence that propaganda itself, not mere differences in culture and language, can be a causal factor behind this phenomenon. Fourth, we document an implication in commercial models---that querying in Chinese generates more positive responses about China's institutions and leaders than the same queries in English. Fifth, we show that this language difference holds in prompts related to Chinese politics created by actual Chinese-speaking users of LLMs. Our results suggest the troubling conclusion that going forward there may be strategic incentives for states and other actors to increase the prevalence of propaganda in the future as generative AI becomes more ubiquitous.

 
"Quantifying the Impact of Misinformation and Vaccine-Skeptical Content on Facebook”, by Jennifer Allen

Low uptake of the COVID-19 vaccine in the US has been widely attributed to social media misinformation. To evaluate this claim, we introduce a framework combining lab experiments (total N = 18,725), crowdsourcing, and machine learning to estimate the causal effect of 13,206 vaccine-related URLs on the vaccination intentions of US Facebook users (N ≈ 233 million). We estimate that the impact of unflagged content that nonetheless encouraged vaccine skepticism was 46-fold greater than that of misinformation flagged by fact-checkers. Although misinformation reduced predicted vaccination intentions significantly more than unflagged vaccine content when viewed, Facebook users’ exposure to flagged content was limited. In contrast, mainstream media stories highlighting rare deaths after vaccination were not flagged by fact-checkers, but were among Facebook’s most-viewed stories. Our work emphasizes the need to scrutinize factually accurate but potentially misleading content in addition to outright falsehoods. Additionally, we show that fact-checking has only limited efficacy in preventing misinformed decision-making and introduce a novel methodology incorporating crowdsourcing and machine learning to better identify misinforming content at scale. [Paper]

 
"Election Misinformation and AI Discourse in Alt-Tech Platforms: Opportunities and Challenges for Research", by Jo Lukito

False information about elections, including the sharing of incorrect voting details and misinformation about "election fraud" remains a persistent (albeit niche) issue, especially on platforms with weaker moderation policies and strong partisan community beliefs. In these digital spaces, generative AI (genAI) tools may exacerbate information integrity in two ways: (1) using genAI tools to produce misinformation perceived as real and (2) blaming genAI even when the information is true. Using a sample of data from three alt-tech platforms (Telegram, Patriot.Win, and Truth Social) in the months before the 2024 U.S. Presidential election, we explore the extent to which genAI misinformation or incorrect genAI attribution is pervasive in these digital spaces. In conducting this work, we also highlight and consider research-infrastructural limitations for continued research in political misinformation and AI.

 
"AI Transforming the Information Ecosystem: The Good, the Bad, and the Ugly", by Kaicheng Yang

The rise of generative AI technologies is reshaping the information ecosystem, encompassing production, dissemination, and consumption. Ensuring online platforms remain safe, fair, and trustworthy requires addressing the emerging challenges and opportunities during the transformations. In this talk, I will present my latest research into these dynamics. For production, I focus on malicious AI-powered social bots that generate human-like information and engage with others automatically (Bad), discussing their behaviors and methods for detection. Regarding dissemination, I analyze the capabilities and biases of large language models (LLMs) as information curators in the AI era (Ugly), focusing on their judgments of information source credibility. On the consumption side, I explore how LLMs can be leveraged to detect misleading textual and visual content and provide fact-checking support (Good). Finally, I will reflect on the broader implications of advancing AI models for the reliability and trustworthiness of the information ecosystem and conclude with future directions. [Paper]

 
"Misinformation on WhatsApp: Insights from a large data donation program", by Kiran Garimella

This research presents the first comprehensive analysis of problematic content circulation on WhatsApp, focusing on private group messages during the national election in India. Through a large-scale data donation program, we obtained a representative sample of users from Uttar Pradesh, India's most populous state with over 200 million inhabitants. This extensive dataset allowed us to examine the prevalence of misinformation, political propaganda, AI-generated content, and hate speech across thousands of users. Our findings reveal a significant presence of political content, with two concerning trends emerging: widespread circulation of previously debunked misinformation and targeted hate speech against Muslim communities. While AI-generated content was minimal, the persistence of debunked misinformation suggests serious limitations in the reach and effectiveness of fact-checking efforts within these private groups. This study makes several key contributions. First, it provides unprecedented quantitative insights into everyday WhatsApp usage patterns and content sharing behaviors. Second, it highlights unique challenges in moderating end-to-end encrypted platforms. Third, it introduces innovative data donation methodologies and tools for collecting representative samples from traditionally inaccessible platforms. The implications of our research extend beyond WhatsApp, offering valuable insights for developing effective content moderation policies across encrypted communication channels. Our data collection approach can be adapted for studying other platforms, particularly crucial in an environment where API access is increasingly restricted.

 
"Anticipating AI’s Impact on Future Information Ecosystems", by Nick Diakopoulos

The media and information ecosystems that we all inhabit are rapidly evolving in light of new applications of (generative) AI. Drawing on hundreds of written scenarios envisioning the future of media and information around the world in light of these technological changes, in this talk I will first scope out the space of anticipated impact. From highly-personalized content and political manipulation, to efficiencies in content production, new experiences for consumers, and evolving jobs and ethics for professionals, there is a wide array of implications for individuals, organizations, the media system, and for society more broadly. From this overview I will then elaborate a few areas in more depth where we are designing sociotechnical applications to advance benefits, simulating policy interventions to mitigate risks, and empirically studying the evolution and shape of media as it adjusts to generative AI technologies.

 

 

Logistics

When?

The workshop will take place on April 14-15, 2025.

Where?

Room 500 at the Annenberg School for Communication (please, use the Walnut Street entrance to be directed on how to reach the room).

Travel and Accommodation

We have reserved rooms at The Inn at Penn, across the street from Annenberg. If you need assistance with your travel arrangements, please contact Luisa Jacobsen.

 

  scheduleabstractslogistics