Summaries of Student Presentations / Summer Activities

Many of our Delta Lab students presented their research in May 2021 at the Wesleyan Media Project’s Political Advertising Workshop. Others conducted research over the summer and presented posters. Summaries of these projects are shared below.


Who’s Watching? A Tone Analysis of Geographically Targeted Senate Facebook Ads

By Noah Cohen ’22

U.S. Senate campaigns have become increasingly expensive and increasingly national, and, with social media enabling candidates to target advertisements to specific audiences based on geography, candidates have begun to target ads to voters outside their home state. Given the contentiousness and hostility that pervades modern American political advertising, there is concern that candidates soliciting funds from out-of-state voters (who cannot vote for or against them) face less risk of losing votes from a distasteful ad, and therefore may have a stronger incentive to run negative ads. 

I examined whether the tone of out-of-state-targeted Senate ads differs from that of other Senate ads. I analyzed this relationship using two different classifiers modeling two distinct measures of tone: 1) a reference-based model (based on the standard political science tone paradigm) and 2) a sentiment-based model trained on human coders’ subjective judgement of various ads’ positivity or negativity. I found that out-of-state targeted ads are less likely to be classified as “attack” ads by the reference-based model; however, they are also more likely to be classified as “negative” by the sentiment model. This contradictory result suggests a need to more deeply assess the relationship between the normative political science definition of tone and subjective human experiences of tone, and to incorporate the nuances of this relationship into future political communication research.


Social Networking: An Exploration of References in Senate Candidates’ Political Advertising

By Julia Crainic ’23

My project looks at references made by Senate candidates in their political advertising which I visualized using a network. My guiding question centers around how various candidates interact not only with their direct opponent, but also reference other Senate and presidential candidates.

The candidates on the outer circle of the network mentioned Biden or Trump in their Facebook ads during this period, while the ones on the outside did not. One of the major visual takeaways is that Trump’s dot is much larger than Biden’s, which makes sense given that Trump was the incumbent in the presidential race. Also notable is the size of McConnell’s dot: it is possible that the then-incumbent Senate Majority Leader is bearing the blame broadly of the Republican party.  

These are just preliminary findings, but one extension on this project that I am interested in pursuing is looking at more specifically how Senate candidates reference politicians from their own and the opposing party. Instead of casting a general assumption by party, I would look at the ad content, and one avenue for that would be ad tone.


BLM in the Battleground: An Analysis of Racial Justice-Related Election Ads in Georgia

By Brianna Mebane ’22

My research analyzed ads ran by politicians in the state of Georgia, which played a pivotal role by both electing Joe Biden as president and shifting control of the Senate from Republicans to Democrats during the 2020 election season. More specifically, I analyzed ads ran by U.S. presidential (Joe Biden & Donald Trump) and Georgia senatorial candidates (Rev. Raphael Warnock, Kelly Loeffler, Jon Ossoff, and David Perdue) during the fall of 2020 to discern how they addressed issues of race through digital advertising.

For this study, I conducted structural topic modeling (STM) and keyword search on the creative text of more than 90,000 distinct Facebook ads, with my STM model grouping words from the ads’ creative text into clusters based on the topics most frequently discussed within the ads, and terms of interest for the keyword search including “black lives matter”, “racism”, “racial justice”, etc.

After the structural topic model detected the use of the terms “antifa” and “kkk” within the dataset, a manual keyword search revealed that exclusively Republican politicians (i.e., Trump, Loeffler & Perdue) mentioned groups like the anti-fascist anti-racist Antifa movement and the Ku Klux Klan (KKK) in their ads for the sake of disparaging both groups as terrorist organizations. Conversely, Democratic candidates (Biden, Warnock & Ossoff) were more likely to mention Black Lives Matter and race-related issues like Black voting rights to establish and convey solidarity with Black constituents in Georgia.

Since most of the ads in my dataset were requests for political campaign donations that did not account for developments from that time period, I hope to widen the timeframe of interest to account for ads that ran closer to protests for racial justice that occurred earlier on in 2020. More broadly speaking, I would also like to analyze ads by politicians who campaigned in other battleground states from the 2020 election to further gauge the conversation surrounding racial justice in these politically pivotal locations.


Candidate Name Recognition

By Roshaan Siddiqui ’22 and Oliver Diamond ’23

When conducting analysis on political ads, we are often interested in if a candidate’s opponent or any particular politician is mentioned in an advertisement. However, speech to text software such as the Google Speech-to-Text API often fails to correctly transcribe candidate names from an advertisement’s audio file, especially with lesser known candidates. When this occurs, names are typically transcribed as a phonetically similar word. For example, the name “Amy” could be mistranscribed as “I’m.” Our task is to construct a machine learning model capable of detecting names directly from the audio of an advertisement in cases where we suspect a name may have been mistranscribed as another word. We plan to accomplish this using a convolutional neural network to detect patterns from a spectrogram of the audio data. 

The project presents two main challenges. The first is that the model described above must be capable of detecting particular names regardless of the speaker the audio originates from. More specifically, patterns in the audio features that the model uses to make predictions must be consistent across multiple speakers. Another challenge is that the main use case for the model is uncommon names (we have observed these are mistranscribed with greater frequency than more prominent politicians such as Joe Biden and Donald Trump), however, when training the model it naturally follows that we have the least amount of data for those names of interest.

Our dataset includes audio files from political advertisements and their respective Google Speech-to-Text transcriptions with word level timestamps. Our methodology for locating mistranscribed names is the following:

  1. Search through the dataset using a text similarity matching algorithm to identify words that are potentially mistranscribed names (words which are phonetically similar to any of a defined set of names our model has been trained to identify)
  2. Run the model on the audio segment associated with the identified word and determine if it is actually a name.

Using this approach, we seek to reduce errors resulting from mistranscription in our automated coding of politicians named in political advertisements.


Entity Linking for Political Facebook Ads in the 2020 U.S. Election

By Natalie Appel ‘23, Dale Ross ‘22, Lexie Silverman ‘23, Latonya Smith ‘24

Our student group collaborated on a handful of projects this summer related to the Delta Lab’s larger goal of creating a successful entity linking algorithm that can recognize, read, and classify political Facebook ads from the 2020 U.S. election. Our first project involved developing and testing a number of classifier models that focused on varying elements of the ad. For instance, Random Forest and SVM models were implemented using the ad’s main text to train and classify ads by their content and sentiment. A DistilBERT Neural Network model was also used to predict the intended goal of the ad and was trained on either the ad’s main text or a combination the ad’s audio transcription and its text. To see a more detailed explanation of these models, see our poster from the Summer Research Poster Session.

In addition to these classifiers, we looked at several collections of ads and analyzed their characteristics using different data analysis methods. For example, we used text similarity analysis to investigate ads whose entity’s known party affiliation or ideology did not match the value predicted by the classifier. By comparing text from ads with known entities to the text from the target ad we were able to identify some mistakes or shortcomings in our classifier and improve the model. Similarly, we attempted to discern whether the funding entity of an ad was more likely to be a candidate or a group. To do so we employed a weighted keyword search and designed a scoring algorithm to rank the sponsors by their likelihood to be either a candidate or group. We first looked at the collection of ads that mentioned Biden and/or Trump, and then repeated the process for ads that specifically did not mention either Biden or Trump. From this we were able to distinguish which presidential candidate the sponsors supported and infer their party affiliation.

While our work this summer contributed to the broader goal of a fully functioning entity-linking algorithm, there is still much work to do in order to accomplish this large task. Future work with the Delta Lab will hopefully continue to improve the models we worked on and build upon this progress.