Skip to content

Research

My research investigates how computational methods can enhance our understanding of cultural phenomena, with particular focus on developing novel approaches for analyzing media, texts, and visual culture at scale. I examine these methods critically, questioning their inherent biases and limitations, especially regarding race, gender, and class. My work spans four interconnected areas addressing contemporary challenges in information science.

computer vision

Computer Vision and Image Clustering

Exploring how computer vision and machine learning techniques can transform our understanding of visual culture and media history. I am developing tools and methods to make computational image analysis accessible to humanities scholars.

Tinylens — An R package for image-first analysis targeting digital humanities and film studies. Provides a tidy, pipeable API for extracting color, texture, composition, and video features. Includes shot detection, visual complexity metrics, and optional LLM vision integration.
ImageSpace — A modern image visualization tool using CLIP embeddings and dimensionality reduction (UMAP/t-SNE). Features 8 visualization modes including 2D/3D scatter, timeline, hotspots, and similarity search. Generates static HTML output deployable to GitHub Pages.
National Photo Company Archive — Analyzing ~35,000 photographs documenting early 20th-century Washington, D.C. Using Microsoft's Florence-2 multimodal model and BERTopic modeling to identify 527 thematic clusters, uncovering gender disparities and hierarchies of political visibility in Progressive Era America.
Filmic Flow — Exploring how objects and materials within film frames form networks of interactions. Early work published in the Journal of Open Humanities Data examines Hitchcock's films as memetic images.
machine learning

Machine Learning and Statistical Methodologies in the Humanities

Examining how machine learning techniques can transform cultural scholarship, providing new ways to understand patterns and phenomena at unprecedented scales.

Cultural Analytics in R: A Tidy Approach

Published by SpringerLink (2025). A framework for managing large-scale cultural data using network analysis, multivariate regression, natural language processing, and neural networks. Datasets intentionally focus on marginalized histories—examining gender and racial representation in art history textbooks, early Black film networks, American empire and cinema, and domestic space in nineteenth-century literature.

SteamTags — Network analysis of Steam's user-generated tagging folksonomy using backbone extraction methods. Examines how users collectively organize and navigate the platform's catalog, revealing patterns in digital genre formation and platform taxonomy. Analysis of ~100,000 games reveals bifurcated gaming/software communities with "Indie" as primary bridge tag.
Published "Minimal Research Compendiums" in the International Journal of Digital Humanities, addressing statistical training deficiencies in the humanities and proposing new approaches emphasizing reproducibility and transparency.
media history

History and Culture of Digital Media

My second monograph, The Computer Goes Home: A Failed Revolution, explores the domestication of computers in America during the 1970s and 1980s. This project blends traditional historical methods with computational techniques, combining archival research and data mining.

Unlike journalistic accounts focusing on Silicon Valley "visionaries," I utilize newspaper coverage, hobbyist magazines, popular media, and advertisements to uncover overlooked voices. Through natural language processing and topic modeling, I identify patterns across thousands of primary sources to reveal hidden stories about computing's social impact and how the device helped establish a media infrastructure responding to the political radicalism of the 1960s.

geospatial + llm

GIS and Large Language Models

Using geospatial analysis and large language models to reveal systemic inequalities and challenge entrenched historical narratives. Integrating critical theory with digital tools for social justice research while acknowledging their limitations and potential biases.

"Are We There Yet?" — First Runner-up, 2021 Digital Humanities Award for Best Exploration of DH Failure/Limitations. Combines computational linguistics, geocoding, and data journalism to demonstrate how academic conferences serve as hubs of power concentration and exclusion.
Police Violence Mapping — Developing innovative methods using hierarchical hexagonal spatial indexing (H3) to analyze patterns of police violence against African Americans, enabling precise comparisons and uncovering previously hidden patterns of systemic racism.
Deep Mapping the WPA Slave Narratives — Using large language models to uncover hidden geographical details in historical stories, revealing patterns of movement and displacement among formerly enslaved people and challenging "Lost Cause" mythology.