Historical Inquiry in Digital Spaces
- Sarah Whitwell

- Aug 14
- 12 min read
Using Relational Databases and Data Visualizations To Advance the Study of Racialized Violence in the Late Antebellum and Postemancipation South
Writer: Sarah Whitwell, PhD
You are about to embark on an exciting new project. You have read the secondary literature, identified your research question(s), and now it is time to enter the archive. In the past, you have faced challenges when your historical subjects are not well represented. But this time, you have a different kind of challenge: you have to navigate thousands of documents that all directly relate to your exciting new project. How do you make sense of it all? How do you make sure that, in reading these documents, you do not lose sight of the bigger picture? This is the challenge I faced when studying racialized violence in the late antebellum and postemancipation South [1].
My research reconstructs how the newly freed Black population experienced racialized violence in the United States during the transition from slavery to freedom and in the decade immediately following emancipation. I analyze primary source collections that chronicle this transitional period between slavery and freedom: the records of the Bureau of Refugees, Freedmen, and Abandoned Lands (1865-1872); the first-person testimony recorded by the Joint Select Committee to Inquire into the Condition of Affairs in the Late Insurrectionary States (1872); and the interviews with formerly enslaved people compiled in the Slave Narrative Collection of the Federal Writers’ Project (1936-1938). Even after narrowing the parameters of my work to focus on Georgia, Mississippi, South Carolina, and Texas, I found myself staring at thousands of pages of documents [2]. To manage my sources effectively and, more importantly, to understand the ways in which Black men and women resisted racialized violence, I employed digital humanities techniques to support and enhance my research.
With so much data at my fingertips, the concern was that I might fail to see the forest through the trees. The traditional close-reading methodology that I had been trained to use, while still critically important, felt inadequate for identifying broader thematic trends. So, I learned how to create, manage, and query my own relational database. While still engaging in a close-reading of each primary source, I extracted data on individual incidents of violence – the victims and perpetrators, geographic locations, forms of violence, methods of resistance, and more. – to better identify thematic trends. I wanted to see any relationships between specific forms of violence and the methods of resistance employed in response. For example, was physical violence more likely to be met with physical resistance? To do this, I needed to put my sources in conversation with one another.
For a previous project on Black resistance to lynching in the postemancipation South, I created a rudimentary database of interviews from the Slave Narrative Collection using Microsoft Excel [3].This database, comprised of independent and unrelated tables, served little purpose beyond its record-keeping functions; it had limited ability to identify relationships across multiple documents because each document was entered without consideration for those around it. Therefore, any analysis had to be done manually. With only a few hundred documents, this was doable. A relational database, however, is more complex and powerful. Comprised of multiple interconnected tables, a relational database can identify relationships across multiple tables by matching common data. In other words, a relational database can identify patterns, relationships, and connections between documents. It is perfect for managing a large body of sources.
Although there is no substitute for the traditional close-reading methodology when considering the lived experiences of marginalized peoples, a relational database can serve as a valuable methodological tool to analyze primary source documents. Because I extracted multiple points of data on incidents of racialized violence in the postemancipation South, I was able to query the database to reveal a number of patterns, relationships, and connections. For example, I could query my database to return all incidents where violence was perpetrated by the Ku Klux Klan. These results could then be further refined on the basis of geography, the type of resistance employed in response, or even by the primary source collection where the testimony originated, offering insight into how violence unfolded on the ground and how Black communities responded.
A database can be used to ask a wide variety of questions [4]. With numerous data points on individual incidents of violence – the victims and perpetrators, geographic locations, forms of violence, methods of resistance – it is limited, to some extent, only by the questions that the user can invent. Beyond its ability to answer questions, a database is a useful methodological tool because it encourages specificity. Indeed, all decisions must be documented and justified [5]. To create the schema for my database, I had to make important decisions about what data to extract from my archival documents. Some data – bibliographic information, dates, geographic locations – do not require significant forethought. Other data, however, require clearly defined keywords and a rigid workflow. When inputting data on incidents of racialized violence described in my primary sources, for example, I had to decide how to code types of violence. What types of violence should be included? How would I define those types of violence? How would I handle situations where certain types of violence overlap? These are not always easy questions to answer, but in trying to answer them I found that I needed to develop clear definitions to underpin my research. In many ways, then, it was creating a database that led me to challenge how scholars have traditionally talked about violence and resistance.
As scholars, we regularly make decisions regarding what sources to include, what geographic regions to sample, and what information to highlight. This mediation, however, is often not transparent. The creation of a relational database, in many ways, ameliorates this problem of transparency. It is not possible to create a successful database without documenting all rationale. To create my database, I had to think critically about how I understand violence and resistance. What criteria, for example, must be met for an incident to qualify for inclusion in the database?
Simply defining violence and resistance was insufficient. To capture how Black men and women experienced and responded to racialized violence in the postemancipation South, it was also necessary to delineate a list of keywords to identify different types of violence (e.g. physical assault, sexual harassment, verbal abuse, etc.) and different methods of resistance (e.g. discursive insubordination, migration, self defense).
I created a typology of violence intended to represent a wide variety of incidents:
Nightriding – nocturnal acts of violence committed by disguised men
Lynching – acts committed by a group of two or more individuals which deprive any person of his/her life without regard to law in the service of justice, tradition, or race.
Verbal Abuse – acts of forceful criticism, insults, or denunciation
General – non-specific references to violence
Deprivation/Neglect – acts that deny an individual or group their rights/freedoms.
Slavery – acts of violence committed during slavery (prior to emancipation) that have a lingering effect on the individual
Rioting – acts of public disturbance committed by a crowd
Destruction of Property – acts that damage or destroy property committed by someone who is not the owner
Sexual Assault – acts of unwanted sexual contact
Physical Assault – acts resulting in physical harm
Intimidation – acts intended to frighten or coerce the victim without causing physical trauma
Silencing – acts that limit, alter, or distort the personal recollections of an individual
Confinement – acts that restrict an individual within certain limits of space
Humiliation – acts intended to shame or embarrass the victim, or to reduce the victim to a lower position in society.
Similarly, I created a typology of resistance:
Occupation – the physical occupation of space (a form of protest)
Physical Retaliation/Self Defense – the defense of one’s person or interests through the use physical force (sometimes with weapons)
Boycott – the refusal to buy a product or take part in an activity as a way of expressing disapproval
White Guardianship – the reliance on white people to ensure safety from violence
Migration – the movement of people to a new area in order to escape violence (permanent)
Discursive Insubordination – the expression of discontent through verbal confrontations (insults, humour, music, taunts)
Theft – the theft of another’s property as a way to retaliate or compensate for acts of violence
Congregation – the gathering of people for support against oppression (often in a religious context)
Sabotage – the deliberate destruction or obstruction of something as a way to undermine efforts at subjugation
Legal – the utilization of government officials (municipal, state, federal) to halt violence or to seek redress for violence
Testimony – the act of giving a written or formal statement on racialized violence and its impact
Burial Rites – the reclamation of deceased victims of violence for the purpose of ensuring proper burial
Education (Racial Uplift) – the advancement of Black rights through education (either formal or informal)
Flight – the movement of people away from a place or situation of danger (temporary)
Mischief/Pranks – the act of causing the perpetrator of violence to become the subject of humiliation or mockery
Protection – the protection of a targeted victim from their attacker
Protest – the physical or verbal rejection of an act of violence
Voting – the act of casting a ballot in spite of efforts to prevent political participation
Isolation – the refusal to interact with another group for self-preservation
Investigation – the informal investigation of the perpetrator(s) of an act of violence with the goal of bringing them to justice
Arson – the act of deliberately setting fire to property as a means of retaliation for acts of violence
These definitions ensured that I coded my data consistently. If the Ku Klux Klan raided the house of a Black politician at night, I knew to code the violence as nightriding. If a Black woman lied about the whereabouts of her husband when confronted by the Ku Klux Klan, I knew to code the resistance as protection. The significance of these definitions can be seen across my work, as my typologies of violence and resistance irrevocably shaped how I understood the ways in which Black communities experienced and responded to racialized violence.
As a methodological tool, relational databases open a new world of possibilities for historical inquiry that operates in tandem with a traditional close-reading methodology. I still read each source, analyzing language and structure to understand its meaning. But when the data is inputted into a relational database, it becomes searchable. It can be (re)organized to understand and clarify the relationships between specific types of violence and the methods of resistance employed in response. Still, a database alone is perhaps not the most useful if we want our work to extend beyond the ivory tower of academia. The database that I created, while critical to my work, is virtually unreadable for those not familiar with Structured Query Language (SQL), the standard programming language used to communicate with relational databases. There is a technological barrier that, in my opinion, is the main drawback to utilizing a database for historical inquiry. But it is not an insurmountable problem, and perhaps no more challenging than shifting from a jargon-laden article for academics to a plain language text for the public. To address this, I opted to create data visualizations that articulate key pieces of information and patterns identified from my relational database and make patterns visible to broader audiences.
There are lots of different types of data visualizations – graphs, heatmaps, charts – that can share information in engaging and unique ways. Because I am most interested in the relationship between types of violence and methods of resistance, I created a series of visualizations using Gephi, a network analysis and visualization software package that maps the relationships between people, places, and ideas. Rooted in the theory of social network analysis, each node represents a piece of data, and those nodes can be linked to other nodes based on the relationships between them [6]. For those who need a little help visualizing what I am describing here, take a look at Figure 1. This network shows every incident of violence and resistance described in the records of the Freedmen’s Bureau, the Ku Klux Klan Hearings, and the Slave Narrative Collection from Georgia, Mississippi, South Carolina and Texas. There are 2,780 unique incidents culled from 1,497 documents [7]. The orange nodes represent the keywords used to denote violence, while the blue nodes represent the keywords used to denote resistance [8].

What is most useful about this visualization is its ability to quickly communicate a number of ideas. The size of the nodes, for example, indicates the relative weight of a particular keyword. In other words, the larger the node, the more prevalent that particular type of violence or resistance. Almost immediately it is possible to see that manifestations of physical violence – nightriding, physical assault/murder, lynching – are the most common types of violence described in the primary source documents. There are two possible explanations for this: 1) physical violence was widespread in the postemancipation South, particularly as a holdover from slavery; or 2) the apparent concreteness and immediacy of physical injuries heightens their visibility and ease of observation. Deprivation/neglect, however, also emerges as a prevalent type of violence, confirming the importance of broadening our definition of violence to include those acts that do not necessarily result in immediate physical trauma, but that threaten or result in incidental injury or cause psychological trauma.
Because I extracted a broad spectrum of data when creating my database, it is possible to create visualizations dealing with more specific questions. For example, we can refine the use of resistance to showcase only those examples where the person resisting was female (Figure 2). There are 855 unique incidents of female resistance found across my three primary source collections. Not included are those incidents where both men and women resisted together. This visualization suggests that Black women were particularly drawn to non-violent methods of resistance, such as testimony, seeking support from government officials (legal), protest, and discursive insubordination.

Creating a relational database and crafting data visualizations is not an easy task and will not be suitable for every project, especially for those who have no prior experience in digital humanities. There is a higher degree of upfront labour, as well as technical skills that can take time to master. If digital humanities techniques, such as database creation and data visualization, can help historians draw connections that might otherwise go unnoticed, it is a worthy endeavour that can revolutionize the ways in which we approach historical inquiry. By integrating digital humanities techniques with a traditional close-reading methodology, we can move forward with the study of history in new and exciting ways.
Endnotes
[1] This reflection is based upon my doctoral research. See Sarah Whitwell, “‘They Will Have to Protect Themselves’: African American Resistance to Racialized Violence in the Southern United States,” PhD diss., (McMaster University, 2021).
[2] The Bureau of Refugees, Freedmen, and Abandoned Lands, more commonly known as the Freedmen’s Bureau, was created by Congress to smooth the transition from slavery to freedom. With offices in every former Confederate state, it was responsible for supervising the newly freed Black population. When white-controlled local and state governments refused to recognize the rights of African Americans, the Freedmen’s Bureau was often the only place to seek redress. As a result, it logged thousands of complaints related to racialized violence, creating a robust set of records for historians to explore. The Joint Select Committee to Inquire into the Condition of Affairs in the Late Insurrectionary States, popularly known as the Ku Klux Klan (KKK) hearings, similarly produced massive volumes of text when it solicited testimony on violence perpetrated by the KKK from public officials, army, officials, and the victims of violence. The Federal Writer’s Project of the Works Progress Administration resulted in over 10,000 typed pages of interviews with formerly enslaved men and women, which included biographic information, personal recollections of slavery, and attitudes towards prominent white and Black men.
[3] Sarah Whitwell, “Rejecting Notions of Passivity: African American Resistance to Lynching in the Southern United States,” Past Tense Graduate Review of History 5, no. 1 (Spring 2017), 71-95.
[4] On the value of digital humanities projects for posing research questions, see Stephen Roberston, “Putting Harlem on the Map,” in Writing History in the Digital Age, eds. Jack Dougherty and Kristen Nawrotzki (Ann Arbor: University of Michigan Press, 2013), 186-197.
[5] Matthew E. Davis, “The Database as a Methodological Tool,” Digital Medievalist, 10 August 2017, https://digitalmedievalist.wordpress.com/2017/08/10/the-database-as-a-methodological-tool/ (accessed 26 June 2025).
[6] Social network analysis is the process of investigating social structures using network and graph theory. It analyzes networked structures in terms of individual objects within the network and the relationships between them. On social network analysis, see John Scott, Social Network Analysis, 4th Edition (Los Angeles: Sage Publications, 2017).
[7] The disparity between these numbers is because each incident must be coded with a unique identification number. If a single narrative mentions multiple incidents, then it will appear multiple times in the database. Because violence was so widespread, it was not uncommon for the same person to relate multiple incidents in their interviews. Similarly, if multiple people resisted a violent act in different ways, then the incident would have to be coded into the database multiple times. Sam McAllum’s interview, for example, is represented twice. One night a group of African Americans were hosting a party when the KKK carried off Miler Hampton and killed him doing “somethin’ bad.” The next day McAllum, along with several other Black men, went to the local whites for help. Then, they bought up all the ammunition they could afford in order to defend themselves at the next party. The KKK never bothered this particular group again. Although there is only one incident of violence described – the murder of Hampton – there are two distinct acts of resistance occurring here: 1) the act of requesting assistance from white Southerners against violence; and 2) the use weapons to defend their interests against violence.
[8] There are 36 unique nodes represented in this diagram.





Comments