As larger archives of human cultural output are accumulated, historians face a deluge of information. Where scarcity of information was once a common frustration, historians now face the opposite problem. Amidst veritable haystacks, historians must locate the needles and, presumably, use them to stitch together a valid historical interpretation. To manage this information overload, historians are beginning to employ digital techniques. Indeed, a wide range of computational tools and methods now enable historians to conduct research at a scale once thought impossible. For example, Micki Kaufman, a doctoral candidate in US History at the Graduate Centre of the City University of New York (CUNY), employs computational text analysis techniques to study the Digital National Security Archive’s Kissinger Collection. This collection is comprised of approximately 17,500 meeting memoranda (memcons) and teleconference transcripts (telcons) detailing Henry Kissinger’s correspondence between 1969 and 1977.
The Kissinger Collection, as a large-scale online resource, presents both an opportunity and challenge for historians. Having this large volume of information available online is undoubtedly valuable, but the restrictions of the web-based search interference render it of limited use for historians. The application of more sophisticated computational techniques, then, permits a comprehensive analysis of the Kissinger Collection and facilitates meaningful historical interpretation. In order to understand the benefits and pitfalls of Digital Humanities, I recently interviewed Kaufman about her research, her advice for new digital humanists, and her views on the future of the field.
To read more about Kaufman’s research, please visit her blog: “‘Everything on Paper Will Be Used Against Me’: Quantifying Kissinger.”
SW: Increasingly, archives are making (portions of) their collections available online. This enables scholars to use digital techniques to ask new questions of large-scale historical data, like the Kissinger Collection. Your research emphasizes text analysis – word frequency and collocation, topic modeling, influence mapping, and sentiment analysis – and visualization – force directed graphs, line and bar graphs, and area and stream graphs. Why did you decide to utilize digital techniques for analyzing the Kissinger Collection, and how did you develop your methodology?
MK: Most fundamentally, the problem I encountered in the analysis was one of scale. The amount of material generated during Henry Kissinger’s White House years (1969-1977) is vast. This posed lots of problems for this project – even confining my particular research to his official correspondence materials curated by the National Security Archive (NSArchive) at George Washington University, I had a dizzying 18,600+ documents to analyze, involving thousands of individuals, organizations, and subjects in the discussions. At the same time, the lack of available material was also a problem – the material at the National Security Archive is a declassified subset of the total amount of correspondence generated and collected. So, in light of the problems of scale, I chose to confront the technological challenges of distant reading, rather than the more typical problem of how (and where) to choose a subjective starting point for a conventional close reading.
In addition to scale, any research based upon Henry Kissinger’s publicly available correspondence is complicated by a dense and contentious historiography generated by a host of historians in the past 40 years. Kissinger and his geopolitical impact are hotly debated, not merely for the controversies of his tenure but for the controversies of his character. He is a man whose history is defined by a paradoxical blend of policy, personality, celebrity, and secrecy. To study a man of such complexity and impact on the basis of selected, cherry-picked items of evidence is therefore to run the same risk former Nixon Domestic Policy Advisor John Ehrlichman warned about, that a few snippets of ‘tape’ would only create an oversimplified, and therefore wrong, impression. Only by studying ‘all’ the tapes, and ‘all the archives’ could one form a picture that would properly reflect Kissinger’s deeply complex, and internally contradictory qualities.
The methodology chosen for the project reflects the nature of the primary sources under analysis. In this case, to represent such a vast and complex archive, and to do so from the ‘top-down,’ required an approach along the lines of what Jo Guldi and David Armitage described in “The History Manifesto.” A study of the archive en masse recommended a computational analysis of the text, data and metadata that the archive comprises. The results of such data sets were best understood with data visualization – and the choices for how those visualizations would be designed and deployed was intimately based upon the kinds of patterns and questions they evoked and revealed.
SW: For scholars just starting out in the Digital Humanities, the field can seem somewhat daunting. What skills would you recommend as a starting point to help navigate the growing overlap between the humanities and technology? What skills did you find most useful when you were beginning your own research?
MK: More than anything else, it requires that the scholar view the computer as anything else in their research environment – subject to inquiry and modification. My view of Digital Humanities is that it reflects an inbuilt ethic of willing (even brave) deconstruction and reconstruction, recreation and reinterpretation – and this includes the Digital as much as the Humanistic. As with any other aspect of the research process, the tools and methods need to be interrogated, analyzed, criticized, modified and/or replaced in order to understand whether the scholarly interpretation of the results they generate and demonstrate is of any lasting value. So, learn what the computer is, how it works. What is an operating system? What is the difference between Mac, PC and others? Why are they different? How are they the same? What is a file? What is the difference between a text file and an Excel file, for example? How are they the same? These may seem like basic questions, but they bear immense relevance to being able to solve scholarly questions without jettisoning the scholarly mindset.
Once you feel pretty good about the computer and how information is saved, stored, processed, and displayed in the course of your everyday use, you can then begin to ask about how the computer is deployed in the scholarship. Which digital scholars use what software or operating system? What files and what tools are they using? Why? It may seem very basic, but the most useful skill I found, more than anything else at this point, was how to connect various tools and technologies using these basic common denominators (file types, etc.) – since there was (and there remains) no single ‘One Size Fits All’ tool or platform for digital scholarship. Such a basic understanding of these components helps one to transform or repurpose the information from one process or tool into the form compatible with whatever tool is necessary to take the research process to the next step.
SW: As those working in the Digital Humanities likely know, it is not uncommon to encounter roadblocks during the course of a project. Some roadblocks require only a minor detour, while others necessitate an entirely new route. Have you ever encountered such a roadblock? How did you deal with it, and what advice can you offer to others going through the same experience?
MK: My research is all about roadblocks. I would consider a large part of the work of the digital historian to be the creation of ‘road(un)blocks.’ The good ones are the ones you can share, and genericize enough as best practices that can be made available to others. The advice I would have to those confronting such seemingly irreconcilable obstacles is to consider them opportunities for learning wrapped in frustration. In those moments, step away from the tech and sketch what you see in your head (literally, or otherwise). Focus on the bits and pieces and weave them together in your mind. In this kind of exercise, one can often find that a core aspect of what the ‘right brain’ serves up contains the germ of a new approach. At the very least, document and share the roadblock.
I have found that about 75% of the limitations I have encountered in Digital Humanities were my own limitations, either conceptually or practically, and thus represented opportunities to grow my skills and abilities. Another 20% represented obstacles posed by absent or unavailable technology – machines and code that either didn’t exist yet or didn’t exist in the form I needed it to (some of which also was within my power to create or learn). The last 5% or so was stuff I had to document and abandon, and that is often some of the most provocative stuff (stuff that I either couldn’t afford, couldn’t approximate, or wasn’t out yet). Most of all, don’t ever let obstacles like these dilute one’s passion for the ideas or the work – for when you are confronted by these obstacles, you may well be at the threshold of great things. And as an historian, it is a great honor and privilege to struggle with such questions in the effort to expand knowledge.
SW: Overall, how has your work with text analysis and visualization shaped your dissertation? The field of Digital Humanities is growing rapidly; however, there are some who question the validity of the field. How can grad students incorporate digital techniques into their dissertations, theses, and major research projects? Do you envision the future of Digital Humanities as solely a set of tools for supporting traditional research (e.g. close-reading), or do you think it can stand as a final product on its own?
MK: The methodological choices I have made throughout have had a huge impact on the course of the research for a number of reasons. First of all, given the complexities posed by the denseness and scale of the material and its selective declassification, the tools were arguably one of the only ways in which the work could have been approached at all comprehensively on the timetable of a doctoral dissertation. Second, the use of these techniques are an existential necessity for the dissertation to avoid ‘cherry picking’ and facilitate new interpretations and demonstrations of some of the unique benefits of Digital Humanities methods. Last, the use of visualization, in particular, has affected how I construct my arguments – while text contains narrative arguments articulated linearly, visualizations are non-linear, revealing patterns and trends that narrative argumentation can sometimes struggle to adequately articulate.
Digital Humanities is far beyond tools, in my opinion. Historical interpretation using digital tools is a different process than traditional close reading, but today’s Google-enabled traditional close reading is a different process than the archival practice of days past. Someday soon, Digital Humanities will once again be Humanities. Our use of text analysis, network and data visualization, geospatial mapping and other approaches will become an assumed and comfortable part of the research landscape. As always, scholars who brave the journey must be ready with a pioneering spirit to overcome the obstacles that come with the territory, digital or otherwise. If scholars can sustain and nourish their historical empathy while overcoming the challenges of a constantly evolving technological landscape, Digital Humanities (and digital history) will continue to thrive.
Micki Kaufman is a doctoral candidate in US History at the Graduate Center of the City University of New York (CUNY). Her dissertation, “‘Everything on Paper Will Be Used Against Me:’ Quantifying Kissinger” is a seven-time winner of the CUNY Graduate Center’s Provost’s Digital Innovation Grant. She is a co-author of “General, I Have Fought Just As Many Nuclear Wars As You Have,” published in the December 2012 American Historical Review. In 2015, Kaufman was awarded the ACH and ADHO’s Lisa Lena and Paul Fortier Prizes for best Digital Humanities paper worldwide by an emerging scholar. From 2015-2017 she served as a Virtual Fellow with the Office of the Historian at the US State Department, is currently a Biography Dissertation Fellow at the Leon Levy Center, and serves as an elected member of the Executive Council of the Association for Computers in the Humanities (ACH).