Digital archives and text mining perhaps offer a new way into the record that may allow us to discover many more (and many previously unknown) histories of reprinting. Histories of reprinting can also be thought of histories of popularity—and thus are useful windows into the priorities of the period. I've worked with a colleague in computer science to begin automatically uncovering new histories of reprinting in the Library of Congress' Chronicling America collection. The results of this text mining look something like this:
Each row lists two publications, each's publication date, a link to where each can be found within the archive, and the text they seem to share. After crawling only a fraction of the archive, we've already uncovered hundreds of potential reprinted texts—most of which I've never encountered as a scholar of the period. What can be done to make sense of so much new data?
Network theory perhaps offers a solution. Here's a network graph of all Edgar Allan Poe's fiction in periodicals—again I find myself working with data we as scholars already have on hand. In this graph, the nodes (the circles) represent individual publications. The edges (the lines between the circles) represent shared texts between publications. The edges are thicker based on how many reprints a given pair of publications share. A given node is larger or smaller based on how many other nodes connect to it.
Such a graph offers a large-scale model of how Poe's fiction moved around the county. Close-knit communities of textual sharing cluster together on the graph, and the central publications that shaped Poe's national reputation clearly emerge from the graph.
While it might not shock Poe scholars to see the <i>Broadway Journal</i> or <i>Southern Literary Messenger</i> at the center of this graph, a broader graph that visualizes connections among thousands of reprinted texts might point to larger, national trends. Who most influenced the nineteenth-century print network? Which publications shared texts most frequently, or to the greatest effect? These are broad versions of the questions I hope network models will help me begin asking of nineteenth-century periodicals in the coming months.
--Ryan Cordell, Northeastern University, r.cordell@neu.edu, @ryancordell
I'm wondering-- given the periodization of the Chronicling America collection-- do you see a gradual shift over time as traditional newspaper sharing declines while syndicates and wire services increase? Do the networks at the beginning and end of the period look radically different? And how much can they be linked to the shift in the technological and institutional shifts in how knowledge was shared in newspapers during that period?
ReplyDelete(It's interesting that the Chronicling America collection coincides so well with the periodization of Richard John's "Network Nation"...)
Also, does the way you're working with the data allow for analysis of the relative *speed* of news spread and reprint?