A Storm of Carbon and Attributes: Similarities Between Docs and Databases

Databases are extremely exciting for my future work in Digital Humanities. This is the infrastructure for the type of inquiry that I would like to do—overlapping discursive lexicons. It offers opportunities to track the minutia that sometimes gets overlooked in an interdisciplinary project. The major difficulty, however, in this stage of inquiry is finesse. I know that for my projects the appearance of an entity (word) is not determined by the text. The rhetorical play in scientific non-fiction during the sixteenth and seventeenth centuries is much more like literary play than a contemporary reader might expect. This at least seems to present a problem in classifying words in regards to genre. A sphere is a metaphor in a poem and a mathematical construct in a scientific monograph, for example. The overlapping rhetorical strategies and cultural assumptions as well as the overlapping lexicon may mean that the questions that “databased” (I had to) inquiry foregrounds “how do I get from here to there,” or “how do I find “x” and “y” data accounting for complexities “a” and “b”?”

The slide shows that we read for today pointed to a difference between traditional document-centric inquiry and databased inquiry. One of the authors claimed that one foregrounds questions and the other pushes them into the background. I do wonder what happens when we explore the similarities between documents and databases. It seems that all words (as entities) have several attributes—some contextual and some grammatical. The pronoun “she” has the grammatical attribute “pronoun” as well as a contextual attribute of referring to an agent, that is itself a collection of data entities each with a string of attributes. It is difficult to remember at times that Lady Macbeth and Hamlet are mere words on a page—entities with attributes.

Hamlet is a man that wears black, not green.

It is tempting to view “Hamlet” as an entity—but the concept of “Hamlet” is more of a table that includes “black,” but excludes “green,” which in turn excludes “pronoun” Character, then seems to be a series of inclusions and exclusions, one layered on another. The data that makes up “Hamlet” is defined by some sort of linguistic data made of “is” and “is not” (I am going to stop this train of thought because I am colliding with the Derrida’s trace).

Defining both textual information and data in this way suggests to me that document data and database data do behave in similar ways. One of the chief differences is actually in labor. Narrative is a dominant mode perhaps because of the labor that it requires to extract the data, and labor requires time. Extracting data from a database (depending on the complexity and the power of the machine) is the labor of waiting while the machine crunches the data (I am aware that I am, for the moment, excluding database construction (authorship?) and query formation). The experience of reading and waiting are different—especially if you are productive while waiting, perhaps using that time to do some light reading—and our experience of time is altered as a result. I am unable here to address the similarities between database design and authorship, but I think there are connections there as well. My conclusions after thinking and writing about databases are fuzzy at best, but it still feels enlightening and exciting. My understanding of the nature of databases reaffirms my approach to texts, but perhaps it’s partly how I am sorting the data.

Whatever we understand, we understand according to our own nature.