My dissertation’s title was The Open Book: Digital Form in the Making. It took the case of “mass digitization” as a lens onto the vast array of issues that “the book in the digital age” includes. For it, I conducted fieldwork from 2008 to 2011 primarily at the Internet Archive in San Francisco, CA. The period of my fieldwork coincided with the national debate around the Google Book Search Settlement, a controversy into which I became a participant observer. At the center of that debate was a lawsuit involving copyright, Authors Guild et al. v. Google, and the specific concerns those attempting to create free and open digital libraries on the Web have with copyright law.

In the dissertation, I tried to weave together the following: concerns for the future of the book as a medium; the introduction of technology entrepreneurship to the social field of the book; and the changing infrastructure for the literary system (focused primarily on copyright). Overall, I hoped to create a framework for thinking about the changes going on in the book when understood as a material infrastructure (and not a literary text).

The book I am writing now is significantly different from the dissertation.

Here I provide a brief summary of the dissertation’s four substantive chapters.

The first chapter (Pathways to Digitization) concerned the “pre-digital history of digitization” and focused on the application of microphotography to books in the first part of the 20th century (roughly 1900-1940) as a means to overcome the form’s limitations. I establish some startling commonalities between the enthusiasm for microfilm in the early 20th century and digital technologies in the 21st century efforts. I also trace the interest of American research libraries in reformatting print collections with microfilm and how this led up to Google’s Library Project.

The second chapter (The Matter of the Book) provides a detailed description of the digitized book as a digital object. What sort of digital object is it? How is it made? The chapter compares the practices of the Internet Archive’s digitization to those of Google’s Library Project. In describing my own “encounter” with the digitized book while scanning books and performing other tasks for the Archive, the chapter teases out the differences in the digital librarian’s view of books to my own as a scholar, reader, and former book publisher.

The third and fourth chapters shift to the contentious social terrain around digitization.

Chapter Three (Books as Data) extends the second chapter by considering more deeply the materiality of the digitized book, specifically the notion that, after digitization, books become “data.” Over the course of the chapter, I follow the circuitous route that “books as data” take from “dirty OCR” to public popularity in Google’s N-gram viewer — against the backdrop of Authors Guild et al. v Google. Some of this chapter appears in a short essay I wrote for Anthropology Today.

The fourth chapter (Books as Orphans) explores the metaphor of the “orphaned” book and how it captures the central conflict around literary property (copyright) in and around mass digitization. Tracing its emergence in the early 1990s and 2000s, I track how the digital librarians at the Internet Archive attempted to deploy the metaphor to establish books as common property. The themes of this chapter were taken up in an essay I published in the journal Current Anthropology.

I am willing to share dissertation chapters with those who have an interest. If you do, please contact me by email ( In the three years since completing it, history has overtaken some of its details. I have also conducted further research and my thinking has changed on a number of matters.