Ed Finn

Friday, March 18, 2011

Day of Ed Finn

That is the rather hubristic automated title my temporary new blog was given by the Day of Digital Humanities folks. Maybe I should change that. In any case, come check it out! I'll be blogging all day about what exactly it is I do.

Wednesday, February 16, 2011

The Abiding Importance of Pie

Valentine's Day Massacre, originally uploaded by edfinn.

That is what happened to this pie soon after the photo was taken. It is a banana peanut butter cream pie with a chocolate cookie crust. And a little whipped cream on top. I MADE THIS PIE. For my wife. That's right.

I will accept your adulation now.

Sunday, February 13, 2011

Papers, papers

In all the excitement of the holidays, MLA and then a trip to Egypt (!), I didn't have a chance to post about an exciting update from the publication front. Since then I also had some good conference news, so here's the skinny.

I'm really delighted to be participating in an awesome book project co-edited by Lee Konstantinou and Sam Cohen considering the impact of David Foster Wallace. The collection is under contract with Iowa and it got a great writeup in The Chronicle of Higher Education. Very exciting! I'm working on revisions to my chapter right now. My piece will explore how different groups of readers are defining Wallace's legacy through book reviews and literary consumption.

I had a paper accepted to Digital Humanities 2011! This may not seem like as big a deal until you start reading the comments on Twitter from people who didn't get in: the acceptance rate was only 31% for panel proposals. I've really enjoyed my previous two DH conferences, and I'm looking forward to presenting with fellow LitLabbers Zephyr Frank and Rhiannon Lewis, with Franco Moretti as moderator. The panel is titled "Networks, Literature, Culture" and it's going to be fantastic. I'll save you a seat.

Wednesday, January 26, 2011

Critical Code Studies

I've been traveling for the past two weeks, heading off immediately after the end of the MLA conference in early January. So I've largely missed out on the storm of Twitter and blogosphere discussion about an emerging critical discourse that I'm excited to be involved in: Critical Code Studies. Fortunately, the leaders of this new community have done an amazing job of generating wider awareness, building on a conference last summer and an ongoing collaboration with the electronic book review (which is where I come in) to several panels at MLA and a thriving new forum over at HASTAC. Most recently, the CCS folks have published the proceedings of the conference last summer in collaboration with Vectors.

I'm excited to dive into the HASTAC conversation and to start thinking about how CCS connects to my own work. A lot of the research I'm doing on the literary marketplace explores how new computational algorithms are changing cultural systems (i.e. the seasons of book production, which operate a little like Hollywood's summer blockbusters, winter Oscar-bait formula). But what I want to dwell on briefly here is how we are all learning to "read" algorithms ourselves on the front end. That's one of the basic sources of challenge in videogames, for instance. An example from my dissertation work might be the way we reverse-engineer recommendation systems (to figure out why something was suggested to us).

A still better example is Slate's Facebook parodies, which at their best adapt the functionality and rhetoric of the site's algorithms for political satire. For instance, in "100 Days of Barack Obama's Facebook news feed," the authors mimic Facebook's social media tracking for comedic effect:

Source: http://www.slate.com/id/2217225/

Here we have humans 'faking' algorithms for their own purposes, and I think the satire effectively skewers Facebook as well as politics. Ultimately, Slate's pieces work because they ask us if American politics is turning into a stylized, algorithmically deterministic system, a sadly unwitting self-parody. Or, as Aaron Sorkin put it, whether "socializing on the Internet is to socializing, what reality TV is to reality."

Of course, the CCS people would point out that there's no real code here, but I guess my point is that we're all involved in interpreting algorithms in various ways, whether or not we're coders. Perhaps my contribution to the HASTAC forum will be some of my own Perl code that I no longer understand!

Tuesday, December 28, 2010

Culturomics: Not Quite Yet

Well, it’s time to stick my oar in on the Google Ngrams discussion. While a number of computational linguistics scholars have pointed out the pitfalls of Google’s latest toy, I think I have a unique perspective to offer on the issue. I understand what the Ngrams creators were trying to do, because I’m trying to exactly the same thing: get some things cooking. My research on contemporary literary reception is not exhaustive or dependent on highly complex statistical models. That’s because literary reception is a huge, multiply mediated field ranging from café conversations to book reviews, and my access to data is limited. But where I have adopted a “core sample” model, choosing a few accessible data sources to make some robust but limited generalizations about readers and reading culture, Google has gone for the moon shot. By creating an opaque front-end to their 5 million book archive, they offer the illusion of a truly global Ngram search—and they emphasize the scale of their ambition by claiming their tool isn’t merely a corpus search mechanism but the portal to a new science of “culturomics.”

As my colleague Matthew Jockers noted in his own oar-insertion post, “To call these charts representations of ‘culture’ is, I think, a dangerous move.” He goes on to suggest it “may be,” but I have to go a bit farther and say “definitely not.” Here’s the problem: we can’t get reasonable, arguable claims about things like culture or literary history unless the limitations of the corpus are acknowledged and dealt with from the outset. Typically, projects like this limit themselves either by going too small or too big, and Google has gone way big. Let me explain what I mean.

Too Small:

The opposite example would be a research project on a small, meticulously tended patch of texts. Classic humanities research, really, but of limited usefulness for making grounded claims about larger literary-historical or cultural issues (at least until enough such small projects emerge with commensurable results that we can begin to construct some causal chains). Traditional humanities as a whole is full of projects that are “too small” for making broad cultural claims because they are limited to a small data footprint. The walled garden of closely tended results is fascinating and lovely to explore, but it’s difficult or impossible to compare the work to anything outside.

Too big:

Google, by contrast, flies off the macro end of the scale by trying to do too much and claim too much. The corpus is amazing, but nevertheless limited and contingent in many ways. As others have pointed out, the OCR is problematic; the metadata is sloppy; the text distribution almost certainly has a number of biases (how could it not? What is the gender, historical and language distribution of the world’s universal library supposed to be anyway?). By choosing to obscure these limitations instead of illuminating them, Google turns “culturomics” into a toy, not a tool.

Fortunately, the data is all there, and these problems can be fixed. Google loves a good algorithm and will presumably figure out solutions to the various technical problems. With luck (and the persistence of its academic research partners) the Ngrams team will also come to acknowledge and reveal the limitations on its data. Once that happens, we can really get cooking and make a clear case for when this vast corpus really does reveal broad cultural trends.

For now, Ngrams is a blunt object but it still has some value as a tool. I’ll post some examples next time.

Friday, December 10, 2010

Stanford Dissertation Browser

While I've had the dissertation specter floating before me for several years now, it has never looked so beautiful. Created by two Stanford graduate students in Computer Science, the Stanford Dissertation Browser uses topic modeling to graph recent dissertations by their disciplinary affiliation. The visualization was created with Flare, successor to Prefuse, which I was using for my own visualizations for a while (this being Stanford, the guy who created all of these visualization tools, Jeffrey Heer, is advising the project).

I'm looking forward to adding my dissertation to the mix next June. I wonder where it will line up?

Friday, October 15, 2010

Map Marathon

I received an email about a wonderful new exhibit/collaboration "Map Marathon" organized by the Serpentine Gallery in London and those intrepid thinkers at Edge. The whole online gallery is fascinating, but what really caught my fancy was this image, apparently submitted by Bruce Sterling. It's a map of writers who are associated with Sterling, and therefor it has a lot in common with my research.

After some investigation it looks like the map was generated with Gnod, or Gnooks to be exact: "a self-adapting community system based on the gnod engine." I'm intrigued--it seems like the site's connections are based on user input to its adaptive learning system. I'd love to compare these networks to my own data.