by Chuck Rybak/@chuckrybak
A Beginner’s Guide to Gephi and Character Networks
Last fall, inspired by Franco Moretti’s must-read pamphlet over at the Stanford Literary Lab, I decided to turn my students loose on character-network assignments created in Gephi: a free, open-source visualization tool. As with most technology I was initially hesitant, fearing that curricular objectives would be sacrificed on the altar of technical difficulties, yet positive results put these fears to rest and this assignment has definitely earned another go in my classroom.
How high is the learning curve? If your needs are as basic as mine were, you’ll hit the ground running in Gephi. Free to download and easy to install, Gephi includes an invaluable Quick Start Guide that students can utilize independently (there are also these handy tutorials).
The important thing is that we started small. The first student group to take on this project created a character network for Dr. Jekyll and Mr. Hyde, a novel compact enough to produce a fairly contained network, as seen in the modest results. In creating this small network, students tracked only the type and number of character interactions, which were then represented as nodes with weighted edges in the Gephi visualization.
What the class saw in the Jekyll and Hyde visualization did not expand much beyond what we gleaned from reading the book, but prompted us to think about the questions we ask of texts and how to frame those questions in order to create more interpretive, informative, and useable networks. For example, the group that worked on Dr. Jekyll felt they might gain more meaningful results by reconfiguring their data to reflect indoor versus outdoor interactions in the novel, an adjustment that is easy to handle in Gephi’s user-friendly interface.
When thinking of my undergraduate students, my first question was “Is it difficult to manage data in Gephi?” Answer: No. Here a few things worth considering if you’re looking to experiment with this software for the first time:
- Importing spreadsheet data into Gephi is fast and easy—whether just importing nodes, or both nodes and edges—all of which is covered in the introductory tutorials.
My preference at the time was to have students work manually in Gephi’s data tables (or “Data Laboratory”) rather than in exterior spreadsheets. This is simple and helps new users warm up to the interface. The work involved was basic, consisting of entering the nodes (character names), linking appropriate nodes via edges with simple dropdown menus, and manually assigning edge weights to indicate the frequency of the interaction. (Note: I’ve since been introduced to the wonders of NodeXL, and would now probably use it as the main site for data entry.)
- Once the data was entered, students followed the easy steps in the Quick Start Guide, with the added bonus of a “wow” effect at the end. Without exception, students indicated that tracking character interactions and compiling their data was far more time consuming and challenging than using Gephi—in other words, the presence of the software was appropriately backgrounded and minimized.
- Gephi allows for easy weighting of edges or links between nodes, especially useful for character-network assignments when trying to assess the strength of relationships, the importance of specific characters, and identifying communities (“community detection”) that might exist within the larger work.
- If you’re truly ambitious, you can also designate interactions between nodes as “directed,” which allows you to further characterize the nature of the interactions (such as who initiated the interaction, etc).
With an initial test run complete, the class was ready for more challenging data, which arrived quickly in the course’s next novel: Dracula.
A novel like Bram Stoker’s Dracula fits this assignment well for any number of reasons, but of chief interest for my class was the presence of multiple narrators speaking through a variety of media. Students working on the Dracula character network were assigned a specific subtext in which to track relationships: one student focused on Jonathan Harker’s journals, while another concentrated solely on Mina’s letters. Unlike when the students worked with Dr. Jekyll, Dracula’s complexity proved ripe for visualization in Gephi, and you can view the final product here.
Did Gephi contribute to the learning process in our classroom? Undoubtedly. What immediately stood out was how male-authored texts, such as those of Dr. Seward and Jonathan Harker, produced more circular networks than those built around the novel’s female characters. The visualization thus prompted us to ask “why?” Students wanted to know what this might indicate, finally hypothesizing that the men operated in more evolved, professional social networks than the text’s isolated women (for example, Mina Murray’s network is much smaller compared with Mina Harker, whose network expands when adopting her married name). In other words, the character-network assignment not only allowed us to begin with a research question (What do the networks of the different subtexts communicate?), but also produced subsequent and unexpected research questions (If we removed Mina from all of the networks, how would they change?). This allowed class discussion to progress based on ongoing student work rather than rigid syllabus construction.
If you use Gephi in class and want to step your game up beyond what I’ve described so far, Clement Levallois has some fantastic tutorials. If you want to set the bar even higher, Nathan Hensley has a student-created Gephi visualization of Middlemarch, complete with dazzling YouTube videos.
I am an admitted novice when it comes to using Gephi and am just beginning to glean the potential for such network visualizations in my classroom. However, this past summer I had the pleasure of attending DHSI 2014 (Digital Humanities Summer Institute), where I was a student in the Networks class with guru Scott Weingart. The week I spent in the course was incredibly valuable and has expanded the teaching possibilities for me to a much more advanced level, which I hope to write about sometime soon. A quick preview, however, would include a lot more discussion of network interpretation rather than creation, and this would involve making more robust use of things like “Degree,” “Closeness Centrality,” “Betweenness Centrality,” etc. I would caution against over-reading the networks you produce, and to discuss with students not only what is revealed, but what is not accounted for by such a visualization; this is just one tool that provides a very particular look at a text, so it’s worth keeping that in mind. But as a teaser, here is a peek at the full network of Dracula that I created under Scott’s tutelage at DHSI. It’s dope.
And while I’m writing as someone who teaches literature classes, this would obviously be effective in other disciplines. In an Introduction to Digital and Public Humanities course that I team-taught with historian Dr. Caroline Boswell last spring, students used Gephi to produce networks based on The vvonderfull discouerie of witches in the countie of Lancaster (1613).
So there is toe dip #2. Do you have experience using Gephi in the classroom? Feel free to discuss your experiences in the comments.