Call for Participation

Call for Participation: Archives Unleashed 3.0: Web Archive Datathon
Internet Archive, San Francisco CA
23 – 24 February 2017
We expect to be able to offer travel grants for US-based graduate students
Applications due 20 January 2017
***Call for Participation***
**This event is a follow-up to the Archives Unleashed datathon series – the first was held in March 2016 in Toronto and the second was held in June 2016 Washington DC. We’re continuing the datathon program in 2017, and are excited to bring this program to the Internet Archive.**
The World Wide Web has a profound impact on how we research and understand the past. The sheer amount of cultural information that is generated and, crucially, preserved every day in electronic form, presents exciting new opportunities for researchers. Much of this information is captured within web archives.
Web archives often contain hundreds of billions of web pages, ranging from individual homepages and social media posts, to institutional websites. These archives offer tremendous potential for social scientists and humanists, and the questions research may pose stretch across a multitude of fields. Scholars broaching topics dating back to the mid-1990s will find their projects enhanced by web data. Moreover, scholars hoping to study the evolution of cultural and societal phenomena will find a treasure trove of data in web archives. In short, web archives offer the ability to reconstruct large-scale traces of the relatively recent past.
While there has been considerable discussion about web archive tools and datasets, few forums or mechanisms for coordinated, mutually informing development efforts have been created. Our series of datathons presents an opportunity to collaboratively unleash our web collections, exploring cutting-edge research tools while fostering a broad-based consensus on future directions in web archive analysis.
This event will bring together a small group of 20-30 participants to collaboratively develop new open-source tools and approaches to web archives, and to kick-off collaboratively inspired research projects. Researchers should be comfortable with command line interactions, and knowledge of a scripting language (such as but not limited to Python) is strongly desired. By bringing together a group of like-minded scholars and programmers, we hope to begin building unified analytic production effort and to continue coalescing this nascent research community.
At this event, we hope to continue to converge on a shared vision of future directions in the use of web archives for inquiry in the humanities and social sciences in order to build a community of practice around various web archive analytics platforms and tools.
***Please also save the date! Archives Unleashed 4.0 will be held at the British Library in London, UK, June 11 – 13. Check back here for more details ***
Thanks to the generous support of the National Science Foundation, the Social Sciences and Humanities Research Council of Canada, the University of Waterloo’s Department of History, the David R. Cheriton School of Computer Science and the University of Waterloo, and the School of Communication and Information at Rutgers University, we will cover lunches and refreshments for attendees. We are also providing sample datasets for people to work on during the datathon, or they are happy to use their own. Included datasets are:
* the .gov web archive covering the American government domain
* the End of Term Web Archives (.gov/.mil), from 2008, 2012, and 2016
* social media collectios from the 2016 archive
* Canadian Political Parties and Political Interest Groups collection
* and other datasets to be announced
We will also have datasets from the Internet Archive’s recent event available, some noted above and the rest available at
Those interested in participating should send a 250-word expression of interest and a CV to Matthew Weber ( by 20 January 2017 with “Archives Unleashed” in the subject line. This expression of interest should address the scholarly questions that you will be bringing to the datathon, and what datasets you might be interested in either working with or bringing to the event. Applicants will be notified by 25 January 2017.
We expect to be able to issue a limited number of travel grants available for US-based graduate students; preference will be given to those who have not participated in the Archives Unleashed program in the past, although we welcome returning participants. These grants can cover up to $750 in expenses. If you are in an eligible position, please indicate in your statement of interest that you would like to be considered for the travel grant.
On behalf of the organizers,
Matthew Weber (Rutgers University), Ian Milligan (University of Waterloo), Jimmy Lin (University of Waterloo)