Columbia University

CLIR Postdoctoral Fellowship for Data Curation in Medieval Studies

Columbia University Libraries and the Department of History seek a Medieval Data Curation Fellow to support ongoing research and initiatives in medieval studies. The successful candidate will work principally on developing ChartEx, a tool for the analysis of large data sets of medieval legal documents. The candidate will also continue work underway in the Digital Humanities Center on curating digitized collections of medieval sources, particularly legal documents and canon law treatises.

Projects

The ChartEx (Charter Excavator) project, funded initially in 2011 by the second round of the multinational Digging into Data Challenge, brought together researchers from six universities in four countries and four different disciplines to develop a tool to help historians analyze large collections of medieval legal records known as charters. Harnessing natural language processing (NLP), data mining, and human-computer interaction techniques, the team successfully demonstrated the feasibility of the tool, which is designed-unlike previous interfaces for working with medieval charters-to be independent of any particular data set. Original team members are continuing work on various parts of the project in various new combinations. Prof. Adam Kosto (History) is part of a group focusing on the problem of preparing new data sets on the basis of printed and previously digitized material, and retro-converting previously existing data sets. The goal of the current stage of the project is to refine the ChartEx tool to the state where it can work reliably with large, curated corpora of Latin language documents.  Initial work on ChartEx was carried out in cooperation with the Columbia University Libraries, with Bob Scott, Digital Humanities Librarian, serving as a member of the advisory board.

The Digital Humanities Center has over two decades of experience in curation of digitized sources, and has long given special emphasis to medieval source digitization and textual analysis projects, drawing on the Columbia University Libraries’ extraordinary collections of 19th- and early 20th-century printed editions of medieval materials, including cartularies, with special strength in France, Spain, Germany, England, Italy, and the Netherlands. A current intern is working under the guidance of Bob Scott on a digital edition of a canon law manuscript in the Libraries’ Special Collections. Scott himself has been working in recent years on a marked up corpus of medieval sources for Poland and Lithuania that includes not only the kinds of material ChartEx addressed in its earlier phase but some related royal account records and registers of ecclesiastical benefices. In addition to working on ChartEx, the Medieval Data Curation Fellow will support and develop this broader program of the Digital Humanities Center.

Desired Skills & Expertise

The successful candidate will hold a doctorate in a relevant field of medieval studies, ideally working with documentary/archival materials.  Familiarity with digital humanities methods such as TEI textual markup, textual analysis using tools such as Natural Language Toolkit and Python, geospatial analysis tools such as Gephi and ArcGIS, visualization, and digitization of print materials is desirable though not required. The Fellow will be given the opportunity to learn and strengthen her/his skills and knowledge in all areas listed above as a part of the fellowship. Substantial teaching and organizational experience is preferred.

Roles & Duties

The Medieval Data Curation Fellow will help the ChartEx team identify key collections to be used in the next phase of the project and explore existing and new automated techniques for data preparation. In particular, the fellow will: 1) assist in identification and preparation of at least three additional corpora of documents appropriate for use with the ChartEx project, participating in all stages of the digitization, markup, analysis, and presentation of these research materials; 2) collaborate with partners on the refinement of the Latin language NLP module; 3) collaborate with partners on the testing and evaluation of the data mining and user feedback modules. The Fellow will be expected to present on the project to library staff and scholars at Columbia University and beyond. The Fellow will participate in educational workshops on the tools and methods used in the ChartEx project.

In addition to work on ChartEx and on new Digital Humanities Center projects developed during the course of the fellowship, the Medieval Data Curation Fellow will contribute to the History in Action program, Columbia’s portion of the Mellon Foundation / American Historical Association Career Diversity Initiative. One of the aims of History in Action is to equip History PhD students with skills useful for careers both within and outside the academy, with digital skills high on the list. The Fellow will develop workshops on digital data curation for students in History and other departments. The Fellow will also assist in course development for a Digital Humanities seminar in the Department of History or the Program in Medieval and Renaissance Studies.

The Fellow will also have substantial time reserved for the development of his/her own independent research, publishing, and teaching program, which should involve a significant digital humanities element.

Guidance & Professional Development

The Fellow will be part of a team of Humanities & History librarians engaged in a training program designed to enhance their ability to partner with faculty and researchers on digital humanities projects. The Developing Librarian training program is a series of bi-weekly workshops on digital humanities tools and methods including topics necessary for ChartEx work. The Fellow will also join a cohort of interdisciplinary Digital Center Interns located within the library with the goal of advancing digital scholarship within the university and supporting the growing skill set necessary for Columbia graduate students to advance in their careers beyond Columbia.

The Fellow will also benefit from the activities at the Studio@Butler, particularly the weekly openLabs that bring together practitioners from across campus to share insights and projects in a co-working environment. The Libraries is also hiring a Digital Humanities Developer in the coming months who will help support the ChartEx project among other digital humanities projects on campus. The Fellow will have the opportunity to work with this developer and with our Digital Scholarship Coordinator on a variety of digital humanities efforts.

The fellow will be mentored and guided by Bob Scott, who has led the Digital Humanities Center (originally the Electronic Text Service) since 1995.

Resources

Columbia University is one of the premier centers for the study of medieval history in North America. The Department of History alone has five specialists in the field (Adam Kosto, Neslihan Senocak, Robert Somerville, Joel Kaye, Martha Howell), and overall there are some two dozen medievalists at the University, who maintain an Interdepartmental Program in Medieval and Renaissance Studies. The University Seminar in Medieval Studies plays an important role in bringing together an even broader group of specialists from the New York area, and Columbia enjoys ties to medieval programs at Princeton, New York University, and elsewhere through an Interuniversity Consortium.

Columbia University Libraries is one of the top five academic research libraries in North America, with particularly strong holdings in the medieval field. In addition, Columbia’s Rare Book and Manuscript Library contains a substantial collection of original medieval manuscripts and documents. More broadly, the Libraries offer extensive print and electronic resources, discipline-based digital centers, and a team of expert staff providing innovative services to support instruction and scholarship. The collections include over 12 million journals, over 160,000 journals and serials, as well as extensive electronic resources, manuscripts, rare books, microfilms, maps, and graphic and audio-visual materials. The Libraries are committed to advancing scholarship using the new tools and methods afforded by the digital humanities.