Presenter(s): Matt Teichman, Obi Obetta and Nischchay Karle


Email increasingly forms a core portion of many archives, and while tools are available for basic preservation of email messages, less consideration has been given to archival preservation and control of email attachments. It is essential to future research that these important records are preserved in the context of their associated messages.

With the generous support of an “Email Archives: Building Capacity and Community” grant from the University of Illinois, the University of Chicago Digital Library Development Center has created a free and open-source tool called Attachment Converter. Attachment Converter utilizes file format conversion utilities already installed on a user’s system to batch-convert common email attachments to formats recommended for archival preservation and access, retaining the connection between the migrated attachments and their associated emails.

In this workshop, iPres attendees will learn the basic backend anatomy of an email, how to convert email from PST to MBOX, and inspect and analyze the contents for common attachments. We will demonstrate how to migrate attachments to preservation formats using Attachment Converter and discuss how archivists without UNIX/Linux experience can collaborate with their institution’s IT staff to implement the tool.


The workshop will have two parts. First, we will explore the basic technical components of email and share techniques for working with them. The full email specification is too complicated for a 90-minute workshop, but we will provide everything that an archivist needs to know for the purpose of working with email attachments. The goal is for all attendees to walk away with the ability to convert email from Outlook PST to MBOX format, open the MBOX in a text editor, and inspect the raw contents of the mailbox, regardless of their technical background. Second, we will teach attendees how to set up and use Attachment Converter.

The workshop is designed to be interactive and attendees are encouraged to follow along on their computers using a sample mailbox that will be provided to them. The sample emails will contain attachments in file formats archivists may commonly wish to preserve: JPEG, GIF, PCX, PDF, DOC, DOCX, RTF, XLS, and XLSX. We will show participants how to open a mailbox in a text editor and identify where the attachments are in an email, how to guess the file format of an attachment, and also explain why that has to be a guess. We will then demonstrate how to convert all the attachments in a single email to archivally stable formats using Attachment Converter. Following that, we will show how to batch-convert attachments in an entire mailbox.

Finally, we will explain how to set Attachment Converter up to use a utility of the user’s choice to convert attachments in their emails. This process requires more technical expertise—so we won’t go into full detail—but we will explain that feature at a high level and stress that any archivist working somewhere with an IT staff that knows UNIX/Linux system administration should be able to make use of this feature.

View Proposal Submission

Event Timeslots (1)

Tuesday, September 19