An Introduction to Mass Digitization and the Brown Brothers Collection

By NYPL Staff
June 12, 2019
Stephen A. Schwarzman Building
A ledger page from the Brown Brothers New Orleans Office, 1845 - 1849

An Introduction to Vendor Digitization at the New York Public Library

Digitization at The New York Public Library can occasionally be a complicated process. The NYPL Digital Imaging Unit (DIU) is an amazing in-house team, filled with talented photographers who handle the majority of our digitization efforts. However, some collections require a different approach. Because of the scope of the collection, the condition of the items, a desire for quicker digitization, or a combination of these and other factors, a collection can become a candidate for vendor digitization.

This means that, instead of sending the items to our in-house lab, the collection is sent to a vendor who specializes in digitizing documents. Through a process that involves curatorial and technical services teams working closely with the vendor, the objects are shipped out, digitized, and then sent back, along with all the digitized files. After quality checking, creating metadata, and ingesting the files into our repository, the files are made available on the NYPL Digital Collections for anyone to see!

With thanks to support from a CLIR Hidden Collections grant, the records from Brown Brothers & Company were digitized. Brown Brothers & Co. is a significant American banking house with a long history,  and the bank’s records at NYPL include ledgers, journals, custom house entries, consignments, and records of sale for the company between the years of 1825 and 1880.

Mostly covering the career of James Brown, the creator of the New York affiliate of Brown Brothers & Co., the 176 volumes contained in this collection document the growth and change of the company, including the business it conducted in its New Orleans and Havana Offices.

With various conditions to contend with, the great size of the physical volumes to consider, and more than  110,000 pages in total to photograph, this collection was a prime candidate for vendor digitization.

Off to the Vendor!

When it’s been decided that a collection will be digitized by a vendor, the project head will work closely with the Mass Digitization Coordinator, who works with the vendor to set up file specifications, naming conventions, and shipping schedules. 

Working with the Registrar’s Office, the vendor and Mass Digitization coordinator schedule shipping or pick-up/drop-off dates for the physical object. In the case of the Brown Brothers collection, the number and size of the volumes required being sent out in two batches. This can make the Registrar’s job even more complicated, as it’s important to keep track of exactly where all objects are. Once the items have been sent to the vendor, digitization can begin!

Digitization

As the vendor works to digitize the items, it’s important to establish a schedule for the delivery of digital files, especially when there are a lot to deal with. When managing large collections, it’s a good idea to know roughly how many files will be created. There’s the front and back matter of the volumes to consider, and both sides of every page. With large volumes, the number of files created can quickly add up.

Because of the volume of images we knew Brown Brothers would produce, we decided to have files sent back in batches of about 15,000 images each. These batches of arrive on large hard drives (8 TB of storage space) and are copied onto a server owned by NYPL for safe keeping. In total, the Brown Brothers Collection involved 13 of these hard drives.

Quality Checking 

After we’ve received the files, the images don’t immediately go into our digital collections. At NYPL, we have image specifications all digital images must meet to be accepted. This quality checking process can take a while, particularly when there are more than 15,000 images in a batch.

Here are some quality checks that happened with the Brown Brothers collection:

  • Making sure the images are exposed correctly. If images are over- or under-exposed, information can be lost and hard to recover. Also, poor exposure makes the images look generally bad and makes parsing the information harder for the end user. We want the pages to be easy to read.
     
  • Making sure all pages of the volume have made it into the final collection. Sometimes, pages can be missed during digitization, resulting in images not making it onto the hard drive, or sometimes the page is actually missing from the volume. No matter the issue, it’s important to make sure the entire volume has made it to digital form. 
     
  • Making sure all the files have made it is also important. We ask for two copies of every image: One is uncropped and unedited, known as the Preservation Master, and this is the version from which all other copies are made; we also ask for a Service Copy, which is a cropped and slightly smaller in size. This is usually the version you see when looking at the item on the NYPL Digital Collections. The smaller size makes it easier to serve to the public. 
     
  • There should be the same number of Preservation Master files and Service Copies, but sometimes there’s not. When this happens, it’s important to track down where the difference is coming from. Sometimes, there’s an extra copy of an image; other times, a file is missing entirely.

For the Brown Brothers project, a lot of images came back that didn’t meet our specifications, which can happen when dealing with vendors. While our in-house digitization team does great work, and often exceeds the specifications we give to vendors, when such a large amount of work is done in a short amount of time, mistakes can happen.

If the issues are small, such as a cropped file is missing, or the color is a bit off, it can be fixed in house. However, if the issues are larger, such as missing uncropped files, significantly under- or over-exposed images, or parts of a page being cut off, the vendor needs to redo the digitization. 

Into The Repository

Brown Brothers Collection thumbnails on Digital Collections

Brown Brothers collection thumbnails on NYPL Digital Collections

After being quality checked, the files are ready to be added to our repository and uploaded to our Digital Collections. This is a whole new process in the life of a digital file.

For a file to be added to our repository, it’s important to make sure its metadata is in place, which means working with the Metadata Services Unit. The MSU collaborates with NYPL archivists, Digital Curatorial Assistants, and collection specialists to import, remediate, and create metadata for our digital objects.

Metadata is essential for any digital object; it includes detials like description, title, volume number, creator, and rights information. Without metadata, it would be almost impossible to find the objects online. 

The Brown Brothers collection already had a finding aid on the NYPL archives portal, which made metadata creation a little bit easier. However, some metadata—like exactly how many images or pages a volume has—can’t be added until digitization is complete. By creating as much metadata as possible beforehand, and then working with the Mass Digitization Coordinator to create on the spot metadata when needed, the MSU makes adding the files to the repository as seamless as possible.

Once the metadata is in place, the files have been confirmed as meeting our specifications, and all file counts have been double checked, the files can be added to our repository. Through a process of assigning IDs to all the images and using a custom built in-house tool, the Mass Digitization Coordinator transfers the files from the server into our repository. 

When this transfer is done, the MSU is informed. As soon as the final approval is added for the metadata, the items show up on Digital Collections and can be viewed by the public. Take a look at the Brown Brothers collection!