43 Million More Images Uploaded to the National Archives Catalog Since June 2021

NARA’s “Record Group Explorer” page at https://www.archives.gov/findingaid/record-group-explorer is a good place to get information on the number of digital images available in NARA’s online Catalog at https://catalog.archives.gov/ as well as the immense quantities of textual records that exist. As of July 2022, there are 179,271,436 images in the National Archives Catalog – or approximately 1.541% of 11.6 Billion textual records. (And that’s only textual records: that count does not include motion pictures, audio recordings, or data files).

Just over a year ago, in June 2021, there were 135,404,569 images in the National Archives Catalog, or about 1.175% of an estimated 11,524,683,948 textual records. That’s an increase of over 43 million digital images in a little over a year! Progress! (Back in June 2020, those numbers stood at 109,384,656 images or .95% out of an estimated 11,509,956,576 textual records).

Want to follow along and see what’s added? The “What’s New in the National Archives Catalog” page at https://www.archives.gov/research/catalog/whats-new links to record series to which digital images have been added – and may also highlight a few interesting items.

Digitization is a slow process. Records are typically one-of-a-kind items that may be fragile, bound into volumes, or otherwise unsuitable for “high speed” automatic sheet-feeding imaging systems. Records may require unfolding; removal of staples, pins, clips, and other fasteners; repair by trained records conservators; and other preparation for imaging, such as arrangement and new or improved description. Just consider the handling care required for the bundle of records shown below, which was just a small part of a small series, Records of Clerks, Wagonmasters, and Printers Employed at Various Posts, 1865-66 (National Archives Identifier 4707062), from Record Group 92, Records of the Office of the Quartermaster General.

Bundle of records from “Records of Clerks, Wagonmasters, and Printers Employed at Various Posts, 1865-66 (National Archives Identifier 4707062), prior to digitization.

This small series is fully digitized. Each individual file unit is now is described in the National Archives Catalog by the name of the commanding officer and his geographic location, which vastly improves discoverability by researchers: see the screen shot below, which shows a few of the 389 file titles.

What records will you discover online in the National Archives Catalog?

Dr. Colleen Shogan Nominated to be the 11th Archivist of the United States

President Biden has nominated Dr. Colleen Shogan to be the 11th Archivist of the United States. Tenth Archivist David S. Ferriero retired in April 2022.

Dr. Shogan is the Senior Vice President and Director of the David Rubenstein Center for White House History at the White House Historical Association.

She previously worked for over a decade at the Library of Congress, serving as the Assistant Deputy Librarian for Collections and Services, the Deputy Director of the Congressional Research Service, and the Deputy Director of National and International Outreach. Prior to joining the Library, Dr. Shogan was a policy staffer in the Senate, handling matters on defense, appropriations, transportation, small business, and science and technology.

Dr. Shogan was the Vice Chair of the Women’s Suffrage Centennial Commission and now serves as the Chair of the Board of Directors at the Women’s Suffrage National Monument Foundation, designated by Congress to build the first Washington, D.C. memorial dedicated to the early movement for women’s equality. She is an Adjunct Professor of Government at Georgetown University, and a member of the United States Capitol Historical Society Council of Scholars.

A native of Pittsburgh, Dr. Shogan holds a B.A. in Political Science from Boston College and a Ph.D. in American Politics from Yale University, where she was a National Science Foundation Graduate Fellow. Prior to working in Congress, Dr. Shogan was an Assistant Professor of Government and Politics at George Mason University. In addition to scholarly publications, Dr. Shogan is a mystery writer and has published seven novels.

The nomination now goes to the Senate for confirmation.

Memorial Day Remembrance: Beneath His Shirt Sleeves

On this Memorial Day, as we remember the fallen heroes who sacrificed their lives to defend our freedoms and preserve one United States of America, I respectfully direct your attention to an excellent two-part article by archives specialist Jackie Budell entitled “Beneath His Shirt Sleeves: Evidence of Injury” with Part I here and Part II here. This article highlights the sacrifice and stories of eight Union Civil War veterans who lost most – or part – of an arm during their war service.

1950 Census Website Improvement:  Transcribed Names are Now Shown in Search Results!

On May 17, 2022, NARA’s 1950 Census website development team made a wonderful improvement to the name search feature.  Names transcribed by humans are now shown in the search results above and below the census page image.  What does this mean?  Let’s look at an example.

Let’s search for Mildred Lauska in Ohio.  Fortunately, some human transcribed her name.

Here’s the search result showing both OCR (optical character recognition) results AND human transcription results above the census page image in the upper right under “Matched Name(s).”  (Click on the image for a bigger view.)’

Mildred Lauska, ED 92-47, with search result above the census page image

Here’s the same search result showing both OCR (optical character recognition) results AND human transcription results below the census page.  (Click on the image for a bigger view.)

  • The OCR results generated by “Machine Learning (AI) Extracted Names” are shown first:  Only Mildred’s husband, “Lauska melvins” is boldfaced because OCR had not transcribed Mildred or their daughters Joanne and Judith.
  • The “User Contributed Transcriptions” are shown second:  All persons with the Lauska surname shown in bold:  Melvin Lauska, Mildred Lauska, Joanne Lauska, and Judith Lauska.

Mildred Lauska, ED 92-47, with search result below the census page image

This image has an empty alt attribute; its file name is screen-shot-2022-05-25-at-12.10.50-pm.png

Important Takeaways:

  • Thank you for your transcriptions!  They matter!  They significantly improve the search results!  In the Lauska family example, all four members of the household can easily be found instead of just one.
  • Now You Can See Everyone’s Transcriptions at Work!  Yay!
  • Narrowing your name search to include state and county always better if the name was significantly misread by the OCR and has not been transcribed, or contains common names (John, Smith, and so forth!)
  • Thank you for your suggestions for website improvements!

David S. Ferriero, 10th Archivist of the United States, Retired on April 30, 2022

David S. Ferriero, 10th Archivist of the United States, retired on April 30, 2022, after 12 years at the helm of the National Archives and Records Administration. A final interview conducted by staff member Victoria Malachi is available on YouTube. Debra Steidel Wall will serve as Acting Archivist until the next Archivist is nominated by the President and confirmed by the U.S. Senate.

1950 Census Transcriptions on NARA’s Official 1950 Census Website

Your transcriptions on NARA’s Official 1950 Census Website at https://1950census.archives.gov/ are generally indexed within 24 hours of submission. By transcribing names, you help to push those spellings – and therefore those records – to the top of search results when other users search for those names. However, your transcription will not appear in the list of “Machine Learning (AI) Extracted Names” at this time.  NARA is working on an enhancement to add the transcribed names to the search results display.

1950 Census: Let’s Understand a Few Important Things

Posted 1 April 2022. Lightly updated 30 April 2023.

The 1950 Census release launched today at https://1950census.archives.gov. It includes a partial name index: primarily first names plus surnames for heads of households and persons in the household with a different surname. The index has a lot of inaccuracies due to optical character recognition (OCR) attempting to decipher the handwriting of 140,000 census enumerators. However, having at least a partial index online on Day 1 is wonderful. Most of the people I was interested in looking for I could find using the name index and some common sense, and if that didn’t work, I knew where to find ED descriptions and maps. I also spent time using the transcription tool to improve the discoverability of whole pages in order to help other researchers; I didn’t limit myself to just the people in which I was I was interested. Please help NARA improve the index by transcribing. Your transcriptions become discoverable by others about 24 hours after you input them.

Having spent virtually all day answering reference questions from Twitter, emails, and posts on the History Hub, it’s clear that there’s a few things that the researcher community needs to understand a little better. Among these, in no particular order:

(1) Census schedules exist for overseas American military and civilian personnel in Alaska, Hawaii, American Samoa, Canton [Kanton] Island, Guam, Johnston Island, Midway Island, Panama Canal Zone, Puerto Rico, U.S. Virgin Islands, and Wake Island. That’s it. Not in Germany, not in Japan, not anywhere else.

(2) If the census schedule says the “Not at home” household is on page 71 or higher, do not give up because there are “only 20” pages (or whatever) for the Enumeration District. Go to the last image, look at the Sheet Number in the upper right corner of the page. See if it is Sheet 71 or higher, and then work your way back a few pages until you get to page 71 or whatever page your “Not at home” household is on. Read my blog post for more information on Page 71 and up.

(3) Yes, you can download individual images. Look for the three dots below the blue box that says “Help Us Transcribe Names” and click on those three dots. You will then be given an opportunity to download: Click on the word download. Choose the level of quality you want (more pixels are better).

(4) You can share a link to an entire ED. You can also share a link to a single page if you’ve searched by name. Being old school, I copy and paste the link my browser shows, but there is a share feature that can do that, too.

(5) Learn by doing. If you’re not sure what one of the features in https://1950census.archives.gov does, click on it. Words in blue or highlighted in blue are clickable links. Gray features also often clickable. Play with it.

(6) https://stevemorse.org has a lot of great tools for census research – and more. Become familiar with them. Use them.

(7) NARA has provided a lot of useful resources with identical content at https://1950census.archives.gov/howto and https://www.archives.gov/research/census/1950. Read pertinent 1950 census blog posts: https://twelvekey.com/blog-posts-on-other-sites.

(8) Only one side of the census form was microfilmed in 1952. The original paper records were destroyed in 1961-63. Side 2 – which contained the housing schedules – does not exist.

Ok, thanks for reading! Let’s go transcribe some more!

ICYMI: 50 Million Images Added to NARA’s Catalog Since August 2020

With all the excitement and preparation for the 1950 census over the past several months, you may have missed it: Millions of images of textual records keep being added to NARA’s online Catalog.

According to NARA’s “Record Group Explorer” webpage, as of March 2022 there are 161,492,780 scans online representing 1.393% of the approximate estimated total of 11.5 billion textual pages in the custody of the National Archives and Records Administration.

One month earlier, in February 2022, that number was 159,188,420 images: so in just one month, 2,304,360 images were added!

Back in August 2020, there were 111,114,108 images in the Catalog, so in 18 months, 50,378,672 images were added.

Fifty million, that’s a pretty big number. Considering that this growth happened during a pandemic that limited staff access to the buildings – and to the records – that’s pretty impressive.