The whole thing, AI-created
It was already a game-changer when, on the first of April, the 1950 U.S. census images were released by the U.S. National Archives (NARA): using handwriting recognition software, NARA had produced a basic down-and-dirty index of that census that researchers could use on the NARA 1950 website.
The Legal Genealogist was impressed.
As I wrote at the time, the census launch was smooth and the index — limited though it was by the ability of machines to read handwriting and by the fact that it was a names-only index — was a huge help in locating family members on the census.1
I figured, however, that we wouldn’t get any better indexing for months, while Ancestry first used its own proprietary handwriting recognition software on the images, then sent the machine-read index over to FamilySearch, then FamilySearch volunteers verified and corrected the entries, and then Ancestry released the reviewed index state by state.
And that’s what it looked like was going to happen, with releases announced for states with small populations that got verified first: places like Delaware, Wyoming, Vermont, South Dakota.
And that’s certainly what Ancestry expected to happen. Back in January, when it announced it was going to be using handwriting recognition technology to produce a first-pass index, it said: “Ancestry anticipates the indexing of the 1950 Census to be completed and available on Ancestry.com this summer, with states released in real time upon completion.”2
Yesterday, just thirty-six days after the release of the census, just 36 days after it first got its hands on those images, Ancestry decided that it had enough confidence in the overall value of its machine-created index to release that index for general use.
That release, announced yesterday, means that there is a basic first-pass machine-created index for just about all of the United States for the 1950 census that genealogists can use right now on Ancestry, augmenting and supporting the index on the NARA website.
Thirty-six days after the release of the images.
In other words, handwriting recognition software is a total gamechanger.
Now… Remember that, like the NARA index, the Ancestry machine-created index is a first step towards a full final verified index.
• Ancestry’s machine-created index isn’t perfect. Like NARA’s version, it indexed my Cottrell relatives in Virginia as Cattrell — though it did get most of the first names right.3 But it got my parents right as Hugo and Hazel Geissler,4 rather than NARA’s Hugar and Hegel Guisler.
• The index released yesterday doesn’t include all of the eventual search fields from the census — it’s basically names, ages, birthplaces and residence locations for now.
• It doesn’t include a few counties and enumeration districts in Ohio and Michigan where the Census Bureau tested out the use of self-reporting forms, rather than enumerator-collected data.
• And, of course, it doesn’t include the verifications and corrections that haven’t been done yet by the army of volunteers working to improve the index at FamilySearch. (Which means we all need to keep volunteering and keep working to verify and improve the index.)
So it’s only the first step towards making the census fully searchable.
But what a big first step…
I realize that handwriting recognition technology isn’t going to work on every record set. The census is a fillable form, the form was known in advance so the software could be calibrated to the fields it included, yadda yadda yadda. There are lots of reasons why this works best with this kind of record.
But a genealogist can dream, no?
Dream of the day when handwriting recognition technology can be used on probate records and court records and…
For now, you’ll have to excuse me.
I still need to find cousin Willy in 1950…
Cite/link to this post: Judy G. Russell, “Ancestry releases 1950 index,” The Legal Genealogist (https://www.legalgenealogist.com/blog : posted 5 May 2022).
- See Judy G. Russell, “Time travel with NARA,” The Legal Genealogist, posted 1 Apr 2022 (https://www.legalgenealogist.com/blog : accessed 5 May 2022). ↩
- “Ancestry® to Apply Handwriting Recognition Artificial Intelligence to Create a Searchable Index of the 1950 U.S. Census,” Ancestry Corporate Blog, posted 27 Jan 2022 (https://www.ancestry.com/corporate/blog/ : accessed 5 May 2022). ↩
- 1950 U.S. census, Fluvanna County, Virginia, population schedule, enumeration district (ED) 33-3, sheet 13, dwelling 134, Clay R. Cottrell household; digital image, Ancestry.com (https://www.ancestry.com : accessed 4 May 2022). ↩
- Ibid., 1950 U.S. census, Golden, Jefferson County, Colorado, population schedule, enumeration district (ED) 30-17, sheet 8, dwelling 93, Hugo H. Geissler household. ↩