The whole thing, AI-created
It was already a game-changer when, on the first of April, the 1950 U.S. census images were released by the U.S. National Archives (NARA): using handwriting recognition software, NARA had produced a basic down-and-dirty index of that census that researchers could use on the NARA 1950 website.
The Legal Genealogist was impressed.
Seriously impressed.
As I wrote at the time, the census launch was smooth and the index — limited though it was by the ability of machines to read handwriting and by the fact that it was a names-only index — was a huge help in locating family members on the census.1
I figured, however, that we wouldn’t get any better indexing for months, while Ancestry first used its own proprietary handwriting recognition software on the images, then sent the machine-read index over to FamilySearch, then FamilySearch volunteers verified and corrected the entries, and then Ancestry released the reviewed index state by state.
And that’s what it looked like was going to happen, with releases announced for states with small populations that got verified first: places like Delaware, Wyoming, Vermont, South Dakota.
And that’s certainly what Ancestry expected to happen. Back in January, when it announced it was going to be using handwriting recognition technology to produce a first-pass index, it said: “Ancestry anticipates the indexing of the 1950 Census to be completed and available on Ancestry.com this summer, with states released in real time upon completion.”2
Yesterday, just thirty-six days after the release of the census, just 36 days after it first got its hands on those images, Ancestry decided that it had enough confidence in the overall value of its machine-created index to release that index for general use.
That release, announced yesterday, means that there is a basic first-pass machine-created index for just about all of the United States for the 1950 census that genealogists can use right now on Ancestry, augmenting and supporting the index on the NARA website.
Thirty-six days after the release of the images.
In other words, handwriting recognition software is a total gamechanger.
Now… Remember that, like the NARA index, the Ancestry machine-created index is a first step towards a full final verified index.
• Ancestry’s machine-created index isn’t perfect. Like NARA’s version, it indexed my Cottrell relatives in Virginia as Cattrell — though it did get most of the first names right.3 But it got my parents right as Hugo and Hazel Geissler,4 rather than NARA’s Hugar and Hegel Guisler.
• The index released yesterday doesn’t include all of the eventual search fields from the census — it’s basically names, ages, birthplaces and residence locations for now.
• It doesn’t include a few counties and enumeration districts in Ohio and Michigan where the Census Bureau tested out the use of self-reporting forms, rather than enumerator-collected data.
• And, of course, it doesn’t include the verifications and corrections that haven’t been done yet by the army of volunteers working to improve the index at FamilySearch. (Which means we all need to keep volunteering and keep working to verify and improve the index.)
So it’s only the first step towards making the census fully searchable.
But what a big first step…
I realize that handwriting recognition technology isn’t going to work on every record set. The census is a fillable form, the form was known in advance so the software could be calibrated to the fields it included, yadda yadda yadda. There are lots of reasons why this works best with this kind of record.
But a genealogist can dream, no?
Dream of the day when handwriting recognition technology can be used on probate records and court records and…
For now, you’ll have to excuse me.
I still need to find cousin Willy in 1950…
Cite/link to this post: Judy G. Russell, “Ancestry releases 1950 index,” The Legal Genealogist (https://www.legalgenealogist.com/blog : posted 5 May 2022).
SOURCES
- See Judy G. Russell, “Time travel with NARA,” The Legal Genealogist, posted 1 Apr 2022 (https://www.legalgenealogist.com/blog : accessed 5 May 2022). ↩
- “Ancestry® to Apply Handwriting Recognition Artificial Intelligence to Create a Searchable Index of the 1950 U.S. Census,” Ancestry Corporate Blog, posted 27 Jan 2022 (https://www.ancestry.com/corporate/blog/ : accessed 5 May 2022). ↩
- 1950 U.S. census, Fluvanna County, Virginia, population schedule, enumeration district (ED) 33-3, sheet 13, dwelling 134, Clay R. Cottrell household; digital image, Ancestry.com (https://www.ancestry.com : accessed 4 May 2022). ↩
- Ibid., 1950 U.S. census, Golden, Jefferson County, Colorado, population schedule, enumeration district (ED) 30-17, sheet 8, dwelling 93, Hugo H. Geissler household. ↩
I have started getting Ancestry hints from the 1950 U.S. Census.
FamilySearch is already using their new Get Involved system to do handwriting recognition and at RootsTech they said they are planning to use it to do all of the probate records and land deeds for the United States, and they are working on it now. So perhaps you will get your wish and they will eventually do other court records as well. I sure hope so!
I saw a presentation on this some years back. Still waiting.
The Ancestry index is amazing. I was not able to find my parents who were still a year away from meeting. Dad wasn’t at home with his parents in Wisconsin. My grandmother was newly widowed and no longer at the address on Grandpa’s death certificate. Mom would have been with her but neither of their names came up in the NARA index. I was sure they’d moved in with cousins, but which ones? Grandma had a lot of cousins. St. Louis was too big for me to manually search. Figured I’d be in for a long wait for the Familysearch index. But as soon as I heard about Ancestry’s, I found my Dad easily in a state I would have never guessed (no family there) and Mom and Grandma were indeed with cousins but ones I never knew lived in St. Louis. As a bonus, I also found Mom’s sister who might have also been with Mom and Grandma but wasn’t.
Like you – i am very impressed with the AI. I’m one of the volunteers working to review the data on FamilySearch (easy, peasy way to contribute) and it’s getting it right much of the time.
Still having trouble with Volga German names in Nebraska, so still having to go to NARA to find the ED as most are still living at the same place as 1940 Census. I have made a Tag – 1950 Census on my tree so I can go back and make certain that the Verification Process has found them and if not, then correct them and have it stick.
I look forward to trying Ancestry’s indexing. I have been hunting for ancestors where I knew their 1940 census address, hoping that I would find that they were stiill there; it has been fifty-fifty I would say.
On the other hand, I found the 1950 census yesterday that i wonder about the enumerator. The head of the household is shown with an incorrect surname but not that far off from actual. Then the mother-in-law is shown correctly and the daughter too. But then there are two children shown with the last name of the married daughter but the relationship is shown as son and daughter of the head of household. But he is 64, his wife is 49, the daughter is 25, and the children are 1year 9shown in the index as 2) and 7 weeks but shown in the index as seven years.
To quote Desi Arnez, somebody’s go some explaining to do. I have entered corrective comments.