New ethnicity estimates
AncestryDNA rolled out its new Ethnicity Estimate Preview to some 6,000 of its autosomal DNA customers for a first look yesterday, with a broader rollout to all customers within the next month or so.
There is quite a bit to like about the new estimates:
As you can see in this graphic, they are far more detailed, with many more regions being reported separately than in the existing estimates. A total of 26 regions — nine in Africa, three in Asia, two in the Pacific Islands, two in West Asia, one for Native American ancestry in North and South America, and nine in Europe — are included in the new estimates.
One regional breakdown of particular use to African Americans is the division of West Africa into six regions: Senegal, Mali, Nigeria, Ivory/Ghana, Benin/Togo and Cameroon/Congo. Another genealogically-useful breakdown is that of the British Isles, which now separately reports Irish ethnicity. Yet another is the division of Southern Europe to separate the Iberian peninsula from Italy and Greece.
AncestryDNA has moved away from relying on public databases for its reference populations for each region — those persons found in the region now whose genetic makeup is expected to be close to the historical DNA for the area.
It is now relying primarily on data for reference populations drawn from the samples collected by the Sorenson Molecular Genetic Foundation, whose DNA assets were acquired by Ancestry in 2012.
Some 3000 reference samples worldwide from the 26 regions — anywhere from a low of 16 to a high of 645 people all of whom were born in the region and who have all four grandparents who were born in the region — were used for purposes of comparison after being carefully vetted for genealogical accuracy.
More of the overall test sample — roughly 10 times as much — is being analyzed against the reference populations, and analyzed repeatedly to arrive at an average, which is the number reported for that region.
There is an enormous amount of background information available in contextual help, where you can find out more without ever leaving the page with your results. The help ranges from simple explanations of what’s being presented all the way up to a highly technical whitepaper that would make the head of a statistician spin.
And although AncestryDNA continues to present its results on the opening page as straight percentages, it’s taking a couple of important additional steps to try to drive home to its customers that the percentages are just estimates, not carved in stone.
First, as you can see here on the right, when you click on each specific result — here my 49% Great Britain result — the box opens up to display the actual range of results seen in various parts of your DNA when compared to the reference population. And what the range says is that, even though I average out at 49%, I could actually have as little as 22% of my DNA in common with the British reference population — or as much as 76%.
Second, as you can see below, although AncestryDNA reports that its Great Britain results are found in persons “primarily located in England, Scotland and Wales,” it makes it clear that the same DNA signature can be found in lower concentration in Ireland, France and Germany, plus Belgium and the Netherlands, and — if you follow the circles all the way out — in lower concerntrations still in Switzerland, and parts of Italy, Austria, the western part of the Czech Republic and Denmark.
Overall, the combination in the new estimates of more regions, more carefully selected reference populations and more analysis tends, in most cases, to bring the results pretty much in line with the Ancestry Composition results of 23andMe, considered to be — at the moment — the industry leader in these sorts of ethnicity estimates.
That’s the good news.
The bad news is, that’s only true in most cases. For others, the new estimates are — at best — a mixed bag. As with anything that relies principally on extrapolating data through statistical analysis, some results are going to change in ways that are jarring compared to the paper trail even as perhaps the majority of results move closer to the paper trail.
Particularly for people with large percentages of German or French ancestry, the update is a real disappointment. From what I can see, the reference populations simply don’t distinguish those populations well at all, and it appears that many results formerly reported as Central European are now being lumped in with either Great Britain or with Scandinavia.
AncestryDNA’s vice president for genomics and bioinformatics, Dr. Catherine Ball, reported in a conference call yesterday that the most difficult populations to differentiate were the French, German and English populations because of centuries of wars and intermingling between the continent and the British Isles. She also noted that there wasn’t enough information available yet to confidently detect differences between German and French ancestry.
Dr. Ball also said it wasn’t possible given today’s scientific know-how to say just how far back in our individual history our ancestors may have been British or Scandinavian or Irish. The comparisons are being made against persons alive today whose own ancestry can be traced back to great grandparents living in the region from which the reference population was chosen. But a mass invasion 500 years ago would inevitably skew the results.
And AncestryDNA’s Dr. Ken Chahine added that the word “estimate” for these results had been chosen very deliberately. “We have a lot more we want to learn, and areas where we know we can continue to improve,” he said.