When you purchase a DNA test from Ancestry, your results include a list of genetic relatives and a breakdown of your Ancestry ethnicity estimates.
I won’t get too caught up with terminology in this article. Just be aware that there are many definitions of “ethnicity”, and many calculations as to the number of ethnic groupings in the world.
Ancestry’s definition is based on a list of regions and communities that it identifies as sharing a common DNA pattern. The Ancestry DNA ethnicity test runs a comparison of your DNA against its reference database to calculate the most likely regions that you have inherited from your ancestors.
Reading Your Ancestry Ethnicity Results
Your top-level ethnicity estimate is shown on the main DNA page under the “DNA Story” heading.
Click into your DNA Story to see your full breakdown. You are presented with a map on the left, and your breakdown on the right. Let’s focus first on the percentage breakdown, and understand how to interpret your results.
What Are Ancestry Regions and Communities?
The latest ethnicity white paper from Ancestry says that they have 70 possible regions that they can assign to your DNA. But look at the bottom of your ethnicity breakdown, where they say they have 1100+ regions.
Confused? It’s not you, it’s them. Ancestry uses “region” to mean two different things. Let’s get this sorted right now.
There are 70 global regions in the Ancestry world. They may or may not have labels that correspond to current political boundaries. So, Ireland and France are regions. But so are “Eastern Bantu Peoples” and “Indigenous Americas – North”. Their “Eastern Bantu” region stretches across Uganda and Kenya.
So, how does Ancestry arrive at a figure of “1,100+”? It’s including sublayers of smaller groupings of DNA patterns within those global regions. It also calls these “genetic communities”. I wish they’d stick to the two different terms consistently!
A Sample Breakdown of Regions and Communities
Take a look at a shortened version of my Ancestry ethnicity estimate:
There are three regions in the picture. Ancestry uses a solid colored circle to show a global region. And it assigns your percentage breakdown at this level.
Notice that there are two levels beneath the Ireland region. “Central Ireland” is a genetic community that groups me and a set of Ancestry DNA testers to a smaller region. And within Central Ireland, Ancestry has defined at least three sub-groups. One of these is an area labeled “Fermanagh & Cavan”. If you’re not familiar with Irish geography, these are two neighboring counties.
Ancestry uses a dotted circle to distinguish communities from regions.
Notice how the other two regions do not have subgroups? Yet the geographical area covered by “Eastern Bantu Peoples” is considerably greater than little ol’ Ireland.
And that 1% region of “Cameroon, Congo & Western Bantu Peoples”? I chuckle when I look at the ethnicity map because that “region” covers a third of the continent of Africa.
Why do some regions have communities, and some don’t? Well, this comes down to Ancestry’s DNA reference database, and how Ancestry goes about calculating ethnicity estimates. Before we jump into the technicalities, let’s take a look at the Ancestry ethnicity map.
The Ancestry Ethnicity Map
Ancestry has partnered with MapBox to produce its interactive visualization map. Zoom in from a high-level view to get a detailed color map that represents the ethnicity estimate on the right of the screen.
How does it work? There’s a great explanation on the MapBox blog.
Basically, Ancestry divides the world map into squares and colors them as a heat map of ethnicities. You will see that some areas “bleed” into others. Take Uganda as a recently formed political entity. At least two Ancestry regions overlap within Uganda’s political borders.
If you click a community or region in your Ethnicity Estimate, the map will zoom in on that geographical area. This is a great way to familiarize yourself with places that are far-flung (from your perspective).
How Does Ancestry Define Ethnic Regions and Communities?
When Ancestry processes your kit, it compares your DNA to samples that it hopes represent different and distinct ethnicities of your ancestors.
For a close-to-perfect accuracy, Ancestry would need access to historic DNA samples dating back centuries. That may become possible with future extraction technology, but it’s not feasible right now. Therefore, your ethnicity test is a compromise scenario. Bear that in mind for when we discuss how accurate these tests can be.
So, Ancestry assembles a reference panel of DNA from current sources. That could include you and me (your consent is required). Here’s my somewhat crude infographic to reduce some of the complexity:
Box 1: DNA Sources
Three public DNA projects are used by Ancestry. These are academic projects to research genomes in specific areas around the world. These are the three:
We Ancestry testers are another source, but they only include members who meet certain criteria. Usually, Ancestry is looking for testers whose four grandparents come from the same area.
The third source is constructed in the labs. These are synthetic DNA samples that represent specific genetic profiles.
Boxes 2 and 3: Cleansing and Removing Samples
The pile of samples is examined, and some are discarded to try to increase the accuracy of representation. One step looks for DNA testers who share more than 20 cM. The extra “relatives” are discarded, to avoid skewing the results. That’s quite a simple test.
A more complicated process looks for statistical outliers. Let’s say a tester’s family tree shows four grandparents from Ireland but his DNA is matching primarily to samples from Eastern Europe. This sample will be set aside. How could that happen? It could be a case of adoption.
Box 4: Reference Panel
The outcome of this acquisition and evaluation of samples is a panel that represents distinct groupings of DNA. It is this panel to which your DNA is compared.
How Does Ancestry Calculate Ethnicity?
The latest Ancestry white paper on ethnicity goes into the calculations in detail. I’ll try to give the 101 here, while also touching on my take on the issue of accuracy. I’ll delve move into accuracy in a later section.
Let’s say your four grandparents hail from Ireland, France, Mali, and Southern China. For the aim of simplicity, we’ll assume that the ancestral lines stayed within those regions for many generations. Your father’s ethnicity is primarily Ireland/France, and your mother is primarily Mali/Southern China.
Ancestry chops up your DNA into 1001 small pieces. Each piece has DNA contributed by both your parents, but Ancestry cannot determine which strand comes from whom. That makes things even more difficult!
The size of these pieces is chosen to be so small that it will only have one region from each parent. In a particular window, your father’s DNA is either Ireland OR France. And your mother’s contribution is either Mali OR Southern China. There’s not enough room in the window to fit two consecutive different pieces.
Now for the Calculations
Let’s take a step back and remember that Ancestry doesn’t know your ethnic ancestry before it starts the calculations.
The process takes the first window with contributions (or “alleles”) from your Ma and Pa. Let’s say we can label this pair of alleles as A and T at position N.
Now, the Reference Panel comes into play. Ancestry has already performed statistical analysis of alleles at position N across every sample. It turns out that most samples with A and T at position N come from Southern China (I’m making that up). So, window #1 is labeled as most likely to represent the Southern China region.
Then we move on to window #2, and so on. At the end of the process, we should have near enough to a 25% breakdown labeled Ireland, France, Mali, and Southern China. The percentages won’t divide equally, due to the random nature of inheritance – but they’ll be ballpark quarters.
Ancestry DNA Ethnicity Range
Using statistics and probabilities means that there is an element of uncertainty with these estimates. Ancestry also tells us how confident they are about the round number percentage delivered in the main display. Click on the percentage to see the range.
So how confident is Ancestry about my being 50% Irish? Here’s the range:
So, that’s pretty darn confident. It’s probably 50, but it could be 49 or 51. Potaytos, Potahtos.
And what about my 29% “Eastern Bantus People”? I mentioned that this region covers a rather wide geographic area and population.
And yet, it’s a rather wide range.
Ancestry Genetic Communities
Every tester gets a breakdown of global regions. But some testers don’t get any genetic communities. I have several communities under my Ireland region and none beneath my African regions.
So, what are genetic communities? If you’ve worked with your shared matches, you’ll understand quickly how these communities are put together. They are a group of highly interconnected testers who Ancestry has clustered along geographic and historic lines.
Getting communities depends on who is also DNA testing with Ancestry. You’re most likely to see communities in regions where genealogy and ethnicity research is a popular hobby.
It’s worth pointing out here that genetic communities are also displayed (if you have them) at the bottom of the Ethnicity tab on DNA Match profiles. Because the ethnicity map takes up a bit of space, it’s easy to forget to scroll down the page to see what’s available.
What happened to the Communities filter?
We used to be able to filter our DNA matches by genetic communities. But Ancestry giveth and sometimes they taketh away. Currently, we get to see three (!) “featured” DNA matches within a genetic community. This selection is kinda useless, in my opinion.
I haven’t found any mention by Ancestry as to why the filtering was dropped. Nor if this feature will ever return.
How Accurate is Ancestry DNA Ethnicity?
The ethnicity estimates are just that: estimates based on statistical comparisons to a reference database.
The accuracy depends on the composition and quality of the reference panel. Western regions have the highest number of DNA tests and are more accurate than in other regions.
Several factors serve to reduce accuracy. These include:
- Accuracy of the reference panel
- Less testers of less developed countries
- Challenge of DNA analysis
- Challenge of statistical analysis
We’ll address these factors in turn.
Accuracy of the Reference Panel
I mentioned that the reference panel is already a step down from using centuries-old DNA.
Aside from that, Ancestry is relying on the reported genealogy of testers to be accurate.
For their best-case scenarios, they look for DNA testers who have established that all four grandparents are from the same region. But genealogy research can be wrong, even when documented evidence is available. I’m talking about non-parental events (usually the father is misattributed).
The Challenge of Less Developed Countries
There are many countries where there isn’t a documented trail. For example, birth registration is a fairly recent phenomenon in Uganda. In these areas, verbal testimony is necessary to establish genealogy.
With verbal testimony, the best-case scenario is that testers can inform Ancestry as to where their grandparents were born. But consider the United States: a survey by Ancestry reported that about 21% of Americans don’t know where their grandparents were born. Now think about countries and regions that suffered civil wars within the last hundred years.
So, let’s say we’ve whittled down Ancestry testers to those who affirm their grandparents there grandparents were born within a region. The problem is that this leaves Ancestry with regions where the number of testers will not be statistically significant. How does Ancestry deal with that? I’ll quote their white paper:
“Ideally, we’d use people with all of their grandparents from the same country, but due to low numbers for some countries we sometimes use parents”.
I’d like Ancestry to be clear on which countries use the lower genealogical standard. But I can’t find a list.
The Challenge of DNA Analysis
When Ancestry is comparing your many tiny windows of DNA to the reference panel, they must label which piece is from your mother and which is from your father. That is a complex process called phasing, and it has its own levels of error.
So, the ethnicity process piles on more statistical analysis to compensate for misassignment. Estimates upon estimates can only serve to reduce accuracy.
The Challenge of Statistical Analysis
I’ll simply mention here that Ancestry has chosen one mix of statistical modeling techniques over others. Have they got it right? And what are the other DNA companies choosing? That’s for another article.
My Experience with Ancestry Ethnicity Estimates
My maternal grandparents and great-grandparents were born in the same small county in Ireland.
In my experience, the accuracy of my maternal ethnicity estimate has improved with each updated version. The regions have become smaller and more precise, and the genetic communities correspond with my family tree.
Ireland can be divided into four quadrants, or provinces. My genetic communities don’t just nail the correct province – the other three are excluded. That is impressive!
I’m less impressed with the other half of my estimate, which covers vast swathes of the African continent. And the recent updates to this estimate do not inspire confidence.
In 2019, my estimate was a quarter (28%) “Southern & Eastern Africa Hunter-Gatherers”. In 2020, that region has disappeared completely from my estimate. My Eastern Bantu region has doubled to 29%, and a new African region seems to have gobbled the remaining percentage of the defunct grouping.
The genealogy forums were awash in 2020 with people complaining about massive jumps in their Scottish percentages. Ancestry went to the trouble of putting out a dedicated article to explain how so many customers could suddenly claim a tartan (I jest). I’d like to see similar explanations for other regions.
When Will Ancestry Ethnicity Be Updated Again?
Ancestry rolled out a major update to ethnicity estimates through September 2020. They’ve done this periodically over the years. The reasons may be one or all of:
- More regions due to changes to the Reference Panel
- Changes to their statistical algorithms
- Changes to DNA extraction and analysis
The 2020 ethnicity update included nine additional global regions. This pushed the total number up from 61 to 70. I’ve mentioned one of the most talked-about changes: the split of the “Scotland & Ireland” region into two.
The previous update was round about April 2019. And the one before that was around September 2018. So, the schedule isn’t predictable. If you want to keep up with forthcoming releases, keep an eye on blogs during and after the big genealogy conferences. RootsTech is often chosen by the DNA and genealogy companies to tease up-coming announcements and events.
Why Hasn’t My Ancestry Ethnicity Updated?
Your Twitter feed is full of people posting their updated ethnicities. Your Facebook feed is all pictures of kilts, sporrans, and highland jigs. And you’re left looking at unchanged percentages. What’s going on?
Ethnicity updates are not released simultaneously for all Ancestry customers. You just need to be patient. It looks to me like North America gets processed first, as my changes were weeks after I saw mention of updates on social media forums.
You will receive an email when your results have been updated. Ancestry will also display a “coming soon” message on the DNA page when you log into your account. There’s no way to trigger these updates, so just sit and wait for good things to happen.
Does Ancestry Test for Native American Ethnicity?
This is a common question from new testers who are surprised that their ethnicity does not match verbal family history.
Ancestry’s regions are listed in their white paper. They include several labeled as “Indigenous Americas”.
Why does Ancestry DNA not show my Native American heritage?
But if you don’t see one of these regions, it doesn’t mean that you don’t have Native American ancestors. Due to the random nature of inheritance, you may not inherit DNA from all your great-grandparents. It’s possible that an ethnic region has been reduced to amounts that cannot be detected by a DNA test.
Of course, the same logic applies to any ethnic region you may feel you are missing.
Low Percentage Ethnicity Estimates
What about those one-percenters? Can you trust them? Here’s one of mine:
And my friend has an even lower percentage:
If you delve into the description, you’ll see that the confidence range starts from 0%. These percentages are low enough to be statistical noise. At this low level, the regions may disappear and reappear with successive Ancestry updates.
They may align with your family history, but I would tend to ignore them. Unless you are disappointed with your 99% region and enjoy the thought of a little bit more variety.
Why Does My Parent Have Ethnicities that I Don’t?
Your mother has tested, and her results show 2% Finland. But you have none.
You take a quick glance at the top of your DNA matches – yep, she’s definitely your mother. Does this prove ethnicity estimates to be hokum? No, not at these small percentages.
The random nature of inheritance comes into play. You don’t get a complete representation of your mother’s DNA in the half that you inherit. The small portion that Ancestry identified as Finland may not have been passed on to you. However, it may be present in the DNA of your sibling!
Why Does My Sibling Have Ethnicities that I Don’t?
This is the same explanation as in the previous section. You and your siblings do not inherit the exact same portions of DNA from your parents. A small percentage, such as 2% Scottish, may appear in your brother’s ethnicity estimate and is nowhere to be seen in your own.
The 2020 ethnicity updates generated a lot of chat and comment in the genealogy forums. I was somewhat alarmed to see one chap remark that he and his brother now had different ethnicities…and there was going to have to be a serious discussion at the family dinner table. Thankfully, it turned out he was joking around – and understood the variability in genetic inheritance.
Transfer Your DNA To Get Alternate Ethnicity Estimates
Other major companies also provide ethnicity estimates. You don’t necessarily have to pay to see additional reports. You can download your raw DNA from Ancestry and upload it for free to some other sites.
This article is a full walkthrough of transferring your DNA to MyHeritage, which has its own ethnicity reports.
You can also upload your Ancestry DNA for free to GedMatch. Each has their version of ethnicity estimates, and you may see considerable variance when you compare several reports.
Be sure to understand the privacy options and conditions of any DNA site where you intend to upload your data. There are differences particularly relating to how the sites interact with law enforcement agencies.