Previously I discussed an analysis of the availability of trees across all my DNA matches. The headline figures from May 2018 were
- 40% of my matches had a public linked tree
- 27% had no linked tree but had at least one public unlinked tree
- 7% had a private tree
- 26% had no tree
A total of (linked + unlinked) 67% of matches with at least one public tree at first glance would seem to be a very positive outcome.
To analyse usefulness I proceeded to drill down further into the actual content of those trees. Specifically, my May analysis included a download of the Direct Ancestor Surname lists provided by Ancestry on the “Pedigree and Surname” page for each match where a tree is either displayed by default (the linked tree) or has been selected from a list of unlinked trees. I use my own utility but the DNAGedcom Client will also download surname lists to spreadsheet to allow analysis. I don’t know if it grabs unlinked trees, mine does to a limited degree (just one per match).
It was immediately clear that many matches had a tree, but had no available surname list. The reason was simple: the tree was public but all entries had been set to a status of living, so their details were hidden:
This was a significant 11% of my matches with a public linked or unlinked tree, or a rounded 8% of my total matches, as reflected in the revised usefulness proprotion:
Single Surname Trees
Aside from the I-see-no-dead-people trees, the other happiness-killers are the loner trees with but a single visible entry:
I also include in this category those trees with multiple generations of a single surname and no visible spouses. These single surname trees were a very similar (slightly lower) percentage to the “Everybody Lives” category. The usefulness proportion is again reduced:
Variety Is The Spice of Life
My grandmother was a Smith, and so it seems were all the neighbours. But research-wise I could have had it worse, like this guy:
Aside from Smith, the two other names I’ve blanked for privacy are common Irish surnames. If you’re thinking Murphy and Reilly, you’re half right. With my current research I find that the most useful trees are those with a good variety of ancestral surnames, certainly more than the three in this example. A high variety will indicate a higher number of ancestral generations represented in the tree.
I titled this post as “The Usefulness of Trees”, but usefulness is in the eye of the beholder. If I was searching for living relatives (e.g. as an adoptee) and this was a close match, this might be gold if the match is prepared to reply to messages.
Measured By Distinct Surnames
For now, I will define the potential usefulness of a tree by the number of distinct surnames in the ancestral list. That won’t be the case for everyone, and it may change for me if the focus of my research changes. That being said, I now want to get a measure of how many matches with potentially useful trees that I have.
The highest number of distinct surnames amongst my matches has 272 distinct direct surnames. The tree goes back to the early 1600s and has over 12K people. The rest of my matches are distributed over a range of numbers.
For illustrative purposes, the next figure shows where my matches fall into bands of “number of distinct ancestral surnames”. I’ve already noted the lowest two bands: zero and the single surname crowd.
Above zero or one surnames is the band of low variety: I’ve fairly arbitrarily put this as 2, 3 or 4 distinct surnames. This is where I also think little immediate value is to be had. I’d probably need to build out a research tree, which is why I say there is little *immediate* value to me. It’s about 30% of my public linked/unlinked trees. That eats into my usefulness as so:
That leaves me with about 32% of matches with “useful” trees for my current purposes. For me personally, that’s currently at least 2,870 trees.