In May 2018 I downloaded the direct line ancestral surnames of all my DNA matches at Ancestry. I discussed some statistical analysis on the numbers in a previous post.
I then conducted some exploratory data mining of the distribution of surnames across my matches. My paternal line is African (not African-American), and I have very few paternal matches on Ancestry. My maternal line is Irish, and I expected the usual suspects of popular Irish surnames to appear in my own top 10 list. I’m talking Murphy, Ryan, Kelly and the like. I could imagine a pattern where my emigrant ancestors landed in the traditional “Irish” enclaves of New York or Boston, and married exclusively within the cohort of people they met from the old country. If their American descendants followed suit, then the distribution of surnames across my matches should skew Irish.
Data mining is about asking questions of the data, so here is the Q and A.
Question: What are the Top 10 Surnames across my Matches?
Before I crunched the numbers, my educated guess for Number 1 was Smith, which can be of Anglo-Saxon origin or of Irish origin. My maternal great-grandfather was a Smith. At least one of his sisters married a Smith. And it seems that I see Smiths everywhere I look among my match trees. Well, I wasn’t wrong on Number One, but the next 9 surprised me. Here is the distribution of the top 10 surnames across my Ancestry matches.
I do not recognize any of the next nine surnames from my known direct lineage.
Question: Is my top 10 Surname Distribution typical of Ireland (north and south)?
So I compared my distribution to a paper published by Sean J Murphy titled “A Survey of Irish Surnames 1992-97” . Murphy presents the top surnames on the entire island of Ireland (i.e. including Northern Ireland) based on data he gathered from 1992-1997. It’s a great read for anyone interested in Irish lineage. Because my Irish ancestors are predominantly Ulster, I’m working with Murphy’s numbers instead of recent data from the Irish Central Statistics Office, as theirs does not include Northern Ireland. Murphy provides raw numbers in tabular format, but I’ve plotted the distribution to allow a broad side-by-side comparison to my own:
Clearly the answer is no, the distribution across my matches doesn’t look similar to the distribution of surnames on the island of Ireland.
Question: Is my top 10 Surname Distribution typical of British surnames in Ireland?
I wouldn’t have thought of this without reading Murphy’s paper which has a section on “British Surnames in Ireland”. I have not established a genealogical link of my ancestors to British origin. But my distribution sure does look more similar to this breakdown than to the Irish one.
Before I draw any inferences, I’ll ask another question.
Question: Is my top 10 Surname Distribution typical of the United States?
I then compared my distribution to the USA census figures. I chose the 2000 census to be similar time frame for the Irish numbers. It might be better to use use earlier census figures: because living people are private, the available names in trees are likely to represent an earlier time frame. But its all very approximate, and I’m only interested in the broad distribution, which is certainly closer to mine than the Irish distribution.
The names “Garcia” and “Rodriguez” jump out at me because I’m not familiar with them. Checking my data, I have a sum total of 17 matches with a direct ancestry to Garcia or Rodriguez. My understanding of American demographics is that the last few decades will have pushed Hispanic surnames upward in frequency. So I narrowed the census numbers to filter on respondents who identified as white European lineage.
So after all that, I can see that the distribution of surnames across my Ancestry matches is closest to the USA white population.
Of course the vast majority of Ancestry customers are American. However if there was a very high tendency among my own emigrant ancestral relatives to keep themselves to themselves and only to marry within the typical Irish communities, then my top 10 distribution would surely be closer to the filter of the top 10 Irish surnames within the USA census. (I’m using the Ancestry blog as my source for these next numbers – they are using the 2000 census but I’m not sure how they got this particular filter).
So I don’t have any of the Irish American top 10 names amongst my top list. To be fair, I only have to go to #13 in my own distribution to hit “Murphy”, and I have “Kelly” at #18. But those two names are the only “Irish-American” names in my top 30. (Smith is the awkward outlier. It’s my personal #1 and I’m sure its being excluded, but it can also be Irish origin).
Before I did this analysis I assumed that I’d have a higher distribution of “Irish” names across my matches. My Irish emigrant ancestors appear to have avoided a tendency only to marry other Irish descendants, but tended to marry within the local population of European heritage.