Common Ancestors Can Be Very Far Back

So far I am finding that the common ancestors with Dad’s DNA matches at both 23andme and FamilyTreeDNA are much further back than predicted. We have found the MRCA for only those distant cousins with good paper trails and perhaps even a tree at GENI like we have.

Most of these matches are only one or two segments and the longer the segment the more likely it is to be a real match with a discoverable common ancestor. I actually found a new 5th cousin of mine through DNA, Dad’s 4th cousin once removed. She has a one segment match of 17.14 CMs and 2849 SNPs in common with Dad and our common ancestors are in the 1700s at farm Fatland in Etne, Hordaland, Norway (online resources for Etne research listed at familysearch.org)

23andme shows you all your 7cm and larger matches but many genetic genealogists think anything less than 10cm is suspect. My view is that if Dad’s match is also a match with either me or my brother (n.b. frequently the match is for fewer SNPs and CMs in the next generation) then it is real, even at 6cm. As you can see in the chart, we have found many common ancestors with smaller than 10cm matches. GEDmatch lets you look at even smaller segment matches with specific people as does Family Tree DNA.

Here is a summary of the most recent common ancestors in Norway that I found for Dad with some of his DNA matches:

CMs SNPs MRCA relationship
8.4 1467 Ingeborg Djupesland (Bårdsdatter) b 1650 6th cousin *
5.7 1308 Ola Narvesen Glaim 1621-1714 7th cousin
9.9 2129 Gunnar Olafsen Gangså 1570-1639 10th cousin twice removed
6 1197 Gunnar Olafsen Gangså 1570-1639 10th cousin twice removed
6 1202 Ingeborg Djupesland (Bårdsdatter) b 1650 7th cousin once removed
6.4 1246 Ingeborg Djupesland (Bårdsdatter) b 1650 7th cousin
6.8 951 Knut Pedersen Åmot 1786-1851 4th cousin twice removed
5.4 954 Amund Jonson Seim (Holter) 1414-1480 13th cousin twice removed
10 1722 Ola Narvesen Glaim 1621-1714 7th cousin
10.29 2096 Ola Narvesen Glaim 1621-1714 7th cousin
9.1 1442 Ola Narvesen Glaim 1621-1714 6th cousin *
17.14 2849 Bjorn Ve (1725-1792) 4th cousin once removed
9.1 1543 Amund Jonson Seim (Holter) 1414-1480 14th cousin once removed
8.99 1900 Nils Anderson Eig Øvrebø (1619-1683) 7th cousin

* 6th cousin is the same person and has two significant match segments

 

Yes there might be a closer connection for our 13th and 14th cousins but we have not found it yet!

3 thoughts on “Common Ancestors Can Be Very Far Back

Click here to add your thoughts at the end of the comments
  1. The 7th cousin with a two segment match turns out to be doubly related to us. Once his parents tests came in, we discovered that they BOTH matched my Dad. So while we know we are 7th cousins on his Dad’s line there may well be a closer relationship on his Mom’s line (the 10cm match). As yet not found but both families worked the Konnerud mines in the 1700s near Drammen, Norway.

  2. Noticing that ‘predicted’ matches often turn out to be more distant than the genetic company implies I asked myself why and I realized they are engaging in a certain amount of trickery.

    To take a simplified example.   If you have a certain strength of match say typical of a 4th cousin, let say a segment of a certain length with a certain number of cM, now the scientists can determine lets say that that strength of match would be produced 50% of the time by the 4th cousin, 25% of the time by a 3rd cousin and 25% by a 5th cousin, and so they can say with a relatively clear conscience  that the predicted level of match is between 3rd and 5th cousin, with 4th cousin predicted.    

    But this is based upon a hidden assumption, which is that the the pool being drawn from has as many 2cd, 3rd, 4th 5th, ect cousin distances

    But the number of cousins we have at each level of distance is definitely not equal.  For this I applied a rule of 4.   Given the size of American families for the past couple of hundred years one could predict that on average each of have about 4 aunts and uncles, 4 times that many 1st cousins, or 16, 4 times that many 2cd cousins or 64, ect.  Actually given the size of the American family till recently this is probably conservative.   So in the general population, from which we are drawing our match from, we have 4 times as many 4th cousins as 3rd cousins and 16 times as many 5th cousins as 3rd cousins. 

    So applying this to the simplified example above, instead of predicting that a given match at the 4th cousin level of strength is likely to be 25% 3rd cousin, 50% 4th cousin and 25% 5th cousin, the actual chances for us would be taking into consideration how many of each level of cousin we  have:    4% 3rd cousin, 32% 4th cousin and 64% 5th cousin.   

    Now in the real world even these strength of matches that are predicted to be 3rd to 5th, a certain small percentage of those beyond 5th cousin would actually create the same pattern, and because the number of cousins at each level of distance is growing exponetially, we can get considerably more distant cousins even though the company has “predicted” a 4th cousin match. For instance in the above example if the scientific prediction was that 24% of 3rd cousins would create that pattern, 48% of 4th cousins 24% of 5th cousins and only 4% of 6th cousins, the actual probability for us of getting a 6th level cousin would be about 1/3,

    Again to give an example, I ran the rule of 4 out to 9th cousin and realized I might have as many as a million or more 9th cousins.  So lets say that only one out of every 10,000 9th cousins created a pattern typical of a 4th cousin, lets say one long segment with so many cM.   Then in the general population there would be out there at least a 1000 9th generation individuals which could create this level of match, but in the general population I would also have only about 1000 4th cousins on average, so if the number of 4th cousins that created that strength of match was less than a 100%, a near certainty, then a given match of a certain strength would more likely be a 9th cousin than a 4th cousin.

    Here I criticize the genetic companies for some dishonesty as genetic scientists are not mathematical dummies, but their marketing departments have allowed this double meaning of the word “predicted” to pass through because the hidden assumption on a predicted match is that the pool drawn from has equal chances of being any given distance of cousin whereas for those of us who are actually engaged in getting matches from the general public  the number of cousins at each level is not equal at all. One is reminded of Mark Twains remarks about lies, damn lies and statistics.

    Now the actual multiplier per generation probably is not exactly 4 but it would be interesting to find out what it is and to also find out the fall off of strength of match per generation, which the scientists may very well have the data for, but if these 2 factors were determined one could come up with a realistic prediction for a given strength of match of the likelihood of the distances implied.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.