Many people have the illusion that if their testing company says a person is a 3rd to 5th cousin they really will be. That is not the case.
The testing companies are just making the best guess they can from the data they have. They do not seem to take segment sizes into account, rather they primarily use total shared DNA measured in centimorgans (cMs) for their relatedness estimates, usually the sum of all matching segments of 5 cM or larger. Close relatives will always share larger chunks with each other and so size does matter here.
Recently I have received numerous questions from people trying to figure out if a new match is a half sibling or a niece or a grandchild. These are hard to tell apart without testing more relatives as they all share about 25% of their DNA with each other. So I decided to collect some detailed statistics on those specific relationships with a google form (click here) that includes total segments and segment sizes for a future blog post [UPDATE as of sept 2017: First round results are written up at https://blog.kittycooper.com/2017/09/the-25-relationship-a-first-look-at-the-data/ ]
The companies predict reasonably well for close family but it is just not possible to be accurate beyond that due to the randomness of DNA inheritance.
For example, here is a picture from the new 23andme of some of the DNA I share with Dick, a 2nd cousin on Dad’s paternal side so blue, and John, a 2nd cousin on Dad’s maternal side so red.
I share a third again as much DNA with John as I do with Dick, even excluding the 14 cM on the X. The expected amount for a 2nd cousin is 3.125% which is 212.50 cM, right in the middle between these two.
Checking my brother, I see the same effect – he has 282 cM with John versus 185 with Dick. Not surprisingly, when I look at Dad I find that he shares almost twice as much with John as with Dick. Clearly he just inherited more of the same DNA as John’s mother from their common grandparents. Conversely, he inherited less DNA shared with Dick’s mother from his other grandparents.
On the left is a comparison of my first cousin Henry with both Dick and John. The amount he shares with each 2nd cousin is practically identical, as long as you subtract the 40 cM that he shares with John on the X from the total shown by 23andme. Amazing how variable DNA inheritance can be among 2nd cousins.
Click here for the ISOGG wiki article on Autosomal DNA statistics which usually includes the current chart from Blaine Bettinger’s shared centimorgan DNA project.
Personally I have his chart (shown below, click it for a larger version) bookmarked for easy reference. I rely on it heavily.
Warning, Ancestry.com DNA testing will show a smaller number of matching cMs and larger number of segments due to their algorithm which removes population specific segments.
The DNA adoption site has a relationship calculator that can help figure out closer relationships discussed in the article at Roberta’s blog called Demystifying Ancestry’s Relationship Predictions Inspires New Relationship Estimator Tool.
Autosomal DNA matching is not cut and dried due to the randomness of DNA inheritance and is even more confusing if you are from an endogamous population because your parents will likely share some DNA due to ancestral cousin marriages. Thus a match could be related on both sides! There is a function on the GEDmatch site that lets you check if the parents of a specific kit are related because they have passed along matching DNA segments.
I have sometimes found that someone predicted to be a 3rd/4th cousin based on total cMs is much more distant. This has happened when there are two good sized matching segments but each segment is from a different ancestral couple. Thus the relationship is much further back, for example, a double 6th cousin.
Another issue is the fact that the testing companies cannot tell which of the two paired chromosomes a match is on. So when you have a match that neither parent has, it is a false match created from small bits from each parent by the computer program (see my IBC article). This is why I prefer to look at matches that are “phased” that is to say a child and a parent have the same match.
If you only match someone on a single good sized segment (greater than 10 cM for most, more than 20 cM for the endogamous) your DNA relative can be anywhere from a 4th to a 14th cousin. See http://ongenetics.blogspot.com/2011/02/genetic-genealogy-and-single-segment.html?m=1 for a further discussion of that.
UPDATE 10/17/2017: There is now an easy to use online calculator based on Blaine Bettinger’s lastest chart at: https://dnapainter.com/tools/sharedcm that will show you all the possibilities for the shared cMs. Then you can use the ages of the testers plus test more relatives to try to figure it out.
Last but not least here is my data collection form which you can fill out right from this blog post (use the slider on the right to scroll down it to answer all questions).