Half sibling or Nibling? A first look at the 25% relationship data

A question I often get is “Can you tell if this DNA match is my uncle or my half-brother?”  Why this question? Because it is very difficult to tell the difference between a half sibling and a nibling (an aunt/uncle/niece/nephew) relationship from the amount of matching DNA. Like grandparents, they all share 25% with you, or about 1750 centimorgans (cMs) give or take several hundred. Unlike grandparents, the age difference can’t usually be used to tell them apart. The testing companies might call him “close family”, “first cousin,” “uncle,” or the more descriptive “1st Cousin, Half Siblings, Grandparent/ Grandchild, Aunt/ Niece” from Family Tree DNA, which by the way is my Dad’s actual great-niece’s designation. Then they show you the amount of shared DNA in centimorgans and maybe a percentage, but really they are just making an educated guess about the relationship.

I wanted to find a way to help adoptees figure out more accurately which relationship a new 25% match was likely to be, so I collected detailed statistics using a google form for about a year, getting some 2400 responses. These were self-reported from people who read my blog or are members of groups on Facebook where I publicized this. I am still collecting, so feel free to add yours to my form (click here) to get included in the next report. GEDmatch numbers preferred.

My experience from helping people understand their DNA results had led me to suspect that segment sizes were the key to telling these relationships apart. I had noticed that the sizes of the four largest segments would usually be much much larger for half siblings than niblings. However now that I have these crowd-sourced numbers, I can see that much of my personal experience came from helping with paternal side cases. There the segments are consistently much larger.

Can you tell a nibling from a half sib by the shared number of segments and centimorgans?

The collected wisdom of the many adoption search angels is that the number of segments can indicate the difference. While this usually works for nibling versus grandparent, half siblings too often seem to fall in the range of one or the other.

The DNA adoption folk have a chart which shows the number of segments expected for each relationship (click here for that PDF) which is very useful, just not enough for determining half siblings. They carefully separate the AncestryDNA results which can have more segments and fewer total centimorgans due to the removal of some matching data deemed less significant. DNA adoption also has an automated relationship estimater based on that data.

Leah Larkin recently wrote a fascinating post – Escape from the Overlap Zone – which showed simulations for these relationships which indicate that grandparents can easily be told from niblings since they have far fewer total segments. However again, the simulations show that half sibs and niblings have considerable overlap.

So how does the collected data compare?

Here is a scatter diagram graphing total centimorgans (X axis) versus number of segments (Y axis) for just grandparent and nibling relationships. This used only the GEDmatch data for consistency. The niblings are the lavender color and the dark results show where they overlap with the beige-pink colored grandparents. This is not far different from the predictions although there is more overlap than expected but look what happens when I add the half-sibling data below right.

The graph is hard to read now since those results overlap both categories. The colors are semi-transparent so the darker areas are created by having multiple colors in that area. There seems to be considerably more overlap than the simulations predicted.

It looks like just using just shared centimorgans and number of segments will not produce a clear answer as to whether a 25% relationship is a half sibling.

Notice the funny shape of the half sibling blues. There is a bunch on top with the niblings and another group with the grandparents. Fortunately someone had suggested that I include a question asking which side a relationship is on, paternal or maternal.

So I decided to look at the scatter graph for maternal (reddish) versus paternal (blue) half siblings. The difference was quite startling. Maternal half siblings looked like niblings while paternal half siblings looked like grandparents in the comparisons of centimorgans versus the number of segments. However there are only about 100 data points in each set, so I will look again in a few months and see if this difference holds up. When I looked at the maternal versus paternal for the other relationships the paternal were only very slightly larger.

What else can we use to tell niblings from half siblings?

What I have seen in the many results that I have looked at over the years is that if all other indicators are ambiguous, the sizes of the four largest segments can indicate which relationship it is, that half siblings will usually share between two and four segments over 100 cM. While this works almost all the time for paternal half siblings, it is a closer call for maternal half siblings versus niblings. Yes, most of my cases have been paternal.

Here is a visual; I used the median rather than the average although those numbers are practically identical and I used the entire data set, not just the GEDmatch numbers since the largest segment sizes will not vary between companies. The third and fourth largest segments are the most consistant when I looked at the range of results (not yet published).

In my experience, another indicator of a half sibling is having one or two fully shared chromosomes whereas a nibling relationship rarely has any. This seemed to fit the small amount of valid data I got. However the question on my form must have been confusing since I got answers like 21,22, and 23. I have now changed it to a checkbox – 0,1,2, or more. Maybe that will work better for the next analysis of this data a few months from now.

The other problem I saw was that I had six outlier answers for niblings on the very low side, about half as many centimorgans as expected . Blaine Bettinger explained that he saw the same thing and the likely answer is that they are unexpected and unknown to be half relationships, so I did not include that data. By the way he has just published a new version of the shared centmorgan project so click here for his latest results.

Any other indicators? I am collecting X data to find out

Is there anything else? Yes, the X chromosome can help. Half sisters sharing a father will share an entire X chromosome (click here for that blog post) which is exceedingly rare in a nibling relationship. There are cases of maternal half sisters sharing a full X, not many though and this more likely in endogamous populations.

Please help me collect X statistics for close family members to see if that can help in these cases as well. Click here to add yours to my form.

21 thoughts on “Half sibling or Nibling? A first look at the 25% relationship data

Click here to add your thoughts at the end of the comments
    • Anna –
      My results are from hundreds of data points. There are always outliers. If yours differ are they just your family or from many cases? I sent you an email.

  1. Wouldn’t the results of the last scatter graph (maternal vs paternal half-siblings) be expected?. The production of a human sperm cell averages 27 recombinations (per ISOGG page on recombination), while the production of an egg averages 41 recombinations. As a result, wouldn’t we expect fewer and longer segments from paternal grandparents, and more but shorter segments from maternal grandparents. This would result in paternal half-siblings having fewer but longer matching segments than maternal half-siblings as shown in your diagram. I imagine a scatter graph between 1st cousins whose fathers are brothers vs 1st cousins whose mothers are sisters would also show a similar difference, with 1st cousins whose parents are brother/sister somewhere in between.

  2. Expanding on my hypotheses of paternal vs maternal half-siblings a quick calculation (which ignores the X and Y chromosomes and the varying rates of recombination on each autosomal chromosome) I would expect to see on average (41+41+22)/2 = 52 matching segments for maternal half-siblings and (27+27+22)/2 = 38 matching segments for paternal half-siblings. This is pretty close to what your scatter graph shows.

  3. Interesting about the half sisters possibly sharing a full X but more likely endogamous related. When I was trying to determine if my mom & her sister (now confirmed maternal half-sister) shared the same father, I was confused by the fact that their X was nearly identical, except for a few tiny pieces missing. They are maternal half-sisters, not sharing the same father at all. But maybe their fathers were distantly related as they did have ties to the same island.

    I also saw distant relatives, a brother & 2 sisters all having the same mother & different fathers shared an entire X.

    I’m still trying to get my paternal half-brother and my maternal half-sister to get tested. I tested 2 maternal half-brothers, and have a full brother out there whom I know will not get tested.

  4. Kitty,
    Thank you for sharing this and for your ongoing work. Here’s my scenarii. My father is deceased and he had no siblings or first cousins. My brother and I have both tested as I am trying to trace our paternal line back and find the parents of my paternal 2nd great grandfather.
    My father was married prior to his marriage to my mother. That union produced 2 daughters, my half sisters. Both are deceased. However, those half sisters did have children. Here’s my question. Would testing my half nieces be worthwhile in trying to create more paternal DNA?
    I have created a paternal phased kit on GedMatch for my brother so we know which matches for him are from which side.
    Thanks for any advice you can offer.
    Diane

    • Yes test your half nieces, the more data you have the better. They will likely have some DNA from him that you two do not have

      You may be able to make a Lazarus kit at Gedmatch from the four of you

      Also it might be worthwhile to do a 37 marker Y test on your brother

  5. Hi Kitty,
    I had an aunt that always believed she really belonged to her sister, my mother, because of things my grandfather said. My grandfather was far from reliable.
    I had her dna done before she died, she wanted to know whether it was true or not but of course I couldn’t tell. I’ve done my own, my half sister, this aunt and two cousins, dna on Ancestry and uploaded them all to gedmatch. I keep going back to her dna and looking for signs. It just hit me the other day, she seems closer to one of my cousins than to me. I’m wondering if she really belongs to my aunt.
    Thanks for all your help.
    Barbara

  6. Not sure my problem fits in here, but here goes:

    I have a brother who for the past50+ years of his life has maintained that he is not our father’s son. Once I found that he has the same haplogroup as our father, he decided that perhaps his father is our father’s brother. Below is a write-up of the data I have:

    My head has quite a bit of trouble understanding and interpreting and I am hoping to get some clarification (before you drink your glass of wine, LOL).
    Here is my puzzle: I have tested three people at 23andMe (09/2013):
    Erika (me), Hans and Massimo (both male). They all have a common MRA, Henry, deceased. Erika and Hans are siblings, Massimo, I think, is a first cousin once removed of both, i.e. the next generation. 23andMe says they are first cousins.
    1. Henry had two sons, Max and Val (all deceased)
    2. Max had Erika and Hans
    3. Val had a son (deceased) who is the father of Massimo (therefore my 1C1R?)
    The question is whether it is possible that Hans is not the son of Erika’s father Max, but of Erika’s father’s brother Val, and if so, what kind of relationship would it be between Hans and Massimo (e.g. half nephew/half uncle).
    Also, how confident can you be, based on these numbers?

    GEDMatch numbers:
    Largest Segments:
    Erika – Hans 113.945
    Erika – Massimo 106.64
    Hans – Massimo 57.5992

    Total cMs:
    Erika – Hans 2527.97
    Erika – Massimo 700.77
    Hans – Massimo 747.605
    I cannot figure out the difference between the two men’s segments and cMs and Erika’s, i.e. Hans to Massimo’s largest segments smaller and cMs larger than Erika and Massimo’s.

    In addition, I think that the ethnicity percentages may be of interest (23andMe):
    Erika Hans Massimo
    Southern European 19.7% 16.1% 57.2%
    Balkan 8.1% 3.0% 2.1%
    Italian 5.5% 1.7% 28.8%
    Broadly Southern European
    6.1% 11.4% 25.6%

    I am a real beginner and enjoy reading your articles and the blog, they are so well written and helpful, but right now I am stymied.

    Many thanks for looking at this and guiding me? If this port is inappropriate here, where else could I go?
    E

  7. Erika –
    The key to the full versus half question is the fully identical regions (FIRs), aka completely identical segments. So look at those.
    That is an area where you each got the same DNA from both parents thus it is identical on both chromosomes in the pair for a stretch along the same location. When you are 3/4 siblings or double first cousins you will also have some FIRs, just not as many.
    You can see this at 23andme or at GEDmatch. At 23andme you have to get to the full DNA comparison to see it which is not intuitively obvious. Easiest is to look at your match to Massimo then scroll to the bottom of the page and click on the yes next to your brother. This feature is not on a smartphone though.
    You would expect about 25% of your DNA to be fully identical with a full sibling, so about 900cM. You have inspired me to collect some statistics on this so stay tuned for my net blog post. I share 824cM with my own full sibling by the way.
    I do not look at ancestry composition for this unless it is wildly different.
    Massimo looks to be your first cousin. When I have more than one comparison I average the totals and check it at https://dnapainter.com/tools/sharedcmv4

    • Wow, Kitty, I did not expect a reply so quickly – thank you so much. Now, however, comes another hard part – doing what you told me to do. Let me see what I can come up with, and then I may have to ask you again, at least to corroborate.

      Thanks again, e

      • I also sent you an email for you to send me the kit numbers at GEDmatch or GENESIS but obviously better if you do it yourself!

    • The number of matches is not that meaningful. DNA inheritance is random and you and your cousin just share more DNA with other people who have tested…

  8. Kitty: I have learned so much in the last few days, mainly thanks to reading you writings and your suggested links. Fantastic (but lots more to go if I want do more). I have written up what I understand and I hope you will critique it, both positively and negatively. I have written it so I can send it on to the people involved in this quest.

    The original question was: Is it possible that Uncle Val might be Hans’ father? Val is our father’s brother.

    I think that the data show that Hans is Erika’s full sibling. They share
    • acc. to 23andMe 44.8% DNA with 2545 cM, 40 segments (785 cMs, 32 segments fully identical)
    • GEDMatch: Total Half-Match segments (HIR) = 2525.8 cM (70.4 Pct)
    o 57 shared segments found for this comparison,
    o 74.1 Pct SNPs are full identical
    o 1.3 Generations

    THEREFORE

    Massimo is our cousin*:

    Erika-Massimo:
    • acc. to 23andMe – 9.36% DNA with 697 cM, 19 segments
    • GEDMatch: Total Half-Match segments (HIR) =699.9 cM (19.5 Pct)
    o 22 shared segments found for this comparison,
    o 60.9 Pct SNPs are full identical
    o 2.2 Generations

    Hans-Massimo:
    • acc. to 23andMe – 10.2% DNA with 758 cM, 24 segments
    • GEDMatch: Total Half-Match segments (HIR) = 746.4 cM (20.8 Pct)
    o 27 shared segments found for this comparison,
    o 61.1 Pct SNPs are full identical
    o 2.1 Generations

    The cMs for Erika and Massimo are 697 and for Hans and Massimo 758 (23andMe numbers).

    According to https://dnapainter.com/tools/sharedcm:
    697 cMs makes Erika either a first cousin 1C, or a first cousin once removed 1C1R
    758 cMs makes Hans either a first cousin 1C, or a first cousin once removed 1C1R

    It seems to me that the fact that these numbers also fall into the categories of half-aunt/uncle and half first cousin is irrelevant because there is no doubt (based on their fully and half identical overlapping segments and cMs) that Hans is Erika’s full sibling and not her half-brother.

    However, the numbers are rather low for their categories if compared to the table from the Aug.2017 version of the Shared cM Project, which may be interesting in and of itself but probably has no significance in the case at hand.

    I’d like to know whether you can do anything with our numbers as you intimated in you previous reply.

    Many thanks for looking at this. AndI will email you the GEDMatch kit numbers in case you want to see them.

    Erika

    • Your logic is excellent Erika, yes they are full siblings. However I am tied up until Monday so cannot look at the rest of this until then

      • Dear Kitty:
        I’d like to refer back to my post of January 20 and your reply; I still have questions.

        You agreed that Erika and Hans are siblings (acc. to 23andMe 44.8%, 2535 cMs in 40 segments).

        The original problem was to ascertain whether there was a possibility that Hans was (instead) the son of Val, our father’s brother and Massimo’s grandfather. We know now that that is not the case, but I am confused about the numbers for the two gentlemen.
        Massimo is a 1C1R (proven), I remember him as a baby.
        Actual Centimorgans for both are way above average, but well within range, 700 for Erika and Massimo, 747 for Hans and Massimo.
        Percentages of shared DNA are well above the expected 1C1R percentage of 6.25%, 9.36% with 19 segments for E and M, and 20.2% with 24 segments for H and M.

        Am I on track, do I understand this, do you think? Can you say that these are just outliers? Or is there something I don’t know? I looked at all the longer segments on all chromosomes. Hans and Massimo have good overlaps on chr. 5, 10, 16, and 19, ranging from 55 to 105 (rounded). Erika and Massimo overlap on chr. 2, 4, 5, 10, 14, ranging from 50 to 107.

        My next project is to try to find some info on some 18th century cousins, because I may have a DNA Relative with 0.22%, one segment on chr. 18 That makes me suspect or hope that I may find something. She is a fourth cousin supposedly, which would fit. There were two young men who ‘were lost in the Levante’ ( I guess as mercenaries?). Would that not be a find? Lots to learn, even if I end up empty handed.

        Many thanks for all your help and interest, e

          • After close family the amount of shared DNA has a wider and wider range due to the randomness of DNA recombination. When people are at the high end, sometimes they are related more than once, particularly if they are from an endogenous group
            But yes you are doing fine with all this

  9. Kitty,

    This information is so very helpful. Without your expertise and that of so many others, situations like Erika’s can be solved. That leads me to my own situation/conundrum. I have struggled with reaching out for help, but I just can’t seem to reconcile all the data to fit or disprove what we assume is a half-sibling relationship.

    I see above that you are quite busy. I would be quite grateful if and when you have time, you could shed some insight. Of course, I am more than happy to share here whatever info might be helpful.

    Thank you so much,

    Brooke

Leave a Reply

Your email address will not be published. Required fields are marked *