Archives

New DNA Tools and Blog from a Scientist at Cornell

Much of the work to build tools and write articles to help testers with their DNA results has been done by citizen scientists, bloggers, computer programmers, and scientists from other fields like Andrew Millard (behind the WATO math). In an exciting development, Amy Williams, a computational biologist at Cornell University, has built a few DNA tools, with more to come, and started a blog at https://hapi-dna.org/blog/

Her blog article titled “How often do two relatives share DNA” is particularly interesting. It includes the beautiful chart shown below which is created from simulations. Click it to go to the actual page where you can mouse over the columns to get the detailed numeric breakdowns.

Chart of How Often 2 Relatives Share DNA from https://hapi-dna.org/2020/11/how-often-do-two-relatives-share-dna-2/

The other article on her blog has a detailed explanation of what a centiMorgan is, the measurement used for DNA segment sizes (click here). I usually recommend not worrying about the exact definition since it is a measure of the frequency of recombination rather than a physical length. It is important just to know that there is not a one-to-one relationship between the cM and the sizes shown in the chromosome browsers. On the those charts, the same cM amount looks smaller at the ends of chromosomes than it does in the middle because recombination is more active on the ends.

The final statement of that article is: “In an upcoming post, we’ll talk more about cM lengths of DNA and how recombination leads more distant relatives to share fewer segments that are also on average smaller than those that close relatives share.” Something to look forward to!

Now to take a look at the tools that are available there so far.

Of particular interest to adoptees is the maternal versus paternal predictor for half siblings or grandparents. I tried it out on a number of half sibling pairs who I have helped in the past.

Here is the prediction created for a brother and sister who share the same father but have different mothers, using the comparison of their segment data from 23andme:

However I discovered  number of minor usage issues when trying to use data from the different DNA testing sites.
Continue reading

Super Large Numbers Do Not Work in my Ahnentafel to GEDCOM Tool

Alert, there is a bug in my tool to convert text files to GEDCOMs: very large Ahnentafel numbers like “46406041600” will cause it to hang.

I will add code to ignore large numbers by May (end of this week). If you are a regular user of this tool, check this post for the update when the new feature is released.

Something must have changed on the collection of trees because I had three emails in the last week complaining that my tool hung and did not complete the conversion. In all cases, the Ahnentafel went up to extremely large numbers, so eliminating those last few lines fixed the problem

Here is an example of the last several lines of a file that did not work:

Here are the last few lines of the same file after removing the lines with very high numbers. This version worked.

Do you really need these people born in the 1200s? What is the probability that they even are actually your ancestors?

This appears to be some sort of limitation in either the storage space for the program or the number sizes. Thus I propose to modify the code to ignore ahnentafel numbers with more than seven digits and to have it tell you that it did that.

Any other ideas out there? Remember I make almost no money on this, just the occasional small thank you donation, so I am not looking for a solution that will take lots of my time.

New Numbers for the Shared cM Project

All of us genetic genealogists are extremely grateful to Blaine Bettinger for collecting statistics on the actual amounts of DNA shared for known family relationships. He just updated his numbers for that project this past March. Click here for the details on his blog

This update is also included in the wonderful online calculator at DNApainter where you can input either the cM or the percentage shared and dynamically see the probabilities of various relationships. Click here to read what the programmer, Jonny Perl, has explained about its new features on his blog.

I refer people to this calculator all the time so that they can see the full range of possible relationships for a specific amount of DNA. In the above screen shot of that tool I have used red arrows to show where you would put the number of cM or the % shared.

An exciting new feature is that if you click on any of the colored boxes it shows you a histogram of the frequencies within the range for that relationship.

Let me demonstrate using this calculator by comparing a few of my family members.

Continue reading

Automation to Find the Common Ancestors in the Trees of your DNA Matches

Recently I gave a presentation on many of the great new DNA tools that have come out this year. The talk focused on how both Ancestry and MyHeritage figure out the likely ways in which you are related to a DNA match from the other trees on their sites (click here for my slides). This left very little time to go into the details of my favorite third party tools that can do similar magic, so I promised the attendees a blog post…

The three tools I use the most for finding common ancestors are:

They all have their strengths but none are free. Think of the endless hours professional programmers have spent making these tools and be grateful.

Mainly I use them for unknown parentage cases where the tree is not yet known. However they are also useful for your genealogy. For example, Genetic Affairs will look at the trees on Family Tree DNA and show you any common ancestors it finds with the path to your matches:

My Dad shares DNA with many descendants of this couple as shown by Genetic Affairs from his FtDNA results

There are many other tools I could not do without, like the online relationship calculator at DNApainter (click here) but this article is about automation to find common ancestors. Read on for my summary of the strengths and weaknesses of each of the tools that do that.

Continue reading

Talking About Many New DNA Tools

Tuesday I will be presenting the latest version of my talk on solving unknown parentage cases in a virtual conference hosted by the Utah genealogical society (click here for more information). In the past, I relied heavily on the tools at DNAgedcom, but now there are several new tools that are even more exciting.

The basic methodology for unknown parentage searches is to DNA test everywhere. Then look through the trees of your matches to see what ancestors are in common. Build trees down from those common ancestors looking for where the different families meet in a marriage. Then find a child of that marriage who was in the right place at the right time to be the missing parent or grandparent or …

A major difficulty is that many people test DNA without providing a tree. Usually you have to try to build trees for them. Another problem is that building trees down from those common ancestors is incredibly labor intensive when the families are large and the matches are distant. My latest strategy for difficult cases is to recruit several search angels to build the different trees.

There are now tools that automate building trees!
Continue reading