Much of the work to build tools and write articles to help testers with their DNA results has been done by citizen scientists, bloggers, computer programmers, and scientists from other fields like Andrew Millard (behind the WATO math). In an exciting development, Amy Williams, a computational biologist at Cornell University, has built a few DNA tools, with more to come, and started a blog at https://hapi-dna.org/blog/
Her blog article titled “How often do two relatives share DNA” is particularly interesting. It includes the beautiful chart shown below which is created from simulations. Click it to go to the actual page where you can mouse over the columns to get the detailed numeric breakdowns.
The other article on her blog has a detailed explanation of what a centiMorgan is, the measurement used for DNA segment sizes (click here). I usually recommend not worrying about the exact definition since it is a measure of the frequency of recombination rather than a physical length. It is important just to know that there is not a one-to-one relationship between the cM and the sizes shown in the chromosome browsers. On the those charts, the same cM amount looks smaller at the ends of chromosomes than it does in the middle because recombination is more active on the ends.
The final statement of that article is: “In an upcoming post, we’ll talk more about cM lengths of DNA and how recombination leads more distant relatives to share fewer segments that are also on average smaller than those that close relatives share.” Something to look forward to!
Now to take a look at the tools that are available there so far.
Of particular interest to adoptees is the maternal versus paternal predictor for half siblings or grandparents. I tried it out on a number of half sibling pairs who I have helped in the past.
Here is the prediction created for a brother and sister who share the same father but have different mothers, using the comparison of their segment data from 23andme:
However I discovered number of minor usage issues when trying to use data from the different DNA testing sites.
Continue reading →