The current technology for personal genome testing cannot tell you which of the two chromosomes, maternal or paternal, in a pair that an allele comes from. It can tell you that there is an AG at a specific position and a CT at the next position but not whether the A came from your mother or your father. This leads to much confusion about DNA segment matching.
Kitty and Shipley; siblings sharing 47% of their DNA
The matches that these testing companies find are for stretches of DNA that are half identical regions (HIRs). This is due to the fact that a relative who shares a DNA segment from a common ancestor with you will match you along the chromosome you got from the parent who is descended from that ancestor. Thus your new relative will match you for half the alleles in those positions. Only a sibling will share fully identical regions of DNA. Click here for a page that has a picture of the DNA I share with my brother Shipley.
For example, if my Dad gave me AAAAAAAAAAA and my Mom gave me CCCCCCCCCCC then I would seem to match absolutely everyone on that segment because every position has both an A and a C. So an ACACCAACCAC or a CCAACCCACA looks like a match, but only those with an AAAAAAAAAAA or a CCCCCCCCCC would be real matches. This is simplistic and the segment runs used for matching are much longer than this to try to avoid that sort of false matching. Also note than when your testing company shows an AC it is really an AT and a CG but just one of the known pairing is shown for brevity.
The term for a real match is IBD, which is an abbreviation for Identical By Descent. The term IBS means Identical by State which would apply to any false match. So in our example, the CCAACCCACAA match would be considered IBS.
I previously blogged about a wonderful set of comic strips for us genealogists (and genetic genealogists) called Geneapalooza by Esto Frigus. Here is one that matched my current frustrations nicely:
This image is used by permission of the creator, Esto Frigus. Thanks Esto! Keep up the good work.
There are four exciting new utilities at GEDmatch.com which I plan to cover in depth over the next several days. These are only available to for people who have donated at least $10 (every additional $10 gets you these for another month). A good way for GEDmatch to pay for their extra server costs. The rest of the site will remain free. The utilities are:
- A Matching Segment Search – Get a list of all your segment matches suitable for cutting and pasting into a spreadsheet
- A Relationship Tree projection – calculates probable relationship paths based on Autosomal and X-DNA Genetic Distances. It is experimental, try it and give them feedback
- Lazarus – Construct a kit to represent a close ancestor, wow!
- Triangulation – takes your top 300 matches and finds which ones match each other with details. The format can be copied to a spreadsheet
Finally a way to give our Ancestry.com cousins a chromosome browser! If you have not been able to convince them to use GEDmatch, perhaps it will be easier to convince them to transfer their data to Family Tree DNA – that wonderful and very reputable company which started the personal genome testing revolution. This is a more private way to compare data than at GEDmatch since only your DNA matches can see your information and compare where they match you. There is a free transfer which gives you an account with just your first 20 matches. Or for $39 you can transfer to a full featured account there with all your matches and ancestry composition (called MyOrigins).
This transfer is possible since you already have the raw data from the DNA test.
To download the raw data from ancestry, you need to click on the settings button next to the person whose data you want on your DNA homepage. You can get the raw data for kits you manage or have been shared with you as an “editor.”
I think the switch from Ancestry.com DNA test results, where your tree gets searched for you, making using DNA with genealogy easy – to GEDmatch where you have to figure out how to use the data yourself, is quite difficult. So this post is an attempt to help my cousins who have tested at ancestry and uploaded to GEDmatch. It might also help others new to GEDmatch who want to look at where they match a [possible] cousin by walking through that process.
Sample from the GEDmatch one-to-one comparison
- First make sure that you understand current DNA basics (click here for my page on that). Genetics have advanced greatly since my high school biology class and perhaps since yours too.
- Next realize that the raw data from your test is only a small part of your genome, a sampling. It is the SNPs that are currently considered the most interesting. They represent the most likely spots where we are different from each other. If a contiguous sequence of those SNPs is the same in two people for about 10 centimorgans (cMs) or more then they are expected to share a common ancestor. With a match of 7-10 cMs it is likely but not a sure thing. There is a good article in the ISOGG wiki on the likelihood of a match at different segment sizes.
- In order to see where your DNA matches someone else’s, you need your kit number and his kit number. Then you can use the one-to-one comparison to see on which chromosome(s) you match each other. Your kit number shows on your GEDmatch homepage. You can find the kit numbers of other possible relatives in the one-to-many display or perhaps your new cousin has sent you his kit number.
- I recommend that you keep a spreadsheet with the information on your matches, sorted by chromosome and start point, so you can see who else a new match might match. I have a number of posts on this blog about using spreadsheets and a template in my downloads area. Many people like to use the genomemate tool to organize their data.
The image above is from a recent new match to my Dad uploaded from ancestry.com. The blue rectangle shows where there is a DNA match. The numbers in the box are what I cut and paste into my master spreadsheet for Dad.