There are four exciting new utilities at GEDmatch.com which I plan to cover in depth over the next several days. These are only available to for people who have donated at least $10 (every additional $10 gets you these for another month). A good way for GEDmatch to pay for their extra server costs. The rest of the site will remain free. The utilities are:
- A Matching Segment Search – Get a list of all your segment matches suitable for cutting and pasting into a spreadsheet
- A Relationship Tree projection – calculates probable relationship paths based on Autosomal and X-DNA Genetic Distances. It is experimental, try it and give them feedback
- Lazarus – Construct a kit to represent a close ancestor, wow!
- Triangulation – takes your top 300 matches and finds which ones match each other with details. The format can be copied to a spreadsheet
Finally a way to give our Ancestry.com cousins a chromosome browser! If you have not been able to convince them to use GEDmatch, perhaps it will be easier to convince them to transfer their data to Family Tree DNA - that wonderful and very reputable company which started the personal genome testing revolution. This is a more private way to compare data than at GEDmatch since only your DNA matches can see your information and compare where they match you. There is a free transfer which gives you an account with just your first 20 matches. Or for $39 you can transfer to a full featured account there with all your matches and ancestry composition (called MyOrigins).
This transfer is possible since you already have the raw data from the DNA test. That is to say that the processing of your spit is already done. It only works for data from ancestry.com or the V3 kit from 23andme (before December 2013).
To download the raw data from ancestry, you need to click on the settings button next to the person whose data you want on your DNA homepage. You can only get the raw data for kits you manage, of course. If you have not previously downloaded your raw data from Ancestry.com you may need to wait a few days since they seem to be having some technical difficulties at the moment.
I think the switch from Ancestry.com DNA test results, where your tree gets searched for you, making using DNA with genealogy easy – to GEDmatch where you have to figure out how to use the data yourself, is quite difficult. So this post is an attempt to help my cousins who have tested at ancestry and uploaded to GEDmatch. It might also help others new to GEDmatch who want to look at where they match a [possible] cousin by walking through that process.
Sample from the GEDmatch one-to-one comparison
- First make sure that you understand current DNA basics (click here for my page on that). Genetics have advanced greatly since my high school biology class and perhaps since yours too.
- Next realize that the raw data from your test is only a small part of your genome, a sampling. It is the SNPs that are currently considered the most interesting. They represent the most likely spots where we are different from each other. If a contiguous sequence of those SNPs is the same in two people for about 10 centimorgans (cMs) or more then they are expected to share a common ancestor. With a match of 7-10 cMs it is likely but not a sure thing. There is a good article in the ISOGG wiki on the likelihood of a match at different segment sizes.
- In order to see where your DNA matches someone else’s, you need your kit number and his kit number. Then you can use the one-to-one comparison to see on which chromosome(s) you match each other. Your kit number shows on your GEDmatch homepage. You can find the kit numbers of other possible relatives in the one-to-many display or perhaps your new cousin has sent you his kit number.
- I recommend that you keep a spreadsheet with the information on your matches, sorted by chromosome and start point, so you can see who else a new match might match. I have a number of posts on this blog about using spreadsheets and a template in my downloads area. Many people like to use the genomemate tool to organize their data.
The image above is from a recent new match to my Dad uploaded from ancestry.com. The blue rectangle shows where there is a DNA match. The numbers in the box are what I cut and paste into my master spreadsheet for Dad.
Many of my matches at ancestry.com are afraid to upload their raw data to GEDmatch because of their fears about DNA privacy. Here is what I want to say to all of them.
These personal genome tests are not your full genome, just a sampling of the places you are likely to be different from the next person. Remember that we all share 98-99% of our DNA with every other human being.
There is not enough information in these tests for some future mad scientist to make a clone of you.
The GINA law protects you from insurance companies or employers using your DNA information to discriminate against you or deny you health coverage.
So are you afraid that someone will know your blood type or eye color? What about unusual medical conditions? They can only figure something out about you if they know the kit number of someone with your same traits. All they get to see is where the DNA overlaps with another kit, not the raw data itself. And they would need far more knowledge about DNA than the average tester has, to use those overlaps to figure out anything about you.
A prominent genetic genealogist with a PhD in biology, Blaine Bettinger, has so little fear about people seeing his DNA data that he posted it all online for anyone to download and look at!
Your identity cannot be stolen from this data sampling of your DNA. It is like a giant fingerprint not a credit card number.
On the other hand, if you have any criminals in your family it is just barely possible that your DNA could help track them down. Not a good idea to do DNA testing if you are a criminal yourself, although the FBI uses different markers than what these tests look at.
Most of my matches at ancestry don’t see why they should upload their data to GEDmatch. I send them the URL of my slide presentation and extol the delights of the fun ancestry composition (admix) tools but it is hard to explain why I like to see where my DNA matches someone else’s. Curiosity? It’s fun? I love making these spreadsheets? Possibly it is because I am very interested in how DNA inheritance works and love to see which grandparent gave me which piece of DNA (n.b. it takes a lot of work to get to that point).
When I know the common ancestor for a specific segment sometimes a new match fits in immediately to a family line. The best example of that is finding my previously unknown 3rd cousin Katy. When I saw where she overlapped I emailed her that it looked like she was related on my WOLD line to which she responded that her grandmother was a Wold. She has since sent me many wonderful family pictures that I had not seen before.
Today I got an email from someone who had tested at ancestry and uploaded to GEDmatch. She wanted to know how to use my tools with her GEDmatch data. However my tools require a CSV file of overlapping segment data which cannot be downloaded in one fell swoop from GEDmatch, unlike at 23andme or Family Tree DNA.
Personally I built my many CSV files (one per person tested) slowly, as I compared each individual’s DNA results, contacted that match, and then cut and pasted the overlap information into my spreadsheets. Jim Bartlett did a great guest blog here on the process of building these DNA spreadsheets.
But I can understand the desire to see a quick picture of your matching DNA. GEDmatch does have a chromosome browser where you can see the overlaps, although the presentation is somewhat different from other sites. A little known secret is that you can massage that function’s table output into a spreadsheet (see end of this post for the technique).