Perhaps this post needs the subtitle , “My Perfect Cousin Goes to GEDmatch.”
Most of us can keep track of information in spreadsheets. So how to do that with DNA? Well, the idea is to keep a list of matching DNA segments so that a new match can be compared to your known family members. That way you may be able to see where they fit in.
If you have tested at 23andme, MyHeritage. or Family Tree DNA, you can download your list of matches with their matching DNA segments either directly from your testing company or by using the tools at DNAgedcom. However AncestryDNA does not provide a list of matching segments.
Why would you want those? The short answer is to figure out which line a new DNA cousin belongs to. For the long answer, read on. For more posts about DNA spreadsheets click here or in the tag cloud, lower right hand column.
AncestryDNA testers can make a DNA segment spreadsheet by using any of a number of utilities at the GEDmatch web site. Start by uploading your raw DNA data (click here for that “how to” post). Your results will usually be ready for full comparisons the next day. Then buy the tier 1 utilities for at least one month ($10).
My preference for making a first spreadsheet is to use the Tier 1 GEDmatch Matching Segment Search. Then I go through the top matches from the ‘One-to-many’ matches report with that spreadsheet as a reference. I add notes on what I discover to my new spreadsheet.
First I uploaded J.M.’s raw DNA results to GEDmatch under my own account and used a pseudonym for her name – Kittys2nd1rJM – which makes it clear to me who she is. Plus anyone who matches her will understand that she is my 2nd cousin once removed. Then I waited until the next day.
On that next day I clicked on the tier 1 function “Matching Segment Search” and got a form where I had to fill in her kit number (your kit numbers are shown in the third box in the left hand column on your GEDmatch home page). I used the defaults for the SNPs and the cMs. Next I checked the No box next to the “Show graphic bar for Chromosome?” in order to make it easier to copy the results to a spreadsheet. Then I clicked submit and went to the kitchen and made myself a salad. I still had a bit of a wait on my return.
Once the matching segment report completed, I used control-A to select the entire page. Then control-C to copy it to my “clipboard.” I opened my spreadsheet program, OpenCalc from OpenOffice, and pasted (control-V) that data into a new sheet. There were a bunch of extra lines at the top which I deleted as well as two extra lines at the bottom that I deleted.
Now I moved the name column to before the chromosome column and I changed the column sizes as needed. I added an initial column for “side” where I will put M, P, or I for maternal, paternal, or IBC (false). Then I add several columns before the email column. One for the company I got the match from, one for the most recent ancestor(s) (MRCA), one for the relationship, and another for notes. These days I mainly use notes for who else matches here. Here is how that looked.
If you DO NOT have other known relatives at GEDmatch, you can skip this next step… I sorted the spreadsheet by the match email address and name and searched for my own email address. Then I added the known MRCA, side, and relationship for all the kits I manage. I also added the same for other known Munson cousins on her list. Then I bolded the names of all the known relatives.
Next I ran a one-to-many report for her kit and looked through her top matches. My family members were matches 1-5 but there was a new match, Naomi, with an M kit number, so a test from 23andme, at position 6. I also spotted a group of the 3rd cousins descended from Neils and Martha Simensen previously found at Ancestry in the top ten, because the email address username was the same as the username at ancestry. So I added the common ancestors and presumed relationship (3C) to all those kits in J.M.’s master spreadsheet, found by searching for the email.
I also checked the other matches that had a gen of 4.5 or less that were kits from Ancestry (kit numbers starting with A) by searching for the name listed or nickname or the username in the email address. Found several of them at Ancestry. Added those relationships where known.
Finally it was time to put the master sheet to work on that Naomi match. I sorted the sheet by chromosome and start position. Then searched through the sheet for Naomi to see if there was anywhere that she overlapped with known relatives. Look what I found.
She overlaps both my 3rd cousin DM (JM’s 2nd cousin) and the Simensen group for a really solid 17cM match at chromosome 19. Since DM is from J.M’s mother’s side and the Simensens are from her father’s side, Naomi should only match one group or the other. A quick one-to-one with DM and one of the Simensen crowd settled the matter: Simensen. I checked Naomi’s kit number at GEDmatch to see if she had uploaded a tree (click here for that post) but no luck; so I sent her an email with my findings.
I will be continuing to use this process with J.M’s other close matches.
Those of you who are coming to the i4gg genetic genealogy conference at the end of October in San Diego, this technique will be part of my talk there …
By the way, I have a template for DNA segment matches that includes the centromere locations in my downloads area.
Another thing you can do with master spreadsheets is to map the DNA of an ancestor. See the one I did for my Wold ancestors. Hoping to do the Munson side soon!
Thank you J.M, your results are a delight to work with.