It was only a matter of time before the methodologies and technologies that have been developed to break genealogical brick walls and find unknown birth parents were used to identify victims and criminals. The use of DNA and genealogy to solve the horrific Golden State killer case has been sensationalized in the media for several days now. I even got a few calls from reporters as a DNA and GEDmatch expert. Also, just two weeks ago an unknown murder victim from 30 years ago, found in Florida, was finally identified from a DNA cousin match on a genealogy site.
Some of my friends and cousins are worried about the possible invasion of their DNA test privacy. Most just want to understand how this can be done, so I will try to explain that in this post. At the end of this post I will include links to other genetic genealogy blog posts that have wrestled with the issues raised.
Although I have sympathy with the concerns of people who fear false identification using DNA techniques, this is not my fear. The methodology used gets to a pool of possibles whose actual DNA is then collected and compared. I have confidence in that technology. My fear is that my cousins will stop testing their DNA to help my family projects or stop uploading their tests to my favorite tools site, GEDmatch, where the DNA test results from different companies can be compared.
Click here for an article at the LA Times which went into more of the technical details of the Golden State killer case for us genetic genealogists and here for a lengthy video interview with investigator Paul Holes on how it was done.
Let me start my article by reminding all of you that every human’s DNA is about 99% the same as every other human and about 98.5% the same as a chimpanzee. The companies who test your personal genome only test a small sample of that differing 1%. To put it in numbers, our genomes have about 3 billion base pairs and the tests cover about 700,000 of those, which comes out to about .02% of your genome. Not enough to clone you or worry about, in my opinion.
Next let me remind you that uploading your DNA results from Ancestry or 23andme or wherever you tested to GEDmatch does not expose even that little bit of your DNA to the public. What happens is that your “DNA cousins” will match long sections of your data, called segments, and they can see which locations on which chromosome(s) are the same between the two of you. Therefore they know what your actual DNA code is only on those pieces they share with you. When they match you in the GEDmatch database, they can see your email address, name or pseudonym, and your kit number. With that kit number they can see what color your eyes are, what ethnicities various calculators give, and who else you match. If you have connected a family tree to your DNA they can also see the non-living people in your tree. But they have to match your DNA significantly to see any of that! Click here for an article I wrote addressing privacy worries at GEDmatch
So how do you get from there to a killer?
You start by putting a DNA results data file of your suspect on GEDmatch that looks like a kit from one of the main testing companies.
The methodology involves building endless trees. This is much easier to do on Ancestry or MyHeritage which have good family trees and good DNA to tree matching tools. There is also WIKItree which connects their one world tree to GEDmatch, if their users have entered their kit number. Finally GEDmatch itself also has a family tree (GEDcom) upload and compare facility.
To start tree building, you need to find some people who match the DNA, predicted second or third cousins are best, but it can be done from fourths, as it was suggested may have been done in this case. It just takes longer. Next you build the trees of these cousins back to about 1800 looking for an ancestor or couple that is in more than one tree, a common ancestor(s). Sometimes the trees are already built. The next step is to build the tree of that common ancestor’s descendants down to the present day looking for someone of the right age in the right place. There are tools that can help with comparing trees to get to that couple or couples but there are no short cuts to building the tree of their descendants, unless some of them have already built large public trees.
I read that in the Golden State case they got down to a pool of 100 people over a four month period using this technique. I saw an article that said that another suspect’s DNA was tested a year ago and was negative. So this methodology was not a panacea. It got them to a pool of people whom they had to investigate using standard police work and direct DNA matching. Anyone who has ever watched the TV show Bones knows that DNA can be extracted from chewing gum and drink cans…
When I do unknown parentage work, this tree building methodology can get me to grandparents or great grandparents fairly quickly, provided enough relatives are tested (easiest for Amercans). I will be giving a presentation on this technique at the SCGS Jamboree in Burbank at the end of May. Here are some of my blog posts that describe the methodology:
Here are some DNA success stories I wrote up:
These articles quote me:
Here is what the Legal Genealogist, Judy Russell has to say about the issues raised:
Here is what Leah Larkin, the DNA geek has to offer:
And finally be sure to watch what top genetic genealogist CeCe Moore has to say about all this, she taped segments for 20/20 and Good Morning America.
UPDATE 1-May-2018: The clip of CeCe was tiny yesterday (Monday morning) on GMA – it’s about 1/3 of the way in when they discussed the Golden State killer case. Click here for a good summary of the case from the Washington Post.