Clustering has changed the way many of us work on genealogy mysteries and unknown parentage cases. Genetic Affairs was just one of the sites offering automated clustering (click here for my first clustering post), but then they added tree building. That’s right, they make tree diagrams for each cluster that has at least two people with trees that can be matched up. They even include a GEDcom for those trees in the zip file they send.
One way I use these diagrams is to show cousins how we are related. Another way I use this feature, is to solve unknown parentage cases. I use both DNA2tree and Genetic Affairs and then go with whichever seems to have the more relevant looking trees. The advantage of Genetic Affairs (GA) is that it will look at the unlinked trees and at your ThruLines. Also the output is easy to glance over to see what is worth pursuing, once you are used to the format. Click here for my recent post on automated tree-building tools.
Above is the diagram GA built for the descendants of my gg-grandparents who are in the lonely box on the far left. Click on the image for a larger image in a new tab. My great grandparents lived on farm Skjold in Etne, Hordaland, Norway and had eight children, four of whom, plus the child of another, emigrated to the USA and have many tested descendants at Ancestry.
Here is a key to what you are seeing. The green box on the bottom line is me. The mustard yellow box means that the match’s unlinked tree was used from that person on down. The people in pink in the middle were determined from my ThruLines. Living people are shown as just id numbers, except for your matches who are shown by the name they have chosen to be seen as. All DNA matches are on the far right and are also colored pink with the source and the amount shared listed. Clicking on a match gets a little box to pop up in the lower right corner (as shown) with the name of their family tree, clickable to their Ancestry tree.
UPDATE 24 Jun 2020: Clustering on Ancestry is no longer available as they issued a cease and desist order to Genetic Affairs and many other 3rd party sites. Please click here and send a suggestion to Ancestry that they implement clustering on their site.
The purple box with the word ANCESTRY indicates the source of the tree information. Another GA feature is the ability to cluster both your Family Tree DNA matches and your Ancestry matches together.
When names are listed differently in other trees they will be shown in these diagrams as separate people. Notice that in the second from the left column, that the software could not tell that the A. Skjold who married L. Stephenson is the same person as the A. Halvorsdtr skjold who married L. Stephenson Fjaere. Norwegians did not have fixed surnames so we usually use the farm name as a surname in our trees. Often upon arriving in this country they often chose to use the patronymic, so Stephenson rather than Fjaere (click here for more on Norwegian naming). However the other Anna Halvorsdtr Skjold listed between those two really is a different person and she married a Thompson. Reusing first names is another bane of the Norwegian genealogist.
This tree building capability from Genetic Affairs recently helped me solve an unknown father mystery.
When “Amy” discovered her brother was only a half brother by doing an Ancestry DNA test, she was very surprised. She had heard that her mother was pregnant with her when marrying her late father, but everyone knew he was her Dad, or so she thought. Her mother was not willing to discuss this, so she asked for my help to figure out her biological father from the DNA.
With a first cousin match, “Gina,” it seemed as if finding the birth father of “Amy” (all names as always are pseudonyms) would be easy. Soon, however, it was clear that Gina’s father was not biological either! The two women shared a mystery second cousin match “Marvin” who had not responded to their queries but at least he had a great tree.
Rather than build a mirror tree, a now disfavored technique, I ran a tree-building rule-based analysis at Genetic Affairs in order to find their common ancestors with Marvin. Note that both people have to share their DNA results with the account you are using (mine in this case). Here are the details of what I did.
Skip to step 6 below, if you are proficient with the Genetic Affairs site already.
1. Go to GeneticAffairs.com and log in or register if you have not yet. You will have to let the site save your username and password at any site you want to use for analysis like Ancestry. I have a different password for all my DNA sites that I do not use anywhere else, so I never worry about a data breach finding those passwords (and I trust the site owner to use proper precautions).
2. Click the big green button RUN AN ANALYSIS which takes you to a page that lists the websites you have registered at this site.
3. Click on the little icon with green dots and lines next to the site you want. I chose my Ancestry account.
4. Now I got the far too long list of DNA profiles shared with me at Ancestry.
5. When working with a new client, typically I have to scroll to the bottom and click the blue button saying RETRIEVE/UPDATE NEW PROFILES FOR THIS WEBSITE in order to update my list of people.
6. After updating, I found Amy and clicked the blue icon with a down arrow and little lines. That is the rule-based analysis indicator.
7. The next page showed explanations of the rules available: the logical “OR”, “AND”, and “NOT” functions. Since we wanted the in common matches (ICW) that is an AND operation. I wanted every match that both Amy AND Gina shared.
8. First though we needed to select a sensible cM range for this search. Those selection boxes are below the explanations and above the list of matches. When there are not many matches, I usually go down to 20 cM or even 15 cM and set the maximum to about 400 to catch 2nd cousins along with more distant relatives. It is important to remember to do this as the defaults are not best for unknown parentage.
10. Finally I scrolled to the bottom of the page and clicked the blue button PERFORM ANALYSIS
11. A box came up letting us know how many credits this would use. I clicked OK and waited for the email,
This is really pretty easy and intuitive once you are used to it. When the email came, I saved the attached zip file and unzipped it. In the extracted folder, I clicked on the folder called AutoCluster. In that folder there was an html file called AutoClust_Ancestry_Amysname_etc. I clicked on that to see the pretty clustering display in my browser.
However what I was interested in was the trees, so rather than click to them from the cluster display, I went straight to the folder called tree and clicked on the html file called treecombined and saw this:
Looking at the first tree which had Marvin, the second cousin match, highlighted in a darker pink, I could see that Percy Cornelius was his great grandfather shared with at least one other DNA match. So that was where to start since Percy was likely to also be the great grandad of Amy and Gina. Using the save to tree in the tools menu on Ancestry from Percy’s facts page in Marvin’s tree, I saved that couple to Amy’s test tree. Using hints and records, I found that they had only five children. One was the ancestor of Marvin and another never married, so only three children to build the families of. One daughter had two sons who were the right ages and in the right places.
Next I built the tree of the husband of that daughter back the mid 1800s and attached one of their sons to Amy’s tree as her father. Two days later the hints came in. The unexplained matches with trees now showed as having common ancestors on the presumed father’s paternal and maternal lines, all with appropriate cM amounts! Armed with his name, Amy was now able to confirm who her biological dad was when she next spoke to her mother.
Total time to solve this case was only 3 hours of my time (three days of elapsed time). Thank you Genetic Affairs! And thank you for donating those extra credits so I could create this post!