Tuesday, January 22, 2019

My Hot 100



The other day, Blaine Bettinger looked at his top 50 matches across four testing companies.  He was careful to note that his analysis only included individuals that he didn’t personally test.  The results are posted the Facebook group Genetic Genealogy Tips & Techniques.  I was impressed with this tactic.   

Taking advantage of a snowy day, I decided to look at my top 100 matches that I hadn’t tested or hadn’t influenced to test, as there are over 60 family members that fit this category over all platforms – with some having their results uploaded to FTDNA and MyHeritage. Since I was involved with the music business early in my career, I named these matches as my "Hot 100" as an homage to Billboard

I decided to look at these 100 matches two ways: by the testing company and by the possible connection via one of my grandparents.  Match sizes ranged from 43cM to 315cM. 

BY COMPANY

Ancestry

Ancestry’s large customer base was probably the reason the bulk of my matches were found in their database. Seventy-four of my Hot 100 tested with Ancestry with 70 of those being unique to their database. Both my mother and I were Ancestry beta testers in 2012. I added my wife, who had tested previously at 23andMe. Being that she was adopted, I was hoping to find a relative to her birth father, as we knew her birth mother. We found her paternal first cousin in 2018 via an Ancestry match. We also found several maternal first cousins as well.  In addition to the three kits that I manage, there are five close relatives I’ve influenced to test who are on Ancestry. 

23andMe

Fifteen of my Hot 100 were found at 23andMe with 12 being unique to this company.  I began testing with 23andMe in 2010 and have 27 kits on this platform. Additionally, there are three other customers who I’ve influenced to test. Up until 2013, I primarily used 23andMe as my autosomal testing company of choice; however, the subscription pricing model which they adopted several years ago and later dropped was my reason for moving to FTDNA as my primary testing source. 

MyHeritage

The newest autosomal company in the mix, MyHeritage, produced 13 matches with 10 being unique to this company. I have one test and several transfers from my surname lineage at MyHeritage. Several of these unique participants are related to me twice. I will further address this below.

FamilyTreeDNA (FTDNA)

FamilyTreeDNA produced five matches with only two being unique to the FTDNA database. I often test family members with FTDNA and the bulk of my new participants test at FTDNA, as I often test the males in my surname with Y-DNA, so it is important to take care of both with one test. All my 23andMe tests and one Ancestry test have been transferred to FTDNA making a total of 58 kits from my family on FTDNA.

I had someone ask me the other day why there were so few of my matches from FamilyTreeDNA. I was surprised by the low number as well. My guess is that both Ancestry and 23andMe do a considerable amount of advertising on TV, radio, and online and this has a profound influence on consumer behavior.  MyHeritage has done some online advertising as well. I can’t say that I’ve ever seen or heard an ad for FTDNA.  

Multiple Testing Companies/GEDMatch

Like Blaine, I didn’t find many individuals in my Hot 100 who tested at multiple companies: only six in total.  One tested at 23andMe and FTDNA, one tested at Ancestry and MyHeritage, one tested at Ancestry and 23andMe, one tested at 23andMe and MyHeritage, and two tested at three companies: one with Ancestry, MyHeritage, and FTDNA and the other with Ancestry, 23andMe, and FTDNA. 

As far as those uploading to GEDMatch, I have five individuals I haven’t tested who match at 43cM or higher. Unfortunately, I cannot identify three of these individuals due to the aliases being used.

BY CONNECTION

In this exercise, I decided also to determine the common ancestor of my Hot 100 matches. While this was not always possible, I was able, in most cases, to identify the grandparent through which the connection was likely to have occurred.  These assumptions are not conclusive, as the matches may be through a completely different line.


Unknowns

Twelve percent of my matches were identified as unknown paternal (3%) and unknown maternal (9%). Further delineation of the relationships were impossible to ascertain due to the subjects not matching others in my family. The basis of whether these individuals were maternal or paternal was determined on whether the subjects matched my mother or not. My father died decades before the advent of commercial DNA testing.  

All 12 matched me and at least one of my brothers. It appears that by not matching my known second cousins and half-cousins (or anyone else who was closely related to me and my siblings) that the common ancestor might be further back in time than others in the Hot 100.

Paternal Grandfather

The connections to my paternal grandfather, George Hood Owston (1879-1924), are the smallest known group overall. While I’ve concentrated my research with targeted testing on my surname lineage, very few (10%) who are related through my grandfather have tested on their own. Four of those are descended from my grandfather’s brother, Ovington French Owston. 

My paternal grandfather: George H. Owston, circa 1905

Paternal Grandmother

It doesn’t surprise me that the majority of the Hot 100 are related to me through my paternal grandmother, Lora Gardner Day (1874-1953).  I realized this with my first test through 23andMe in 2010 and chalked it up to her Colonial New England ancestry. Many of these matches may be related to me through several different colonial lines. 

In addition, seven of these matches are through my grandmother’s first marriage which produced three daughters who lived to the age of majority. My father was from her second marriage.  Most of these matches are descended from my father’s half-sister Ruth while one each is through his half-sisters Nathalie and Blanche.  These rank at 1, 2, 4, 6, 17, and 84. 

My father with his mother, Lora Day Owston, circa 1924
 
Because of the age difference between my dad and his sisters and with me being the youngest grandchild of my grandmother (born after her death), I only met six of my 11 half-cousins who were born between 1917 and 1943. Only three are still living. Facebook has opened the possibility  knowing my cousins’ children and grandchildren from this side of the family.

Mixed Paternal Grandparents

Some of my closest matches come from a unique relationship that connects to three of my great grandparents.  My grandmother’s sister, Susie Eva Day (1871-1946), was married to my grandfather’s uncle, John Freemont Merriman (1862-1941), the brother of Mary Emma Merriman Owston (1856-1895). This couple influenced my grandparent’s marriage in 1911.  John and Susie’s 12 children were first cousins to my father via his mother and first cousins, once removed via his father.  These individuals are among my strongest matches and many have tested at MyHeritage. Descendants of John and Susie Merriman ranked at 3, 7, 16, 18, 20, 52, & 62.   

Maternal Grandfather

Fifteen percent of the total are connected to my maternal grandfather, John Alva Brakeall (1883-1957). Many of the distant matches who have larger amounts of shared DNA may be due to being related to me through multiple lines, as my great-grandparents were second cousins. Most of these are more closely related through my great-grandfather, but two recent testers are more closely related to me via my great-grandmother’s brother, John Staley.  

Me and my maternal grandfather, Alva Brakeall, 1957

Maternal Grandmother

Finally, the biggest surprise is the number of matches through my maternal grandmother, Rose Pauline Schad (1885-1976).  Up until recently, there were no matches that could be connected through her lineage, as her family was our most recent immigrants to North America.  Additionally, my grandmother was 7/8 German and 1/8 French Waldenses who settled in W├╝rttemberg in the 1690s and who didn’t intermarry with local Germans until the early 19th century. Most of these matches are descended from the sisters of my grandmother or sisters of my great-grandfather.  Only one can be traced to our Waldensian connection.     

Me and my maternal grandmother, Rose Schad Brakeall, 1974

CONCLUSION

This was an interesting exercise that I hadn’t attempted in the past and it opened my eyes to the number of individuals who share DNA with me and our connections.  It was also helpful to see the importance of testing at Ancestry, as the bulk of my matches came from this company; however, 26% of my matches did not test at Ancestry. With this, it is important to test at all autosomal companies so that you wouldn’t miss any matches. Of the 19 individuals that matched at 100cM or higher, three tested only at MyHeritage and one tested at 23andMe. If you are only testing at one company, you may be missing important matches.

Saturday, December 1, 2018

Understanding FTDNA's New Big Y-500 Differences Column


During this past week, Family Tree DNA has added a column to the Y-DNA Matches feature called “Big Y-500 STR Differences.”  There has been much said about this column, and there is a great deal of confusion as to what it means. I’ve seen a few argue a point that is different than the one I espouse. Hopefully, by the end of this post, we can agree on this new set of data.  

Background

The Owston/Ouston DNA project has a total of 33 Y-DNA participants with 16 having taken a Big Y-500 test. The Big Y-500 participants range in relationship of a second cousin pair to an estimated 13th cousin, twice removed pair. Most relationships fall between eighth cousins and ninth cousins, once removed.  Our family charts can be found at http://www.owston.com/family/owston/Owston_Family_Charts.pdf.

A truncated version of my personal Y-111 report with Big Y-500 data appears below and shows the genetic distance at 111 markers.  I have removed all duplicate information and personal identifications. 


The STR Differences Column

The Big Y-500 STR Differences column is spurring all the recent interest. The higher number is obviously the number of markers beyond 111 that can be compared. These are the markers where neither compared participant has a no-call.  It is the smaller number, however, that is generating a bit of disagreement.

Some believe this smaller number is the genetic distance for the markers beyond 111; however, it is not. When looking at the raw data for all matches in a project, one can deduce that this number is not the genetic distance. 

The belief that this is genetic distance is because the number will mimic the infinite alleles model when there is only a one-step difference per mismatched marker. This is what is causing the confusion. Just because it looks like a duck and waddles like a duck, it might be a goose.

What is it then?  The column simply gives the user the opportunity to see how many of the comparable markers (the larger number) and the number of those markers that differ (the smaller number). 

When I compare the actual genetic distance with the number in the Big Y-500 STR Differences column for all 120 relationships, only 52 have the same number for genetic distance beyond Y-111 and for the Big Y-500 STR Differences.  The remaining 68 (56.7%) have larger numbers. 

I have provided the data in a PDF file on my website. The rows in lavender are those where the post 111 marker genetic distances and the STR differences columns do not match.

What about Genetic Distance?

When combining the genetic distance from both sets of markers (Y-111 and the 112-561), the results are all over the road.  I’ve seen this at 37, 67, and 111 markers as well. The greatest GD for both markers occurs for a pair of seventh cousins and a pair of eighth cousins. Both pairs exhibit a GD of 21. A GD of 8 has a relationship range of second cousins, once removed to thirteenth cousins, once removed and everything in between.  Genetic Distance is a poor indicator of relationship, as mutations occur randomly.

Compared Numbers

As far as the compared markers (larger number in this column), our project has a range of 364 (ninth cousins) to two pairs with 444 (8C1R and an estimated 13C1R) for the possible 450 additional markers. The mean number of usable markers is 418, while the median is 427 and the mode is 435.

The Overall Importance of this Data

How important is this data? This remains for you to discover in your own family project. As for me, the additional STRs have not provided much additional detail for our family. Of all the additional 450 markers, only three are line specific. 

DYS631=11 (modal 10) is indicative of the Cobourg line; however, so is DYS643=11 (modal 10) found in the first 111, as well as the A10921 and A10923 SNPs.  

FTY510=10 (modal 9) is a signature marker for the Thornholme family, but so is DYS481=25 (modal 26) found in the first 67 markers, as well as the I-A15739 and I-A15740 SNPs. 

DYS489=13 (modal 12) is probably the most valuable of the three, as it is a defining STR marker for the Ganton Branch. While there are two line specific markers for the Rillington Builders Line in the first 111, there is no other Ganton Branch specific STRs besides DYS489. The Ganton Branch is also identified by the I-A10208 SNP.   

Of the 450 markers, 147 exhibit no-calls. There are 260 no-calls in total in our project. Twenty-four of the markers have at least one mutation present. Sixteen only exhibit one person experiencing a mutation among the markers’ results. 

The Real Value of the Big Y-500

As I said earlier in the year, the greater value in the Big Y-500 are the SNPs. For our family, the Big Y-500 cleared up three issues:

  • It provided additional evidence that a spurious male was descended from a specific progenitor.
  • It allowed us to determine which of two men with the same name was the ancestral father for a line of descent.
  • It aided in correcting a mistake in our own genealogical research that occurred thirty years ago. It helped us revisit the documentation of a family in question, and in doing so, this documentation provided the same answers as were found among four matching SNPs.
My experiences may be different than yours and I am hoping that you will find the additional STRs helpful. Remember, the Big Y-500 STR Differences column is not a record of genetic distance, but it is rather a number of markers where a mismatch occurs. 

 

Addendum


I was alerted by a reader that Family Tree DNA had already posted an explanation of this column.  Their explanation, which agrees with the above, is found below: 


"In the matches section, the Big Y-500 STR Differences column is now displayed between Genetic Distance and Name columns.
Understanding the Big Y-500 STR Differences Column This column displays the mismatch number and the number of comparable Big Y-500 STR markers between the kit and a match.
Let us say that for a match 2 of 395 is displayed in this column:
• 395 is the number of comparable markers between the kit and the match. In other words, both the kit and the match have STR values on 395 of the same Big Y-500 STRs. Note: On the CSV file, this value is displayed in the Big Y-500 STRs Compared column.
• 2 is the mismatch number. In other words, out of the 395 Big Y-500 STRs on which the kit and the match have values, there are 2 markers for which the kit and the match has a different value. Note: On the CSV file, this value is displayed in the Big Y-500 STR Differences column."