During this past week, Family Tree DNA has added a column to
the Y-DNA Matches feature called “Big Y-500 STR Differences.” There has been much said about this column,
and there is a great deal of confusion as to what it means. I’ve seen a few
argue a point that is different than the one I espouse. Hopefully, by the end
of this post, we can agree on this new set of data.
Background
The Owston/Ouston DNA project has a total of 33 Y-DNA participants
with 16 having taken a Big Y-500 test. The Big Y-500 participants range in relationship
of a second cousin pair to an estimated 13th cousin, twice removed
pair. Most relationships fall between eighth cousins and ninth cousins, once
removed. Our family charts can be found
at http://www.owston.com/family/owston/Owston_Family_Charts.pdf.
A truncated version of my personal Y-111 report with Big Y-500
data appears below and shows the genetic distance at 111 markers. I have removed all duplicate information and
personal identifications.
The STR Differences Column
The Big Y-500 STR Differences column is spurring all the recent interest. The
higher number is obviously the number of markers beyond 111 that can be
compared. These are the markers where neither compared participant has a
no-call. It is the smaller number,
however, that is generating a bit of disagreement.
Some believe this smaller number is the genetic distance for
the markers beyond 111; however, it is not. When looking at the raw data for all matches in a project,
one can deduce that this number is not the genetic distance.
The belief that this is genetic distance is because the
number will mimic the infinite alleles model when there is only a one-step
difference per mismatched marker. This is what is causing the confusion. Just
because it looks like a duck and waddles like a duck, it might be a goose.
What is it then? The
column simply gives the user the opportunity to see how many of the comparable markers
(the larger number) and the number of those markers that differ (the smaller
number).
When I compare the actual genetic distance with the number in the Big Y-500 STR Differences column for all 120 relationships, only 52 have the same number for genetic distance beyond Y-111 and for the Big Y-500 STR Differences. The remaining 68 (56.7%) have larger numbers.
I have provided the data in a PDF file on my website. The rows in lavender are those where the post 111 marker genetic distances and the STR differences columns do not match.
When I compare the actual genetic distance with the number in the Big Y-500 STR Differences column for all 120 relationships, only 52 have the same number for genetic distance beyond Y-111 and for the Big Y-500 STR Differences. The remaining 68 (56.7%) have larger numbers.
I have provided the data in a PDF file on my website. The rows in lavender are those where the post 111 marker genetic distances and the STR differences columns do not match.
What about Genetic Distance?
When combining the genetic distance from both sets of markers
(Y-111 and the 112-561), the results are all over the road. I’ve seen this at 37, 67, and 111 markers as
well. The greatest GD for both markers occurs for a pair of seventh cousins and
a pair of eighth cousins. Both pairs exhibit a GD of 21. A GD of 8 has a relationship range of second
cousins, once removed to thirteenth cousins, once removed and everything in
between. Genetic Distance is a poor
indicator of relationship, as mutations occur randomly.
Compared Numbers
As far as the compared markers (larger number in this
column), our project has a range of 364 (ninth cousins) to two pairs with 444 (8C1R
and an estimated 13C1R) for the possible 450 additional markers. The mean number
of usable markers is 418, while the median is 427 and the mode is 435.
The Overall Importance of this Data
How important is this data? This remains for you to discover
in your own family project. As for me, the additional STRs have not provided much
additional detail for our family. Of all the additional 450 markers, only three
are line specific.
DYS631=11 (modal 10) is indicative of the Cobourg line; however, so is DYS643=11 (modal 10) found in the first 111, as well as the A10921 and A10923 SNPs.FTY510=10 (modal 9) is a signature marker for the Thornholme family, but so is DYS481=25 (modal 26) found in the first 67 markers, as well as the I-A15739 and I-A15740 SNPs.DYS489=13 (modal 12) is probably the most valuable of the three, as it is a defining STR marker for the Ganton Branch. While there are two line specific markers for the Rillington Builders Line in the first 111, there is no other Ganton Branch specific STRs besides DYS489. The Ganton Branch is also identified by the I-A10208 SNP.
Of the 450 markers, 147 exhibit no-calls. There are 260
no-calls in total in our project. Twenty-four of
the markers have at least one mutation present. Sixteen only exhibit one person
experiencing a mutation among the markers’ results.
The Real Value of the Big Y-500
As I said earlier in the year, the greater value in the Big
Y-500 are the SNPs. For our family, the Big Y-500 cleared up three issues:
- It provided additional evidence that a spurious male was descended from a specific progenitor.
- It allowed us to determine which of two men with the same name was the ancestral father for a line of descent.
- It aided in correcting a mistake in our own genealogical research that occurred thirty years ago. It helped us revisit the documentation of a family in question, and in doing so, this documentation provided the same answers as were found among four matching SNPs.
My experiences may be different than yours and I am hoping
that you will find the additional STRs helpful. Remember, the Big Y-500 STR Differences column is not a record of genetic distance, but it is rather a number of markers where a
mismatch occurs.
I was alerted by a reader that Family Tree DNA had already posted an explanation of this column. Their explanation, which agrees with the above, is found below:
Addendum
I was alerted by a reader that Family Tree DNA had already posted an explanation of this column. Their explanation, which agrees with the above, is found below:
"In the matches section, the Big Y-500 STR Differences column is now displayed between Genetic Distance and Name columns.
Understanding the Big Y-500 STR Differences Column This column displays the mismatch number and the number of comparable Big Y-500 STR markers between the kit and a match.
Let us say that for a match 2 of 395 is displayed in this column:
• 395 is the number of comparable markers between the kit and the match. In other words, both the kit and the match have STR values on 395 of the same Big Y-500 STRs. Note: On the CSV file, this value is displayed in the Big Y-500 STRs Compared column.
• 2 is the mismatch number. In other words, out of the 395 Big Y-500 STRs on which the kit and the match have values, there are 2 markers for which the kit and the match has a different value. Note: On the CSV file, this value is displayed in the Big Y-500 STR Differences column."
Thank you for your work, and for presenting it so well that even a relative novice can understand it.
ReplyDeleteGlad to be of help and thanks for your positive comments, Mike.
DeleteJim
I have one with 7 differences out of 432 STR markers - what does that mean, if anything, about the genetic closeness between my self and that individual - at 67 we show a GD of 5?
ReplyDeleteI wouldn't worry about it. I don't find a lot of use for the newer markers. What was your match at 111?
Delete