Thursday, July 28, 2016

Statistical DNA percentages vs. real life

You get 50% of your DNA from each of your parents* which in turn means you get 25% of your DNA from each of your grandparents which in turn means you get 12.5% of your DNA from each of your great-grandparents which in turn means you get 6.25% of your DNA from each of your 2nd great-grandparents etc.  However, these are only simple mathematical calculations. DNA is much more complicated than that and real life doesn’t always match the math. For example:

You get 50% of your DNA from your mother and 50% from your father but which 50% of their 100% DNA is a crapshoot.  Your parents also got 50% of their DNA from each parent but again, which 50% was a crapshoot.  This means as the DNA is mixing and diluting as it is being passed down you can’t use a simple mathematical equation to calculate how much DNA you got from a certain person. The only time that the percentages will be on the money is the 50% you got from mom and the 50% you got from dad because you inherit entire chromosomes from each. 

Here is a great chart from ISOGG that give more of a real life expectation of what you might or might not actually see.  Even if you should mathematically inherit X amount of DNA from an ancestor that doesn’t mean you will. 

File:Ancestor relationships.jpgCousin Statistics ISOGG Wiki Page


The chart shows you the chance of you not inheriting any DNA form a particular ancestor.  What this chart doesn’t show you is the amount of DNA you could inherit in between the mathematical calculation and the calculation on these charts.  In other words, you can inherit UP TO the mathematical percentage but it may be (and probably will be) lower.

Here is a real life example and one that many people are pursing, Native American (NA)ancestry.  My 3rd-great grandmother was a Choctaw Indian (have paper trail).  Simple math would say that I would inherit 3.125% of her DNA and my uncle who has also tested would get 6.25% At this level I only have a slight chance of not inheriting any DNA from her and my uncle has 0% chance. So far so good.  However, with the way that DNA mixes and dilutes as it comes down the line I can have anything from 3.125% to 0% and my uncle can have 6.25% to above 0%

My uncle has 0.62%
I have 0.57%

Is this still reasonable?  Yes it is.  What is interesting is that I have almost as much as my uncle has.  I wish I could have tested my dad because I would bet he got a bigger chunk than my uncle did.  I also wish I could test my other living uncle but he isn’t interested in testing.  I would like to see how much he ended up with. The uncle that won’t test has a granddaughter that did test. Mathematically she could have 1.5625% of her 4th great-grandmother’s DNA.  She has a little over 1/2% chance of inheriting no DNA.  Her number should between these two.  She has 0.19%

I am waiting for DNA from a first cousin to add to my NA pool as well as the DNA from my stepmother and her brother who both descend from my 3rd great-grandmother’s brother.  This is pretty exciting because I will have DNA from a different line to compare to.  I still need to map out the exact segment matches but I am off to a good start.  There is always a chance of a false positive but I don’t think this will be the case.

For genealogists working with autosomal DNA this next chart from ISOGG might be of more interest. This will explain why you don’t share as much DNA (or you share no DNA) with someone you have a paper trail for as a cousin match.

File:Cousin relationships.jpg


Here are the mathematical calculations for familial matches. This time it will be expressed in centimorgans (cM) along with the percentages. The chart is too big for the blog so go to ISOGG's Autosomal DNA Matches and scroll down to the Table, “Average autosomal DNA shared by pairs of relatives, in percentages and centiMorgans”


Now compare those mathematical calculations to what Blaine Bettinger actually found using real life data. Notice that in Blaine’s data there are averages and ranges.

SharedcMProjectUpdate to the Shared cM Project

Blaine updates his chart as he gets additional data in.  The more data, the more accurate. 


Nutshell version – You can’t rely on statistical calculations to rule someone in or someone out as a DNA match at a particular relationship. 

* For practical purposes it is a 50/50 split but there are slight variations due to the y chromosome being smaller than the x chromosome and the possibility of endogamy – your parents having a common ancestor down the line somewhere and they are actually sharing some DNA.

NOTE:  Gedmatch’s Dodecad World9 Admixture algorithm was used to give the percentage of Amerindian in the DNA testers.  Algorithm’s are updated periodically and all of these numbers could change.


Copyright © 2016 Michèle Simmons Lewis


  1. Michele. I'm reading this on my iPhone and can't wait to take a look at it on a larger screen. You've given a very easy to understand narrative and great visuals. I keep attending DNA classes and reading and I think some of it is finally sinking in. Thank you for this post.

  2. Wonderful post with wonderful explanations. Thanks!

  3. Thanks Michele,
    Very informative. How best to choose which test? Ancestry (because of the $ and familiarity) and then import raw data into ? OR ?
    I am just starting to seriously explore this with only my mother's generation left...better hurry up :)
    Christine Gray

    1. Which test you take depends on what your goals are. The three main tests, yDNA, mtDNA and atDNA (autosomal) have three different purposes. The one most people are interested in (and the only test that Ancestry offers since that is the company you mentioned) is atDNA. In a nutshell, atDNA will show you who you are related to from within a pool of other testers. atDNA goes across all lines. Males and females can take it. It's reach is about 6 generations back in time (give or take). It will also give your ethnicity percentages.

      If you do take an atDNA test (no matter if it is with Ancestry, 23andMe or FTDNA) I highly recommend you upload the raw data to Gedmatch. You will have a much bigger pool of testers there (from all three companies, not just one) and you will have more tools to play with.

  4. Michele, you've done it again — hit us right where we need it, with information that is understandable and useful! Thank you. This will be a great help for cousins and I who share a well-documented paper trail, but seemingly no measurable DNA. :)

    I'm curious how you were able to arrive at actual percentages of DNA you share with your 3rd-great grandmother (0.57%) and others? Was this through a test other than autosomal? How could I calculate percentages like this?

    1. Great question! I don't know that this came from my 3rd great-grandmother, at least not yet. Since it is a low percentage it is possible it is just "noise" or a false positive but I don't think it is because the numbers are consistent with what I should be seeing and so far everyone tested has consistent numbers. However, that is only half the battle. There are two more things I am doing. I have tested two people that descend from the brother of my my 3rd great grandmother (I have the results of one, waiting on the other one). Here's the rub, I have always suspected that my 3rd great grandmother and this other man were brother and sister but I have never been able to prove it so I am working on a double brick wall. I am trying to prove that the two people were siblings AND that is where our NA blood comes from.

      The DNA that has come back from one of the descendants is a 42 cM match to one of my cousins but not to me or my uncle. I am hoping that this testers brother will match me or my uncle. Right now I can't say that the match between the tester and my first cousin isn't on his MOTHER's side. Still working on that.

      The other thing I am working on is chromosome mapping. Since this is such a small percentage I really need to show that everyone is matching in the same place. Even though the tester matches my cousin, she could still match me and my uncle only on a small enough segment it isn't being picked up. I can run some tests through gedmatch looking for those smaller segments. Anyway, once I have all my DNA sample back I will be working on seeing where the NA DNA is on the chromosomes and whether we all match on those segments.

  5. In my case, I'm amazed that I ended up with as much Native DNA as I did. I have one great-great-grandma who was Metis (French & Native, in her case Cree & Chipewyan). Her parents and grandparents were all about 1/2. Splitting it perfectly in 1/2 from her 1/2 down to me makes me a little more than 3% Native American on paper. I've tested my parents & myself at Ancestry. They show me as 3% Native, my dad as 4%. (I am 100% positive none is from my mom's side as she is of German descent 2 & 3 gens back). GedMatch (Decodad) shows my Amerindian as 2.61% along with 1.83% Siberian (my guess is this is the Chipewyan as Ancestry's little write-up on Native DNA said there were 3 waves of migrations and Chipewyans were part of the 2nd wave). My dad shows as 4.53% Amerindian and 2.03% Siberian. I am amazed that I got about 1/2 of that small amount of Native DNA and yet I know I don't have much if any DNA left from a Lockman 4th great-grandfather as I don't match any of his descendants yet my dad & great-aunt do. My husband keeps showing up & then disappearing again from certain DNA Circles on Ancestry. It took me the longest time to figure out that he was matching many descendants but it was too many generations back to put him in their Circle. And that was all him, no other family members tested on that line. Amazing when you can figure out exactly which ancestors you still carry DNA from!!