After some further consideration I think it's quite clear that the only probability mass function evaluated in the computation of  is that of the classically computed ideal distribution, denoted 
 in the main paper.
This leads me to the conclusion that the phrasing of the following excerpt from section IV.C of the Supplemental Information (and especially the part underlined in red) is a bit unfortunate/misleading:
Just because the empirically measured bitstrings are coming from the uniform distribution doesn't mean that  is suddenly 
 for all 
. 
, as it goes into the calculation of the 
, is still the probability of sampling bitstring 
 from the classically computed ideal distribution. This is in general not 
.
The correct reasoning is that the fact that  will be 
 (and 
) when bitstrings 
 are sampled from the uniform distribution follows from the definitions of expectation and probability mass function:
The definition of expected value is the following sum
where 
 is the probability of bitstring 
 being sampled from the classically computed ideal quantum circuit, 
 is the probability of 
 being sampled from the non-ideal empirical distribution, and the sum runs over all possible bitstrings.
When bitstrings are coming from the uniform distribution  will always be 
 so can be broken out of the sum:
When you sum any probability mass function (of which 
 is one example) over all the possible outcomes you by definition get 1, and thus:
Videos
After some further consideration I think it's quite clear that the only probability mass function evaluated in the computation of  is that of the classically computed ideal distribution, denoted 
 in the main paper.
This leads me to the conclusion that the phrasing of the following excerpt from section IV.C of the Supplemental Information (and especially the part underlined in red) is a bit unfortunate/misleading:
Just because the empirically measured bitstrings are coming from the uniform distribution doesn't mean that  is suddenly 
 for all 
. 
, as it goes into the calculation of the 
, is still the probability of sampling bitstring 
 from the classically computed ideal distribution. This is in general not 
.
The correct reasoning is that the fact that  will be 
 (and 
) when bitstrings 
 are sampled from the uniform distribution follows from the definitions of expectation and probability mass function:
The definition of expected value is the following sum
where 
 is the probability of bitstring 
 being sampled from the classically computed ideal quantum circuit, 
 is the probability of 
 being sampled from the non-ideal empirical distribution, and the sum runs over all possible bitstrings.
When bitstrings are coming from the uniform distribution  will always be 
 so can be broken out of the sum:
When you sum any probability mass function (of which 
 is one example) over all the possible outcomes you by definition get 1, and thus:
