Reply to Brenner
Mark D. Shriver,Michael W. Smith,Li Jin
DOI: https://doi.org/10.1086/301894
1998-01-01
Abstract:To the Editor:In response to the letter by Dr. Brenner (1998xDifficulties in the estimation of ethnic affiliation. Brenner, CH. Am J Hum Genet. 1998; 62: 1558–1560Abstract | Full Text | Full Text PDF | PubMed | Scopus (8)See all ReferencesBrenner (1998 [in this issue]), there are a number of issues open for discussion with regard to both our previously published article (Shriver et al. 1997xEthnic-affiliation estimation by use of population-specific DNA markers. Shriver, MD, Smith, MW, Jin, L, Marcini, A, Akey, JM, Deka, R, and Ferrell, RE. Am J Hum Genet. 1997; 60: 957–964PubMedSee all ReferencesShriver et al. 1997) and, more generally, methods for estimation of biological ancestry. Dr. Brenner has identified some specific concerns with regard to our methods and results, which we address below. However, we remain confident of the main conclusions of our study: (1) the reliable estimation of ethnic affiliation by use of population-specific alleles (PSAs) is possible; and (2) many of the loci we identified will be useful markers for this effort.We have examined the computer program that was used to calculate average single-locus log-likelihood levels and have found that Dr. Brenner is correct in his determination that alleles that were not observed were assigned a frequency of 1/(4n+1), instead of 1/(2n+1), where n is the number of individuals in the sample. The effect of this error was to inflate the average single-locus and multilocus log-likelihood estimates, to a small degree. Since the same program was used to screen all the allele-frequency data sets, it is reasonable to conclude that the 40 loci with the highest log-likelihood levels, which we presented in tables 1 and 2 of our article (Shriver et al. 1997xEthnic-affiliation estimation by use of population-specific DNA markers. Shriver, MD, Smith, MW, Jin, L, Marcini, A, Akey, JM, Deka, R, and Ferrell, RE. Am J Hum Genet. 1997; 60: 957–964PubMedSee all ReferencesShriver et al. 1997), are still good candidates for high performers among the loci tested.Dr. Brenner is correct to recognize that our method for determining average single-locus log-likelihood ratios (LLRs) and multilocus ethnic-affiliation estimates is appropriate only when accurate allele-frequency data are available. We expect that, in the determination of biological ancestry, care will be taken to determine with precision the allele frequencies of potential contributing populations. If accurate allele frequencies are available (e.g., n>200 individuals), no adjustment of the formula we presented will be needed. In cases for which frequency data are available only from small samples, the addition of one to the total allele count for each allele is a reasonable adjustment.Dr. Brenner concludes that the differences in allele frequency that we observed between loci were largely due to bias resulting from small sample size. He bases this conclusion on a computer simulation in which he evidently resampled 1,000 × from frequency data on four short tandem-repeat identity markers. He then compared his results with the data in table 1 of our article (Shriver et al. 1997xEthnic-affiliation estimation by use of population-specific DNA markers. Shriver, MD, Smith, MW, Jin, L, Marcini, A, Akey, JM, Deka, R, and Ferrell, RE. Am J Hum Genet. 1997; 60: 957–964PubMedSee all ReferencesShriver et al. 1997). We have two concerns with this approach. First, the 17 microsatellite PSAs that we presented in table 1 were culled from ∼350 loci (1,000 loci/population combinations were tested in the work that we reported). Second, the range of variation in the frequency differential used in Dr. Brenner's model was very limited and, with only four loci (LLR of .08–.4), could not have reflected naturally observed levels of variation in the allele-frequency differential. We are well aware of the bias resulting from small sample sizes, which is why we presented a list of 20 loci in table 1 and not just the best 10. In fact, we stated, “It should be noted that the markers on this list need to be typed in larger samples from different parts of the country, both to have more accurate allele-frequency estimates and to identify the most efficient set for EAE [ethnic-affiliation estimation]” (Shriver et al. 1997xEthnic-affiliation estimation by use of population-specific DNA markers. Shriver, MD, Smith, MW, Jin, L, Marcini, A, Akey, JM, Deka, R, and Ferrell, RE. Am J Hum Genet. 1997; 60: 957–964PubMedSee all ReferencesShriver et al. 1997, p. 963). Recently, we typed nine dimorphic autosomal PSAs in large samples from >20 ethnographically defined populations, including 12 African-American population samples, and indeed found these markers to be useful for the estimation of ethnic affiliation and admixture (Parra et al. 1997xA systematic study of African-American admixture using population-specific alleles. Parra, E, Marcini, A, Akey, J, Ferrell, RE, and Shriver, MD. Am J Hum Genet. 1997; : A17PubMedSee all ReferencesParra et al. 1997; E. J. Parra, A. Marcini, L. Jin, J. Akey, M. Batzer, R. Cooper, T. Forrester, et al., unpublished data). Overall and in view of Dr. Brenner's concerns, we still feel that this is a viable approach for the estimation of the biological ancestry of a person and that we have provided an important list of putative PSAs for this purpose.Finally, in responding to Dr. Brenner's comments, we would like to suggest an alternative phrase that more accurately describes what is being estimated by means of the markers and methods that we, Dr. Brenner, and others have described. Ethnicity is a term that directly refers to the culture of a person or people and that encompasses their language, traditions, and national identity. Ethnicity is often related to biological ancestry but not always. In the United States, awkward terms that combine both ethnicity and biological ancestry are sometimes used—for example, “non-Hispanic whites,” “black Hispanics,” and “non-Hispanic blacks.” Modern populations are highly complex, and the classification of genetic differences among individuals and populations is a potentially sensitive issue. We therefore propose and intend to use the term “estimation of biological ancestry,” rather than “ethnic-affiliation estimation,” to describe the methods that we have presented.