Abstract:To the Editor: We would like to comment on the Schork and Greenwood (2004) article dealing with the inherent “bias” toward the null hypothesis in the context of nonparametric linkage analysis. The authors point out that, in certain situations, a loss of evidence for linkage can result from the practice of assigning expected allele-sharing values to affected relative pairs that are uninformative for their identity-by-descent (IBD) status. They explained this by setting up a likelihood function and studying its properties by simulation, clearly illustrating the negative impact of using expected IBD values for uninformative pairs. However, we would like to point out that their likelihood does not reflect how the majority of nonparametric linkage analysis programs compute statistics in practice. Indeed, the “problem” has been known and well discussed for years. Some of the concerns we discuss here have also been raised by Cordell (2004). Schork and Greenwood (2004) set up the likelihood formulation as follows. Let ni be the number of sib pairs sharing i alleles IBD (i=0, 1, or 2). If all families had unambiguous IBD sharing, then the LOD score evaluated at the sharing vector (p0, p1, p2) is calculated as In their model, Schork and Greenwood (2004) said that fully uninformative sibling pairs contribute 0.25, 0.50, and 0.25, respectively, to the counts n0, n1, and n2 used in equation (1). If so, then the presence of uninformative sib pairs can lower the LOD score. However, in most software implementations, expected allele-sharing values are not used to compute nonparametric LOD scores. For example, consider the maximum LOD score (MLS) statistic proposed by Risch (1990). Let wi be the probability of the observed marker phenotypes of the pair, given that they share i alleles IBD (i=0, 1, or 2). Then, the likelihood of the observed marker data for the pair is given by where pi is the posterior probability that the pair shares i alleles IBD, given that both members of the pair are affected. Suppose, in addition, that we know that n2,1 pairs share either 2 or 1 alleles, n2,0 pairs share either 2 or 0 alleles, n1,0 pairs share either 1 or 0 alleles, and nun is the number of pairs that are fully uninformative. According to Risch (1990), the LOD score can be written as Maximizing this likelihood gives consistent and asymptotically unbiased estimates of the IBD-sharing probabilities. Cordell (2004) confirms this by simulation. To verify that most implementations of nonparametric linkage statistics are not altered by uninformative families, we used FastSLINK (Ott 1989; Weeks et al. 1990; Cottingham et al. 1993) to simulate 200 fully genotyped affected–sib-pair families under disease model 1 of Schork and Greenwood (2004). The disease locus was completely linked to a two-allele marker with equally frequent alleles. We then used a variety of programs to compute linkage statistics on two data sets: (1) all 200 families and (2) the 147 families that remained after removal of the fully uninformative families. As shown in table 1, the majority of the linkage statistics, as implemented in widely used software, are exactly the same for the two data sets. Table 1 Comparison of Linkage Statistics Analyses Using All 200 Families and Using Only the 147 Informative Families There are two statistics in table 1 that are less significant when all 200 families are used than when the uninformative families are removed. These two statistics are the GeneHunter NPL Sall Z score and the SIBPAL mean test Z value. In both of these cases, the reduction in evidence for linkage is caused by the use of the “perfect data approximation” to compute the variance of the statistics. The “perfect data approximation” performs well if most of the families are informative for IBD sharing, but, as the proportion of uninformative families increases, it becomes increasingly conservative, leading to a loss of power (Kruglyak et al. 1996). In fact, the loss of power due to “bias” that Schork and Greenwood (2004) identify is, mathematically, exactly the samething as the loss of power due to the “perfect data approximation.” The negative effects of the “perfect data approximation” can be illustrated by a simple example. Consider the sib-pair IBD-sharing statistic where πi is the estimated proportion of alleles shared IBD for the ith affected sib pair. Suppose we have two data sets: (1) 50 fully informative affected–sib-pair families and (2) 50 fully informative and 50 uninformative families. Suppose πi in our fully informative families takes on the values 0, 1/2, and 1, with probabilities1/8, 1/2, and 3/8, respectively, whereas πi is 1/2 in our uninformative families. The numerator of the statistic is identical for both data sets. However, different approaches to computing the variance in the denominator can lead to different statistic values for the two data sets. Under the “perfect data approximation,” the value of the statistic is 2.50 for the first data set and 1.77 for the second data set—an undesirable reduction in the evidence for linkage. Use of the correct variance (given that the number of uninformative families remains constant) leads to statistic values of 2.50 for both data sets. Another option is to use the empirical variance, which reflects the alternative hypothesis rather than the null hypothesis and can be quite powerful; the empirical variance gives an expected IBD-sharing statistic of 2.50 for both example data sets. A score test using empirical variances was one of the best statistics in a recent evaluation of methods for QTL mapping using selected sibling pairs (T.Cuenco et al. 2003). To avoid the negative consequences of using the “perfect data approximation,” Kong and Cox (1997) proposed a nonparametric statistic that performs much better in the presence of uninformative families. This statistic has been implemented in GeneHunter-Plus (Kong and Cox 1997), Allegro (Gudbjartsson et al. 2000), and Merlin (Abecasis et al. 2002) and, as illustrated by our simple simulation experiment in table 1, is insensitive to the presence of fully uninformative families. Similarly, in the context of the Haseman-Elston (HE) test (Haseman and Elston 1972), in which trait values are regressed on IBD sharing, the problem of using estimated IBD sharinghas long been recognized. For example, Kruglyak and Lander (1995) developed a missing-value regression approach to compute a modified HE test that has much better behavior in the presence of uninformative families than the original test. Whereas it is always useful to remind the scientific community that proper statistical analyses of linkage data requires deep insight into the potential weaknesses of the chosen methodology and software implementation, we feel that Schork and Greenwood’s concerns are overstated. Indeed, as we have shown, not only has this potential problem been known since at least the mid-1990s, but, in addition, the majority of implementations of linkage statistics in commonly used software do not suffer from this “bias” toward the null hypothesis in the presence of uninformative families. Furthermore, the use of highly informative markers in a multipoint analysis will result in very few families being fully uninformative for IBD sharing.

Multiple enzyme activities of Escherichia coli MutT protein for sanitization of DNA and RNA precursor pools.

Porous Microneedle Patch with Sustained Exosome Delivery Repairs Severe Spinal Cord Injury

Local delivery of EGFR + NSCs-derived exosomes promotes neural regeneration post spinal cord injury via miR-34a-5p/HDAC6 pathway

Exosomes derived from NGF-overexpressing bone marrow mesenchymal stem cell sheet promote spinal cord injury repair in a mouse model

Umbilical mesenchymal stem cell-derived exosomes facilitate spinal cord functional recovery through the miR-199a-3p/145-5p-mediated NGF/TrkA signaling pathway in rats

Rat Bone Mesenchymal Stem Cell-Derived Exosomes Loaded with miR-494 Promoting Neurofilament Regeneration and Behavioral Function Recovery after Spinal Cord Injury

A method for the study of regional ischemic dysfunction in the intact dog. Permanent endocardial markers and the pressure-length loop.

Exosome-shuttled miR-216a-5p from hypoxic preconditioned mesenchymal stem cells repair traumatic spinal cord injury by shifting microglial M1/M2 polarization

Neural Stem Cell-Derived Exosomes Protect Spinal Cord Injury by the Transfer of miR-31-5p

Exosomal miR-9-5p derived from BMSCs alleviates apoptosis, inflammation and endoplasmic reticulum stress in spinal cord injury by regulating the HDAC5/FGF2 axis

Neuron-Derived Exosomes Promote the Recovery of Spinal Cord Injury by Modulating Nerve Cells in the Cellular Microenvironment of the Lesion Area

Engineered extracellular vesicles for delivery of siRNA promoting targeted repair of traumatic spinal cord injury

Therapeutic Potential of Mesenchymal Stem Cell-Derived Exosomes in Spinal Cord Injury

The Importance of Using Exosome-Loaded miRNA for the Treatment of Spinal Cord Injury

Bone marrow mesenchymal stem cell exosomes-derived microRNA-216a-5p on locomotor performance, neuronal injury, and microglia inflammation in spinal cord injury

Physical activity energy expenditure in Dutch adolescents: contribution of active transport to school, physical education, and leisure time activities.

Exosomes derived from umbilical cord-mesenchymal stem cells inhibit the NF-κB/MAPK signaling pathway and reduce the inflammatory response to promote recovery from spinal cord injury

Potential of different cells-derived exosomal microRNA cargos for treating spinal cord injury

Small Extracellular Vesicles Released from miR-211-5p-Overexpressed Bone Marrow Mesenchymal Stem Cells Ameliorate Spinal Cord Injuries in Rats

No "bias" toward the null hypothesis in most conventional multipoint nonparametric linkage analyses.

Exosomal miR-17-92 Cluster from BMSCs Alleviates Apoptosis and Inflammation in Spinal Cord Injury