Among multiple potential source populations, and their maximum assignment probability is more likely to fall below the threshold. Thus, the contributions of reporting groups that are less distinct can be under estimated in the mixture. Whether this bias affects estimates of population-specific traits depends on whether the trait is negatively or positively correlated with genetic similarity among reporting groups. In the positive case, trait estimates from IA are expected to be unbiased. Even though we might underestimate the true number of fish from genetically similar reporting groups, we are unlikely to end up with a biased set of individuals relative to the traits they express. Precision might suffer from diminished sample size, but trait estimates should not be biased in the same way that population mixture composition can be. Although bias was not necessarily expected in comparing IA and BMM, our results serve to underscore a classical problem in biological sampling – the need to balance accuracy of IA with sample size and therefore statistical power. On one hand, IA with a high MAP-rule threshold will better assure that individuals included in the analysis are accurately classified. On the other hand, if mixture samples are small to begin with, as they often are in ecological studies, then the number of individuals with assignment probabilities above the MAP threshold can be quite small. If one lowers the threshold to include more individuals, assignment error and therefore trait estimation bias can increase. This is not such a problem if the trait distribution is correlated with the genetic structure of reporting groups, but if genetically similar reporting groups have very different trait values, then clearly misassignments will bias estimates of population-specific traits. We saw signs of such bias in our simulation studies where we created trait distributions having both positive and negative correlation with genetic distance. The genetic data we used were based on a regional subset of the GAPS-Chinook microsatellite baseline. The 13 GAPS loci are highly variable, with nearly 500 alleles coast wide and highly significant allele frequency differences among populations and among regions. This diversity provided significant power to accurately assign individual Chinook salmon to their putative population of origin. The difference in performance between IA and BMM would be greater in applications with less powerful baselines, which could be caused by fewer, less variable loci, low differentiation among populations, or because mixture samples are degraded and provide quality genotypes for only a subset of loci. In many ecological genetic studies of the type we consider here, power is often limited by small mixture samples. Genetic reporting groups that have small contributions to a mixture may provide very few individuals from which to estimate trait parameters.
More likely be removed from the analysis because they fractionate their posterior probability
Leave a reply