Institute for Plant Protection, NARO

Development of a statistical model for efficient estimation of allele frequencies in field populations

-Deals with the individual differences in the amount of DNA in samples extracted from multiple individuals at once-

National Agriculture and Food Research Organization (NARO) in collaboration with Utsunomiya University and Kyoto University, has developed a statistical model to estimate allele frequencies in field populations by means of expressing the variation in the amount of DNA derived from each individual contained in the DNA sample extracted from multiple organisms at once as a probability. It is possible to accurately grasp the prevalence of drug-resistant pests and alien species with fewer inspections than individual diagnosis, by applying this result to genetic diagnostic techniques such as quantitative Polymerase Chain Reaction (PCR) and quantitative DNA sequencing.


Conventionally, in field populations, individual genetic diagnosis has been performed to grasp the ratio of alien species that may hybridize with native species and pesticide-resistant strains of insect pests. In the individual genetic diagnosis DNA is extracted from each individual and the genotype is determined by PCR tests. However, if individuals with the allele for which ratio is to be known are rare in the population, it was necessary to diagnose dozens to hundreds of individuals to maintain estimation accuracy. Therefore, in order to reduce the number of experimental operations, a method for estimating allele frequencies has been sought by diagnosis using a "bulk sample (mixed DNA solution)" in which DNA is extracted from several individuals at once.

Bulk sample DNA content and allele content ratios can be determined by techniques such as quantitative PCR and quantitative DNA sequencing. If each individual has the same amount of DNA, the proportion of DNA in the bulk sample directly indicates the genetic constitution of the individuals contained in the bulk sample. However, in reality, the amount of DNA varies depending on the number of cells that make up the body of each individual, and the amount of DNA also decreases due to decomposition after death. Therefore, when a bulk sample is prepared from individuals caught in traps and the ratio of DNA content is measured, it may deviate greatly from the abundance ratio of individuals in the field.

NARO in collaboration with Utsunomiya University and Kyoto University, has developed a statistical model by approximating the variation in the amount of DNA obtained from each individual with a probability distribution called "gamma distribution". This model can estimate the ratio of the specific allele in a population, along with an indicator of how likely the estimate is (confidence interval). This model can be applied if multiple bulk samples are prepared, and it is known how many individuals each consists of.

In order to apply this model to quantitative PCR analysis and to easily obtain the ratio of alleles and their confidence intervals, we have developed a package "freqpcr" for R, which is a free statistical analysis environment, and distribute it on the official website ( This package has already been used for the purpose of analyzing the regional distribution pattern of acaricide-resistant alleles in the citrus red mite, as well as for estimating the abundance ratio of alleles that are rare in the field with high accuracy with a smaller number of tests. It is useful not only for agricultural pests, but also for monitoring for the purpose of conserving rare species and preventing the invasion of alien species and strains.


  • Sudo M, Yamamura K, Sonoda S, Yamanaka T (2021). Estimating the proportion of resistance alleles from bulk Sanger sequencing, circumventing the variability of individual DNA. Journal of Pesticide Science 46(2): 1-8.
  • Sudo M, Osakabe M (2022) freqpcr: estimation of population allele frequency using qPCR ΔΔCq measures from bulk samples. Molecular Ecology Resources 22 (4) 1380-1393.

For Inquiries