Читать онлайн «Реконструкция родословных деревьев изолированных популяций»

Автор Малютов М.Б.

*e ШФАКУЛЬТЕТСКАЯ ЛАБОРАТОРИЯ СТАТИСТИЧЕСКИХ МЕТОДОВ М. Б. МАЛЮТОВ, В. П. ПАСЕКОВ РЕКОНСТРУКЦИЯ РОДОСЛОВНЫХ ДЕРЕВЬЕВ ИЗОЛИРОВАННЫХ ПОПУЛЯЦИЙ Препринт * 19 М. В. MALYUTOV, V. P. PASSEKOV ON THE RECONSTRUCTION OP THE GENEALOGICAL TREE OP ISOLATED POPULATIONS Preprint N 19 Издательство Мооковсхого университета 1971 SUMMARY. I. Let us denote p^, t = /r. . , О % the frequencies of various types (alleles) Ax of the gene Л • Under the hypothesis of random mating the genotype of any progeny is determined by the random sample of two genes from the practically infinite assembly of alleles Л. with probabilities p. • As a result the gene frequencies p!<; of allele in the population of ^ progeny have multinomial distribution pf*-* •'■'••• ^Л&К1 If we restrict ourselves with the model of non-overlapping generations with the constant size N of the population, th9n the gene frequencies vector is «supposed to change as a Markov chain with the transition probabilities (I) (in real calculations one takes the so called effective size N instead of tne total population number1. The examples of its evaluation see in [2Ц ] ). This Markov chain is called "the genetic drift"• It is also the name of the diffusion process pit) of the gene frequencies^ evolution. The probability density of it at tne moment t in the point p changes according to the forward equation of A. lLKoJmoflurov: (the time is measured in number of generations)* It is known that in diminished scale of time T- t(2N)'* the continual* 4 process well approximates the discrete one. theorem I. If t/v""' —* 0 then the distribution of the random variable a t.
tends to % with Q -i degrees of freedom ( p {0) * 0,£. р^1, The essential point is that the distribution of 6* asymptotically does not depend on the initial frequencies p^O) The main idea of the proof is the transformation хк= ф^7, K = 4,... ,q which transforms genetic drift to a spherical brownian motion with some trend which is negligible when tN —» 0 • Then we approximate Bpherical brownian motion by a brownian motion on the tangent plane to the sphere Z. X* - i at the initial point of the process. Let the population is splitted into two isolated populations. Let us assume, the effective sizes of the populations remained equal to N< and fVa respectively. We know at present the frequencies pi(( and QiK of the i-th attele of k-th from the independently inherited locusesf к = i f... , rn, i = {,... , QK . Let &H^^cco$Z)/p^u and V^Ifi/, The immediate consequence of the theorem I is Theorem 2. The distribution of the random variable ,Г^ н^/ц/Т* 5 tends to the distribution of X* ( where the number of degrees of freedom U П) a S £/\T -* 0. Thus -^ y^fofj H *s a consistent estimate of t when thl~ is small, ft —► «*=> according to the law of large numbers. 3. Let we are given the contemporary frequencies of alleles in some populations as well as the effective sizes of contemporary and earlier populations. For estimating the genealogical tree we reconstruct the nodes of the tree successively according to their remoteness from the present time. Let the estimates of the nearest nodes are known. We take as the descendant of the following split the pair of the node of the reconstructed tree for which кт к «* *«-- is minimal. Here tK , t m (for definiteness tK > tm) coordinates are the estimates of time~Vxor the nodes ,the variance matrices of the vectors x.