Quasispecies is a model of informational sequences evolution
[1,2]. The evolved population is a set {S_{k}}
of n sequences, k = 1,..., n. Each
sequence is a string of N symbols, S_{ki}
, i = 1,..., N. The symbols are taken from an
alphabet, containing l letters.
For example, we can consider a twoletter alphabet (l = 2, S_{ki}
= 1, 1 or S_{ki} = G, C)
or a fourletter alphabet (l = 4,
S_{ki} = G, C, A, U).
The sequence length N and the population size n are
assumed to be large: N , n >> 1.
Sequences are the model "organisms", they have
certain (nonnegative) selective values f_{k}
= f(S_{k}).
We assume here, that there is the master sequence S_{m} , having the maximal
selective value. The selective value of any sequence depends only
on Hamming distance (the number of different symbols at
corresponding places in sequences) between given S and
master sequence S_{m}
: f(S) = f(r(S,S_{m}))  the smaller is the
distance r , the greater is
the selective value f . For simplicity we assume here,
that values f are not greater than 1.
The evolution process consists of consequent generations. New
generation {S_{k} (t+1)}
is obtained from the old one {S_{k}(t)}
by selection and mutations of sequences S_{k}
(t) ; here t is the generation number.
The model evolution process can be described formally in the
following computerprogramlike manner.
; 
Step 0. (Formation of an initial population {S_{k}
(0)} ) For every k = 1 , ..., n, for
every i = 1 , ..., N , choose randomly
a symbol S_{ki} by setting it
to an arbitrary symbol from given alphabet. 
; 
Step 1. (Selection) 

Substep 1.1. (Selection of a particular sequence).
Choose randomly a sequence number k*, and
select the sequence S_{k*}(t)
(without canceling it from the old population) into the
new population {S_{k}(t+1)}
with the probability f_{k*}
= f (S_{k*}
(t)). 

Substep 1.2. (Iteration of the sequences selection,
control of the population size). Repeat the substep 1.1
until the number of sequences in the new population
reaches the value n . 

Step 2. (Mutations) For every k = 1
, ..., n, for every i = 1 , ..., N ,
change with the probability P the symbol
S_{ki}(t+1) to an
arbitrary other symbol of the alphabet. 

Step 3. (Organization of the iterative evolution).
Repeat the steps 1, 2 for t = 0, 1, 2, ... 
The evolution character depends strongly on the population
size n. If n is very large (n >> l^{N} ), the numbers of all sequences in
a population are large and the evolution can be considered as
deterministic process. In this case the population dynamics can
be described in terms of the ordinary differential equations and
analyzed by well known methods. The main result of such an
analysis [14] are the following conclusions: 1) the evolution
process always converges, and 2) the final population is a quasispecie,
that is the distribution of the sequences in the neighborhood of
the master sequence S_{m}.
In the opposite case (l^{N} >> n), the
evolution process is essentially stochastic, and computer
simulations as well as reasonable quantitative estimations can be
used to characterize the main evolution features [1,2,5]. At
large sequence length N (N > 50) we have
just this case for any real population size.
The main evolution features and the estimations in the
stochastic case for twoletter alphabet ( l = 2; S_{ki}
= 1, 1 ) are described in the child node Estimation of the evolution rate . It is
shown that the total number of generations T , needed to
converge to a quasispecie at sufficiently large selection
intensity, can be estimated by the value
where P is a mutation intensity. This estimation
implies a sufficiently large population size
at which the effect of the neutral selection [6] can be
neglected (see Estimation of the evolution
rate, Neutral evolution game for
details).
It is interesting to estimate, how effective can be an
evolution algorithm of searching. Namely, what is a minimal value
of the total number of participants n_{total }= nT
, which are needed to find a master sequence in evolution
process? According to (1) , (2) , to minimize n_{total}
, we should maximize the mutation intensity P . But at
large P , the already found "good" sequences
could be lost. "Optimal" mutation intensity P ~ N^{
1} corresponds approximately to one mutation
in any sequence per generation. Consequently, we can conclude
that an "optimal" evolution process should involve of
the order of
n_{total} = nT ~
N^{ 2} 
(3) 
participants, to find the master sequence.
This value can be compared with the participant number in
deterministic and pure random methods of search. The simple
deterministic (sequential) method of search (for the considered
Hammingdistancetype selective value and twoletter alphabet,
S_{i} = 1, 1 ) can be constructed
as follows: 1) start with arbitrary sequence S
, 2) try to change consequently its symbols: S_{1}
>  S_{1} , S_{2}
>  S_{2}
, ... , by fixing only such symbol changes, those increase the
sequence selective value. The total number of sequences, which
should be tested in order to find the master sequence S_{m} in such a manner, is equal
to N : n_{total} = N . In a
pure random search, to find S_{m} , we need to inspect of the
order of 2^{N} sequences : n_{total}
~ 2^{N} .
So, we have the following estimations:
Deterministic search

n_{total} = N

Evolutionary search

n_{total} ~ N^{
2}

Random search

n_{total} ~ 2^{N}

Thus, for simple assumptions (Hammingdistancetype selective
value and twoletter alphabet), the evolution method of search is
essentially more effective than the random one, but it is
something worse as compared with the deterministic search.
The Hammingdistancetype model implies that there is unique
maximum of the selective value. This is a strong restriction.
Using the spinglass concept (see Spinglass
model of evolution), it is possible to construct a similar
model of informational sequences evolution for the case of very
large number of the local maxima of a selective value. The
evolution rate, restriction on population size, and total number
of evolution participants in that model can be also roughly
estimated by formulas (1)  (3). But unlike the Hammingdistance
model, the spinglasstype evolution converges to one of the
local selective value maxima, which depends on a particular
evolution realization.
Conclusion. Quasispecies describes
quantitatively a simple information sequence evolution in terms
of sequence length, population size, and mutation and selection
intensities. This model can be used to characterize roughly the
hypothetical prebiotic polynucleotide sequence evolution and to
illustrate mathematically general features of biological
evolution.
References:
1. M.Eigen. Naturwissenshaften. 1971.
Vol.58. P. 465.
2. M.Eigen, P.Schuster. "The
hypercycle: A principle of natural selforganization".
Springer Verlag: Berlin etc. 1979.
3. C.J.Tompson, J.L.McBride. Math.
Biosci. 1974. Vol.21. P.127.
4. B.L.Jones, R.H.Enns, S.S. Kangnekar.
Bull. Math. Biol. 1976. Vol.38. N.1. P.15.
5. V.G.Red'ko. Biofizika. 1986. Vol. 31.
N.3. P. 511. V.G.Red'ko. Biofizika. 1990. Vol. 35. N.5.
P. 831 (In Russian).
6. M. Kimura. "The neutral theory
of molecular evolution". Cambridge Unty Press. 1983.
Copyright© 1998 Principia Cybernetica 
Referencing this page