Estimation
of
Recombination Frequencies and Construction
f
RFLP Linkage Maps in Plants From Crosses Between Heterozygous Parents
E.
Ritter,
C.
Gebhardt and
F.
Salamini
MaxPlanckInstitut fur Ziichtungsforschung,
05000
oln
30
West
Germany
Manuscript received January
5,
1990 Accepted for publication March
26,
1990 ABSTRACT The construction of a restriction fragment length polymorphism (RFLP) linkage map
is
based on the estimation of recombination frequencies between genetic loci and on the determination of the linear order of loci in linkage groups. RFLP loci can be identified as segregations of singular
or
allelic DNArestriction fragments. From crosses between heterozygous individuals several allele (fragment) configurations are possible, and this leads o a set of formulas for the evaluation ofp, the recombination frequency between two loci. Tables and figures are presented illustrating a general outline of gene mapping using heterozygous populations. The method encompasses as special cases the mapping of loci from segregating populations of pure lines. Formulas for deriving the recombination frequencies and information functions are given for different fragment configurations. Information functions derived for relevant configurations are also compared.
A
procedure for map construction is presented, as
it
has been applied
to
RFLP mapping
in
an allogamous crop.
ITH the discovery of a new marker class termed “restriction fragment ength polymorphisms” (RFLP), marker based selection is currently receiving attention and support in crop breeding (reviewed in BECKMANN and OLLER 986). RFLP linkage maps
have been constructed for several crop species includ ing maize, tomato, ettuce, rice and potato HE
LENTJARIS
et
al.
1986; BERNATZKY and
TANKSLEY
1986; HELENTJARIS 987; LANDRY
t al.
1987;
ZAMIR
and
TANKSLEY
988; MCCOUCH
t
al.
1988; BONIER
BALE,
PLAISTED and
TANKSLEY
988; GEBHARDT
t
al.
1989). Moreover, RFLP markers are virtually un limited in numbers, the only restriction to the effi ciency of this technique being he DNA sequence divergence between the genotypes tested. Restriction fragments of nuclear DNA varying in length between parental genotypes are detected by Southern blot hybridization
to
cloned homologous sequences as probes SOUTHERN 1975). Any single polymorphic restriction fragment segregates as a co dominant Mendelian marker in the progeny
from
parents being heterozygous for hat fragment. The distance on the linkage map between any two RFLP markers is determined by measuring the recombina tion frequency. Linked markers are aggregated in linkage groups. The linear order of markers within each linkage group
is
deduced from the genetic dis tances relative to each other in two, three
or
multi plepoint estimates. The number of linkage groups is equivalent to the chromosome number of the species. Most RFLP maps in plants have been obtained from segregating populations, F2 and/or backcrosses, de
rived
from homozygous inbred lines
(e.g.,
HE
Genetics
125:
645654
(July,
1990)
LENTJARIS
et
al.
1986; BERNATZKY and
TANKSLEY
1986). We have recently produced a RFLP map for the potato
(Solanum tuberosum
ssp.
tuberosum)
(GEBHARDT
et
al.
1989). In the diploid state, potato clones are self incompatible and characterized by a high genetic load. Both conditions preclude the possibility of obtaining pure lines. In the case of our map, two highly heter ozygous parents were crossed
to
obtain
a
segregating offspring. In he paper we describe the theoretical background for RFLP linkage analysis from any type
of
F1 populations, including those from heterozygous individuals, which encompasses as special cases map ping in FPand backcross populations from homozy gous inbred lines.
METHODS
AND
RESULTS
Calculation
of
recombination requencies e tween loci defined by single restriction fragments:
This situation has been considered as separated from the case of loci defined by the existence of allelic restriction fragments (see later). In the case of a locus defined only by a single fragment
A,
care is not taken to individuate possible fragments allelic to the same locus: the presence of A is scored versus its absence in a progeny segregating
for
A. Allelic states to A are therefore scored as ull
(0
=
no alternative fragment). Genotypes having the same phenotype (A present) may be homozygous (AA) or heterozygous AO), where A behaves as a dominant marker. In a
F1
cross of the type A0
X
00,
the segregation ratio is
1:
1 (presence
vs.
absence), while the F1 of a cross A0
X
646
E.
Ritter,
C.
Gebhardt and
F.
Salamini
TABLE
1
Derivation
of
recombination frequencies
A. Single Fragment Loci
I:rsglllent configuration Mating table
for
an
of
plrents
AB/OO
type (coupling)
B
00
00 00
X
GF
I p p

P
CF
GT
00 0
1
p
AB
AB B B B
2

2
2
2

2
1
p
p

2
2
00 00 00
A
A0 0 0 0
p
OB
OB
OB
OB B
2
Distribution
of
phenotypes
AB
:
A0
:
OB
:
00
Absolute linkage
8:0:0:8
Absence
of
linkage
4:4:4:4
P=P
Calculation table Phenotypes
PJ
AB
I p
1
1

2
2
1P
A
<>
OB
P
2

2

1
2
2


P

P
1
a
2(1

P)
2P
b
C
SUI11
1
0
1
n
2=
P(l

P)
Maximunl
likelihood equation
a
b
c
d
b+c
+++=o
+p=
IP
P
P
1P
n
A0
will
segregate 3:
1.
Segregation at a second RFLP locus
B
can be defined accordingly. Presence and absence of fragments
A
and
B
can be arranged in 2H configurations n the four oci available for two diploid parents, ncluding cases of homo besides those of heterozygosity. The best estimate
P
for the recombination frequency
p
between
A
and
B
can be obtained by use of the maximum likelihood method of
FISHER
(1
92
1
)
and requires a pecific treat ment for each of the parental fragment configura tions.
As
an example, the derivation of
P
for he configuration
AB/OO
X
OO/OO
(fragments
A
and
B
are both heterozygous and present on the same chro mosome only in one parent) is shown in Table
1A.
The mating table shows the gamete types (GT), their expected frequencies (GF) as functions of
p
and the phenotypes of the progeny resulting from rossing the parent
AB/OO
with
OO/OO.
In case of absolute linkage between
A
and
B
(p
=
0)
the two parental phenotypes are expected in the
F1
progeny with a frequency of
50%
each, whereas in the absence
of
linkage
(p
=
0.5)
four phenotypes (two parental, two recombinant) with 25% frequency each appear. Re combination frequencies between
A
and
B
can only be estimated based on the difference between the expected phenotypic requencies or absolute link age and for an independent segregation of the
two
markers. The Chi square test
(MATHER
1938)
will
establish whether he observed numbers of phenotypes
(2,)
deviate significantly from those expected in case of independent segregation. If
A
and
B
are supposed
to
be linked, the recombination frequency
p
can be
esti
mated by solving the maximum likelihood Equation
1
RFLP
Mapping
in Allogamous Plants
647
~
__~~
~ ~ ~ ~ ~
B.
Loci
with Allelic
Fragments
Frapnent
cwlfiguratton
of
Irrlenls
orB/ap
type (coupiingr Mating table
for
..
..
IBI
AIBI
AyB2
AaBn

GF
P
2
I p

2
P
2
GF
GT
A~IBI
AzBz
A
IBz
AS,
1
p
A
,B/
AIBI AlAeBIBy AlBlBy AlAnBl
I p
AzBz
A A'B, Ba A2Bn
AlAyBn
AeBl
Bn
2
p'
A
IBZ
2
AlBlBz IAYB?
AI
.L AIA~BIB?
p l
&B/
AIA~BI ~BIB? AIA'BIBB AyBl
2
1)intr.ibution
of
pllrnolype\
AIB,
:
AIB?
:
AIBIBr
:
AsBl
:
AnBx
:
AzBIB2
:
A~AYB~ AlAzBy
:
AlAZBlBy
Absolute
linkage
4:o: 0 :0:4:
0:
0: 0:
8
Absence
of
linkage
I:]:
2
:1:1:
2:
2:
:
4
P=P
Calculation table
~~
I'hrnotype\
PI

P,
6P

6PI
PI
@J
i
(%T
PI
6P
4
AI
BI
AlB?
AIBIB,
AnBl
AnBy A~BIU?
I

p)

4
4
P(1

P)
2
p
1

P)
4
4
P(I

)
2
P
2

2
1

2p
2
P
2
Pp.
1

2p
2
2
2

1P
2
P
1

2p
P(1

P)
2
P
2



1P
1

2p
P(1

P)
1
1
1

2P)'
2P(l

e,
1
1
I

2p)' 2P(l

P)
Ul
a2
US
a4
a5
a6
AIAYB~
P(1

P)
1

2p
I

2p
1

2P)'
a7
P(l

P)
1

2p
1

2p
(1

2P)'
a8
2
2
P(1

)
2PG

)
2
2
P(l

P)
2P(I

P)
2
1

2p
+
2p'
1

2p
+
2p'
A
AZBy
AIAsBIB~
1

2p
2py
1 1

2p)
2(1

2p)
2 1

2p)Y
a9
SUlll
1
0
2(1

3p
+
3p7
n
j=
P(l

)(l

2p
+
2P')
M;rximunl
IiLelillood
equation
2(al
+
a.5)
+
1

2p) a3
+
a6
+
a7
+
a8)
2(a2
~4
2(1

2p)d
I
P
P(1

P)
+
P
1

2p
+
2px
+
=o
GF
=
WIletic frequeV';
CT
=
ganlete type;
p,p
=
recombination frequency
of
male and female gametes;
2,
=
observed
nun,bers
of
( See
ex1.
phenotypes.
648
E.
Ritter,
C.
Gebhardt
and
F.
Salamini
(FISHER
92
1)
where
pj
are he expected requencies and
Zj
the observed numbers
of
phenotypes. Here and
in
the following Equation
2,
the terms needed for the solu tion are calculated as examplified in the calculation table (Table 1A). The information fuction
Zp
which measures the quality of he estimate
P
is given by
(
MATHER
1
9 3
)
where
n
is the sample size (=number of offspring). The variance of
P
is then given
by
V(P)
=
l/zp
(3)
and the standard deviation by
a
he maximum likelihood estimator is
a
minimum variance unbiased estimator of he recombination requency
p
(RAO
1952).
In Table
1A
the expected frequencies
pj
(first col umn) are obtained y multiplying the gamete frequen cies giving rise to
a
specific phenotype and summing up the products over the mating table. For example
pAB
=
2
with the male frequency of recombinant gametes
(p)
equal to hat of the female
(p').
The other terms (columns
2,
3,
and 4) are derivatives of
PI.
Using the calculation table, Equation
1
is formulated as
a
b
c
d
+++
0
IP
P P
1P
and solving for
p
gives the estimate
b+c
b+c
P=
=
a+b+c+d
n with
C(P)
=
P(l

P)
n
from Equations
2
and
3.
If in a cross only four phenotypes are present, as
it
is
with single fragment loci,
it
may be convenient (see below) to estimate
p
with the product formula of FISHER nd BALMAKUND (1928) hich is easy
tq
calculate (IMMER 1930). Thus
p
is estimated by
P
solving the equation: with
p,
as expected frequencies and
Zj
as observed numbers of phenotypes.
If
the variance
is
the same as with the maximum likelihood method then the prod
uct
formula gives a fully efficient estimate
ofp
(BAILEY
196
1).
Similar as shown in Table lA, mating tables can be assembled for all the
2'
possible fragment configura tions at the loci
A
and
B
of two diploid parents. In crosses these fragment configurations srcinate a ax imum of four phenotypes because the homozygous or heterozygous states for a fragment cannot be distin guished. However, out
of
the
256
configurations, only a few have expected phenotypic frequencies differing between absolutely linked and unlinked fragments
A
and
B
and these are therefore useful for linkage analysis. They are combined in three types:
1.
The AB/OOtype with the configurations AB/
00
X
OO/OO
(coupling, see Table
1
A)
and AO/OB
X
OO/OO
(repulsion), characterized by the presence of both fragments A and
B
in one parent and absence in the other;
2.
The AB/AO with the configurations AB/OO
X
AO/OO
(coupling) and AO/OB
X
AO/OO (repul sion) in which one fragment is present in both parents and the other only in one;
3.
The AB/AB type with the configurations AB/
00
X
AB/OO
(coupling), AO/OB
X
AO/OB (repul sion) and
AB/00
X
AO/OB
(coupling/repulsion) with both fragments shared by the parents.
For
each informative fragment configuration as defined above, a calculation table can be developed by expressing the expected phenotypic frequency
PI
as
a function of
p,
obtained from the mating table as exemplified in Equation
4,
and by calculating the partial derivatives and the other terms necessary for solving the maximum likelihood Equation
1.
In doing this, three assumptions are made:
1.
The recombination requency during gamete formation is the same in both parents
(p
=
p );
2.
Reciprocal crosses result in the same phenotypic frequencies
(P1
X
P2
=
P2
X
Pl);
3.
The phenotypic frequencies are identical and independent of which homologous chromosomes are paired
(e.g.,
AB/OO
=
OO/AB). Table
2A
summarizes the formulas necessary to CalcuJate the recombination frequency estimators
P
and
P
for the seven usable fragment configurations of two single fragment loci. The formulas for the AB/
00
and AB/AB type were derived by solving equa tion (l), while for the AB/AO type equation
(5)
was used due to its lower computational complexity. The solution
of
the maximum likelihood equation: a
b
C
d
+
++
0
2p
1+p p
1p
(AB/AO coupling)
RFLP
Mapping
in
Allogamous lants
649
would in fact lead in this case to a polynomial of third order, while the application of the product formula gives a quadratic equation, he variance being the same in both cases. Analogous results can be obtained for repulsion. With the product formula the value of
X,
as defined in Table
2A,
is always larger or equal to one. The formula is not defined for
X
=
1
(ad
=
bc)
or the denominator being equal
to
zero
[bc
=
0
(COW
pling),
ad
=
0
(repulsion)]. If, however,
X
approaches one or the denominator approaches zero, hen he estimate of
p
converges upon
0.5
and zero respec tively. Similar conclusions can be drawn for the other cases, when the product formula is applied. DISTORTED SEGREGATION RATIOS In the F1 a deviation from the segregation ratio of
1
:
1
for fragments contributed only by one parent and from
3:l
for fragments present in both parents may result due to a reduced viability of some of the result ing phenotypes (reduced viability of certain gametes is not considered here). Significant deviations from the normal ratios are detected with the Chi square test (summarized in
MATHER
1938). If the “skewing factor”
u
(ratio of the phenotypes with and without a fragment
A)
is considered, the phenotypic frequencies in the mating table of Table
1A
can be expressed as
PAB
=
u(l

p)/(u
+
I),
poo
=
(1

P)/(u
+
PAO
=
up/ .
+
1)
and
10,
=
P/(u
+
1)
summing o
1
(BAILEY
961). When only one frag ment shows distorted egregation,
u
disappears in subsequent calculations and the estimate for
p
is the same as with segregating fragments without distor tion. Nevertheless the variance must be specifically calculated because
it
is
different rom he case of absence of distortion. If both fragments are distorted and the skewing factor” for
B
is
v
then the phenotypic frequencies are expressed as
PAB
=
uv
1

p)/D;
Po0
=
(1

P)/D;
PAO
=
uP/D
and
POB
=
vP/D
with
D
=
UV(~
p)
+
P(u
+
V
1

.
The estimate for
p
results in complex maximum like lihood equations, but using the product formula as suggested by
BAILEY
1961), solutions can be found for the
AB/OO
type and the
AB/AB
type (see
*
and
$*
in Table
2A).
For the
AB/AO
type, the estimation formula for
p
is always the same whether distorted segregation ratios are observed or not, ince the prod uct formula is used in all cases.
CALCULATION
OF
RECOMBINATION FREQUENCIES
BETWEEN
LOCI
DEFINED
BY
ALLELIC
RESTRICTION
FRAGMENTS
If two fragments AI and
A:
are detected with the same probe and
if
they are linked 100% in repulsion
(p
=
0),
they can be reated as allelic fragments (although they do not have to be
so
in the molecular sense). In a progeny of heterozygous parents, a locus may therefore be represented by up to four codomi nant allelic fragments in the combinations
AIAs, AIA4, A2As, AZA4
f
AI
and
A:
are the alleles of P1 and
A:<
and
A4
of
P2.
If only two alleles are present in both parents (both parents have, for instance,
AIA: )
and knowing these alleles based on their electropho retic pattern, the homozygous or heterozygous state of a locus can be deduced. Recombination frequencies between two such loci are derived by the procedure described for single fragment loci. As an example, he fragment configuration
AIBI/A: B:
X
AIBI/A: B: ,
with
AI, A:
and
BI,
B:
being allelic fragments of two loci
A
and
B,
is shown in Table
1B. As
seen in the mating table, nine phenotypes can be distinguished and their frequencies vary according to the linkage intensity between the loci
A
and
B.
The terms neces sary to formulate Equations 1 and
2
are given in the calculation table. The maximum likelihood equation is a polynomial of higher order, that can be solved iteratively using, for example, Newton’s approxima tion method. Table
2B
summarizes the formulas and maximum likelihood equations (where a universal so lution is not possible) for the estimation
of
p
between two loci with allelic fragments (Nos. 13) or for mixed situations where linkages between a single fragment locus and a locus with allelic fragments are onsidered (Nos. 46). Allelic configurations at loci with allelic fragments are here indicated ntroducing he addi tional letters
a
and
(a
=
A,/A: ,
a’
=
AI/As,
p
=
Bl/ B: ,
/3’
=
BI/B2),
and their configurations in a cross are defined in terms of allelic states in Table 3. The configurations
aP/OO,
aB/OO
(Nos. la and 4b) and
aP/aO,
aB/aO, aB/BO
(Nos. lb, 4a and
5
have similar solutions for
P
as
AB/OO
ad
AB/AO
respec tively (Table
2A).
For the configurations
ap/aB,
a/3/
a’@‘
and
aB/aB,
respectively (Nos.
2,
3
and
6)
the three cases of coupling, repulsion, and coupling/re pulsion have been considered. In the three allelic configuration
a/3/a’P’
(No. 3) the sixteen genotypes can be distinguished, allowing a very precise estimate of
p
(see also Figure 1).
A
configuration where our different ragments are found at a ocus is treated as the
ap/a’@’
onfiguration by attaching corresponding genotypes. In a similar way as described in Table 1 further mating and calculation tables could be set up consid ering three and more loci, where several parameters have to be estimated.
INFORMATIVITY
OF
P
DEPENDS
ON
THE
FRAGMENT
CONFIGURATION The information function
Zp
or
its reciprocal, the variance
V(P)
Equations
2
and 3), is a measure of the precision of the estimated recombination frequency
P
(MATHER
938). Table
4
lists the information func