| United States Patent Application |
20110008793
|
| Kind Code
|
A1
|
|
Butruille; David
;   et al.
|
January 13, 2011
|
Maize Polymorphisms and Methods of Genotyping
Abstract
Polymorphic maize DNA loci useful for genotyping between at least two
varieties of maize. Sequences of the loci are useful for designing
primers and probe oligonucleotides for detecting polymorphisms in maize
DNA. Polymorphisms are useful for genotyping applications in maize. The
polymorphic markers are useful to establish marker/trait associations,
e.g. in linkage disequilibrium mapping and association studies,
positional cloning and transgenic applications, marker-aided breeding and
marker-assisted selection, hybrid prediction and identity by descent
studies. The polymorphic markers are also useful in mapping libraries of
DNA clones, e.g. for maize QTLs and genes linked to polymorphisms.
| Inventors: |
Butruille; David; (Des Moines, IA)
; Laurie; Cathy; (Saratoga, CA)
; Gupta; Anju; (Ankeny, IA)
; Johnson; Dick; (Urbana, IL)
; Eathington; Sam; (Ames, IA)
; Bull; Jason; (Wildwood, MO)
; Edwards; Marlin; (Davis, CA)
|
| Correspondence Address:
|
THOMPSON COBURN, LLP
ONE US BANK PLAZA
ST. LOUIS
MO
63101
US
|
| Assignee: |
MONSANTO TECHNOLOGY LLC
St. Louis
MO
|
| Family ID:
|
39082626
|
| Appl. No.:
|
12/885990
|
| Filed:
|
September 20, 2010 |
Related U.S. Patent Documents
| | | | |
|
| Application Number | Filing Date | Patent Number | |
|---|
| | 11504538 | Aug 14, 2006 | | |
| | 12885990 | | | |
|
|
| Current U.S. Class: |
435/6.13 ; 435/6.18; 536/23.1 |
| Current CPC Class: |
C12Q 1/6895 20130101; C12Q 2600/13 20130101; C12Q 2600/156 20130101; C12Q 2600/172 20130101 |
| Class at Publication: |
435/6 ; 536/23.1 |
| International Class: |
C12Q 1/68 20060101 C12Q001/68; C12N 15/29 20060101 C12N015/29 |
Claims
1. A polymorphic maize DNA locus which is useful for genotyping between
at least two varieties of maize; wherein said locus comprises at least 20
consecutive nucleotides which include or are adjacent to a polymorphism
identified in Table 1; and wherein the sequence of said at least 20
consecutive nucleotides is at least 90% identical to the sequence of the
same number of nucleotides in either strand of a segment of maize DNA
which includes or is adjacent to said polymorphism.
2-10. (canceled)
11. A method of investigating a maize allele comprising deterrnining the
presence of a polymorphism in the nucleic acid sequence of nucleic acid
molecules isolated from one or more maize plants wherein said
polymorphism is linked to a locus of claim 1.
12. A method of mapping maize genomic sequence comprising identifying the
presence of a mapped polymorphism in said sequence, wherein said mapped
polymorphism is linked to a locus of claim 1.
13. A method according to claim 12 wherein said mapped polymorphism is
identified in Table 3.
14-17. (canceled)
18. A method identifying genes associated with a trait of interest
comprising identifying linkage of at least one polymorphism to said trait
of interest, wherein said polymorphism is linked to a locus of claim 1,
identifying a genomic clone containing said locus and identifying genes
linked to said locus.
19. A method for improving heterosis in hybrid maize comprising (a)
developing associations between a plurality of polymorphisms and traits
in more than two inbred lines of maize, (b) selecting for breeding two of
said inbred lines having complementary heterotic groups which are
predicted to improve heterosis wherein said polymorphisms are linked to
loci of claim 1.
20-29. (canceled)
Description
CROSS REFERENCE TO RELATED APPLICATIONS
[0001] This application is a divisional of U.S. patent application Ser.
No. 11/504,538, filed Aug. 14, 2006, which is incorporated herein by
reference in its entirety.
INCORPORATION OF SEQUENCE LISTING
[0002] A text file of the Sequence Listing contained in the filed named
"pa.sub.--00358C.txt" which is 11,105,843 bytes (measured in
MS-Windows.RTM.) and identical to the computer readable copy of the
sequence listing filed in the parent application U.S. patent application
Ser. No. 11/504,538, filed Aug. 14, 2006, is electronically filed
herewith and incorporated by reference.
INCORPORATION OF TABLES
[0003] Table 1 and Table 3 contained in the files named "Table 1.txt" and
"Table 3.pdf" which are 3,101,937 bytes and 373,032 bytes, respectively
(measured in MS-Windows.RTM.), and identical in content to the files
containing Table 1 and Table 3 filed in the parent application U.S.
patent application Ser. No. 11/504,538, filed Aug. 14, 2006, are
electronically filed herewith and incorporated by reference.
TABLE-US-LTS-CD-00001
LENGTHY TABLES
The patent application contains a lengthy table section. A copy of the
table is available in electronic form from the USPTO web site
(http://seqdata.uspto.gov/?pageRequest=docDetail&DocID=US20110008793A1).
An electronic copy of the table will also be available from the USPTO
upon request and payment of the fee set forth in 37 CFR 1.19(b)(3).
FIELD OF THE INVENTION
[0004] Disclosed herein are maize polymorphisms, nucleic acid molecules
related to such polymorphisms and methods of using such polymorphisms and
molecules, e.g. in genotyping.
BACKGROUND
[0005] Polymorphisms are useful as genetic markers for genotyping
applications in the agriculture field, e.g. in plant genetic studies and
commercial breeding. See for instance U.S. Pat. Nos. 5,385,835;
5,437,697; 5,385,835; 5,492,547; 5,746,023; 5,962,764; 5,981,832 and
6,100,030, the disclosures of all of which are incorporated herein by
reference. The highly conserved nature of DNA combined with the rare
occurrences of stable polymorphisms provide genetic markers which are
both predictable and discerning of different genotypes. Among the classes
of existing genetic markers are a variety of polymorphisms indicating
genetic variation including restriction-fragment-length polymorphisms
(RFLPs), amplified fragment-length polymorphisms (AFLPs), simple sequence
repeats (SSRs), single nucleotide polymorphisms (SNPs) and
insertion/deletion polymorphisms (Indels). Because the number of genetic
markers for a plant species is limited, the discovery of additional
genetic markers will facilitate genotyping applications including
marker-trait association studies, gene mapping, gene discovery,
marker-assisted selection and marker-assisted breeding. Evolving
technologies make certain genetic markers more amenable for rapid, large
scale use. For instance, technologies for SNP detection indicate that
SNPs may be preferred genetic markers.
SUMMARY OF THE INVENTION
[0006] This invention provides a large number of genetic markers for
maize. These genetic markers comprise maize DNA loci which are useful for
genotyping applications between at least two varieties of maize. A
polymorphic maize locus of this invention comprises at least 20
consecutive nucleotides which include or are adjacent to a polymorphism
which is identified herein, e.g. in Table 1. More particularly, a
polymorphic maize locus of this invention has a nucleic acid sequence
which is at least 90%, preferably at least 95%, identical to the sequence
of the same number of nucleotides in either strand of a segment of maize
DNA which includes or is adjacent to the polymorphism. As indicated in
Table 1 the nucleic acid sequences of SEQ ID NO: 1 through SEQ ID NO:
10373 comprise one or more polymorphisms, e.g. single nucleotide
polymorphisms (SNPs) and insertion/deletion polymorphisms (Indels).
[0007] In one aspect of the invention the polymorphic maize loci are
provided in one or more data sets of DNA sequences, i.e. data sets
comprising up to a finite number of distinct sequences of polymorphic
loci. The finite number of polymorphic loci in a data set can be as few
as 2 or up to 1000 or more, e.g. 5, 10, 25, 40, 75, 100 or 500 loci. Such
data sets are useful for genotyping applications of a large scale or
involving large numbers of plants. In a useful aspect of the invention
the data set of polymorphic maize loci is recorded on a computer readable
medium.
[0008] In another aspect of the invention the polymorphism in the loci of
the invention are mapped onto the maize genome, e.g. as a genetic map of
the maize genome comprising map positions of two or more polymorphisms,
as indicated in Table 1, more preferably as indicated in Table 3. Such a
genetic map is illustrated in FIG. 1. The genetic map data can also be
recorded on computer readable medium. Preferred embodiments of the
invention provide genetic maps of polymorphisms at high densities, e.g.
at least 150 or more, say at least 500 or 1000, polymorphisms across a
map of the maize genome. Especially useful genetic maps comprise
polymorphisms at an average distance of not more than 10 centiMorgans
(cM) on a linkage group.
[0009] This invention also provides nucleic acid molecules for identifying
the polymorphisms, such molecules are preferably oligonucleotides which
are useful as PCR primers for amplifying a segment of a maize genome,
e.g. a polymorphic locus, and hybridization probes for use in assays to
identify in maize DNA the presence or absence of particular
polymorphisms.
[0010] Nucleic acid molecules useful as PCR primers are typically provided
in pairs for amplify a segment of maize DNA comprising at least one
polymorphism, where each molecule comprises at least 15 nucleotide bases.
The nucleotide sequence of one of the primer molecules is preferably at
least 90 percent identical to a sequence of the same number of
consecutive nucleotides in one strand of a segment of maize DNA in a
polymorphic locus and the sequence of the other of the primer molecules
is at least 90 percent identical to a sequence of the same number of
consecutive nucleotides in the other strand of said segment of maize DNA
in the polymorphic locus. Preferably the primers are capable of
hybridizing under high stringency conditions to the strands of DNA in the
polymorphic locus. Preferably such primers are provided and used in pairs
which flank at least one polymorphism in the segment of maize DNA in a
polymorphic locus.
[0011] Nucleic acid molecules useful as hybridization probes for detecting
a polymorphism in maize DNA can be designed for a variety of assays. For
assays, where the probe is intended to hybridize to a segment including
the polymorphism, such molecules can comprise at least 12 nucleotide
bases and a detectable label. The sequence of the nucleotide bases is
preferably at least 90 percent, more preferably at least 95%, identical
to a sequence of the same number of consecutive nucleotides in either
strand of a segment of maize DNA in a polymorphic locus of this
invention. In preferred aspects of the invention the detectable label is
a dye at one end of the molecule. In more preferred aspects the molecule
comprises a dye and dye quencher at the ends thereof. For SNP assays it
is useful to provide such molecules in pairs, e.g. where each molecule
has a distinct fluorescent dye at the 5' end and has identical nucleotide
sequence except for a single nucleotide polymorphism.
[0012] For assays where the molecule is designed to hybridize adjacent to
a polymorphism which is detected by single base extension, e.g. of a
labeled dideoxynucleotide, such molecules can comprise at least 15, more
preferably at least 16 or 17, nucleotide bases in a sequence which is at
least 90 percent, preferably at least 95%, identical to a sequence of the
same number of consecutive nucleotides in either strand of a segment of
polymorphic maize DNA.
[0013] Another aspect of the invention is a complex of hybridization probe
and a fragment maize genomic DNA.
[0014] Still another aspect of this invention provides a set of
oligonucleotides comprising a pair of nucleic acid molecules primers for
PCR amplification of a segment of polymorphic maize DNA and at least one
detector nucleic acid molecule for detecting a polymorphism in the
segment. Such sets can be provided in collections of at least 2 or up to
1000 or more, e.g. up to 5, 10, 25, 40, 75, 100 or 500 sets of primer
pairs and hybridization probes.
[0015] Another aspect of this invention provides methods for determining
polymorphisms which are likely to be useful as markers for genotyping
applications in eukaryotic genomes. Such method comprises the
construction of reduced representation libraries by separating repetitive
sequence from fragments of genomic DNA of at least two varieties of a
species, fractionating the separated genomic DNA fragments based on size
of nucleotide sequence and comparing the sequence of a fragments in a
fraction to determine polymorphisms. More particularly, the method of
identifying polymorphisms in genomic DNA comprises digesting total
genomic DNA from at least two variants of a eukaryotic species with a
methylation sensitive endonuclease to provide a pool of digested DNA
fragments. The average nucleotide length of fragments is smaller for DNA
regions characterized by a lower percent of 5-methylated cytosine. Such
fragments are separable, e.g. by gel electrophoresis, based on nucleotide
length. A fraction of DNA with less than average nucleotide Length is
separated from the pool of digested DNA. Sequences of the DNA is a
fraction is compared to identify polymorphisms. As compared to coding
sequence, repetitive sequence is more likely to comprise 5-methylated
cytosine, e.g. in --CG- and --CNG-sequence segments. In a preferred
aspect of the method genomic DNA from at least two different inbred
varieties of a crop plant is digested with a with a methylation sensitive
endonuclease selected from the group consisting of Aci I, Apa I, Age I,
Bsr F I, BssH II, Eag I, Eae I, Hha I, HinP1 I, Hpa II, Msp I, MspM II,
Nar I, Not I, Pst I, Pvu I, Sac II, Sma I, Stu I and Xho I to provide a
pool of digested DNA which is physically separated, e.g. by gel
electrophoresis. Comparable size fractions of DNA are obtained from
digested DNA of each of said varieties. DNA molecules from the comparable
fractions are inserted into vectors to construct reduced representation
libraries of genomic DNA clones which are sequenced and compared to
identify polymorphisms.
[0016] In an alternative method polymorphisms in genomic DNA are
identified by digesting total genomic DNA from at least two variants of a
eukaryotic species with endonuclease to provide a pool of digested DNA
fragments. The digested DNA fragments are segregated in an array on a
substrate and contacted with one or more labeled oligonucleotides having
repetitive sequence elements which are characteristic of DNA in the
species. Hybridization identifies DNA fragments characterized by
repetitive sequence. The sequence of DNA fragments which do not hybridize
to repetitive sequence oligonucleotides is compared for polymorphisms.
Such methods provide segments of reduced representation genomic DNA from
a plant which has genomic DNA comprising regions of DNA with relatively
higher levels of methylated cytosine and regions of DNA with relatively
lower levels of methylated cytosine. The reduced representation segments
of this invention comprise genomic DNA from a region of DNA with
relatively lower levels of methylated cytosine and are provided in
fractions characterized by nucleotide size of said segments, e.g. in the
range of 500 to 3000 bp. This invention also provides methods of using
the loci and polymorphism of this invention, e.g. in genotyping and
related applications One aspect. of this invention provides methods of
finding polymorphisms in maize DNA by comparing DNA sequence in at least
two maize lines where the sequence is selected by using a segment of
polymorphic maize DNA locus. The DNA sequence for comparison is
preferably selected as being at least 80% identical to sequence of a
polymorphic locus. More preferably such sequence is selected as being
linked to a polymorphic locus.
[0017] This invention also provides methods of genotyping by assaying DNA
or mRNA from tissue of at least one maize line to identify the presence
of a nucleic acid polymorphism linked to a polymorphic locus of this
invention. In preferred aspects of the invention genotyping uses a
polymorphism identified in the genetic map of FIG. 1 as amplified by
Table 3. In another preferred aspect of the invention genotyping
comprises identifying one or more phenotypic traits for at least two
maize lines and determining associations between traits and
polymorphisms, e.g. lines with complementary traits are identified and
selected for breeding to improve heterosis. Assays for such genotyping
can employ sufficient nucleic acid molecules to identify the presence of
at least 2 and up to 5000 or more distinct polymorphisms, e.g. where the
number of distinct polymorphisms is 5, 10, 25, 40, 75, 100, 500, 1000,
2000, 3000 or 4000.
[0018] This invention also provides methods of investigating a maize
allele by determining the presence of a polymorphism in the nucleic acid
sequence of nucleic acid molecules isolated from one or more maize plants
where the polymorphism is linked to a polymorphic locus of the invention.
[0019] This invention also provides methods of mapping maize genomic
sequence by identifying the presence of a mapped polymorphism in the
genomic sequence where the mapped polymorphism is linked to a polymorphic
locus of the invention, e.g. a mapped polymorphism on a genetic map of
this invention.
[0020] This invention also provides methods of breeding maize by selecting
a maize line having a polymorphism associated by linkage disequilibrium
to a trait of interest where the polymorphism is linked to a polymorphic
locus of the invention.
[0021] This invention also provides methods of associating a phenotype
trait to a genotype in maize plants by identifying a set of one or more
distinct phenotypic traits characterizing the maize plants. DNA or mRNA
in tissue from at least two maize plants having allelic DNA is assayed to
identify the presence or absence of a set of distinct polymorphisms.
Associations between the set of polymorphisms and set of phenotypic
traits are identified where the set of polymorphisms comprises at least
one, more preferably at least 10, polymorphisms linked to a polymorphic
locus of the invention, e.g. at least 10 polymorphisms linked to mapped
polymorphisms, e.g. as identified in Table 3. In a more preferred aspect
traits are associated to genotypes in a segregating population of maize
plants having allelic DNA in loci of a chromosome which confers a
phenotypic effect on a trait of interest and where a polymorphism is
located in such loci and where the degree of association among the
polymorphisms and between the polymorphisms and the traits permits
determination of a linear order of the polymorphism and the trait loci.
In such methods at least 5 polymorphisms are linked to loci permitting
disequilibrium mapping of the loci.
[0022] This invention also provides methods of identifying genes
associated with a trait of interest by identifying linkage of at least
one polymorphism to a trait of interest where the polymorphism is linked
to a polymorphic locus of the invention, identifying a genomic clone
containing the locus and identifying genes linked to the locus. In
preferred aspects of the invention such association is useful in marker
assisted breeding an/or marker assisted selection.
[0023] This invention further provides methods for improving heterosis in
hybrid maize. In such methods associations are developed between a
plurality of polymorphisms which are linked to polymorphic loci of the
invention and traits in more than two inbred lines of maize. Two of such
inbred lines having complementary heterotic groups which are predicted to
improve heterosis are selected for breeding.
[0024] This invention also provides methods to screen for traits by
interrogating a collection of SNPs at an average density of less than 10
cM on a genetic map of maize. The presence or absence of a SNP linked to
a polymorphic locus of the invention is correlated such traits. In
another aspect of the invention the polymorphisms are used to identify
haplotypes which are allelic segments of genomic DNA characterized by at
least two polymorphisms in linkage disequilibrium and wherein said
polymorphisms are in a genomic windows of not more than 10 centimorgans
in length, e.g. not more than about 8 centimorgans or smaller windows,
e.g. in the range of say 1 to 5 centimorgans. Especially useful methods
of the invention use such polymorphisms to identify a plurality of
haplotypes in a series of adjacent genomic windows in each corn
chromosome, e.g. providing essentially full genome coverage with such
windows. With a sufficiently large and diverse breeding population of
corns, it is possible to identify a high quantity of haplotypes in each
window, thus providing allelic DNA that can be associated with one or
more traits to allow focused marker assisted breeding. Thus, an aspect of
the corn analysis of this invention further comprises the steps of
characterizing one or more traits for said population of corn plants and
associating said traits with said allelic SNP or Indel polymorphisms,
preferably organized to define haplotypes. Such traits include yield,
lodging, maturity, plant height and disease resistance, e.g. resistance
to corn cyst nematode, corn rust, brown stem rot, sudden death syndrome
and the like. To facilitate breeding it is useful to compute a value for
each trait or a value for a combination of traits, e.g. a multiple trait
index. The weight allocated to various traits in a multiple trait index
can vary depending of the objectives of breeding. For instance, if yield
is a key objective, the yield value may be weighted at 50 to 80%,
maturity, lodging, plant height or disease resistance may be weighted at
lower percentages in a multiple trait index.
[0025] Another aspect of this invention provides a method of genotyping
further comprising identifying one or more phenotypic traits for at least
two corn lines and determining associations between said traits and
polymorphisms.
[0026] Still another aspect of this invention is directed to the use of a
selected set of polymorphic corn DNA sequences in corn breeding, e.g. by
selecting a corn line on the basis of its genotype at a polymorphic locus
has a sequence within the selected set of polymorphic corn DNA sequences
[0027] Another aspect of this invention provides a method of breeding corn
plants comprising the steps of:
[0028] (a) identifying trait values for at least two haplotypes in at
least two genomic windows of up to 10 centimorgans for a breeding
population of at least two corn plants;
[0029] (b) breeding two corn plants in said breeding population to produce
a population of progeny seed;
[0030] (c) identifying the allelic state of polymorphisms in each of said
windows in said progeny seed to determine the presence of said
haplotypes; and
[0031] (c) selecting progeny seed having the higher trait values
identified for determined haplotypes in said progeny seed.
[0032] In aspects of the breeding method trait values are identified for
at least two haplotypes in each adjacent genomic window over essentially
the entirety of each chromosome. In another useful aspect of the method
progeny seed is selected for a higher trait value for yield for a
haplotype in a genomic window of up to 10 centimorgans in each
chromosome. In another aspect of the invention, the breeding method is
directed to increased yield, where the trait value is for the yield
trait, where trait values are ranked for haplotypes in each window, and
where a progeny seed is selected which has a trait value for yield in a
window that is higher than the mean trait value for yield in said window.
In certain aspects of the breeding methods the haplotypes are defined
using the polymorphisms identified in Table 1 or are defined as being in
the set of DNA sequences that comprises all of the DNA sequences of SEQ
ID NO: 1 through SEQ ID NO:10,373, or as being in linkage disequilibrium
with one of those polymorphisms.
[0033] The methods of this invention characterized by marker
identification can be carried out using oligonucleotide primers and
oligonucleotides detectors. Thus, another aspect of the invention is
directed to such oligonucleotides, e.g. sets of oligonucleotides
functional with a marker. More particularly, this invention provides a
pair of isolated nucleic acid molecules comprising oligonucleotide
primers for amplifying corn DNA to identify the presence of a
polymorphism in the DNA, e.g. oligonucleotides comprising at least 12
consecutive nucleotides which are at least 90% identical to ends of a
segment of DNA of the same number of nucleotides in opposite strands of a
polymorphic corn DNA locus having a sequence which is at least 90%
identical to a sequence in a subset of polymorphic corn DNA sequences
disclosed herein (or a complement thereof). More preferably such a pair
of oligonucleotides comprise at least 15 consecutive nucleotides, or
more, e.g. at least 20 consecutive nucleotides. More particularly, when
hybridization to a SNP is contemplated for marker assay for identifying a
polymorphism in corn DNA, a set will comprise four oligonucleotides, e.g.
a pair of isolated nucleic acid molecules for amplifying DNA which can
hybridize to DNA which flanks a polymorphism and a pair of detector
nucleic acid molecules which are useful for detecting each nucleotide in
a single nucleotide polymorphism in a segment of the amplified DNA. In
preferred aspects of the invention such detector nucleic acid molecules
comprise at least 12 nucleotide bases and a detectable label, or at least
15 nucleotide bases, and the sequence of the detector nucleic acid
molecules is identical except for the nucleotide polymorphism (e.g. SNP
or Indel) and is at least 95 percent identical to a sequence of the same
number of consecutive nucleotides in either strand of the segment of
polymorphic corn DNA locus containing the polymorphism.
BRIEF DESCRIPTION OF THE DRAWINGS
[0034] FIG. 1 is a genetic map of maize showing the density of mapped
polymorphisms of this invention.
[0035] FIG. 2 is an allelogram illustrating results of a genotyping assay.
DEFINITIONS
[0036] As used herein certain terms are defined as follows.
[0037] An "allele" means an alternative sequence at a particular locus;
the length of an allele can be as small as 1 nucleotide base, but is
typically larger. Allelic sequence can be amino acid sequence or nucleic
acid sequence. A "locus" is a short sequence that is usually unique and
usually found at one particular location in the genome by a point of
reference, e.g. a short DNA sequence that is a gene, or part of a gene or
intergenic region. A locus of this invention can be a unique PCR product
at a particular location in the genome. The loci of this invention
comprise one or more polymorphisms i.e. alternative alleles present in
some individuals. "Genotype" means the specification of an allelic
composition at one or more loci within an individual organism. In the
case of diploid organisms, there are two alleles at each locus; a diploid
genotype is said to be homozygous when the alleles are the same, and
heterozygous when the alleles are different. "Haplotype" means an allelic
segment of genomic DNA that tends to be inherited as a unit; such
haplotypes can be characterized by two or more polymorphisms and can be
defined by a size of not greater than 10 centimorgans, e.g. not greater 8
centimorgans. With higher precision, from higher density of
polymorphisms, haplotypes can be characterized by genomic windows in the
range of 1-5 centimorgans.
[0038] "Consensus sequence" means a constructed DNA sequence which
identifies SNP and Indel polymorphisms in alleles at a locus. Consensus
sequence can be based on either strand of DNA at the locus and states the
nucleotide base of either one of each SNP in the locus and the nucleotide
bases of all Indels in the locus. Thus, although a consensus sequence may
not be a copy of an actual DNA sequence, a consensus sequence is useful
for precisely designing primers and probes for actual polymorphisms in
the locus.
[0039] "Phenotype" means the detectable characteristics of a cell or
organism which are a manifestation of gene expression.
[0040] "Marker" mean polymorphic sequence. A "polymorphism" is a variation
among individuals in sequence, particularly in DNA sequence. Useful
polymorphisms include a single nucleotide polymorphisms (SNPs),
insertions or deletions in DNA sequence (Indels) and simple sequence
repeats of DNA sequence (SSRs).
[0041] "Marker Assay" means an method for detecting a polymorphism at a
particular locus using a particular method, e.g. phenotype (such as seed
color, flower color, or other visually detectable trait), restriction
fragment length polymorphism (RFLP), single base extension,
electrophoresis, sequence alignment, allelic specific oligonucleotide
hybridization (ASO), RAPID, etc. Preferred marker assays include single
base extension as disclosed in U.S. Pat. No. 6,013,431 and allelic
discrimination where endonuclease activity releases a reporter dye from a
hybridization probe as disclosed in U.S. Pat. No. 5,538,848 the
disclosures of both of which are incorporated herein by reference.
[0042] "Linkage" refers to relative frequency at which types of gametes
are produced in a cross. For example, if locus A has genes "A" or "a" and
locus B has genes "B" or "b" and a cross between parent I with AABB and
parent B with aabb will produce four possible gametes where the genes are
segregated into AB, Ab, aB and ab. The null expectation is that there
will be independent equal segregation into each of the four possible
genotypes, i.e. with no linkage 1/4 of the gametes will of each genotype.
Segregation of gametes into a genotypes differing from 1/4 are attributed
to linkage.
[0043] "Linkage disequilibrium" is defined in the context of the relative
frequency of gamete types in a population of many individuals in a single
generation. If the frequency of allele A is p, a is p', B is q and b is
q', then the expected frequency (with no linkage disequilibrium) of
genotype AB is pq, Ab is pq', aB is p' q and ab is p' q'. Any deviation
from the expected frequency is called linkage disequilibrium. Two loci
are said to be "genetically linked" when they are in linkage
disequilibrium.
[0044] "Quantitative Trait Locus (QTL)" means a locus that controls to
some degree numerically representable traits that are usually
continuously distributed.
[0045] Nucleic acid molecules or fragments thereof of the present
invention are capable of hybridizing to other nucleic acid molecules
under certain circumstances. As used herein, two nucleic acid molecules
are said to be capable of hybridizing to one another if the two molecules
are capable of forming an anti-parallel, double-stranded nucleic acid
structure. A nucleic acid molecule is said to be the "complement" of
another nucleic acid molecule if they exhibit "complete complementarity"
i.e. each nucleotide in one sequence is complementary to its base pairing
partner nucleotide in another sequence. Two molecules are said to be
"minimally complementary" if they can hybridize to one another with
sufficient stability to permit them to remain annealed to one another
under at least conventional "low-stringency" conditions. Similarly, the
molecules are said to be "complementary" if they can hybridize to one
another with sufficient stability to permit them to remain annealed to
one another under conventional "high-stringency" conditions. Nucleic acid
molecules which hybridize to other nucleic acid molecules, e.g. at least
under low stringency conditions are said to be "hybridizable cognates" of
the other nucleic acid molecules. Conventional stringency conditions are
described by Sambrook et al., Molecular Cloning, A Laboratory Manual, 2nd
Ed., Cold Spring Harbor Press, Cold Spring Harbor, N.Y. (1989) and by
Haymes et al., Nucleic Acid Hybridization, A Practical Approach, IRL
Press, Washington, D.C. (1985), each of which is incorporated herein by
reference. Departures from complete complementarity are therefore
permissible, as long as such departures do not completely preclude the
capacity of the molecules to form a double-stranded structure. Thus, in
order for a nucleic acid molecule to serve as a primer or probe it need
only be sufficiently complementary in sequence to be able to form a
stable double-stranded structure under the particular solvent and salt
concentrations employed.
[0046] Appropriate stringency conditions which promote DNA hybridization,
for example, 6.0.times. sodium chloride/sodium citrate (SSC) at about
45.degree. C., followed by a wash of 2.0.times.SSC at 50.degree. C., are
known to those skilled in the art or can be found in Current Protocols in
Molecular Biology, John Wiley & Sons, N.Y. (1989), 6.3.1-6.3.6,
incorporated herein by reference. For example, the salt concentration in
the wash step can be selected from a low stringency of about
2.0.times.SSC at 50.degree. C. to a high stringency of about
0.2.times.SSC at 50.degree. C. In addition, the temperature in the wash
step can be increased from low stringency conditions at room temperature,
about 22.degree. C., to high stringency conditions at about 65.degree. C.
Both temperature and salt may be varied, or either the temperature or the
salt concentration may be held constant while the other variable is
changed.
[0047] In a preferred embodiment, a nucleic acid molecule of the present
invention will specifically hybridize to one strand of a segment of maize
DNA having a nucleic acid sequence as set forth in SEQ ID NO: 1 through
SEQ ID NO: 10373 under moderately stringent conditions, for example at
about 2.0.times.SSC and about 65.degree. C., more preferably under high
stringency conditions such as 0.2.times.SSC and about 65.degree. C.
[0048] As used herein "sequence identity" refers to the extent to which
two optimally aligned polynucleotide or peptide sequences are invariant
throughout a window of alignment of components, e.g. nucleotides or amino
acids. An "identity fraction" for aligned segments of a test sequence and
a reference sequence is the number of identical components which are
shared by the two aligned sequences divided by the total number of
components in reference sequence segment, i.e. the entire reference
sequence or a smaller defined part of the reference sequence. "Percent
identity" is the identity fraction times 100.
DETAILED DESCRIPTION OF PREFERRED EMBODIMENTS
A. Nucleic Acid Molecules
Loci, Primers and Probes
[0049] The maize loci of this invention comprise DNA sequence which
comprises at least 20 consecutive nucleotides and includes or is adjacent
to one or more polymorphisms identified in Table 1. Such maize loci have
a nucleic acid sequence having at least 90% sequence identity, more
preferably at least 95% or even more preferably for some alleles at least
98% and in many cases at least 99% sequence identity, to the sequence of
the same number of nucleotides in either strand of a segment of maize DNA
which includes or is adjacent to the polymorphism. The nucleotide
sequence of one strand of such a segment of maize DNA may be found in a
sequence in the group consisting of SEQ ID NO: 1 through SEQ ID NO:
10373. It is understood by the very nature of polymorphisms that for at
least some alleles there will be no identity to the polymorphism, per se.
Thus, sequence identity can be determined for sequence that is exclusive
of the polymorphism sequence. The polymorphisms in each locus are
identified more particularly in Table 1.
[0050] For many genotyping applications it is useful to employ as markers
polymorphisms from more than one locus. Thus, one aspect of the invention
provides a collection of different loci. The number of loci in such a
collection can vary but will be a finite number, e.g. as few as 2 or 5 or
10 or 25 loci or more, for instance up to 40 or 75 or 100 or more loci.
[0051] Another aspect of the invention provides nucleic acid molecules
which are capable of hybridizing to the polymorphic maize loci of this
invention. In certain embodiments of the invention, e.g. which provide
PCR primers, such molecules comprises at least 15 nucleotide bases.
Molecules useful as primers can hybridize under high stringency
conditions to a one of the strands of a segment of DNA in a polymorphic
locus of this invention. Primers for amplifying DNA are provided in
pairs, i.e. a forward primer and a reverse primer. One primer will be
complementary to one strand of DNA in the locus and the other primer will
be complementary to the other strand of DNA in the locus, i.e. the
sequence of a primer is preferably at least 90%, more preferably at least
95%, identical to a sequence of the same number of nucleotides in one of
the strands. It is understood that such primers can hybridize to sequence
in the locus which is distant from the polymorphism, e.g. at least 5, 10,
20, 50 or up to about 100 nucleotide bases away from the polymorphism.
Design of a primer of this invention will depend on factors well known in
the art, e.g. avoidance or repetitive sequence.
[0052] Another aspect of the nucleic acid molecules of this invention are
hybridization probes for polymorphism assays. In one aspect of the
invention such probes are oligonucleotides comprising at least 12
nucleotide bases and a detectable label. The purpose of such a molecule
is to hybridize, e.g. under high stringency conditions, to one strand of
DNA in a segment of nucleotide bases which includes or is adjacent to the
polymorphism of interest in an amplified part of a polymorphic locus.
Such oligonucleotides are preferably at least 90%, more preferably at
least 95%, identical to the sequence of a segment of the same number of
nucleotides in one strand of maize DNA in a polymorphic locus. The
detectable label can be a radioactive element or a dye. In preferred
aspects of the invention, the hybridization probe further comprises a
fluorescent label and a quencher, e.g. for use hybridization probe assays
of the type known as Taqman assays, available from AB Biosystems.
[0053] For assays where the molecule is designed to hybridize adjacent to
a polymorphism which is detected by single base extension, e.g. of a
labeled dideoxynucleotide, such molecules can comprise at least 15, more
preferably at least 16 or 17, nucleotide bases in a sequence which is at
least 90 percent, preferably at least 95%, identical to a sequence of the
same number of consecutive nucleotides in either strand of a segment of
polymorphic maize DNA. Oligonucleotides for single base extension assays
are available from Orchid Bioystems.
[0054] Such primer and probe molecules are generally provided in groups of
two primers and one or more probes for use in genotyping assays.
Moreover, it is often desirable to conduct a plurality of genotyping
assays for a plurality of polymorphisms. Thus, this invention also
provides collections of nucleic acid molecules, e.g. in sets which
characterize a plurality of polymorphisms.
B. Identifying Polymorphisms
[0055] Polymorphisms in a genome can be determined by comparing cDNA
sequence from different lines. While the detection of polymorphisms by
comparing cDNA sequence is relatively convenient, evaluation of cDNA
sequence allows no information about the position of introns in the
corresponding genomic DNA. Moreover, polymorphisms in non-coding sequence
cannot be identified from cDNA. This can be a disadvantage, e.g. when
using cDNA-derived polymorphisms as markers for genotyping of genomic
DNA. More efficient genotyping assays can be designed if the scope of
polymorphisms includes those present in non-coding unique sequence.
[0056] Genomic DNA sequence is more useful than cDNA for identifying and
detecting polymorphisms. Polymorphisms in a genome can be determined by
comparing genomic DNA sequence from different lines. However, the genomic
DNA of higher eukaryotes typically contain a large fraction of repetitive
sequence and transposons. Genomic DNA can be more efficiently sequenced
if the coding/unique fraction is enriched by subtracting or eliminating
the repetitive sequence.
[0057] There are a number of strategies that can be employed to enrich for
coding/unique sequence. Examples of these include the use of enzymes
which are sensitive to cytosine methylation, the use of the McrBC
endonuclease to cleave repetitive sequence, and the printing of
microarrays of genomic libraries which are then hybridized with
repetitive sequence probes.
a. Methylated Cytosine Sensitive Enzymes:
[0058] The DNA of higher eukaryotes tends to be very heavily methylated,
however it is not uniformly methylated. In fact, repetitive sequence is
much more highly methylated than coding sequence. Coding/unique sequence
can therefore be enriched by exploiting this difference in methylation
pattern. See U.S. Pat. No. 6,017,704 for methods of mapping and
assessment of DNA methylation patterns in CG islands. Some restriction
endonucleases are sensitive to the presence of methylated cytosine
residues in their recognition site. Such methylation sensitive
restriction endonucleases may not cleave at their recognition site if the
cytosine residue in either an overlapping 5'-CG-3' or an overlapping
5'-CNG-3' is methylated. Methylation sensitive restriction endonucleases
include the 4 base cutters: Aci I, Hha I, HinP1 I, HpaII and Msp I, the 6
base cutters: Apa I, Age I, Bsr F I, BssH II, Eag I, Eae I, MspM II, Nar
I, Pst I, Pvu I, Sac II, Sma I, Stu I and Xho I and the 8 base cutter:
Not I. For example, DNA cleavage at the site CTGCAG by Pst I is inhibited
when the C residues are methylated. In order to enrich for coding/unique
sequence maize libraries can be constructed from genomic DNA digested
with Pst I (or other methylation sensitive enzymes), and size
fractionated by agarose gel electrophoresis. Regions of the genome which
are heavily methylated (i.e., regions with a high fraction of repetitive
sequences) have a higher number of Pst I sites that are methylated.
Therefore, most of the Pst I sites in repetitive DNA will not be cleaved
during Pst I digestion, and the repetitive sequence will tend to consist
mostly of high molecular weight, uncleaved DNA. In contrast, regions of
the genome that are not heavily methylated (i.e. regions containing a
large fraction of coding/unique sequence) should contain a large fraction
of unmethylated Pst I sites which will be cleaved during digestion,
producing relatively smaller fragments. When digested DNA is
electrophoresed through agarose, relatively larger fragments from heavily
methylated, non-coding DNA regions are separated from relatively smaller
fragments derived from coding/unique sequence. Coding region-enriched DNA
fragments (commonly between 500-3000 bp) can be excised from the gel,
purified and ligated into a Pst I digested vector, e.g. pUC18. The
ligation products are transformed by electroporation into a plurality of
suitable bacterial hosts, e.g. DH10B, to produce a library of clones
enriched for coding/unique sequence. Individual clones can be sequenced
to provide the sequence of the inserted coding region DNA.
[0059] In order to reduce the sequence complexity of any particular
library, the DNA in the range 500 to 10,000 bp can be further
size-fractionated by incrementally excising fragments from the gel.
Useful ranges of size-fractionated fragments include 500-600 bp, 600-700
bp, 700-800 bp, 800-900 bp, 900-1100 bp, 1100-1500 bp, 1500-2000 bp,
2000-2500 bp and 2500-3000 bp. A series of size-fractionated reduced
representation libraries are constructed by ligating purified DNA from
each size fraction separately to the vector. A small sample of clones
from each library (for example about 400 clones) is sequenced to
determine the fraction of repetitive sequence present in each particular
library. Comparison of reduced representation libraries prepared from a
variety of different maize lines indicates that many fractions contain
less than 10% repetitive sequence and some fractions contain more than
20% repetitive sequence. Preferred reduced representation libraries
contain less than 20% repetitive sequence, more preferably less than 15%
repetitive sequence and even more preferably less than 10% repetitive
sequence. By determining the fraction of repetitive sequence throughout
the whole series of size fractionated reduced representation libraries,
the libraries with the smallest fraction of repetitive sequence can be
selected for deep sequencing (usually 10,000-20,000 clones). Since the
purpose of obtaining sequence is for polymorphism detection, the
equivalent libraries representing the same size fraction for both maize
strains are sequenced. Another advantage of using reduced representation
libraries for polymorphism detection is that it increases the probability
of recovering the equivalent sequences from both maize lines.
Polymorphisms can only be detected if the equivalent sequence is
available from both lines.
b. McrBC Endonuclease
[0060] An alternative method for enriching coding region DNA sequence
enrichment uses McrBC endonuclease restriction. As a defense against
invading foreign DNA from phage/viruses, E. coli contain endonucleases,
e.g. McrBC endonuclease, which cleave methylated cytosine-containing DNA.
This feature can be exploited to enrich DNA with regions of the genome
which are not heavily methylated, e.g. the presumed coding region DNA.
Reduced representation libraries can be constructed using genomic DNA
fragments which are cleaved by physical shearing or digestion with any
restriction enzyme. DNA fragments are transformed into an E. coli host
that contains an McrBC endonuclease, e.g. E. coli strain JM107 or DH5a.
When the bacterial host is transformed with a DNA fragment which contains
methylated DNA region, the McrBC endonuclease will cleave the inserted
DNA and the plasmid will not be propagated. When the bacterial host is
transformed with a DNA fragment that is not methylated, the plasmid will
be propagated, and a colony will grow on the agar plate allowing the
clone to be sequenced. A small sample of clones from libraries generated
in this manner are sampled, and the fraction of repetitive sequenced
determined. McrBC endonuclease can also be used with methylated cytosine
sensitive endonuclease to further reduce the fraction of repetitive
sequence in libraries that are not suitable for sequencing, e.g.
sequences that contain more than 15% repetitive sequence.
c. Microarraying Reduced Representation Libraries
[0061] Another method to enrich for coding/unique sequence is to construct
reduced representation libraries (using methylation sensitive or
non-methylation sensitive enzymes), print microarrays of the library on
nylon membrane, and hybridize with, probes made from repetitive elements
known to be present in the library. The repetitive sequence elements are
identified, and the library is re-arrayed by picking only the negative
clones. This process is performed by randomly picking clones from a
reduced representation library into 384-well plates and culturing them.
Micro-arrays can be prepared by printing clone DNA from the collection of
384-well plates in determined patterns on supports, such as glass
supports or nylon membranes. The fabrication of microarrays comprising
thousands of distinct clones, e.g. up to about 25,000 clones or more, are
well known in the art. See for instance, U.S. Pat. No. 5,807,522 for
methods for fabricating microarrays of spotted polynucleotides at high
density. A small sample of clones from the reduced representation
library, e.g. about 400 clones, can be sequenced to identify repetitive
sequence elements. Clones containing the repetitive sequences are
retrieved, and the clones used to make radioactive probes which are
hybridized on the nylon arrays. Radioactive isotope label elements
include .sup.32P, .sup.33P, .sup.35S, .sup.125I, and the like with
.sup.33P being especially preferred. The arrays are analyzed for
hybridization by detecting radiation, e.g. using a Fuji Phosphoimager.TM.
imaging screen. After an appropriate exposure time the array image is
read as a digital file representing the hybridization intensity from each
array element which is proportional to amount of labeled repeat sequence.
This radiation image identifies all the clones on the array which
correspond to repetitive sequence clones, and also identifies the
384-well plate and well location of each repetitive sequence clone.
[0062] With this information, all the non-repetitive sequence clones can
be picked from the original plates and relocated onto a new set of plates
which do not contain repetitive sequence clones. This method can be used
to lower the fraction of repetitive sequence in reduced representation
libraries from approximately 25% to about 1-2%.
C. Detecting Polymorphisms
[0063] Polymorphisms in DNA sequences can be detected by a variety of
effective methods well known in the art including those disclosed in U.S.
Pat. Nos. 5,468,613 and 5,217,863; 5,210,015; 5,876,930; 6,030,787
6,004,744; 6,013,431; 5,595,890; 5,762,876; 5,945,283; 5,468,613;
6,090,558; 5,800,944 and 5,616,464, all of which are incorporated herein
by reference in their entireties. For instance, polymorphisms in DNA
sequences can be detected by hybridization to allele-specific
oligonucleotide (ASO) probes as disclosed in U.S. Pat. Nos. 5,468,613 and
5,217,863. The nucleotide sequence of an ASO probe is designed to form
either a perfectly matched hybrid or to contain a mismatched base pair at
the site of the variable nucleotide residues. The distinction between a
matched and a mismatched hybrid is based on differences in the thermal
stability of the hybrids in the conditions used during hybridization or
washing, differences in the stability of the hybrids analyzed by
denaturing gradient electrophoresis or chemical cleavage at the site of
the mismatch.
[0064] U.S. Pat. No. 5,468,613 discloses allele specific oligonucleotide
hybridizations where single or multiple nucleotide variations in nucleic
acid sequence can be detected in nucleic acids by a process in which the
sequence containing the nucleotide variation is amplified, spotted on a
membrane and treated with a labeled sequence-specific oligonucleotide
probe.
[0065] Length variation in DNA nucleotide sequence repeats such as
microsatellites, simple sequence repeats (SSRs) and short tandem repeats
(STRs) can be detected by mass spectroscopy methods as disclosed in U.S.
Pat. No. 6,090,558 The advantages of using mass spectrometry include a
dramatic increase in both the speed of analysis (a few seconds per
sample) and the accuracy of direct mass measurements.
[0066] Target nucleic acid sequence can also be detected by probe ligation
methods as disclosed in U.S. Pat. No. 5,800,944 where sequence of
interest is amplified and hybridized to probes followed by ligation to
detect a labeled part of the probe.
[0067] Target nucleic acid sequence can also be detected by probe linking
methods as disclosed in U.S. Pat. No. 5,616,464 employing at least one
pair of probes having sequences homologous to adjacent portions of the
target nucleic acid sequence and having side chains which non-covalently
bind to form a stem upon base pairing of said probes to said target
nucleic acid sequence. At least one of the side chains has a
photoactivatable group which can form a covalent cross-link with the
other side chain member of the stern.
a. Primer Base Extension Assay
[0068] A preferred method for detecting SNPs and Indels is a labeled base
extension method as disclosed in U.S. Pat. Nos. 6,004,744; 6,013,431;
5,595,890; 5,762,876; and 5,945,283. These methods are based on primer
extension and incorporation of detectable nucleoside triphosphates. The
primer is designed to anneal to the sequence immediately adjacent to the
variable nucleotide which can be can be detected after incorporation of
as few as one labeled nucleoside triphosphate. The method uses three
synthetic oligonucleotides. Two of the oligonucleotides serve as PCR
primers and are complementary to sequence of the locus of maize genomic
DNA which flanks a region containing the polymorphism to be assayed.
Using maize genomic DNA as a template the primer oligonucleotides are
used in PCR to produce sufficient copies of the region of the locus
containing the polymorphisms so that allelic discrimination can be
conducted. Following amplification of the region of the maize genome
containing the polymorphism, the PCR product is mixed with the third
oligonucleotide (called an extension primer) which is designed to
hybridize to the amplified DNA immediately adjacent to the polymorphism
in the presence of DNA polymerase and two differentially labeled
dideoxynucleosidetriphosphates. If the polymorphism is present on the
template, one of the labeled dideoxynucleosidetriphosphates can be added
to the primer in a single base chain extension. The allele present is
then inferred by determining which of the two differential labels was
added to the extension primer. Homozygous samples will result in only one
of the two labeled bases being incorporated and thus only one of the two
labels will be detected. Heterozygous samples have both alleles present,
and will thus direct incorporation of both labels (into different
molecules of the extension primer) and thus both labels will be detected.
[0069] To design primers for maize polymorphism detection by single base
extension the sequence of the locus is first masked to prevent design of
any of the three primers to sites that match known maize repetitive
elements (e.g., transposons) or are of very low sequence complexity (di-
or tri-nucleotide repeat sequences). Design of primers to such repetitive
elements will result in assays of low specificity, through amplification
of multiple loci or annealing of the extension primer to multiple sites.
[0070] PCR primers are preferably designed (a) to have an optimal
annealing temperature for PCR in the range of 55 to 60.degree. C., (b) to
have lengths in the range of 18 to 25 bases, and (c) to produce a product
in the size range 75 to 200 base pairs with the polymorphism to be
assayed located at least 25 bases from the 3' end of each primer. The
extension primers must be chosen to contain minimal self- or inter-primer
complementarity, or the efficiency and/or specificity of the PCR reaction
will be reduced.
[0071] The extension primer is designed to anneal immediately adjacent to
the polymorphism, such that the 3' end of the annealed extension primer
immediately abuts the polymorphic site. The extension primer can lie
either to the 5' or 3' side of the polymorphism; however, if it is
designed to lie on the 3' side, then the sequence of the extension primer
must match the reverse complement of the sequence adjacent to the
polymorphism. The extension primer must contain no self-complementarity
that will enable self-annealing, or the incorporation of the labeled
ddNTPs may result from self-priming of the extension primer, obscuring
the results of polymorphism-directed incorporation. If the nature of the
sequence adjacent to the polymorphic site makes it impossible to design
an extension primer that is fully non-self-complementary, the extent of
self-annealing may be limited by replacing one or two bases of the
extension primer with abasic sites, as long as the abasic sites are not
introduced into the three 3' most positions.
[0072] The labeled ddNTPs chosen for inclusion in the reaction are
determined by the nature of the polymorphism, and whether the extension
primer lies those that match the first base of the polymorphism, if the
extension primer lies 5' or 3' of the polymorphism. If the extension
primer is located 5' of the polymorphism, then the ddNTPs are those of
the polymorphism itself. For example, in the case of an AG polymorphism,
the ddNTPs would be ddATP-label(1) and ddGTP-label(2). If the extension
primer lies 3' of the polymorphic site, then the ddNTPs are the
complements of the bases involved in the polymorphism; in the present
example, ddTTP-label(1) and ddCTP-label(2). Labels can be chosen from
among a wide variety of chemical moieties, including affinity or
immunological labels, fluorescent dyes and mass tags. In the most common
embodiment of the process, affinity and immunological labels are used,
followed by appropriate detection reagents. In the present example,
ddATP-FITC and ddGTP-biotin might be employed, followed by incubation
with anti-FITC-antibody conjugated to the enzyme horseradish peroxidase
(HRP-anti-FITC), and streptavidin conjugated to the enzyme alkaline
phosphatase (AP-streptavidin).
b. Labeled Probe Degradation Assay
[0073] In another preferred method for detecting polymorphisms SNPs and
Indels can be detected by methods disclosed in U.S. Pat. Nos. 5,210,015;
5,876,930 and 6,030,787 in which an oligonucleotide probe having a 5'
fluorescent reporter dye and a 3' quencher dye covalently linked to the
5' and 3' ends of the probe. When the probe is intact, the proximity of
the reporter dye to the quencher dye results in the suppression of the
reporter fluorescence, e.g. by Forster-type energy transfer. During PCR
forward and reverse primers hybridize to a specific sequence of the
target DNA flanking a polymorphism. The hybridization probe hybridizes to
polymorphism-containing sequence within the amplified PCR product. In the
subsequent PCR cycle DNA polymerase with 5'.fwdarw.3' exonuclease
activity cleaves the probe and separates the reporter dye from the
quencher dye resulting in increased fluorescence of the reporter. A
useful assay is available from AB Biosystems as the Taqman.RTM. assay
which employs four synthetic oligonucleotides in a single reaction that
concurrently amplifies the maize genomic DNA, discriminates between the
alleles present, and directly provides a signal for discrimination and
detection. Two of the four oligonucleotides serve as PCR primers and
generate a PCR product encompassing the polymorphism to be detected. Two
others are allele-specific fluorescence-resonance-energy-transfer (FRET)
probes. FRET probes incorporate a fluorophore and a quencher molecule in
close proximity so that the fluorescence of the fluorophore is quenched.
The signal from a FRET probes is generated by degradation of the FRET
oligonucleotide, so that the fluorophore is released from proximity to
the quencher, and is thus able to emit light when excited at an
appropriate wavelength. In the assay, two FRET probes bearing different
fluorescent reporter dyes are used, where a unique dye is incorporated
into an oligonucleotide that can anneal with high specificity to only one
of the two alleles. Useful reporter dyes include
6-carboxy-4,7,2',7'-tetrachlorofluorecein (TET), (VIC) and
6-carboxyfluorescein phosphoramidite (FAM). A useful quencher is
6-carboxy-N,N,N',N'-tetramethylrhodamine (TAMRA). Additionally, the 3'
end of each FRET probe is chemically blocked so that it can not act as a
PCR primer. During the assay, maize genomic DNA is added to a buffer
containing the two PCR primers and two FRET probes. Also present is a
third fluorophore used as a passive reference, e.g., rhodamine X (ROX) to
aid in later normalization of the relevant fluorescence values
(correcting for volumetric errors in reaction assembly). Amplification of
the genomic DNA is initiated. During each cycle of the PCR, the FRET
probes anneal in an allele-specific manner to the template DNA molecules.
Annealed (but not non-annealed) FRET probes are degraded by TAQ DNA
polymerase as the enzyme encounters the 5' end of the annealed probe,
thus releasing the fluorophore from proximity to its quencher. Following
the PCR reaction, the fluorescence of each of the two fluorescers, as
well as that of the passive reference, is determined fluorometrically.
The normalized intensity of fluorescence for each of the two dyes will be
proportional to the amounts of each allele initially present in the
sample, and thus the genotype of the sample can be inferred.
[0074] To design primers and probes for the assay the locus sequence is
first masked to prevent design of any of the three primers to sites that
match known maize repetitive elements (e.g., transposons) or are of very
low sequence complexity (di- or tri-nucleotide repeat sequences). Design
of primers to such repetitive elements will result in assays of low
specificity, through amplification of multiple loci or annealing of the
FRET probes to multiple sites.
[0075] PCR primers are designed (a) to have a length in the size range of
18 to 25 bases and matching sequences in the polymorphic locus, (b) to
have a calculated melting temperature in the range of 57 to 60.degree.
C., e.g. corresponding to an optimal PCR annealing temperature of 52 to
55 oC, (c) to produce a product which includes the polymorphic site and
has a length in the size range of 75 to 250 base pairs. The PCR primers
are preferably located on the locus so that the polymorphic site is at
least one base away from the 3' end of each PCR primer. The PCR primers
must not be contain regions that are extensively self- or
inter-complementary.
[0076] FRET probes are designed to span the sequence of the polymorphic
site, preferably with the polymorphism located in the 3' most 2/3 of the
oligonucleotide. In the preferred embodiment, the FRET probes will have
incorporated at their 3' end a chemical moiety which, when the probe is
annealed to the template DNA, binds to the minor groove of the DNA, thus
enhancing the stability of the probe-template complex. The probes should
have a length in the range of 12 to 17 bases, and with the 3'MGB, have a
calculated melting temperature of 5 to 7.degree.C. above that of the PCR
primers. Probe design is disclosed in U.S. Pat. Nos. 5,538,848; 6,084,102
and 6,127,121.
D. Use of Polymorphisms to Establish Marker/Trait Associations
[0077] The polymorphisms in the loci of this invention can be used in
marker/trait associations which are inferred from statistical analysis of
genotypes and phenotypes of the members of a population. These members
may be individual organisms, e.g. maize, families of closely related
individuals, inbred lines, dihaploids or other groups of closely related
individuals. Such maize groups are referred to as "lines", indicating
line of descent. The population may be descended from a single cross
between two individuals or two lines (e.g. a mapping population) or it
may consist of individuals with many lines of descent. Each individual or
line is characterized by a single or average trait phenotype and by the
genotypes at one or more marker loci.
[0078] Several types of statistical analysis can be used to infer
marker/trait association from the phenotype/genotype data, but a basic
idea is to detect markers, i.e. polymorphisms, for which alternative
genotypes have significantly different average phenotypes. For example,
if a given marker locus A has three alternative genotypes (AA, Aa and
aa), and if those three classes of individuals have significantly
different phenotypes, then one infers that locus A is associated with the
trait. The significance of differences in phenotype may be tested by
several types of standard statistical tests such as linear regression of
marker genotypes on phenotype or analysis of variance (ANOVA).
Commercially available, statistical software packages commonly used to do
this type of analysis include SAS Enterprise Miner (SAS Institute Inc.,
Cary, N.C.) and Splus (Insightful Corporation. Cambridge, Mass.). When
many markers are tested simultaneously, an adjustment such as Bonferonni
correction is made in the level of significance required to declare an
association.
[0079] Often the goal of an association study is not simply to detect
marker/trait associations, but to estimate the location of genes
affecting the trait directly (i.e. QTLs) relative to the marker
locations. In a simple approach to this goal, one makes a comparison
among marker loci of the magnitude of difference among alternative
genotypes or the level of significance of that difference. Trait genes
are inferred to be located nearest the marker(s) that have the greatest
associated genotypic difference. In a more complex analysis, such as
interval mapping (Lander and Botstein, Genetics 121:185-199 (1989), each
of many positions along the genetic map (say at 1 cM intervals) is tested
for the likelihood that a QTL is located at that position. The
genotype/phenotype data are used to calculate for each test position a
LOD score (log of likelihood ratio). When the LOD score exceeds a
critical threshold value, there is significant evidence for the location
of a QTL at that position on the genetic map (which will fall between two
particular marker loci).
a. Linkage Disequilibrium Mapping and Association Studies
[0080] Another approach to determining trait gene location is to analyze
trait-marker associations in a population within which individuals differ
at both trait and marker loci. Certain marker alleles may be associated
with certain trait locus alleles in this population due to population
genetic process such as the unique origin of mutations, founder events,
random drift and population structure. This association is referred to as
linkage disequilibrium. In linkage disequilibrium mapping, one compares
the trait values of individuals with different genotypes at a marker
locus. Typically, a significant trait difference indicates close
proximity between marker locus and one or more trait loci. If the marker
density is appropriately high and the linkage disequilibrium occurs only
between very closely linked sites on a chromosome, the location of trait
loci can be very precise.
[0081] A specific type of linkage disequilibrium mapping is known as
association studies. This approach makes use of markers within candidate
genes, which are genes that are thought to be functionally involved in
development of the trait because of information such as biochemistry,
physiology, transcriptional profiling and reverse genetic experiments in
model organisms. In association studies, markers within candidate genes
are tested for association with trait variation. If linkage
disequilibrium in the study population is restricted to very closely
linked sites (i.e. within a gene or between adjacent genes), a positive
association provides nearly conclusive evidence that the candidate gene
is a trait gene.
b. Positional Cloning and Transgenic Applications
[0082] Traditional linkage mapping typically localizes a trait gene to an
interval between two genetic markers (referred to as flanking markers).
When this interval is relatively small (say less than 1 Mb), it becomes
feasible to precisely identify the trait gene by a positional cloning
procedure. A high marker density is required to narrow down the interval
length sufficiently. This procedure requires a library of large insert
genomic clones (such as a BAC library), where the inserts are pieces
(usually 100-150 kb in length) of genomic DNA from the species of
interest. The library is screened by probe hybridization or PCR to
identify clones that contain the flanking marker sequences. Then a series
of partially overlapping clones that connects the two flanking clones (a
"contig") is built up through physical mapping procedures. These
procedures include fingerprinting, STS content mapping and
sequence-tagged connector methodologies. Once the physical contig is
constructed and sequenced, the sequence is searched for all
transcriptional units. The transcriptional unit that corresponds to the
trait gene can be determined by comparing sequences between mutant and
wild type strains, by additional fine-scale genetic mapping, and/or by
functional testing through plant transformation. Trait genes identified
in this way become leads for transgenic product development. Similarly,
trait genes identified by association studies with candidate genes become
leads for transgenic product development.
c. Marker-Aided Breeding and Marker-Assisted Selection
[0083] When a trait gene has been localized in the vicinity of genetic
markers, those markers can be used to select for improved values of the
trait without the need for phenotypic analysis at each cycle of
selection. In marker aided breeding and marker-assisted selection,
associations between trait genes and markers are established initially
through genetic mapping analysis (as in A.1 or A.2). In the same process,
one determines which marker alleles are linked to favorable trait gene
alleles. Subsequently, marker alleles associated with favorable trait
gene alleles are selected in the population. This procedure will improve
the value of the trait provided that there is sufficiently close linkage
between markers and trait genes. The degree of linkage required depends
upon the number of generations of selection because, at each generation,
there is opportunity for breakdown of the association through
recombination.
Prediction of Crosses for New Inbred Line Development
[0084] The associations between specific marker alleles and favorable
trait gene alleles also can be used to predict what types of progeny may
segregate from a given cross. This prediction may allow selection of
appropriate parents to generation populations from which new combinations
of favorable trait gene alleles are assembled to produce a new inbred
line. For example, if line A has marker alleles previously known to be
associated with favorable trait alleles at loci 1, 20 and 31, while line
B has marker alleles associated with favorable effects at loci 15, 27 and
29, then a new line could be developed by crossing A.times.B and
selecting progeny that have favorable alleles at all 6 trait loci.
d. Hybrid Prediction
[0085] Commercial corn seed is produced by making hybrids between two
elite inbred lines that belong to different "heterotic groups". These
groups are sufficiently distinct genetically that hybrids between them
show high levels of heterosis or hybrid vigor (i.e. increased performance
relative to the parental lines). By analyzing the marker constitution of
good hybrids, one can identify sets of alleles at different loci in both
male and female lines that combine well to produce heterosis.
Understanding these patterns, and knowing the marker constitution of
different inbred lines, can allow prediction of the level of heterosis
between different pairs of lines. These predictions can narrow down the
possibilities of which line(s) of opposite heterotic group should be used
to test the performance of a new inbred line.
e. Identity by Descent
[0086] One theory of heterosis predicts that regions of identity by
descent (IBD) between the male and female lines used to produce a hybrid
will reduce hybrid performance. Identity by descent can be inferred from
patterns of marker alleles in different lines. An identical string of
markers at a series of adjacent loci may be considered identical by
descent if it is unlikely to occur independently by chance. Analysis of
marker fingerprints in male and female lines can identify regions of IBD.
Knowledge of these regions can inform the choice of hybrid parents, since
avoiding IBD in hybrids is likely to improve performance. This knowledge
may also inform breeding programs in that crosses could be designed to
produce pairs of inbred lines (one male and one female) that show little
or no IBD.
[0087] A fingerprint of an inbred line is the combination of alleles at a
set of marker loci. High density fingerprints can be used to establish
and trace the identity of germplasm, which has utility in germplasm
ownership protection.
[0088] Genetic markers are used to accelerate introgression of transgertes
into new genetic backgrounds (i.e. into a diverse range of germplasm).
Simple introgression involves crossing a transgenic line to an elite
inbred line and then backcrossing the hybrid repeatedly to the elite
(recurrent) parent, while selecting for maintenance of the transgene.
Over multiple backcross generations, the genetic background of the
original transgenic line is replaced gradually by the genetic background
of the elite inbred through recombination and segregation. This process
can be accelerated by selection on marker alleles that derive from the
recurrent parent.
E. Use of Polymorphism Assay for Mapping a Library of DNA Clones
[0089] The polymorphisms and loci of this invention are useful for
identifying and mapping DNA sequence of QTLs and genes linked to the
polymorphisms. For instance, BAC or YAC clone libraries can be queried
using polymorphisms linked to a trait to find a clone containing specific
QTLs and genes associated with the trait. For instance, QTLs and genes in
a plurality, e.g. hundreds or thousands, of large, multi-gene sequences
can be identified by hybridization with an oligonucleotide probe which
hybridizes to a mapped and/or linked polymorphism. Such hybridization
screening can be improved by providing clone sequence in a high density
array. The screening method is more preferably enhanced by employing a
pooling strategy to significantly reduce the number of hybridizations
required to identify a clone containing the polymorphism. When the
polymorphisms are mapped, the screening effectively maps the clones.
[0090] For instance, in a case where thousands of clones are arranged in a
defined array, e.g. in 96 well plates, the plates can be arbitrarily
arranged in three-dimensionally, arrayed stacks of wells each comprising
a unique DNA clone. The wells in each stack can be represented as
discrete elements in a three dimensional array of rows, columns and
plates. In one aspect of the invention the number of stacks and plates in
a stack are about equal to minimize the number of assays. The stacks of
plates allow the construction of pools of cloned DNA.
[0091] For a three-dimensionally arrayed stack pools of cloned DNA can be
created for (a) all of the elements in each row, (b) all of the elements
of each column, and (c) all of the elements of each plate. Hybridization
screening of the pools with an oligonucleotide probe which hybridizes to
a polymorphism unique to one of the clones will provide a positive
indication for one column pool, one row pool and one plate pool, thereby
indicating the well element containing the target clone.
[0092] In the case of multiple stacks, additional pools of all of the
clone DNA in each stack allows indication of the stack having the
row-column-plate coordinates of the target clone. For instance, a 4608
clone set can be disposed in 48 96-well plates. The 48 plates can be
arranged in 8 sets of 6 plate stacks providing 6.times.12.times.8
three-dimensional arrays of elements, i.e. each stack comprises 6 stacks
of 8 rows and 12 columns. For the entire clone set there are 36 pools,
i.e. 6 stack pools, 8 row pools, 12 column pools and 8 stack pools. Thus,
a maximum of 36 hybridization reactions is required to find the clone
harboring QTLs or genes associated or linked to each mapped polymorphism.
[0093] Once a clone is identified, oligonucleotide primers designed from
the locus of the polymorphism can be used for positional cloning of the
linked QTL and/or genes.
F. Computer Readable Media and Databases
[0094] The sequences of nucleic acid molecules of this invention can be
"provided" in a variety of mediums to facilitate use, e.g. a database or
computer readable medium, which can also contain descriptive annotations
in a form that allows a skilled artisan to examine or query the sequences
and obtain useful information. In one embodiment of the invention
computer readable media may be prepared that comprise nucleic acid
sequences where at least 10% or more, e.g. at least 25%, or even at least
50% or more of the sequences of the loci and nucleic acid molecules of
this invention. For instance, such database or computer readable medium
may comprise sets of the loci of this invention or sets of primers and
probes useful for assaying the polymorphisms of this invention. In
addition such database or computer readable medium may comprise a figure
or table of the mapped or unmapped polymorphisms or this invention and
genetic maps.
[0095] As used herein "database" refers to any representation of
retrievable collected data including computer files such as text files,
database files, spreadsheet files and image files, printed tabulations
and graphical representations and combinations of digital and image data
collections. In a preferred aspect of the invention, "database" means a
memory system that can store computer searchable information. Currently,
preferred database applications include those provided by DB2, Sybase and
Oracle.
[0096] As used herein, "computer readable media" refers to any medium that
can be read and accessed directly by a computer. Such media include, but
are not limited to: magnetic storage media, such as floppy discs, hard
disc, storage medium and magnetic tape; optical storage media such as
CD-ROM; electrical storage media such as RAM and ROM; and hybrids of
these categories such as magnetic/optical storage media. A skilled
artisan can readily appreciate how any of the presently known computer
readable mediums can be used to create a manufacture comprising computer
readable medium having recorded thereon a nucleotide sequence of the
present invention.
[0097] As used herein, "recorded" refers to the result of a process for
storing information in a retrievable database or computer readable
medium. For instance, a skilled artisan can readily adopt any of the
presently known methods for recording information on computer readable
medium to generate media comprising the mapped polymorphisms and other
nucleotide sequence information of the present invention. A variety of
data storage structures are available to a skilled artisan for creating a
computer readable medium where the choice of the data storage structure
will generally be based on the means chosen to access the stored
information. In addition, a variety of data processor programs and
formats can be used to store the polymorphisms and nucleotide sequence
information of the present invention on computer readable medium.
[0098] Computer software is publicly available which allows a skilled
artisan to access sequence information provided in a computer readable
medium. The examples which follow demonstrate how software which
implements a search algorithm such as the BLAST algorithm (Altschul et
al., J. Mol. Biol. 215:403-410 (1990), incorporated herein by reference)
and the BLAZE algorithm (Brutlag et al., Comp. Chem. 17:203-207 (1993),
incorporated herein by reference) on a Sybase system can be used to
identify DNA sequence which is homologous to the sequence of loci of this
invention with a high level of identity. Sequence of high identity can be
compared to find polymorphic markers useful with a maize varieties.
[0099] The present invention further provides systems, particularly
computer-based systems, which contain the sequence information described
herein. Such systems are designed to identify commercially important
sequence segments of the nucleic acid molecules of this invention. As
used herein, "a computer-based system" refers to the hardware, software
and memory used to analyze the nucleotide sequence information. A skilled
artisan can readily appreciate that any one of the currently available
computer-based system are suitable for use in the present invention.
[0100] As indicated above, the computer-based systems of the present
invention comprise a database having stored therein polymorphic markers,
genetic maps, and/or the sequence of nucleic acid molecules of the
present invention and the necessary hardware and software for supporting
and implementing genotyping applications.
Example 1
[0101] This example illustrates the preparation of reduced representation
libraries using enzymes which are sensitive to methylated cytosine
residues in order to enrich for unique/coding-sequence genomic DNA.
[0102] There are general methods for preparing genomic DNA from maize (or
other plants) that are suitable for use in construction of reduced
representation libraries. There are commercially available kits, for
example the "DNeasy Plant Maxi Kit" from Qiagen (Valencia, Calif.). The
preferred method however which maximizes both yield and convenience is to
extract DNA using "Plant DNAzol Reagent" from Life Technologies (Grand
Island, N.Y.). Briefly, frozen leaf tissue is ground in liquid nitrogen
in a mortar and pestle. The ground tissue is then extracted with DNAzol
reagent. This removes cellular proteins, cell wall material and other
debris. Following extraction with this reagent, the DNA is precipitated,
washed, resuspended, and treated with RNAse to remove RNA. The DNA is
precipitated again, and resuspended in a suitable volume of TE (so that
concentration is 1 .mu.g/.mu.l). The genomic DNA is ready to use in
library construction.
[0103] Genomic DNA from two maize lines which are to be compared for
polymorphism detection are digested separately with Pst I restriction
endonuclease which provides the ends of the DNA fragments with sticky
ends which can ligate into a plasmid with the same restriction site. For
instance, 100 units of Pst I is added to 20 .mu.g of DNA and incubated at
37.degree. C. for 8 hours. The digested DNA product is separated by
electrophoresis on a 1% low-melting-temperature-agarose gel to separate
the DNA fragments by size. The digested DNA from the two maize lines is
loaded side by side on the gel (with one lane in between as a spacer).
Both a 1 KB DNA ladder marker and a 100 bp DNA ladder marker are loaded
on each side of the two maize DNA lanes. These markers act as a guide for
size fractionation of the digested maize DNA. Fragments in the range of
500 to 3000 bp are excised incrementally from the gel in size fractions
of 500-600 bp, 600-700 bp, 700-800 bp, 800-900 bp, 900-1100 bp, 1100-1500
bp, 1500-2000 bp, 2000-2500 bp and 2500-3000 bp. DNA in each fraction is
purified using .beta.-agarase and ligated into the Pst I cloning site of
pUC18. The plasmid ligation products are transformed by electroporation
into DH10B E. coli bacterial hosts to produce reduced representation
libraries. For instance, about 500 nanograms of the size-selected DNA is
ligated to 50 ng dephosphorylated pUC18 vector.
[0104] Transformation is carried out by electroporation and the
transformation efficiency for reduced representation Pst I libraries is
approximately 50,000-300,000 transformants from one microliter of
ligation product or 1000 to 6000 transformants/ng DNA.
[0105] Basic tests to evaluate the quality include the average insert
size, chloroplast/mitochondrial DNA content, and the fraction of
repetitive sequence.
[0106] The determination of the average insert size of the library is
assessed during library construction. Every ligation is tested to
determine the average insert size by assaying 10-20 clones per ligation.
DNA is isolated from recombinant clones using a standard mini preparation
protocol, digested with Pst I to free the insert from the vector and then
sized using 1% agarose gel electrophoresis (Maule, Molecular
Biotechnology 9:107-126 (1998), the entirety of which is herein
incorporated by reference).
[0107] The chloroplast/mitochondrial DNA content, and the percentage of
repetitive sequence in the library is estimated by sequencing a small
sample of clones (400), and cross checking the sequence obtained against
various sequence databases. Some repetitive elements are not present in
the databases, but can nevertheless often be identified by the large
number of copies of the same sequence. For instance, after sequencing a
set of 400 clones any sequence that is not filtered by the repetitive
element database, but yet is present more than 10 times in the sample is
considered a repetitive element.
[0108] Maize reduced representation libraries of the present invention are
constructed by inserting coding region enriched DNA obtained from the
following maize lines: B73, M017, LH82 and 5CM1.
Example 2
[0109] This example illustrates the determination of maize genomic DNA
sequence from clones in reduced representation libraries prepared in
Example 1. Two basic methods can be used for DNA sequencing, the chain
termination method of Sanger et al., Proc. Natl. Acad. Sci. USA
74:5463-5467 (1977) and the chemical degradation method of Maxam and
Gilbert, Proc. Natl. Acad. Sci. USA 74:560-564 (1977). Automation and
advances in technology such as the replacement of radioisotopes with
fluorescence-based sequencing have reduced the effort required to
sequence DNA (Craxton, Methods, 2:20-26 (1991), Ju et al., Proc. Natl.
Acad. Sci. USA 92:4347-4351 (1995) and Tabor and Richardson, Proc. Natl.
Acad. Sci. USA 92:6339-6343 (1995). Automated sequencers are available
from, for example, Applied Biosystems, Foster City, Calif. (ABI
Prism.RTM. systems); Pharmacia Biotech, Inc., Piscataway, N.J. (Pharmacia
ALF), LI-COR, Inc., Lincoln, Nebr. (LI-COR 4,000) and Millipore, Bedford,
Mass. (Millipore BaseStation).
[0110] In addition, advances in capillary gel electrophoresis have also
reduced the effort required to sequence DNA and such advances provide a
rapid high resolution approach for sequencing DNA samples (Swerdlow and
Gesteland, Nucleic Acids Res. 18:1415-1419 (1990); Smith, Nature
349:812-813 (1991); Luckey et al., Methods Enzymol. 218:154-172 (1993);
Lu et al., J. Chromatog. A. 680:497-501 (1994); Carson et al., Anal.
Chem. 65:3219-3226 (1993); Huang et al, Anal. Chem. 64:2149-2154 (1992);
Kheterpal et al., Electrophoresis 17:1852-1859 (1996); Quesada and Zhang,
Electrophoresis 17:1841-1851 (1996); Baba, Yakugaku Zasshi 117:265-281
(1997).
[0111] A number of sequencing techniques are known in the art, including
fluorescence-based sequencing methodologies. These methods have the
detection, automation and instrumentation capability necessary for the
analysis of large volumes of sequence data. An ABI Prism.RTM.377 DNA
Sequencer (Applied Biosystems, Foster City, Calif.) allows rapid
electrophoresis and data collection. With these types of automated
systems, fluorescent dye-labeled sequence reaction products are detected
and data entered directly into the computer, producing a chromatogram
that is subsequently viewed, stored, and analyzed using the corresponding
software programs. These methods are known to those of skill in the art
and have been described and reviewed (Birren et al., Genome Analysis:
Analyzing DNA, 1, Cold Spring Harbor, N.Y. (1999).
[0112] Sequence base calling from trace files and quality scores are
assigned by PHRED which is available from CodonCode Corporation, Dedham,
Mass. and is described by Brent Ewing, et al. "Base-calling of automated
sequencer traces using phred", 1998, Genome Research, Vol. 8, pages
175-185 and 186-194, incorporated herein by reference.
[0113] After the base calling is completed, sequence quality is improved
by cutting poor quality end sequence. If the resulting sequence is less
than 50 bp, it is deleted. Sequence with an overall quality of less than
12.5 is deleted. And, contaminating sequence, e.g. E. coli BAC and vector
sequences and sub-cloning vector, are removed. Contigs are assembled
using Pangea Clustering and Alignment Tools which is available from
DoubleTwist Inc., Oakland, Calif. by comparing pairs of sequences for
overlapping bases. The overlap is determined using the following high
stringency parameters: word size=8; window size=60; and identity is 93%.
The clusters are reassembled using PHRAP fragment assembly program which
is available from CodonCode Corporation using a "repeat stringency"
parameter of 0.5 or lower. The final assembly output contains a
collection of sequences including contig sequences which represent the
consensus sequence of overlapping clustered sequences (contigs) and
singleton sequences which are not present in any cluster of related
sequences (singletons). Collectively, the contigs and singletons
resulting from a DNA assembly are referred to as islands.
Example 3
[0114] This example illustrates identification of SNP and Indel
polymorphisms by comparing alignments of the sequences of contigs and
singletons from at least two separate maize lines as prepared as in
example 2. Sequence from multiple maize lines is assembled to into loci
having one or more polymorphisms, i.e. SNPs and/or Indels. Candidate
polymorphisms are qualified by the following parameters: [0115] (a) The
minimum length of a contig or singleton for a consensus alignment is 200
bases. [0116] (b) The percentage identity of observed bases in a region
of 15 bases on each side of a candidate SNP, is 75%. [0117] (c) The
minimum BLAST quality in each contig at a polymorphism site is 35. [0118]
(d) The minimum BLAST quality in a region of 15 bases on each side of the
polymorphism site is 20.
[0119] A plurality of loci having qualified polymorphisms are identified
as having consensus sequence as reported as SEQ ID NO: 1 through SEQ ID
NO: 10373. The qualified SNP and Indel polymorphisms in each locus are
identified in Table 1. More particularly, Table 1 identifies the type and
location of the polymorphisms as follows:
[0120] SEQ_NUM refers to the sequence number of the polymorphic maize DNA
locus, e.g. a SEQ ID NO.
[0121] SEQ_ID refers to an arbitrary identifying name for the polymorphic
maize DNA locus.
[0122] MUTATION_ID refers to an arbitrary identifying name for each
polymorphism.
[0123] START_POS refers to the position in the nucleotide sequence of the
polymorphic maize DNA locus where the polymorphism begins.
[0124] END_POS refers to the position in the nucleotide sequence of the
polymorphic maize DNA locus where the polymorphism ends; for SNPs the
START_POS and END_POS are common.
[0125] TYPE refers to the identification of the polymorphism as an SNP or
IND (Indel).
[0126] ALLELEn and STRAINn refers to the nucleotide sequence of a
polymorphism in a specific allelic maize variety.
[0127] CHROMOSOME refers to the chromosome for a mapped polymorphism.
[0128] POSITION refers to the distance of a mapped polymorphism measured
in cM from the 5' end of the chromosome.
Example 4
[0129] This example illustrates the use of primer base extension for
detecting a SNP polymorphism, i.e. with Mutation ID 3972 in the maize
locus of SEQ ID NO: 5378 which is described more particularly in the
following Table 2.
TABLE-US-00001
TABLE 2
MUTATION START END ALLELE1/ ALLELE2/
SEQ NUM ID POS POS TYPE STRAIN1 STRAIN 2
5738 3971 66 66 SNP A/b73 C/mo17
5738 3972 126 126 SNP A/mo17 G/b73
5738 3973 149 150 IND **/mo17 TG/b73
5738 3974 338 338 SNP A/b73 G/mo17
[0130] A small quantity of maize genomic DNA (e.g. about 10 ng) is
amplified using the forward and reverse PCR primers, i.e. SEQ ID NO:
10379 and SEQ ID NO: 10378, respectively, which are designed to have an
annealing temperature of 55.degree. C. to template in the locus of SEQ ID
NO: 5738 around polymorphism of Mutation ID 3972 which is an A/G SNP. The
PCR product is added to a new plate in which the extension primer SEQ ID
NO: 10380 is covalently bound to the surface of the reaction wells in a
GBA plate. Ex tension mix containing DNA polymerase, the two
differentially labeled ddNTPs, and extension buffer is added. The GBA
plate is incubated at 42.degree. C. for 15 min to allow extension. The
reaction mix is removed from the wells by washing with a suitable buffer.
The two labels are detected by sequential incubation with primary and
secondary detection reagents for each of the labels. In the present
example, incorporation of ddATP-FITC is measured by incubation with
HRP-anti-FITC, followed by washing the wells, followed by incubation in a
buffer containing a chromogenic substrate for HRP. The extent of the
reaction is determined spectrophotometrically for each well at the
wavelength appropriate for the product of the HRP reaction. The wells are
washed again, and the procedure is repeated with AP-streptavidin,
followed by a chromogenic substrate for AP, and spectrophotometry at the
wavelength appropriate for the AP reaction product.
[0131] Analysis of Results.
[0132] The extent of incorporation of each labeled ddNTP is inferred from
the absorbance measured for the reaction products of the detection steps
specific label, and the genotype of the sample is inferred from the
ratios of these absorbances as compared to a standards of known genotype
and a no-template control reactions. In the most common practice, the
absorbances observed for each data point are plotted against each other
in a scatter plot, producing an "allelogram". A successful genotyping
assay using the single base extension assay of this example provides an
allelogram as illustrated in FIG. 2 where the data points are grouped
into four clusters: Homozygote 1 (e.g., the A allele), homozygote 2
(e.g., the G allele), heterozygotes (each sample containing both
alleles), and a "no signal" cluster resulting from no-template controls,
or failed amplification or detection.
Example 5
[0133] This example illustrates the use of a labeled probe degradation
assay for detecting the SNP polymorphism assayed in Example 4, i.e. the
polymorphism of Mutation ID 3972 in the locus of SEQ ID NO: 5738. A
quantity of maize genomic template DNA (e.g. about 2-20 ng) is mixed in 5
ul total volume with four oligonucleotides, i.e. forward primer SEQ ID
NO: 10376, reverse primer SEQ ID NO:10377 and hybridization probe having
a VIC reporter attached to the 5' end designed as VIC-TGTGTGAGCTGCTG
where the oligonucleotide segment of the probe has SEQ ID NO: 10374 and
hybridization probe having a FAM reporter attached to the 5' end designed
as FAM-TTGTGTGGGCTGCT where the oligonucleotide segment of the probe has
SEQ ID NO:10375 as well as PCR reaction buffer containing the passive
reference dye ROX. The PCR reaction is conducted for 35 cycles using a
60.degree. C. annealing-extension temperature. Following the reaction,
the fluorescence of each fluorophore as well as that of the passive
reference is determined in a fluorimeter. The fluorescence value for each
fluorophore is normalized to the fluorescence value of the passive
reference. The normalized values are plotted against each other for each
sample to produce an allelogram. A successful genotyping assay using the
primers and hybridization probes of this example provides an allelogram
with data points in clearly separable clusters as illustrated in FIG. 2.
[0134] To confirm that an assay produces accurate results, each new assay
is performed on a number of replicates of samples of known genotypic
identity representing each of the three possible genotypes, i.e. two
homozygous alleles and a heterozygous sample. To be a valid and useful
assay, it must produce clearly separable clusters of data points, such
that one of the three genotypes can be assigned for at least 90% of the
data points, and the assignment is observed to be correct for at least
98% of the data points. Subsequent to this validation step, the assay is
applied to progeny of a cross between two highly inbred individuals to
obtain segregation data, which are then used to calculate a genetic map
position for the polymorphic locus.
Example 6
[0135] This example illustrates the genetic mapping of polymorphisms in
loci of this invention based on the genotypes of over 1000 SNPs for 78
recombinant inbred lines (RILs) originating from the cross of maize lines
B73 and Mo17. The genotypes are combined with genotypes for about 80
public core SSR and RFLP markers scored on 203 RILs. Before mapping, any
loci showing distorted segregation (P<0.01 for a Chi-square test of a
1:1 segregation ratio) are removed. These loci can be added to the map
later but without allowing them to change marker order.
[0136] A map is constructed using the JoinMap version 2.0 software which
is described by Stam, P. "Construction of integrated genetic linkage maps
by means of a new computer package: JoinMap, The Plant Journal, 3:
739-744 (1993); Stam, P. and van Ooijen, J. W. "JoinMap version 2.0:
Software for the calculation of genetic linkage maps (1995) CPRO-DLO,
Wageningen. JoinMap implements a weighted-least squares approach to
multipoint mapping in which information from all pairs of linked loci
(adjacent or not) is incorporated. Linkage groups are formed using a LOD
threshold of 5.0. The SSR and RFLP public markers are used to assign
linkage groups to chromosomes. Linkage groups are merged within
chromosomes before map construction.
[0137] Haldane's mapping function is used to convert recombination
fractions to map distances. Lenient criteria are applied for excluding
pairwise linkage data; only data with a LOD not greater than 0.001 or a
recombination fraction not less than 0.499 are excluded. For ordering
loci, we used a jump threshold of 5.0, a triplet threshold of 7.0 and a
ripple value of 3. About 38% of the loci (424 of 1108) are ordered in two
rounds of map construction with a jump threshold of 5.0 which prevents
the addition of a locus to the map if such addition results in a jump of
more than 5.0 to a goodness-of-fit criterion. The remaining loci are
added to the map without application of such a jump threshold. Addition
of these loci has a negligible effect on the map order and distances for
the initial 424 loci. Mapped SNP polymorphisms are identified in Table 3
where "Chromosome" and "Position" identify the distance measured in cM
from the 5' end of a maize chromosome for the SNP identified by "Mutation
ID". "Public Name" provides the published name of reference public
markers which are not part of this invention. For certain of the mapped
polymorphic markers listed in Table 3, the Mutation ID is listed more
than once which indicates that the mapping was conducted based on
multiple genotyping assays. The map locations for multiple genotyping
assays generally serve to confirm map location except in the case where
map locations are divergent, e.g. due to error in the design or practice
of an assay. The density and distribution of the mapped polymorphisms is
shown in FIG. 1.
[0138] An alternative approach for linkage map construction based on
finding a locus order to minimize the total number of recombination
events is disclosed by Jansen, J. et al. "Constructing dense genetic
linkage maps", Theor Appl Genet. (in press). This approach yields under
many conditions a close approximation to a maximum-likelihood map. A map
estimated by this approach agrees quite closely with the map obtained
using JoinMap 2.0.
Example 7
[0139] This example illustrates methods of the invention using
polymorphisms disclosed in Table 1 and in the DNA sequences of SEQ ID
NO:1-10,373.
[0140] A breeding population of corn with diverse heritage is analyzed
using primer pairs and probe pairs prepared as indicated in Example 5 for
each of the polymorphisms identified in Table 1 based on sequences of SEQ
ID NO:1-10,373. Closely linked polymorphisms are identified as
characterizing haplotypes in adjacent genomic windows of about 8
centimorgans across the corn genome. Haplotypes representing at least 4%
of the population are associated with trait values identified for each
member of the corn population including the trait values for yield,
maturity, lodging, plant height, rust resistance, drought tolerance and
cold germination. The trait values for each haplotype are ranked in each
8 centimorgan window. Progeny seed from randomly-mated members of the
population are analyzed for the identity of haplotypes in each window.
Progeny seed are selected for planting based on high trait values for
haploytpes identified in said seeds.
TABLE-US-00002
TABLE 3
Chromosome Position MutationID Public Name
1 0 NA tub1
1 4.3 111829
1 5.4 24027
1 8.9 77353
1 8.9 2640
1 8.9 18242
1 8.9 48301
1 9 21713
1 9 21554
1 9 3850
1 9 77353
1 11.3 111443
1 13 19086
1 13.2 33261
1 14.2 44003
1 16.3 43992
1 16.3 43992
1 17.3 37039
1 17.3 5358
1 18.7 39299
1 18.7 39299
1 19.5 66743
1 19.5 70875
1 21.1 43185
1 21.8 78736
1 22 36628
1 22 1369
1 22 68027
1 22 78736
1 22 36628
1 22.3 1369
1 22.4 68027
1 23.9 110473
1 24.8 26572
1 24.8 25418
1 24.8 31299
1 25.1 25418
1 26.2 83894
1 30.7 28164
1 32.5 105051
1 32.9 NA umc157
1 33 108303
1 33.2 43230
1 33.2 43223
1 35.5 107227
1 36.1 113465
1 37 2521
1 37.5 42000
1 37.6 3563
1 37.6 36721
1 37.6 35426
1 37.7 29201
1 41.3 43600
1 41.9 33770
1 41.9 58681
1 44.5 106004
1 44.5 4984
1 44.5 52815
1 44.5 108007
1 44.5 52815
1 44.6 20607
1 47 3040
1 47 36685
1 47 36685
1 47 51228
1 47.3 110880
1 48.9 29694
1 48.9 29694
1 50.1 52741
1 50.1 49734
1 50.1 104827
1 50.1 52741
1 51.2 4409
1 51.2 9213
1 51.2 56126
1 51.2 56126
1 51.2 41877
1 51.2 37716
1 51.9 9213
1 51.9 41877
1 52.1 524
1 52.4 41481
1 52.4 NA umc76
1 52.4 37716
1 52.4 38720
1 52.4 38720
1 56.6 80697
1 56.6 78549
1 56.6 28700
1 56.6 80697
1 56.6 78549
1 58.2 42173
1 58.2 105856
1 58.2 105076
1 58.2 42173
1 58.3 35435
1 58.4 108891
1 58.4 113273
1 59 449
1 59.2 43266
1 59.8 25695
1 59.8 77750
1 59.8 4330
1 59.8 32681
1 59.8 29904
1 59.8 77750
1 59.8 20625
1 59.8 28646
1 60.8 4442
1 61 57022
1 61 15124
1 61 57022
1 61 11957
1 61 14299
1 61 38788
1 61.1 116
1 61.3 39840
1 61.3 113502
1 61.3 33816
1 61.3 39840
1 61.4 9159
1 61.4 39205
1 61.4 9678
1 61.4 10832
1 61.4 29329
1 61.4 116
1 62 38788
1 62.4 9578
1 62.4 16876
1 62.4 39067
1 62.4 39812
1 62.4 16823
1 62.4 16876
1 62.4 38212
1 62.4 57225
1 62.4 782
1 62.4 56612
1 62.4 2271
1 62.4 31873
1 62.4 39814
1 62.4 39067
1 62.4 43266
1 66.8 43554
1 66.8 509
1 66.8 36506
1 66.8 4387
1 66.8 43614
1 66.8 509
1 66.8 4387
1 66.8 36506
1 66.8 38234
1 69.1 29053
1 69.1 4287
1 69.1 3656
1 69.1 4287
1 69.1 4903
1 72 104670
1 72 105022
1 73.9 113462
1 74.9 40801
1 76 112443
1 76.4 29053
1 76.5 36229
1 78.7 66981
1 78.7 43901
1 78.7 43698
1 78.7 43698
1 78.7 369
1 78.7 28351
1 78.7 34513
1 78.8 38741
1 78.8 5215
1 78.8 41660
1 78.8 8984
1 78.8 14644
1 78.8 66809
1 78.8 68435
1 78.8 20510
1 78.8 3401
1 78.8 68435
1 78.8 33970
1 78.8 38741
1 79 78350
1 79.5 53351
1 79.7 34880
1 79.7 11522
1 79.7 5280
1 79.7 37482
1 79.9 32745
1 79.9 NA csu3
1 79.9 33746
1 79.9 35579
1 79.9 4023
1 79.9 23745
1 79.9 33746
1 79.9 4340
1 79.9 35579
1 82.5 109095
1 83 58745
1 83 15205
1 83 19256
1 83 25863
1 83 34205
1 83 43789
1 83 68281
1 83 69524
1 83 72083
1 83 106077
1 83 69188
1 83 68281
1 83 25863
1 83 43789
1 83 72083
1 84.6 110353
1 84.6 41836
1 84.6 66809
1 84.6 11669
1 85 107044
1 87.2 107701
1 87.9 4176
1 87.9 111365
1 87.9 50366
1 87.9 34525
1 87.9 4176
1 88.1 60430
1 88.1 33728
1 88.1 106144
1 88.2 80733
1 88.2 516
1 88.2 NA umc67
1 88.2 111828
1 88.2 8901
1 88.2 29506
1 88.2 2688
1 88.2 43819
1 88.2 53309
1 88.2 53983
1 88.2 541
1 88.2 2688
1 88.2 60283
1 88.2 53287
1 88.2 80732
1 88.2 43819
1 88.2 36121
1 88.2 34941
1 88.2 67414
1 89.3 30935
1 89.3 35132
1 89.3 35132
1 89.3 30935
1 89.3 41219
1 89.8 50366
1 90 29568
1 90 43522
1 91.3 104474
1 93.5 4771
1 93.5 40655
1 93.5 48983
1 93.5 11598
1 93.5 36620
1 93.5 48983
1 93.5 40655
1 93.5 32288
1 93.5 111780
1 95.4 77922
1 95.4 NA ncr
1 95.4 107077
1 95.4 8716
1 95.6 111987
1 95.9 108768
1 99.1 16755
1 99.1 68400
1 99.1 16755
1 99.1 36863
1 99.1 38247
1 99.1 69137
1 105.8 107621
1 106.8 111052
1 107 109328
1 108.4 70305
1 108.4 33373
1 108.4 70305
1 108.5 70702
1 108.5 70702
1 108.6 33373
1 109.2 37706
1 109.2 37706
1 109.2 4313
1 109.3 42754
1 109.3 42754
1 109.6 13351
1 109.6 23034
1 110 41280
1 110 50718
1 110 41280
1 110.1 113254
1 110.5 24524
1 110.5 3065
1 110.7 61571
1 110.7 51736
1 110.7 70005
1 111.6 35048
1 111.6 35048
1 111.9 106514
1 113 38475
1 113 38475
1 115.9 69344
1 115.9 4449
1 115.9 NA umc128
1 115.9 14555
1 115.9 72095
1 115.9 9626
1 115.9 9628
1 115.9 66417
1 116.3 72095
1 116.5 69565
1 116.5 84242
1 116.5 69565
1 116.8 67728
1 116.8 4981
1 116.8 67728
1 117.2 4981
1 117.7 108030
1 118.1 109882
1 118.4 57976
1 124.8 NA csu164a
1 124.8 12824
1 124.8 14058
1 124.8 28759
1 124.8 4619
1 124.8 4909
1 124.8 5098
1 124.8 3104
1 124.8 14058
1 124.8 78015
1 124.8 77537
1 124.8 38552
1 124.8 41284
1 124.8 28759
1 126.7 38991
1 126.8 105775
1 129.2 111289
1 135.4 107477
1 135.4 113311
1 138.6 NA umc107a
1 138.6 8982
1 138.6 33427
1 138.6 79
1 138.6 1169
1 138.6 58842
1 138.6 55580
1 138.6 34333
1 138.6 33427
1 138.8 111792
1 139 35891
1 139.1 33995
1 139.4 58891
1 139.4 9148
1 139.4 9701
1 139.4 13584
1 139.4 14038
1 139.4 15021
1 139.4 16059
1 139.4 31264
1 139.4 40427
1 139.4 55580
1 139.4 66464
1 139.4 18335
1 139.4 108862
1 139.4 40427
1 139.4 2619
1 139.4 9148
1 139.4 48910
1 139.4 3764
1 139.4 19159
1 139.4 10978
1 139.4 41554
1 139.4 3264
1 139.4 3982
1 139.4 66464
1 139.4 31264
1 139.4 71589
1 139.7 39486
1 150.2 28335
1 152.8 79742
1 152.8 NA umc161a
1 152.8 3837
1 152.8 15344
1 155.4 15344
1 172.5 3691
1 181.9 496
1 181.9 30840
1 182.4 30840
1 183.5 13490
1 183.5 501
1 191.6 14253
1 193.8 NA bnl6.32
1 195.9 18460
1 199 2635
1 199 2635
1 202.1 16137
1 202.1 5177
1 203 5177
2 0 5444
2 1.4 NA bnl8.45a
2 3.3 30230
2 3.3 30230
2 6.7 9866
2 6.7 9867
2 15.2 110959
2 15.4 NA umc53a
2 16.2 106352
2 16.8 106295
2 17 33786
2 17 9766
2 17 33786
2 17 5133
2 21.4 107221
2 21.6 82235
2 21.6 82235
2 22.5 57663
2 24.3 58964
2 25.2 4071
2 27.3 2814
2 27.7 28836
2 27.7 8941
2 27.7 28836
2 28.3 76912
2 28.3 3388
2 28.3 76912
2 28.5 110482
2 29.9 24116
2 29.9 24116
2 31.5 59463
2 31.5 14461
2 31.5 59463
2 32.2 2945
2 32.2 13859
2 32.2 2945
2 35.1 80035
2 35.1 80035
2 38 13691
2 38 8673
2 38 9706
2 38 13691
2 38 4265
2 42.1 33017
2 44.1 107479
2 44.9 12585
2 46.2 109140
2 46.2 2630
2 46.2 12259
2 46.2 27262
2 48.4 14220
2 48.4 13275
2 48.4 48553
2 48.4 78243
2 48.9 51472
2 49.3 30393
2 50.4 77411
2 52.2 NA umc44b
2 58.9 106391
2 61 2822
2 62.1 80704
2 62.1 80704
2 64.1 109323
2 65.5 NA umc34
2 67.2 105448
2 68.2 37306
2 69.1 19110
2 69.5 4697
2 69.9 4128
2 69.9 9623
2 69.9 19110
2 69.9 37306
2 69.9 42242
2 69.9 4697
2 71.1 41923
2 71.3 15022
2 71.3 15022
2 71.8 526
2 71.8 30616
2 72 40931
2 72.3 15548
2 72.7 3984
2 72.9 30616
2 73.5 42561
2 73.5 28858
2 73.5 36323
2 74.4 36323
2 75.3 37846
2 78 81866
2 78 110133
2 78 11466
2 78 16297
2 78 79073
2 78 49430
2 78 105002
2 78 108493
2 78 111617
2 78 104946
2 78 79073
2 78.2 16297
2 78.2 24771
2 78.8 59093
2 79.3 49430
2 80.4 104479
2 82.8 2805
2 82.8 53463
2 83.2 NA isu89
2 83.4 105696
2 83.4 23442
2 83.5 57604
2 83.7 5467
2 83.7 5467
2 83.8 61699
2 83.8 21092
2 83.8 57604
2 84.2 82777
2 84.2 23442
2 84.2 82768
2 84.5 79826
2 84.5 13347
2 84.5 19874
2 84.5 2468
2 84.6 16933
2 84.6 19874
2 84.6 66
2 84.8 16128
2 84.8 339
2 85.6 66
2 85.6 108305
2 85.6 60879
2 85.6 107948
2 87.2 106407
2 87.6 112229
2 87.8 112226
2 88.9 108607
2 89.2 3177
2 89.2 3177
2 89.8 NA nc003
2 90.3 551
2 90.3 551
2 90.6 107736
2 91.2 30667
2 91.2 30667
2 91.2 57210
2 91.2 59782
2 91.2 366
2 91.4 23121
2 91.5 366
2 91.7 4914
2 92.3 57211
2 92.9 32979
2 96.7 395
2 98 2307
2 99 NA umc36b
2 99.7 3067
2 99.7 82458
2 99.8 52225
2 100 31289
2 100 31289
2 101 56954
2 101.3 21597
2 101.3 2307
2 101.4 108013
2 101.4 44080
2 101.4 44080
2 101.8 105556
2 101.8 5263
2 101.8 59751
2 102 111247
2 102.1 107850
2 102.1 111475
2 102.1 104954
2 102.1 82265
2 102.1 109393
2 102.2 84632
2 102.2 69
2 102.2 69
2 102.2 29138
2 102.2 82265
2 108 41850
2 108 41850
2 108.5 NA asg20
2 108.5 9102
2 110 3843
2 111.9 4901
2 112.5 9639
2 112.5 109207
2 112.5 2878
2 115 38436
2 115.9 104694
2 116.2 3023
2 116.2 5088
2 119.5 35297
2 119.5 35297
2 120 19267
2 121.3 44031
2 123.7 34166
2 123.8 38185
2 124.4 69373
2 124.4 69373
2 132.1 NA umc49a
2 137.2 3241
2 137.2 14467
2 137.2 21114
2 137.2 29041
2 139.5 68964
2 139.5 68964
2 141.6 23748
2 143.5 4367
2 143.5 11740
2 144.2 42627
2 144.3 17136
2 144.3 66033
2 144.3 76792
2 144.7 33476
2 144.7 33476
2 145 108877
2 145.3 735
2 145.5 NA php20581b
2 146 77782
2 146 35238
2 146 35238
2 159 107149
3 0 106276
3 0 8911
3 0 4964
3 0 32997
3 0 51614
3 0.1 51614
3 1.4 16670
3 1.4 NA e8
3 1.4 3679
3 6.4 20971
3 6.9 106389
3 6.9 31902
3 11.1 10667
3 15.7 48700
3 15.7 3536
3 15.7 48700
3 28.2 4821
3 28.2 NA asg24
3 28.2 4821
3 28.9 2531
3 29 27564
3 35.6 19963
3 36.7 51616
3 36.7 51616
3 42 77220
3 42 33033
3 42 33053
3 42 22667
3 50.3 423
3 50.3 423
3 50.3 60188
3 85.6 NA asg48
3 85.6 27063
3 85.6 49293
3 85.6 49293
3 107.7 3631
3 107.7 82160
3 107.9 82160
3 108.2 108727
3 110.3 8900
3 110.6 106769
3 110.9 53088
3 111.5 NA phi053
3 112.3 16729
3 112.3 37908
3 112.6 110326
3 112.7 12017
3 112.7 827
3 112.7 810
3 112.7 9358
3 112.7 27482
3 112.7 40183
3 112.9 9358
3 112.9 109722
3 113 106515
3 113.1 21154
3 113.1 10220
3 113.2 2207
3 113.2 9468
3 113.2 2207
3 113.2 21190
3 113.2 4599
3 113.2 10220
3 113.3 27101
3 113.4 20499
3 113.5 67185
3 113.5 33635
3 113.6 31647
3 116.1 107784
3 116.9 39003
3 116.9 51964
3 116.9 39003
3 117.4 10933
3 117.4 10933
3 118.8 107671
3 118.8 9739
3 118.8 22590
3 119 9144
3 119 29709
3 119.2 9739
3 120.1 104504
3 120.3 22590
3 121.4 13092
3 121.4 13092
3 122.7 108089
3 123.3 107469
3 124 4735
3 124.1 12133
3 124.1 12777
3 125.6 NA bnl5.37a
3 125.6 4886
3 125.6 55896
3 125.6 55896
3 125.6 56809
3 129.8 79081
3 129.8 79082
3 133.5 23890
3 133.5 23890
3 137.4 11320
3 138 2905
3 138.1 23828
3 138.1 399
3 138.3 9173
3 138.3 11320
3 138.3 510
3 138.3 2905
3 139.2 106349
3 142.7 8922
3 144.1 18713
3 146.7 77118
3 146.7 41040
3 146.7 77118
3 146.9 108109
3 146.9 41040
3 146.9 28069
3 147.7 36694
3 147.7 NA bnl6.16a
3 147.7 2765
3 147.7 4013
3 147.7 49015
3 148 4013
3 150 29390
3 150 43810
3 150 39210
3 150.7 54742
3 150.7 21772
3 151 109717
3 151.5 105966
3 152.2 111204
3 153 71496
3 153.7 4157
3 153.7 71496
3 153.7 108630
3 153.7 69529
3 153.7 19641
3 153.7 4371
3 153.7 69529
3 154 4371
3 156.4 9473
3 156.4 20205
3 156.4 38504
3 157.8 NA umc17a
3 157.8 10862
3 157.8 320
3 157.8 321
3 157.8 52735
3 157.8 10862
3 161.5 21603
3 161.5 13288
3 162.2 105852
3 162.6 110780
3 162.6 10402
3 162.8 55817
3 162.8 55817
3 163.1 106901
3 164.5 78788
3 164.7 30587
3 165.7 112644
3 166.8 112487
3 167.8 56939
3 168.3 32026
3 172.4 NA umc63a
3 172.4 26945
3 173.1 5371
3 173.1 25814
3 178.4 57856
3 178.4 32427
3 178.4 10232
3 178.4 10232
3 179.8 18073
3 180.8 55
3 180.8 55
3 184.3 19414
3 189.4 3970
3 189.4 48063
3 199 77802
3 199 59538
3 199 77802
3 199.8 14041
4 0 107122
4 1.1 NA agrr115
4 1.1 9523
4 1.1 12340
4 1.1 4362
4 3.1 104957
4 3.2 33138
4 3.2 18275
4 3.2 55502
4 11.3 29468
4 11.3 2739
4 14.8 41419
4 14.8 41419
4 21.6 38293
4 21.6 38293
4 26.9 110069
4 28.9 1127
4 29 69223
4 29 9057
4 34.5 28603
4 34.5 28603
4 34.6 NA umc31a
4 36 54512
4 36 19003
4 36 28338
4 37.7 43900
4 40.1 24647
4 40.1 24647
4 41.3 38472
4 41.6 77263
4 41.6 3972
4 41.6 77263
4 41.7 58697
4 41.9 3972
4 42.4 34130
4 44.6 67577
4 44.6 67577
4 46.3 1122
4 46.7 785
4 46.7 1122
4 48.6 30507
4 49.5 NA umc49d
4 51 9327
4 51.5 10671
4 51.5 10671
4 51.8 34322
4 52.3 28441
4 52.5 NA zp1
4 52.5 34462
4 52.8 36730
4 52.8 28441
4 52.8 34462
4 52.8 42575
4 52.8 69795
4 52.8 105263
4 52.9 42575
4 53.1 70728
4 55 35683
4 55.1 10305
4 55.1 8936
4 55.1 10305
4 55.1 33483
4 55.1 38900
4 55.1 5451
4 55.5 31791
4 55.5 38900
4 55.5 31791
4 55.5 108120
4 56 31718
4 56 20472
4 56 20481
4 56.4 34464
4 56.4 15096
4 56.6 13103
4 56.6 15574
4 56.6 15574
4 56.9 2585
4 56.9 34464
4 57 13815
4 57.5 35244
4 57.6 20374
4 57.7 4385
4 57.7 55791
4 58.9 27345
4 58.9 27345
4 58.9 104906
4 59.1 84527
4 59.1 106099
4 59.1 415
4 59.1 107276
4 59.2 68131
4 59.2 24678
4 59.2 78135
4 59.2 109551
4 59.2 67365
4 59.2 77444
4 59.2 78135
4 59.2 80778
4 59.3 80475
4 59.3 111345
4 59.3 52478
4 59.3 37473
4 59.3 18036
4 59.3 66430
4 59.3 36245
4 59.3 50107
4 59.3 9021
4 59.3 29788
4 59.3 37473
4 59.3 39640
4 59.3 3351
4 59.3 3532
4 59.3 3533
4 59.3 5021
4 59.3 66430
4 59.3 80475
4 59.3 3528
4 59.3 106797
4 59.4 35275
4 59.4 41869
4 59.4 17775
4 59.4 104785
4 59.4 64
4 59.6 415
4 59.6 104667
4 59.9 80782
4 60 77444
4 61.2 38999
4 61.2 38999
4 61.5 39743
4 61.5 39743
4 63.8 81484
4 63.8 3657
4 63.8 111228
4 64.5 3657
4 65.7 22725
4 65.7 22725
4 68.3 106845
4 68.3 105197
4 68.3 105550
4 68.6 37405
4 68.6 37405
4 68.6 38087
4 68.6 40744
4 68.6 2474
4 68.6 5018
4 69 79924
4 69.2 38087
4 69.2 12191
4 69.2 32557
4 69.2 69570
4 69.4 57766
4 69.4 32557
4 69.4 79926
4 73.9 17281
4 73.9 35625
4 74 35625
4 74.4 28748
4 74.9 37540
4 74.9 13363
4 74.9 77408
4 74.9 107840
4 75 61713
4 75.3 77407
4 75.4 70043
4 75.5 39138
4 75.5 70043
4 75.6 40534
4 75.6 40117
4 75.6 40534
4 75.6 3274
4 75.6 3964
4 75.9 3274
4 76.7 81
4 76.8 NA umc66a
4 85.8 36240
4 85.8 9187
4 89.2 108028
4 93 43502
4 93.8 69747
4 95.3 71462
4 95.6 36147
4 95.6 36147
4 96.5 106491
4 97.9 50958
4 97.9 8979
4 97.9 15073
4 97.9 28933
4 97.9 29886
4 97.9 50947
4 98.1 NA umc158
4 98.4 29886
4 98.4 NA npi570
4 98.6 18782
4 98.9 10497
4 99.3 56052
4 99.6 81355
4 99.7 39796
4 99.9 78134
4 99.9 70533
4 99.9 78134
4 101.2 67246
4 101.2 37073
4 101.2 36646
4 101.2 5295
4 101.2 48771
4 101.2 84088
4 101.4 84088
4 101.5 36646
4 101.5 48771
4 102.1 71156
4 102.1 71156
4 102.4 67159
4 102.4 29435
4 102.4 67159
4 102.6 31964
4 102.6 30877
4 102.6 31964
4 102.9 110764
4 103.2 71447
4 103.3 42473
4 103.3 9484
4 103.3 3835
4 103.3 4170
4 104.2 104975
4 104.5 104901
4 104.7 420
4 104.7 24549
4 104.7 38447
4 104.8 156
4 104.8 42522
4 104.8 109767
4 104.9 42522
4 105.1 40950
4 105.1 79199
4 105.3 111505
4 105.8 NA umc52
4 105.8 38232
4 105.8 12711
4 105.8 17828
4 105.8 18439
4 105.8 20934
4 105.8 36534
4 105.8 38053
4 105.8 48567
4 106.1 48567
4 106.1 40950
4 106.1 54601
4 106.1 2435
4 106.3 17704
4 106.3 68443
4 106.3 2445
4 106.3 2435
4 106.5 36534
4 106.9 38053
4 108 27877
4 108 27877
4 108 29194
4 110.3 30576
4 110.3 30576
4 111 69709
4 111 107293
4 112.7 452
4 113.5 29336
4 114.4 31931
4 114.4 28579
4 114.4 31931
4 114.4 34250
4 115.2 20080
4 122.4 33110
4 122.4 8855
4 122.6 2755
4 122.6 2755
4 123.7 34767
4 123.8 24368
4 123.8 112744
4 123.8 24368
4 123.8 60681
4 126.1 34767
4 127 110955
4 127 110455
4 131.4 NA php20608a
4 133.3 48218
4 133.3 9398
4 133.8 3224
4 133.8 3224
4 135 30195
4 135.4 3152
4 135.4 30195
4 135.4 3152
4 135.4 4445
4 135.6 36378
4 137.5 36635
4 137.5 43011
4 137.5 36635
4 145.3 30211
4 145.3 30211
4 145.8 69919
4 145.8 50788
4 145.8 50788
4 146.3 43794
4 146.7 NA bnl8.23a
4 147.1 112943
4 147.8 43121
4 147.8 43121
4 149.8 40159
4 150.4 105560
4 151.5 10790
4 151.9 59119
5 0 23752
5 1.8 NA isu62
5 1.9 4969
5 2.4 23752
5 3.1 19740
5 6 9459
5 6 14864
5 6.1 5120
5 10.2 57383
5 11.1 17772
5 11.1 14633
5 11.1 17772
5 13.4 26746
5 13.4 4808
5 13.9 NA npi409
5 16.4 69590
5 16.5 107640
5 18.5 104988
5 19.4 105613
5 20 107858
5 20.2 19501
5 20.2 57137
5 20.5 57137
5 23.5 13052
5 23.8 20894
5 24.8 60322
5 25.8 58282
5 25.8 58282
5 28.4 33977
5 28.4 33977
5 30.8 91
5 30.8 91
5 30.8 4215
5 32.1 79065
5 32.1 55976
5 32.1 79065
5 33.5 42669
5 33.5 42669
5 34.2 10779
5 34.2 20668
5 34.2 5275
5 34.8 67791
5 36.1 67802
5 40.3 NA rab15
5 40.3 16527
5 42.3 35524
5 42.8 25224
5 42.8 109403
5 42.8 12935
5 44.2 79943
5 44.6 36696
5 44.7 NA umc107b
5 44.7 38726
5 45.4 30912
5 46.5 68926
5 51 37030
5 51.4 14946
5 52.8 58930
5 54.2 59999
5 54.5 79341
5 54.5 16767
5 55.2 77038
5 55.2 72062
5 56.7 79519
5 56.9 83715
5 57.3 79573
5 57.3 54720
5 57.6 3854
5 58.5 18546
5 58.5 18546
5 58.5 79519
5 61 31044
5 61.1 11588
5 61.1 9668
5 61.5 28655
5 62.2 5019
5 62.3 110554
5 62.3 16234
5 62.5 3932
5 62.5 113139
5 62.5 52081
5 62.5 77545
5 62.5 107549
5 62.5 106441
5 62.5 110919
5 62.5 111388
5 62.6 52081
5 62.7 77545
5 62.8 107061
5 63.5 106912
5 63.8 109411
5 63.9 26930
5 64.1 2319
5 64.1 56874
5 64.2 57086
5 64.2 4955
5 64.4 108957
5 64.4 51419
5 64.4 19187
5 64.4 51419
5 64.9 483
5 64.9 28807
5 65 111398
5 66.1 82147
5 66.1 30670
5 66.1 32272
5 66.1 82146
5 66.3 32272
5 67.3 40571
5 67.5 80028
5 67.5 80028
5 67.7 NA bnl4.36
5 68.4 78124
5 68.4 4605
5 68.4 14488
5 68.4 4605
5 68.7 3214
5 69 48328
5 69 18230
5 69 27874
5 69 48328
5 69 105854
5 69.1 27874
5 69.3 78223
5 69.4 24709
5 69.4 19329
5 69.4 40366
5 69.4 78535
5 69.5 19329
5 69.7 18230
5 70 10262
5 70.7 10859
5 70.7 10859
5 70.9 4032
5 70.9 110854
5 70.9 29820
5 71.2 13657
5 75.9 18153
5 75.9 18153
5 75.9 2775
5 75.9 51711
5 76 25501
5 76 43040
5 76.1 51711
5 76.1 9276
5 76.1 107309
5 76.2 36637
5 76.3 NA bnl5.71a
5 76.3 36637
5 76.4 111999
5 76.5 48616
5 76.5 108101
5 77 36425
5 77 36425
5 77.4 12480
5 77.4 10658
5 77.4 12480
5 81.8 27863
5 82.2 17678
5 82.2 17678
5 83.1 3009
5 83.1 9297
5 83.6 513
5 85.8 78477
5 86 108
5 86 3021
5 86 3338
5 86 78477
5 86.5 106000
5 86.5 106300
5 86.7 12281
5 86.7 36778
5 86.7 3054
5 88.2 106344
5 90.6 108679
5 90.9 5480
5 90.9 8807
5 90.9 5480
5 91.5 528
5 95 10139
5 95 10139
5 95.8 NA bnl5.40
5 95.8 17125
5 97.4 83876
5 97.4 79526
5 97.4 19634
5 97.4 83876
5 97.4 106716
5 98.5 4507
5 99.1 21138
5 99.1 21138
5 99.5 5503
5 102.1 9068
5 102.9 10131
5 102.9 10131
5 104.5 35377
5 104.5 35377
5 107.2 105970
5 107.2 107877
5 109.6 80644
5 109.6 31346
5 109.6 42059
5 109.6 53779
5 110.4 NA umc108
5 112.7 42956
5 112.8 8793
5 112.8 58375
5 112.8 81212
5 113.1 2353
5 113.2 41824
5 113.2 41824
5 113.5 50972
5 121.4 107947
5 123.8 104963
5 124.5 28721
5 126.6 390
5 126.6 28721
5 126.6 390
5 131.4 104717
5 132.2 105546
5 134.2 109853
5 135 12914
5 135 12417
5 135 15
5 135.2 15
5 141 32564
5 141.5 111504
5 142.1 NA php10017
5 142.1 31084
5 142.2 28767
5 146.2 113237
6 0 28234
6 0.5 27615
6 0.6 NA bnlg238
6 0.6 27615
6 0.6 36921
6 0.9 14417
6 3.8 110700
6 3.8 105014
6 5.1 77806
6 7.8 NA umc85a
6 8.2 66735
6 8.2 66735
6 8.9 79529
6 8.9 79529
6 9.8 69630
6 10.2 105714
6 10.2 69630
6 10.2 33924
6 10.2 2397
6 11.2 439
6 11.2 43610
6 11.2 53338
6 11.4 107639
6 11.7 107287
6 11.7 104510
6 11.7 2870
6 11.7 70447
6 11.7 37282
6 11.7 70447
6 11.7 42090
6 11.7 68941
6 11.8 30875
6 11.8 29780
6 12 107748
6 12.1 15304
6 12.1 16944
6 12.1 110607
6 12.2 22009
6 12.2 3210
6 12.2 15304
6 12.2 31248
6 12.2 3210
6 12.3 111142
6 12.4 20696
6 12.4 25201
6 12.4 66094
6 12.4 82438
6 12.5 13985
6 12.5 105718
6 12.5 13985
6 12.5 68941
6 12.5 25657
6 12.6 77756
6 12.6 2870
6 12.6 69532
6 12.8 42090
6 12.9 71287
6 13 77413
6 13 37929
6 13.4 82439
6 13.8 33924
6 14.6 110850
6 18.2 70260
6 19 57919
6 19 70260
6 19 37981
6 19.4 37981
6 19.4 4126
6 19.4 37812
6 24.6 29331
6 28 15488
6 28 15488
6 30.1 43377
6 30.1 43377
6 30.6 67323
6 30.6 67323
6 32.1 34054
6 32.1 106121
6 32.1 106527
6 32.1 108212
6 33.3 NA umc65a
6 37.1 60751
6 37.1 512
6 37.1 60751
6 37.1 34948
6 37.2 34948
6 38.8 57758
6 38.8 29612
6 39.3 69868
6 39.3 29612
6 39.3 37720
6 41.2 11591
6 41.2 448
6 41.6 448
6 42.6 78226
6 43.2 59008
6 43.2 9134
6 43.4 105586
6 44.5 105497
6 45.3 456
6 45.3 NA phi129
6 45.3 8833
6 45.3 8833
6 45.3 16087
6 45.3 81427
6 45.7 3277
6 45.7 3277
6 46.9 30942
6 50.6 109389
6 51.2 12874
6 51.2 8838
6 51.2 20410
6 51.2 20410
6 51.9 14694
6 51.9 3913
6 51.9 14694
6 53 19518
6 53.5 5081
6 53.5 108196
6 54.1 82021
6 54.2 557
6 54.2 14128
6 56 66737
6 56 205
6 56 70996
6 56 67505
6 57 13638
6 57 20450
6 57.7 113381
6 57.8 37517
6 59.1 4030
6 59.1 28203
6 59.1 2347
6 59.1 4030
6 62.1 15070
6 62.1 19772
6 62.1 15070
6 62.1 109862
6 64.2 110972
6 65.3 81445
6 65.5 NA umc38a
6 65.5 81445
6 66 30771
6 66.7 107703
6 66.8 17860
6 66.8 17860
6 67.2 81121
6 67.2 81120
6 68.2 67075
6 68.2 5313
6 68.2 67075
6 68.7 37947
6 68.7 19588
6 68.7 22200
6 68.7 56222
6 68.7 37947
6 70.8 667
6 73.7 29924
6 73.9 29676
6 73.9 29676
6 74.7 2629
6 74.7 2629
6 75.6 21284
6 76.1 54780
6 76.1 54780
6 78.4 31684
6 80.5 31026
6 80.5 107449
6 80.5 31026
6 81.6 33492
6 81.9 23358
6 81.9 23358
6 81.9 31920
6 82.2 NA umc132a
6 82.2 16982
6 85.9 5265
6 85.9 5265
6 86 3201
6 86.1 21418
6 86.5 16017
6 86.5 28185
6 86.5 16017
6 86.5 40264
6 89.4 60514
6 89.4 13445
6 89.4 60514
6 89.8 58630
6 91.6 2782
6 91.6 2782
6 91.6 70983
6 95.1 53636
6 95.1 17395
6 97.8 9667
6 103.5 37790
6 103.5 21734
6 103.6 21734
6 103.6 9439
6 103.8 37555
6 103.8 43724
6 104.7 42370
7 0 58637
7 0 58637
7 0 2314
7 5.9 NA php20581a
7 12.9 110968
7 15.3 NA npi394
7 16 68954
7 16.7 35408
7 16.7 48425
7 16.7 8800
7 16.7 35408
7 16.7 48425
7 16.7 12463
7 17.9 16644
7 17.9 12477
7 17.9 68954
7 18 5051
7 18 2475
7 18 37767
7 21.1 20649
7 21.1 38796
7 21.2 34688
7 21.2 68426
7 21.2 30317
7 21.8 2225
7 21.8 66143
7 21.8 36748
7 22.1 30674
7 22.5 68426
7 22.5 66807
7 22.5 66807
7 23 28653
7 23 4882
7 23 41043
7 23 51404
7 23 34688
7 23 40386
7 23.3 457
7 23.3 4299
7 23.3 32268
7 23.3 34731
7 23.3 33469
7 23.3 38707
7 23.3 49507
7 23.3 78294
7 23.7 78783
7 23.9 33469
7 23.9 19507
7 23.9 42930
7 24.2 33507
7 24.2 36486
7 24.2 15035
7 24.3 5273
7 24.3 36486
7 24.3 33507
7 24.3 5273
7 24.4 558
7 24.4 50490
7 24.4 61500
7 24.4 57209
7 24.4 30511
7 24.4 69202
7 24.4 39064
7 24.5 34121
7 24.5 80469
7 24.6 34215
7 24.6 367
7 24.6 558
7 24.6 38766
7 24.6 3233
7 24.6 8806
7 24.7 28094
7 24.7 28094
7 24.7 69202
7 24.7 30511
7 24.7 50490
7 24.7 39064
7 24.7 57209
7 24.7 78294
7 24.7 34215
7 24.7 19507
7 25.1 35631
7 25.3 9073
7 25.5 30317
7 26 56253
7 26 33116
7 26 56253
7 26 41610
7 26.4 29362
7 26.4 33509
7 26.4 68434
7 26.4 32790
7 26.4 37827
7 26.4 33509
7 26.4 41557
7 26.9 27428
7 26.9 41610
7 26.9 50359
7 26.9 108168
7 26.9 32918
7 27.6 33755
7 27.6 40172
7 27.6 42164
7 27.6 2225
7 29 9304
7 29 66143
7 29 50359
7 29 80061
7 29.7 84006
7 29.7 20087
7 29.7 60906
7 29.7 60906
7 29.7 84010
7 29.7 32039
7 30.6 3922
7 30.6 29927
7 30.6 29927
7 30.6 30154
7 31.8 NA asg34a
7 35.3 15184
7 35.3 15161
7 35.3 38733
7 35.3 40282
7 35.4 38733
7 35.4 18284
7 35.4 21944
7 37.1 68523
7 37.2 38653
7 37.2 4093
7 37.2 4229
7 37.2 38653
7 37.2 42153
7 37.4 15995
7 37.4 16008
7 37.4 69388
7 37.9 70392
7 37.9 81460
7 37.9 70392
7 38 28932
7 38.1 71642
7 38.1 15314
7 38.1 NA umc254
7 38.1 81460
7 38.1 33952
7 38.2 71001
7 38.2 68149
7 38.2 68149
7 38.2 71652
7 38.2 71001
7 38.2 43980
7 45.5 30029
7 45.5 30029
7 45.5 79300
7 52.1 31502
7 52.1 31502
7 52.5 3218
7 52.5 489
7 52.5 21967
7 58.1 30872
7 58.1 30872
7 59.4 17039
7 59.4 17039
7 63.8 4953
7 63.8 4953
7 71.6 15974
7 76.3 19704
7 76.3 4142
7 76.3 19704
7 76.4 4142
7 83.5 30970
7 83.5 2710
7 83.5 30970
7 85.2 NA phi069
7 85.2 18935
7 85.2 11659
7 85.2 11664
7 85.2 9843
7 87 38498
7 87 NA phi116
7 87 20454
7 87 28596
7 87 38317
7 87 66634
7 87 78091
7 87 106258
7 87 38317
7 87 28596
7 87 78091
7 87 66634
7 87 42623
8 0 24672
8 0 2237
8 0 24672
8 0 39991
8 0 10310
8 3.3 40320
8 3.3 26837
8 3.3 40320
8 3.3 2312
8 5 19198
8 7.5 38724
8 7.5 38724
8 12.1 40299
8 12.6 3792
8 12.6 39677
8 17.3 78792
8 21.5 35790
8 24.4 NA umc92b
8 25.4 34552
8 25.4 34552
8 25.4 207
8 25.5 38939
8 33.3 5266
8 33.3 58392
8 33.3 21477
8 33.4 58392
8 37.4 30011
8 37.4 2322
8 39.3 79080
8 39.3 79080
8 39.3 53899
8 39.3 31255
8 39.3 77693
8 39.6 29635
8 39.6 NA phi125
8 39.6 22765
8 39.6 29015
8 40.3 10347
8 40.3 29015
8 40.3 107396
8 40.3 37392
8 40.3 26720
8 40.3 26720
8 40.3 37392
8 40.3 29693
8 40.4 109056
8 40.6 79084
8 40.6 104862
8 40.8 53899
8 40.8 15819
8 40.9 81269
8 41.1 51919
8 41.1 10347
8 41.1 81269
8 41.1 79096
8 41.1 9659
8 41.1 82612
8 41.1 79096
8 48.8 26775
8 48.8 26775
8 50.5 NA umc89a
8 50.5 12023
8 50.5 58047
8 50.5 56049
8 50.5 51064
8 50.6 5077
8 50.7 22382
8 50.7 22382
8 50.8 8818
8 51 20912
8 51 9254
8 51 11760
8 51 27361
8 51 77568
8 51 48947
8 51 61592
8 51 27361
8 51 16491
8 51.2 9835
8 51.2 9835
8 51.4 10123
8 51.4 NA phi121
8 51.9 107937
8 54.8 4254
8 54.8 4504
8 54.8 58047
8 55.3 82295
8 55.3 82295
8 55.7 32337
8 55.7 32339
8 55.8 56860
8 55.8 108315
8 55.8 108631
8 55.8 110331
8 55.8 110378
8 55.8 104858
8 55.8 3016
8 55.8 56860
8 55.8 12264
8 56.4 9969
8 56.4 20514
8 68.7 53505
8 68.7 27300
8 68.7 53505
8 68.8 82386
8 69 27300
8 69 82386
8 70.4 32642
8 70.8 107641
8 72.8 32993
8 72.9 12656
8 73 10392
8 73 10392
8 73 112497
8 73.2 8831
8 73.7 9759
8 74.9 20537
8 74.9 9802
8 80.1 4587
8 80.1 31630
8 80.1 4587
8 80.1 31630
8 82.7 3008
8 82.7 NA tpi5
8 82.7 5592
8 82.7 53045
8 82.7 4866
8 92.8 107286
8 94 NA npi414
8 94 27810
8 94 27810
8 95.4 4041
8 95.4 16520
8 95.4 503
8 98.9 22728
8 98.9 2891
8 98.9 60696
8 103.6 4171
8 103.6 35106
8 103.6 69226
8 107.8 31448
8 107.8 14353
8 108.8 14545
8 112 50507
8 112.6 NA gst1
8 112.6 8760
8 112.6 10257
8 118.4 561
8 119.1 561
8 119.1 25294
8 119.2 25294
8 120.6 60573
8 120.6 60573
8 120.6 40239
8 120.9 53085
9 0 11026
9 0.1 14476
9 0.1 14479
9 8.3 81558
9 9.2 NA umc109
9 12.2 30508
9 12.2 30508
9 14.2 32715
9 16.3 20781
9 17 2735
9 17 2735
9 22.2 49557
9 22.2 49557
9 32.3 NA bz1
9 33.4 12830
9 34.8 4308
9 34.8 4308
9 45.9 29745
9 45.9 29745
9 45.9 66106
9 47.1 41796
9 47.6 29583
9 47.6 41796
9 50.3 29436
9 50.3 29436
9 56.5 12557
9 56.5 12557
9 57.1 28095
9 57.8 NA wx1
9 58.1 10643
9 59.3 4407
9 59.3 2833
9 59.3 2730
9 59.6 2876
9 60.1 80632
9 60.1 80633
9 60.2 28527
9 60.2 28527
9 60.3 80382
9 60.3 15632
9 60.3 55370
9 60.3 55370
9 60.6 113113
9 60.7 29744
9 60.7 29744
9 60.9 18302
9 60.9 20857
9 60.9 8935
9 60.9 55759
9 60.9 9249
9 61.3 4049
9 61.7 8935
9 65 31039
9 66.2 13096
9 66.2 9397
9 66.2 29567
9 70.2 28354
9 70.2 20048
9 70.2 20872
9 70.2 21430
9 70.2 29595
9 70.2 2611
9 70.2 1647
9 70.2 4284
9 70.2 33088
9 70.2 9242
9 70.2 57608
9 70.2 28354
9 73.1 3231
9 73.1 21306
9 74.4 111177
9 74.4 61623
9 75.9 31482
9 75.9 NA sus1
9 75.9 NA umc95
9 75.9 110125
9 75.9 8937
9 75.9 14826
9 75.9 3425
9 75.9 4123
9 75.9 78438
9 75.9 5186
9 75.9 49561
9 75.9 78437
9 75.9 4123
9 81 18417
9 81 18417
9 81.9 14240
9 81.9 30512
9 82.4 59423
9 82.7 42348
9 82.7 42348
9 83.9 108275
9 84.3 81074
9 84.3 20286
9 84.3 2591
9 84.3 81074
9 84.7 13086
9 84.7 38548
9 84.7 61433
9 84.7 61433
9 84.7 60349
9 84.7 13086
9 85.7 32501
9 86.9 4890
9 86.9 57098
9 86.9 32869
9 86.9 28613
9 87.3 56
9 87.3 4921
9 87.3 4890
9 87.3 57098
9 90 66389
9 90 66389
9 94.3 36022
9 94.3 36022
9 96.5 NA csu93a
9 97.8 35380
9 97.8 51922
9 97.8 59132
9 97.8 35380
9 97.8 59320
9 98.3 18446
9 99.7 20368
9 101.6 69775
9 101.6 69775
9 102 5023
9 102 56641
9 105.6 NA asg12
9 107.1 52938
9 108.7 35729
9 108.7 35729
9 110.3 4863
9 110.3 4862
9 110.3 4863
9 111.1 42929
9 111.1 10264
9 111.1 42929
9 116.1 9407
9 119.5 83647
9 123.9 77194
9 128.5 49286
10 0 NA phi041
10 13.6 NA npi285a
10 13.6 20502
10 23.3 NA umc130
10 24 18509
10 24 8956
10 25.3 28604
10 25.3 16045
10 25.3 28604
10 25.7 29123
10 25.7 2290
10 25.7 111212
10 25.7 531
10 25.7 15355
10 25.7 25218
10 25.7 29123
10 25.7 4887
10 25.7 5406
10 25.7 53602
10 25.7 109866
10 25.8 58389
10 25.8 58389
10 25.9 53602
10 26.1 79078
10 26.2 10927
10 26.3 531
10 26.3 4887
10 26.3 5255
10 28.1 51974
10 28.1 109648
10 28.1 51974
10 28.7 NA umc64a
10 28.9 12984
10 28.9 12984
10 28.9 105175
10 28.9 48637
10 28.9 5140
10 28.9 111673
10 28.9 81142
10 28.9 8840
10 28.9 21894
10 28.9 41739
10 28.9 43776
10 28.9 5020
10 28.9 104512
10 29 111682
10 29.3 41739
10 29.3 21894
10 29.3 109004
10 29.4 43776
10 29.5 107599
10 29.5 20088
10 29.6 21292
10 29.6 22541
10 29.7 9350
10 29.8 5020
10 29.8 9755
10 29.9 22541
10 30.4 9587
10 30.4 33137
10 31 35898
10 31.1 56723
10 31.5 54661
10 31.5 32428
10 31.5 53110
10 31.5 54661
10 31.9 32428
10 32.1 83
10 32.2 53110
10 32.7 52544
10 32.7 3206
10 32.7 3640
10 33.7 13733
10 33.7 109090
10 33.7 13733
10 33.7 2940
10 33.7 5324
10 34.3 84194
10 34.3 84196
10 34.3 16730
10 34.5 107941
10 34.5 49445
10 34.7 16730
10 35.1 3133
10 35.9 11123
10 40.9 81776
10 40.9 18392
10 40.9 81776
10 43.4 43412
10 43.9 22296
10 44.6 30134
10 44.6 13745
10 44.6 30134
10 44.7 13745
10 44.7 48402
10 45.4 43391
10 45.4 43391
10 46.5 10183
10 46.5 10183
10 46.5 33664
10 48.3 NA umc44a
10 51.4 21962
10 51.8 8790
10 51.8 11115
10 52.8 70905
10 52.8 70905
10 53.2 36251
10 53.2 36251
10 55.3 25816
10 58.1 55498
10 58.1 67173
10 58.9 40431
10 59.4 67173
10 59.8 106742
10 60.1 106406
10 61.9 113140
10 64.7 9486
10 64.7 NA bnl7.49a
10 65.7 109723
10 67.6 109666
10 67.9 107333
10 74.5 8643
10 78.6 111488
10 78.9 8756
Sequence CWU
0
SQTB
SEQUENCE LISTING
The patent application contains a lengthy "Sequence Listing" section. A
copy of the "Sequence Listing" is available in electronic form from the
USPTO web site
(http://seqdata.uspto.gov/?pageRequest=docDetail&DocID=US20110008793A1).
An electronic copy of the "Sequence Listing" will also be available from
the USPTO upon request and payment of the fee set forth in 37 CFR
1.19(b)(3).
0
SQTB
SEQUENCE LISTING
The patent application contains a lengthy "Sequence Listing" section. A
copy of the "Sequence Listing" is available in electronic form from the
USPTO web site
(http://seqdata.uspto.gov/?pageRequest=docDetail&DocID=US20110008793A1).
An electronic copy of the "Sequence Listing" will also be available from
the USPTO upon request and payment of the fee set forth in 37 CFR
1.19(b)(3).
* * * * *