| United States Patent Application |
20210206813
|
| Kind Code
|
A1
|
|
Fox; Jerome
;   et al.
|
July 8, 2021
|
GENETICALLY ENCODED SYSTEM FOR CONSTRUCTING AND DETECTING BIOLOGICALLY
ACTIVE AGENTS
Abstract
This invention relates to the field of genetic engineering. Specifically,
the invention relates to the construction of operons to produce
biologically active agents. For example, operons may be constructed to
produce agents that control the function of biochemical pathway proteins
(e.g., protein phosphatases, kinases and/or proteases). Such agents may
include inhibitors and modulators that may be used in studying or
controlling phosphatase function associated with abnormalities in a
phosphatase pathway or expression level. Fusion proteins, such as light
activated protein phosphatases, may be genetically encoded and expressed
as photoswitchable phosphatases. Systems are provided for use in
controlling phosphatase function within living cells or in identifying
small molecule inhibitors/activator/modulator molecules of protein
phosphatases associated with cell signaling.
| Inventors: |
Fox; Jerome; (Boulder, CO)
; Sarkar; Ankur; (Boulder, CO)
; Hongdusit; Akarawin; (Boulder, CO)
; Kim; Edward; (Oakland, CA)
|
| Applicant: | | Name | City | State | Country | Type | The Regents of the University of Colorado, a Body Corporate | Denver |
CO | US | | |
| Assignee: |
The Regents of the University of Colorado, a Body Corporate
Denver
CO
|
| Family ID:
|
69060332
|
| Appl. No.:
|
17/141321
|
| Filed:
|
January 5, 2021 |
Related U.S. Patent Documents
| | | | |
|
| Application Number | Filing Date | Patent Number | |
|---|
| | PCT/US2019/040896 | Jul 8, 2019 | | |
| | 17141321 | | | |
| | 62694838 | Jul 6, 2018 | | |
|
|
| Current U.S. Class: |
1/1 |
| Current CPC Class: |
C12Y 205/01001 20130101; C07K 2319/00 20130101; C12N 1/20 20130101; C12Y 402/03024 20130101; C12Y 207/01036 20130101; C07K 14/005 20130101; C07K 2319/80 20130101; C07K 14/47 20130101; C07K 2319/60 20130101; C12N 9/1205 20130101; C12N 9/16 20130101; C12Y 301/01048 20130101; C12Y 401/01033 20130101; C12Y 207/04002 20130101; C12Y 503/03002 20130101; C12Y 402/03017 20130101; C12Y 402/03018 20130101; C12Y 205/0101 20130101; C12Y 402/03056 20130101 |
| International Class: |
C07K 14/005 20060101 C07K014/005; C07K 14/47 20060101 C07K014/47; C12N 1/20 20060101 C12N001/20 |
Goverment Interests
STATEMENT OF GOVERNMENT SUPPORT
[0002] This invention was made with government support under grant numbers
1750244 and 1804897 awarded by the National Science Foundation. The
government has certain rights in the invention.
Claims
1. A method of using a genetically encoded detection operon system,
comprising, a. providing, i. an inhibitor detection operon, comprising
Part A: a first region of DNA in operable combination comprising: 1. a
region of DNA encoding a first promoter; 2. a first gene encoding a first
fusion protein comprising a substrate recognition domain linked to a
DNA-binding protein; 3. a second gene encoding a second fusion protein
comprising a substrate domain linked to a protein capable of recruiting
RNA polymerase to DNA; 4. a region of DNA encoding a second promoter; 5.
a third gene for a protein kinase; 6. a fourth gene for a molecular
chaperone; 7. a fifth gene for a protein phosphatase; Part B: a second
region of DNA in operable combination under control of a third promoter
comprising: 8. a first DNA sequence encoding an operator for said
DNA-binding protein; 9. a second DNA sequence encoding a binding site for
RNA polymerase; and 10. at least one first gene of interest (GOI); ii. a
mevalonate-dependent isoprenoid pathway operon not containing a terpene
synthase gene, under control of a fourth promoter, iii. a third DNA
sequence under control of a fifth promoter comprising a second gene of
interest; and iv. a plurality of bacteria, and b. transfecting said
bacteria with said inhibitor detection operon for expressing said first
gene of interest; c. transfecting said bacteria with said
mevalonate-dependent isoprenoid pathway operon; d. transfecting said
bacteria with said third DNA sequence for expressing said second gene of
interest; e. growing said bacteria expressing said genes of interest
wherein inhibitor compounds inhibiting said protein phosphatase are
produced by said bacteria.
2. The method of claim 1, further comprising f. isolating said protein
phosphatase inhibitor compounds and providing a mammalian cell culture,
and g. treating said mammalian cell culture with said inhibitor compounds
for reducing activity of said protein phosphatase.
3. The method of claim 2, wherein reducing activity of said protein
phosphatase reduces growth of said mammalian cells.
4. The method of claim 1, wherein said protein phosphatase is human
PTP1B.
5. The method of claim 1, wherein said protein phosphatase is wild-type.
6. The method of claim 1, wherein said protein phosphatase has at least
one mutation.
7. The method of claim 1, wherein said mevalonate-dependent isoprenoid
pathway operon comprises genes for expressing mevalonate kinase (ERG12),
phosphomevalonate kinase (ERGS), mevalonate pyrophosphate decarboxylatse
(MVD1), Isopentenyl pyrophosphate isomerase (IDI gene), and Farnesyl
pyrophosphate (FPP) synthase (ispA).
8. The method of claim 1, wherein said second gene of interest is a gene
for a terpene synthase.
9. The method of claim 8, wherein said terpene synthase is selected from
the group consisting of amorphadiene synthase (ADS) and .gamma.-humulene
synthase (GHS).
10. The method of claim 8, wherein said third DNA sequence further
comprises a geranylgeranyl diphosphate synthase (GPPS) and said terpene
synthase is selected from the group consisting of abietadiene synthase
(ABS) and taxadiene synthase (TXS).
11. The method of claim 8, wherein said terpene synthase is wild-type.
12. The method of claim 8, wherein said terpene synthase has at least one
mutation.
13. The method of claim 12, wherein said inhibitor compounds are
terpenoid compounds or structural variants of terpenoid compounds.
14. The method of claim 1, wherein said at least one first gene of
interest are one or more antibiotic genes.
15. The method of claim 14, wherein said at least one first gene of
interest are each different antibiotic genes.
16. The method of claim 1, wherein the fourth promoter is an inducible
promoter; further comprising providing an induction molecule for said
inducible promoter, and contacting said inducible promoter with said
induction molecule.
17. A genetically encoded detection operon system, comprising: Part A: a
first region of DNA in operable combination comprising: i. a region of
DNA encoding a first promoter; ii. a first gene encoding a first fusion
protein comprising a substrate recognition domain linked to a DNA-binding
protein; iii. a second gene encoding a second fusion protein comprising a
substrate domain linked to a protein capable of recruiting RNA polymerase
to DNA; iv. a region of DNA encoding a second promoter; v. a third gene
for a protein kinase; vi. a fourth gene for a molecular chaperone; vii. a
fifth gene for a protein phosphatase; Part B: a second region of DNA in
operable combination under control of a third promoter comprising: i. a
first DNA sequence encoding an operator for said DNA-binding protein; ii.
a second DNA sequence encoding a binding site for RNA polymerase; and
iii. at least one gene of interest (GOI).
18. The operon of claim 17, wherein said substrate recognition domain is
a substrate homology 2 (SH2) domain.
19. The operon of claim 17, wherein said DNA-binding protein is the 434
phage cI repressor.
20. The operon of claim 17, wherein said substrate domain is a peptide
substrate of both said kinase and said phosphatase.
Description
RELATED APPLICATIONS
[0001] This application is a continuation of international application
PCT/US2019/040896, filed Jul. 8, 2019, which claims the benefit under 35
U.S.C. .sctn. 119(e) of U.S. provisional application Ser. No. 62/694,838,
filed Jul. 6, 2018, the disclosures of which are incorporated by
reference herein in their entireties.
FIELD OF THE INVENTION
[0003] This invention relates to the field of genetic engineering.
Specifically, the invention relates to the construction of operons to
produce biologically active agents. For example, operons may be
constructed to produce agents that control the function of biochemical
pathway proteins (e.g., protein phosphatases, kinases and/or proteases).
Such agents may include inhibitors and modulators that may be used in
studying or controlling phosphatase function associated with
abnormalities in a phosphatase pathway or expression level. Fusion
proteins, such as light activated protein phosphatases, may be
genetically encoded and expressed as photoswitchable phosphatases.
Systems are provided for use in controlling phosphatase function within
living cells or in identifying small molecule
inhibitors/activator/modulator molecules of protein phosphatases
associated with cell signaling.
BACKGROUND
[0004] Protein phosphorylation is involved with cell signaling as in part
it controls the location and timing of cellular differentiation,
movement, proliferation, and death.sup.1-4; its misregulation is
implicated in cancer, diabetes, obesity, and Alzheimer's disease, among
other disorders.sup.5-9. Optical tools to exert spatiotemporal control
over the activity of phosphorylation-regulating enzymes in living cells
could elucidate the mechanisms by which cells transmit, filter, and
integrate chemical signals.sup.10,11, reveal links between seemingly
disparate physiological processes (e.g., memory.sup.12 and
metabolism.sup.13), and facilitate the identification of new targets for
phosphorylation-modulating therapeutics (a class of
pharmaceuticals.sup.14). Therefore, there is a need for developing tools
to control, reduce, or enhance the activity of phosphorylation-regulating
enzymes in living cells.
SUMMARY OF THE INVENTION
[0005] This invention relates to the field of genetic engineering.
Specifically, the invention relates to the construction of operons to
produce biologically active agents. For example, operons may be
constructed to produce agents that control the function of biochemical
pathway proteins (e.g., protein phosphatases, kinases and/or proteases).
Such agents may include inhibitors and modulators that may be used in
studying or controlling phosphatase function associated with
abnormalities in a phosphatase pathway or expression level. Fusion
proteins, such as light activated protein phosphatases, may be
genetically encoded and expressed as photoswitchable phosphatases.
Systems are provided for use in controlling phosphatase function within
living cells or in identifying small molecule
inhibitors/activator/modulator molecules of protein phosphatases
associated with cell signaling.
[0006] In one embodiment, the present invention contemplates a genetic
operon comprising: a) providing; i) a first gene encoding a first fusion
protein, the first fusion protein comprising a substrate recognition
domain and either a DNA-binding domain or an anchoring unit for RNA
polymerase; ii) a second gene encoding a second fusion protein, the
second fusion protein comprising an enzyme substrate domain and either an
anchoring unit for RNA polymerase or a DNA binding domain; iii) a first
DNA sequence comprising a binding site for said DNA-binding domain; iv) a
second DNA sequence comprising a binding site, proximal to the first, for
said anchoring unit and for said RNA polymerase; v) a third gene encoding
a first enzyme, wherein said first enzyme is capable of modifying said
substrate domain, thereby changing the affinity of said substrate
recognition domain; vi) a fourth gene encoding a second enzyme, wherein
said second enzyme is capable unmodifying said substrate domain; vii) a
reporter gene encoding at least one capable of having a detectable output
when said RNA polymerase and said anchoring unit binds to said second DNA
sequence binding site after association of the two fusion proteins. In
one embodiment, said substrate domain is a peptide substrate of a protein
kinase. In one embodiment, said substrate domain is a peptide substrate
of a protein tyrosine kinase. In one embodiment, said substrate domain is
a peptide substrate of Src kinase (a protein tyrosine kinase). In one
embodiment, said substrate recognition domain is capable of binding to
said substrate domain in its phosphorylated state. In one embodiment,
said substrate recognition domain is capable of binding to said substrate
domain in its unphosphorylated state. In one embodiment, said DNA-binding
domain is the 434 cI repressor and said DNA binding site is the binding
sequence for that repressor. In one embodiment, said anchoring unit is
the omega subunit of RNA polymerase and said second DNA binding site is
the binding site for RNA polymerase. In one embodiment, said substrate
domain is a peptide substrate of a protein kinase. In one embodiment,
said operon further comprises a system of proteins. In one embodiment,
said first enzyme is a protein phosphatase. In one embodiment, said first
enzyme is a protein tyrosine phosphatase. In one embodiment, said first
enzyme is protein tyrosine phosphatase 1B. In one embodiment, said second
enzyme is a protein kinase. In one embodiment, said second enzyme is a
protein tyrosine kinase. In one embodiment, said second enzyme is Src
kinase. In one embodiment, said reporter protein yields a detectable
output. In one embodiment, said reporter protein that yields a detectable
output is a LuxAB bioreporters (e.g., output is a luminescence). In one
embodiment, said reporter protein that yields a detectable output is a
fluorescent protein. In one embodiment, said reporter protein that yields
a detectable output is mClover. In one embodiment, said reporter protein
that yields a detectable output confers antibiotic resistance. In one
embodiment, said antibiotic resistance is to spectinomycin. In one
embodiment, said operon further comprises a gene encoding a decoy protein
fusion comprising: (i) a second enzyme substrate domain that is different
from the first enzyme substrate domain and (ii) a protein that that does
not bind specifically to DNA and/or to RNA polymerase, and a fifth gene
encoding a third enzyme, wherein said third enzyme is capable of being
active on the decoy substrate domain. In one embodiment, both said first
enzyme substrate domain (of the base system) and said second enzyme
substrate domain (of the decoy) are substrates of a protein kinases. In
one embodiment, both said first enzyme substrate domain (of the base
system) and said second enzyme substrate domain (of the decoy) are
substrates of a protein tyrosine kinase. In one embodiment, both said
first enzyme substrate domain (of the base system) and said second enzyme
substrate domain (of the decoy) are substrates of Src kinase. In one
embodiment, both said first enzyme substrate domain (of the base system)
and said second substrate domain (of the decoy) are substrates of a
protein phosphatase. In one embodiment, both said first enzyme substrate
domain (of the base system) and said second substrate domain (of the
decoy) are substrates of a protein tyrosine phosphatase. In one
embodiment, both said first enzyme substrate domain (of the base system)
and said second substrate domain (of the decoy) are substrates of protein
tyrosine phosphatase 1B. In one embodiment, said first enzyme is a light
modulated enzyme. In one embodiment, said first enzyme is a protein-LOV2
chimera. In one embodiment, said first enzyme is a PTP1B-LOV2 chimera. In
one embodiment, said proteins that yield a detectable output include a
protein that generates a toxic product in the presence of a non-essential
substrate. In one embodiment, said additional protein is SacB, which
converts sucrose to a nonstructural polysaccharide that is toxic in E.
coli. In one embodiment, said operon further comprises an expression
vector and a bacterial cell.
[0007] In one embodiment, the present invention contemplates a system for
detecting inhibitors of an enzyme, comprising: a) providing; i) an operon
comprising a gene encoding an enzyme; ii) a bacterium cell; iii) a small
molecule test compound; and b) contacting said bacterium with said operon
such that said contacted bacterium is capable of producing a detectable
output; c) growing said contacted bacterium in the presence of said test
compound under conditions allowing said detectable output; and d)
assessing the influence of the test compound on said detectable output.
In one embodiment, said enzyme is a protein phosphatase. In one
embodiment, said enzyme is a protein tyrosine phosphatase. In one
embodiment, said enzyme, is protein tyrosine phosphatase 1B.
[0008] In one embodiment, the present invention contemplates a method for
evolving inhibitors of an enzyme, comprising: a) providing: i) an operon
comprising a gene encoding an enzyme; ii) a library of bacteria cells,
wherein each said bacteria cells has at least one mutated metabolic
pathway; b) growing said library of bacteria cells; and c) screening said
library of bacterial cells for a detectable output. In one embodiment,
said operon further comprises an expression vector.
[0009] In one embodiment, the present invention contemplates a method for
detecting selective inhibitors of a first enzyme over a second enzyme,
comprising: a) providing; i) a system as described above comprising a
library of bacterial cells; and ii) a small molecule test compound; b)
growing said library of bacterial cells in the presence of the test
compound; and c) assessing an influence of the test compound on a
detectable output. In one embodiment, the system further provides an
operon comprising a gene encoding a decoy fusion protein, said decoy
fusion protein comprising; (i) a second enzyme substrate domain that is
different from the first enzyme substrate domain and (ii) a protein that
that does not bind specifically to DNA and/or RNA polymerase. In one
embodiment, said operon further comprises an expression vector.
[0010] In one embodiment, the present invention contemplates a method for
evolving selective inhibitors of a first enzyme over a second enzyme,
comprising; a) providing; a system as described herein comprising a
library of bacterial cells having mutated metabolic pathways; b) growing
said bacterial cell library; and b) screening the bacterial cell library
for a detectable output. In one embodiment, the method further provides
an operon comprising a gene encoding a decoy fusion protein, the decoy
fusion protein comprising; (i) a second enzyme substrate domain that is
different from the first enzyme substrate domain and (ii) a protein that
that does not bind specifically to DNA and/or RNA polymerase. In one
embodiment, said operon further comprises an expression vector.
[0011] In one embodiment, the present invention contemplates a method for
evolving photoswitchable enzymes, comprising; a) providing; i) a system
as described herein comprising a bacterial cell library having mutated
photoswitchable enzymes; b) growing the bacterial cell library under at
least two different light conditions; and c) comparing differences in
detectable output for each cell between each of said two different light
conditions. In one embodiment, said operon further comprises an
expression vector.
[0012] In one embodiment, the present invention contemplates a method for
evolving photoswitchable enzymes, comprising: a) providing; i) a system
as described herein comprising a library of bacterial cells have mutated
photoswitchable enzymes; b) growing the library of bacterial cells under
a first light source in which activity is desired; c) subsequently
growing the library of bacterial cells from step b) in the presence of:
(i) a non-essential substrate; and (ii) a second light source in which
activity is not desired; d) subsequently screening survivors of step c)
for a mutant bacterial cell; and e) examining the mutant bacterial cell
for activity under the first light source and the second light source. In
one embodiment, the method further comprises an operon comprising a gene
encoding a decoy fusion protein, the decoy fusion protein comprising; (i)
a second enzyme substrate domain that is different from the first enzyme
substrate domain; and (ii) a protein that that does not bind specifically
to DNA and/or RNA polymerase. In one embodiment, said operon further
comprises an expression vector.
[0013] In one embodiment, the present invention contemplates a method for
evolving selective mutants of an enzyme, comprising: a) providing; a
system as described above comprising a library of bacterial cells having
a mutant enzyme; b) growing the library of bacterial cells; and c)
comparing a detectable output between the cells to identify the mutant
enzyme. In one embodiment, the method further comprises an operon
comprising a gene encoding a decoy fusion protein, the decoy fusion
protein comprising; (i) a second enzyme substrate domain that is
different from the first enzyme substrate domain; and (ii) a protein that
that does not bind specifically to DNA and/or RNA polymerase. In one
embodiment, said operon further comprises an expression vector.
[0014] In one embodiment, the present invention contemplates a method for
evolving substrate domains selective for an enzyme, comprising: a)
providing; a method as described above comprising a library of bacterial
cells comprising substrate domains fused to DNA binding domains; b)
growing the library of bacterial cells in the presence of an inducer for
a first enzyme and a non-essential substrate; c) subsequently growing the
library of bacterial cells from step b) in the presence of an inducer for
a second enzyme; and d) subsequently screening for survivor bacterial
cells, thereby identifying substrates that bind to the first enzyme but
not to the second enzyme. In one embodiment, said system comprises a
reporter protein that yields a detectable output. In one embodiment, the
reporter protein generates a toxic product in the presence of a
non-essential substrate. In one embodiment, the system further comprises
an operon comprising a gene selected from the group consisting of a first
inducible promoter for a first enzyme and a second inducible promoter for
a second enzyme, wherein the second enzyme has a similar activity to the
first enzyme.
[0015] In one embodiment, the present invention contemplates a method of
using a microbial biosensor comprising an operon, wherein said operon
comprises; a) providing a reporter gene and a sensor fusion protein gene;
and b) expressing said sensor fusion protein with a post-translational
modification and the reporter gene. In one embodiment, said expressed
sensor fusion protein has a protein tyrosine phosphatase substrate domain
and is capable of binding to said DNA binding sequences in the presence
of at least one expressible sensor fusion protein as a recognition domain
(SH2) for said protein tyrosine phosphatase substrate domain attached to
a phosphate molecule. In one embodiment, said operon further comprises
gene segments encoding: i) a first expressible sensor fusion protein as a
protein tyrosine phosphatase substrate domain capable of attaching to
said phosphate molecule, said first expressible sensor fusion protein is
in an operable combination with a DNA-binding protein; and ii) a second
expressible sensor fusion protein as a recognition domain (SH2) for said
protein tyrosine phosphatase substrate domain when attached to a
phosphate molecule, said second expressible sensor fusion protein is in
operable combination with a subunit of an RNA polymerase; and iii)
individual expressible fragments including, but not limited to, a Src
kinase protein; a protein tyrosine phosphatase 1B (PTP1B) and conjugated
to said transcriptionally active binding sequences capable of binding to
said DNA-binding protein of sensor fusion protein and said subunit of an
RNA polymerase in operable combination with said reporter gene.
[0016] In one embodiment, the present invention contemplates a method of
using a microbial biosensor comprising; a) providing; i) an operon,
wherein said operon comprises a reporter gene and a sensor fusion protein
gene; ii) a living bacterium; and iii) a test small molecule inhibitor of
said protein tyrosine phosphatase enzyme; b) expressing said sensor
fusion protein with a post-translational modification and a reporter
gene; c) contacting said bacterium with said test small molecule; and d)
determining whether said test small molecule is an inhibitor for said
protein phosphatase enzyme by expression of said reporter gene. In one
embodiment, said expressed sensor fusion protein has a protein tyrosine
phosphatase substrate domain that is capable of binding to a DNA binding
sequence in the presence of at least one expressible sensor fusion
protein as a recognition domain (SH2) for said protein tyrosine
phosphatase substrate domain attached to a phosphate molecule. In one
embodiment, said expressed sensor fusion protein has a protein tyrosine
phosphatase 1B substrate domain that is capable of binding to said DNA
binding sequences in the presence of at least one expressible sensor
fusion protein as a recognition domain (SH2) for said protein tyrosine
phosphatase substrate domain attached to a phosphate molecule. In one
embodiment, said operon further comprises gene segments encoding: i) said
first expressible sensor fusion protein as said protein tyrosine
phosphatase substrate domain capable of attaching to said phosphate
molecule that is in operable combination with a DNA-binding; and ii) said
second expressible sensor fusion protein as a recognition domain (SH2)
for said protein tyrosine phosphatase substrate domain when attached to a
phosphate molecule that is in operable combination with a subunit of an
RNA polymerase; and iii) individual expressible fragments including but
not limited to, a Src kinase protein; a protein tyrosine phosphatase 1B
(PTP1B) and conjugated to said transcriptionally active binding sequences
capable of binding to said DNA-binding protein of sensor fusion protein
and said subunit of an RNA polymerase in operable combination with said
reporter gene. In one embodiment, said biosensor further comprises an
operon component for expressing a second gene. In one embodiment, said
biosensor further comprises an operon component for expressing a second
PTP that is different from the first PTP for identifying a said inhibitor
selective for one of the TPT enzymes. In one embodiment, said test small
molecule inhibitor includes, but is not limited to, abietane-type
diterpenes, abietic acid (AA), dihydroabietic acid and structural
variants thereof.
[0017] In one embodiment, the present invention contemplates a method of
using a microbial biosensor, comprising: a) providing; i) an operon,
wherein said operon comprises a reporter gene and a sensor fusion protein
gene; ii) a living bacterium; and iii) a test small molecule inhibitor of
said protein tyrosine phosphatase enzyme; b) expressing said sensor
fusion protein with a post-translational modification and the reporter
gene; c) expressing said expressible sensor fusion proteins in said
bacterium; d) contacting said bacterium with said test small molecule;
and e) determining whether said test small molecule is an inhibitor for
said protein phosphatase enzyme by expression of said reporter gene. In
one embodiment, said expressed sensor fusion protein has a protein
tyrosine phosphatase substrate domain and is capable of binding to said
DNA binding sequences in the presence of at least one expressible sensor
fusion protein as a recognition domain (SH2) for said protein tyrosine
phosphatase substrate domain attached to a phosphate molecule. In one
embodiment, the expressed sensor fusion protein has a protein tyrosine
phosphatase 1B substrate domain and capable of binding to said DNA
binding sequences in the presence of at least one expressible sensor
fusion protein as a recognition domain (SH2) for said protein tyrosine
phosphatase substrate domain attached to a phosphate molecule, and an
individual expressible fragment for a photoswitchable protein tyrosine
phosphatase 1B. In one embodiment, said operon comprises gene segments
encoding: i) said first expressible sensor fusion protein as said protein
tyrosine phosphatase substrate domain that is capable of attaching to
said phosphate molecule in operable combination with a DNA-binding
protein; ii) said second said expressible sensor fusion protein as a
recognition domain (SH2) for said protein tyrosine phosphatase substrate
domain when attached to a phosphate molecule that is in operable
combination with a subunit of an RNA polymerase; and iii) individual
expressible fragments including, but not limited to, a Src kinase
protein; a protein tyrosine phosphatase 1B (PTP1B) and conjugated to said
transcriptionally active binding sequences capable of binding to said
DNA-binding protein of sensor fusion protein and said subunit of an RNA
polymerase in operable combination with said reporter gene.
[0018] In one embodiment, the present invention contemplates a method for
providing variants of chemical structures for use as a potential
therapeutic, comprising: a) providing; i) an E. coli bacterium comprising
a metabolic terpenoid chemical structure-producing pathway providing an
altered chemical structure, wherein said metabolic pathway comprises a
synthetic enzyme, wherein said E. coli further comprises a microbial
biosensor operon for detecting PTP inhibition; and ii) a mutated
synthetic enzyme of system of enzymes; a) introducing said mutated
synthetic enzyme of system of enzymes; c) expressing said mutated
synthetic enzyme under conditions wherein said mutated synthetic enzyme
or system of enzymes alters/alter the chemical structure of said
terpenoid chemical structure; and d) determining whether said altered
chemical structure is an inhibitor for said PTP as a test inhibitor for
use as a potential therapeutic. In one embodiment, said metabolic pathway
comprises synthetic enzymes including, but not limited to, terpene
synthases, cytochrome P450s, halogenases, methyl transferases, or
terpenoid-functionalizing enzymes. In one embodiment, said terpenoid
includes, but is not limited to, labdane-related diterpenoids. In one
embodiment, said terpenoid includes but is not limited to, abietane-type
diterpenoids. In one embodiment, said terpenoid is abietic acid.
[0019] In one embodiment, the present invention contemplates a fusion
protein DNA construct, comprising a protein phosphatase gene and a
protein light switch gene conjugated within said phosphatase gene,
wherein said protein phosphatase gene encodes a protein with a C-terminal
domain and said protein light switch gene encodes a protein with an
N-terminal alpha helical region such that said C-terminal domain is
conjugated to said N-terminal alpha helical region. In one embodiment,
said construct further comprises an expression vector and a living cell.
In one embodiment, said protein phosphatase is a protein tyrosine
phosphatase. In one embodiment, said protein phosphatase is protein
tyrosine phosphatase 1B (PTP1B). In one embodiment, said C-terminal
domain encodes an .alpha.7 helix of PTP1B. In one embodiment, said
construct encodes PTP1B.sub.PS-A. In one embodiment, said construct
encodes PTP1B.sub.PS-B. In one embodiment, said protein phosphatase is
T-Cell protein tyrosine phosphatases (TC-PTP). In one embodiment, said
protein light switch is a light-oxygen-voltage (LOV) domain. In one
embodiment, said protein light switch is the LOV2 domain of phototropin 1
form Avena sativa. In one embodiment, said LOV2 domain comprises an A'a
helix of LOV2. In one embodiment, said LOV2 has at least one mutation
resulting in an amino acid mutation. It is not meant to limit such
mutations. In fact, a mutation may include but is not limited to a
nucleotide substitution, the addition of a nucleotide, and the deletion
of a nucleotide from said gene. In one embodiment, said mutation is a
substitution of a nucleotide. In one embodiment, said A'a helix of LOV2
has a T406A mutation. In one embodiment, said protein light switch is a
phytochrome protein. In one embodiment, said phytochrome protein is a
bacterial phytochrome protein. In one embodiment, said bacterial
phytochrome protein is a bacterial phytochrome protein 1 (BphP1) from
Rhodopseudomonas palustris. In one embodiment, said protein light switch
is a light-oxygen-voltage (LOV) domain with an artificial chromophore. In
one embodiment, said protein light switch is a phytochrome protein with
an artificial chromophore.
[0020] In one embodiment, the present invention contemplates a fusion
protein, comprising a protein phosphatase and a protein light switch
conjugated within said phosphatase, wherein said protein phosphatase has
a C-terminal domain and said protein light switch has a N-terminal alpha
helical region such that said C-terminal domain is conjugated to said
N-terminal alpha helical region. In one embodiment, said fusion protein
further comprises an expression vector and a living cell. In one
embodiment, said protein phosphatase is a protein tyrosine phosphatase.
In one embodiment, said protein phosphatase is protein tyrosine
phosphatase 1B (PTP1B). In one embodiment, said C-terminal domain encodes
an .alpha.7 helix. In one embodiment, said fusion protein is
PTP1B.sub.PS-A. In one embodiment, said fusion protein is PTP1B.sub.PS-B.
In one embodiment, said protein phosphatase is T-Cell protein tyrosine
phosphatases (TC-PTP). In one embodiment, said protein light switch is a
light-oxygen-voltage (LOV) domain. In one embodiment, said protein light
switch is the LOV2 domain of phototropin 1 form Avena sativa. In one
embodiment, said LOV2 domain comprises an A'a helix of LOV2. In one
embodiment, said A'a helix of LOV2 has a T406A mutation. In one
embodiment, said protein light switch is a light-oxygen-voltage (LOV)
domain with an artificial chromophore. In one embodiment, said protein
light switch is a phytochrome protein with an artificial chromophore. In
one embodiment, said protein light switch is a phytochrome protein. In
one embodiment, said phytochrome protein is a bacterial phytochrome
protein. In one embodiment, said bacterial phytochrome protein is a
bacterial phytochrome protein 1 (BphP1) from Rhodopseudomonas palustris.
In one embodiment, said protein light switch is a light-oxygen-voltage
(LOV) domain with an artificial chromophore. In one embodiment, said
protein light switch is a phytochrome protein with an artificial
chromophore.
[0021] In one embodiment, the present invention contemplates a method of
using a fusion protein, comprising; a) providing; i) a fusion protein;
ii) a protein phosphatase, and iii) a living cell; and b) introducing
said fusion protein in said a living cell such that illumination of said
light switch alters a feature in said living cell. In one embodiment,
said feature includes but is not limited to controlling cell movement,
morphology, controlling cell signaling and having a modulatory effect. In
one embodiment, said modulatory effect includes but is not limited to
inactivation, activation, reversible inactivation and reversible
activation. In one embodiment, said modulatory effect is dose dependent.
In one embodiment, said illumination is light within the range of 450-500
nm. In one embodiment, said illumination is light within the range of
600-800 nm. In one embodiment, said protein light switch undergoes
light-induced conformational change and said protein phosphatase has
allosterically modulated catalytic activity that is altered by said
conformational change. In one embodiment, said altering is enhanced or
reduced. In one embodiment, said protein light switch is a
light-oxygen-voltage (LOV) domain with an artificial chromophore. In one
embodiment, said protein light switch is a phytochrome protein with an
artificial chromophore. In one embodiment, said living cell has an
activity. In one embodiment, said living cell is in vivo. In one
embodiment, said method further comprises a step of controlling said
cellular activity in vivo.
[0022] In one embodiment, the present invention contemplates a method for
detecting a small molecule modulator of a protein phosphatase,
comprising: a) providing; i) a fusion protein comprising a protein
phosphatase and protein light switch; ii) a visual readout for
phosphatase activity; iii) an optical source, wherein said source is
capable of emitting light radiation; iv) a living cell; and v) a small
molecule test compound; b) expressing said fusion protein in said living
cell; c) contacting said living cell with said small molecule test
compound; d) illuminating said fusion protein within said cell with said
optical source; e) measuring a visual readout for a change in phosphatase
activity for identifying said small molecule test compound as a modulator
of said activity of said phosphatase; and f) using said modulatory small
molecule test compound for treating a patient exhibiting at least one
symptom of a disease associated with said phosphatase. In one embodiment,
said method further comprises identifying said small molecule test
compound as an inhibitor of the activity of said phosphatase. In one
embodiment, said method further comprises identifying said small molecule
test compound as an activator of the activity of said phosphatase. In one
embodiment, said disease includes but is not limited to diabetes,
obesity, cancer, anxiety, autoimmunity, or neurodegenerative diseases. In
one embodiment, said protein light switch is a light-oxygen-voltage (LOV)
domain with an artificial chromophore. In one embodiment, said protein
light switch is a phytochrome protein with an artificial chromophore. In
one embodiment, said method further provides a fluorescence-based
biosensor, and comprises a step of introducing said fluorescence-based
biosensor into said cell. In one embodiment, said method further
comprises a step of controlling said cellular activity in vivo. In one
embodiment, said visual readout for phosphatase activity is selected from
the group consisting of a fluorescence-based biosensor; changes in cell
morphology; and changes in cell motility.
[0023] In one embodiment, the present invention contemplates a
photoswitchable protein tyrosine phosphatase enzyme construct comprising
an N-terminal alpha helix of a protein light switch conjugated to a
C-terminal allosteric domain region. In one embodiment, said protein
tyrosine phosphatase enzyme is protein tyrosine phosphatase 1B (PTP1B).
In one embodiment, said protein light switch is a LOV2 domain of
phototropin 1 derived from Avena sativa (wild oats). In one embodiment,
said enzyme construct further comprises an expression vector. In one
embodiment, the present invention contemplates a biosensor for enzyme
activity, comprising; a) a substrate domain as described above; b) a
substrate recognition domain; c) a first fluorescent protein; and d) a
second fluorescent protein.
[0024] In one embodiment, the invention provides a genetically encoded
system for detecting small molecules that modulate enzyme activity,
comprising, a. a first region in operable combination comprising: i. a
first promoter; ii. a first gene encoding a first fusion protein
comprising a substrate recognition domain linked to a DNA-binding
protein; iii. a second gene encoding a second fusion protein comprising a
substrate domain linked to a protein capable of recruiting RNA polymerase
to DNA; iv. a second promoter; v. a third gene for a protein kinase; vi.
a fourth gene for a molecular chaperone; vii. a fifth gene for a protein
phosphatase; b. a second region in operable combination comprising: i. a
first DNA sequence encoding an operator for said DNA-binding protein; ii.
a second DNA sequence encoding a binding site for RNA polymerase; and
iii. one or more genes of interest (GOI). In one embodiment, said first
promoter is Prol. In one embodiment, said substrate recognition domain is
a substrate homology 2 (SH2) domain from H. sapiens. In one embodiment,
said DNA-binding protein is the 434 phage cI repressor. In one
embodiment, said substrate domain is a peptide substrate of both said
kinase and said phosphatase. In one embodiment, said second promoter is
ProD. In one embodiment, said protein capable of recruiting RNA
polymerase to DNA is the omega subunit of RNA polymerase (i.e., RpoZ or
RP.sub..omega.). In one embodiment, said protein kinase is Src kinase
from H. sapiens. In one embodiment, said molecular chaperone is CDCl37
(i.e., the Hsp90 co-chaperone) from H. sapiens. In one embodiment, said
protein phosphatase is protein tyrosine phosphatase 1B (PTP1B) from H.
sapiens. In one embodiment, said operator is the operator for 434 phage
cI repressor. In one embodiment, said binding site for RNA polymerase is
the -35 to -10 region of the lacZ promoter. In one embodiment, said gene
of interest is SpecR, a gene that confers resistance to spectinomycin. In
one embodiment, said genes of interest are LuxA and LuxB, two genes that
yield a luminescent output. In one embodiment, said gene of interest is a
gene that confers resistance to an antibiotic. In one embodiment, said
protein phosphatase is PTPN6 from H. sapiens. In one embodiment, said
protein phosphatase is a protein tyrosine phosphatase (PTP). In one
embodiment, said protein phosphatase is the catalytic domain of a PTP. In
one embodiment, an alignment of the X-ray crystal structures of (i) the
catalytic domain of said protein phosphatase and (ii) the catalytic
domain of PTP1B yields a root-mean-square deviation (RMSD) of less than
or equal to 0.95 .ANG. (as defined by a function similar to the PyMol
function align). In one embodiment, said catalytic domain of said protein
phosphatase has at least 34.1% sequence identity with the catalytic
domain of PTP1B. In one embodiment, said catalytic domain of said
phosphatase has at least 53.5% sequence similarity with the catalytic
domain of PTP1B. In one embodiment, said protein kinase is a protein
tyrosine kinase (PTK). In one embodiment, said protein kinase is the
catalytic domain of a PTK. In one embodiment, said first promoter is a
constitutive promoter. In one embodiment, said second promoter is a
constitutive promoter. In one embodiment, said first promoter is an
inducible promoter. In one embodiment, said second promoter is an
inducible promoter. In one embodiment, said binding site for RNA
polymerase comprises part of a third promoter. In one embodiment, said
first region lacks a gene for a molecular chaperone. In one embodiment,
said first fusion protein consists of a substrate recognition domain
linked a protein capable of recruiting RNA polymerase to DNA, and said
second fusion protein consists of a substrate domain linked to a
DNA-binding protein. In one embodiment, said first region further
contains a third fusion protein (i.e., a "decoy") comprising a second
substrate domain, which is distinct from the first substrate domain,
linked to a protein that is incapable of recruiting RNA polymerase to
DNA. In one embodiment, said substrate domain of said third fusion
protein is a peptide substrate of both said kinase and said phosphatase.
In one embodiment, said substrate domain of said third fusion protein is
a peptide substrate of said kinase but is a poor substrate of said
phosphatase. In one embodiment, said first region further contains a
sixth gene for a second protein phosphatase, which is distinct from the
first protein phosphatase and which acts on said substrate domain of said
third fusion protein.
[0025] In one embodiment, the invention provides a method for using both
(i) a genetically encoded system for detecting small molecules that
modulate enzyme activity and (ii) a genetically encoded pathway for
terpenoid biosynthesis to identify and/or build terpenoids that modulate
enzyme activity, comprising, a. providing, i. a genetically encoded
system for detecting small molecules that modulate enzyme activity,
comprising, 1. a first region in operable combination comprising: a. a
first promoter; b. a first gene encoding a first fusion protein
comprising a substrate recognition domain linked to a DNA-binding
protein; c. a second gene encoding a second fusion protein comprising a
substrate domain linked to a protein capable of recruiting RNA polymerase
to DNA; d. a second promoter; e. a third gene for a protein kinase; f. a
fourth gene for a molecular chaperone; g. a fifth gene for a protein
phosphatase; 2. a second region in operable combination comprising: a. a
first DNA sequence encoding an operator for said DNA-binding protein; b.
a second DNA sequence encoding a binding site for RNA polymerase; c. one
or more genes of interest (GOI); ii. a genetically encoded pathway for
terpenoid biosynthesis comprising: 1. a pathway that generates linear
isoprenoid precursors; 2. a gene for a terpene synthase (TS); 3. a
plurality of E. coli bacteria; b. transforming said bacteria with both
(i) said genetically encoded system for detecting small molecules and
(ii) said genetically encoded pathway for terpenoid biosynthesis, and
allowing said transformed bacteria to replicate; c. observing the
expression of a gene of interest through a measurable output. In one
embodiment, said pathway that generates linear isoprenoid precursors
generates farnesyl pyrophosphate (FPP). In one embodiment, said pathway
that generates linear isoprenoid precursors is all or part of the
mevalonate-dependent isoprenoid pathway of S. cerevisiae. In one
embodiment, said pathway that generates linear isoprenoid precursors is
carried by the plasmid pMBIS. In one embodiment, said gene of interest is
SpecR, a gene that confers resistance to spectinomycin. In one
embodiment, said TS gene is carried on a separate plasmid (pTS) from the
rest of the terpenoid pathway. In one embodiment, said TS gene encodes
for amorphadiene synthase (ADS) from Artemisia annua. In one embodiment,
said TS gene encodes for .gamma.-humulene synthase (GHS) from Abies
grandis. In one embodiment, said TS gene encodes for abietadiene synthase
(ABS) from Abies grandis, and this gene is carried in operable
combination with a gene for geranylgeranyl diphosphate synthase (GPPS).
In one embodiment, said TS gene encodes for taxadiene synthase (TXS) from
Taxus brevifolia, and this gene is carried in operable combination with a
gene for GGPPS. In one embodiment, the method further comprises, d.
extracting terpenoids that enable the highest measurable output (e.g.,
growth at the highest concentration of spectinomycin); e. identifying
said terpenoids; f. purifying said terpenoids. In one embodiment, the
method further comprises, providing, g. a mammalian cell culture, h.
treating said cell cultures with purified terpenoids, i. measuring a
biochemical effect that results from changes in the activity of a protein
phosphatase or protein kinase. In one embodiment, the method further
comprises, j. providing, a purified enzyme target, k. measuring the
modulatory effect of purified terpenoids on the enzyme target, 1.
quantifying that modulatory effect (e.g., by calculating an IC.sub.50).
In one embodiment, said TS gene has at least one mutation. In one
embodiment, said TS gene is in operable combination with a gene for an
enzyme that functionalizes terpenoids. In one embodiment, said TS gene is
in operable combination with a gene for a cytochrome P450. In one
embodiment, said TS gene is in operable combination with a gene for
cytochrome P450 BM3 from Bacillus megaterium. In one embodiment, said TS
gene is in operable combination with a gene for a halogenase. In one
embodiment, said TS gene is in operable combination with a gene for
6-halogenase (SttH) from Streptomyces toxytricini. In one embodiment,
said TS gene is in operable combination with a gene for vanadium
haloperoxidase (VHPO) from Acaryochloris marina. In one embodiment, said
mammalian cell is a HepG2, Hela, Hek393t, MCF-7, and/or Cho-hIR cell. In
one embodiment, said cells are BT474, SKBR3, or MCF-7 and MDA-MB-231
cells. In one embodiment, said biochemical effect is insulin receptor
phosphorylation, which can be measured by a western blot or enzyme-linked
immunosorbent assay (ELISA). In one embodiment, said cells are triple
negative (TN) cell lines. In one embodiment, said cells are TN cells from
the American Type Culture Collection (ATCC). In one embodiment, said
cells are TN cells from ATCC TCP-1002. In one embodiment, said
biochemical effect is cellular migration. In one embodiment, said
biochemical effect is cellular viability. In one embodiment, said
biochemical effect is cellular proliferation. In one embodiment, said
protein phosphatase is PTP1B from H. sapiens. In one embodiment, said
protein kinase is Src kinase from H. sapiens. In one embodiment, said
gene of interest confers resistance to an antibiotic. In one embodiment,
said gene of interest is SacB, a gene that confers sensitivity to
sucrose. In one embodiment, said gene of interest confers conditional
toxicity (i.e., toxicity in the presence of an exogenously added
molecule). In one embodiment, said genes of interest are SpecR and SacB.
In one embodiment, said protein phosphatase is the wild-type enzyme. In
one embodiment, said protein phosphatase has at least one mutation. In
one embodiment, said protein phosphatase has at least one mutation that
reduces its sensitivity to a small molecule that modulates the activity
of the wild-type protein phosphatase. In one embodiment, said protein
kinase is the wild-type enzyme. In one embodiment, said protein kinase
has at least one mutation. In one embodiment, said protein kinase has at
least one mutation that reduces its sensitivity to a small molecule that
modulates the activity of the wild-type protein kinase. In one
embodiment, said at least one of said terpenoids inhibit a protein
phosphatase. In one embodiment, said at least one of said terpenoids
inhibit a PTP. In one embodiment, said least one of said terpenoids
inhibit PTP1B. In one embodiment, said at least one of said terpenoids
activate a protein phosphatase. In one embodiment, said least one of said
terpenoids activates a PTP. In one embodiment, said at least one of aid
terpenoids activate protein tyrosine phosphatase non-receptor type 12
(PTPN12). In one embodiment, said at least one of said terpenoids inhibit
a protein kinase. In one embodiment, said at least one of said terpenoids
inhibit a PTK. In one embodiment, said at least one of said terpenoid
inhibit Src kinase. In one embodiment, said at least one of said
terpenoids activate a protein kinase. In one embodiment, said at least
one of said terpenoids activate a PTK. In one embodiment, said
genetically encoded system for detecting small molecules further contains
both (i) a third fusion protein comprising a second substrate domain,
which is distinct from the first substrate domain, linked to a protein
that is incapable of recruiting RNA polymerase to DNA and (ii) a sixth
gene for a second protein phosphatase, which is distinct from the first
protein phosphatase. In one embodiment, said genetically encoded system
for detecting small molecules further contains both (i) a third fusion
protein comprising a second substrate domain, which is distinct from the
first substrate domain, linked to a protein that is incapable of
recruiting RNA polymerase to DNA and (ii) a sixth gene for a second
protein kinase, which is distinct from the first protein kinase. In one
embodiment, said genetically encoded pathway for terpenoid biosynthesis
comprises, instead, a library of pathways that differ in the identity of
the TS gene such that upon transformation, the majority of cells contain
a distinct TS gene (i.e., a gene that differs by at least one mutation).
In one embodiment, said genetically encoded pathway for terpenoid
biosynthesis comprises, instead, a library of pathways that differ in the
identity of a gene that functionalizes terpenoids (e.g., a cytochrome
P450 or halogenase), in operable combination with the SI gene, such that
upon transformation, the majority of cells contain a distinct gene that
functionalizes terpenoids (i.e., a gene that differs by at least one
mutation). In one embodiment, said genetically encoded pathway for
terpenoid biosynthesis comprises, instead, a library of pathways in which
the TS gene has been replaced by a component of a eukaryotic
complementary DNA (cDNA) library such that upon transformation, the
majority of cells contain a distinct gene in place of the TS gene. In one
embodiment, said genetically encoded pathway for terpenoid biosynthesis
comprises, instead, a library of pathways in which the TS gene
accompanied by a component of a eukaryotic complementary DNA (cDNA)
library such that upon transformation, the majority of cells contain a
distinct gene in operable combination with the TS gene (e.g., a gene that
may encode for a terpenoid-functionalizing enzyme). In one embodiment,
said genetically encoded system for detecting small molecules comprises,
instead, a library of such systems that differ in the identity of the
protein phosphatase gene such that upon transformation, the majority of
cells contain a distinct protein phosphatase gene (i.e., a gene that
differs by at least one mutation). In one embodiment, said genetically
encoded pathway for terpenoid biosynthesis generates a terpenoid that
modulates the activity of the wild-type form of said protein phosphatase,
thereby enabling the growth study to isolate a mutant of said protein
phosphatase that is less sensitive to the modulatory effect of the small
molecule. In one embodiment, said genetically encoded system for
detecting small molecules comprises, instead, a library of such systems
that differ in the identity of the protein kinase gene, such that upon
transformation, the majority of cells contain a separate protein kinase
gene (i.e., a gene that differs by at least one mutation). In one
embodiment, said genetically encoded pathway for terpenoid biosynthesis
generates a terpenoid that modulates the activity of the wild-type form
of said protein kinase, thereby enabling the growth study to isolate a
mutant of said protein kinase that is less sensitive to the modulatory
effect of the small molecule. In one embodiment, said at least one of
said terpenoids modulates the activity of the wild-type form of said
protein phosphatase, but not a mutated form of said protein phosphatase.
In one embodiment, said at least one of said terpenoids modulates the
activity of the said first protein phosphatase, but not the activity of
said second protein phosphatase. In one embodiment, said at least one of
said terpenoids modulates the activity of the wild-type form of said
protein kinase, but not a mutated form of said protein kinase. In one
embodiment, said at least one of said terpenoids modulates the activity
of said first protein kinase, but not the activity of said second protein
kinase.
[0026] In one embodiment, the invention provides an inhibitor detection
operon comprising, A: a first region in operable combination under
control of a first promoter including: i. a first DNA sequence encoding a
first fusion protein comprising a substrate recognition homology 2 domain
(SH2) and a repressor; ii. a second DNA sequence encoding a second fusion
protein comprising a phosphate molecule binding domain of a substrate
recognition domain, said substrate recognition domain and an omega
subunit of RNA polymerase (RpoZ or RP.); iii. a third DNA sequence
encoding a Cell Division Cycle 37 protein (CDCl37); iv. a protein
phosphatase; and B: a second region in operable combination under control
of a second promoter comprising: i. an operator comprising a repressor
binding domain said repressor, ii. a ribosome binding site (RB); and iii.
a gene of interest (GOI). In one embodiment, said SH2 domain is a
substrate recognition domain of said protein phosphatase. In one
embodiment, said repressor is a 434 phage cI repressor. In one
embodiment, said substrate recognition domain binds said protein
phosphatase. In one embodiment, said decoy substrate domain is a Src
kinase gene. In one embodiment, said operator is a 434cI operator. In one
embodiment, said gene of interest encodes an antibiotic protein. In one
embodiment, said protein phosphatase is a protein tyrosine phosphatase.
In one embodiment, said first promoter is constitutive promoter. In one
embodiment, said second promoter is an inducible promoter.
[0027] In one embodiment, the invention provides a method of using an
inhibitor detection operon, comprising, a. providing, i. a detection
operon, comprising A: a first region in operable combination under
control of a first promoter including: 1. a first DNA sequence encoding a
first fusion protein comprising a protein phosphatase enzyme's substrate
recognition homology 2 domain (SH2) and a repressor binding domain; 2. a
second DNA sequence encoding a second fusion protein comprising a
phosphate molecule binding domain of a protein phosphatase enzyme's
substrate recognition domain, said protein phosphatase enzyme's substrate
recognition domain and an omega subunit of RNA polymerase (RpoZ or RP.);
4. a third DNA sequence encoding a Cell Division Cycle 37 (CDCl37)
protein; 5. a protein phosphatase enzyme; and B: a second region in
operable combination under control of a second promoter comprising: 6. an
operator comprising a repressor binding domain biding said repressor, 7.
a ribosome binding site (RB); and 8. a gene of interest (GOI); and ii. a
mevalonate pathway operon having a missing gene, such that said pathway
operon does not contain at least one gene in said pathway for producing
said terpenoid compound, under control of a third promoter comprising a
second gene of interest for producing a terpenoid compound, iii. a fourth
DNA sequence under control of a fourth promoter comprising said missing
gene from said mevalonate pathway operon and a third gene of interest;
and iv. a plurality of E. coli bacteria, and b. transfecting said E. coli
bacteria with said first operon for expressing said first gene of
interest; c. transfecting said E. coli bacteria with said mevalonate
pathway operon for expressing said first and said second gene of
interest; d. transfecting said E. coli bacteria with said fourth DNA
sequence for expressing said first and said second and said third gene of
interest; e. growing said cells wherein said inhibitor terpenoid
compounds for protein phosphatase enzymes are produced by said cells. In
one embodiment, said method further comprising step e. isolating said
protein phosphatase inhibitor molecules and providing a mammalian cell
culture for step f. treating said cell cultures for reducing activity of
said protein phosphatase enzyme. In one embodiment, said method further
providing an inducer compound for inducing said inducible promoter and a
step of contacting said baceria with said compound. In one embodiment,
said method wherein reducing activity of said protein phosphatase enzyme
reduces growth of said mammalian cells. In one embodiment, said protein
phosphatase enzyme is human PTP1B. In one embodiment, said protein
phosphatase enzyme is wild-type. In one embodiment, said protein
phosphatase enzyme has at least one mutation. In one embodiment, said
missing enzyme is a terpene synthase enzyme. In one embodiment, said
terpene synthase enzyme is selected from the group consisting of
amorphadiene synthase (ADS) and .gamma.-humulene synthase (GHS). In one
embodiment, said fourth DNA sequence further comprises a geranylgeranyl
diphosphate synthase (GPPS) and said missing enzyme is selected from the
group consisting of abietadiene synthase (ABS) and taxadiene synthase
(TXS). In one embodiment, said terpene synthase enzyme is wild-type. In
one embodiment, said terpene synthase enzyme has at least one mutation.
In one embodiment, said terpenoid compounds are structural variants of
terpenoid compounds. In one embodiment, said genes of interest are
antibiotic genes. In one embodiment, said genes of interest are each
different antibiotic genes.
[0028] In one embodiment, said genetically encoded detection operon
system, comprising; Part A: a first region of DNA in operable combination
comprising: a region of DNA encoding a first promoter; a first gene
encoding a first fusion protein comprising a substrate recognition domain
linked to a DNA-binding protein; a second gene encoding a second fusion
protein comprising a substrate domain linked to a protein capable of
recruiting RNA polymerase to DNA; a region of DNA encoding a second
promoter; a third gene for a protein kinase; a fourth gene for a
molecular chaperone; a fifth gene for a protein phosphatase; Part B: a
second region of DNA in operable combination under control of a second
promoter comprising: a first DNA sequence encoding an operator for said
DNA-binding protein; a second DNA sequence encoding a binding site for
RNA polymerase; and at least one gene of interest (GOI). In one
embodiment, said substrate recognition domain is a substrate homology 2
(SH2) domain. In one embodiment, said DNA-binding protein is the 434
phage cI repressor. In one embodiment, said substrate domain is a peptide
substrate of both said kinase and said phosphatase In one embodiment,
said protein capable of recruiting RNA polymerase to DNA is the omega
subunit of RNA polymerase (RP.sub..omega.). In one embodiment, said gene
for a kinase is a Src kinase gene. In one embodiment, said molecular
chaperone is CDCl37. In one embodiment, said molecular chaperone is the
Hsp90 co-chaperone) from H. sapiens. In one embodiment, said operator is
a 434 phage cI operator. In one embodiment, said gene of interest is a
gene for antibiotic resistance. In one embodiment, said gene for
antibiotic resistance produces an enzyme that allow the bacteria to
degrade an antibiotic protein. In one embodiment, said protein
phosphatase enzyme is protein tyrosine phosphatase 1B. In one embodiment,
said first and second promoters of part A are constitutive promoters. In
one embodiment, said second promoter of Part B is an inducible promoter.
[0029] In one embodiment, the invention provides a method of using a
genetically encoded detection operon system, comprising, a. providing, i.
an inhibitor detection operon, comprising Part A: a first region of DNA
in operable combination comprising: 1. a region of DNA encoding a first
promoter; 2. a first gene encoding a first fusion protein comprising a
substrate recognition domain linked to a DNA-binding protein; 3. a second
gene encoding a second fusion protein comprising a substrate domain
linked to a protein capable of recruiting RNA polymerase to DNA; 4. a
region of DNA encoding a second promoter; 5. a third gene for a protein
kinase; 6. a fourth gene for a molecular chaperone; 7. a fifth gene for a
protein phosphatase; Part B: a second region of DNA in operable
combination under control of a second promoter comprising: 8. a first DNA
sequence encoding an operator for said DNA-binding protein; 9. a second
DNA sequence encoding a binding site for RNA polymerase; and 10. at least
one gene of interest (GOI). ii. a mevalonate-terpene pathway operon not
containing a terpene synthase gene, under control of a fourth promoter
comprising a second gene of interest for producing a terpenoid compound,
iii. a fourth DNA sequence under control of a fifth promoter comprising
said terpene synthase gene and a third gene of interest; and iv. a
plurality of bacteria, and b. transfecting said bacteria with said
inhibitor detection operon for expressing said first gene of interest; c.
transfecting said bacteria with said mevalonate pathway operon for
expressing said second gene of interest; d. transfecting said bacteria
with said fourth DNA sequence for expressing said third gene of interest;
e. growing said bacteria cells expressing said three genes of interest
wherein said inhibitor terpenoid compounds are produced by said bacteria
cells inhibiting said protein phosphatase enzyme. In one embodiment, said
method further comprising step e. isolating said protein phosphatase
inhibitor molecules and providing a mammalian cell culture for step f.
treating said cell cultures for reducing activity of said protein
phosphatase enzyme. In one embodiment, said method wherein reducing
activity of said protein phosphatase enzyme reduces growth of said
mammalian cells. In one embodiment, said protein phosphatase enzyme is
human PTP1B. In one embodiment, said protein phosphatase enzyme is
wild-type. In one embodiment, said protein phosphatase enzyme has at
least one mutation. In one embodiment, said mevalonate pathway operon
comprises genes for expressing mevalonate kinase (ERG12),
phosphomevalonate kinase (ERGS), mevalonate pyrophosphate decarboxylatse
(MVD1), Isopentenyl pyrophosphate isomerase (IDI gene), and Farnesyl
pyrophosphate (FPP) synthase (ispA). In one embodiment, said missing
enzyme is a terpene synthase enzyme. In one embodiment, said terpene
synthase enzyme is selected from the group consisting of amorphadiene
synthase (ADS) and .gamma.-humulene synthase (GHS). In one embodiment,
said fourth DNA sequence further comprises a geranylgeranyl diphosphate
synthase (GPPS) and said terpene synthase is selected from the group
consisting of abietadiene synthase (ABS) and taxadiene synthase (TXS). In
one embodiment, said terpene synthase enzyme is wild-type. In one
embodiment, said terpene synthase enzyme has at least one mutation. In
one embodiment, said terpenoid compounds are structural variants of
terpenoid compounds. In one embodiment, said genes of interest are
antibiotic genes. In one embodiment, said genes of interest are each
different antibiotic genes. In one embodiment, said method further
provides an inducer compound for inducing said inducible promoter and a
step of contacting said baceria with said compound.
[0030] In one embodiment, the invention provides a method for using both
(i) a genetically encoded system for detecting small molecules that
modulate enzyme activity and (ii) a genetically encoded pathway for
polyketide biosynthesis to identify and/or build polyketides that
modulate enzyme activity, comprising, providing, A genetically encoded
system for detecting small molecules that modulate enzyme activity,
comprising, a first region in operable combination comprising: a first
promoter; a first gene encoding a first fusion protein comprising a
substrate recognition domain linked to a DNA-binding protein; a second
gene encoding a second fusion protein comprising a substrate domain
linked to a protein capable of recruiting RNA polymerase to DNA; a second
promoter; a third gene for a protein kinase; a fourth gene for a
molecular chaperone; a fifth gene for a protein phosphatase; a second
region in operable combination comprising: a first DNA sequence encoding
an operator for said DNA-binding protein; a second DNA sequence encoding
a binding site for RNA polymerase; one or more genes of interest (GOI); a
genetically encoded pathway for polyketide biosynthesis comprising; a
gene for a polyketide synthase; a plurality of E. coli bacteria. In one
embodiment, said polyketide synthase is 6-deoxyerythronolide B synthase
(DEBS). In one embodiment, said polyketide synthase (PKS) is a modular
combination of different PKS components.
[0031] In one embodiment, the invention provides a method for using both
(i) a genetically encoded system for detecting small molecules that
modulate enzyme activity and (ii) a genetically encoded pathway for
polyketide biosynthesis to identify and/or build alkaloids that modulate
enzyme activity, comprising, a. providing, a genetically encoded system
for detecting small molecules that modulate enzyme activity, comprising,
a first region in operable combination comprising: a first promoter; a
first gene encoding a first fusion protein comprising a substrate
recognition domain linked to a DNA-binding protein; a second gene
encoding a second fusion protein comprising a substrate domain linked to
a protein capable of recruiting RNA polymerase to DNA; a second promoter;
a third gene for a protein kinase; a fourth gene for a molecular
chaperone; a fifth gene for a protein phosphatase; a second region in
operable combination comprising: a first DNA sequence encoding an
operator for said DNA-binding protein; a second DNA sequence encoding a
binding site for RNA polymerase; one or more genes of interest (GOI); a
genetically encoded pathway for polyketide biosynthesis comprising, a
pathway for alkaloid biosynthesis. a plurality of E. coli bacteria. In
one embodiment, said pathway for alkaloid biosynthesis described herein.
[0032] In one embodiment, the invention provides an engineered bacreria
cell line comprising expression plasmid 1, plasmid 2, plasmid 3 and
plasmid 4.
[0033] In one embodiment, the invention provides a phosphatase inhibitor
molecule produced by a bacterium expressing a plasmid 1 in contact with
an inducer molecule for inducing a promoter expressing a terpenoid
synthesis pathway operon in plasmid 2 and a terpene synthase enzyme in
plasmid 3, wherien said plasmid 2 and plasmid 3 are coexpressed in said
bacteria with plasmid 1. In one embodiment, said paslmid 2 and said
plasmid 3 are under control of an inducible promoter. In one embodiment,
said bacterium is contacted by an inducible molecule for inducing said
promoter.
[0034] In one embodiment, the invention provides a bacteria strain
producing a phosphatase inhibitor molecule. In one embodiment, said
inhibitor is a terpenoid molecule.
BRIEF DESCRIPTION OF THE DRAWINGS
[0035] The patent or application file contains at least one drawing
executed in color. Copies of this patent or patent application
publication with color drawing(s) will be provided by the Office upon
request and payment of the necessary fee.
[0036] FIG. 1A-G illustrates embodiments and shows exemplary results of
developing a photoswitchable phosphatase, e.g. PTP1B.sub.PS.
[0037] FIG. 1A illustrates one embodiment of a design of PTP1B.sub.PS:
Light-induced unwinding of the A'.alpha. helix of LOV2 destabilizes the
.alpha.7 helix of PTP1B and, thus, inhibits catalysis. FIG. 1B
illustrates one embodiment of Elaboration: In the competitively inhibited
structure of PTP1B (orange), the .alpha.7 helix (SEQ ID NO: 1) is stable,
and the WPD loop (black) adopts a closed, catalytically competent
conformation. In the apo structure (yellow), the .alpha.7 helix is
disordered, and the WPD loop (blue) adopts an open, inactive
conformation. We attached the C-terminal .alpha.7 helix of PTP1B to the
N-terminal A'.alpha. helix of LOV2 (SEQ ID NO: 2) at homologous crossover
points (1-7) to create a chimera for which the photoresponsive of LOV2
destabilizes the .alpha.7 helix. FIG. 1C shows exemplary results of
optimization of one embodiment: Construct 7 exhibited the largest dynamic
range of the crossover variants; 7.1 had an improved activity over 7,
while 7.1(T406A) had an improved dynamic range over 7.1. FIG. 1D shows an
exemplary analysis of the activity of PTP1B.sub.PS on pNPP indicates that
light affects k.sub.cat, but not K.sub.m. FIG. 1E shows the dynamic range
of PTP1B.sub.PS is similar for substrates of different sizes. FIG. 1F
shows exemplary illustrations of two small molecules:
p-nitrophenyl-phosphate (or pNPP), and 4-methylumbelliferyl phosphate (or
4MU) and a peptide domain from EGFR. FIG. 1G shows exemplary activity of
PTP1B-LOV2 chimeras that differ in (A-D) crossover location and (E-E4)
linker composition in the presence and absence of 455 nm light.
Substrate: 4-methylumbelliferyl phosphate.
[0038] FIG. 2A-J shows exemplary biophysical characterizations of
PTP1B.sub.PS.
[0039] FIG. 2A shows exemplary mutations that (i) prevent the formation of
the cysteine adduct in LOV2 (C450M), (ii) destabilize the A'.alpha. and
J.alpha. helices if LOV2 (I532E, I539E, and .DELTA.J.alpha.), or (iii)
disrupt the allosteric network of PTP1B (Y152A/Y153A) reduced the
photosensitivity of 7.1 and, with the exception of I532E and C450M,
lowered its specific activity. FIG. 2B shows exemplary exposure of
PTP1B.sub.PS to 455 nm light reduces its .alpha.-helical content
(CD.sub.222 nm). FIG. 2C shows exemplary optical modulation of
.alpha.-helical content (i.e.,
.delta..sub.222=CD.sub.222-dark-CD.sub.222-light) is necessary, but not
sufficient for optical modulation of catalytic activity. The dashed line
denotes .delta..sub.222 for an equimolar solution of PTP1B.sub.WT and
LOV2.sub.WT. FIG. 2D shows exemplary fluorescence of six tryptophan
residues in the catalytic domain of PTP1B which enables optical
monitoring of its conformational state. FIG. 2E-F shows exemplary thermal
recovery of (FIG. 2E) .alpha.-helical content and (FIG. 2F) tryptophan
fluorescence of PTP1B.sub.PS. FIG. 2G shows exemplary kinetic constants
for thermal resetting are larger for .alpha.-helical content than for
tryptophan fluorescence, suggesting that LOV2 resets more quickly than
the PTP1B domain. This discrepancy is smallest for the most
photosensitive variant: 7.1(T406A). FIG. 2H shows exemplary alignments of
the crystal structures of PTP1B.sub.PS (blue) and apo PTP1B.sub.WT
(orange) indicate that LOV2 does not distort the structure of the
catalytic domain. The LOV2 domain of PTP1B.sub.PS could not be resolved;
a flexible loop at the beginning of the .alpha.7 helix likely causes LOV2
to adopt variable orientations in the crystal lattice. The .alpha.6 and
.alpha.7 helices of an inhibited structure of PTP1B (yellow) are shown
for reference. FIG. 2I shows an exemplary gap in the crystal structure of
PTP1B.sub.PS that can accommodate LOV2. FIG. 2J shows where exemplary
crystals of a PTP1B-LOV2 fusion are green and turn clear when illuminated
with 455 nm light; LOV2 is, thus, unequivocally present. Error bars for
A, C, and G denote standard error (n>3). Note: PTP1B.sub.PS
corresponds to construct 7.1(T406A) from FIG. 1.
[0040] FIG. 3A-D demonstrates exemplary Fluorescence-based Biosensors
having PTP1B activity.
[0041] FIG. 3A shows one embodiment of a sensor for PTP1B activity. This
sensor consists of a kinase substrate domain, a short flexible linker,
and a phosphorylation recognition domain, sandwiched between two
fluorescent proteins (e.g., a cyan fluorescent protein and a yellow
fluorescent protein). When the sensor is in its unphosphorylated state,
Forster resonance energy transfer (FRET) between the two fluorophores
causes a decrease in CFP fluorescence and an increase in YFP
fluorescence; when the sensor is in its phosphorylated state, the absence
of FRET causes the opposite effect. FIG. 3B shows an exemplary increase
in the ratio of donor fluorescence (CFP) to acceptor fluorescence (YPet)
evidences the presence of Src kinase (i.e., a tyrosine kinase). When
either (i) EDTA, which chelates a metal cofactor of Src, or (ii) PTP1B,
which dephosphorylates the substrate domain, are additionally added, this
increase does not occur. FIG. 3C shows one embodiment as another variety
of the FRET sensor for A; this one uses mClover3 and mRuby3. The
excitation and emission wavelengths of these proteins make them
compatible with LOV2-based imaging experiments. FIG. 3D shows an
exemplary repeat of the experiment from B with the sensor from C.
[0042] FIG. 4A-H demonstrates exemplary Evidence of phosphatase activity
within living cells using photoconstructs and fluorescent tags.
[0043] FIG. 4A-C shows embodiments of three constructs are expressed in
Cos-7 cells: (FIG. 4A) GFP-PTP1B.sub.PS, (FIG. 4B) GFP-PTP1B.sub.PS-A,
and (FIG. 4C) GFP-PTP1B.sub.PS-B. Here, GFP-PTP1B.sub.PS is a fusion of
green fluorescent protein (GFP) and the N-terminus of 7.1(T406A) from
FIG. 1B-C (without the histidine tag); GFP-PTP1B.sub.PS-A is a fusion of
GFP-PTP1B.sub.PS and the C-terminal domain of full-length PTP1B; and
GFP-PTP1B.sub.PS-B is fusion of GFP-PTP1B.sub.PS and the C-terminal
endoplasmic reticulum (ER) anchor of full-length PTP1B (see below).
GFP-PTP1B.sub.PS localizes to the cytosol and nucleus, while
GFP-PTP1B.sub.PS-A and GFP-PTP1B.sub.PS-B localize to the ER. FIG. 4D-H
shows exemplary results of cell-based studies of PTP1B.sub.PS. We
transformed Cos-7 cells with a plasmid containing (i) the FRET sensor
from FIGS. 3C-3D and (ii) PTP1B.sub.PS or PTP1B.sub.PS/C450M (a
light-insensitive mutant). In this experiment, we illuminated individual
cells with 447 nm light and immediately imaged them with 561 nm light.
Light-modulated changes in FRET ratio (as defined in FIG. 3) allowed us
to detect light-modulated changes in PTP1B activity. FIG. 4D-E shows an
exemplary Cos-7 cell transformed with PTP1B.sub.PS at two time points:
(FIG. 4D) immediately after excitation with 447 nm light and (FIG. 4E)
after 1 min. A slight increase in FRET ratio (dark green to lighter
green) evidences photoactivation of PTP1B. (F-G). A Cos-7 cell
transformed with PTP1B.sub.PS (C450M) at two time points: (FIG. 4F)
immediately after excitation with 447 nm light and (G) after 1 min. The
absence of a detectable change in FRET-ratio indicates that the change
observed in D-E results from light-induced changes in PTP1B activity.
FIG. 4H shows an exemplary average fractional change in FRET ratio
observed in the nucleus (nuc) and cytosol (cyt) after 1 min and 2.67 min.
The change is higher for PTP1B.sub.PS than for PTP1B.sub.PS(C450M), the
light-insensitive mutant. Error bars indicate standard error.
[0044] FIG. 5A-C illustrates embodiments of drug discovery.
[0045] FIG. 5A shows an exemplary use of a phosphatase, i.e. drug target
(upper left depiction of PTP1B) for identifying a synthetic enzyme (lower
right depiction) where the enzyme is then used for providing an inhibitor
or modulatory molecule for the phosphatase, thus showing a general
framework for using enzymes to build inhibitors of chosen protein
targets. FIG. 5B shows an exemplary analysis of structural relationships
between binding pockets. A matrix compares individual properties (e.g.,
volume) between binding pocket 1 and all other binding pockets (2 to n)
capable of functionalizing (e.g., P450) or binding to (e.g., PTP1B)
ligands synthesized within pocket 1. FIG. 5C shows an exemplary
comparison of the ability of binding pockets in a biosynthetic pathway to
bind to intermediates.
[0046] FIG. 6 illustrates PTP1B showing an overlay of allosterically
inhibited (green) and competitively inhibited (orange) structures of
PTP1B (PDB entries 1t4j and 2f71, respectively) show activity-modulating
conformational changes: Unwinding of the .alpha.7 helix of LOV2 (blue)
causes its catalytically essential WPD loop (right) to adopt an open,
catalytically compromised conformation. Competitive (red) and allosteric
(yellow) inhibitors highlight the active site and allosteric site,
respectively.
[0047] FIG. 7A-B shows and exemplary analysis of binding affinity. FIG. 7A
shows embodiments of two binding partners of PTP1B: LMO4 and Stat3. FIG.
7B shows an exemplary binding isotherm based on binding-induced changes
in the tryptophan fluorescence of PTP1B (the ligand is TCS 401, a
competitive inhibitor).
[0048] FIG. 8A-B illustrates exemplary structural alignment of PTP1B
(light blue) and STEP (orange), FIG. 8A, which have only 31% sequence
identity, shows remarkable structural similarity. FIG. 8B illustrates an
exemplary structure of PTK6. Both STEP and PTK6 possess a C-terminal
alpha-helix that is compatible with actuation by the N-terminal helix of
LOV2 (i.e., an photomodulatory architecture similar to that depicted in
FIG. 1).
[0049] FIG. 9A-B illustrates an exemplary framework for building an enzyme
modulated by red light. We will attach the C-terminal .alpha.-helix of
PTP1B to the N-terminal .alpha.-helix of BphP1.
[0050] FIG. 10A-B illustrates an exemplary operon for screening
photoswitchable variants of PTP1B. FIG. 10A shows an exemplary
illustration where in its active state (here the far-red state), PTP1B
dephosphorylates the substrate domain, prevents substrate-SH2
association, and, thus, prevents transcription. FIG. 10B shows an
exemplary illustration where in its inactive state (here, the red state),
the phosphorylated substrate domain binds SH2, permitting transcription
of a gene for antibiotic resistance.
[0051] FIG. 11A-B illustrates an exemplary strategy for evolution of
photoswitchable proteins. FIG. 11A illustrates where we will compare the
growth of colonies on replicate plates exposed to red and infrared light
and select colonies that exhibit differential growth. FIG. 11B
illustrates where we will further characterize the photosensitivity of
top hits in liquid culture.
[0052] FIG. 12A-B illustrates an exemplary FRET-based sensor developed for
measuring intracellular phosphatase or kinase activity. Binding of the
substrate and SH2 domain either (FIG. 12A) enhance or (FIG. 12B) reduce
FRET, depending on architecture.
[0053] FIG. 13 shows a cartoon of imaging experiments. We will inactivate
PTP1B PS within subcellular regions (1-10 .mu.m) containing different
amounts of plasma membrane, ER, and cytosol, and we will use fluorescence
lifetime imaging to examine the phosphorylation state of our FRET-based
sensor (from FIG. 12) throughout the cell.
[0054] FIG. 14A-D illustrates an exemplary starting point for lead drug
design and discovery.
[0055] FIG. 14A illustrates Abietic acid. FIG. 14B demonstrates inhibition
of PTP1B by abietic acid at concentrations (dark to light) of 0-400 uM.
Analysis of different fits suggests noncompetitive or mixed-type
inhibition. FIG. 14C illustrates Abietic acid (green) docked in the
allosteric site of PTP1B. We have since shown that abietic acid binds to
the active site of PTP1B. Inset highlights active site in black. FIG. 14D
shows an exemplary X-ray crystal structure of a known allosteric
inhibitor (blue).
[0056] FIG. 15A-D illustrates an exemplary FIG. 15A Pathway for the
synthesis of terpenoids (mevalonate can be synthesized through pMevT or
added to the media). FIG. 15B shows exemplary Abietadiene titers
generated by E. coli DH5a transformed with the plasmids from A (with no
P450). FIG. 15C-D shows exemplary GC-MS analysis of products of FIG. 15C)
abietadiene-producing strain and FIG. 15D abietic-acid-producing strain:
(1) abietadiene, (2) levopimaradiene, and (3) abietic acid (ion counts in
10,000 for C and 1,000 for FIG. 15D). Note: E. coli DH5a avoids protein
overexpression is commonly used in metabolic engineering*.sup.4.
[0057] FIG. 16A-B illustrates exemplary terpenoids showing differences in
stereochemistry, shape, size, and chemical functionality. FIG. 16A
illustrates clockwise from abietic acid (1), neoabietic acid (2),
levopimaric acid (3), dihydroabietic acid (4). FIG. 16B shows exemplary
initial rates in PTP1B on 10 mM of p-NP phosphate in the presence of 200
uM inhibitor. No inhibitor (C). Error bars=standard error (n>5).
[0058] FIG. 17A-C shows results from exemplary studies. FIG. 17A N-HSQC
spectra of PTP1B (red) and PTP1B bound to abietic acid (blue). Inset:
Crystal structure of PTP1B. FIG. 17B-C shows exemplary Tryptophan (W)
fluorescence of PTP1B in the presence of FIG. 17B culture extract of
control (ABS.sub.X), abietadiene-producing (ABS), and abietic-acid
producing (ABS/BM3) strains and (FIG. 17C) various concentrations of
abietic acid and 25 uM of known allosteric inhibitor (BBR). Error bars
represent standard error (n>5).
[0059] FIG. 18A-B illustrates exemplary terpenoids that differ in FIG. 18A
stereochemistry and FIG. 18B shape. Inset: residues targeted for
mutagenesis in class I site of ABS.
[0060] FIG. 19A-E illustrates exemplary terpenoids FIG. 19A carboxylated,
FIG. 19B hydroxylated, FIG. 19C and halogenated diterpenoids. FIG. 19D-E
shows exemplary residues targeted for mutagenesis in FIG. 19D P450 BM3
and FIG. 19E SttH.
[0061] FIG. 20 illustrates an exemplary WaterMap analysis of UPPS. Colors
of water molecules correspond to free energies, relative to bulk water.
[0062] FIG. 21A-E illustrates an exemplary high-throughput screens for
PTP1B inhibitors.
[0063] FIG. 21A Growth-coupled (i.e., selection; strategy 1). FIG. 21B)
FRET sensor for PTP1B activity (strategy 2). FIG. 21C) FRET sensor and
FIG. 21D) tryptophan fluorescence for changes in PTP1B conformation
(strategies 3 and 4). FIG. 21E. Results for an operon similar to that
shown in FIG. 21A, where Amp is replaced with Lux. Error bars=SD
(n.gtoreq.3).
[0064] FIG. 22A-D illustrates exemplary inhibition of PTP1B. Error bars in
FIG. 22C denote SE (n.gtoreq.3 independent reactions).
[0065] FIG. 22A shows exemplary alignments of the backbone of PTP1B in
competitively inhibited (yellow and orange, PDB entry 2F71) and
allosterically inhibited (gray and black, PDB entry 1T4J) poses. The
binding of substrates and competitive inhibitors to the active site
causes the WPD loop to adopt a closed (orange) conformation that
stabilizes the C-terminal alpha7 helix through an allosteric network;
this helix is unresolvable in allosterically inhibited, noncompetitively
inhibited, and uninhibited structures, which exhibit WPD-open
conformations (black). FIG. 22B shows an exemplary illustration of a
chemical structure of abietic acid (AA). FIG. 22C shows exemplary initial
rates of PTP1B-catalyzed hydrolysis of pNPP in the presence of increasing
concentrations of AA. Lines show a fit to a model for mixed inhibition.
FIG. 22D an exemplary illustration of this model, where the inhibitor (I)
binds to the enzyme (E) and enzyme-substrate complex (ES) with different
affinities.
[0066] FIG. 23A-C illustrates an exemplary NMR analysis of PTP1B-AA
association.
[0067] FIG. 23A shows exemplary weighted differences in chemical shifts
(.DELTA..quadrature.) between 1H-15 N-HSQC spectra collected in the
absence and presence of AA (PTP1B:AA of 10:1). The dashed red line
delineates the threshold for values of .DELTA..quadrature. larger than
two standard deviations (a) above the mean; gray bars mark residues for
which chemical shifts broadened beyond recognition. FIG. 23B illustrates
an exemplary crystal structure of PTP1B (PDB entry 3A5J, gray) highlights
the locations of assigned residues (blue); inhibitors in the allosteric
site (PDB entry 1T4J, green) and active site (PDB entry 3EB1, yellow) are
overlaid for reference. Residues with significant CSPs (i.e.,
.DELTA..quadrature.>.DELTA..quadrature. mean+2.sigma.) are distributed
across the protein (red) and, with the exception of two residues in the
WPD loop, outside of known binding sites. FIG. 23C illustrates an
exemplary detail of the active site (upper panel) and known allosteric
site (lower panel) with inhibitors from (FIG. 23B) overlaid.
[0068] FIG. 24A-C illustrates an exemplary mutational analysis of the AA
binding site.
[0069] FIG. 24A illustrates an exemplary crystal structure of PTP1B (gray,
PDB entry 3A5J) shows the location of mutations introduced at five sites:
the active site (red), the allosteric site (green), site 1 (orange), site
2 (yellow), and the L11 loop (blue). The bound configurations of BBR
(allosteric site, PDB entry 1T4J) and TCS401 (active site, PDB entry
1C83) are overlaid for reference. FIG. 24B illustrates exemplary
disruptive mutations introduced at each site. Mutations were designed to
alter the size and/or polarity of targeted residues. The mutation denoted
"YAYA" (Y152A/Y153A), which was identified in a previous study,
attenuates allosteric communication between the C-terminus and the WPD
loop. FIG. 24C illustrates exemplary fractional change in inhibition (F
in Eq. 1) caused by the mutations from (B). Five mutations distributed
across the protein reduced inhibition by AA and TCS401, but had
negligible effect on inhibition by BBR. The similar effects of most
mutations on AA and TCS401 suggest that both inhibitors bind to the
active site. Error bars denote SE (propagated from n.gtoreq.9 independent
measurements of each V in Eq. 1).
[0070] FIG. 25A-D illustrates exemplary computational analysis of AA
binding.
[0071] FIG. 25A-B illustrates exemplary results of molecular dynamics
simulations: backbone traces of PTP1B in (A) AA-free and FIG. 25B
illustrates exemplary amino acid (AA)-bound states. The thickness of
traces indicates the amplitude and direction of local motions (Methods).
The binding of AA increases the flexibility of the WPD, E, and L10 loops.
The WPD and L10 loops contain residues with significant CSPs (red),
suggesting consistency between the results of MD and NMR analyses. FIG.
25C illustrates an exemplary representative bound conformation if AA
(green). Upon binding to the active site, AA (i) forms a hydrogen bond
with R221 that weakens a bond between R221 and E115 and (ii) prevents the
formation of a hydrogen bond (red) between W179 and R221 that forms when
the WPD loop closes. Both effects enhance the conformational dynamics of
the WPD loop. FIG. 25D shows exemplary results of docking calculations
are consistent with mixed-type inhibition: the binding of AA prevents the
WPD loop from closing and disrupts, but does not preclude, the binding of
pNPP (blue spheres).
[0072] FIG. 26A-C illustrates exemplary terpenoids showing differences in
stereochemistry, shape, size, and chemical functionality. FIG. 26A)
Structural analogues of abietic acid (AA): continentalic acid (CA),
isopimaric acid (IA), dehydroabietic acid (DeAA), and dihydroabietic acid
(DiAA). FIG. 26B) Differences in degree of saturation yield pronounced
differences in potency (i.e., IC 50), but not selectivity. Error bars
represent 95% confidence intervals. FIG. 26C) shows binding of three of
the analogues depicted in FIG. 26A).
[0073] FIG. 27A-C Analysis of pathologically relevant mutations.
[0074] FIG. 27A illustrates an exemplary Histogram of kinetically
characterized mutations. All mutations proximal (<4 A) to five or more
network residues were "influential" (i.e., they altered k.sub.cat or
K.sub.M by >50% or had a detectable influence on inhibition);
non-consequential mutations, by contrast, had fewer neighboring network
residues. FIG. 27B illustrates an exemplary crystal structure of PTP IB
(gray, PDB entry 3A5J) highlights the locations of influential mutations
on network residues; colors indicate whether they were introduced in
biophysical studies or found in diseases. FIG. 27C illustrates an
exemplary two cumulative distribution functions describe numbers of
network residues proximal to (i) mutations identified in diseases and
(ii) a random selection of sites. The two distributions are
indistinguishable from one another (P<0.05), suggesting that
disease-associated mutations do not occur preferentially near the
allosteric network.
[0075] FIG. 28A-D illustrates and shows exemplary data using a Genetic
operon linking PTP activity to the output of a gene of interest (GOI).
[0076] FIG. 28A shows embodiments of Operon A. An example of the operon.
S, tyrosine substrate; P, phosphate group; cI, the 434 phage cI
repressor; RpoZ and RP.sub..omega., the omega subunit of RNA polymerase;
cI OP, the binding sequence for the 434 phage cI repressor; and RB, the
binding site for RNA polymerase (RNAP). Phosphorylation of the tyrosine
substrate (by c-Src kinase) causes binding of the substrate-RP. fusion to
the SH2-cI fusion; this binding event, in turn, localizes the RNA
polymerase to RB, triggering transcription of the GOI. PTP1B
dephosphorylates the substrate domain, preventing the association of
substrate-RP. fusion and the SH2-cI, thereby, halting transcription of
the GOI. Inactivation of PTP1B, in turn, re-enables transcription of the
GOI. FIG. 28B illustrates one embodiment of a proposed medium-throughput
screen for membrane-permeable inhibitors: A strain of the E. coli is
transformed with the operon and grown in the presence of small molecules;
small-molecule inhibitors of PTP1B modulate transcription of the GOI
(e.g., a gene for luminescence, fluorescence, or antibiotic resistance)
in a dose-dependent manner. The bar graph shows a predicted trend in
data. FIG. 28C shows embodiments of Operon B. An operon that enables
screens for selective inhibitors. This operon comprises operon A with (i)
a second substrate-protein fusion (red), a "decoy", that can bind to the
SH2-cI fusion but not specifically to DNA or RNA polymerase, and (ii) a
second PTP (e.g., TC-PTP) that is active on the substrate domain of the
decoy. Because complexes between the decoy and SH2-cI do not trigger
transcription, the decoy inhibits transcription by competing with
cI-substrate for binding sites. Accordingly, molecules that inhibit
PTP1B, but not TC-PTP (which dephosphorylates the decoy)--that is,
selective inhibitors--cause the greatest transcriptional activation.
Molecules that inhibit both enzymes, by contrast, cause less activation.
FIG. 28D shows embodiments of Operon C. This operon enables screens for
photoswitchable enzymes. This operon comprises a version of operon A in
which PTP1B has been replaced with a photoswitchable version of PTP1B. In
this case, transcription of the GOI is different (e.g., higher or lower)
under different sources of light. In the example shown, light inhibits
the activity of a PTP1B-LOV2 chimera and, thus, enhances transcription of
the GOI.
[0077] FIG. 29A-B shows exemplary Preliminary results showing
phosphorylation-dependent expression of a GOI. FIG. 29A shows one
embodiment of Operon A in which the GOI is a bacterial luciferase
(LuxAB). PTP1B inhibits luminescence (i.e., reduces transcription of the
GOI), while a catalytically inactive version of PTP1B (a mimic for an
inhibited version of PTP1B) enhances luminescence. FIG. 29B shows one
embodiment of Operon A in which the GOI is a gene for spectinomycin
resistance (SpecR). PTP1B inhibits growth on spectinomycin, while a
catalytically inactive version of PTP1B (a mimic for an inhibited version
of PTP1B) enhances growth. The MidT substrate is used herein.
[0078] FIG. 30A-B illustrates Optimization of operon.
[0079] FIG. 30A shows one embodiment of operon from A in which the GOI is
a bacterial luciferase (LuxAB), the PTP1B is missing, and the substrate
is a peptide from Kras, midT, ShcA, or EGFR. Although all substrates can
be phosphorylated by Src kinase, only two substrates bind to the SH2
domain tightly enough to enable significant luminescence over background
(0% arabinose). FIG. 30B shows one embodiment of operon from A (here,
contained on a single plasmid) in which the GOI is a bacterial luciferase
(LuxAB) and PTP1B is missing. The Y/F mutation on the substrate domain
(blue) prevents it from being phosphorylated. The RBS sites toggle
expression of the Src kinase.
[0080] FIG. 31A-D illustrates Applications of operons.
[0081] FIG. 31A illustrates an exemplary conceptualization of a screen for
microbially synthesizable inhibitors of PTP1B. When transformed with one
embodiment of Operon A (or operon B), a cell capable of synthesizing
PTP1B-inhibiting metabolites will produce a different GOI output than a
cell that does not produce such metabolites. Because abietane-type
diterpenoids can both (i) inhibit PTP1B and (ii) be synthesized in E.
coli, we believe that a strain of E. coli that contains both Operon A and
a pathway for building abietane-type diterpenoids could be "evolved" to
build inhibitors of PTP1B. Here, the GOI could be a gene for luminescence
or fluorescence (low throughput) or antibiotic resistance (high
throughput). FIG. 31B illustrates an exemplary conceptualization of a
screen for photoswitchable enzymes. Consider a fusion of PTP1B to LOV2 or
BphP1 (here, the highlighted helices show N-terminal connection points on
these two proteins). For this example, illumination of the PTP1B-LOV2
with 455 nm light reduces its activity; illumination of the PTP1B-BphP1
fusion with 650 nm light reduces its activity, while illumination of the
PTP1B-BphP1 fusion with 750 nm light enhances its activity. When
transformed with operon C (which would contain one of these fusions), a
cell will produce a different GOI output under different illumination
conditions. FIG. 31C illustrates an exemplary conceptualization of a
screen for selective mutants of enzymes. When transformed with a version
of operon B where (i) PTP1B is also active on the decoy and (ii) the
second PTP (TC-PTP in our example) is missing, a cell containing a mutant
of PTP1B will most effectively transcribe the GOI when PTP1B is only
active on the decoy substrate. FIG. 31D illustrates an exemplary
conceptualization of a screen for selective substrates. When transformed
with a version of operon B where (i) the decoy is missing, (ii) the first
enzyme (PTP1B in our example) is under an inducible promoter, (iii) a
second PTP (TC-PTP in our example) is under a second inducible promoter,
and (iv) the GOI includes a gene for antibiotic resistance and a gene
that produces a toxic product in the presence of a non-essential
substrate, a cell containing a mutated substrate domain will grow under
both condition 1 (inducer of PTP1B and non-essential substrate) and
condition 2 (inducer of TC-PTP), when it binds to PTP1B, but not to
TC-PTP.
[0082] FIG. 32A-B presents exemplary evidence of an evolutionarily
conserved allosteric network.
[0083] FIG. 32A refers to an exemplary results of a statistical coupling
analysis. The orange and blue clusters represent two groups of
interconnected residues, termed "sectors", that exhibit strong intragroup
correlations in nonrandom distributions of amino acids. The allosteric
site (green inhibitor, PDB entry 1T4J), WPD loop (purple spheres), and
active site (red inhibitor, 3EB1) are highlighted for reference. FIG. 32B
refers to an exemplary analysis of crosstalk between pockets of PTP1B
modeled with MD simulations. Pockets are represented as spheres, colored
according to their persistency along the MD trajectory; the size of each
sphere indicates its average volume in MD simulations. Links have
thicknesses proportional to the frequency of inter-pocket merging and
splitting events (i.e., communication). Two independent sets of
interconnected pockets map closely to the sectors identified in SCA and,
thus, suggest that these two sectors represent distinct domains of an
evolutionarily conserved allosteric network. In the PTP1B-LOV2 fusions of
FIG. 1, LOV2 modulates the activity of PTP1B by tapping into the
allosteric network defined by sector A. Identification of sector A with a
statistical coupling analysis of the PTP family thus indicates that the
architecture for photocontrol described in FIG. 1 is broadly applicable
to all protein tyrosine phosphatases.
[0084] FIG. 33A-E illustrates an embodiment of a genetically encoded
system that links the activity of an enzyme to the expression of a gene
of interest (GOI). Error bars in FIG. 33B-E denote standard deviation
with n=3 biological replicates.
[0085] FIG. 33A illustrates an embodiment of a bacterial two-hybrid system
that detects phosphorylation-dependent protein-protein interactions.
Components include (i) a substrate domain fused to the omega subunit of
RNA polymerase (yellow), (ii) an SH2 domain fused to the 434 phage cI
repressor (light blue), (iii) an operator for 434cI (dark green), (iv) a
binding site for RNA polymerase (purple), (v) Src kinase, and (vi) PTP1B.
Src-catalyzed phosphorylation of the substrate domain enables a
substrate-SH2 interaction that activates transcription of a gene of
interest (GOI, black). PTP1B-catalyzed dephosphorylation of the substrate
domain prevents that interaction; inhibition of PTP1B re-enables it. FIG.
33B refers to an embodiment of the two-hybrid system from FIG. 33A that
(i) lacks PTP1B and (ii) contains luxAB as the GOI. We used an inducible
plasmid to increase expression of specific components; overexpression of
Src enhanced luminescence. FIG. 33C refers to an embodiment of the
two-hybrid system from FIG. 33A that (i) lacks both PTP1B and Src and
(ii) includes a "superbinder" SH2 domain (SH2*, i.e., an SH2 domain with
mutations that enhance its affinity for phosphopeptides), a variable
substrate domain, and LuxAB as the GOI. We used an inducible plasmid to
increase expression of Src; luminescence increased most prominently for
p130cas and MidT, suggesting that Src acts on both substrate domains.
FIG. 33D refers to an embodiment of a two-hybrid system from FIG. 33C
with one of two substrates: p130cas or MidT. We used a second plasmid to
overexpress either (i) Src and PTP1B or (ii) Src and an inactive variant
of PTP1B (C215S). The difference in luminescence between systems
containing PTP1B or PTP1B (C215S) was greatest for MidT, suggesting that
PTP1B acts on this substrate. Right: An optimized version of the
two-hybrid system (with bb030 as the RBS for PTP1B) appears for
reference. FIG. 33E displays the results of an exemplary growth-coupled
assay performed using an optimized B2H including SH2*, a midT substrate,
optimized promoters and ribosome binding sites (bb034 for PTP1B), and
SpecR as the GOI. This system is illustrated at the top of the figure.
Exemplary growth results demonstrate that inactivation of PTP1B enables
strain of E. coli harboring this system to survive at high concentrations
of spectinomycin (>250 .mu.g/ml).
[0086] FIG. 34 illustrates exemplary experiments used to optimized the B2H
system depicted in FIG. 33.
[0087] FIG. 35 illustrates exemplary experiments used to optimize the B2H
system depicted in FIG. 33 for growth-coupled assays.
[0088] FIG. 36A-C depicts an exemplary metabolic pathway for the
biosynthesis of terpenoids.
[0089] FIG. 36A depicts a plasmid-borne pathway for terpenoid
biosynthesis: (i) pMBIS, which harbors the mevalonate-dependent
isoprenoid pathway of S. cerevisiae, converts mevalonate to isopentyl
pyrophosphate (IPP) and farnesyl pyrophosphate (FPP). (ii) pTS, which
encodes a terpene synthase (TS) and, when necessary, a geranylgeranyl
diphosphate synthase (GPPS), converts IPP and FPP to sesquiterpenes
and/or diterpenes.
[0090] FIG. 36B depicts exemplary terpene synthases: amorphadiene synthase
(ADS) from Artemisia annua, .gamma.-humulene synthase (GHS) from Abies
grandis, abietadiene synthase (ABS) from Abies grandis, and taxadiene
synthase (TXS) from Taxus brevifolia.
[0091] FIG. 36C shows the results of an exemplary growth-coupled assay of
strain of E. coli that contains both (i) an embodiment of the optimized
bacterial two-hybrid (B2H) system (i.e., the B2H system from FIG. 33E)
and (ii) an embodiment of a pathway for terpenoid biosynthesis (i.e., the
pathway from FIG. 35A).
[0092] FIG. 37A-C provides an exemplary analysis of the inhibitory effects
of terpenoids generated by different strains of E. coli.
[0093] FIG. 37A depicts the results of our analysis of the inhibitory
effect of DMSO containing (i) no inhibitor and (ii) extracted compounds
from the culture broth of the ADS-containing strain. FIG. 37B depicts the
results of our analysis of the inhibitory effect of DMSO containing (i)
extracted compounds from the culture broth of the GHS-containing strain
(gHUM) or (ii) extracted compounds from the culture broth of the strain
including the L450Y mutant of GHS. FIG. 37C depicts the results of our
analysis of the inhibitory effect of DMSO containing (i) no inhibitor,
(ii) extracted compounds from the culture broth of the ABS-containing
strain, (iii) extracted compounds from the culture broth of the
TXS-containing strain, and (iv) extracted compounds from the culture
broth of the train strain containing a catalytically inactive variant of
ABS.
[0094] FIG. 38 shows exemplary analysis of the product profiles of mutants
of GHS that enabled growth in the presence of spectinomcyin.
[0095] FIG. 39 shows an analysis of an exemplary B2H systems that link the
inhibition of other PTPs to cell survival.
[0096] FIG. 40A-E depicts exemplary embodiments of genetically encoded
systems that link the activity of an enzyme to the expression of a gene
of interest, and the application of those embodiments to (i) the
prediction of resistance mutations, (ii) the construction of inhibitors
that combat resistance mutations, and (ii) the evolution of inhibitors of
kinases.
[0097] FIG. 40A depicts an exemplary first step in examining potential
resistance mutations. By evolving a metabolic pathway to produce
molecules that inhibit a known drug target (e.g., PTP1B); these molecules
will permit expression of a gene of interest (GOI) that confers survival
in the presence of a selection pressure (e.g., the presence of
spectinomycin, an antibiotic). FIG. 40B depicts an exemplary second step
in examining potential resistance mutations. In a second strain of E.
coli, we will replace the original gene of interest with a second (GOI2)
that confers conditional toxicity (e.g., SacB, which converts sucrose to
levan, a toxic product); we will evolve the drug target to become
resistant to the endogenous inhibitors, while still retaining its
activity. This mutant will prevent expression of the toxic gene. FIG. 40C
depicts an exemplary third step in combating resistance mutations. In a
third strain of E. coli, we will evolve a metabolic pathway that produces
molecules that inhibit the mutated drug target. In this way, we will both
predict--and, through our second evolved pathway, address--mutations that
might cause resistance to terpenoid-based drugs. FIG. 40D depicts an
exemplary genetically encoded system that detects inhibitors of an Src
kinase. In brief, Src activity enables expression of a toxic gene (GOI2);
inhibition of Src, in turn, would confer survival. FIG. 40E demonstrates
one embodiment of a roof of principle for the B2H system describe in FIG.
40B. The system shown here includes two GOIs: SpecR and SacB. Expression
of the GOIs confers survival in the presence of spectinomycin; expression
of the GOIs causes toxicity in the presence of sucrose. The images depict
the results of a growth-coupled assay performed on a strain of E. coli in
the presence of various concentrations of sucrose. The strain harboring
an active form of PTP1B (WT) grows better at high sucrose concentrations
that the strain harboring an inactive form of PTP1B (C215S).
[0098] FIG. 41A depicts an exemplary strategy for the evolution of
inhibitors of PTP1B.
[0099] FIG. 41A depicts an exemplary structural analysis used to identify
targets for mutagenesis in the active sites of terpene synthases. It
shows an alignment of the class I active site of ABS (gray, PDB entry
3s9v) and TXS (blue, PDB entry 3p5r) with the locations of sites targeted
for site-saturation mutagenesis (SSM) highlighted on ABS (red). A
substrate analogue (yellow) of TXS appears for reference. FIG. 41B
depicts an exemplary strategy for introducing diversity into libraries of
metabolic pathways: An iterative combination of SSM of key sites on a
terpene synthase (as in a), error-prone PCR (ePCR) of the entire terpene
synthase gene, SSM of sites on a terpene-functionalizing enzyme (e.g.,
P450), and ePCR of the entire terpene-functionalizing enzyme. FIG. 41C
depicts an exemplary quantification the total terpenoids present in DMSO
samples with extracts of various TS-containing strains. In brief, we
performed site-saturation mutagenesis of six sites on ADS (analogous to
the sites shown in FIG. 41A); we plated the SSM library on agar plates
containing different concentrations of spectinomycin; we picked colonies
that grew on a plate containing a high concentration (800 .mu.g/ml) of
spectinomycin and used each colony to inoculate a separate culture; we
used a hexane overlay to extract the terpenoids secreted into each
culture broth; we dried the hexane extract in a rotary evaporator and
re-suspended the solid in DMSO; and we used a GC-MS to quantify the total
amount of terpenoids present in the DMSO. FIG. 41D depicts an exemplary
analysis of the inhibitory effect of various extracts on PTP1B. In brief,
the figure shows initial rates of PTP1B-catalyzed hydrolysis of
p-nitrophenyl phosphate (pNPP) in the presence of terpenoids quantified
in FIG. 41C. Two mutants of ADS (G439A and G400L) generate particularly
potent inhibitors of PTP1B.
[0100] FIG. 42 depicts an exemplary analysis of the link between B2H
activation and cell survival. An exemplary strain of E. coli that
contains both (i) the optimized bacterial two-hybrid (B2H) system (FIG.
33E) and (ii) the terpenoid pathway depicted in FIG. 36A. Note: pTS
includes GGPPS only when ABS or TXS are present; the "Y/F" operon
corresponds to a B2H system in which the substrate domain cannot be
phosphorylated. Survival at high concentrations of spectinomycin requires
activation of the B2H system (i.e., phosphorylation of the substrate
domain, a process facilitated by inhibition of PTP1B).
[0101] FIG. 43 provides exemplary product profiles of strains of E. coli
harboring various terpene synthases. For this figure, the strain of E.
coli harbored (i) the optimized B2H system (FIG. 33E) and (ii) the
terpenoid pathway (FIG. 36A). The pathways corresponding to each profile
differ only in the composition of the pTS plasmid, which contains TXS
(taxadiene synthase from Taxus brevifolia and a geranylgeranyl
diphosphate synthase from Taxus Canadensis); GHS (.gamma.-humulene
synthase from Abies grandis); ADS (amorphadiene synthase from Artemisia
annua); ABS (abietadiene synthase from Abies grandis and a geranylgeranyl
diphosphate synthase from Taxus Canadensis); G400A (the G400A mutant of
amorphadiene synthase from Artemisia annua); and G439L (the G439L mutant
of amorphadiene synthase from Artemisia annua). Note that the two mutants
of ADS yield different product profiles than the wild-type enzyme (ADS);
our resuls indicate that products generated by these two mutants are more
inhibitory than those generated by the wild-type enzyme (FIG. 41E).
[0102] FIG. 44A-D provides exemplary structural and sequence-based
evidence that supports the extension the B2H system to other protein
tyrosine phosphatases (PTPs).
[0103] FIG. 44A provides an exemplary structural alignment PTP1B and
PTPN6, two PTPs that are compatible with the B2H system (see FIGS. 1e and
7 of Update A for evidence of compatibility). We used the align function
of PyMol to align each structure of PTPN6 with either (i) the ligand-free
(3A5J) or (ii) ligand-bound (2F71) structure of the catalytic domain of
PTP1B. The align function carries out a sequence alignment followed by a
structural superposition and, thus, effectively aligns the catalytic
domains of both proteins. FIG. 44B provides an exemplary structural
comparison of PTP1B and PTPN6; the root-mean-square deviations (RMSD) of
aligned structures of PTP1B and PTPN6 range from 0.75 to 0.94 .ANG.. FIG.
44C proves an exemplary sequence alignment of the catalytic domains of
PTP1B and PTPN6 (EMBOSS Needle.sup.1) (SEQ ID NOS: 3 and 4,
respectively). FIG. 44D provides an exemplary sequence comparison of the
catalytic domains of PTP1B and TPPN6. The sequences share 34.1% sequence
identity and 53.5% sequence similarity. In summary, the results of this
figure indicate that our B2H system can be readily extended to PTPs that
possess catalytic domains that are (i) structurally similar to the
catalytic domain of PTP1B (here, we define structural similarity as two
structures that when aligned, have with an RMSD of .ltoreq.0.94 .ANG.
RMSD with the framework similar to the one used by the align function of
PyMol) and/or (ii) sequence similar to the catalytic domain of PTP1B
(here, we define sequence similarity as .gtoreq.34% sequence identity or
.gtoreq.53.5% sequence similarity as defined by the EMBOSS Needle
algorithm).
DEFINITIONS
[0104] As used herein, the use of the term "operon" may refer to a cluster
of genes under the control of a single promoter (as in a classical
definition of an operon) and may also refer to a genetically encoded
system comprising multiple operons (e.g., the bacterial two-hybrid
system).
[0105] As used herein, "phosphorylation-regulating enzymes" refer to
proteins that regulate phosphorylation.
[0106] As used herein, "phosphorylation" refers to a biochemical process
that involves the addition of phosphate to an organic compound.
[0107] As used herein, "optogenetic actuator" refers to a genetically
encodable protein that undergoes light-induced changes in conformation.
[0108] As used herein, "dynamic range" refers to the ratio of activity in
dark and light state (i.e., the initial rate in the dark/the initial rate
in the presence of 455 nm light).
[0109] As used herein, "operon" refers to a unit made up of multiple genes
that regulate other genes responsible for protein synthesis,
[0110] As used herein, "operably linked" refers to one or more genes (i.e.
DNA sequences) suitably positioned and oriented in a DNA molecule for
transcription to be initiated from the same promoter. DNA sequences that
are operably linked to a promoter means that expression of the DNA
sequence(s) is under transcriptional initiation regulation of the
promoter.
[0111] As used herein, "construct" refers to an engineered molecule, e.g.
ligated pieces of DNA as a DNA construct; a RNA construct as one
contiguous sequence resulting from the expression of a DNA construct.
[0112] As used herein, "fusion" refers to an expressed product of an
engineered construct i.e. a combination of several ligated sequences as
one molecule or a single gene that encodes for a protein-protein fusion
originally encoded by two genes.
[0113] As used herein, "expression vector" or "expression construct"
refers to an operon, plasmid or virus designed for DNA expression of a
construct in host cells, typically containing a promoter sequence
operable within the host cell.
[0114] As used herein, "promoter" refers to a region of DNA that initiates
transcription of a particular DNA sequence. Promoters are located near
the transcription start sites of, towards the 5' region of the sense
strand. Promoters may be constitutive promoters, such as cytomegalovirus
(CMV) promoters in mammalian cells, or inducible promoters, such as
tetracycline-inducible promoters in mammalian cells.
[0115] As used herein, "transformation" refers to a foreign nucleic acid
sequence or plasmid delivery into a prokaryotic host cell, for example,
an expression plasmid (e.g. a plasmid expression construct) inserted into
or taken up by a host cell.
[0116] As used herein, "transfection" refers to the insertion of a nucleic
acid sequence into a eukaryotic cell.
[0117] Transformation and transfection may be transient, such that the
nucleic acid sequence or plasmid introduced into the host cell is not
permanently incorporated into the cellular genome. A stable
transformation and transfection refers to a host cell retaining the
foreign nucleic acid sequence or plasmid for multip generations
regardless of whether the nucleic acid or plasmid was integrated into the
genome of the host cell.
[0118] As used herein, "host" in reference to a cell refers to a cell
intended for receiving a nucelic acid sequence or plasmid or already
harboring a a nucelic acid sequence or plasmid, eg. a bacterium.
[0119] As used herein, "conjugate" refers to a covalently attachment of at
least two compounds, for example, a photosensing element attached to a
phosphatase protein.
[0120] As used herein, "decoy" in reference to a protein construct that
cannot bind to DNA and/or RNA polymerase.
DETAILED DESCRIPTION OF INVENTION
[0121] This invention relates to the field of genetic engineering.
Specifically, the invention relates to the construction of operons to
produce biologically active agents. For example, operons may be
constructed to produce agents that control the function of biochemical
pathway proteins (e.g., protein phosphatases, kinases and/or proteases).
Such agents may include inhibitors and modulators that may be used in
studying or controlling phosphatase function associated with
abnormalities in a phosphatase pathway or expression level. Fusion
proteins, such as light activated protein phosphatases, may be
genetically encoded and expressed as photoswitchable phosphatases.
Systems are provided for use in controlling phosphatase function within
living cells or in identifying small molecule
inhibitors/activator/modulator molecules of protein phosphatases
associated with cell signaling.
[0122] The invention also relates to the assembly of genetically encoded
systems (e.g., one or more operons) for detecting and/or constructing
biologically active agents. For example, systems may be assembled in
order to accomplish one or more goals, e.g. (i) to detect and/or
synthesize small molecules that affect the activity of regulatory enzymes
(e.g., protein phosphatases, kinases, and/or proteases); (ii) to detect
and/or evolve regulatory enzymes modulated by light (e.g.,
light-responsive protein phosphatases, kinases, or proteases), etc. Small
molecule modulators may include inhibitors of phosphatases known to be
associated with human diseases or implicated with causing or perpetuating
human diseases; activators of phosphatases implicated or known to be
associated in human diseases (e.g., diabetes, obesity, and cancer); such
small molecules may serve as chemical probes in studies of cell
signaling; as structural starting points (i.e., leads); etc., for the
development of pharmaceutical compounds for use in treating a human
disease. Light-sensitive enzymes may include protein tyrosine
phosphatases fused to optogenetic actuators (e.g., a LOV domain if
phototropin 1). Such fusions could serve as tools for exerting
spatiotemporal control over protein tyrosine phosphorylation in living
cells
[0123] Further, microbial operons are provided that are designed for use
in identifying either small molecule inhibitors, activators, or modulator
molecules, photoswitchable enzymes, or biological components, including
intracellularly expressed molecules, including, for examples, operons
having components for use in whole cell microbial screening assay
systems. Inhibitors/modulator molecules discovered using compositions,
systems and methods described herein are contemplated for use in treating
diseases such as diabetes, type II diabetes, obesity, cancer, and
Alzheimer's disease, among other disorders associated with protein
phosphatase enzymes.
[0124] In one embodiment, the present invention relates to a Protein
tyrosine phosphatase 1B (PTP1B). PTP1B represents a valuable starting
point for this study for four reasons: (i) It is implicated in
diabetes.sup.5, obesity.sup.6, cancer.sup.30, anxiety.sup.31,
inflammation.sup.32, the immune response, and neural specification in
embryonic stem cells.sup.33, (ii) The mechanisms underlying its
subcellular localization are well understood (a short C-terminal anchor
connects it to the ER; proteolysis of this anchor releases it to the
cytosol).sup.2934. (iii) It can be expressed, purified, and assayed with
ease.sup.35, (iv) It is a member of a class of structurally similar
enzymes (PTPs) that could facilitate the rapid extension of architectures
for making it photoswitchable. PTP1B represents both an experimentally
tractable model system for testing strategies for optical control, and an
enzyme for which optical modulation is contemplated to permit detailed
analyses of a wide range of diseases and physiological processes.
[0125] Specifically related to exemplary Figures: FIGS. 1, 2, 3, 4, 8, 9,
12, and 13 describe optogenetic and imaging technologies (i.e.,
light-sensitive enzymes and genetically encodable biosensors) that could
be evolved, improved, or optimized with the operon; FIGS. 10 and 11
describe strategies for using the operon to evolve, improve, or optimize
light-sensitive enzymes; FIGS. 5, 6, 14, 15, 16, 17, 18, 19, 20, 28, 29,
30, and 31 support both (i) the development of an operon for detecting
and/or evolving small molecules that inhibit known drug targets and (ii)
the subsequent characterization of those molecules; FIGS. 22, 23, 24, 25,
26, 27, and 32 provide examples of kinetic and biophysical
characterizations of a microbially synthesizable molecule that inhibits
PTP1B.
I. Protein Tyrosine Phosphatases (PTPs) and Protein Tyrosine Kinases
(PTKs) in Relation to Disease.
[0126] Protein tyrosine phosphatases (PTPs) and protein tyrosine kinases
(PTKs) are two classes of enzymes contributing to anomalous signaling
events in a wide range of diseases (e.g., diabetes, cancer,
atherosclerosis, and Alzheimer's disease, among others) and understanding
disease progression.sup.14,36. Further, they are involved with regulating
memory, fear, appetite, energy expenditure, and metabolism, thus use of
such phosphorylation regulating enzymes may reveal links between
seemingly disparate physiological processes.sup.14,22,13.
[0127] Embodiments for using light as photoswitchable constructs for
controlling PTPs and PTKs is described herein. Accordingly, examples of
photoswitchable constructs of PTPs and PTKs developed as described
herein, should be broadly useful to biomedical researchers interested in
understanding how healthy and diseased cells process chemical signals in
addition to use for identifying specific alleles of PTPs and/or PTKs
(i.e. gene sequences or proteins)--or other enzymes that they
regulate--linked to specific diseases, such as diabetes, etc., including
subtypes of diseases, i.e. early onset, late onset, etc., and specific
types of cancer, and for screening and testing molecules, including small
molecules, for treating diseases associated with these alleles.
[0128] Although other references describe photocontrol of proteins,
including using LOV2 conjugates, these references do not mention using
phosphatases. Fan, et al., "Optical Control Of Biological Processes By
Light-Switchable Proteins." Wiley Interdiscip Rev Dev Biol. 4(5):
545-554. 2015. This reference describes blue light-oxygen-voltage-sensing
(LOV) domains including the LOV2 C-terminal .alpha.-helix, termed
J.alpha., from Avena sativa phototropin. Linkage to the LOV domain can
cage a protein of interest (POI), while light-induced conformational
change in the LOV domain results in its uncaging. As one example, peptide
kinase inhibitors can be caged by fusion to the C-terminus of LOV2.
Exposure to light results in uncaging of the inhibitors for light
modulating protein kinase activities in cells. WO2011133493. "Allosteric
regulation of kinase activity." Published Oct. 27, 2011. This reference
describes fusion proteins comprising a kinase, including as examples, a
tyrosine kinase (Src), a serine/threonine kinase (p38), and a ligand
binding domain, e.g. a light-regulated LOV domain (where illumination is
considered "ligand binding), inserted in the N-terminal and/or C-terminal
end or near the catalytic domain to produce allosteric regulation using a
light-dependent kinase. Further, a LOV domain includes a LOV2 domain
and/or Ja domain from A. sativa phototropin I. WO2012111772 (A1) In
Japanese with an English abstract. This reference abstract describes a
polypeptide for the optical control of calcium signaling comprising an
amino acid sequence including: a LOV2 domain composed of SEQ ID NO: 1 or
an amino acid sequence having at least 80% sequence identity with SEQ ID
NO: 1. The construct has a LOV2 domain followed by a LOV2-Jalpha optical
switch at the N terminus of the construct. U.S. Pat. No. 8,859,232.
"Genetically encoded photomanipulation of protein and peptide activity."
Issued 10-14-2014. This reference describes fusion proteins comprising
protein light switches and methods of photomanipulating the activity of
the fusion proteins to study protein function and analyze subcellular
activity, as well as diagnostic and therapeutic methods. More
specifically, a fusion protein comprising a protein of interest fused to
a protein light switch comprising a light, oxygen or voltage (LOV2)
domain of Avena sativa (oat) phototropin 1, wherein illumination of the
fusion protein activates or inactivates the protein of interest. The
protein of interest is a functional domain of a human protein. As an
example, a LOV2-J.alpha. sequence of phototropin1 (404-547) was fused to
the N-terminus of RacI so that the LOV domain in its closed conformation
would reversibly block the binding of effectors to RacI.
[0129] A. Protein Tyrosine Phosphatases (PTPs).
[0130] Protein tyrosine phosphatases (PTPs) are a class of regulatory
enzymes that exhibit aberrant activities in a wide range of diseases. A
detailed mapping of allosteric communication in these enzymes could,
thus, reveal the structural basis of physiologically relevant--and,
perhaps, therapeutically informative--perturbations (i.e., mutations,
post-translational modifications, or binding events) that influence their
catalytic states. This study combines detailed biophysical studies of
protein tyrosine phosphatase IB (PTP IB) with large-scale bioinformatic
analyses to examine allosteric communication in PTPs. Results of X-ray
crystallography, molecular dynamics simulations, and sequence-based
statistical analyses indicate that PTP IB possesses a broadly distributed
allosteric network that is evolutionarily conserved across the PTP
family, and findings from kinetic studies show that this network is
functionally intact in sequence-diverse PTPs. The allosteric network
resolved in this study reveals new sites for targeting allosteric
inhibitors of PTPs and helps explain the functional influence of a
diverse set of disease-associated mutations.
[0131] In one embodiment, a tyrosine phosphatase and photosensitive
protein as described herein may be attached to a drug for use in medical
treatments. In contrast to EP2116263, "Reversibly light-switchable
drug-conjugates." Published Nov. 11, 2009 which does not mention tyrosine
phosphatase, and which describes photoswitchable conjugates of protein
phosphatase calcineurin attached to a photoisomerizable group B and also
attached to a drug for use in medical treatments (neither of these groups
are genetically encodable). As one example in EP2116263, tumor growth is
suppressed by inhibition of the protein phosphatase calcineurin. A
photoisomerizable group B, for near UV (e.g. 370 nm) or near IR (e.g. 740
nm) induced activity, does not include a light responsive plant protein
phototropin 1 LOV2 N-terminal alpha helix.
[0132] Receptor PTPs are contemplated for conjugation to light sensing
proteins, as described herein. In contrast, Karunarathne, et al.,
"Subcellular optogenetics--controlling signaling and single-cell
behavior." J Cell Sci. 128(1):15-25, 2015, describes photosensitive
domains, such as bacteria light-oxygen-voltage-sensing (LOV and LOV2)
domains including a C-terminal helical J.alpha. region, tagged to
receptor tyrosine kinases (RTKs), there were no specific examples, there
was no mention of a tyrosine phosphatase nor a plant phototropin 1 LOV2
N-terminal alpha helix. Optical activation of an inositol 5-phosphatase
was shown, but inositol 5-phosphatase is not a protein phosphatase.
[0133] B. Enzymatic Phosphorylation of Tyrosine Residues.
[0134] Enzymatic phosphorylation of tyrosine residues has a role in
cellular function and is anomalously regulated in an enormous range of
diseases (e.g., diabetes, cancer, autoimmune disorders, and Noonan
syndrome. It is controlled by the concerted action of two classes of
structurally flexible--and dynamically regulatable--enzymes: protein
tyrosine kinases (PTKs), which catalyze the ATP-dependent phosphorylation
of tyrosine residues, and protein tyrosine phosphatases (PTPs), which
catalyze the hydrolytic dephosphorylation of phosphotyrosines (5, 6). A
detailed understanding of the mechanisms by which these enzymes respond
to activity-modulating structural perturbations (i.e., mutations,
post-translational modifications, or binding events) can, thus,
illuminate their contributions to various diseases and facilitate the
design of new PTK- or PTP-targeted therapeutics.
[0135] Over the last several decades, many biophysical studies have
dissected the catalytic mechanisms and regulatory functions of PTKs (7,
8), which are common targets of pharmaceuticals.(9) Detailed analyses of
PTPs, by contrast, have lagged behind.(10) These enzymes represent an
underdeveloped source of biomedical insight and therapeutic potential (no
inhibitors of PTPs have cleared clinical trials); they are, thus, the
focus of this study.
[0136] PTPs uses two loops to dephosphorylate tyrosine residues. The
eight-residue P-loop binds phosphate moieties through a positively
charged arginine, which enables nucleophilic attack by a nearby cysteine,
and the ten-residue WPD loop contains a general acid catalyst--an
aspartate--that protonates the tyrosine leaving group and hydrolyzes the
phosphoenzyme intermediate--11-13) During catalysis, the P-loop remains
fixed, while the WPD loop moves .about.10A between open and closed
states; nuclear magnetic resonance (NMR) analyses suggest this movement
controls the rate of catalysis.(14)
[0137] Recent analyses of protein tyrosine phosphatase IB (PTP IB) a drug
target for the treatment of diabetes, obesity, and breast cancer,
indicate that motions of its WPD loop are regulated by an allosteric
network that extends to its C-terminus (FIG. 1B) (15, 16). This network
is susceptible to modulation by both (i) inhibitors that displace its
C-terminal .alpha.7 helix (17, 18) and (ii) mutations that disrupt
communication between the a(alpha)7 helix and the WPD loop (15); the
specific collection of residues that enable allosteric communication in
PTP1B and other PTPs has yet to be fully resolved.
[0138] Protein tyrosine phosphatase 1B (PTP1B). PTP1B represents a
valuable tool for use in identifying potential therapeutics for at least
four reasons: (i) It is implicated in diabetes.sup.5, obesity.sup.6,
cancer.sup.30, anxiety.sup.31, inflammation.sup.32, the immune response,
and neural specification in embryonic stem cells.sup.33, (ii) The
mechanisms underlying its subcellular localization are well understood (a
short C-terminal anchor connects it to the ER; proteolysis of this anchor
releases it to the cytosol).sup.2934. (iii) It can be expressed,
purified, and assayed with ease.sup.35, (iv) It is a member of a class of
structurally similar enzymes (PTPs) that could facilitate the rapid
extension of architectures for making it photoswitchable. PTP1B, thus,
represents both an experimentally tractable model system for testing
strategies for optical control, and an enzyme for which optical
modulation will permit detailed analyses of a wide range of diseases and
physiological processes.
[0139] Spatial regulation and intracellular signaling. PTP1B demonstrates,
by example, the value of photoswitchable enzymes for studying spatial
regulation in intracellular signaling. It is hypothesized to inactivate
receptor tyrosine kinases through (i) contacts between endosomes and the
ER.sup.37,38, (ii) contacts between the plasma membrane and extended
regions of the ER.sup.39, and (iii) direct protein-protein interactions
enabled by its partial proteolysis and release into the cytosol.sup.34.
The role of different mechanisms (or locations) of PTP1B-substrate
interaction in determining the outcomes of those interactions is poorly
understood. Evidence suggesting a relationship between the location of
PTP1B and its role in signaling has arisen in studies of tumorigenesis.
Inhibition of PTP1B can suppress tumor growth and metastasis in
breast.sup.30,40, lung.sup.3,41, colorectal.sup.9, and prostate
cancers,.sup.42,43 while its upregulation has similar effects in
lymphoma.sup.3,44. Recent evidence suggest that the former effect may
result from inhibition of cytosolic PTP1B.sup.45; the cause of the latter
is unclear. At present, there are no tools to investigate the
differential influence of spatially distinct subpopulations of PTP1B on
tumor-associated signaling events within the same cell. Photoswitchable
variants of PTP1B represent such a tool.
[0140] Network biology. Signaling networks are often represented as nodes
(proteins) connected by lines (interactions).sup.46. Such maps capture
the connectivity of biochemical relay systems, but obscure spatial
context--the ability of a single interaction to occur in multiple
locations and, perhaps, to stimulate multiple signaling outcomes. This
study develops a set of tools that will enable detailed studies of the
role of spatial context in guiding the propagation of signals through
biochemical networks; such an examination contributes to understanding
the role of PTP1B in cell signaling (and processes associated with
tumorigenesis), and generally relevant to the study of any enzyme that
exists in spatially distinct subpopulations within the cell.
II. Optogenetic Actuators.
[0141] Optogenetic actuators (genetically encodable proteins that undergo
light-induced changes in conformation) provide a convenient means of
placing biochemical events under optical control. Alone, or when fused to
other proteins, they have enabled optical manipulation of biomolecular
transport, binding, and catalysis with millisecond and submicron
resolution in living cells. Our approach addresses two major deficiencies
in existing technologies: Observational interference and illuminating
half the story. Existing strategies to control the activity of enzymes
with light interfere with native patterns of protein production,
localization, and interaction (often by design) and, thus, make direct
interrogation and/or control of those patterns--which determine how
biochemical signals are processed--difficult. There are several methods
to control protein kinases with light, but no analogous methods for
controlling protein phosphatases. As signaling networks are regulated by
the concerted action of both classes of enzyme, comprehensive control
and/or detailed dissections of those networks require methods for
controlling both.
[0142] Embodiments described herein comprise (i) an approach for
controlling the activity of proteins with light without disrupting their
wild-type activities and (ii) a demonstration of this approach on a
protein of particular importance: protein tyrosine phosphatase 1B
(PTP1B), a regulator of cell signaling and a validated drug target for
the treatment of diabetes, obesity, and cancer. There are no known
photoswitchable protein tyrosine phosphatases. The PTP1B-LOV2 construct
reported in this filing is the first. (ii) The N-terminal alpha helix of
LOV2 is ignored in most studies (even reviews of optical switches) and
has not been used as an exclusive connection point for optical modulation
of enzymes.
[0143] We have developed a photoswitchable version of PTP1B by fusing the
C-terminal allosteric domain of this enzyme to the N-terminal alpha helix
of a protein light switch (i.e., the LOV2 domain of phototropin 1 from
Avena sativa). We present evidence that this general architecture--which
is unique in the placement of LOV2 away from the active site of PTP1B
(minimally disruptive)--can be extended to other PTPs and, perhaps, PTKs.
For example, we used a statistical coupling analysis to show that the
allosteric network exploited in our PTP1B design is preserved across the
PTP family.
[0144] Alone, or when fused to other proteins, optogenetic actuators have
enabled optical manipulation of biomolecular transport, binding, and
catalysis with millisecond and submicron resolution.sup.15,16. At least
three deficiencies limit their use in detailed studies of signaling
networks: Observational interference. Existing strategies to control the
activity of enzymes with light interfere with native patterns of protein
production, localization, and interaction.sup.16,17 (often by design)
and, thus, make direct interrogation of those patterns--which determine
how biochemical signals are processed.sup.10 difficult. Illuminating half
the story. There are several methods to control protein kinases with
light.sup.18,19, but no analogous methods for controlling protein
phosphatases. As signaling networks are regulated by the concerted action
of both classes of enzyme, detailed dissections of those networks require
methods for controlling both. A limited palette of actuators. Optogenetic
actuators that enable subcellular control of enzyme activity require the
use of blue or green light.sup.15. These wavelengths exhibit significant
phototoxicity.sup.20, suffer from short biological penetration
depths.sup.21, and, as a result of their spectral similarity, limit
actuation to individual signaling events, rather than multiple events
simultaneously.
[0145] A. Photoswitchable Constructs: Advantages Over Other Exemplary
Technologies.
[0146] As described herein, a photoswitch describes a protein-protein
architecture (e.g., a PTP1B-LOV2 fusion) that is optically active in its
monomeric form. A reference, WO2013016693. "Near-infrared light-activated
proteins." Publication Date Jan. 31, 2013, relies on homodimerization. In
contrast, optical control as described herein is over a larger range of
proteins, including both those that require homodimerization and those
that do not, unlike in WO2013016693. Further, this reference describes
types of photosensory modules including blue light-sensitive
flavoproteins found in plants; photoreceptors of blue-light using flavin
adenine dinucleotide (BLUF); Light, Oxygen, or Voltage sensing (LOV)
types, which includes plant and bacterial photoreceptors; and
plant/microbe phytochromes, sensitive to light, i.e. light-induced helix
rotation in the red-to-NIR region. More specifically described with
examples are bacteriophytochrome (Bph)-based photoactivated fusion
proteins, using light-responsive alpha helixes from Rhodobacter
sphaeroides (BphG) fused to proteins such as protein phosphatases,
protein kinases, membrane receptors, etc. E. coli, are modified so as to
exhibit the level of photoactivity of these expressed fusion proteins,
i.e. in the presence or absence of red-to-NIR light. Although blue color
changes in E. coli expressing fusion proteins are described in response
to light, these blue bacteria are the result of using far-red/NIR-light
for photoactivating a fusion protein that in turn activates lacZ
expression in the presence of XgaI, not a photoresponse to exposure to
blue light. However, there is no specific mention of a tyrosine
phosphatase or a plant phototropin 1 LOV2 N-terminal alpha helix. In
fact, reviews on optogenetics tend to depict LOV2 as having one terminal
helix: The C-terminal Jalpha helix. While there are studies/patents
indicating that simple insertion of the LOV2 domain enables photocontrol
they rely on the underlying assumption that the Jalpha helix is
undwinding to produce the controlling effect, not the A alpha helix as
described herein.
[0147] B. A "Cage-Free" Approach to Control Protein Tyrosine Phosphatases
and Protein Tyrosine Kinases with Light.
[0148] Current strategies for using light to control the activity of
enzymes (as opposed to their concentration or location) rely on
cage-based systems: a light-responsive protein, when fused to an enzyme
of interest, controls access to its active site.sup.16,47. Unfortunately,
such architectures can alter the affinity of enzymes for binding partners
and change their susceptibility to activity modulating modifications
(e.g., phosphorylation).sup.16,18. These effects complicate the use of
optogenetics to study signaling. This study will develop a "cage-free",
allostery-based approach for optical control that minimizes interference
between enzymes and their substrates (and other binding partners). This
approach will help preserve native patterns of protein localization,
interaction, and post-translational modification and, thus, facilitate
studies of the influence of those patterns on intracellular signaling.
[0149] 2. A genetically encoded photoswitchable phosphatase. There are no
genetically encodable photoswitchable phosphatases; the chimeras
developed in this proposal will be the first. Photoswitchable variants of
PTP1B will enable detailed studies of a wide range of interesting PTP1
B-regulated processes (e.g., insulin, endocannabinoid, and epidermal
growth factor signaling.sup.49,51, and cell adhesion and
migration.sup.52). Photoswitchable phosphatases, in general, will provide
a useful class of tools for studying cell biology (particularly in
concert with photoswitchable kinases, which could enable complementation
experiments).
[0150] Hypothesis: The catalytic domains of PTPs and PTKs possess
C-terminal a-helices that are distal to their active sites, yet capable
of modulating their catalytic activities (for at least a subset of
enzymes--the generality of this function is not known).sup.23,24. We
hypothesize that the fusion of this helix to the N-terminal .alpha.-helix
of the light-oxygen voltage 2 (LOV2) domain of phototropin 1 from Avena
sativa--a photosensory domain with terminal helices that unwind in
response to blue light.sup.2526--will yield enzyme-LOV2 chimeras that
exhibit light-dependent catalytic activities, yet retain their native
substrate specificities and binding affinities.
[0151] Experimental approach: We will attach the C-terminal .alpha.-helix
of PTP1B to the N-terminal .alpha.-helix of LOV2 at homologous crossover
points, and we will assess the influence of photoactivation on the
catalytic activity of the resulting chimeras. This effort will involve
the use of (i) kinetic assays and binding studies to characterize the
substrate specificities and binding affinities of photoswitchable
constructs and (ii) crystallographic and spectroscopic analyses to
examine the structural basis of photocontrol. Informed by these studies,
we will extend our approach to striatal-enriched protein tyrosine
phosphatase (STEP) and protein tyrosine kinase 6 (PTK6), enzymes
implicated in Alzheimer's disease and triple-negative breast cancer,
respectively.
[0152] We will combine sophisticated biophysical studies, synthetic
biology, and fluorescence microscopy to (i) develop protein architectures
that enable optical control of protein tyrosine phosphatases (PTPs) and
protein tyrosine kinases (PTKs) without interfering with their wild-type
activities or binding specificities, (ii) evolve PTPs and PTKs modulated
by red light, and (iii) develop an imaging methodology to study spatially
localized signaling events in living cells.
[0153] We will begin our study with PTP1B, a validated drug target for the
treatment of diabetes, obesity, and breast cancer, and an enzyme for
which optogenetic tools will be particularly useful to address current
gaps in knowledge (e.g., the role of spatially distinct subpopulations of
PTP1B in promoting or suppressing the growth of tumors.sup.22). Using it
as a model, we will establish the generality of our methods by extending
them to other PTPs and PTKs.
[0154] C. A Photoswitchable Variant of PTP1B.
[0155] Our first objective seeks to use LOV2, a protein with terminal
helices that unwind in response to blue light, to control the activity of
PTP1B, an enzyme for which unwinding of the C-terminal .alpha.-helix
disrupts activity by distorting its catalytically essential WPD loop
(FIG. 1AB, FIG. 6). To assess the feasibility of this goal, we
constructed five PTP1B-LOV2 chimeras (joined at homologous crossover
points): three chimeras showed light-dependent catalytic activity on
4-methylumbelliferyl phosphate (4M) (FIG. 1G). A subsequent mutational
analysis of one chimera indicated that mutations in the .alpha.-helix
that links PTP1B to LOV2 can improve catalytic activity and dynamic range
(DR, the ratio of dark/light activities; FIG. 1G). Our ability to
build--and begin optimizing--a photoswitchable PTP1B-LOV2 chimera by
screening a small number of constructs suggests that rational design will
allow us to build a chimera sufficient for intracellular signaling
studies. We note: Our most photoswitchable chimera has a DR of 2.2;
previous imaging studies suggest that a DR of 3-10 is sufficient to
control intracellular signaling.sup.218,19.
[0156] More specifically, FIG. 1C demonstrates some of differences over
other types of optical control. The y-axis of the top plot indicates the
activity of each construct in the dark (i.e., the initial rate of
PTP1B-catalyzed hydrolysis of p-nitrophenyl phosphate); the y-axis of the
bottom plot indicates the ratio of activity in dark and light state
(i.e., the initial rate in the dark/the initial rate in the presence of
455 nm light), i.e. dynamic range.
[0157] Black bars show the activity and dynamic range for a set of eight
initial constructs that differ in the crossover point (see the bottom of
FIG. 1B). Some of these constructs are photoswitchable, and some are not.
Version 7 shows the greatest photoswitchability--the dynamic range is
approximately 1.8.
[0158] More specifically, colors are associated with different types of
constructs. Black: different crossover point (see FIG. 1B for crossover
points); Gray: different partitioning of the linker (see, Linker section
below); Light blue: the Jalpha helix--this is at the C-terminus of the
LOV2 domain; Dark blue: the A'alpha helix--this is at the N-terminus of
the LOV2 domain and, thus, on the region that links it to PTP1B; Yellow:
the alpha7 helix of PTP1B--this is at the C-terminus of PTP1B and, thus,
on the region that links PTP1B to LOV2; Orange: combination: a
combination of sites from the previous colors, see below for additional
information.
[0159] These results were surprising, in part, because a recent review on
optogenetics shows that that photocontrol of activity requires the
J.alpha. helix of LOV2, where J.alpha. is a C-terminal helix which
resides in a folded state against the LOV domain core, to be attached to
a protein of interest, see Repina, N. A., Rosenbloom, A., Mukherjee, A.,
Schaffer, D. V. & Kane, R. S. At Light Speed: Advances in Optogenetic
Systems for Regulating Cell Signaling and Behavior. Annu. Rev. Chem.
Biomol. Eng. 8, 13-39 (2017). Photoactivation with blue light converts
the noncovalent interaction between the LOV core and its bound flavin
chromophore, FMN, into a covalent one through a conserved cysteine
residue. The accompanying light-induced conformational change displaces
the J.alpha. helix away from the protein core, leading to uncaging of a
fused effector domain (e.g., the kinase domain of phot1). J.alpha. helix
reverts to its dark-state caged conformation within minutes owing to
spontaneous decay of the protein-cofactor bond.
[0160] Several limitations of the native AsLOV2 domain have motivated
efforts to engineer improved variants. First, when fused to foreign
protein domains, spontaneous undocking of the J.alpha. helix can lead to
a relatively high dark-state activity, resulting in a low dynamic range
upon AsLOV2 uncaging (26). For example, the light-inducible DNA-binding
system LovTAP has only a fivefold change in DNA affinity between the dark
and illuminated states (27). To address this issue, Strickland et al.
(26) used rational design to introduce four mutations into AsLOV2 that
stabilized the docking of J.alpha. to the LOV core. This increased the
dynamic range of LovTAP from 5-fold to 70-fold, an approach that can be
applied to other LOV domain optogenetic systems to reduce dark-state
activity. AsLOV2 fusions are also particularly sensitive to linker
lengths and the size and structure of attached domains (28, 29), and as a
result, each new fusion protein switch requires optimization to achieve
low dark-state and high light-state activity in mammalian cells.
[0161] In contrast to the J.alpha. helix-protein chimers, as shown herein,
the A'.alpha. helix not the J.alpha. helix is attached to the protein of
interest to form photoswitchable constructs, e.g. PTPB1.
[0162] Exemplary Linkers.
[0163] Gray bars of FIG. 1C show the activity and dynamic range of mutants
of version 7 in which the linker has been re-partitioned. In other words,
version 7 has the following linker region: LSHEDLATTL (SEQ ID NO: 5),
where the underlined region "LSHED" (SEQ ID NO: 6) corresponds to the
C-terminus of PTP1B, and the region "LATTL" (SEQ ID NO: 7) corresponds to
the N-terminus of LOV2. Version 7.1 has sequence LSHEDATTL (SEQ ID NO:
8); version 7.2 has sequence LSHEDTTL (SEQ ID NO: 9), and so on. Here, we
find that version 7.1 has the same dynamic range as version 7, but a
higher activity. We, thus, used version 7.1 for further optimization.
[0164] Exemplary Mutations.
[0165] Light blue bars show the activity and dynamic range of mutants of
version 7.1 in which the J.alpha. helix contains helix-stabilizing
mutations. Curiously, these improve the activity of 7.1, but do not
improve its dynamic range.
[0166] Dark blue bars show the activity and dynamic range for mutants of
version 7.1 in which the A'.alpha. helix contains helix-stabilizing
mutations. One of these mutations (T406A) improves dynamic range; we used
this version for further studies.
[0167] Yellow bars show the activity and dynamic range of mutants of
version 7.1 in which the .alpha.7 of PTP1B has helix-stabilizing
mutations; the orange bars show the activity and dynamic range for
mutants of version 7.1 in which the multiple mutations are combined.
Neither of the constructs associated with yellow and orange bars show
improved characteristics of 7.1 (T406A).
[0168] A minimally disruptive approach. Two kinetic studies indicate that
our architecture for photocontrol does not interfere with the native
substrate specificity or binding behavior of PTP1B: (i) An analysis of
the activity of chimera E3 (from FIG. 1D) on p-nitrophenyl phosphate (pN)
indicates that light affects k.sub.cat, but not K.sub.m (FIGS. 2K and L).
(ii) An analysis of activities on three substrates of different sizes
(4M, pN, and a peptide) shows that DR is the same for all three (FIG.
2L-K). The results of both studies are consistent with our hypothesized
mechanism of photocontrol: LOV2-induced unwinding of the C-terminal
.alpha.-helix of PTP1B disrupts the movement of its catalytically
essential WPD loop, which controls the rate of catalysis, but has little
influence on substrate binding affinity.
[0169] Biophysical studies. Photoswitchable chimeras express at titers
(-100 mg/L) sufficient to carry out detailed biophysical analyses. We
performed a preliminary set of these analyses on chimera E3. (i) We use
circular dichroism (CD) to examine the influence of photoactivation on
its secondary structure; spectral measurements indicate that
photoactivation reduces .alpha.-helical content (222 nm; FIG. 2B). (ii)
We used the amplitude at 222 nm to measure a post-activation recovery
time for .alpha.-helical content: T.sub.r-30 s (FIG. 2E). This value is
similar to the recovery times of previously developed LOV2-based
photoswitchable constructs, (iii) We used tryptophan fluorescence to
measure a post-activation recovery time of tryptophan residues:
T.sub.r-50 s (FIG. 2F). Tryptophan fluorescence is a rough metric for the
conformation of PTP1B (which has seven tryptophan residues, compared to
one in LOV2); this slower recovery time, thus, suggests that PTP1B takes
longer than LOV2 to refold, (iv) We identified a set of crystallization
conditions (those previously used to crystallize PTP1 B.sub.WT) to grow
crystals of E3 (FIG. 2F). (V) We collected a two-dimensional
.sup.1H-.sup.15N HSQC spectrum of PTP1B.sub.WT, and assigned -65% of
non-proline peaks. These NMR experiments, which are recent, have yet to
include PTP1B-LOV2 chimeras; but the ease with which we carried them out
(a single try) suggests that similar analyses of chimeras will be
straightforward. The experimental tractability of PTP1B-LOV2 chimeras
will enable a comprehensive biophysical analysis of variants with
different photophysical properties.
[0170] Example 1. To develop a "cage-free" approach to control protein
tyrosine phosphatases and kinases with light. This section develops an
approach for placing enzymes under optical control without disrupting
their native interactions. We will demonstrate this approach with PTP1B
and, then, extend it to STEP and PTK6. We will know that we are
successful when we have a PTP1B-LOV2 chimera that exhibits a three- to
ten-fold change in activity between light and dark states, and when we
have identified structure-based design rules that facilitate fine-tuning
of the photophysical properties of photoswitchable variants of PTP1B,
STEP, and PTK6.
[0171] D. Development of a Photoswitchable Variant of PTP1B.
[0172] The efforts in this section assume--and with crystallographic,
kinetic, and binding studies, attempt to confirm--that optogenetic
actuation systems located far from active sites are less likely to
disrupt wild-type behaviors that actuation systems located nearby.
Kinetic studies of preliminary PTP1B-LOV2 chimeras (i.e., chimeras in
which the C-terminal helix of PTP1B is connected to the N-terminal helix
of the LOV2 domain of phototropin 1 from A vena sativa) support this
hypothesis: light inhibits their activity by affecting k.sub.cat, not
K.sub.m, and they show wild-type kinetics on 4-methylumbelliferyl
phosphate (4M), a model substrate (FIG. 1G and FIG. 2K). Photomodulation
of k.sub.cat, but not K.sub.m suggests that LOV2 exploits an allosteric
network to distort the WPD loop (FIG. 6).
[0173] Our initial constructs, which represent the first reported examples
photoswitchable protein phosphatases, will facilitate a systematic study
of the functional advantages of different chimera architectures. We are
particularly interested in understanding how (i) the length of the linker
that connects PTP1B and LOV2 and (ii) the stability of the terminal
helices of LOV2 affect catalytic activity and dynamic range. We will
study these relationships by combing spectroscopic analyses with kinetic
studies. Spectroscopic analyses will show how different PTP1B-LOV2
chimeras rearrange under illumination (e.g., we will use CD and
fluorescence spectroscopy to measure photomodulation of .alpha.-helical
content and tryptophan fluorescence), and kinetic studies will reveal the
influence of those rearrangements on catalytic activity and dynamic
range.
[0174] The results of our biophysical analyses will facilitate the
optimization of our chimera for in vitro cell studies. We will target a
chimera--hereafter, referred to as PTP1B.sub.PS--with the following
properties: a dynamic range (DR) of 3-10, a recovery time of
T.sub.r.about.15-60 s, and wild-type activity (in its activated state).
Previous optogenetic studies suggest that these attributes enable optical
control of cell signaling.sup.2,18,19. We note: Biophysical studies of
PTP1B indicate that the removal of its C-terminal .alpha.-helix can
reduce its activity by a factor of four.sup.57; accordingly, we believe
that LOV2 can modulate the activity of PTP1B by at least fourfold (of
course, LOV2 may trigger structural distortions more pronounced than
those of a simple truncation).
[0175] E. Characterization of PTP1B-Substrate and PTP1B-Protein
Interactions.
[0176] We will assess the influence of LOV2 on the substrate specificity
of PTP1B by using kinetic analyses. Specifically, we will compare the
activities of PTP1B.sub.WT and PTP1B.sub.PS on three substrates: (i)
p-nitrophenyl phosphate, a general substrate for tyrosine phosphatases,
(ii) ETGTEEpYMKMDLG (SEQ ID NO: 10), a substrate of PTPs closely related
to PTP1B, and (iii) RRLIEDAEpYAARG (SEQ ID NO: 11), a substrate specific
to PTP1B. A comparison of values of k.sub.cat and K.sub.m on these
substrates (FIG. 2K shows an example kinetic study) will reveal
differences in the catalytic activities and specificities of PTP1B.sub.WT
and PTP1B.sub.PS. These studies will also allow us to assess the
substrate-dependence of photoswitchability (i.e., DR). Photomodulation is
often assumed to be independent of substrate; there is, however, no
biochemical basis for this assumption (particularly in cage-based
systems, where substrates may bind with different affinities and, thus,
have different abilities to compete with the caging protein). We will
test it.
[0177] We will assess the ability of PTP1B.sub.PS to engage in the same
protein-protein interactions as PTP1B.sub.WT by measuring the affinity of
both enzymes for two native binding partners of PTP1B.sub.WT: LMO4 and
Stat3. Binding isotherms based on changes in tryptophan fluorescence of
PTP1B will facilitate this study (FIG. 7).
[0178] Our biochemical comparison of PTP1B.sub.WT and PTP1B.sub.PS may
seem tedious, but we believe that this analysis is necessary to establish
the relevance of future optogenetic observations to wild-type processes.
[0179] Biostructural characterization. We will investigate the structural
basis of photocontrol in PTP1B.sub.PS by using X-ray crystallography and
NMR spectroscopy. X-ray crystal structures will show how LOV2 affects the
structure of PTP1B (and vice versa); NMR spectroscopy will show how LOV2
modulates catalytic activity. For crystallographic studies, we will
crystallize PTP1B.sub.PS in its dark state (we will use the C450S
mutation, which prevents formation of the cysteine adduct.sup.2,26) by
screening crystallization conditions previously used for LOV2, PTP1B, and
LOV2-protein chimeras (all of which have crystal structures.sup.2,35,58);
preliminary results suggest that those used to grow crystals of
PTP1B.sub.WT also yield crystals of PTP1B-LOV2 chimeras (FIG. 2J). For
NMR studies, we will use heteronuclear single quantum coherence (HSQC)
spectroscopy and transverse relaxation-optimized spectroscopy (TROSY) to
monitor changes in the conformation and backboned dynamics of
PTP1B.sub.PS before and after illumination. (We note: Backbone .sup.1H,
.sup.13C, and .sup.15N chemical shifts have been assigned for PTP1B and
LOV2.sup.59,60).
[0180] G. Exemplary Imaging Methodology to Study Subcellular Signaling
Events in Living Cells.
[0181] This section uses PTP1B.sub.PS (a PTP1B-LOV2 chimera) to develop an
approach for using confocal microscopy to probe--and study--subcellular
signaling events. We will know that this objective is successful when we
can inactivate a within subcellular regions, monitor the effect of that
inactivation with an FRET-based sensor, and isolate the contributions of
different subpopulations of PTP1B (e.g., ER-bound and cytosolic) to
sensor phosphorylation.
[0182] Hypothesis. The subcellular localization of PTPs and PTKs is
controlled by domains proximal to their catalytic cores.sup.23,24. We
hypothesize that the attachment of these domains to photoswitchable
chimeras will give them wild-type localization patterns, and enable the
use of confocal microscopy to study the contribution of spatially
distinct subpopulations of PTPs and PTKs to cell signaling. Experimental
approach: Within the cell, PTP1B exists in two spatially distinct
subpopulations: attached to the cytosolic face of the endoplasmic
reticulum, and free in the cytosol--a result of proteolysis of its short
(-80 residue) C-terminal ER anchor.sup.29. We will (i) attach the ER
anchor of wild-type PTP1B (PTP1 B.sub.WT) to our PTP1B-LOV2 chimera, (ii)
compare the subcellular localization of the resulting chimera with that
of PTP1B.sub.WT, (iii) use confocal microscopy--in conjunction with a
FRET-based sensor for phosphatase activity--to control and monitor PTP1B
activity within the cell, and (iv) develop a reaction-diffusion model to
assess the contributions of spatially distinct subpopulations of PTP1B to
changes in sensor phosphorylation over time and space. This work will
yield a general approach for studying spatially localized signaling
events in living cells.
[0183] Localization of PTP1B.sub.PS.
[0184] To examine the localization of PTP1B.sub.PS in living cells, we
will express three variants in COS-7 cells: (i) PTP1B.sub.PS_C45os, (ii)
PTP1B.sub.PS_c45os attached to a short segment (-20 amino acids.sup.29)
of the C-terminal ER anchor of PTP1B.sub.WT that contains only the
transmembrane domain (but not the proteolysis site), and (iii) PTP1
B.sub.PS-c45os attached to the full C-terminal ER anchor of PTP1B.sub.WT
(.about.80 amino acids.sup.29). We hypothesize that these constructs will
have (i) cytosolic, (ii) ER-bound, and (iii) both cytosolic and ER-bound
(i.e., wild-type) localization patterns, respectively. Using confocal
microscopy, we will test this hypothesis by using the fluorescence of
LOV2 to locate each chimera.sup.70. (In these studies, we will locate the
ER with fluorescently-labeled SEC61B, an ER-associated transport
complex.sup.71. The C450S mutation, which locks LOV2 in its fluorescent
state, will prevent photoactivation during imaging).
[0185] COS-7 cells, fibroblast-like cells derived from the kidney tissue
of the African green monkey, are particularly compatible with the
aforementioned analysis for three reasons: (i) They are large and flat
and, thus, facilitate imaging of spatially segregated subcellular
regions.sup.72, (ii) They are compatible with commercially available
transfection reagents.sup.73, (iii) Methods for inducing
endocytosis.sup.71 and calpain expression.sup.74, two processes that
influence the subcellular activity and localization of PTP1B, are well
developed for these cells.
[0186] Control of PTP1B.sub.PS in living cells. We will examine the
activity of PTP1B within subcellular regions by pairing confocal
microscopy with a FRET-based sensor for protein phosphorylation
(developed by the Umezawa group.sup.54; FIG. 12). This sensor will
consist of a kinase substrate domain, a short flexible linker, and a
phosphorylation recognition domain--all sandwiched between two
fluorescent proteins (Clover, a green fluorescent protein, and mRuby2, a
red fluorescent protein). Phosphorylation of the substrate domain will
cause it to bind to the recognition domain, modulating (i.e., enhancing
or reducing) FRET between the two fluorescent proteins. Our preliminary
sensor, which uses substrate and SH2 domains compatible with PTP1B and
src.sup.23,55, exhibits a 20% change in FRET in response to
phosphorylation. We will attempt to optimize our sensor further by
screening different substrate domains, SH2 domains, and linker lengths.
Ouyang et al. built a FRET sensor for Src kinase activity that exhibits a
-120% change in FRET when phosphorylated.sup.55; we will use the
architecture of this sensor--or, perhaps the sensor itself--to inform our
designs.
[0187] In our imaging experiments, we will use a 455-nm laser to
inactivate PTP1B within subcellular regions (1-10 urn circles) and
fluorescence lifetime imaging microscopy (FLIM) to monitor changes in
sensor phosphorylation that result from that inactivation (FIG. 13). For
these experiments, we will use siRNA to deplete PTP1 B.sub.WT and SEC61B
to label the ER. The output will be a series of images in which the
intensity of a pixel is proportional to the fluorescence lifetime of
Clover (and, thus, the extent of sensor phosphorylation).
[0188] With this study, we are particularly interested in examining
relationships between (i) the location of PTPIBps
activation/inactivation, (ii) the size of the region of
activation/inactivation, and (iii) the location and timing of changes in
the phosphorylation state of the sensor. We will investigate these
relationships by using a reaction-diffusion model. Equation 1 provides a
simple example of a governing equation:
.differential. S P ( r , t ) .differential. t = D
S .gradient. 2 S P + k cat K [ KS ] - k cat P [
PS P ] - k on K [ P ] [ S P ] ( Eq . 1 )
##EQU00001##
for the phosphorylated sensor (S.sub.P). Here, D.sub.s is the diffusion
coefficient for the sensor; KS is the concentration of tyrosine kinase
bound to unphosphorylated sensor; PS.sub.p is the concentration of PTP1B
bound to phosphorylated sensor; P and S.sub.p are the concentrations of
free PTP1B and free phosphorylated sensor, respectively; k{circumflex
over ( )}.sub.at and k{circumflex over ( )}.sub.at are the catalytic
constants for the tyrosine kinase and PTP1B, respectively; and k %.sub.n
is the kinetic constant for sensor-PTP1B association. The kinase and
phosphatase are assumed to bind only weakly with their products (an
assumption that can be easily re-examined later). We may also supplement
this model with tools such as BioNetGen, a web-based platform for
generating biochemical reaction networks from user-specified rules for
the mechanisms and locations of biomolecular interactions.sup.75; such a
tool, which can accommodate cellular heterogeneity (e.g., organelles and
other compartments), will help to support and expand our kinetic model.
[0189] We hypothesize that a version of our kinetic model in which the
phosphatase diffuses freely will more accurately capture the
phosphorylation state of the sensor (at a specified time and position
from the irradiation region) in the presence of cytosolic PTP1B.sub.PS.
By contrast, a version of the model in which phosphatase does not diffuse
freely will more accurately capture the behavior of sensors in the
presence of ER-bound PTP1B.sub.PS. Regression of either model against
imaging data will enable estimation of the extent to which cytosolic and
ER-bound PTP1B contribute to changes in sensor phosphorylation over time
and space.
[0190] Image analysis. The ER exists as a vesicular network that is spread
throughout the cell; inactivation of subcellular regions that are
entirely ER or entirely cytosol is difficult. To enable analysis of
spatially distinct subpopulations of PTP1B, we must, thus, estimate the
amount of ER in different regions of irradiation. The discrepancy in
length scales of ER heterogeneity (-20-100 pm) and irradiation (-1-10 pm)
will permit such an estimation. We will work with two metrics: (i) the
total fluorescence of labeled ER, and (ii) the anisotropy of labeled ER.
Both metrics, by facilitating estimates of the populations of cytosolic
and ER-bound PTP1B in an illuminated region, will help us to assess the
contributions of those populations to changes in sensor phosphorylation.
[0191] Spatial Regulation and Intracellular Signaling.
[0192] PTP1B demonstrates, by example, the value of photoswitchable
enzymes for studying spatial regulation in intracellular signaling. It is
hypothesized to inactivate receptor tyrosine kinases through (i) contacts
between endosomes and the ER.sup.37,38, (ii) contacts between the plasma
membrane and extended regions of the ER.sup.39, and (iii) direct
protein-protein interactions enabled by its partial proteolysis and
release into the cytosol.sup.34. The role of different mechanisms (or
locations) of PTP1B-substrate interaction in determining the outcomes of
those interactions is poorly understood. Evidence suggesting a
relationship between the location of PTP1B and its role in signaling has
arisen in studies of tumorigenesis. Inhibition of PTP1B can suppress
tumor growth and metastasis in breast.sup.30,40, lung.sup.3,41,
colorectal.sup.9, and prostate cancers,.sup.42,43 while its upregulation
has similar effects in lymphoma.sup.3,44. Recent evidence suggests that
the former effect may result from inhibition of cytosolic PTP1B.sup.45;
the cause of the latter is unclear. At present, there are no tools to
investigate the differential influence of spatially distinct
subpopulations of PTP1B on tumor-associated signaling events within the
same cell. Photoswitchable variants of PTP1B represent such a tool.
[0193] Network biology. Signaling networks are often represented as nodes
(proteins) connected by lines (interactions).sup.46. Such maps capture
the connectivity of biochemical relay systems, but obscure spatial
context--the ability of a single interaction to occur in multiple
locations and, perhaps, to stimulate multiple signaling outcomes. This
study develops a set of tools that will enable detailed studies of the
role of spatial context in guiding the propagation of signals through
biochemical networks; e.g. understanding the role of PTP1B in cell
signaling (and processes associated with tumorigenesis), and generally
relevant to the study of any enzyme that exists in spatially distinct
subpopulations within the cells.
[0194] Generalization of Approach to Protein Tyrosine Phosphatases and
Kinases.
[0195] Two observations suggest that our architecture for photocontrol
(i.e., attachment of the N-terminus of LOV2 to the C-terminal
.alpha.-helix of an enzyme) is broadly applicable to PTPs and PTKs. (i)
Structural alignments show that all PTPs possess, or, with a few
mutations, can possess--the same allosteric communication network as
PTP1B (FIG. 8A).sup.23. (ii) PTKs contain a C-terminal .alpha.-helix that
is distal to their active sites, yet capable of modulating their
catalytic activities (FIG. 8B).sup.61.
[0196] We will assess the generalizability of our approach by building
photoswitchable variants of striatal-enriched protein tyrosine
phosphatase (STEP) and protein tyrosine kinase 6 (PTK6; FIG. 8A). STEP is
a neuron-specific phosphatase that is overactive in several neurological
disorders, prominently Alzheimer's disease, schizophrenia, and drug
addiction.sup.62,63. PTK6, which may function orthogonally to PTP1B in
some signaling pathways, is expressed in approximately 70% of
triple-negative breast cancers and promotes metastasis.sup.50,64.
Photoswitchable variants of STEP and PTK6, both of which exist in
multiple spatially distinct subpopulations within cells.sup.50,62, will
enable detailed studies of their intracellular signaling roles, which
remain poorly characterized.
[0197] For STEP and PTK6, we will develop--and measure the substrate
specificities of--photoswitchable chimeras by using several kinetic
assays. For STEP, we will use assays analogous to those employed with
PTP1B. For PTK6, we will use the ADP-Glo kit developed by Promega,
Inc..sup.65. This assay, which is compatible with any peptide substrate,
converts ADP produced by PTK-catalyzed peptide phosphorylation to a
luminescent signal. For both enzymes, we will collect crystal structures
of optimal chimeras.
[0198] Exemplary photoswitch construct sequences for use in expressing in
mammalian cells or within an operon for microbial cells. In some
embodiments, the sequences may be optimized for microbial expression.
TABLE-US-00001
PPTP1B-LOV2, version 7.1(T406A): DNA sequence SEQ ID NO: 12:
ATGGAGATGGAAAAGGAGTTCGAGCAGATCGACAAGTCCGGGAGCTGGGCGGCC
ATTTACCAGGATATCCGACATGAAGCCAGTGACTTCCCATGTAGAGTGGCCAAGCT
TCCTAAGAACAAAAACCGAAATAGGTACAGAGACGTCAGTCCCTTTGACCATAGTC
GGATTAAACTACATCAAGAAGATAATGACTATATCAACGCTAGTTTGATAAAAATGG
AAGAAGCCCAAAGGAGTTACATTCTTACCCAGGGCCCTTTGCCTAACACATGCGGT
CACTTTTGGGAGATGGTGTGGGAGCAGAAAAGCAGGGGTGTCGTCATGCTCAACA
GAGTGATGGAGAAAGGTTCGTTAAAATGCGCACAATACTGGCCACAAAAAGAAGAA
AAAGAGATGATCTTTGAAGACACAAATTTGAAATTAACATTGATCTCTGAAGATATC
AAGTCATATTATACAGTGCGACAGCTAGAATTGGAAAACCTTACAACCCAAGAAAC
TCGAGAGATCTTACATTTCCACTATACCACATGGCCTGACTTTGGAGTCCCTGAAT
CACCAGCCTCATTCTTGAACTTTCTTTTCAAAGTCCGAGAGTCAGGGTCACTCAGC
CCGGAGCACGGGCCCGTTGTGGTGCACTGCAGTGCAGGCATCGGCAGGTCTGGA
ACCTTCTGTCTGGCTGATACCTGCCTCTTGCTGATGGACAAGAGGAAAGACCCTTC
TTCCGTTGATATCAAGAAAGTGCTGTTAGAAATGAGGAAGTTTCGGATGGGGCTGA
TCCAGACAGCCGACCAGCTGCGCTTCTCCTACCTGGCTGTGATCGAAGGTGCCAA
ATTCATCATGGGGGACTCTTCCGTGCAGGATCAGTGGAAGGAGCTTTCCCACGAG
GACGCTGCTACACTTGAACGTATTGAGAAGAACTTTGTCATTACTGACCCAAGGTT
GCCAGATAATCCCATTATATTCGCGTCCGATAGTTTCTTGCAGTTGACAGAATATAG
CCGTGAAGAAATTTTGGGAAGAAACTGCAGGTTTCTACAAGGTCCTGAAACTGATC
GCGCGACAGTGAGAAAAATTAGAGATGCCATAGATAACCAAACAGAGGTCACTGTT
CAGCTGATTAATTATACAAAGAGTGGTAAAAAGTTCTGGAACCTCTTTCACTTGCAG
CCTATGCGAGATCAGAAGGGAGATGTCCAGTACTTTATTGGGGTTCAGTTGGATG
GAACTGAGCATGTCCGAGATGCTGCCGAGAGAGAGGGAGTCATGCTGATTAAGAA
AACTGCAGAAAATATTGATGAGGCGGCAAAAGAACTTCTCGAGCACCACCACCAC
CACCACTGA
Protein sequence: SEQ ID NO: 13:
MEMEKEFEQIDKSGSWAAIYQDIRHEASDFPCRVAKLPKNKNRNRYRDVSPFDHSRIKL
HQEDNDYINASLIKMEEAQRSYILTQGPLPNTCGHFWEMVWEQKSRGVVMLNRVMEK
GSLKCAQYWPQKEEKEMIFEDTNLKLTLISEDIKSYYTVRQLELENLTTQETREILHFHY
TTWPDFGVPESPASFLNFLFKVRESGSLSPEHGPVVVHCSAGIGRSGTFCLADTCLLLMD
KRKDPSSVDIKKVLLEMRKFRMGLIQTADQLRFSYLAVIEGAKFIMGDSSVQDQWKELS
HEDAATLERIEKNFVITDPRLPDNPIIFASDSFLQLTEYSREEILGRNCRFLQGPETDRATV
RKIRDAIDNQTEVTVQLINYTKSGKKFWNLFHLQPMRDQKGDVQYFIGVQLDGTEHVR
DAAEREGVMLIKKTAENIDEAAKELLEHHHHHH
PTP1B-LOV2, version 7.1(S286A): DNA sequence: SEQ ID NO: 14:
ATGGAGATGGAAAAGGAGTTCGAGCAGATCGACAAGTCCGGGAGCTGGGCGGCCAT
TTACCAGGATATCCGACATGAAGCCAGTGACTTCCCATGTAGAGTGGCCAAGCTTCC
TAAGAACAAAAACCGAAATAGGTACAGAGACGTCAGTCCCTTTGACCATAGTCGGA
TTAAACTACATCAAGAAGATAATGACTATATCAACGCTAGTTTGATAAAAATGGAA
GAAGCCCAAAGGAGTTACATTCTTACCCAGGGCCCTTTGCCTAACACATGCGGTCAC
TTTTGGGAGATGGTGTGGGAGCAGAAAAGCAGGGGTGTCGTCATGCTCAACAGAGT
GATGGAGAAAGGTTCGTTAAAATGCGCACAATACTGGCCACAAAAAGAAGAAAAA
GAGATGATCTTTGAAGACACAAATTTGAAATTAACATTGATCTCTGAAGATATCAAG
TCATATTATACAGTGCGACAGCTAGAATTGGAAAACCTTACAACCCAAGAAACTCG
AGAGATCTTACATTTCCACTATACCACATGGCCTGACTTTGGAGTCCCTGAATCACC
AGCCTCATTCTTGAACTTTCTTTTCAAAGTCCGAGAGTCAGGGTCACTCAGCCCGGA
GCACGGGCCCGTTGTGGTGCACTGCAGTGCAGGCATCGGCAGGTCTGGAACCTTCTG
TCTGGCTGATACCTGCCTCTTGCTGATGGACAAGAGGAAAGACCCTTCTTCCGTTGA
TATCAAGAAAGTGCTGTTAGAAATGAGGAAGTTTCGGATGGGGCTGATCCAGACAG
CCGACCAGCTGCGCTTCTCCTACCTGGCTGTGATCGAAGGTGCCAAATTCATCATGG
GGGACTCTGCCGTGCAGGATCAGTGGAAGGAGCTTTCCCACGAGGACGCTACTACA
CTTGAACGTATTGAGAAGAACTTTGTCATTACTGACCCAAGGTTGCCAGATAATCCC
ATTATATTCGCGTCCGATAGTTTCTTGCAGTTGACAGAATATAGCCGTGAAGAAATT
TTGGGAAGAAACTGCAGGTTTCTACAAGGTCCTGAAACTGATCGCGCGACAGTGAG
AAAAATTAGAGATGCCATAGATAACCAAACAGAGGTCACTGTTCAGCTGATTAATT
ATACAAAGAGTGGTAAAAAGTTCTGGAACCTCTTTCACTTGCAGCCTATGCGAGATC
AGAAGGGAGATGTCCAGTACTTTATTGGGGTTCAGTTGGATGGAACTGAGCATGTCC
GAGATGCTGCCGAGAGAGAGGGAGTCATGCTGATTAAGAAAACTGCAGAAAATATT
GATGAGGCGGCAAAAGAACTTCTCGAGCACCACCACCACCACCACTGA
Protein sequence: SEQ ID NO: 15:
MEMEKEFEQIDKSGSWAAIYQDIRHEASDFPCRVAKLPKNKNRNRYRDVSPFDHSRIKL
HQEDNDYINASLIKMEEAQRSYILTQGPLPNTCGHFWEMVWEQKSRGVVMLNRVMEK
GSLKCAQYWPQKEEKEMIFEDTNLKLTLISEDIKSYYTVRQLELENLTTQETREILHFHY
TTWPDFGVPESPASFLNFLFKVRESGSLSPEHGPVVVHCSAGIGRSGTFCLADTCLLLMD
KRKDPSSVDIKKVLLEMRKFRMGLIQTADQLRFSYLAVIEGAKFIMGDSAVQDQWKELS
HEDATTLERIEKNFVITDPRLPDNPIIFASDSFLQLTEYSREEILGRNCRFLQGPETDRATV
RKIRDAIDNQTEVTVQLINYTKSGKKFWNLFHLQPMRDQKGDVQYFIGVQLDGTEHVR
DAAEREGVMLIKKTAENIDEAAKELLEHHHHHH
TCPTP-LOV2, best version:
DNA sequence: SEQ ID NO: 16:
ATGCCCACCACCATCGAGCGGGAGTTCGAAGAGTTGGATACTCAGCGTCGCTGGCA
GCCGCTGTACTTGGAAATTCGAAATGAGTCCCATGACTATCCTCATAGAGTGGCCAA
GTTTCCAGAAAACAGAAATCGAAACAGATACAGAGATGTAAGCCCATATGATCACA
GTCGTGTTAAACTGCAAAATGCTGAGAATGATTATATTAATGCCAGTTTAGTTGACA
TAGAAGAGGCACAAAGGAGTTACATCTTAACACAGGGTCCACTTCCTAACACATGC
TGCCATTTCTGGCTTATGGTTTGGCAGCAGAAGACCAAAGCAGTTGTCATGCTGAAC
CGCGTGATGGAGAAAGGTTCGTTAAAATGTGCACAGTACTGGCCAACAGATGACCA
AGAGATGCTGTTTAAAGAAACAGGATTCAGTGTGAAGCTCTTGTCAGAAGATGTGA
AGTCGTATTATACAGTACATCTACTACAATTAGAAAATATCAATAGTGGTGAAACCA
GAACAATATCTCACTTTCATTATACTACCTGGCCAGATTTTGGAGTCCCTGAATCACC
AGCTTCATTTCTCAATTTCTTGTTTAAAGTGAGAGAATCTGGCTCCTTGAACCCTGAC
CATGGGCCTGCGGTGATCCACTGTAGTGCAGGCATTGGGCGCTCTGGCACCTTCTCT
CTGGTAGACACTTGTCTTTTGCTGATGGACAAGAGGAAAGACCCTTCTTCCGTTG
ATATCAAGAAAGTGCTGTTAGAAATGAGGAAGTTTCGGATGGGGCTGATCCAG
ACAGCCGACCAGCTGCGCTTCTCCTACCTGGCTGTGATCGAAGGTGCCAAATT
CATCATGGGGGACTCTTCCGTGCAGGATCAGTGGAAGGAGCTTTCCCACGAGG
ACGCTGCTACACTTGAACGTATTGAGAAGAACTTTGTCATTACTGACCCAAGGTTGC
CAGATAATCCCATTATATTCGCGTCCGATAGTTTCTTGCAGTTGACAGAATATAGCC
GTGAAGAAATTTTGGGAAGAAACTGCAGGTTTCTACAAGGTCCTGAAACTGATCGC
GCGACAGTGAGAAAAATTAGAGATGCCATAGATAACCAAACAGAGGTCACTGTTCA
GCTGATTAATTATACAAAGAGTGGTAAAAAGTTCTGGAACCTCTTTCACTTGCAGCC
TATGCGAGATCAGAAGGGAGATGTCCAGTACTTTATTGGGGTTCAGTTGGATGGAAC
TGAGCATGTCCGAGATGCTGCCGAGAGAGAGGGAGTCATGCTGATTAAGAAAACTG
CAGAAAATATTGATGAGGCGGCAAAAGAACTTCTCGAGCACCACCACCACCACCA
CTGA
The underlined letters indicate sequence from PTP1B. Protein sequence: SEQ
ID NO: 17:
MPTTIEREFEELDTQRRWQPLYLEIRNESHDYPHRVAKFPENRNRNRYRDVSPYDHSRV
KLQNAENDYINASLVDIEEAQRSYILTQGPLPNTCCHFWLMVWQQKTKAVVMLNRVM
EKGSLKCAQYWPTDDQEMLFKETGFSVKLLSEDVKSYYTVHLLQLENINSGETRTISHF
HYTTWPDFGVPESPASFLNFLFKVRESGSLNPDHGPAVIHCSAGIGRSGTFSLVDTCLLL
MDKRKDPSSVDIKKVLLEMRKFRMGLIQTADQLRFSYLAVIEGAKFIMGDSSVQDQWK
ELSHEDAATLERIEKNFVITDPRLPDNPIIFASDSFLQLTEYSREEILGRNCRFLQGPETDR
ATVRKIRDAIDNQTEVTVQLINYTKSGKKFWNLFHLQPMRDQKGDVQYFIGVQLDGTE
HVRDAAEREGVMLIKKTAENIDEAAKELLEHHHHHH
TCPTP-LOV2 V2: DNA sequence: SEQ ID NO: 18:
ATGCCCACCACCATCGAGCGGGAGTTCGAAGAGTTGGATACTCAGCGTCGCTGGCA
GCCGCTGTACTTGGAAATTCGAAATGAGTCCCATGACTATCCTCATAGAGTGGCCAA
GTTTCCAGAAAACAGAAATCGAAACAGATACAGAGATGTAAGCCCATATGATCACA
GTCGTGTTAAACTGCAAAATGCTGAGAATGATTATATTAATGCCAGTTTAGTTGACA
TAGAAGAGGCACAAAGGAGTTACATCTTAACACAGGGTCCACTTCCTAACACATGC
TGCCATTTCTGGCTTATGGTTTGGCAGCAGAAGACCAAAGCAGTTGTCATGCTGAAC
CGCATTGTGGAGAAAGAATCGGTTAAATGTGCACAGTACTGGCCAACAGATGACCA
AGAGATGCTGTTTAAAGAAACAGGATTCAGTGTGAAGCTCTTGTCAGAAGATGTGA
AGTCGTATTATACAGTACATCTACTACAATTAGAAAATATCAATAGTGGTGAAACCA
GAACAATATCTCACTTTCATTATACTACCTGGCCAGATTTTGGAGTCCCTGAATCACC
AGCTTCATTTCTCAATTTCTTGTTTAAAGTGAGAGAATCTGGCTCCTTGAACCCTGAC
CATGGGCCTGCGGTGATCCACTGTAGTGCAGGCATTGGGCGCTCTGGCACCTTCTCT
CTGGTAGACACTTGTCTTTTGCTGATGGACAAGAGGAAAGACCCTTCTTCCGTTGAT
ATCAAGAAAGTGCTGTTAGAAATGAGGAAGTTTCGGATGGGGCTGATCCAGACAGC
CGACCAGCTGCGCTTCTCCTACCTGGCTGTGATCGAAGGTGCCAAATTCATCATGGG
GGACTCTTCCGTGCAGGATCAGTGGAAGGAGCTTTCCCACGAGGACGCTGCTACACT
TGAACGTATTGAGAAGAACTTTGTCATTACTGACCCAAGGTTGCCAGATAATCCCAT
TATATTCGCGTCCGATAGTTTCTTGCAGTTGACAGAATATAGCCGTGAAGAAATTTT
GGGAAGAAACTGCAGGTTTCTACAAGGTCCTGAAACTGATCGCGCGACAGTGAGAA
AAATTAGAGATGCCATAGATAACCAAACAGAGGTCACTGTTCAGCTGATTAATTATA
CAAAGAGTGGTAAAAAGTTCTGGAACCTCTTTCACTTGCAGCCTATGCGAGATCAGA
AGGGAGATGTCCAGTACTTTATTGGGGTTCAGTTGGATGGAACTGAGCATGTCCGAG
ATGCTGCCGAGAGAGAGGGAGTCATGCTGATTAAGAAAACTGCAGAAAATATTGAT
GAGGCGGCAAAAGAACTTCTCGAGCACCACCACCACCACCACTGA
Protein sequence: SEQ ID NO: 19:
MPTTIEREFEELDTQRRWQPLYLEIRNESHDYPHRVAKFPENRNRNRYRDVSPYDHSRV
KLQNAENDYINASLVDIEEAQRSYILTQGPLPNTCCHFWLMVWQQKTKAVVMLNRIVE
KESVKCAQYWPTDDQEMLFKETGFSVKLLSEDVKSYYTVHLLQLENINSGETRTISHFH
YTTWPDFGVPESPASFLNFLFKVRESGSLNPDHGPAVIHCSAGIGRSGTFSLVDTCLLLM
DKRKDPSSVDIKKVLLEMRKFRMGLIQTADQLRFSYLAVIEGAKFIMGDSSVQDQWKEL
SHEDAATLERIEKNFVITDPRLPDNPIIFASDSFLQLTEYSREEILGRNCRFLQGPETDRAT
VRKIRDAIDNQTEVTVQLINYTKSGKKFWNLFHLQPMRDQKGDVQYFIGVQLDGTEHV
RDAAEREGVMLIKKTAENIDEAAKELLEHHHHHH
FRET sensors. Forster resonance energy transfer (FRET) is contemplated
for use to monitor the activity of PTP1B in living cells. Sensor exhibits
a 20% reduction in FRET signal when treated with Src kinase (FIG. 21B).
Previous imaging studies indicate that a 20% change in FRET is sufficient
to monitor intracellular kinase activity.sup.54-56. To enhance spatial
resolution in imaging studies, we will attempt to optimize our sensor
further (and use it to measure the activity of PTP1B in vitro). Exemplary
FRET sensors: underlined mClover3-SH2-Linker-Bold Substrate--underlined
and Bold mRuby3.
TABLE-US-00002
mClover3-mRuby3: DNA sequence: SEQ ID NO: 20:
ATGCATCATCATCATCATCATGTGAGCAAGGGCGAGGAGCTGTTCACCGGGGTGG
TGCCCATCCTGGTCGAGCTGGACGGCGACGTAAACGGCCACAAGTTCAGCGTCCGC
GGCGAGGGCGAGGGCGATGCCACCAACGGCAAGCTGACCCTGAAGTTCATCTGCAC
CACCGGCAAGCTGCCCGTGCCCTGGCCCACCCTCGTGACCACCTTCGGCTACGGCGT
GGCCTGCTTCAGCCGCTACCCCGACCACATGAAGCAGCACGACTTCTTCAAGTCCGC
CATGCCCGAAGGCTACGTCCAGGAGCGCACCATCTCTTTCAAGGACGACGGTACCT
ACAAGACCCGCGCCGAGGTGAAGTTCGAGGGCGACACCCTGGTGAACCGCATCGAG
CTGAAGGGCATCGACTTCAAGGAGGACGGCAACATCCTGGGGCACAAGCTGGAGTA
CAACTTCAACAGCCACTACGTCTATATCACGGCCGACAAGCAGAAGAACTGCATCA
AGGCTAACTTCAAGATCCGCCACAACGTTGAGGACGGCAGCGTGCAGCTCGCCGAC
CACTACCAGCAGAACACCCCCATCGGCGACGGCCCCGTGCTGCTGCCCGACAACCA
CTACCTGAGCCATCAGTCCAAGCTGAGCAAAGACCCCAACGAGAAGCGCGATCACA
TGGTCCTGCTGGAGTTCGTGACCGCCGCCGGGATTACACATGGCATGGACGAGCTGT
ACAAGTGGTATTTTGGGAAGATCACTCGTCGGGAGTCCGAGCGGCTGCTGCTCAACC
CCGAAAACCCCCGGGGAACCTTCTTGGTCCGGGAGAGCGAGACGACAAAAGGTGCC
TATTGCCTCTCCGTTTCTGACTTTGACAACGCCAAGGGGCTCAATGTGAAGCACTAC
AAGATCCGCAAGCTGGACAGCGGCGGCTTCTACATCACCTCACGCACACAGTTCAG
CAGCCTGCAGCAGCTGGTGGCCTACTACTCCAAACATGCTGATGGCTTGTGCCACCG
CCTGACTAACGTCTGTGGGTCTACATCTGGATCTGGGAAGCCGGGTTCTGGTGAGGG
TTCTTGGATGGAGGACTATGACTACGTCCACCTACAGGGGGAGCTCGTGTCTAA
GGGCGAAGAGCTGATCAAGGAAAATATGCGTATGAAGGTGGTCATGGAAGGTT
CGGTCAACGGCCACCAATTCAAATGCACAGGTGAAGGAGAAGGCAGACCGTAC
GAGGGAACTCAAACCATGAGGATCAAAGTCATCGAGGGAGGACCCCTGCCATT
TGCCTTTGACATTCTTGCCACGTCGTTCATGTATGGCAGCCGTACTTTTATCAA
GTACCCGGCCGACATCCCTGATTTCTTTAAACAGTCCTTTCCTGAGGGTTTTAC
TTGGGAAAGAGTTACGAGATACGAAGATGGTGGAGTCGTCACCGTCACGCAGG
ACACCAGCCTTGAGGATGGCGAGCTCGTCTACAACGTCAAGGTCAGAGGGGTA
AACTTTCCCTCCAATGGTCCCGTGATGCAGAAGAAGACCAAGGGTTGGGAGCC
TAATACAGAGATGATGTATCCAGCAGATGGTGGTCTGAGAGGATACACTGACA
TCGCACTGAAAGTTGATGGTGGTGGCCATCTGCACTGCAACTTCGTGACAACTT
ACAGGTCAAAAAAGACCGTCGGGAACATCAAGATGCCCGGTGTCCATGCCGTT
GATCACCGCCTGGAAAGGATCGAGGAGAGTGACAATGAAACCTACGTAGTGCA
ACGCGAAGTGGCAGTTGCCAAATACAGCAACCTTGGTGGTGGCATGGACGAGC
TGTACAAGTAA
Protein sequence: SEQ ID NO: 21:
MHHHHHHVSKGEELFTGVVPILVELDGDVNGHKFSVRGEGEGDATNGKLTLKFICTTG
KLPVPWPTLVTTFGYGVACFSRYPDHMKQHDFFKSAMPEGYVQERTISFKDDGTYKTR
AEVKFEGDTLVNRIELKGIDFKEDGNILGHKLEYNFNSHYVYITADKQKNCIKANFKIRH
NVEDGSVQLADHYQQNTPIGDGPVLLPDNHYLSHQSKLSKDPNEKRDHMVLLEFVTAA
GITHGMDELYKWYFGKITRRESERLLLNPENPRGTFLVRESETTKGAYCLSVSDFDNAK
GLNVKHYKIRKLDSGGFYITSRTQFSSLQQLVAYYSKHADGLCHRLTNVCGSTSGSGKP
GSGEGSWMEDYDYVHLQGELVSKGEELIKENMRMKVVMEGSVNGHQFKCTGEGE
GRPYEGTQTMRIKVIEGGPLPFAFDILATSFMYGSRTFIKYPADIPDFFKQSFPEGFT
WERVTRYEDGGVVTVTQDTSLEDGELVYNVKVRGVNFPSNGPVMQKKTKGWEP
NTEMMYPADGGLRGYTDIALKVDGGGHLHCNFVTTYRSKKTVGNIKMPGVHAVD
HRLERIEESDNETYVVQREVAVAKYSNLGGGMDELYK
Exemplary Mammalian expression vector(s) for expressing a photoswitch
construct in a mammalian cell. For insertion into a mammalian expression
vector, e.g. lentiviral vector, pAcGFP1-C1 (Clontech); PTP1B-LOV2
(above), a promoter, e.g. CMV:
GCAAATGGGCGGTAGGCGTGTACGGTGGGAGGTCTATATAAGCAGAGCTGGTTTAG TGAACCGTCAGATC
(SEQ ID NO: 22); a RBS, e.g. Kozak consensus translation initiation site:
GCCACCATG; an Intergenic spacer (e.g. P2A: DNA sequence:
GGCAGCGGCGCCACCAACTTCTCCCTGCTGAAGCAGGCCGGCGACGTGGAGGAGAA CCCCGGCCCC (SEQ
ID NO: 23); a protein sequence: GSGATNFSLLKQAGDVEENPGP (SEQ ID NO: 24),
etc. An exemplary FRET Sensor included: a Promoter: Same as above; a RBS:
Same as above, etc.
[0199] Exemplary FRET sensors are contemplated to avoid overlap between
the excitation/emission wavelengths of LOV2 (455/495, we note that LOV2
is only weakly fluorescent.sup.70) and our FRET pair (505/515 for Clover
and 560/605 for mRuby2), while we still expect to see some crosstalk
during imaging, previous three-color imaging studies.sup.71 suggest that
it will not interfere with our ability to carry out the experiments
described in this section.
[0200] Contemplative Embodiments Include but at not Limited to Invadopodia
Formation and EGFR Regulation.
[0201] A photoswitchable variant of PTP1B is contemplated to determine if
cytosolic PTP1B, released from the ER by proteolysis, is exclusively
responsible for regulating the formation of invadopodia, or if ER-bound
PTP1B can function similarly. Cancer cell invasion and metastasis is
facilitated by the formation of invadopodia, actin-rich protrusions that
enable matrix degradation.sup.45.
[0202] Both PTP1B and PTK6 regulate epidermal growth factor receptor
(EGFR), a regulator of cell proliferation and migration that exhibits
aberrant activity in numerous cancers and inflammatory
diseases.sup.51,76. We will use a variant of PTP1B stimulated by red
light and a variant of PTK6 stimulated by blue light (or vice versa) to
carry out a combinatorial analysis of the cooperative contribution of
PTP1B and PTK6 to EGFR regulation within different regions of the cell.
REFERENCES FOR SECTIONS I, II, AND V ARE LISTED BELOW AND HEREIN
INCORPORATED BY REFERENCE
[0203] 1. Wray, J., Kalkan, T., Gomez-Lopez, S., Eckardt, D., Cook, A.,
Kemler, R. & Smith, A. Inhibition of glycogen synthase kinase-3
alleviates Tcf3 repression of the pluripotency network and increases
embryonic stem cell resistance to differentiation. Nat. Cell Biol. 13,
838-45 (2011). [0204] 2. Wu, Y. I., Frey, D., Lungu, O. I., Jaehrig, A.,
Schlichting, I., Kuhlman, B. & Hahn, K. M. A genetically encoded
photoactivatable Rac controls the motility of living cells. Nature 461,
104-108 (2009). [0205] 3. Liu, H., Wu, Y., Zhu, S., Liang, W., Wang, Z.,
Wang, Y., Lv, T., Yao, Y., Yuan, D. & Song, Y. PTP1B promotes cell
proliferation and metastasis through activating src and ERK1/2 in
non-small cell lung cancer. Cancer Lett. 359, 218-225 (2015). [0206] 4.
Danial, N. N. & Korsmeyer, S. J. Cell Death: Critical Control Points.
Cell 116, 205-219 (2004). [0207] 5. Johnson, T. O., Ermolieff, J. &
Jirousek, M. R. Protein tyrosine phosphatase 1B inhibitors for diabetes.
Nat. Rev. Drug Discov. 1, 696-709 (2002). [0208] 6. Koren, S. & Fantus,
I. G. Inhibition of the protein tyrosine phosphatase PTP1B: potential
therapy for obesity, insulin resistance and type-2 diabetes mellitus.
Best Pract. Res. Clin. Endocrinol. Metab. 21, 621-640 (2007). [0209] 7.
Pike, K. a, Hutchins, A. P., Vinette, V., Theberge, J.-F., Sabbagh, L.,
Tremblay, M. L. & Miranda-Saavedra, D. Protein tyrosine phosphatase 1B is
a regulator of the interleukin-10-induced transcriptional program in
macrophages. Sci. Signal. 7, ra43 (2014). [0210] 8. Rhee, I. & Veillette,
a. Protein tyrosine phosphatases in lymphocyte activation and
autoimmunity. Nat. Immunol. 13, 439-447 (2012). [0211] 9. Zhu, S.,
Bjorge, J. D. & Fujita, D. J. PTP1B contributes to the oncogenic
properties of colon cancer cells through Src activation. Cancer Res. 67,
10129-10137 (2007). [0212] 10. Volinsky, N. & Kholodenko, B. N.
Complexity of receptor tyrosine kinase signal processing. Cold Spring
Harb. Perspect. Biol. 5, (2013). [0213] 11. Kennedy, M. B.
Signal-Processing Machines at the Postsynaptic Density. Science. 290,
750-754 (2000). [0214] 12. Lee, H. K., Takamiya, K., Han, J. S., Man, H.,
Kim, C. H., Rumbaugh, G., Yu, S., Ding, L., He, C, Petralia, R. S.,
Wenthold, R. J., Gallagher, M. & Huganir, R. L. Phosphorylation of the
AMPA receptor GluR1 subunit is required for synaptic plasticity and
retention of spatial memory. Cell 112, 631-643 (2003). [0215] 13. Bence,
K. K., Delibegovic, M., Xue, B., Gorgun, C. Z., Hotamisligil, G. S.,
Neel, B. G. & Kahn, B. B. Neuronal PTP1B regulates body weight, adiposity
and leptin action. Nat. Med. 12, 917-24 (2006). [0216] 14. Wu, P.,
Nielsen, T. E. & Clausen, M. H. FDA-approved small-molecule kinase
inhibitors. Trends Pharmacol. Sci. 36, 422-439 (2015). [0217] 15. Repina,
N. A., Rosenbloom, A., Mukherjee, A., Schaffer, D. V. & Kane, R. S. At
Light Speed: Advances in Optogenetic Systems for Regulating Cell
Signaling and Behavior. Annu. Rev. Chem. Biomol. Eng. 8, 13-39 (2017).
[0218] 16. Gautier, A., Gauron, C, Volovitch, M., Bensimon, D., Jullien,
L. & Vriz, S. How to control proteins with light in living systems. Nat.
Chem. Biol. 10, 533-41 (2014). [0219] 17. Krauss, U., Lee, J., Benkovic,
S. J. & Jaeger, K. E. LOVely enzymes--Towards engineering
light-controllable biocatalysts. Microb. Biotechnol. 3, 15-23 (2010).
[0220] 18. Dagliyan, O., Tarnawski, M., Chu, P.-H., Shirvanyants, D.,
Schlichting, I., Dokholyan, N. V. & Hahn, K. M. Engineering extrinsic
disorder to control protein activity in living cells. Science. 354,
1441-1444 (2016). [0221] 19. Zhou, X. X., Fan, L. Z., Li, P., Shen, K. &
Lin, M. Z. Optical control of cell signaling by single-chain
photoswitchable kinases. Science. 355, 836-842 (2017). [0222] 20.
Lukyanov, K. a, Chudakov, D. M., Lukyanov, S. & Verkhusha, V. V.
Photoactivatable fluorescent proteins. Nat. Rev. Mol. Cell Biol. 6,
885-890 (2005). [0223] 21. Rodriguez, E. A., Campbell, R. E., Lin, J. Y.,
Lin, M. Z., Miyawaki, A., Palmer, A. E., Shu, X., Zhang, J. & Tsien, R.
Y. The Growing and Glowing Toolbox of Fluorescent and Photoactive
Proteins. Trends Biochem. Sci. 42, 111-129 (2017). [0224] 22. Lessard,
L., Stuible, M. & Tremblay, M. L. The two faces of PTP1B in cancer.
Biochim. Biophys. Acta-Proteins Proteomics 1804, 613-619 (2010). [0225]
23. Barr, A. J., Ugochukwu, E., Lee, W. H., King, O. N. F.,
Filippakopoulos, P., Alfano, I., Savitsky, P., Burgess-Brown, N. A.,
Mtiller, S. & Knapp, S. Large-Scale Structural Analysis of the Classical
Human Protein Tyrosine Phosphatome. Cell 136, 352-363 (2009). [0226] 24.
Hubbard, S. R. & Till, J. H. Protein tyrosine kinase structure and
function. Annu. Rev. Biochem. 69, 373-398 (2000). [0227] 25. Zayner, J.
P., Antoniou, C. & Sosnick, T. R. The amino-terminal helix modulates
light-activated conformational changes in AsLOV2. J. Mol. Biol. 419,
61-74 (2012). [0228] 26. Peter, E., Dick, B. & Baeurle, S. A. Mechanism
of signal transduction of the LOV2-Ja photosensor from Avena sativa. Nat.
Commun. 1, 122 (2010). [0229] 27. Kaberniuk, A. A., Shemetov, A. A. &
Verkhusha, V. V. A bacterial phytochrome-based optogenetic system
controllable with near-infrared light. Nat. Methods 13, 1-15 (2016).
[0230] 28. Auldridge, M. E. & Forest, K. T. Bacterial phytochromes: more
than meets the light. Crit. Rev. Biochem. Mol. Biol. 46, 67-88 (2011).
[0231] 29. Anderie, I., Schulz, I. & Schmid, A. Characterization of the
C-terminal ER membrane anchor of PTP1B. Exp. Cell Res. 313, 3189-3197
(2007). [0232] 30. Tonks, N. K. & Muthuswamy, S. K. A Brake Becomes an
Accelerator: PTP1B-A New Therapeutic Target for Breast Cancer. Cancer
Cell 11, 214-216 (2007). [0233] 31. Krishnan, N. & Tonks, N. K. Anxious
moments for the protein tyrosine phosphatase PTP1B. Trends Neurosci. 38,
462-465 (2015). [0234] 32. T raves, P. G., Pardo, V.,
Pimentel-Santillana, M., Gonzalez-Rodrfguez, A., Mojena, M., Rico, D.,
Montenegro, Y., Cales, C, Martfn-Sanz, P., Valverde, a M. & Bosca, L.
Pivotal role of protein tyrosine phosphatase 1B (PTP1B) in the macrophage
response to proinflammatory and anti-inflammatory challenge. Cell Death
Dis. 5, e1125 (2014). [0235] 33. Matulka, K., Lin, H. H., Hrfbkova, H.,
Uwanogho, D., Dvorak, P. & Sun, Y. M. PTP1B is an effector of activin
signaling and regulates neural specification of embryonic stem cells.
Cell Stem Cell 13, 706-719 (2013). [0236] 34. Cortesio, C. L., Chan, K.
T., Perrin, B. J., Burton, N. O., Zhang, S., Zhang, Z. Y. & Huttenlocher,
A. Calpain 2 and PTP1B function in a novel pathway with Src to regulate
invadopodia dynamics and breast cancer cell invasion. J. Cell Biol. 180,
957-971 (2008). [0237] 35. Wiesmann, C, Barr, K. J., Kung, J., Zhu, J.,
Erlanson, D. A., Shen, W., Fahr, B. J., Zhong, M., Taylor, L., Randal,
M., McDowell, R. S. & Hansen, S. K. Allosteric inhibition of protein
tyrosine phosphatase 1B. Nat. Struct. Mol. Biol. 11, 730-737 (2004).
[0238] 36. Alonso, A., Sasin, J., Bottini, N., Friedberg, I., Friedberg,
I., Osterman, A., Godzik, A., Hunter, T., Dixon, J. & Mustelin, T.
Protein tyrosine phosphatases in the human genome. Ce//117, 699-711
(2004). [0239] 37. Haj, F. G., Verveer, P. J., Squire, A., Neel, B. G. &
Bastiaens, P. I. H. Imaging sites of receptor dephosphorylation by PTP1B
on the surface of the endoplasmic reticulum. Science 295, 1708-1711
(2002). [0240] 38. Romsicki, Y., Reece, M., Gauthier, J. Y.,
Asante-Appiah, E. & Kennedy, B. P. Protein Tyrosine Phosphatase-1B
Dephosphorylation of the Insulin Receptor Occurs in a Perinuclear
Endosome Compartment in Human Embryonic Kidney 293 Cells. J. Biol. Chem.
279, 12868-12875 (2004). [0241] 39. Haj, F. G., Sabet, O., Kinkhabwala,
A., Wimmer-Kleikamp, S., Roukos, V., Han, H. M., Grabenbauer, M.,
Bierbaum, M., Antony, C, Neel, B. G. & Bastiaens, P. I. Regulation of
signaling at regions of cell-cell contact by endoplasmic reticulum-bound
protein-tyrosine phosphatase 1B. PLoS One 7, (2012). [0242] 40. Soysal,
S., Obermann, E. C, Gao, F., Oertli, D., Gillanders, W. E., Viehl, C. T.
& Muenst, S. PTP1B expression is an independent positive prognostic
factor in human breast cancer. Breast Cancer Res. Treat. 137, 637-644
(2013). [0243] 41. Zhang, S. & Zhang, Z. Y. PTP1B as a drug target:
recent developments in PTP1B inhibitor discovery. Drug Discov. Today 12,
373-381 (2007). [0244] 42. Wu, C, Zhang, L, Bourne, P. A., Reeder, J. E.,
Di Sant & apos; Agnese, P. A., Yao, J. L, Na, Y. & Huang, J. Protein
tyrosine phosphatase PTP1B is involved in neuroendocrine differentiation
of prostate cancer. Prostate 66, 1124-1135 (2006). [0245] 43. Lessard, L,
DP, L, Deblois, G., Begin, L, Hardy, S., Mes-Masson, A., Saad, F.,
Trotman, L, Giguere, V. & Tremblay, M. PTP1B is an androgen
receptor-regulated phosphatase that promotes the progression of prostate
cancer. Cancer Res. 7 2, 1529-1537 (2012). [0246] 44. Dube, N., Bourdeau,
A., Heinonen, K. M., Cheng, A., Loy, A. L. & Tremblay, M. L. Genetic
ablation of protein tyrosine phosphatase 1B accelerates lymphomagenesis
of p53-null mice through the regulation of B-cell development. Cancer
Res. 65, 10088-10095 (2005). [0247] 45. Weaver, A. M. Invadopodia:
Specialized cell structures for cancer invasion. Clin. Exp. Metastasis
23, 97-105 (2006). [0248] 46. Cui, Q., Ma, Y., Jaramillo, M., Bari, H.,
Awan, A., Yang, S., Zhang, S., Liu, L., Lu, M., O'Connor-McCourt, M.,
Purisima, E. O. & Wang, E. A map of human cancer signaling. Mol. Syst.
Biol. 3, 152 (2007). [0249] 47. Repina, N. A., Rosenbloom, A., Mukherjee,
A., Schaffer, D. V. & Kane, R. S. At Light Speed: Advances in Optogenetic
Systems for Regulating Cell Signaling and Behavior. Annu. Rev. Chem.
Biomol. Eng. 8, 13-39 (2017). [0250] 48. Lee, J., Natarajan, M., Nashine,
V. C, Socolich, M., Vo, T., Russ, W. P., Benkovic, S. J. & Ranganathan,
R. Surface sites for engineering allosteric control in proteins. Science
322, 438-442 (2008). [0251] 49. Qin, Z., Zhou, X., Pandey, N. R.,
Vecchiarelli, H. A., Stewart, C. A., Zhang, X., Lagace, D. C, Brunei, J.
M., Beique, J. C, Stewart, A. F. R., Hill, M. N. & Chen, H. H. Chronic
Stress Induces Anxiety via an Amygdalar Intracellular Cascade that
Impairs Endocannabinoid Signaling. Neuron 85, 1319-1331 (2015). [0252]
50. Fan, G., Lin, G., Lucito, R. & Tonks, N. K. Protein-tyrosine
phosphatase 1B antagonized signaling by insulin-like growth factor-1
receptor and kinase BRK/PTK6 in ovarian cancer cells. J. Biol. Chem. 288,
24923-34 (2013). [0253] 51. Eden, E. R., White, I. J., Tsapara, A. &
Futter, C. E. Membrane contacts between endosomes and ER provide sites
for PTP1B-epidermal growth factor receptor interaction. Nat. Cell Biol.
12, 267-72 (2010). [0254] 52. Arregui, C. O., Gonzalez, A., Burdisso, J.
E. & Gonzalez Wusener, A. E. Protein tyrosine phosphatase PTP1B in cell
adhesion and migration. Cell Adh. Migr. 7, 418-423 (2013). [0255] 53.
Badran, A. H., Guzov, V. M., Huai, Q., Kemp, M. M., Vishwanath, P., Kain,
W., Nance, A. M., Evdokimov, A., Moshiri, F., Turner, K. H., Wang, P.,
Malvar, T. & Liu, D. R. Continuous evolution of Bacillus thuringiensis
toxins overcomes insect resistance. Nature 533, 58-63 (2016). [0256] 54.
Sato, M. & Umezawa, Y. in Cell Biol. Four-Volume Set 2, 325-328 (2006).
[0257] 55. Ouyang, M., Sun, J., Chien, S. & Wang, Y. Determination of
hierarchical relationship of Src and Rac at subcellular locations with
FRET biosensors. Proc Natl Acad Sci USA 105, 14353-14358 (2008). [0258]
56. Ting, a Y., Kain, K. H., Klemke, R. L. & Tsien, R. Y. Genetically
encoded fluorescent reporters of protein tyrosine kinase activities in
living cells. Proc. Natl. Acad. Sci. U.S.A 98, 15003-15008 (2001). [0259]
57. Wiesmann, C, Barr, K. J., Kung, J., Zhu, J., Erlanson, D. A., Shen,
W., Fahr, B. J., Zhong, M., Taylor, L, Randal, M., McDowell, R. S. &
Hansen, S. K. Allosteric inhibition of protein tyrosine phosphatase 1B.
Nat. Struct. Mol. Biol. 11, 730-737 (2004). [0260] 58. Halavaty, A. S. &
Moffat, K. N- and C-terminal flanking regions modulate light-induced
signal transduction in the LOV2 domain of the blue light sensor
phototropin 1 from Avena sativa. Biochemistry 46, 14001-14009 (2007).
[0261] 59. Yao, X., Rosen, M. K. & Gardner, K. H. Estimation of the
available free energy in a LOV2-Ja photoswitch. Nat. Chem. Biol. 4,
491-497 (2008). [0262] 60. Choy, M. S., Li, Y., Machado, L. E. S. F.,
Kunze, M. B. A., Connors, C. R., Wei, X., Lindorff-Larson, K., Page, R. &
Peti, W. Conformational Rigidity and Protein Dynamics at Distinct
Timescales Regulate PTP1B Activity and Allostery. Mol. Cell 65, 644-658
(2017). [0263] 61. Hubbard, S. R. & Till, J. H. Protein tyrosine kinase
structure and function. Annu. Rev. Biochem. 69, 373-398 (2000). [0264]
62. Zhang, Y., Kurup, P., Xu, J., Carty, N., Fernandez, S. M., Nygaard,
H. B., Pittenger, C, Greengard, P., Strittmatter, S. M., Nairn, A. C. &
Lombroso, P. J. Genetic reduction of striatal-enriched tyrosine
phosphatase (STEP) reverses cognitive and cellular deficits in an
Alzheimer's disease mouse model. Proc. Natl. Acad. Sci. 107, 19014-19019
(2010). [0265] 63. He, R., Yu, Z., Zhang, R. & Zhang, Z. Protein tyrosine
phosphatases as potential therapeutic targets. Acta Pharmacol. Sin. 35,
1227-1246 (2014). [0266] 64. Ito, K., Park, S. H., Nayak, A., Byerly, J.
H. & Irie, H. Y. PTK6 inhibition suppresses metastases of triple-negative
breast cancer via SNAIL-dependent E-cadherin regulation. Cancer Res. 76,
4406-4417 (2016). [0267] 65. Zegzouti, H., Zdanovskaia, M., Hsiao, K. &
Goueli, S. A. ADP-Glo: A Bioluminescent and Homogeneous ADP Monitoring
Assay for Kinases. Assay Drug Dev. Technol. 7, 560-572 (2009). [0268] 66.
Strickland, D., Yao, X., Gawlak, G., Rosen, M. K., Gardner, K. H. &
Sosnick, T. R. Rationally improving LOV domain-based photoswitches. Nat.
Methods 7, 623-6 (2010). [0269] 67. Cosentino, C, Alberio, L.,
Gazzarrini, S., Aquila, M., Romano, E., Cermenati, S., Zuccolini, P.,
Petersen, J., Beltrame, M., Van Etten, J. L., Christie, J. M., Thiel, G.
& Moroni, A. Engineering of a light-gated potassium channel. Science.
348, 707-710 (2015). [0270] 68. Krishnan, N., Koveal, D., Miller, D. H.,
Xue, B., Akshinthala, S. D., Kragelj, J., Jensen, M. R., Gauss, C.-M.,
Page, R., Blackledge, M., Muthuswamy, S. K., Peti, W. & Tonks, N. K.
Targeting the disordered C terminus of PTP1B with an allosteric
inhibitor. Nat. Chem. Biol. 10, 558-566 (2014). [0271] 69. Piserchio, A.,
Cowburn, D. & Ghose, R. Expression and purification of Src-family kinases
for solution NMR studies. Methods Mol. Biol. 831, 111-131 (2012). [0272]
70. Van Stokkum, I. H. M., Gauden, M., Crosson, S., Van Grondelle, R.,
Moffat, K. & Kennis, J. T. M. The primary photophysics of the Avena
sativa phototropin 1 LOV2 domain observed with time-resolved emission
spectroscopy. Photochem. Photobiol. 87, 534-541 (2011). [0273] 71.
Rowland, A. A., Chitwood, P. J., Phillips, M. J. & Voeltz, G. K. ER
contact sites define the position and timing of endosome fission.
Ce//159, 1027-1041 (2014). [0274] 72. Fehr, M., Lalonde, S., Lager, I.,
Wolff, M. W.
& Frommer, W. B. In vivo imaging of the dynamics of glucose uptake in the
cytosol of COS-7 cells by fluorescent nanosensors. J. Biol. Chem. 278,
19127-19133 (2003). [0275] 73. Feigner, P. L., Gadek, T. R., Holm, M.,
Roman, R., Chan, H. W., Wenz, M., Northrop, J. P., Ringold, G. M. &
Danielsen, M. Lipofection: a highly efficient, lipid-mediated
DNA-transfection procedure. Proc. Natl. Acad. Sci. U.S.A 84, 7413-7
(1987). [0276] 74. Gil-Parrado, S., Fernandez-Montalvan, A.,
Assfalg-Machleidt, I., Popp, O., Bestvater, F., Holloschi, A., Knoch, T.
A., Auerswald, E. A., Welsh, K., Reed, J. C, Fritz, H., Fuentes-Prior,
P., Spiess, E., Salvesen, G. S. & Machleidt, W. lonomycin-activated
calpain triggers apoptosis. A probable role for Bcl-2 family members. J.
Biol. Chem. 277, 27217-27226 (2002). [0277] 75. Faeder, J. R., Blinov, M.
L. & Hlavacek, W. S. Rule-based modeling of biochemical systems with
BioNetGen. Methods Mol. Biol. 500, 113-167 (2009). [0278] 76. Tiganis,
T., Bennett, A. M., Ravichandran, K. S. & Tonks, N. K. Epidermal growth
factor receptor and the adaptor protein p52Shc are specific substrates of
T-cell protein tyrosine phosphatase. Mol. Cell. Biol. 18, 1622-34 (1998).
III. Genetically Encoded System for Constructing and Detecting
Biologically Active Agents: Microbial Inhibitor Screening Systems.
[0279] Several types of operons were developed as described herein, each
for a specific purpose, including but not limited to testing small
molecules for their ability to inhibit, activate, or otherwise modulate a
chosen PTP and/or PTK; operons for testing intracellularly provided small
molecules for inhibiting, activating, or modulating effects on a chosen
PTP and/or PTK; and evolving one or more proteins or small molecules of
interest. More specifically, genetic operons were contemplated for
insertion, using transfection and breeding techniques well known in the
art, for providing microbial cells wherein the activity of an enzyme of
interest (e.g., protein tyrosine phosphatase 1B, a drug target for the
treatment of diabetes, obesity, and cancer) is linked to (i) cellular
luminescence, (ii) cellular fluorescence, or (iii) cellular growth. In
some embodiments, such operons are modified for use in detecting and/or
evolving biologically active metabolites. When modified and/or induced to
build various metabolites, the cell will be used for detection of
metabolites that inhibit/activate a protein of interest (e.g., PTP1B).
[0280] These operons allow operon-containing microbial cells to be used to
carry out the following tasks: Detecting biologically active molecules
and non-native biologically active metabolites. When grown in the
presence of biologically active molecules as a small molecule that is
both (i) cell permeable and (ii) capable of inhibiting a protein of
interest (e.g., PTP1B), the cell will enable detection of that molecule.
That is, it will exhibit a concentration-dependent response in
luminescence, fluorescence, or growth. Many non-native biologically
active metabolites have useful pharmaceutical properties. Examples
include paclitaxel and artemisinin, plant-derived terpenoids that are
used to treat cancer and malaria, respectively. When the metabolic
pathways responsible for making such natural metabolites are installed
into microbial cells that also contain our operon, those cells will
enable detection of interesting metabolite-based biological activities
(e.g., the ability to inhibit PTP1B).
[0281] Genetic operons that, when installed into microbial cells, link the
activity of an enzyme of interest (e.g., protein tyrosine phosphatase 1B,
a drug target for the treatment of diabetes, obesity, and cancer) to (i)
cellular luminescence, (ii) cellular fluorescence, or (iii) cellular
growth.
[0282] Detect and/or evolve biologically active metabolites. When modified
and/or induced to build various metabolites, the cell will enable
detection of metabolites that inhibit/activate a protein of interest
(e.g., PTP1B).
[0283] These operons allow operon-containing microbial cells to be used to
carry out the following tasks: Detecting biologically active molecules
and non-native biologically active metabolites. When grown in the
presence of a biologically active molecules as a small molecule that is
both (i) cell permeable and (ii) capable of inhibiting a protein of
interest (e.g., PTP1B), the cell will enable detection of that molecule.
That is, it will exhibit a concentration-dependent response in
luminescence, fluorescence, or growth. Many non-native biologically
active metabolites have useful pharmaceutical properties. Examples
include paclitaxel and artemisinin, plant-derived terpenoids that are
used to treat cancer and malaria, respectively. When the metabolic
pathways responsible for making such natural metabolites are installed
into microbial cells that also contain our operon, those cells will
enable detection of metabolite-based biological activities (e.g., the
ability to inhibit PTP1B).
[0284] In some embodiments, methods of evolving molecules may be modified
from Moses, et al., "Bioengineering of plant (tri)terpenoids: from
metabolic engineering of plants to synthetic biology in vivo and in
vitro." New Phytologist, Volume 200, Issue 1, where this reference
describes synthesis of artemisinic acid, the precursor of the
antimalarial drug artemisinin, as diterpenoids expressed in E. coli.
Further, enzyme engineering or directed evolution of terpenoid
biosynthetic enzymes, e.g. engineer enzymes to accept unnatural
substrates and to catalyze regions and stereospecific reactions with an
efficiency comparable with that of the natural enzymes is described,
along with discussions on enhancing the production of terpenoids in
Escherichia coli. In some embodiments, methods of evolving molecules may
be modified from Badran, et al., "Continuous evolution of Bacillus
thuringiensis toxins overcomes insect resistance". Nature, Vol 533:58,
2016, where this reference describes a phage-assisted continuous
evolution selection that rapidly evolves high-affinity protein--protein
interactions, and applied this system to evolve variants of the Bt toxin
Cry1Ac that bind a cadherin-like receptor from the insect pest
Trichoplusia ni (TnCAD) that is not natively bound by wild-type Cry1Ac.
[0285] A. Protein Evolving Systems and Evolving Biologically Active
Metabolites.
[0286] In some embodiments, methods of evolving molecules may be used to
construct drug leads that can be readily synthesized in microbial hosts.
It addresses a longstanding challenge--the development of low-cost
pharmaceuticals--by using a sophisticated set of biophysical tools and
analytical methodologies to narrow the molecular search space in lead
discovery, and by explicitly considering the biosynthetic accessibility
of therapeutic molecules. The approach, which departs from contemporary
efforts to use microbial systems for the synthesis of clinically approved
drugs and their precursors, is unique in its focus on using biology for
the systematic construction of new molecules. It will accelerate the
rate--and lower the cost--of pharmaceutical development.
[0287] The development of a drug requires optimization of many of its
pharmacological properties--affinity, absorption, distribution,
metabolism, excretion, toxicology, pharmacokinetics, and
pharmacodynamics.sup.1. The first of these properties--protein-ligand
binding affinity--generally determines whether the others are worth
measuring or enhancing, and, thus, represents a property of drug
leads.sup.2. Despite advances in computational chemistry and structural
biology, the rational design of ligands that bind tightly to
proteins--ligands, henceforth, referred to collectively as
inhibitors--remains exceptionally difficult.sup.3; as a result, the
development of drugs often begins with screens of large libraries of
molecules.sup.4. An inhibitor, once discovered, must be synthesized in
quantities sufficient for subsequent analysis, optimization, formulation,
and clinical evaluation.
[0288] The difficulties associated with developing protein inhibitors are
particularly problematic for natural products. These molecules, which
account for over 50% of clinically approved drugs, tend to have favorable
pharmacological properties (e.g., membrane permeability).sup.5.
Unfortunately, their low natural titers--which hamper the extraction of
testable quantities from natural sources--and their chemical
complexity--which complicates chemical synthesis--make the preparation of
quantities sufficient for post-screen analyses time-consuming and
expensive.sup.6.
[0289] In some embodiments, enzymes are contemplated for use to construct
terpenoid inhibitors that can be synthesized in Escherichia coli; such an
approach takes advantage of the chemical diversity (and generally
favorable pharmacological properties) of natural products without the
constraints of their natural scarcity. In some embodiments, detailed
biophysical study of the molecular-level origin and thermodynamic basis
of affinity and activity in protein-terpenoid interactions are included
for the rapid construction of high-affinity inhibitors. In some
embodiments, development of selective inhibitors of protein tyrosine
phosphatase 1B (PTP1B), a target for the treatment of diabetes, obesity,
and cancer is contemplated in part for using enzymes to evolve readily
synthesizable drug leads.
[0290] Structurally Varied Terpenoids with Different Affinities for the
Allosteric Binding Pocket of Protein Tyrosine Phosphatase 1B (PTP1B).
[0291] Hypothesis. Results indicate that abietic acid, a mono-carboxylated
variant of abietadiene, is an allosteric inhibitor of PTP1B. Derivatives
or structural analogs of abietadiene that differ in stereochemistry,
shape, size, and/or chemical functionality (including carboxylation
position) are likely to have different affinities for the allosteric
binding pocket of PTP1B.
[0292] In some embodiments, (i) mutants of abietadiene synthase,
cytochrome P450s, and halogenases are contemplated for use to make
structural variants of abietadiene, (ii) GC/MS to identify those
variants, (iii) preparative HPLC and flash chromatography to isolate
them, and (iii) isothermal titration calorimetry to determine their free
energies, enthalpies, and entropies of binding. In some embodiments, a
set of structurally varied inhibitors with (i) affinities that differ by
100-fold and/or (ii) enthalpies and entropies of binding that suggest
alternative binding geometries is contemplated.
[0293] To Examine the Molecular Basis and Thermodynamic Origin of Affinity
and Activity in Enzyme-Terpenoid Interactions.
[0294] Hypothesis. Enzymes that bind, functionalize, and/or synthesize
terpenoids possess large nonpolar binding pockets. We hypothesize that
both (i) the affinity of an enzyme for terpenoids and (ii) the activity
of an enzyme ON terpenoids is determined by the general shape and
hydration structure of its binding pocket, not the position of specific
protein-terpenoid contacts.
[0295] In some embodiments, a sophisticated set of biophysical tools
(isothermal titration calorimetry, X-ray crystallography, molecular
dynamics (MD) simulations, and NMR spectroscopy) are contemplated for use
to (i) determine how protein-ligand contacts, rearrangements of water,
and conformational constraints contribute to differences in affinity
between terpenoid inhibitors and to (ii) develop a set of empirical
relationships that predict how mutations in terpene synthases and
terpene-functionalizing enzymes influence general attributes (e.g.,
shape) of their products.
[0296] To Evolve High-Affinity Terpenoid Inhibitors of PTP1B.
[0297] Hypothesis. Mutants from secondary metabolism (e.g., terpene
synthases, cytochrome P450s, and halogenases) are highly promiscuous; a
single mutation in or near their active sites can dramatically alter
their product profiles. Mutagenesis of a small number (i.e., 2-4) of such
enzymes, selected for their ability to synthesize and/or functionalize
diterpenoids, will enable the development of inhibitors of PTP1B with
sub-micromolar affinities.
[0298] In some embodiments, high-affinity inhibitors of PTP1B by pairing
(i) high-throughput methods for detecting inhibitors with (ii)
site-saturation and random mutagenesis is contemplated. For (i) we will
develop four alternative fluorescence or growth-coupled assays to screen
libraries of mutated pathways (and their respective products). For (ii)
we use biostructural analyses and sequence alignments to identify
residues likely to yield enzymes with favorable product profiles.
[0299] To Identify Structure-Activity Relationships that Enable the
Evolution of Terpenoid Inhibitors of Arbitrary Protein Targets.
[0300] Hypothesis. Proteins that interact with similar classes of
molecules (through binding or catalysis) have structurally similar
binding pockets. Methods for evaluating these structural
similarities--and their implications for enzyme activity--may enable the
identification of enzymes capable of synthesizing inhibitors of ANY
specified protein.
[0301] In some embodiments, a biophysical framework for using the crystal
structure of a protein as a starting point to identify enzymes capable of
synthesizing inhibitors of that protein is contemplated. We will examine
(and formalize) structural relationships between (i) the active sites of
enzymes used to synthesize allosteric inhibitors of PTP1B and (ii) the
allosteric binding pocket of PTP1B, and we will validate these
relationships by using them to identify--and, then, test--new enzymes
capable of synthesizing inhibitors of PTP1B and (separately) undecaprenyl
diphosphate synthase, a target for the treatment of antibiotic-resistant
bacterial infections.
[0302] Diabetes, Obesity, and Cancer.
[0303] Protein tyrosine phosphatase 1B (PTP1B) contributes to insulin
resistance in type 2 diabetes.sup.7, leptin resistance in obesity.sup.8,
and tumor growth in breast, colorectal, and lung cancers.sup.9,11. To
date, the development of selective, tight-binding inhibitors of PTP1B
(i.e., treatments for diabetes, obesity, and cancer) has been hindered by
the structure of its active site, where polar residues limit tight
binding to charged, membrane-impermeable molecules, and where structural
similarities to the active sites of other protein tyrosine phosphatases
(PTPs) lead to off-target interactions.sup.12,14. In this proposal, we
will construct selective inhibitors of PTP1B that bind to its C-terminal
allosteric site, a largely nonpolar region that is not conserved across
phosphatases.sup.15. Previous screens of large molecular libraries have
identified several ligands that bind to this site, but have yet to yield
clinically approved drugs.sup.16,13. The identification of new molecular
alternatives--a feat tackled in this proposal--remains a goal in efforts
to develop selective PTP1 B-inhibiting therapeutics.
[0304] Development of pharmaceuticals. The development of enzyme
inhibitors--or leads-represents an expensive part of drug development;
for each successful drug, lead identification and optimization takes an
average of 3 years and $250M to complete (-20-30% of the total time and
cost to bring a drug to market).sup.17. By narrowing the molecular search
space in lead discovery, by enabling rapid construction of
structurally-varied leads (often referred to as "backups".sup.18), and by
facilitating scale-up of molecular synthesis, the technology developed in
this proposal could accelerate the rate--and lower the cost--of
pharmaceutical development.
[0305] Molecular recognition. The hydrophobic effect--the free
energetically favorable association of nonpolar species in aqueous
solution--is, on average, responsible for -75% of the free energy of
protein-ligand association.sup.19. Unfortunately, hydrophobic
interactions between ligands and proteins--which differ dramatically in
rigidity, topography, chemical functionality, and hydration
structure--remain difficult to predict.sup.20. This study uses detailed
biophysical analyses and explicit-water calculations to examine the
thermodynamic basis of hydrophobic interactions between terpenoids and
protein binding pockets. It will develop a model system and corresponding
conceptual framework--for studying the hydrophobic effect in the context
of structurally varied protein-ligand complexes, for accounting for that
effect in the design of biosynthetic pathways, and for exploiting it in
the construction of new drug leads.
[0306] Biosynthesis of New Natural Products.
[0307] Synthetic biology offers a promising route to the discovery and
production of natural products. When the metabolic machinery of one
organism is installed into a genetically tractable production host (e.g.,
S. cerevisiae or E. coli), it enables the synthesis of complex compounds
at high titers (relative to the native host). This approach has enabled
the efficient production of pharmaceutically relevant metabolites from
unculturable or low-yielding organisms.sup.21,22, but, unfortunately,
requires large investments of time and resources in pathway discovery and
optimization; its use, as a result, is generally limited to the
low-throughput characterization of newly discovered gene clusters or to
the production of known, pharmaceutically relevant molecules (e.g.,
paclitaxel, artemisinin, or opioids).sup.22,24.
[0308] In some embodiments, a strategy for using synthetic biology to
build new molecular function is contemplated. It begins with a
pathologically relevant protein target and engineers pathway enzymes to
produce molecules that selectively inhibit that target. This approach
will yield molecules that can be produced in microbial hosts without
extensive pathway optimization (it relies on enzymes that are expressible
by default); it will, thus, expand the use of synthetic biology to the
production of leads and backups. It is not a replacement for conventional
approaches to the synthesis of complex natural products, but rather, a
complementary strategy for constructing new compounds that will enhance
the efficiency with which pharmaceuticals are developed.
[0309] In the presence of mutated metabolic pathways (e.g., version of a
plant-based terpenoid-producing pathway in which the terpene synthase has
been mutated), our operon will enable screens of large numbers of
metabolites for their ability to inhibit our protein of interest (e.g.,
PTP1B). Such a platform could be used to evolve metabolites with specific
biological activities.
[0310] Detect and/or evolve highly selective molecules. We have developed
an idea for a version of our operon to detect molecules that inhibit one
protein over a highly similar protein. Screens for molecular selectivity
are, at present, remain very difficult.
[0311] Advantages of methods and systems described herein, over some other
systems for detecting small molecule inhibitors includes but is not
limited to enabling the detection of molecules that modulate or change
the catalytic activity of an enzyme. Moreover, some embodiments of the
systems described herein allow for the detection of test molecules that
change the activity of an enzyme by binding anywhere on its surface. As
one example, detection of an inhibitor is contemplated that inactivates
PTP1B by binding to its C-terminal allosteric site; this binding event,
which distorts catalytically essential motions of the WPD loop, would not
necessarily prevent enzyme-substrate association. U.S. Pat. No.
6,428,951, herein incorporated by reference in its entirety, in contrast,
enables the detection of molecules that prevent enzyme-substrate binding
by competing for substrate binding sites (i.e., the active site). As
another example, detection of molecules that activate an enzyme of
interest is contemplated as an embodiment. U.S. Pat. No. 6,428,951,
herein incorporated by reference in its entirety, in contrast, has
methods that merely detect molecules that prevent enzymes from binding to
their substrates, or that otherwise change the affinity of enzymes for
their substrates. As another example, detection of molecules that do not
require an enzyme and substrate to interact with any particular affinity,
orientation, or half-life is contemplated as an embodiment. U.S. Pat. No.
6,428,951, herein incorporated by reference in its entirety, in contrast,
requires an enzyme and substrate to bind one another with an affinity and
orientation that enable assembly of a split reporter. As a result, it may
require modifications to the enzyme; in contrast, the inventors use a
"substrate trapping" mutant of PTP1B to improve its affinity for a
substrate domain.
[0312] As another example, some embodiments enables the detection of
inhibitors of wild-type enzymes. Tu S., U.S. Pat. No. 6,428,951, herein
incorporated by reference in its entirety, in contrast, requires enzymes
to be fused to one-half of a split reporter.
[0313] Further, the following two publications are examples of methods
that for detecting molecules that merely disrupt the binding of an enzyme
to a substrate. This characteristic, among others, is in contrast to U.S.
Pat. No. 6,428,951. "Protein fragment complementation assays for the
detection of biological or drug interactions." Pub. Date: Jan. 31, 2008,
herein incorporated by reference in its entirety, which describes a high
throughput bacteria based protein-fragment complementation assays (PCAs)
wherein when two protein fragments derived from the enzyme dihydrofolate
reductase (DHFR), coexpressed as fusion molecules in Escherichia coli,
that interact in the absence of an inhibitor, then concentration
dependent colony growth was observed. This reference states that PCA can
be adapted to detecting interactions of proteins small molecules and
provide examples, including complementary fragment fusions and a
bait-fused fragment. In fact, protein tyrosine phosphatase PTP1B was
provided in an example for detecting enzyme substrate interactions and an
example of survival assay for detecting protein substrate interactions
using aminoglycoside kinase (AK), an example of antibiotic resistance
marker used for dominant selection of an E. coli,-based PCA. Further, a
PCA is described as being applied to identify small molecule inhibitors
of enzymes; natural products or small molecules from compound libraries
of potential therapeutic value; may be used as survival assay for library
screening; for detecting endogenous DHFR inhibitors, e.g. rapamycin; and
for protein-drug interactions. Expression of PCA complementary fragments
and fused cDNA libraries/target genes can be assembled on single plasmids
as individual operons under the control of separate inducible or
constitutive promoters with interceding region sequences, e.g. derived
from a mel operon, or have polycistronic expression. The PCA can be
adapted to detecting interactions of proteins with small molecules. In
this conception, two proteins are fused to PCA complementary fragments,
but the two proteins do not interact with each other. The interaction
must be triggered by a third entity, which can be any molecule that will
simultaneously bind to the two proteins or induce an interaction between
the two proteins by causing a conformational change in one or both of the
partners. Moreover, exemplary applications of the PCA Strategy in
bacteria to protein engineering/evolution to generate peptides or
proteins with novel binding properties that may have therapeutic value
using phage display technology. One example of evolution produced novel
zipper sequences; other examples of evolutions were described to produce
endogenous toxins.
[0314] WO2004048549. Dep-1 Receptor Protein Tyrosine Phosphatase
Interacting Proteins And Related Methods. Published Jun. 10, 2004, herein
incorporated by reference in its entirety; describes screening assays for
inhibitors that alter the interaction between a PTP and a tyrosine
phosphorylated protein that is a substrate of the PTP, e.g.
dephosphorylation by Density Enhanced Phosphatase-1 (DEP-1) of a DEP-1
substrate. DEP-1 polypeptides can be expressed in bacteria cells,
including E. coli, under the control of appropriate promoters, e.g. E.
coli arabinose operon (P.sub.BAD or P.sub.ARA). This reference is
similarly limited in focus as U.S. Pat. No. 6,428,951, herein
incorporated by reference in its entirety; it enables the detection of
molecules that disrupt the binding of a substrate to an enzyme, rather
than the detection of molecules that modulate (i.e., enhance or reduce)
the activity of an enzyme.
[0315] Advantages of methods and systems described herein, over some other
systems for detecting small molecule inhibitors includes but is not
limited to enabling the evolution of metabolites that change the
catalytic activity of an enzyme. The technology described in Badran, et
al., "Continuous evolution of Bacillus thuringiensis toxins overcomes
insect resistance". Nature, Vol 533:58, 2016, herein incorporated by
reference in its entirety; and the platform of continuous evolution in
general, has been used to evolve proteins with different affinities for
other proteins/peptide substrates. It has not, however, been used to
evolve enzymes that produce small molecules (i.e., metabolites) that
alter the activities of enzymes or the strength of protein-protein
interactions.
[0316] Another advantage of methods and systems described herein, over
some other systems for detecting small molecule inhibitors includes the
discovery of metabolites with targeted biological activities but unknown
structures (e.g., the ability to inhibit protein tyrosine phosphatase
1B). There are many inventions relevant to the production of terpenoids
in E. coli or S. cerevisiae (e.g., Moses, et al., "Bioengineering of
plant (tri)terpenoids: from metabolic engineering of plants to synthetic
biology in vivo and in vitro." New Phytologist, Volume 200, Issue 1,
where this reference describes synthesis of artemisinic acid, the
precursor of the antimalarial drug artemisinin, as diterpenoids expressed
in E. coli. Further, enzyme engineering or directed evolution of
terpenoid biosynthetic enzymes, e.g. engineer enzymes to accept unnatural
substrates and to catalyze regions and stereospecific reactions with an
efficiency comparable with that of the natural enzymes is described,
along with discussions on enhancing the production of terpenoids in
Escherichia coli.); in many cases, the metabolic pathways responsible for
making these terpenoids are mutated to improve production levels.
However, the use of biosensors (i.e., constructs that report on the
concentrations of various metabolites) has focused on the detection of
specific intermediates (e.g., farnesyl pyrophosphate, a precursor to
terpenoids) not for combining (i) mutagenesis of a metabolic pathway and
(ii) a biosensor for specific biological activity (e.g., the ability to
inhibit PTP1B) for the discovery of new biologically active molecules
(which may possess unknown structures).
[0317] High-Throughput Metabolic Engineering.
[0318] Microbial pathways are most efficiently optimized with
high-throughput screens. Unfortunately, at present, such screens are
sparse, and those available rely on signals (e.g., absorbance or
fluorescence, association with a product-specific transcription factor,
or growth permitted by an essential metabolite) that are difficult to
extend to broad classes of molecules (e.g., those without distinct
optical or metabolic properties).sup.27. The proposed work develops
high-throughput screens for terpenoids with a targeted activity--the
ability to inhibit PTP1B--rather than a targeted structure; these
activity-focused screens could be broadly useful for building (i.e.,
evolving) new biologically active small molecules.
[0319] Identification of new inhibitors: a starting point. We recently
discovered that abietic acid, the primary component of resin acid, is an
allosteric inhibitor of PTP1B (FIG. 22). We will use abietadiene, the
immediate precursor of this molecule, as a starting point for the design
of inhibitors. Abietadiene has several attributes that make it
particularly compatible with our approach: (i) It can be synthesized in
E. coli at titers (200 mg/L) that permit purification, NMR analysis, and
calorimetric studies.sup.28, (ii) Mutants of its associated terpene
synthase--abietadiene synthase--yield a range of hydrocarbon scaffolds
that differ in stereochemistry, shape, and size (and that still enable
analysis and purification).sup.29-31, (iii) It--and similar
molecules--can be functionalized by cytochrome P450s and
halogenases.sup.32-34.
[0320] Metabolic engineering. We have engineered a strain of E. coli to
produce abietadiene at titers (>150 mg/L) sufficient to permit the
analytical methods (i.e., GC/MS, ITC, and NMR). Our biosynthetic pathway
has two requisite operons: MBIS, which converts (RJ-mevalonate to
farnesyl pyrophosphate (FPP), and TS, which converts farnesyl
pyrophosphate to abietadiene. One optional operon--MevT, which converts
acetyl-CoA to (RJ-mevalonate--is necessary when mevalonate is not
included in the media.sup.35. The plasmids pMevT and pMBIS were developed
by the Keasling Laboratory.sup.36. The plasmid pTS, which contains
abietadiene synthase (ABS) from Abies grandis, was developed as in
Morrone.sup.28 with a gene for geranylgeranyl diphosphate synthase from
Ajikumar.sup.37.
[0321] Improved inhibitors. We assessed the ability of minor structural
perturbations of abietadiene derivatives to yield improved inhibitors by
comparing four structurally related (and commercially available)
molecules (FIG. 16): abietic, neoabietic, levopimaric, and dihydroabietic
acid. Surprisingly, dihydroabietic acid was ten times more inhibitory
than abietic acid (K, .about.25 uM vs. 250 uM). Our ability to find an
improved inhibitor in this small screen suggests that the kinds of
structural variations explored in this study--and the sizes of molecular
libraries generated--are likely yield improved inhibitors.
[0322] Functionalization of abietadiene. We assessed the ability of
mutants of cytochrome P450.sub.bm3 to functionalize abietadiene-like
molecules by installing five readily available mutants (G3, KSA-4, 9-1
OA, 139-3, and J, which were engineered for activity on
amorphadiene.sup.38 and steroids.sup.39) into our heterologous pathway;
three mutants yielded hydroxylated and/or carboxylated products,
generating up to 0.3 mg/L of abietic acid (FIG. 17). The
abietadiene-functionalizing activity of mutants originally engineered for
other targets suggests that we will be able to develop mutants of
P450.sub.bm3 with even higher activities on abietadiene-like molecules.
[0323] Biostructural analyses. We have crystallized PTP1B in our lab,
collected X-ray diffraction data in collaboration with Peter Zwart at
Lawrence Berkeley National Lab (LBNL), and solved its crystal structure
(FIG. 17A inset). We have also co-crystallized PTP1B with abietic acid;
we will analyze these crystals in late July (first available beam time).
[0324] Recently, we expressed N.sup.15-labeled PTP1B and used it to
collect two-dimensional .sup.1H-.sup.15N HSQC spectra in collaboration
with Haribabu Arthanari at Harvard Medical School (FIG. 17A main). The
spectra include PTP1B bound (separately) to abietic acid and known
inhibitors; at present, we are processing the data. Preliminary results
(X-ray and NMR) suggest that biostructural studies of PTP1B bound to
different inhibitors will be straightforward.
[0325] High-throughput screens. Upon binding to inhibitors (both
competitive and allosteric), PTP1B exhibits changes in conformation that
quench its tryptophan fluorescence (the basis of one of our four
high-throughput screens). FIG. 17B indicates that such quenching can be
used to distinguish between inhibitory extract (i.e., a hexane overlay)
from an abietadiene-producing strain of E. coli and non-inhibitory
extract from a control strain (i.e., one with a catalytically inactive
ABS). FIG. 17C indicates that such changes can also be used to detect 50
uM (15 mg/L) of abietic acid. Our ability to detect (i) abietadiene in
culture extract and (ii) abietic acid at low concentrations (i.e.,
tenfold lower than our titers of abietadiene) suggests that we will be
able to detect improved inhibitors of PTP1B, even if they are accompanied
by reductions in titer.
[0326] Providing structurally varied terpenoids with different affinities
for the allosteric binding pocket. This section describes developing a
set of inhibitors with incremental differences in affinity that result
from systematic differences in structure. The goal (metric for success):
a minimum of -15 structurally varied inhibitors with (i) affinities for
PTP1B that differ by 100-fold and/or (ii) enthalpies and entropies of
binding that suggest alternative binding geometries.
[0327] Research plan. In the sections that follow, we use enzymes to build
selective terpenoid inhibitors of PTP1B. This enzyme is the initial focus
of our work because it is a therapeutic target for diabetes, obesity, and
cancer, and it can be expressed, crystallized, and assayed with
ease.sup.15. It, thus, serves as a pharmaceutically relevant model system
with which to develop a general approach for the enzymatic construction
of drug leads.
[0328] Hypothesis for structural changes. In this section, we use
promiscuous enzymes to construct terpenoids that differ in
stereochemistry, shape, size, and chemical functionality. We believe that
these modifications will affect the affinity of ligands for PTP1B by
altering (i) their ability to engage in van der Waals interactions with
nonpolar residues (e.g., F280, L192, and F196) in the allosteric binding
pocket, (ii) their ability to engage in direct or water-mediated hydrogen
bonds with proximal polar residues (e.g., N193, E200, and E276), (iii)
their ability to engage in halogen bonds with either set of residues,
(iv) their influence on molecular conformational constraints, and, (v)
their ability to reorganize water during binding. This hypothesis (which
is supported, in part, by FIG. 16) motivates the synthetic strategy
described herein.
[0329] Stereochemistry, shape, and size. We will begin by using mutants of
ABS to generate diterpenoids that differ in stereochemistry and shape
FIG. 18A). ABS uses two active sites to catalyze sequential class II
(protonation-dependent) and class I (ionization-dependent) cyclization of
geranylgeranyl pyrophosphate (GGPP, C.sub.2o) into abietadiene.sup.29.
Previous studies indicate that amino acid substitutions in its active
sites can alter the stereochemistry or shape of its products.sup.29,31.
We will use mutations (new and previously identified) that affect the
position of deprotonation, intramolecular protein transfer, or
carbocation stability (FIG. 8B). After installing these mutants into E.
coli, we will use GC/MS to search for new products (fragmentation tools
such as MetFrag.sup.40 or ACD/MS Fragmenter.sup.41 will facilitate
identification of novel compounds).
[0330] We will generate terpenoids that differ in size by using mutations
that increase/decrease the volume of the active sites of ABS. Previous
attempts to change the substrate specificities of terpene
synthases.sup.42,43 suggest that such mutations could enable enhanced
activity on farnesyl pyrophosphate (FPP, CI.sub.5) and farnesylgeranyl
pyrophosphate (FGPP, C.sub.2s). To synthesize FGPP, we will incorporate
an FGPP synthase previously expressed in E. coli.sup.44.
[0331] We will isolate a subset of new terpenoids with particularly high
titers by using flash chromatography and HPLC (a task for which
feasibility has been established in several studies.sup.28,31,45), and we
will use ITC to measure the free energy (AG.degree..sub.birid), enthalpy
(AH.degree..sub.bind), and entropy (-TAS.degree..sub.bind) of binding to
PTP1B. Differences in AG.degree..sub.bind between ligands will reveal how
structural changes affect the strength of binding; differences in
AH.degree..sub.bind and -TAS.degree..sub.bind will reveal their influence
on binding geometry.sup.46,47.
[0332] Hydroxylation and halogenation. For each of the three ligands
selected in 6.1.2, we will use mutants of cytochrome P450 BM3
(P450.sub.bm3) from Bacillus megaterium and/or CYP720B4 (P450.sub.72o)
from Picea sitchensis to construct five variants with hydroxyl or
carboxyl groups at different positions (FIGS. 19A and 19B). P450.sub.bm3
can hydroxylate a wide range of substrates, including terpenoids.sup.48;
P450.sub.72o can carboxylate over 20 diterpenoids, including
abietadiene.sup.49. Both enzymes can be expressed in E. coli.sup.4A.
[0333] We will work with several sets of mutations: For P450.sub.bm3, we
will use (i) three (V78A, F87A, and A328L) that permit the
stereoselective hydroxylation of sesquiterpenes and diterpenes.sup.50,
(ii) five (L75A, M177A, L181A, and L437A) that enable hydroxylation of
alkaloids and steroids.sup.51), and (iii) two (F87V and A82F) that permit
carboxylation of heteroaromatics (FIG. 18D).sup.52. For P450.sub.72o, we
will examine -10 similar mutations likely to alter the position of
oxidation. We will, again, screen each mutant in E. coli, isolate
interesting products, and use ITC to analyze them.
[0334] For each of two high-affinity oxygenated ligands, we will construct
six variants with bromide or iodide at different positions (FIG. 18C).
These two halogens can engage in halogen bonds with oxygen, nitrogen, or
sulfur acceptors in proteins.sup.53, and can bind small nonpolar
declivities on their surfaces.sup.54. The energetic contribution
associated with both interactions tends to increase from Br to
I.sup.54,55 and, thus, lends itself to systematic analysis (i.e., a
physical organic approach). To generate halogenated ligands, we will use
mutants of tryptophan 6-halogenase (SttH) from Streptomyces toxytricini
and vanadium haloperoxidase (VHPO) from Acaryochloris marina. These
enzymes can introduce halogens (chloride, bromide, or iodide) into
sp.sup.2-hybridized carbons of alkaloids or terpenoids (before or after
cyclization).sup.56,57. For each enzyme, we will examine several
mutations known to change regioselectivity (e.g., L460F, P461E, and P452T
for SttH.sup.56) and 5-10 mutations likely to change the orientation of
bound FIG. 19 Examples: (FIG. 19A) carboxylated, (B) hydroxylated,
ligands (FIG. 19E). We will, again, screen (FIG. 19C) and halogenated
diterpenoids. (FIG. 19D-E) Residues each mutant in E. coli and use ITC to
targeted for mutagenesis in (FIG. 19D) P450.sub.bm3 and (FIG. 19E) SttH.
IV. Evolving High-Affinity Terpenoid Inhibitors of PTP1B.
[0335] This section develops four high-throughput screens for rapidly
evaluating the strength of PTP1B inhibitors, and it uses those methods,
in conjunction with site-saturation and random mutagenesis, to evolve new
inhibitors. The goal: a set of evolved inhibitors with particularly high
affinities (K.sub.D{circumflex over ( )}1 uM) and/or unpredictable
structures (i.e., structures inconsistent with rational design).
[0336] Biological selection. A selection method (i.e., a growth-coupled
screen) in which the survival of E. coli is linked to inhibitor potency
will enable rapid screening of extremely large libraries of molecules
(10.sup.10).sup.66. In this section, we develop such a method.
[0337] PTP1B catalyzes the dephosphorylation--and inactivation--of several
cell surface receptors. We will use the tyrosine-containing regions of
these receptors to build an operon that links inhibition of PTP1B to cell
growth. This operon will require six components (FIG. 21A): (i) a
substrate domain (the tyrosine-containing region of a receptor) tethered
to a DNA-binding protein, (ii) a substrate recognition domain (a protein
that binds the tyrosine-containing region after its phosphorylation)
tethered to the co subunit of an RNA polymerase, (iii) a tyrosine kinase,
(iv) PTP1B, (v) a gene for antibiotic resistance, and (vi) an operator
for that gene. With this system, inhibitors of PTP1B will enable binding
of the substrate and substrate recognition domains, recruitment of RNA
polymerase to the DNA, and transcription of the gene for antibiotic
resistance. Previous groups have used similar operons to evolve
protein-protein binding partners; here, we take the additional steps of
(i) using a protein-protein interaction mediated by enzymes (PTP1B and a
kinase) and of (ii) screening that interaction in the presence of
potential inhibitors of one of those enzymes.
[0338] We will develop our operon by starting with a luminescence-based
system, and we will add an antibiotic resistance gene as a final step. In
our preliminary work with a system optimized by Liu et al..sup.67, we
obtained a tenfold difference in Lux-based luminescence between a strain
expressing two binding partners and a strain expressing one (FIG. 21E;
arabinose induces expression of the second partner). We now plan to
introduce--and test--different substrate domains, recognition domains,
and kinases (eGFR and Src).
[0339] A FRET sensor for PTP1B activity. A high-throughput screen in which
inhibition of PTP1B is linked to cell fluorescence will enable rapid
screening via fluorescence-activated cell sorting (FACS). This technique
tends to produce more false positives than selection and limits libraries
to sizes of 10.sup.7-10.sup.8, but it requires fewer heterologous
genes.sup.27,66.
[0340] For this strategy, we will make use of FRET (Forster resonance
energy transfer) sensors commonly used to monitor kinase and phosphatase
activity in mammalian cells.sup.68,69. These sensors consist of a kinase
substrate domain, a short flexible linker, and a phosphorylation
recognition domain--all sandwiched between two fluorescent proteins.
Phosphorylation of the substrate domain causes it to bind to the
recognition domain, inducing FRET between the two fluorescent proteins.
In a PTP1B-compatible sensor, inhibitors of PTP1B will increase FRET
(FIG. 21B). We have begun to develop such a sensor by trying different
combinations of substrate domains, recognition domains, and kinases.
(Note: FACS enables FRET-based screens.sup.70,71).
[0341] A FRET sensor for changes in the conformation of PTP1B. A
FACS-based screen in which changes in cell fluorescence result from
binding-induced changes in the conformation of PTP1B would be less
generalizable than strategies 2 and 3 (which could be used for any kinase
or phosphatase), but would require only one heterologous gene.
[0342] For this strategy, we will make use of a FRET experiment carried
out by the Tonks Group.sup.13. These researchers sought to show that the
binding of trodusquemine to PTP1B caused the protein to become more
compact. To do so, they attached members of a FRET pair to each terminus
of the PTP1B (FIG. 21C); upon protein-ligand association, an increase in
FRET signal indicated that its termini approached one another. We
hypothesize that this construct could be used as a sensor for identifying
other molecules that bind to the allosteric site of PTP1B. We will begin
by testing it with a variety of known inhibitors (a step the Tonks group
did not take).
[0343] Binding-induced changes in the tryptophan fluorescence of PTP1B. A
screen in which inhibition of PTP1B is linked to changes in tryptophan
fluorescence (FIG. 21D) will enable rapid screening of moderately sized
libraries (10.sup.3-10.sup.4).sup.27 in microtiter plates. Our use of
binding-induced changes in tryptophan fluorescence is described in 5.6.
In future work, we plan to extend this approach to other protein tyrosine
phosphatases, many of which are allosteric and possess many tryptophans
(e.g., SHP-2, a target for Noonan syndrome.sup.72)
[0344] Mutagenesis. To use our high-throughput screens to evolve
inhibitors of PTP1B, we will build libraries of mutated terpenoid
pathways by using (i) site-saturation mutagenesis (SSM; we will target
binary combinations of sites) and (ii) error-prone PCR (ep-PCR).
[0345] For SSM, we will identify "plastic" residues likely to accommodate
useful mutations by developing functions similar to Eq. 1. This function
scores residues based on their ability to accommodate mutations that
influence the volume and hydration structure of an active site; S is a
metric for the propensity of a residue to permit mutations, cr.sup.2 is
the variance in volume of
s=4+RTW (EQ 1}
similarly positioned residues in the active sites of other enzymes,
A{circumflex over ( )}.sub.w is the variance in hydrophilicity of those
residues, and N.sub.V and N.sub.HW are normalization factors. In our
preliminary analysis of ABS, we successfully used Eq. 1 (and
structure/sequence information from Taxadiene, y-humulene, 5-selenine,
and epi-isozizaene synthases) to identify residues for which mutations
are known to yield new products (e.g., H348 of ABS).sup.31. We note:
Previous attempts to identify plastic residues have scanned each site
near the bound substrate.sup.73; our approach will be unique in its
inclusion of biophysical considerations from (i) our study of optimal
ligand attributes (6.2.1) and (ii) our study of the types of mutations
that bring them about (6.2.2). For library construction, we will explore
mutating our pathway (i) enzyme-by enzyme (e.g., ABS, then P450.sub.bm3,
and then VttH) or (ii) at random. The second approach could give us
access to structures that might be difficult to find with conventional
approaches to lead design.
[0346] To identify structure-activity relationships that enable the
evolution of terpenoid inhibitors of arbitrary protein targets. This
section develops a biophysical framework for using a crystal structure of
a protein to identify enzymes capable of making inhibitors of that
protein. The goal: the use of that framework to identify--and, then,
test--enzymes capable of synthesizing new inhibitors of PTP1B and
(separately) undecaprenyl diphosphate synthase (UPPS), a target for
antibiotic-resistant bacterial infections.
[0347] Relationships between binding pockets. We will begin by determining
how similarities in specific properties of binding pockets (e.g., volume,
polarity, and shape) enable enzymes to synthesize, functionalize, and/or
bind similar molecules. This effort will involve comparisons of the
allosteric binding pocket of PTP1B with the binding pockets (i.e., active
sites) of enzymes involved in inhibitor synthesis. For these comparisons,
we will construct two matrices: matrix A in which each element (ay)
represents the similarity of a specific property between binding pockets
i and j ((0<aij<1, where 1 is highly similar) and matrix B in which
each element (by) describes the ability of binding pockets i and j to
bind similar molecules (0<by<1, where 1 represents identical
binding specificities). The rank of the matrix formed by the product of
these two matrices (AB) will suggest the number of independent variables
(i.e., active site attributes) necessary to determine the functional
compatibility of enzymes in a metabolic pathway; the eigenvalue will
suggest the relative importance of the property under study (described by
matrix A).
[0348] We will construct matrix A with PyMol- and MD-based analyses of
protein crystal structures. We will construct matrix B by examining the
binding of functionalized terpenoids and their precursors to each enzyme
involved in terpenoid synthesis. Binding affinities for some of these
ligand/protein combinations will be measured with ITC; most will be
estimated with docking calculations (OEDocking.sup.78).
[0349] The result of this section will be an equation similar to Eq. 2,
where J is a metric for an active site's ability to synthesize
[0350] J=w.sub.vV+w.sub.pP+WiL+w.sub.wW (Eq. 2) terpenoids that bind a
particular binding pocket; V, P, L, and W represent specific properties
of that active site (volume, polarity, longest diameter, and shortest
diameter); and w's represent weighting factors. The final number of
variables--and their respective weights--will be determined through the
above analysis. In parameterizing the equation, we plan to examine
different metrics for properties of binding pockets (e.g. shape) and to
explore/develop different matrix manipulations.
[0351] Validation and Extension.
[0352] The identification of promising active site motifs for inhibitor
synthesis will require a search of available protein structural data. We
will perform such a search by using PROBIS (probis.nih.gov.sup.79), an
alignment-based platform that uses a specified binding site to find
similar binding sites on other proteins in the Protein Data Bank. PROBIS
can identify similarly shaped binding pockets, even when the protein
folds that surround those pockets are different (i.e., it detects similar
constellations of amino acids).
[0353] To begin, we will use a PROBIS-based search to identify enzymes
with active sites that have some level of structural similarity (we will
explore different thresholds) to either (i) the allosteric binding site
of PTP1B or (ii) the active sites of enzymes capable of synthesizing
inhibitors of PTP1B. Using Eq. 2, we will select enzymes with the most
favorable active sites and test them with our platform for inhibitor
development).
[0354] We will assess the generalizability of our approach by attempting
to construct inhibitors of UPPS, a protein known to bind terpenoids and
polycyclic molecules.sup.80. Structure-based searches will use two
starting points: (i) UPPS and (ii) mutants of ABS, P450.sub.bm3, or
similar enzymes that our biophysical analyses suggest might yield UPPS
inhibitors. We will, again, select a subset of enzymes to test with our
platform.
REFERENCES FOR SECTIONS III-IV
[0355] 1. Koh, H.-L, Yau, W.-P., Ong, P.-S. & Hegde, A. Current trends
in modern pharmaceutical analysis for drug discovery. Drug Discov. Today
8, 889-897 (2003). [0356] 2. Whitesides, G. M. & Krishnamurthy, V. M.
Designing ligands to bind proteins. Q. Rev. Biophys. 38, 385-395 (2005).
[0357] 3. Olsson, T. S. G., Williams, M. a., Pitt, W. R. & Ladbury, J. E.
The Thermodynamics of Protein-Ligand Interaction and Solvation: Insights
for Ligand Design. J. Mol. Biol. 384, 1002-1017 (2008). [0358] 4. Welsch,
M. E., Snyder, S. A. & Stockwell, B. R. Privileged scaffolds for library
design and drug discovery. Curr. Opin. Chem. Biol. 14, 347-361 (2010).
[0359] 5. Gershenzon, J. & Dudareva, N. The function of terpene natural
products in the natural world. Nat. Chem. Biol. 3, 408-414 (2007). [0360]
6. Chang, M. C. Y. & Keasling, J. D. Production of isoprenoid
pharmaceuticals by engineered microbes. Nat. Chem. Biol. 2, 674-681
(2006). [0361] 7. Johnson, T. O., Ermolieff, J. & Jirousek, M. R. Protein
tyrosine phosphatase 1B inhibitors for diabetes. Nat. Rev. Drug Discov.
1, 696-709 (2002). [0362] 8. Koren, S. & Fantus, I. G. Inhibition of the
protein tyrosine phosphatase PTP1B: potential therapy for obesity,
insulin resistance and type-2 diabetes mellitus. Best Pract. Res. Clin.
Endocrinol. Metab. 21, 621-640 (2007). [0363] 9. Soysal, S., Obermann, E.
C, Gao, F., Oertli, D., Gillanders, W. E., Viehl, C. T. & Muenst, S.
PTP1B expression is an independent positive prognostic factor in human
breast cancer. Breast Cancer Res. Treat. 137, 637-644 (2013). [0364] 10.
Tonks, N. K. & Muthuswamy, S. K. A Brake Becomes an Accelerator: PTP1B--A
New Therapeutic Target for Breast Cancer. Cancer Cell 11, 214-216 (2007).
[0365] 11. Lessard, L, Stuible, M. & Tremblay, M. L. The two faces of
PTP1B in cancer. Biochim. Biophys. Acta-Proteins Proteomics 1804, 613-619
(2010). [0366] 12. Zhang, S. & Zhang, Z. Y. PTP1B as a drug target:
recent developments in PTP1B inhibitor discovery. Drug Discov. Today 12,
373-381 (2007). [0367] 13. Krishnan, N., Koveal, D., Miller, D. H., Xue,
B., Akshinthala, S. D., Kragelj, J., Jensen, M. R., Gauss, C.-M., Page,
R., Blackledge, M., Muthuswamy, S. K., Peti, W. & Tonks, N. K. Targeting
the disordered C terminus of PTP1B with an allosteric inhibitor. Nat.
Chem. Biol. 10, 558-566 (2014). [0368] 14. Sun, J. P., Fedorov, A. A.,
Lee, S. Y., Guo, X. L, Shen, K., Lawrence, D. S., Almo, S. C. & Zhang, Z.
Y. Crystal structure of PTP1B complexed with a potent and selective
bidentate inhibitor. J. Biol. Chem. 278, 12406-12414 (2003). [0369] 15.
Wiesmann, C, Barr, K. J., Kung, J., Zhu, J., Erlanson, D. A., Shen, W.,
Fahr, B. J., Zhong, M., Taylor, L., Randal, M., McDowell, R. S. & Hansen,
S. K. Allosteric inhibition of protein tyrosine phosphatase 1B. Nat.
Struct. Mol. Biol. 11, 730-737 (2004). [0370] 16. Krishnan, N. & Tonks,
N. K. Anxious moments for the protein tyrosine phosphatase PTP1B. Trends
Neurosci. 38, 462-465 (2015). [0371] 17. Hughes, J. P., Rees, S. S.,
Kalindjian, S. B. & Philpott, K. L. Principles of early drug discovery.
Br. J. Pharmacol. 162, 1239-1249 (2011). [0372] 18. Kennedy, T. Managing
the drug discovery/development interface. Drug Discov. Today 2, 436-444
(1997). [0373] 19. Snyder, P. W., Lockett, M. R., Moustakas, D. T. &
Whitesides, G. M. Is it the shape of the cavity, or the shape of the
water in the cavity? Eur. Phys. J. Spec. Top. 223, 853-891 (2014). [0374]
20. Klebe, G. Applying thermodynamic profiling in lead finding and
optimization. Nat. Rev. Drug Discov. 14, 95-110 (2015). [0375] 21. Chang,
M. C. Y. & Keasling, J. D. Production of isoprenoid pharmaceuticals by
engineered microbes. Nat. Chem. Biol. 2, 674-681 (2006). [0376] 22.
George, K. W., Alonso-Gutierrez, J., Keasling, J. D. & Lee, T. S.
Isoprenoid Drugs, Biofuels, and Chemicals-Artemisinin, Farnesene, and
Beyond. Adv Biochem Eng Biotechnol 148, 355-389 (2014). [0377] 23. Cragg,
G. M. & Newman, D. J. Natural products: A continuing source of novel drug
leads. Biochim. Biophys. Acta-Gen. Subj. 1830, 3670-3695 (2013). [0378]
24. Galanie, S., Thodey, K., Trenchard, I. J., Filsinger Interrante, M. &
Smolke, C. D. Complete biosynthesis of opioids in yeast. Science (80-.).
349, 1095-1100 (2015). [0379] 25. Govindarajan, S., Recabarren, R. &
Goldstein, R. a. Estimating the total number of protein folds. Proteins
35, 408-414 (1999). [0380] 26. Atanasov, A. G., Waltenberger, B.,
Pferschy-Wenzig, E. M., Linder, T., Wawrosch, C, Uhrin, P., Temml, V.,
Wang, L., Schwaiger, S., Heiss, E. H., Rollinger, J. M., Schuster, D.,
Breuss, J. M., Bochkov, V., Mihovilovic, M. D., Kopp, B., Bauer, R.,
Dirsch, V. M. & Stuppner, H. Discovery and resupply of pharmacologically
active plant-derived natural products: A review. Biotechnol. Adv. 33,
1582-1614 (2015). [0381] 27. Lauchli, R., Rabe, K. S., Kalbarczyk, K. Z.,
Tata, A., Heel, T., Kitto, R. Z. & Arnold, F. H. High-throughput
screening for terpene-synthase-cyclization activity and directed
evolution of a terpene synthase. Angew. Chemie--Int. Ed. 52, 5571-5574
(2013). [0382] 28. Morrone, D., Lowry, L., Determan, M. K., Hershey, D.
M., Xu, M. & Peters, R. J. Increasing diterpene yield with a modular
metabolic engineering system in E. coli: Comparison of MEV and MEP
isoprenoid precursor pathway engineering. Appl. Microbiol. Biotechnol.
85, 1893-1906 (2010). [0383] 29. Peters, R. J. & Croteau, R. B.
Abietadiene synthase catalysis: mutational analysis of a prenyl
diphosphate ionization-initiated cyclization and rearrangement. Proc.
Natl. Acad. Sci. U.S.A. 99, 580-584 (2002). [0384] 30. Wilderman, P. R. &
Peters, R. J. A single residue switch converts abietadiene synthase into
a pimaradiene specific cyclase. J. Am. Chem. Soc. 129, 15736-15737
(2007). [0385] 31. Criswell, J., Potter, K., Shephard, F., Beale, M. H. &
Peters, R. J. A single residue change leads to a hydroxylated product
from the class II diterpene cyclization catalyzed by abietadiene
synthase. Org. Lett. 14, 5828-5831 (2012). [0386] 32. Fasan, R. Tuning
P450 enzymes as oxidation catalysts. ACS Catal. 2, 647-666 (2012). [0387]
33. Hamberger, B. B., Ohnishi, T., Hamberger, B. B., Seguin, A. &
Bohlmann, J. Evolution of diterpene metabolism: Sitka spruce CYP720B4
catalyzes multiple oxidations in resin acid biosynthesis of conifer
defense against insects. Plant Physiol. 157, 1677-95 (2011). [0388] 34.
Fujimori, D. G. & Walsh, C. T. What's new in enzymatic halogenations.
Curr. Opin. Chem. Biol. 11, 553-560 (2007). [0389] 35. Martin, V. J. J.,
Pitera, D. J., Withers, S. T., Newman, J. D. & Keasling, J. D.
Engineering a mevalonate pathway in Escherichia coli for production of
terpenoids. Nat. Biotechnol. 21, 796-802 (2003). [0390] 36. Zhang, F. &
Keasling, J. Biosensors and their applications in microbial metabolic
engineering. Trends Microbiol. 19, 323-329 (2011). [0391] 37. Ajikumar,
P. K., Xiao, W.-H., Tyo, K. E. J., Wang, Y., Simeon, F., Leonard, E.,
Mucha, O., Phon, T. H., Pfeifer, B. & Stephanopoulos, G. Isoprenoid
pathway optimization for Taxol precursor overproduction in Escherichia
coli. Science 330, 70-74 (2010). [0392] 38. Dietrich, J. A., Yoshikuni,
Y., Fisher, K. J., Woolard, F. X., Ockey, D., McPhee, D. J., Renninger,
N. S., Chang, M. C. Y., Baker, D. & Keasling, J. D. A novel
semi-biosynthetic route for artemisinin production using engineered
substrate-promiscuous P450BM3. ACS Chem. Biol. 4, 261-267 (2009). [0393]
39. Zhang, K., El Damaty, S. & Fasan, R. P450 fingerprinting method for
rapid discovery of terpene hydroxylating P450 catalysts with diversified
regioselectivity. J. Am. Chem. Soc. 133, 3242-3245 (2011). [0394] 40.
Ruttkies, C, Schymanski, E. L, Wolf, S., Hollender, J. & Neumann, S.
MetFrag relaunched: Incorporating strategies beyond in silico
fragmentation. J. Cheminform. 8, (2016). [0395] 41. Pelander, A., Tyrkko,
E. & Ojanpera, I. In silico methods for predicting metabolism and mass
fragmentation applied to quetiapine in liquid
chromatography/time-of-flight mass spectrometry urine drug screening.
Rapid Commun. Mass Spectrom. 23, 506-514 2009. [0396] 42. Kampranis, S.
C, loannidis, D., Purvis, A., Mahrez, W., Ninga, E., Katerelos, N. A.,
Anssour, S., Dunwell, J. M., Degenhardt, J., Makris, A. M., Goodenough,
P. W. & Johnson, C. B. Rational conversion of substrate and product
specificity in a Salvia monoterpene synthase: structural insights into
the evolution of terpene synthase function. Plant Cell 19, 1994-2005
(2007). [0397] 43. Huang, Q., Williams, H. J., Roessner, C. A. & Scott,
A. I. Sesquiterpenes produced by truncated taxadiene synthase.
Tetrahedron Lett. 41, 9701-9704 (2000). [0398] 44. Tachibana, A., Yano,
Y., Otani, S., Nomura, N., Sako, Y. & Taniguchi, M. Novel
prenyltransferase gene encoding farnesylgeranyl diphosphate synthase from
a hyperthermophilic archaeon, Aeropyrum pernix. Molecular evolution with
alteration in product specificity. Eur. J. Biochem. 267, 321-328 (2000).
[0399] 45. Jia, M., Potter, K. C. & Peters, R. J. Extreme promiscuity of
a bacterial and a plant diterpene synthase enables combinatorial
biosynthesis. Metab. Eng. 37, 24-34 (2016). [0400] 46. Fox, J. M., Kang,
K. K., Sastry, M., Sherman, W., Sankaran, B., Zwart, P. H. & Whitesides,
G. M. Water-Restructuring Mutations Can Reverse the Thermodynamic
Signature of Ligand Binding to Human Carbonic Anhydrase. Angew. Chemie
Int. Ed. 56, 3833-3837 (2017). [0401] 47. Krimmer, S. G., Betz, M.,
Heine, A. & Klebe, G. Methyl, ethyl, propyl, butyl: Futile but not for
water, as the correlation of structure and thermodynamic signature shows
in a congeneric series of thermolysin inhibitors. ChemMedChem 9, 833-846
(2014). [0402] 48. Jung, S. T., Lauchli, R. & Arnold, F. H. Cytochrome
P450: Taming a wild type enzyme. Curr. Opin. Biotechnol. 22, 809-817
(2011). [0403] 49. Hamberger, B. B., Ohnishi, T., Hamberger, B. B.,
Seguin, A. & Bohlmann, J. Evolution of diterpene metabolism: Sitka spruce
CYP720B4 catalyzes multiple oxidations in resin acid biosynthesis of
conifer defense against insects. Plant Physiol. 157, 1677-95 (2011).
[0404] 50. Seifert, A., Vomund, S., Grohmann, K., Kriening, S., Urlacher,
V. B., Laschat, S. & Pleiss, J. Rational design of a minimal and highly
enriched CYP102A1 mutant library with improved regio-, stereo- and
chemoselectivity. ChemBioChem 10, 853-861 (2009). [0405] 51. Lewis, J. C,
Mantovani, S. M., Fu, Y., Snow, C. D., Komor, R. S., Wong, C. H. &
Arnold, F. H. Combinatorial alanine substitution enables rapid
optimization of cytochrome P450BM3 for selective hydroxylation of large
substrates. ChemBioChem 11, 2502-2505 [0406] 52. Butler, C. F., Peet, C,
Mason, A. E., Voice, M. W., Leys, D. & Munro, A. W. Key mutations alter
the cytochrome P450 BM3 conformational landscape and remove inherent
substrate bias. J. Biol. Chem. 288, 25387-25399 (2013). [0407] 53.
Auffinger, P., Hays, F. a, Westhof, E. & Ho, P. S. Halogen bonds in
biological molecules. Proc. Natl. Acad. Sci. U.S.A 101, 16789-16794
(2004). [0408] 54. Fox, J., Kang, K., Sherman, W., Heroux, A., Sastry,
G., Baghbanzadeh, M., Lockett, M. & Whitesides, G. Interactions between
Hofmeister anions and the binding pocket of a protein. J. Am. Chem. Soc.
137, 3859-3866 (2015). [0409] 55. Carter, M., Voth, A. R., Scholfield, M.
R., Rummel, B., Sowers, L. C. & Ho, P. S. Enthalpy-entropy compensation
in biomolecular halogen bonds measured in DNA junctions. Biochemistry 52,
4891-4903 (2013). [0410] 56. Shepherd, S. A., Menon, B. R. K., Fisk, H.,
Struck, A.-W., Levey, C, Leyes, D. & Micklefield, J. A Structure-Guided
Switch in the Regioselectivity of a Tryptophan Halogenase. ChemBioChem
17, 821-824 (2016). [0411] 57. Carter-Franklin, J. N., Parrish, J. D.,
Tschirret-Guth, R. A., Little, R. D. & Butler, A. Vanadium
haloperoxidase-catalyzed bromination and cyclization of terpenes. J. Am.
Chem. Soc. 125, 3688-3689 (2003). [0412] 58. Li, R., Chou, W. K. W.,
Himmelberger, J. A., Litwin, K. M., Harris, G. G., Cane, D. E. &
Christianson, D. W. Reprogramming the chemodiversity of terpenoid
cyclization by remolding the active site contour of epi-isozizaene
synthase. Biochemistry 53, 1155-1168 (2014). [0413] 59. Brown, S. &
O'Connor, S. E. Halogenase Engineering for the Generation of New Natural
Product Analogues. ChemBioChem 16, 2129-2135 (2015). [0414] 60. Steele,
C. L., Crock, J., Bohlmann, J. & Croteau, R. Sesquiterpene Synthases from
Grand Fir (Abies grandis). J. Biol. Chem. 273, 2078-2089 (1998). [0415]
61. Lu, Y. & Mei, L. Co-expression of P450 BM3 and glucose dehydrogenase
by recombinant Escherichia coli and its application in an NADPH-dependent
indigo production system. J. Ind. Microbiol. Biotechnol. 34, 247-253
(2007). [0416] 62. Tzeng, S.-R. & Kalodimos, C. G. Protein activity
regulation by conformational entropy. Nature 488, 236-240 (2012). [0417]
63. Aramini, J. M., Vorobiev, S. M., Tuberty, L. M., Janjua, H.,
Campbell, E. T., Seetharaman, J., Su, M., Huang, Y. J., Acton, T. B.,
Xiao, R., Tong, L. & Montelione, G. T. The RAS-Binding Domain of Human
BRAF Protein Serine/Threonine Kinase Exhibits Allosteric Conformational
Changes upon Binding HRAS. Structure 23, 1382-1393 (2015). [0418] 64.
Christianson, D. W. Structural biology and chemistry of the terpenoid
cyclases. Chem. Rev. 106, 3412-3442 (2006). [0419] 65. O'Maille, P. E.,
Malone, A., Delias, N., Andes Hess, B., Smentek, L., Sheehan, I.,
Greenhagen, B. T., Chappell, J., Manning, G. & Noel, J. P. Quantitative
exploration of the catalytic landscape separating divergent plant
sesquiterpene synthases. Nat. Chem. Biol. 4, 617-623 (2008). [0420] 66.
Packer, M. S. & Liu, D. R. Methods for the directed evolution of
proteins. Nat. Rev. Genet. 16, 379-394 (2015). [0421] 67. Badran, A. H.,
Guzov, V. M., Huai, Q., Kemp, M. M., Vishwanath, P., Kain, W., Nance, A.
M., Evdokimov, A., Moshiri, F., Turner, K. H., Wang, P., Malvar, T. &
Liu, D. R. Continuous evolution of Bacillus thuringiensis toxins
overcomes insect resistance. Nature 533, 58-63 (2016). [0422] 68. Ting, a
Y., Kain, K. H., Klemke, R. L. & Tsien, R. Y. Genetically encoded
fluorescent reporters of protein tyrosine kinase activities in living
cells. Proc. Natl. Acad. Sci. U.S.A 98, 15003-15008 (2001). [0423] 69.
Sato, M. & Umezawa, Y. in Cell Biol. Four-Volume Set 2, 325-328 (2006).
[0424] 70. Dye, B. T. Flow cytometric analysis of CFP-YFP FRET as a
marker for in vivo protein-protein interaction. Clin. Appl. Immunol. Rev.
5, 307-324 (2005). [0425] 71. Vereb, G., Nagy, P. & Szollosi, J. Flow
cytometric FRET analysis of protein interaction. Methods Mol. Biol. 699,
371-92 (2011). [0426] 72. Chen, Y.-N. P., LaMarche, M. J., Chan, H. M.,
Fekkes, P., Garcia-Fortanet, J., Acker, M. G., Antonakos, B., Chen, C.
H.-T., Chen, Z., Cooke, V. G., Dobson, J. R., Deng, Z., Fei, F.,
Firestone, B., Fodor, M., Fridrich, C, Gao, H., Grunenfelder, D., Hao,
H.-X., Jacob, J., Ho, S., Hsiao, K., Kang, Z. B., Karki, R., Kato, M.,
Larrow, J., La Bonte, L. R., Lenoir, F., Liu, G., Liu, S., Majumdar, D.,
Meyer, M. J., Palermo, M., Perez, L., Pu, M., Price, E., Quinn, C,
Shakya, S., Shultz, M. D., Slisz, J., Venkatesan, K., Wang, P., Warmuth,
M., Williams, S., Yang, G., Yuan, J., Zhang, J.-H., Zhu, P., Ramsey, T.,
Keen, N. J., Sellers, W. R., Stams, T.
& Fortin, P. D. Allosteric inhibition of SHP2 phosphatase inhibits cancers
driven by receptor tyrosine kinases. Nature 535, 148-52 (2016). [0427]
73. Yoshikuni, Y., Ferrin, T. E. & Keasling, J. D. Designed divergent
evolution of enzyme function. Nature 440, 1078-1082 (2006). [0428] 74.
Gruet, A., Longhi, S. & Bignon, C. One-step generation of error-prone PCR
libraries using Gateway.RTM. technology. Microb. Cell Fact. 11, 14
(2012). [0429] 75. Dietrich, J. A., McKee, A. E. & Keasling, J. D.
High-Throughput Metabolic Engineering: Advances in Small-Molecule
Screening and Selection. Annu. Rev. Biochem. 79, 563-590 (2010). [0430]
76. Esvelt, K. M., Carlson, J. C. & Liu, D. R. A system for the
continuous directed evolution of biomolecules. Nature 472, 499-503
(2011). [0431] 77. Feiler, C, Fisher, A. C, Boock, J. T., Marrichi, M.
J., Wright, L., Schmidpeter, P. A. M., Blankenfeldt, W., Pavelka, M. &
DeLisa, M. P. Directed Evolution of Mycobacterium tuberculosis
(3-Lactamase Reveals Gatekeeper Residue That Regulates Antibiotic
Resistance and Catalytic Efficiency. PLoS One 8, (2013). [0432] 78.
Murphy, R. B., Repasky, M. P., Greenwood, J. R., Tubert-Brohman, I.,
Jerome, S., Annabhimoju, R., Boyles, N. A., Schmitz, C. D., Abel, R.,
Farid, R. & Friesner, R. A. WScore: A flexible and accurate treatment of
explicit water molecules in ligand-receptor docking. J. Med. Chem.
acs.jmedchem.6b00131 (2016). doi:10.1021/acs.jmedchem.6b00131 [0433] 79.
Konc, J., Miller, B. T., Stular, T., Lesnik, S., Woodcock, H. L., Brooks,
B. R. & Janezic, D. ProBiS-CHARMMing: Web Interface for Prediction and
Optimization of Ligands in Protein Binding Sites. J. Chem. Inf. Model.
55, 2308-2314 (2015). [0434] 80. Guo, R.-T., Cao, R., Liang, P.-H., Ko,
T.-P., Chang, T.-H., Hudock, M. P., Jeng, W.-Y., Chen, C. K.-M., Zhang,
Y., Song, Y., Kuo, C.-J., Yin, F., Oldfield, E. & Wang, A. H.-J.
Bisphosphonates target multiple sites in both cis- and
trans-prenyltransferases. Proc. Natl. Acad. Sci. U.S.A 104, 10022-10027
(2007). [0435] 81. Teng, K. H. & Liang, P. H. Structures, mechanisms and
inhibitors of undecaprenyl diphosphate synthase: A cis-prenyltransferase
for bacterial peptidoglycan biosynthesis. Bioorg. Chem. 43, 51-57 (2012).
[0436] 82. Zhu, W., Zhang, Y., Sinko, W., Hensler, M. E., Olson, J.,
Molohon, K. J., Lindert, S., Cao, R., Li, K., Wang, K., Wang, Y., Liu,
Y.-L, Sankovsky, A., de Oliveira, C. A. F., Mitchell, D. a, Nizet, V.,
McCammon, J. A. & Oldfield, E. Antibacterial drug leads targeting
isoprenoid biosynthesis. Proc. Natl. Acad. Sci. U.S.A 110, 123-8 (2013).
[0437] 83. Leonard, E., Ajikumar, P. K., Thayer, K., Xiao, W.-H., Mo, J.
D., Tidor, B., Stephanopoulos, [0438] G. & Prather, K. L. J. Combining
metabolic and protein engineering of a terpenoid biosynthetic pathway for
overproduction and selectivity control. Proc. Natl. Acad. Sci. U.S.A.
107, 13654-13659 (2010). [0439] 84. Ro, D.-K., Paradise, E. M., Ouellet,
M., Fisher, K. J., Newman, K. L., Ndungu, J. M., Ho, K. A., Eachus, R.
A., Ham, T. S., Kirby, J., Chang, M. C. Y., Withers, S. T., Shiba, Y.,
Sarpong, R. & Keasling, J. D. Production of the antimalarial drug
precursor artemisinic acid in engineered yeast. Nature 440, 940-943
(2006). V. Specific Embodiments of Bacterial Systems for Identifying
Small Molecules that Modulate the Activity of Enzymes.
[0440] As described herein, a strain of Escherichia coli was developed
comprising both (i) a genetically encoded system (i.e., a "bacterial
two-hybrid" or B2H system) that links cell survival to the modulation
inhibition of a pathologically relevant enzyme from Homo sapiens (i.e., a
drug target) and (ii) a pathway for metabolite biosynthesis. The
genetically encoded system described herein contains more genetic
elements than would traditionally constitute a single operon (e.g. it has
more than one promoter), but it is sometimes referred to as an operon.
[0441] More specifically, as described herein, host organisms, e.g.
Escherichia (E.) coli, were transformed with up to four plasmids,
including a first plasmid (plasmid 1) an expression plasmid comprising a
genetically encoded system that links the inhibition of a target enzyme
to cell survival, wherein the target enzyme may be chosen for the purpose
of identifying molecules that inhibit a specific target enzyme; a second
plasmid (plasmid 2) an expression plasmid comprising an operon for
expressing at least some of the genes necessary to synthesize products of
a metabolic pathway, e.g. a mevalonate-dependent pathway for terpenoid
biosynthesis derived from Saccharomyces cerevisiae for providing
terpenoid product compounds; a third plasmid (plasmid 3) an expression
plasmid comprising at least one additional gene, not present in plasmid
(plasmid 2), e.g. a terpene synthase, such as ADS, GHS, ABS, or TXS, for
providing desired products, e.g. terpenoid products, such that when the
host bacterial expresses plasmids 1 and 2, desired products are not
produced until the host bacterial expresses plasmid 3 for completing the
pathway for desired compounds; and a fourth plasmid (plasmid 4)
comprising additional genetic components specific to the strain of E.
coli, e.g., the F-plasmid of S1030 (Addgene 105063).
[0442] Examples of plasmid 1 embodiments are shown in FIG. 33A, 33B, 33D,
33E, FIG. 34, FIG. 35, FIG. 40A, 40B, 40C, 40D, etc.
[0443] In some embodiments, a strain of E. coli used as a host for
transfomation possesses the .DELTA.rpoZ mutation, which enable the system
encoded by plasmid 1 to control the expression of a gene for antibiotic
resistance.
[0444] In some embodiments, plasmids 2 and/or 3 constitute a pathway for
terpenoid biosynthesis. In some embodiments, plasmids 2 and/or 3
constitute a pathway for alkaloid biosynthesis. In some embodiments,
plasmids 2 and/or 3 constitute a pathway for polyketide biosynthesis.
[0445] In some embodiments, plasmid 3 further comprises a GGPPS gene in
combination with either ABS or TXS. Examples of GGPPS genes provide
substrates for terpene synthase genes, i.e. ABS, or TXS. In some
embodiments, terpene synthase genes are wild-type genes. In some
preferred embodiments, terpene synthase genes contain mutations for
producing variants of terpenoid products, as described and shown herein.
In some embodiments, plasmid 3 further comprises a gene for terpenoid
functionalizing enzymes, e.g., cytochromes P450.
[0446] In some preferred embodiments, plasmid 1 is under control of
constitutive promoters. Thus, in some preferred embodiments, at least
some of the genes that are part of the operon in plasmid 1 are
constituitvely expressed. In some preferred embodiments, at least some of
the genes that are part of the operon in plasmid 1 are expressed when
contacted with an inducible compound, i.e. under control of an inducible
promoter, such as a lacZ promoter turned on when in contact with X-gal.
[0447] In some preferred embodiments, plasmids 2 and 3 are under control
of inducible promoters. Thus, in some preferred embodiments, at least
some of, and in some cases the entire set of genes contained in a
metobolic pathway operon in plasmid 2 are expressed when contacted with
an inducible compound. In some preferred embodiments, some genes
expressed in plasmid 3 are under inducible control.
[0448] In some preferred embodiments, plasmid 4 is under the control of
constitutive promoters. Thus, in some embodiments, at least one gene in
plasmid 4 is under control of a constiuitive promoter. In some
embodiments, at least one gene in plasmid 4 is under control of an
inducible promoter.
[0449] In some preferred embodiments, a host bacterium undergoes at least
2 rounds of transformation, e.g. first to transform plasmids 1 and 2
simultaneously into a strain that already harbors plasmid 4 (e.g., a
S1030 strain which already comprises this accessory plasmid), followed by
transformation with plasmid 3. In some preferred embodiments, a host
bacterium undergoes at least 3 rounds of transformation, e.g. first to
transfect plasmid 1, then transfect plasmid 2, followed by transfection
of plasmid 3.
[0450] In some preferred embodiments, each plasmid has an antibiotic
resistance gene (or other type of selective gene) for identifying
successfully transformed bacteria for that plasmid, i.e. antibiotic
resistance genes may be different for each plasmid. Thus, when an
antibiotic resistance gene is expressed, instead of a bacteria stopped
from normal replication when in contact with the antibiotic, a bacteria
has ressitance so is able to replicate at normal or near normal rates.
[0451] Thus, as described herein, laboratory stains of E. coli were
engineered to comprise up to three types of expression plasmids by first
transfecting with plasmid 1, then selecting for transformants (growing
colonies) on/in antibiotic containing media wherein nontransformants do
not grow, then transfecting transformants with plasmid 2 and selecting
for double transformants, e.g. media containing antibiotics for allowing
the growth of double transformants, then transfecting double
transformants with plasmid 3 and selecting for triple transformants, e.g.
media containing antibiotics for allowing the growth of triple
transformants. In one embodiment, triple transformants are grown in media
containing an inducer(s) for the inducible plasmids (2 and 3) in
combination with the three antibiotics for producing products having at
least some inhibitory activity for the chosen enzyme of plasmid 1, made
by the enzymes provided by the combination of enzymes expressed by
plasmids 2 and 3.
[0452] Further, as described herein, laboratory stains of E. coli were
engineered to comprise up to four types of expression plasmids by first
transforming host cells with plasmids 1 and 2, simultaneously, into a
strain that already harbors plasmid 4, then selecting for triple
transformants (growing colonies) on/in antibiotic containing media
wherein non-transformants do not grow, then further transforming
successful triple transformants with plasmid 3 and selecting for
quadruple transformants, e.g. media containing antibiotics that allow for
the growth of quadruple transformants. In one embodiment, quadruple
transformants are grown in media containing (i) an inducer(s) for the
inducible plasmids (2 and 3), (ii) a metabolic precursor for metabolite
biosynthesis, e.g., mevalonate, and (iii) five antibiotics (i.e., one for
each plasmid and one under control of the genetically encoded system in
plasmid 1) for producing products having at least some inhibitory
activity on the chosen enzyme of plasmid 1, made by the combination of
enzymes expressed by plasmids 2 and 3.
[0453] In some embodiments, a terpenoid operon pathway inteneded for
insertion into or already within plasmid 2, may be altered by swapping in
a different gene for terpene synthases (i.e., in each row of FIG. 36, the
metabolic pathway differs in the identity of the gene for a terpene
synthase; when ADS or TXS are present, GGPPS is also present).
[0454] In FIGS. 41A, 41B, 41G, and 41D, for examples, we mutate (rather
than swap) a single gene of a metabolic pathway: e.g. induce at least one
mutation in a gene endocing amorphadiene synthase. After doing so, we
show that a metabolic pathway can be mutated to generate a library of
pathways, and that these pathways can be screened to identify pathways
that generate more potent inhibitors of PTP1B than the unmutated parent
pathway.
[0455] To summarize, we provided a demonstration that (i) the B2H system
(detection operon) and (ii) a metabolic pathway for terpenoid
biosynthesis can be combined within a host organisum to identify genes
involved with production of small-molecules and evolve genes related to
production of small-molecules that may be inhibitors that enable the
microbial synthesis of PTP1B inhibitors.
[0456] In preferred embodiments, small-molecule products are derived from
one general metabolic pathway (the mevalonate-dependent pathway for
terpenoid biosynthesis from Saccharomyces cerevisiae), and one host
organism (Escherichia coli). These small-molecule products produced as
described herein, are contemplated for use as treatments of type 2
diabetes, obesity, and breast cancer, among other diseases.
[0457] Without being bound by theory, when a genetically encoded system
for detecting the activity of a specified test enzyme is located within a
host bacterium, a constitutive promoter expresses part A of the detection
system (e.g. detection operon). So long as the phosphatase (or other test
enzyme) expressed by part A is active, an expressed kinase enzyme, e.g.
Src kinase, attaches a phosphate (P) group to the expressed second fusion
protein comprising a substrate recognition domain (S) attached to a
protein capable of recruiting RNA polymerase to DNA (e.g., the
RP.sub..omega. subunit of RNA polymerase), and the phosphatase removes
that phosphate group so that few molecules of phosphorylated fusion
protein 2 stay bound to fusion protein 1 and, thus, few complexes between
fusion proteins 2 and 1 form to initiate transcription of a gene of
interest (GOI).
[0458] Thus, transcription of part B is off and the expression of a GOI is
low, e.g. as observed when a GOI is a luminescent protein, so long as the
placZ inducible promoter is not being induced. In this embodiment of an
operon, the placZ inducible promoter is induced in order to allow the
expression of a gene of interest in the absence of an inhibitor when not
testing for inhibitor molecules.
[0459] However, in the presence of a small molecule that inhibits the
phosphatase, a molecule either made endogenously from a metabolic pathway
harbored by plasmids 2 and 3, or added to the growth media, then an
excess of phosphorylated fusion protein 2 within the substrate binding
region attaches to the substrate recognition domain of fusion protein 1
then when both are bound to the operator and the RB binding site then the
GOI is expressed indicating the presence of a phosphatase inhibitor.
[0460] For practical purposes, it does not matter which fusion protein
possesses a DNA-binding protein and which possesses a protein capable of
recruiting RNA polymerase to DNA, so long as the DNA-binding protein
constitutes part of one fusion protein and the protein that recruits RNA
polymerase constitutes part of the other fusion protein, see FIG. 40,
FIG. 10, for examples.
[0461] E. coli DH10B was used for molecular cloning and for preliminary
analyses of terpenoid production; E. coli s1030.sup.1 was used for
luminescence studies and for experiments involving terpenoid-mediated
selection (e.g., molecular evolution); and E. coli Bl21 was used for
experiments involving the heterologous expression and subsequent
purification of proteins. However, it is not intended to limit the host
bacteria strain to these E. coli strains. Indeed, any bacteria strain
that supports the expression of the operons, DNA sequences and plasmids
as described herein may be used as a host bacteria strain.
[0462] In preferred embodiments, small molecule products are derived from
one general metabolic pathway (the mevalonate-dependent pathway for
terpenoid biosynthesis from Saccharomyces cerevisiae), and one host
organism (Escherichia coli). These small molecule products produced as
described herein, are contemplated for use as treatments of type 2
diabetes, obesity, and breast cancer, among other diseases.
[0463] A. Bacterial Two-Hybrid (B2H) Systems (Operons) for the
Identification of Microbially Synthesizable Inhibitors of PTP1B.
[0464] In one embodiment, an application of the B2H system to the
evolution of genes that enable the microbial synthesis of molecules that
(i) inhibit PTP1B and (ii) may be identified (i.e., structurally
characterized) with standard analytical methods. In brief, the B2H system
links the inactivation of PTP1B to the expression of a gene for
antibiotic resistance. Accordingly, when a strain of E. coli (or other
host bacterium) harbors both (i) the B2H system and (ii) a metabolic
pathway for terpenoid biosynthesis, it will survive in the presence of
antibiotics when it produces terpenoids that inhibit PTP1B.
[0465] A bacterial two-hybrid (B2H) system as described herein comprises
one embodiment of an operon as described herein. Data displayed on left
side of the plot in FIG. 33D (i.e., p130cas [also called liras] and MidT
substrates) is the same data displayed in FIG. 29A with the addition of
providing more details of the B2H system in light of development.
[0466] We propose to use directed evolution to evolve new inhibitors; that
is, we will manually introduce mutations into specific genes (or sets of
genes) within a metabolic pathway to generate a library of metabolic
pathways that can be screened alongside the B2H system. FIG. 41A
describes a general approach to introduce mutations; Example C provides a
very specific approach represented by FIG. 41A. To screen our library, we
transform it into B2H-containing cells, and we grow them on plates
containing various concentrations of spectinomycin; colonies that form on
plates with high concentrations of spectinomycin contain a pathway
capable of generating molecules that activate the B2H system (i.e.,
inhibit PTP1B). This pathway will not naturally evolve on its own. We
can, thus, remove it from the first host cell, and transform it into
another strain of E. coli to make high concentrations of inhibitors.
[0467] Embodiments of the system described herein enables the rapid
identification of drug leads that can be readily synthesized in microbial
hosts. It allows for a simultaneous solutions to two problems encountered
during pharmaceutical development that are often examined separately 1)
the identification of leads and 2) subsequent synthesis of those leads
identified in 1).
[0468] Systems described herein have at least five uses: [0469] 1.
Enables the identification of genes for proteins that generate inhibitors
of the drug target. In brief, when the pathway for terpenoid biosynthesis
generates target-inhibiting molecules, the cell survives at high
antibiotic concentrations. By swapping out genes for terpene synthesizing
and/or functionalizing enzymes, we can identify genes for enzymes that
build such inhibitors. [0470] 2. Enables the construction of novel--and,
perhaps, unnatural--inhibitors. By mutating the pathway for terpenoid
biosynthesis, we can generate pathways that confer survival at high
antibiotic concentrations. These pathways contain mutated (i.e.,
unnatural) genes and, thus, can generate inhibitor molecules not found in
Nature. [0471] 3. Enables the construction of inhibitors that overcome
drug resistance. Briefly, after building a strain that generates a
target-inhibiting molecule, we can carry out two steps: (i) We can mutate
the drug target until it becomes resistant to that inhibitor. (ii) We can
mutate the metabolic pathway until it generates an inhibitor of the
mutated drug target. In this way, we can both (i) predict drug-resistance
mutations and (ii) address those mutations by generating new inhibitors
that overcome them. [0472] 4. Enables the construction of inhibitors of
protein tyrosine kinases. Using a selection strategy similar to that
described in 3.ii, we can mutate a metabolic pathway until it generates
an inhibitor of Src kinase.
[0473] B. A Genetically Encoded System that Links the Inhibition of a
Protein Tyrosine Phosphatase to Cell Survival.
[0474] In one preferred embodiment, a genetically encoded system was
developed and used, as described herein, for detecting the presence of a
small-molecule inhibitor of the catalytic domain of a chosen enzyme, e.g.
a drug target enzyme, while allowing the survival of a host cell in the
presence of a selective growth media. In other words, when the
genetically encoded system is part of an expression plasmid in E. coli.
[0475] In one embodiment, an exempalry drug target enzyme was chosen, e.g.
protein tyrosine phosphatase enzyme, protein tyrosine phosphatase 1B
(PTP1B),
[0476] In one embodiment, the genetically encoded system is part of an
expression plasmid. In one embodiment, the sensing operon is operably
linked to a constitutive promoter for expression in E. coli.
FIG. 33A-E illustrates an embodiment of a genetically encoded system that
links the activity of an enzyme to the expression of a gene of interest
(GOI). Error bars in FIG. 33B-E denote standard deviation with n=3
biological replicates. FIG. 33A illustrates an embodiment of a bacterial
two-hybrid system that detects phosphorylation-dependent protein-protein
interactions. Components include (i) a substrate domain fused to the
omega subunit of RNA polymerase (yellow), (ii) an SH2 domain fused to the
434 phage cI repressor (light blue), (iii) an operator for 434cI (dark
green), (iv) a binding site for RNA polymerase (purple), (v) Src kinase,
and (vi) PTP1B. Src-catalyzed phosphorylation of the substrate domain
enables a substrate-SH2 interaction that activates transcription of a
gene of interest (GOI, black). PTP1B-catalyzed dephosphorylation of the
substrate domain prevents that interaction; inhibition of PTP1B
re-enables it. FIG. 33B refers to an embodiment of the two-hybrid system
from FIG. 33A that (i) lacks PTP1B and (ii) contains luxAB as the GOI. We
used an inducible plasmid to increase expression of specific components;
overexpression of Src enhanced luminescence. FIG. 33C refers to an
embodiment of the two-hybrid system from FIG. 33A that (i) lacks both
PTP1B and Src and (ii) includes a "superbinder" SH2 domain (SH2*, i.e.,
an SH2 domain with mutations that enhance its affinity for
phosphopeptides), a variable substrate domain, and LuxAB as the GOI. We
used an inducible plasmid to increase expression of Src; luminescence
increased most prominently for p130cas and MidT, suggesting that Src acts
on both substrate domains. FIG. 33D refers to an embodiment of a
two-hybrid system from FIG. 33C with one of two substrates: p130cas or
MidT. We used a second plasmid to overexpress either (i) Src and PTP1B or
(ii) Src and an inactive variant of PTP1B (C215S). The difference in
luminescence between systems containing PTP1B or PTP1B(C215S) was
greatest for MidT, suggesting that PTP1B acts on this substrate. Right:
An optimized version of the two-hybrid system (with bb030 as the RBS for
PTP1B) appears for reference. FIG. 33E displays the results of an
exemplary growth-coupled assay performed using an optimized B2H including
SH2*, a midT substrate, optimized promoters and ribosome binding sites
(bb034 for PTP1B), and SpecR as the GOI. This system is illustrated at
the top of the figure. Exemplary growth results demonstrate that
inactivation of PTP1B enables strain of E. coli harboring this system to
survive at high concentrations of spectinomycin (>250 .mu.g/ml).
[0477] 1. Sequential Optimization of a Two-Hybrid System with LuxAB as the
GOI.
[0478] Phase 1: We examined two different promoters for Src in a system
that lacked PTP1B. Phase 2: We examined two different ribosome binding
sites (RBSs) for Src in a system that lacked PTP1B. Phase 3: We examined
two different RBSs for PTP1B in a complete system. Note: In phases 1 and
2, the operon contains wild-type (WT) or non-phosphorylate-able (mutant,
Y/F) versions of the substrate domain. In phase 3, the operon contains
wild-type (WT) or catalytically inactive (mutant, C215S) version so
PTP1B. See, FIG. 34.
FIG. 34 illustrates exemplary experiments used to optimize the B2H system
depicted in FIG. 33.
[0479] 2. Comparing RB Sites.
[0480] We grew strains of E. coli harboring versions of the bacterial
two-hybrid that contained different RBSs for PTP1B (bb034 or bb030) on
various concentrations of spectinomycin (left to right) and plated them
on various concentrations of spectinomycin (top to bottom). We used bb034
for one emboidment of an "optimized" two-hybrid system shown in FIG. 33E.
See, FIG. 35.
FIG. 35 FIG. 3 illustrates exemplary experiments used to optimize the B2H
system depicted in FIG. 33 for growth-coupled assays. [0481] Rice, P.,
Longden, L. & Bleasby, A. EMBOSS: The European Molecular Biology Open
Software Suite. Trends Genet. 16, 276-277 (2000).
[0482] C. Biosynthesis of PTP1B-Inhibititing Terpenoids Enables Cell
Survival.
[0483] When pTS contains ADS or GHS, it does not contain GGPPS; when pTS
contains ABS or TXS, it also contains GGPPS; ABS.sub.D404A/D621A refers
to a catalytically inactive variant of ABS; and B2H* contains
PTP1B(C215S). ADS and, marginally, ABS enabled survival in the presence
of spectinomycin, a result suggestive of the ability of these to terpene
synthases to generate inhibitors of PTP1B.
FIG. 36A-C FIG. 4 I shows an illustration of an operon (FIG. 36A) used
for providing exemplary results during biosynthesis of PTP1B-inhibititing
terpenoids FIG. 36B enabling cell survival FIG. 36C. FIG. 36A-C FIG. 4
depicts an exemplary metabolic pathway for the biosynthesis of
terpenoids. FIG. 36A depicts a plasmid-borne pathway for terpenoid
biosynthesis: (i) pMBIS, which harbors the mevalonate-dependent
isoprenoid pathway of S. cerevisiae, converts mevalonate to isopentyl
pyrophosphate (IPP) and farnesyl pyrophosphate (FPP). (ii) pTS, which
encodes a terpene synthase (TS) and, when necessary, a geranylgeranyl
diphosphate synthase (GPPS), converts IPP and FPP to sesquiterpenes
and/or diterpenes. FIG. 36B depicts exemplary terpene synthases:
amorphadiene synthase (ADS) from Artemisia annua, .gamma.-humulene
synthase (GHS) from Abies grandis, abietadiene synthase (ABS) from Abies
grandis, and taxadiene synthase (TXS) from Taxus brevifolia. FIG. 36C
shows the results of an exemplary growth-coupled assay of strain of E.
coli that contains both (i) an embodiment of the optimized bacterial
two-hybrid (B2H) system (i.e., the B2H system from FIG. 33E) and (ii) an
embodiment of a pathway for terpenoid biosynthesis (i.e., the pathway
from FIG. 35A).
[0484] Briefly, we grew strains of E. coli that harbored (i) the same
pathway for producing linear isoprenoid precursors and (ii) a different
plasmid encoding a terpene synthase (pTS). The pTS plasmid contained on
of the following: (i) amorphadiene synthase (ADS) from Artemisia annua,
(ii) .gamma.-humulene synthase (GHS) from Abies grandis, (iii)
abietadiene synthase (ABS) from Abies grandis in operable combination
with a geranylgeranyl diphosphate synthase (GGPPS, (iv) taxadiene
synthase (TXS) from Taxus brevifolia in operable combination with a
GGPPS, (v) a inactive variant of ABS (i.e., ABS.sub.xx, which corresponds
to ABS.sub.D404A/D621A), or (vi) the L450Y mutant of GHS. After growing
these strains, we compared the ability of their products to inhibit PTP1B
by carrying out the following steps: (i) We used a hexane overlay to
extract hydrophobic products (e.g., terpene-like products) from each
culture, we then dried the products in a rotary evaporator, we dissolved
the dried extract in dimethyl sulfoxide (DMSO), and we measured
PTP1B-catalyzed hydrolysis of p-nitrophenyl phosphate (pNPP) in the
presence and absence of extract-containing DMSO. We note: The L450Y
mutant of GHS was included in our analysis because the wild-type form of
GHS does not permit B2H-mediated growth in the presence of an antibiotic,
but our preliminary data indicate that the L450Y mutant of GHS does
permit such growth. Accordingly, we hypothesized that this mutant
produced a molecule that is a stronger inhibitor of PTP1B than the
molecules generated by wild-type GHS. See, FIG. 37A-C Demonstration of
differential inhibition by structurally distinct terpenoids.
[0485] In examining FIG. 37A-C, we observed a trend: Extracts from strains
containing terpene synthases that confer resistance to high
concentrations of antibiotic (see FIG. 36) where ADS and GHS.sub.L450Y
were more inhibitory than extracts from strains that did not confer
resistance, e.g., TXS and ABS.sub.xx. We note: strains containing ADS and
GHS also included the optimized bacterial two-hybrid (B2H) system, but
selection was not performed in the experiments used to product terpenoids
for the experiments described by these figures.
FIG. 37A-C provides an exemplary analysis of the inhibitory effects of
terpenoids generated by different strains of E. coli. FIG. 37A depicts
the results of our analysis of the inhibitory effect of DMSO containing
(i) no inhibitor and (ii) extracted compounds from the culture broth of
the ADS-containing strain. FIG. 37B depicts the results of our analysis
of the inhibitory effect of DMSO containing (i) extracted compounds from
the culture broth of the GHS-containing strain (gHUM) or (ii) extracted
compounds from the culture broth of the strain including the L450Y mutant
of GHS. FIG. 37C depicts the results of our analysis of the inhibitory
effect of DMSO containing (i) no inhibitor, (ii) extracted compounds from
the culture broth of the ABS-containing strain, (iii) extracted compounds
from the culture broth of the TXS-containing strain, and (iv) extracted
compounds from the culture broth of the train strain containing a
catalytically inactive variant of ABS.
[0486] Briefly, we grew strains of E. coli containing both (i) the
optimized bacterial two-hybrid system and (ii) a terpenoid pathway with
mutants .gamma.-humulene synthase (GHS; 1 mutant/cell) on varying
concentrations of spectinomycin. Above: product profiles of strains with
GHS mutants that conferred survival at high antibiotic concentrations.
See, FIG. 38.
FIG. 38 shows exemplary analysis of the product profiles of mutants of
GHS that enabled growth in the presence of spectinomcyin.
[0487] In brief, we constructed versions of the bacterial two-hybrid
system that include SH2*, the midT substrate, optimized promoters and
ribosome binding sites, SpecR, and alternative PTPs: the catalytic domain
of PTPN6 (e.g., SHP-1) and PTP1B405 (the full-length version of PTP1B).
Note: these systems are identical to the B2H system depicted in FIG. 33E,
except they possess only one of the following PTP genes: PTP1B (as in
FIG. 33E), PTPN6 (different from FIG. 33E), or full-length PTP1B.
Inactivation of the catalytic domain of both PTPN6 and the full-length
PTP1B enabled strains of E. coli harboring corresponding operons to
survive at high concentrations of spectinomycin (>400 .mu.g/ml). To
extend our operon to other PTPs, we plan on modifying the substrate, SH2,
and/or kinase domains. See, FIG. 39.
FIG. 39 An analysis of exemplary B2H systems that link the inhibition of
other PTPs to cell survival.
[0488] We also generated versions of the bacterial two-hybrid system that
include SH2*, the midT substrate, optimized promoters and ribosome
binding sites, SpecR, and alternative PTPs: the catalytic domain of PTPN6
(e.g., SHP-1) and PTP1B405 (the full-length version of PTP1B).
Inactivation of the catalytic domain of PTPN6 and the full-length PTP1B
enabled strains of E. coli harboring corresponding operons to survive at
high concentrations of spectinomycin (>400 .mu.g/ml). To extend our
operon to other PTPs, we plan on modifying the substrate, SH2, and/or
kinase domains.
FIG. 40A-E depicts exemplary embodiments of genetically encoded systems
that link the activity of an enzyme to the expression of a gene of
interest, and the application of those embodiments to (i) the prediction
of resistance mutations, (ii) the construction of inhibitors that combat
resistance mutations, and (ii) the evolution of inhibitors of kinases.
FIG. 40A depicts an exemplary first step in examining potential
resistance mutations. By evolving a metabolic pathway to produce
molecules that inhibit a known drug target (e.g., PTP1B); these molecules
will permit expression of a gene of interest (GOI) that confers survival
in the presence of a selection pressure (e.g., the presence of
spectinomycin, an antibiotic). FIG. 40B depicts an exemplary second step
in examining potential resistance mutations. In a second strain of E.
coli, we will replace the original gene of interest with a second (GOI2)
that confers conditional toxicity (e.g., SacB, which converts sucrose to
levan, a toxic product); we will evolve the drug target to become
resistant to the endogenous inhibitors, while still retaining its
activity. This mutant will prevent expression of the toxic gene. FIG. 40C
depicts an exemplary third step in combating resistance mutations. In a
third strain of E. coli, we will evolve a metabolic pathway that produces
molecules that inhibit the mutated drug target. In this way, we will both
predict--and, through our second evolved pathway, address--mutations that
might cause resistance to terpenoid-based drugs. We note: FIG. 40A-40C
describe the use of our genetically encoded system to evolve inhibitors,
but the steps 2 and 3 could be used to predict mutations that permit
resistance to endogenously supplied inhibitors and, subsequently, to
identify new endogenously supplied inhibitors that might combat that
resistance. FIG. 40D depicts an exemplary genetically encoded system that
detects inhibitors of an Src kinase. In brief, Src activity enables
expression of a toxic gene (GOI2); inhibition of Src, in turn, would
confer survival.
[0489] One embodiment of a configuration of the B2H architecture that
enables survival PTP1B is active, that is, when the activity of Src
kinase is successfully canceled out. In the absence of PTP1B, this
configuration could be used to evolve inhibitors of Src kinase such an
inhibitor would act similarly to PTP1B by preventing the phosphorylation
of the substrate domain (as shown in FIG. 40E). Src kinase is a validated
drug target; tyrosine kinases are targets of over 40 FDA-approved drugs.
[0490] FIG. 40E demonstrates one embodiment of a roof of principle for the
B2H system describe in FIG. 40B. The system shown here includes two GOIs:
SpecR and SacB. Expression of the GOIs confers survival in the presence
of spectinomycin; expression of the GOIs causes toxicity in the presence
of sucrose. The images depict the results of a growth-coupled assay
performed on a strain of E. coli in the presence of various
concentrations of sucrose. The strain harboring an active form of PTP1B
(WT) grows better at high sucrose concentrations that the strain
harboring an inactive form of PTP1B (C215S).
FIG. 41A depicts an exemplary strategy for the evolution of inhibitors of
PTP1B. FIG. 41A depicts an exemplary structural analysis used to identify
targets for mutagenesis in the active sites of terpene synthases. It
shows an alignment of the class I active site of ABS (gray, PDB entry
3s9v) and TXS (blue, PDB entry 3p5r) with the locations of sites targeted
for site-saturation mutagenesis (SSM) highlighted on ABS (red). A
substrate analogue (yellow) of TXS appears for reference. FIG. 41B
depicts an exemplary strategy for introducing diversity into libraries of
metabolic pathways: An iterative combination of SSM of key sites on a
terpene synthase (as in a), error-prone PCR (ePCR) of the entire terpene
synthase gene, SSM of key sites on a terpene-functionalizing enzyme
(e.g., P450), and ePCR of the entire terpene-functionalizing enzyme. FIG.
41C depicts an exemplary quantification the total terpenoids present in
DMSO samples with extracts of various TS-containing strains. In brief, we
performed site-saturation mutagenesis of six sites on ADS (analogous to
the sites shown in a); we plated the SSM library on agar plates
containing different concentrations of spectinomycin; we picked colonies
that grew on a plate containing a high concentration (800 .mu.g/ml) of
spectinomycin and used each colony to inoculate a separate culture; we
used a hexane overlay to extract the terpenoids secreted into each
culture broth; we dried the hexane extract in a rotary evaporator and
re-suspended the solid in DMSO; and we used a GC-MS to quantify the total
amount of terpenoids present in the DMSO.
[0491] "ADS WT", "ADS F514E", "ADS F370L", "ADS G400A", "ADS G439A", and
"ADS G400L" describe mixtures of molecules generated by strains of E.
coli harboring mutants of amorphadiene synthase (ADS). The labels
describe the mutant: "G439A" corresponds to a mutant of abietadiene
synthase in which glycine 439 has been mutated to alanine, and so on. In
future work, we plan on (i) purifying different terpenoids from these
mixtures, (ii) assessing their inhibitory effect on PTP1 B in vitro,
(iii) assaying their inhibitory effect on other PTPs (notably TC-PTP and
PTPN11) in vitro, and (iv) assaying their influence on mammalian cells.
See, FIG. 41D.
FIG. 41D depicts an exemplary analysis of the inhibitory effect of
various extracts on PTP1B. In brief, the figure shows initial rates of
PTP1B-catalyzed hydrolysis of p-nitrophenyl phosphate (pNPP) in the
presence of terpenoids quantified in FIG. 41C. Two mutants of ADS (G439A
and G400L) generate particularly potent inhibitors of PTP1B. FIG. 42
depicts an exemplary analysis of the link between B2H activation and cell
survival. An exemplary strain of E. coli that contains both (i) the
optimized bacterial two-hybrid (B2H) system (FIG. 33E) and (ii) the
terpenoid pathway depicted in FIG. 36A. Note: pTS includes GGPPS only
when ABS or TXS are present; the "Y/F" operon corresponds to a B2H system
in which the substrate domain cannot be phosphorylated. Survival at high
concentrations of spectinomycin requires activation of the B2H system
(i.e., phosphorylation of the substrate domain, a process facilitated by
inhibition of PTP1B). FIG. 43 provides exemplary product profiles of
strains of E. coli harboring various terpene synthases. For this figure,
the strain of E. coli harbored (i) the optimized B2H system (FIG. 33E)
and (ii) the terpenoid pathway (FIG. 36A). The pathways corresponding to
each profile differ only in the composition of the pTS plasmid, which
contains TXS (taxadiene synthase from Taxus brevifolia and a
geranylgeranyl diphosphate synthase from Taxus Canadensis); GHS
(.gamma.-humulene synthase from Abies grandis); ADS (amorphadiene
synthase from Artemisia annua); ABS (abietadiene synthase from Abies
grandis and a geranylgeranyl diphosphate synthase from Taxus Canadensis);
G400A (the G400A mutant of amorphadiene synthase from Artemisia annua);
and G439L (the G439L mutant of amorphadiene synthase from Artemisia
annua). Note that the two mutants of ADS yield different product profiles
than the wild-type enzyme (ADS); our resuls indicate that products
generated by these two mutants are more inhibitory than those generated
by the wild-type enzyme (FIG. 41E).
[0492] D. Identification of Sites for Site Saturation Mutagenesis (SSM).
[0493] The active sites of terpene synthases and cytochrome P450s contain
constellations of amino acids that guide catalysis in two ways: (i) They
control the conformation space available to reacting substrates, and (ii)
they alter the organization of water that surrounds substrates.sup.8-10.
We identified "plastic" residues likely to modulate these attributes in
the class I active sites of terpene synthase by carrying out the
following steps: (i) We aligned the crystal structure of ABS with the
crystal structure of TXS. (ii) We selected all residues within 8 angstoms
of the substrate analog (2-fluoro-geranylgeranyl diphosphate) of the
class I active site of TXS, and we identified a subset of sites that
differed between ABS and TXS. (iii) We aligned the sequences of ABS,
S = .sigma. V 2 n v + .sigma. HW 2 n HW ##EQU00002##
GHS, delta-selenine synthase (DSS), and epi-isozizaene synthase (EIS).
(iv) We used Eq. 51 to score each site based on its variability in size
and hydrophilicity across the five enzymes analyzed. In this equation,
.sigma..sub.V.sup.2 is the variance in volume, .sigma..sub.HW.sup.2 is
the variance in Hopp-Woods index, and n.sub.v and n.sub.HW are
normalization factors (based on the highest variances measured in this
study). (v) We ranked each site according to S and selected the six
highest-scoring sites. We note: For this analysis, we chose ABS and TXS
because they are structurally similar enzymes (i.e., both possess
.alpha., .beta., and .gamma. domains) with crystal structures; we chose
GHS, DSS, and EIS because they have been shown to exhibit
mutation-responsive product profiles. FIG. 44A-D provides exemplary
structural and sequence-based evidence that supports the extension the
B2H system to other protein tyrosine phosphatases (PTPs). FIG. 44A
provides an exemplary structural alignment PTP1B and PTPN6, two PTPs that
are compatible with the B2H system (see FIGS. 1e and 7 of Update A for
evidence of compatibility). We used the align function of PyMol to align
each structure of PTPN6 with either (i) the ligand-free (3A5J) or (ii)
ligand-bound (2F71) structure of the catalytic domain of PTP1B. The align
function carries out a sequence alignment followed by a structural
superposition and, thus, effectively aligns the catalytic domains of both
proteins. FIG. 44B provides an exemplary structural comparison of PTP1B
and PTPN6; the root-mean-square deviations (RMSD) of aligned structures
of PTP1B and PTPN6 range from 0.75 to 0.94 .ANG.. FIG. 44C proves an
exemplary sequence alignment of the catalytic domains of PTP1B and PTPN6
(EMBOSS Needle.sup.1). FIG. 44D provides an exemplary sequence comparison
of the catalytic domains of PTP1B and TPPN6. The sequences share 34.1%
sequence identity and 53.5% sequence similarity. In summary, the results
of this figure indicate that our B2H system can be readily extended to
PTPs that possess catalytic domains that are (i) structurally similar to
the catalytic domain of PTP1B (here, we define structural similarity as
two structures that when aligned, have with an RMSD of .ltoreq.0.94 .ANG.
RMSD with the framework similar to the one used by the align function of
PyMol) and/or (ii) sequence similar to the catalytic domain of PTP1B
(here, we define sequence similarity as .gtoreq.34% sequence identity or
.gtoreq.53.5% sequence similarity as defined by the EMBOSS Needle
algorithm).
[0494] To identify "plastic" residues capable of adjusting the activity of
P450.sub.BM3, we carried out an approach similar to that described above:
(i) We used the mutant database.sup.11 (http://www.MuteinDB.org) to
identify the 25 most commonly mutated sites in functional variants of
P450.sub.BM3. (ii) We used Eq. 51 to score each site based on its
variability in size and hydrophobicity across different mutants. (iii) We
ranked each site according to S and selected the 7 highest-scoring sites.
Site S1024 scored highly based on S but was omitted due to its location
on the P450 reductase domain.
[0495] E. Exemplary Purification of Products.
[0496] See section relating to flash chromatography and HPLC.sup.1-3.
[0497] F. Exemplary Concentration Range for Testing Products.
[0498] We plan on incubating mammalian cells with 1-400 .mu.M of
inhibitors; we will assess the biochemical influence of those inhibitors
by using the assays described below.
[0499] G. Exemplary Cell-Based Assays.
[0500] We will characterize the biological activity of newly developed
inhibitors in at least two ways:
1. We will assay the influence of inhibitors on insulin receptor
phosphorylation. In brief, we will expose HepG2, Hela, Hek393t, MCF-7,
and/or Cho-hIR cells to insulin shock in the presence and absence of
inhibitors, and we will use a western blot and/or an enzyme-linked
immunosorbent assay (ELISA) to measure the influence of the inhibitors on
insulin receptor phosphorylation. In some embodiments we may use
cell-permeable inhibitors of PTP1B to enhance insulin receptor
phosphorylation. 2. We will examine the morphological and/or growth
effects inhibitors identified in a system described herein on cellular
models of HER2(+) and TN breast cancer. In brief, we will examine the
relevance of inhibitors to HER2(+) breast cancer by evaluating their
ability to inhibit the migration of BT474 and SKBR3 cells, which are
HER2(+), but not MCF-7 and MDA-MB-231 cells, which are HER2(-). We will
examine the relevance of inhibitors to triple negative breast cancer, in
turn, by carrying out viability and proliferation assays on panels of TN
cell lines (e.g., ATCC TCP-1002). All cell lines are available from the
ATCC (ATCC.org) and have been used previously to characterize potential
therapeutics for HER2(+) and TN subtypes.sup.4,5.
[0501] It is not meant to limit a pathway to terpenoid synthesis. Indeed,
an alkaloid biosynthesis pathway is contemplated for use to identify,
[0502] An exemplary pathway for alkaloid biosynthesis consists of three
modules (Nakagawa, A. et al. A bacterial platform for fermentative
production of plant alkaloids. Nat. Commun. (2011).
doi:10.1038/ncomms1327, herein incorporated by reference) (i) the first
enables the overexpression of our enzymes for L-tyrosine overproduction:
TKT, PEPS, fbr-DAHPS, and fbr-CM/PDH; (ii) the second enables the
expression of three enzymes necessary for the construction of dopamine
and 3,4-DHPAA: TYR, DODC, and MAO; and (iii) the third enable the
expression of four enzymes for the construction of (S) reticuline from
3,4-DHPAA and dopamine: NCS, 6OMT, CNMT, and 4'OMT. Enzymes are as
follows: TKT, transketolase (tktA, GenBank accession number X68025);
PEPS, phosphoenolpyruvate (PEP) synthetase (ppsA, GenBank accession
number X59381); fbr-DAHPS, feedback-inhibition resistant
3-deoxy-D-arabino-heptulosonate-7-phosphate synthase (aroGfbr, GenBank
accession number J01591); fbr-CM/PDH, feedback-inhibition resistant
chorismate mutase/prephenate dehydrogenase (tyrAfbr, GenBank accession
number M10431); TYR, tyrosinase of Streptomyces castaneoglobisporus
(ScTYR containing tyrosinase and its adaptor protein, ORF378, GenBank
accession numbers AY254101 and AY254102); DODC, DOPA decarboxylase of
Pseudomonas putida (GenBank accession number AE015451); MAO, monoamine
oxidase of Micrococcus luteus (GenBank accession number AB010716); NCS,
norcoclaurine synthetase of C. japonica (GenBank accession number
AB267399); 6OMT, norcoclaurine 6-O-methyltransferase of C. japonica
(GenBank accession number D29811); CNMT, coclaurine-N-methyltransferase
of Coptis japonica (GenBank accession number AB061863); 4'OMT,
3'-hydroxy-N-methylcoclaurine 4'-O-methyltransferase of C. japonica
(GenBank accession number D29812). We note; these three modules may be
encoded by two plasmids.
REFERENCES FOR SECTION V, HEREIN INCORPORATED BY REFERENCE IN THEIR
ENTIRETY
[0503] 1. Jia, M., Potter, K. C. & Peters, R. J. Extreme promiscuity of
a bacterial and a plant diterpene synthase enables combinatorial
biosynthesis. Metab. Eng. 37, 24-34 (2016). [0504] 2. Criswell, J.,
Potter, K., Shephard, F., Beale, M. H. & Peters, R. J. A single residue
change leads to a hydroxylated product from the class II diterpene
cyclization catalyzed by abietadiene synthase. Org. Lett. 14, 5828-5831
(2012). [0505] 3. Morrone, D. et al. Increasing diterpene yield with a
modular metabolic engineering system in E. coli: Comparison of MEV and
MEP isoprenoid precursor pathway engineering. Appl. Microbiol.
Biotechnol. 85, 1893-1906 (2010). [0506] 4. Dagliyan, O. et al.
Engineering extrinsic disorder to control protein activity in living
cells. Science (80-.). 354, 1441-1444 (2016). [0507] 5. Lehmann, B. D. et
al. Identification of human triple-negative breast cancer subtypes and
preclinical models for selection of targeted therapies. J. Clin. Invest.
(2011). doi:10.1172/JC145014 [0508] 6. Dempke, W. C. M., Uciechowski, P.,
Fenchel, K. & Chevassut, T. Targeting SHP-1, 2 and SHIP Pathways: A novel
strategy for cancer treatment? Oncology (Switzerland) (2018).
doi:10.1159/000490106 [0509] 7. Nakagawa, A. et al. A bacterial platform
for fermentative production of plant alkaloids. Nat. Commun. (2011).
doi:10.1038/ncomms1327 [0510] 8. Christianson, D. W. Structural biology
and chemistry of the terpenoid cyclases. Chem. Rev. 106, 3412-3442
(2006). [0511] 9. Fasan, R. Tuning P450 enzymes as oxidation catalysts.
ACS Catalysis 2, 647-666 (2012). [0512] 10. Jung, S. T., Lauchli, R. &
Arnold, F. H. Cytochrome P450: Taming a wild type enzyme. Current Opinion
in Biotechnology 22, 809-817 (2011). [0513] 11. Braun, A. et al.
MuteinDB: The mutein database linking substrates, products and enzymatic
reactions directly with genetic variants of enzymes. Database (2012).
doi:10.1093/database/bas028
VI. Evolving Optogenetic Actuators: Photoswitchable Constructs.
[0514] A. Optical Control with Red and Infrared Light.
[0515] Contemporary efforts for using light to control enzyme activity
have relied on at least two optogenetic actuators: LOV2, which has
terminal helices that are destabilized by blue light (-450
nm).sup.2,18,48, and Dronpa, which switches from a dimer to a monomer in
response to green light (-500 nm).sup.19. Unfortunately, blue and green
light suffer from problems of phototoxicity, penetration depth, and
spectral similarity that limit their use in signaling studies.sup.21.
Thus, in one embodiment, photoswitchable enzymes stimulated by red or
infrared light are contemplated for development. These wavelengths have
lower phototoxicities and greater penetration depths than blue and green
light.sup.2.degree. '.sup.21, and will permit multi-color actuation
alongside blue or green light.
[0516] B. An Operon to Evolve Photoswitchable Constructs.
[0517] In one embodiment, an operon that links the activity of PTP1B to
cell growth is contemplated. In brief, this operon is based on the
following control strategy (some additional details in FIG. 10): A kinase
stimulates the binding of two proteins, which in turn, promote
transcription of an essential gene; PTP1B suppresses the binding of these
two proteins and, thus, inhibits transcription. This operon allows cells
in possession of photoswitchable variants of PTP1B to grow faster in the
presence of one light source than in the present of another (e.g., 750 nm
vs. 650 nm). The difference in growth rates enables the identification of
functional chimeras. Initial experiments with an operon based on
Lux-based luminescence (based on a system developed by Liu and
colleagues.sup.53) show a 20-fold difference in luminescence between a
strain expressing two model binding partners and a strain expressing one
(FIG. 3E). We will continue to develop this operon by adding a
protein-protein interaction that is modulated by a PTP and PTK (see
below).
[0518] This operon allows cells in possession of photoswitchable variants
of PTP1B to grow faster in the presence of one light source than in the
present of another (e.g., 750 nm vs. 650 nm). The difference in growth
rates enables the identification of functional chimeras. Initial
experiments with an operon based on Lux-based luminescence (based on a
system developed by Liu and colleagues 53) show a 20-fold difference in
luminescence between a strain expressing two model binding partners and a
strain expressing one (FIG. 3E). We will continue to develop this operon
by adding a protein-protein interaction that is modulated by a PTP and
PTK.
[0519] FRET sensors. We will use Forster resonance energy transfer (FRET)
to monitor the activity of PTP1B in living cells. Our preliminary sensor
exhibits a 20% reduction in FRET signal when treated with Src kinase
(FIG. 3F). Previous imaging studies indicate that a 20% change in FRET is
sufficient to monitor intracellular kinase activity54''56. To enhance
spatial resolution in imaging studies, we will attempt to optimize our
sensor further (and use it to measure the activity of PTP1B in vitro).
[0520] 1. To Evolve Phosphatases and Kinases Modulated by Red and Infrared
Light.
[0521] This section uses directed evolution to build enzymes that can be
turned "on" and "off" with red and infrared light. We will know that we
are successful when we have (i) built a genetic operon that links the
activity of PTP1B to antibiotic resistance, (ii) A used that operon to
build a PTP1 B-phytochrome chimera that exhibits a three- to ten-fold
change in activity in response to red and infrared light, and (iii) built
similar phytochrome chimeras of STEP and PTK6.
[0522] Hypothesis. Phytochrome proteins exhibit global conformational
changes when exposed to red and infrared light.sup.27,28, but to date,
have eluded rational integration into photoswitchable enzymes. We
hypothesize that a genetic operon that links PTP or PTK activity to cell
growth will enable the evolution of PTP- or PTK-phytochrome chimeras
stimulated by red or infrared light.
[0523] Experimental approach: We will build an operon that links PTP1B
inhibition to antibiotic resistance, and we will use that operon to
evolve photoswitchable PTP1 B-phytochrome chimeras. This effort will
involve (i) the construction a library of PTP1 B-phytochrome chimeras
that differ in linker composition and/or linker length, (ii) the use of
our operon to screen that library for functional mutants, (iii) a kinetic
and biostructural characterization of the most photoswitchable mutants,
and (iv) the extension of this approach to STEP and PTK6. This effort has
two major goals: a variant of PTP1B modulated by red and/or infrared
light, and a general approach for using directed evolution to extend
optical control to new enzymes and different wavelengths of light.
[0524] 2. Development of a Synthetic Operon for Evolving PTP1
B-Phytochrome Chimeras.
[0525] We will build a variant of PTP1B that can be modulated by red and
infrared light by attaching its C-terminal a-helix to the N-terminal
a-helix of bacterial phytochrome protein 1 (BphP1) from Rhodopseudomonas
palustris (FIG. 9); this protein undergoes a reversible conformational
change when exposed to 650 nm and 750 nm light. Phytochromes such as
BphP1 are valuable for photocontrol because they can be actively toggled
between conformations (i.e., turned "on" and "off"). Their structures,
however, are not compatible with cage-based actuation (they do not
undergo large-scale "unwinding"); they have, thus, been overlooked in
previous efforts to develop photoswitchable enzymes.
[0526] We will evolve photoswitchable PTP1B-BphP1 chimeras by using a
genetic operon that links PTP1B activity to antibiotic resistance. This
operon will consist of six components (FIG. 10A-B): (i) a PTP1B substrate
domain tethered to a DNA-binding protein, (ii) a substrate recognition
domain (i.e., a substrate homology 2 domain, or SH2) tethered to the
subunit of an RNA polymerase, (iii) an Src kinase (a kinase capable of
phosphorylating a wide range of substrates), (iv) PTP1B (or a potentially
photoswitchable variant of PTP1B), (v) a gene for antibiotic resistance,
and (vi) an operator for that gene.
[0527] With this system, light-induced inactivation of PTP1B will enable
transcription of the gene for antibiotic resistance. Previous groups have
used similar operons to evolve protein-protein binding partners (our
system is based on an operon used by Liu et al. to evolve insecticidal
proteins.sup.53); here, we take the additional (new) steps of (i) using a
protein-protein interaction mediated by enzymes (phosphatases and
kinases) and (ii) screening that interaction in the presence and absence
of light.
[0528] We have begun to develop our operon by using a Lux-based
luminescence as an output. Preliminary results show that model
protein-protein binding partners can elicit a 20-fold change in
luminescence (FIG. 3E). We plan to swap out these binding partners with
substrate and SH2 domains, and test the new system alongside
simultaneously expressed PTP1B and Src kinase (which have some
complementary activities, and can be expressed in E. coli.sup.68,69).
[0529] Advantages of using operons expressing photosensitive phosphatases
includes but is not limited to enabling high-throughput screens of
mutants of photoswitchable enzymes and provides a method for screening
the libraries of enzymes that they motivate, see, FIG. 5A for example. In
contrast, have shown that mutagenesis a photoswitchable enzyme can adjust
(i.e., improve) its dynamic range (i.e., ratio of dark-state activity to
light-state activity); while some published studies, such as
WO2011002977. Genetically Encoded Photomanipulation Of Protein And
Peptide Activity. Published Jan. 6, 2011, have proposed, but not shown,
that mutagenesis of protein light switches might enable spectral tuning
of photoswitchable enzymes. WO2011002977, provides a list of sites that
could be mutated to modify the flavin-binding pocket of LOV2 to accept
flavins that absorb light at alternative wavelengths. However, their
construct is described as a LOV2 domain of Avena sativa (oat) phototropin
1 (404-546), including the C-terminal helical extension J alpha where Ja
unwinds instead of the A alpha helix described herein. Nonetheless, there
is no available method for carrying out high-throughput screens of
mutants with modified binding pockets for which the invention described
herein provides a platform for doing so. Further, in contrast to
WO2013016693. "Near-infrared light-activated proteins." Publication Date
Jan. 31, 2013, inventions described herein provide a platform for
screening potentially improved/modified variants of photoswitchable
proteins, such as a plant phototropin 1 LOV2.
[0530] Additionally, methods for screening the libraries of enzymes enable
the detection of (i) molecules or (ii) photoswitchable domains that
change the activity of any enzyme that, in turn, can modulate the
affinity, or outcome associated with, a protein-protein interaction:
protein tyrosine phosphatase (PTPs) and protein tyrosine kinases (PTKs)
are demonstrated. Moreover, proteases are contemplated as proteins to add
to this system.
[0531] C. Directed Evolution.
[0532] We will build libraries of PTP1B-BphP1 chimeras by pairing overlap
extension PCR (oePCR) with error-prone PCR (epPCR). Specifically, we will
use oePCR to build chimeras that differ in linker length (here, we define
the linker as the -20 residue region comprised of the C-terminal a-helix
of PTP1B and the N-terminal a-helix of BphP1), and we will use epPCR to
vary linker composition. Depending on the results of this initial
library, we may extend error-prone PCR into the BphP1 gene, but we will
not mutate PTP1B beyond its C-terminal a-helix.
[0533] In the presence of a small amount of antibiotic (i.e. an amount
that impedes the growth of E. coli), our genetic operon will cause cells
that contain functional PTP1B-BphP1 chimeras to exhibit different growth
rates under red and infrared light. We will exploit these differences to
identify cells that harbor photoswitchable constructs. In brief, we will
(i) generate two replicate plates of cell colonies, (ii) grow one under
red light and one under infrared light (FIG. 11 A), and (iii) select a
subset of colonies (top hits) that show differential growth. We will
further characterize our top hits by growing them in small-scale liquid
cultures (e.g., 96-well plates with .about.1 ml/well; FIG. 11B) under red
and infrared light, and by sequencing the PTP1 B-BphP1 genes of colonies
that show the greatest different in growth rates.
[0534] We will attempt to build enzyme-phytochrome chimeras of STEP and
PTK6 by pursuing two strategies: (i) We will replace PTP1B in our final
PTP1B-BphP1 chimera with STEP or PTK6; this strategy will allow us to
assess the modularity of our final design, (ii) We will use our
operon-based approach to evolve functional STEP-BphP1 and PTK6-BphP1
chimeras; this strategy will allow us to assess the generalizability of
our approach to evolution.
[0535] Operons for evolving STEP-BphP1 and PTK6-BphP1 chimeras will
closely resemble the PTP TB-specific operon. For STEP, we will use a
STEP-specific substrate and SH2 domain (Src kinase, which has a broad
substrate specificity, is likely to have complementary activities on a
subset of STEP substrates); for PTK6, we will use a recognition process
that is inhibited--not activated--by phosphorylation (here, we can use
PTP1B.sub.WT as the complementary enzyme).
[0536] D. Extension of Approach.
[0537] We will attempt to build enzyme-phytochrome chimeras of STEP and
PTK6 by pursuing two strategies: (i) We will replace PTP1B in our final
PTP1B-BphP1 chimera with STEP or PTK6; this strategy will allow us to
assess the modularity of our final design, (ii) We will use our
operon-based approach to evolve functional STEP-BphP1 and PTK6-BphP1
chimeras; this strategy will allow us to assess the generalizability of
our approach to evolution.
[0538] Operons for evolving STEP-BphP1 and PTK6-BphP1 chimeras will
closely resemble the PTP TB-specific operon. For STEP, we will use a
STEP-specific substrate and SH2 domain (Src kinase, which has a broad
substrate specificity, is likely to have complementary activities on a
subset of STEP substrates); for PTK6, we will use a recognition process
that is inhibited--not activated--by phosphorylation (here, we can use
PTP1BWT as the complementary enzyme)
[0539] F. Exemplary Contemplated Characterization: Biophysical
Characterization of Enzyme-Phytochrome Chimeras.
We will examine the structural basis of photocontrol in the most
photoswitchable chimeras by using a subset of crystallographic and
kinetic analyses. X-ray crystal structures will show how BphP1 affects
the structures of PTP1B, STEP, and PTK6. Kinetic studies will show how
BphP1 affects substrate specificity and binding affinity (or more
specifically, Km, which is affected by binding affinity).
TABLE-US-00003
TABLE 1
Exemplary Promoters.
SEQ ID
Name DNA Sequence NO. #
Pro1 TTCTAGAGCACAGCTAACACCACGTCGTCCCTATCTGCTGCCCTAGGTCTA SEQ ID NO:
TGAGTGGTTGCTGGATAACTTTACGGGCATGCATAAGGCTCGGTATCTATA 25
TTCAGGGAGACCACAACGGTTTCCCTCTACAAATAATTTTGTTTAACTTTT
ACTAGAG
ProD GCACAGCTAACACCACGTCGTCCCTATCTGCTGCCCTAGGTCTATGAGTGG SEQ ID NO:
TTGCTGGATAACTTTACGGGCATGCATAAGGCTCGTATAATATATTCAGGG 26
AGACCACAACGGTTTCCCTCTACAAATAATTTTGTTTAACTTTTACTAGAG
pBAD AGAAACCAATTGTCCATATTGCATCAGACATTGCCGTCACTGCGTCTTTTA SEQ ID NO:
CTGGCTCTTCTCGCTAACCAAACCGGTAACCCCGCTTATTAAAAGCATTCT 27
GTAACAAAGCGGGACCAAAGCCATGACAAAAACGCGTAACAAAAGTGTC
TATAATCACGGCAGAAAAGTCCACATTGATTATTTGCACGGCGTCACACTT
TGCTATGCCATAGCATTTTTATCCATAAGATTAGCGGATCCTACCTGACGC
TTTTTATCGCAACTCTCTACTGTTTCTCCATACCCGTTTTTTTGGGCTAGC
pLacZOpt ACAAGAAAGTTTGTTCATTAGGCACCCCGGGCTTTACTCGTAAAGCTTCC SEQ ID NO:
(operator GGCGCGTATGTTGTGTCGACCG 28
bolded)
pTrc CGACTGCACGGTGCACCAATGCTTCTGGCGTCAGGCAGCCATCGGAAGCT SEQ ID NO:
GTGGTATGGCTGTGCAGGTCGTAAATCACTGCATAATTCGTGTCGCTCAAG 29
GCGCACTCCCGTTCTGGATAATGTTTTTTGCGCCGACATCATAACGGTTCT
GGCAAATATTCTGAAATGAGCTGTTGACAATTAATCATCCGGCTCGTATAA
TGTGTGGAATTGTGAGCGGATAACAATTTCACACAGGAAACAG
T7 CCTATAGTGAGTCGTATTA SEQ ID NO:
30
TABLE-US-00004
TABLE 2
Exemplary Ribosome Binding Sites.
SEQ ID
Name DNA Sequence NO. #
proRBS TTAAAGAGGAGAAAGGTC SEQ ID
NO: 31
Sal28 RBS CGAAAAAAAGTAAGGCGGTAATCC SEQ ID
NO: 32
bb034 RBS TGCAGAAAGAGGAGAAATACTAG SEQ ID
NO: 33
bb030 ATTAAAGAGGAGAAATACTAG SEQ ID
NO: 34
RBSfor GOI GTGCAGTAAGGAGGAAAAAAAA SEQ ID
in B2H NO: 35
bbAH GCTAGCTTTAAGAAGGAGATATACC SEQ ID
NO: 36
TABLE-US-00005
TABLE 3
Exemplary Protein Sequences (includes truncations).
SEQ ID
Name Amino Acid Sequence NO. #
RpoZ (linker MARVTVQDAVEKIGNRFDLVLVAARRARQMQVGGKDPLVPEENDKTTV SEQ ID
bolded) IALREIEEGLINNQILDVRERQEQQEQEAAELQAVTAIAEGRRAAA NO: 37
cI (linker bolded) MSISSRVKSKRIQLGLNQAELAQKVGTTQQSIEQLENGKTKRPRFLPELAS SEQ
ID
ALGVSVDWLLNGTSDSNVRFVGHVEPKGKYPLISMVRARSWCEACEPYD NO: 38
IKDIDEWYDSDVNLLGNGFWLKVEGDSMTSPVGQSIPEGHMVLVDTGRE
PVNGSLVVAKLTDANEATFKKLVIDGGQKYLKGLNPSWPMTPINGNCKII
GVVVEARVKFVDYKDDDDK
SH2 WYFGKITRRESERLLLNPENPRGTFLVRESETVKGAYALSVSDFDNAKGL SEQ ID
NVKHYLIRKLDSGGFYITSRTQFSSLQQLVAYYSKHADGLCHRLTNVC NO: 39
Kras Substrate WMEDYDYVHLQG SEQ ID
NO: 40
MidT Substrate EPQYEEIPIYL SEQ ID
NO: 41
ShcA Substrate DHQYYNDFPG SEQ ID
NO: 42
EGFR Substrate PQRYLVIQGD SEQ ID
NO: 43
Src MSKPQTQGLAKDAWEIPRESLRLEVKLGQGCFGEVWMGTWNGTTRVAI SEQ ID
KTLKPGTMSPEAFLQEAQVMKKLRHEKLVQLYAVVSEEPIYIVTEYMSKG NO: 44
SLLDFLKGETGKYLRLPQLVDMAAQIASGMAYVERMNYVHRDLRAANIL
VGENLVCKVADFGLARLIEDNEYTARQGAKFPIKWTAPEAALYGRFTIKS
DVWSFGILLTELTTKGRVPYPGMVNREVLDQVERGYRMPCPPECPESLHD
LMCQCWRKEPEERPTFEYLQAFLEDYFTSTEPQYQPGENL
CDC37 MVDYSVWDHIEVSDDEDETHPNIDTASLFRWRHQARVERMEQFQKEKEE SEQ ID
LDRGCRECKRKVAECQRKLKELEVAEGGKAELERLQAEAQQLRKEERSW NO: 45
EQKLEEMRKKEKSMPWNVDTLSKDGFSKSMVNTKPEKTEEDSEEVREQK
HKTFVEKYEKQIKHFGMLRRWDDSQKYLSDNVHLVCEETANYLVIWCID
LEVEEKCALMEQVAHQTIVMQFILELAKSLKVDPRACFRQFFTKIKTADR
QYMEGFNDELEAFKERVRGRAKLRIEKAMKEYEEEERKKRLGPGGLDPV
EVYESLPEELQKCFDVKDVQMLQDAISKMDPTDAKYHMQRCIDSGLWVP
NSKASEAKEGEEAGPGDPLLEAVPKTGDEKDVSV
PTP1B MEMEKEFEQIDKSGSWAAIYQDIRHEASDFPCRVAKLPKNKNRNRYRDV SEQ ID
SPFDHSRIKLHQEDNDYINASLIKMEEAQRSYILTQGPLPNTCGHFWEMV NO: 46
WEQKSRGVVMLNRVMEKGSLKCAQYWPQKEEKEMIFEDTNLKLTLISED
IKSYYTVRQLELENLTTQETREILHFHYTTWPDFGVPESPASFLNFLFKVRE
SGSLSPEHGPVVVHSSAGIGRSGTFCLADTCLLLMDKRKDPSSVDIKKVLL
EMRKFRMGLIQTADQLRFSYLAVIEGAKFIMGDSSVQDQWKELSHEDLEP
PPEHIPPPPRPPKRILEPHN
MBP MKIEEGKLVIWINGDKGYNGLAEVGKKFEKDTGIKVTVEHPDKLEEKFP SEQ ID
QVAATGDGPDIIFWAHDRFGGYAQSGLLAEITPDKAFQDKLYPFTWDA NO: 47
VRYNGKLIAYPIAVEALSLIYNKDLLPNPPKTWEEIPALDKELKAKGKS
ALMFNLQEPYFTWPLIAADGGYAFKYENGKYDIKDVGVDNAGAKAGL
TFLVDLIKNKHMNADTDYSIAEAAFNKGETAMTINGPWAWSNIDTSKV
NYGVTVLPTFKGQPSKPFVGVLSAGINAASPNKELAKEFLENYLLTDEG
LEAVNKDKPLGAVALKSYEEELAKDPRIAATMENAQKGEIMPNIPQMS
AFWYAVRTAVINAASGRQTVDEALKDAQTRITK
LuxAB MKFGNFLLTYQPPQFSQTEVMKRLVKLGRISEECGFDTVWLLEHHFTEF SEQ ID
GLLGNPYVAAAYLLGATKKLNVGTAAIVLPTAHPVRQLEDVNLLDQM NO: 48
SKGRFRFGICRGLYNKDFRVFGTDMNNSRALAECWYGLIKNGMTEGYM
EADNEHIKFHKVKVNPAAYSRGGAPVYVVAESASTTEWAAQFGLPMIL
SWIINTNEKKAQLELYNEVAQEYGHDIHNIDHCLSYITSVDHDSIKAKEIC
RKFLGHWYDSYVNATTIFDDSDQTRGYDFNKGQWRDFVLKGHKDTNRR
IDYSYEINPVGTPQECIDIIQKDIDATGISNICCGFEANGTVDEIIASMKLFQ
SDVMPFLKEKQRSLLYYGGGGSGGGGSGGGGSGGGGSKFGLFFLNFINS
TTVQEQSIVRMQEITEYVDKLNFEQILVYENHFSDNGVVGAPLTVSGFLL
GLTEKIKIGSLNHIITTHHPVRIAEEACLLDQLSEGRFILGFSDCEKKDEMH
FFNRPVEYQQQLFEECYEIINDALTTGYCNPDNDFYSFPKISVNPHAYTPG
GPRKYVTATSHHIVEWAAKKGIPLIFKWDDSNDVRYEYAERYKAVADKY
DVDLSEIDHQLMILVNYNEDSNKAKQETRAFISDYVLEMHPNENFENKLE
EIIAENAVGNYTECITAAKLAIEKCGAKSVLLSFEPMNDLMSQKNVINIV
DDNIKKYHTEYT
SpecR MREAVIAEVSTQLSEVVGVIERHLEPTLLAVHLYGSAVDGGLKPHSDIDL SEQ ID
LVTVTVRLDETTRRALINDLLETSASPGESEILRAVEVTIVVHDDIIPWRY NO: 49
PAKRELQFGEWQRNDILAGIFEPATIDIDLAILLTKAREHSVALVGPAAE
ELFDPVPEQDLFEALNETLTLWNSPPDWAGDERNVVLTLSRIWYSAVTG
KIAPKDVAADWAMERLPAQYQPVILEARQAYLGQEEDRLASRADQLEE
FVHYVKGEITKVVGK
AgAs MVKREFPPGFWKDDLIDSLTSSHKVAASDEKRIETLISEIKNMFRCMGY SEQ ID
GETNPSAYDTAWVARIPAVDGSDNPHFPETVEWILQNQLKDGSWGEG NO: 50
FYFLAYDRILATLACIITLTLWRTGETQVQKGIEFFRTQAGKMEDEADSH
RPSGFEIVFPAMLKEAKILGLDLPYDLPFLKQIIEKREAKLKRIPTDVLYA
LPTTLLYSLEGLQEIVDWQKIMKLQSKDGSFLSSPASTAAVFMRTGNKKC
LDFLNFVLKKFGNHVPCHYPLDLFERLWAVDTVERLGIDRHFKEEIKEAL
DYVYSHWDERGIGWARENPVPDIDDTAMGLRILRLHGYNVSSDVLKTFR
DENGEFFCFLGQTQRGVTDMLNVNRCSHVSFPGETIMEEAKLCTERYLRN
ALENVDAFDKWAFKKNIRGEVEYALKYPWHKSMPRLEARSYIENYGPDD
VWLGKTVYMMPYISNEKYLELAKLDFNKVQSIHQTELQDLRRWWKSSGF
TDLNFTRERVTEIYFSPASFIFEPEFSKCREVYTKTSNFTVILDDLYDAHGSL
DDLKLFTESVKRWDLSLVDQMPQQMKICFVGFYNTFNDIAKEGRERQGR
DVLGYIQNVWKVQLEAYTKEAEWSEAKYVPSFNEYIENASVSIALGTVVL
ISALFTGEVLTDEVLSKIDRESRFLQLMGLTGRLVNDTKTYQAERGQGEV
ASAIQCYMKDHPKISEEEALQHVYSVMENALEELNREFVNNKIPDIYKRL
VFETARIMQLFYMQGDGLTLSHDMEIKEHVKNCLFQPVA
GGPPS MFDFNEYMKSKAVAVDAALDKAIPLEYPEKIHESMRYSLLAGGKRVRPA SEQ ID
LCIAACELVGGSQDLAMPTACAMEMIHTMSLIHDDLPCMDNDDFRRGKP NO: 51
TNHKVFGEDTAVLAGDALLSFAFEHIAVATSKTVPSDRTLRVISELGKTIG
SQGLVGGQVVDITSEGDANVDLKTLEWIHIHKTAVLLECSVVSGGILGGA
TEDEIARIRRYARCVGLLFQVVDDILDVTKSSEELGKTAGKDLLTDKATYP
KLMGLEKAKEFAAELATRAKEELSSFDQIKAAPLLGLADYIAFRQN
P450 MTIKEMPQPKTFGELKNLPLLNTDKPVQALMKIADELGEIFKFEAPGRVTR SEQ ID
YLSSQRLIKEACDESRFDKNLSQALKFVRDFAGDGLFTSWTHEKNWKKA NO: 52
HNILLPSFSQQAMKGYHAMMVDIAVQLVQKWERLNADEHIEVPEDMTRL
TLDTIGLCGFNYRFNSFYRDQPHPFITSMVRALDEAMNKLQRANPDDPAY
DENKRQFQEDIKVMNDLVDKIIADRKASGEQSDDLLTHMLNGKDPETGEP
LDDENIRYQIITFLIAGHETTSGLLSFALYFLVKNPHVLQKAAEEAARVLV
DPVPSYKQVKQLKYVGMVLNEALRLWPTAPAFSLYAKEDTVLGGEYPLE
KGDELMVLIPQLHRDKTIWGDDVEEFRPERFENPSAIPQHAFKPFGNGQR
ACIGQQFALHEATLVLGMMLKHFDFEDHTNYELDIKETLTLKPEGFVVKA
KSKKIPLGGIPSPSTEQSAKKVRKKAENAHNTPLLVLYGSNMGTAEGTAR
DLADIAMSKGFAPQVATLDSHAGNLPREGAVLIVTASYNGHPPDNAKQF
VDWLDQASADEVKGVRYSVFGCGDKNWATTYQKVPAFIDETLAAKGAE
NIADRGEADASDDFEGTYEEWREHMWSDVAAYFNLDIENSEDNKSTLSL
QFVDSAADMPLAKMHGAFSTNVVASKELQQPGSARSTRHLEIELPKEASY
QEGDHLGVIPRNYEGIVNRVTARFGLDASQQIRLEAEEEKLAHLPLAKTVS
VEELLQYVELQDPVTRTQLRAMAAKTVCPPHKVELEALLEKQAYKEQVL
AKRLTMLELLEKYPACEMKFSEFIALLPSIRPRYYSISSSPRVDEKQASITVS
VVSGEAWSGYGEYKGIASNYLAELQEGDTITCFISTPQSEFTLPKDPETPLI
MVGPGTGVAPFRGFVQARKQLKEQGQSLGEAHLYFGCRSPHEDYLYQEE
LENAQSEGIITLHTAFSRMPNQPKTYVQHVMEQDGKKLIELLDQGAHFYI
CGDGSQMAPAVEATLMKSYADVHQVSEADARLWLQQLEEKGRYAKDV
WAG
LOV2 AATLERIEKNFVITDPRLPDNPIIFASDSFLQLTEYSREEILGRNCRFLQGPET SEQ ID
DRATVRKIRDAIDNQTEVTVQLINYTKSGKKFWNLFHLQPMRDQKGDVQ NO: 53
YFIGVQLDGTEHVRDAAEREGVMLIKKTAENIDEAAKEL
BphP1 MASVAGHASGSPAFGTADLSNCEREEIHLAGSIQPHGALLVVSEPDHRIIQ SEQ ID
ASANAAEFLNLGSVLGVPLAEIDGDLLIKILPHLDPTAEGMPVAVRCRIGN NO: 54
PSTEYDGLMHRPPEGGLIIELERAGPPIDLSGTLAPALERIRTAGSLRALCD
DTALLFQQCTGYDRVMVYRFDEQGHGEVFSERHVPGLESYFGNRYPSSDI
PQMARRLYERQRVRVLVDVSYQPVPLEPRLSPLTGRDLDMSGCFLRSMSP
IHLQYLKNMGVRATLVVSLVVGGKLWGLVACHHYLPRFMHFELRAICEL
LAEAIATRITALESFAQSQSELFVQRLEQRMIEAITREGDWRAAIFDTSQSIL
QPLHAAGCALVYEDQIRTIGDVPSTQDVREIAGWLDRQPRAAVTSTASLG
LDVPELAHLTRMASGVVAAPISDHRGEFLMWFRPERVHTVTWGGDPKKP
FTMGDTPADLSPRRSFAKWHQVVEGTSDPWTAADLAAARTIGQTVADIV
LQFRAVRTLIAREQYEQFSSQVHASMQPVLITDAEGRILLMNDSFRDMLP
AGSPSAVHLDDLAGFFVESNDFLRNVAELIDHGRGWRGEVLLRGAGNRP
LPLAVRADPVTRTEDQSLGFVLIFSDATDRRTADAARTRFQEGILASARPG
VRLDSKSDLLHEKLLSALVENAQLAALEITYGVETGRIAELLEGVRQSML
RTAEVLGHLVQHAARTAGSDSSSNGSQNKKEFDSAGSAGSAGTS
TC-PTP MGMPTTIEREFEELDTQRRWQPLYLEIRNESHDYPHRVAKFPENRNRNRY SEQ ID
RDVSPYDHSRVKLQNAENDYINASLVDIEEAQRSYILTQGPLPNTCCHFW NO: 55
LMVWQQKTKAVVMLNRIVEKESVKCAQYWPTDDQEMLFKETGFSVKLL
SEDVKSYYTVHLLQLENINSGETRTISHFHYTTWPDFGVPESPASFLNFLFK
VRESGSLNPDHGPAVIHCSAGIGRSGTFSLVDTCLVLMEKGDDINIKQVLL
NMRKYRMGLIQTPDQLRFSYMAIIEGAKCIKGDSSIQKRWKELS
PTP1B.sub.1-435 MEMEKEFEQIDKSGSWAAIYQDIRHEASDFPCRVAKLPKNKNRNRYRDV SEQ ID
SPFDHSRIKLHQEDNDYINASLIKMEEAQRSYILTQGPLPNTCGHFWEMV NO: 56
WEQKSRGVVMLNRVMEKGSLKCAQYWPQKEEKEMIFEDTNLKLTLISED
IKSYYTVRQLELENLTTQETREILHFHYTTWPDFGVPESPASFLNFLFKVRE
SGSLSPEHGPVVVHCSAGIGRSGTFCLADTCLLLMDKRKDPSSVDIKKVLL
EMRKFRMGLIQTADQLRFSYLAVIEGAKFIMGDSSVQDQWKELSHEDLEP
PPEHIPPPPRPPKRILEPHNGKCREFFPNHQWVKEETQEDKDCPIKEEKGSP
LNAAPYGIESMSQDTEVRSRVVGGSLRGAQAASPAKGEPSLPEKDEDHAL
SYWKPFLVNMCVATVLTAGAYLCYRFLFNSNT
SacB MNIKKFAKQATVLTFTTALLAGGATQAFAKETNQKPYKETYGISHITRHD SEQ ID
MLQIPEQQKNEKYQVPEFDSSTIKNISSAKGLDVWDSWPLQNADGTVAN NO: 57
YHGYHIVFALAGDPKNADDTSIYMFYQKVGETSIDSWKNAGRVFKDSDK
FDANDSILKDQTQEWSGSATFTSDGKIRLFYTDFSGKHYGKQTLTTAQVN
VSASDSSLNINGVEDYKSIFDGDGKTYQNVQQFIDEGNYSSGDNHTLRDP
HYVEDKGHKYLVFEANTGTEDGYQGEESLFNKAYYGKSTSFFRQESQKL
LQSDKKRTAELANGALGMIELNDDYTLKKVMKPLIASNTVTDEIERANVF
KMNGKWYLFTDSRGSKMTIDGITSNDIYMLGYVSNSLTGPYKPLNKTGL
VLKMDLDPNDVTFTYSHFAVPQAKGNNVVITSYMTNRGFYADKQSTFAP
SFLLNIKGKKTSVVKDSILEQGQLTVNK
GalK MSLKEKTQSLFANAFGYPATHTIQAPGRVNLIGEHTDYNDGFVLPCAIDY SEQ ID
QTVISCAPRDDRKVRVMAADYENQLDEFSLDAPIVAHENYQWANYVRG NO: 58
VVKHLQLRNNSFGGVDMVISGNVPQGAGLSSSASLEVAVGTVLQQLYHL
PLDGAQIALNGQEAENQFVGCNCGIMDQLISALGKKDHALLIDCRSLGTK
AVSMPKGVAVVIINSNFKRTLVGSEYNTRREQCETGARFFQQPALRDVTIE
EFNAVAHELDPIVAKRVRHILTENARTVEAASALEQGDLKRMGELMAES
HASMRDDFEITVPQIDTLVEIVKAVIGDKGGVRMTGGGFGGCIVALIPEEL
VPAVQQAVAEQYEAKTGIKETFYVCKPSQGAGQC
GHS MAQISESVSPSTDLKSTESSITSNRHGNMWEDDRIQSLNSPYGAPAYQERS SEQ ID
EKLIEEIKLLFLSDMDDSCNDSDRDLIKRLEIVDTVECLGIDRHFQPEIKLAL NO: 59
DYVYRCWNERGIGEGSRDSLKKDLNATALGFRALRLHRYNVSSGVLENF
RDDNGQFFCGSTVEEEGAEAYNKHVRCMLSLSRASNILFPGEKVMEEAK
AFTTNYLKKVLAGREATHVDESLLGEVKYALEFPWHCSVQRWEARSFIEI
FGQIDSELKSNLSKKMLELAKLDFNILQCTHQKELQIISRWFADSSIASLNF
YRKCYVEFYFWMAAAISEPEFSGSRVAFTKIAILMTMLDDLYDTHGTLDQ
LKIFTEGVRRWDVSLVEGLPDFMKIAFEFWLKTSNELIAEAVKAQGQDMA
AYIRKNAWERYLEAYLQDAEWIATGHVPTFDEYLNNGTPNTGMCVLNLI
PLLLMGEHLPIDILEQIFLPSRFHHLIELASRLVDDARDFQAEKDHGDLSCIE
CYLKDHPESTVEDALNHVNGLLGNCLLEMNWKFLKKQDSVPLSCKKYSF
HVLARSIQFMYNQGDGFSISNKVIKDQVQKVLIVPVPI*
ADS MALTEEKPIRPIANFPPSIWGDQFLIYEKQVEQGVEQIVNDLKKEVRQLLK SEQ ID
EALDIPMKHANLLKLIDEIQRLGIPYHFEREIDHALQCIYETYGDNWNGDR NO: 60
SSLWFRLMRKQGYYVTCDVFNNYKDKNGAFKQSLANDVEGLLELYEAT
SMRVPGEIILEDALGFTRSRLSIMTKDAFSTNPALFTEIQRALKQPLWKRLP
RIEAAQYIPFYQQQDSHNKTLLKLAKLEFNLLQSLHKEELSHVCKWWKAF
DIKKNAPCLRDRIVECYFWGLGSGYEPQYSRARVFFTKAVAVITLIDDTYD
AYGTYEELKIFTEAVERWSITCLDTLPEYMKPIYKLFMDTYTEMEEFLAKE
GRTDLFNCGKEFVKEFVRNLMVEAKWANEGHIPTTEEHDPVVIITGGANL
LTTTCYLGMSDIFTKESVEWAVSAPPLFRYSGILGRRLNDLMTHKAEQER
KHSSSSLESYMKEYNVNEEYAQTLIYKEVEDVWKDINREYLTTKNIPRPLL
MAVIYLCQFLEVQYAGKDNFTRMGDEYKHLIKSLLVYPMSI*
TXS MSSSTGTSKVVSETSSTIVDDIPRLSANYHGDLWHHNVIQTLETPFRESSTY SEQ ID
QERADELVVKIKDMFNALGDGDISPSAYDTAWVARLATISSDGSEKPRFP NO: 61
QALNWVFNNQLQDGSWGIESHFSLCDRLLNTTNSVIALSVWKTGHSQVQ
QGAEFIAENLRLLNEEDELSPDFQIIFPALLQKAKALGINLPYDLPFIKYLST
TREARLTDVSAAADNIPANMLNALEGLEEVIDWNKIMRFQSKDGSFLSSP
ASTACVLMNTGDEKCFTFLNNLLDKFGGCVPCMYSIDLLERLSLVDNIEH
LGIGRHFKQEIKGALDYVYRHWSERGIGWGRDSLVPDLNTTALGLRTLR
MHGYNVSSDVLNNFKDENGRFFSSAGQTHVELRSVVNLFRASDLAFPDE
RAMDDARKFAEPYLREALATKISTNTKLFKEIEYVVEYPWHMSIPRLEAR
SYIDSYDDNYVWQRKTLYRMPSLSNSKCLELAKLDFNIVQSLHQEELKLL
TRWWKESGMADINFTRHRVAEVYFSSATFEPEYSATRIAFTKIGCLQVLFD
DMADIFATLDELKSFTEGVKRWDTSLLHEIPECMQTCFKVWFKLMEEVN
NDVVKVQGRDMLAHIRKPWELYFNCYVQEREWLEAGYIPTFEEYLKTYA
ISVGLGPCTLQPILLMGELVKDDVVEKVHYPSNMFELVSLSWRLTNDTKT
YQAEKARGQQASGIACYMKDNPGATEEDAIKHICRVVDRALKEASFEYF
KPSNDIPMGCKSFIFNLRLCVQIFYKFIDGYGIANEEIKDYIRKVYIDPIQV*
TC-PTP MPTTIEREFEELDTQRRWQPLYLEIRNESHDYPHRVAKFPENRNRNRYRD SEQ ID
VSPYDHSRVKLQNAENDYINASLVDIEEAQRSYILTQGPLPNTCCHFWLM NO: 62
VWQQKTKAVVMLNRIVEKESVKCAQYWPTDDQEMLFKETGFSVKLLSE
DVKSYYTVHLLQLENINSGETRTISHFHYTTWPDFGVPESPASFLNFLFKV
RESGSLNPDHGPAVIHCSAGIGRSGTFSLVDTCLVLMEKGDDINIKQVLLN
MRKYRMGLIQTPDQLRFSYMAIIEGAKCIKGDSSIQKRWKELSKEDLSPAF
DHSPNKIMTEKYNGNR
PTPN5 MSSGVDLGTENLYFQSMSRVLQAEELHEKALDPFLLQAEFFEIPMNFVDP SEQ ID
KEYDIPGLVRKNRYKTILPNPHSRVCLTSPDPDDPLSSYINANYIRGYGGEE NO: 63
KVYIATQGPIVSTVADFWRMVWQEHTPIIVMITNIEEMNEKCTEYWPEEQ
VAYDGVEITVQKVIHTEDYRLRLISLKSGTEERGLKHYWFTSWPDQKTPD
RAPPLLHLVREVEEAAQQEGPHCAPIIVHCSAGIGRTGCFIATSICCQQLRQ
EGVVDILKTTCQLRQDRGGMIQTCEQYQFVHHVMSLYEKQLSHQS*
PTPN6 MVRWFHRDLSGLDAETLLKGRGVHGSFLARPSRKNQGDFSLSVRVGDQV SEQ ID
THIRIQNSGDFYDLYGGEKFATLTELVEYYTQQQGVVQDRDGTIIHLKYPL NO: 64
NCSDPTSERWYHGHMSGGQAETLLQAKGEPWTFLVRESLSQPGDFVLSV
LSDQPKAGPGSPLRVTHIKVMCEGGRYTVGGLETFDSLTDLVEHFKKTGI
EEASGAFVYLRQPYYATRVNAADIENRVLELNKKQESEDTAKAGFWEEF
ESLQKQEVKNLHQRLEGQRPENKGKNRYKNILPFDHSRVILQGRDSNIPGS
DYINANYIKNQLLGPDENAKTYIASQGCLEATVNDFWQMAWQENSRVIV
MTTREVEKGRNKCVPYWPEVGMQRAYGPYSVTNCGEHDTTEYKLRTLQ
VSPLDNGDLIREIWHYQYLSWPDHGVPSEPGGVLSFLDQINQRQESLPHA
GPIIVHCSAGIGRTGTIIVIDMLMENISTKGLDCDIDIQKTIQMVRAQRSGM
VQTEAQYKFIYVAIAQFIETTKKKLEVLQSQKGQESEYGNITYPPAMKNA
HAKASRTSSKHKEDVYENLHTKNKREEKVKKQRSADKEKSKGSLKRK*
PTPN11 MTSRRWFHPNITGVEAENLLLTRGVDGSFLARPSKSNPGDFTLSVRRNGA SEQ ID
VTHIKIQNTGDYYDLYGGEKFATLAELVQYYMEHHGQLKEKNGDVIELK NO: 65
YPLNCADPTSERWFHGHLSGKEAEKLLTEKGKHGSFLVRESQSHPGDFVL
SVRTGDDKGESNDGKSKVTHVMIRCQELKYDVGGGERFDSLTDLVEHYK
KNPMVETLGTVLQLKQPLNTTRINAAEIESRVRELSKLAETTDKVKQGFW
EEFETLQQQECKLLYSRKEGQRQENKNKNRYKNILPFDHTRVVLHDGDP
NEPVSDYINANIIMPEFETKCNNSKPKKSYIATQGCLQNTVNDFWRMVFQ
ENSRVIVMTTKEVERGKSKCVKYWPDEYALKEYGVMRVRNVKESAAHD
YTLRELKLSKVGQGNTERTVWQYHFRTWPDHGVPSDPGGVLDFLEEVHH
KQESIMDAGPVVVHCSAGIGRTGTFIVIDILIDIIREKGVDCDIDVPKTIQMV
RSQRSGMVQTEAQYRFIYMAVQHYIETLQRRIEEEQKSKRKGHEYTNIKY
SLADQTSGDQSPLPPCTPTPPCAEMREDSARVYENVGLMQQQKSFR*
PTN12 MEQVEILRKFIQRVQAMKSPDHNGEDNFARDFMRLRRLSTKYRTEKIYPT SEQ ID
ATGEKEENVKKNRYKDILPFDHSRVKLTLKTPSQDSDYINANFIKGVYGP NO: 66
KAYVATQGPLANTVIDFWRMVWEYNVVIIVMACREFEMGRKKCERYWP
LYGEDPITFAPFKISCEDEQARTDYFIRTLLLEFQNESRRLYQFHYVNWPD
HDVPSSFDSILDMISLMRKYQEHEDVPICIHCSAGCGRTGAICAIDYTWNL
LKAGKIPEEFNVFNLIQEMRTQRHSAVQTKEQYELVHRAIAQLFEKQLQL
YEIHGAQKIADGVNEINTENMVSSIEPEKQDSPPPKPPRTRSCLVEGDAKEE
ILQPPEPHPVPPILTPSPPSAFPTVTTVWQDNDRYHPKPVLQWFHQNNIQQT
STETIVNQQNFQGKMNQQLNR
PTPN22 MDQREILQKFLDEAQSKKITKEEFANEFLKLKRQSTKYKADKTYPTTVAE SEQ ID
KPKNIKKNRYKDILPYDYSRVELSLITSDEDSSYINANFIKGVYGPKAYIAT NO: 67
QGPLSTTLLDFWRMIWEYSVLIIVMACMEYEMGKKKCERYWAEPGEMQ
LEFGPFSVSCEAEKRKSDYIIRTLKVKFNSETRTIYQFHYKNWPDHDVPSSI
DPILELIWDVRCYQEDDSVPICIHCSAGCGRTGVICAIDYTWMLLKDGIIPE
NFSVFSLIREMRTQRPSLVQTQEQYELVYNAVLELFKRQMDVIRD
sfGFP MRKGEELFTGVVPILVELDGDVNGHKFSVRGEGEGDATNGKLTLKFICTT SEQ ID
GKLPVPWPTLVTTLTYGVQCFARYPDHMKQHDFFKSAMPEGYVQERTIS NO: 68
FKDDGTYKTRAEVKFEGDTLVNRIELKGIDFKEDGNILGHKLEYNFNSHN
VYITADKQKNGIKANFKIRHNVEDGSVQLADHYQQNTPIGDGPVLLPDNH
YLSTQSVLSKDPNEKRDHMVLLEFVTAAGITHGMDELYK*
mClover MHHHHHHVSKGEELFTGVVPILVELDGDVNGHKFSVRGEGEGDATNGKL SEQ ID
TLKFICTTGKLPVPWPTLVTTFGYGVACFSRYPDHMKQHDFFKSAMPEGY NO: 69
VQERTISFKDDGTYKTRAEVKFEGDTLVNRIELKGIDFKEDGNILGHKLEY
NFNSHNVYITADKQKNGIKANFKIRHNVEDGSVQLADHYQQNTPIGDGPV
LLPDNHYLSHQSALSKDPNEKRDHMVLLEFVTAAGITHGMDELYK
TABLE-US-00006
TABLE 4
Exemplary Terminators.
SEQ
ID No.
Name DNA Sequence #
T7 ATCCGGATATAGTTCCTCCTTTCAGCAAAAAACCCCTCAAGACCCGTTTAGAGG SEQ ID
CCCCAAGGGGTTATGCTAGTTATTGCTCAGCGGTGGCAGCAGCCAACTCAGCTT NO: 70
CCTTTCGGGCTTTGTTAGCAG
rrnB GGCTGTTTTGGCGGATGAGAGAAGATTTTCAGCCTGATACAGATTAAATCAGAA SEQ ID
T1/T2 CGCAGAAGCGGTCTGATAAAACAGAATTTGCCTGGCGGCAGTAGCGCGGTGGT NO: 71
CCCACCTGACCCCATGCCGAACTCAGAAGTGAAACGCCGTAGCGCCGATGGTA
GTGTGGGGTCACCCCATGCGAGAGTAGGGAACTGCCAGGCATCAAATAAAACG
AAAGGCTCAGTCGAAAGACTGGGCCTTTCGTTTTATCTGTTGTTTGTCGGTGAA
CGCTCTCCTGAGTAGGACAAATCCGCCGGGAGCGGATTTGAACGTTGCGAAGC
AACGGCCCGGAGGGTGGCGGGCAGGACGCCCGCCATAAACTGCCAGGCATCAA
ATTAAGCAGAAGGCCATCCTGACGGATGGCCTTTTTGCGTTTCTACAAACTCT
TrrnB TGCCTGGCGGCAGTAGCGCGGTGGTCCCACCTGACCCCATGCCGAACTCAGAA SEQ ID
GTGAAACGCCGTAGCGCCGATGGTAGTGTGGGGTCTCCCCATGCGAGAGTAGG NO: 72
GAACTGCCAGGCATCAAATAAAACGAAAGGCTCAGTCGAAAGACTGGGCCTT
TABLE-US-00007
TABLE 5
Exemplary DNA Sequences (includes truncations):
SEQ
ID
Comp. Organism DNA Sequence No. #
Src H. sapiens ATGGGCTCCAAGCCGCAGACTCAGGGCCTGGCCAAGGATGCCTGGGA SEQ
GATCCCTCGGGAGTCGCTGCGGCTGGAGGTCAAGCTGGGCCAGGGCT ID
GCTTTGGCGAGGTGTGGATGGGGACCTGGAACGGTACCACCAGGGTG NO:
GCCATCAAAACCCTGAAGCCTGGCACGATGTCTCCAGAGGCCTTCCTG 73
CAGGAGGCCCAGGTCATGAAGAAGCTGAGGCATGAGAAGCTGGTGCA
GTTGTATGCTGTGGTTTCAGAGGAGCCCATTTACATCGTCACGGAGTA
CATGAGCAAGGGGAGTTTGCTGGACTTTCTCAAGGGGGAGACAGGCA
AGTACCTGCGGCTGCCTCAGCTGGTGGACATGGCTGCTCAGATCGCCT
CAGGCATGGCGTACGTGGAGCGGATGAACTACGTCCACCGGGACCTTC
GTGCAGCCAACATCCTGGTGGGAGAGAACCTGGTGTGCAAAGTGGCC
GACTTTGGGCTGGCTCGGCTCATTGAAGACAATGAGTACACGGCGCGG
CAAGGTGCCAAATTCCCCATCAAGTGGACGGCTCCAGAAGCTGCCCTC
TATGGCCGCTTCACCATCAAGTCGGACGTGTGGTCCTTCGGGATCCTG
CTGACTGAGCTCACCACAAAGGGACGGGTGCCCTACCCTGGGATGGTG
AACCGCGAGGTGCTGGACCAGGTGGAGCGGGGCTACCGGATGCCCTG
CCCGCCGGAGTGTCCCGAGTCCCTGCACGACCTCATGTGCCAGTGCTG
GCGGAAGGAGCCTGAGGAGCGGCCCACCTTCGAGTACCTGCAGGCCT
TCCTGGAGGACTACTTCACGTCCACCGAGCCCCAGTACCAGCCCGGGG
AGAACCTCTAA
CDC37 H. sapiens ATGGTGGACTACAGCGTGTGGGACCACATTGAGGTGTCTGATGATGAA SEQ
GACGAGACGCACCCCAACATCGACACGGCCAGTCTCTTCCGCTGGCGG ID
CATCAGGCCCGGGTGGAACGCATGGAGCAGTTCCAGAAGGAGAAGGA NO:
GGAACTGGACAGGGGCTGCCGCGAGTGCAAGCGCAAGGTGGCCGAGT 74
GCCAGAGGAAACTGAAGGAGCTGGAGGTGGCCGAGGGCGGCAAGGC
AGAGCTGGAGCGCCTGCAGGCCGAGGCACAGCAGCTGCGCAAGGAGG
AGCGGAGCTGGGAGCAGAAGCTGGAGGAGATGCGCAAGAAGGAGAA
GAGCATGCCCTGGAACGTGGACACGCTCAGCAAAGACGGCTTCAGCA
AGAGCATGGTAAATACCAAGCCCGAGAAGACGGAGGAGGACTCAGAG
GAGGTGAGGGAGCAGAAACACAAGACCTTCGTGGAAAAATACGAGAA
ACAGATCAAGCACTTTGGCATGCTTCGCCGCTGGGATGACAGCCAAAA
GTACCTGTCAGACAACGTCCACCTGGTGTGCGAGGAGACAGCCAATTA
CCTGGTCATTTGGTGCATTGACCTAGAGGTGGAGGAGAAATGTGCACT
CATGGAGCAGGTGGCCCACCAGACAATCGTCATGCAATTTATCCTGGA
GCTGGCCAAGAGCCTAAAGGTGGACCCCCGGGCCTGCTTCCGGCAGTT
CTTCACTAAGATTAAGACAGCCGATCGCCAGTACATGGAGGGCTTCAA
CGACGAGCTGGAAGCCTTCAAGGAGCGTGTGCGGGGCCGTGCCAAGC
TGCGCATCGAGAAGGCCATGAAGGAGTACGAGGAGGAGGAGCGCAA
GAAGCGGCTCGGCCCCGGCGGCCTGGACCCCGTCGAGGTCTACGAGTC
CCTCCCTGAGGAACTCCAGAAGTGCTTCGATGTGAAGGACGTGCAGAT
GCTGCAGGACGCCATCAGCAAGATGGACCCCACCGACGCAAAGTACC
ACATGCAGCGCTGCATTGACTCTGGCCTCTGGGTCCCCAACTCTAAGG
CCAGCGAGGCCAAGGAGGGAGAGGAGGCAGGTCCTGGGGACCCATTA
CTGGAAGCTGTTCCCAAGACGGGCGATGAGAAGGATGTCAGTGTGTA
A
PTP1B.sub.1-435 H. sapiens ATGGAGATGGAAAAGGAGTTCGAGCAGATCGACAAGTCCGGGAGCTG
SEQ
GGCGGCCATTTACCAGGATATCCGACATGAAGCCAGTGACTTCCCATG ID
TAGAGTGGCCAAGCTTCCTAAGAACAAAAACCGAAATAGGTACAGAG NO:
ACGTCAGTCCCTTTGACCATAGTCGGATTAAACTACATCAAGAAGATA 75
ATGACTATATCAACGCTAGTTTGATAAAAATGGAAGAAGCCCAAAGG
AGTTACATTCTTACCCAGGGCCCTTTGCCTAACACATGCGGTCACTTTT
GGGAGATGGTGTGGGAGCAGAAAAGCAGGGGTGTCGTCATGCTCAAC
AGAGTGATGGAGAAAGGTTCGTTAAAATGCGCACAATACTGGCCACA
AAAAGAAGAAAAAGAGATGATCTTTGAAGACACAAATTTGAAATTAA
CATTGATCTCTGAAGATATCAAGTCATATTATACAGTGCGACAGCTAG
AATTGGAAAACCTTACAACCCAAGAAACTCGAGAGATCTTACATTTCC
ACTATACCACATGGCCTGACTTTGGAGTCCCTGAATCACCAGCCTCAT
TCTTGAACTTTCTTTTCAAAGTCCGAGAGTCAGGGTCACTCAGCCCGG
AGCACGGGCCCGTTGTGGTGCACTGCAGTGCAGGCATCGGCAGGTCTG
GAACCTTCTGTCTGGCTGATACCTGCCTCTTGCTGATGGACAAGAGGA
AAGACCCTTCTTCCGTTGATATCAAGAAAGTGCTGTTAGAAATGAGGA
AGTTTCGGATGGGGCTGATCCAGACAGCCGACCAGCTGCGCTTCTCCT
ACCTGGCTGTGATCGAAGGTGCCAAATTCATCATGGGGGACTCTTCCG
TGCAGGATCAGTGGAAGGAGCTTTCCCACGAGGACCTGGAGCCCCCA
CCCGAGCATATCCCCCCACCTCCCCGGCCACCCAAACGAATCCTGGAG
CCACACAATGGGAAATGCAGGGAGTTCTTCCCAAATCACCAGTGGGTG
AAGGAAGAGACCCAGGAGGATAAAGACTGCCCCATCAAGGAAGAAA
AAGGAAGCCCCTTAAATGCCGCACCCTACGGCATCGAAAGCATGAGT
CAAGACACTGAAGTTAGAAGTCGGGTCGTGGGGGGAAGTCTTCGAGG
TGCCCAGGCTGCCTCCCCAGCCAAAGGGGAGCCGTCACTGCCCGAGA
AGGACGAGGACCATGCACTGAGTTACTGGAAGCCCTTCCTGGTCAACA
TGTGCGTGGCTACGGTCCTCACGGCCGGCGCTTACCTCTGCTACAGGT
TCCTGTTCAACAGCAACACATAG
LuxAB Bacterial ATGAAATTTGGAAACTTTTTGCTTACATACCAACCTCCCCAATTTTCCC SEQ
AAACAGAGGTAATGAAACGTTTGGTTAAATTAGGTCGCATCTCTGAGG ID
AGTGTGGTTTTGATACCGTATGGTTACTGGAGCATCATTTCACGGAGTT NO:
TGGTTTGCTTGGTAACCCTTATGTCGCTGCTGCATATTTACTTGGCGCG 76
ACTAAAAAATTGAATGTAGGAACTGCCGCTATTGTTCTTCCCACAGCC
CATCCAGTACGCCAACTTGAAGATGTGAATTTATTGGATCAAATGTCA
AAAGGACGATTTCGGTTTGGTATTTGCCGAGGGCTTTACAACAAGGAC
TTTCGCGTATTCGGCACAGATATGAATAACAGTCGCGCCTTAGCGGAA
TGCTGGTACGGGCTGATAAAGAATGGCATGACAGAGGGATATATGGA
AGCTGATAATGAACATATCAAGTTCCATAAGGTAAAAGTAAACCCCGC
GGCGTATAGCAGAGGTGGCGCACCGGTTTATGTGGTGGCTGAATCAGC
TTCGACGACTGAGTGGGCTGCTCAATTTGGCCTACCGATGATATTAAG
TTGGATTATAAATACTAACGAAAAGAAAGCACAACTTGAGCTTTATAA
TGAAGTGGCTCAAGAATATGGGCACGATATTCATAATATCGACCATTG
CTTATCATATATAACATCTGTAGATCATGACTCAATTAAAGCGAAAGA
GATTTGCCGGAAATTTCTGGGGCATTGGTATGATTCTTATGTGAATGCT
ACGACTATTTTTGATGATTCAGACCAAACAAGAGGTTATGATTTCAAT
AAAGGGCAGTGGCGTGACTTTGTATTAAAAGGACATAAAGATACTAA
TCGCCGTATTGATTACAGTTACGAAATCAATCCCGTGGGAACGCCGCA
GGAATGTATTGACATAATTCAAAAAGACATTGATGCTACAGGAATATC
AAATATTTGTTGTGGATTTGAAGCTAATGGAACAGTAGACGAAATTAT
TGCTTCCATGAAGCTCTTCCAGTCTGATGTCATGCCATTTCTTAAAGAA
AAACAACGTTCGCTATTATATTATTAA
LuxB V. fischeri ATGAGCAAATTTGGATTGTTCTTCCTTAACTTCATCAATTCAACAACTG SEQ
TTCAAGAACAGAGTATAGTTCGCATGCAGGAAATAACGGAGTATGTTG ID
ATAAGTTGAATTTTGAACAGATTTTAGTGTATGAAAATCATTTTTCAGA NO:
TAATGGTGTTGTCGGCGCTCCTCTGACTGTTTCTGGTTTTCTGCTCGGT 77
TTAACAGAGAAAATTAAAATTGGTTCATTAAATCACATCATTACAACT
CATCATCCTGTCCGCATAGCGGAGGAAGCTTGCTTATTGGATCAGTTA
AGTGAAGGGAGATTTATTTTAGGGTTTAGTGATTGCGAAAAAAAAGAT
GAAATGCATTTTTTTAATCGCCCGGTTGAATATCAACAGCAACTATTTG
AAGAGTGTTATGAAATCATTAACGATGCTTTAACAACAGGCTATTGTA
ATCCAGATAACGATTTTTATAGCTTCCCTAAAATATCTGTAAATCCCCA
TGCTTATACGCCAGGCGGACCTCGGAAATATGTAACAGCAACCAGTCA
TCATATTGTTGAGTGGGCGGCCAAAAAAGGTATTCCTCTCATCTTTAA
GTGGGATGATTCTAATGATGTTAGATATGAATATGCTGAAAGATATAA
AGCCGTTGCGGATAAATATGACGTTGACCTATCAGAGATAGACCATCA
GTTAATGATATTAGTTAACTATAACGAAGATAGTAATAAAGCTAAACA
AGAAACGCGTGCATTTATTAGTGATTATGTTCTTGAAATGCACCCTAA
TGAAAATTTCGAAAATAAACTTGAAGAAATAATTGCAGAAAACGCTG
TCGGAAATTATACGGAGTGTATAACTGCGGCTAAGTTGGCAATTGAAA
AGTGTGGTGCGAAAAGTGTATTGCTGTCCTTTGAACCAATGAATGATT
TGATGAGCCAAAAAAATGTAATCAATATTGTTGATGATAATATTAAGA
AGTACCACACGGAATATACCTAA
RpoZ Escherichia ATGGCACGCGTAACTGTTCAGGACGCTGTAGAGAAAATTGGTAACCGT SEQ
coli TTTGACCTGGTACTGGTCGCCGCGCGTCGCGCTCGTCAGATGCAGGTA ID
GGCGGAAAGGATCCGCTGGTACCGGAAGAAAACGATAAAACCACTGT NO:
AATCGCGCTGCGCGAAATCGAAGAAGGTCTGATCAACAACCAGATCC 78
TCGACGTTCGCGAACGCCAGGAACAGCAAGAGCAGGAAGCCGCTGAA
TTACAAGCCGTTACCGCTATTGCTGAAGGTCGTCGTTAA
cI Lambda ATGAGTATCAGCAGCAGGGTAAAAAGCAAAAGAATTCAGCTTGGACT SEQ
bacteriophage TAACCAGGCTGAACTTGCTCAAAAGGTGGGGACTACCCAGCAGTCTAT ID
AGAGCAGCTCGAAAACGGTAAAACTAAGCGACCACGCTTTTTACCAG NO:
AACTTGCGTCAGCTCTTGGCGTAAGTGTTGACTGGCTGCTCAATGGCA 79
CCTCTGATTCGAATGTTAGATTTGTTGGGCACGTTGAGCCCAAAGGGA
AATATCCATTGATTAGCATGGTTAGAGCTCGTTCGTGGTGTGAAGCTT
GTGAACCCTACGATATCAAGGACATTGATGAATGGTATGACAGTGACG
TTAACTTATTAGGCAATGGATTCTGGCTGAAGGTTGAAGGTGATTCCA
TGACCTCACCTGTAGGTCAAAGCATCCCTGAAGGTCATATGGTGTTAG
TAGATACTGGACGGGAGCCAGTGAATGGAAGCCTTGTTGTAGCCAAA
CTGACTGACGCGAACGAAGCAACATTCAAGAAACTGGTCATAGATGG
CGGTCAGAAGTACCTGAAAGGCCTGAATCCTTCATGGCCTATGACTCC
TATCAACGGAAACTGCAAGATTATCGGTGTTGTCGTGGAAGCGAGGGT
AAAATTCGTAGACTAA
SH2 Rous ATGTGGTATTTTGGGAAGATCACTCGTCGGGAGTCCGAGCGGCTGCTG SEQ
sarcoma CTCAACCCCGAAAACCCCCGGGGAACCTTCTTGGTCCGGGAGAGCGA ID
virus GACGGTAAAAGGTGCCTATGCCCTCTCCGTTTCTGACTTTGACAACGC NO:
CAAGGGGCTCAATGTGAAACACTACCTGATCCGCAAGCTGGACAGCG 80
GCGGCTTCTACATCACCTCACGCACACAGTTCAGCAGCCTGCAGCAGC
TGGTGGCCTACTACTCCAAACATGCTGATGGCTTGTGCCACCGCCTGA
CCAACGTCTGCTAA
MBP E. coli ATGAAAATCGAAGAAGGTAAACTGGTAATCTGGATTAACGGCGATAA SEQ
AGGCTATAACGGTCTCGCTGAAGTCGGTAAGAAATTCGAGAAAGATA ID
CCGGAATTAAAGTCACCGTTGAGCATCCGGATAAACTGGAAGAGAAA NO:
TTCCCACAGGTTGCGGCAACTGGCGATGGCCCTGACATTATCTTCTGG 81
GCACACGACCGCTTTGGTGGCTACGCTCAATCTGGCCTGTTGGCTGAA
ATCACCCCGGACAAAGCGTTCCAGGACAAGCTGTATCCGTTTACCTGG
GATGCCGTACGTTACAACGGCAAGCTGATTGCTTACCCGATCGCTGTT
GAAGCGTTATCGCTGATTTATAACAAAGATCTGCTGCCGAACCCGCCA
AAAACCTGGGAAGAGATCCCGGCGCTGGATAAAGAACTGAAAGCGAA
AGGTAAGAGCGCGCTGATGTTCAACCTGCAAGAACCGTACTTCACCTG
GCCGCTGATTGCTGCTGACGGGGGTTATGCGTTCAAGTATGAAAACGG
CAAGTACGACATTAAAGACGTGGGCGTGGATAACGCTGGCGCGAAAG
CGGGTCTGACCTTCCTGGTTGACCTGATTAAAAACAAACACATGAATG
CAGACACCGATTACTCCATCGCAGAAGCTGCCTTTAATAAAGGCGAAA
CAGCGATGACCATCAACGGCCCGTGGGCATGGTCCAACATCGACACC
AGCAAAGTGAATTATGGTGTAACGGTACTGCCGACCTTCAAGGGTCAA
CCATCCAAACCGTTCGTTGGCGTGCTGAGCGCAGGTATTAACGCCGCC
AGTCCGAACAAAGAGCTGGCGAAAGAGTTCCTCGAAAACTATCTGCT
GACTGATGAAGGTCTGGAAGCGGTTAATAAAGACAAACCGCTGGGTG
CCGTAGCGCTGAAGTCTTACGAGGAAGAGTTGGCGAAAGATCCACGT
ATTGCCGCCACCATGGAAAACGCCCAGAAAGGTGAAATCATGCCGAA
CATCCCGCAGATGTCCGCTTTCTGGTATGCCGTGCGTACTGCGGTGATC
AACGCCGCCAGCGGTCGTCAGACTGTCGATGAAGCCCTGAAAGACGC
GCAGACTCGTATCACCAAGTAA
p130cas H. sapiens TGGATGGAGGACTATGACTACGTCCACCTACAGGGG SEQ
(or ID
Kras) NO:
substrate 82
MidT Hamster GAACCGCAGTATGAAGAAATTCCGATTTATCTG SEQ
substrate polyoma ID
virus NO:
83
EGFR H. sapiens CCGCAGCGCTATCTGGTGATTCAGGGCGAT SEQ
substrate ID
NO:
84
ShcA H. sapiens GATCATCAGTATTATAACGATTTTCCGGGC SEQ
substrate ID
NO:
85
MBIS S. ATGTCATTACCGTTCTTAACTTCTGCACCGGGAAAGGTTATTATTTTTG SEQ
cerevisiae GTGAACACTCTGCTGTGTACAACAAGCCTGCCGTCGCTGCTAGTGTGT ID
(from CTGCGTTGAGAACCTACCTGCTAATAAGCGAGTCATCTGCACCAGATA NO:
pMBIS CTATTGAATTGGACTTCCCGGACATTAGCTTTAATCATAAGTGGTCCAT 86
Addgene#: CAATGATTTCAATGCCATCACCGAGGATCAAGTAAACTCCCAAAAATT
17817 GGCCAAGGCTCAACAAGCCACCGATGGCTTGTCTCAGGAACTCGTTAG
TCTTTTGGATCCGTTGTTAGCTCAACTATCCGAATCCTTCCACTACCAT
GCAGCGTTTTGTTTCCTGTATATGTTTGTTTGCCTATGCCCCCATGCCA
AGAATATTAAGTTTTCTTTAAAGTCTACTTTACCCATCGGTGCTGGGTT
GGGCTCAAGCGCCTCTATTTCTGTATCACTGGCCTTAGCTATGGCCTAC
TTGGGGGGGTTAATAGGATCTAATGACTTGGAAAAGCTGTCAGAAAA
CGATAAGCATATAGTGAATCAATGGGCCTTCATAGGTGAAAAGTGTAT
TCACGGTACCCCTTCAGGAATAGATAACGCTGTGGCCACTTATGGTAA
TGCCCTGCTATTTGAAAAAGACTCACATAATGGAACAATAAACACAAA
CAATTTTAAGTTCTTAGATGATTTCCCAGCCATTCCAATGATCCTAACC
TATACTAGAATTCCAAGGTCTACAAAAGATCTTGTTGCTCGCGTTCGT
GTGTTGGTCACCGAGAAATTTCCTGAAGTTATGAAGCCAATTCTAGAT
GCCATGGGTGAATGTGCCCTACAAGGCTTAGAGATCATGACTAAGTTA
AGTAAATGTAAAGGCACCGATGACGAGGCTGTAGAAACTAATAATGA
ACTGTATGAACAACTATTGGAATTGATAAGAATAAATCATGGACTGCT
TGTCTCAATCGGTGTTTCTCATCCTGGATTAGAACTTATTAAAAATCTG
AGCGATGATTTGAGAATTGGCTCCACAAAACTTACCGGTGCTGGTGGC
GGCGGTTGCTCTTTGACTTTGTTACGAAGAGACATTACTCAAGAGCAA
ATTGACAGCTTCAAAAAGAAATTGCAAGATGATTTTAGTTACGAGACA
TTTGAAACAGACTTGGGTGGGACTGGCTGCTGTTTGTTAAGCGCAAAA
AATTTGAATAAAGATCTTAAAATCAAATCCCTAGTATTCCAATTATTTG
AAAATAAAACTACCACAAAGCAACAAATTGACGATCTATTATTGCCAG
GAAACACGAATTTNCCATGGACTTCATAGGAGGCAGATCAAATGTCA
GAGTTGAGAGCCTTCAGTGCCCCAGGGAAAGCGTTACTAGCTGGTGGA
TATTTAGTTTTAGATACAAAATATGAAGCATTTGTAGTCGGATTATCG
GCAAGAATGCATGCTGTAGCCCATCCTTACGGTTCATTGCAAGGGTCT
GATAAGTTTGAAGTGCGTGTGAAAAGTAAACAATTTAAAGATGGGGA
GTGGCTGTACCATATAAGTCCTAAAAGTGGCTTCATTCCTGTTTCGATA
GGCGGATCTAAGAACCCTTTCATTGAAAAAGTTATCGCTAACGTATTT
AGCTACTTTAAACCTAACATGGACGACTACTGCAATAGAAACTTGTTC
GTTATTGATATTTTCTCTGATGATGCCTACCATTCTCAGGAGGATAGCG
TTACCGAACATCGTGGCAACAGAAGATTGAGTTTTCATTCGCACAGAA
TTGAAGAAGTTCCCAAAACAGGGCTGGGCTCCTCGGCAGGTTTAGTCA
CAGTTTTAACTACAGCTTTGGCCTCCTTTTTTGTATCGGACCTGGAAAA
TAATGTAGACAAATATAGAGAAGTTATTCATAATTTAGCACAAGTTGC
TCATTGTCAAGCTCAGGGTAAAATTGGAAGCGGGTTTGATGTAGCGGC
GGCAGCATATGGATCTATCAGATATAGAAGATTCCCACCCGCATTAAT
CTCTAATTTGCCAGATATTGGAAGTGCTACTTACGGCAGTAAACTGGC
GCATTTGGTTGATGAAGAAGACTGGAATATTACGATTAAAAGTAACCA
TTTACCTTCGGGATTAACTTTATGGATGGGCGATATTAAGAATGGTTC
AGAAACAGTAAAACTGGTCCAGAAGGTAAAAAATTGGTATGATTCGC
ATATGCCAGAAAGCTTGAAAATATATACAGAACTCGATCATGCAAATT
CTAGATTTATGGATGGACTATCTAAACTAGATCGCTTACACGAGACTC
ATGACGATTACAGCGATCAGATATTTGAGTCTCTTGAGAGGAATGACT
GTACCTGTCAAAAGTATCCTGAAATCACAGAAGTTAGAGATGCAGTTG
CCACAATTAGACGTTCCTTTAGAAAAATAACTAAAGAATCTGGTGCCG
ATATCGAACCTCCCGTACAAACTAGCTTATTGGATGATTGCCAGACCT
TAAAAGGAGTTCTTACTTGCTTAATACCTGGTGCTGGTGGTTATGACG
CCATTGCAGTGATTACTAAGCAAGATGTTGATCTTAGGGCTCAAACCG
CTAATGACAAAAGATTTTCTAAGGTTCAATGGCTGGATGTAACTCAGG
CTGACTGGGGTGTTAGGAAAGAAAAAGATCCGGAAACTTATCTTGATA
AATAGGAGGTAATACTCATGACCGTTTACACAGCATCCGTTACCGCAC
CCGTCAACATCGCAACCCTTAAGTATTGGGGGAAAAGGGACACGAAG
TTGAATCTGCCCACCAATTCGTCCATATCAGTGACTTTATCGCAAGATG
ACCTCAGAACGTTGACCTCTGCGGCTACTGCACCTGAGTTTGAACGCG
ACACTTTGTGGTTAAATGGAGAACCACACAGCATCGACAATGAAAGA
ACTCAAAATTGTCTGCGCGACCTACGCCAATTAAGAAAGGAAATGGA
ATCGAAGGACGCCTCATTGCCCACATTATCTCAATGGAAACTCCACAT
TGTCTCCGAAAATAACTTTCCTACAGCAGCTGGTTTAGCTTCCTCCGCT
GCTGGCTTTGCTGCATTGGTCTCTGCAATTGCTAAGTTATACCAATTAC
CACAGTCAACTTCAGAAATATCTAGAATAGCAAGAAAGGGGTCTGGTT
CAGCTTGTAGATCGTTGTTTGGCGGATACGTGGCCTGGGAAATGGGAA
AAGCTGAAGATGGTCATGATTCCATGGCAGTACAAATCGCAGACAGCT
CTGACTGGCCTCAGATGAAAGCTTGTGTCCTAGTTGTCAGCGATATTA
AAAAGGATGTGAGTTCCACTCAGGGTATGCAATTGACCGTGGCAACCT
CCGAACTATTTAAAGAAAGAATTGAACATGTCGTACCAAAGAGATTTG
AAGTCATGCGTAAAGCCATTGTTGAAAAAGATTTCGCCACCTTTGCAA
AGGAAACAATGATGGATTCCAACTCTTTCCATGCCACATGTTTGGACT
CTTTCCCTCCAATATTCTACATGAATGACACTTCCAAGCGTATCATCAG
TTGGTGCCACACCATTAATCAGTTTTACGGAGAAACAATCGTTGCATA
CACGTTTGATGCAGGTCCAAATGCTGTGTTGTACTACTTAGCTGAAAA
TGAGTCGAAACTCTTTGCATTTATCTATAAATTGTTTGGCTCTGTTCCT
GGATGGGACAAGAAATTTACTACTGAGCAGCTTGAGGCTTTCAACCAT
CAATTTGAATCATCTAACTTTACTGCACGTGAATTGGATCTTGAGTTGC
AAAAGGATGTTGCCAGAGTGATTTTAACTCAAGTCGGTTCAGGCCCAC
AAGAAACAAACGAATCTTTGATTGACGCAAAGACTGGTCTACCAAAG
GAATAACTGCAGCCCGGGAGGAGGATTACTATATGCAAACGGAACAC
GTCATTTTATTGAATGCACAGGGAGTTCCCACGGGTACGCTGGAAAAG
TATGCCGCACACACGGCAGACACCCGCTTACATCTCGCGTTCTCCAGT
TGGCTGTTTAATGCCAAAGGACAATTATTAGTTACCCGCCGCGCACTG
AGCAAAAAAGCATGGCCTGGCGTGTGGACTAACTCGGTTTGTGGGCAC
CCACAACTGGGAGAAAGCAACGAAGACGCAGTGATCCGCCGTTGCCG
TTATGAGCTTGGCGTGGAAATTACGCCTCCTGAATCTATCTATCCTGAC
TTTCGCTACCGCGCCACCGATCCGAGTGGCATTGTGGAAAATGAAGTG
TGTCCGGTATTTGCCGCACGCACCACTAGTGCGTTACAGATCAATGAT
GATGAAGTGATGGATTATCAATGGTGTGATTTAGCAGATGTATTACAC
GGTATTGATGCCACGCCGTGGGCGTTCAGTCCGTGGATGGTGATGCAG
GCGACAAATCGCGAAGCCAGAAAACGATTATCTGCATTTACCCAGCTT
AAATAACCCGGGGGATCCACTAGTTCTAGAGCGGCCGCCACCGCGGA
GGAGGAATGAGTAATGGACTTTCCGCAGCAACTCGAAGCCTGCGTTAA
GCAGGCCAACCAGGCGCTGAGCCGTTTTATCGCCCCACTGCCCTTTCA
GAACACTCCCGTGGTCGAAACCATGCAGTATGGCGCATTATTAGGTGG
TAAGCGCCTGCGACCTTTCCTGGTTTATGCCACCGGTCATATGTTCGGC
GTTAGCACAAACACGCTGGACGCACCCGCTGCCGCCGTTGAGTGTATC
CACGCTTACTCATTAATTCATGATGATTTACCGGCAATGGATGATGAC
GATCTGCGTCGCGGTTTGCCAACCTGCCATGTGAAGTTTGGCGAAGCA
AACGCGATTCTCGCTGGCGACGCTTTACAAACGCTGGCGTTCTCGATT
TTAAGCGATGCCGATATGCCGGAAGTGTCGGACCGCGACAGAATTTCG
ATGATTTCTGAACTGGCGAGCGCCAGTGGTATTGCCGGAATGTGCGGT
GGTCAGGCATTAGATTTAGACGCGGAAGGCAAACACGTACCTCTGGA
CGCGCTTGAGCGTATTCATCGTCATAAAACCGGCGCATTGATTCGCGC
CGCCGTTCGCCTTGGTGCATTAAGCGCCGGAGATAAAGGACGTCGTGC
TCTGCCGGTACTCGACAAGTATGCAGAGAGCATCGGCCTTGCCTTCCA
GGTTCAGGATGACATCCTGGATGTGGTGGGAGATACTGCAACGTTGGG
AAAACGCCAGGGTGCCGACCAGCAACTTGGTAAAAGTACCTACCCTG
CACTTCTGGGTCTTGAGCAAGCCCGGAAGAAAGCCCGGGATCTGATCG
ACGATGCCCGTCAGTCGCTGAAACAACTGGCTGAACAGTCACTCGATA
CCTCGGCACTGGAAGCGCTAGCGGACTACATCATCCAGCGTAATAAAT
AA
ADS Artemisia GCCCTGACCGAAGAGAAACCGATCCGCCCGATCGCTAACTTCCCGCCG SEQ
annua TCTATCTGGGGTGACCAGTTCCTGATCTACGAAAAGCAGGTTGAGCAG ID
GGTGTTGAACAGATCGTAAACGACCTGAAGAAAGAAGTTCGTCAGCT NO:
GCTGAAAGAAGCTCTGGACATCCCGATGAAACACGCTAACCTGTTGAA 87
GCTGATCGACGAGATCCAGCGTCTGGGTATCCCGTACCACTTCGAACG
CGAAATCGACCACGCACTGCAGTGCATCTACGAAACCTACGGCGACA
ACTGGAACGGCGACCGTTCTTCTCTGTGGTTTCGTCTGATGCGTAAAC
AGGGCTACTACGTTACCTGTGACGTTTTTAACAACTACAAGGACAAGA
ACGGTGCTTTCAAACAGTCTCTGGCTAACGACGTTGAAGGCCTGCTGG
AACTGTACGAAGCGACCTCCATGCGTGTACCGGGTGAAATCATCCTGG
AGGACGCGCTGGGTTTCACCCGTTCTCGTCTGTCCATTATGACTAAAG
ACGCTTTCTCTACTAACCCGGCTCTGTTCACCGAAATCCAGCGTGCTCT
GAAACAGCCGCTGTGGAAACGTCTGCCGCGTATCGAAGCAGCACAGT
ACATTCCGTTTTACCAGCAGCAGGACTCTCACAACAAGACCCTGCTGA
AACTGGCTAAGCTGGAATTCAACCTGCTGCAGTCTCTGCACAAAGAAG
AACTGTCTCACGTTTGTAAGTGGTGGAAGGCATTTGACATCAAGAAAA
ACGCGCCGTGCCTGCGTGACCGTATCGTTGAATGTTACTTCTGGGGTCT
GGGTTCTGGTTATGAACCACAGTACTCCCGTGCACGTGTGTTCTTCACT
AAAGCTGTAGCTGTTATCACCCTGATCGATGACACTTACGATGCTTAC
GGCACCTACGAAGAACTGAAGATCTTTACTGAAGCTGTAGAACGCTGG
TCTATCACTTGCCTGGACACTCTGCCGGAGTACATGAAACCGATCTAC
AAACTGTTCATGGATACCTACACCGAAATGGAGGAATTCCTGGCAAAA
GAAGGCCGTACCGACCTGTTCAACTGCGGTAAAGAGTTTGTTAAAGAA
TTCGTACGTAACCTGATGGTTGAAGCTAAATGGGCTAACGAAGGCCAT
ATCCCGACTACCGAAGAACATGACCCGGTTGTTATCATCACCGGCGGT
GCAAACCTGCTGACCACCACTTGCTATCTGGGTATGTCCGACATCTTTA
CCAAGGAATCTGTTGAATGGGCTGTTTCTGCACCGCCGCTGTTCCGTTA
CTCCGGTATTCTGGGTCGTCGTCTGAACGACCTGATGACCCACAAAGC
AGAGCAGGAACGTAAACACTCTTCCTCCTCTCTGGAATCCTACATGAA
GGAATATAACGTTAACGAGGAGTACGCACAGACTCTGATCTATAAAG
AAGTTGAAGACGTATGGAAAGACATCAACCGTGAATACCTGACTACT
AAAAACATCCCGCGCCCGCTGCTGATGGCAGTAATCTACCTGTGCCAG
TTCCTGGAAGTACAGTACGCTGGTAAAGATAACTTCACTCGCATGGGC
GACGAATACAAACACCTGATCAAATCCCTGCTGGTTTACCCGATGTCC
ATCTGA
GHS Abies ATGGCTCAAATCAGCGAATCAGTGTCTCCAAGCACCGACCTTAAAAGC SEQ
grandis ACGGAATCTTCTATTACCAGCAACCGCCACGGTAACATGTGGGAAGAT ID
GACCGCATTCAGAGCTTAAACAGCCCATATGGCGCACCCGCTTATCAG NO:
GAACGTAGCGAAAAATTGATTGAAGAAATTAAGCTCCTGTTTCTGTCC 88
GATATGGACGATAGTTGCAATGATTCGGATCGCGACTTGATCAAACGC
CTGGAGATCGTAGATACGGTTGAGTGTCTGGGCATTGATCGTCATTTC
CAACCTGAAATTAAGCTGGCGCTGGATTACGTGTACCGTTGCTGGAAT
GAGCGTGGCATCGGAGAAGGTAGCCGTGATAGCTTAAAAAAGGACCT
GAATGCGACCGCCTTGGGCTTTCGGGCTTTACGCTTACACCGTTATAAT
GTAAGCTCAGGAGTGCTGGAGAACTTCCGTGATGACAATGGTCAATTC
TTTTGCGGTTCTACTGTGGAGGAGGAAGGCGCGGAGGCCTACAATAAA
CATGTACGTTGCATGCTGTCCCTGTCCCGCGCTTCCAATATTTTATTCC
CGGGCGAGAAAGTGATGGAAGAAGCGAAGGCGTTTACGACCAACTAT
CTTAAGAAAGTCCTGGCGGGTCGTGAAGCAACTCATGTCGACGAGAGT
CTCCTTGGAGAGGTCAAGTATGCACTAGAATTTCCGTGGCATTGTTCC
GTGCAGCGCTGGGAGGCACGTTCTTTTATCGAAATTTTCGGTCAGATT
GATAGTGAACTGAAAAGCAACCTCTCTAAAAAAATGCTCGAACTCGC
AAAACTTGATTTTAACATACTCCAGTGTACGCATCAAAAAGAGCTCCA
GATCATTAGTCGATGGTTCGCCGATTCAAGTATCGCAAGTCTGAACTT
TTACCGTAAATGCTATGTGGAATTTTACTTCTGGATGGCCGCGGCAATT
TCAGAACCAGAATTTAGTGGCTCTCGCGTGGCATTCACTAAAATTGCG
ATCTTGATGACAATGTTAGATGACTTATACGACACGCATGGGACGCTG
GATCAATTGAAAATATTTACCGAAGGTGTGCGCAGGTGGGACGTGTCG
CTGGTGGAGGGCCTGCCGGATTTCATGAAAATTGCCTTTGAGTTCTGG
TTAAAGACCTCCAACGAACTGATTGCGGAGGCGGTTAAGGCCCAAGG
CCAGGATATGGCGGCCTATATCCGCAAAAACGCTTGGGAACGCTATCT
GGAAGCGTATTTGCAGGATGCCGAATGGATCGCCACCGGTCACGTTCC
GACATTCGATGAATATCTGAACAATGGCACCCCCAACACCGGTATGTG
TGTACTTAATCTGATCCCGTTGCTGCTTATGGGCGAACACTTGCCGATC
GATATTCTTGAACAGATCTTTCTGCCGAGCCGGTTCCACCATCTGATTG
AACTGGCTAGCCGACTGGTCGATGATGCGAGAGATTTTCAAGCCGAAA
AAGATCATGGTGATTTATCCTGCATCGAATGCTACCTGAAAGACCATC
CGGAATCAACAGTTGAAGACGCCCTGAATCACGTCAACGGCCTGCTGG
GGAATTGTTTGCTGGAAATGAATTGGAAATTTCTGAAAAAACAGGACT
CGGTACCTCTGTCGTGTAAAAAATACTCATTCCACGTCCTGGCGCGGT
CGATTCAGTTTATGTATAACCAGGGGGACGGGTTTTCGATTTCGAACA
AAGTTATTAAAGACCAGGTCCAGAAAGTTCTAATCGTTCCGGTTCCTA
TATAA
ABS Abies TGAAACGAGAATTTCCTCCAGGATTTTGGAAGGATGATCTTATCGATT SEQ
grandis CTCTAACGTCATCTCACAAGGTTGCAGCATCAGACGAGAAGCGTATCG ID
AGACATTAATATCCGAGATTAAGAATATGTTTAGATGTATGGGCTATG NO:
GCGAAACGAATCCCTCTGCATATGACACTGCTTGGGTAGCAAGGATTC 89
CAGCAGTTGATGGCTCTGACAACCCTCACTTTCCTGAGACGGTTGAAT
GGATTCTTCAAAATCAGTTGAAAGATGGGTCTTGGGGTGAAGGATTCT
ACTTCTTGGCATATGACAGAATACTGGCTACACTTGCATGTATTATTAC
CCTTACCCTCTGGCGTACTGGGGAGACACAAGTACAGAAAGGTATTGA
ATTCTTCAGGACACAAGCTGGAAAGATGGAAGATGAAGCTGATAGTC
ATAGGCCAAGTGGATTTGAAATAGTATTTCCTGCAATGCTAAAGGAAG
CTAAAATCTTAGGCTTGGATCTGCCTTACGATTTGCCATTCCTGAAACA
AATCATCGAAAAGCGGGAGGCTAAGCTTAAAAGGATTCCC
ACTGATGTTCTCTATGCCCTTCCAACAACGTTATTGTATTCTTTGGAAG
GTTTACAAGAAATAGTAGACTGGCAGAAAATAATGAAACTTCAATCC
AAGGATGGATCATTTCTCAGCTCTCCGGCATCTACAGCGGCTGTATTC
ATGCGTACAGGGAACAAAAAGTGCTTGGATTTCTTGAACTTTGTCTTG
AAGAAATTCGGAAACCATGTGCCTTGTCACTATCCGCTTGATCTATTTG
AACGTTTGTGGGCGGTTGATACAGTTGAGCGGCTAGGTATCGATCGTC
ATTTCAAAGAGGAGATCAAGGAAGCATTGGATTATGTTTACAGCCATT
GGGACGAAAGAGGCATTGGATGGGCGAGAGAGAATCCTGTTCCTGAT
ATTGATGATACAGCCATGGGCCTTCGAATCTTGAGATTACATGGATAC
AATGTATCCTCAGATGTTTTAAAAACATTTAGAGATGAGAATGGGGAG
TTCTTTTGCTTCTTGGGTCAAACACAGAGAGGAGTTACAGACATGTTA
AACGTCAATCGTTGTTCACATGTTTCATTTCCGGGAGAAACGATCATG
GAAGAAGCAAAACTCTGTACCGAAAGGTATCTGAGGAATGCTCTGGA
AAATGTGGATGCCTTTGACAAATGGGCTTTTAAAAAGAATATTCGGGG
AGAGGTAGAGTATGCACTCAAATATCCCTGGCATAAGAGTATGCCAA
GGTTGGAGGCTAGAAGCTATATTGAAAACTATGGGCCAGATGATGTGT
GGCTTGGAAAAACTGTATATATGATGCCATACATTTCGAATGAAAAGT
ATTTAGAACTAGCGAAACTGGACTTCAATAAGGTGCAGTCTATACACC
AAACAGAGCTTCAAGATCTTCGAAGGTGGTGGAAATCATCCGGTTTCA
CGGATCTGAATTTCACTCGTGAGCGTGTGACGGAAATATATTTCTCAC
CGGCATCCTTTATCTTTGAGCCCGAGTTTTCTAAGTGCAGAGAGGTTTA
TACAAAAACTTCCAATTTCACTGTTATTTTAGATGATCTTTATGACGCC
CATGGATCTTTAGACGATCTTAAGTTGTTCACAGAATCAGTCAAAAGA
TGGGATCTATCACTAGTGGACCAAATGCCACAACAAATGAAAATATGT
TTTGTGGGTTTCTACAATACTTTTAATGATATAGCAAAAGAAGGACGT
GAGAGGCAAGGGCGCGATGTGCTAGGCTACATTCAAAATGTTTGGAA
AGTCCAACTTGAAGCTTACACGAAAGAAGCAGAATGGTCTGAAGCTA
AATATGTGCCATCCTTCAATGAATACATAGAGAATGCGAGTGTGTCAA
TAGCATTGGGAACAGTCGTTCTCATTAGTGCTCTTTTCACTGGGGAGGT
TCTTACAGATGAAGTACTCTCCAAAATTGATCGCGAATCTAGATTTCTT
CAACTCATGGGCTTAACAGGGCGTTTGGTGAATGACACCAAAACTTAT
CAGGCAGAGAGAGGTCAAGGTGAGGTGGCTTCTGCCATACAATGTTAT
ATGAAGGACCATCCTAAAATCTCTGAAGAAGAAGCTCTACAACATGTC
TATAGTGTCATGGAAAATGCCCTCGAAGAGTTGAATAGGGAGTTTGTG
AATAACAAAATACCGGATATTTACAAAAGACTGGTTTTTGAAACTGCA
AGAATAATGCAACTCTTTTATATGCAAGGGGATGGTTTGACACTATCA
CATGATATGGAAATTAAAGAGCATGTCAAAAATTGCCTCTTCCAACCA
GTTGCC
TXS Taxus ATGAGCAGCAGCACTGGCACTAGCAAGGTGGTTTCCGAGACTTCCAGT SEQ
brevifola ACCATTGTGGATGATATCCCTCGACTCTCCGCCAATTATCATGGCGATC ID
TGTGGCACCACAATGTTATACAAACTCTGGAGACACCGTTTCGTGAGA NO:
GTTCTACTTACCAAGAACGGGCAGATGAGCTGGTTGTGAAAATTAAAG 90
ATATGTTCAATGCGCTCGGAGACGGAGATATCAGTCCGTCTGCATACG
ACACTGCGTGGGTGGCGAGGCTGGCGACCATTTCCTCTGATGGATCTG
AGAAGCCACGGTTTCCTCAGGCCCTCAACTGGGTTTTCAACAACCAGC
TCCAGGATGGATCGTGGGGTATCGAATCGCACTTTAGTTTATGCGATC
GATTGCTTAACACGACCAATTCTGTTATCGCCCTCTCGGTTTGGAAAAC
AGGGCACAGCCAAGTACAACAAGGTGCTGAGTTTATTGCAGAGAATC
TAAGATTACTCAATGAGGAAGATGAGTTGTCCCCGGATTTCCAAATAA
TCTTTCCTGCTCTGCTGCAAAAGGCAAAAGCGTTGGGGATCAATCTTC
CTTACGATCTTCCATTTATCAAATATTTGTCGACAACACGGGAAGCCA
GGCTTACAGATGTTTCTGCGGCAGCAGACAATATTCCAGCCAACATGT
TGAATGCGTTGGAAGGACTCGAGGAAGTTATTGACTGGAACAAGATT
ATGAGGTTTCAAAGTAAAGATGGATCTTTCCTGAGCTCCCCTGCCTCC
ACTGCCTGTGTACTGATGAATACAGGGGACGAAAAATGTTTCACTTTT
CTCAACAATCTGCTCGACAAATTCGGCGGCTGCGTGCCCTGTATGTAT
TCCATCGATCTGCTGGAACGCCTTTCGCTGGTTGATAACATTGAGCATC
TCGGAATCGGTCGCCATTTCAAACAAGAAATCAAAGGAGCTCTTGATT
ATGTCTACAGACATTGGAGTGAAAGGGGCATCGGTTGGGGCAGAGAC
AGCCTTGTTCCAGATCTCAACACCACAGCCCTCGGCCTGCGAACTCTT
CGCATGCACGGATACAATGTTTCTTCAGACGTTTTGAATAATTTCAAA
GATGAAAACGGGCGGTTCTTCTCCTCTGCGGGCCAAACCCATGTCGAA
TTGAGAAGCGTGGTGAATCTTTTCAGAGCTTCCGACCTTGCATTTCCTG
ACGAAAGAGCTATGGACGATGCTAGAAAATTTGCAGAACCATATCTTA
GAGAGGCACTTGCAACGAAAATCTCAACCAATACAAAACTATTCAAA
GAGATTGAGTACGTGGTGGAGTACCCTTGGCACATGAGTATCCCACGC
TTAGAAGCCAGAAGTTATATTGATTCATATGACGACAATTATGTATGG
CAGAGGAAGACTCTATATAGAATGCCATCTTTGAGTAATTCAAAATGT
TTAGAATTGGCAAAATTGGACTTCAATATCGTACAATCTTTGCATCAA
GAGGAGTTGAAGCTTCTAACAAGATGGTGGAAGGAATCCGGCATGGC
AGATATAAATTTCACTCGACACCGAGTGGCGGAGGTTTATTTTTCATC
AGCTACATTTGAACCCGAATATTCTGCCACTAGAATTGCCTTCACAAA
AATTGGTTGTTTACAAGTCCTTTTTGATGATATGGCTGACATCTTTGCA
ACACTAGATGAATTGAAAAGTTTCACTGAGGGAGTAAAGAGATGGGA
TACATCTTTGCTACATGAGATTCCAGAGTGTATGCAAACTTGCTTTAAA
GTTTGGTTCAAATTAATGGAAGAAGTAAATAATGATGTGGTTAAGGTA
CAAGGACGTGACATGCTCGCTCACATAAGAAAACCCTGGGAGTTGTAC
TTCAATTGTTATGTACAAGAAAGGGAGTGGCTTGAAGCCGGGTATATA
CCAACTTTTGAAGAGTACTTAAAGACTTATGCTATATCAGTAGGCCTT
GGACCGTGTACCCTACAACCAATACTACTAATGGGTGAGCTTGTGAAA
GATGATGTTGTTGAGAAAGTGCACTATCCCTCAAATATGTTTGAGCTT
GTATCCTTGAGCTGGCGACTAACAAACGACACCAAAACATATCAGGCT
GAAAAGGCTCGAGGACAACAAGCCTCAGGCATAGCATGCTATATGAA
GGATAATCCAGGAGCAACTGAGGAAGATGCCATTAAGCACATATGTC
GTGTTGTTGATCGGGCCTTGAAAGAAGCAAGCTTTGAATATTTCAAAC
CATCCAATGATATCCCAATGGGTTGCAAGTCCTTTATTTTTAACCTTAG
ATTGTGTGTCCAAATCTTTTACAAGTTTATAGATGGGTACGGAATCGC
CAATGAGGAGATTAA
GGACTATATAAGAAAAGTTTATATTGATCCAATTCAAGTATGA
GGPPS Taxus ATGTTTGATTTCAATGAATATATGAAAAGTAAGGCTGTTGCGGTAGAC SEQ
Canadensis GCGGCTCTGGATAAAGCGATTCCGCTGGAATATCCCGAGAAGATTCAC ID
GAATCGATGCGCTACTCCCTGTTAGCAGGAGGGAAACGCGTTCGTCCG NO:
GCATTATGCATCGCGGCCTGTGAACTCGTCGGCGGTTCACAGGACTTA 91
GCAATGCCAACTGCTTGCGCAATGGAAATGATTCACACAATGAGCCTG
ATTCATGATGATTTGCCTTGCATGGACAACGATGACTTTCGGCGCGGT
AAACCTACTAATCATAAGGTTTTTGGCGAAGATACTGCAGTGCTGGCG
GGCGATGCGCTGCTGTCGTTTGCCTTCGAACATATCGCCGTCGCGACC
TCGAAAACCGTCCCGTCGGACCGTACGCTTCGCGTGATTTCCGAGCTG
GGAAAGACCATCGGCTCTCAAGGACTCGTGGGTGGTCAGGTAGTTGAT
ATCACGTCTGAGGGTGACGCGAACGTGGACCTGAAAACCCTGGAGTG
GATCCATATTCACAAAACGGCCGTGCTGCTGGAATGTAGCGTGGTGTC
AGGGGGGATCTTGGGGGGCGCCACGGAGGATGAAATCGCGCGTATTC
GTCGTTATGCCCGCTGTGTTGGACTGTTATTTCAGGTGGTGGATGACAT
CCTGGATGTCACAAAATCCAGCGAAGAGCTTGGCAAGACCGCGGGCA
AAGACCTTCTGACGGATAAGGCTACATACCCGAAATTGATGGGCTTGG
AGAAAGCCAAGGAGTTCGCAGCTGAACTTGCCACGCGGGCGAAGGAA
GAACTCTCTTCTTTCGATCAAATCAAAGCCGCGCCACTGCTGGGCCTC
GCCGATTACATTGCGTTTCGTCAGAACTGA
P450.sub.BM3 Bacillus ATGACAATTAAAGAAATGCCTCAGCCAAAAACGTTTGGAGAGCTTAA SEQ
megaterium AAATTTACCGTTATTAAACACAGATAAACCGGTTCAAGCTTTGATGAA ID
AATTGCGGATGAATTAGGAGAAATCTTTAAATTCGAGGCGCCTGGTCG NO:
TGTAACGCGCTACTTATCAAGTCAGCGTCTAATTAAAGAAGCATGCGA 92
TGAATCACGCTTTGATAAAAACTTAAGTCAAGCGCTTAAATTTGTACG
TGATTTTGCAGGAGACGGGTTATTTACAAGCTGGACGCATGAAAAAAA
TTGGAAAAAAGCGCATAATATCTTACTTCCAAGCTTCAGTCAGCAGGC
AATGAAAGGCTATCATGCGATGATGGTCGATATCGCCGTGCAGCTTGT
TCAAAAGTGGGAGCGTCTAAATGCAGATGAGCATATTGAAGTACCGG
AAGACATGACACGTTTAACGCTTGATACAATTGGTCTTTGCGGCTTTA
ACTATCGCTTTAACAGCTTTTACCGAGATCAGCCTCATCCATTTATTAC
AAGTATGGTCCGTGCACTGGATGAAGCAATGAACAAGCTGCAGCGAG
CAAATCCAGACGACCCAGCTTATGATGAAAACAAGCGCCAGTTTCAA
GAAGATATCAAGGTGATGAACGACCTAGTAGATAAAATTATTGCAGA
TCGCAAAGCAAGCGGTGAACAAAGCGATGATTTATTAACGCATATGCT
AAACGGAAAAGATCCAGAAACGGGTGAGCCGCTTGATGACGAGAACA
TTCGCTATCAAATTATTACATTCTTAATTGCGGGACACGAAACAACAA
GTGGTCTTTTATCATTTGCGCTGTATTTCTTAGTGAAAAATCCACATGT
ATTACAAAAAGCAGCAGAAGAAGCAGCACGAGTTCTAGTAGATCCTG
TTCCAAGCTACAAACAAGTCAAACAGCTTAAATATGTCGGCATGGTCT
TAAACGAAGCGCTGCGCTTATGGCCAACTGCTCCTGCGTTTTCCCTATA
TGCAAAAGAAGATACGGTGCTTGGAGGAGAATATCCTTTAGAAAAAG
GCGACGAACTAATGGTTCTGATTCCTCAGCTTCACCGTGATAAAACAA
TTTGGGGAGACGATGTGGAAGAGTTCCGTCCAGAGCGTTTTGAAAATC
CAAGTGCGATTCCGCAGCATGCGTTTAAACCGTTTGGAAACGGTCAGC
GTGCGTGTATCGGTCAGCAGTTCGCTCTTCATGAAGCAACGCTGGTAC
TTGGTATGATGCTAAAACACTTTGACTTTGAAGATCATACAAACTACG
AGCTGGATATTAAAGAAACTTTAACGTTAAAACCTGAAGGCTTTGTGG
TAAAAGCAAAATCGAAAAAAATTCCGCTTGGCGGTATTCCTTCACCTA
GCACTGAACAGTCTGCTAAAAAAGTACGCAAAAAGGCAGAAAACGCT
CATAATACGCCGCTGCTTGTGCTATACGGTTCAAATATGGGAACAGCT
GAAGGAACGGCGCGTGATTTAGCAGATATTGCAATGAGCAAAGGATT
TGCACCGCAGGTCGCAACGCTTGATTCACACGCCGGAAATCTTCCGCG
CGAAGGAGCTGTATTAATTGTAACGGCGTCTTATAACGGTCATCCGCC
TGATAACGCAAAGCAATTTGTCGACTGGTTAGACCAAGCGTCTGCTGA
TGAAGTAAAAGGCGTTCGCTACTCCGTATTTGGATGCGGCGATAAAAA
CTGGGCTACTACGTATCAAAAAGTGCCTGCTTTTATCGATGAAACGCT
TGCCGCTAAAGGGGCAGAAAACATCGCTGACCGCGGTGAAGCAGATG
CAAGCGACGACTTTGAAGGCACATATGAAGAATGGCGTGAACATATG
TGGAGTGACGTAGCAGCCTACTTTAACCTCGACATTGAAAACAGTGAA
GATAATAAATCTACTCTTTCACTTCAATTTGTCGACAGCGCCGCGGAT
ATGCCGCTTGCGAAAATGCACGGTGCGTTTTCAACGAACGTCGTAGCA
AGCAAAGAACTTCAACAGCCAGGCAGTGCACGAAGCACGCGACATCT
TGAAATTGAACTTCCAAAAGAAGCTTCTTATCAAGAAGGAGATCATTT
AGGTGTTATTCCTCGCAACTATGAAGGAATAGTAAACCGTGTAACAGC
AAGGTTCGGCCTAGATGCATCACAGCAAATCCGTCTGGAAGCAGAAG
AAGAAAAATTAGCTCATTTGCCACTCGCTAAAACAGTATCCGTAGAAG
AGCTTCTGCAATACGTGGAGCTTCAAGATCCTGTTACGCGCACGCAGC
TTCGCGCAATGGCTGCTAAAACGGTCTGCCCGCCGCATAAAGTAGAGC
TTGAAGCCTTGCTTGAAAAGCAAGCCTACAAAGAACAAGTGCTGGCA
AAACGTTTAACAATGCTTGAACTGCTTGAAAAATACCCGGCGTGTGAA
ATGAAATTCAGCGAATTTATCGCCCTTCTGCCAAGCATACGCCCGCGC
TATTACTCGATTTCTTCATCACCTCGTGTCGATGAAAAACAAGCAAGC
ATCACGGTCAGCGTTGTCTCAGGAGAAGCGTGGAGCGGATATGGAGA
ATATAAAGGAATTGCGTCGAACTATCTTGCCGAGCTGCAAGAAGGAG
ATACGATTACGTGCTTTATTTCCACACCGCAGTCAGAATTTACGCTGCC
AAAAGACCCTGAAACGCCGCTTATCATGGTCGGACCGGGAACAGGCG
TCGCGCCGTTTAGAGGCTTTGTGCAGGCGCGCAAACAGCTAAAAGAAC
AAGGACAGTCACTTGGAGAAGCACATTTATACTTCGGCTGCCGTTCAC
CTCATGAAGACTATCTGTATCAAGAAGAGCTTGAAAACGCCCAAAGC
GAAGGCATCATTACGCTTCATACCGCTTTTTCTCGCATGCCAAATCAGC
CGAAAACATACGTTCAGCACGTAATGGAACAAGACGGCAAGAAATTG
ATTGAACTTCTTGATCAAGGAGCGCACTTCTATATTTGCGGAGACGGA
AGCCAAATGGCACCTGCCGTTGAAGCAACGCTTATGAAAAGCTATGCT
GACGTTCACCAAGTGAGTGAAGCAGACGCTCGCTTATGGCTGCAGCAG
CTAGAAGAAAAAGGCCGATACGCAAAAGACGTGTGGGCTGGGTAA
SpecR Bacterial ATGAGGGAAGCGGTGATCGCCGAAGTATCGACTCAACTATCAGAGGT SEQ
AGTTGGCGTCATCGAGCGCCATCTCGAACCGACGTTGCTGGCCGTACA ID
TTTGTACGGCTCCGCAGTGGATGGCGGCCTGAAGCCACACAGTGATAT NO:
TGATTTGCTGGTTACGGTGACCGTAAGGCTTGATGAAACAACGCGGCG 93
AGCTTTGATCAACGACCTTTTGGAAACTTCGGCTTCCCCTGGAGAGAG
CGAGATTCTCCGCGCTGTAGAAGTCACCATTGTTGTGCACGACGACAT
CATTCCGTGGCGTTATCCAGCTAAGCGCGAACTGCAATTTGGAGAATG
GCAGCGCAATGACATTCTTGCAGGTATCTTCGAGCCAGCCACGATCGA
CATTGATCTGGCTATCTTGCTGACAAAAGCAAGAGAACATAGCGTTGC
CTTGGTAGGTCCAGCGGCGGAGGAACTCTTTGATCCGGTTCCTGAACA
GGATCTATTTGAGGCGCTAAATGAAACCTTAACGCTATGGAACTCGCC
GCCCGACTGGGCTGGCGATGAGCGAAATGTAGTGCTTACGTTGTCCCG
CATTTGGTACAGCGCAGTAACCGGCAAAATCGCGCCGAAGGATGTCG
CTGCCGACTGGGCAATGGAGCGCCTGCCGGCCCAGTATCAGCCCGTCA
TACTTGAAGCTAGACAGGCTTATCTTGGACAAGAAGAAGATCGCTTGG
CCTCGCGCGCAGATCAGTTGGAAGAATTTGTCCACTACGTGAAAGGCG
AGATCACCAAGGTAGTCGGCAAA
LOV2 Avena TTGGCTACTACACTTGAACGTATTGAGAAGAACTTTGTCATTACTGAC SEQ
sativa CCAAGGTTGCCAGATAATCCCATTATATTCGCGTCCGATAGTTTCTTGC ID
AGTTGACAGAATATAGCCGTGAAGAAATTTTGGGAAGAAACTGCAGG NO:
TTTCTACAAGGTCCTGAAACTGATCGCGCGACAGTGAGAAAAATTAGA 94
GATGCCATAGATAACCAAACAGAGGTCACTGTTCAGCTGATTAATTAT
ACAAAGAGTGGTAAAAAGTTCTGGAACCTCTTTCACTTGCAGCCTATG
CGAGATCAGAAGGGAGATGTCCAGTACTTTATTGGGGTTCAGTTGGAT
GGAACTGAGCATGTCCGAGATGCTGCCGAGAGAGAGGGAGTCATGCT
GATTAAGAAAACTGCAGAAAATATTGATGAGGCGGCAAAAGAACTTC
CA
BphP1 Rhodopseudo- ATGGCTAGCGTGGCAGGTCATGCCTCTGGCAGCCCCGCATTCGGGACC SEQ
monas GCCGATCTTTCGAATTGCGAACGTGAAGAGATCCACCTCGCCGGCTCG ID
palustris ATCCAGCCGCATGGCGCGCTTCTGGTCGTCAGCGAGCCGGATCATCGC NO:
ATCATCCAGGCCAGCGCCAACGCCGCGGAATTTCTGAATCTCGGAAGC 95
GTGCTCGGCGTTCCGCTCGCCGAGATCGACGGCGATCTGTTGATCAAG
ATCCTGCCGCATCTCGATCCCACCGCCGAAGGCATGCCGGTCGCGGTG
CGCTGCCGGATCGGCAATCCCTCCACGGAGTACGACGGTCTGATGCAT
CGGCCTCCGGAAGGCGGGCTGATCATCGAGCTCGAACGTGCCGGCCC
GCCGATCGATCTGTCCGGCACGCTGGCGCCGGCGCTGGAGCGGATCCG
CACGGCGGGCTCGCTGCGCGCGCTGTGCGATGACACCGCGCTGCTGTT
TCAGCAGTGCACCGGCTACGACCGGGTGATGGTGTATCGCTTCGACGA
GCAGGGCCACGGCGAAGTGTTCTCCGAGCGCCACGTGCCCGGGCTCG
AATCCTATTTCGGCAACCGCTATCCGTCGTCGGACATTCCGCAGATGG
CGCGGCGGCTGTACGAGCGGCAGCGCGTCCGCGTGCTGGTCGACGTCA
GCTATCAGCCGGTGCCGCTGGAGCCGCGGCTGTCGCCGCTGACCGGGC
GCGATCTCGACATGTCGGGCTGCTTCCTGCGCTCGATGTCGCCGATCC
ATCTGCAGTACCTGAAGAACATGGGCGTGCGCGCCACCCTGGTGGTGT
CGCTGGTGGTCGGCGGCAAGCTGTGGGGCCTGGTTGCCTGTCATCATT
ATCTGCCGCGCTTCATGCATTTCGAGCTGCGGGCGATCTGCGAACTGC
TCGCCGAAGCGATCGCGACGCGGATCACCGCGCTTGAGAGCTTCGCGC
AGAGCCAGTCGGAGCTGTTCGTGCAGCGGCTCGAACAGCGCATGATC
GAAGCGATTACCCGTGAAGGCGATTGGCGCGCAGCGATTTTCGACACC
AGCCAATCGATCCTGCAGCCGCTGCACGCCGCCGGTTGCGCGCTGGTG
TACGAAGACCAGATCAGGACCATCGGCGACGTGCCTTCCACGCAGGA
TGTGCGCGAGATCGCCGGGTGGCTCGATCGCCAGCCGCGCGCGGCGGT
GACCTCGACCGCGTCGCTCGGTCTCGACGTGCCGGAGCTCGCGCATCT
GACGCGGATGGCGAGCGGCGTGGTCGCGGCGCCGATTTCGGATCATC
GCGGCGAGTTTCTGATGTGGTTCCGCCCCGAGCGCGTCCACACCGTTA
CCTGGGGCGGCGATCCGAAGAAGCCGTTCACGATGGGCGATACACCG
GCGGATCTGTCGCCGCGGCGCTCCTTCGCCAAATGGCATCAGGTTGTC
GAAGGCACGTCCGATCCGTGGACGGCCGCCGATCTCGCCGCGGCTCGC
ACCATCGGTCAGACCGTCGCCGACATCGTGCTGCAATTCCGCGCGGTG
CGGACACTGATCGCCCGCGAACAGTACGAACAGTTTTCGTCCCAGGTG
CACGCTTCGATGCAGCCGGTGCTGATCACCGACGCCGAAGGCCGCATC
CTGCTGATGAACGACTCGTTCCGCGACATGTTGCCGGCGGGTTCGCCA
TCCGCCGTCCATCTCGACGATCTCGCCGGGTTCTTCGTCGAATCGAAC
GATTTCCTGCGCAACGTCGCCGAACTGATCGATCACGGCCGCGGGTGG
CGCGGCGAAGTTCTGCTGCGCGGCGCAGGCAACCGCCCGTTGCCGCTG
GCAGTGCGCGCCGATCCGGTGACGCGCACGGAGGACCAGTCGCTCGG
CTTCGTGCTGATCTTCAGCGACGCTACCGATCGTCGCACCGCAGATGC
CGCACGCACGCGTTTCCAGGAAGGCATTCTTGCCAGCGCACGTCCCGG
CGTGCGGCTCGACTCCAAGTCCGACCTGTTGCACGAGAAGCTGCTGTC
CGCGCTGGTCGAGAACGCGCAGCTTGCCGCATTGGAAATCACTTACGG
CGTCGAGACCGGACGCATCGCCGAGCTGCTCGAAGGCGTCCGCCAGTC
GATGCTGCGCACCGCCGAAGTGCTCGGCCATCTGGTGCAGCACGCGGC
GCGCACGGCCGGCAGCGACAGCTCGAGCAATGGCTCGCAGAACAAGA
AGGAATTCGATAGTGCTGGTAGTGCTGGTAGTGCTGGTACTAGT
PTP1B.sub.1-435 H. ATGGAGATGGAAAAGGAGTTCGAGCAGATCGACAAGTCCGGGAGCTG SEQ
Sapiens GGCGGCCATTTACCAGGATATCCGACATGAAGCCAGTGACTTCCCATG ID
TAGAGTGGCCAAGCTTCCTAAGAACAAAAACCGAAATAGGTACAGAG NO:
ACGTCAGTCCCTTTGACCATAGTCGGATTAAACTACATCAAGAAGATA 96
ATGACTATATCAACGCTAGTTTGATAAAAATGGAAGAAGCCCAAAGG
AGTTACATTCTTACCCAGGGCCCTTTGCCTAACACATGCGGTCACTTTT
GGGAGATGGTGTGGGAGCAGAAAAGCAGGGGTGTCGTCATGCTCAAC
AGAGTGATGGAGAAAGGTTCGTTAAAATGCGCACAATACTGGCCACA
AAAAGAAGAAAAAGAGATGATCTTTGAAGACACAAATTTGAAATTAA
CATTGATCTCTGAAGATATCAAGTCATATTATACAGTGCGACAGCTAG
AATTGGAAAACCTTACAACCCAAGAAACTCGAGAGATCTTACATTTCC
ACTATACCACATGGCCTGACTTTGGAGTCCCTGAATCACCAGCCTCAT
TCTTGAACTTTCTTTTCAAAGTCCGAGAGTCAGGGTCACTCAGCCCGG
AGCACGGGCCCGTTGTGGTGCACTGCAGTGCAGGCATCGGCAGGTCTG
GAACCTTCTGTCTGGCTGATACCTGCCTCTTGCTGATGGACAAGAGGA
AAGACCCTTCTTCCGTTGATATCAAGAAAGTGCTGTTAGAAATGAGGA
AGTTTCGGATGGGGCTGATCCAGACAGCCGACCAGCTGCGCTTCTCCT
ACCTGGCTGTGATCGAAGGTGCCAAATTCATCATGGGGGACTCTTCCG
TGCAGGATCAGTGGAAGGAGCTTTCCCACGAGGACCTGGAGCCCCCA
CCCGAGCATATCCCCCCACCTCCCCGGCCACCCAAACGAATCCTGGAG
CCACACAATGGGAAATGCAGGGAGTTCTTCCCAAATCACCAGTGGGTG
AAGGAAGAGACCCAGGAGGATAAAGACTGCCCCATCAAGGAAGAAA
AAGGAAGCCCCTTAAATGCCGCACCCTACGGCATCGAAAGCATGAGT
CAAGACACTGAAGTTAGAAGTCGGGTCGTGGGGGGAAGTCTTCGAGG
TGCCCAGGCTGCCTCCCCAGCCAAAGGGGAGCCGTCACTGCCCGAGA
AGGACGAGGACCATGCACTGAGTTACTGGAAGCCCTTCCTGGTCAACA
TGTGCGTGGCTACGGTCCTCACGGCCGGCGCTTACCTCTGCTACAGGT
TCCTGTTCAACAGCAACACATAG
TC- H. ATGCCCACCACCATCGAGCGGGAGTTCGAAGAGTTGGATACTCAGCGT SEQ
PTP Sapiens CGCTGGCAGCCGCTGTACTTGGAAATTCGAAATGAGTCCCATGACTAT ID
(full) CCTCATAGAGTGGCCAAGTTTCCAGAAAACAGAAATCGAAACAGATA NO:
CAGAGATGTAAGCCCATATGATCACAGTCGTGTTAAACTGCAAAATGC 97
TGAGAATGATTATATTAATGCCAGTTTAGTTGACATAGAAGAGGCACA
AAGGAGTTACATCTTAACACAGGGTCCACTTCCTAACACATGCTGCCA
TTTCTGGCTTATGGTTTGGCAGCAGAAGACCAAAGCAGTTGTCATGCT
GAACCGCATTGTGGAGAAAGAATCGGTTAAATGTGCACAGTACTGGC
CAACAGATGACCAAGAGATGCTGTTTAAAGAAACAGGATTCAGTGTG
AAGCTCTTGTCAGAAGATGTGAAGTCGTATTATACAGTACATCTACTA
CAATTAGAAAATATCAATAGTGGTGAAACCAGAACAATATCTCACTTT
CATTATACTACCTGGCCAGATTTTGGAGTCCCTGAATCACCAGCTTCAT
TTCTCAATTTCTTGTTTAAAGTGAGAGAATCTGGCTCCTTGAACCCTGA
CCATGGGCCTGCGGTGATCCACTGTAGTGCAGGCATTGGGCGCTCTGG
CACCTTCTCTCTGGTAGACACTTGTCTTGTTTTGATGGAAAAAGGAGAT
GATATTAACATAAAACAAGTGTTACTGAACATGAGAAAATACCGAAT
GGGTCTTATTCAGACCCCAGATCAACTGAGATTCTCATACATGGCTAT
AATAGAAGGAGCAAAATGTATAAAGGGAGATTCTAGTATACAGAAAC
GATGGAAAGAACTTTCTAAGGAAGACTTATCTCCTGCCTTTGATCATT
CACCAAACAAAATAATGACTGAAAAATACAATGGGAACAGA
PTPN5 H. sapiens ATGTCTTCTGGTGTAGATCTGGGTACCGAGAACCTGTACTTCCAATCC SEQ
ATGTCCCGTGTCCTCCAAGCAGAAGAGCTTCATGAAAAGGCCCTGGAC ID
CCTTTCCTGCTGCAGGCGGAATTCTTTGAAATCCCCATGAACTTTGTGG NO:
ATCCGAAAGAGTACGACATCCCTGGGCTGGTGCGGAAGAACCGGTAC 98
AAAACCATACTTCCCAACCCTCACAGCAGAGTGTGTCTGACCTCACCA
GACCCTGACGACCCTCTGAGTTCCTACATCAATGCCAACTACATCCGG
GGCTATGGTGGGGAGGAGAAGGTGTACATCGCCACTCAGGGACCCAT
CGTCAGCACGGTCGCCGACTTCTGGCGCATGGTGTGGCAGGAGCACAC
GCCCATCATTGTCATGATCACCAACATCGAGGAGATGAACGAGAAAT
GCACCGAGTATTGGCCGGAGGAGCAGGTGGCGTACGACGGTGTTGAG
ATCACTGTGCAGAAAGTCATTCACACGGAGGATTACCGGCTGCGACTC
ATCTCCCTCAAGAGTGGGACTGAGGAGCGAGGCCTGAAGCATTACTG
GTTCACATCCTGGCCCGACCAGAAGACCCCAGACCGGGCCCCCCCACT
CCTGCACCTGGTGCGGGAGGTGGAGGAGGCAGCCCAGCAGGAGGGGC
CCCACTGTGCCCCCATCATCGTCCACTGCAGTGCAGGGATTGGGAGGA
CCGGCTGCTTCATTGCCACCAGCATCTGCTGCCAGCAGCTGCGGCAGG
AGGGTGTAGTGGACATCCTGAAGACCACGTGCCAGCTCCGTCAGGAC
AGGGGCGGCATGATCCAGACATGCGAGCAGTACCAGTTTGTGCACCA
CGTCATGAGCCTCTACGAAAAGCAGCTGTCCCACCAGTCCTGA
PTPN6 H. sapiens ATGGTGAGGTGGTTTCACCGAGACCTCAGTGGGCTGGATGCAGAGACC SEQ
CTGCTCAAGGGCCGAGGTGTCCACGGTAGCTTCCTGGCTCGGCCCAGT ID
CGCAAGAACCAGGGTGACTTCTCGCTCTCCGTCAGGGTGGGGGATCAG NO:
GTGACCCATATTCGGATCCAGAACTCAGGGGATTTCTATGACCTGTAT 99
GGAGGGGAGAAGTTTGCGACTCTGACAGAGCTGGTGGAGTACTACAC
TCAGCAGCAGGGTGTGGTGCAGGACCGCGACGGCACCATCATCCACCT
CAAGTACCCGCTGAACTGCTCCGATCCCACTAGTGAGAGGTGGTACCA
TGGCCACATGTCTGGCGGGCAGGCAGAGACGCTGCTGCAGGCCAAGG
GCGAGCCCTGGACGTTTCTTGTGCGTGAGAGCCTCAGCCAGCCTGGAG
ACTTCGTGCTTTCTGTGCTCAGTGACCAGCCCAAGGCTGGCCCAGGCT
CCCCGCTCAGGGTCACCCACATCAAGGTCATGTGCGAGGGTGGACGCT
ACACAGTGGGTGGTTTGGAGACCTTCGACAGCCTCACGGACCTGGTGG
AGCATTTCAAGAAGACGGGGATTGAGGAGGCCTCAGGCGCCTTTGTCT
ACCTGCGGCAGCCGTACTATGCCACGAGGGTGAATGCGGCTGACATTG
AGAACCGAGTGTTGGAACTGAACAAGAAGCAGGAGTCCGAGGATACA
GCCAAGGCTGGCTTCTGGGAGGAGTTTGAGAGTTTGCAGAAGCAGGA
GGTGAAGAACTTGCACCAGCGTCTGGAAGGGCAACGGCCAGAGAACA
AGGGCAAGAACCGCTACAAGAACATTCTCCCCTTTGACCACAGCCGAG
TGATCCTGCAGGGACGGGACAGTAACATCCCCGGGTCCGACTACATCA
ATGCCAACTACATCAAGAACCAGCTGCTAGGCCCTGATGAGAACGCTA
AGACCTACATCGCCAGCCAGGGCTGTCTGGAGGCCACGGTCAATGACT
TCTGGCAGATGGCGTGGCAGGAGAACAGCCGTGTCATCGTCATGACCA
CCCGAGAGGTGGAGAAAGGCCGGAACAAATGCGTCCCATACTGGCCC
GAGGTGGGCATGCAGCGTGCTTATGGGCCCTACTCTGTGACCAACTGC
GGGGAGCATGACACAACCGAATACAAACTCCGTACCTTACAGGTCTCC
CCGCTGGACAATGGAGACCTGATTCGGGAGATCTGGCATTACCAGTAC
CTGAGCTGGCCCGACCATGGGGTCCCCAGTGAGCCTGGGGGTGTCCTC
AGCTTCCTGGACCAGATCAACCAGCGGCAGGAAAGTCTGCCTCACGCA
GGGCCCATCATCGTGCACTGCAGCGCCGGCATCGGCCGCACAGGCACC
ATCATTGTCATCGACATGCTCATGGAGAACATCTCCACCAAGGGCCTG
GACTGTGACATTGACATCCAGAAGACCATCCAGATGGTGCGGGCGCA
GCGCTCGGGCATGGTGCAGACGGAGGCGCAGTACAAGTTCATCTACGT
GGCCATCGCCCAGTTCATTGAAACCACTAAGAAGAAGCTGGAGGTCCT
GCAGTCGCAGAAGGGCCAGGAGTCGGAGTACGGGAACATCACCTATC
CCCCAGCCATGAAGAATGCCCATGCCAAGGCCTCCCGCACCTCGTCCA
AACACAAGGAGGATGTGTATGAGAACCTGCACACTAAGAACAAGAGG
GAGGAGAAAGTGAAGAAGCAGCGGTCAGCAGACAAGGAGAAGAGCA
AGGGTTCCCTCAAGAGGAAGTGA
PTPN11 H. sapiens ATGACATCGCGGAGATGGTTTCACCCAAATATCACTGGTGTGGAGGCA SEQ
GAAAACCTACTGTTGACAAGAGGAGTTGATGGCAGTTTTTTGGCAAGG ID
CCTAGTAAAAGTAACCCTGGAGACTTCACACTTTCCGTTAGAAGAAAT NO:
GGAGCTGTCACCCACATCAAGATTCAGAACACTGGTGATTACTATGAC 100
CTGTATGGAGGGGAGAAATTTGCCACTTTGGCTGAGTTGGTCCAGTAT
TACATGGAACATCACGGGCAATTAAAAGAGAAGAATGGAGATGTCAT
TGAGCTTAAATATCCTCTGAACTGTGCAGATCCTACCTCTGAAAGGTG
GTTTCATGGACATCTCTCTGGGAAAGAAGCAGAGAAATTATTAACTGA
AAAAGGAAAACATGGTAGTTTTCTTGTACGAGAGAGCCAGAGCCACC
CTGGAGATTTTGTTCTTTCTGTGCGCACTGGTGATGACAAAGGGGAGA
GCAATGACGGCAAGTCTAAAGTGACCCATGTTATGATTCGCTGTCAGG
AACTGAAATACGACGTTGGTGGAGGAGAACGGTTTGATTCTTTGACAG
ATCTTGTGGAACATTATAAGAAGAATCCTATGGTGGAAACATTGGGTA
CAGTACTACAACTCAAGCAGCCCCTTAACACGACTCGTATAAATGCTG
CTGAAATAGAAAGCAGAGTTCGAGAACTAAGCAAATTAGCTGAGACC
ACAGATAAAGTCAAACAAGGCTTTTGGGAAGAATTTGAGACACTACA
ACAACAGGAGTGCAAACTTCTCTACAGCCGAAAAGAGGGTCAAAGGC
AAGAAAACAAAAACAAAAATAGATATAAAAACATCCTGCCCTTTGAT
CATACCAGGGTTGTCCTACACGATGGTGATCCCAATGAGCCTGTTTCA
GATTACATCAATGCAAATATCATCATGCCTGAATTTGAAACCAAGTGC
AACAATTCAAAGCCCAAAAAGAGTTACATTGCCACACAAGGCTGCCT
GCAAAACACGGTGAATGACTTTTGGCGGATGGTGTTCCAAGAAAACTC
CCGAGTGATTGTCATGACAACGAAAGAAGTGGAGAGAGGAAAGAGTA
AATGTGTCAAATACTGGCCTGATGAGTATGCTCTAAAAGAATATGGCG
TCATGCGTGTTAGGAACGTCAAAGAAAGCGCCGCTCATGACTATACGC
TAAGAGAACTTAAACTTTCAAAGGTTGGACAAGGGAATACGGAGAGA
ACGGTCTGGCAATACCACTTTCGGACCTGGCCGGACCACGGCGTGCCC
AGCGACCCTGGGGGCGTGCTGGACTTCCTGGAGGAGGTGCACCATAA
GCAGGAGAGCATCATGGATGCAGGGCCGGTCGTGGTGCACTGCAGTG
CTGGAATTGGCCGGACAGGGACGTTCATTGTGATTGATATTCTTATTG
ACATCATCAGAGAGAAAGGTGTTGACTGCGATATTGACGTTCCCAAAA
CCATCCAGATGGTGCGGTCTCAGAGGTCAGGGATGGTCCAGACAGAA
GCACAGTACCGATTTATCTATATGGCGGTCCAGCATTATATTGAAACA
CTACAGCGCAGGATTGAAGAAGAGCAGAAAAGCAAGAGGAAAGGGC
ACGAATATACAAATATTAAGTATTCTCTAGCGGACCAGACGAGTGGAG
ATCAGAGCCCTCTCCCGCCTTGTACTCCAACGCCACCCTGTGCAGAAA
TGAGAGAAGACAGTGCTAGAGTCTATGAAAACGTGGGCCTGATGCAA
CAGCAGAAAAGTTTCAGATGA
PTPN12 H. sapiens ATGGAGCAAGTGGAGATCCTGAGGAAATTCATCCAGAGGGTCCAGGC SEQ
CATGAAGAGTCCTGACCACAATGGGGAGGACAACTTCGCCCGGGACT ID
TCATGCGGTTAAGAAGATTGTCTACCAAATATAGAACAGAAAAGATAT NO:
ATCCCACAGCCACTGGAGAAAAAGAAGAAAATGTTAAAAAGAACAGA 101
TACAAGGACATACTGCCATTTGATCACAGCCGAGTTAAATTGACATTA
AAGACTCCTTCACAAGATTCAGACTATATCAATGCAAATTTTATAAAG
GGCGTCTATGGGCCAAAAGCATATGTAGCAACTCAAGGACCTTTAGCA
AATACAGTAATAGATTTTTGGAGGATGGTATGGGAGTATAATGTTGTG
ATCATTGTAATGGCCTGCCGAGAATTTGAGATGGGAAGGAAAAAATG
TGAGCGCTATTGGCCTTTGTATGGAGAAGACCCCATAACGTTTGCACC
ATTTAAAATTTCTTGTGAGGATGAACAAGCAAGAACAGACTACTTCAT
CAGGACACTCTTACTTGAATTTCAAAATGAATCTCGTAGGCTGTATCA
GTTTCATTATGTGAACTGGCCAGACCATGATGTTCCTTCATCATTTGAT
TCTATTCTGGACATGATAAGCTTAATGAGGAAATATCAAGAACATGAA
GATGTTCCTATTTGTATTCATTGCAGTGCAGGCTGTGGAAGAACAGGT
GCCATTTGTGCCATAGATTATACGTGGAATTTACTAAAAGCTGGGAAA
ATACCAGAGGAATTTAATGTATTTAATTTAATACAAGAAATGAGAACA
CAAAGGCATTCTGCAGTACAAACAAAGGAGCAATATGAACTTGTTCAT
AGAGCTATTGCCCAACTGTTTGAAAAACAGCTACAACTATATGAAATT
CATGGAGCTCAGAAAATTGCTGATGGAGTGAATGAAATTAACACTGA
AAACATGGTCAGCTCCATAGAGCCTGAAAAACAAGATTCTCCTCCTCC
AAAACCACCAAGGACCCGCAGTTGCCTTGTTGAAGGGGATGCTAAAG
AAGAAATACTGCAGCCACCGGAACCTCATCCAGTGCCACCCATCTTGA
CACCTTCTCCCCCTTCAGCTTTTCCAACAGTCACTACTGTGTGGCAGGA
CAATGATAGATACCATCCAAAGCCAGTGTTGCAATGGTTTCATCAGAA
CAACATTCAGCAGACCTCAACAGAAACTATAGTAAATCAACAGAACTT
CCAGGGAAAAATGAATCAACAATTGAACAGA
PTPN22 H. sapiens ATGGACCAAAGAGAAATTCTGCAGAAGTTCCTGGATGAGGCCCAAAG SEQ
CAAGAAAATTACTAAAGAGGAGTTTGCCAATGAATTTCTGAAGCTGAA ID
AAGGCAATCTACCAAGTACAAGGCAGACAAAACCTATCCTACAACTG NO:
TGGCTGAGAAGCCCAAGAATATCAAGAAAAACAGATATAAGGATATT 102
TTGCCCTATGATTATAGCCGGGTAGAACTATCCCTGATAACCTCTGAT
GAGGATTCCAGCTACATCAATGCCAACTTCATTAAGGGAGTTTATGGA
CCCAAGGCTTATATTGCCACCCAGGGTCCTTTATCTACAACCCTCCTGG
ACTTCTGGAGGATGATTTGGGAATATAGTGTCCTTATCATTGTTATGGC
ATGCATGGAGTATGAAATGGGAAAGAAAAAGTGTGAGCGCTACTGGG
CTGAGCCAGGAGAGATGCAGCTGGAATTTGGCCCTTTCTCTGTATCCT
GTGAAGCTGAAAAAAGGAAATCTGATTATATAATCAGGACTCTAAAA
GTTAAGTTCAATAGTGAAACTCGAACTATCTACCAGTTTCATTACAAG
AATTGGCCAGACCATGATGTACCTTCATCTATAGACCCTATTCTTGAGC
TCATCTGGGATGTACGTTGTTACCAAGAGGATGACAGTGTTCCCATAT
GCATTCACTGCAGTGCTGGCTGTGGAAGGACTGGTGTTATTTGTGCTA
TTGATTATACATGGATGTTGCTAAAAGATGGGATAATTCCTGAGAACT
TCAGTGTTTTCAGTTTGATCCGGGAAATGCGGACACAGAGGCCTTCAT
TAGTTCAAACGCAGGAACAATATGAACTGGTCTACAATGCTGTATTAG
AACTATTTAAGAGACAGATGGATGTTATCAGAGATAA
GalK Escherichia ATGAGTCTGAAAGAAAAAACACAATCTCTGTTTGCCAACGCATTTGGC SEQ
coli TACCCTGCCACTCACACCATTCAGGCGCCTGGCCGCGTGAATTTGATT ID
GGTGAACACACCGACTACAACGACGGTTTCGTTCTGCCCTGCGCGATT NO:
GATTATCAAACCGTGATCAGTTGTGCACCACGCGATGACCGTAAAGTT 103
CGCGTGATGGCAGCCGATTATGAAAATCAGCTCGACGAGTTTTCCCTC
GATGCGCCCATTGTCGCACATGAAAACTATCAATGGGCTAACTACGTT
CGTGGCGTGGTGAAACATCTGCAACTGCGTAACAACAGCTTCGGCGGC
GTGGACATGGTGATCAGCGGCAATGTGCCGCAGGGTGCCGGGTTAAG
TTCTTCCGCTTCACTGGAAGTCGCGGTCGGAACCGTATTGCAGCAGCT
TTATCATCTGCCGCTGGACGGCGCACAAATCGCGCTTAACGGTCAGGA
AGCAGAAAACCAGTTTGTAGGCTGTAACTGCGGGATCATGGATCAGCT
AATTTCCGCGCTCGGCAAGAAAGATCATGCCTTGCTGATCGATTGCCG
CTCACTGGGGACCAAAGCAGTTTCCATGCCCAAAGGTGTGGCTGTCGT
CATCATCAACAGTAACTTCAAACGTACCCTGGTTGGCAGCGAATACAA
CACCCGTCGTGAACAGTGCGAAACCGGTGCGCGTTTCTTCCAGCAGCC
AGCCCTGCGTGATGTCACCATTGAAGAGTTCAACGCTGTTGCGCATGA
ACTGGACCCGATCGTGGCAAAACGCGTGCGTCATATACTGACTGAAAA
CGCCCGCACCGTTGAAGCTGCCAGCGCGCTGGAGCAAGGCGACCTGA
AACGTATGGGCGAGTTGATGGCGGAGTCTCATGCCTCTATGCGCGATG
ATTTCGAAATCACCGTGCCGCAAATTGACACTCTGGTAGAAATCGTCA
AAGCTGTGATTGGCGACAAAGGTGGCGTACGCATGACCGGCGGCGGA
TTTGGCGGCTGTATCGTCGCGCTGATCCCGGAAGAGCTGGTGCCTGCC
GTACAGCAAGCTGTCGCTGAACAATATGAAGCAAAAACAGGTATTAA
AGAGACTTTTTACGTTTGTAAACCATCACAAGGAGCAGGACAGTGCTG
A
SacB Bacillus ATGAACATCAAAAAGTTTGCAAAACAAGCAACAGTATTAACCTTTACT SEQ
subtilis ACCGCACTGCTGGCAGGAGGCGCAACTCAAGCGTTTGCGAAAGAAAC ID
GAACCAAAAGCCATATAAGGAAACATACGGCATTTCCCATATTACACG NO:
CCATGATATGCTGCAAATCCCTGAACAGCAAAAAAATGAAAAATATC 104
AAGTTCCTGAATTCGATTCGTCCACAATTAAAAATATCTCTTCTGCAAA
AGGCCTGGACGTTTGGGACAGCTGGCCATTACAAAACGCTGACGGCA
CTGTCGCAAACTATCACGGCTACCACATCGTCTTTGCATTAGCCGGAG
ATCCTAAAAATGCGGATGACACATCGATTTACATGTTCTATCAAAAAG
TCGGCGAAACTTCTATTGACAGCTGGAAAAACGCTGGCCGCGTCTTTA
AAGACAGCGACAAATTCGATGCAAATGATTCTATCCTAAAAGACCAA
ACACAAGAATGGTCAGGTTCAGCCACATTTACATCTGACGGAAAAATC
CGTTTATTCTACACTGATTTCTCCGGTAAACATTACGGCAAACAAACA
CTGACAACTGCACAAGTTAACGTATCAGCATCAGACAGCTCTTTGAAC
ATCAACGGTGTAGAGGATTATAAATCAATCTTTGACGGTGACGGAAAA
ACGTATCAAAATGTACAGCAGTTCATCGATGAAGGCAACTACAGCTCA
GGCGACAACCATACGCTGAGAGATCCTCACTACGTAGAAGATAAAGG
CCACAAATACTTAGTATTTGAAGCAAACACTGGAACTGAAGATGGCTA
CCAAGGCGAAGAATCTTTATTTAACAAAGCATACTATGGCAAAAGCAC
ATCATTCTTCCGTCAAGAAAGTCAAAAACTTCTGCAAAGCGATAAAAA
ACGCACGGCTGAGTTAGCAAACGGCGCTCTCGGTATGATTGAGCTAAA
CGATGATTACACACTGAAAAAAGTGATGAAACCGCTGATTGCATCTAA
CACAGTAACAGATGAAATTGAACGCGCGAACGTCTTTAAAATGAACG
GCAAATGGTACCTGTTCACTGACTCCCGCGGATCAAAAATGACGATTG
ACGGCATTACGTCTAACGATATTTACATGCTTGGTTATGTTTCTAATTC
TTTAACTGGCCCATACAAGCCGCTGAACAAAACTGGCCTTGTGTTAAA
AATGGATCTTGATCCTAACGATGTAACCTTTACTTACTCACACTTCGCT
GTACCTCAAGCGAAAGGAAACAATGTCGTGATTACAAGCTATATGAC
AAACAGAGGATTCTACGCAGACAAACAATCAACGTTTGCGCCAAGCTT
CCTGCTGAACATCAAAGGCAAGAAAACATCTGTTGTCAAAGACAGCA
TCCTTGAACAAGGACAATTAACAGTTAACAAATAA
ABBREVIATIONS
[0540] PTP IB, protein tyrosine phosphatase IB; TC-PTP, T-cell protein
tyrosine phosphatase; SHP2, protein tyrosine phosphatase non-receptor
type 11; BBR,
3-(3,5-Dibromo-4-hydroxy-benzoyl)-2-ethyl-benzofuran-6-sulfonicacid-(4-(t-
hiazol-2-ylsulfamyl)-phenyl)-amide; TCS401,
2-[(Carbox-ycarbonyl)amino]-4,5,6,7-tetrahydro-thieno[2,3-c]pyridine-3-ca-
rboxylic acid hydrochloride; AA, abietic acid; SCA, statistical coupling
analysis. PTP1B.sub.1-435, protein tyrosine phosphatase 1B (full-length);
SacB, levansucrase; GHS, .gamma.-humulene synthase; ADS, amorphadiene
synthase; ABS (or AgAs), abietadiene synthase; TXS, taxadiene synthase,
PTPN5, protein tyrosine phosphatase non-receptor type 5; PTPN6, protein
tyrosine phosphatase non-receptor type 6; PTPN11, protein tyrosine
phosphatase non-receptor type 11; PTPN12, protein tyrosine phosphatase
non-receptor type 12; PPTN22, protein tyrosine phosphatase non-receptor
type 22; RpoZ, omega subunit of RNA polymerase; cI (or c1434), cI
repressor protein from lambda phage; Kras (or p130cas), p130cas
phosphotyrosine substrate; MidT, phosphotyrosine substrate from hamster
polyoma virus; EGFR substrate, phosphotyrosine substrate from epidermal
growth factor receptor; Src, Src kinase; CDCl37, Hsp90 co-chaperone
Cdc37; MBP, maltose-binding protein; LuxAB, bacterial luciferase modules
A and B; SpecR, spectinomycin resistance gene; GGPPS, geranylgeranyl
diphosphate synthase; P450 (or P450.sub.BM3) Cytochrome P450; LOV2,
light-oxygen-voltage domain 2 from phototropin 1; BphP1, bacterial
phytochrome; Galk, galatokinase.
Examples
[0541] The following examples are offered to illustrate various
embodiments of the invention, but should not be viewed as limiting the
scope of the invention.
Statistical Analysis of Kinetic Models. We evaluated four kinetic models
of inhibition as described previously (19). In brief, we used an F-test
to compare a two-parameter mixed model to several single-parameter
models, and we used Akaike's Information Criterion (AIC, or Ai) to
compare the single-parameter models to one another. Mixed models with
p<0.05 are superior to all single-parameter models, and
single-parameter models with Aj>10 are inferior to the reference
(i.e., "best fit") model. Exemplary Estimation of IC50. We estimated the
half maximal inhibitory concentration (IC50) of BBR by using kinetic
models to estimate the concentration of inhibitor required to reduce
initial rates of PTP-catalyzed hydrolysis of 20 mM of pNPP by 50%, and we
used the MATLAB function "nlparci" to determine the confidence intervals
on those estimates (19).
[0542] All publications and patents mentioned in the above specification
are herein incorporated by reference. Various modifications and
variations of the described methods and system of the invention will be
apparent to those skilled in the art without departing from the scope and
spirit of the invention. Although the invention has been described in
connection with specific preferred embodiments, it should be understood
that the invention as claimed should not be unduly limited to such
specific embodiments. Indeed, various modifications of the described
modes for carrying out the invention that are obvious to those skilled in
medicine, molecular biology, cell biology, genetics, statistics or
related fields are intended to be within the scope of the following
claims.
Sequence CWU
1
1
104127PRTArtificial SequenceSynthetic Polypeptide 1Met Gly Asp Ser Ser Val
Gln Asp Gln Trp Lys Glu Leu Ser His Glu1 5
10 15Asp Leu Glu Pro Pro Pro Glu His Ile Pro Pro
20 25227PRTArtificial SequenceSynthetic Polypeptide
2Glu Ser Phe Asp Asp Glu Leu Arg Arg Lys Glu Met Arg Arg Gly Ile1
5 10 15Asp Leu Ala Thr Thr Leu
Glu Arg Ile Glu Lys 20 253298PRTArtificial
SequenceSynthetic Polypeptide 3Met Glu Met Glu Lys Glu Phe Glu Gln Ile
Asp Lys Ser Gly Ser Trp1 5 10
15Ala Ala Ile Tyr Gln Asp Ile Arg His Glu Ala Ser Asp Phe Pro Cys
20 25 30Arg Val Ala Lys Leu Pro
Lys Asn Lys Asn Arg Asn Arg Tyr Arg Asp 35 40
45Val Ser Pro Phe Asp His Ser Arg Ile Lys Leu His Gln Glu
Asp Asn 50 55 60Asp Tyr Ile Asn Ala
Ser Leu Ile Lys Met Glu Glu Ala Gln Arg Ser65 70
75 80Tyr Ile Leu Thr Gln Gly Pro Leu Pro Asn
Thr Cys Gly His Phe Trp 85 90
95Glu Met Val Trp Glu Gln Lys Ser Arg Gly Val Val Met Leu Asn Arg
100 105 110Val Met Glu Lys Gly
Ser Leu Lys Cys Ala Gln Tyr Trp Pro Gln Lys 115
120 125Glu Glu Lys Glu Met Ile Phe Glu Asp Thr Asn Leu
Lys Leu Thr Leu 130 135 140Ile Ser Glu
Asp Ile Lys Ser Tyr Tyr Thr Val Arg Gln Leu Glu Leu145
150 155 160Glu Asn Leu Thr Thr Gln Glu
Thr Arg Glu Ile Leu His Phe His Tyr 165
170 175Thr Thr Trp Pro Asp Phe Gly Val Pro Glu Ser Pro
Ala Ser Phe Leu 180 185 190Asn
Phe Leu Phe Lys Val Arg Glu Ser Gly Ser Leu Ser Pro Glu His 195
200 205Gly Pro Val Val Val His Cys Ser Ala
Gly Ile Gly Arg Ser Gly Thr 210 215
220Phe Cys Leu Ala Asp Thr Cys Leu Leu Leu Met Asp Lys Arg Lys Asp225
230 235 240Pro Ser Ser Val
Asp Ile Lys Lys Val Leu Leu Glu Met Arg Lys Phe 245
250 255Arg Met Gly Leu Ile Gln Thr Ala Asp Gln
Leu Arg Phe Ser Tyr Leu 260 265
270Ala Val Ile Glu Gly Ala Lys Phe Ile Met Gly Asp Ser Ser Val Gln
275 280 285Asp Gln Trp Lys Glu Leu Ser
His Glu Asp 290 2954303PRTArtificial SequenceSynthetic
Polypeptide 4Leu Glu Leu Asn Lys Lys Gln Glu Ser Glu Asp Thr Ala Lys Ala
Gly1 5 10 15Phe Trp Glu
Glu Phe Glu Ser Leu Gln Lys Gln Glu Val Lys Asn Leu 20
25 30His Gln Arg Leu Glu Gly Gln Arg Pro Glu
Asn Lys Gly Lys Asn Arg 35 40
45Tyr Lys Asn Ile Leu Pro Phe Asp His Ser Arg Val Ile Leu Gln Gly 50
55 60Arg Asp Ser Asn Ile Pro Gly Ser Asp
Tyr Ile Asn Ala Asn Tyr Ile65 70 75
80Lys Asn Gln Leu Leu Gly Pro Asp Glu Asn Ala Lys Thr Tyr
Ile Ala 85 90 95Ser Gln
Gly Cys Leu Glu Ala Thr Val Asn Asp Phe Trp Gln Met Ala 100
105 110Trp Gln Glu Asn Ser Arg Val Ile Val
Met Thr Thr Arg Glu Val Glu 115 120
125Lys Gly Arg Asn Lys Cys Val Pro Tyr Trp Pro Glu Val Gly Met Gln
130 135 140Arg Ala Tyr Gly Pro Tyr Ser
Val Thr Asn Cys Gly Glu His Asp Thr145 150
155 160Thr Glu Tyr Lys Leu Arg Thr Leu Gln Val Ser Pro
Leu Asp Asn Gly 165 170
175Asp Leu Ile Arg Glu Ile Trp His Tyr Gln Tyr Leu Ser Trp Pro Asp
180 185 190His Gly Val Pro Ser Glu
Pro Gly Gly Val Leu Ser Phe Leu Asp Gln 195 200
205Ile Asn Gln Arg Gln Glu Ser Leu Pro His Ala Gly Pro Ile
Ile Val 210 215 220His Cys Ser Ala Gly
Ile Gly Arg Thr Gly Thr Ile Ile Val Ile Asp225 230
235 240Met Leu Met Glu Asn Ile Ser Thr Lys Gly
Leu Asp Cys Asp Ile Asp 245 250
255Ile Gln Lys Thr Ile Gln Met Val Arg Ala Gln Arg Ser Gly Met Val
260 265 270Gln Thr Glu Ala Gln
Tyr Lys Phe Ile Tyr Val Ala Ile Ala Gln Phe 275
280 285Ile Glu Thr Thr Lys Lys Lys Leu Glu Val Leu Gln
Ser Gln Lys 290 295
300510PRTArtificial SequenceSynthetic Polypeptide 5Leu Ser His Glu Asp
Leu Ala Thr Thr Leu1 5 1065PRTArtificial
SequenceSynthetic Polypeptide 6Leu Ser His Glu Asp1
575PRTArtificial SequenceSynthetic Polypeptide 7Leu Ala Thr Thr Leu1
589PRTArtificial SequenceSynthetic Polypeptide 8Leu Ser His Glu
Asp Ala Thr Thr Leu1 598PRTArtificial SequenceSynthetic
Polypeptide 9Leu Ser His Glu Asp Thr Thr Leu1
51013PRTArtificial SequenceSynthetic
PolypeptideMOD_RES(7)..(7)phosphorylated 10Glu Thr Gly Thr Glu Glu Tyr
Met Lys Met Asp Leu Gly1 5
101113PRTArtificial SequenceSynthetic
PolypeptideMOD_RES(9)..(9)phosphorylated 11Arg Arg Leu Ile Glu Asp Ala
Glu Tyr Ala Ala Arg Gly1 5
10121347DNAArtificial SequenceSynthetic Polynucleotide 12atggagatgg
aaaaggagtt cgagcagatc gacaagtccg ggagctgggc ggccatttac 60caggatatcc
gacatgaagc cagtgacttc ccatgtagag tggccaagct tcctaagaac 120aaaaaccgaa
ataggtacag agacgtcagt ccctttgacc atagtcggat taaactacat 180caagaagata
atgactatat caacgctagt ttgataaaaa tggaagaagc ccaaaggagt 240tacattctta
cccagggccc tttgcctaac acatgcggtc acttttggga gatggtgtgg 300gagcagaaaa
gcaggggtgt cgtcatgctc aacagagtga tggagaaagg ttcgttaaaa 360tgcgcacaat
actggccaca aaaagaagaa aaagagatga tctttgaaga cacaaatttg 420aaattaacat
tgatctctga agatatcaag tcatattata cagtgcgaca gctagaattg 480gaaaacctta
caacccaaga aactcgagag atcttacatt tccactatac cacatggcct 540gactttggag
tccctgaatc accagcctca ttcttgaact ttcttttcaa agtccgagag 600tcagggtcac
tcagcccgga gcacgggccc gttgtggtgc actgcagtgc aggcatcggc 660aggtctggaa
ccttctgtct ggctgatacc tgcctcttgc tgatggacaa gaggaaagac 720ccttcttccg
ttgatatcaa gaaagtgctg ttagaaatga ggaagtttcg gatggggctg 780atccagacag
ccgaccagct gcgcttctcc tacctggctg tgatcgaagg tgccaaattc 840atcatggggg
actcttccgt gcaggatcag tggaaggagc tttcccacga ggacgctgct 900acacttgaac
gtattgagaa gaactttgtc attactgacc caaggttgcc agataatccc 960attatattcg
cgtccgatag tttcttgcag ttgacagaat atagccgtga agaaattttg 1020ggaagaaact
gcaggtttct acaaggtcct gaaactgatc gcgcgacagt gagaaaaatt 1080agagatgcca
tagataacca aacagaggtc actgttcagc tgattaatta tacaaagagt 1140ggtaaaaagt
tctggaacct ctttcacttg cagcctatgc gagatcagaa gggagatgtc 1200cagtacttta
ttggggttca gttggatgga actgagcatg tccgagatgc tgccgagaga 1260gagggagtca
tgctgattaa gaaaactgca gaaaatattg atgaggcggc aaaagaactt 1320ctcgagcacc
accaccacca ccactga
134713448PRTArtificial SequenceSynthetic Polypeptide 13Met Glu Met Glu
Lys Glu Phe Glu Gln Ile Asp Lys Ser Gly Ser Trp1 5
10 15Ala Ala Ile Tyr Gln Asp Ile Arg His Glu
Ala Ser Asp Phe Pro Cys 20 25
30Arg Val Ala Lys Leu Pro Lys Asn Lys Asn Arg Asn Arg Tyr Arg Asp
35 40 45Val Ser Pro Phe Asp His Ser Arg
Ile Lys Leu His Gln Glu Asp Asn 50 55
60Asp Tyr Ile Asn Ala Ser Leu Ile Lys Met Glu Glu Ala Gln Arg Ser65
70 75 80Tyr Ile Leu Thr Gln
Gly Pro Leu Pro Asn Thr Cys Gly His Phe Trp 85
90 95Glu Met Val Trp Glu Gln Lys Ser Arg Gly Val
Val Met Leu Asn Arg 100 105
110Val Met Glu Lys Gly Ser Leu Lys Cys Ala Gln Tyr Trp Pro Gln Lys
115 120 125Glu Glu Lys Glu Met Ile Phe
Glu Asp Thr Asn Leu Lys Leu Thr Leu 130 135
140Ile Ser Glu Asp Ile Lys Ser Tyr Tyr Thr Val Arg Gln Leu Glu
Leu145 150 155 160Glu Asn
Leu Thr Thr Gln Glu Thr Arg Glu Ile Leu His Phe His Tyr
165 170 175Thr Thr Trp Pro Asp Phe Gly
Val Pro Glu Ser Pro Ala Ser Phe Leu 180 185
190Asn Phe Leu Phe Lys Val Arg Glu Ser Gly Ser Leu Ser Pro
Glu His 195 200 205Gly Pro Val Val
Val His Cys Ser Ala Gly Ile Gly Arg Ser Gly Thr 210
215 220Phe Cys Leu Ala Asp Thr Cys Leu Leu Leu Met Asp
Lys Arg Lys Asp225 230 235
240Pro Ser Ser Val Asp Ile Lys Lys Val Leu Leu Glu Met Arg Lys Phe
245 250 255Arg Met Gly Leu Ile
Gln Thr Ala Asp Gln Leu Arg Phe Ser Tyr Leu 260
265 270Ala Val Ile Glu Gly Ala Lys Phe Ile Met Gly Asp
Ser Ser Val Gln 275 280 285Asp Gln
Trp Lys Glu Leu Ser His Glu Asp Ala Ala Thr Leu Glu Arg 290
295 300Ile Glu Lys Asn Phe Val Ile Thr Asp Pro Arg
Leu Pro Asp Asn Pro305 310 315
320Ile Ile Phe Ala Ser Asp Ser Phe Leu Gln Leu Thr Glu Tyr Ser Arg
325 330 335Glu Glu Ile Leu
Gly Arg Asn Cys Arg Phe Leu Gln Gly Pro Glu Thr 340
345 350Asp Arg Ala Thr Val Arg Lys Ile Arg Asp Ala
Ile Asp Asn Gln Thr 355 360 365Glu
Val Thr Val Gln Leu Ile Asn Tyr Thr Lys Ser Gly Lys Lys Phe 370
375 380Trp Asn Leu Phe His Leu Gln Pro Met Arg
Asp Gln Lys Gly Asp Val385 390 395
400Gln Tyr Phe Ile Gly Val Gln Leu Asp Gly Thr Glu His Val Arg
Asp 405 410 415Ala Ala Glu
Arg Glu Gly Val Met Leu Ile Lys Lys Thr Ala Glu Asn 420
425 430Ile Asp Glu Ala Ala Lys Glu Leu Leu Glu
His His His His His His 435 440
445141347DNAArtificial SequenceSynthetic Polynucleotide 14atggagatgg
aaaaggagtt cgagcagatc gacaagtccg ggagctgggc ggccatttac 60caggatatcc
gacatgaagc cagtgacttc ccatgtagag tggccaagct tcctaagaac 120aaaaaccgaa
ataggtacag agacgtcagt ccctttgacc atagtcggat taaactacat 180caagaagata
atgactatat caacgctagt ttgataaaaa tggaagaagc ccaaaggagt 240tacattctta
cccagggccc tttgcctaac acatgcggtc acttttggga gatggtgtgg 300gagcagaaaa
gcaggggtgt cgtcatgctc aacagagtga tggagaaagg ttcgttaaaa 360tgcgcacaat
actggccaca aaaagaagaa aaagagatga tctttgaaga cacaaatttg 420aaattaacat
tgatctctga agatatcaag tcatattata cagtgcgaca gctagaattg 480gaaaacctta
caacccaaga aactcgagag atcttacatt tccactatac cacatggcct 540gactttggag
tccctgaatc accagcctca ttcttgaact ttcttttcaa agtccgagag 600tcagggtcac
tcagcccgga gcacgggccc gttgtggtgc actgcagtgc aggcatcggc 660aggtctggaa
ccttctgtct ggctgatacc tgcctcttgc tgatggacaa gaggaaagac 720ccttcttccg
ttgatatcaa gaaagtgctg ttagaaatga ggaagtttcg gatggggctg 780atccagacag
ccgaccagct gcgcttctcc tacctggctg tgatcgaagg tgccaaattc 840atcatggggg
actctgccgt gcaggatcag tggaaggagc tttcccacga ggacgctact 900acacttgaac
gtattgagaa gaactttgtc attactgacc caaggttgcc agataatccc 960attatattcg
cgtccgatag tttcttgcag ttgacagaat atagccgtga agaaattttg 1020ggaagaaact
gcaggtttct acaaggtcct gaaactgatc gcgcgacagt gagaaaaatt 1080agagatgcca
tagataacca aacagaggtc actgttcagc tgattaatta tacaaagagt 1140ggtaaaaagt
tctggaacct ctttcacttg cagcctatgc gagatcagaa gggagatgtc 1200cagtacttta
ttggggttca gttggatgga actgagcatg tccgagatgc tgccgagaga 1260gagggagtca
tgctgattaa gaaaactgca gaaaatattg atgaggcggc aaaagaactt 1320ctcgagcacc
accaccacca ccactga
134715448PRTArtificial SequenceSynthetic Polypeptide 15Met Glu Met Glu
Lys Glu Phe Glu Gln Ile Asp Lys Ser Gly Ser Trp1 5
10 15Ala Ala Ile Tyr Gln Asp Ile Arg His Glu
Ala Ser Asp Phe Pro Cys 20 25
30Arg Val Ala Lys Leu Pro Lys Asn Lys Asn Arg Asn Arg Tyr Arg Asp
35 40 45Val Ser Pro Phe Asp His Ser Arg
Ile Lys Leu His Gln Glu Asp Asn 50 55
60Asp Tyr Ile Asn Ala Ser Leu Ile Lys Met Glu Glu Ala Gln Arg Ser65
70 75 80Tyr Ile Leu Thr Gln
Gly Pro Leu Pro Asn Thr Cys Gly His Phe Trp 85
90 95Glu Met Val Trp Glu Gln Lys Ser Arg Gly Val
Val Met Leu Asn Arg 100 105
110Val Met Glu Lys Gly Ser Leu Lys Cys Ala Gln Tyr Trp Pro Gln Lys
115 120 125Glu Glu Lys Glu Met Ile Phe
Glu Asp Thr Asn Leu Lys Leu Thr Leu 130 135
140Ile Ser Glu Asp Ile Lys Ser Tyr Tyr Thr Val Arg Gln Leu Glu
Leu145 150 155 160Glu Asn
Leu Thr Thr Gln Glu Thr Arg Glu Ile Leu His Phe His Tyr
165 170 175Thr Thr Trp Pro Asp Phe Gly
Val Pro Glu Ser Pro Ala Ser Phe Leu 180 185
190Asn Phe Leu Phe Lys Val Arg Glu Ser Gly Ser Leu Ser Pro
Glu His 195 200 205Gly Pro Val Val
Val His Cys Ser Ala Gly Ile Gly Arg Ser Gly Thr 210
215 220Phe Cys Leu Ala Asp Thr Cys Leu Leu Leu Met Asp
Lys Arg Lys Asp225 230 235
240Pro Ser Ser Val Asp Ile Lys Lys Val Leu Leu Glu Met Arg Lys Phe
245 250 255Arg Met Gly Leu Ile
Gln Thr Ala Asp Gln Leu Arg Phe Ser Tyr Leu 260
265 270Ala Val Ile Glu Gly Ala Lys Phe Ile Met Gly Asp
Ser Ala Val Gln 275 280 285Asp Gln
Trp Lys Glu Leu Ser His Glu Asp Ala Thr Thr Leu Glu Arg 290
295 300Ile Glu Lys Asn Phe Val Ile Thr Asp Pro Arg
Leu Pro Asp Asn Pro305 310 315
320Ile Ile Phe Ala Ser Asp Ser Phe Leu Gln Leu Thr Glu Tyr Ser Arg
325 330 335Glu Glu Ile Leu
Gly Arg Asn Cys Arg Phe Leu Gln Gly Pro Glu Thr 340
345 350Asp Arg Ala Thr Val Arg Lys Ile Arg Asp Ala
Ile Asp Asn Gln Thr 355 360 365Glu
Val Thr Val Gln Leu Ile Asn Tyr Thr Lys Ser Gly Lys Lys Phe 370
375 380Trp Asn Leu Phe His Leu Gln Pro Met Arg
Asp Gln Lys Gly Asp Val385 390 395
400Gln Tyr Phe Ile Gly Val Gln Leu Asp Gly Thr Glu His Val Arg
Asp 405 410 415Ala Ala Glu
Arg Glu Gly Val Met Leu Ile Lys Lys Thr Ala Glu Asn 420
425 430Ile Asp Glu Ala Ala Lys Glu Leu Leu Glu
His His His His His His 435 440
445161350DNAArtificial SequenceSynthetic Polynucleotide 16atgcccacca
ccatcgagcg ggagttcgaa gagttggata ctcagcgtcg ctggcagccg 60ctgtacttgg
aaattcgaaa tgagtcccat gactatcctc atagagtggc caagtttcca 120gaaaacagaa
atcgaaacag atacagagat gtaagcccat atgatcacag tcgtgttaaa 180ctgcaaaatg
ctgagaatga ttatattaat gccagtttag ttgacataga agaggcacaa 240aggagttaca
tcttaacaca gggtccactt cctaacacat gctgccattt ctggcttatg 300gtttggcagc
agaagaccaa agcagttgtc atgctgaacc gcgtgatgga gaaaggttcg 360ttaaaatgtg
cacagtactg gccaacagat gaccaagaga tgctgtttaa agaaacagga 420ttcagtgtga
agctcttgtc agaagatgtg aagtcgtatt atacagtaca tctactacaa 480ttagaaaata
tcaatagtgg tgaaaccaga acaatatctc actttcatta tactacctgg 540ccagattttg
gagtccctga atcaccagct tcatttctca atttcttgtt taaagtgaga 600gaatctggct
ccttgaaccc tgaccatggg cctgcggtga tccactgtag tgcaggcatt 660gggcgctctg
gcaccttctc tctggtagac acttgtcttt tgctgatgga caagaggaaa 720gacccttctt
ccgttgatat caagaaagtg ctgttagaaa tgaggaagtt tcggatgggg 780ctgatccaga
cagccgacca gctgcgcttc tcctacctgg ctgtgatcga aggtgccaaa 840ttcatcatgg
gggactcttc cgtgcaggat cagtggaagg agctttccca cgaggacgct 900gctacacttg
aacgtattga gaagaacttt gtcattactg acccaaggtt gccagataat 960cccattatat
tcgcgtccga tagtttcttg cagttgacag aatatagccg tgaagaaatt 1020ttgggaagaa
actgcaggtt tctacaaggt cctgaaactg atcgcgcgac agtgagaaaa 1080attagagatg
ccatagataa ccaaacagag gtcactgttc agctgattaa ttatacaaag 1140agtggtaaaa
agttctggaa cctctttcac ttgcagccta tgcgagatca gaagggagat 1200gtccagtact
ttattggggt tcagttggat ggaactgagc atgtccgaga tgctgccgag 1260agagagggag
tcatgctgat taagaaaact gcagaaaata ttgatgaggc ggcaaaagaa 1320cttctcgagc
accaccacca ccaccactga
135017449PRTArtificial SequenceSynthetic Polypeptide 17Met Pro Thr Thr
Ile Glu Arg Glu Phe Glu Glu Leu Asp Thr Gln Arg1 5
10 15Arg Trp Gln Pro Leu Tyr Leu Glu Ile Arg
Asn Glu Ser His Asp Tyr 20 25
30Pro His Arg Val Ala Lys Phe Pro Glu Asn Arg Asn Arg Asn Arg Tyr
35 40 45Arg Asp Val Ser Pro Tyr Asp His
Ser Arg Val Lys Leu Gln Asn Ala 50 55
60Glu Asn Asp Tyr Ile Asn Ala Ser Leu Val Asp Ile Glu Glu Ala Gln65
70 75 80Arg Ser Tyr Ile Leu
Thr Gln Gly Pro Leu Pro Asn Thr Cys Cys His 85
90 95Phe Trp Leu Met Val Trp Gln Gln Lys Thr Lys
Ala Val Val Met Leu 100 105
110Asn Arg Val Met Glu Lys Gly Ser Leu Lys Cys Ala Gln Tyr Trp Pro
115 120 125Thr Asp Asp Gln Glu Met Leu
Phe Lys Glu Thr Gly Phe Ser Val Lys 130 135
140Leu Leu Ser Glu Asp Val Lys Ser Tyr Tyr Thr Val His Leu Leu
Gln145 150 155 160Leu Glu
Asn Ile Asn Ser Gly Glu Thr Arg Thr Ile Ser His Phe His
165 170 175Tyr Thr Thr Trp Pro Asp Phe
Gly Val Pro Glu Ser Pro Ala Ser Phe 180 185
190Leu Asn Phe Leu Phe Lys Val Arg Glu Ser Gly Ser Leu Asn
Pro Asp 195 200 205His Gly Pro Ala
Val Ile His Cys Ser Ala Gly Ile Gly Arg Ser Gly 210
215 220Thr Phe Ser Leu Val Asp Thr Cys Leu Leu Leu Met
Asp Lys Arg Lys225 230 235
240Asp Pro Ser Ser Val Asp Ile Lys Lys Val Leu Leu Glu Met Arg Lys
245 250 255Phe Arg Met Gly Leu
Ile Gln Thr Ala Asp Gln Leu Arg Phe Ser Tyr 260
265 270Leu Ala Val Ile Glu Gly Ala Lys Phe Ile Met Gly
Asp Ser Ser Val 275 280 285Gln Asp
Gln Trp Lys Glu Leu Ser His Glu Asp Ala Ala Thr Leu Glu 290
295 300Arg Ile Glu Lys Asn Phe Val Ile Thr Asp Pro
Arg Leu Pro Asp Asn305 310 315
320Pro Ile Ile Phe Ala Ser Asp Ser Phe Leu Gln Leu Thr Glu Tyr Ser
325 330 335Arg Glu Glu Ile
Leu Gly Arg Asn Cys Arg Phe Leu Gln Gly Pro Glu 340
345 350Thr Asp Arg Ala Thr Val Arg Lys Ile Arg Asp
Ala Ile Asp Asn Gln 355 360 365Thr
Glu Val Thr Val Gln Leu Ile Asn Tyr Thr Lys Ser Gly Lys Lys 370
375 380Phe Trp Asn Leu Phe His Leu Gln Pro Met
Arg Asp Gln Lys Gly Asp385 390 395
400Val Gln Tyr Phe Ile Gly Val Gln Leu Asp Gly Thr Glu His Val
Arg 405 410 415Asp Ala Ala
Glu Arg Glu Gly Val Met Leu Ile Lys Lys Thr Ala Glu 420
425 430Asn Ile Asp Glu Ala Ala Lys Glu Leu Leu
Glu His His His His His 435 440
445His181350DNAArtificial SequenceSynthetic Polynucleotide 18atgcccacca
ccatcgagcg ggagttcgaa gagttggata ctcagcgtcg ctggcagccg 60ctgtacttgg
aaattcgaaa tgagtcccat gactatcctc atagagtggc caagtttcca 120gaaaacagaa
atcgaaacag atacagagat gtaagcccat atgatcacag tcgtgttaaa 180ctgcaaaatg
ctgagaatga ttatattaat gccagtttag ttgacataga agaggcacaa 240aggagttaca
tcttaacaca gggtccactt cctaacacat gctgccattt ctggcttatg 300gtttggcagc
agaagaccaa agcagttgtc atgctgaacc gcattgtgga gaaagaatcg 360gttaaatgtg
cacagtactg gccaacagat gaccaagaga tgctgtttaa agaaacagga 420ttcagtgtga
agctcttgtc agaagatgtg aagtcgtatt atacagtaca tctactacaa 480ttagaaaata
tcaatagtgg tgaaaccaga acaatatctc actttcatta tactacctgg 540ccagattttg
gagtccctga atcaccagct tcatttctca atttcttgtt taaagtgaga 600gaatctggct
ccttgaaccc tgaccatggg cctgcggtga tccactgtag tgcaggcatt 660gggcgctctg
gcaccttctc tctggtagac acttgtcttt tgctgatgga caagaggaaa 720gacccttctt
ccgttgatat caagaaagtg ctgttagaaa tgaggaagtt tcggatgggg 780ctgatccaga
cagccgacca gctgcgcttc tcctacctgg ctgtgatcga aggtgccaaa 840ttcatcatgg
gggactcttc cgtgcaggat cagtggaagg agctttccca cgaggacgct 900gctacacttg
aacgtattga gaagaacttt gtcattactg acccaaggtt gccagataat 960cccattatat
tcgcgtccga tagtttcttg cagttgacag aatatagccg tgaagaaatt 1020ttgggaagaa
actgcaggtt tctacaaggt cctgaaactg atcgcgcgac agtgagaaaa 1080attagagatg
ccatagataa ccaaacagag gtcactgttc agctgattaa ttatacaaag 1140agtggtaaaa
agttctggaa cctctttcac ttgcagccta tgcgagatca gaagggagat 1200gtccagtact
ttattggggt tcagttggat ggaactgagc atgtccgaga tgctgccgag 1260agagagggag
tcatgctgat taagaaaact gcagaaaata ttgatgaggc ggcaaaagaa 1320cttctcgagc
accaccacca ccaccactga
135019449PRTArtificial SequenceSynthetic Polypeptide 19Met Pro Thr Thr
Ile Glu Arg Glu Phe Glu Glu Leu Asp Thr Gln Arg1 5
10 15Arg Trp Gln Pro Leu Tyr Leu Glu Ile Arg
Asn Glu Ser His Asp Tyr 20 25
30Pro His Arg Val Ala Lys Phe Pro Glu Asn Arg Asn Arg Asn Arg Tyr
35 40 45Arg Asp Val Ser Pro Tyr Asp His
Ser Arg Val Lys Leu Gln Asn Ala 50 55
60Glu Asn Asp Tyr Ile Asn Ala Ser Leu Val Asp Ile Glu Glu Ala Gln65
70 75 80Arg Ser Tyr Ile Leu
Thr Gln Gly Pro Leu Pro Asn Thr Cys Cys His 85
90 95Phe Trp Leu Met Val Trp Gln Gln Lys Thr Lys
Ala Val Val Met Leu 100 105
110Asn Arg Ile Val Glu Lys Glu Ser Val Lys Cys Ala Gln Tyr Trp Pro
115 120 125Thr Asp Asp Gln Glu Met Leu
Phe Lys Glu Thr Gly Phe Ser Val Lys 130 135
140Leu Leu Ser Glu Asp Val Lys Ser Tyr Tyr Thr Val His Leu Leu
Gln145 150 155 160Leu Glu
Asn Ile Asn Ser Gly Glu Thr Arg Thr Ile Ser His Phe His
165 170 175Tyr Thr Thr Trp Pro Asp Phe
Gly Val Pro Glu Ser Pro Ala Ser Phe 180 185
190Leu Asn Phe Leu Phe Lys Val Arg Glu Ser Gly Ser Leu Asn
Pro Asp 195 200 205His Gly Pro Ala
Val Ile His Cys Ser Ala Gly Ile Gly Arg Ser Gly 210
215 220Thr Phe Ser Leu Val Asp Thr Cys Leu Leu Leu Met
Asp Lys Arg Lys225 230 235
240Asp Pro Ser Ser Val Asp Ile Lys Lys Val Leu Leu Glu Met Arg Lys
245 250 255Phe Arg Met Gly Leu
Ile Gln Thr Ala Asp Gln Leu Arg Phe Ser Tyr 260
265 270Leu Ala Val Ile Glu Gly Ala Lys Phe Ile Met Gly
Asp Ser Ser Val 275 280 285Gln Asp
Gln Trp Lys Glu Leu Ser His Glu Asp Ala Ala Thr Leu Glu 290
295 300Arg Ile Glu Lys Asn Phe Val Ile Thr Asp Pro
Arg Leu Pro Asp Asn305 310 315
320Pro Ile Ile Phe Ala Ser Asp Ser Phe Leu Gln Leu Thr Glu Tyr Ser
325 330 335Arg Glu Glu Ile
Leu Gly Arg Asn Cys Arg Phe Leu Gln Gly Pro Glu 340
345 350Thr Asp Arg Ala Thr Val Arg Lys Ile Arg Asp
Ala Ile Asp Asn Gln 355 360 365Thr
Glu Val Thr Val Gln Leu Ile Asn Tyr Thr Lys Ser Gly Lys Lys 370
375 380Phe Trp Asn Leu Phe His Leu Gln Pro Met
Arg Asp Gln Lys Gly Asp385 390 395
400Val Gln Tyr Phe Ile Gly Val Gln Leu Asp Gly Thr Glu His Val
Arg 405 410 415Asp Ala Ala
Glu Arg Glu Gly Val Met Leu Ile Lys Lys Thr Ala Glu 420
425 430Asn Ile Asp Glu Ala Ala Lys Glu Leu Leu
Glu His His His His His 435 440
445His201827DNAArtificial SequenceSynthetic Polynucleotide 20atgcatcatc
atcatcatca tgtgagcaag ggcgaggagc tgttcaccgg ggtggtgccc 60atcctggtcg
agctggacgg cgacgtaaac ggccacaagt tcagcgtccg cggcgagggc 120gagggcgatg
ccaccaacgg caagctgacc ctgaagttca tctgcaccac cggcaagctg 180cccgtgccct
ggcccaccct cgtgaccacc ttcggctacg gcgtggcctg cttcagccgc 240taccccgacc
acatgaagca gcacgacttc ttcaagtccg ccatgcccga aggctacgtc 300caggagcgca
ccatctcttt caaggacgac ggtacctaca agacccgcgc cgaggtgaag 360ttcgagggcg
acaccctggt gaaccgcatc gagctgaagg gcatcgactt caaggaggac 420ggcaacatcc
tggggcacaa gctggagtac aacttcaaca gccactacgt ctatatcacg 480gccgacaagc
agaagaactg catcaaggct aacttcaaga tccgccacaa cgttgaggac 540ggcagcgtgc
agctcgccga ccactaccag cagaacaccc ccatcggcga cggccccgtg 600ctgctgcccg
acaaccacta cctgagccat cagtccaagc tgagcaaaga ccccaacgag 660aagcgcgatc
acatggtcct gctggagttc gtgaccgccg ccgggattac acatggcatg 720gacgagctgt
acaagtggta ttttgggaag atcactcgtc gggagtccga gcggctgctg 780ctcaaccccg
aaaacccccg gggaaccttc ttggtccggg agagcgagac gacaaaaggt 840gcctattgcc
tctccgtttc tgactttgac aacgccaagg ggctcaatgt gaagcactac 900aagatccgca
agctggacag cggcggcttc tacatcacct cacgcacaca gttcagcagc 960ctgcagcagc
tggtggccta ctactccaaa catgctgatg gcttgtgcca ccgcctgact 1020aacgtctgtg
ggtctacatc tggatctggg aagccgggtt ctggtgaggg ttcttggatg 1080gaggactatg
actacgtcca cctacagggg gagctcgtgt ctaagggcga agagctgatc 1140aaggaaaata
tgcgtatgaa ggtggtcatg gaaggttcgg tcaacggcca ccaattcaaa 1200tgcacaggtg
aaggagaagg cagaccgtac gagggaactc aaaccatgag gatcaaagtc 1260atcgagggag
gacccctgcc atttgccttt gacattcttg ccacgtcgtt catgtatggc 1320agccgtactt
ttatcaagta cccggccgac atccctgatt tctttaaaca gtcctttcct 1380gagggtttta
cttgggaaag agttacgaga tacgaagatg gtggagtcgt caccgtcacg 1440caggacacca
gccttgagga tggcgagctc gtctacaacg tcaaggtcag aggggtaaac 1500tttccctcca
atggtcccgt gatgcagaag aagaccaagg gttgggagcc taatacagag 1560atgatgtatc
cagcagatgg tggtctgaga ggatacactg acatcgcact gaaagttgat 1620ggtggtggcc
atctgcactg caacttcgtg acaacttaca ggtcaaaaaa gaccgtcggg 1680aacatcaaga
tgcccggtgt ccatgccgtt gatcaccgcc tggaaaggat cgaggagagt 1740gacaatgaaa
cctacgtagt gcaacgcgaa gtggcagttg ccaaatacag caaccttggt 1800ggtggcatgg
acgagctgta caagtaa
182721608PRTArtificial SequenceSynthetic Polypeptide 21Met His His His
His His His Val Ser Lys Gly Glu Glu Leu Phe Thr1 5
10 15Gly Val Val Pro Ile Leu Val Glu Leu Asp
Gly Asp Val Asn Gly His 20 25
30Lys Phe Ser Val Arg Gly Glu Gly Glu Gly Asp Ala Thr Asn Gly Lys
35 40 45Leu Thr Leu Lys Phe Ile Cys Thr
Thr Gly Lys Leu Pro Val Pro Trp 50 55
60Pro Thr Leu Val Thr Thr Phe Gly Tyr Gly Val Ala Cys Phe Ser Arg65
70 75 80Tyr Pro Asp His Met
Lys Gln His Asp Phe Phe Lys Ser Ala Met Pro 85
90 95Glu Gly Tyr Val Gln Glu Arg Thr Ile Ser Phe
Lys Asp Asp Gly Thr 100 105
110Tyr Lys Thr Arg Ala Glu Val Lys Phe Glu Gly Asp Thr Leu Val Asn
115 120 125Arg Ile Glu Leu Lys Gly Ile
Asp Phe Lys Glu Asp Gly Asn Ile Leu 130 135
140Gly His Lys Leu Glu Tyr Asn Phe Asn Ser His Tyr Val Tyr Ile
Thr145 150 155 160Ala Asp
Lys Gln Lys Asn Cys Ile Lys Ala Asn Phe Lys Ile Arg His
165 170 175Asn Val Glu Asp Gly Ser Val
Gln Leu Ala Asp His Tyr Gln Gln Asn 180 185
190Thr Pro Ile Gly Asp Gly Pro Val Leu Leu Pro Asp Asn His
Tyr Leu 195 200 205Ser His Gln Ser
Lys Leu Ser Lys Asp Pro Asn Glu Lys Arg Asp His 210
215 220Met Val Leu Leu Glu Phe Val Thr Ala Ala Gly Ile
Thr His Gly Met225 230 235
240Asp Glu Leu Tyr Lys Trp Tyr Phe Gly Lys Ile Thr Arg Arg Glu Ser
245 250 255Glu Arg Leu Leu Leu
Asn Pro Glu Asn Pro Arg Gly Thr Phe Leu Val 260
265 270Arg Glu Ser Glu Thr Thr Lys Gly Ala Tyr Cys Leu
Ser Val Ser Asp 275 280 285Phe Asp
Asn Ala Lys Gly Leu Asn Val Lys His Tyr Lys Ile Arg Lys 290
295 300Leu Asp Ser Gly Gly Phe Tyr Ile Thr Ser Arg
Thr Gln Phe Ser Ser305 310 315
320Leu Gln Gln Leu Val Ala Tyr Tyr Ser Lys His Ala Asp Gly Leu Cys
325 330 335His Arg Leu Thr
Asn Val Cys Gly Ser Thr Ser Gly Ser Gly Lys Pro 340
345 350Gly Ser Gly Glu Gly Ser Trp Met Glu Asp Tyr
Asp Tyr Val His Leu 355 360 365Gln
Gly Glu Leu Val Ser Lys Gly Glu Glu Leu Ile Lys Glu Asn Met 370
375 380Arg Met Lys Val Val Met Glu Gly Ser Val
Asn Gly His Gln Phe Lys385 390 395
400Cys Thr Gly Glu Gly Glu Gly Arg Pro Tyr Glu Gly Thr Gln Thr
Met 405 410 415Arg Ile Lys
Val Ile Glu Gly Gly Pro Leu Pro Phe Ala Phe Asp Ile 420
425 430Leu Ala Thr Ser Phe Met Tyr Gly Ser Arg
Thr Phe Ile Lys Tyr Pro 435 440
445Ala Asp Ile Pro Asp Phe Phe Lys Gln Ser Phe Pro Glu Gly Phe Thr 450
455 460Trp Glu Arg Val Thr Arg Tyr Glu
Asp Gly Gly Val Val Thr Val Thr465 470
475 480Gln Asp Thr Ser Leu Glu Asp Gly Glu Leu Val Tyr
Asn Val Lys Val 485 490
495Arg Gly Val Asn Phe Pro Ser Asn Gly Pro Val Met Gln Lys Lys Thr
500 505 510Lys Gly Trp Glu Pro Asn
Thr Glu Met Met Tyr Pro Ala Asp Gly Gly 515 520
525Leu Arg Gly Tyr Thr Asp Ile Ala Leu Lys Val Asp Gly Gly
Gly His 530 535 540Leu His Cys Asn Phe
Val Thr Thr Tyr Arg Ser Lys Lys Thr Val Gly545 550
555 560Asn Ile Lys Met Pro Gly Val His Ala Val
Asp His Arg Leu Glu Arg 565 570
575Ile Glu Glu Ser Asp Asn Glu Thr Tyr Val Val Gln Arg Glu Val Ala
580 585 590Val Ala Lys Tyr Ser
Asn Leu Gly Gly Gly Met Asp Glu Leu Tyr Lys 595
600 6052270DNAArtificial SequenceSynthetic Polynucleotide
22gcaaatgggc ggtaggcgtg tacggtggga ggtctatata agcagagctg gtttagtgaa
60ccgtcagatc
702366DNAArtificial SequenceSynthetic Polynucleotide 23ggcagcggcg
ccaccaactt ctccctgctg aagcaggccg gcgacgtgga ggagaacccc 60ggcccc
662422PRTArtificial SequenceSynthetic Polypeptide 24Gly Ser Gly Ala Thr
Asn Phe Ser Leu Leu Lys Gln Ala Gly Asp Val1 5
10 15Glu Glu Asn Pro Gly Pro
2025160DNAArtificial SequenceSynthetic Polynucleotide 25ttctagagca
cagctaacac cacgtcgtcc ctatctgctg ccctaggtct atgagtggtt 60gctggataac
tttacgggca tgcataaggc tcggtatcta tattcaggga gaccacaacg 120gtttccctct
acaaataatt ttgtttaact tttactagag
16026153DNAArtificial SequenceSynthetic Polynucleotide 26gcacagctaa
caccacgtcg tccctatctg ctgccctagg tctatgagtg gttgctggat 60aactttacgg
gcatgcataa ggctcgtata atatattcag ggagaccaca acggtttccc 120tctacaaata
attttgttta acttttacta gag
15327304DNAArtificial SequenceSynthetic Polynucleotide 27agaaaccaat
tgtccatatt gcatcagaca ttgccgtcac tgcgtctttt actggctctt 60ctcgctaacc
aaaccggtaa ccccgcttat taaaagcatt ctgtaacaaa gcgggaccaa 120agccatgaca
aaaacgcgta acaaaagtgt ctataatcac ggcagaaaag tccacattga 180ttatttgcac
ggcgtcacac tttgctatgc catagcattt ttatccataa gattagcgga 240tcctacctga
cgctttttat cgcaactctc tactgtttct ccatacccgt ttttttgggc 300tagc
3042872DNAArtificial SequenceSynthetic Polynucleotide 28acaagaaagt
ttgttcatta ggcaccccgg gctttactcg taaagcttcc ggcgcgtatg 60ttgtgtcgac
cg
7229246DNAArtificial SequenceSynthetic Polynucleotide 29cgactgcacg
gtgcaccaat gcttctggcg tcaggcagcc atcggaagct gtggtatggc 60tgtgcaggtc
gtaaatcact gcataattcg tgtcgctcaa ggcgcactcc cgttctggat 120aatgtttttt
gcgccgacat cataacggtt ctggcaaata ttctgaaatg agctgttgac 180aattaatcat
ccggctcgta taatgtgtgg aattgtgagc ggataacaat ttcacacagg 240aaacag
2463019DNAArtificial SequenceSynthetic Polynucleotide 30cctatagtga
gtcgtatta
193118DNAArtificial SequenceSynthetic Polynucleotide 31ttaaagagga
gaaaggtc
183224DNAArtificial SequenceSynthetic Polynucleotide 32cgaaaaaaag
taaggcggta atcc
243323DNAArtificial SequenceSynthetic Polynucleotide 33tgcagaaaga
ggagaaatac tag
233421DNAArtificial SequenceSynthetic Polynucleotide 34attaaagagg
agaaatacta g
213522DNAArtificial SequenceSynthetic Polynucleotide 35gtgcagtaag
gaggaaaaaa aa
223625DNAArtificial SequenceSynthetic Polynucleotide 36gctagcttta
agaaggagat atacc
253794PRTArtificial SequenceSynthetic Polypeptide 37Met Ala Arg Val Thr
Val Gln Asp Ala Val Glu Lys Ile Gly Asn Arg1 5
10 15Phe Asp Leu Val Leu Val Ala Ala Arg Arg Ala
Arg Gln Met Gln Val 20 25
30Gly Gly Lys Asp Pro Leu Val Pro Glu Glu Asn Asp Lys Thr Thr Val
35 40 45Ile Ala Leu Arg Glu Ile Glu Glu
Gly Leu Ile Asn Asn Gln Ile Leu 50 55
60Asp Val Arg Glu Arg Gln Glu Gln Gln Glu Gln Glu Ala Ala Glu Leu65
70 75 80Gln Ala Val Thr Ala
Ile Ala Glu Gly Arg Arg Ala Ala Ala 85
9038218PRTArtificial SequenceSynthetic Polypeptide 38Met Ser Ile Ser Ser
Arg Val Lys Ser Lys Arg Ile Gln Leu Gly Leu1 5
10 15Asn Gln Ala Glu Leu Ala Gln Lys Val Gly Thr
Thr Gln Gln Ser Ile 20 25
30Glu Gln Leu Glu Asn Gly Lys Thr Lys Arg Pro Arg Phe Leu Pro Glu
35 40 45Leu Ala Ser Ala Leu Gly Val Ser
Val Asp Trp Leu Leu Asn Gly Thr 50 55
60Ser Asp Ser Asn Val Arg Phe Val Gly His Val Glu Pro Lys Gly Lys65
70 75 80Tyr Pro Leu Ile Ser
Met Val Arg Ala Arg Ser Trp Cys Glu Ala Cys 85
90 95Glu Pro Tyr Asp Ile Lys Asp Ile Asp Glu Trp
Tyr Asp Ser Asp Val 100 105
110Asn Leu Leu Gly Asn Gly Phe Trp Leu Lys Val Glu Gly Asp Ser Met
115 120 125Thr Ser Pro Val Gly Gln Ser
Ile Pro Glu Gly His Met Val Leu Val 130 135
140Asp Thr Gly Arg Glu Pro Val Asn Gly Ser Leu Val Val Ala Lys
Leu145 150 155 160Thr Asp
Ala Asn Glu Ala Thr Phe Lys Lys Leu Val Ile Asp Gly Gly
165 170 175Gln Lys Tyr Leu Lys Gly Leu
Asn Pro Ser Trp Pro Met Thr Pro Ile 180 185
190Asn Gly Asn Cys Lys Ile Ile Gly Val Val Val Glu Ala Arg
Val Lys 195 200 205Phe Val Asp Tyr
Lys Asp Asp Asp Asp Lys 210 2153998PRTArtificial
SequenceSynthetic Polypeptide 39Trp Tyr Phe Gly Lys Ile Thr Arg Arg Glu
Ser Glu Arg Leu Leu Leu1 5 10
15Asn Pro Glu Asn Pro Arg Gly Thr Phe Leu Val Arg Glu Ser Glu Thr
20 25 30Val Lys Gly Ala Tyr Ala
Leu Ser Val Ser Asp Phe Asp Asn Ala Lys 35 40
45Gly Leu Asn Val Lys His Tyr Leu Ile Arg Lys Leu Asp Ser
Gly Gly 50 55 60Phe Tyr Ile Thr Ser
Arg Thr Gln Phe Ser Ser Leu Gln Gln Leu Val65 70
75 80Ala Tyr Tyr Ser Lys His Ala Asp Gly Leu
Cys His Arg Leu Thr Asn 85 90
95Val Cys4012PRTArtificial SequenceSynthetic Polypeptide 40Trp Met
Glu Asp Tyr Asp Tyr Val His Leu Gln Gly1 5
104111PRTArtificial SequenceSynthetic Polypeptide 41Glu Pro Gln Tyr Glu
Glu Ile Pro Ile Tyr Leu1 5
104210PRTArtificial SequenceSynthetic Polypeptide 42Asp His Gln Tyr Tyr
Asn Asp Phe Pro Gly1 5
104310PRTArtificial SequenceSynthetic Polypeptide 43Pro Gln Arg Tyr Leu
Val Ile Gln Gly Asp1 5
1044287PRTArtificial SequenceSynthetic Polypeptide 44Met Ser Lys Pro Gln
Thr Gln Gly Leu Ala Lys Asp Ala Trp Glu Ile1 5
10 15Pro Arg Glu Ser Leu Arg Leu Glu Val Lys Leu
Gly Gln Gly Cys Phe 20 25
30Gly Glu Val Trp Met Gly Thr Trp Asn Gly Thr Thr Arg Val Ala Ile
35 40 45Lys Thr Leu Lys Pro Gly Thr Met
Ser Pro Glu Ala Phe Leu Gln Glu 50 55
60Ala Gln Val Met Lys Lys Leu Arg His Glu Lys Leu Val Gln Leu Tyr65
70 75 80Ala Val Val Ser Glu
Glu Pro Ile Tyr Ile Val Thr Glu Tyr Met Ser 85
90 95Lys Gly Ser Leu Leu Asp Phe Leu Lys Gly Glu
Thr Gly Lys Tyr Leu 100 105
110Arg Leu Pro Gln Leu Val Asp Met Ala Ala Gln Ile Ala Ser Gly Met
115 120 125Ala Tyr Val Glu Arg Met Asn
Tyr Val His Arg Asp Leu Arg Ala Ala 130 135
140Asn Ile Leu Val Gly Glu Asn Leu Val Cys Lys Val Ala Asp Phe
Gly145 150 155 160Leu Ala
Arg Leu Ile Glu Asp Asn Glu Tyr Thr Ala Arg Gln Gly Ala
165 170 175Lys Phe Pro Ile Lys Trp Thr
Ala Pro Glu Ala Ala Leu Tyr Gly Arg 180 185
190Phe Thr Ile Lys Ser Asp Val Trp Ser Phe Gly Ile Leu Leu
Thr Glu 195 200 205Leu Thr Thr Lys
Gly Arg Val Pro Tyr Pro Gly Met Val Asn Arg Glu 210
215 220Val Leu Asp Gln Val Glu Arg Gly Tyr Arg Met Pro
Cys Pro Pro Glu225 230 235
240Cys Pro Glu Ser Leu His Asp Leu Met Cys Gln Cys Trp Arg Lys Glu
245 250 255Pro Glu Glu Arg Pro
Thr Phe Glu Tyr Leu Gln Ala Phe Leu Glu Asp 260
265 270Tyr Phe Thr Ser Thr Glu Pro Gln Tyr Gln Pro Gly
Glu Asn Leu 275 280
28545378PRTArtificial SequenceSynthetic Polypeptide 45Met Val Asp Tyr Ser
Val Trp Asp His Ile Glu Val Ser Asp Asp Glu1 5
10 15Asp Glu Thr His Pro Asn Ile Asp Thr Ala Ser
Leu Phe Arg Trp Arg 20 25
30His Gln Ala Arg Val Glu Arg Met Glu Gln Phe Gln Lys Glu Lys Glu
35 40 45Glu Leu Asp Arg Gly Cys Arg Glu
Cys Lys Arg Lys Val Ala Glu Cys 50 55
60Gln Arg Lys Leu Lys Glu Leu Glu Val Ala Glu Gly Gly Lys Ala Glu65
70 75 80Leu Glu Arg Leu Gln
Ala Glu Ala Gln Gln Leu Arg Lys Glu Glu Arg 85
90 95Ser Trp Glu Gln Lys Leu Glu Glu Met Arg Lys
Lys Glu Lys Ser Met 100 105
110Pro Trp Asn Val Asp Thr Leu Ser Lys Asp Gly Phe Ser Lys Ser Met
115 120 125Val Asn Thr Lys Pro Glu Lys
Thr Glu Glu Asp Ser Glu Glu Val Arg 130 135
140Glu Gln Lys His Lys Thr Phe Val Glu Lys Tyr Glu Lys Gln Ile
Lys145 150 155 160His Phe
Gly Met Leu Arg Arg Trp Asp Asp Ser Gln Lys Tyr Leu Ser
165 170 175Asp Asn Val His Leu Val Cys
Glu Glu Thr Ala Asn Tyr Leu Val Ile 180 185
190Trp Cys Ile Asp Leu Glu Val Glu Glu Lys Cys Ala Leu Met
Glu Gln 195 200 205Val Ala His Gln
Thr Ile Val Met Gln Phe Ile Leu Glu Leu Ala Lys 210
215 220Ser Leu Lys Val Asp Pro Arg Ala Cys Phe Arg Gln
Phe Phe Thr Lys225 230 235
240Ile Lys Thr Ala Asp Arg Gln Tyr Met Glu Gly Phe Asn Asp Glu Leu
245 250 255Glu Ala Phe Lys Glu
Arg Val Arg Gly Arg Ala Lys Leu Arg Ile Glu 260
265 270Lys Ala Met Lys Glu Tyr Glu Glu Glu Glu Arg Lys
Lys Arg Leu Gly 275 280 285Pro Gly
Gly Leu Asp Pro Val Glu Val Tyr Glu Ser Leu Pro Glu Glu 290
295 300Leu Gln Lys Cys Phe Asp Val Lys Asp Val Gln
Met Leu Gln Asp Ala305 310 315
320Ile Ser Lys Met Asp Pro Thr Asp Ala Lys Tyr His Met Gln Arg Cys
325 330 335Ile Asp Ser Gly
Leu Trp Val Pro Asn Ser Lys Ala Ser Glu Ala Lys 340
345 350Glu Gly Glu Glu Ala Gly Pro Gly Asp Pro Leu
Leu Glu Ala Val Pro 355 360 365Lys
Thr Gly Asp Glu Lys Asp Val Ser Val 370
37546321PRTArtificial SequenceSynthetic Polypeptide 46Met Glu Met Glu Lys
Glu Phe Glu Gln Ile Asp Lys Ser Gly Ser Trp1 5
10 15Ala Ala Ile Tyr Gln Asp Ile Arg His Glu Ala
Ser Asp Phe Pro Cys 20 25
30Arg Val Ala Lys Leu Pro Lys Asn Lys Asn Arg Asn Arg Tyr Arg Asp
35 40 45Val Ser Pro Phe Asp His Ser Arg
Ile Lys Leu His Gln Glu Asp Asn 50 55
60Asp Tyr Ile Asn Ala Ser Leu Ile Lys Met Glu Glu Ala Gln Arg Ser65
70 75 80Tyr Ile Leu Thr Gln
Gly Pro Leu Pro Asn Thr Cys Gly His Phe Trp 85
90 95Glu Met Val Trp Glu Gln Lys Ser Arg Gly Val
Val Met Leu Asn Arg 100 105
110Val Met Glu Lys Gly Ser Leu Lys Cys Ala Gln Tyr Trp Pro Gln Lys
115 120 125Glu Glu Lys Glu Met Ile Phe
Glu Asp Thr Asn Leu Lys Leu Thr Leu 130 135
140Ile Ser Glu Asp Ile Lys Ser Tyr Tyr Thr Val Arg Gln Leu Glu
Leu145 150 155 160Glu Asn
Leu Thr Thr Gln Glu Thr Arg Glu Ile Leu His Phe His Tyr
165 170 175Thr Thr Trp Pro Asp Phe Gly
Val Pro Glu Ser Pro Ala Ser Phe Leu 180 185
190Asn Phe Leu Phe Lys Val Arg Glu Ser Gly Ser Leu Ser Pro
Glu His 195 200 205Gly Pro Val Val
Val His Ser Ser Ala Gly Ile Gly Arg Ser Gly Thr 210
215 220Phe Cys Leu Ala Asp Thr Cys Leu Leu Leu Met Asp
Lys Arg Lys Asp225 230 235
240Pro Ser Ser Val Asp Ile Lys Lys Val Leu Leu Glu Met Arg Lys Phe
245 250 255Arg Met Gly Leu Ile
Gln Thr Ala Asp Gln Leu Arg Phe Ser Tyr Leu 260
265 270Ala Val Ile Glu Gly Ala Lys Phe Ile Met Gly Asp
Ser Ser Val Gln 275 280 285Asp Gln
Trp Lys Glu Leu Ser His Glu Asp Leu Glu Pro Pro Pro Glu 290
295 300His Ile Pro Pro Pro Pro Arg Pro Pro Lys Arg
Ile Leu Glu Pro His305 310 315
320Asn47371PRTArtificial SequenceSynthetic Polypeptide 47Met Lys Ile
Glu Glu Gly Lys Leu Val Ile Trp Ile Asn Gly Asp Lys1 5
10 15Gly Tyr Asn Gly Leu Ala Glu Val Gly
Lys Lys Phe Glu Lys Asp Thr 20 25
30Gly Ile Lys Val Thr Val Glu His Pro Asp Lys Leu Glu Glu Lys Phe
35 40 45Pro Gln Val Ala Ala Thr Gly
Asp Gly Pro Asp Ile Ile Phe Trp Ala 50 55
60His Asp Arg Phe Gly Gly Tyr Ala Gln Ser Gly Leu Leu Ala Glu Ile65
70 75 80Thr Pro Asp Lys
Ala Phe Gln Asp Lys Leu Tyr Pro Phe Thr Trp Asp 85
90 95Ala Val Arg Tyr Asn Gly Lys Leu Ile Ala
Tyr Pro Ile Ala Val Glu 100 105
110Ala Leu Ser Leu Ile Tyr Asn Lys Asp Leu Leu Pro Asn Pro Pro Lys
115 120 125Thr Trp Glu Glu Ile Pro Ala
Leu Asp Lys Glu Leu Lys Ala Lys Gly 130 135
140Lys Ser Ala Leu Met Phe Asn Leu Gln Glu Pro Tyr Phe Thr Trp
Pro145 150 155 160Leu Ile
Ala Ala Asp Gly Gly Tyr Ala Phe Lys Tyr Glu Asn Gly Lys
165 170 175Tyr Asp Ile Lys Asp Val Gly
Val Asp Asn Ala Gly Ala Lys Ala Gly 180 185
190Leu Thr Phe Leu Val Asp Leu Ile Lys Asn Lys His Met Asn
Ala Asp 195 200 205Thr Asp Tyr Ser
Ile Ala Glu Ala Ala Phe Asn Lys Gly Glu Thr Ala 210
215 220Met Thr Ile Asn Gly Pro Trp Ala Trp Ser Asn Ile
Asp Thr Ser Lys225 230 235
240Val Asn Tyr Gly Val Thr Val Leu Pro Thr Phe Lys Gly Gln Pro Ser
245 250 255Lys Pro Phe Val Gly
Val Leu Ser Ala Gly Ile Asn Ala Ala Ser Pro 260
265 270Asn Lys Glu Leu Ala Lys Glu Phe Leu Glu Asn Tyr
Leu Leu Thr Asp 275 280 285Glu Gly
Leu Glu Ala Val Asn Lys Asp Lys Pro Leu Gly Ala Val Ala 290
295 300Leu Lys Ser Tyr Glu Glu Glu Leu Ala Lys Asp
Pro Arg Ile Ala Ala305 310 315
320Thr Met Glu Asn Ala Gln Lys Gly Glu Ile Met Pro Asn Ile Pro Gln
325 330 335Met Ser Ala Phe
Trp Tyr Ala Val Arg Thr Ala Val Ile Asn Ala Ala 340
345 350Ser Gly Arg Gln Thr Val Asp Glu Ala Leu Lys
Asp Ala Gln Thr Arg 355 360 365Ile
Thr Lys 37048707PRTArtificial SequenceSynthetic Polypeptide 48Met Lys
Phe Gly Asn Phe Leu Leu Thr Tyr Gln Pro Pro Gln Phe Ser1 5
10 15Gln Thr Glu Val Met Lys Arg Leu
Val Lys Leu Gly Arg Ile Ser Glu 20 25
30Glu Cys Gly Phe Asp Thr Val Trp Leu Leu Glu His His Phe Thr
Glu 35 40 45Phe Gly Leu Leu Gly
Asn Pro Tyr Val Ala Ala Ala Tyr Leu Leu Gly 50 55
60Ala Thr Lys Lys Leu Asn Val Gly Thr Ala Ala Ile Val Leu
Pro Thr65 70 75 80Ala
His Pro Val Arg Gln Leu Glu Asp Val Asn Leu Leu Asp Gln Met
85 90 95Ser Lys Gly Arg Phe Arg Phe
Gly Ile Cys Arg Gly Leu Tyr Asn Lys 100 105
110Asp Phe Arg Val Phe Gly Thr Asp Met Asn Asn Ser Arg Ala
Leu Ala 115 120 125Glu Cys Trp Tyr
Gly Leu Ile Lys Asn Gly Met Thr Glu Gly Tyr Met 130
135 140Glu Ala Asp Asn Glu His Ile Lys Phe His Lys Val
Lys Val Asn Pro145 150 155
160Ala Ala Tyr Ser Arg Gly Gly Ala Pro Val Tyr Val Val Ala Glu Ser
165 170 175Ala Ser Thr Thr Glu
Trp Ala Ala Gln Phe Gly Leu Pro Met Ile Leu 180
185 190Ser Trp Ile Ile Asn Thr Asn Glu Lys Lys Ala Gln
Leu Glu Leu Tyr 195 200 205Asn Glu
Val Ala Gln Glu Tyr Gly His Asp Ile His Asn Ile Asp His 210
215 220Cys Leu Ser Tyr Ile Thr Ser Val Asp His Asp
Ser Ile Lys Ala Lys225 230 235
240Glu Ile Cys Arg Lys Phe Leu Gly His Trp Tyr Asp Ser Tyr Val Asn
245 250 255Ala Thr Thr Ile
Phe Asp Asp Ser Asp Gln Thr Arg Gly Tyr Asp Phe 260
265 270Asn Lys Gly Gln Trp Arg Asp Phe Val Leu Lys
Gly His Lys Asp Thr 275 280 285Asn
Arg Arg Ile Asp Tyr Ser Tyr Glu Ile Asn Pro Val Gly Thr Pro 290
295 300Gln Glu Cys Ile Asp Ile Ile Gln Lys Asp
Ile Asp Ala Thr Gly Ile305 310 315
320Ser Asn Ile Cys Cys Gly Phe Glu Ala Asn Gly Thr Val Asp Glu
Ile 325 330 335Ile Ala Ser
Met Lys Leu Phe Gln Ser Asp Val Met Pro Phe Leu Lys 340
345 350Glu Lys Gln Arg Ser Leu Leu Tyr Tyr Gly
Gly Gly Gly Ser Gly Gly 355 360
365Gly Gly Ser Gly Gly Gly Gly Ser Gly Gly Gly Gly Ser Lys Phe Gly 370
375 380Leu Phe Phe Leu Asn Phe Ile Asn
Ser Thr Thr Val Gln Glu Gln Ser385 390
395 400Ile Val Arg Met Gln Glu Ile Thr Glu Tyr Val Asp
Lys Leu Asn Phe 405 410
415Glu Gln Ile Leu Val Tyr Glu Asn His Phe Ser Asp Asn Gly Val Val
420 425 430Gly Ala Pro Leu Thr Val
Ser Gly Phe Leu Leu Gly Leu Thr Glu Lys 435 440
445Ile Lys Ile Gly Ser Leu Asn His Ile Ile Thr Thr His His
Pro Val 450 455 460Arg Ile Ala Glu Glu
Ala Cys Leu Leu Asp Gln Leu Ser Glu Gly Arg465 470
475 480Phe Ile Leu Gly Phe Ser Asp Cys Glu Lys
Lys Asp Glu Met His Phe 485 490
495Phe Asn Arg Pro Val Glu Tyr Gln Gln Gln Leu Phe Glu Glu Cys Tyr
500 505 510Glu Ile Ile Asn Asp
Ala Leu Thr Thr Gly Tyr Cys Asn Pro Asp Asn 515
520 525Asp Phe Tyr Ser Phe Pro Lys Ile Ser Val Asn Pro
His Ala Tyr Thr 530 535 540Pro Gly Gly
Pro Arg Lys Tyr Val Thr Ala Thr Ser His His Ile Val545
550 555 560Glu Trp Ala Ala Lys Lys Gly
Ile Pro Leu Ile Phe Lys Trp Asp Asp 565
570 575Ser Asn Asp Val Arg Tyr Glu Tyr Ala Glu Arg Tyr
Lys Ala Val Ala 580 585 590Asp
Lys Tyr Asp Val Asp Leu Ser Glu Ile Asp His Gln Leu Met Ile 595
600 605Leu Val Asn Tyr Asn Glu Asp Ser Asn
Lys Ala Lys Gln Glu Thr Arg 610 615
620Ala Phe Ile Ser Asp Tyr Val Leu Glu Met His Pro Asn Glu Asn Phe625
630 635 640Glu Asn Lys Leu
Glu Glu Ile Ile Ala Glu Asn Ala Val Gly Asn Tyr 645
650 655Thr Glu Cys Ile Thr Ala Ala Lys Leu Ala
Ile Glu Lys Cys Gly Ala 660 665
670Lys Ser Val Leu Leu Ser Phe Glu Pro Met Asn Asp Leu Met Ser Gln
675 680 685Lys Asn Val Ile Asn Ile Val
Asp Asp Asn Ile Lys Lys Tyr His Thr 690 695
700Glu Tyr Thr70549263PRTArtificial SequenceSynthetic Polypeptide
49Met Arg Glu Ala Val Ile Ala Glu Val Ser Thr Gln Leu Ser Glu Val1
5 10 15Val Gly Val Ile Glu Arg
His Leu Glu Pro Thr Leu Leu Ala Val His 20 25
30Leu Tyr Gly Ser Ala Val Asp Gly Gly Leu Lys Pro His
Ser Asp Ile 35 40 45Asp Leu Leu
Val Thr Val Thr Val Arg Leu Asp Glu Thr Thr Arg Arg 50
55 60Ala Leu Ile Asn Asp Leu Leu Glu Thr Ser Ala Ser
Pro Gly Glu Ser65 70 75
80Glu Ile Leu Arg Ala Val Glu Val Thr Ile Val Val His Asp Asp Ile
85 90 95Ile Pro Trp Arg Tyr Pro
Ala Lys Arg Glu Leu Gln Phe Gly Glu Trp 100
105 110Gln Arg Asn Asp Ile Leu Ala Gly Ile Phe Glu Pro
Ala Thr Ile Asp 115 120 125Ile Asp
Leu Ala Ile Leu Leu Thr Lys Ala Arg Glu His Ser Val Ala 130
135 140Leu Val Gly Pro Ala Ala Glu Glu Leu Phe Asp
Pro Val Pro Glu Gln145 150 155
160Asp Leu Phe Glu Ala Leu Asn Glu Thr Leu Thr Leu Trp Asn Ser Pro
165 170 175Pro Asp Trp Ala
Gly Asp Glu Arg Asn Val Val Leu Thr Leu Ser Arg 180
185 190Ile Trp Tyr Ser Ala Val Thr Gly Lys Ile Ala
Pro Lys Asp Val Ala 195 200 205Ala
Asp Trp Ala Met Glu Arg Leu Pro Ala Gln Tyr Gln Pro Val Ile 210
215 220Leu Glu Ala Arg Gln Ala Tyr Leu Gly Gln
Glu Glu Asp Arg Leu Ala225 230 235
240Ser Arg Ala Asp Gln Leu Glu Glu Phe Val His Tyr Val Lys Gly
Glu 245 250 255Ile Thr Lys
Val Val Gly Lys 26050785PRTArtificial SequenceSynthetic
Polypeptide 50Met Val Lys Arg Glu Phe Pro Pro Gly Phe Trp Lys Asp Asp Leu
Ile1 5 10 15Asp Ser Leu
Thr Ser Ser His Lys Val Ala Ala Ser Asp Glu Lys Arg 20
25 30Ile Glu Thr Leu Ile Ser Glu Ile Lys Asn
Met Phe Arg Cys Met Gly 35 40
45Tyr Gly Glu Thr Asn Pro Ser Ala Tyr Asp Thr Ala Trp Val Ala Arg 50
55 60Ile Pro Ala Val Asp Gly Ser Asp Asn
Pro His Phe Pro Glu Thr Val65 70 75
80Glu Trp Ile Leu Gln Asn Gln Leu Lys Asp Gly Ser Trp Gly
Glu Gly 85 90 95Phe Tyr
Phe Leu Ala Tyr Asp Arg Ile Leu Ala Thr Leu Ala Cys Ile 100
105 110Ile Thr Leu Thr Leu Trp Arg Thr Gly
Glu Thr Gln Val Gln Lys Gly 115 120
125Ile Glu Phe Phe Arg Thr Gln Ala Gly Lys Met Glu Asp Glu Ala Asp
130 135 140Ser His Arg Pro Ser Gly Phe
Glu Ile Val Phe Pro Ala Met Leu Lys145 150
155 160Glu Ala Lys Ile Leu Gly Leu Asp Leu Pro Tyr Asp
Leu Pro Phe Leu 165 170
175Lys Gln Ile Ile Glu Lys Arg Glu Ala Lys Leu Lys Arg Ile Pro Thr
180 185 190Asp Val Leu Tyr Ala Leu
Pro Thr Thr Leu Leu Tyr Ser Leu Glu Gly 195 200
205Leu Gln Glu Ile Val Asp Trp Gln Lys Ile Met Lys Leu Gln
Ser Lys 210 215 220Asp Gly Ser Phe Leu
Ser Ser Pro Ala Ser Thr Ala Ala Val Phe Met225 230
235 240Arg Thr Gly Asn Lys Lys Cys Leu Asp Phe
Leu Asn Phe Val Leu Lys 245 250
255Lys Phe Gly Asn His Val Pro Cys His Tyr Pro Leu Asp Leu Phe Glu
260 265 270Arg Leu Trp Ala Val
Asp Thr Val Glu Arg Leu Gly Ile Asp Arg His 275
280 285Phe Lys Glu Glu Ile Lys Glu Ala Leu Asp Tyr Val
Tyr Ser His Trp 290 295 300Asp Glu Arg
Gly Ile Gly Trp Ala Arg Glu Asn Pro Val Pro Asp Ile305
310 315 320Asp Asp Thr Ala Met Gly Leu
Arg Ile Leu Arg Leu His Gly Tyr Asn 325
330 335Val Ser Ser Asp Val Leu Lys Thr Phe Arg Asp Glu
Asn Gly Glu Phe 340 345 350Phe
Cys Phe Leu Gly Gln Thr Gln Arg Gly Val Thr Asp Met Leu Asn 355
360 365Val Asn Arg Cys Ser His Val Ser Phe
Pro Gly Glu Thr Ile Met Glu 370 375
380Glu Ala Lys Leu Cys Thr Glu Arg Tyr Leu Arg Asn Ala Leu Glu Asn385
390 395 400Val Asp Ala Phe
Asp Lys Trp Ala Phe Lys Lys Asn Ile Arg Gly Glu 405
410 415Val Glu Tyr Ala Leu Lys Tyr Pro Trp His
Lys Ser Met Pro Arg Leu 420 425
430Glu Ala Arg Ser Tyr Ile Glu Asn Tyr Gly Pro Asp Asp Val Trp Leu
435 440 445Gly Lys Thr Val Tyr Met Met
Pro Tyr Ile Ser Asn Glu Lys Tyr Leu 450 455
460Glu Leu Ala Lys Leu Asp Phe Asn Lys Val Gln Ser Ile His Gln
Thr465 470 475 480Glu Leu
Gln Asp Leu Arg Arg Trp Trp Lys Ser Ser Gly Phe Thr Asp
485 490 495Leu Asn Phe Thr Arg Glu Arg
Val Thr Glu Ile Tyr Phe Ser Pro Ala 500 505
510Ser Phe Ile Phe Glu Pro Glu Phe Ser Lys Cys Arg Glu Val
Tyr Thr 515 520 525Lys Thr Ser Asn
Phe Thr Val Ile Leu Asp Asp Leu Tyr Asp Ala His 530
535 540Gly Ser Leu Asp Asp Leu Lys Leu Phe Thr Glu Ser
Val Lys Arg Trp545 550 555
560Asp Leu Ser Leu Val Asp Gln Met Pro Gln Gln Met Lys Ile Cys Phe
565 570 575Val Gly Phe Tyr Asn
Thr Phe Asn Asp Ile Ala Lys Glu Gly Arg Glu 580
585 590Arg Gln Gly Arg Asp Val Leu Gly Tyr Ile Gln Asn
Val Trp Lys Val 595 600 605Gln Leu
Glu Ala Tyr Thr Lys Glu Ala Glu Trp Ser Glu Ala Lys Tyr 610
615 620Val Pro Ser Phe Asn Glu Tyr Ile Glu Asn Ala
Ser Val Ser Ile Ala625 630 635
640Leu Gly Thr Val Val Leu Ile Ser Ala Leu Phe Thr Gly Glu Val Leu
645 650 655Thr Asp Glu Val
Leu Ser Lys Ile Asp Arg Glu Ser Arg Phe Leu Gln 660
665 670Leu Met Gly Leu Thr Gly Arg Leu Val Asn Asp
Thr Lys Thr Tyr Gln 675 680 685Ala
Glu Arg Gly Gln Gly Glu Val Ala Ser Ala Ile Gln Cys Tyr Met 690
695 700Lys Asp His Pro Lys Ile Ser Glu Glu Glu
Ala Leu Gln His Val Tyr705 710 715
720Ser Val Met Glu Asn Ala Leu Glu Glu Leu Asn Arg Glu Phe Val
Asn 725 730 735Asn Lys Ile
Pro Asp Ile Tyr Lys Arg Leu Val Phe Glu Thr Ala Arg 740
745 750Ile Met Gln Leu Phe Tyr Met Gln Gly Asp
Gly Leu Thr Leu Ser His 755 760
765Asp Met Glu Ile Lys Glu His Val Lys Asn Cys Leu Phe Gln Pro Val 770
775 780Ala78551296PRTArtificial
SequenceSynthetic Polypeptide 51Met Phe Asp Phe Asn Glu Tyr Met Lys Ser
Lys Ala Val Ala Val Asp1 5 10
15Ala Ala Leu Asp Lys Ala Ile Pro Leu Glu Tyr Pro Glu Lys Ile His
20 25 30Glu Ser Met Arg Tyr Ser
Leu Leu Ala Gly Gly Lys Arg Val Arg Pro 35 40
45Ala Leu Cys Ile Ala Ala Cys Glu Leu Val Gly Gly Ser Gln
Asp Leu 50 55 60Ala Met Pro Thr Ala
Cys Ala Met Glu Met Ile His Thr Met Ser Leu65 70
75 80Ile His Asp Asp Leu Pro Cys Met Asp Asn
Asp Asp Phe Arg Arg Gly 85 90
95Lys Pro Thr Asn His Lys Val Phe Gly Glu Asp Thr Ala Val Leu Ala
100 105 110Gly Asp Ala Leu Leu
Ser Phe Ala Phe Glu His Ile Ala Val Ala Thr 115
120 125Ser Lys Thr Val Pro Ser Asp Arg Thr Leu Arg Val
Ile Ser Glu Leu 130 135 140Gly Lys Thr
Ile Gly Ser Gln Gly Leu Val Gly Gly Gln Val Val Asp145
150 155 160Ile Thr Ser Glu Gly Asp Ala
Asn Val Asp Leu Lys Thr Leu Glu Trp 165
170 175Ile His Ile His Lys Thr Ala Val Leu Leu Glu Cys
Ser Val Val Ser 180 185 190Gly
Gly Ile Leu Gly Gly Ala Thr Glu Asp Glu Ile Ala Arg Ile Arg 195
200 205Arg Tyr Ala Arg Cys Val Gly Leu Leu
Phe Gln Val Val Asp Asp Ile 210 215
220Leu Asp Val Thr Lys Ser Ser Glu Glu Leu Gly Lys Thr Ala Gly Lys225
230 235 240Asp Leu Leu Thr
Asp Lys Ala Thr Tyr Pro Lys Leu Met Gly Leu Glu 245
250 255Lys Ala Lys Glu Phe Ala Ala Glu Leu Ala
Thr Arg Ala Lys Glu Glu 260 265
270Leu Ser Ser Phe Asp Gln Ile Lys Ala Ala Pro Leu Leu Gly Leu Ala
275 280 285Asp Tyr Ile Ala Phe Arg Gln
Asn 290 295521049PRTArtificial SequenceSynthetic
Polypeptide 52Met Thr Ile Lys Glu Met Pro Gln Pro Lys Thr Phe Gly Glu Leu
Lys1 5 10 15Asn Leu Pro
Leu Leu Asn Thr Asp Lys Pro Val Gln Ala Leu Met Lys 20
25 30Ile Ala Asp Glu Leu Gly Glu Ile Phe Lys
Phe Glu Ala Pro Gly Arg 35 40
45Val Thr Arg Tyr Leu Ser Ser Gln Arg Leu Ile Lys Glu Ala Cys Asp 50
55 60Glu Ser Arg Phe Asp Lys Asn Leu Ser
Gln Ala Leu Lys Phe Val Arg65 70 75
80Asp Phe Ala Gly Asp Gly Leu Phe Thr Ser Trp Thr His Glu
Lys Asn 85 90 95Trp Lys
Lys Ala His Asn Ile Leu Leu Pro Ser Phe Ser Gln Gln Ala 100
105 110Met Lys Gly Tyr His Ala Met Met Val
Asp Ile Ala Val Gln Leu Val 115 120
125Gln Lys Trp Glu Arg Leu Asn Ala Asp Glu His Ile Glu Val Pro Glu
130 135 140Asp Met Thr Arg Leu Thr Leu
Asp Thr Ile Gly Leu Cys Gly Phe Asn145 150
155 160Tyr Arg Phe Asn Ser Phe Tyr Arg Asp Gln Pro His
Pro Phe Ile Thr 165 170
175Ser Met Val Arg Ala Leu Asp Glu Ala Met Asn Lys Leu Gln Arg Ala
180 185 190Asn Pro Asp Asp Pro Ala
Tyr Asp Glu Asn Lys Arg Gln Phe Gln Glu 195 200
205Asp Ile Lys Val Met Asn Asp Leu Val Asp Lys Ile Ile Ala
Asp Arg 210 215 220Lys Ala Ser Gly Glu
Gln Ser Asp Asp Leu Leu Thr His Met Leu Asn225 230
235 240Gly Lys Asp Pro Glu Thr Gly Glu Pro Leu
Asp Asp Glu Asn Ile Arg 245 250
255Tyr Gln Ile Ile Thr Phe Leu Ile Ala Gly His Glu Thr Thr Ser Gly
260 265 270Leu Leu Ser Phe Ala
Leu Tyr Phe Leu Val Lys Asn Pro His Val Leu 275
280 285Gln Lys Ala Ala Glu Glu Ala Ala Arg Val Leu Val
Asp Pro Val Pro 290 295 300Ser Tyr Lys
Gln Val Lys Gln Leu Lys Tyr Val Gly Met Val Leu Asn305
310 315 320Glu Ala Leu Arg Leu Trp Pro
Thr Ala Pro Ala Phe Ser Leu Tyr Ala 325
330 335Lys Glu Asp Thr Val Leu Gly Gly Glu Tyr Pro Leu
Glu Lys Gly Asp 340 345 350Glu
Leu Met Val Leu Ile Pro Gln Leu His Arg Asp Lys Thr Ile Trp 355
360 365Gly Asp Asp Val Glu Glu Phe Arg Pro
Glu Arg Phe Glu Asn Pro Ser 370 375
380Ala Ile Pro Gln His Ala Phe Lys Pro Phe Gly Asn Gly Gln Arg Ala385
390 395 400Cys Ile Gly Gln
Gln Phe Ala Leu His Glu Ala Thr Leu Val Leu Gly 405
410 415Met Met Leu Lys His Phe Asp Phe Glu Asp
His Thr Asn Tyr Glu Leu 420 425
430Asp Ile Lys Glu Thr Leu Thr Leu Lys Pro Glu Gly Phe Val Val Lys
435 440 445Ala Lys Ser Lys Lys Ile Pro
Leu Gly Gly Ile Pro Ser Pro Ser Thr 450 455
460Glu Gln Ser Ala Lys Lys Val Arg Lys Lys Ala Glu Asn Ala His
Asn465 470 475 480Thr Pro
Leu Leu Val Leu Tyr Gly Ser Asn Met Gly Thr Ala Glu Gly
485 490 495Thr Ala Arg Asp Leu Ala Asp
Ile Ala Met Ser Lys Gly Phe Ala Pro 500 505
510Gln Val Ala Thr Leu Asp Ser His Ala Gly Asn Leu Pro Arg
Glu Gly 515 520 525Ala Val Leu Ile
Val Thr Ala Ser Tyr Asn Gly His Pro Pro Asp Asn 530
535 540Ala Lys Gln Phe Val Asp Trp Leu Asp Gln Ala Ser
Ala Asp Glu Val545 550 555
560Lys Gly Val Arg Tyr Ser Val Phe Gly Cys Gly Asp Lys Asn Trp Ala
565 570 575Thr Thr Tyr Gln Lys
Val Pro Ala Phe Ile Asp Glu Thr Leu Ala Ala 580
585 590Lys Gly Ala Glu Asn Ile Ala Asp Arg Gly Glu Ala
Asp Ala Ser Asp 595 600 605Asp Phe
Glu Gly Thr Tyr Glu Glu Trp Arg Glu His Met Trp Ser Asp 610
615 620Val Ala Ala Tyr Phe Asn Leu Asp Ile Glu Asn
Ser Glu Asp Asn Lys625 630 635
640Ser Thr Leu Ser Leu Gln Phe Val Asp Ser Ala Ala Asp Met Pro Leu
645 650 655Ala Lys Met His
Gly Ala Phe Ser Thr Asn Val Val Ala Ser Lys Glu 660
665 670Leu Gln Gln Pro Gly Ser Ala Arg Ser Thr Arg
His Leu Glu Ile Glu 675 680 685Leu
Pro Lys Glu Ala Ser Tyr Gln Glu Gly Asp His Leu Gly Val Ile 690
695 700Pro Arg Asn Tyr Glu Gly Ile Val Asn Arg
Val Thr Ala Arg Phe Gly705 710 715
720Leu Asp Ala Ser Gln Gln Ile Arg Leu Glu Ala Glu Glu Glu Lys
Leu 725 730 735Ala His Leu
Pro Leu Ala Lys Thr Val Ser Val Glu Glu Leu Leu Gln 740
745 750Tyr Val Glu Leu Gln Asp Pro Val Thr Arg
Thr Gln Leu Arg Ala Met 755 760
765Ala Ala Lys Thr Val Cys Pro Pro His Lys Val Glu Leu Glu Ala Leu 770
775 780Leu Glu Lys Gln Ala Tyr Lys Glu
Gln Val Leu Ala Lys Arg Leu Thr785 790
795 800Met Leu Glu Leu Leu Glu Lys Tyr Pro Ala Cys Glu
Met Lys Phe Ser 805 810
815Glu Phe Ile Ala Leu Leu Pro Ser Ile Arg Pro Arg Tyr Tyr Ser Ile
820 825 830Ser Ser Ser Pro Arg Val
Asp Glu Lys Gln Ala Ser Ile Thr Val Ser 835 840
845Val Val Ser Gly Glu Ala Trp Ser Gly Tyr Gly Glu Tyr Lys
Gly Ile 850 855 860Ala Ser Asn Tyr Leu
Ala Glu Leu Gln Glu Gly Asp Thr Ile Thr Cys865 870
875 880Phe Ile Ser Thr Pro Gln Ser Glu Phe Thr
Leu Pro Lys Asp Pro Glu 885 890
895Thr Pro Leu Ile Met Val Gly Pro Gly Thr Gly Val Ala Pro Phe Arg
900 905 910Gly Phe Val Gln Ala
Arg Lys Gln Leu Lys Glu Gln Gly Gln Ser Leu 915
920 925Gly Glu Ala His Leu Tyr Phe Gly Cys Arg Ser Pro
His Glu Asp Tyr 930 935 940Leu Tyr Gln
Glu Glu Leu Glu Asn Ala Gln Ser Glu Gly Ile Ile Thr945
950 955 960Leu His Thr Ala Phe Ser Arg
Met Pro Asn Gln Pro Lys Thr Tyr Val 965
970 975Gln His Val Met Glu Gln Asp Gly Lys Lys Leu Ile
Glu Leu Leu Asp 980 985 990Gln
Gly Ala His Phe Tyr Ile Cys Gly Asp Gly Ser Gln Met Ala Pro 995
1000 1005Ala Val Glu Ala Thr Leu Met Lys
Ser Tyr Ala Asp Val His Gln 1010 1015
1020Val Ser Glu Ala Asp Ala Arg Leu Trp Leu Gln Gln Leu Glu Glu
1025 1030 1035Lys Gly Arg Tyr Ala Lys
Asp Val Trp Ala Gly 1040 104553142PRTArtificial
SequenceSynthetic Polypeptide 53Ala Ala Thr Leu Glu Arg Ile Glu Lys Asn
Phe Val Ile Thr Asp Pro1 5 10
15Arg Leu Pro Asp Asn Pro Ile Ile Phe Ala Ser Asp Ser Phe Leu Gln
20 25 30Leu Thr Glu Tyr Ser Arg
Glu Glu Ile Leu Gly Arg Asn Cys Arg Phe 35 40
45Leu Gln Gly Pro Glu Thr Asp Arg Ala Thr Val Arg Lys Ile
Arg Asp 50 55 60Ala Ile Asp Asn Gln
Thr Glu Val Thr Val Gln Leu Ile Asn Tyr Thr65 70
75 80Lys Ser Gly Lys Lys Phe Trp Asn Leu Phe
His Leu Gln Pro Met Arg 85 90
95Asp Gln Lys Gly Asp Val Gln Tyr Phe Ile Gly Val Gln Leu Asp Gly
100 105 110Thr Glu His Val Arg
Asp Ala Ala Glu Arg Glu Gly Val Met Leu Ile 115
120 125Lys Lys Thr Ala Glu Asn Ile Asp Glu Ala Ala Lys
Glu Leu 130 135 14054748PRTArtificial
SequenceSynthetic Polypeptide 54Met Ala Ser Val Ala Gly His Ala Ser Gly
Ser Pro Ala Phe Gly Thr1 5 10
15Ala Asp Leu Ser Asn Cys Glu Arg Glu Glu Ile His Leu Ala Gly Ser
20 25 30Ile Gln Pro His Gly Ala
Leu Leu Val Val Ser Glu Pro Asp His Arg 35 40
45Ile Ile Gln Ala Ser Ala Asn Ala Ala Glu Phe Leu Asn Leu
Gly Ser 50 55 60Val Leu Gly Val Pro
Leu Ala Glu Ile Asp Gly Asp Leu Leu Ile Lys65 70
75 80Ile Leu Pro His Leu Asp Pro Thr Ala Glu
Gly Met Pro Val Ala Val 85 90
95Arg Cys Arg Ile Gly Asn Pro Ser Thr Glu Tyr Asp Gly Leu Met His
100 105 110Arg Pro Pro Glu Gly
Gly Leu Ile Ile Glu Leu Glu Arg Ala Gly Pro 115
120 125Pro Ile Asp Leu Ser Gly Thr Leu Ala Pro Ala Leu
Glu Arg Ile Arg 130 135 140Thr Ala Gly
Ser Leu Arg Ala Leu Cys Asp Asp Thr Ala Leu Leu Phe145
150 155 160Gln Gln Cys Thr Gly Tyr Asp
Arg Val Met Val Tyr Arg Phe Asp Glu 165
170 175Gln Gly His Gly Glu Val Phe Ser Glu Arg His Val
Pro Gly Leu Glu 180 185 190Ser
Tyr Phe Gly Asn Arg Tyr Pro Ser Ser Asp Ile Pro Gln Met Ala 195
200 205Arg Arg Leu Tyr Glu Arg Gln Arg Val
Arg Val Leu Val Asp Val Ser 210 215
220Tyr Gln Pro Val Pro Leu Glu Pro Arg Leu Ser Pro Leu Thr Gly Arg225
230 235 240Asp Leu Asp Met
Ser Gly Cys Phe Leu Arg Ser Met Ser Pro Ile His 245
250 255Leu Gln Tyr Leu Lys Asn Met Gly Val Arg
Ala Thr Leu Val Val Ser 260 265
270Leu Val Val Gly Gly Lys Leu Trp Gly Leu Val Ala Cys His His Tyr
275 280 285Leu Pro Arg Phe Met His Phe
Glu Leu Arg Ala Ile Cys Glu Leu Leu 290 295
300Ala Glu Ala Ile Ala Thr Arg Ile Thr Ala Leu Glu Ser Phe Ala
Gln305 310 315 320Ser Gln
Ser Glu Leu Phe Val Gln Arg Leu Glu Gln Arg Met Ile Glu
325 330 335Ala Ile Thr Arg Glu Gly Asp
Trp Arg Ala Ala Ile Phe Asp Thr Ser 340 345
350Gln Ser Ile Leu Gln Pro Leu His Ala Ala Gly Cys Ala Leu
Val Tyr 355 360 365Glu Asp Gln Ile
Arg Thr Ile Gly Asp Val Pro Ser Thr Gln Asp Val 370
375 380Arg Glu Ile Ala Gly Trp Leu Asp Arg Gln Pro Arg
Ala Ala Val Thr385 390 395
400Ser Thr Ala Ser Leu Gly Leu Asp Val Pro Glu Leu Ala His Leu Thr
405 410 415Arg Met Ala Ser Gly
Val Val Ala Ala Pro Ile Ser Asp His Arg Gly 420
425 430Glu Phe Leu Met Trp Phe Arg Pro Glu Arg Val His
Thr Val Thr Trp 435 440 445Gly Gly
Asp Pro Lys Lys Pro Phe Thr Met Gly Asp Thr Pro Ala Asp 450
455 460Leu Ser Pro Arg Arg Ser Phe Ala Lys Trp His
Gln Val Val Glu Gly465 470 475
480Thr Ser Asp Pro Trp Thr Ala Ala Asp Leu Ala Ala Ala Arg Thr Ile
485 490 495Gly Gln Thr Val
Ala Asp Ile Val Leu Gln Phe Arg Ala Val Arg Thr 500
505 510Leu Ile Ala Arg Glu Gln Tyr Glu Gln Phe Ser
Ser Gln Val His Ala 515 520 525Ser
Met Gln Pro Val Leu Ile Thr Asp Ala Glu Gly Arg Ile Leu Leu 530
535 540Met Asn Asp Ser Phe Arg Asp Met Leu Pro
Ala Gly Ser Pro Ser Ala545 550 555
560Val His Leu Asp Asp Leu Ala Gly Phe Phe Val Glu Ser Asn Asp
Phe 565 570 575Leu Arg Asn
Val Ala Glu Leu Ile Asp His Gly Arg Gly Trp Arg Gly 580
585 590Glu Val Leu Leu Arg Gly Ala Gly Asn Arg
Pro Leu Pro Leu Ala Val 595 600
605Arg Ala Asp Pro Val Thr Arg Thr Glu Asp Gln Ser Leu Gly Phe Val 610
615 620Leu Ile Phe Ser Asp Ala Thr Asp
Arg Arg Thr Ala Asp Ala Ala Arg625 630
635 640Thr Arg Phe Gln Glu Gly Ile Leu Ala Ser Ala Arg
Pro Gly Val Arg 645 650
655Leu Asp Ser Lys Ser Asp Leu Leu His Glu Lys Leu Leu Ser Ala Leu
660 665 670Val Glu Asn Ala Gln Leu
Ala Ala Leu Glu Ile Thr Tyr Gly Val Glu 675 680
685Thr Gly Arg Ile Ala Glu Leu Leu Glu Gly Val Arg Gln Ser
Met Leu 690 695 700Arg Thr Ala Glu Val
Leu Gly His Leu Val Gln His Ala Ala Arg Thr705 710
715 720Ala Gly Ser Asp Ser Ser Ser Asn Gly Ser
Gln Asn Lys Lys Glu Phe 725 730
735Asp Ser Ala Gly Ser Ala Gly Ser Ala Gly Thr Ser 740
74555295PRTArtificial SequenceSynthetic Polypeptide 55Met
Gly Met Pro Thr Thr Ile Glu Arg Glu Phe Glu Glu Leu Asp Thr1
5 10 15Gln Arg Arg Trp Gln Pro Leu
Tyr Leu Glu Ile Arg Asn Glu Ser His 20 25
30Asp Tyr Pro His Arg Val Ala Lys Phe Pro Glu Asn Arg Asn
Arg Asn 35 40 45Arg Tyr Arg Asp
Val Ser Pro Tyr Asp His Ser Arg Val Lys Leu Gln 50 55
60Asn Ala Glu Asn Asp Tyr Ile Asn Ala Ser Leu Val Asp
Ile Glu Glu65 70 75
80Ala Gln Arg Ser Tyr Ile Leu Thr Gln Gly Pro Leu Pro Asn Thr Cys
85 90 95Cys His Phe Trp Leu Met
Val Trp Gln Gln Lys Thr Lys Ala Val Val 100
105 110Met Leu Asn Arg Ile Val Glu Lys Glu Ser Val Lys
Cys Ala Gln Tyr 115 120 125Trp Pro
Thr Asp Asp Gln Glu Met Leu Phe Lys Glu Thr Gly Phe Ser 130
135 140Val Lys Leu Leu Ser Glu Asp Val Lys Ser Tyr
Tyr Thr Val His Leu145 150 155
160Leu Gln Leu Glu Asn Ile Asn Ser Gly Glu Thr Arg Thr Ile Ser His
165 170 175Phe His Tyr Thr
Thr Trp Pro Asp Phe Gly Val Pro Glu Ser Pro Ala 180
185 190Ser Phe Leu Asn Phe Leu Phe Lys Val Arg Glu
Ser Gly Ser Leu Asn 195 200 205Pro
Asp His Gly Pro Ala Val Ile His Cys Ser Ala Gly Ile Gly Arg 210
215 220Ser Gly Thr Phe Ser Leu Val Asp Thr Cys
Leu Val Leu Met Glu Lys225 230 235
240Gly Asp Asp Ile Asn Ile Lys Gln Val Leu Leu Asn Met Arg Lys
Tyr 245 250 255Arg Met Gly
Leu Ile Gln Thr Pro Asp Gln Leu Arg Phe Ser Tyr Met 260
265 270Ala Ile Ile Glu Gly Ala Lys Cys Ile Lys
Gly Asp Ser Ser Ile Gln 275 280
285Lys Arg Trp Lys Glu Leu Ser 290
29556435PRTArtificial SequenceSynthetic Polypeptide 56Met Glu Met Glu Lys
Glu Phe Glu Gln Ile Asp Lys Ser Gly Ser Trp1 5
10 15Ala Ala Ile Tyr Gln Asp Ile Arg His Glu Ala
Ser Asp Phe Pro Cys 20 25
30Arg Val Ala Lys Leu Pro Lys Asn Lys Asn Arg Asn Arg Tyr Arg Asp
35 40 45Val Ser Pro Phe Asp His Ser Arg
Ile Lys Leu His Gln Glu Asp Asn 50 55
60Asp Tyr Ile Asn Ala Ser Leu Ile Lys Met Glu Glu Ala Gln Arg Ser65
70 75 80Tyr Ile Leu Thr Gln
Gly Pro Leu Pro Asn Thr Cys Gly His Phe Trp 85
90 95Glu Met Val Trp Glu Gln Lys Ser Arg Gly Val
Val Met Leu Asn Arg 100 105
110Val Met Glu Lys Gly Ser Leu Lys Cys Ala Gln Tyr Trp Pro Gln Lys
115 120 125Glu Glu Lys Glu Met Ile Phe
Glu Asp Thr Asn Leu Lys Leu Thr Leu 130 135
140Ile Ser Glu Asp Ile Lys Ser Tyr Tyr Thr Val Arg Gln Leu Glu
Leu145 150 155 160Glu Asn
Leu Thr Thr Gln Glu Thr Arg Glu Ile Leu His Phe His Tyr
165 170 175Thr Thr Trp Pro Asp Phe Gly
Val Pro Glu Ser Pro Ala Ser Phe Leu 180 185
190Asn Phe Leu Phe Lys Val Arg Glu Ser Gly Ser Leu Ser Pro
Glu His 195 200 205Gly Pro Val Val
Val His Cys Ser Ala Gly Ile Gly Arg Ser Gly Thr 210
215 220Phe Cys Leu Ala Asp Thr Cys Leu Leu Leu Met Asp
Lys Arg Lys Asp225 230 235
240Pro Ser Ser Val Asp Ile Lys Lys Val Leu Leu Glu Met Arg Lys Phe
245 250 255Arg Met Gly Leu Ile
Gln Thr Ala Asp Gln Leu Arg Phe Ser Tyr Leu 260
265 270Ala Val Ile Glu Gly Ala Lys Phe Ile Met Gly Asp
Ser Ser Val Gln 275 280 285Asp Gln
Trp Lys Glu Leu Ser His Glu Asp Leu Glu Pro Pro Pro Glu 290
295 300His Ile Pro Pro Pro Pro Arg Pro Pro Lys Arg
Ile Leu Glu Pro His305 310 315
320Asn Gly Lys Cys Arg Glu Phe Phe Pro Asn His Gln Trp Val Lys Glu
325 330 335Glu Thr Gln Glu
Asp Lys Asp Cys Pro Ile Lys Glu Glu Lys Gly Ser 340
345 350Pro Leu Asn Ala Ala Pro Tyr Gly Ile Glu Ser
Met Ser Gln Asp Thr 355 360 365Glu
Val Arg Ser Arg Val Val Gly Gly Ser Leu Arg Gly Ala Gln Ala 370
375 380Ala Ser Pro Ala Lys Gly Glu Pro Ser Leu
Pro Glu Lys Asp Glu Asp385 390 395
400His Ala Leu Ser Tyr Trp Lys Pro Phe Leu Val Asn Met Cys Val
Ala 405 410 415Thr Val Leu
Thr Ala Gly Ala Tyr Leu Cys Tyr Arg Phe Leu Phe Asn 420
425 430Ser Asn Thr 43557473PRTArtificial
SequenceSynthetic Polypeptide 57Met Asn Ile Lys Lys Phe Ala Lys Gln Ala
Thr Val Leu Thr Phe Thr1 5 10
15Thr Ala Leu Leu Ala Gly Gly Ala Thr Gln Ala Phe Ala Lys Glu Thr
20 25 30Asn Gln Lys Pro Tyr Lys
Glu Thr Tyr Gly Ile Ser His Ile Thr Arg 35 40
45His Asp Met Leu Gln Ile Pro Glu Gln Gln Lys Asn Glu Lys
Tyr Gln 50 55 60Val Pro Glu Phe Asp
Ser Ser Thr Ile Lys Asn Ile Ser Ser Ala Lys65 70
75 80Gly Leu Asp Val Trp Asp Ser Trp Pro Leu
Gln Asn Ala Asp Gly Thr 85 90
95Val Ala Asn Tyr His Gly Tyr His Ile Val Phe Ala Leu Ala Gly Asp
100 105 110Pro Lys Asn Ala Asp
Asp Thr Ser Ile Tyr Met Phe Tyr Gln Lys Val 115
120 125Gly Glu Thr Ser Ile Asp Ser Trp Lys Asn Ala Gly
Arg Val Phe Lys 130 135 140Asp Ser Asp
Lys Phe Asp Ala Asn Asp Ser Ile Leu Lys Asp Gln Thr145
150 155 160Gln Glu Trp Ser Gly Ser Ala
Thr Phe Thr Ser Asp Gly Lys Ile Arg 165
170 175Leu Phe Tyr Thr Asp Phe Ser Gly Lys His Tyr Gly
Lys Gln Thr Leu 180 185 190Thr
Thr Ala Gln Val Asn Val Ser Ala Ser Asp Ser Ser Leu Asn Ile 195
200 205Asn Gly Val Glu Asp Tyr Lys Ser Ile
Phe Asp Gly Asp Gly Lys Thr 210 215
220Tyr Gln Asn Val Gln Gln Phe Ile Asp Glu Gly Asn Tyr Ser Ser Gly225
230 235 240Asp Asn His Thr
Leu Arg Asp Pro His Tyr Val Glu Asp Lys Gly His 245
250 255Lys Tyr Leu Val Phe Glu Ala Asn Thr Gly
Thr Glu Asp Gly Tyr Gln 260 265
270Gly Glu Glu Ser Leu Phe Asn Lys Ala Tyr Tyr Gly Lys Ser Thr Ser
275 280 285Phe Phe Arg Gln Glu Ser Gln
Lys Leu Leu Gln Ser Asp Lys Lys Arg 290 295
300Thr Ala Glu Leu Ala Asn Gly Ala Leu Gly Met Ile Glu Leu Asn
Asp305 310 315 320Asp Tyr
Thr Leu Lys Lys Val Met Lys Pro Leu Ile Ala Ser Asn Thr
325 330 335Val Thr Asp Glu Ile Glu Arg
Ala Asn Val Phe Lys Met Asn Gly Lys 340 345
350Trp Tyr Leu Phe Thr Asp Ser Arg Gly Ser Lys Met Thr Ile
Asp Gly 355 360 365Ile Thr Ser Asn
Asp Ile Tyr Met Leu Gly Tyr Val Ser Asn Ser Leu 370
375 380Thr Gly Pro Tyr Lys Pro Leu Asn Lys Thr Gly Leu
Val Leu Lys Met385 390 395
400Asp Leu Asp Pro Asn Asp Val Thr Phe Thr Tyr Ser His Phe Ala Val
405 410 415Pro Gln Ala Lys Gly
Asn Asn Val Val Ile Thr Ser Tyr Met Thr Asn 420
425 430Arg Gly Phe Tyr Ala Asp Lys Gln Ser Thr Phe Ala
Pro Ser Phe Leu 435 440 445Leu Asn
Ile Lys Gly Lys Lys Thr Ser Val Val Lys Asp Ser Ile Leu 450
455 460Glu Gln Gly Gln Leu Thr Val Asn Lys465
47058382PRTArtificial SequenceSynthetic Polypeptide 58Met Ser
Leu Lys Glu Lys Thr Gln Ser Leu Phe Ala Asn Ala Phe Gly1 5
10 15Tyr Pro Ala Thr His Thr Ile Gln
Ala Pro Gly Arg Val Asn Leu Ile 20 25
30Gly Glu His Thr Asp Tyr Asn Asp Gly Phe Val Leu Pro Cys Ala
Ile 35 40 45Asp Tyr Gln Thr Val
Ile Ser Cys Ala Pro Arg Asp Asp Arg Lys Val 50 55
60Arg Val Met Ala Ala Asp Tyr Glu Asn Gln Leu Asp Glu Phe
Ser Leu65 70 75 80Asp
Ala Pro Ile Val Ala His Glu Asn Tyr Gln Trp Ala Asn Tyr Val
85 90 95Arg Gly Val Val Lys His Leu
Gln Leu Arg Asn Asn Ser Phe Gly Gly 100 105
110Val Asp Met Val Ile Ser Gly Asn Val Pro Gln Gly Ala Gly
Leu Ser 115 120 125Ser Ser Ala Ser
Leu Glu Val Ala Val Gly Thr Val Leu Gln Gln Leu 130
135 140Tyr His Leu Pro Leu Asp Gly Ala Gln Ile Ala Leu
Asn Gly Gln Glu145 150 155
160Ala Glu Asn Gln Phe Val Gly Cys Asn Cys Gly Ile Met Asp Gln Leu
165 170 175Ile Ser Ala Leu Gly
Lys Lys Asp His Ala Leu Leu Ile Asp Cys Arg 180
185 190Ser Leu Gly Thr Lys Ala Val Ser Met Pro Lys Gly
Val Ala Val Val 195 200 205Ile Ile
Asn Ser Asn Phe Lys Arg Thr Leu Val Gly Ser Glu Tyr Asn 210
215 220Thr Arg Arg Glu Gln Cys Glu Thr Gly Ala Arg
Phe Phe Gln Gln Pro225 230 235
240Ala Leu Arg Asp Val Thr Ile Glu Glu Phe Asn Ala Val Ala His Glu
245 250 255Leu Asp Pro Ile
Val Ala Lys Arg Val Arg His Ile Leu Thr Glu Asn 260
265 270Ala Arg Thr Val Glu Ala Ala Ser Ala Leu Glu
Gln Gly Asp Leu Lys 275 280 285Arg
Met Gly Glu Leu Met Ala Glu Ser His Ala Ser Met Arg Asp Asp 290
295 300Phe Glu Ile Thr Val Pro Gln Ile Asp Thr
Leu Val Glu Ile Val Lys305 310 315
320Ala Val Ile Gly Asp Lys Gly Gly Val Arg Met Thr Gly Gly Gly
Phe 325 330 335Gly Gly Cys
Ile Val Ala Leu Ile Pro Glu Glu Leu Val Pro Ala Val 340
345 350Gln Gln Ala Val Ala Glu Gln Tyr Glu Ala
Lys Thr Gly Ile Lys Glu 355 360
365Thr Phe Tyr Val Cys Lys Pro Ser Gln Gly Ala Gly Gln Cys 370
375 38059593PRTArtificial SequenceSynthetic
Polypeptide 59Met Ala Gln Ile Ser Glu Ser Val Ser Pro Ser Thr Asp Leu Lys
Ser1 5 10 15Thr Glu Ser
Ser Ile Thr Ser Asn Arg His Gly Asn Met Trp Glu Asp 20
25 30Asp Arg Ile Gln Ser Leu Asn Ser Pro Tyr
Gly Ala Pro Ala Tyr Gln 35 40
45Glu Arg Ser Glu Lys Leu Ile Glu Glu Ile Lys Leu Leu Phe Leu Ser 50
55 60Asp Met Asp Asp Ser Cys Asn Asp Ser
Asp Arg Asp Leu Ile Lys Arg65 70 75
80Leu Glu Ile Val Asp Thr Val Glu Cys Leu Gly Ile Asp Arg
His Phe 85 90 95Gln Pro
Glu Ile Lys Leu Ala Leu Asp Tyr Val Tyr Arg Cys Trp Asn 100
105 110Glu Arg Gly Ile Gly Glu Gly Ser Arg
Asp Ser Leu Lys Lys Asp Leu 115 120
125Asn Ala Thr Ala Leu Gly Phe Arg Ala Leu Arg Leu His Arg Tyr Asn
130 135 140Val Ser Ser Gly Val Leu Glu
Asn Phe Arg Asp Asp Asn Gly Gln Phe145 150
155 160Phe Cys Gly Ser Thr Val Glu Glu Glu Gly Ala Glu
Ala Tyr Asn Lys 165 170
175His Val Arg Cys Met Leu Ser Leu Ser Arg Ala Ser Asn Ile Leu Phe
180 185 190Pro Gly Glu Lys Val Met
Glu Glu Ala Lys Ala Phe Thr Thr Asn Tyr 195 200
205Leu Lys Lys Val Leu Ala Gly Arg Glu Ala Thr His Val Asp
Glu Ser 210 215 220Leu Leu Gly Glu Val
Lys Tyr Ala Leu Glu Phe Pro Trp His Cys Ser225 230
235 240Val Gln Arg Trp Glu Ala Arg Ser Phe Ile
Glu Ile Phe Gly Gln Ile 245 250
255Asp Ser Glu Leu Lys Ser Asn Leu Ser Lys Lys Met Leu Glu Leu Ala
260 265 270Lys Leu Asp Phe Asn
Ile Leu Gln Cys Thr His Gln Lys Glu Leu Gln 275
280 285Ile Ile Ser Arg Trp Phe Ala Asp Ser Ser Ile Ala
Ser Leu Asn Phe 290 295 300Tyr Arg Lys
Cys Tyr Val Glu Phe Tyr Phe Trp Met Ala Ala Ala Ile305
310 315 320Ser Glu Pro Glu Phe Ser Gly
Ser Arg Val Ala Phe Thr Lys Ile Ala 325
330 335Ile Leu Met Thr Met Leu Asp Asp Leu Tyr Asp Thr
His Gly Thr Leu 340 345 350Asp
Gln Leu Lys Ile Phe Thr Glu Gly Val Arg Arg Trp Asp Val Ser 355
360 365Leu Val Glu Gly Leu Pro Asp Phe Met
Lys Ile Ala Phe Glu Phe Trp 370 375
380Leu Lys Thr Ser Asn Glu Leu Ile Ala Glu Ala Val Lys Ala Gln Gly385
390 395 400Gln Asp Met Ala
Ala Tyr Ile Arg Lys Asn Ala Trp Glu Arg Tyr Leu 405
410 415Glu Ala Tyr Leu Gln Asp Ala Glu Trp Ile
Ala Thr Gly His Val Pro 420 425
430Thr Phe Asp Glu Tyr Leu Asn Asn Gly Thr Pro Asn Thr Gly Met Cys
435 440 445Val Leu Asn Leu Ile Pro Leu
Leu Leu Met Gly Glu His Leu Pro Ile 450 455
460Asp Ile Leu Glu Gln Ile Phe Leu Pro Ser Arg Phe His His Leu
Ile465 470 475 480Glu Leu
Ala Ser Arg Leu Val Asp Asp Ala Arg Asp Phe Gln Ala Glu
485 490 495Lys Asp His Gly Asp Leu Ser
Cys Ile Glu Cys Tyr Leu Lys Asp His 500 505
510Pro Glu Ser Thr Val Glu Asp Ala Leu Asn His Val Asn Gly
Leu Leu 515 520 525Gly Asn Cys Leu
Leu Glu Met Asn Trp Lys Phe Leu Lys Lys Gln Asp 530
535 540Ser Val Pro Leu Ser Cys Lys Lys Tyr Ser Phe His
Val Leu Ala Arg545 550 555
560Ser Ile Gln Phe Met Tyr Asn Gln Gly Asp Gly Phe Ser Ile Ser Asn
565 570 575Lys Val Ile Lys Asp
Gln Val Gln Lys Val Leu Ile Val Pro Val Pro 580
585 590Ile60546PRTArtificial SequenceSynthetic
Polypeptide 60Met Ala Leu Thr Glu Glu Lys Pro Ile Arg Pro Ile Ala Asn Phe
Pro1 5 10 15Pro Ser Ile
Trp Gly Asp Gln Phe Leu Ile Tyr Glu Lys Gln Val Glu 20
25 30Gln Gly Val Glu Gln Ile Val Asn Asp Leu
Lys Lys Glu Val Arg Gln 35 40
45Leu Leu Lys Glu Ala Leu Asp Ile Pro Met Lys His Ala Asn Leu Leu 50
55 60Lys Leu Ile Asp Glu Ile Gln Arg Leu
Gly Ile Pro Tyr His Phe Glu65 70 75
80Arg Glu Ile Asp His Ala Leu Gln Cys Ile Tyr Glu Thr Tyr
Gly Asp 85 90 95Asn Trp
Asn Gly Asp Arg Ser Ser Leu Trp Phe Arg Leu Met Arg Lys 100
105 110Gln Gly Tyr Tyr Val Thr Cys Asp Val
Phe Asn Asn Tyr Lys Asp Lys 115 120
125Asn Gly Ala Phe Lys Gln Ser Leu Ala Asn Asp Val Glu Gly Leu Leu
130 135 140Glu Leu Tyr Glu Ala Thr Ser
Met Arg Val Pro Gly Glu Ile Ile Leu145 150
155 160Glu Asp Ala Leu Gly Phe Thr Arg Ser Arg Leu Ser
Ile Met Thr Lys 165 170
175Asp Ala Phe Ser Thr Asn Pro Ala Leu Phe Thr Glu Ile Gln Arg Ala
180 185 190Leu Lys Gln Pro Leu Trp
Lys Arg Leu Pro Arg Ile Glu Ala Ala Gln 195 200
205Tyr Ile Pro Phe Tyr Gln Gln Gln Asp Ser His Asn Lys Thr
Leu Leu 210 215 220Lys Leu Ala Lys Leu
Glu Phe Asn Leu Leu Gln Ser Leu His Lys Glu225 230
235 240Glu Leu Ser His Val Cys Lys Trp Trp Lys
Ala Phe Asp Ile Lys Lys 245 250
255Asn Ala Pro Cys Leu Arg Asp Arg Ile Val Glu Cys Tyr Phe Trp Gly
260 265 270Leu Gly Ser Gly Tyr
Glu Pro Gln Tyr Ser Arg Ala Arg Val Phe Phe 275
280 285Thr Lys Ala Val Ala Val Ile Thr Leu Ile Asp Asp
Thr Tyr Asp Ala 290 295 300Tyr Gly Thr
Tyr Glu Glu Leu Lys Ile Phe Thr Glu Ala Val Glu Arg305
310 315 320Trp Ser Ile Thr Cys Leu Asp
Thr Leu Pro Glu Tyr Met Lys Pro Ile 325
330 335Tyr Lys Leu Phe Met Asp Thr Tyr Thr Glu Met Glu
Glu Phe Leu Ala 340 345 350Lys
Glu Gly Arg Thr Asp Leu Phe Asn Cys Gly Lys Glu Phe Val Lys 355
360 365Glu Phe Val Arg Asn Leu Met Val Glu
Ala Lys Trp Ala Asn Glu Gly 370 375
380His Ile Pro Thr Thr Glu Glu His Asp Pro Val Val Ile Ile Thr Gly385
390 395 400Gly Ala Asn Leu
Leu Thr Thr Thr Cys Tyr Leu Gly Met Ser Asp Ile 405
410 415Phe Thr Lys Glu Ser Val Glu Trp Ala Val
Ser Ala Pro Pro Leu Phe 420 425
430Arg Tyr Ser Gly Ile Leu Gly Arg Arg Leu Asn Asp Leu Met Thr His
435 440 445Lys Ala Glu Gln Glu Arg Lys
His Ser Ser Ser Ser Leu Glu Ser Tyr 450 455
460Met Lys Glu Tyr Asn Val Asn Glu Glu Tyr Ala Gln Thr Leu Ile
Tyr465 470 475 480Lys Glu
Val Glu Asp Val Trp Lys Asp Ile Asn Arg Glu Tyr Leu Thr
485 490 495Thr Lys Asn Ile Pro Arg Pro
Leu Leu Met Ala Val Ile Tyr Leu Cys 500 505
510Gln Phe Leu Glu Val Gln Tyr Ala Gly Lys Asp Asn Phe Thr
Arg Met 515 520 525Gly Asp Glu Tyr
Lys His Leu Ile Lys Ser Leu Leu Val Tyr Pro Met 530
535 540Ser Ile54561802PRTArtificial SequenceSynthetic
Polypeptide 61Met Ser Ser Ser Thr Gly Thr Ser Lys Val Val Ser Glu Thr Ser
Ser1 5 10 15Thr Ile Val
Asp Asp Ile Pro Arg Leu Ser Ala Asn Tyr His Gly Asp 20
25 30Leu Trp His His Asn Val Ile Gln Thr Leu
Glu Thr Pro Phe Arg Glu 35 40
45Ser Ser Thr Tyr Gln Glu Arg Ala Asp Glu Leu Val Val Lys Ile Lys 50
55 60Asp Met Phe Asn Ala Leu Gly Asp Gly
Asp Ile Ser Pro Ser Ala Tyr65 70 75
80Asp Thr Ala Trp Val Ala Arg Leu Ala Thr Ile Ser Ser Asp
Gly Ser 85 90 95Glu Lys
Pro Arg Phe Pro Gln Ala Leu Asn Trp Val Phe Asn Asn Gln 100
105 110Leu Gln Asp Gly Ser Trp Gly Ile Glu
Ser His Phe Ser Leu Cys Asp 115 120
125Arg Leu Leu Asn Thr Thr Asn Ser Val Ile Ala Leu Ser Val Trp Lys
130 135 140Thr Gly His Ser Gln Val Gln
Gln Gly Ala Glu Phe Ile Ala Glu Asn145 150
155 160Leu Arg Leu Leu Asn Glu Glu Asp Glu Leu Ser Pro
Asp Phe Gln Ile 165 170
175Ile Phe Pro Ala Leu Leu Gln Lys Ala Lys Ala Leu Gly Ile Asn Leu
180 185 190Pro Tyr Asp Leu Pro Phe
Ile Lys Tyr Leu Ser Thr Thr Arg Glu Ala 195 200
205Arg Leu Thr Asp Val Ser Ala Ala Ala Asp Asn Ile Pro Ala
Asn Met 210 215 220Leu Asn Ala Leu Glu
Gly Leu Glu Glu Val Ile Asp Trp Asn Lys Ile225 230
235 240Met Arg Phe Gln Ser Lys Asp Gly Ser Phe
Leu Ser Ser Pro Ala Ser 245 250
255Thr Ala Cys Val Leu Met Asn Thr Gly Asp Glu Lys Cys Phe Thr Phe
260 265 270Leu Asn Asn Leu Leu
Asp Lys Phe Gly Gly Cys Val Pro Cys Met Tyr 275
280 285Ser Ile Asp Leu Leu Glu Arg Leu Ser Leu Val Asp
Asn Ile Glu His 290 295 300Leu Gly Ile
Gly Arg His Phe Lys Gln Glu Ile Lys Gly Ala Leu Asp305
310 315 320Tyr Val Tyr Arg His Trp Ser
Glu Arg Gly Ile Gly Trp Gly Arg Asp 325
330 335Ser Leu Val Pro Asp Leu Asn Thr Thr Ala Leu Gly
Leu Arg Thr Leu 340 345 350Arg
Met His Gly Tyr Asn Val Ser Ser Asp Val Leu Asn Asn Phe Lys 355
360 365Asp Glu Asn Gly Arg Phe Phe Ser Ser
Ala Gly Gln Thr His Val Glu 370 375
380Leu Arg Ser Val Val Asn Leu Phe Arg Ala Ser Asp Leu Ala Phe Pro385
390 395 400Asp Glu Arg Ala
Met Asp Asp Ala Arg Lys Phe Ala Glu Pro Tyr Leu 405
410 415Arg Glu Ala Leu Ala Thr Lys Ile Ser Thr
Asn Thr Lys Leu Phe Lys 420 425
430Glu Ile Glu Tyr Val Val Glu Tyr Pro Trp His Met Ser Ile Pro Arg
435 440 445Leu Glu Ala Arg Ser Tyr Ile
Asp Ser Tyr Asp Asp Asn Tyr Val Trp 450 455
460Gln Arg Lys Thr Leu Tyr Arg Met Pro Ser Leu Ser Asn Ser Lys
Cys465 470 475 480Leu Glu
Leu Ala Lys Leu Asp Phe Asn Ile Val Gln Ser Leu His Gln
485 490 495Glu Glu Leu Lys Leu Leu Thr
Arg Trp Trp Lys Glu Ser Gly Met Ala 500 505
510Asp Ile Asn Phe Thr Arg His Arg Val Ala Glu Val Tyr Phe
Ser Ser 515 520 525Ala Thr Phe Glu
Pro Glu Tyr Ser Ala Thr Arg Ile Ala Phe Thr Lys 530
535 540Ile Gly Cys Leu Gln Val Leu Phe Asp Asp Met Ala
Asp Ile Phe Ala545 550 555
560Thr Leu Asp Glu Leu Lys Ser Phe Thr Glu Gly Val Lys Arg Trp Asp
565 570 575Thr Ser Leu Leu His
Glu Ile Pro Glu Cys Met Gln Thr Cys Phe Lys 580
585 590Val Trp Phe Lys Leu Met Glu Glu Val Asn Asn Asp
Val Val Lys Val 595 600 605Gln Gly
Arg Asp Met Leu Ala His Ile Arg Lys Pro Trp Glu Leu Tyr 610
615 620Phe Asn Cys Tyr Val Gln Glu Arg Glu Trp Leu
Glu Ala Gly Tyr Ile625 630 635
640Pro Thr Phe Glu Glu Tyr Leu Lys Thr Tyr Ala Ile Ser Val Gly Leu
645 650 655Gly Pro Cys Thr
Leu Gln Pro Ile Leu Leu Met Gly Glu Leu Val Lys 660
665 670Asp Asp Val Val Glu Lys Val His Tyr Pro Ser
Asn Met Phe Glu Leu 675 680 685Val
Ser Leu Ser Trp Arg Leu Thr Asn Asp Thr Lys Thr Tyr Gln Ala 690
695 700Glu Lys Ala Arg Gly Gln Gln Ala Ser Gly
Ile Ala Cys Tyr Met Lys705 710 715
720Asp Asn Pro Gly Ala Thr Glu Glu Asp Ala Ile Lys His Ile Cys
Arg 725 730 735Val Val Asp
Arg Ala Leu Lys Glu Ala Ser Phe Glu Tyr Phe Lys Pro 740
745 750Ser Asn Asp Ile Pro Met Gly Cys Lys Ser
Phe Ile Phe Asn Leu Arg 755 760
765Leu Cys Val Gln Ile Phe Tyr Lys Phe Ile Asp Gly Tyr Gly Ile Ala 770
775 780Asn Glu Glu Ile Lys Asp Tyr Ile
Arg Lys Val Tyr Ile Asp Pro Ile785 790
795 800Gln Val62317PRTArtificial SequenceSynthetic
Polypeptide 62Met Pro Thr Thr Ile Glu Arg Glu Phe Glu Glu Leu Asp Thr Gln
Arg1 5 10 15Arg Trp Gln
Pro Leu Tyr Leu Glu Ile Arg Asn Glu Ser His Asp Tyr 20
25 30Pro His Arg Val Ala Lys Phe Pro Glu Asn
Arg Asn Arg Asn Arg Tyr 35 40
45Arg Asp Val Ser Pro Tyr Asp His Ser Arg Val Lys Leu Gln Asn Ala 50
55 60Glu Asn Asp Tyr Ile Asn Ala Ser Leu
Val Asp Ile Glu Glu Ala Gln65 70 75
80Arg Ser Tyr Ile Leu Thr Gln Gly Pro Leu Pro Asn Thr Cys
Cys His 85 90 95Phe Trp
Leu Met Val Trp Gln Gln Lys Thr Lys Ala Val Val Met Leu 100
105 110Asn Arg Ile Val Glu Lys Glu Ser Val
Lys Cys Ala Gln Tyr Trp Pro 115 120
125Thr Asp Asp Gln Glu Met Leu Phe Lys Glu Thr Gly Phe Ser Val Lys
130 135 140Leu Leu Ser Glu Asp Val Lys
Ser Tyr Tyr Thr Val His Leu Leu Gln145 150
155 160Leu Glu Asn Ile Asn Ser Gly Glu Thr Arg Thr Ile
Ser His Phe His 165 170
175Tyr Thr Thr Trp Pro Asp Phe Gly Val Pro Glu Ser Pro Ala Ser Phe
180 185 190Leu Asn Phe Leu Phe Lys
Val Arg Glu Ser Gly Ser Leu Asn Pro Asp 195 200
205His Gly Pro Ala Val Ile His Cys Ser Ala Gly Ile Gly Arg
Ser Gly 210 215 220Thr Phe Ser Leu Val
Asp Thr Cys Leu Val Leu Met Glu Lys Gly Asp225 230
235 240Asp Ile Asn Ile Lys Gln Val Leu Leu Asn
Met Arg Lys Tyr Arg Met 245 250
255Gly Leu Ile Gln Thr Pro Asp Gln Leu Arg Phe Ser Tyr Met Ala Ile
260 265 270Ile Glu Gly Ala Lys
Cys Ile Lys Gly Asp Ser Ser Ile Gln Lys Arg 275
280 285Trp Lys Glu Leu Ser Lys Glu Asp Leu Ser Pro Ala
Phe Asp His Ser 290 295 300Pro Asn Lys
Ile Met Thr Glu Lys Tyr Asn Gly Asn Arg305 310
31563299PRTArtificial SequenceSynthetic Polypeptide 63Met Ser Ser
Gly Val Asp Leu Gly Thr Glu Asn Leu Tyr Phe Gln Ser1 5
10 15Met Ser Arg Val Leu Gln Ala Glu Glu
Leu His Glu Lys Ala Leu Asp 20 25
30Pro Phe Leu Leu Gln Ala Glu Phe Phe Glu Ile Pro Met Asn Phe Val
35 40 45Asp Pro Lys Glu Tyr Asp Ile
Pro Gly Leu Val Arg Lys Asn Arg Tyr 50 55
60Lys Thr Ile Leu Pro Asn Pro His Ser Arg Val Cys Leu Thr Ser Pro65
70 75 80Asp Pro Asp Asp
Pro Leu Ser Ser Tyr Ile Asn Ala Asn Tyr Ile Arg 85
90 95Gly Tyr Gly Gly Glu Glu Lys Val Tyr Ile
Ala Thr Gln Gly Pro Ile 100 105
110Val Ser Thr Val Ala Asp Phe Trp Arg Met Val Trp Gln Glu His Thr
115 120 125Pro Ile Ile Val Met Ile Thr
Asn Ile Glu Glu Met Asn Glu Lys Cys 130 135
140Thr Glu Tyr Trp Pro Glu Glu Gln Val Ala Tyr Asp Gly Val Glu
Ile145 150 155 160Thr Val
Gln Lys Val Ile His Thr Glu Asp Tyr Arg Leu Arg Leu Ile
165 170 175Ser Leu Lys Ser Gly Thr Glu
Glu Arg Gly Leu Lys His Tyr Trp Phe 180 185
190Thr Ser Trp Pro Asp Gln Lys Thr Pro Asp Arg Ala Pro Pro
Leu Leu 195 200 205His Leu Val Arg
Glu Val Glu Glu Ala Ala Gln Gln Glu Gly Pro His 210
215 220Cys Ala Pro Ile Ile Val His Cys Ser Ala Gly Ile
Gly Arg Thr Gly225 230 235
240Cys Phe Ile Ala Thr Ser Ile Cys Cys Gln Gln Leu Arg Gln Glu Gly
245 250 255Val Val Asp Ile Leu
Lys Thr Thr Cys Gln Leu Arg Gln Asp Arg Gly 260
265 270Gly Met Ile Gln Thr Cys Glu Gln Tyr Gln Phe Val
His His Val Met 275 280 285Ser Leu
Tyr Glu Lys Gln Leu Ser His Gln Ser 290
29564595PRTArtificial SequenceSynthetic Polypeptide 64Met Val Arg Trp Phe
His Arg Asp Leu Ser Gly Leu Asp Ala Glu Thr1 5
10 15Leu Leu Lys Gly Arg Gly Val His Gly Ser Phe
Leu Ala Arg Pro Ser 20 25
30Arg Lys Asn Gln Gly Asp Phe Ser Leu Ser Val Arg Val Gly Asp Gln
35 40 45Val Thr His Ile Arg Ile Gln Asn
Ser Gly Asp Phe Tyr Asp Leu Tyr 50 55
60Gly Gly Glu Lys Phe Ala Thr Leu Thr Glu Leu Val Glu Tyr Tyr Thr65
70 75 80Gln Gln Gln Gly Val
Val Gln Asp Arg Asp Gly Thr Ile Ile His Leu 85
90 95Lys Tyr Pro Leu Asn Cys Ser Asp Pro Thr Ser
Glu Arg Trp Tyr His 100 105
110Gly His Met Ser Gly Gly Gln Ala Glu Thr Leu Leu Gln Ala Lys Gly
115 120 125Glu Pro Trp Thr Phe Leu Val
Arg Glu Ser Leu Ser Gln Pro Gly Asp 130 135
140Phe Val Leu Ser Val Leu Ser Asp Gln Pro Lys Ala Gly Pro Gly
Ser145 150 155 160Pro Leu
Arg Val Thr His Ile Lys Val Met Cys Glu Gly Gly Arg Tyr
165 170 175Thr Val Gly Gly Leu Glu Thr
Phe Asp Ser Leu Thr Asp Leu Val Glu 180 185
190His Phe Lys Lys Thr Gly Ile Glu Glu Ala Ser Gly Ala Phe
Val Tyr 195 200 205Leu Arg Gln Pro
Tyr Tyr Ala Thr Arg Val Asn Ala Ala Asp Ile Glu 210
215 220Asn Arg Val Leu Glu Leu Asn Lys Lys Gln Glu Ser
Glu Asp Thr Ala225 230 235
240Lys Ala Gly Phe Trp Glu Glu Phe Glu Ser Leu Gln Lys Gln Glu Val
245 250 255Lys Asn Leu His Gln
Arg Leu Glu Gly Gln Arg Pro Glu Asn Lys Gly 260
265 270Lys Asn Arg Tyr Lys Asn Ile Leu Pro Phe Asp His
Ser Arg Val Ile 275 280 285Leu Gln
Gly Arg Asp Ser Asn Ile Pro Gly Ser Asp Tyr Ile Asn Ala 290
295 300Asn Tyr Ile Lys Asn Gln Leu Leu Gly Pro Asp
Glu Asn Ala Lys Thr305 310 315
320Tyr Ile Ala Ser Gln Gly Cys Leu Glu Ala Thr Val Asn Asp Phe Trp
325 330 335Gln Met Ala Trp
Gln Glu Asn Ser Arg Val Ile Val Met Thr Thr Arg 340
345 350Glu Val Glu Lys Gly Arg Asn Lys Cys Val Pro
Tyr Trp Pro Glu Val 355 360 365Gly
Met Gln Arg Ala Tyr Gly Pro Tyr Ser Val Thr Asn Cys Gly Glu 370
375 380His Asp Thr Thr Glu Tyr Lys Leu Arg Thr
Leu Gln Val Ser Pro Leu385 390 395
400Asp Asn Gly Asp Leu Ile Arg Glu Ile Trp His Tyr Gln Tyr Leu
Ser 405 410 415Trp Pro Asp
His Gly Val Pro Ser Glu Pro Gly Gly Val Leu Ser Phe 420
425 430Leu Asp Gln Ile Asn Gln Arg Gln Glu Ser
Leu Pro His Ala Gly Pro 435 440
445Ile Ile Val His Cys Ser Ala Gly Ile Gly Arg Thr Gly Thr Ile Ile 450
455 460Val Ile Asp Met Leu Met Glu Asn
Ile Ser Thr Lys Gly Leu Asp Cys465 470
475 480Asp Ile Asp Ile Gln Lys Thr Ile Gln Met Val Arg
Ala Gln Arg Ser 485 490
495Gly Met Val Gln Thr Glu Ala Gln Tyr Lys Phe Ile Tyr Val Ala Ile
500 505 510Ala Gln Phe Ile Glu Thr
Thr Lys Lys Lys Leu Glu Val Leu Gln Ser 515 520
525Gln Lys Gly Gln Glu Ser Glu Tyr Gly Asn Ile Thr Tyr Pro
Pro Ala 530 535 540Met Lys Asn Ala His
Ala Lys Ala Ser Arg Thr Ser Ser Lys His Lys545 550
555 560Glu Asp Val Tyr Glu Asn Leu His Thr Lys
Asn Lys Arg Glu Glu Lys 565 570
575Val Lys Lys Gln Arg Ser Ala Asp Lys Glu Lys Ser Lys Gly Ser Leu
580 585 590Lys Arg Lys
59565593PRTArtificial SequenceSynthetic Polypeptide 65Met Thr Ser Arg Arg
Trp Phe His Pro Asn Ile Thr Gly Val Glu Ala1 5
10 15Glu Asn Leu Leu Leu Thr Arg Gly Val Asp Gly
Ser Phe Leu Ala Arg 20 25
30Pro Ser Lys Ser Asn Pro Gly Asp Phe Thr Leu Ser Val Arg Arg Asn
35 40 45Gly Ala Val Thr His Ile Lys Ile
Gln Asn Thr Gly Asp Tyr Tyr Asp 50 55
60Leu Tyr Gly Gly Glu Lys Phe Ala Thr Leu Ala Glu Leu Val Gln Tyr65
70 75 80Tyr Met Glu His His
Gly Gln Leu Lys Glu Lys Asn Gly Asp Val Ile 85
90 95Glu Leu Lys Tyr Pro Leu Asn Cys Ala Asp Pro
Thr Ser Glu Arg Trp 100 105
110Phe His Gly His Leu Ser Gly Lys Glu Ala Glu Lys Leu Leu Thr Glu
115 120 125Lys Gly Lys His Gly Ser Phe
Leu Val Arg Glu Ser Gln Ser His Pro 130 135
140Gly Asp Phe Val Leu Ser Val Arg Thr Gly Asp Asp Lys Gly Glu
Ser145 150 155 160Asn Asp
Gly Lys Ser Lys Val Thr His Val Met Ile Arg Cys Gln Glu
165 170 175Leu Lys Tyr Asp Val Gly Gly
Gly Glu Arg Phe Asp Ser Leu Thr Asp 180 185
190Leu Val Glu His Tyr Lys Lys Asn Pro Met Val Glu Thr Leu
Gly Thr 195 200 205Val Leu Gln Leu
Lys Gln Pro Leu Asn Thr Thr Arg Ile Asn Ala Ala 210
215 220Glu Ile Glu Ser Arg Val Arg Glu Leu Ser Lys Leu
Ala Glu Thr Thr225 230 235
240Asp Lys Val Lys Gln Gly Phe Trp Glu Glu Phe Glu Thr Leu Gln Gln
245 250 255Gln Glu Cys Lys Leu
Leu Tyr Ser Arg Lys Glu Gly Gln Arg Gln Glu 260
265 270Asn Lys Asn Lys Asn Arg Tyr Lys Asn Ile Leu Pro
Phe Asp His Thr 275 280 285Arg Val
Val Leu His Asp Gly Asp Pro Asn Glu Pro Val Ser Asp Tyr 290
295 300Ile Asn Ala Asn Ile Ile Met Pro Glu Phe Glu
Thr Lys Cys Asn Asn305 310 315
320Ser Lys Pro Lys Lys Ser Tyr Ile Ala Thr Gln Gly Cys Leu Gln Asn
325 330 335Thr Val Asn Asp
Phe Trp Arg Met Val Phe Gln Glu Asn Ser Arg Val 340
345 350Ile Val Met Thr Thr Lys Glu Val Glu Arg Gly
Lys Ser Lys Cys Val 355 360 365Lys
Tyr Trp Pro Asp Glu Tyr Ala Leu Lys Glu Tyr Gly Val Met Arg 370
375 380Val Arg Asn Val Lys Glu Ser Ala Ala His
Asp Tyr Thr Leu Arg Glu385 390 395
400Leu Lys Leu Ser Lys Val Gly Gln Gly Asn Thr Glu Arg Thr Val
Trp 405 410 415Gln Tyr His
Phe Arg Thr Trp Pro Asp His Gly Val Pro Ser Asp Pro 420
425 430Gly Gly Val Leu Asp Phe Leu Glu Glu Val
His His Lys Gln Glu Ser 435 440
445Ile Met Asp Ala Gly Pro Val Val Val His Cys Ser Ala Gly Ile Gly 450
455 460Arg Thr Gly Thr Phe Ile Val Ile
Asp Ile Leu Ile Asp Ile Ile Arg465 470
475 480Glu Lys Gly Val Asp Cys Asp Ile Asp Val Pro Lys
Thr Ile Gln Met 485 490
495Val Arg Ser Gln Arg Ser Gly Met Val Gln Thr Glu Ala Gln Tyr Arg
500 505 510Phe Ile Tyr Met Ala Val
Gln His Tyr Ile Glu Thr Leu Gln Arg Arg 515 520
525Ile Glu Glu Glu Gln Lys Ser Lys Arg Lys Gly His Glu Tyr
Thr Asn 530 535 540Ile Lys Tyr Ser Leu
Ala Asp Gln Thr Ser Gly Asp Gln Ser Pro Leu545 550
555 560Pro Pro Cys Thr Pro Thr Pro Pro Cys Ala
Glu Met Arg Glu Asp Ser 565 570
575Ala Arg Val Tyr Glu Asn Val Gly Leu Met Gln Gln Gln Lys Ser Phe
580 585 590Arg66425PRTArtificial
SequenceSynthetic Polypeptide 66Met Glu Gln Val Glu Ile Leu Arg Lys Phe
Ile Gln Arg Val Gln Ala1 5 10
15Met Lys Ser Pro Asp His Asn Gly Glu Asp Asn Phe Ala Arg Asp Phe
20 25 30Met Arg Leu Arg Arg Leu
Ser Thr Lys Tyr Arg Thr Glu Lys Ile Tyr 35 40
45Pro Thr Ala Thr Gly Glu Lys Glu Glu Asn Val Lys Lys Asn
Arg Tyr 50 55 60Lys Asp Ile Leu Pro
Phe Asp His Ser Arg Val Lys Leu Thr Leu Lys65 70
75 80Thr Pro Ser Gln Asp Ser Asp Tyr Ile Asn
Ala Asn Phe Ile Lys Gly 85 90
95Val Tyr Gly Pro Lys Ala Tyr Val Ala Thr Gln Gly Pro Leu Ala Asn
100 105 110Thr Val Ile Asp Phe
Trp Arg Met Val Trp Glu Tyr Asn Val Val Ile 115
120 125Ile Val Met Ala Cys Arg Glu Phe Glu Met Gly Arg
Lys Lys Cys Glu 130 135 140Arg Tyr Trp
Pro Leu Tyr Gly Glu Asp Pro Ile Thr Phe Ala Pro Phe145
150 155 160Lys Ile Ser Cys Glu Asp Glu
Gln Ala Arg Thr Asp Tyr Phe Ile Arg 165
170 175Thr Leu Leu Leu Glu Phe Gln Asn Glu Ser Arg Arg
Leu Tyr Gln Phe 180 185 190His
Tyr Val Asn Trp Pro Asp His Asp Val Pro Ser Ser Phe Asp Ser 195
200 205Ile Leu Asp Met Ile Ser Leu Met Arg
Lys Tyr Gln Glu His Glu Asp 210 215
220Val Pro Ile Cys Ile His Cys Ser Ala Gly Cys Gly Arg Thr Gly Ala225
230 235 240Ile Cys Ala Ile
Asp Tyr Thr Trp Asn Leu Leu Lys Ala Gly Lys Ile 245
250 255Pro Glu Glu Phe Asn Val Phe Asn Leu Ile
Gln Glu Met Arg Thr Gln 260 265
270Arg His Ser Ala Val Gln Thr Lys Glu Gln Tyr Glu Leu Val His Arg
275 280 285Ala Ile Ala Gln Leu Phe Glu
Lys Gln Leu Gln Leu Tyr Glu Ile His 290 295
300Gly Ala Gln Lys Ile Ala Asp Gly Val Asn Glu Ile Asn Thr Glu
Asn305 310 315 320Met Val
Ser Ser Ile Glu Pro Glu Lys Gln Asp Ser Pro Pro Pro Lys
325 330 335Pro Pro Arg Thr Arg Ser Cys
Leu Val Glu Gly Asp Ala Lys Glu Glu 340 345
350Ile Leu Gln Pro Pro Glu Pro His Pro Val Pro Pro Ile Leu
Thr Pro 355 360 365Ser Pro Pro Ser
Ala Phe Pro Thr Val Thr Thr Val Trp Gln Asp Asn 370
375 380Asp Arg Tyr His Pro Lys Pro Val Leu Gln Trp Phe
His Gln Asn Asn385 390 395
400Ile Gln Gln Thr Ser Thr Glu Thr Ile Val Asn Gln Gln Asn Phe Gln
405 410 415Gly Lys Met Asn Gln
Gln Leu Asn Arg 420 42567299PRTArtificial
SequenceSynthetic Polypeptide 67Met Asp Gln Arg Glu Ile Leu Gln Lys Phe
Leu Asp Glu Ala Gln Ser1 5 10
15Lys Lys Ile Thr Lys Glu Glu Phe Ala Asn Glu Phe Leu Lys Leu Lys
20 25 30Arg Gln Ser Thr Lys Tyr
Lys Ala Asp Lys Thr Tyr Pro Thr Thr Val 35 40
45Ala Glu Lys Pro Lys Asn Ile Lys Lys Asn Arg Tyr Lys Asp
Ile Leu 50 55 60Pro Tyr Asp Tyr Ser
Arg Val Glu Leu Ser Leu Ile Thr Ser Asp Glu65 70
75 80Asp Ser Ser Tyr Ile Asn Ala Asn Phe Ile
Lys Gly Val Tyr Gly Pro 85 90
95Lys Ala Tyr Ile Ala Thr Gln Gly Pro Leu Ser Thr Thr Leu Leu Asp
100 105 110Phe Trp Arg Met Ile
Trp Glu Tyr Ser Val Leu Ile Ile Val Met Ala 115
120 125Cys Met Glu Tyr Glu Met Gly Lys Lys Lys Cys Glu
Arg Tyr Trp Ala 130 135 140Glu Pro Gly
Glu Met Gln Leu Glu Phe Gly Pro Phe Ser Val Ser Cys145
150 155 160Glu Ala Glu Lys Arg Lys Ser
Asp Tyr Ile Ile Arg Thr Leu Lys Val 165
170 175Lys Phe Asn Ser Glu Thr Arg Thr Ile Tyr Gln Phe
His Tyr Lys Asn 180 185 190Trp
Pro Asp His Asp Val Pro Ser Ser Ile Asp Pro Ile Leu Glu Leu 195
200 205Ile Trp Asp Val Arg Cys Tyr Gln Glu
Asp Asp Ser Val Pro Ile Cys 210 215
220Ile His Cys Ser Ala Gly Cys Gly Arg Thr Gly Val Ile Cys Ala Ile225
230 235 240Asp Tyr Thr Trp
Met Leu Leu Lys Asp Gly Ile Ile Pro Glu Asn Phe 245
250 255Ser Val Phe Ser Leu Ile Arg Glu Met Arg
Thr Gln Arg Pro Ser Leu 260 265
270Val Gln Thr Gln Glu Gln Tyr Glu Leu Val Tyr Asn Ala Val Leu Glu
275 280 285Leu Phe Lys Arg Gln Met Asp
Val Ile Arg Asp 290 29568238PRTArtificial
SequenceSynthetic Polypeptide 68Met Arg Lys Gly Glu Glu Leu Phe Thr Gly
Val Val Pro Ile Leu Val1 5 10
15Glu Leu Asp Gly Asp Val Asn Gly His Lys Phe Ser Val Arg Gly Glu
20 25 30Gly Glu Gly Asp Ala Thr
Asn Gly Lys Leu Thr Leu Lys Phe Ile Cys 35 40
45Thr Thr Gly Lys Leu Pro Val Pro Trp Pro Thr Leu Val Thr
Thr Leu 50 55 60Thr Tyr Gly Val Gln
Cys Phe Ala Arg Tyr Pro Asp His Met Lys Gln65 70
75 80His Asp Phe Phe Lys Ser Ala Met Pro Glu
Gly Tyr Val Gln Glu Arg 85 90
95Thr Ile Ser Phe Lys Asp Asp Gly Thr Tyr Lys Thr Arg Ala Glu Val
100 105 110Lys Phe Glu Gly Asp
Thr Leu Val Asn Arg Ile Glu Leu Lys Gly Ile 115
120 125Asp Phe Lys Glu Asp Gly Asn Ile Leu Gly His Lys
Leu Glu Tyr Asn 130 135 140Phe Asn Ser
His Asn Val Tyr Ile Thr Ala Asp Lys Gln Lys Asn Gly145
150 155 160Ile Lys Ala Asn Phe Lys Ile
Arg His Asn Val Glu Asp Gly Ser Val 165
170 175Gln Leu Ala Asp His Tyr Gln Gln Asn Thr Pro Ile
Gly Asp Gly Pro 180 185 190Val
Leu Leu Pro Asp Asn His Tyr Leu Ser Thr Gln Ser Val Leu Ser 195
200 205Lys Asp Pro Asn Glu Lys Arg Asp His
Met Val Leu Leu Glu Phe Val 210 215
220Thr Ala Ala Gly Ile Thr His Gly Met Asp Glu Leu Tyr Lys225
230 23569245PRTArtificial SequenceSynthetic
Polypeptide 69Met His His His His His His Val Ser Lys Gly Glu Glu Leu Phe
Thr1 5 10 15Gly Val Val
Pro Ile Leu Val Glu Leu Asp Gly Asp Val Asn Gly His 20
25 30Lys Phe Ser Val Arg Gly Glu Gly Glu Gly
Asp Ala Thr Asn Gly Lys 35 40
45Leu Thr Leu Lys Phe Ile Cys Thr Thr Gly Lys Leu Pro Val Pro Trp 50
55 60Pro Thr Leu Val Thr Thr Phe Gly Tyr
Gly Val Ala Cys Phe Ser Arg65 70 75
80Tyr Pro Asp His Met Lys Gln His Asp Phe Phe Lys Ser Ala
Met Pro 85 90 95Glu Gly
Tyr Val Gln Glu Arg Thr Ile Ser Phe Lys Asp Asp Gly Thr 100
105 110Tyr Lys Thr Arg Ala Glu Val Lys Phe
Glu Gly Asp Thr Leu Val Asn 115 120
125Arg Ile Glu Leu Lys Gly Ile Asp Phe Lys Glu Asp Gly Asn Ile Leu
130 135 140Gly His Lys Leu Glu Tyr Asn
Phe Asn Ser His Asn Val Tyr Ile Thr145 150
155 160Ala Asp Lys Gln Lys Asn Gly Ile Lys Ala Asn Phe
Lys Ile Arg His 165 170
175Asn Val Glu Asp Gly Ser Val Gln Leu Ala Asp His Tyr Gln Gln Asn
180 185 190Thr Pro Ile Gly Asp Gly
Pro Val Leu Leu Pro Asp Asn His Tyr Leu 195 200
205Ser His Gln Ser Ala Leu Ser Lys Asp Pro Asn Glu Lys Arg
Asp His 210 215 220Met Val Leu Leu Glu
Phe Val Thr Ala Ala Gly Ile Thr His Gly Met225 230
235 240Asp Glu Leu Tyr Lys
24570129DNAArtificial SequenceSynthetic Polynucleotide 70atccggatat
agttcctcct ttcagcaaaa aacccctcaa gacccgttta gaggccccaa 60ggggttatgc
tagttattgc tcagcggtgg cagcagccaa ctcagcttcc tttcgggctt 120tgttagcag
12971426DNAArtificial SequenceSynthetic Polynucleotide 71ggctgttttg
gcggatgaga gaagattttc agcctgatac agattaaatc agaacgcaga 60agcggtctga
taaaacagaa tttgcctggc ggcagtagcg cggtggtccc acctgacccc 120atgccgaact
cagaagtgaa acgccgtagc gccgatggta gtgtggggtc accccatgcg 180agagtaggga
actgccaggc atcaaataaa acgaaaggct cagtcgaaag actgggcctt 240tcgttttatc
tgttgtttgt cggtgaacgc tctcctgagt aggacaaatc cgccgggagc 300ggatttgaac
gttgcgaagc aacggcccgg agggtggcgg gcaggacgcc cgccataaac 360tgccaggcat
caaattaagc agaaggccat cctgacggat ggcctttttg cgtttctaca 420aactct
42672158DNAArtificial SequenceSynthetic Polynucleotide 72tgcctggcgg
cagtagcgcg gtggtcccac ctgaccccat gccgaactca gaagtgaaac 60gccgtagcgc
cgatggtagt gtggggtctc cccatgcgag agtagggaac tgccaggcat 120caaataaaac
gaaaggctca gtcgaaagac tgggcctt
15873867DNAArtificial SequenceSynthetic Polynucleotide 73atgggctcca
agccgcagac tcagggcctg gccaaggatg cctgggagat ccctcgggag 60tcgctgcggc
tggaggtcaa gctgggccag ggctgctttg gcgaggtgtg gatggggacc 120tggaacggta
ccaccagggt ggccatcaaa accctgaagc ctggcacgat gtctccagag 180gccttcctgc
aggaggccca ggtcatgaag aagctgaggc atgagaagct ggtgcagttg 240tatgctgtgg
tttcagagga gcccatttac atcgtcacgg agtacatgag caaggggagt 300ttgctggact
ttctcaaggg ggagacaggc aagtacctgc ggctgcctca gctggtggac 360atggctgctc
agatcgcctc aggcatggcg tacgtggagc ggatgaacta cgtccaccgg 420gaccttcgtg
cagccaacat cctggtggga gagaacctgg tgtgcaaagt ggccgacttt 480gggctggctc
ggctcattga agacaatgag tacacggcgc ggcaaggtgc caaattcccc 540atcaagtgga
cggctccaga agctgccctc tatggccgct tcaccatcaa gtcggacgtg 600tggtccttcg
ggatcctgct gactgagctc accacaaagg gacgggtgcc ctaccctggg 660atggtgaacc
gcgaggtgct ggaccaggtg gagcggggct accggatgcc ctgcccgccg 720gagtgtcccg
agtccctgca cgacctcatg tgccagtgct ggcggaagga gcctgaggag 780cggcccacct
tcgagtacct gcaggccttc ctggaggact acttcacgtc caccgagccc 840cagtaccagc
ccggggagaa cctctaa
867741137DNAArtificial SequenceSynthetic Polynucleotide 74atggtggact
acagcgtgtg ggaccacatt gaggtgtctg atgatgaaga cgagacgcac 60cccaacatcg
acacggccag tctcttccgc tggcggcatc aggcccgggt ggaacgcatg 120gagcagttcc
agaaggagaa ggaggaactg gacaggggct gccgcgagtg caagcgcaag 180gtggccgagt
gccagaggaa actgaaggag ctggaggtgg ccgagggcgg caaggcagag 240ctggagcgcc
tgcaggccga ggcacagcag ctgcgcaagg aggagcggag ctgggagcag 300aagctggagg
agatgcgcaa gaaggagaag agcatgccct ggaacgtgga cacgctcagc 360aaagacggct
tcagcaagag catggtaaat accaagcccg agaagacgga ggaggactca 420gaggaggtga
gggagcagaa acacaagacc ttcgtggaaa aatacgagaa acagatcaag 480cactttggca
tgcttcgccg ctgggatgac agccaaaagt acctgtcaga caacgtccac 540ctggtgtgcg
aggagacagc caattacctg gtcatttggt gcattgacct agaggtggag 600gagaaatgtg
cactcatgga gcaggtggcc caccagacaa tcgtcatgca atttatcctg 660gagctggcca
agagcctaaa ggtggacccc cgggcctgct tccggcagtt cttcactaag 720attaagacag
ccgatcgcca gtacatggag ggcttcaacg acgagctgga agccttcaag 780gagcgtgtgc
ggggccgtgc caagctgcgc atcgagaagg ccatgaagga gtacgaggag 840gaggagcgca
agaagcggct cggccccggc ggcctggacc ccgtcgaggt ctacgagtcc 900ctccctgagg
aactccagaa gtgcttcgat gtgaaggacg tgcagatgct gcaggacgcc 960atcagcaaga
tggaccccac cgacgcaaag taccacatgc agcgctgcat tgactctggc 1020ctctgggtcc
ccaactctaa ggccagcgag gccaaggagg gagaggaggc aggtcctggg 1080gacccattac
tggaagctgt tcccaagacg ggcgatgaga aggatgtcag tgtgtaa
1137751308DNAArtificial SequenceSynthetic Polynucleotide 75atggagatgg
aaaaggagtt cgagcagatc gacaagtccg ggagctgggc ggccatttac 60caggatatcc
gacatgaagc cagtgacttc ccatgtagag tggccaagct tcctaagaac 120aaaaaccgaa
ataggtacag agacgtcagt ccctttgacc atagtcggat taaactacat 180caagaagata
atgactatat caacgctagt ttgataaaaa tggaagaagc ccaaaggagt 240tacattctta
cccagggccc tttgcctaac acatgcggtc acttttggga gatggtgtgg 300gagcagaaaa
gcaggggtgt cgtcatgctc aacagagtga tggagaaagg ttcgttaaaa 360tgcgcacaat
actggccaca aaaagaagaa aaagagatga tctttgaaga cacaaatttg 420aaattaacat
tgatctctga agatatcaag tcatattata cagtgcgaca gctagaattg 480gaaaacctta
caacccaaga aactcgagag atcttacatt tccactatac cacatggcct 540gactttggag
tccctgaatc accagcctca ttcttgaact ttcttttcaa agtccgagag 600tcagggtcac
tcagcccgga gcacgggccc gttgtggtgc actgcagtgc aggcatcggc 660aggtctggaa
ccttctgtct ggctgatacc tgcctcttgc tgatggacaa gaggaaagac 720ccttcttccg
ttgatatcaa gaaagtgctg ttagaaatga ggaagtttcg gatggggctg 780atccagacag
ccgaccagct gcgcttctcc tacctggctg tgatcgaagg tgccaaattc 840atcatggggg
actcttccgt gcaggatcag tggaaggagc tttcccacga ggacctggag 900cccccacccg
agcatatccc cccacctccc cggccaccca aacgaatcct ggagccacac 960aatgggaaat
gcagggagtt cttcccaaat caccagtggg tgaaggaaga gacccaggag 1020gataaagact
gccccatcaa ggaagaaaaa ggaagcccct taaatgccgc accctacggc 1080atcgaaagca
tgagtcaaga cactgaagtt agaagtcggg tcgtgggggg aagtcttcga 1140ggtgcccagg
ctgcctcccc agccaaaggg gagccgtcac tgcccgagaa ggacgaggac 1200catgcactga
gttactggaa gcccttcctg gtcaacatgt gcgtggctac ggtcctcacg 1260gccggcgctt
acctctgcta caggttcctg ttcaacagca acacatag
1308761086DNAArtificial SequenceSynthetic Polynucleotide 76atgaaatttg
gaaacttttt gcttacatac caacctcccc aattttccca aacagaggta 60atgaaacgtt
tggttaaatt aggtcgcatc tctgaggagt gtggttttga taccgtatgg 120ttactggagc
atcatttcac ggagtttggt ttgcttggta acccttatgt cgctgctgca 180tatttacttg
gcgcgactaa aaaattgaat gtaggaactg ccgctattgt tcttcccaca 240gcccatccag
tacgccaact tgaagatgtg aatttattgg atcaaatgtc aaaaggacga 300tttcggtttg
gtatttgccg agggctttac aacaaggact ttcgcgtatt cggcacagat 360atgaataaca
gtcgcgcctt agcggaatgc tggtacgggc tgataaagaa tggcatgaca 420gagggatata
tggaagctga taatgaacat atcaagttcc ataaggtaaa agtaaacccc 480gcggcgtata
gcagaggtgg cgcaccggtt tatgtggtgg ctgaatcagc ttcgacgact 540gagtgggctg
ctcaatttgg cctaccgatg atattaagtt ggattataaa tactaacgaa 600aagaaagcac
aacttgagct ttataatgaa gtggctcaag aatatgggca cgatattcat 660aatatcgacc
attgcttatc atatataaca tctgtagatc atgactcaat taaagcgaaa 720gagatttgcc
ggaaatttct ggggcattgg tatgattctt atgtgaatgc tacgactatt 780tttgatgatt
cagaccaaac aagaggttat gatttcaata aagggcagtg gcgtgacttt 840gtattaaaag
gacataaaga tactaatcgc cgtattgatt acagttacga aatcaatccc 900gtgggaacgc
cgcaggaatg tattgacata attcaaaaag acattgatgc tacaggaata 960tcaaatattt
gttgtggatt tgaagctaat ggaacagtag acgaaattat tgcttccatg 1020aagctcttcc
agtctgatgt catgccattt cttaaagaaa aacaacgttc gctattatat 1080tattaa
108677987DNAArtificial SequenceSynthetic Polynucleotide 77atgagcaaat
ttggattgtt cttccttaac ttcatcaatt caacaactgt tcaagaacag 60agtatagttc
gcatgcagga aataacggag tatgttgata agttgaattt tgaacagatt 120ttagtgtatg
aaaatcattt ttcagataat ggtgttgtcg gcgctcctct gactgtttct 180ggttttctgc
tcggtttaac agagaaaatt aaaattggtt cattaaatca catcattaca 240actcatcatc
ctgtccgcat agcggaggaa gcttgcttat tggatcagtt aagtgaaggg 300agatttattt
tagggtttag tgattgcgaa aaaaaagatg aaatgcattt ttttaatcgc 360ccggttgaat
atcaacagca actatttgaa gagtgttatg aaatcattaa cgatgcttta 420acaacaggct
attgtaatcc agataacgat ttttatagct tccctaaaat atctgtaaat 480ccccatgctt
atacgccagg cggacctcgg aaatatgtaa cagcaaccag tcatcatatt 540gttgagtggg
cggccaaaaa aggtattcct ctcatcttta agtgggatga ttctaatgat 600gttagatatg
aatatgctga aagatataaa gccgttgcgg ataaatatga cgttgaccta 660tcagagatag
accatcagtt aatgatatta gttaactata acgaagatag taataaagct 720aaacaagaaa
cgcgtgcatt tattagtgat tatgttcttg aaatgcaccc taatgaaaat 780ttcgaaaata
aacttgaaga aataattgca gaaaacgctg tcggaaatta tacggagtgt 840ataactgcgg
ctaagttggc aattgaaaag tgtggtgcga aaagtgtatt gctgtccttt 900gaaccaatga
atgatttgat gagccaaaaa aatgtaatca atattgttga tgataatatt 960aagaagtacc
acacggaata tacctaa
98778276DNAArtificial SequenceSynthetic Polynucleotide 78atggcacgcg
taactgttca ggacgctgta gagaaaattg gtaaccgttt tgacctggta 60ctggtcgccg
cgcgtcgcgc tcgtcagatg caggtaggcg gaaaggatcc gctggtaccg 120gaagaaaacg
ataaaaccac tgtaatcgcg ctgcgcgaaa tcgaagaagg tctgatcaac 180aaccagatcc
tcgacgttcg cgaacgccag gaacagcaag agcaggaagc cgctgaatta 240caagccgtta
ccgctattgc tgaaggtcgt cgttaa
27679636DNAArtificial SequenceSynthetic Polynucleotide 79atgagtatca
gcagcagggt aaaaagcaaa agaattcagc ttggacttaa ccaggctgaa 60cttgctcaaa
aggtggggac tacccagcag tctatagagc agctcgaaaa cggtaaaact 120aagcgaccac
gctttttacc agaacttgcg tcagctcttg gcgtaagtgt tgactggctg 180ctcaatggca
cctctgattc gaatgttaga tttgttgggc acgttgagcc caaagggaaa 240tatccattga
ttagcatggt tagagctcgt tcgtggtgtg aagcttgtga accctacgat 300atcaaggaca
ttgatgaatg gtatgacagt gacgttaact tattaggcaa tggattctgg 360ctgaaggttg
aaggtgattc catgacctca cctgtaggtc aaagcatccc tgaaggtcat 420atggtgttag
tagatactgg acgggagcca gtgaatggaa gccttgttgt agccaaactg 480actgacgcga
acgaagcaac attcaagaaa ctggtcatag atggcggtca gaagtacctg 540aaaggcctga
atccttcatg gcctatgact cctatcaacg gaaactgcaa gattatcggt 600gttgtcgtgg
aagcgagggt aaaattcgta gactaa
63680300DNAArtificial SequenceSynthetic Polynucleotide 80atgtggtatt
ttgggaagat cactcgtcgg gagtccgagc ggctgctgct caaccccgaa 60aacccccggg
gaaccttctt ggtccgggag agcgagacgg taaaaggtgc ctatgccctc 120tccgtttctg
actttgacaa cgccaagggg ctcaatgtga aacactacct gatccgcaag 180ctggacagcg
gcggcttcta catcacctca cgcacacagt tcagcagcct gcagcagctg 240gtggcctact
actccaaaca tgctgatggc ttgtgccacc gcctgaccaa cgtctgctaa
300811116DNAArtificial SequenceSynthetic Polynucleotide 81atgaaaatcg
aagaaggtaa actggtaatc tggattaacg gcgataaagg ctataacggt 60ctcgctgaag
tcggtaagaa attcgagaaa gataccggaa ttaaagtcac cgttgagcat 120ccggataaac
tggaagagaa attcccacag gttgcggcaa ctggcgatgg ccctgacatt 180atcttctggg
cacacgaccg ctttggtggc tacgctcaat ctggcctgtt ggctgaaatc 240accccggaca
aagcgttcca ggacaagctg tatccgttta cctgggatgc cgtacgttac 300aacggcaagc
tgattgctta cccgatcgct gttgaagcgt tatcgctgat ttataacaaa 360gatctgctgc
cgaacccgcc aaaaacctgg gaagagatcc cggcgctgga taaagaactg 420aaagcgaaag
gtaagagcgc gctgatgttc aacctgcaag aaccgtactt cacctggccg 480ctgattgctg
ctgacggggg ttatgcgttc aagtatgaaa acggcaagta cgacattaaa 540gacgtgggcg
tggataacgc tggcgcgaaa gcgggtctga ccttcctggt tgacctgatt 600aaaaacaaac
acatgaatgc agacaccgat tactccatcg cagaagctgc ctttaataaa 660ggcgaaacag
cgatgaccat caacggcccg tgggcatggt ccaacatcga caccagcaaa 720gtgaattatg
gtgtaacggt actgccgacc ttcaagggtc aaccatccaa accgttcgtt 780ggcgtgctga
gcgcaggtat taacgccgcc agtccgaaca aagagctggc gaaagagttc 840ctcgaaaact
atctgctgac tgatgaaggt ctggaagcgg ttaataaaga caaaccgctg 900ggtgccgtag
cgctgaagtc ttacgaggaa gagttggcga aagatccacg tattgccgcc 960accatggaaa
acgcccagaa aggtgaaatc atgccgaaca tcccgcagat gtccgctttc 1020tggtatgccg
tgcgtactgc ggtgatcaac gccgccagcg gtcgtcagac tgtcgatgaa 1080gccctgaaag
acgcgcagac tcgtatcacc aagtaa
11168236DNAArtificial SequenceSynthetic Polynucleotide 82tggatggagg
actatgacta cgtccaccta cagggg
368333DNAArtificial SequenceSynthetic Polynucleotide 83gaaccgcagt
atgaagaaat tccgatttat ctg
338430DNAArtificial SequenceSynthetic Polynucleotide 84ccgcagcgct
atctggtgat tcagggcgat
308530DNAArtificial SequenceSynthetic Polynucleotide 85gatcatcagt
attataacga ttttccgggc
30865432DNAArtificial SequenceSynthetic
Polynucleotidemisc_feature(1317)..(1317)n is a, c, g, or t 86atgtcattac
cgttcttaac ttctgcaccg ggaaaggtta ttatttttgg tgaacactct 60gctgtgtaca
acaagcctgc cgtcgctgct agtgtgtctg cgttgagaac ctacctgcta 120ataagcgagt
catctgcacc agatactatt gaattggact tcccggacat tagctttaat 180cataagtggt
ccatcaatga tttcaatgcc atcaccgagg atcaagtaaa ctcccaaaaa 240ttggccaagg
ctcaacaagc caccgatggc ttgtctcagg aactcgttag tcttttggat 300ccgttgttag
ctcaactatc cgaatccttc cactaccatg cagcgttttg tttcctgtat 360atgtttgttt
gcctatgccc ccatgccaag aatattaagt tttctttaaa gtctacttta 420cccatcggtg
ctgggttggg ctcaagcgcc tctatttctg tatcactggc cttagctatg 480gcctacttgg
gggggttaat aggatctaat gacttggaaa agctgtcaga aaacgataag 540catatagtga
atcaatgggc cttcataggt gaaaagtgta ttcacggtac cccttcagga 600atagataacg
ctgtggccac ttatggtaat gccctgctat ttgaaaaaga ctcacataat 660ggaacaataa
acacaaacaa ttttaagttc ttagatgatt tcccagccat tccaatgatc 720ctaacctata
ctagaattcc aaggtctaca aaagatcttg ttgctcgcgt tcgtgtgttg 780gtcaccgaga
aatttcctga agttatgaag ccaattctag atgccatggg tgaatgtgcc 840ctacaaggct
tagagatcat gactaagtta agtaaatgta aaggcaccga tgacgaggct 900gtagaaacta
ataatgaact gtatgaacaa ctattggaat tgataagaat aaatcatgga 960ctgcttgtct
caatcggtgt ttctcatcct ggattagaac ttattaaaaa tctgagcgat 1020gatttgagaa
ttggctccac aaaacttacc ggtgctggtg gcggcggttg ctctttgact 1080ttgttacgaa
gagacattac tcaagagcaa attgacagct tcaaaaagaa attgcaagat 1140gattttagtt
acgagacatt tgaaacagac ttgggtggga ctggctgctg tttgttaagc 1200gcaaaaaatt
tgaataaaga tcttaaaatc aaatccctag tattccaatt atttgaaaat 1260aaaactacca
caaagcaaca aattgacgat ctattattgc caggaaacac gaatttncca 1320tggacttcat
aggaggcaga tcaaatgtca gagttgagag ccttcagtgc cccagggaaa 1380gcgttactag
ctggtggata tttagtttta gatacaaaat atgaagcatt tgtagtcgga 1440ttatcggcaa
gaatgcatgc tgtagcccat ccttacggtt cattgcaagg gtctgataag 1500tttgaagtgc
gtgtgaaaag taaacaattt aaagatgggg agtggctgta ccatataagt 1560cctaaaagtg
gcttcattcc tgtttcgata ggcggatcta agaacccttt cattgaaaaa 1620gttatcgcta
acgtatttag ctactttaaa cctaacatgg acgactactg caatagaaac 1680ttgttcgtta
ttgatatttt ctctgatgat gcctaccatt ctcaggagga tagcgttacc 1740gaacatcgtg
gcaacagaag attgagtttt cattcgcaca gaattgaaga agttcccaaa 1800acagggctgg
gctcctcggc aggtttagtc acagttttaa ctacagcttt ggcctccttt 1860tttgtatcgg
acctggaaaa taatgtagac aaatatagag aagttattca taatttagca 1920caagttgctc
attgtcaagc tcagggtaaa attggaagcg ggtttgatgt agcggcggca 1980gcatatggat
ctatcagata tagaagattc ccacccgcat taatctctaa tttgccagat 2040attggaagtg
ctacttacgg cagtaaactg gcgcatttgg ttgatgaaga agactggaat 2100attacgatta
aaagtaacca tttaccttcg ggattaactt tatggatggg cgatattaag 2160aatggttcag
aaacagtaaa actggtccag aaggtaaaaa attggtatga ttcgcatatg 2220ccagaaagct
tgaaaatata tacagaactc gatcatgcaa attctagatt tatggatgga 2280ctatctaaac
tagatcgctt acacgagact catgacgatt acagcgatca gatatttgag 2340tctcttgaga
ggaatgactg tacctgtcaa aagtatcctg aaatcacaga agttagagat 2400gcagttgcca
caattagacg ttcctttaga aaaataacta aagaatctgg tgccgatatc 2460gaacctcccg
tacaaactag cttattggat gattgccaga ccttaaaagg agttcttact 2520tgcttaatac
ctggtgctgg tggttatgac gccattgcag tgattactaa gcaagatgtt 2580gatcttaggg
ctcaaaccgc taatgacaaa agattttcta aggttcaatg gctggatgta 2640actcaggctg
actggggtgt taggaaagaa aaagatccgg aaacttatct tgataaatag 2700gaggtaatac
tcatgaccgt ttacacagca tccgttaccg cacccgtcaa catcgcaacc 2760cttaagtatt
gggggaaaag ggacacgaag ttgaatctgc ccaccaattc gtccatatca 2820gtgactttat
cgcaagatga cctcagaacg ttgacctctg cggctactgc acctgagttt 2880gaacgcgaca
ctttgtggtt aaatggagaa ccacacagca tcgacaatga aagaactcaa 2940aattgtctgc
gcgacctacg ccaattaaga aaggaaatgg aatcgaagga cgcctcattg 3000cccacattat
ctcaatggaa actccacatt gtctccgaaa ataactttcc tacagcagct 3060ggtttagctt
cctccgctgc tggctttgct gcattggtct ctgcaattgc taagttatac 3120caattaccac
agtcaacttc agaaatatct agaatagcaa gaaaggggtc tggttcagct 3180tgtagatcgt
tgtttggcgg atacgtggcc tgggaaatgg gaaaagctga agatggtcat 3240gattccatgg
cagtacaaat cgcagacagc tctgactggc ctcagatgaa agcttgtgtc 3300ctagttgtca
gcgatattaa aaaggatgtg agttccactc agggtatgca attgaccgtg 3360gcaacctccg
aactatttaa agaaagaatt gaacatgtcg taccaaagag atttgaagtc 3420atgcgtaaag
ccattgttga aaaagatttc gccacctttg caaaggaaac aatgatggat 3480tccaactctt
tccatgccac atgtttggac tctttccctc caatattcta catgaatgac 3540acttccaagc
gtatcatcag ttggtgccac accattaatc agttttacgg agaaacaatc 3600gttgcataca
cgtttgatgc aggtccaaat gctgtgttgt actacttagc tgaaaatgag 3660tcgaaactct
ttgcatttat ctataaattg tttggctctg ttcctggatg ggacaagaaa 3720tttactactg
agcagcttga ggctttcaac catcaatttg aatcatctaa ctttactgca 3780cgtgaattgg
atcttgagtt gcaaaaggat gttgccagag tgattttaac tcaagtcggt 3840tcaggcccac
aagaaacaaa cgaatctttg attgacgcaa agactggtct accaaaggaa 3900taactgcagc
ccgggaggag gattactata tgcaaacgga acacgtcatt ttattgaatg 3960cacagggagt
tcccacgggt acgctggaaa agtatgccgc acacacggca gacacccgct 4020tacatctcgc
gttctccagt tggctgttta atgccaaagg acaattatta gttacccgcc 4080gcgcactgag
caaaaaagca tggcctggcg tgtggactaa ctcggtttgt gggcacccac 4140aactgggaga
aagcaacgaa gacgcagtga tccgccgttg ccgttatgag cttggcgtgg 4200aaattacgcc
tcctgaatct atctatcctg actttcgcta ccgcgccacc gatccgagtg 4260gcattgtgga
aaatgaagtg tgtccggtat ttgccgcacg caccactagt gcgttacaga 4320tcaatgatga
tgaagtgatg gattatcaat ggtgtgattt agcagatgta ttacacggta 4380ttgatgccac
gccgtgggcg ttcagtccgt ggatggtgat gcaggcgaca aatcgcgaag 4440ccagaaaacg
attatctgca tttacccagc ttaaataacc cgggggatcc actagttcta 4500gagcggccgc
caccgcggag gaggaatgag taatggactt tccgcagcaa ctcgaagcct 4560gcgttaagca
ggccaaccag gcgctgagcc gttttatcgc cccactgccc tttcagaaca 4620ctcccgtggt
cgaaaccatg cagtatggcg cattattagg tggtaagcgc ctgcgacctt 4680tcctggttta
tgccaccggt catatgttcg gcgttagcac aaacacgctg gacgcacccg 4740ctgccgccgt
tgagtgtatc cacgcttact cattaattca tgatgattta ccggcaatgg 4800atgatgacga
tctgcgtcgc ggtttgccaa cctgccatgt gaagtttggc gaagcaaacg 4860cgattctcgc
tggcgacgct ttacaaacgc tggcgttctc gattttaagc gatgccgata 4920tgccggaagt
gtcggaccgc gacagaattt cgatgatttc tgaactggcg agcgccagtg 4980gtattgccgg
aatgtgcggt ggtcaggcat tagatttaga cgcggaaggc aaacacgtac 5040ctctggacgc
gcttgagcgt attcatcgtc ataaaaccgg cgcattgatt cgcgccgccg 5100ttcgccttgg
tgcattaagc gccggagata aaggacgtcg tgctctgccg gtactcgaca 5160agtatgcaga
gagcatcggc cttgccttcc aggttcagga tgacatcctg gatgtggtgg 5220gagatactgc
aacgttggga aaacgccagg gtgccgacca gcaacttggt aaaagtacct 5280accctgcact
tctgggtctt gagcaagccc ggaagaaagc ccgggatctg atcgacgatg 5340cccgtcagtc
gctgaaacaa ctggctgaac agtcactcga tacctcggca ctggaagcgc 5400tagcggacta
catcatccag cgtaataaat aa
5432871638DNAArtificial SequenceSynthetic Polynucleotide 87gccctgaccg
aagagaaacc gatccgcccg atcgctaact tcccgccgtc tatctggggt 60gaccagttcc
tgatctacga aaagcaggtt gagcagggtg ttgaacagat cgtaaacgac 120ctgaagaaag
aagttcgtca gctgctgaaa gaagctctgg acatcccgat gaaacacgct 180aacctgttga
agctgatcga cgagatccag cgtctgggta tcccgtacca cttcgaacgc 240gaaatcgacc
acgcactgca gtgcatctac gaaacctacg gcgacaactg gaacggcgac 300cgttcttctc
tgtggtttcg tctgatgcgt aaacagggct actacgttac ctgtgacgtt 360tttaacaact
acaaggacaa gaacggtgct ttcaaacagt ctctggctaa cgacgttgaa 420ggcctgctgg
aactgtacga agcgacctcc atgcgtgtac cgggtgaaat catcctggag 480gacgcgctgg
gtttcacccg ttctcgtctg tccattatga ctaaagacgc tttctctact 540aacccggctc
tgttcaccga aatccagcgt gctctgaaac agccgctgtg gaaacgtctg 600ccgcgtatcg
aagcagcaca gtacattccg ttttaccagc agcaggactc tcacaacaag 660accctgctga
aactggctaa gctggaattc aacctgctgc agtctctgca caaagaagaa 720ctgtctcacg
tttgtaagtg gtggaaggca tttgacatca agaaaaacgc gccgtgcctg 780cgtgaccgta
tcgttgaatg ttacttctgg ggtctgggtt ctggttatga accacagtac 840tcccgtgcac
gtgtgttctt cactaaagct gtagctgtta tcaccctgat cgatgacact 900tacgatgctt
acggcaccta cgaagaactg aagatcttta ctgaagctgt agaacgctgg 960tctatcactt
gcctggacac tctgccggag tacatgaaac cgatctacaa actgttcatg 1020gatacctaca
ccgaaatgga ggaattcctg gcaaaagaag gccgtaccga cctgttcaac 1080tgcggtaaag
agtttgttaa agaattcgta cgtaacctga tggttgaagc taaatgggct 1140aacgaaggcc
atatcccgac taccgaagaa catgacccgg ttgttatcat caccggcggt 1200gcaaacctgc
tgaccaccac ttgctatctg ggtatgtccg acatctttac caaggaatct 1260gttgaatggg
ctgtttctgc accgccgctg ttccgttact ccggtattct gggtcgtcgt 1320ctgaacgacc
tgatgaccca caaagcagag caggaacgta aacactcttc ctcctctctg 1380gaatcctaca
tgaaggaata taacgttaac gaggagtacg cacagactct gatctataaa 1440gaagttgaag
acgtatggaa agacatcaac cgtgaatacc tgactactaa aaacatcccg 1500cgcccgctgc
tgatggcagt aatctacctg tgccagttcc tggaagtaca gtacgctggt 1560aaagataact
tcactcgcat gggcgacgaa tacaaacacc tgatcaaatc cctgctggtt 1620tacccgatgt
ccatctga
1638881782DNAArtificial SequenceSynthetic Polynucleotide 88atggctcaaa
tcagcgaatc agtgtctcca agcaccgacc ttaaaagcac ggaatcttct 60attaccagca
accgccacgg taacatgtgg gaagatgacc gcattcagag cttaaacagc 120ccatatggcg
cacccgctta tcaggaacgt agcgaaaaat tgattgaaga aattaagctc 180ctgtttctgt
ccgatatgga cgatagttgc aatgattcgg atcgcgactt gatcaaacgc 240ctggagatcg
tagatacggt tgagtgtctg ggcattgatc gtcatttcca acctgaaatt 300aagctggcgc
tggattacgt gtaccgttgc tggaatgagc gtggcatcgg agaaggtagc 360cgtgatagct
taaaaaagga cctgaatgcg accgccttgg gctttcgggc tttacgctta 420caccgttata
atgtaagctc aggagtgctg gagaacttcc gtgatgacaa tggtcaattc 480ttttgcggtt
ctactgtgga ggaggaaggc gcggaggcct acaataaaca tgtacgttgc 540atgctgtccc
tgtcccgcgc ttccaatatt ttattcccgg gcgagaaagt gatggaagaa 600gcgaaggcgt
ttacgaccaa ctatcttaag aaagtcctgg cgggtcgtga agcaactcat 660gtcgacgaga
gtctccttgg agaggtcaag tatgcactag aatttccgtg gcattgttcc 720gtgcagcgct
gggaggcacg ttcttttatc gaaattttcg gtcagattga tagtgaactg 780aaaagcaacc
tctctaaaaa aatgctcgaa ctcgcaaaac ttgattttaa catactccag 840tgtacgcatc
aaaaagagct ccagatcatt agtcgatggt tcgccgattc aagtatcgca 900agtctgaact
tttaccgtaa atgctatgtg gaattttact tctggatggc cgcggcaatt 960tcagaaccag
aatttagtgg ctctcgcgtg gcattcacta aaattgcgat cttgatgaca 1020atgttagatg
acttatacga cacgcatggg acgctggatc aattgaaaat atttaccgaa 1080ggtgtgcgca
ggtgggacgt gtcgctggtg gagggcctgc cggatttcat gaaaattgcc 1140tttgagttct
ggttaaagac ctccaacgaa ctgattgcgg aggcggttaa ggcccaaggc 1200caggatatgg
cggcctatat ccgcaaaaac gcttgggaac gctatctgga agcgtatttg 1260caggatgccg
aatggatcgc caccggtcac gttccgacat tcgatgaata tctgaacaat 1320ggcaccccca
acaccggtat gtgtgtactt aatctgatcc cgttgctgct tatgggcgaa 1380cacttgccga
tcgatattct tgaacagatc tttctgccga gccggttcca ccatctgatt 1440gaactggcta
gccgactggt cgatgatgcg agagattttc aagccgaaaa agatcatggt 1500gatttatcct
gcatcgaatg ctacctgaaa gaccatccgg aatcaacagt tgaagacgcc 1560ctgaatcacg
tcaacggcct gctggggaat tgtttgctgg aaatgaattg gaaatttctg 1620aaaaaacagg
actcggtacc tctgtcgtgt aaaaaatact cattccacgt cctggcgcgg 1680tcgattcagt
ttatgtataa ccagggggac gggttttcga tttcgaacaa agttattaaa 1740gaccaggtcc
agaaagttct aatcgttccg gttcctatat aa
1782892351DNAArtificial SequenceSynthetic Polynucleotide 89tgaaacgaga
atttcctcca ggattttgga aggatgatct tatcgattct ctaacgtcat 60ctcacaaggt
tgcagcatca gacgagaagc gtatcgagac attaatatcc gagattaaga 120atatgtttag
atgtatgggc tatggcgaaa cgaatccctc tgcatatgac actgcttggg 180tagcaaggat
tccagcagtt gatggctctg acaaccctca ctttcctgag acggttgaat 240ggattcttca
aaatcagttg aaagatgggt cttggggtga aggattctac ttcttggcat 300atgacagaat
actggctaca cttgcatgta ttattaccct taccctctgg cgtactgggg 360agacacaagt
acagaaaggt attgaattct tcaggacaca agctggaaag atggaagatg 420aagctgatag
tcataggcca agtggatttg aaatagtatt tcctgcaatg ctaaaggaag 480ctaaaatctt
aggcttggat ctgccttacg atttgccatt cctgaaacaa atcatcgaaa 540agcgggaggc
taagcttaaa aggattccca ctgatgttct ctatgccctt ccaacaacgt 600tattgtattc
tttggaaggt ttacaagaaa tagtagactg gcagaaaata atgaaacttc 660aatccaagga
tggatcattt ctcagctctc cggcatctac agcggctgta ttcatgcgta 720cagggaacaa
aaagtgcttg gatttcttga actttgtctt gaagaaattc ggaaaccatg 780tgccttgtca
ctatccgctt gatctatttg aacgtttgtg ggcggttgat acagttgagc 840ggctaggtat
cgatcgtcat ttcaaagagg agatcaagga agcattggat tatgtttaca 900gccattggga
cgaaagaggc attggatggg cgagagagaa tcctgttcct gatattgatg 960atacagccat
gggccttcga atcttgagat tacatggata caatgtatcc tcagatgttt 1020taaaaacatt
tagagatgag aatggggagt tcttttgctt cttgggtcaa acacagagag 1080gagttacaga
catgttaaac gtcaatcgtt gttcacatgt ttcatttccg ggagaaacga 1140tcatggaaga
agcaaaactc tgtaccgaaa ggtatctgag gaatgctctg gaaaatgtgg 1200atgcctttga
caaatgggct tttaaaaaga atattcgggg agaggtagag tatgcactca 1260aatatccctg
gcataagagt atgccaaggt tggaggctag aagctatatt gaaaactatg 1320ggccagatga
tgtgtggctt ggaaaaactg tatatatgat gccatacatt tcgaatgaaa 1380agtatttaga
actagcgaaa ctggacttca ataaggtgca gtctatacac caaacagagc 1440ttcaagatct
tcgaaggtgg tggaaatcat ccggtttcac ggatctgaat ttcactcgtg 1500agcgtgtgac
ggaaatatat ttctcaccgg catcctttat ctttgagccc gagttttcta 1560agtgcagaga
ggtttataca aaaacttcca atttcactgt tattttagat gatctttatg 1620acgcccatgg
atctttagac gatcttaagt tgttcacaga atcagtcaaa agatgggatc 1680tatcactagt
ggaccaaatg ccacaacaaa tgaaaatatg ttttgtgggt ttctacaata 1740cttttaatga
tatagcaaaa gaaggacgtg agaggcaagg gcgcgatgtg ctaggctaca 1800ttcaaaatgt
ttggaaagtc caacttgaag cttacacgaa agaagcagaa tggtctgaag 1860ctaaatatgt
gccatccttc aatgaataca tagagaatgc gagtgtgtca atagcattgg 1920gaacagtcgt
tctcattagt gctcttttca ctggggaggt tcttacagat gaagtactct 1980ccaaaattga
tcgcgaatct agatttcttc aactcatggg cttaacaggg cgtttggtga 2040atgacaccaa
aacttatcag gcagagagag gtcaaggtga ggtggcttct gccatacaat 2100gttatatgaa
ggaccatcct aaaatctctg aagaagaagc tctacaacat gtctatagtg 2160tcatggaaaa
tgccctcgaa gagttgaata gggagtttgt gaataacaaa ataccggata 2220tttacaaaag
actggttttt gaaactgcaa gaataatgca actcttttat atgcaagggg 2280atggtttgac
actatcacat gatatggaaa ttaaagagca tgtcaaaaat tgcctcttcc 2340aaccagttgc c
2351902409DNAArtificial SequenceSynthetic Polynucleotide 90atgagcagca
gcactggcac tagcaaggtg gtttccgaga cttccagtac cattgtggat 60gatatccctc
gactctccgc caattatcat ggcgatctgt ggcaccacaa tgttatacaa 120actctggaga
caccgtttcg tgagagttct acttaccaag aacgggcaga tgagctggtt 180gtgaaaatta
aagatatgtt caatgcgctc ggagacggag atatcagtcc gtctgcatac 240gacactgcgt
gggtggcgag gctggcgacc atttcctctg atggatctga gaagccacgg 300tttcctcagg
ccctcaactg ggttttcaac aaccagctcc aggatggatc gtggggtatc 360gaatcgcact
ttagtttatg cgatcgattg cttaacacga ccaattctgt tatcgccctc 420tcggtttgga
aaacagggca cagccaagta caacaaggtg ctgagtttat tgcagagaat 480ctaagattac
tcaatgagga agatgagttg tccccggatt tccaaataat ctttcctgct 540ctgctgcaaa
aggcaaaagc gttggggatc aatcttcctt acgatcttcc atttatcaaa 600tatttgtcga
caacacggga agccaggctt acagatgttt ctgcggcagc agacaatatt 660ccagccaaca
tgttgaatgc gttggaagga ctcgaggaag ttattgactg gaacaagatt 720atgaggtttc
aaagtaaaga tggatctttc ctgagctccc ctgcctccac tgcctgtgta 780ctgatgaata
caggggacga aaaatgtttc acttttctca acaatctgct cgacaaattc 840ggcggctgcg
tgccctgtat gtattccatc gatctgctgg aacgcctttc gctggttgat 900aacattgagc
atctcggaat cggtcgccat ttcaaacaag aaatcaaagg agctcttgat 960tatgtctaca
gacattggag tgaaaggggc atcggttggg gcagagacag ccttgttcca 1020gatctcaaca
ccacagccct cggcctgcga actcttcgca tgcacggata caatgtttct 1080tcagacgttt
tgaataattt caaagatgaa aacgggcggt tcttctcctc tgcgggccaa 1140acccatgtcg
aattgagaag cgtggtgaat cttttcagag cttccgacct tgcatttcct 1200gacgaaagag
ctatggacga tgctagaaaa tttgcagaac catatcttag agaggcactt 1260gcaacgaaaa
tctcaaccaa tacaaaacta ttcaaagaga ttgagtacgt ggtggagtac 1320ccttggcaca
tgagtatccc acgcttagaa gccagaagtt atattgattc atatgacgac 1380aattatgtat
ggcagaggaa gactctatat agaatgccat ctttgagtaa ttcaaaatgt 1440ttagaattgg
caaaattgga cttcaatatc gtacaatctt tgcatcaaga ggagttgaag 1500cttctaacaa
gatggtggaa ggaatccggc atggcagata taaatttcac tcgacaccga 1560gtggcggagg
tttatttttc atcagctaca tttgaacccg aatattctgc cactagaatt 1620gccttcacaa
aaattggttg tttacaagtc ctttttgatg atatggctga catctttgca 1680acactagatg
aattgaaaag tttcactgag ggagtaaaga gatgggatac atctttgcta 1740catgagattc
cagagtgtat gcaaacttgc tttaaagttt ggttcaaatt aatggaagaa 1800gtaaataatg
atgtggttaa ggtacaagga cgtgacatgc tcgctcacat aagaaaaccc 1860tgggagttgt
acttcaattg ttatgtacaa gaaagggagt ggcttgaagc cgggtatata 1920ccaacttttg
aagagtactt aaagacttat gctatatcag taggccttgg accgtgtacc 1980ctacaaccaa
tactactaat gggtgagctt gtgaaagatg atgttgttga gaaagtgcac 2040tatccctcaa
atatgtttga gcttgtatcc ttgagctggc gactaacaaa cgacaccaaa 2100acatatcagg
ctgaaaaggc tcgaggacaa caagcctcag gcatagcatg ctatatgaag 2160gataatccag
gagcaactga ggaagatgcc attaagcaca tatgtcgtgt tgttgatcgg 2220gccttgaaag
aagcaagctt tgaatatttc aaaccatcca atgatatccc aatgggttgc 2280aagtccttta
tttttaacct tagattgtgt gtccaaatct tttacaagtt tatagatggg 2340tacggaatcg
ccaatgagga gattaaggac tatataagaa aagtttatat tgatccaatt 2400caagtatga
240991891DNAArtificial SequenceSynthetic Polynucleotide 91atgtttgatt
tcaatgaata tatgaaaagt aaggctgttg cggtagacgc ggctctggat 60aaagcgattc
cgctggaata tcccgagaag attcacgaat cgatgcgcta ctccctgtta 120gcaggaggga
aacgcgttcg tccggcatta tgcatcgcgg cctgtgaact cgtcggcggt 180tcacaggact
tagcaatgcc aactgcttgc gcaatggaaa tgattcacac aatgagcctg 240attcatgatg
atttgccttg catggacaac gatgactttc ggcgcggtaa acctactaat 300cataaggttt
ttggcgaaga tactgcagtg ctggcgggcg atgcgctgct gtcgtttgcc 360ttcgaacata
tcgccgtcgc gacctcgaaa accgtcccgt cggaccgtac gcttcgcgtg 420atttccgagc
tgggaaagac catcggctct caaggactcg tgggtggtca ggtagttgat 480atcacgtctg
agggtgacgc gaacgtggac ctgaaaaccc tggagtggat ccatattcac 540aaaacggccg
tgctgctgga atgtagcgtg gtgtcagggg ggatcttggg gggcgccacg 600gaggatgaaa
tcgcgcgtat tcgtcgttat gcccgctgtg ttggactgtt atttcaggtg 660gtggatgaca
tcctggatgt cacaaaatcc agcgaagagc ttggcaagac cgcgggcaaa 720gaccttctga
cggataaggc tacatacccg aaattgatgg gcttggagaa agccaaggag 780ttcgcagctg
aacttgccac gcgggcgaag gaagaactct cttctttcga tcaaatcaaa 840gccgcgccac
tgctgggcct cgccgattac attgcgtttc gtcagaactg a
891923150DNAArtificial SequenceSynthetic Polynucleotide 92atgacaatta
aagaaatgcc tcagccaaaa acgtttggag agcttaaaaa tttaccgtta 60ttaaacacag
ataaaccggt tcaagctttg atgaaaattg cggatgaatt aggagaaatc 120tttaaattcg
aggcgcctgg tcgtgtaacg cgctacttat caagtcagcg tctaattaaa 180gaagcatgcg
atgaatcacg ctttgataaa aacttaagtc aagcgcttaa atttgtacgt 240gattttgcag
gagacgggtt atttacaagc tggacgcatg aaaaaaattg gaaaaaagcg 300cataatatct
tacttccaag cttcagtcag caggcaatga aaggctatca tgcgatgatg 360gtcgatatcg
ccgtgcagct tgttcaaaag tgggagcgtc taaatgcaga tgagcatatt 420gaagtaccgg
aagacatgac acgtttaacg cttgatacaa ttggtctttg cggctttaac 480tatcgcttta
acagctttta ccgagatcag cctcatccat ttattacaag tatggtccgt 540gcactggatg
aagcaatgaa caagctgcag cgagcaaatc cagacgaccc agcttatgat 600gaaaacaagc
gccagtttca agaagatatc aaggtgatga acgacctagt agataaaatt 660attgcagatc
gcaaagcaag cggtgaacaa agcgatgatt tattaacgca tatgctaaac 720ggaaaagatc
cagaaacggg tgagccgctt gatgacgaga acattcgcta tcaaattatt 780acattcttaa
ttgcgggaca cgaaacaaca agtggtcttt tatcatttgc gctgtatttc 840ttagtgaaaa
atccacatgt attacaaaaa gcagcagaag aagcagcacg agttctagta 900gatcctgttc
caagctacaa acaagtcaaa cagcttaaat atgtcggcat ggtcttaaac 960gaagcgctgc
gcttatggcc aactgctcct gcgttttccc tatatgcaaa agaagatacg 1020gtgcttggag
gagaatatcc tttagaaaaa ggcgacgaac taatggttct gattcctcag 1080cttcaccgtg
ataaaacaat ttggggagac gatgtggaag agttccgtcc agagcgtttt 1140gaaaatccaa
gtgcgattcc gcagcatgcg tttaaaccgt ttggaaacgg tcagcgtgcg 1200tgtatcggtc
agcagttcgc tcttcatgaa gcaacgctgg tacttggtat gatgctaaaa 1260cactttgact
ttgaagatca tacaaactac gagctggata ttaaagaaac tttaacgtta 1320aaacctgaag
gctttgtggt aaaagcaaaa tcgaaaaaaa ttccgcttgg cggtattcct 1380tcacctagca
ctgaacagtc tgctaaaaaa gtacgcaaaa aggcagaaaa cgctcataat 1440acgccgctgc
ttgtgctata cggttcaaat atgggaacag ctgaaggaac ggcgcgtgat 1500ttagcagata
ttgcaatgag caaaggattt gcaccgcagg tcgcaacgct tgattcacac 1560gccggaaatc
ttccgcgcga aggagctgta ttaattgtaa cggcgtctta taacggtcat 1620ccgcctgata
acgcaaagca atttgtcgac tggttagacc aagcgtctgc tgatgaagta 1680aaaggcgttc
gctactccgt atttggatgc ggcgataaaa actgggctac tacgtatcaa 1740aaagtgcctg
cttttatcga tgaaacgctt gccgctaaag gggcagaaaa catcgctgac 1800cgcggtgaag
cagatgcaag cgacgacttt gaaggcacat atgaagaatg gcgtgaacat 1860atgtggagtg
acgtagcagc ctactttaac ctcgacattg aaaacagtga agataataaa 1920tctactcttt
cacttcaatt tgtcgacagc gccgcggata tgccgcttgc gaaaatgcac 1980ggtgcgtttt
caacgaacgt cgtagcaagc aaagaacttc aacagccagg cagtgcacga 2040agcacgcgac
atcttgaaat tgaacttcca aaagaagctt cttatcaaga aggagatcat 2100ttaggtgtta
ttcctcgcaa ctatgaagga atagtaaacc gtgtaacagc aaggttcggc 2160ctagatgcat
cacagcaaat ccgtctggaa gcagaagaag aaaaattagc tcatttgcca 2220ctcgctaaaa
cagtatccgt agaagagctt ctgcaatacg tggagcttca agatcctgtt 2280acgcgcacgc
agcttcgcgc aatggctgct aaaacggtct gcccgccgca taaagtagag 2340cttgaagcct
tgcttgaaaa gcaagcctac aaagaacaag tgctggcaaa acgtttaaca 2400atgcttgaac
tgcttgaaaa atacccggcg tgtgaaatga aattcagcga atttatcgcc 2460cttctgccaa
gcatacgccc gcgctattac tcgatttctt catcacctcg tgtcgatgaa 2520aaacaagcaa
gcatcacggt cagcgttgtc tcaggagaag cgtggagcgg atatggagaa 2580tataaaggaa
ttgcgtcgaa ctatcttgcc gagctgcaag aaggagatac gattacgtgc 2640tttatttcca
caccgcagtc agaatttacg ctgccaaaag accctgaaac gccgcttatc 2700atggtcggac
cgggaacagg cgtcgcgccg tttagaggct ttgtgcaggc gcgcaaacag 2760ctaaaagaac
aaggacagtc acttggagaa gcacatttat acttcggctg ccgttcacct 2820catgaagact
atctgtatca agaagagctt gaaaacgccc aaagcgaagg catcattacg 2880cttcataccg
ctttttctcg catgccaaat cagccgaaaa catacgttca gcacgtaatg 2940gaacaagacg
gcaagaaatt gattgaactt cttgatcaag gagcgcactt ctatatttgc 3000ggagacggaa
gccaaatggc acctgccgtt gaagcaacgc ttatgaaaag ctatgctgac 3060gttcaccaag
tgagtgaagc agacgctcgc ttatggctgc agcagctaga agaaaaaggc 3120cgatacgcaa
aagacgtgtg ggctgggtaa
315093789DNAArtificial SequenceSynthetic Polynucleotide 93atgagggaag
cggtgatcgc cgaagtatcg actcaactat cagaggtagt tggcgtcatc 60gagcgccatc
tcgaaccgac gttgctggcc gtacatttgt acggctccgc agtggatggc 120ggcctgaagc
cacacagtga tattgatttg ctggttacgg tgaccgtaag gcttgatgaa 180acaacgcggc
gagctttgat caacgacctt ttggaaactt cggcttcccc tggagagagc 240gagattctcc
gcgctgtaga agtcaccatt gttgtgcacg acgacatcat tccgtggcgt 300tatccagcta
agcgcgaact gcaatttgga gaatggcagc gcaatgacat tcttgcaggt 360atcttcgagc
cagccacgat cgacattgat ctggctatct tgctgacaaa agcaagagaa 420catagcgttg
ccttggtagg tccagcggcg gaggaactct ttgatccggt tcctgaacag 480gatctatttg
aggcgctaaa tgaaacctta acgctatgga actcgccgcc cgactgggct 540ggcgatgagc
gaaatgtagt gcttacgttg tcccgcattt ggtacagcgc agtaaccggc 600aaaatcgcgc
cgaaggatgt cgctgccgac tgggcaatgg agcgcctgcc ggcccagtat 660cagcccgtca
tacttgaagc tagacaggct tatcttggac aagaagaaga tcgcttggcc 720tcgcgcgcag
atcagttgga agaatttgtc cactacgtga aaggcgagat caccaaggta 780gtcggcaaa
78994432DNAArtificial SequenceSynthetic Polynucleotide 94ttggctacta
cacttgaacg tattgagaag aactttgtca ttactgaccc aaggttgcca 60gataatccca
ttatattcgc gtccgatagt ttcttgcagt tgacagaata tagccgtgaa 120gaaattttgg
gaagaaactg caggtttcta caaggtcctg aaactgatcg cgcgacagtg 180agaaaaatta
gagatgccat agataaccaa acagaggtca ctgttcagct gattaattat 240acaaagagtg
gtaaaaagtt ctggaacctc tttcacttgc agcctatgcg agatcagaag 300ggagatgtcc
agtactttat tggggttcag ttggatggaa ctgagcatgt ccgagatgct 360gccgagagag
agggagtcat gctgattaag aaaactgcag aaaatattga tgaggcggca 420aaagaacttc
ca
432952244DNAArtificial SequenceSynthetic Polynucleotide 95atggctagcg
tggcaggtca tgcctctggc agccccgcat tcgggaccgc cgatctttcg 60aattgcgaac
gtgaagagat ccacctcgcc ggctcgatcc agccgcatgg cgcgcttctg 120gtcgtcagcg
agccggatca tcgcatcatc caggccagcg ccaacgccgc ggaatttctg 180aatctcggaa
gcgtgctcgg cgttccgctc gccgagatcg acggcgatct gttgatcaag 240atcctgccgc
atctcgatcc caccgccgaa ggcatgccgg tcgcggtgcg ctgccggatc 300ggcaatccct
ccacggagta cgacggtctg atgcatcggc ctccggaagg cgggctgatc 360atcgagctcg
aacgtgccgg cccgccgatc gatctgtccg gcacgctggc gccggcgctg 420gagcggatcc
gcacggcggg ctcgctgcgc gcgctgtgcg atgacaccgc gctgctgttt 480cagcagtgca
ccggctacga ccgggtgatg gtgtatcgct tcgacgagca gggccacggc 540gaagtgttct
ccgagcgcca cgtgcccggg ctcgaatcct atttcggcaa ccgctatccg 600tcgtcggaca
ttccgcagat ggcgcggcgg ctgtacgagc ggcagcgcgt ccgcgtgctg 660gtcgacgtca
gctatcagcc ggtgccgctg gagccgcggc tgtcgccgct gaccgggcgc 720gatctcgaca
tgtcgggctg cttcctgcgc tcgatgtcgc cgatccatct gcagtacctg 780aagaacatgg
gcgtgcgcgc caccctggtg gtgtcgctgg tggtcggcgg caagctgtgg 840ggcctggttg
cctgtcatca ttatctgccg cgcttcatgc atttcgagct gcgggcgatc 900tgcgaactgc
tcgccgaagc gatcgcgacg cggatcaccg cgcttgagag cttcgcgcag 960agccagtcgg
agctgttcgt gcagcggctc gaacagcgca tgatcgaagc gattacccgt 1020gaaggcgatt
ggcgcgcagc gattttcgac accagccaat cgatcctgca gccgctgcac 1080gccgccggtt
gcgcgctggt gtacgaagac cagatcagga ccatcggcga cgtgccttcc 1140acgcaggatg
tgcgcgagat cgccgggtgg ctcgatcgcc agccgcgcgc ggcggtgacc 1200tcgaccgcgt
cgctcggtct cgacgtgccg gagctcgcgc atctgacgcg gatggcgagc 1260ggcgtggtcg
cggcgccgat ttcggatcat cgcggcgagt ttctgatgtg gttccgcccc 1320gagcgcgtcc
acaccgttac ctggggcggc gatccgaaga agccgttcac gatgggcgat 1380acaccggcgg
atctgtcgcc gcggcgctcc ttcgccaaat ggcatcaggt tgtcgaaggc 1440acgtccgatc
cgtggacggc cgccgatctc gccgcggctc gcaccatcgg tcagaccgtc 1500gccgacatcg
tgctgcaatt ccgcgcggtg cggacactga tcgcccgcga acagtacgaa 1560cagttttcgt
cccaggtgca cgcttcgatg cagccggtgc tgatcaccga cgccgaaggc 1620cgcatcctgc
tgatgaacga ctcgttccgc gacatgttgc cggcgggttc gccatccgcc 1680gtccatctcg
acgatctcgc cgggttcttc gtcgaatcga acgatttcct gcgcaacgtc 1740gccgaactga
tcgatcacgg ccgcgggtgg cgcggcgaag ttctgctgcg cggcgcaggc 1800aaccgcccgt
tgccgctggc agtgcgcgcc gatccggtga cgcgcacgga ggaccagtcg 1860ctcggcttcg
tgctgatctt cagcgacgct accgatcgtc gcaccgcaga tgccgcacgc 1920acgcgtttcc
aggaaggcat tcttgccagc gcacgtcccg gcgtgcggct cgactccaag 1980tccgacctgt
tgcacgagaa gctgctgtcc gcgctggtcg agaacgcgca gcttgccgca 2040ttggaaatca
cttacggcgt cgagaccgga cgcatcgccg agctgctcga aggcgtccgc 2100cagtcgatgc
tgcgcaccgc cgaagtgctc ggccatctgg tgcagcacgc ggcgcgcacg 2160gccggcagcg
acagctcgag caatggctcg cagaacaaga aggaattcga tagtgctggt 2220agtgctggta
gtgctggtac tagt
2244961308DNAArtificial SequenceSynthetic Polynucleotide 96atggagatgg
aaaaggagtt cgagcagatc gacaagtccg ggagctgggc ggccatttac 60caggatatcc
gacatgaagc cagtgacttc ccatgtagag tggccaagct tcctaagaac 120aaaaaccgaa
ataggtacag agacgtcagt ccctttgacc atagtcggat taaactacat 180caagaagata
atgactatat caacgctagt ttgataaaaa tggaagaagc ccaaaggagt 240tacattctta
cccagggccc tttgcctaac acatgcggtc acttttggga gatggtgtgg 300gagcagaaaa
gcaggggtgt cgtcatgctc aacagagtga tggagaaagg ttcgttaaaa 360tgcgcacaat
actggccaca aaaagaagaa aaagagatga tctttgaaga cacaaatttg 420aaattaacat
tgatctctga agatatcaag tcatattata cagtgcgaca gctagaattg 480gaaaacctta
caacccaaga aactcgagag atcttacatt tccactatac cacatggcct 540gactttggag
tccctgaatc accagcctca ttcttgaact ttcttttcaa agtccgagag 600tcagggtcac
tcagcccgga gcacgggccc gttgtggtgc actgcagtgc aggcatcggc 660aggtctggaa
ccttctgtct ggctgatacc tgcctcttgc tgatggacaa gaggaaagac 720ccttcttccg
ttgatatcaa gaaagtgctg ttagaaatga ggaagtttcg gatggggctg 780atccagacag
ccgaccagct gcgcttctcc tacctggctg tgatcgaagg tgccaaattc 840atcatggggg
actcttccgt gcaggatcag tggaaggagc tttcccacga ggacctggag 900cccccacccg
agcatatccc cccacctccc cggccaccca aacgaatcct ggagccacac 960aatgggaaat
gcagggagtt cttcccaaat caccagtggg tgaaggaaga gacccaggag 1020gataaagact
gccccatcaa ggaagaaaaa ggaagcccct taaatgccgc accctacggc 1080atcgaaagca
tgagtcaaga cactgaagtt agaagtcggg tcgtgggggg aagtcttcga 1140ggtgcccagg
ctgcctcccc agccaaaggg gagccgtcac tgcccgagaa ggacgaggac 1200catgcactga
gttactggaa gcccttcctg gtcaacatgt gcgtggctac ggtcctcacg 1260gccggcgctt
acctctgcta caggttcctg ttcaacagca acacatag
130897951DNAArtificial SequenceSynthetic Polynucleotide 97atgcccacca
ccatcgagcg ggagttcgaa gagttggata ctcagcgtcg ctggcagccg 60ctgtacttgg
aaattcgaaa tgagtcccat gactatcctc atagagtggc caagtttcca 120gaaaacagaa
atcgaaacag atacagagat gtaagcccat atgatcacag tcgtgttaaa 180ctgcaaaatg
ctgagaatga ttatattaat gccagtttag ttgacataga agaggcacaa 240aggagttaca
tcttaacaca gggtccactt cctaacacat gctgccattt ctggcttatg 300gtttggcagc
agaagaccaa agcagttgtc atgctgaacc gcattgtgga gaaagaatcg 360gttaaatgtg
cacagtactg gccaacagat gaccaagaga tgctgtttaa agaaacagga 420ttcagtgtga
agctcttgtc agaagatgtg aagtcgtatt atacagtaca tctactacaa 480ttagaaaata
tcaatagtgg tgaaaccaga acaatatctc actttcatta tactacctgg 540ccagattttg
gagtccctga atcaccagct tcatttctca atttcttgtt taaagtgaga 600gaatctggct
ccttgaaccc tgaccatggg cctgcggtga tccactgtag tgcaggcatt 660gggcgctctg
gcaccttctc tctggtagac acttgtcttg ttttgatgga aaaaggagat 720gatattaaca
taaaacaagt gttactgaac atgagaaaat accgaatggg tcttattcag 780accccagatc
aactgagatt ctcatacatg gctataatag aaggagcaaa atgtataaag 840ggagattcta
gtatacagaa acgatggaaa gaactttcta aggaagactt atctcctgcc 900tttgatcatt
caccaaacaa aataatgact gaaaaataca atgggaacag a
95198900DNAArtificial SequenceSynthetic Polynucleotide 98atgtcttctg
gtgtagatct gggtaccgag aacctgtact tccaatccat gtcccgtgtc 60ctccaagcag
aagagcttca tgaaaaggcc ctggaccctt tcctgctgca ggcggaattc 120tttgaaatcc
ccatgaactt tgtggatccg aaagagtacg acatccctgg gctggtgcgg 180aagaaccggt
acaaaaccat acttcccaac cctcacagca gagtgtgtct gacctcacca 240gaccctgacg
accctctgag ttcctacatc aatgccaact acatccgggg ctatggtggg 300gaggagaagg
tgtacatcgc cactcaggga cccatcgtca gcacggtcgc cgacttctgg 360cgcatggtgt
ggcaggagca cacgcccatc attgtcatga tcaccaacat cgaggagatg 420aacgagaaat
gcaccgagta ttggccggag gagcaggtgg cgtacgacgg tgttgagatc 480actgtgcaga
aagtcattca cacggaggat taccggctgc gactcatctc cctcaagagt 540gggactgagg
agcgaggcct gaagcattac tggttcacat cctggcccga ccagaagacc 600ccagaccggg
cccccccact cctgcacctg gtgcgggagg tggaggaggc agcccagcag 660gaggggcccc
actgtgcccc catcatcgtc cactgcagtg cagggattgg gaggaccggc 720tgcttcattg
ccaccagcat ctgctgccag cagctgcggc aggagggtgt agtggacatc 780ctgaagacca
cgtgccagct ccgtcaggac aggggcggca tgatccagac atgcgagcag 840taccagtttg
tgcaccacgt catgagcctc tacgaaaagc agctgtccca ccagtcctga
900991788DNAArtificial SequenceSynthetic Polynucleotide 99atggtgaggt
ggtttcaccg agacctcagt gggctggatg cagagaccct gctcaagggc 60cgaggtgtcc
acggtagctt cctggctcgg cccagtcgca agaaccaggg tgacttctcg 120ctctccgtca
gggtggggga tcaggtgacc catattcgga tccagaactc aggggatttc 180tatgacctgt
atggagggga gaagtttgcg actctgacag agctggtgga gtactacact 240cagcagcagg
gtgtggtgca ggaccgcgac ggcaccatca tccacctcaa gtacccgctg 300aactgctccg
atcccactag tgagaggtgg taccatggcc acatgtctgg cgggcaggca 360gagacgctgc
tgcaggccaa gggcgagccc tggacgtttc ttgtgcgtga gagcctcagc 420cagcctggag
acttcgtgct ttctgtgctc agtgaccagc ccaaggctgg cccaggctcc 480ccgctcaggg
tcacccacat caaggtcatg tgcgagggtg gacgctacac agtgggtggt 540ttggagacct
tcgacagcct cacggacctg gtggagcatt tcaagaagac ggggattgag 600gaggcctcag
gcgcctttgt ctacctgcgg cagccgtact atgccacgag ggtgaatgcg 660gctgacattg
agaaccgagt gttggaactg aacaagaagc aggagtccga ggatacagcc 720aaggctggct
tctgggagga gtttgagagt ttgcagaagc aggaggtgaa gaacttgcac 780cagcgtctgg
aagggcaacg gccagagaac aagggcaaga accgctacaa gaacattctc 840ccctttgacc
acagccgagt gatcctgcag ggacgggaca gtaacatccc cgggtccgac 900tacatcaatg
ccaactacat caagaaccag ctgctaggcc ctgatgagaa cgctaagacc 960tacatcgcca
gccagggctg tctggaggcc acggtcaatg acttctggca gatggcgtgg 1020caggagaaca
gccgtgtcat cgtcatgacc acccgagagg tggagaaagg ccggaacaaa 1080tgcgtcccat
actggcccga ggtgggcatg cagcgtgctt atgggcccta ctctgtgacc 1140aactgcgggg
agcatgacac aaccgaatac aaactccgta ccttacaggt ctccccgctg 1200gacaatggag
acctgattcg ggagatctgg cattaccagt acctgagctg gcccgaccat 1260ggggtcccca
gtgagcctgg gggtgtcctc agcttcctgg accagatcaa ccagcggcag 1320gaaagtctgc
ctcacgcagg gcccatcatc gtgcactgca gcgccggcat cggccgcaca 1380ggcaccatca
ttgtcatcga catgctcatg gagaacatct ccaccaaggg cctggactgt 1440gacattgaca
tccagaagac catccagatg gtgcgggcgc agcgctcggg catggtgcag 1500acggaggcgc
agtacaagtt catctacgtg gccatcgccc agttcattga aaccactaag 1560aagaagctgg
aggtcctgca gtcgcagaag ggccaggagt cggagtacgg gaacatcacc 1620tatcccccag
ccatgaagaa tgcccatgcc aaggcctccc gcacctcgtc caaacacaag 1680gaggatgtgt
atgagaacct gcacactaag aacaagaggg aggagaaagt gaagaagcag 1740cggtcagcag
acaaggagaa gagcaagggt tccctcaaga ggaagtga
17881001782DNAArtificial SequenceSynthetic Polynucleotide 100atgacatcgc
ggagatggtt tcacccaaat atcactggtg tggaggcaga aaacctactg 60ttgacaagag
gagttgatgg cagttttttg gcaaggccta gtaaaagtaa ccctggagac 120ttcacacttt
ccgttagaag aaatggagct gtcacccaca tcaagattca gaacactggt 180gattactatg
acctgtatgg aggggagaaa tttgccactt tggctgagtt ggtccagtat 240tacatggaac
atcacgggca attaaaagag aagaatggag atgtcattga gcttaaatat 300cctctgaact
gtgcagatcc tacctctgaa aggtggtttc atggacatct ctctgggaaa 360gaagcagaga
aattattaac tgaaaaagga aaacatggta gttttcttgt acgagagagc 420cagagccacc
ctggagattt tgttctttct gtgcgcactg gtgatgacaa aggggagagc 480aatgacggca
agtctaaagt gacccatgtt atgattcgct gtcaggaact gaaatacgac 540gttggtggag
gagaacggtt tgattctttg acagatcttg tggaacatta taagaagaat 600cctatggtgg
aaacattggg tacagtacta caactcaagc agccccttaa cacgactcgt 660ataaatgctg
ctgaaataga aagcagagtt cgagaactaa gcaaattagc tgagaccaca 720gataaagtca
aacaaggctt ttgggaagaa tttgagacac tacaacaaca ggagtgcaaa 780cttctctaca
gccgaaaaga gggtcaaagg caagaaaaca aaaacaaaaa tagatataaa 840aacatcctgc
cctttgatca taccagggtt gtcctacacg atggtgatcc caatgagcct 900gtttcagatt
acatcaatgc aaatatcatc atgcctgaat ttgaaaccaa gtgcaacaat 960tcaaagccca
aaaagagtta cattgccaca caaggctgcc tgcaaaacac ggtgaatgac 1020ttttggcgga
tggtgttcca agaaaactcc cgagtgattg tcatgacaac gaaagaagtg 1080gagagaggaa
agagtaaatg tgtcaaatac tggcctgatg agtatgctct aaaagaatat 1140ggcgtcatgc
gtgttaggaa cgtcaaagaa agcgccgctc atgactatac gctaagagaa 1200cttaaacttt
caaaggttgg acaagggaat acggagagaa cggtctggca ataccacttt 1260cggacctggc
cggaccacgg cgtgcccagc gaccctgggg gcgtgctgga cttcctggag 1320gaggtgcacc
ataagcagga gagcatcatg gatgcagggc cggtcgtggt gcactgcagt 1380gctggaattg
gccggacagg gacgttcatt gtgattgata ttcttattga catcatcaga 1440gagaaaggtg
ttgactgcga tattgacgtt cccaaaacca tccagatggt gcggtctcag 1500aggtcaggga
tggtccagac agaagcacag taccgattta tctatatggc ggtccagcat 1560tatattgaaa
cactacagcg caggattgaa gaagagcaga aaagcaagag gaaagggcac 1620gaatatacaa
atattaagta ttctctagcg gaccagacga gtggagatca gagccctctc 1680ccgccttgta
ctccaacgcc accctgtgca gaaatgagag aagacagtgc tagagtctat 1740gaaaacgtgg
gcctgatgca acagcagaaa agtttcagat ga
17821011275DNAArtificial SequenceSynthetic Polynucleotide 101atggagcaag
tggagatcct gaggaaattc atccagaggg tccaggccat gaagagtcct 60gaccacaatg
gggaggacaa cttcgcccgg gacttcatgc ggttaagaag attgtctacc 120aaatatagaa
cagaaaagat atatcccaca gccactggag aaaaagaaga aaatgttaaa 180aagaacagat
acaaggacat actgccattt gatcacagcc gagttaaatt gacattaaag 240actccttcac
aagattcaga ctatatcaat gcaaatttta taaagggcgt ctatgggcca 300aaagcatatg
tagcaactca aggaccttta gcaaatacag taatagattt ttggaggatg 360gtatgggagt
ataatgttgt gatcattgta atggcctgcc gagaatttga gatgggaagg 420aaaaaatgtg
agcgctattg gcctttgtat ggagaagacc ccataacgtt tgcaccattt 480aaaatttctt
gtgaggatga acaagcaaga acagactact tcatcaggac actcttactt 540gaatttcaaa
atgaatctcg taggctgtat cagtttcatt atgtgaactg gccagaccat 600gatgttcctt
catcatttga ttctattctg gacatgataa gcttaatgag gaaatatcaa 660gaacatgaag
atgttcctat ttgtattcat tgcagtgcag gctgtggaag aacaggtgcc 720atttgtgcca
tagattatac gtggaattta ctaaaagctg ggaaaatacc agaggaattt 780aatgtattta
atttaataca agaaatgaga acacaaaggc attctgcagt acaaacaaag 840gagcaatatg
aacttgttca tagagctatt gcccaactgt ttgaaaaaca gctacaacta 900tatgaaattc
atggagctca gaaaattgct gatggagtga atgaaattaa cactgaaaac 960atggtcagct
ccatagagcc tgaaaaacaa gattctcctc ctccaaaacc accaaggacc 1020cgcagttgcc
ttgttgaagg ggatgctaaa gaagaaatac tgcagccacc ggaacctcat 1080ccagtgccac
ccatcttgac accttctccc ccttcagctt ttccaacagt cactactgtg 1140tggcaggaca
atgatagata ccatccaaag ccagtgttgc aatggtttca tcagaacaac 1200attcagcaga
cctcaacaga aactatagta aatcaacaga acttccaggg aaaaatgaat 1260caacaattga
acaga
1275102899DNAArtificial SequenceSynthetic Polynucleotide 102atggaccaaa
gagaaattct gcagaagttc ctggatgagg cccaaagcaa gaaaattact 60aaagaggagt
ttgccaatga atttctgaag ctgaaaaggc aatctaccaa gtacaaggca 120gacaaaacct
atcctacaac tgtggctgag aagcccaaga atatcaagaa aaacagatat 180aaggatattt
tgccctatga ttatagccgg gtagaactat ccctgataac ctctgatgag 240gattccagct
acatcaatgc caacttcatt aagggagttt atggacccaa ggcttatatt 300gccacccagg
gtcctttatc tacaaccctc ctggacttct ggaggatgat ttgggaatat 360agtgtcctta
tcattgttat ggcatgcatg gagtatgaaa tgggaaagaa aaagtgtgag 420cgctactggg
ctgagccagg agagatgcag ctggaatttg gccctttctc tgtatcctgt 480gaagctgaaa
aaaggaaatc tgattatata atcaggactc taaaagttaa gttcaatagt 540gaaactcgaa
ctatctacca gtttcattac aagaattggc cagaccatga tgtaccttca 600tctatagacc
ctattcttga gctcatctgg gatgtacgtt gttaccaaga ggatgacagt 660gttcccatat
gcattcactg cagtgctggc tgtggaagga ctggtgttat ttgtgctatt 720gattatacat
ggatgttgct aaaagatggg ataattcctg agaacttcag tgttttcagt 780ttgatccggg
aaatgcggac acagaggcct tcattagttc aaacgcagga acaatatgaa 840ctggtctaca
atgctgtatt agaactattt aagagacaga tggatgttat cagagataa
8991031149DNAArtificial SequenceSynthetic Polynucleotide 103atgagtctga
aagaaaaaac acaatctctg tttgccaacg catttggcta ccctgccact 60cacaccattc
aggcgcctgg ccgcgtgaat ttgattggtg aacacaccga ctacaacgac 120ggtttcgttc
tgccctgcgc gattgattat caaaccgtga tcagttgtgc accacgcgat 180gaccgtaaag
ttcgcgtgat ggcagccgat tatgaaaatc agctcgacga gttttccctc 240gatgcgccca
ttgtcgcaca tgaaaactat caatgggcta actacgttcg tggcgtggtg 300aaacatctgc
aactgcgtaa caacagcttc ggcggcgtgg acatggtgat cagcggcaat 360gtgccgcagg
gtgccgggtt aagttcttcc gcttcactgg aagtcgcggt cggaaccgta 420ttgcagcagc
tttatcatct gccgctggac ggcgcacaaa tcgcgcttaa cggtcaggaa 480gcagaaaacc
agtttgtagg ctgtaactgc gggatcatgg atcagctaat ttccgcgctc 540ggcaagaaag
atcatgcctt gctgatcgat tgccgctcac tggggaccaa agcagtttcc 600atgcccaaag
gtgtggctgt cgtcatcatc aacagtaact tcaaacgtac cctggttggc 660agcgaataca
acacccgtcg tgaacagtgc gaaaccggtg cgcgtttctt ccagcagcca 720gccctgcgtg
atgtcaccat tgaagagttc aacgctgttg cgcatgaact ggacccgatc 780gtggcaaaac
gcgtgcgtca tatactgact gaaaacgccc gcaccgttga agctgccagc 840gcgctggagc
aaggcgacct gaaacgtatg ggcgagttga tggcggagtc tcatgcctct 900atgcgcgatg
atttcgaaat caccgtgccg caaattgaca ctctggtaga aatcgtcaaa 960gctgtgattg
gcgacaaagg tggcgtacgc atgaccggcg gcggatttgg cggctgtatc 1020gtcgcgctga
tcccggaaga gctggtgcct gccgtacagc aagctgtcgc tgaacaatat 1080gaagcaaaaa
caggtattaa agagactttt tacgtttgta aaccatcaca aggagcagga 1140cagtgctga
11491041422DNAArtificial SequenceSynthetic Polynucleotide 104atgaacatca
aaaagtttgc aaaacaagca acagtattaa cctttactac cgcactgctg 60gcaggaggcg
caactcaagc gtttgcgaaa gaaacgaacc aaaagccata taaggaaaca 120tacggcattt
cccatattac acgccatgat atgctgcaaa tccctgaaca gcaaaaaaat 180gaaaaatatc
aagttcctga attcgattcg tccacaatta aaaatatctc ttctgcaaaa 240ggcctggacg
tttgggacag ctggccatta caaaacgctg acggcactgt cgcaaactat 300cacggctacc
acatcgtctt tgcattagcc ggagatccta aaaatgcgga tgacacatcg 360atttacatgt
tctatcaaaa agtcggcgaa acttctattg acagctggaa aaacgctggc 420cgcgtcttta
aagacagcga caaattcgat gcaaatgatt ctatcctaaa agaccaaaca 480caagaatggt
caggttcagc cacatttaca tctgacggaa aaatccgttt attctacact 540gatttctccg
gtaaacatta cggcaaacaa acactgacaa ctgcacaagt taacgtatca 600gcatcagaca
gctctttgaa catcaacggt gtagaggatt ataaatcaat ctttgacggt 660gacggaaaaa
cgtatcaaaa tgtacagcag ttcatcgatg aaggcaacta cagctcaggc 720gacaaccata
cgctgagaga tcctcactac gtagaagata aaggccacaa atacttagta 780tttgaagcaa
acactggaac tgaagatggc taccaaggcg aagaatcttt atttaacaaa 840gcatactatg
gcaaaagcac atcattcttc cgtcaagaaa gtcaaaaact tctgcaaagc 900gataaaaaac
gcacggctga gttagcaaac ggcgctctcg gtatgattga gctaaacgat 960gattacacac
tgaaaaaagt gatgaaaccg ctgattgcat ctaacacagt aacagatgaa 1020attgaacgcg
cgaacgtctt taaaatgaac ggcaaatggt acctgttcac tgactcccgc 1080ggatcaaaaa
tgacgattga cggcattacg tctaacgata tttacatgct tggttatgtt 1140tctaattctt
taactggccc atacaagccg ctgaacaaaa ctggccttgt gttaaaaatg 1200gatcttgatc
ctaacgatgt aacctttact tactcacact tcgctgtacc tcaagcgaaa 1260ggaaacaatg
tcgtgattac aagctatatg acaaacagag gattctacgc agacaaacaa 1320tcaacgtttg
cgccaagctt cctgctgaac atcaaaggca agaaaacatc tgttgtcaaa 1380gacagcatcc
ttgaacaagg acaattaaca gttaacaaat aa 1422
* * * * *