Practical tips for cloning expressing and purifying proteins

Скачать презентацию Practical tips for cloning expressing and purifying proteins

b2690ddc9babeb9ce42d10c8a3b77b83.ppt

Количество слайдов: 56

Practical tips for cloning, expressing and purifying proteins for structural biology Aled Edwards Banting and Best Department of Medical Research University of Toronto, Canada aled. edwards@utoronto. ca Affinium Pharmaceuticals Toronto, Canada aedwards@afnm. com

Molecular biological approaches to structural biology An excellent structural sample usually has the following properties • Lack of conformational heterogeneity • Soluble at high concentrations • Pure Molecular biology is probably fastest way to transform “poor” sample into an “excellent” one.

Outline • Historical perspective on engineering proteins for structural biology • Practical advice for cloning/purification of structural samples • Ancillary benefits of high-throughput studies

RNA polymerase II From 15Å to 3Å by eliminating heterogeneity

Another source of sample heterogeneity Eukaryotic proteins comprise multiple domains • Conformational heterogeneity lowers probability of crystallization • Protein domains • Are resistant to proteolysis • Fold autonomously • Can usually be expressed in bacteria • Are between 15 and 30 k. Da (NMR or X-ray size) • Are fundamental unit of protein function • Domains are often only tractable targets for HTP crystallography

EBNA 1 DNA-binding domain (No sequence homologue in database)

RPA Domain Structure A collection of OB-folds RPA 70 A RPA 32 RPA 14 B

RPA crystallization • Start with full-length protein purified using baculovirus (Wold) • Identify domain (aa 1 -442) soluble in E coli (Wold) • Crystallize domain (7Å) • Use limited proteolysis to define smaller domain (aa 161 -442) (3. 5Å…. and same cell as 7Å crystal) • Create many constructs varying N- and C-termini to identify final construct (aa 181 -422). (2. 2Å…solve structure) Final tally: 15 different constructs

RPA 70 Domains A and B Two OB-folds bound to DNA B L 12 loops L 45 loops A

How does one map domains?

Domain mapping using limited proteolysis TFIIS Protease Integrative Proteomics

TFIIS Domain Structure 240 309 264 1 124 131 Binds holoenzyme. Similar to elongin, CRSP 70 RNA polymerase binding I II Transcript cleavage and read-through (Nucleic acid binding? ) III

Domain. Hunter. TM Industrialized Domain Mapping • Partial proteolysis in 96 well plates • Optimized set of proteases • Low protein requirement • No SDS-PAGE • No N-terminal sequencing • Direct identification of domains by mass spectrometry

31650 25360 23332 0 0. 1 0. 25 -0. 4 1. 0 -0. 6 2. 5 5 -0. 8 25 -1. 0 23000 28000 33000 m/z Protease Titration 21952 21612 20507 0. 2 -0. 2 33318 r. i. -0. 0 35057 Domain. Hunter. TM

Domain. Hunter Applied to NMR Sample Residue Number N 20 40 B C Fragment Mass B C A D A 10324. 0 12352. 0 9131. 0 11159. 0 60 80 100 120 140 V 8 cleavage site Chymotrypsin site A D Matching sequence Expression G[44 -133]R G[44 -150]D I[55 -133]R I[55 -150]D +++ no B Solubility ++ ++

Structural Proteomics MTH 40 MTH 1615 MTH 152 MTH 1184 MTH 1175 MTH 538 MTH 150 MTH 1790 MTH 129 MTH 1699 Nat. Str. Biol. Oct/Nov 2000 MTH 1048

5 more done 3 more soon

Molecular biology for crystallization and for large-scale studies 1. Basic steps in creating expression vectors for E. coli 2. Practical tips for making fewer mistakes 3. Application of methods to higher-throughput 4. Alternate expression systems 5. Some results

E coli is the first choice……why? • Cost effective • Easy to grow • Abundance of expertise and reagents • Easy to incorporate selenomethionine • High yield • Rapid doubling time and rapid scale-up

Factors involved in successful expression of recombinant proteins in Escherichia coli cytoplasm Expression vector Copy number (gene dosage – sometimes better less than more) Promoter choice (T 7, Ptac, Plac, Para ) Little or no expression before induction Reliable and adjustable expression m. RNA stability (RNAase. E- mutant) Translation Consensus SD sequence Proper spacing and sequence before the initiation codon Possible m. RNA secondary structures that block ribosome binding or internal ribosome binding site Codon Bias

But which E coli? BL 21(DE 3) F- omp. T hsd. SB (r. B-, m. B-), gal, dcm, (DE 3) BL 21 -Star(DE 3) F- omp. T hsd. SB (r. B-, m. B-), gal, dcm, rne 131, (DE 3) BL 21 -Gold(DE 3) F- omp. T hsd. S (r. B- m. B-) dcm+ Tetr gal end. A (DE 3) Tuner(DE 3) F- omp. T hsd. SB (r. B- m. B-) gal dcm lac. Y 1 (DE 3)

Conventional cloning approach 1. Select vector of choice 2. Restriction digest the vector 3. PCR the insert 4. Restriction digest the insert 5. Ligate the vector and insert 6. Transform and plate 7. Pick colonies and screen for insert 8. Screen positive clones for protein expression 9. Sequence positive clones

Which vector/tag? 1. T 7 RNA polymerase-based systems is overwhelming choice - Highly specific - High yields - Exquisitely controlled 2. Choice of vector - Restriction sites (are there internal sites in gene? ) - Are there many possible sites? - Are the enzymes commonly available? - Do the enzymes cut near ends of DNA fragments? 3. Which tag? - Relatively little data on which generates best proteins for crystallization - His-tag, GST, MBP all are effective at purification - His tag offers advantage of being able to screen +/- tag for crystals (double bang for the buck) - Make sure there is a protease site to remove tag

Practical issues with cloning 1. Choice of protease? ? ? - Thrombin (more difficult to get but highly effective) - TEV, recombinant with his-tag, stable mutant with less autoproteolysis activity (Waugh), needs calcium, finicky - Factor X, enterokinase…. . avoid

“I can’t use thrombin, it digests my protein”

Practical issues with cloning Restrict the plasmid - Double digestion often leave one end undigested, which in turn results in high background due to re-ligation - Phosphatase treatment and gel purification of large prep makes life much easier in long run - Optimize system to get no background

Practical issues with cloning PCR the insert - For HTP studies need to optimize condition for genome or clone - Order primers from reputable supplier (most common problem is in deprotecting oligos) - Have someone else double-check primer sequence - Order primers with requisite overhang (be over-cautious) - Use error-correcting polymerase

Practical issues with cloning Digest the PCR insert - Make sure that there are no internal sites - Purify the restricted product

Practical issues with cloning Ligation and transformation - If vector control background is low, and PCR product is purified, then should be no problem - Use highly competent cells

Practical issues with cloning Screen for positive clones - PCR screen from colony - Screen by protein expression - Make note of expression, as well as solubility

Cloning (conventional method) gene T 7 6 His TEV STOP T 7 TEV 6 His STOP T 7 6 His MBP TEV STOP T 7 6 His TRX TEV STOP Screening for inserts by PCR Clones

TOPO cloning

GATEWAY™ Cloning System Technology - l Phage l att. P E. coli att. B IHF, Int, Xis att. R IHF, Int att. L att. P att. B att. L+att. R att. B+att. P E. coli lysogen

GATEWAY™ Cloning System Technology - l Phage l att. P 1 att. P 2 att. P ? E. coli att. B 2 att. B 1 att. B att. P 1 att. P 2 IHF, Int, Xis ? att. B 1 att. L 1 x att. R 1 att. B 2 att. L 1 att. R 1 ? att. L 2 att. R 2 att. B 1 x att. P 1 att. R 1 x att. L 1 att. B 2 x att. P 2 att. R 2 x att. L 2 ? att. L 2 x att. R 2

“Gateway type” cloning

Cloning and Test Expression ligate transform clones X 96 PCR x 96 300 ul Kan, Amp X 96 300 ul X 96 24 x 3 ml LB Kan, Amp 37 C, Induce at OD 600 Grow O/N 15 C or 20 C X 96 supernatant Spin, Dissolve pellet in SDS Spin, Freeze, Lyse with Bug. Buster. TM Spin again SDS PAGE

1750 clones 100 90 80 70 60 50 40 30 20 10 0 cloned expressed soluble

Expression systems for eukaryotic proteins • Baculovirus infection of insect cells • Simple, relatively cost effective, selenomethionine-compatible, not fully able to replicate human post-translational modifications • Viral infection of human cells • Viruses not as easy to work with, high yield, proper modification • Stable transformation of human cells • Usually lower expression. After selection, transcription sometimes goes away. Low throughput due to selection process • Transfection of human cells • High expression in few cells, uses up lots of DNA

Protein Purification

Purification parallel des proteines 1. 2. 1 2 3 4 5 1’ 2’ 3’ 4’ 5’

Proteo. Max – Automated Protein Purification and Concentration System Affinium Pharmaceuticals

A few observations from our work

Structure determination strategy < 20 k. Da 3 -5 weeks of NMR data collection 15 N-labeled 15 N/13 C-labeled > 20 k. Da Synchrotron Data Se-Methionine labeled

Orthologues 68 Escherichia coli 68 Thermotoga maritima Topt 80 °C Topt 37 °C 1, 860, 725 bp 4, 639, 221 bp 1, 877 ORFs 4, 288 ORFs Expressed & soluble 62 48 Concentratable to > 2 mg/ml 50 44 15 35 9 9 Proteins could not be purified from either species

Total Crystals (30) T. maritima E. coli 11 3 13 Total Good/Promising NMR spectra (14) T. maritima E. coli 4 4 2

NMR & Crystallography: complementary! 24 small proteins for which both crystal trials and NMR data collected Good/promising HSQC crystals 10 3 6 Of 32 proteins that gave poor HSQC’s 7 have crystallized

Data storage and Mining: Defined Vocabulary Property Vocabulary Expression level 0 -5 (no expression – high expression) Solubility (test expression) 0 -5 (insoluble – highly soluble) Concentratability 0 -5 (or mg/ml) Crystal trials clear precipitate crystal Initial HSQC NMR good promising poor

Expression/solubility testing 5 5 4 3 2 1 0 0

Empirical Bioinformatics Solubility Tree based On 58 sequence properties Kluger & Gerstein Mostly insoluble Mostly soluble

Crystallization conditions Efficiency through mining crystal screens Different proteins Clear drop Precipitate Crystal Affinium Pharmaceuticals

Crystal trial: Diminishing Returns

Collaborators on Structural Proteomics Lawrence Mc. Intosh (UBC) C. Mackereth, G. Lee Thomas Szypersky* (SUNY Buffalo) Mike Kennedy (PNNL)* J. Cort, T. Ramelot Mark Gerstein (Yale) * Yval Kluger Ning Lan Kalle Gehring (Mc. Gill) I. Ekiel G. Kozlov Dave Wishart (U. Alberta) S. Bhattacharyya Sherry Mowbray (Sweden) Liang Tong (Columbia) * John Hunt (Columbia) * Andrzej Joachimiak (ANL)* Weontae Lee (Yonsei U. ) Guy Montelione (Rutgers) * Emil Pai (U. Toronto) V. Saridakis, N. Wu *Northeast Structural Genomics Consortium *Midwest Structural Genomics Consortium