
2b01a52b682ef45ed80b8e594a2d3801.ppt
- Количество слайдов: 23
Native-Source Structural Proteomics Nathaniel Echols*, Monica Totir*, Andrew May#, Chloe Zubieta*, Alisa Moskaleva*, Tom Alber* * UC Berkeley # Fluidigm Corporation Protein Structure Initiative Bottlenecks Workshop April 15 th, 2008
Native-source structural proteomics • Native sources provide access to samples that may be difficult to obtain by recombinant methods • Project goal: obtain structures of complexes and lowabundance proteins 1. Scale up purification (>100 g protein) 2. Scale down crystallization (picoliter reactions) • No cloning, no overexpression.
Experimental approach • Use E. coli as a model system to develop the purification protocol necessary to go from grams of starting material to 100 μg fractions • Screen the final samples at a concentration of >10 mg/ml in Fluidigm Topaz chips and identify the crystallizable fractions • Identify samples by mass spectrometry • Set the selected samples in diffraction capable chips or nanodrop crystallization trays for X-ray data collection
Proof-of-concept: the E. coli proteome • Small, well-studied proteome, but still some novelty: • • • 4243 predicted proteins (manageable number of molecular species) 860 membrane proteins 1000 proteins with > 90% sequence identity to known structures 1250 with > 50% sequence identity 2000 with > 30% sequence identity Nearly 1400 uncharacterized non-membrane proteins Existing structures allow us to validate approach Easy to grow in massive quantities Lysis and clarification are relatively simple
Proteome component sizes Cellular protein content is dominated by large assemblies
Purification scheme A new philosophy--keep everything--required new strategies Lyse at p. H 7 -8 Cross flow size fractionation – 500 k. Da TFF Proteins/complexes bigger than 500 k. Da Sucrose gradients Size exclusion chromatography Mono. Q Proteins/complexes smaller than 500 k. Da Capto Q Steps Phenyl Mono. Q/Mono. S SP Sepharose Superdex 200 Phenyl Mono. Q/Mono. S Scalable, gentle purification scheme
Purification scheme (continued) Proteins/complexes smaller than 500 k. Da Column size Approx. protein Heparin Capto MMC quantity 1 -2 L 50 g 300 m. L Blue 10 g Superdex 200 Phenyl Mono. Q/Mono. S Typical Anion Exchange chromatogram of the final samples 20 -50 m. L 1 -8 m. L 1 g 10 -100 mg
The first large-scale prep • 200 g of E. coli cells grown in M 9 minimal medium and lysed • Purification scheme: Capto Q Phenyl Mono. Q/Mono. S • 272 fractions analyzed in 96 -well Caliper electrophoresis robot and selected for crystallization Caliper “gel”
Crystallization pipeline Purity checked by Caliper gel Microfluidic crystallization with the Fluidigm TOPAZ system (8. 96 chips) Promising chip crystals Sub-optimal chip crystals MS identification Diffraction-capable chips 96 well sitting drop for further optimization X-ray data collection
Microfluidic crystallization • 272 samples set in Fluidigm TOPAZ 8. 96 chips with Index screen • Automated inspection and scoring required to find crystals efficiently • 190/272 (70%) produced crystals or microcrystals in chips (high redundancy in crystal forms) • 50 unique crystal forms by visual inspection • High-quality crystals possible even in very impure samples http: //www. fluidigm. com/topaz. htm
Crystal optimization • 66 samples picked for optimization in nanodrop vapor diffusion trays (using Mosquito robot) Protocol: sample 40%-100% precipitant concentration with different protein: well ratios (1: 3, 1: 1, 3: 1) • 50 of hits (76%) were reproducible by this method •
Diffraction-capable microfluidic chips “Hands-Free” data collection Reagents Samples 10 n. L sample chambers ALS Beamline 8. 3. 1
Structure determination • • MS identification of unique crystals should be the first step 25 unique native datasets collected at ALS 8. 3. 1/12. 3. 1 15 already published structures identified • 3 structures novel in E. coli, phased by MR • • Robotics and automation software used for data collection and processing whenever possible
Rapid structure identification by MR • • Concept: identify protein from “anonymous” diffraction data (no mass spec info) Search set of every PDB structure homologous to an E. coli protein (~10, 000 models) Molecular replacement rotation function run using each model Identical structures are usually high-scoring • • Homologous proteins may still score better than average Potential solutions can be verified by full MR
Experimental phasing • • • The largest bottleneck: much more manual labor required Cryoprotectants contain heavy monovalent ions (Br+, Rb-) Metal quick-soaks (0. 5 - 5 m. M): Ethyl mercury phosphate/thimerosal • Hg. Cl 2 or PCMBS (p-Chloro-mercuric-benzenesulphonate acid) • Sm. Cl 3 • • Pt. Cl 4, Pt. Cl 6
Current structures, new and old (Structures labelled in red were identified by brute-force search. ) New: (% identity to PDB) Methylglyoxal reductase (37%) p. Glucose isomerase ß-glucosidase (? ) (bgl. A) (65%) (33%) Old: yca. C Arginosuccinate lyase p. Ser aminotransferase Dihydrodipicolinate synthase Molybdopterin biosynthesis prot. B PPIase Catalase HPII (also in truncated form) Citrate synthase Lysyl-t. RNA synthetase Cystathionine -synthase Transhydrogenase domain I Pyruvate kinase Hsp 31 chaperone 5 -keto-4 -deoxyuronate isomerase
Purity of crystallized samples
Summary • Macro-to-micro strategy tested with E. coli • Large-scale fractionation pipeline: • • • New approaches and equipment (TFF, larger columns, Caliper CE robot) needed to scale up and keep everything Currently 464 fractions isolated for crystallization Small-scale crystallization: >50% of fractions crystallized in Topaz microfluidic format Many impure fractions yielded starting crystals Optimization in sitting drops and new diffraction chips was efficient Structure determination: 25 data sets collected, 18 structures phased, all oligomeric 3 structures novel to E. coli Brute-force molecular replacement was used in most cases
Future directions • • Continue improvements to purification methods Pathogenic organisms (e. g. Mycobacteria) Plant/mammalian proteomes: diploid, much larger and more complex Smaller sets of related proteins: • • • Protease-resistant domains Serum proteins ATP-binding proteins Metalloproteins Large complexes
Acknowledgements • • • Tom Alber, Monica Totir, Chloe Zubieta, Alisa Moskaleva Andy May (Fluidigm) Scott Gradia, James Berger (UCB) James Holton (ALS) George Meigs, Jane Tanamatchi (ALS) ALS beamlines 8. 3. 1, 12. 3. 1 Tony Iavarone (QB 3 MS facility) Scripps Center for Mass Spectrometry W. M. Keck Foundation Millipore Corporation Funded in part by UC Discovery/Fluidigm Corporation and NIGMS grant GM 71326 -02
Second large-scale prep – a better purification scheme • 1000 g of E. coli cells grown in M 9 minimal medium and lysed Lysate at p. H 7 Cross flow size fractionation – 500 k. Da TFF Proteins/complexes bigger than Proteins/complexes smaller than 500 k. Da Sucrose gradients SP Sepharose Blue Size exclusion chromatography Mono. Q Heparin Capto MMC Superdex 200 Phenyl Mono. Q/Mono. S • 192 unique final samples to be screened in 8. 96 chips and subsequently set up in diffraction-capable chips
Apparently rare proteins accessible # genes I will have to look this up. Or do we have smth like this? Abundance ( # transcripts)