
efcc631b30c544c113f9c13ee8e4406e.ppt
- Количество слайдов: 59
Lect 7: DNA, chromosomes, transcription, translation 6 th Ed: Ch 6 -2, 6 -3, 12 -1, 12 -2, 8 -3, 8 -4, 10 -1, 16 -1, 17 -2 1
DNA base pairs 2 strands Base pairing Anti parallel strands 5’ to 3’ 2
What is a gene? There are 23 chromosome pairs in humans in G 1 Each chromosome has one double stranded DNA molecule Each chromosome has many genes (genotype) A gene often (but not always) produces a protein. Protein activity gives rise to a phenotype There approximately 25, 000 genes in almost every cell of the human body There are two copies of each gene in each of these cells (Humans=diploids) (exceptions include egg and sperm) Each gene has many forms- alleles (BUT there are only 2 alleles in any one human) Different alleles are caused by changes in the DNA sequence of the gene Some of the changes in sequence lead to mutations. These lead to the appearance of a new phenotype
Vermillion eye Cut wing Cross veinless Echinus Eye Hypothetical chromosome Many genes Genes on DNA What is wild type? Most common phenotype in a population 4
Chromatin Every human cell contains about 2 m of DNA (1 m per haploid) The human body consists of approximately 1013 cells and therefore contains a total of about 2 × 1013 m of DNA. Distance from the earth to the sun is 1. 5 × 10 11 m The DNA in your body could stretch to the sun and back about 50 times. The diameter of a nucleus is 5 x 10 -6 meters (um) DNA has to be packaged into this small volume (4/3 p. R 3) DNA in the nucleus is not naked How is the DNA packaged? Chromatin= DNA 1 g +histones 1 g +other proteins 1 g 5
DNA to chromosomes 2 nm diameter (length 1 m) 10 nm 300 nm 700 nm 1400 nm (length 1 um)
Nucleosomes Four core histone proteins H 2 A H 2 B H 3 H 4 Very highly conserved There are two molecules of each core histone in a nucleosome DNA is wrapped around the outside of the core histone octamer 2 2 mol mol H 2 A H 2 B H 3 H 4 1 mol H 1 ~200 bp DNA 166 bp of DNA wraps around the core histone octamer Linker DNA (~30 bp) connects nucleosomes 1 mol of linker Histone H 1 7
Chromatin Loop Domains Chromosome Territories 8
genetics and development n + n Differentiation 2 n DNA content same DNA content, > 200 cell types
Eukaryotic Information Transfer: Gene to phenotype= DNA to Transcription & Translation DNA RNA Protein Process Transcription Translation Enzyme RNA polymerase Ribosome Location Nucleus Cytoplasm 10
RNA serves as the intermediary between DNA and proteins Genes are in the nucleus Proteins are made in cytoplasm Although RNA and DNA are structurally analogous, Three major differences DNA RNA Four bases A T G C A U G C Double stranded Single stranded Deoxyribose sugar backbone Ribose sugar backbone Most DNA is nuclear Most RNA is cytoplasmic 11
DNA = gene = 5’ to 3’ Each single strand of DNA has polarity 5’ 3’ 3’ 5’ Base pairing is specific DNA sequence (5’ to 3’) Gene sequence - - - - - > <- - - - - - - {- - - - ->} - - - <- - -} - - - - --[- - - -] {- - - - -} [- - - -]- Intergenic sequence <- - - - -12 - - -
Two anti-parallel strands of DNA Each DNA strand is separate and distinct 3’ 3’ Gene. B Gene. A 5’ 5’ 5’-A-G-T-A-C-G-3’ 3’-T-C-A-T-G-C-5’ Genes code in 5’ to 3’ direction THEREFORE Both strands of DNA can code for genes! Gene. A sequence: 5’-A-G-T-A-C-G-3’ = Ser-Thr Gene. B sequence: 5’-C-G-T-A-C-T-3’ = Arg-Ser 13
Transcription m. RNA is an exact copy of a gene that is exported to the cytoplasm The synthesis of RNA by the enzyme RNA polymerase using DNA as the template is called transcription SYNTHESIS OCCURS IN 5’ TO 3’ DIRECTION For each gene, only one of the two strands of DNA is used as a template The transcribed DNA strand (DNA strand used by RNA polymerase to make a copy) is called the Template strand. Its complimentary strand is called the Non-template strand- Non Template (RNA like/coding/sense) Template strand (noncoding/antisense) 5’ ATGCCCGGGAAATAA 3’ |||||||| 3’ TACGGGCCCTTTATT 5’ |||||||| m. RNA 5’ AUGCCCGGGAAAUAA 3’ Met Pro Gly Lys X Notice that the m. RNA has the same sequence as the non 14 template strand
RNA polymerase catalyzes the synthesis of RNA using the DNA as a template RNA polymerase is a multi-protein complex It consists of four proteins in bacteria (E. coli) A GENE is a defined region of DNA It has a start, a body a end. Transcription involves THREE distinct processes 1) Transcription Initiation 2) Transcription Elongation 3) Transcription Termination Here is the DNA sequence ATTGTGTCTCACTGCATGCGTACATGGTACGTAATGCCGA Where does the polymerase start? How does it know where the start of the gene is? 15
Which house can PAUL enter? Rule = PAUL only enters a house with a pink staircase 16
Initiation of Transcription Initiation by the enzyme RNA polymerase involves the enzyme recognizing and binding to a specific sequence on the DNA and then beginning transcription from near that sequence The recognition sequence is called a PROMOTER RNA 5’ 3’ nontemplate/sense/coding/RNA like acct. TTGACATgcgagccggtacgcg. TATAAAgccgacggcg. ATgg//gcg. ATG CCC GGG TTT TAA NN tgga. AACTGTAcgctcggccatgcgc. ATATTTcggctgccgc. TAcc//cgc. TAC GGG CCC AAA ATT NN Template/antisense/non-coding 3’ (-35) (15 -17) (-10) Protein Very similar sequences (TTGACAT and TATAAA) are present in the promoters of most E. coli genes These sequences are conserved They are critical for proper functioning of the promoter What do we mean by conserved sequence? Regions of the DNA (gene or non-gene) or protein that share similar nucleotide sequence 17 Sequence rules wrt promoter etc are not hard and fast 5’
RNA polymerase Bacterial RNA polymerase is a complex of four different proteins Core enzyme: Four polypeptides: alpha (a), beta (b), beta' (b'), and omega (w) Stoichiometry : 2 a: 1 b’: 1 w Core RNA polymerase can bind to DNA It catalyzes the synthesis of RNA but it has no specificity It will begin transcription at random from any position on the DNA. So then how does a polymerase know where the start of the gene is? Consensus sequences of promoters ----TTGACAT--------TATAAA-----AT----ATG CCC (-35) (-10) (15 -17 bp) (+1) 18
Which house can Paul enter? Rule = Paul only enters a house with a pink staircase But if Paul is unable to see color, he might enter any house 19 He will need a person who is not color blind to guide him to the pink staircase
Sigma factor guides polymerase to the start of a gene Holo Enzyme: The RNA polymerase holoenzyme contains an additional subunit - sigma (s). The sigma subunit does two things: It reduces the affinity of polymerase for non-specific DNA. It greatly increases the affinity of polymerase for promoters. Sigma binds specifically to the -35 promoter sequence and helps target the polymerase to the -10 promoter sequence Critical step in regulation of transcription of most bacterial genes is the binding of RNA polymerase to the promoter Polymerase Sigma Consensus sequences of promoters ----TTGACAT--------TATAAA-----AT----ATG CCC (-35) (-10) (+1) (15 -17 bp) SIGMA factor is a TRANSCRIPTION ACTIVATOR 20
RNA polymerase Sigma binds and then recruits polymerase 5’ 3’ 3’ 5’ RNA polymerase binds the promoter and unwinds the DNA RNA polymerase synthesizes RNA 3’ 21 5’
Promoter asymmetry and direction of transcription The orientation of the promoter defines which of the two DNA strands will be transcribed RNA chain length ranges from ~70 to 10, 000 nucleotides 5’--TTGACAT--------TATAAA-----AT--//-ATG CCC GGG TAA template --AACTGTA--------ATATTT-----TA--//-TAC GGG CCC ATT 5’ Promoter sequence is asymmetrical and orients the binding of the polymerase TATA sequence has an orientation -35 (sigma sequence) has an orientation wrt to the TATA box 5’--TTGACAT--------AAATAT-----AT--//-ATG CCC GGG --AACTGTA--------TTTATA-----TA--//-TAC GGG CCC NOT 5’--TATAAA--------TTGACAT-----AT--//-ATG CCC GGG --ATATTT--------AACTGTA-----TA--//-TAC GGG CCC NOT 5’--TTGACAT--------TATAAA-----AT--//-ATG CCC GGG OK --AACTGTA--------ATATTT-----TA--//-TAC GGG CCC 22
RNA polymers are synthesized in the 5’ to 3’ direction 5 --TTGACAT-----TATAAA----ATGGGGATG CCC GGG TAA 3 3 --AACTGTA-----ATATTT----TACCCCTAC GGG CCC ATT 5 Template |||| ||| ||| 5 UGGGGAUG CCC GGG UAA 3 Once the polymerase orientation is established only one DNA strand is read! m. RNA has similar sequence as the non-template strand RNA chains are ONLY made in the 5’ to 3’ direction The template DNA strand is read in the opposite direction (3’ to 5’) 23
Promoters can be found in different relative orientations You can make RNA from both DNA strands 5’ 3’ 3’ 3’ 5’ 5 CAT------TTTATA-------TATAAA------ATG 3’ 3 GTA------AAATAT-------ATATTT------TAC 5’ 24
Gene orientations For each gene, RNA is transcribed from ONLY ONE DNA strand (template strand) However different genes may use different DNA strands Over the entire chromosome, different regions of both DNA strands will be Transcribed Orientation of genes is the direction in which they are transcribed 5’ 3’ 3’ 5’ 25
Two anti-parallel strands of DNA Each DNA strand is separate and distinct 3’ 3’ Gene. B Gene. A 5’ 5’ 5’-A-G-T-A-C-G-3’ 3’-T-C-A-T-G-C-5’ Genes code in 5’ to 3’ direction THEREFORE Both strands of DNA can code for genes! Gene. A sequence: 5’-A-G-T-A-C-G-3’ = Ser-Thr Gene. B sequence: 5’-C-G-T-A-C-T-3’ = Arg-Ser 26
Transcription termination 5’ 3’ 3’ 3’ 5’ 5’ 3’ 5’ Termination of transcription requires the protein Rho, that associates with the RNA polymerase, and recognizes a sequence in the m. RNA, binds this sequence and terminates transcription by pulling the RNA away from the polymerase. This causes the polymerase to first pause and then dissociate from the DNA strand Upon termination, the RNA is released from the DNA Most terminators contain a region rich in GC bases followed by poly. U tract. This adopts a hairpin structure in the RNA. 27
Eukaryotic RNA polymerase Prokaryotes have a single RNA polymerase This enzyme synthesizes m. RNA, t. RNA and r. RNA Eukaryotes have three RNA polymerases RNA Polymerase. I----r. RNA polymerase. II---m. RNA polymerase. III--t. RNA is synthesized in the nucleus Eukaryotic RNA polymerases also use transcription activators to initiate from a specific site 28
Prokaryote Vs Eukaryote Promoter TTGACA -35 Enhancer TATAA -10 Promoter TATAA -25 Up. Stream Down. Stream
General transcription factors bind the TATA box and recruit RNA polymerase Distal Enhancer TATAAA Inr TFIIB, TFIID, TFIIE, TFIIF, TFIIH (General factors) These factors bind around the TATA box and recruit the RNA polymerase to the TATA box. 30
Regulation of Transcription by Enhancers Nuclear envelope Proximal promoter (General factors) Pol RNA TATA + Enhancer (Tissue specific transcription activators) The enhancer functions to activate genes. The enhancer is made 32 up of specific sequences that bind TISSUE SPECIFIC Factors. The binding of these factors induces gene activation 100 fold!
Proteins that bind enhancers Transcription Activators (proteins) bind the enhancer sequences (like Sigma factor) There are different activators in different eukaryotic cells Some activators are cell type specific and some are gene specific Activation domain helps recruit the general factors/RNA polymerase Different activators have a different DNA binding domain that recognizes a different DNA sequence Polymerase Distal Enhancer TATA Inr Gene Transcription Activators recruit general factors to the TATA box. The general factors then recruit the RNA polymerase to the TATA box. They cooperate together to activate transcription. 33
Cell specific expression Different Enhancers bind different tissue and cell specific transcription activator proteins and this enables specific gene activation in specific cells BBB activator recognizes CGCGCG Blue 1 Blue cell Red 1 BBB Blue 2 Red 2 RRR activator recognizes AACGAA Grey cell Blue 1 Red 1 RRR Blue 2 Red 2 34
Primary Transcript Processing This is the Primary transcript It is processed before being transported to the cytoplasm 5’ cap of 7 -methylguanosine is added Primary transcript TGGGGGGATGGGGGGCG Capped transcript 7 methyl G-TGGGGGGATGGGGGGCG 3’ poly. A tail is added: usually about 150 -200 nucleotides long 7 methyl G-TGGGGGGATGGGGGGCGAAAA RNA is spliced 35
Processing DNA Primary transcript m 1 Gppp AAAA m 1 Gppp Splicing 36
Splicing Internal portions of the primary transcript are removed This is called splicing Regions of a gene that code for a protein are interrupted by regions called intervening sequences (introns) 1 2 3 4 5 6 7 Ovalbumin 7700 bp long Primary transcripts are a mosaic of exons and introns Regions of a gene that code for a protein are interrupted by regions called intervening sequences (introns) Primary transcript m. RNA 37
Splicing Short sequences dictate where splicing occurs Exon 1 Pu. GUPu. Pu-------Py 12 -14 AG Splice donor Exon 2 Splice acceptor Exon 1 Exon 2 Splicing requires a enzyme complex called a spliceosome Consists of several small RNAs complexed with ~50 proteins The sn. RNA basepair with the splice donor and acceptor sites and are important for holding the two Exons together during splicing 38
Translation is the production of a polypeptide whose amino acid sequence is derived from the nucleotide sequence of the m. RNA is a simple linear molecule made of an array of FOUR different nucleotides Proteins are complex three dimensional structures made of arrays of 20 amino acids How do simple m. RNA molecules specify complex proteins? Two components from cells t. RNA molecule isolation- fed radioactive AA to cells, Purified radioactive macromolecules RNA complexed with a single amino acid -t. RNA!!! Ribosomes- RNA + protein. Complex enzyme-catalyzes joining of amino acids to form polypeptides 39
Genes, RNA, proteins Genes synthesize RNAs that are converted to proteins Genes also encode for RNAs that are NOT converted to proteins Two major classes of non-protein RNA t. RNA = Transfer RNA r. RNA = Ribosomal RNA 40
Adaptor hypothesis Proline 3’GGC 5’ ||| 5’AAACCGGGG 3’ t. RNA molecules act as adaptors that translate the nucleotide sequence into protein sequence Each t. RNA has two functional sites Each t. RNA includes a specific loop (ANTI-CODON loop) that is used to read the m. RNA Each t. RNA is covalently linked to one of the 20 amino acids (a t. RNA with the anticodon GGG specifically carries the amino acid proline and will read the codon CCC in the m. RNA) 41
Charged t. RNA are synthesized from genes as RNA AFTER A t. RNA is MADE A specific amino acid is then covalently attached to the 3’ end of a specific t. RNA by a specific enzyme called AA -t. RNA synthase (the true translators) There are 20 different AA-t. RNA synthase enzymes for the 20 different amino acids The t. RNA with an amino acid is called a charged t. RNA Pro 42
t. RNA genes, t. RNA and charged t. RNA Gene t. RNA gene 1 t. RNA gene 2 t. RNA gene 3 m. RNA t. RNA AA-t. RNA synthase Charged t. RNA 1 5’CAU 3’ anticodon Met-t. RNA synthase Met-t. RNA 2 5’GAA 3’ anticodon Phe-t. RNA synthase Phe-t. RNA 3 5’CUU 3’ anticodon Lys-t. RNA synthase Lys-t. RNA 5’AUG UUC AAG UAA 3’ ||| ||| UAC AAG UUC MET Phe Lys STP 43
Codon-anticodon The m. RNA sequence complementary to the t. RNA anticodon is called a codon The sequence of aminoacids along a protein is specified by the anticodon-codon alignment Alignment is anti-parallel If m. RNA codon is 5’GGA 3’ complementary t. RNA anticodon is 3’CCU 5’ m. RNA 5 --- AUG GGG AAG CCG UAA ---3 ||| ||| 3 UAC 5 CCC UUC 3 GGC 5’ Met Gly Lys Pro PROTEIN Met-Gly-Lys-Pro t. RNA adaptors translate the sequence of nucleotides present in the m. RNA into a sequence of amino acids in the protein. 44
The genetic code has 3 letters 5’ 3’ TTT AAA TAG CCC GGG CTA TTT AAA ATG CAT 3’ 5’ 5’ 3’ AUG UUU AAG UAG CCC 3 U A C 5 A A A 3 U U C 5 STP Met Phe Lys 45
Protein synthesis is a stepwise process 5’ 3’ 5’ aa 2 aa 1 aa 2 5’ 3’ aa 1 aa 2 3’ 5’ aa 3 3’ aa 2 aa 3 aa 1 46
Enzymes are required for protein synthesis Mixing m. RNA with charged t. RNA’s does not lead to protein synthesis The enzyme necessary for catalysis of protein synthesis is the RIBOSOME Ribosomes are complex enzymes made of more than 50 proteins and 3 RNA molecules The RNA molecules in ribosomes are called ribosomal RNA (r. RNA) The Ribosome has several functional sites Peptidyl transferase t. RNA binding sites E P A m. RNA binding site 47
Translation Initiation What about the first aminoacid? Does the ribosome start synthesis at the start of the m. RNA? NO Translation of an m. RNA by the ribosome always initiates at the INITIATION Codon- AUG is normally recognized by a t. RNA charged with the amino acid Methionine When an AUG occurs near the 5’ end of the m. RNA (at a special initiation position), it is recognized by a t. RNA charged with Met 48
Special Initiation position (r. RNA) UCCUCCA 5’ Annnnn. AGGAGGUnnnnnnn. AUGUCUAUUACCnnnn 3’ (m. RNA) What is the special initiation position Upstream (5’) of the start codon AUG is a sequence in the m. RNA that is Complementary to a sequence in one of the ribosomal r. RNAs. This sequence is called the ribosome binding sequence (RBS) Once the ribosome binds the m. RNA via the Ribosome binding site it tracks along the m. RNA until it encounters an AUG. It initiates protein synthesis at the AUG 49
STEPS in Elongation Trp Met Ala Leu Phe Trp Phe ACC AAG ACC 5’-----UUCUGGUUU---3’ AAG 5’-----UUCUGGUUU--3’ Met Ala Leu Met Ala Phe Leu Phe Trp AAG Phe AAG ACC 5’-----UUCUGGUUU---3 ACC AAA 5’-----UUCUGGUUU--3’ 50 AAA
Translation termination The growing polypeptide chain is released when a stop codon is reached There are three stop codons: UAA UAG UGA These codons are not recognized by a t. RNA They are recognized by a protein- Release factor. This causes the ribosome to release the m. RNA and the newly synthesized polypeptide Met Ala Leu Phe Trp The release factor binds to the STOP codon ACC 5’-----UGGUAA-----3’ (m. RNA) 51
The Genetic Code Properties of the Genetic code 1 - The code is written in a linear form using the nucleotides that comprise the m. RNA 2 - The code is a triplet: THREE nucleotides specify ONE amino acid 3 - The code is degenerate: more than one triplet specifies a given amino acid 4 - The code is unambiguous: each triplet specifies only ONE amino acid 5 - The code contains stop signs- There are three different stops 6 - The code is comma less 7 - The code is non-overlapping 8 - The code is universal: The same “dictionary” is used by viruses, prokaryotes, invertebrates and vertebrates. 52
The GENETIC CODE Second letter U C CUU CUC CUA CUG A AUU AUC AUA AUG G GUU GUC GUA GUG Phe Leu Ile Met Val A UCU UCC UCA UCG UAU UAC UAA UAG CCU CCC CCA CCG ACU ACC ACA ACG GCU GCC GCA GCG Ser G Tyr STOP His Pro CAU CAC CAA CAG Asn Thr AAU AAC AAA AAG Ala GAU GAC GAA GAG Gln Lys Asp Glu UGU UGC UGA UGG CGU CGC CGA CGG AGU AGC AGA AGG GGU GGC GGA GGG U C A G Cys STOP Trp U C A G Arg U C A G Ser Arg U C A G Gly Codons in m. RNA (5’ to 3’) Anticodons in t. RNA will be reverse complement 53 Third letter First letter U UUC UUA UUG C
The code 3 5 1 9 2 amino amino acids are specified by 6 different codons acids are specified by 4 different codons acid is specified by 3 different codons acids are specified by 2 different codons acids are specified by 1 different codons The degeneracy arises because More than one t. RNA specifies a given amino acid A single t. RNA can base-pair with more than one codon t. RNAs do not normally pair with STOP codons Ser Ser AGG AGU UCG 5’---UCC------UCA------AGC 3’ Ser AGG ----UCC------UCA-----54
Predicting Genes If you sequence a large region of DNA, how do you determine if the region codes for a protein or not? 5’ 3’ 3’ 5’ 5’ ATG GCC TAT GAG AAT TAA TGA CCC GGG -5’ ATG GCC T ATG AGA ATT AAT GAC CCG GG-- Two most important characteristics of a Gene – Start codon = ATG Stop codon = UAA UAG UGA Start/Stop method 0 100 200 300 400 1 2 3 4 5 6 55
Predicting Genes The first amino acid in all proteins is always Met (ATG) The end of a protein is specified by Stop codons TAA TAG TGA BOTH Strands of ds DNA CAN code for proteins 5’TTATATGGATGAATGACATA 3’ 5’TATGTCATTCATCCATATAA 3’ Possible proteins for top strand of DNA TTATATGGATGAATGACATA Met. Asp. Gly. Stp TTATATGGATGAATGACATA Met. Asp. Glu. Stp TTATATGGATGGATGAATGAATGACATA Met. Asn. Asp After rotation of DNA 5’TATGTCATTCATCCATATAA 3’ 5’TTATATGGATGAATGACATA 3’ Possible proteins TATGTCATTCATCCATATAA 56 Met. Ser. Phe. Ileleu. Pro. Ser. Ile. Stp
Genes also require promoters and ribosome binding sites Is there a ribosome binding site upstream of the ATG Is there a promoter upstream of the ribosome binding site Prokaryotic Genes PROMOTER 3’ 5’ ---TTGACAT------TATAAT-------AT-/-AGGAGGT-/-ATG CCC CTT TTG TGA ---AACTGTA------ATATTA-------TA-/-TCCTCCA-/-TAC GGG GAA AAC ATT 3’ (-35) (-10) 5’ RIBOSOME BINDING SITE 3’ 5’ U-/-AGGAGGU-/-AUG CCC CUU UUG UGA Met Pro leu stp When ALL OF THESE RULES ARE SATISFIED THEN AND ONLY THEN WILL A PIECE OF DNA GENERATE A PROTEIN. EUKARYOTES ARE EVEN MORE COMPLICATED. 57
Molecular Memory What is so important about speculations on the origins of life? Mechanism of molecular memory Macromolecules ultimately decay and decompose. Ashes to ashes Dust to dust Therefore macromolecules that have a means of reproducing have a Evolutionary Future What is required to replicate a molecule? Template Enzyme DNA Templating ability Catalytic activity (replicatability) RNA XXX XXX Protein XXX RNA WORLD 58
Speculations on replicating entities Speculations of the nature of the first replicating entity produces four possibilities 1) Proteins evolved and early proteins somehow replicated directly. Later in evolution nucleic acids evolved 2) Nucleic acids evolved and early nucleic acids somehow replicated directly 3) Nucleic acids and proteins co-evolved 4) The first life was unrelated to nucleic acids and proteins. Later in evolution nucleic acids and proteins evolved The discovery of RNA with catalytic activity has increased support for the RNA world RNA is also the genetic material for some viruses! Beginning of Life Present Day RNA based Protein based Enzymatic activity Ribosomes Sn. RNAPs Enzymes containing RNA + Proteins Ancient evolutionary intermediates? 59