
ee7f60a5db1ab1b634e5f07a69c50c8b.ppt
- Количество слайдов: 132
Too many matches… http: //www. genomatix. de - http: //www. genomatix-software. com © 2007 Genomatix Software Gmb. H
Too many matches… A typical question: What are the potential TF sites involved in regulation of my gene of interest ? A typical approach: “Let´s run Mat. Inspector over the promoter region of my gene” http: //www. genomatix. de - http: //www. genomatix-software. com © 2007 Genomatix Software Gmb. H
Too many matches… A typical question: Where do I get my input promoter DNA sequence from? A typical approach: “Let´s extract from NCBI. 3 kb upstream of TSS to be sure to have the promoter…” http: //www. genomatix. de - http: //www. genomatix-software. com © 2007 Genomatix Software Gmb. H
Too many matches… A typical result: Which of those matches are relevant? How do I get rid of all those “false positives” ? http: //www. genomatix. de - http: //www. genomatix-software. com © 2007 Genomatix Software Gmb. H
TF binding sites… Important facts to consider: There is not a single false positive match Mat. Inspector gives you all physical TF binding sites A physical TFBS is found every 10 to 15 bps throughout the genome A single isolated TF binding site carries no function TFs work through complexes which are represented on sequence level through sets of TF binding sites in certain distance relationship and orientation ->promoter frameworks http: //www. genomatix. de - http: //www. genomatix-software. com © 2007 Genomatix Software Gmb. H
TF binding sites… Okay, what is now a physical TF binding site ? What is a functional TF binding site? http: //www. genomatix. de - http: //www. genomatix-software. com © 2007 Genomatix Software Gmb. H
False positives? A physical binding site is invariable A physical binding site is a fixed part of the genome = weight matrix / IUPAC string Physical binding sites can be detected by Mat. Inspector This DNA sequence usually can bind to its cognate protein(s) Physical binding sites have no function in transcription on their own http: //www. genomatix. de - http: //www. genomatix-software. com © 2007 Genomatix Software Gmb. H
Physical vs functional TFBS A functional binding site depends on context! A functional binding site requires a cellular context. . . but binding proteins are present only in 2 cell types! One binding site, five cell types. . . -> no functional binding site in the other 3 cell types! A functional binding site requires a genomic context. . . biological function may require additional binding sites! Even when binding proteins are present. . . Module Transcriptional function is defined by the cellular and genomic context http: //www. genomatix. de - http: //www. genomatix-software. com © 2007 Genomatix Software Gmb. H
Transcriptional modules A transcriptional module is the smallest functional unit A transcriptional module consists of two or more TFBSs Strand orientation, relative order and distance of TFBSs are important A module also has a strand orientation and can shift within a promoter Transcriptional modules are present in promoters and enhancers TATA box INR box F 1 + F 2 - F 3 +/- The core promoter - just another module Transcriptional modules integrate signals via the interacting TFs http: //www. genomatix. de - http: //www. genomatix-software. com © 2007 Genomatix Software Gmb. H
Why uses nature modules? A B A C B C A No common organization? http: //www. genomatix. de - http: //www. genomatix-software. com B C Common modules! © 2007 Genomatix Software Gmb. H
Transcriptional modules Promoter modules can work in three different ways Synergistic “Short range module” distance ≤ 50 bp Antagonistic “Composite elements” or Binding Affinity: High / Low Is possible http: //www. genomatix. de - http: //www. genomatix-software. com Synergistic “Short range module” “Looping module” distance ≤ 50 bp distance up to 300 bp or High / Low Is possible High / High only © 2007 Genomatix Software Gmb. H
Transcriptional modules define target genes of pathways NFkappa. B involved a regulation of target genes NFkappa. B isregulates in number of “target genes” of several pathways NFk. B CREB ELAM-1 IL-2 NFk. B C/EBP CREB ICAM-1 IRF-1 IL-8 IL-6 NFkappa. B SAA-1 IFN-ß SAA-2 NFkappa. B Induced by 2 pathways ! G-CSF IP-10 NFk. B IRF-1 HLA-A HLA-B E-Selectin IL-1 NFk. B Modules are the basic elements of regulatory pathways and networks http: //www. genomatix. de - http: //www. genomatix-software. com © 2007 Genomatix Software Gmb. H
Transcriptional modules Key – lock principle Protein complex binding „DNA-looping“ distal promoter/ enhancer TF binding sites TBP TATA proximal promoter TFIIB TFIIE TFIIA TFIIF TFIID TFIIH core promoter IN R RNA polymerase II Transcription factor binding sites http: //www. genomatix. de - http: //www. genomatix-software. com © 2007 Genomatix Software Gmb. H
Transcriptional modules Transcription regulation mechanism Promoter Exon Gene A, transcript n Primary transcript Protein complex Gene C, transcript m Gene B, transcript p Transcription regulation implies a regulatory network http: //www. genomatix. de - http: //www. genomatix-software. com © 2007 Genomatix Software Gmb. H
Transcriptional modules Context dependent expression by different protein complexes TBP TATA TFIIB TFIIE TFIIA TFIIFTFIID TFIIH IN R Same lock – different keys: Same gene - different biological context http: //www. genomatix. de - http: //www. genomatix-software. com © 2007 Genomatix Software Gmb. H
Transcriptional modules Context specific transcription regulation Example: Analysis of the RANTES promoter in different cell lines Experimentally verified evidence that TFBSs from modules, which are crucial for regulation in one biological context (cell type), are totally irrelevant in another ! Fessele, S. , Maier, H. , Zischek, C. , Nelson, P. J. , Werner, T. (2002) "Regulatory context is a crucial part of gene function" Trends in Genetics 18, 60 -63 (MEDLINE 1181130) http: //www. genomatix. de - http: //www. genomatix-software. com © 2007 Genomatix Software Gmb. H
Transcriptional modules Modules contribute strongly to functional promoter analysis Modules are usually linked to at least one known biological function A module match in a promoter makes this gene a good candidate A module match in a promoter does not prove the gene to be a target Additional independent evidence is required to prove the target A module match immediately suggests experimental verification Module matches reduce experimental efforts by orders of magnitude http: //www. genomatix. de - http: //www. genomatix-software. com © 2007 Genomatix Software Gmb. H
Promoter sequences Very interesting – but how does all this help me with my original question ? The question still is: What are the potential TF sites involved in regulation of my gene of interest ? http: //www. genomatix. de - http: //www. genomatix-software. com © 2007 Genomatix Software Gmb. H
Promoter sequences More things to consider before asking that question ! There was another one: Where do I get my input promoter DNA sequence from? “Let´s extract from NCBI. 3 kb upstream of TSS to be sure to have the promoter…” http: //www. genomatix. de - http: //www. genomatix-software. com © 2007 Genomatix Software Gmb. H
Promoter sequences More things to consider … 3 kb is too large for meaningful analysis even going 10 kb upstream of TSS is no guarantee to have the relevant promoter sequence multiple promoters are the rule, not an exeption the non-coding first exon is always part of the promoter Huh? What does this mean ? Where do I get this damn promoter now? http: //www. genomatix. de - http: //www. genomatix-software. com © 2007 Genomatix Software Gmb. H
Alternative transcripts/promoters Which promoter? One gene = one promoter ? Gene A? Genes usually have alternative transcripts with alternative promoters http: //www. genomatix. de - http: //www. genomatix-software. com © 2007 Genomatix Software Gmb. H
Alternative transcripts/promoters Context dependent expression via different promoters Example: Glucokinase Coding exons Hepatic promoter Pancreatic promoter Y Tanizawa, A Matsutani, KC Chiu, and MA Permutt Human glucokinase gene: isolation, structural characterization, and identification of a microsatellite repeat polymorphism Mol. Endocrinol. , Jul 1992; 6: 1070 - 1081. http: //www. genomatix. de - http: //www. genomatix-software. com © 2007 Genomatix Software Gmb. H
Alternative transcripts/promoters Comparative genomic map of the Glucokinase GCK Promoter set 1 Promoter set 2 Pancreatic promoter Hepatic promoter http: //www. genomatix. de - http: //www. genomatix-software. com Data from El. Dorado © 2007 Genomatix Software Gmb. H
Alternative transcripts/promoters Important facts to consider: Alternative promoter usage is often tied to regulation of tissue specific gene expression Alternative promoter usage is of very high biological relevance. There are several examples where aberrant regulation of the identical primary transcript leads to severe biological effects http: //www. genomatix. de - http: //www. genomatix-software. com © 2007 Genomatix Software Gmb. H
Alternative transcripts/promoters Aromatase: Switch in promoter usage is associated with disease AATAAA 1. 1 1. 4 1. f 1. 6 Normal breast 1. 3 1 II IV V VI VIII IX X Breast cancer Aromatase The gene product is absolutely identical. The only difference is in the alternative promoter usage. On transcript level this can be seen only in the non-coding first exon. http: //www. genomatix. de - http: //www. genomatix-software. com © 2007 Genomatix Software Gmb. H
Promoter Analysis The aim of in silico promoter analysis - summary 1. Identification of the promoter sequence context 1 context 2 context 3 2. Prediction of physical transcription factor binding sites 3. Functional context : context n http: //www. genomatix. de - http: //www. genomatix-software. com 4. Context dependent functional transcription factor binding sites © 2007 Genomatix Software Gmb. H
El. Dorado promoter sequence retrieval … Yes! I know all of this! I just wanted to know from where I can get my promoter sequence(s) easily! If you don´t have one already, sign up for a free evaluation account. first. . . then login here! www. genomatix. de http: //www. genomatix. de - http: //www. genomatix-software. com © 2007 Genomatix Software Gmb. H
El. Dorado promoter sequence retrieval http: //www. genomatix. de - http: //www. genomatix-software. com © 2007 Genomatix Software Gmb. H
El. Dorado promoter sequence retrieval Choose the organism. Either enter here the locus ID, or the gene name …or choose a sequence file from your directory. . . … or copy & paste a raw sequence here. It can be cd. NA or whatever you have. It will be exactly mapped to the genomes within seconds. Upload a file from your local disk…. . . accession number… … or exact contig position http: //www. genomatix. de - http: //www. genomatix-software. com © 2007 Genomatix Software Gmb. H
El. Dorado promoter sequence retrieval IMPORTANT! Affymetrix probe-set-ID input : Our annotation is NOT based on the Affymetrix Net. Affx assignment!It is rather based on genomic mapping of each single probe. A transcript will be retrieved if at least one probe of the set (usually 11 probes) matches. Input in this section delivers results based on gene name HMGCS 1 orfor. For mixed probe setsa million of names, synonyms and ( keyword search. Over (cross-hybridisation), all relevant example) transcripts will be retrieved, which might lead to a gene IDs help to find what you want - fast! result with transcripts from different loci. Input in this section delivers results based on ultra fast sequence mapping. Copy and paste raw sequence data here (min. 15 nucleotides) or enter an accession number. In contrast to the entry of an accession number above, here the sequence is actually retrived from data base and mapped onto the genome(s). NOTE: many EST based accession numbers have poor sequence homology and deliver no result. http: //www. genomatix. de - http: //www. genomatix-software. com © 2007 Genomatix Software Gmb. H
El. Dorado promoter sequence retrieval … here you can choose which chip´s probes to see. . . … licensed customers can add their own sequence data http: //www. genomatix. de - http: //www. genomatix-software. com © 2007 Genomatix Software Gmb. H
El. Dorado promoter sequence retrieval This gives you an interactive graphical representation of the genomic context of your gene http: //www. genomatix. de - http: //www. genomatix-software. com © 2007 Genomatix Software Gmb. H
El. Dorado promoter sequence retrieval switch display of components on and off mapping positions of Affymetrix single probes ! select regions of the scale/slide the retrieved into a file "window" graphics and safe them genomic Orange indicates your input. In this case a gene name. It is very informative when your query is based on sequence data. Then you see the mapping positions. Everything is clickable – just play around ! Here you can scale the view http: //www. genomatix. de - http: //www. genomatix-software. com © 2007 Genomatix Software Gmb. H
El. Dorado promoter sequence retrieval Clicking on this trancriptional start region (TSR). . . displays this hyperlink to. . . Now we have zoomed into the promoter region http: //www. genomatix. de - http: //www. genomatix-software. com © 2007 Genomatix Software Gmb. H
El. Dorado promoter sequence retrieval . . . this profile of the different experimentally verified TSS (CAGE tags) in the different tissue types. http: //www. genomatix. de - http: //www. genomatix-software. com © 2007 Genomatix Software Gmb. H
El. Dorado promoter sequence retrieval This is a table-like representation of all annotated elements. It is especially useful for quick and easy retrieval of the dna sequence(s) of interest. http: //www. genomatix. de - http: //www. genomatix-software. com © 2007 Genomatix Software Gmb. H
El. Dorado promoter sequence retrieval Tick/un-tick the boxes of what you would like to see, and then. . . http: //www. genomatix. de - http: //www. genomatix-software. com © 2007 Genomatix Software Gmb. H
El. Dorado promoter sequence retrieval This for instance. . . tells you that this SNP deletes three potential TF binding sites and creates a new one. A potential regulatory active SNP. . . http: //www. genomatix. de - http: //www. genomatix-software. com © 2007 Genomatix Software Gmb. H
El. Dorado promoter sequence retrieval from here you can directly run a Mat. Inspector analysis for this promoter. . . again, play around with the interactive graphics. . . Click the symbols and jump right into Mat. Base, the TF knowledge base. . http: //www. genomatix. de - http: //www. genomatix-software. com © 2007 Genomatix Software Gmb. H
El. Dorado promoter sequence retrieval now, finally the first way to extract a promoter sequence. . . and/or any other element displayed in the list below. Choose your desired length. Unless you have good reason to change the length of the proximal promoter, leave the defaults! http: //www. genomatix. de - http: //www. genomatix-software. com © 2007 Genomatix Software Gmb. H
El. Dorado promoter sequence retrieval This shows you all annotated alternative transcripts plus all Affymetrix probe set single probe mappings plus another way to extract your promoter sequence(s) http: //www. genomatix. de - http: //www. genomatix-software. com © 2007 Genomatix Software Gmb. H
El. Dorado promoter sequence retrieval You know this already. . . Three different known transcripts for this locus. . . and four distinct promoters ! How this comes, I´ll tell you in a minute http: //www. genomatix. de - http: //www. genomatix-software. com © 2007 Genomatix Software Gmb. H
El. Dorado promoter sequence retrieval Tick the promoter of your interest. . . Or submit the promoter directly sequences directly to Mat. Inspector to one of those for graphical tasks. analysis. But they make It works on a sense only with single sequence, multiple too. sequences. More on that later! . . . choose format. . . and extract the sequence. http: //www. genomatix. de - http: //www. genomatix-software. com © 2007 Genomatix Software Gmb. H
El. Dorado promoter sequence retrieval But why do I have four promoters here? And two even don´t have a transcript assigned, as it is written here! And what´s all that Comp. Gen thing about? The multiple promoter thing I showed you before. Remember the GCK example, liver and pancreas? Now to the Comp. Gen promoters. They are derived by a proprietary comparative genomics approach. http: //www. genomatix. de - http: //www. genomatix-software. com © 2007 Genomatix Software Gmb. H
El. Dorado promoter sequence retrieval http: //www. genomatix. de - http: //www. genomatix-software. com © 2007 Genomatix Software Gmb. H
El. Dorado promoter sequence retrieval For tick-boxes you have an homologous The our example weknow already. . . locusthe Promoter Set number ! Note assigned in chimp, macaca, human, rat, dog, cow, We need them for later promoter retrieval. opossum, chicken, and zebrafish. Exhaustive cross-mapping of all transcripts to all genomes of all organisms in El. Dorado generates our homology groups. http: //www. genomatix. de - http: //www. genomatix-software. com © 2007 Genomatix Software Gmb. H
El. Dorado promoter sequence retrieval Get a feeling for the degree of phylogenetic conservation of the resp. promoter. See how much experimental evidence supports this promoter http: //www. genomatix. de - http: //www. genomatix-software. com © 2007 Genomatix Software Gmb. H
El. Dorado promoter sequence retrieval A Promoter Set represents phylogenetically conserved promoters With these tickboxes you can switch on and off the display of the different Promoter Sets http: //www. genomatix. de - http: //www. genomatix-software. com You should be familiar with this view, now. Here the orange indicates a promoter belonging to a promoter set. © 2007 Genomatix Software Gmb. H
El. Dorado promoter sequence retrieval Don´t waste my time here! How do I get my promoter sequence now? And which one of all those promoters should I take ? Well, which one? If you do not have any other information (experimental or from literature), I would recommend that you consider all available alternative promoters for further analysis http: //www. genomatix. de - http: //www. genomatix-software. com © 2007 Genomatix Software Gmb. H
El. Dorado promoter sequence retrieval Don´t waste my time here! How do I get my promoter sequence now? And which one of all those promoters should I take ? Two easy ways of promoter sequence retrieval by two mouse clicks I showed you some minutes ago. There are more. . . oh. . . you cannot access these options? http: //www. genomatix. de - http: //www. genomatix-software. com © 2007 Genomatix Software Gmb. H
El. Dorado promoter sequence retrieval You should license Genomatix. Suite with at least the 10 -fold evaluation account upgrade. Otherwise it is slightly more cumbersome. . . and use the for sequence retrieval Use one of that options I showed you from your get Contig and positional before andsecond to Genomatix favorite system, e. g. NCBI information. . . Hint: If you are interested in the TF results rather than the sequence, use the “search for common transcription factor binding sites” option as shown before. http: //www. genomatix. de - http: //www. genomatix-software. com © 2007 Genomatix Software Gmb. H
From physical to functional TF site Quite interesting… But I am not a single step closer to the answer of my real question: What are the potential TF sites involved in regulation of my gene of interest ? Well, I think you are. Essential first step is to analyze the right sequence in a length that allows for meaningful results. Now that you have the real promoter sequence(s), let´s see how to go on from here. . . http: //www. genomatix. de - http: //www. genomatix-software. com © 2007 Genomatix Software Gmb. H
From physical to functional TF site Then we have to look for additional potential functional The ideal situation for determining evidence that some of the physical TF sites to have a set of genes apparently binding sites would be might be functional ones. Best would be to goin the Chromatin. IP experiment. being co-regulated for a given cellular and experimental However, for such you would need some hints for which context, f. i. from a microarray experiment. TFcomparativebuy antibodies. Further computer analysis A to make or promoter analysis with Frame. Worker is required anyhow! you a pattern of involved TFs, as would very likely give shown in numerous publications (see our web site at There are three different roads to go. . . “About us -> Publications”). But I have only a single gene. And that´s the one I am interested in! http: //www. genomatix. de - http: //www. genomatix-software. com © 2007 Genomatix Software Gmb. H
From physical to functional TF site We talked about promoter modules before. Search your sequence for promoter modules with Model. Inspector. Our Promoter Module Library contains over 550 promoter modules, each of them experimentally verified to carry transcriptional regulatory activity. A module match increases probability that an involved TF site is functional. Okay, how conserved patterns of TF sites in Look for phylogenetically do I do this? a comparative genomics promoter set with Frame. Worker. TFs being part Let´s go phylogenetically conserved of such ! frameworks carry higher probability for being functional. Do extensive literature data mining with Biblio. Sphere. PE for known TF correlations, pathway analysis and gene set creation for comparative promoter analysis. TFs showing biological activity in another experimental context are functional (at least in that context). http: //www. genomatix. de - http: //www. genomatix-software. com © 2007 Genomatix Software Gmb. H
El. Dorado promoter sequence retrieval Lets start with an analysis for promoter modules. . . http: //www. genomatix. de - http: //www. genomatix-software. com © 2007 Genomatix Software Gmb. H
Search for promoter modules If you are licensed, you can have a quick look at the promoter module library. Each module is experimentally verified to carry regulatory activity. http: //www. genomatix. de - http: //www. genomatix-software. com © 2007 Genomatix Software Gmb. H
Search for promoter modules Choose a sequence file from your directory Or copy & paste a raw sequence here. or… you know the rest ! Don´t click anything below, unless you want to scan an entire data base ! http: //www. genomatix. de - http: //www. genomatix-software. com © 2007 Genomatix Software Gmb. H
Search for promoter modules go for vertebrate modules. . . Click here! You can wait for the result… http: //www. genomatix. de - http: //www. genomatix-software. com © 2007 Genomatix Software Gmb. H
Search for promoter modules http: //www. genomatix. de - http: //www. genomatix-software. com © 2007 Genomatix Software Gmb. H
Search for promoter modules http: //www. genomatix. de - http: //www. genomatix-software. com © 2007 Genomatix Software Gmb. H
Search for promoter modules Now we have focused down to 21 very interesting positions in this promoter with modules that are composed of a total of 11 different transcription factor binding sites. Our arbitrary chosen example HMGCS 1 belongs to the cholesterol biosynthesis pathway. Some of the found promoter modules do have proven function in sterol regulation! Wow! That´s impressive! … But that example is a mock-up, isn´t it? Not really. It is a nice example to show this approach. Very frequently one finds functionally related modules. However, there is no guarantee… It adds just another line of evidence. http: //www. genomatix. de - http: //www. genomatix-software. com © 2007 Genomatix Software Gmb. H
Phylogenetically conserved frameworks That´s right. For this approach you first need a set of phylogenetically conserved promoters. Remember several slides before ? Okay, how does the other thing help? How did you call it, phylogenetically conserved frameworks? Not really. It is a nice example to show this approach. Very frequently one finds functionally related modules. However, there is no guarantee… It adds just another line of evidence. http: //www. genomatix. de - http: //www. genomatix-software. com © 2007 Genomatix Software Gmb. H
El. Dorado promoter sequence retrieval and tick the promoters of one set. . scroll to the Inspect and top of the choose your page. . . Promoter Set. . . In this example I choose Promoter Set 3 for human, rat, dog and cow. http: //www. genomatix. de - http: //www. genomatix-software. com © 2007 Genomatix Software Gmb. H
Phylogenetically conserved frameworks . . . scroll down. . . Great ! … From here you can have a look at TF binding sites which are common to the input promoters That is what I really want to know: Which TF sites do they have in common? http: //www. genomatix. de - http: //www. genomatix-software. com © 2007 Genomatix Software Gmb. H
Phylogenetically conserved frameworks … This is not more than a tiny hint! I ! Greatcan show you many cases where totally unrelated Be careful TF exons do have more!! sites in common than Thatclosely co-regulated promoters. is what I really want to know: What sites do they have for is a conserved Which TF you are really looking in common? pattern of TF sites. And we are going to do so. But first let´s have a look on the nucleotide sequence level. . . http: //www. genomatix. de - http: //www. genomatix-software. com © 2007 Genomatix Software Gmb. H
Phylogenetically conserved frameworks Di. Align TF gives an overlay of a true multiple sequence alignment (not pairwise) and common TF sites. Check Di. Align for other sequences (including amino acids)! It is extremely fast and especially powerful for finding short homologies in largely unrelated sequences. http: //www. genomatix. de - http: //www. genomatix-software. com © 2007 Genomatix Software Gmb. H
Phylogenetically conserved frameworks The parameters should be self explanatory. You can always click for help http: //www. genomatix. de - http: //www. genomatix-software. com © 2007 Genomatix Software Gmb. H
Phylogenetically conserved frameworks Here an output example. http: //www. genomatix. de - http: //www. genomatix-software. com © 2007 Genomatix Software Gmb. H
Phylogenetically conserved frameworks Why did you do this? What does it tell me? It is pretty informative to get a feeling for the degree of homology, which parts are more conserved than others and which TF binding sites reside in the homologous parts. Then, it is of interest to see where the evolutionary pressure was rather on functional conservation (TFBS) than on sequence conservation. http: //www. genomatix. de - http: //www. genomatix-software. com © 2007 Genomatix Software Gmb. H
Phylogenetically conserved frameworks Why did you do this? What does it tell me? Then, if you do a framework analysis on two highly homologous sequences we run into a combinatorial explosion. Frame. Worker checks for it and might give you a warning. However, in this case everything is fine. . . http: //www. genomatix. de - http: //www. genomatix-software. com © 2007 Genomatix Software Gmb. H
Phylogenetically conserved frameworks Why did you do this? … What does it tell me? If you do a framework analysis on two highly homologous sequences we run into a combinatorial explosion. Now, Frame. Worker the Frame. Worker analysis! we finally go to checks for it and might give you a warning. However, in this case everything is fine. . . http: //www. genomatix. de - http: //www. genomatix-software. com © 2007 Genomatix Software Gmb. H
Phylogenetically conserved frameworks This filter is a positive filter! Only TFs known to be associated with choose the matrix library Heretissue areselect for TFs only, known to be a you can listed here. associated with certain tissues. A TF not listed in a certain tissue does NOT mean that it is not expressed there! It just has not been reported, yet. http: //www. genomatix. de - http: //www. genomatix-software. com © 2007 Genomatix Software Gmb. H
Phylogenetically conserved frameworks More options gives you. . . well, more options ! Don´t change those parameters unless you know exactly what you are doing ! http: //www. genomatix. de - http: //www. genomatix-software. com © 2007 Genomatix Software Gmb. H
Phylogenetically conserved frameworks If you know that a certain TF a common This decides the number of input sequences which have to showis involved in the regulation of your gene, make it a mandatory pattern of TF sites element and search only for frameworks containing such. Mandatory elements are most helpful in focusing your analysis. If you don´t know one a One word on this parameter. between two adjacent TF sites. More decides the minimum/maximum number of This sets the distance constraints It priory, I´ll show you later in Biblio. Sphere. PE how to TF sites being allowed distance is to those. thisvariance. And always think about the get the distance case I increased important than the absolute in one framework. In. HELP pages ! Always the default value from 6 up to 10 know already better) and relax Toggle want to choices by largest start at default values (unless you since we multiple identify the holding the "Ctrl" key conserved pattern in this phylogenetic promoter set. when gradually if nothing meaningful is found. clicking! We might lower this later. This option gives you an idea of the specificity of the found frameworks. Use it with care! It checksdown often a framework It slows how would be found in a background of Frame. Worker 5. 000 random human promoter considerably! sequences. http: //www. genomatix. de - http: //www. genomatix-software. com © 2007 Genomatix Software Gmb. H
Phylogenetically conserved frameworks The longest frameworks contain 8 TF common. All four promoters have 18 TF sites in sites. There are 4 different frameworks. „search for This number might differ from the. If you click the link, you TF“ job earlier, since now we common jump direct to the graphical take representation strand specificity into account. http: //www. genomatix. de - http: //www. genomatix-software. com © 2007 Genomatix Software Gmb. H
Phylogenetically conserved frameworks Here you see the detailed You can the framework. It description of save this framework in your is perfectly Here you have a conserved throughout the species graphical personal directory for subsequent sequence representation. You or database already know how this scans works. . . Scroll down to the bottom of the page. . . http: //www. genomatix. de - http: //www. genomatix-software. com © 2007 Genomatix Software Gmb. H
Phylogenetically conserved frameworks At the bottom of the output you find this list. Now we not only have identified the TFs but also the exact positions which are worth a closer look. You can scan with your saved frameworks all of our promoter databases for promoters with similar organization. Why should I do this? Would this give me additional information ? http: //www. genomatix. de - http: //www. genomatix-software. com © 2007 Genomatix Software Gmb. H
Phylogenetically conserved frameworks In this example with an 8 element framework and almost no distance variation between the TF sites, you will find exactly 1 match in over 56. 000 human promoters: the input gene. How to use this approach with less selective frameworks for identification of similarly organized promoters? I'll show you later… Why should I do this? Would this give me additional information ? http: //www. genomatix. de - http: //www. genomatix-software. com © 2007 Genomatix Software Gmb. H
Knowledge based analysis Yes. The third is knowledge driven and bases on a combination of literature data mining, sequence analysis and pathway/network analysis. For this you need first to download and install the Java client of Biblio. Sphere. PE Fine! I think I have seen now two strategies. You mentioned three? http: //www. genomatix. de - http: //www. genomatix-software. com © 2007 Genomatix Software Gmb. H
Knowledge based analysis http: //www. genomatix. de - http: //www. genomatix-software. com © 2007 Genomatix Software Gmb. H
Knowledge based analysis For more detailed introduction to Biblio. Sphere. PE please have a look at http: //www. genomatix. de/products/Biblio. Sphere. PE 5. html http: //www. genomatix. de - http: //www. genomatix-software. com © 2007 Genomatix Software Gmb. H
Knowledge based analysis . . . un-tick this box. . . We are interested in the full network Choose "single not only the around our gene, gene" here. . . connected transcription factors HMGCS 1 http: //www. genomatix. de - http: //www. genomatix-software. com © 2007 Genomatix Software Gmb. H
Knowledge based analysis http: //www. genomatix. de - http: //www. genomatix-software. com © 2007 Genomatix Software Gmb. H
Knowledge based analysis This sets the context sensitive filter stringency. The most stringent including computer based semantic analysis is an ordered Gene 1 – functionaword – all other genes, Here you have list of Gene 2 level (B 3). being connected to your input gene by at Click around, (B 4) shows expert curated gene-gene on least one co-citation in and see what happens ! entire Pub. Med relationships only. Expert knowledge is abstract level derived by different sources, like Genomatix experts, Molecular Connection´s Net. Pro data base, STKE, etc. . . http: //www. genomatix. de - http: //www. genomatix-software. com © 2007 Genomatix Software Gmb. H
Knowledge based analysis This I have intentionally chosen an example filters the co-citation frequency with no expert curation available, since I want to demonstrate how to generate new knowledge! http: //www. genomatix. de - http: //www. genomatix-software. com © 2007 Genomatix Software Gmb. H
Knowledge based analysis Here you see the network around HGMCS 1, all other genes connected on GFG level http: //www. genomatix. de - http: //www. genomatix-software. com © 2007 Genomatix Software Gmb. H
Knowledge based analysis Here connected transcription factors only on GFG level. http: //www. genomatix. de - http: //www. genomatix-software. com © 2007 Genomatix Software Gmb. H
Knowledge based analysis Now all connected transcription factors. http: //www. genomatix. de - http: //www. genomatix-software. com © 2007 Genomatix Software Gmb. H
Knowledge based analysis A connection line between two genes means that there is a bibliographic connection on abstract level (BO). . . http: //www. genomatix. de - http: //www. genomatix-software. com © 2007 Genomatix Software Gmb. H
Knowledge based analysis "Mouse over" and clicking gives you more information. . . http: //www. genomatix. de - http: //www. genomatix-software. com © 2007 Genomatix Software Gmb. H
Knowledge based analysis The green indicates that there is a binding site for SREBF 1 (V$SREB) in at least one of the promoters of HMGCS 1 http: //www. genomatix. de - http: //www. genomatix-software. com © 2007 Genomatix Software Gmb. H
Knowledge based analysis There is more encoded in the connection lines. . . http: //www. genomatix. de - http: //www. genomatix-software. com © 2007 Genomatix Software Gmb. H
Knowledge based analysis The little symbols give you some information about the gene and its association with pathways http: //www. genomatix. de - http: //www. genomatix-software. com © 2007 Genomatix Software Gmb. H
Knowledge based analysis The tagged text tells us that the TF SREBF 1 is involved in regulation of HMGCS 1 Some more helpful options from this page. . . http: //www. genomatix. de - http: //www. genomatix-software. com © 2007 Genomatix Software Gmb. H
Knowledge based analysis This you know already. . . You can get all info about any gene you click up there. . . over here. . . http: //www. genomatix. de - http: //www. genomatix-software. com © 2007 Genomatix Software Gmb. H
Knowledge based analysis . . as well as this. http: //www. genomatix. de - http: //www. genomatix-software. com © 2007 Genomatix Software Gmb. H
Knowledge based analysis Hey, hey ! Stop it ! … I want to know about the regulation of my gene, not to play around with your Biblio. . . thing! . . as well as this. http: //www. genomatix. de - http: //www. genomatix-software. com © 2007 Genomatix Software Gmb. H
Knowledge based analysis Hey, hey ! Stop it ! … I want to know about the regulation of my gene, not to play around with your Biblio. . . thing! Biblio. Sphere Pathway. Edition ! We already found TFs of interest, known to be involved in regulation of our gene. Now let´s see the biological environment of our gene and find a group of related genes which might share some regulatory motifs. Let´s go back and display all genes contained in this network. . . http: //www. genomatix. de - http: //www. genomatix-software. com © 2007 Genomatix Software Gmb. H
Knowledge based analysis Let´s load the GOFilter "biological process". . . http: //www. genomatix. de - http: //www. genomatix-software. com © 2007 Genomatix Software Gmb. H
Knowledge based analysis Here you see the tree for the selected filter. Expand collapse by clicking on the +/- http: //www. genomatix. de - http: //www. genomatix-software. com Go to the table view by this tab. . . © 2007 Genomatix Software Gmb. H
Knowledge based analysis Top scoring is The Z-Score gives you a measure whether certain sterol and categories are significantly over- or under-represented by Clicking here cholesterol the displayed gene set. opens the tree metabolism. . . on the left and highlights the Everything category as well above 3 is as the resp. statistically genes in the significant! pathway view. http: //www. genomatix. de - http: //www. genomatix-software. com © 2007 Genomatix Software Gmb. H
Knowledge based analysis This finally applies the filter to your gene set. Superimpose as many filters as you´d like ! http: //www. genomatix. de - http: //www. genomatix-software. com © 2007 Genomatix Software Gmb. H
Knowledge based analysis We see two TFs in here, SREBF 1 and SREBF 2, both Sterol Regulatory Element Binding Protein factors. The "redraw" button Double-click on SREBF 1 in order to see all connections to that TF http: //www. genomatix. de - http: //www. genomatix-software. com © 2007 Genomatix Software Gmb. H
Knowledge based analysis Another table view. . . http: //www. genomatix. de - http: //www. genomatix-software. com © 2007 Genomatix Software Gmb. H
Knowledge based analysis . . . the colors encode for. . . Highlight those genes with your mouse, and copy them. . . http: //www. genomatix. de - http: //www. genomatix-software. com © 2007 Genomatix Software Gmb. H
Knowledge based analysis Now we have expanded our single input gene with a set of seven additional genes! And we know already quite a lot about them! They all are connected with my original gene in Pub. Med All genes, with very high statistical significance, belong to the GO-category "Cholesterol Metabolic Process" SREB transcription factors seem to play a role in the regulation of those genes Now lets check whether the promoters of those genes share a complex framework. For such we first need to export those genes into Genomatix. Suite´s Gene 2 Promoter http: //www. genomatix. de - http: //www. genomatix-software. com © 2007 Genomatix Software Gmb. H
Back to sequence level Oh my god. . . more. . . Where do I find this now ? Relax ! It´s easy and not far away. . . http: //www. genomatix. de - http: //www. genomatix-software. com © 2007 Genomatix Software Gmb. H
Back to sequence level APOA 1, LDLR, SREBF 2, VLDLR, FDFT 1, FDPS, MVK, HMGCS 1 Paste here the gene symbols which we just copied in Biblio. Sphere. PE forget this ! Don´t Otherwise you will be asked for all findings in all organisms. http: //www. genomatix. de - http: //www. genomatix-software. com © 2007 Genomatix Software Gmb. H
Back to sequence level http: //www. genomatix. de - http: //www. genomatix-software. com © 2007 Genomatix Software Gmb. H
Back to sequence level Hey stop ! Haven´t I seen this before ? You are right! It pretty much is the same display as the comparative genomics page which we have generated earlier. The difference in this case is that we now compare promoters of different genes within one organism… http: //www. genomatix. de - http: //www. genomatix-software. com © 2007 Genomatix Software Gmb. H
Back to sequence level Eight loci with 26 different unique promoters ! 9. 216 combinations possible for exhaustive analysis! How should I know which Since we are concentrating on SREB TF-sites, let´s ones? concentrate on those promoters which contain an Combinatorial explosion ! V$SREB binding site. How do I do this ? We have to find a way to circumvent this Very easy! Just scroll down to the bottom of the page. . . http: //www. genomatix. de - http: //www. genomatix-software. com © 2007 Genomatix Software Gmb. H
Back to sequence level Select the desired TF-matrix family here http: //www. genomatix. de - http: //www. genomatix-software. com © 2007 Genomatix Software Gmb. H
Back to sequence level . . . and all relevant promoters are checked already for you Now we have reduced to 12 different promoters from 8 different loci, each containing at least one SREB site. http: //www. genomatix. de - http: //www. genomatix-software. com © 2007 Genomatix Software Gmb. H
Back to sequence level Scroll to the bottom of the Gene 2 Promoter result page. . . We have done this before. . . http: //www. genomatix. de - http: //www. genomatix-software. com © 2007 Genomatix Software Gmb. H
Back to sequence level You see? Now we have tolerable combinatorics and can perform an exhaustive promoter analysis. http: //www. genomatix. de - http: //www. genomatix-software. com © 2007 Genomatix Software Gmb. H
Back to sequence level. . . but now we choose V$SREB as a mandatory element for our framework. Hint: you can select multiple elements by holding the "Ctrl" key while clicking. . and with these parameters you have to play around a little bit. Start at default. Gradually relax stringency. Go down in Quorum Constraint step by step, or allow for higher distance variance (e. g. 20, 30, 40, 50, usw. . . ) The lower the distance variance and the more elements per model, the higher is the resulting model selectivity. Remember? We have been here before, too. . . http: //www. genomatix. de - http: //www. genomatix-software. com © 2007 Genomatix Software Gmb. H
Back to sequence level Tick the boxes 6 the models for is quite significant and There are frameworks withof elements! This subsequent database search for other expected topromoters with similar organization. be extremely selective. With 6 elements I expect to find the 3 genes from which this models were derived only: SREBF 2, HMGCS 1, and MVK For example, at quorum of 30%, allowed distance range of 5 to 200 bp, distance variance of 50 bp maximum elements allowed: 10 we find quite a lot of frameworks in the different promoter combinations. http: //www. genomatix. de - http: //www. genomatix-software. com © 2007 Genomatix Software Gmb. H
Back to sequence level Scroll all the way down. . . Now lets see how selective this model is. . . This list is quite interesting! Here we have the differents TF sites in this set of frameworks. This list represents those TFs which we should concentrate on, when analyzing the regulation of the original input gene. It is pretty comparable to the list from our phylogenetic approach before. There is now good evidence that those factors play a role in regulation in the biological context of cholesterol metabolism. http: //www. genomatix. de - http: //www. genomatix-software. com © 2007 Genomatix Software Gmb. H
Back to sequence level It is just one click away. . . http: //www. genomatix. de - http: //www. genomatix-software. com © 2007 Genomatix Software Gmb. H
Back to sequence level This should look familiar to you ! But now we are going for the database section. . . Unless you have a good reason to do so, always go for the database of promoters of annotated genes. This allows for GO-group Z-scoring of the database hits later on. . . http: //www. genomatix. de - http: //www. genomatix-software. com © 2007 Genomatix Software Gmb. H
Back to sequence level This is a termination parameter. If this number of hits is reached before the end of the database, the search is terminated Careful! A database search usually takes several minutes. In order to Some browsers crash avoid a server time-out go for the e-mail option. You´ll receive with too many hits to a mail with a direct link to your result file display in HTML ! ( it will be kept in your "Results Directory", too) (>10. 000) http: //www. genomatix. de - http: //www. genomatix-software. com © 2007 Genomatix Software Gmb. H
Back to sequence level Eight matches! In four sequences. Each model matches exactly once per sequence. . . The three genes of our "training set". . . out of a total of 56. 193 different promoters http: //www. genomatix. de - http: //www. genomatix-software. com Wow ! © 2007 Genomatix Software Gmb. H
Back to sequence level . . . plus one additional "new" gene! This one was not in our input list and is identified only by common promoter organization! http: //www. genomatix. de - http: //www. genomatix-software. com © 2007 Genomatix Software Gmb. H
Back to sequence level Those four genes now are extremely likely to share common regulation in the given biological context! The TFs in the framework now are the top candidates for further inspection. http: //www. genomatix. de - http: //www. genomatix-software. com © 2007 Genomatix Software Gmb. H
Back to sequence level Those four genes now are extremely likely to share common regulation in the given biological context! The TFs in the framework now are the top candidates for further inspection. … STOP !! First I had too many matches in Mat. Inspector, now there are too many slides !! http: //www. genomatix. de - http: //www. genomatix-software. com © 2007 Genomatix Software Gmb. H
New Knowledge I am terribly sorry for that! However, eukaryotic transcriptional regulation is pretty complex. Our group of researchers works in this field since more than two decades. As you have seen, our tools - though pretty easy to use - require some explanations and sometimes a slightly different mind-setting, going beyond looking at single, isolated TF binding sites. I hope I was able to show you some basic strategies to follow. Nevertheless, lets have a final look at the additional gene which we have found with the database search in our example. . . http: //www. genomatix. de - http: //www. genomatix-software. com © 2007 Genomatix Software Gmb. H
New Knowledge http: //www. genomatix. de - http: //www. genomatix-software. com © 2007 Genomatix Software Gmb. H
New Knowledge http: //www. genomatix. de - http: //www. genomatix-software. com © 2007 Genomatix Software Gmb. H
New Knowledge MMAB is a transferase involved in vitamin B(12) activation and linked to a disease: methylmalonyl aciduria http: //www. genomatix. de - http: //www. genomatix-software. com © 2007 Genomatix Software Gmb. H
New Knowledge Feeding all 4 genes from Model. Inspector into Biblio. Sphere. PE shows that they are all connected plus. . . http: //www. genomatix. de - http: //www. genomatix-software. com © 2007 Genomatix Software Gmb. H
In our example, we started with a single gene ( HMGCS 1), El. Dorado put it into biological context in and concentrated on an potential regulator ( SREB), Biblio. Sphere. PE identified common promoter organization (TF-Framework) GEMS Launcher , Frame. Worker searched for additional genes with similar promoter organization and GEMS Launcher , Model. Inspector put the genes back into biological context. Biblio. Sphere. PE Literature confirmed that we indeed found a co-regulated network and identified the molecular basis for such. This could NEVER be achieved by statistical analysis of isolated TFBS http: //www. genomatix. de - http: //www. genomatix-software. com © 2007 Genomatix Software Gmb. H
. There is so much more in Genomatix. Suite PE I did neither say a word to matrix generation, nor to direct experimental planning for knock-out/knock-in experiments with Sequence. Shaper Expand the hit-list by shortening the framework, etc. . . Get in touch with us via support@genomatix. de and we will give you a tour through the entire system at a web-meeting. Some informative links: http: //www. genomatix. de/company/publications 1. html http: //www. genomatix. de/training/tasks. html http: //www. genomatix. de/download 4. html http: //www. genomatix. de/cgi-bin/UMapps/register. pl http: //www. genomatix. de - http: //www. genomatix-software. com © 2007 Genomatix Software Gmb. H