Скачать презентацию NMR Spectroscopy and Protein Structures Chem 991 A Скачать презентацию NMR Spectroscopy and Protein Structures Chem 991 A

9d479f28cc98467ada3d233f4e02ef26.ppt

  • Количество слайдов: 51

NMR Spectroscopy and Protein Structures Chem 991 A Special Topics in Physical Chemistry Lectures: NMR Spectroscopy and Protein Structures Chem 991 A Special Topics in Physical Chemistry Lectures: MWF 10: 30 am-11: 20 am, Rm 733 Hamilton Hall Class Projects & Exams: Thur. 6: 00 -8: 00 pm, Rm 733 Hamilton Hall COURSE OUTLINE Instructor: Dr. Robert Powers Office Address: 722 Ha. H Phone: 472 -3039 e-mail: rpowers [email protected] edu web page: http: //bionmr. unl. edu/ Labs 721 Ha. H Phone: 472 -5316 Office Hours: 11: 30 -12: 30 am MWF or by Special Appointment. Required Text: J. N. S. Evans, Biomolecular NMR Spectroscopy, Oxford University Press Recommended Text: M. H. Levitt, Spin Dynamics – Basics of Nuclear Magnetic Resonance, Wiley

Course Outlined (cont. ) Some Other Recommended Resources “NMR of Proteins and Nucleic Acids” Course Outlined (cont. ) Some Other Recommended Resources “NMR of Proteins and Nucleic Acids” Kurt Wuthrich “Protein NMR Spectroscopy: Principals and Practice” John Cavanagh, Arthur Palmer, Nicholas J. Skelton, Wayne Fairbrother “Principles of Protein Structure” G. E. Schulz & R. H. Schirmer “Introduction to Protein Structure” C. Branden & J. Tooze “Enzymes: A Practical Introduction to Structure, Mechanism, and Data Analysis” R. Copeland “Biophysical Chemistry” Parts I to III, C. Cantor & P. Schimmel “Principles of Nuclei Acid Structure” W. Saenger

Course Outlined (cont. ) Some Important Web Sites: RCSB Protein Data Bank (PDB) http: Course Outlined (cont. ) Some Important Web Sites: RCSB Protein Data Bank (PDB) http: //www. rcsb. org/pdb/ Database of NMR & X-ray Structures BMRB (Bio. Mag. Res. Bank) http: //www. bmrb. wisc. edu/ Database of NMR resonance assignments CATH Protein Structure Classification http: //www. cathdb. info/ Classification of All Proteins in PDB SCOP: Structural Classification of Proteins Classification of All Structures into http: //scop. berkeley. edu Families, Super Families etc. PDBe. Fold http: //www. ebi. ac. uk/msd-srv/ssm/ Compares 3 D-Stuctures of Proteins to Determine Structural Similarities of New Structures NMR Information Server http: //www. spincore. com/nmrinfo/ NMR Groups, News, Links, Conferences, Jobs NMR Knowledge Base http: //www. spectroscopynow. com/ A lot of useful NMR links

Course Outlined (cont. ) Course Work: Oral Reports (2): Ubiquitin Assignment Problem Set: Exam Course Outlined (cont. ) Course Work: Oral Reports (2): Ubiquitin Assignment Problem Set: Exam 1: Exam 2: Final Exam: Total: 100 pts 100 pts 200 pts 700 pts. (variable due dates) (due Dec. 13) (Thur. , Oct. 3) (Thur. , Nov. 7) (Fri, Dec. 20, 10 am-12 pm) Answer keys for the problem sets and exams will be posted on Black. Board. Grading scale: A+=95%; A=90%; A-=85%; B+=80%; B=75%; B-=70%; C+=65%; C=60%; C-=55%; D=50%; D-=45%; F=40%

Course Outlined (cont. ) Class Participation • Reading assignments should be completed prior to Course Outlined (cont. ) Class Participation • Reading assignments should be completed prior to each lecture. The required text will only supplement the lecture material. A vast majority of the material for the class will come from the lectures. • You are expected to participate in ALL classroom discussions Exams • All exams (except the final) will take place at 6 pm in Hamilton Hall Rm. 733 on the scheduled date. • The length of each exam (except the final) will be open-ended. You will have as much time as needed to complete the exam. • Bring TI-89 style calculator or a simpler model, and an approved translator if required. • A review session will take place during the normal class time prior to each exam. • ALWAYS SHOW ALL WORK!!!!

Lecture Topics (Tentative Schedule) Date Topic I. Overview of Protein Structures Aug 26 Introduction Lecture Topics (Tentative Schedule) Date Topic I. Overview of Protein Structures Aug 26 Introduction Aug 28 Linux and Awk Aug 30 Protein Structures from an NMR Perspective Sept 4 Sept 6 Sept 9 Sept 11 Sept 13 Sept 16 Sept 18 Sept 20 Sept 23 Sept 25 Protein Modeling Software Sept 27 Sept 30 Oct 2 Oct 3 EXAM 1 Oct 4 Molecular Mechanics and Dynamics Oct 7 Oct 9 Comparison of X-ray and NMR Structures Oct 11 Oct 14 Isotope Labeling of Proteins Oct 16 II. NMR Assignment Problem Oct 18 NMR Software Oct 21 to Oct 22 Fall Break Chapter 4 3. 9 3. 5 -3. 9 4. 2. 2 – 4. 2. 3 2 3. 9

Lecture Topics (continue) Date Topic Oct 23 Oct 25 2 D NMR Oct 28 Lecture Topics (continue) Date Topic Oct 23 Oct 25 2 D NMR Oct 28 Oct 30 3 D NMR Nov 1 4 D NMR III. NMR Structure Determination Nov 4 NOEs Nov 6 Nov 7 EXAM 2 Nov 8 Nov 11 Chemical shifts, Coupling constants, Amide Exchanges Nov 13 Nov 15 Stereospecific assignments, RDCs Nov 18 Quality of NMR Structures Nov 20 IV. Protein Dynamics Nov 22 T 1, T 2, NOE & S 2 Nov 25 Nov 27 to Nov 29 Thanksgiving V. Protein-Ligand Structures Dec 2 SAR by NMR, Other 1 D and 2 D Methods Dec 4 Transfer NOE Dec 6 Filtered & edited NMR experiments Dec 9 Metabolomics Dec 11 Dec 13 Problem Set & Ubiquitin Assignment due Dec 20 FINAL EXAM Chapter 2. 1 2. 2 2. 3 3 3. 1 4. 1. 4, 3. 2, 4. 1. 3, 5. 2 4. 1. 2 3. 10 1. 3, 1. 4, 6. 3 6. 5 6. 7

ORAL PRESENTATION OF STRUCTURE PAPERS – Two 20 minute Oral Presentations • • Thursday ORAL PRESENTATION OF STRUCTURE PAPERS – Two 20 minute Oral Presentations • • Thursday Evenings at 6 pm in Ha. H 733 Audience Participation is Expected (like a journal club) Presentation Dates Randomly Assigned (see syllabus page 4) 50 points per presentation – total of 100 points – Paper of Your Choice • A Protein Structure Should be a Major Focus of the Paper • The Paper Topic Should be of General Interest and of Significant Impact • Send an Electronic Copy of the Paper to the Class Prior to Your Presentation – Some Recommended Sources • Nature Structural Biology, Science, Nature, Cell, Molecular Cell, Structure, Protein Science, PNAS, Journal of Molecular Biology, Biochemistry, and Journal of Biomolecular NMR. • The paper may cover a protein structure or a protein-complex (small molecule, protein, DNA, RNA, etc).

ORAL PRESENTATION OF STRUCTURE PAPERS – Presentation Goal • Present a Clear Understanding of ORAL PRESENTATION OF STRUCTURE PAPERS – Presentation Goal • Present a Clear Understanding of the Goals and Findings of the Paper to the Class • Why was the particular protein the target of the paper? • How was the structure determined? Were there any challenging issues? • What structure was determined for the protein (fold? ) • What are some interesting features of the structure (dynamics)? • Are there any unique structural differences compared to other members of the family? • What structural features are important to function? • How was the structure used to support or refute the biological focus of the paper? • Does the structure actually support the conclusion or did the author’s over interpret the data? • Does the data/structure suggest other equally plausible conclusions?

ORAL PRESENTATION OF STRUCTURE PAPERS – Grading • • Combination of My Assessment and ORAL PRESENTATION OF STRUCTURE PAPERS – Grading • • Combination of My Assessment and the Other Students’ Assessment Each Student will be Limited to Giving Approximately 30% As, 55% Bs, And 15% Cs Default Grade is a B, an A or C will Require Justification All the assessments will be averaged together to determine the number of points Average Assessed Grade: A: 50 pts, B+: 45 pts, B: 40 pts, B-: 35 pts, C+: 30 pts, C: 25 pts – Assessing the Presenter • • • How well did the presenter understand the material? How clearly did the presenter discuss the material? Was the chosen paper of general interest and biologically significant? Was the structure relevant and important to the paper? How well did the presenter answer questions? Did the paper lead to an interesting discussion?

ORAL PRESENTATION OF STRUCTURE PAPERS Tentative Schedule Oral Presentation Schedule 9/19 9/26 9/5 9/12 ORAL PRESENTATION OF STRUCTURE PAPERS Tentative Schedule Oral Presentation Schedule 9/19 9/26 9/5 9/12 Jonathan Catazaro Jeffrey Jeppson Mark Carter Jessica Periago Bradley Worley Shulei Lei 10/10 Teklab Gebregiworgis Darrell Marshall Jonathan Catazaro Jeffrey Jeppson 10/17 11/14 11/21 12/5 Bradley Worley Shulei Lei Teklab Gebregiworgis Darrell Marshall 12/12 10/24 Mark Carter Jessica Periago

Course Assignments – Two Separate Graded Assignments • • A standard problem set – Course Assignments – Two Separate Graded Assignments • • A standard problem set – included at the end of the syllabus An NMR assignment problem Due data for both assignments is the beginning of class on Fri. Dec. 13 Late Problem Sets will NOT be accepted – Grading - General • Each Assignment is worth 100 pts. (200 pts. total) • Show ALL work to receive full credit • You must submit your own set of answers – Some Additional Considerations • Please start both assignments NOW! • Please work together • Please visit my office hours for assistance

Course Assignments – The Standard Problem Set Has Two Sections • • Writing simple Course Assignments – The Standard Problem Set Has Two Sections • • Writing simple AWK programs to manipulate files Using Xplor and other software to analyze protein structures Due date for both assignments is the beginning of class on Fri. Dec. 13 Late Problem Sets will NOT be accepted – Grading – Standard Problem Set • • • No unique answer for programing section, either it works or it doesn’t E-mail me your scripts and I will run them If it works full credit, if not zero points The analysis of the protein structures section will have defined answers Please submit the answers to the protein structure section on the due date

Course Assignments – NMR Assignment Problem Set • Determine the backbone NMR Assignments for Course Assignments – NMR Assignment Problem Set • Determine the backbone NMR Assignments for Ubiquitin Sequence: • 1 10 20 30 40 MQIFVKTLTG KTITLEVEPS DTIENVKAKI QDKEGIPPDQ 50 60 70 76 QRLIFAGKQL EDGRTLSDYN IQKESTLHLV LRLRGG The completed project should include a cover page that summarizes your assignments using the following template:

Course Assignments – NMR Assignment Problem Set • You will ALL have access to Course Assignments – NMR Assignment Problem Set • You will ALL have access to a standard dataset of NMR spectra: • 2 D 1 H-15 HSQC, 2 D 1 H-13 C HSQC, 3 D HNCO, 3 D HNCA, 3 D CBCANH, and 3 D CBCACONH • Data will be available on the computers in the Research Instrument NMR Facility (Ha. H 832) • All the necessary software for the processing and analyzing of the data will also be available on these computers – Goal • • • Assign the minimal set of backbone resonances (HN, 15 N, 13 CO, Ca, Cb) Provide practical experience with using NMR data to assign a protein Complete as much of the backbone assignments as possible – Grading – NMR Assignment Problem Set • • Based on how complete the assignments are Scaled based on overall success of the class

Introduction to Linux/Unix Linux: A UNIX–like operating system developed as a free and open Introduction to Linux/Unix Linux: A UNIX–like operating system developed as a free and open source software User interface is a traditional and cumbersome command line in a shell (window) “Linux is for Adults” – Stephan Grzesiek There a number of flavors (distributions) of Linux with different graphical user interfaces (GUI) or desktop interfaces (attempt to be Mac or Windows-like) - Debian, Fedora, Ubuntu, Mageia, Mint Linux, etc. Similarly, there a number of PC-look-a-like software programs (free & commercial) (WORD, EXCEL, etc). Initially thought it would replace the Windows PC Very popular in academia because it is free and open for development

Introduction to Linux/Unix Typical Linux Shell Environment Simple “command line” execution of programs or Introduction to Linux/Unix Typical Linux Shell Environment Simple “command line” execution of programs or editing of files Typical Linux “Windows” Environment Mimics PC/Mac desktop GUI environment

Introduction to Linux/Unix Connecting from a PC by a Terminal Emulation Software (Pu. TTY) Introduction to Linux/Unix Connecting from a PC by a Terminal Emulation Software (Pu. TTY) command line environment Connecting from a PC by Samba PC/Mac folder environment

Introduction to Linux/Unix – Graphical User Interface (GUI) or PC/MAC Desktop Environment • You Introduction to Linux/Unix – Graphical User Interface (GUI) or PC/MAC Desktop Environment • You can use the Desktop like a PC, but can be cumbersome Ø Minimal (if any ) standards, everything in the environment needs to be configured Ø Downside of open-source (free) software – many contributors with little to no managers – More common to work in a shell using the command line • Primitive (“Old School”) Ø Minimal mouse functions, pull down menus or other common features we are accustom to • • Need to memorize commands and options (“flags”) Need to open a Terminal, Window or Shell Ø Right click mouse and select “open terminal”

Introduction to Linux/Unix – Three Common Linux Commands: pwd, ls and cd • pwd Introduction to Linux/Unix – Three Common Linux Commands: pwd, ls and cd • pwd – identifies the current path or directory • ls – list the files and folders in the current directory • cd path - move to the defined path (change directory) ‒ cd. . (move up one directory), ‒ cd. . /. . ( move up two directories)

Introduction to Linux/Unix ‒ For a Complete List of Linux Commands and Explanations see Introduction to Linux/Unix ‒ For a Complete List of Linux Commands and Explanations see • http: //linuxcommand. org/ • Or the book “Linux in a Nutshell” ‒ Some Other Common Commands • • • • echo “text” – display or print text exit – close a terminal clear – clear all text in a terminal mkdir - make a new directory rm - remove/delete file mv - moves files cp - copies files ps – lists all active user programs and display a PID (process identification number) kill pid - will kill (stop) the process with the listed pid number man command - will display the manual for the listed command cat file – display the contents of a file (also used to combine or concatenate multiple files) vi file – will open file with a primitive text editor chmod file [flags] – will change or set permissions for file defined by flags

Introduction to Linux/Unix ‒ It Gets More Complicated! ‒ A number of commands have Introduction to Linux/Unix ‒ It Gets More Complicated! ‒ A number of commands have a range of options that are implemented on the command line with a “flag” • • • ls –l - lists files and folders with associated permissions rm –R - remove/delete folder mv –i – prompt before overwriting an existing file with the same name cp –n – do not overwrite an existing file with the same name cp –u – only overwrite an older file with the same name ps –axu – lists the detailed status of every process on the system with the name of the user • chmod 755 file – change file’s permission such that file's owner may read, write, and execute the file. All others may only read and execute the file. ‒ Multiple flags can be used simultaneously • Again, man pages, Linux web site and reference books provide more details

Introduction to Linux/Unix ‒ One More Very Useful Command ‒ sort • Quickly re-order Introduction to Linux/Unix ‒ One More Very Useful Command ‒ sort • Quickly re-order or sort the rows of a tabular file with n number of columns sort –rn $n filename > newfilename - $n – the number of the column that will be sorted - r – sort in reverse order - n – sort based on numeric value of the string

Permissions ‒ You can’t read, write, edit or execute a file without permission! Directory Permissions ‒ You can’t read, write, edit or execute a file without permission! Directory File Owner Number of files in Directory Size of file in kilobytes Group Owner belongs to Filename File Date or Time Stamp

Permissions ‒ Reading and understanding permissions Permissions Permissions ‒ Reading and understanding permissions Permissions

Permissions ‒ Where did the 755 come from in the chmod command? Think of Permissions ‒ Where did the 755 come from in the chmod command? Think of the permission settings as a series of bits : rwx rwx = 111 111 rw- rw- = 110 110 rwx --- = 111 000 and so on. . . rwx rwr-x r--xx -x--x --- = = = = 111 110 101 100 011 010 001 000 in in binary binary = = = = 7 6 5 4 3 2 1 0

Pipes and Redirection | (pipe) - passes output of one Linux command to the Pipes and Redirection | (pipe) - passes output of one Linux command to the input of a second command • Example: ls |wc (wc – counts the number of characters, words and lines) • Not limited to just one pipe, can string multiple pipes together >, < - redirection of files • command > filename – output of command (or program) is sent to a file called filename instead of being displayed on the screen Ø Example: ls > file_list • command < filename – the filename is the input to the command or program Ø Example: xplor < psf. inp

Background Calculations ‒ For long calculations don’t want the process directly associated with the Background Calculations ‒ For long calculations don’t want the process directly associated with the window or shell • • Window must remain open and active during calculation Window is “locked” until the program is finished Calculations will be stopped if the window is closed A intense calculation can overwhelm the shell environment, leading to the window crashing or even slow down your computer • Output displays on window can be lost, lock window or crash computer ‒ Instead, submit your “job” to the “background” • Lowers the calculations priority to access the CPU • Any interactive calculation has the highest priority • Example: background - xplor < psf. inp > psf. out && interactive - xplor < psf. inp ‒ Use ps command to monitor status of background jobs

vi – Primitive Text Editor ‒ Opens any text based file for reading, editing vi – Primitive Text Editor ‒ Opens any text based file for reading, editing and writing • Only simple text or ASCII files can be edited with vi • You will see gibberish with *. doc, *. pdf, etc. ‒ Like Linux, vi uses a number of simple command line functions • A number of the functions require a key combination (ctrl key + another key) • For a Complete List of Vi Commands and Explanations see Ø “The Vi Lovers Home Page” http: //thomer. com/vi/vi. html Ø Or “Learning the vi Editor” by L. Lamb, O’Reilly & Associates, Inc ‒ vi filename • • If filename exists, vi will open the file for editing If filename doesn’t exist, vi will create the file for editing

vi – Primitive Text Editor Cursor What part of the text is shown: All vi – Primitive Text Editor Cursor What part of the text is shown: All Top Bot Percentage Editing Mode Line number Column number Cursor Location

vi – Primitive Text Editor ‒ Working with files • : q – quits vi – Primitive Text Editor ‒ Working with files • : q – quits only if no changes to the file have been made • : q! – force vi to quit without saving any changes • : wq filename – quits and writes the contents of the file to a new file named filename • : wq! – quits and writes the file to the current filename • : r filename – inserts the contents of the filename into the current file at the cursor location ‒ Moving around the file • • • : number – jumps to the specified line number in the text G or : $ - jump to last line Ctrl-g – gives current line number Ctrl-f or Ctrl-d – move forward Ctrl-b or Ctrl-u– move back Arrow Keys – allows you to move around the file and position the cursor

vi – Primitive Text Editor ‒ Adding to a file • Enter Key – vi – Primitive Text Editor ‒ Adding to a file • Enter Key – adds a blank line at the cursor position • Esc key – exits or leaves the active vi function • i or a – enters insert mode, allows text to be typed into the file at the location of the cursor • R – enters replace mode, allows text to be typed into the file at the location of the cursor replacing any existing text ‒ Deleting • • dd – deletes the line at the position of the cursor dw – deletes the word at the position of the cursor x – deletes the character under the cursor r – replace the character under the cursor D – deletes from the cursor position to the end of the line u – undo the last edit or change U – undo all the edits on a single line Place a number in front of command the command will be executed that many times

vi – Primitive Text Editor ‒ Copying and Pasting Text • number yy– yanks vi – Primitive Text Editor ‒ Copying and Pasting Text • number yy– yanks (copy) the specified number of lines (starting at the cursor) • p– put (pastes) the previously yanked (copied) lines in the text after the cursor • J – joins two lines at the position of the cursor ‒ Global Search and Replace • /text – moves the cursor to the next location of text in the file • n – moves to the next occurrence of text in the file • : %s/search_string/replacement_string/g – globally replace search_string with replacement_string

Awk/Nawk – Primitive (but Powerful) Programing language ‒ Interpreted (not compiled) language • C-like Awk/Nawk – Primitive (but Powerful) Programing language ‒ Interpreted (not compiled) language • C-like • A file containing the software code needs to be passed to Awk awk –f awk_script. awk infilename > outfilename - awk_script. awk – the Awk program infilename – the file used by the Awk program outfilename – the output generated by the Awk program ‒ Awk significantly simplifies writing a quick program • Automatically handles opening and reading files and inputing data into standard variables • Structured to read a file composed of rows and columns • IMPORTANT – sequentially reads each row as it executes the program Ø If 10 rows, the program gets executed 10 times – major source of confusion

Awk Program Structure BEGIN INPUT logic statements (if, and, or, not) arithmetic looping table Awk Program Structure BEGIN INPUT logic statements (if, and, or, not) arithmetic looping table arrays printing END OUTPUT

BEGIN/END • All of the commands in the section defined by BEGIN occurs BEFORE BEGIN/END • All of the commands in the section defined by BEGIN occurs BEFORE the file is read # This script politely introduces itself BEGIN { print “Hello, world” } • All of the commands in the section defined by END occurs AFTER the file is read { #Main – Does Nothing, but still reads file } • To comment out a line of text from a script add “#” before text - Line is skipped by Awk END { print “Bye, world” }

BEGIN/END BEGIN { • The BEGIN section is commonly used to set or define BEGIN/END BEGIN { • The BEGIN section is commonly used to set or define the value of variables used by the MAIN program • Also, to open an input data or information from other files • The END section is commonly used to print out the results of the Awk Program CAmax[0]= "65. 52" CAmin[0]= "43. 00" CBmax[0]= "38. 70" CBmin[0]= "0. 00" Res[0]="A“ i=1 While {getline < ref. pck > 0) { CAref[i] = $1 CBref[i] = $2 i++ } } { #Main – Does Nothing, but still reads file } END { } For (i = 1; i <= NR; i++) { print CA[i], CB[i] }

MAIN • The various functions of AWK performs the tasks you want as the MAIN • The various functions of AWK performs the tasks you want as the program sequentially reads the input file Consider the following input file: Pk. ID 1. 00 2. 00 3. 00 4. 00 5. 00 6. 00 7. 00 8. 00 NH 9. 35 9. 10 9. 73 7. 80 8. 84 8. 14 9. 01 8. 15 N 15 126. 75 126. 69 126. 68 126. 57 126. 52 125. 85 125. 35 125. 24 CA 53. 19 59. 42 60. 73 57. 28 58. 35 65. 85 62. 57 54. 86 CB 40. 06 31. 90 38. 11 33. 99 28. 58 31. 89 42. 15 40. 69 CAi 63. 53 52. 92 54. 64 56. 10 53. 25 53. 03 52. 70 55. 79 CBi 69. 87 43. 03 31. 38 30. 75 40. 03 41. 07 41. 84 30. 35 COi 172. 90 174. 94 171. 92 172. 60 173. 12 171. 99 171. 17 $1 $2 $3 $4 $5 $6 $7 $8 Awk sequentially reads each row redefining the value of each standard variable ($1 to $8) - NF is set to the number of fields (columns), 8 in this example - NR is set to the number of rows, 9 in this example - $0 is a string corresponding to the entire row

MAIN • The primary Awk functions can be grouped into 5 categories – Logic MAIN • The primary Awk functions can be grouped into 5 categories – Logic statements – Arithmetic – Looping – Arrays – Printing

MAIN • As the file is being read in, you can now write instructions MAIN • As the file is being read in, you can now write instructions to test, change or manipulate the original data • You can define your own variable names • You can do any number of arithmetic functions ( – Basic math +, -, *, /, ^ – General functions – cos(x), exp(x), sqrt(x), etc. BEGIN { { CAmax[0]= "65. 52" CAmin[0]= "43. 00" CBmax[0]= "38. 70" CBmin[0]= "0. 00" Res[0]="A" Pk. ID=$1 NH=$2 N 15=$3 CAiatom=$4 CBiatom=$5 CAatom=$6 CBatom=$7 COi=$8 CAup=sqrt(CAmax[0] – Caiatom) CBdn = CBiatom/CBmin[0] CO 2 = Coi^2

Functions • Logic statements – if (logical test of a parameter/variable) – Probably most Functions • Logic statements – if (logical test of a parameter/variable) – Probably most important logic command – General call structure is • if (statement to test) {action} • Example: if ($1 == “HAPPY”) – Reads “if column 1 equals HAPPY” – If this is true then we do something – else ‒ Used to perform an action when the if statement is false ‒ else {action} ‒ Example BEGIN { $1 = “HAPPY” if ($1 == “HAPPY”) print “I am HAPPY” else print “I am SAD” }

Functions • Logic statements – ! (not) true if not a match • Example: Functions • Logic statements – ! (not) true if not a match • Example: if ($1 !=“HAPPY”) • True if $1 NOT EQUAL to “HAPPY” – && (and) true only if both conditions are met • Example: if ($1 > $2 && $1 > $3) • True if $1 is larger than BOTH $2 and $3 – || (or) true if one of multiple conditions are met • Example: if ($1 > $2 || $1 > $3) • $1 only needs to be larger than either $2 or $3 for the statement to be true

BEGIN { Functions • Loops – allows you to repeat a set of instructions BEGIN { Functions • Loops – allows you to repeat a set of instructions until a condition is met { { • Major source of problem – infinite loop – The exit condition is never met { While {getline < ref. pck > 0) { CAref[i] = $1 CBref[i] = $2 i++ } For (i = 1; i <= NF; i++) { if ($i >= 54. 0 && <= 55. 0) count++ } • Two loop functions – For – While END { } For (i = 1; i <= NR; i++) { print CA[i], CB[i] }

Functions • Arrays – allows you to assign multiple values to a single variable Functions • Arrays – allows you to assign multiple values to a single variable BEGIN { { { • Effectively allows you to sort or group information • Two types of Arrays – 1 D: CA[0] – 2 D: CB[0, 0] } i=1 Pk. ID[i]=$1 NH[i]=$2 N 15[i]=$3 Caiatom[i]=$4 Cbiatom[i]=$5 Caatom[i]=$6 Cbatom[i]=$7 Coi[i]=$8 i++

Functions • printf – primary mechanism of reporting the results of the Awk program Functions • printf – primary mechanism of reporting the results of the Awk program to the user • Extremely flexible number of options available to format output BEGIN { } { – Can do calculations within print statement – Can be frustrating to get it right. { • Two types of print statements – print: no formatting, just prints the value of the valuable – printf: full range of formats available state=“HAPPY” For (i = 1; i <= 10; i++) { print i*i printf (“%sn”, state) }

Functions ‒ Examples of different formatting options with printf • Each variable needs a Functions ‒ Examples of different formatting options with printf • Each variable needs a type definition: § § %d - decimal %s - string %f – floating point %e – floating point with scientific notation • Formatting is “literal” printf (“%s%sn”, $1, $2) – print all the characters in column 1 (%s) and column 2 (%s) – n print new line – no spacing » $1 = HAPPY and $2 = SAD the output would be HAPPYSAD

Functions ‒ Examples of different formatting options with printf • Spacing , Tabs and Functions ‒ Examples of different formatting options with printf • Spacing , Tabs and justifications § § § The number of spaces between type definitions will be printed t – Tab, using system defined tab locations n – print new line Can use any number or combination of tabs, spaces and new lines Default printing is right justified For left justification, place a – in front of the type classification (e. g. %-10 s) printf (“%s %sn”, $1, $2) – single space » $1 = HAPPY and $2 = SAD the output would be HAPPY SAD printf (“%s t%snn”, $1, $2) – five space then tab » $1 = HAPPY and $2 = SAD the output would be HAPPY » Followed by two new lines SAD

Functions ‒ Examples of different formatting options with printf • Precision Modifier § “Fine Functions ‒ Examples of different formatting options with printf • Precision Modifier § “Fine tunes” how the variable is printed § Defines both spacing and number of characters or significant figures printed § Simply, place a number in front of the type classification (e. g. %5. 3 f) printf (“%10 s%5 sn”, $1, $2) – 10 spaces for the first string and 5 spaces for second string – Spaces include the number of characters in the string » » $1 = HAPPY and $2 = SAD the output would be HAPPY SAD 5 spaces in front of HAPPY (5 spaces + 5 characters in HAPPY = 10) 2 spaces in front of SAD ( 2 spaces + 3 characters in SAD = 5) OR printing of $1 will end on column 10 and printing of $2 will end on column 15 printf (“%f %5. 3 fn”, $1) » » $1 = 1/3, the output would be 0. 333333 0. 333 %f – all the characters are printed 5 in %5. 3 indicates a total of 5 characters are printed (including decimal point) 3 in %5. 3 indicates a total of 3 characters are printed to the left of decimal point

Functions ‒ Examples of different formatting options with printf • Printing is “literal” § Functions ‒ Examples of different formatting options with printf • Printing is “literal” § Anything within the quotes is printed printf (“%s HELLO %sn”, $1, $2) » $1 = HAPPY and $2 = SAD the output would be HAPPY HELLO SAD printf (“Hello Worldn”) » Don’t need to print a variable » The output would simply be: Hello World • Print to a File § Simply redirect the output of the print or printf statement to a file name printf (“Hello Worldn”) > helloworld. txt

Functions ‒ Examples of different formatting options with printf • Can do Math within Functions ‒ Examples of different formatting options with printf • Can do Math within the print and printf statement printf (“%d %dn”, $1^2, sqrt($2)) » $1 = 1/3, the output would be 0. 111111 0. 577 • This is a general feature of Awk, functions can be imbedded within other functions • For More information on Awk, see • The book “sed and awk” by Dale Dougherty O’Reilly and Associates • The GNU Awk Users Guide: http: //www. gnu. org/software/gawk/manual/gawk. html • Effective Awk Programming: http: //www. gnu. org/software/gawk/manual/

Linux & AWK – Final Thoughts • These Lectures have only meant to serve Linux & AWK – Final Thoughts • These Lectures have only meant to serve as a general introduction to both Linux and Awk • There is a lot more detail and other topics that simply were not covered. Entire courses are dedicated to these topics. I did not present everything there is to know about Linux and Awk or programming in general • Mastering an operating system and computer programming will only come from extensive effort and practice • The best way to learn is by doing!!