EE 3721 Computer System Principles Academic Year 2011 2012

Скачать презентацию EE 3721 Computer System Principles Academic Year 2011 2012

483d46dc2269f5935eebef7e36c41983.ppt

Количество слайдов: 151

EE 3721 Computer System Principles Academic Year 2011/2012 HD Subject Code – EE 3721 Lecturer – Y. F. Fung Office (Y. F. Fung) : CF 605 email: eeyffung@polyu. edu. hk Consultation: Wednesday after 3 pm

Lab. Schedule Commence in week 6 (6 -Oct) A short briefing will be provided before the session Form your own group and with 2 students per group

Computer System Engineering Reference Textbook – The Intel Microprocess …, Prentice-hall, Barry Brey. ISBN 0 -13 -060714 -2 Intel Microprocessors: Architecture, Programming and Interfacing, Ray & Bhurchandi, Mc. Graw. Hill, ISBN 0 -07 -120169 -6 Pentium Processor System Architecture-2 nd Editions, Don Anderson/Tom Shanley-Mindshare, Inc. n Latest edition for the above titles

Reference books IBM PC Assembly Language and Programming - Fourth Edition (Peter Abel - Prentice-Hall International Inc. ) The 8086 and 80286 Microprocessors Hardware, Software and Interfacing (A. Singh, W. A. Triebel, Prentice-Hall)

Teaching materials Available in Web. CT web site Also some useful information can be found in ftp. ee. polyu. edu. hk/yffung/ee 2531

Related Topics Computer Architectures of microprocessors (2. 5 weeks) Assembly Language programming (3 weeks) Memory system overview (2 weeks) Input/Output mechanism (2 weeks) Other computing supporting facilities: interrupt (1. 5 weeks), communication (1. 5 weeks)

Assessment methods Continuous assessment (40%) Performance in Lab and online exercises (10%) n Lab report (10%) n Quiz (usually only 1 test or quiz) (usually around week 8 or 9) (20%) n Examine (60%)

Criteria Reference Assessment Pass (D/D+) n n Some understanding of assembly language programming Some understanding of other topics Grade C/C+ n n Able to write simple assembly language programs Able to demonstrate in-depth understanding of certain topics Grade B/B+ n n Able to write assembly language program to achieve specific tasks Able to demonstrate in-depth understanding for most topics Grade A/A+ n n n Able to write assembly language program Able to demonstrate in-depth understanding of all topics Able to apply knowledge learnt to solve real-life problems

Objectives To understand the basic principles (hardware components) of a computer system To learn how to control a computer system using assembly language To learn how to design a simple computer system (eg for the development of an electronic mouse, robot, simple control system for your project)

Learning outcomes 1. Given specifications of an application and the instruction set of the microprocessor, design an assembly program to carry out the necessary operations 2. able to appreciate advanced features of the latest microprocessors 3. given a set of conditions, design a simple computer system 4. able to think logically and be able to present results

Thing that you can do after this

More thing that you can do

Attention!!!!!!! This is not a traditional “Engineering” subject!!!! Equations – almost none!!!!!! Try to think, understand learn Do not try to memorize

What is a computer ? ? ? Nowadays, computer comes in different forms!!!!!!!!!! In this subject, we try to learn the generic form of a computer!!!!!!!! If you are willing to spend your time, it is possible to build your own game console or a cell phone

What is a computer?

The PC is dead “It was fun while it lasted, but as of 2011 the era of the desktop computer is officially over. ” Read more: http: //www. foxnews. com/scitech/201 1/01/11/pc-is-dead-ces-tabletssmartphone/#ixzz 1 X 1 j. Bu. Y 8 N

Different forms of computer Can you think of a real-life example that involves the application of a microprocessor or a computer?

Computer is a digital system Computer is a combination of digital and analogue systems, but mainly digital So to learn this subject, you also need to understand basic digital system as well as binary number system

Number systems (revision) The microprocessor (µP) is a binary device, everything inside the microprocessor is represented by 0 and 1. The most direct number system used inside the microprocessor is the binary system (base 2) with only 0 and 1, for example 0101. Each digit in the number represents a value in power of 2, starting from the RHS (Right hand side). The first digit is 20, then 21, 22 , etc. Usually, the most RHS bit is also called the LSB (least significant bit) while the most LHS bit is the MSB (most significant bit). So the value 0101 = 0 x 27 + 1 x 26 + 0 x 25 + 1 x 24 + 0 x 23 + 1 x 22 + 0 x 21 + 1 x 20 = 64+16+4+1 = 85

Number system In base 10 n 123 = 1 x 102 + 2 x 101 + 3 x 100 In subtraction 0 -1 = -1 but there is no (-) sign in binary system so the answer is 11. The first “ 1” is called the borrow bit In addition 1+1 = 10 the first “ 1” is called the carry bit How about 01+01 = 10 is there a carry?

Number system In addition, we use the term byte to represent an 8 -bit data and the term word to represent 16 -bit value, for a 32 -bit value, it is called double word. If we want to represent a very large number then we need to use many bits and this is not very convenient. Therefore we usually use number systems derived from the binary system and the most commonly used number system are Octal (base 8) and hexademical (base 16) usually we just use the term Hex.

Hex number system In octal, only digits from 0 to 7 will be used and in hexadecimal, we use 0 to 9 and A, B, C, D, E, F to represent the values. A = 10, B = 11, C = 12, D = 13, E = 14, F = 15 To convert, from a binary number to a hex, it can be done very easily, starting from the RHS, every 4 -bit from the binary number can be converted directly into a hex digit. For example: 0101 is equal to 55 H (H == hexadecimal), we have two 4 -bit groups 0101 and 0101. The value 0101 = 5 so the binary pattern is 55 H (in Hex). Example: 1100 0111 1010 1011 = C 7 A B H To convert back ABCDH = 1010 1011 1100 1101

Unsigned and signed numbers If a binary pattern represents both positive as well as negative values then it is regarded as signed, otherwise, it is unsigned (only positive). There are two types of signed notations, using a sign bit (usually the most LHS bit) and 2’s complement. For example using the LHS bit as a sign-bit, (0 is positive; 1 – negative ) 1010 = -42 while 00101010 = 42

2’s complement Usually the 2’s complement system is used. To convert a value X to –X using 2’s complement, we first do a 1’s complement of X and then add 1 to the result. The 1’s complement of a binary value is to invert its 0’s and 1’s. Example: X = 00101010 = 42 first convert X using 1’s complement gives 11010101 then add 1 to the result, i. e. 11010101 + 1 = 11010110 = -42 (comparing 1010 )

Floating point representation As mentioned above, each digit in a binary pattern represents a value in the power of 2. In a floating point format, the digits after the decimal point represent value 2 -1, 2 -2, etc. Example, 0. 1010 = 1 x 2 -1+0 x 2 -2+1 x 2 -3+0 x 24 = 0. 5+0. 125 = 0. 625 How to represent signed floating point values will be discussed in later stage.

Floating-point format Floating-point numbers are represented in the form X = ±F • 2 ±E F is called the fraction (or mantissa) and E is the exponent Floating-point number is usually represented in two standard (defined by IEEE ) 32 -bit (single precision) and 64 -bit (double precision). A 80 -bit (extended precision) standard is also available.

$IEEE Floating-point standard For single precision n There are 23 bits for fraction and$ IEEE Floating-point standard For single precision n There are 23 bits for fraction and 8 bits exponent and 1 sign bit For double precision n There are 52 bits fraction, 11 bits exponent and 1 sign bit The bit formation is sign Exp. Fraction

Floating-point arithmetic Using the format ±F • 2 ±E can implement floating point arithmetic very easily X = a x 2 b Y = c x 2 d X*Y = (axc)* 2 (b+d) X/Y = (a/c )* 2 (b-d) How about X+Y and X-Y ? ? ?

Floating-point format For single precision format If E=255, F 0; Na. N (not a number) If E=255, F=0, S=1; then –infinity If E=255, F=0, S=0; then +infinity If 0

Example of floating point Given 6. 125 express the value in binary format usingle precision floating-point format n n n First convert the value into the format 1. XXX*2 y 6. 125 = 1. 53125 x 22 So the exponent is 2 and the fraction is 1. 53125 As all floating-point values always come with the ‘ 1. ’ so the ‘ 1. ’ is not stored and only 53125 is saved!!!! 0. 53125 =. 10001 0 10000001 100000000… E-127 = 2 => E=129

Characters In addition to values, characters are also represented using binary codes usually in ASCII (American Standard Code for Information Interchange) code.

ASCII table A = 41 H ; a = 61 H so a>A

Chinese characters There are several ways of representing Chinese characters in computer Unicode n Big 5 n GB 2312 -80 n Your computer will need to have proper font installed in order to display the corresponding characters

Chinese characters (Big 5) Big 5 is the character encoding standard most commonly used for traditional Chinese characters. Regions / countries such as Taiwan, Hong Kong are using this encoding standard. GB is for simplified Chinese characters Every Chinese Character is represented by a two byte code. The first byte ranges from 0 x 81 to 0 x. FE, while the second byte ranges from 0 x 40 to 0 x 7 E, 0 x. A 1 to 0 x. FE. Thus, in a document that contain Chinese characters and regular ASCII characters, the ASCII characters are still represented with a single byte.

Big-5 coding ranges Range Total Name of Block 8140 -8 DFE 2041 User-defined area 8 E 40 -A 0 FE 2983 User-defined area A 140 -A 3 FE 471 Big 5 symbols and control codes A 440 -C 67 E 4501 Big 5 primary character Set C 6 A 1 -C 8 FE 408 Vendor defined area C 940 -F 9 D 5 7652 Big 5 secondary character set F 9 D 6 -F 9 FE 41 Vendor defined area FA 40 -FEFE 785 User defined area

Big 5 Chinese characters Address Character A 640 H 共 AF 66 H 病 A 741 H 你 AA 46 H 東 http: //www. khngai. com/chinese/charmap/tblbig. php

Exercises Convert the number -4. 5 into a IEEE 32 -bit floating-point pattern Convert 9 into a IEEE 32 -bit floating point Without using a calculator, determine 01010010 + 10101101 Without using a calculator, determine 01010010 – 10101101 What is (a-B)?

Revision exercises What is a Bit, a byte, a word, one K, one M What is Hex, Oct, binary Do you know how to convert a number into different base systems without using a calculator? Binary number notation (1010 = ? ? ? ) How many different values can be represented by a 8 -bit pattern? Hexadecimal A = ? ? ? How to convert binary 110101110101 to Hex? CFH (Hex) = ? ? ?

More exercises How to present negative values in binary number system? How about using binary system to represent a floating number? ? ? What is ASCII? What is BCD? What is a string? ? ? Do you know how to program in C/C++, Fortran, Java, assembly language etc ? ? ? Do you know what is a multiplexer, register, latch? Referring to the block diagram of a memory, do you know what is an address; what is data?

Microprocessor/Microcomputer What is a Microcomputer A complete computer based on a particular microprocessor chip. n So the microprocessor is the most important component in a microcomputer n So to study a microcomputer system, we must first understand the microprocessor What is a Microprocessor n Processor-on-a-chip can be described as a microprocessor. 8051 series, 8086, Pentium series, etc

Block diagram of a generic microcomputer system RAM Hard Disk CD ROM Data and address bus Keyboard, mouse Monitor, printer

Microprocessor based system sensor Input u. P with Control program motor Output

Structure of a modern computer system

Model Year Max. Clock frequency at introduction Transis Register tors per Sizes Die Ext. data bus size Max. Caches external address space 8086 1978 8 MHz 29 K 16 GP 16 1 MB None 486 1989 25 MHz 1. 2 M 32 GP 80 FPU 32 4 GB L 1: 8 KB Pentium 1993 60 MHz 3. 1 M 32 GP 80 FPU 64 4 GB L 1: 16 K B P 3 1999 500 MHz 8. 2 M 32 GP 80 FPU 64 MMX 128 XMM 64 64 GB L 1: 32 K B L 2: 512 KB 64 64 GB L 2: 1 MB Pentium 2007 1. 6 GHz to 2. 4 167 M Dual GHz Core GP – general purpose FPU – floating point unit Register – a device to store binary data

http: //www. hknotebook. com/polyu 2011/lenovo/mai n_notebooks. jsp

The Intel 8086 Microprocessor The 8086 is a popular device used in the early 70’s and 80’s and its architecture is simple and suitable for teaching computer architecture Once we gain the basic concept of the 8086, we can then discuss the more advanced microprocessors Many features found in 8086 are still being embedded in modern microprocessors but enhanced!

8086 Microprocessor This is a 16 -bit microprocessor chip manufactured by high-performance metaloxide semiconductor (HMOS) technology Circuitry on chip is approximately 29, 000 transistors Comes in a 40 -pin package

Self test Do you know what does it mean by 16 -bit, 32 -bit, or 64 -bit processor? How would you describe an Intel Core 2 Duo CPU ?

Basic 8086 features True 16 -bit microprocessor with 16 -bit internal and external data bus The address bus and data bus are multiplexed? ? ? Multiplex – address and data share the same pin!! A 20 -bit address bus which allows access to 1 MB of memory. Can address up to 64 K byte-wide I/O ports Or 32 K word-wide ports (word = 16 bits) Details regarding I/O ports will be discussed in the I/O System

Pins layout for 8086 A/D – address/data (address and data share the pins - multiplexed) Also pay attention to “active high” and “active low” signals

8086 Features The 8086 has two modes – min. and max. Min. mode – used as a typical microprocessor Max. mode – use with multiple processors, usually for floating-point arithmetic) The mode selection is via the MN/MX input

Block diagram for a simple computer system Display unit LCD What are the basic operations performance by a computer? memory CPU Get instruction from memory Perform/Execute operation Get next instruction I/O

What are the basic operations performed by a microprocessor? Get instruction from memory Perform/Execute operation Get next instruction So inside the microprocessor, it is organized into two units: Bus Interface Unit (BIU) and Execution Unit (EU). So that it can perform the above operations effectively

Processor Model for 8086

The 8086 Internal Architecture The internal functions of the 8086 µP are divided between two separate processing units. They are the Bus Interfacing Unit (BIU) and the Execution Unit (EU). The BIU is responsible for performing all bus operations, such as instruction fetching, reading and writing operands from/to memory, and inputting and outputting of data for peripherals. The EU is responsible for executing instructions The two units operate asynchronously so overlapping instruction fetch and execution is possible (what’s the advantage of this? ? ? )

Terminology Program is stored in memory and consists of a sequence of instructions and some data To execute an instruction it may require some operands What is an operand? Operand is the object that is being operated upon! Example, in an instruction ADD A, B (A = A+B) ADD (addition is the operation) A and B are the operands

Bus Interfacing Unit (BIU) The BIU is the 8086’s interface to the outside world (external memory). The major task of BIU is to get “information” from the memory Information includes data and instructions How can we get data from memory? ? ? To access the memory, we need to issue an address (via the address bus) and then read the data (via the data bus) (Details of this mechanism will be discussed when we discuss the memory systems)

BIU There is a full 16 -bit bidirectional data bus and 20 bit address bus It has the following functions: instruction fetch, instruction queueing, operand fetch and storage, and bus control. It contains the segment registers, internal communication registers, instruction pointer, instruction object code queue, address summer ( ), and bus control logic.

How BIU and EU collaborate What a program consists of ? ? ? A program is a collection of instructions and data BIU fetch an instruction from memory and put it in the queue and this is called instruction queue (refer to the block diagram) EU fetches the instruction from the queue and executes BIU and EU implements a pipeline (BIU->EU) and pre-fetch to optimize the performance

BIU – EU Pipeline mechanism Note: there are 3 components in the pipeline Information coming from memory BIU Control to access the memory EU executes the instruction queue that can store 6 bytes of instructions EU requests BIU to get operands

Pre-fetch concept Pre-fetching is similar to what you do when you’re having a buffet dinner. You collect different kinds of food from the buffet table, for example, you take the sashimi, roast beef, soup, and salad etc. When you’re eating the salad, you have already pre-fetched the sashimi and the roast beef! If you do not pre-fetch then you take the salad first, go back to the table, eat your salad. When you finish the salad then you go and get some other food. Why pre-fetching your food? ? ?

Pre-fetch Is pre-fetching in a buffet dinner exactly the same as the pre-fetching mechanism in a microprocessor? The plate is equivalent to which component? Is a big plate better than a small plate?

Pre-fetch by BIU What’s pre-fetch? ? When the queue can store at least 2 bytes EU is not requesting BIU to read or write operands from memory BIU will look ahead in the program by prefetching the next sequential instruction The prefetched instructions are held in the queue which is a FIFO (First-in-first-out) device Two bytes are fetched (16 -bit data bus) in a single memory cycle EU will read one instruction byte from the output of the queue

Pre-fetch Int 1 a Int 1 b Int 2 a Int 2 b Int 2 c Int 3 a Int 3 b Int 4 a Memory Queue int 1 b int 1 a int 2 b int 2 a int 1 b EU int 1 a While EU is processing “int 1 a” Int 2 a and int 2 b have already been Pre-fetched

Instruction sequence Fetch Execute

Pre-fetching by BIU If the instruction queue is full and EU is not requesting access to operands in memory, the BIU does not perform any bus cycles – this is called idle states When BIU is in the process of fetching an instruction when the EU requests its services then BIU first completes the instruction fetch bus cycle and then serves the EU

Exercise An instruction can be completed by a single processing stage and it takes 25 clock cycles to finish. On the other hand, the instruction can be divided into five sub-tasks each of which can be carried out by one processing stage forming a pipeline. If sub-task 1 takes 10 clock cycles, subtask 2 takes 5 clock cycles, sub-task 3 takes 8 clock cycles, sub-task 4 takes 10 clock cycles and sub-task 5 takes 12 clock cycles, how many instructions should be executed so that the subtask approach will be more cost-effective?

Components in the BIU is to read/write the memory What is needed to access the memory? ? ? We need to generate an address and read/write the data BIU contains a dedicated adder ( ) which is used to generate the physical address of the memory location Address is formed by adding an appended 16 -bit segment address and a 16 -bit offset address Example: the physical address of the next instruction to be fetched is formed by combining the current contents of the code segment (CS) register (16 -bit) and the current contents of the instruction pointer (IP) register (16 -bit)

Generating the physical address If CS (code segment) is 1005 H The IP is 5555 H What is the physical address? (how to determine the physical address? ) Point to consider: the address bus of the 8086 is 20 -bit, the registers are 16 -bit. Is it a problem? ? Consider the sum of two 16 -bit values, what is the max. integer value represented by 16 -bit. What is the sum and how many bits are required to represent the sum?

Segment concept 8086 can support up to 1 M memory Memory is divided into segments Each segment is 64 K To access data inside a segment, we need to know the base address of a segment as well as the offset. This is similar to a building, we can have room 10 B, 11 B etc. The floor is the base address and the letter ‘B’ is the offset

Segment in 8086 Why segment mechanism is needed in 8086? n The address bus size (20 -bit) > register size (16 -bit) n n n Example: if the address bus is 4 bits then you can access 16 locations If you can only output a 2 -bit address from your register then what will happen? Save components – can reduce the size of the registers A segment is a 64 Kbyte memory block

Segment concept Segment analogy It is similar to an estate Instead of building a very tall building, we build smaller blocks If you live in one of the smaller blocks then your address will have two components – the block number (segment) and the floor (offset)

Segment concept A 64 k segment 1 M How can we access locations within a segment ? ? ? A 64 k segment

Segment Registers The segment registers are used for accessing the memory The 8086 address space is segmented into 64 K-byte segments and just four segments can be active at a time. Because there are only 4 segment registers In theory, how many segments can we have? ? n Total memory 1 M and segment is 64 K so 1 M/64 K number of segment

The Segment concept So the real address (physical address) is = Base address (20 bits) + offset (16 -bit) Real address Offset The Base address must be Memory divisible by 16 so the last digit (a segment) is equal to 0 and the ‘ 0’ is not stored so a 16 -bit register can hold the rest of the address Base address (segment address)

Segment concept For example: FFFFEH is not divisible by 16 n FFFF 0 H is divisible by 16 n 12340 H is also divisible by 16 n

Segment concept The maximum value of a 16 -bit value is FFFF (Hex), if two 16 -bit values added together, such as FFFF (segment) + FFFF (offset), the result is 1 FFFE (Hex) (physical) and it is only a 17 -bit value and values from 1 FFFFH to FFFFFH cannot be produced. As for a 20 bit pattern, it represents values from 00000 H - FFFFFH So in 8086, you cannot randomly assign a segment. The segment address must satisfy one condition, that is the base address must be divisible by 16. If a value is divisible by 16 and if we are using HEX (base 16) as the number system then the last digit of the value must be a ‘ 0’. For example, the value in the segment register is 1234 H and the offset is 20 H then the physical address is 12340 H + 20 H = 12360 H.

Segment concept The segment concept analogy n If you are design the elevator for a very tall building, for example with 100 levels. How are you going to arrange the buttons if the elevator is able to reach all levels?

Execution Unit (EU) The EU is responsible for decoding and executing all instructions. What is decoding ? The EU will see data such as 8 B C 3 (10001011 11010011) Decoding is to carry out the properation according to the binary string (10001011 11010011) 8 B C 3 is (MOV AX, BX) After decoding, EU will perform the move (MOV) operation

Decoding Instruction 10101100 Decoder Control signals

Execution Unit (EU) (Cont’d) EU consists of an ALU (Arithmetic and Logic Unit), status and control flags, eight generalpurpose registers, temporary registers, and queue control logic The EU extracts instructions from the top of the queue in the BIU, decodes them, generates operand addresses if necessary, passes them to the BIU and requests it to perform the read or write bus cycles to memory or I/O, and performs the operation specified by the instruction on the operands. During execution of the instruction, the EU tests the status and control flags and updates them based on the results of executing the instruction.

Functions of EU ADD AX, 16 ; meaning add 16 to AX Where AX is a register inside the CPU If AX is 20 then after the operation it becomes 36 For the above operation, do we need to fetch operand from memory? 16 in the above operation is called an immediate Immediate values are stored as part of an instruction and fetched together with the instruction Now if it is ADD AX, X ; X is a variable Do we need to fetch the operand X from memory?

Functions of EU If the instruction queue is empty, the EU waits for the next instruction byte to be fetched and shifted to the top of the queue. When the EU executes a branch or jump instruction, it transfers control to a location corresponding to another set of sequential instructions. Whenever this happens, the BIU automatically resets the queue and then begins to fetch instructions from this new location to refill the queue.

Jump and branch

Summary What is the pre-fetch concept? What is a pipeline and its advantage? What are functions performed by the BIU and EU What is a multiplexed address/data bus What is the segment concept

8086 Internal Registers are a very important component because they are used as a temporary storage, as well as storing the current status of the CPU. Contents of some registers indicate the memory locations to be fetched. Registers are internal components that we can control with assembly language programming 4 groups of 16 -bit register n n Instruction Pointer (IP) Data Registers (4) Pointers and Index Registers (4) Segment Registers (4) The Flag Register

Instruction Pointer (IP) Identifies the location of the next instruction to be executed in the current code segment IP contains an offset value not the physical address of the next instruction Physical address = IP+CS (code segment register) Every time an instruction word is fetched from memory, the BIU updates the values in IP (eg IP = IP+1) such that it points to the next sequential instruction word in memory

Data Registers 4 general purpose data registers and are used for temporary storage of frequently used intermediate results. This can improve the speed (why? ? ? ) Register can use either as 8 -bit or 16 -bit Accumulator Register (AX: AH AL) AX (16 -bit) Base Register (BX: BH BL) Count Register (CX: CH CL) AH (8 -bit) AL (8 -bit) Data Register (DX: DH DL)

Data Registers The general purpose data registers can be used for arithmetic or logic operations For example, to carry out an addition: add ax, bx The result is stored in ax and it is equal to the sum of values in ax and bx (in C, it is similar to ax+=bx) For string instruction, the CX register is used to store a count value representing the number of bytes to be moved All I/O operations require data that are to be input or output to be in the A register, while register DX holds the address of the I/O port

Segment Registers The segment registers are used for accessing the memory The 8086 address space is segmented into 64 K-byte segments and just four segments can be active at a time. In theory, how many segments can we have? ? The segment registers are used to select the active segments Code Segment (CS) Register n CS identifies the starting address of the 64 -K byte segment known as the code segment. Code segments of memory contain instructions of the program. Data Segment (DS) Register n DS register identifies the starting location of the current data segment in memory. Data is stored in the data segment.

Segment Registers (Cont’d) Stack Segment (SS) Register n SS register contains a logical address that identifies the starting location of the current stack segment in memory. Stack is used for temporary storage Extra Segment (ES) Register n ES register identifies the extra segment usually used for data storage. The segment registers store the base address of a segment. To determine the physical address, an offset is required. The index registers are used to store the offset value.

Pointer and Index Registers Stack Pointer (SP) – permits easy access to locations in the stack segment of memory n The value in SP represents the offset of the next stack location which can be accessed relative to the current address in the stack segment (SS) register, i. e. , always points to the top of the stack. Base Pointer (BP) BP represents an offset from the SS register. However, it is used to access data within the stack segment. n Used in the based addressing mode The applications of the various registers will be discussed in details when we learn assembly language programming n

Pointer and Index Registers Index register are used to hold offset addresses for instructions that access data stored in the data segment of memory. Source Index Register (SI) n SI is used to store an offset address for a source operand under index addressing for string and memory operation. Destination Index Register (DI) n n DI is used for storage of an offset that identifies the location of destination operand also used in some string operations. Remarks: The offset value is always referenced to the value in the data segment (DS) register.

Registers and pointers Segment register Pointer CS (code segment) IP (instruction pointer) DS (data segment) DI, SI SS (Stack segment) SP (stack pointer) BP (base pointer) ES (Extra segment) DI

Flag Register The flag register is a 16 -bit register within the execution unit. The status flags in the register indicate conditions that are produced as the result of executing an arithmetic or logic instruction. What kind of conditions can you think of? ? ?

8086 Flag Register (status flag) C - Carry Bit (set if there is a carryout or borrowin) P - Parity Bit (set if lower byte of the result contains even number of 1 s) – odd parity Z - Zero Bit (set if result after an operation is equal to zero) S - Sign Bit (represent negative value produced during an operation) O - Overflow Bit (result is out of range). If the result of a signed operation is not large enough to be accommodated in a destination register. When an overflow will occur? Give one example The above are the most commonly used flag registers, there are others but not discussed in this subject!!!!!!

Example If our data is only 8 -bit then we do FFH + 1 H = 1 0000 this is a 9 -bit value the ‘ 1’ is the carry!!!! Similarly when we do 00 H – 1 H then result is 1 1111 the 1 is the borrow bit.

Flags consider using 8 -bit values A and B, determine flag status for C, S, Z and O If A = 0 FH, B = 1; A+B n If A = 0, B = 1; A-B n If A = 7 FH, B = 1; A+B n If A = 80, B = 0 F; A-B n If A = FFH, B = 1; A+B n If A = 2 FH, B = 60, C = -1; (A+C) -B n

Block diagram for a simple computer system Display unit LCD memory CPU Get instruction from memory Perform operation Get next instruction I/O

Bus Cycle Bus – address and data Bus cycle is used to access memory, I/O devices, or the interrupt controller. Bus cycle starts with an address being output on the system bus followed by a read or write data transfer. A series of control signals are produced to control the direction and timing of the bus A standard bus cycle consists of 4 clock periods Understand system bus timing will assist you to choose the proper memory device

Bus cycle T 1 : BIU puts an address on the bus T 2: data are put on the bus (for write cycle) T 2: bus in High Z mode (for read cycle) T 3: data on the bus T 4: data on the bus For a 5 MHz system, how long does it take to complete 1 bus cycle? ?

Read cycle

Write cycle

Wait states can be inserted into a bus cycle In response to request by an event in external hardware READY input of 8086 is set LOW As long as READY is held low, wait states (Tw) are inserted between T 3 and T 4 For a write cycle, data maintained on the bus Purpose: to extend the duration of the bus cycle, so slower memory devices can be used

Bus cycle

Example Assembly Language Machine Code MOV AX, 0 F 802 H PUSH AX MOV CX, BX MOV DX, CX ADD AX, [SI] ADD SI, 8086 H JMP $ -14 B 802 F 8 50 8 BCB 8 BD 1 0304 81 C 68680 EBF 0

Instruction sequence

Instruction sequence B 8 02 02 F 8 50

Instruction sequence

Pins definition AD 15 – AD 0 Address/Data Bus A 19/S 6 – A 16/S 3 - Address/Status BHE/S 7 – Bus high enable MN/MX – min. max. mode control RD – read control (read data from memory or I/O) Test – wait on test (input). If test signal is HIGH then processor will be in an idle state Ready – (input) wait state control Reset – system reset (if kept HIGH for 4 clock cycles) NMI – non-maskable interrupt request INTR – interrrupt request CLK – system clock

Pins definition HOLD – hold request (used with DMA) HLDA – hold acknowledge (entered the HOLD state) WR – write control M/IO – memory /IO control DT/R – data transmit receive (to enable external data bus buffer) DEN – data enable ALE – address latch enable INTA – interrupt acknowledge

Pins definition RQ / GT 1, 0 – request / grant bus access control (Used in max. mode, to force the processor to release the local bus at the end of the processor’s current bus cycle) LOCK – bus priority lock control (disable other bus master to gain access of the system bus) /S 2 -/S 0 – bus cycle status (this lines reflect the type of operation being carried out by the processor) QS 1 – QS 0 – instruction queue status (give information about the status of the code-prefetch queue)

Status lines – reflects activities of the BIU /S 2 0 0 1 1 /S 1 0 0 1 1 /S 0 0 1 0 1 Indication Interrupt acknowledge Read I/O port Write I/O port Halt Code access Read memory Write memory passive

Instruction Queue status QS 1 QS 0 Indication 0 0 No operation 0 1 1 First byte of opcode from the queue Empty queue Subsequent byte from the queue

Instruction sequence 100 B 8 02 10 02 F 8 50

Status signals S 6 – S 3 – are output on the bus at the same time that data are transferred over the other bus lines S 4 and S 3 form a 2 -bit binary code that identifies which of the 8086’s internal segment register was used to generate the physical address 00 – extra 01 – stack 10 – code/none 11 – Data S 5 – logic level of the interrupt enable flag S 6 – not used always at 0

Control signals are provided to support memory and I/O interfaces ALE – 0 -> 1 to signal external circuitry when a valid address word is on the bus BHE : 0 used as memory enable for the most significant byte (High Byte) half of the data bus M/IO: 1 represent a memory operation n 0 represent an I/O operation DT/R: 1 bus in transmit mode; 0 in receive mode RD: represent a read cycle and reading data from the bus WR: represent a write cycle and 0 represent valid write or output data are on the bus DEN: signals external devices when they should put data on the bus

Operating modes of 8086 The 8086 can run in two different modes: minimum and maximum In min. mode 8086 provides all the control signals needed to implement the memory I/O interfaces In max. mode, it provides signals (status signals) for implementing a multiprocessor/coprocessor system environment In max. mode, bus controller, bus arbiter are included in the system. The controller derives the control signals based on the status signals

Max. mode application

Maximum mode The basic functions of the bus controller chip (8288) is to derive control signals like /RD, /WR, /DEN, DT/R, ALE based on the status lines /IORC, /IOWC – I/O read/write command signals. They enable an IO interface to read or write data from or to the addressed port. /MRDC, /MWTC – memory read and write command signals. For instructing memory to accept or send data from or to the bus. /AIOWC, /AMWTC – advanced /IOWC and /MWTC. Serve the same purpose as /IOWC or /MWTC but are activated one clock cycle earlier.

Maximum mode circuit

Minimum mode circuit Latch – store Transceiver – transmit + receive

Read cycle Why there are two address? Output From address latch Output from multiplexed Address/data bus

Memory To access the memory both address and data must be available at the same time. But for 8086, the bus is multiplexed so Physically it is not possible to supply both information simultaneously So address latches are used to hold the address information when the bus is used to transceive data Data memory Address

Modern microprocessor architecture Based on your understanding of the 8086 microprocessor, what features of the 8086 are needed to be modified in order to make it more powerful? What are the major differences between a modern microprocessor and a 8086? ? ?

Modern microprocessors The following features are crucial to the performance: Operating speed (clock rate) n Memory (size and speed) n Data size n Floating point processing n Overlapping of execution and memory access n Perform more tasks in a single cycle n

Modern microprocessor Operating speed n The 8086 operates at 5 MHz n The latest Intel microprocessor (i 5, i 7, i 9) can operate at the 2. 6 GHz range n The operating frequency is governed by the fabrication techniques (0. 13 Micron technology for P 4 and the 8086 is based on 3 Micron technology). With submicron technology we can put more components into the chip n The higher the operating speed, more heat will be generated and cooling the CPU becomes more important. No cooling is for the CPU is required in the 486 era

Modern microprocessor Memory n n n The 8086 has 20 -bit address bus and the max. memory location is only 1 MBytes The Pentium II can access up to 64 GBytes (what is the size of the address bus? ? ? ) of memory The speed of memory chip has also increased in the last 20 years The RDRAM can operate at 600 MHz or 800 MHz What is the advantage(s) of having more memory? ? ? Using 8086 as an example, memory read/write must complete in 4 cycles so if the memory is slow then the memory read/write cannot be completed in 4 cycles.

Memory speed n n To improve the performance, internal high-speed memory is provided for the storage of data as well as instruction. These internal memory is called cache There are cache inside the CPU and it is called Level 1 (L 1) cache (there at least 8 Kb in P 4) The cache located between the CPU and the external memory is called Level 2 (L 2) cache (speed of cache is higher then the traditional memory) (512 K in P 4) In newer microprocessors design, L 2 cache is also located inside the CPU so therefore, L 3 cache is also provided. L 3 cache is external just like L 2 cache in the old days.

Cache n Accessing data/instruction from cache is faster that accessing the traditional memory so when executing a program, if the entire program is already stored in the cache then the overall performance can be improved

Cache is fast Access time for DRAM could be 60 ns while for the cache it could be 15 ns n to access cache memory the CPU does not have to use motherboard’s system bus (also slow) for data transfer. n

Cache CPU Memory L 1 L 2

Cache The cache is usually divided into smaller units, called blocks. In 486, the 8 K cache is divided into 4 2 K blocks. Each block is 16 bytes x 128 rows The 16 -byte row is divided into 4 -byte lines. Any of the 4 lines cannot be accessed partially. During operations, the microprocessor always check the cache for data and instruction first. If data is already stored in the cache – it is called a cache hit Usually a mechanism called paging is employed to write frequently used data and instruction into the cache If the data is not found then – cache miss

Modern Microprocessor Data The 8086 has a 16 -bit data bus n The P 4 has a 64 -bit data bus so can get more data in a single read/write cycle n Or processing of high-precision data can be more effective n Consider this: what is the maximum value can be represented by a 16 -bit and a 32 -bit pattern respectively n

Overlapping of operations In 8086 the instruction pipeline enables the overlapping of execution of instruction and fetching In Pentium Pro processor which is a superscalar architecture – can execute multiple instructions concurrently Two integer pipelines U and V, each one is 5 -stage One floating-point execution unit

Overlapping of operations So it is possible to execute three instructions simultaneously (with proper programming the speedup can be up to 40%)

Pentium Processor Pentium processor issues 2 instructions in parallel to the 2 independent integer pipelines (U and V) This enhances the speed of integer arithmetic In the prefetch stage, the CPU fetches instructions from the instruction cache In Write Back, the CPU updates registers’ contents or the status in the flag register depending upon the execution result In Pentium processor only one clock cycle to decode the instruction, comparing to 486, it takes 2 clock cycles.

Block diagram of Pentium processor Can you see the pipeline in the above?

Block diagram of U-V pipe for Pentium processor The pipeline of 8086 has how many stages?

The 8086 instruction execution sequence Fetch instruction Decode and determine operand address Fetch operand Execute Basic operations performed by a CPU

Modern microprocessor Floating point hardware Do you know how floating-point number is represented in a computer? The floating-point hardware is an important design feature in modern processor No floating-point hardware is available for 8086 For Intel, the co-processor mechanism was used until the emerge of the 486 microprocessors

Perform more tasks in a single cycle MMX – multimedia extension It is for improving 2 -D , 3 -D multimedia applications Using MMX instruction set, eight image elements (8 -bit) can be loaded simultaneously and perform concurrent operation on them The MMX instruction makes use of the 64 -bit registers (the mantissa portion of the floating-point registers) MMX instructions are mainly integer operations

MMX /SSE mechanism d 1 d 2 d 3 d 4 Operation d 5 d 6 d 7 128 -bit register can Store 4 floating point value d 8 result r 1 r 2 r 3 r 4 4 results can be obtained in one single operation

Points about P 4 More features to support parallel processing – hyper-threading technology, SSE, MMX instructions, deeper pipeline, dual-core Still compatible with existing Intel architecture based software Operate at very high frequency up to 3 GHz With 8 KB data cache and 12 K instruction cache Integer ALUs operate at twice the core processor frequency

Self-test How to make a microprocessor to become more powerful! What is the most important hardware feature embedded in a 8086 microprocessor? What are the basic operations being performed when a computer is running? What is the maximum value represented by a 20 -bit pattern? What is the maximum value represented by the sum of two 16 -bit pattern? What is the major function of a ALU? What is the major functions of a BIU? Can you use a block diagram to represent a memory? What is an instruction pipeline and why it can improve the overall performance of a microprocessor?

Block diagram of Intel P 4 CPU Can you spot any features come from the 8086 ?

Introduction to microcontroller If you want to develop a simple robot are you going to use a P 5 microprocessor in your system? Usually to implement a simple system, microcontroller is used instead of a microprocessor. A microcontroller can be regarded as an all-in-one device with a CPU, memory, Input/Output interfaces all included in a single package. Commonly used microcontrollers include the 8051 series, Basic Stamp, Basic. X etc. In addition, the cost of a microcontroller is usually lower than the traditional microprocessor, for example an 8051 is only HKD 25.

Some microcontrollers also come with additional features such as ADC (analog to digital converter), DAC (digital to analog converter) and PWM (pulse-width modulation) output. All those features are very useful for the implementation of basic robotic or control systems.

The ADu. C 832 device The ADu. C 832 is a powerful microcontroller and is being used in the experimental setup. Basic features of the device include: n 8 -channel 12 -bit ADC n 2 12 -bit DAC n 62 Kbytes Program memory, 4 k. Bytes data memory n 2304 bytes on-chip data RAM n dual PWM output (the PWM can be used to control an analog device) n 8051 compatible instruction set n 4 8 -bit Input/Output ports

Block diagram of ADu. C 832

Application using microcontroller The components can be connected directly to the microcontroller via the I/O ports without other supporting devices. Sensor Motor ADu. C 832