Скачать презентацию An Introduction to IA-32 Processor Architecture Eddie Lopez Скачать презентацию An Introduction to IA-32 Processor Architecture Eddie Lopez

2084df4691a70bc700e0a2aafe5aa0dd.ppt

  • Количество слайдов: 62

An Introduction to IA-32 Processor Architecture Eddie Lopez CSCI 6303 Oct 6, 2008 An Introduction to IA-32 Processor Architecture Eddie Lopez CSCI 6303 Oct 6, 2008

Overview n Microcomputer Design n Intel IA-32 Family Tree n Operating Environment n Input Overview n Microcomputer Design n Intel IA-32 Family Tree n Operating Environment n Input / Output n The Future

Microcomputer Design n What is IA-32? n n n Intel Architecture 32 -bit Also Microcomputer Design n What is IA-32? n n n Intel Architecture 32 -bit Also known as x 86 or i 386 Intel 80386 chip released in 1985 First Intel 32 -bit chip Backward Compatibility preserved n Replaced 16 -bit architecture of 8086, 80186, 80286

Microcomputer Design n Other Manufacturers also produced IA-32 compatible processors n AMD, Cyrix, VIA Microcomputer Design n Other Manufacturers also produced IA-32 compatible processors n AMD, Cyrix, VIA

Microcomputer Design The Central Processing Unit (CPU) Microcomputer Design The Central Processing Unit (CPU)

Microcomputer Design Motherboard Microcomputer Design Motherboard

Microcomputer Design CPU Heat Sinks Microcomputer Design CPU Heat Sinks

Microcomputer Design n The Central Processing Unit contains: n n Control Unit Arithmetic Logic Microcomputer Design n The Central Processing Unit contains: n n Control Unit Arithmetic Logic Unit (ALU) High Frequency Clock Registers

Microcomputer Design Microcomputer Design

Microcomputer Design n IA-32 Instruction Execution Pipeline: n n n Bus Interface Unit – Microcomputer Design n IA-32 Instruction Execution Pipeline: n n n Bus Interface Unit – accesses memory Code Prefetch Unit – instruction queue Instruction Decode Unit – translates to microcode Execution Unit – executes microcode Segment Unit – translates logical addresses to linear addresses Paging Unit – translates linear addresses to physical addresses.

Microcomputer Design n Instruction Execution Cycle n n n Fetch – gets instruction from Microcomputer Design n Instruction Execution Cycle n n n Fetch – gets instruction from memory Decode – translate into microcode Fetch input – get data from memory Execute – ALU performs instruction Store output – store data back into memory

Questions? Questions?

IA-32 Architecture n Microcomputer Design n Intel IA-32 Family Tree n Operating Environment n IA-32 Architecture n Microcomputer Design n Intel IA-32 Family Tree n Operating Environment n Input / Output n The Future

IA-32 Family Tree n 8086 (1979) n n n Segmented Memory 20 bit addressing IA-32 Family Tree n 8086 (1979) n n n Segmented Memory 20 bit addressing 1 MB limit

IA-32 Family Tree n 80286 (1982) n n Protected Mode Privilege Rings n n IA-32 Family Tree n 80286 (1982) n n Protected Mode Privilege Rings n n Ring 0 1 2 3 – Kernel – OS / Device Drivers – Device Drivers - Applications

IA-32 Family Tree n 80386 (1985) n n n Intel’s First 32 -bit Processor IA-32 Family Tree n 80386 (1985) n n n Intel’s First 32 -bit Processor Flat Memory Model 32 -bit Addressing 4 GB Limit Paging

IA-32 Family Tree n 80486 (1989) n n n Level 1 Cache (8 KB) IA-32 Family Tree n 80486 (1989) n n n Level 1 Cache (8 KB) On-board FPU (Floating Point Unit) 5 Stage Pipeline

IA-32 Family Tree n Pentium (1993) n n n Super Scalar (u, v pipelines) IA-32 Family Tree n Pentium (1993) n n n Super Scalar (u, v pipelines) Separate Code and Data Cache (8 KB) Branch Prediction

IA-32 Family Tree Branch Prediction Model Loop 100 times Do something Next loop Next IA-32 Family Tree Branch Prediction Model Loop 100 times Do something Next loop Next instruction

IA-32 Family Tree n Pentium Pro (1995) n n n 3 instruction pipelines Out IA-32 Family Tree n Pentium Pro (1995) n n n 3 instruction pipelines Out of order execution 36 -bit address bus can address 64 GB memory 256 kb Level 2 cache MMX Instruction Set

IA-32 Family Tree n Pentium II (1997) n n n Level 1 cache increased IA-32 Family Tree n Pentium II (1997) n n n Level 1 cache increased 16 KB each Level 2 cache 256 KB, 512 KB, 1 MB Celeron 128 KB (Value Market)

IA-32 Family Tree n Pentium III (1999) n n SSE instruction set (XMM registers) IA-32 Family Tree n Pentium III (1999) n n SSE instruction set (XMM registers) Pentium IV (2000) n n n SSE 2 instruction set Net. Burst Micro-architecture Hyper-Threading

IA-32 Family Tree Net. Burst Micro-Architecture • ALU runs at x 2 speed • IA-32 Family Tree Net. Burst Micro-Architecture • ALU runs at x 2 speed • Dynamic Execution • Out-Of-Order

IA-32 Family Tree Core Micro-Architecture • 4 Pipelines (14 stages) • 3 ALU Units IA-32 Family Tree Core Micro-Architecture • 4 Pipelines (14 stages) • 3 ALU Units • 4 Instruction Decoders • Macrofusion

IA-32 Family Tree Core Micro-Architecture (Intel Conroe) IA-32 Family Tree Core Micro-Architecture (Intel Conroe)

Questions? Questions?

Overview n Operating Environment n Operating Modes n Registers n Memory Management n Instruction Overview n Operating Environment n Operating Modes n Registers n Memory Management n Instruction Format

Operating Modes n n Real Mode Protected Mode System Management Mode Virtual 8086 Mode Operating Modes n n Real Mode Protected Mode System Management Mode Virtual 8086 Mode

Operating Modes n Real Mode n n Operating mode for 8086 20 -bit addressing: Operating Modes n Real Mode n n Operating mode for 8086 20 -bit addressing: 1 MB of memory No memory protection or multitasking Modern chips start up in real-mode for backward compatibility

Operating Modes n Protected Mode n n Introduced in Intel 80286 chip 32 -bit Operating Modes n Protected Mode n n Introduced in Intel 80286 chip 32 -bit addressing: 4 GB of memory Flat memory model Uses privilege rings (0 -3) to regulate applications.

Operating Modes n Protection Rings Operating Modes n Protection Rings

Operating Modes n Virtual 8086 Mode n n Allows “real mode” programs to run Operating Modes n Virtual 8086 Mode n n Allows “real mode” programs to run under the supervision of a protected mode operating system Allows operating systems to run Virtual DOS machines to run legacy software.

Operating Modes n System Management Mode n Provides OS with power management and system Operating Modes n System Management Mode n Provides OS with power management and system security functions.

Registers n What is a register? n n n Storage space on the CPU Registers n What is a register? n n n Storage space on the CPU Used for fast memory storage and processing Each of the general registers has a special name and a specific use.

Registers Registers

Registers n Floating Point registers (80 -bit) n n MMX registers (64 -bit) n Registers n Floating Point registers (80 -bit) n n MMX registers (64 -bit) n n MMX 0 – MMX 7 SIMD registers (128 -bit) n n ST 0 – ST 7 (Part of Floating Point Unit) XMM 0 – XMM 7 Control Registers (32 -bit) n CR 0 - CR 4

Registers n Test Registers n n Description Registers n n GDTR, LDTR, IDTR Task Registers n Test Registers n n Description Registers n n GDTR, LDTR, IDTR Task Register n n TR 4 - TR 7 TR Control Registers (32 -bit) n CR 0 - CR 4

Registers n MMX n n Multi-Media Extensions Introduced on the Pentium Pro Used for Registers n MMX n n Multi-Media Extensions Introduced on the Pentium Pro Used for graphics and multimedia SSE n n n Streaming SIMD Introduced on the Pentium III One instruction can be applied to multiple data

Registers n 6 Segment Registers (16 bit) contain address pointers to segments of the Registers n 6 Segment Registers (16 bit) contain address pointers to segments of the currently running process n n CS Segment DS, ES, FS, GS SS Segment Code Data Segments Stack 1 Instruction Pointer (32 -bit) n Contains the memory address of the next

Registers n Compatibility with previous architecture n n To allow backward compatibility, registers EAX, Registers n Compatibility with previous architecture n n To allow backward compatibility, registers EAX, EBX, ECX, and EDX can be addressed as subsets. Example using the EAX register:

Registers n Roles for Generic Registers n n n n EAX – Accumulator EBX Registers n Roles for Generic Registers n n n n EAX – Accumulator EBX – Base Addressing ECX – Counter EDX – Data Operand EDI – Destination Address ESI – Source Address ESP – Stack Pointer EBP – Stack Base Pointer

Registers n EFLAGS register n n n Carry Flag (CF) – Unsigned Carry Overflow Registers n EFLAGS register n n n Carry Flag (CF) – Unsigned Carry Overflow Flag (OF) – Signed Overflow Sign Flag (SF) - Negative arithmetic results Zero Flag (ZF) – Zero arithmetic results Auxiliary Carry Flag Parity Flag – Even/Odd of a value

Instruction Set n IA-32 Architecture uses CISC n n n CISC – Complex Instruction Instruction Set n IA-32 Architecture uses CISC n n n CISC – Complex Instruction Set Computer Large amount of complex instructions Easier for compilers and programmers But placed a strain on decoder Backward Compatibility is a burden RISC n n n Reduced Instruction Set Computer Atomic instructions Easy to decode and run quickly

Instruction Format n Instructions of varying length n n Design decisions from 8086 have Instruction Format n Instructions of varying length n n Design decisions from 8086 have placed a burden on modern architecture. One instruction can vary from 1 byte to 17 bytes

Instruction Format n The instruction Format n n n Prefix (0 -4 bytes) Opcode Instruction Format n The instruction Format n n n Prefix (0 -4 bytes) Opcode (1 -3 bytes) R/M Modifier (0 -1 byte) SIB Modifier (0 -1 byte) Displacement Modifier (0 -4 bytes) Data elements (0 -4 bytes)

Instruction Format n Prefix (0 -4 bytes) n n Opcode (1 -3 bytes) n Instruction Format n Prefix (0 -4 bytes) n n Opcode (1 -3 bytes) n n Alerts the CPU that address or operand sizes are about to change The operation to execute. Common operations have one byte code, less frequently used ones get three opcodes R/M Modifier (0 -1 byte) n Specifies the addressing mode – Register or Memory

Instruction Format n Scale / Index / Base (0 -1 byte) n n Displacement Instruction Format n Scale / Index / Base (0 -1 byte) n n Displacement Modifier (0 -4 bytes) n n Indicates whether the register serves as an index or a base and gives the scale factor Provides an additional data offset Data elements (0 -4 bytes) n Immediate data (values and addresses)

Instruction Sets n Types of instructions in the set: n n n n Move Instruction Sets n Types of instructions in the set: n n n n Move data between memory and registers Exchanging data Integer Arithmetic Flow Control Procedure call and return Manipulating the stack Character string operations

Memory Management n Real Mode n n n 20 bit Addressing: 1 MB of Memory Management n Real Mode n n n 20 bit Addressing: 1 MB of memory Addresses: 00000 to FFFFF Memory is logically divided into 64 KB segments Segment registers stored the segment CPU converts segment: offset value to its linear equivalent

Memory n Reading From Memory n n n Fetching operands from RAM is slow Memory n Reading From Memory n n n Fetching operands from RAM is slow Bus Interface Unit polls RAM for data and waits. The CPU is goes into a wait state. Requires many clock cycles depending on speed of RAM. Level-1 cache is much faster – keeps data near Registers are the fastest

Memory n Reading From Memory n n n Processor places address on the address Memory n Reading From Memory n n n Processor places address on the address bus Processor asserts the memory read control signal Processor waits for memory to place the data on the data bus Processor reads the data from the data bus Processor drops the memory read signal

Memory Management n Protected Mode n n n 32 bit Addressing: 4 GB of Memory Management n Protected Mode n n n 32 bit Addressing: 4 GB of memory Addresses: 0000 to FFFF Each process “sees” the full 4 GB. Segment registers store indexes to a global descriptor table. Multiple processes running simultaneously Prevents processes from corrupting each other's data.

Memory Management n Paging n n n Segments are divided into 4 KB blocks Memory Management n Paging n n n Segments are divided into 4 KB blocks Virtual Memory Manager Blocks are sent to the page file on the hard disk when they are not in use Switching between applications in low memory condition requires a delay The more memory, the less paging is required

Program Execution n What happens when program runs? n n n User clicks on Program Execution n What happens when program runs? n n n User clicks on a program icon Operating System (OS) searches for program OS loads programs into available memory n n n What happens if memory is full? OS Allocates blocks of memory and adjusts pointers in the code to point to the data OS branches to the first executable instruction At this point, it becomes a Process Memory is released after program ends

Program Execution n Multi-tasking n n n OS can run multiple processes Only one Program Execution n Multi-tasking n n n OS can run multiple processes Only one process runs at any given time Processes run in a time slice CPU must support Task Switching requires that all registers and program counter be stored when switching to another process

Questions? Questions?

IA-32 Architecture n Microcomputer Design n Intel IA-32 Family Tree n Operating Environment n IA-32 Architecture n Microcomputer Design n Intel IA-32 Family Tree n Operating Environment n Input / Output n The Future

Input / Output n Input n n Keyboard, Mouse, Network Card, etc Output n Input / Output n Input n n Keyboard, Mouse, Network Card, etc Output n Monitor, Printer, etc

Input / Output n There are 4 access levels of I/O interaction n n Input / Output n There are 4 access levels of I/O interaction n n n Level 3 2 1 0 – – High level programming language Operating System API BIOS Direct Hardware interaction The lower the access level, the faster the result, but what is the trade-off? Operating System may reserve direct access to hardware

Input / Output n n Input/Output is Interrupt Driven What happens when you press Input / Output n n Input/Output is Interrupt Driven What happens when you press a key on the keyboard? n n n Keyboard sends signal to CPU stops and handles the request by the keyboard that a key was struck CPU puts keystroke into a buffer and returns to the previous process

The Future n n Intel 64 Shrinking Cores n n n Multiple Cores n The Future n n Intel 64 Shrinking Cores n n n Multiple Cores n n 45 nm core (Intel Penryn) 32 nm (Intel 2009) Xeon 7400 Hexcore (9/16/08) IA-32 phase-out

Questions? The End… Questions? The End…