Скачать презентацию Input Output CPS 104 Week 14 Скачать презентацию Input Output CPS 104 Week 14

795d0f5810425e7a4dd5db2c9c506c70.ppt

  • Количество слайдов: 40

 Input / Output CPS 104 Week 14 lecture 1 Input / Output CPS 104 Week 14 lecture 1

Administrivia • HW 5 Due • HW 6 Assigned – Due last day of Administrivia • HW 5 Due • HW 6 Assigned – Due last day of class © Alvin R. Lebeck 1998 CPS 104 2

Overview • I/O devices – device controller • • Rotational media (disks) Device drivers Overview • I/O devices – device controller • • Rotational media (disks) Device drivers Memory Mapped I/O Programmed I/O Direct Memory Access (DMA) I/O bus memory bus RAID (if time) © Alvin R. Lebeck 1998 CPS 104 3

I/O Systems Processor interrupts Cache Memory Bus I/O Bridge I/O Bus Main Memory Disk I/O Systems Processor interrupts Cache Memory Bus I/O Bridge I/O Bus Main Memory Disk Controller Disk Graphics Controller Graphics Network Interface Network Time(workload) = Time(CPU) + Time(I/O) - Time(Overlap) © Alvin R. Lebeck 1998 4

Why I/O? • • Interactive Apps Long term storage (files, data repository) Swap for Why I/O? • • Interactive Apps Long term storage (files, data repository) Swap for VM Many different devices – character v. s. block – Networks are everywhere! • 106 difference CPU (10 -9) & I/O (10 -3) • Response Time vs Throughput – Not always another process to execute • OS hides (some) differences in devices – same (similar) interface to many devices • Permits many apps to share one device © Alvin R. Lebeck 1998 5

Device Drivers • top-half – API (open, close, read, write, ioctl) – I/O Control Device Drivers • top-half – API (open, close, read, write, ioctl) – I/O Control (IOCTL, device specific arguments) • bottom-half – interrupt handler – communicates with device – resumes process • Must have access to user address space and device control registers => runs in kernel mode. © Alvin R. Lebeck 1998 CPS 104 6

Review: Interrupts and Exceptions • Unnatural change in control flow • Interrupt is external Review: Interrupts and Exceptions • Unnatural change in control flow • Interrupt is external event – devices: disk, network, keyboard, etc. – clock for timeslicing – these are useful events, must do something when they occur. • Exception is often potential problem with program – – – segmentation fault bus error divide by 0 don’t want my bug to crash the entire machine page fault (virtual memory…) © Alvin R. Lebeck 1998 CPS 104 7

Review: Handling an Interrupt/Exception User Program ld add st mul beq ld sub bne Review: Handling an Interrupt/Exception User Program ld add st mul beq ld sub bne Interrupt Handler RETT Service Routines • Invoke specific kernel routine based on type of interrupt – interrupt/exception handler • Must determine what caused interrupt – could use software to examine each device – PC = interrupt_handler • Vectored Interrupts – PC = interrupt_table[i] • Clear the interrupt • kernel initializes table at boot time • May return from interrupt (RETT) to different process (e. g, context switch) © Alvin R. Lebeck 1998 CPS 104 8

Types of Storage Devices • • Magnetic Disks Magnetic Tapes CD ROM Juke Box Types of Storage Devices • • Magnetic Disks Magnetic Tapes CD ROM Juke Box (automated tape library, robots) © Alvin R. Lebeck 1998 9

Magnetic Disks • Long term nonvolatile storage • Another slower, less expensive level of Magnetic Disks • Long term nonvolatile storage • Another slower, less expensive level of memory hierarchy Track Sector Arm Cylinder Head © Alvin R. Lebeck 1998 Platter 10

Disk Access • Access time = queue + seek + rotational + transfer + Disk Access • Access time = queue + seek + rotational + transfer + overhead • Seek time – move arm over track – average is confusing (startup, slowdown, locality of accesses) • Rotational latency – wait for sector to rotate under head – average = 0. 5/(3600 RPM) = 8. 3 ms • Transfer Time – f(size, BW bytes/sec) © Alvin R. Lebeck 1998 11

Disk Access Time Example • Disk Parameters: – – Transfer size is 8 K Disk Access Time Example • Disk Parameters: – – Transfer size is 8 K bytes Advertised average seek is 12 ms Disk spins at 7200 RPM Transfer rate is 4 MB/sec • Controller overhead is 2 ms • Assume that disk is idle so no queuing delay • What is Average Disk Access Time for a Sector? – Ave seek + ave rot delay + transfer time + controller overhead – 12 ms + 0. 5/(7200 RPM/60) + 8 KB/4 MB/s + 2 ms – 12 + 4. 15 + 2 = 20 ms • Advertised seek time assumes no locality: typically 1/4 to 1/3 advertised seek time: 20 ms => 12 ms © Alvin R. Lebeck 1998 12

DRAM as Disk • Solid state disk, Expanded Storage, NVRAM • Disk is slow, DRAM as Disk • Solid state disk, Expanded Storage, NVRAM • Disk is slow, DRAM is fast => replace Disk with battery backed DRAM • BUT, Disk is cheap, much cheaper than DRAM • Network Memory – fast networks (e. g. , Myrinet) – use DRAM of other workstations as backing store – Trapeze/GMS project here © Alvin R. Lebeck 1998 13

Alternative Storage • CD ROM – read only: good distribution, archiving • Magnetic Tape Alternative Storage • CD ROM – read only: good distribution, archiving • Magnetic Tape – Sequential Access – R-DAT (Rotating Digital Audio Tape) » Helical Scan (angle to tape, high density ~5 GB) – Tera to peta bytes of storage (NASA EOS) © Alvin R. Lebeck 1998 14

Connecting I/O Devices to CPU/Memory • Memory Bus – – Short Fast Known set Connecting I/O Devices to CPU/Memory • Memory Bus – – Short Fast Known set of components Proprietary (don’t release design free) • Separate I/O Bus (e. g. , PCI) – – Standard Accept variety of components (w/ different BW performance) Long Slow © Alvin R. Lebeck 1998 15

Processor Interface Issues • Interconnections – Busses • Processor interface – I/O Instructions – Processor Interface Issues • Interconnections – Busses • Processor interface – I/O Instructions – Memory mapped I/O • I/O Control Structures – Device Controllers – Polling/Interrupts • Data movement – Programmed I/O / DMA • Capacity, Access Time, Bandwidth © Alvin R. Lebeck 1998 16

Device Controllers Interrupt? Busy Done Error Bus Command Status Device Controller Data 0 Data Device Controllers Interrupt? Busy Done Error Bus Command Status Device Controller Data 0 Data 1 Controller deals with mundane control (e. g. , position head, error detection/correction) Data n-1 Processor communicates with Controller Device © Alvin R. Lebeck 1998 CPS 104 17

I/O Instructions CPU Memory memory bus Independent I/O Bus Controller Device Separate I/O instructions I/O Instructions CPU Memory memory bus Independent I/O Bus Controller Device Separate I/O instructions (in, out) Device CPU common memory & I/O bus Memory Lines distinguish between I/O and memory transfers Controller Device © Alvin R. Lebeck 1998 Controller Device VME bus Multibus-II Nubus 40 Mbytes/sec optimistically 10 MIP processor completely saturates the bus! 18

Memory Mapped I/O CPU Memory Single Memory & I/O Bus No Separate I/O Instructions Memory Mapped I/O CPU Memory Single Memory & I/O Bus No Separate I/O Instructions Controller Device Physical Address ROM RAM Device CPU $ Device Controller L 2 $ Memory Bus I/O bus Memory I/O Issue command through store instruction Check status with load instruction Caches? Bus Adapter Bridge © Alvin R. Lebeck 1998 19

Communicating with the processor • Polling – can waste time waiting for slow I/O Communicating with the processor • Polling – can waste time waiting for slow I/O device – busy wait – can interleave with useful work • Interrupts – interrupt overhead – interrupt could happen anytime - asynchronous – no busy wait © Alvin R. Lebeck 1998 CPS 104 20

Data Movement • Programmed I/O – processor has to touch all the data – Data Movement • Programmed I/O – processor has to touch all the data – too much processor overhead » for high bandwidth devices (disk, network) • DMA – processor sets up transfer(s) – DMA controller transfers data – complicates memory system © Alvin R. Lebeck 1998 CPS 104 21

Programmed I/O & Polling CPU $ Device Controller L 2 $ Memory Bus I/O Programmed I/O & Polling CPU $ Device Controller L 2 $ Memory Bus I/O bus Memory Bus Adapter Is the data ready? no yes load data but checks for I/O completion can be dispersed among computationally intensive code store data done? busy wait loop not an efficient way to use the CPU unless the device is very fast! no yes © Alvin R. Lebeck 1998 22

Interrupt Driven Data Transfer add sub and or nop CPU Device $ (1) I/O Interrupt Driven Data Transfer add sub and or nop CPU Device $ (1) I/O interrupt Controller L 2 $ user program (2) save PC Memory Bus I/O bus Memory Bus Adapter (3) interrupt service addr User program progress only halted during actual transfer Interrupt Overhead can dominate transfer time. 1000 xfers of 1000 bytes each: 2 usecs for interrupt 98 usecs for service (4) read store. . . rti interrupt service routine memory Device xfer rate: 10 MB/s =>. 1 usec/byte => . 1 ms for 1000 bytes © Alvin R. Lebeck 1998 23

Direct Memory Access CPU sends a starting address, direction, and length count to DMAC. Direct Memory Access CPU sends a starting address, direction, and length count to DMAC. Then issues "start". Time to do 1000 x 1000 bytes: 1 DMA set-up sequence @ 50 µsec 1 interrupt @ 2 µsec 1 interrupt service sequence @ 48 µsec. 0001 second of CPU time 0 CPU $ L 2 $ Memory Bus I/O bus Memory Mapped I/O ROM RAM Peripherals Memory Bus Adapter DMA CNTRL n DMAC provides handshake signals for device controller, and memory addresses and handshake signals for memory. © Alvin R. Lebeck 1998 24

I/O Data Flow Impediment to high performance: multiple copies, complex hierarchy © Alvin R. I/O Data Flow Impediment to high performance: multiple copies, complex hierarchy © Alvin R. Lebeck 1998 25

Communication Networks Performance limiter is memory system, OS overhead, not HW protocols • Send/receive Communication Networks Performance limiter is memory system, OS overhead, not HW protocols • Send/receive queues in processor memories • Network controller copies back and forth via DMA • No host intervention needed • Interrupt host when message sent or received © Alvin R. Lebeck 1998 26

Relationship to Processor Architecture • Virtual memory frustrates DMA – page faults during DMA? Relationship to Processor Architecture • Virtual memory frustrates DMA – page faults during DMA? • Synchronization between controller and CPU • Caches required for processor performance cause problems for I/O – Flushing is expensive, I/O pollutes cache – Solution is borrowed from shared memory multiprocessors "snooping” (coherent DMA) • Caches and write buffers – need uncached and write buffer flush for memory mapped I/O © Alvin R. Lebeck 1998 27

Bus Arbitration Parallel (Centralized) Arbitration BR BG M M BR BG Bus Request Bus Bus Arbitration Parallel (Centralized) Arbitration BR BG M M BR BG Bus Request Bus Grant M Serial Arbitration (daisy chaining) BG A. U. BR BGi BGo M BR Self Selection Collision Detection © Alvin R. Lebeck 1998 28

Bus Option High performance Low cost Bus width Separate address Multiplex address & data Bus Option High performance Low cost Bus width Separate address Multiplex address & data lines Data width Wider is faster Narrower is cheaper (e. g. , 32 bits) (e. g. , 8 bits) Transfer size Multiple words has Single-word transfer less bus overhead is simpler Bus masters Multiple Single master (requires arbitration) (no arbitration) Split Yes—separate No—continuous transaction? Request and Reply connection is cheaper packets gets higher and has lower latency bandwidth (needs multiple masters) Clocking Synchronous Asynchronous © Alvin R. Lebeck 1998 29

Asynchronous Handshake Write Transaction Address Master Asserts Address Data Next Address Master Asserts Data Asynchronous Handshake Write Transaction Address Master Asserts Address Data Next Address Master Asserts Data Read Req. Ack. 4 Cycle Handshake t 0 t 1 t 2 t 3 t 4 t 5 t 0 : Master has obtained control and asserts address, direction, data Waits a specified amount of time for slaves to decode target t 1: Master asserts request line t 2: Slave asserts ack, indicating data received t 3: Master releases req t 4: Slave releases ack © Alvin R. Lebeck 1998 30

Read Transaction Address Master Asserts Address Next Address Data Read Req Ack 4 Cycle Read Transaction Address Master Asserts Address Next Address Data Read Req Ack 4 Cycle Handshake t 0 t 1 t 2 t 3 t 4 t 5 t 0 : Master has obtained control and asserts address, direction, data Waits a specified amount of time for slaves to decode target t 1: Master asserts request line t 2: Slave asserts ack, indicating ready to transmit data t 3: Master releases req, data received t 4: Slave releases ack Time Multiplexed Bus: address and data share lines © Alvin R. Lebeck 1998 31

Manufacturing Advantages of Disk Arrays Disk Product Families Conventional: 4 disk 3. 5” 5. Manufacturing Advantages of Disk Arrays Disk Product Families Conventional: 4 disk 3. 5” 5. 25” 10” designs Low End 14” High End Disk Array: 1 disk design 3. 5” © Alvin R. Lebeck 1998 32

Redundant Arrays of Disks • Files are Redundant Arrays of Disks • Files are "striped" across multiple spindles • Redundancy yields high data availability Disks will fail Contents reconstructed from data redundantly stored in the array Capacity penalty to store it Bandwidth penalty to update Mirroring/Shadowing (high capacity cost) Techniques: Horizontal Hamming Codes (overkill) Parity & Reed-Solomon Codes Failure Prediction (no capacity overhead!) © Alvin R. Lebeck 1998 33

Summary • I/O devices – device controller • Rotational media (disks) • Device drivers Summary • I/O devices – device controller • Rotational media (disks) • Device drivers (two parts) – help isolate specifics of device • • • Memory Mapped I/O Programmed I/O Direct Memory Access (DMA) I/O bus memory bus RAID © Alvin R. Lebeck 1998 CPS 104 34

Homework 6 Homework 6

Interrupt Handler • MIPS/SPIM program • Use memory-mapped I/O • Use interrupts • Program Interrupt Handler • MIPS/SPIM program • Use memory-mapped I/O • Use interrupts • Program should: – Accept keyboard input » interrupts – Echo input to terminal » polling – Exit if user typed ‘q’ • Programmed I/O? © Alvin R. Lebeck 1998 CPS 104 36

Terminal Control • Memory mapped I/O – use LW, SW • -mapped_io command line Terminal Control • Memory mapped I/O – use LW, SW • -mapped_io command line option • Receiver - input – ready=1 when data valid • Transmitter – ready=1 when ready to print next char © Alvin R. Lebeck 1998 CPS 104 37

Interrupt Driven I/O • Set Interrupt Enable = 1 – generates a level 0 Interrupt Driven I/O • Set Interrupt Enable = 1 – generates a level 0 interrupt when Ready becomes 1 – if interrupt is enabled in Status Register also Unused 1 1 Receiver control (0 xffff 0000) Interrupt enable Ready • Run spim with -notrap – allows you to install interrupt handler © Alvin R. Lebeck 1998 CPS 104 38

Status Register • Bit 0 = interrupt enable • Bit 8 = allow level Status Register • Bit 0 = interrupt enable • Bit 8 = allow level 0 interrupts – terminal input generates level 0 int. • Coprocessor 0, register 12 – use mfc 0, mtc 0 • On interrupt, bits 0 -5 are shifted left by 2 – disables interrupts and enters kernel mode • When done servicing interrupt, use rfe to restore © Alvin R. Lebeck 1998 CPS 104 39

Cause Register • Code 0000 = external interrupt – terminal interrupt 15 10 5 Cause Register • Code 0000 = external interrupt – terminal interrupt 15 10 5 2 Pending interrupts © Alvin R. Lebeck 1998 Exception code CPS 104 40