Скачать презентацию Chapter 8 Memory Management A program may Скачать презентацию Chapter 8 Memory Management A program may

e0d93c6339d6d0155d0a57c5678c5049.ppt

  • Количество слайдов: 40

Chapter 8: Memory Management • A program may not be executed until it is; Chapter 8: Memory Management • A program may not be executed until it is; – associated with a process – brought into memory • In allow multi-programming, the OS must be able to allocate memory to each process – Several processes at once – Requires a “Memory Management” scheme and appropriate hardware support – Security? • The memory management scheme has a large impact upon how a program for a particular platform must be designed and compiled – How much memory is available? – How do should we bind addresses? CEG 433/633 - Operating Systems I 8. 1 Dr. T. Doom

Address Binding • Instruction and data addresses in program source code are symbolic: – Address Binding • Instruction and data addresses in program source code are symbolic: – goto errjmp; – X = A + B; • These symbolic addresses must be bound to addresses in physical memory before the code can be executed • • Address binding: a mapping from one address space to another • Compile-time Binding: the compiler generates absolute code – memory location must be known a priori – must recompile to move code – MS-DOS. COM format programs The address binding can take place at compile time, load time, or execution time. CEG 433/633 - Operating Systems I 8. 2 Dr. T. Doom

Load-time Binding • Most modern compilers generate relocatable object code – symbolic address are Load-time Binding • Most modern compilers generate relocatable object code – symbolic address are bound to a relocatable address – i. e. “ 286 bytes from the beginning for the module doom. C. o • The linkage editor (linker) combines the multiple modules into a relocatable executable • The load module (loader) is places the program in memory – The loader performs the final binding of relocatable addresses to absolute addresses • Load-time Binding: Bind relocatable code to address on load – Must generate relocatable code – Memory location need not be known at compile time – If starting address must change, we must “reload” code CEG 433/633 - Operating Systems I 8. 3 Dr. T. Doom

Execution-time Binding • A logical (or virtual) address space may be bound to a Execution-time Binding • A logical (or virtual) address space may be bound to a separate physical address space – Provides an abstraction of physical memory – Logical (virtual) address – generated by the CPU – Physical address – address seen by the memory unit • The user program deals with logical addresses; it never sees the “real” physical addresses • Memory-Management Unit (MMU): Hardware device that translates CPU-generated logical addresses into physical memory addresses • Execution-time Binding: Binding delayed until run time – process can be moved during its execution from one memory segment to another – logical and physical addresses differ (requires mapping) – requires hardware and OS support for address mapping CEG 433/633 - Operating Systems I 8. 4 Dr. T. Doom

Memory-Management Unit (MMU) • Logical and physical addresses are the same in compile-time and Memory-Management Unit (MMU) • Logical and physical addresses are the same in compile-time and load-time address-binding schemes; logical (virtual) and physical addresses differ in execution-time address-binding scheme. • The user program deals with logical addresses; it never sees the real physical addresses. • • Hardware device that maps virtual to physical address. • Thus, each logical address is bound to a physical address – Is security maintained? In most basic MMU scheme, all logical addresses begin at 0, and the base register is replaced by a relocation register – The value in the relocation register is added to every logical address generated by a user process at the time it is sent to memory to generate the necessary physical address – To move the program, simply change the value in the register – The limit register remains unchanged CEG 433/633 - Operating Systems I 8. 5 Dr. T. Doom

Can we reduce memory requirements? • • Loading: Placing the program in memory • Can we reduce memory requirements? • • Loading: Placing the program in memory • Better memory-space utilization; unused routine is never loaded – Size of executable is unchanged – Runtime footprint is smaller – Useful when large amounts of code are needed to handle infrequently occurring cases. • No special support from the operating system is required – Implemented through program design Dynamic Loading: Routine is not loaded until it is called – Program must check and load before calling – If a needed routine is not available in memory, the relocatable linker/loader loads the routine and updates the program’s address tables CEG 433/633 - Operating Systems I 8. 6 Dr. T. Doom

Can we reduce executable size? • • Linking: combining object modules into an executable Can we reduce executable size? • • Linking: combining object modules into an executable • Modern OSes often allow dynamic linking – Linking postponed until execution time • Instead of placing the code for each library routine in the executable, include only a stub (a small piece of code) which: T locates the appropriate memory-resident library routine T replaces itself with the address of the routine, and executes the routine • Executable footprint is reduced – program will not run w/o libraries – New (minor) versions of the library do not require recompilation • Some operating systems provide support for sharing the memory associated with library modules between processes (shared libs. ) – Very efficient! No read() required, less overall memory usage Most OSes require static linking – All library routines become part of the executable CEG 433/633 - Operating Systems I 8. 7 Dr. T. Doom

What if there isn’t enough memory? • How can we execute an executable whose What if there isn’t enough memory? • How can we execute an executable whose code footprint is larger than the memory available? – This was a major problem in the 60 s and 70 s for general purpose computers and remains a major problem T Consider memory usage in an e-mail pager or ISDN box • Solution: Keep in memory only those instructions and data that are needed at any given time; overload during run-time – Overwrite this memory with a new set of instructions and data when we get to a significantly different part of the code – Each set of instructions/data is an overlay – Programming design of overlay structure is non-trivial • No special support needed from operating system – Implemented by user design • Modern general purpose OSes use virtual memory to deal with this problem CEG 433/633 - Operating Systems I 8. 8 Dr. T. Doom

How does the OS allocate memory? • Contiguous Allocation Scheme: All memory granted to How does the OS allocate memory? • Contiguous Allocation Scheme: All memory granted to a process must be contiguous • Single-partition contiguous allocation – Only one “partition” exists in memory for user processes T Only one user process is granted memory at a time T The resident operating system must also be held in memory T OS size changes as “transient” code is loaded – Place OS in low memory, use relocation-register to define the beginning of the user partition T Relocation-register protects the OS code and data T Alows relocation of user code if OS requirements change – Relocation register contains value of smallest physical address; limit register contains range of logical addresses – each logical address must be less than the limit register – To change context, must swap out main memory to a backing store CEG 433/633 - Operating Systems I 8. 9 Dr. T. Doom

Swapping • A process can be suspended and swapped temporarily out of memory to Swapping • A process can be suspended and swapped temporarily out of memory to a backing store, and then brought back into memory for continued execution • Backing store – usually a fast disk large enough to accommodate copies of all memory images for all users; must provide direct access to these memory images – swap may be from memory (conventional) to memory (extended) • Roll out, roll in – swapping variant used for priority-based scheduling algorithms (or round-robin with a huge quantum); lower-priority process is swapped out so higher-priority process can be loaded and executed. • Major part of swap time is transfer time; total transfer time is directly proportional to the amount of memory swapped. • Requires execution-time binding if process can be restored to a different memory space then it occupied previously • • OS management of I/O buffers required to swap a process awaiting I/O Modified versions of swapping are found on many systems, i. e. , UNIX and Microsoft Windows CEG 433/633 - Operating Systems I 8. 10 Dr. T. Doom

Swapping in Single Partition Scheme CEG 433/633 - Operating Systems I 8. 11 Dr. Swapping in Single Partition Scheme CEG 433/633 - Operating Systems I 8. 11 Dr. T. Doom

Contiguous Allocation (Cont. ) • For multi-processing systems it is far more efficient to Contiguous Allocation (Cont. ) • For multi-processing systems it is far more efficient to allow several user processes to allocate memory – The OS must keep track of the size and owner of each partition – The OS must determine how and where to allocate new requests • Multiple-partition contiguous allocation – Fixed-partition: Memory is pre-partitioned, the OS must assign each process to the best free partition T Hard limit to the number of processes in memory T Efficient? OS 100 K 500 K 200 K CEG 433/633 - Operating Systems I 8. 12 Dr. T. Doom

Contiguous Allocation (Cont. ) • Multiple-partition contiguous allocation – Dynamic allocation: Memory is partitioned Contiguous Allocation (Cont. ) • Multiple-partition contiguous allocation – Dynamic allocation: Memory is partitioned by the OS “on the fly” T Operating system maintains information about: a) allocated partitions b) free partitions (hole) T Hole: block of available memory; holes of various size are scattered throughout memory. T When a process arrives, it is allocated memory from a hole large enough to accommodate it OS OS process 5 process 9 process 8 process 2 CEG 433/633 - Operating Systems I process 10 process 2 8. 13 process 2 Dr. T. Doom

Dynamic Storage-Allocation Problem • How do we satisfy a request of size n from Dynamic Storage-Allocation Problem • How do we satisfy a request of size n from a list of free holes. Optimization metrics include speed and storage utilization. – First-fit: Allocate the first hole that is big enough. Search begins at top of list. Fast search. – Next-fit: Allocate the first hole that is big enough. Search begins at the end of the last search. Fast search. – Best-fit: Allocate the smallest hole that is big enough; must search entire list, unless ordered by size. Produces the smallest leftover hole. – Worst-fit: Allocate the largest hole; must also search entire list, unless ordered by size. Produces the largest leftover hole. • Simulation shows that: – First-fit is better (in terms of storage utilization) than worst-fit – First-fit is as good (in terms of storage utilization) than best-fit – First-fit is faster than best-fit – Next-fit is generally better than first-fit CEG 433/633 - Operating Systems I 8. 14 Dr. T. Doom

Fragmentation • How do we measure storage utilization? – How much space is wasted? Fragmentation • How do we measure storage utilization? – How much space is wasted? • Internal fragmentation – allocated memory may be slightly larger than requested memory; this size difference is memory internal to a partition, but not being used – Problem in fixed-partition allocation • External fragmentation – total memory space exists to satisfy a request, but it is not contiguous. – Problem in dynamic allocation – 50% rule: Simulations show that for n-blocks, n/2 -blocks of memory are wasted. 1/3 of memory is lost to fragmentation – External fragmentation can be reduced by compaction T Shuffle memory contents to place all free memory together in one large block T Compaction is possible only if relocation is dynamic, and is done at execution time and if the OS provides I/O buffers so that devices don’t DMA reallocated memory CEG 433/633 - Operating Systems I 8. 15 Dr. T. Doom

Non-Contiguous Memory Allocation • Goal: Reduce memory loss to external fragmentation without incurring the Non-Contiguous Memory Allocation • Goal: Reduce memory loss to external fragmentation without incurring the overhead of compaction • Solution: Abandon the requirement that allocation memory be contiguous. • Non-contiguous memory allocation approaches include: – Paging: Allow logical address space of a process to be noncontiguous in physical memory. This complicates the binding (MMU) but allows the process to be allocated physical memory wherever it is available. – Segmentation: Allow the segmentation of a process into many logically connected components. Each begins at its own (local) virtual address 0. T This allows many other useful features, including protection permisions on a per segment basis, etc. T Example segmentation: Text, Data, Stack. – Segmentation with Paging: Hybrid approach CEG 433/633 - Operating Systems I 8. 16 Dr. T. Doom

Paging • • • Physical memory is broken up into fixed-size partitions called frames Paging • • • Physical memory is broken up into fixed-size partitions called frames • Logical addresses must be mapped to physical addresses – Set up a page table to note which frame holds each page • Logical Address generated by CPU is divided into: – Page number (p) – used as an index into a page table which contains base address of each page in physical memory – Page offset (d) – combined with base address to define the physical memory address that is sent to the memory unit Logical memory is broken up into frame-size partitions called pages The OS keeps track of all free frames – Frame size = Page size (power of 2, usually 512 - 8 k bytes) – To run a program of size n pages, need to find n free frames and load program – Internal fragmentation (average of 50% of one page per process) CEG 433/633 - Operating Systems I 8. 17 Dr. T. Doom

Paging Example CEG 433/633 - Operating Systems I 8. 18 Dr. T. Doom Paging Example CEG 433/633 - Operating Systems I 8. 18 Dr. T. Doom

Implementing Paging • • Paging is transparent to the process (still viewed as contiguous) Implementing Paging • • Paging is transparent to the process (still viewed as contiguous) Divide a m-bit logical address for a system with pages of size 2 n into: – n-bit page offset (d) – (m-n)-bit page number (p) m - n bits page number p page offset d m-bit logical address • The page number p is an index to the page table which stores the location of the frame • Frames and pages are the same size, thus the displacement within a page is also the displacement within the frame • Mapping is: – Physical address = page-table(p) + d CEG 433/633 - Operating Systems I 8. 19 Dr. T. Doom

Address Translation Architecture CEG 433/633 - Operating Systems I 8. 20 Dr. T. Doom Address Translation Architecture CEG 433/633 - Operating Systems I 8. 20 Dr. T. Doom

Page Size • How large should a page be? – Smaller pages reduce internal Page Size • How large should a page be? – Smaller pages reduce internal fragmentation – Larger pages reduce the number of page table entries • If s is the average process size, p is the page size (in bytes) and e is the # of bytes per page table entry, then: s/p: # pages / process se/p: size of page table / process p/2: memory lost to int. fragmentation Overhead = se/p + p/2 Mimimize: dp(overhead) = -se/p 2 + 1/2 = 0 or p = sqrt(2 se) • For current process sizes, and available physical memory, optimal page sizes range between 512 - 8 K bytes • Page table must be kept in main memory. – Why? If a page is 8 k (12 bits) and the CPU uses a 32 -bits address then there are 220 possible pages per process – # of bits per entry depends upon size of physical memory – The memory consumed by this table is overhead/waste CEG 433/633 - Operating Systems I 8. 21 Dr. T. Doom

Implementation of Page Table • The page table must be kept in main memory Implementation of Page Table • The page table must be kept in main memory – Page-table base register (PTBR) points to the page table T add PTBR + page number (p) to get lookup address – Page-table length register (PRLR) indicates size of the table T Only make the page table as large as necessary T Addresses in unallocated pages cause an exception • For each CPU memory access in there are two physical accesses – access the page table (in memory) to retrieve frame – access the data/instruction • The inefficiency of this two memory access solution can be reduced by the use of a special fast-lookup hardware cache for the page table – associative registers or translation look-aside buffers (TLBs) • Hit Ratio: The percentage for which the necessary data is present in the cache – otherwise, get data from page table in main memory CEG 433/633 - Operating Systems I 8. 22 Dr. T. Doom

Effective Access Time • Effective Access Time (EAT) is a weighted average t. TLB: Effective Access Time • Effective Access Time (EAT) is a weighted average t. TLB: time required for a TLB lookup tmem: time required for an access to main memory : hit ratio EAT = ( t. TLB + tmem) + (1 - )(t. TLB+tmem) • Even for fairly small TLBs, hit ratios of. 98 -. 99 are common – Most programs refer to memory very sequentially and locally – The 32 -entry TLB in the 486 generally has a. 98 hit ratio • Thus, we can implement paging without suffering a significant latency cost • Try it with TLB search of 20 ns, Memory access of 100 ns, and hit ratios of. 80 and. 98 CEG 433/633 - Operating Systems I 8. 23 Dr. T. Doom

Memory Protection • Protections bits are included for each entry in the page table: Memory Protection • Protections bits are included for each entry in the page table: – Valid-invalid bit indicates if the associated page is in the process’ logical address space, and is thus a legal page T Machines which have a PTLR can avoid the “wasted” page table entries necessary to house the i bit. – RO/RW/X bits indicates if the page should be considered readonly, read-write and/or executable – Protection exceptions are calculated in parallel with the physical address (after the page table lookup) • Page tables allow processes to share memory by having their page tables point to the same frame – Note: Processes can not reference physical memory that the OS does not allow them to via page table setup • The OS keeps a frame-table (one entry per frame) which indicates if each frame is full or empty, to which process the frame is allocated, when was it last referenced, etc – Memory protection implemented by associating protection bit with each frame CEG 433/633 - Operating Systems I 8. 24 Dr. T. Doom

Shared Pages • Private code and data – Each process keeps a separate copy Shared Pages • Private code and data – Each process keeps a separate copy of the code and data • Shared code – To be sharable, code must be reentrant (or “pure”) T All non-self modifying code is pure - it never changes during execution (I. e. read only code) T Each process has its own copy of registers and data storage to hold the data for its process’ execution – One copy of reentrant code can be shared among processes (i. e. , text editors, compilers, window systems) – Problem: Shared code must appear in at the same location in the logical address space of each process T internal branch and memory addresses must be consistent CEG 433/633 - Operating Systems I 8. 25 Dr. T. Doom

Shared Pages Example CEG 433/633 - Operating Systems I 8. 26 Dr. T. Doom Shared Pages Example CEG 433/633 - Operating Systems I 8. 26 Dr. T. Doom

Two-Level Paging • Consider a page table for a 32 -bit logical address space Two-Level Paging • Consider a page table for a 32 -bit logical address space on a machine with a 32 -bit physical address space and size 4 K pages – logical space/page size = 232 / 212 = 220 entries – physical space/frame size = 232/212 = 220, 20 bits/entry + ~12 protection bits ~= 4 Bytes/entry – Page table size = 220 entries * 4 Bytes/entry = 4 MB – 4 MB >> 4 K: The page table itself is larger than one page! – We can’t allocate the page table in contiguous memory • We must page the page table! The page number is divided into: – How many 4 Byte entries per 4 K page? 212/22 = 210 page number page offset T a 10 -bit page offset – How many bits remain? 20 - 10 = 10 T a 10 -bit page number • pi p 2 d 10 10 12 Thus, a logical address is divided pi, an index into the outer page table, and p 2, the displacement within the page of the outer page table CEG 433/633 - Operating Systems I 8. 27 Dr. T. Doom

Two-Level Page-Table Scheme CEG 433/633 - Operating Systems I 8. 28 Dr. T. Doom Two-Level Page-Table Scheme CEG 433/633 - Operating Systems I 8. 28 Dr. T. Doom

Multilevel Paging Performance • • The concept can be extended to any number of Multilevel Paging Performance • • The concept can be extended to any number of page-table levels • Even though time needed for one memory access is increased, caching (via TLB) permits performance to remain reasonable • Example: In a system with a two-level paging scheme, a memory access time of 100 ns, and 20 ns TLB with a hit rate of 98 percent: Since each level is stored as a separate table in memory, covering a logical address to a physical one may take many memory accesses effective access time = 0. 98 x (20 + 100) + 0. 02 x (20 + 100) = 124 nanoseconds. which is only a 24 percent slowdown in memory access time. CEG 433/633 - Operating Systems I 8. 29 Dr. T. Doom

Inverted Page Table • Problem: Each process requires its own page table, which consists Inverted Page Table • Problem: Each process requires its own page table, which consists many entries (possibly millions). How can we reduce this overhead? • Solution: The number of frames is fixed (and shared between the processes). Store the process/page information by frame! – One entry for each “real” page of memory – Entry consists of the virtual address of the page stored in that real memory location, with information about the process that owns that page • Concern: Decreases memory needed to store each page table, but increases time needed to search the table when a page reference occurs – Use hash table to limit the search to one — or at most a few — page-table entries – hash table requires another memory lookup (of course) • Concern for later: The use of an inverted page table does not obviate the need for a normal page table in demand paged systems (ch. 9) CEG 433/633 - Operating Systems I 8. 30 Dr. T. Doom

Inverted Page Table Architecture CEG 433/633 - Operating Systems I 8. 31 Dr. T. Inverted Page Table Architecture CEG 433/633 - Operating Systems I 8. 31 Dr. T. Doom

Segmentation • Segmentation is a non-contiguous memory allocation scheme – “simpler” than paging, but Segmentation • Segmentation is a non-contiguous memory allocation scheme – “simpler” than paging, but not as efficient – supports user view of memory • Programmers tend not to consider memory as a linear array of bytes, they prefer to view memory as a collection of variable sized segments – Never forget, however, that memory is a linear array of bytes • A segment is a logical unit such as: – main program, procedure, function, local variables, global variables, common block, stack, symbol table, arrays, etc. • Segmentation is a memory management scheme that supports this user view of memory – segments are numbered and referred to by that number – a logical address consists of a segment, and an offset – A mapping between segments and physical addresses must be performed CEG 433/633 - Operating Systems I 8. 32 Dr. T. Doom

Logical View of Segmentation 1 4 1 2 3 2 4 3 user space Logical View of Segmentation 1 4 1 2 3 2 4 3 user space CEG 433/633 - Operating Systems I physical memory space 8. 33 Dr. T. Doom

Segmentation Architecture • Logical address consists of a two tuple: <segment-number, offset>, • Segment Segmentation Architecture • Logical address consists of a two tuple: , • Segment table – maps two-dimensional physical addresses; each table entry has: – base – contains the starting physical address where the segments reside in memory. – limit – specifies the length of the segment. • Segment-table base register (STBR) points to the segment table’s location in memory. • Segment-table length register (STLR) indicates number of segments used by a program segment number s is legal if s < STLR. CEG 433/633 - Operating Systems I 8. 34 Dr. T. Doom

Segmentation Architecture (Cont. ) • Relocation – dynamic (execution-time) – by segment table • Segmentation Architecture (Cont. ) • Relocation – dynamic (execution-time) – by segment table • Sharing – similar to sharing in a paged system T shared segments T must have same segment number in each program – protection/sharing bits in each segment table entry • Memory allocation – segment vary in length – dynamic-storage problem: first fit/best fit? – external fragmentation T segmentation don’t use frames, thus external fragmentation exists T periodic compaction may be necessary and is possible as dynamic relocation is supported CEG 433/633 - Operating Systems I 8. 35 Dr. T. Doom

Sharing of segments CEG 433/633 - Operating Systems I 8. 36 Dr. T. Doom Sharing of segments CEG 433/633 - Operating Systems I 8. 36 Dr. T. Doom

Hybrid: Segmentation with Paging • Segmentation and paging have their advantages and disadvantages – Hybrid: Segmentation with Paging • Segmentation and paging have their advantages and disadvantages – segmentation suffers from dynamic allocation problems T lengthy search time for a memory hole T external fragmentation can waste significant resources – paging reduces dynamic allocation problems T quick search (just find enough empty frames if they exist) T eliminates external fragmentation – Note: it does introduce internal fragmentation • Solution: page the segments! – First seen in MULTICS, dominates current allocation schemes • Solution differs from pure segmentation in that the segment-table entry contains not the base address of the segment, but rather the base address of a page table for the segment page offset s 8. 37 d’ 18 CEG 433/633 - Operating Systems I p 6 10 Dr. T. Doom

MULTICS Address Translation Scheme segment page offset s 8. 38 d’ 18 CEG 433/633 MULTICS Address Translation Scheme segment page offset s 8. 38 d’ 18 CEG 433/633 - Operating Systems I p 6 10 Dr. T. Doom

Generalized Summary • • Parkinson’s Law: “Programs expand to fill available memory” • Multi-programmed Generalized Summary • • Parkinson’s Law: “Programs expand to fill available memory” • Multi-programmed systems/fixed number of tasks (OS/360 MFT): – Memory allocation on fixed-sized/numbered partitions – Queue for each partition size – Relocatable at load time – Protection: Base and limit register, or protection code (pid) if multiple non-contiguous blocks are allowed Mono-programmed systems: – One user process in memory – OS and device drivers also present – Overlays used to increase program size – Relocatable at compile-time only – Protection: Base and limit register CEG 433/633 - Operating Systems I 8. 39 Dr. T. Doom

Generalized Summary • Multi-programmed and time-shared systems with variable partitions – Memory manager must Generalized Summary • Multi-programmed and time-shared systems with variable partitions – Memory manager must keep track of partitions and holes – Dynamic allocation algorithm: First-fit, Next-fit, Best-fit, etc. – Compaction to reduce external fragmentation – Protection: T relocation (base) register and limit register, or T virtual addresses - the OS produces the physical address; user programs can not generate addresses which belong to other processes – Relocatable during execution (or no compaction possible) T Change relocation register value or page-to-frame mapping CEG 433/633 - Operating Systems I 8. 40 Dr. T. Doom