
5b382f9e1e7e8732c8d8a46583967377.ppt
- Количество слайдов: 111
Storage Devices CSC-3004 Introduction to Software Development Spring 2011 Dr. James Skon
Secondary device types Magnetic disk Magnetic tape Semiconductor memory devices Optical Devices Mass storage devices
Magnetic disks Disk platters, coated with ferrous oxide, rotate on a spindle. Read/write heads read and record information in single bit wide "tracks". These tracks are broken up into blocks, or "sectors".
Magnetic disks Performance - 3 aspects to timing seek time - time to move head to the correct cylinder. Latency - time for disk to rotate to correct position. Transfer rate - speed at which data may be read. Instantaneous - rate at an instance in time Average - rate including time for IBG
Magnetic disks Hard disks. Sector - block size on disk (if fixed). Track - all sectors in a concentric circle. Platter - one physical disk - two surfaces. May have multiple platters. All parallel tracks form a "cylinder".
Magnetic disks Hard disks. Sector – Typical size 512 bytes
Magnetic disks Disks spin fast (~ 7200 rpm or more). Heads "fly" over surface. If they touch, or "crash" both heads and surface may be damaged. the closer the heads, the higher the density Movable heads must accurately locate correct track. Often one surface is used for timing and position sensing
Magnetic disks Fixed technology disks since sealed, no dirt can cause crash, heads fly very close. May have multiple heads per surface. High density. Fast (multiple heads & high density)
Magnetic disks Removable Lower density then fixed.
Magnetic disks Fixed head One head for every track. Very fast. Expensive Not made anymore
Magnetic disks Floppy disks: single flexible platter Rotate slowly (360 rpm) Head in constant contact with surface Easily damaged Heads seek slowly
Magnetic disks Disk defects due to the thinness of the surface coating, most disks have small flaws or defects Spare tracks or sectors are provided for storage of data that normally would be stored in the damaged location. Either the hardware or software must handle these "bad" sections.
Magnetic disks Disk track formats Tracks are divided into either fixed length sectors or variable or user-defined length blocks.
Sector-addressable devices The disk tracks are subdivided into fixed size sectors. Advantages: simple allocation of storage space simple address calculations Disadvantages 11 Internal fragmentation 12 13 1 2 3 10 9 8 7 6 5 4
Sector-addressable devices interleaved Disks spin too fast to read adjacent blocks Solution - interleave blocks Logically adjacent blocks not physically adjacent Interleaving facter - distance between blocks Interleave Factor: 3 13 9 5 1 10 6 4 8 12 3 7 11 2
Sector-addressable devices interleaved If the factor is n, the n revolutions are required to read the whole track High performace controller speeds now allow up to 1: 1 interleaving! Interleave Factor: 3 13 9 5 1 10 6 4 8 12 3 7 11 2
Sector-addressable devices Clustered File System groups sectors into logically contiguous clusters. All allocation, reading, and writing is done on an entire cluster. For Example, with 512 byte sectors, can have cluster sized ranging from 1 to 65, 535 sectors.
Sector-addressable devices Clustered Advantages over non-clustered Blocking - do less reads and writes, to faster overall performance Management - maintain information on file as a list of clusters, rather then a (longer) list of sectors File allocation table cluster number location 1 • 2 • 3 • . . . 1 3 2
Sector-addressable devices Clustered Disadvantages More Wasted Space - more Internal Fragmentation Thus cluster size is a space/time tradeoff!
Sector-addressable devices extents An extent is a physically contiguous collection of clusters If a file is in one extent, it is all physically continuous. Reduces seek time to read entire file A file may need more then one extent if not enough physical contiguous available the disk is “fragmented”
Block-addressable devices Block size is programmable, as in magnetic tapes. Blocks sizes may be mixed on a single device. Advantages: As with mag. tape, space is saved by blocking (fewer gaps) as a multiple of logical record size no internal fragmentation! (unused area at end of block) Disadvantages External Fragmentation Complex space management
Space utilization of sector addressable devices Consider a disk with: 512 bytes per sector 32 sectors per track 20 track per cylinder 400 cylinders/disk pack what is the disk size in bytes? 512 * 32 * 20 * 400 = 131, 072, 000 bytes or 131 megabytes.
Space utilization How many sectors will be used to store 8, 000 records on the above disk if record size is 100 bytes? Blocking factor = 5 Thus
Space utilization Utilization - how much is used? Thus:
Nondata Overhead Disk require space for non-data overhead interblock gaps block headers synchronization marks These fields are invisible on sector addressable devices, and usually need not be considered in space computations.
Magnetic Disk Timing is a function of the following device specific factors: Seek time rotational delay (latency) transmission time (read time) The times for these is not fixed, but vary based on the previous status of the disk drive, disk and head position relative to desired position.
Magnetic Disk Timing Consider the following times: Seek time: Track to track time: 1 milliseconds Full disk movement: 9 milliseconds average move time: 7. 6 milliseconds Rotational Speed: 7200 RPM Average rotational delay: (60/7200)/2 = 4. 16 milliseconds Transfer rate: 66. 6 Mbytes/second Sector size: 512 bytes
Magnetic Disk Timing Thus is would take: to transfer a sector.
Magnetic Disk Timing Average access per sector is: average sector access time = seek time + rotational delay + transfer time Thus, for the case above: average sector access time is 7. 6 + 4. 16 +. 00776 = 11. 76776 ms
Magnetic Disk Timing Clustering
Magnetic tape Typically nine tracks wide 800, 1600, 6250 bits per inch (bpi) Storage based on the magnetic polarity of ferrous oxide particles on the tape. The tape moves over read/write heads to store and retrieve information
Magnetic tape The write head magnetizes small regions of the tape in one of two directions. The read head senses the places where magnetic polarity changes, called "flux change". Flux changes cause an electrical current to be produced in the windings of the read head. Speed varies between 40 to 200 inches per second (ips)
Magnetic tape Vacuum loops hold a reservoir of tape. This way the bulky reels do not have to keep up with acceleration/deceleration of tape, but can catch up a short time later.
Magnetic tape Streaming tape drive - No loops needed. Very slow in start/stop mode (~20 k/sec), but extremely fast in continuous mode. (~160 k/sec). Often these are cartridge type devices. Used for high speed/low cost backup devices.
Magnetic tape Error checking and correction Even/odd parity. Vertical redundancy checking (VCR): An extra bit per column is set or clear to make the number of bits set either even or odd. Longitudinal redundancy checking (LCR): Each "row" of bits in a block has a parity bit. Using VCR and LCR together, errors may be found and corrected in flight.
Magnetic tape Error checking and correction Checksum addition of all data in a block together using modulo arithmetic. Then this values is recorded at the end of the data block.
Magnetic tape Error checking and correction Cyclic redundancy check (CRC) Based on calculating polynomial functions of data. Can correct multiple errors.
Magnetic tape Error checking and correction Soft error - errors which can be corrected Hard errors, errors that can not be corrected.
Magnetic tape Blocking Tapes must be read at a constant speed. To facilitate starting and stopping midtape, interblock gaps (IBG) are used to allow time for acceleration/ deceleration of tape. Typical size 0. 6 inch. IBG
Magnetic tape Buffering Blocks of tape read into buffer for subsequent processing. One physical block may hold several logical blocks. blocking factor - number of logical blocks per physical block. Optimizes slow I/O time.
Space utilization Blocking factor greatly affects utilization of tape. Block size = record size x blocking factor gap length = density (bytes per inch) x gap length (in)
Space utilization Consider: 6250 BPI tape 0. 6 inch IBG 100 byte records
Space utilization
Timing considerations
Timing considerations Consider 6250 BPI tape 100 byte records 100 IPS (inches per second). 03 second start time. 03 second stop time
CR-ROM 600 megabytes read-only (write-once) very cheap to produce History: Offspring of videodisk from late 60’s, early 70’s. Many standards caused problems. Early 80’s work began on developing a audio disc’s Sony and Philips developed as a standard. Introduced in 1984 File system standard developed in 1985. DVD is a later CD standards - 10 gigabytes
CD-ROM Strengths High Capacity Inexpensive Durable Weaknesses extremely slow seek speed (transfer rate is reasonable)
CD-ROM: Physical Organization Creating Bits stored as Pits and Lands: CD-ROMs are stamped from a glass master disk which has a coating that is changed by the laser beam. When the coating is developed, the areas hit by the laser beam turn into pits along the track followed by the beam. The smooth unchanged areas between the pits are called lands.
CD-ROM: Physical Organization Reading A beam of laser light is focused on the track as it moves under the optical pickup. The pits scatter the light, but the lands reflect most of it back to the pickup. This alternating pattern of high- and lowintensity reflected light is the signal used to reconstruct the original digital information.
CD-ROM: Physical Organization Digital Encoding 1’s are represented by the transition from pit to land back again. 0’s are represented by the amount of time between transitions. The longer between transitions, the more 0’s we have.
CD-ROM: Physical Organization Digital Encoding Given this scheme, it is not possible to have two adjacent 1 s: 1 s are always separated by 0 s. As a matter of fact, because of physical limitations, there must be at least two 0 s between any pair of 1 s. Raw patterns of 1 s and 0 s have to be translated to get the 8 -bit patterns of 1 s and 0 s that form the bytes of the original data.
CD-ROM: Physical Organization Digital Encoding EFM encoding (Eight to Fourteen Modulations) turns the original 8 bits of data into 14 expanded bits that can be represented in the pits and lands on the disk. Since 0 s are represented by the length of time between transition, the disk must be rotated at a precise and constant speed. This affects the CD-ROM drive’s ability to seek quickly.
CD-ROM: Physical Organization CLV instead of CAV CLV: Constant Linear Velocity CAV: Constant Angular Velocity
CD-ROM: Physical Organization CLV instead of CAV Data on a CD-ROM is stored in a single, spiral track. Constant Linear Velocity Constant Angular Velocity
CD-ROM: Physical Organization CLV instead of CAV This allows the data to be packed as tightly as possible since all the sectors have the same size (whether in the center or at the edge). In the magnetic disk drive the data is packed more densely in the center than in the edge, thus Space is lost in the edge. Since reading the data requires that it passes under the optical pick-up device at a constant rate, the disc has to spin more slowly when reading the outer edges than when reading towards the center.
CD-ROM: Physical Organization CLV instead of CAV The CLV format is responsible, in large part, for the poor seeking performance of CD-ROM Drives: there is no straightforward way to jump to a location. Part of the problem is the need to change rotational speed.
CD-ROM: Physical Organization CLV instead of CAV To read the address info that is stored on the disc along with the user’s data, we need to be moving the data under the optical pick up at the correct speed. But to know how to adjust the speed, we need to be able to read the address info so we know where we are. How do we break this loop? By guessing and through trial and error ==> Slows down performance.
CD-ROM: Physical Organization CD Addressing Each second of playing time on a CD is divided into 75 sectors. Each sector holds 2 Kilobytes of data. Each CD-ROM contains at least one hour of playing time. Thus the disc is capable of holding at least: 60 min * 60 sec/min * 75 sector/sec * 2 Kilobytes/sector = 540, 000 Kbytes Often, it is actually possible to store over 600, 000 KBytes. Sectors are addressed by min: sector e. g. , 16: 22: 34
CD error correction CIRC - Cross-Interleaved Reed-Solomon Code. adds to every three data bytes one redundant parity byte CIRC corrects error bursts up to 3, 500 bits in sequence (2. 4 mm in length as seen on CD surface) and compensates for error bursts up to 12, 000 bits (8. 5 mm) that may be caused by minor scratches.
DVD "Digital Video Disc or Digital Versatile Disc Many similarities with CD
DVD In early 1990’s two competing standards: Multi. Media Compact Disc (MMCD) (Philips and Sony) Super Density Disc (SD) (Toshiba, Time-Warner, Matsushita Electric, Hitachi, Mitsubishi Electric, Pioneer, Thomson, and JVC) SD format accepted with two modifications. First players in 1996 -1997
DVD A single-layer DVD can store 4. 7 GB, which is around seven times as much as a standard CD-ROM. Higher frequency red laser increased density by 3. 5. More efficient coding increased efficiency by 47% CIRC replaced by a powerful Reed-Solomon product code, RS-PC Eight-to-Fourteen Modulation (EFM) is replaced by a more efficient version, EFMPlus, which uses eight-tosixteen modulation
DVD Formats DVD-5: single sided, single layer, 4. 7 gigabytes (GB), or 4. 38 gibibytes (Gi. B) DVD-9: single sided, double layer, 8. 5 GB (7. 92 Gi. B) DVD-10: double sided, single layer on both sides, 9. 4 GB (8. 75 Gi. B) DVD-14: double sided, double layer on one side, single layer on other, 13. 3 GB (12. 3 Gi. B) DVD-18: double sided, double layer on both sides, 17. 1 GB (15. 9 Gi. B)
DVD Mediums DVD-ROM: read only, manufactured by a press DVD-R: recordable once DVD-RW: rewritable DVD-RAM: random access rewritable DVD+R: recordable once DVD+RW: rewritable DVD-R DL: dual layer record once DVD+R DL: dual layer record once DVD-RW DL: dual layer rewritable DVD+RW DL: dual layer rewritable
Blu-ray disc Capacity 25 GB (single layer) 50 GB (dual layer) Block size - 64 kb ECC Speed 1× at 36 Mbps (4. 5 MBps) 2× at 72 Mbps (9 MBps) 4× at 144 Mbps (18 MBps) 6× at 216 Mbit/s[1] (27 MBps) 8× at 288 Mbps (36 MBps) 12× at 432 Mbps (54 MBps)
Blu-ray disc Laser and optics Conventional DVDs and CDs use red and near infrared lasers at 650 nm and 780 nm respectively. Blu-ray uses 405 nm (violet, but called blue) The minimum "spot size" on which a laser can be focused is limited by diffraction, and depends on the wavelength of the light
SSD - Solid-state drive a data storage device that uses solid-state memory to store persistent data. A SSD emulates a hard disk drive interface, thus easily replacing it in most applications. Most SSD manufacturers use non-volatile flash memory Lower priced drives usually use multi-level cell (MLC) flash memory, which is slower and less reliable than single-level cell (SLC) flash memory.
Flash memory Flash memory is a non-volatile computer memory that can be electrically erased and reprogrammed. Flash memory stores information in an array of memory cells made from floating-gate transistors. Flash cells “wear” from being erased and reprogrammed, limiting life.
Flash Memory Flash memory stores information in an array of memory cells made from floating-gate transistors. single-level cell (SLC) devices: each cell stores only one bit of information. Faster. multi-level cell (MLC) devices, can store more than one bit per cell by choosing between multiple levels of electrical charge to apply to the floating gates of its cells. Slower.
Flash Memory Erases to “ 1” Programs to “ 0” A “large” voltage is needed to force the transistors between state. This is done by a “voltage pump”
Flash Memory limitations Block erasure - although it can be read or programmed a byte or a word at a time in a random access fashion, it must be erased a "block" at a time. Memory wear flash memory has a finite number of erase-write cycles. Most commercially available flash products are guaranteed to withstand around 100, 000 writeerase-cycles. Wear leveling.
Flash Memory – Wear Leveling This effect is partially offset in some chip firmware or file system drivers by counting the writes and dynamically remapping blocks in order to spread write operations between sectors. Another approach is to perform write verification and remapping to spare sectors in case of write failure, a technique called Bad Block Management (BBM).
I/O in Unix I/O is performed by calls to the I/O portion of the Unix Kernel The Kernel presents a simple view of I/O - as sequences of bytes. The Kernal maintains a series of tables to keep track of I/O
Unix File System The file system resides on a single logical disk or partition A partition can be viewed as a linear array of blocks block represents the granularity of space allocation for files a disk block is 512 bytes * some power of 2 physical block number identifies a block on a given disk partition physical block number can be translated into physical location on a partition
Disk partition B Code required to bootstrap the operating system Attributes and metadata of the file system itself inode list data blocks Superblock inode list Boot area S a linear array of inodes data blocks for files and directories, and indirect blocks
Superblock It contains Size in blocks of the file system Size in blocks of the inode list Number of free blocks and inodes Free block list(Partial) Free inode list(Partial) The kernel reads the superblock and stores it in memory when mounting the file system
Inode Each file has an inode associated with it Inode contains metadata for file on-disk inode refers to inode stored in disk within the inode list in-core inode refers to inode stored in memory when a file is open
I/O in Unix - tables File Descriptor Table One for each process Maps file descriptors onto specific open files in open file table Open files table System wide Entry for each instance of open file File may be opened by more then one process
I/O in Unix - tables Table of Index nodes (inodes) Used to describe each file Describes file, points to all blocks Index nodes Each contains a list of 13 pointers first 10 point directly to first ten data blocks 11 th points to another inode of 1000 pointers to blocks 12 th points to block of 1000 pointers, each of which points to a block 1000 pointers (1 meg) 13 th point to block adds one more level of indirection, giving 1 billion blocks!
File Descriptor Table File descriptor File table entry 0 (keyboard) 1 (screen) 2 (error) 3 (normal file) 4 (normal file) • • • to open file table
Open files table Number of Offset R/W processes mode using it ptr to of next access inode write routine . . write. read. . . 100. 3214. . . 1. 2. . table entry
On-disk inode The size of on-disk inode is 64 bytes
Index nodes (inodes) device permissions owner’s userid file size. . . block count file allocation table
On-disk inode Unix files are not contiguous on disk File system need to maintain a map of the disk location of every block of the file 0 1 2 3 4 5 6 7 8 9 10 11 12 indirect Double indirect triple indirect
Index nodes (inodes) 10 blocks root Inode 1000 blocks Inode 1000 inodes Inode 1000 pointers to inodes Inode
In-core Inode It contains all the fields of on-disk inode, and some additional fields, such as The status of the in-core inode (whether the inode is locked, which process is waiting, etc. ) The logical device number containing the file The inode number of the file Pointers to keep the inode on a free list Pointers to keep the inode on a hash queue. Block number of last block read.
Inode operation Inode lookup: lookuppn()&s 5 lookup() allocate inode: iget() translates a pathname and returns a pointer to the vnode of the desired file read an inode from disk into memory by inode number or initialize an empty inode if not found release inode: iput() kernal writes the inode to disk if the in-core copy differs from the disk copy
File Operation Read and write system call use the following arguments File descriptor, user buffer address, count of number of byte transferred Offset is obtained from the opened file object Offset is advanced to the number of byte transferred For random I/O “lseek” is used to set the offset to desired location Kernel verifies the file mode and puts an exclusive lock on the inode for serialized access File read: s 5 read()
Structure of the File System • File system is organised as a heirarchy of directories • It starts from a single directory called root(represented by a /). /(root) | ----------------------------------| | /bin /dev /etc /tmp /usr | | /kernelfile
Different types of Files • • Ordinary files Directories Special files Pipes
Directories • Directory is file containing list of files and subdirectories • It has fixed size records of 16 bytes each which contains - a filename(14 bytes) an inode number (2 bytes) which acts as a pointer to where the system can find info about the file.
Special Files • Special files are contained in the directory /dev. • They are used to represent a real physical device such as a printer, tape device etc. • Ex: Special device - /dev/null(unwanted output can be redirected).
Pipes UNIX allows us to link commands together using a pipe. The pipe acts as a temporary file which only exists to hold data from one command until its read by another. ex: command 1 | command 2 | command 3. .
File Allocation Consider A 1 MB file on a system with a block size set to 8 KB. Then the file will have 125 blocks. First 10 pointed at directly by root inode next 115 pointed at indirectly through indirect inode Max file size: 8 KB*(10 + 2**20 + 2**30) that is more than 16 TB! Depends of block (or cluster) size
Using UNIX filesystem data structures Example: find /usr/bin/vi from Leffler, Mc. Kusick, Karels and Quarterman Search root directory of filesystem to find /usr Search user for bin access blocks pointed to by inode #4 and search contents of blocks for entry that gives us bin’s inode we discover that bin’s inode is inode #7 Search bin for vi root directory inode is, by convention, stored in inode #2 inode shows where data blocks are for root directory – these blocks (not the inode itself) must be retrieved and searched for entry user we discover that the directory user’s inode is inode #4 access blocks pointed to by inode #7 and search contents of block for an entry that gives us vi’s inode we discover that vi’s inode is inode #7 Access inode #7 - this is vi’s inode
File performance The first 10 blocks are accessed with a single read the pointers are in main memory where the inode is brought when the file is opened. The next 1 K blocks require up to two reads, one for the index block, one for the data block. The next 1 M blocks require up to three reads, The next 1 G blocks require up to four reads. Reads slower farther in file!
Mount System Call How to attach a file system into a name space? Simple Idea: use letters C, D, E, etc. Better Idea: Allow attachment at arbitrary points in namespace Designate one tree as the “root” file system Others are attached to the root
A Journey of A Byte: What happens when the program statement: write(textfile, ‘P’, 1) is executed ? Part that takes place in memory: Statement calls the Operating System (OS) which overseas the operation File manager (Part of the OS that deals with I/O) Checks whether the operation is permitted Locates the physical location where the byte will be stored (Drive, Cylinder, Track & Sector) Finds out whether the sector to locate the ‘P’ is already in memory (if not, call the I/O Buffer) Puts ‘P’ in the I/O Buffer Keep the sector in memory to see if more bytes will be going to the same sector in the file
A Journey of A Byte: Part that takes place outside of memory: I/O Processor: Wait for an external data path to become available (CPU is faster than data-paths ==> Delays) Disk Controller: I/O Processor asks the disk controller if the disk drive is available for writing Disk Controller instructs the disk drive to move its read/write head to the right track and sector. Disk spins to right location and byte is written
Data transfer time disparity Disk access time is slowed by the time required for the heads to move into position (seek time), and the time for the disk to rotate to the correct position (latency). There are several ways to avoid costly delays while waiting for the disk.
Data transfer time disparity Multiprogramming In a single process environment, the CPU must usually sit "idle" while it waits for I/O to complete. This is just wasted CPU time. Solution: Share CPU among several users (processes). While one process is waiting for I/O, another runs. The O. S. is responsible to arbitrate the use of the CPU among the waiting processes (users).
Data transfer time disparity Single Process Run Wait Multi-Process 123 123 1234 13
Direct Memory Access (DMA) Sophisticated I/O controllers transfer requested blocks directly into memory while CPU is working on something else. The I/O controller is given the address of the data on the device. The I/O controller locates the data, and "steals" bus cycles from the CPU to perform transfers. CPU I/O Controller Primary Memory
Direct Memory Access (DMA) Memory Activity Process “Stolen” Cycles Process
Buffering Consider the following characteristics of disk access the majority of I/O time is consumed by head movement time. each I/O call has related overhead and Data must often be read in a certain minimum size (physical block size) Files are often read in a sequential order. It doesn't take much more time to read several records then one.
Buffering Solution: Buffering read or write of several records during each transfer operation. Reading - “Anticipatory buffering” Read several records at a time into buffer Use records from buffer if possible Read only when buffer empty Writing Write records to buffer rather then I/O device Write buffer to I/O device when full
Buffering Without Buffering I/O Read 1 Read 2 Read 3 Read 4 Read 5 Process Process 1 CPU 2 3 4 5 With Buffering (5) I/O CPU Read 1 -5 Read 6 -10 Process 1 2 3 4 5 Process 6 7 8
Buffer Size Blocking factor - number of records per block usually an integral number of records. the buffer size often is the same size as the block size of the physical device. Example: Record Size: 80 bytes Physical Block Size: 512 bytes Blocking Factor: floor(512/80) = 18
Overlapped buffering or double buffering Technique whereby a single process can overlap record processing with the I/O process. Consider a case of double buffering Allocate two buffers for the file When file opened, fill both buffers As soon as one block is requested by user program, a anticipatory read is begun for next block concept of buffering is like passing buckets of water to a burning house.
Single buffering I/O CPU Read 1 Read 2 Read 3 Read 4 Process 1 2 3 4
Double buffering Read 1 Read 2 Read 3 Read 4 Read 5 Read 6 Read 7 Process Process 1 2 3 4 5 6 7 Here the I/O time is greater then processing time, What if Processing time is greater?