Скачать презентацию COMP 3438 System Programming UNIX Files Chapter 5 Скачать презентацию COMP 3438 System Programming UNIX Files Chapter 5

901b8f6f828bd888a3f495956990cb25.ppt

  • Количество слайдов: 41

COMP 3438 System Programming UNIX Files (Chapter 5) Slide 1 COMP 3438 System Programming UNIX Files (Chapter 5) Slide 1

COMP 3438 System Programming UNIX Files Course Organization (This Lecture is in red) Part COMP 3438 System Programming UNIX Files Course Organization (This Lecture is in red) Part I: Unix System Programming (Device Driver Development) Character Device Driver Development (HW #2) Introduction to Block Device Driver Overview of Device Driver Development Process/File (HW #1) Overview of Unix Sys. Prog. Part II: Compiler Design Syntax Analysis (HW #4) Lexical Analysis (HW #3) Overview of Complier Design Overview of the Subject (COMP 3438) Slide 2

COMP 3438 System Programming UNIX Files UNIX files n What is a file in COMP 3438 System Programming UNIX Files UNIX files n What is a file in UNIX? n How many types of files? n How are the files in a file system structured? n How is a file represented in memory and disk? n How to find the file using its name? n How to access files from a UNIX program? Slide 3

COMP 3438 System Programming UNIX Files Unix file system n n File system provides COMP 3438 System Programming UNIX Files Unix file system n n File system provides abstractions of naming, storage, and access of files. A file is a container of some information: n n data programs what else? In UNIX, devices (disks, tapes, CD ROMs, screens, keyboards, printers, mice, and network interfaces ) are also treated as files: n n unified interface device independence Slide 4

COMP 3438 System Programming UNIX Files How to handle devices? n OS provides systems COMP 3438 System Programming UNIX Files How to handle devices? n OS provides systems calls to the programmer for performing control and I/O to devices. n n n These system calls are handled by device drivers, which hide the details of device operation and protect the devices from unauthorized use. Some OS provides specific system calls for each type of supported devices. In contrast, in UNIX, disk files and other devices are named and accessed in the same way as data files. n n UNIX provides a uniform device interface (called file descriptors ) Allows uniform access to most devices through file system calls open, close, read, write, etc. Slide 5

COMP 3438 System Programming UNIX Files Types of files in Unix n Regular file: COMP 3438 System Programming UNIX Files Types of files in Unix n Regular file: an ordinary data file on disk - contains bytes of data organized into a linear array; n Special file: a file representing a device - located in the /dev directory n n Block special file: devices transferring information in blocks or chunks, just like disks, CD-ROM. Character special file: devices transferring information in stream of bytes that must be accessed sequentially, e. g. , keyboard, printer. FIFO special file: used for inter-process communications Directories: provided to allow names (not physical locations) of files to be used. n A user gives a filename and UNIX makes a translation to the location of the physical file - done via directories. Slide 6

COMP 3438 System Programming UNIX Files Question: What are the differences between a regular COMP 3438 System Programming UNIX Files Question: What are the differences between a regular file and a directory file? Answer: They are different in n n Contents: data vs. file information Operations: what can be done and who can do them? Slide 7

COMP 3438 System Programming UNIX Files Question: What are the differences between a regular COMP 3438 System Programming UNIX Files Question: What are the differences between a regular file and a device file? %cp /etc/passwd /tmp/garbage %cp /etc/passwd /dev/console Slide 8

COMP 3438 System Programming UNIX Files Hierarchical file organization A UNIX filesystem has a COMP 3438 System Programming UNIX Files Hierarchical file organization A UNIX filesystem has a hierarchical tree structure, where internal nodes are directories and dir. C leaf nodes are files / root dir. A My 3. dat dir. B My 1. dat My 2. dat My 1. dat n n The absolute or fully-qualified pathname uniquely specifies a file, e. g. , /dir. A/My 1. dat differs from /dir. A/dir. B/My 1. dat. We can also use a relative pathname, which needs not to begin with the root directory. e. g. , . . /My 2. dat Slide 9

COMP 3438 System Programming UNIX Files Current working directory n At any time, every COMP 3438 System Programming UNIX Files Current working directory n At any time, every process has an associated directory called the current working directory (cwd) n n n denoted by a dot “. ”, e. g. , ”mv. . /file 1. ” The cwd associated with a user's login shell is called the user's home directory. pwd prints the name of cwd. A relative pathname always starts with the path to the cwd. The C library function getcwd returns the pathname of the current working directory char *getcwd(char *buf, size_t size) n n size specifies maximum length pathname; If longer than maximum, returns -1 and sets errno to ERANGE. If buf is not NULL, getcwd copies the name into buf. Slide 10

COMP 3438 System Programming UNIX Files File representation n n Information about a filesystem COMP 3438 System Programming UNIX Files File representation n n Information about a filesystem structure is stored both on disk and in main memory UNIX uses a logical structure called i-node to store the information about a file on disk - each file in a filesystem is represented by an i-node, which contains information about the file: n size, pointer to physical location, owner, creation time, time of last access/modification, permission, n etc. n n Slide 11

COMP 3438 System Programming UNIX Files i-Node File information: Size (in bytes), owner UID COMP 3438 System Programming UNIX Files i-Node File information: Size (in bytes), owner UID and GID, relevant times (3), link and block counts, permissions. 64 or 128 bytes (system dependent) (a block is typically 8 K) Direct Pointers to beginning file blocks (e. g. , 12 such direct pointers) Single indirect pointer Double indirect pointer Triple indirect pointer n Inode structure Pointers to next file blocks i-node does not contain the file name ? Slide 12

COMP 3438 System Programming UNIX Files Where is i-node stored ? and how can COMP 3438 System Programming UNIX Files Where is i-node stored ? and how can we find the i-node of a file? n i-nodes are kept at the front of each region of disk that contains a UNIX file system. Filesystem n Superblock i-node 1 . . . i-nodek data. . . block 1 data blockn A regular file has a number called i-number, which is an index into an array of the i-nodes on disk. n n Boot Block i-number Each i-node corresponds to (indexed by) an unique i-number. How can we find i-number of a file, given the file name? Slide 13

COMP 3438 System Programming UNIX Files Directory: mapping filename to i-number n A directory COMP 3438 System Programming UNIX Files Directory: mapping filename to i-number n A directory contains a list of entries mapping file names to i-numbers. n n A pair is called a link. You can create many links for a file (multiple names). i-nodes are the hidden part, while directories are the visible structure of the UNIX file system Slide 14

COMP 3438 System Programming UNIX Files Directory: mapping filename to i-number Partial slice of COMP 3438 System Programming UNIX Files Directory: mapping filename to i-number Partial slice of the directory “dir. A” i-number file name 123 . 247 . . 220 dir. B 230 My 1. dat 230 My 2. dat Slide 15

COMP 3438 System Programming UNIX Files Directory: mapping filename to i-number n n Now, COMP 3438 System Programming UNIX Files Directory: mapping filename to i-number n n Now, can you see why the i-node itself does not contain the file name ? The correspondence between a file name and its i-node is maintained in the directory. Name resolution is a function to convert a pathname to an i-node number. n n n When a program references a file by pathname, OS traverses the filesystem tree to find the filename and the i-number in the appropriate directory. Once it has the i-number, the OS can determine other information about the file by accessing the i-node. Usually, the OS keeps copies of the i-nodes for active files in memory for efficiency. Slide 16

COMP 3438 System Programming UNIX Files Directory: mapping filename to i-number n All devices COMP 3438 System Programming UNIX Files Directory: mapping filename to i-number n All devices are represented as files that are located in the directory /dev n n n /dev/tty is the terminal /dev/stdin, /dev/stdout, /dev/stderr Directories are disk resident files that can be read by any program. In many aspects, they are treated the same way as regular files. Slide 17

COMP 3438 System Programming UNIX Files How are files accessed in user programs? n COMP 3438 System Programming UNIX Files How are files accessed in user programs? n n Within C programs a file is accessed either by file pointer or by file descriptors, which provide logical names (handles) for performing device-independent I/O The ANSI C I/O library uses file pointers (via fopen, fscanf, fprintf, fread, fwrite, fclose, and so on) The Unix file system calls uses file descriptors (via open, read, write, close and ioctl) The file pointers (in stdio. h) are stdin, stdout, & stderr while the file descriptors (in unistd. h) are STDIN_FILENO, STDOUT_FILENO, & STDERR_FILENO Slide 18

COMP 3438 System Programming UNIX Files File descriptors n The open() system call returns COMP 3438 System Programming UNIX Files File descriptors n The open() system call returns a file descriptor (an integer) with a file. e. g. , the following code segment open the file /home/ann/my. dat for reading #include #include #include int myfd; myfd = open("/home/ann/my. dat", O_RDONLY); n n A file descriptor specifies the index into file descriptor table (FTD) of the process. Entries of FDT contain pointers to entries in system file table (SFT) for the whole system. When a file is opened, an entry is created in the system file table. Slide 19

COMP 3438 System Programming UNIX Files File descriptors n n SFT entry contains information COMP 3438 System Programming UNIX Files File descriptors n n SFT entry contains information about whether a file is open for read or write, protection, and lock, the file offset, where the next data is read from or written to in the file, etc. Several entries in SFT may point to the same physical file Slide 20

COMP 3438 System Programming UNIX Files File descriptors n Three files are opened automatically COMP 3438 System Programming UNIX Files File descriptors n Three files are opened automatically STDIN_FILENO : standard input n STDOUT_FILENO : standard output n STDERR_FILENO : standard error Corresponding constants 0, 1, 2 in unistd. h n n n When new files are opened, it is assigned the lowest available FD. Accessing files for I/O is a three-step process, whether it is a regular file or a device: n n n Open the file for I/O Read and Write to the file Close the file when finished with I/O Slide 21

COMP 3438 System Programming UNIX Files open() n n fd = open(path, flags, mode) COMP 3438 System Programming UNIX Files open() n n fd = open(path, flags, mode) path: char*, absolute or relative path flags: n n n n O_RDONLY – open for reading O_WRONLY – open for writing O_RDWR – open for reading and writing O_CREAT – create the file if it doesn’t exist O_TRUNC – truncate the file it exists (overwrite) O_APPEND – only write at the end of the file mode: specify permissions if using O_CREATE Returns newly assigned file descriptor fd = open(“my. File”, O_CREAT, 00644) Slide 22

COMP 3438 System Programming UNIX Files read() bytes = read(fd, buffer, count) n Read COMP 3438 System Programming UNIX Files read() bytes = read(fd, buffer, count) n Read from file associated with fd; place count bytes into buffer n n fd: file descriptor to read from buffer: pointer to an array count: number of bytes to read Returns number of bytes read or -1 if an error occurred int fd = open(“some. File”, O_RDONLY); char buffer[4]; int bytes = read(fd, buffer, 4*sizeof(char)); Slide 23

COMP 3438 System Programming UNIX Files write() bytes = write(fd, buffer, count) n Write COMP 3438 System Programming UNIX Files write() bytes = write(fd, buffer, count) n Write contents of buffer to file associated with fd n n fd: file descriptor buffer: pointer to an array count: number of bytes to write Returns the number of bytes written or -1 if an error occurred int fd = open(“some. File”, O_WRONLY); char buffer[4]; int bytes = write(fd, buffer, 4*sizeof(char)); Slide 24

COMP 3438 System Programming UNIX Files close() return_val = close(fd) n n Closes an COMP 3438 System Programming UNIX Files close() return_val = close(fd) n n Closes an open file descriptor Returns 0 on success, -1 on error Slide 25

COMP 3438 System Programming UNIX Files File pointers n n n A file pointer COMP 3438 System Programming UNIX Files File pointers n n n A file pointer points to a data structure FILE, called a file structure in the user area of the process. A file structure contains a buffer and a file descriptor (so a file pointer is a handle to a handle) The following code segment open the file /home/ann/my. dat for output and then writes a string to the file. #include FILE *myfp; If ((myfp = fopen("/home/ann/my. dat", "w")) == NULL) fprintf(stderr, "Could not fopen filen"); else fprintf(myfp, "This is a test"); Slide 26

COMP 3438 System Programming UNIX Files File pointers File Descriptor Table myfp File Structure COMP 3438 System Programming UNIX Files File pointers File Descriptor Table myfp File Structure for /home/ann/my. dat "This is a test" To System File Table 3 Slide 27

COMP 3438 System Programming UNIX Files File pointers and FILE n File pointers are COMP 3438 System Programming UNIX Files File pointers and FILE n File pointers are used in the following higherlevel IO functions in C libraries: n n n fopen() printf() scanf() fclose() These use the FILE data type Slide 28

COMP 3438 System Programming UNIX Files fopen()and fclose() n n FILE *file_stream = fopen(path, COMP 3438 System Programming UNIX Files fopen()and fclose() n n FILE *file_stream = fopen(path, mode) path: char*, absolute or relative path mode: n n n n r – open file for reading r+ – open file for reading and writing w – overwrite file or create file for writing w+ – open for reading and writing; overwrites file a – open file for appending (writing at end of file) a+ – open file for appending and reading fclose(file_stream) Closes open file stream Slide 29

COMP 3438 System Programming UNIX Files printf() printf(formatted_string, . . . ) n formatted_string: COMP 3438 System Programming UNIX Files printf() printf(formatted_string, . . . ) n formatted_string: string that describes the output information n n variable types are escaped with % (see next slide) string is followed by as many expressions as are referenced in the formatted string Slide 30

COMP 3438 System Programming UNIX Files printf() int term = 15; printf(“Twice %d is COMP 3438 System Programming UNIX Files printf() int term = 15; printf(“Twice %d is %d formatted string n”, term, 2*term); expressions Slide 31

COMP 3438 System Programming UNIX Files Escaping Variable Types n n n n n COMP 3438 System Programming UNIX Files Escaping Variable Types n n n n n %d, %i – decimal integer %u – unsigned decimal integer %o – unsigned octal integer %x, %X – unsigned hexadecimal integer %c - character %s – string or character array %f – float %e, %E – double (scientific notation) %g, %G – double or float %% - outputs a % character Slide 32

COMP 3438 System Programming UNIX Files printf() examples n printf( COMP 3438 System Programming UNIX Files printf() examples n printf("The sum of %d, and %d is %dn", 65, 87, 33, 65+87+33); n n printf("Error %s occurred at line %d n", emsg, lno); n n n Output: The sum of 65, 87, and 33 is 185 emsg and lno are variables Output: Error invalid variable occurred at line 27 printf("Hexadecimal form of %d is %x n", 59); n Output: Hexadecimal form of 59 is 3 B Slide 33

COMP 3438 System Programming UNIX Files scanf() scanf(formatted_string, . . . ) n n COMP 3438 System Programming UNIX Files scanf() scanf(formatted_string, . . . ) n n n Similar syntax as printf, only the formatted string represents the data that you are reading in Must pass variables by reference Example n scanf(“%d %c %s”, &int_var, &char_var, string_var); Slide 34

COMP 3438 System Programming UNIX Files printf() and scanf() Families fprintf(file_stream, formatted_string, . . COMP 3438 System Programming UNIX Files printf() and scanf() Families fprintf(file_stream, formatted_string, . . . ) Prints to a file stream instead of stdout sprintf(char_array, formatted_string, . . . ) n n Prints to a character array instead of stdout fscanf(file_stream, formatted_string, . . . ) Reads from a file stream instead of stdin sscanf(char_array, formatted_string, . . . ) n n Reads from a string instead of stdin Slide 35

COMP 3438 System Programming UNIX Files I/O redirection n n Recall: to access a COMP 3438 System Programming UNIX Files I/O redirection n n Recall: to access a file, a process uses a file descriptor, which is an index into the process file descriptor table, which in turn points to an entry in the system file table. Redirection means that the process modifies its file descriptor table entry so that it points to a different entry in the system file table. n Consider the command cat, which reads from a file and echoes to standard output. The following command redirects standard output to my. file cat test > my. file Slide 36

COMP 3438 System Programming UNIX Files Use “dup()” to implement “redirection” n int dup(int COMP 3438 System Programming UNIX Files Use “dup()” to implement “redirection” n int dup(int fd) dup() is a “smart” function that can duplicate the file descriptor, “fd”, to the lowestnumbered unused file descriptor in the file descriptor table. Slide 37

COMP 3438 System Programming UNIX Files Use “dup()” to implement “redirection” n int dup(int COMP 3438 System Programming UNIX Files Use “dup()” to implement “redirection” n int dup(int fd) dup() is a “smart” function that can duplicate the file descriptor, “fd”, to the lowestnumbered unused file descriptor in the file descriptor table. Slide 38

COMP 3438 System Programming UNIX Files File descriptor table before redirection I/O redirection [0] COMP 3438 System Programming UNIX Files File descriptor table before redirection I/O redirection [0] [1] cat test Standard input [1] Standard output [2] Standard error File descriptor table after redirection [2] Standard input [1] Write to my. file [2] cat test [0] Standard error [1] To my. file fd = open(“my. file”, O_CREAT, 00644); /* Create the file – my. file */ close(1); /* Close 1 (Standard output)*/ dup(fd); /* Duplicate fd */ execl("/bin/cat", "cat", “test", NULL); Slide 39

COMP 3438 System Programming UNIX Files Communication between Parent/child via pipe n System call COMP 3438 System Programming UNIX Files Communication between Parent/child via pipe n System call pipe() returns two file descriptors by which we can access the input/output of a pipe (an I/O mechanism) int fd[2]; int pipe(int fd[2]); return: 0 success; -1 error fd[1] pipe for writing fd[0] for reading Question: How to implement “ls –l | wc -l”? Slide 40

COMP 3438 System Programming UNIX Files End of this section But is it all COMP 3438 System Programming UNIX Files End of this section But is it all about programming of UNIX Files? NO! Check out: n n details on C library functions/system calls for create, open, read, write, locate, and close files and directories. C library functions/system calls: rename, delete, files Slide 41