The st_size member of the stat structure contains the size of the file in bytes. This field is meaningful only for regular files, directories, and symbolic links.
Solaris also defines the file size for a pipe as the number of bytes that are available for reading from the pipe. We'll discuss pipes in Section 15.2.
For a regular file, a file size of 0 is allowed. We'll get an end-of-file indication on the first read of the file.
For a directory, the file size is usually a multiple of a number, such as 16 or 512. We talk about reading directories in Section 4.21.
For a symbolic link, the file size is the number of bytes in the filename. For example, in the following case, the file size of 7 is the length of the pathname usr/lib:
lrwxrwxrwx 1 root 7 Sep 25 07:14 lib -> usr/lib
(Note that symbolic links do not contain the normal C null byte at the end of the name, as the length is always specified by st_size.)
Most contemporary UNIX systems provide the fields st_blksize and st_blocks. The first is the preferred block size for I/O for the file, and the latter is the actual number of 512-byte blocks that are allocated. Recall from Section 3.9 that we encountered the minimum amount of time required to read a file when we used st_blksize for the read operations. The standard I/O library, which we describe in Chapter 5, also tries to read or write st_blksize bytes at a time, for efficiency.
Be aware that different versions of the UNIX System use units other than 512-byte blocks for st_blocks. Using this value is nonportable.
Holes in a File
In Section 3.6, we mentioned that a regular file can contain "holes." We showed an example of this in Figure 3.2. Holes are created by seeking past the current end of file and writing some data. As an example, consider the following:
$ ls -l core -rw-r--r-- 1 sar 8483248 Nov 18 12:18 core $ du -s core 272 core
The size of the file core is just over 8 MB, yet the du(estimate file space usage) command reports that the amount of disk space used by the file is 272 512-byte blocks (139,264 bytes). (The du command on many BSD-derived systems reports the number of 1,024-byte blocks; Solaris reports the number of 512-byte blocks and also does Centos.) Obviously, this file has many holes.
As we mentioned in Section 3.6, the read function returns data bytes of 0 for any byte positions that have not been written. If we execute the following, we can see that the normal I/O operations read up through the size of the file:
$ wc -c core
8483248 core
The wc(1) command with the -c option counts the number of characters (bytes) in the file.
If we make a copy of this file, using a utility such as cat(1), all these holes are written out as actual data bytes of 0:
$ cat core > core.copy $ ls -l core* -rw-r--r-- 1 sar 8483248 Nov 18 12:18 core -rw-rw-r-- 1 sar 8483248 Nov 18 12:27 core.copy $ du -s core* 272 core 16592 core.copy
Here, the actual number of bytes used by the new file is 8,495,104 (512 x 16,592). The difference between this size and the size reported by ls is caused by the number of blocks used by the file system to hold pointers to the actual data blocks.