readdir的原型如下:
struct dirent *readdir(DIR *dirp);
因为内部使用了静态数据,所以readdir被认为不是线程安全的函数,POSIX[i]标准这样描述:
The application shall not modify the structure to which the return value of readdir() points, nor any storage areas pointed to by pointers within the structure. The returned pointer, and pointers within the structure, might be invalidated or the structure or the storage areas might be overwritten by a subsequent call to readdir() on the same directory stream. They shall not be affected by a call to readdir() on a different directory stream.
If a file is removed from or added to the directory after the most recent call to opendir() or rewinddir(),whether a subsequent call to readdir() returns an entry for that file is unspecified.
The readdir() function need not be thread-safe.
因此才有了readdir_r函数的出现,它的原型如下:
int readdir_r(DIR *dirp, struct dirent *entry, struct dirent **result);
readdir_r将返回结果填充到调用者提供的entry缓冲区中,保证了它的线程安全性。
然而,在GNU的官方文档[ii]中,有下面的描述:
In POSIX.1-2008, readdir is not thread-safe. In the GNU C Library implementation, it is safe to call readdir concurrently on different dirstreams, but multiple threads accessing the same dirstream result in undefined behavior. readdir_r is a fully thread-safe alternative, but suffers from poor portability (see below). It is recommended that you use readdir, with externallocking if multiple threads access the same dirstream.
Portability Note: It is recommended to use readdir instead of readdir_r for the following reasons:
On systems which do not define NAME_MAX, it may not be possible to use readdir_r safely because the caller does not specify the length of the buffer for the directory entry.
On some systems,readdir_r cannot read directory entries with very long names. If such a name is encountered, the GNU C Library implementation of readdir_r returns with an error code of ENAMETOOLONG after the final directory entry has been read. On other systems, readdir_r may return successfully, but the d_name member may not be NUL-terminated or may be truncated.
POSIX-1.2008 does not guarantee that readdir is thread-safe, even when access to the same dirstream is serialized. But in current implementations(including the GNU C Library), it is safe to call readdir concurrently on different dirstreams, so there is no need to use readdir_r in most multi-threaded programs. In the rare case that multiple threads need to read from the same dirstream, it is still better to use readdir and external synchronization.
It is expected that future versions of POSIX will obsolete readdir_r and mandate the level of thread safety for readdir which is provided by the GNU C Library and other implementations today.
尽管POSIX中不保证readdir是线程安全的,但是在目前的实现中(包括GUN C库),在不同的dirstream上(dirp)同时调用readdir能够保证是安全的。因此,多线程程序中其实没必要使用readdir_r,即使在极少场景下,多个线程中需要使用相同的dirstream时,使用readdir以及外部同步手段(加锁),也会是更好的选择。预计在未来版本的POSIX标准中,将会废弃readdir_r。
除了线程安全方面的考虑,没必要使用readdir_r之外,readdir_r还有其他可移植上的缺点,比如某些系统上readdir_r无法处理有很长名字的目录项。
而且,结构体dirent中只有d_name是在POSIX中有明确规定的,它长度还是未指定的(在Linux中,结构体dirent中的d_name,具有明确的数组长度,为256),因此,某些系统中,为了使用readdir_r,必须像下面这样分配entry的内存:
name_max = pathconf(dirpath, _PC_NAME_MAX); if (name_max == -1) /* Limit not defined, or error */ name_max = 255; /* Take a guess */ len = offsetof(struct dirent, d_name) + name_max + 1; entryp = malloc(len);
但是这种方式也有问题[iii]。
因此,结论就是:
只要不是多个线程使用相同的dirstream,就尽可能的使用readdir,它其实更简单且更安全。
参考:
http://pubs.opengroup.org/onlinepubs/9699919799/
https://www.gnu.org/software/libc/manual/html_mono/libc.html#Reading_002fClosing-Directory
http://elliotth.blogspot.co.uk/2012/10/how-not-to-use-readdirr3.html