在进程间传送打开的文件描述符的能力是非常有用的,可以用它对客户进程/服务器进程应用进行不同的设计。它使一个进程(一般是服务器进程)能够处理为打开一个文件所要求的一切操作(具体如将网络名翻译为网络地址、拨号调制解调器、协商文件锁等)以及向调用进程送回一描述符,该描述符可被用于以后的所有I/O函数。涉及打开文件或设备的所有细节对客户进程而言都是隐藏的。
下面进一步说明从一个进程向另一个进程“传送一打开的文件描述符”的含义。回忆http://www.cnblogs.com/nufangrensheng/p/3498509.html中的图3-2,其中显示了两个进程,它们打开了同一文件。虽然它们共享同一v节点表,但每个进程都有它自己的文件表项。
当一个进程向另一个进程传送一打开的文件描述符时,我们想要发送进程和接收进程共享同一文件表项。图17-8显示了所希望的安排。
图17-8 从顶部进程传送一个打开的文件至底部进程
在技术上,发送进程实际上向接收进程传送一个指向一打开文件表项的指针。该指针被分配存放在接收进程的第一个可用描述符项中。(注意,不要造成错觉,以为发送进程和接收进程中的描述符编号是相同的,通常它们是不同的。)两个进程共享同一打开文件表项,在这一点上与fork之后,父、子进程共享打开文件表项的情况完全相同(参考http://www.cnblogs.com/nufangrensheng/p/3509492.html中图8-1所示)。
当发送进程将描述符传送给接收进程后,通常它关闭该描述符。发送进程关闭该描述符并不造成关闭该文件或设备,其原因是该描述符对应的文件仍被视为由接收者进程打开(即使接收进程尚未接收到该描述符)。
下面定义三个函数以发送和接收文件描述符。本节将会给出对于STREAMS和套接字的这三个函数的不同实现代码。
#include "apue.h" int send_fd(int fd, int fd_to_send); int send_err(int fd, int status, const char *errmsg); 两个函数返回值:若成功则返回0,出错则返回-1 int recv_fd(int fd, ssize_t (*userfunc)(int, const void *, size_t)); 返回值:若成功则返回文件描述符,出错则返回负值
当一个进程(通常是服务器进程)希望将一个描述符传送给另一个进程时,它调用send_fd或send_err。等待接收描述符的进程(客户进程)调用recv_fd。
send_fd经由fd代表的STREAMS管道或UNIX域套接字发送描述符fd_to_send。
send_err函数用fd发送errmsg以及后随的status字节。status的值应在-1到-255之间。
客户进程调用recv_fd接收一描述符。如果一切正常(发送者调用了send_fd),则作为函数值返回非负描述符。否则,返回值是由send_err发送的status(-1到-255之间的一个值)。另外,如果服务器进程发送了一条出错消息,则客户进程调用它自己的userfunc处理该消息。userfunc的第一个参数是常量STDERR_FILENO,然后是指向出错消息的指针及其长度。userfunc函数的返回值是已写的字节数或负的出错编号值。客户进程常将userfunc指定为通常的write函数。
我们实现了用于这三个函数的我们自己指定的协议。为发送一描述符,send_fd先发送两个0字节,然后是实际描述符。为了发送一条出错消息,send_err发送errmsg,然后是1个0字节,最后是status字节的绝对值(1-255)。recv_fd读s管道(可以实现为STREAMS管道或UNIX域套接字的双向通信管道)中所有字节直至null字符。null字符之前的所有字符都传送给调用者的userfunc。recv_fd读到的下一个字节是status字节。若status字节为0,那么一个描述符已传送过来,否则表示没有描述符可接收。
send_err函数在将出错消息写到STREAMS管道后,即调用send_fd函数。如程序清单17-11所示。
程序清单17-11 send_err函数
#include "apue.h" /* * Used when we had planned to send an fd using send_fd(), * but encountered an error instead. We send the error back * using the send_fd()/recv_fd() protocol. */ int send_err(int fd, int errcode, const char *msg) { int n; if((n = strlen(msg)) > 0) if(writen(fd, msg, n) != n) /* send the error message */ return(-1); if(errcode >= 0) errcode = -1; /* must be negtive */ if(send_fd(fd, errcode) < 0) return(-1); return(0); }
1、经由基于STREAMS的管道传送文件描述符
文件描述符用两个ioctl命令经由STREAMS管道交换,这两个命令是:I_SENDFD和I_RECVFD。为了发送一个描述符,将ioctl的第三个参数设置为实际描述符。
程序清单17-12 STREAMS管道的send_fd函数
#include "apue.h" #include <stropts.h> /* * Pass a file descriptor to another process. * If fd < 0, then -fd is sent back instead as the error status. */ int send_fd(int fd, int fd_to_send) { char buf[2]; /* send_fd()/recv_fd() 2-byte protocol */ buf[0] = 0; /* null bytes flag to recv_fd() */ if(fd_to_send < 0) { buf[1] = -fd_to_send; /* nonzero status means error */ if(buf[1] == 0) buf[1] = 1; /* -256, etc. would screw up protocol */ } else { buf[1] = 0; /* zero status means OK */ } if(write(fd, buf, 2) != 2) return(-1); if(fd_to_send >= 0) if(ioctl(fd, I_SENDFD, fd_to_send) < 0) return(-1); return(0); }
当接收一个描述符时,ioctl的第三个参数是一指向strrecvfd结构的指针。
struct strrecvfd { int fd; /* new descriptor */ uid_t uid; /* effective user ID of sender */ gid_t gid; /* effective group ID of sender */ char fill[8]; };
recv_fd读STREAMS管道直到接收到双字节协议的第一个字节(null字节)。当发出I_RECVFD ioctl命令时,位于流首读队列中的下一条消息应当是一个描述符,它是由I_SENDFD发来的,或者是一条出错消息。
程序清单17-13 STREAMS管道的recv_fd函数
#include "apue.h" #include <stropts.h> /* * Receive a file descirpor from another process ( a server ). * In addition, any data received from the server is passed * to (*userfunc)(STDERR_FILENO, buf, nbytes). We have a * 2-byte protocol for receiving the fd from send_fd(). */ int recv_fd(int fd, ssize_t (*userfunc)(int, const void *, size_t)) { itn newfd, nread, flag, status; char *ptr; char buf[MAXLINE]; struct strbuf dat; struct strrecvfd recvfd; status = -1; for(;;) { dat.buf = buf; dat.maxlen = MAXLINE; flag = 0; if(getmsg(fd, NULL, &dat, &flag) < 0) err_sys("getmsg error"); nread = dat.len; if(nread == 0) { err_ret("connection closed by server"); return(-1); } /* * See if this is the final data with null & status. * Null must be next to last byte of buffer, status * byte is last byte. Zero status means there must * be a file descriptor to receive. */ for(ptr = buf; ptr < &buf[nread]; ) { if(*ptr++ == 0) { if(ptr != &buf[nread - 1]) err_dump("message format error"); status = *ptr & 0xFF; /* prevent sign extension */ if(status == 0) { if(ioctl(fd, I_RECVFD, &recvfd) < 0) return(-1); newfd = recvfd.fd; /* new descriptor */ } else { newfd = -status; } nread -= 2; } } if(nread > 0) if((*userfunc(STDERR_FILENO, buf, nread) != nread)) return(-1); if(status >= 0) /* final data has arrived */ return(newfd); /* descriptor, or -status */ } }
2、经由UNIX域套接字传送文件描述符
为了用UNIX域套接字交换文件描述符,调用sendmsg(2)和recvmsg(2)函数(http://www.cnblogs.com/nufangrensheng/p/3567376.html)。这两个函数的参数中都有一个指向msghdr结构的指针,该结构包含了所有有关收发内容的信息。该结构的定义大致如下:
struct msghdr { void *msg_name; /* optional address */ socklen_t msg_namelen; /* address size in bytes */ struct iovec *msg_iov; /* array of I/O buffers */ int msg_iovlen; /* number of elements in array */ void *msg_control; /* ancillary data */ socklen_t msg_controllen; /* number of ancillary bytes */ int msg_flags; /* flags for received message */ };
其中,头两个元素通常用于在网络连接上发送数据报文,在这里,目的地址可以由每个数据报文指定。下面两个元素使我们可以指定由多个缓冲区构成的数组(散布读和聚集写),这与对readv和writev函数(http://www.cnblogs.com/nufangrensheng/p/3559304.html)的说明一样。msg_flags字段包含了说明所接收到消息的标志,这些标志摘要示于表16-9中(http://www.cnblogs.com/nufangrensheng/p/3567376.html)。
有两个参数用来处理控制信息的传送和接收:msg_control字段指向cmsghdr(控制信息首部)结构,msg_contrllen字段包含控制信息的字节数。
struct cmsghdr { socklen_t cmsg_len; /* data byte count, including header */ int cmsg_level; /* originating protocol */ int cmsg_type; /* protocol-specific type */ /* followed by the actual control message data */ };
为了发送文件描述符,将cmsg_len设置为cmsghdr结构的长度加一个整型(描述符)的长度,cmsg_level字段设置为SOL_SOCKET,cmsg_type字段设置为SCM_RIGHTS,用以指明我们在传送访问权。(SCM指的是套接字级控制信息,socket_level cnotrol message。)访问权仅能通过UNIX域套接字传送。描述符紧随cmsg_type字段之后存放,用CMSG_DATA宏获得该整型量的指针。
三个宏用于访问控制数据,一个宏用于帮助计算smsg_len所使用的值。
#include <sys/socket.h> unsigned char *CMSG_DATA(struct cmsghdr *cp); 返回值:指向与cmsghdr结构相关联的数据的指针 struct cmsghdr *CMSG_FIRSTHDR(struct msghdr *mp); 返回值:指向与msghdr结构相关联的第一个cmsghdr结构的指针,若无这样的结构则返回NULL struct cmsghdr *CMSG_NXTHDR(struct msghdr *mp, struct cmsghdr *cp); 返回值:指向与msghdr结构相关联的下一个cmsghdr结构的指针,该msghdr结构给出了当前cmsghdr结构,若当前cmsghdr结构已是最后一个则返回NULL unsigned int CMSG_LEN(unsigned int nbytes); 返回值:为nbytes大小的数据对象分配的长度
Single UNIX规范定义了前三个宏,但没有定义CMSG_LEN。
GMSG_LEN宏返回为存放长度为nbytes的数据对象(控制数据)所需的字节数。它先将nbytes加上cmsghdr结构(控制数据头部)的长度,然后按处理机体系结构的对齐要求进行调整,最后再向上取整。
程序清单17-14 UNIX域套接字的send_fd函数
#include "apue.h" #include <sys/socket.h> /* size of control buffer to send/recv one file descriptor */ #define CONTROLLEN CMSG_LEN(sizeof(int)) static struct cmsghdr *cmptr = NULL; /* malloc'ed first time */ /* * Pass a file descriptor to another process. * If fd < 0, then -fd is sent back instead as the error status. */ int send_fd(int fd, int fd_to_send) { struct iovec iov[1]; struct msghdr msg; char buf[2]; /* send_fd()/recv_fd() 2-byte protocol */ iov[0].iov_base = buf; iov[0].iov_len = 2; msg.msg_iov = iov; msg.msg_iovlen = 1; msg.msg_name = NULL; msg.msg_namelen = 0; if(fd_to_send < 0) { msg.msg_control = NULL; msg.msg_controllen = 0; buf[1] = -fd_to_send; /* nonzero status means error */ if(buf[1] == 0) buf[1] = 1; /* -256, etc. would screw up protocol */ } else { if(cmptr == NULL && (cmptr = malloc(CONTROLLEN)) == NULL) return(-1); cmptr->cmsg_level = SOL_SOCKET; cmptr->cmsg_type = SCM_RIGHTS; cmptr->cmsg_len = CONTROLLEN; msg.msg_control = cmptr; msg.msg_controllen = CONTROLLEN; *(int *)CMSG_DATA(cmptr) = fd_to_send; /* the fd to pass */ buf[1] = 0; /* zero status means ok */ } buf[0] = 0; /* null byte flag to recv_fd() */ if(sendmsg(fd, &msg, 0) != 2) return(-1); return(0); }
在sendmsg调用中,发送双字节协议数据(null和status字节)和描述符。
为了接收文件描述符,我们为cmsghdr结构和描述符分配足够大的空间,将msg_control指向该存储空间,然后调用recvmsg。我们使用MSG_LEN宏计算所需空间的总量。
我们从UNIX域套接字读入,直至读到null字节,它位于最后的status字节之前。null字节之前是一条来自发送者的出错消息。
程序清单17-15 UNIX域套接字的recv_fd函数
#include "apue.h" #include <sys/socket.h> /* struct msghdr */ /* size of control buffer to send/recv one file descriptor */ #define CONTOLLEN CMSG_LEN(sizeof(int)) static struct cmsghdr *cmptr = NULL; /* malloc'ed first time */ /* * Receive a file descriptor from a server process. Also, any data * received is passed to (*userfunc)(STDERR_FILENO, buf, nbytes). * We have a 2-byte protocol for receiving the fd from send_fd(). */ int recv_fd(int fd, ssize_t (*userfunc)(int, const void *, size_t)) { int newfd, nr, status; char *ptr; char buf[MAXLINE]; struct iovec iov[1]; struct msghdr msg; status = -1; for(;;) { iov[0].iov_base = buf; iov[0].iov_len = sizeof(buf); msg.msg_iov = iov; msg.msg_iovlen = 1; msg.msg_name = NULL; msg.msg_namelen = 0; if(cmptr == NULL && (cmptr = malloc(CONTROLLEN)) == NULL) return(-1); msg.msg_control = cmptr; msg.msg_controllen = CONTROLLEN; if((nr = recvmsg(fd, &msg, 0)) < 0) { err_sys("recvmsg error"); } else if(nr == 0) { err_ret("connection close by server"); return(-1); } /* * See if this is the final data with null & status. Null * is next to last byte to buffer; status byte is last byte. * Zero status means there is a file descriptor to receive. */ for(ptr = buf; ptr < &buf[nr];) { if(*ptr++ == 0) { if(ptr != &buf[nr - 1]) { err_dump("message format error"); } status = *ptr & 0xFF; /* prevent sign extension */ if(status == 0) { if(msg.msg_controllen != CONTROLLEN) { err_dump("status = 0 but no fd"); } newfd = *(int *)CMSG_DATA(cmptr); } else { newfd = -status; } nr -= 2; } } if(nr > 0 && (*userfunc)(STDERR_FILENO, buf, nr) != nr) return(-1); if(status >= 0) /* final data has arrived */ return(newfd); /* descriptor, or -status */ } }
注意,该程序总是准备接收一描述符(在每次调用recvmsg之前,设置msg_control和msg_controllen),但是仅当在返回时,msg_controllen非0,才确实接收到一描述符。
在传送文件描述符方面,UNIX域套接字和STREAMS管道之间的一个区别是,用STREAMS管道时我们得到发送进程的身份。
FreeBSD 5.2.1和Linux 2.4.22支持在UNIX域套接字上发送凭证,但实现方式不同。
在FreeBSD,将凭证作为cmsgcred结构传送。
#define CMGROUP_MAX 16 struct cmsgcred { pid_t cmcred_pid; /* sender's process ID */ uid_t cmcred_uid; /* sender's real UID */ uid_t cmcred_euid; /* sender's effective UID */ gid_t cmcred_gid; /* sender's read GID */ short cmcred_ngroups; /* number of groups */ gid_t cmcred_groups[CMGROUP_MAX]; /* groups */ };
当传送凭证时,仅需为cmsgcred结构保留存储空间。内核将填充该结构以防止应用程序伪装成具有另一种身份。
在Linux中,将凭证作为ucred结构传送。
struct ucred { uint32_t pid; /* sender's process ID */ uint32_t uid; /* sender's user ID */ uint32_t gid; /* sender's group ID */ };
不同于FreeBSD的是,Linux要求在传送前先将结构初始化。内核将确保应用程序使用对应于调用程序的值,或具有适当的权限使用其他值。
程序清单17-16 在UNIX域套接字上发送凭证
#include "apue.h" #include <sys/socket.h> #if define(SCM_CRED) /* BSD interface */ #define CREDSTRUCT cmsgcred #define SCM_CREDTYPE SCM_CREDS #elif define(SCM_CREDENTIALS) /* Linux interface */ #define CREDSTRUCT ucred #define SCM_CREDTYPE SCM_CREDENTIALS #else #error passing credentials is unsupported! #endif /* size of control buffer to send/recv one file descriptor */ #define CONTROLLEN CMSG_LEN(sizeof(int)) static struct cmsghdr *cmptr = NULL; /* malloc'ed first time */ /* * Pass a file descriptor to another process. * If fd < 0, then -fd is sent back instead as the error status. */ int send_fd(int fd, int fd_to_send) { struct CREDSTRUCT *credp; struct cmsghdr *cmp; struct iovec iov[1]; struct msghdr msg; char buf[2]; /* send_fd()/recv_fd() 2-byte protocol */ iov[0].iov_base = buf; iov[0].iov_len = 2; msg.msg_iov = iov; msg.msg_iovlen = 1; msg.msg_name = NULL; msg.msg_namelen = 0; msg.msg_flags = 0; if(fd_to_send < 0) { msg.msg_control = NULL; msg.msg_controllen = 0; buf[1] = -fd_to_send; /* nonzero status means error */ if(buf[1] == 0) buf[1] = 1; /* -256, etc. would screw up protocol */ } else { if(cmptr == NULL && (cmptr = malloc(CONTROLLEN)) == NULL) return(-1); msg.msg_control = cmptr; msg.msg_controllen = CONTROLLEN; cmp = cmptr; cmp->cmsg_level = SOL_SOCKET; cmp->cmsg_type = SCM_RIGHTS; cmp->cmsg_len = RIGHTSLEN; *(int *)CMSG_DATA(cmp) = fd_to_send; /* the fd to pass */ cmp = CMSG_NXTHDR(&msg, cmp); cmp->cmsg_level =SOL_SOCKET; cmp->cmsg_type = SCM_CREDTYPE; cmp->cmsg_len = CREADSLEN; credp = (struct CREADSTRUCT *)CMSG_DATA(cmp); #if defined(SCM_CREDENTIALS) credp->uid = geteuid(); credp->gid = getegid(); credp->pid = getpid(); #endif buf[1] = 0; /* zero status means ok */ } buf[0] = 0; /* null byte flag to recv_fd() */ if(sendmsg(fd, &msg, 0) != 2) return(-1); return(0); }
注意,只是Linux上才需要初始化凭证结构。
程序清单17-17 在UNIX域套接字上接收凭证
#include "apue.h" #include <sys/socket.h> /* struct msghdr */ ************************************** #if define(SCM_CRED) /* BSD interface */ #define CREDSTRUCT cmsgcred #define CR_UID cmcred_uid #define CREDOPT LOCAL_PEERCRED #define SCM_CREDTYPE SCM_CREDS #elif define(SCM_CREDENTIALS) /* Linux interface */ #define CREDSTRUCT ucred #define CR_UID uid #define CREDOPT SO_PASSCRED #define SCM_CREDTYPE SCM_CREDENTIALS #else #error passing credentials is unsupported! #endif /* size of control buffer to send/recv one file descriptor */ #define RIGHTSLEN CMSG_LEN(sizeof(int)) #define CREDSLEN CMSG_LEN(sizeof(struct CREDSTRUCT)) #define CONTROLLEN (RIGHTSLEN + CREDSLEN) static struct cmsghdr *cmptr = NULL; /* malloc'ed first time */ /* * Receive a file descriptor from a server process. Also, any data * received is passed to (*userfunc)(STDERR_FILENO, buf, nbytes). * We have a 2-byte protocol for receiving the fd from send_fd(). */ int recv_fd(int fd, uid_t *uidptr, ssize_t (*userfunc)(int, const void *, size_t)) { struct cmsghdr *cmp; struct CREDSTRUCT *credp; int newfd, nr, status; char *ptr; char buf[MAXLINE]; struct iovec iov[1]; struct msghdr msg; const int on = 1; status = -1; newfd = -1; if(setsockopt(fd, SOL_SOCKET, CREDOPT, &on, sizeof(int)) < 0) { err_ret("setsockopt failed"); return(-1); } for(;;) { iov[0].iov_base = buf; iov[0].iov_len = sizeof(buf); msg.msg_iov = iov; msg.msg_iovlen = 1; msg.msg_name = NULL; msg.msg_namelen = 0; if(cmptr == NULL && (cmptr = malloc(CONTROLLEN)) == NULL) return(-1); msg.msg_control = cmptr; msg.msg_controllen = CONTROLLEN; if((nr = recvmsg(fd, &msg, 0)) < 0) { err_sys("recvmsg error"); } else if(nr == 0) { err_ret("connection close by server"); return(-1); } /* * See if this is the final data with null & status. Null * is next to last byte to buffer; status byte is last byte. * Zero status means there is a file descriptor to receive. */ for(ptr = buf; ptr < &buf[nr];) { if(*ptr++ == 0) { if(ptr != &buf[nr - 1]) { err_dump("message format error"); } status = *ptr & 0xFF; /* prevent sign extension */ if(status == 0) { if(msg.msg_controllen != CONTROLLEN) { err_dump("status = 0 but no fd"); } /* process the control data */ for(cmp = CMSG_FIRSTHDR(&msg); cmp != NULL; cmp = CMSG_NXTHDR(&msg, cmp)) { if(cmp->cmsg_level != SOL_SOCKET) continue; switch(cmp->cmsg_type) { case SCM_RIGHTS: newfd = *(int *)CMSG_DATA(cmptr); break; case SCM_CREDTYPE: credp = (struct CREDSTRUCT *)CMSG_DATA(cmp); *uidptr = credp->CR_UID; } } } else { newfd = -status; } nr -= 2; } } if(nr > 0 && (*userfunc)(STDERR_FILENO, buf, nr) != nr) return(-1); if(status >= 0) /* final data has arrived */ return(newfd); /* descriptor, or -status */ } }
本篇博文内容摘自《UNIX环境高级编程》(第二版),仅作个人学习记录所用。关于本书可参考:http://www.apuebook.com/。