感谢rulary的指正!博文中我对IOCP的理解是有误的,正确的方式请见评论区rulary的回复!
由于项目实际设计的需要,最终IO事件处理没有采用IOCP,而是采用了NT6.0引入的WSAPoll,其编程模型和linux下poll基本一致,此处就不赘述了!
==================================================
IOCP是windows下IO事件处理的最高效的一种方式了,结合OVERLAPPED IO可以实现真正的完全异步IO。windows在此种模式下提供了一站式服务,只要你提交一个IO请求,接下来windows替你处理其他所有的工作,你只需要等着接受windows的完成通知就行了。
响马大叔在他的孢子社区有了一个帖子再谈select, iocp, epoll,kqueue及各种I/O复用机制对此有比较全面的对比介绍了,故而本文不对IOCP这方面的内容再做赘述了,相反说说自己在自己开发过程中认为IOCP不好的地方。
IOCP不好的地方体现这个地方:一个File/Socket Handle是不能多次调用CreateIoCompletionPort()绑定到不同的IOCP上的,只有第一次是成功的,第二次开始是参数错误失败!因此一旦绑定了一个IOCP就没法迁移到其他的IOCP了,这个是我经过实际的代码测试和分析ReactOS代码实现得出的结论。测试代码如下
1 int main(int argc, char *argv[]) 2 { 3 HANDLE iocp; 4 HANDLE iocp1; 5 SOCKET s; 6 HANDLE ret; 7 8 WSADATA wsa_data; 9 WSAStartup(MAKEWORD(2, 2), &wsa_data); 10 11 iocp = CreateIoCompletionPort(INVALID_HANDLE_VALUE, NULL, 0, 4); 12 iocp1 = CreateIoCompletionPort(INVALID_HANDLE_VALUE, NULL, 0, 4); 13 s = create_client_socket(); 14 15 assert(NULL != iocp); 16 assert(NULL != iocp1); 17 18 ret = CreateIoCompletionPort((HANDLE)s, iocp, 0, 0); 19 printf("first bind, ret: %lu, error: %u ", (long)ret, GetLastError()); 20 21 ret = CreateIoCompletionPort((HANDLE)s, iocp1, 0, 0); 22 printf("second bind, ret: %lu, error: %u ", (long)ret, GetLastError()); 23 24 CloseHandle(iocp); 25 CloseHandle(iocp1); 26 closesocket(s); 27 28 WSACleanup(); 29 30 return 0; 31 }
运行结果
Administrator@attention /e/tinylib/windows/net_iocp
$ iocp.exe
first bind, ret: 60, error: 0
second bind, ret: 0, error: 87
ReactOS-0.3.12-REL-src的代码体现在NtSetInformationFile()中以下代码片段
1 /* FIXME: Later, we can implement a lot of stuff here and avoid a driver call */ 2 /* Handle IO Completion Port quickly */ 3 if (FileInformationClass == FileCompletionInformation) 4 { 5 /* Check if the file object already has a completion port */ 6 if ((FileObject->Flags & FO_SYNCHRONOUS_IO) || 7 (FileObject->CompletionContext)) 8 { 9 /* Fail */ 10 Status = STATUS_INVALID_PARAMETER; 11 } 12 else 13 { 14 /* Reference the Port */ 15 CompletionInfo = Irp->AssociatedIrp.SystemBuffer; 16 Status = ObReferenceObjectByHandle(CompletionInfo->Port, 17 IO_COMPLETION_MODIFY_STATE, 18 IoCompletionType, 19 PreviousMode, 20 (PVOID*)&Queue, 21 NULL); 22 if (NT_SUCCESS(Status)) 23 { 24 /* Allocate the Context */ 25 Context = ExAllocatePoolWithTag(PagedPool, 26 sizeof(IO_COMPLETION_CONTEXT), 27 IOC_TAG); 28 if (Context) 29 { 30 /* Set the Data */ 31 Context->Key = CompletionInfo->Key; 32 Context->Port = Queue; 33 if (InterlockedCompareExchangePointer((PVOID*)&FileObject-> 34 CompletionContext, 35 Context, 36 NULL)) 37 { 38 /* 39 * Someone else set the completion port in the 40 * meanwhile, so dereference the port and fail. 41 */ 42 ExFreePool(Context); 43 ObDereferenceObject(Queue); 44 Status = STATUS_INVALID_PARAMETER; 45 } 46 } 47 else 48 { 49 /* Dereference the Port now */ 50 ObDereferenceObject(Queue); 51 Status = STATUS_INSUFFICIENT_RESOURCES; 52 } 53 } 54 } 55 56 /* Set the IRP Status */ 57 Irp->IoStatus.Status = Status; 58 Irp->IoStatus.Information = 0; 59 }
MSDN中也明确提倡开发者启动多个线程使用GetQueuedCompletionStatus()挂在一个IOCP上来处理IO事件,我是如此理解了的,原文如下
- NumberOfConcurrentThreads
- [in] Maximum number of threads that the operating system allows to concurrently process I/O completion packets for the I/O completion port. If this parameter is zero, the system allows as many concurrently running threads as there are processors in the system.
Although any number of threads can call the GetQueuedCompletionStatus function to wait for an I/O completion port, each thread is associated with only one completion port at a time. That port is the port that was last checked by the thread.
可这对应有另外一个问题:会导致同一个IO handle的完成事件被分散到不同的线程中处理,从而在处理同一个handle的IO事件时会引入额外的并发竞争,对此我也写了代码进行测试确认,如下
1 /* 2 编译命令 3 gcc iocp.c -o iocp -lws2_32 -g 4 5 测试命令 6 nc -u 192.168.100.101 1993 7 快速反复发送数据 8 9 实际运行结果 10 Administrator@attention /e/code 11 $ gdb -q iocp.exe 12 Reading symbols from e:codeiocp.exe...done. 13 (gdb) r 14 Starting program: e:codeiocp.exe 15 [New Thread 1984.0x1788] 16 [New Thread 1984.0x914] 17 thread: 6024, 3 bytes received fro 168 notified by IOCP 18 thread: 6024, 3 bytes received fro 168 notified by IOCP 19 thread: 6024, 3 bytes received fro 168 notified by IOCP 20 thread: 6024, 4 bytes received fro 168 notified by IOCP 21 thread: 6024, 3 bytes received fro 168 notified by IOCP 22 thread: 2324, 4 bytes received fro 168 notified by IOCP 23 thread: 2324, 2 bytes received fro 168 notified by IOCP 24 thread: 2324, 4 bytes received fro 168 notified by IOCP 25 thread: 2324, 3 bytes received fro 168 notified by IOCP 26 thread: 6024, 5 bytes received fro 168 notified by IOCP 27 thread: 2324, 4 bytes received fro 168 notified by IOCP 28 thread: 2324, 4 bytes received fro 168 notified by IOCP 29 thread: 2324, 3 bytes received fro 168 notified by IOCP 30 thread: 6024, 4 bytes received fro 168 notified by IOCP 31 thread: 6024, 4 bytes received fro 168 notified by IOCP 32 thread: 6024, 2 bytes received fro 168 notified by IOCP 33 */ 34 35 #include <stdio.h> 36 #include <stdlib.h> 37 38 #define WIN32_LEAN_AND_MEAN 39 #include <windows.h> 40 #include <winsock2.h> 41 #include <process.h> 42 43 HANDLE iocp; 44 SOCKET s_udp; 45 46 void routine(void) 47 { 48 unsigned threadId; 49 50 ULONG_PTR key; 51 LPOVERLAPPED povlp; 52 BOOL result; 53 54 char buffer[65535]; 55 WSABUF wsabuf; 56 DWORD received; 57 DWORD flag; 58 struct sockaddr_in peer_addr; 59 int addr_len; 60 WSAOVERLAPPED ovlp; 61 int error; 62 63 do 64 { 65 wsabuf.len = sizeof(buffer); 66 wsabuf.buf = buffer; 67 received = 0; 68 flag = 0; 69 addr_len = sizeof(peer_addr); 70 memset(&peer_addr, 0, addr_len); 71 memset(&ovlp, 0, sizeof(ovlp)); 72 73 threadId = GetCurrentThreadId(); 74 75 if (WSARecvFrom(s_udp, &wsabuf, 1, &received, &flag, (struct sockaddr*)&peer_addr, &addr_len, &ovlp, NULL) == 0) 76 { 77 printf("thread: %u, %u bytes received for %lu imediately ", threadId, received, s_udp); 78 continue; 79 } 80 81 while (1) 82 { 83 result = GetQueuedCompletionStatus(iocp, &received, &key, &povlp, 10); 84 if (FALSE == result) 85 { 86 error = WSAGetLastError(); 87 if (WAIT_TIMEOUT != error) 88 { 89 printf("GetQueuedCompletionStatus() failed, error: %d ", error); 90 } 91 continue; 92 } 93 94 printf("thread: %u, %u bytes received fro %lu notified by IOCP ", threadId, received, s_udp); 95 break; 96 } 97 } while (1); 98 99 return; 100 } 101 102 unsigned __stdcall thread(void *arg) 103 { 104 routine(); 105 106 return 0; 107 } 108 109 SOCKET create_udp_socket(unsigned short port, const char *ip) 110 { 111 SOCKET fd; 112 struct sockaddr_in addr; 113 unsigned long value = 1; 114 115 fd = WSASocket(AF_INET, SOCK_DGRAM, IPPROTO_UDP, NULL, 0, WSA_FLAG_OVERLAPPED); 116 if (INVALID_SOCKET == fd) 117 { 118 printf("create_udp_socket: socket() failed, errno: %d", WSAGetLastError()); 119 return INVALID_SOCKET; 120 } 121 122 memset(&addr, 0, sizeof(addr)); 123 addr.sin_family = AF_INET; 124 addr.sin_addr.s_addr = (NULL != ip ? inet_addr(ip) : INADDR_ANY); 125 addr.sin_port = htons(port); 126 if (bind(fd, (struct sockaddr*)&addr, sizeof(addr)) != 0) 127 { 128 printf("create_server_socket: bind() failed, erron: %d", WSAGetLastError()); 129 closesocket(fd); 130 return INVALID_SOCKET; 131 } 132 133 return fd; 134 } 135 136 int main(int argc, char *argv[]) 137 { 138 unsigned threadId; 139 HANDLE t; 140 WSADATA wsadata; 141 142 WSAStartup(MAKEWORD(2,2), &wsadata); 143 144 iocp = CreateIoCompletionPort(INVALID_HANDLE_VALUE, NULL, 0, 4); 145 s_udp = create_udp_socket(1993, "0.0.0.0"); 146 CreateIoCompletionPort((HANDLE)s_udp, iocp, 0, 0); 147 148 t = (HANDLE)_beginthreadex(NULL, 0, thread, NULL, 0, &threadId); 149 150 routine(); 151 152 WaitForSingleObject(t, INFINITE); 153 CloseHandle(t); 154 closesocket(s_udp); 155 CloseHandle(iocp); 156 157 WSACleanup(); 158 159 return 0; 160 }
如此的话,由于这些并发竞争的存在实际上差不多抵消了开多个线程进行并发处理的好处,还不如将所有的IO事件全部放在同一个线程中进行处理,还能省去很多锁的开销。不过现代的程序几乎完全是在多核的CPU上运行的,如果因为IOCP,你让所有相关的工作全部放在一个线程里进行处理,又不能充分利用多核的并行优势。实际上我们在设计并发模型时,经常开多个worker来实现负载均衡,但IOCP以上的限制是与之相冲突的。
linux下的epoll就额外提供了del操作,可以使得一个fd可以随时从当期的epoll中detach出去,又立马add进另外一个epoll,如此的话就可以开多个worker线程开跑多个epoll,可以将不同fd均摊到不同的worker中实现负载均衡,同时又可以随意的将fd从一个线程迁移到另外一个线程进行处理。这种均衡操作在实际的业务中是很常见的,会需要你根据业务逻辑,将不同的fd交给其他的线程来处理,若使用IOCP的话就不太方便了。
这些就算是我对IOCP吐槽的一个地方了。
~~end~~
===== 分割线 =====
看来我的意思还说得不很清楚,补充一下:
本意也是多个线程跑多个处理循环,在每个循环里都拥有一个IOCP,处理不同的socket,但业务逻辑需要将一个socket从一个线程的处理循环中迁移到另外的一个线程的处理循环,但上面所述的IOCP的限制,没法绑定到新线程的IOCP中,从而没法进行迁移!
但是开多个线程挂在同一个IOCP上,又有上面所说的并发竞争的问题!
===== 分割线 =====
得再补充一些情况:
实际业务情况是这样的:我们这边有两个不同的服务,但奈何由于一些我们不能自主的原因,两个服务的请求只能从一个端口进来。来了一个连接之后,得先接收一小段数据才能知道到底请求哪个,但这两个服务是在不同的线程循环里实现的。所以额外有一个入口server线程来负责接受请求,并收取这小段数据。若使用IOCP,那么接收到的连接就得先绑定到该入口server的IOCP里头了,但一旦绑定就没法迁移出去了,但实际后续两个服务又需要在各自的循环里进一步在接受到的连接上进行数据收发处理。
本身开始设计实现的时候自然是想到各自拥有一个IOCP各自处理的不同的连接,不做迁移,但实际却由于这些原因产生了迁移需求。此乃谓之蛋疼也!