Whitepapers
Download PDF: Enterprise Systems

Improving Response times With I/O Completion Ports & Overlapped I/O

Introduction

Windows NT comes equipped with some special programming features hardly found in other operating systems, that can improve response times in enterprise systems that need to handle large number of transactions. For example an electronic trading system on the Web may be receiving over a thousand messages per second during peak times and distributing computed information or interacting with more than a thousand users at the same time. To handle this kind of information flow in an efficient manner NT provides mechanisms know as I/O Completion ports and Overlapped I/O.

Overlapped I/O is essentially I/O that proceeds mostly in parallel with computation. What this means is that your program can issue an I/O request and instead of sitting there twiddling its thumbs waiting for the operation to complete, it can go ahead with other processing requirements. The I/O operation takes place in parallel. NT will inform the process, if desired, of the completion of the operation through an event or through an I/O completion port. Overlapped I/O thus helps you make better utilization of the CPU, provided the program is well designed to make use of the feature.<

I/O completion ports add another dimension to this picture. They allow you to manage a pool of threads efficiently and control the amount of concurrency therein. In a client-server situation where a large number of clients connect to a server, it is not very viable to create a thread for each client, especially so if per-client interactions are not in high volumes or are sporadic. It is often a better method to pre-create a set of threads (a pool) and allocate them to the clients using some ad hoc scheduling scheme. Well, NT makes this very efficient !

A set of N threads may wait simultaneously on an I/O completion port for I/O completion events. These could be, for example, results of doing asynchronous reads on a set of network sockets. NT will schedule threads from the pool in the most efficient way. That is, when NT finds that an event needs to be processed it will unblock the thread that was last active, so that new thread resources will not be required. This means savings in thread-switching and virtual memory operations, since another thread will not be activated as long as a previously run thread can service the event. Dormant threads will be swapped out of memory. These threads are activated only if the earlier running threads are busy processing events. Also NT can map these threads to multiple processors for most efficient processing. You can also explicitly assign threads to processors but this task is best left to the NT kernel. One can also control the concurrency level on the I/O completion port. If you set the level to K (with K < N), then no more than K threads from the pool will be activated by NT.

However, if one of the K threads block doing some other operation and an event is pending on the port, then NT will activate another thread to service it! So what happens when the blocked thread is ready to run? NT will again schedule the thread so it can finish off and return to waiting on the completion port. Thus the concurrency level can sometimes raise above K temporarily, and quickly subside to K or less. One can experiment with various values of K depending on the system configuration. A good rule of the thumb is to use 2*M threads where M is the number of processors in the system. This kind of thread pool management need not be limited to processing overlapped I/O events alone. In fact they could be any event of interest to your system. You can directly post a message to an I/O completion port, to be picked up and processed by the thread pool. These could be events on one or more of your network/peripheral interfaces. I/O completion ports let you juggle with all these events and manage them efficiently.