When building server based applications in C#, it is important to have the ability to create thread pools. Thread pools allow our server to queue and perform work in the most efficient and scalable way possible. Without thread pooling we are left with two options.
The first option is to perform all of the work on a single thread. The second option is to spawn a thread every time some piece of work needs to be done. For this article, work is defined as an event that requires the processing of code. Work may or may not be associated with data, and it is our job to process all of the work our server receives in the most efficient and fastest way possible.
As a general rule, if you can accomplish all of the work required with a single thread, then only use a single thread. Having multiple threads performing work at the same time does not necessarily mean our application is getting more work done, or getting work done faster. This is true for many reasons.
For example, if you spawn multiple threads which attempt to access the same resource bound to a synchronization object, like a monitor object, these threads will serialize and fall in line waiting for the resource to become available. As each thread tries to access the resource, it has the potential to block, and wait for the thread that owns the resource to release the resource.
At that point, these waiting threads are put to sleep, and not getting any work done. In fact, these waiting threads have caused more work for the operating system to perform. Now the operating system must task another thread to perform work, and then determine which thread, waiting for the resource, may access the resource next, once it becomes available.
If the threads that need to perform work are sleeping, because they are waiting for the resource to become available, we have actually created a performance problem. In this case it would be more efficient to queue up this work and have a single thread process the queue.
Threads that start waiting for a resource before other threads, are not guaranteed to be given the resource first. In diagram A, thread 1 requests access to the resource before thread 2, and thread 2 requests access to the resource before thread 3. The operating system however decides to give the resource to thread 1 first, then thread 3, and then thread 2. This scenario causes work to be performed in an undetermined order. The possible issues are endless when dealing with multi-threaded applications.
If work received can be performed independent of each other, we could always spawn a thread for processing that piece of work. The problem here is that an operating system like Windows has severe performance problems when a large number of threads are created or running at the same time, waiting to have access to the CPU.
The Windows operating system needs to manage all of these threads, and compared to the UNIX operating system, it just doesn’t hold up. If large amounts of work are issued to the server, this model will most likely cause the Windows operating system to become overloaded. System performance will degrade drastically.<>
This article is a case study comparing thread performance between Windows NT and Solaris.
In the .NET framework, the “System.Threading” namespace has a ThreadPool class. Unfortunately, it is a static class and therefore our server can only have a single thread pool. This isn’t the only issue. The ThreadPool class does not allow us to set the concurrency level of the thread pool.
The concurrency level is the most important setting when configuring a thread pool. The concurrency level defines how many threads in the pool may be in an “active state” at the same time. If we set this parameter correctly, we will have the most efficient, performance enhanced thread pool for the work being processed.
Imagine we have a thread pool with 4 threads and a concurrency level of 1. Then, three pieces of work are queued up for processing in the pool. Since the concurrency level for the thread pool is 1, only a single thread from the pool is activated and given work from the queue. Even though there are two pieces of work queued up, no other threads are activated. This is because the concurrency level is set to 1. If the concurrency level was set to 2, then another thread would have been activated immediately and given work from the queue. In diagram B we have thread 1 running and all of the other threads sleeping with two pieces of work queued.
So the question exists, why have more than 1 thread in the pool if the concurrency level is set to 1? If thread 1 in diagram B ever goes to sleep before it completes its work, another thread from the pool will be activated. When thread 1 goes to sleep, there are 0 threads “active” in the pool and it is ok to activate a new thread based on the concurrency level. In diagram C, we now have thread 1 sleeping and thread 4 running with one piece of work queued.
Eventually, thread 1 will wake up, and it is possible for thread 4 to still be active. We have 2 threads active in the pool, even though the concurrency level is set to 1. In diagram D, we now have thread 1 and thread 4 running and one piece of work still queued.
The last piece of work in the queue will need to wait until both threads return to a sleeping state. This is because the concurrency level is set to 1. As we can see, even though the concurrency level restricts the number of active threads in the pool at any given time, we could have more active threads then the concurrency level allows. It all depends on the state of the threads in the pool and how fast the threads can complete the work they are processing.
A good rule of thumb is to set the concurrency level to match the number of CPU’s in the system. If the machine our server is running on only has one CPU, then only one thread can be executing at any given time. It will require a task swap to have another thread get CPU time. We want to reduce the number of active threads at any given time to maximize performance. This also leads to scalability. As the number of CPU’s increase, we can increase the concurrency level because there is a CPU to execute that thread. This is a general rule and is always a good starting point for configuring our thread pools.
The bottom line is, if the CPU is available, and there is work to perform, activate a thread. If the CPU is not available, do not activate a thread. One other thing, we need to be careful that we don’t cause a situation where the threads in the pool are constantly being put to sleep for long periods of time during the processing of work. This may cause all of the threads in the pool to constantly be in an active state, defeating the efficiency of the pool and the performance of the server.
The remaining scope of this article will show you how to add IOCP thread pools to your C# server based applications. How to configure the thread pools for your specific application will not be covered. It is suggested to use the general rules as discussed.