Saturday, February 14, 2015

Windows Internals - Thread resumption and synchronization objects

Hello, in the two previous blog entries I discussed how thread suspension works. I'll dedicate this post to share my research concerning thread resumption, it was crucial to explore some parts of the internal synchronization mechanisms to achieve a better understanding. As usual, the reversing was done on a Windows 7 32-bit machine.

To resume a suspended thread you normally call ResumeThread from usermode or ZwResumeThread from kernelmode, as a result you'll be landing in the NtResumeThread kernel function, it's very similar to NtSuspendThread that I already talked about in the previous posts.
This is the function's prototype :
NTSTATUS NtResumeThread(HANDLE ThreadHandle,PULONG PreviousSuspendCount)

It returns a status code and optionally a previous suspend count indicating the thread's suspend count before it was decremented, as you might already know suspending a thread x times requires resuming it x times to make it able to run again.
In order to start the thread resumption and to get the previous suspend count, NtResumeThread calls KeResumeThread which prototype is the following :
LONG KeResumeThread(PKTHREAD Thread)

KeResumeThread returns the previous suspend thread count and resumes the thread if the suspend count reached 0. Let's look more closely at this function :

First the IRQL is raised to DISPATCH_LEVEL and the target thread's ApcQueueLock is acquired, after that the previous thread count is saved. If it isn't null, the thread was in fact suspended and the routine wasn't called just by mistake on a thread in a different state. The suspend count is then decremented and checked alongside the freeze count against 0. If both of them are null, the thread must be resumed and here where it gets interesting : A thread is suspended when executing an APC that just waits on the thread's Suspend Semaphore to switch into the signaled state. This Semaphore is initially in the non-signaled state and won't switch its state until the thread has to be resumed or was forced to be resumed (KeForceResumeThread).

Like any other synchronization object (mutex,thread,timer...) a semaphore has a header structure (_DISPATCH_HEADER). Its most important fields are the type, signal state, lock and the wait list head.

The WaitListHead field is the doubly linked list head of the wait blocks (KWAIT_BLOCK) waiting for the synchronization object. Let's dump KWAIT_BLOCK structure :
- The pointers to the next and previous wait block waiting on the same synchronization object are in the LIST_ENTRY WaitListEntry field. e.i : if there are threads waiting on a synchronization object, the dispatch header's WaitListHead field points to the first block's WaitListEntry field. The object fields of the wait blocks in this list is the same, but the thread field isn't.
- The NextWaitBlock field points to the next wait block when the wait is performed using KeWaitForMultipleObjects and each object field in this list points to a different synchronization object but the thread field is the same.
- The WaitKey field is the index of the wait block in the array used by KeWaitForMultipleEvents (either the default or the supplied array : see msdn). Its type is NTSTATUS and serves to know the index of the signaled object in case WaitAny was supplied. In KeWaitForSingleEvent this field is set to 0x00 (STATUS_SUCCESS/STATUS_WAIT_0).
- WaitType : WaitAll or WaitAny when waiting for multiple objects. WaitAny by default when waiting on a single object.

Back to KeResumeThread, if the signal state field value is greater than 0, then the synchronization object is in the signaled state and the wait could be satisfied for the thread(s) waiting on that object (depends on the object though). Compared to a mutex a semaphore is able to satisfy a wait for more than one single thread, a semaphore object has a Limit field in its structure indicating the limit of those threads. In addition, a semaphore has a semaphore count which is the SignalState field ; its value can't be above the Limit. Being in the signaled state, a semaphore will satisfy the wait for semaphore count threads.
KeResumeThread turns the semaphore into the signaled state by incrementing its count and then it calls KiSignalSynchronizationObject. Here's the routine :
The WaitListHead comes into scene in this function, where it is used to walk the doubly linked list of KWAIT_BLOCK structures waiting on the synchronization object. I forgot to mention earlier that the thread object structure KTHREAD stores 4 KWAIT_BLOCK structures in an array, more than one WaitBlock is clearly used when the thread is waiting on multiple objects , the msdn documentation on KeWaitForMultipleObjects discusses that point. The WaitBlock is mainly initialized inside KeWaitForSingleObject or KeWaitForMultipleObjects and then inserted in the tail of the KWAIT_BLOCK structures waiting list of the synchronization object.
You notice from the code above that WaitBlock->WaitType is checked, let's see the type definition of the WaitType field type.

typedef enum _WAIT_TYPE {
    WaitAll,
    WaitAny
} WAIT_TYPE;

- WaitAll means that the wait isn't satisfied for the waiting thread until all the synchronization object become in the signaled state (KeWaitForMultipleObjects).
- WaitAny means that the wait is satisfied for the thread if at least one synchronization object turns into the signaled state.

Let's get back to where we stopped and treat each case alone. If the WaitType is WaitAny, an attempt to unwait the waiting thread is made by calling KiTryUnwaitThread (we'll looking into this function shortly). If the thread exited the wait state, then the synchronization object's signaled state field is decremented. If it reached 0 as a result, we stop iterating through the wait blocks linked list because the synchronization object would be in the non-signaled state.
Now let's see if the WaitType is equal to WaitAll ; In that case only a call to KiTryUnwaitThread is made.
The arguments given to KiTryUnwaitThread are quite different in the two cases. Here is the decompilation of parts that interest us of the function :
The function appears to call KiSignalThread , let's take a look at it too :
In general, KiTryUnwaitThread calls KiSignalThread if the thread is waiting and return a boolean value describing if the thread was signaled or not. In fact this boolean value is returned by KiSignalThread, this function unlinks the thread from the linked list of threads in the waiting state for the processor it was executing in before exiting the running state (WaitPrcb), then it inserts the thread into the deferred ready list and set its state to DeferredReady , after that it sets the Thread->WaitStatus to the same status code passed to KiTryUnwaitThread and then it returns TRUE. KiSignalThread does what I described previously if the Thread->WaitRegister.State == 1; KiCommitThreadWait initializes this field to 1. But if Thread->WaitRegister.State == 0 (this field is initialized to 0 by KeDelayExecutionThread), the WaitStatus is set to the status code and TRUE is returned.
The Thread->WaitStatus field is returned by KiSwapThread function which is called by KeWaitForSingleObject and KeWaitForMultipleObjects. KiSwapThread basically won't return to KiCommitThreadWait until the waiting thread exited the wait state (KiSignalThread). In our case, KiCommitThreadWait returns to KeWaitForXXXObject(s) with the WaitStatus as a return value. This WaitStatus describes the reason why the thread was awaken from its wait state. KeWaitForXXXObject(s) checks on this return value. Here's a very simplified pseudo code of what interests us:

Everything has become quite clear at this stage to explain why KiSignalSynchronizationObject supplies different arguments to KiTryUnwaitThread and also why it decrements the SignalState when the wait type is WaitAny. Let me explain :
When the wait type is WaitAny, this means that the waiting thread entered the wait state upon calling KeWaitForSingleObject or KeWaitForMultipleObject with the WaitAny wait type.Thus, KiTryUnwaitThread is called with the WaitBlock->WaitKey as the wait status. So when the awaken thread returns from KiCommitThreadWait in KeWaitForMultipleObjects the wait status won't be STATUS_KERNEL_APC and we'll bail out directly returning the index of the signaled synchronization object. In this case, the synchronization object signal state wasn't touched that's why it must be decremented after successfully unwaiting the thread.
Let's see now, if the wait type is WaitAll ; this implicates that the waiting thread waits for multiple objects to become in the signal state. That's why KiTryUnwaitThread is called with STATUS_KERNEL_APC so that KeWaitForMultipleObjects iterates again and checks the signaled state of all synchronization objects. If it turns out that they're all signaled KeWaitForMultipleObject takes care this time of decrementing or zeroing (depends on the object) the signal state of all the synchronization objects the thread was waiting on.

Let's continue the story of the suspended thread and see what happens when it's finally resumed. Now that the semaphore is signaled and therefore the thread is deferred ready thanks to KiSignalThread, it will be scheduled to run soon. When it does run it will return from KeWaitForSingleEvent with a STATUS_SUCCESS/STATUS_WAIT_0 (Status != STATUS_KERNEL_APC). We're now in the kernel APC routine after returning...

Conclusion :
While thread suspension relies on queuing a kernel APC calling WaitForSingleEvent on a suspend semaphore, thread resumption takes us more deeply into exploring synchronization objects and how the waiting threads behave differently when waiting on a single or multiple objects.

I hope you enjoyed reading this post.
Follow me on twitter : Here

No comments:

Post a Comment