Function RTKSetCPUMask

This function can be used to restrict the execution of a thread to a specific subset of logical CPUs:

RTKBool RTKSetCPUMask(RTKTaskHandle Handle, DWORD Mask);

Parameters

Handle

Handle of the task for which to set the CPU mask.

Mask

The CPU mask to apply. If bit 0 is set, the thread may run on CPU 0. If bit 1 is set, the thread may run on CPU 1, etc.

return value

If the specified CPU mask is apply, the function returns TRUE and FALSE otherwise.

The mask for the Idle Tasks may only be changed to include all available CPUs.

If the task - for which the mask is changed - is currently running on a different CPU, the mask will take effect at the next task switch on that CPU. For all other cases, the new mask is applied immediately.

When this function is never called for a new thread, then the thread may run on any logical CPU (i.e., the applied mask is ((1 << RTKCPUs()) - 1)). Mask 0xFFFFFFFF may be specified to restore this default state. Otherwise, parameter Mask may not have bits set for CPUs which do not exist. Value 0 is also not allowed.

Please note that limiting the number of CPUs a thread may run on can degrade real-time performance. If, for example, a thread with a high priority is activated on CPU 0, but this thread may only run on CPU 1, then an IPI (Inter-Processor-Interrupt) must be sent from CPU 0 to CPU 1 to inform the scheduler on CPU 1 that a task switch is required. Sending an IPI can typically take about 1 to 10 microseconds. The activation of this high priority thread is thus delayed by this time.

The real-time behavior of a program can also be compromised. RTKernel-32 guarantees real-time primarily through Scheduling Rule 1: "Of all threads in the state Ready, the N threads with the highest priorities run, where N is the number of available CPUs.". However, when function RTKSetCPUMask is used, this rule cannot always be fulfilled, as illustrated by this example:

Consider a system with 2 logical CPUs (0 and 1) and 3 threads (A, B, and C) with priorities 1, 2, and 3, respectively. Thread A may run on CPU 0 while threads B and C may run on CPU 1. When all 3 threads are ready to run, threads A and C run. This violates Scheduling Rule 1 as thread B is not running even though it has a higher priority than running thread A.

Limiting the usable set of CPUs per thread can also degrade overall performance, mostly due to the higher IPI load and because some CPUs might be idle because they are prevented from running threads in the state Ready. Here are some benchmark figures collected with demo program MPPrimes, test "T2" on a 4 core target:

Test	Time (s)	IPIs	Comments
Default	2.637	19526	All threads may run on all CPUs
RANDOM_CMASK	4.621	132671	Each thread has a random CPU mask
UNIQUE_CMASK	4.742	141418	Each thread may run on only 1 CPU
UNIQUE_CMASK INC_PRIOS	8.682	470407	Each thread may run on only 1 CPU Each thread has a unique priority

For best performance, it should always be ensured that threads activated through interrupts may run on the CPU on which these interrupts come in. Function RTMPBalanceINTCPUs can be used to send hardware interrupts to specific CPUs. The only exception is the Timer interrupt which is always directed to CPU 0.

Function RTKCPUs