Execution happens in the context of a Thread. A thread has a stack and a register state. One thread can be executed on a given processor at any one time, although multiple threads might be ready to run, waiting for their turn. Execution of a thread may be interrupted asyncronously by an interrupt or by a call to an object that calls the scheduler synchronously. The scheduler is called each time a thread blocks or a new thread becomes runnable or a timer interrupt occurs. The scheduler decides which thread should run next and programs the timer interrupt as needed. The synchronization primitives control access to shared resources and provide blocking and signalling between threads and affects scheduling.
There can be multiple different schedulers, as long as they implement the common scheduling interface and provide sufficient semantic as required by applications. The simplest scheduler supports just one thread. A more sophisticatyed scheduler might provide time sharing, a yet more interesting scheduler provides real-time scheduling with feedback.
The system provides mutexes and condition variables for thread synchronization.
A mutex is a semaphore object that prevents multiple threads from accessing a resource concurrently. MMLite supports blocking on a mutex, and also integrates mutexes with constraint-based thread scheduling.
The following functions manage mutexes.
|
MUTEX |
|
|
|
A mutex object. |
|
Mutex_Destroy |
|
|
|
Deletes a mutex object. |
|
Mutex_Init |
|
|
|
Initializes a mutex object. |
|
Mutex_Lock |
|
|
|
Acquires a lock on a mutex object, waiting for its availability, if necessary. |
|
Mutex_Locked |
|
|
|
Determines if the current thread owns lock on a a specified mutex. |
|
Mutex_TryLock |
|
|
|
Attempts to acquire lock on a a mutex object, but doesn't wait if it isn't available. |
|
Mutex_Unlock |
|
|
|
Releases a previously acquired lock on a mutex object. |
A condition variable implements thread wait and wakeup. For example, a thread can be put to sleep and told to wait until an interrupt signal is received. When the condition is signaled, the thread resumes execution. Operations on condition variables include the following.
|
CONDITION |
|
|
|
Structure for representing condition variables. |
|
Condition_Broadcast |
|
|
|
Wakes up all threads waiting on a condition variable. |
|
Condition_Destroy |
|
|
|
Deletes a condition variable. |
|
Condition_Init |
|
|
|
Creates and initializes a condition variable. |
|
Condition_InterruptSignal |
|
|
|
Wakes up a thread waiting on a condition variable. From interrupt context. |
|
Condition_Signal |
|
|
|
Wakes up a thread waiting on a condition variable. From thread context. |
|
Condition_WaitAndBeginConstraint |
|
|
|
Makes thread wait for a condition with an optional timeout, then restarts thread with constraint. |
|
Condition_Wait |
|
|
|
Makes thread wait for a condition variable with optional timeout. |
Atomic Functions provide thread-safe counting without mutexes and building blocks for simple lock-free algorithms.
|
AtomicAdd |
|
|
|
Adds a value to a parameter. |
|
AtomicCmpAndSwap |
|
|
|
Compares a parameter to a value and if equal, sets the parameter to a new value. |
|
AtomicDec |
|
|
|
Decrements a parameter. |
|
AtomicInc |
|
|
|
Increments a parameter. |
|
AtomicSub |
|
|
|
Subtracts a value from a parameter. |
|
AtomicSwap |
|
|
|
Sets a parameter to a value. |
A thread is the basic entity to which the operating system allocates CPU time. When ab executable image is loaded, a new thread starting at the entry point of the image is also created. This thread can create additional threads that execute independently.
A thread can execute any part of the program's code, including a part executed by another thread. Threads can also execute code that is part of some other process, typically by invoking a method on some separately loaded object. Method invocation is transparent across any logical process boundaries, including address space transitions, should an MMU ever be present in hardware.
Each thread maintains a set of structures for saving its context while waiting to be scheduled for processing time. The context includes the thread's set of machine registers, a thread description block, and a stack. The stack is by far the main contributor in the memory cost of a thread. Stack size is determined at thread creation time, either explicitly by the programmer or by using a default size for each process.
All threads of a process share the virtual address space and can access the global variables and system resources of the process. In the absence of virtual memory, all threads have access to all of the memory.
MMLite schedules all runnable threads according to the scheduling policies implemented by the system scheduler. By default, threads are scheduled with a time-sharing policy not optimized for real-time processing. This policy attempts to divide the available CPU time equally among all its threads. Threads that have been suspended for the longest time have the highest urgency under this policy.
To support real-time processing, a scheduling policy is implemented using time constraints. A constraint is a set of scheduling parameters that include earliest start time, deadline, execution time estimate, and criticality. The urgency of a real-time thread is defined by the constraint in effect for it. The urgency of a constraint is the deadline minus the estimated execution time. The BaseRtl interface provides several constraint-related API functions for setting scheduling parameters for a currently running thread. This allows applications to implement fine-grained control of real-time behavior.
A thread is created using IProcess::CreateThread. A thread can terminate its execution by calling ThreadExit. Other thread functions are managed by the IThread interface, which implements the following methods.
|
CurrentThread |
|
|
|
Gets a pointer to the thread object that represents the currently running thread in its process. |
|
IThread::GetProcess |
|
|
|
Gets a pointer to the process object that the thread runs in. |
MMLite provides a constraint-based real-time scheduler with feedback to dispatch threads. The programmer dynamically provides the scheduler with information about the time constraints under which a certain section of code is to be executed. The scheduler will make sure that the thread is scheduled before the given deadline with enough projected time to complete the indicated computing. To prevent priority inversion, the currently running thread's constraints are inherited by other threads that might block it.
In the absence of programmer-specified constraints, the constraint scheduler schedules threads in a time-sharing fashion. Time-sharing is internally implemented using artificial constraints. Threads that specify time constraints are given priority over all time-sharing threads.
Constraints are specified through the use of the BeginConstraint and EndConstraint functions. Basically, when a thread needs to execute a block of code by a certain point in time, it first calls BeginConstraint, passing a structure which includes the following four parameters:
|
Start time |
|
|
|
Earliest time desired for execution. Measured in absolute time. Relative time will be converted to absolute time. Thread sleeps until this time arrives. |
|
Estimated time |
|
|
|
Estimated time for the execution of the thread inside this constraint. Measured in absolute time (a time interval). |
|
Deadline |
|
|
|
Latest time required for completion of execution inside this constraint. Measured in absolute time. Relative time will be converted to absolute time. |
|
Criticality |
|
|
|
Importance of meeting the deadline. Criticality is a data type with values of CRITICAL and NONCRITICAL. Only safety-critical (critical real-time) activities use the CRITICAL value. |
The scheduler compares the real-time needs of this thread with that of other threads, and then schedules each thread to meet its deadline. If the scheduler predicts that it will not have enough time to complete execution of the block before its deadline, BeginConstraint returns an out-of-time error code. This is an exceptional situation, and the thread should execute some recovery action, perhaps shedding part of the load.
The end of the block is marked by a call to EndConstraint. This call serves two purposes:
It tells the scheduler that the block which was under a constraint has completed execution.
It gives feedback to the thread about the actual amount of processor time that it took to execute the block.
The thread can use this feedback in future BeginConstraint calls to more accurately predict the amount of processor time it will consume.
Three common execution styles are:
Periodic execution, in which a thread runs every N milliseconds. The earliest start time parameter is used to implement this execution style, it is incremented by N milliseconds on each cycle, and a new BeginConstraint call is made. The thread is suspended until the earliest start time arrives.
Event-driven execution, in which a thread unpredictably runs when certain events materialize. In this case, the earliest start time is unpredictable, but the expected amount of processor time and the deadline (relative to the start time) are known in advance. This execution style is implemented via the Condition_WaitAndBeginConstraint function.
Data-driven execution, in which a thread receives a block of data to process, perhaps at a constant rate, but the computation time is unpredictable and dependent on the data item itself. This style of execution can be implemented by having the thread wake up and inspect the data, determine the expected amount of processor time and the deadline, and then call BeginConstraint again with new, more accurate estimates. Pessimistic estimates are acceptable, but more accurate information benefits overall system performance and throughput.
Constraints do not start or stop threads. They provide a way to modify the scheduling of threads to meet real-time deadlines. You must provide code to handle the results of a constraint's indication of a missed deadline. You can repeatedly call BeginConstraint (to indicate EndPreviousConstraint is TRUE), if you need to have a thread be scheduled to meet a succession of deadlines. The term chained constraint is sometimes used to indicate such a succession of deadlines.
Scheduling is supported by the following BaseRtl API calls.
|
BeginConstraint |
|
|
|
Sets a constraint for the current thread and optionally ends the previous constraint. |
|
CRITICALITY |
|
|
|
Data type for indicating how well a thread handles missed deadlines. |
|
EndConstraint |
|
|
|
Ends the current constraint, reports the time taken since the constraint began, and restores the previous constraints. |
|
TIME |
|
|
|
Data type for representing a number of 100-nanosecond intervals. |
|
TIME_CONSTRAINT |
|
|
|
Structure for defining constraint values. |
|
TIME_FOREVER |
|
|
|
Macro for determining the largest possible absolute time. |
|
TIME_MICROS |
|
|
|
Macro for converting microseconds into TIME units. |
|
TIME_MILLIS |
|
|
|
Macro for converting milliseconds into TIME units. |
|
TIME_ORIGIN |
|
|
|
Macro for determining the smallest possible absolute time. |
|
TIME_RELATIVE |
|
|
|
Macro for converting an absolute time to a relative time. |
|
TIME_SECONDS |
|
|
|
Macro for converting seconds into TIME units. |
The system scheduler preemptively context switches among threads. A thread does not need to explicitly call a system function to yield the processor to another thread. Any of the following conditions can cause a context switch from one thread to another:
The currently running thread exhausts its time slice.
The currently running thread becomes blocked waiting for some resource.
A blocked thread of higher priority becomes runnable.
The scheduler uses a Minimum-Laxity-First algorithm to determine the execution priority of threads. In this algorithm, the laxity is:
laxity = deadline_time - current_time - CPU_time_needed
A thread with a low laxity must run promptly to enable it to meet its deadline. For example, a thread with a laxity of zero must run immediately. This scheduling algorithm results in very efficient scheduling when the system is not in an overload condition. Threads with no associated deadline receive the lowest priority and context switch among each other in a round-robin fashion.
Preemptive context switching prevents ill-behaved threads from monopolizing the processor, causing other threads to miss their deadlines. The scheduler also includes mechanisms to deal with overload conditions. The next section discusses these mechanisms.
Detection and resolution of overload situations is a design requirement of any proper multitasking, real-time system. Choosing a mechanism to prevent, detect, and manage overload involves a number of tradeoffs.
Overload prevention is the simplest solution to the overload requirement. By preventing overload situations in the first place, the system never needs to worry about what to do in an overload situation. In a properly designed system, overload detection and management are unnecessary.
Overload prevention results from predicting in advance the worst-case performance characteristics of each task in the system. A scheduler uses this information to run combinations of tasks that don't cause the system to enter an overload condition. When a request to load or run a task fails, due to a prediction that the task may cause an overload, the system provides a clean feedback mechanism to the requesting application or driver. When the request to load or run a task succeeds, the requesting application or driver receives assurance that there is enough projected real time available to run the task.
The predictive approach does have a disadvantage: by predicting the worst-case load for each task before running the task, the system maintains an essentially static picture of system load based upon an assumed worst-case scenario. This leads to underutilization of the system, because over time the average load of all threads will be much less than the worst case. Because the worst case might exceed the average case by an order of magnitude or more, underutilization of the system can be quite dramatic.
An alternative to overload prediction is the dynamic detection and resolution of overload situations. This approach produces much better utilization of processor bandwidth but requires more intelligent applications to manage overload situations.
The constraint scheduler uses a combination of prediction and detection to try to provide the best of both approaches. It supplies a mechanism to allow programs to dynamically predict and communicate to the scheduler their own expected processor requirements. The system then uses that information to schedule threads within their real-time requirements and informs threads if they cannot be scheduled due to a potential overload situation. The scheduler also detects when a thread has underestimated its real-time requirements, preempts that thread so that it does not degrade the performance of other threads in the system, and provides a feedback mechanism to the thread so that it can correct its future predictions.
This method does place a few demands on the implementation of application threads. Before executing a block of time-critical code, a thread might find out that there is not enough real time available for it to execute the block. Even after executing a block of time-critical code, the thread may find that it missed its deadline, typically because it underestimated its real-time processing needs. In either of these cases, the thread will have to put in place a policy to either reduce its real-time processing requirements or to correct its estimate of its real-time needs. In addition, during normal operation, the system may become instantaneously overloaded. That is, because of a rare combination of (interrupt) events, not all the threads that should run will be able to run. This is different from a permanent overload situation, where there are just not enough processor cycles (or other resources) to perform all necessary computations.
The system deals with a temporary overload condition by allowing program control over which of the threads with real time requirements should fail their deadlines. The application designer, therefore, can define the set of tasks that should never fail their deadlines; for example, because of life-threatening conditions associated with such failure. The design can preclude an overload condition when only these threads are present. The remaining activities might have real-time scheduling requirements, but they will not be able to disrupt the operation of the life-critical threads.
The constraint scheduler is an extension of deadline-based scheduling. Threads are context switched at any time based upon decisions by the system scheduler. A thread may specify real-time constraints to the scheduler through the use of the BeginConstraint and EndConstraint functions. Basically, when a thread needs to execute a block of code by a certain point in time, it first calls BeginConstraint, and then passes a structure that includes the following four parameters:
The deadline by which the block must complete execution.
The expected amount of processor time that the application believes it will take to execute the block.
The earliest start time to begin executing the block.
A criticality value that the scheduler uses to prioritize threads under overload conditions.
At this point, the scheduler can compare the real-time needs of this thread with that of other threads and reschedule the thread to meet its deadline. If the scheduler predicts that it does not have enough time to complete execution of the block before its deadline, BeginConstraint returns an S_OUTATIME error code.
The constraint mechanism is the lowest level of real-time resource management in the system. It can manage processor resources with a relatively fine time resolution (on the order of tens of microseconds).
EndConstraint marks the end of a constrained block. This call serves two purposes:
It tells the system that the constrained block of code has completed execution.
It feeds back to the thread the actual amount of processor time that it took to execute the block.
The thread can use this feedback in future BeginConstraint calls to more accurately predict the amount of processor time it will consume.
Typically, a thread has only a rough estimate of the amount of real-time that it will take to complete some task. In some cases, the thread will have no idea, or its estimate will be far above or below the actual amount of real time required. However, the amount of time required to perform a task will remain fairly constant each time the task is performed on a particular hardware platform. The first time a thread enters a constrained block of code, it will probably either underestimate or overestimate the amount of real-time required. However, after executing the block once, it can use the information supplied by the scheduler to correct its estimate so that the next time through the block it can correct its estimate.
The application design should ensure that if the first execution of a constrained block misses its deadline, the end user will not notice any negative effects. Alternatively, the system designer can precalculate the expected real time on the reference platform and compile this information into his code. An even more sophisticated implementation could execute each block during software installation or initialization and save the appropriate constraint values for later use.
Whenever the system schedules a thread to run, it programs the hardware timer to interrupt after a prescribed time. The prescribed time is the time requested by the thread to execute the constrained block of code before the system interrupts to perform a context switch. This mechanism reduces the overhead of context switching in the system (because the timer does not need to interrupt at a fixed, high frequency) while maintaining real-time priorities for each thread. If a thread underestimates the amount of processor time it needs to complete a block of code, it might be context switched out before completing its task and calling EndConstraint. In this case, it will probably miss its deadline. The thread can use the information returned from EndConstraint either to correct its estimate of its processing time requirements or to determine how much to degrade its operation to fit within the amount of time it wishes to use.
There is also an exceptional case where a thread that has correctly predicted its processor time still fails to meet its deadline. This can happen in the following scenario:
1 The thread calls BeginConstraint, which returns success because the scheduler can schedule the constrained block of code to run within its deadline. Due to load conditions though, the thread is scheduled very close to its deadline.
2 An I/O interrupt occurs, and its associated interrupt routine wakes up a second thread.
3 This second thread has a tighter deadline and a higher criticality value than the original thread. Therefore, the scheduler decides to give the processor to this second thread. There is now not enough time to allow the original thread to meet its constraint.
The system also includes a number of facilities for resource management and allocation. Chief among these is support for mutex (semaphore) objects to prevent multiple threads from accessing a resource concurrently. The system supports blocking on a mutex, and also integrates mutexes with constraint-based thread scheduling. If a constrained thread is blocked on a mutex, the thread that owns the mutex can inherit the constraint of the blocked thread to allow that owner to complete its use of the resource and release it. This is called priority inheritance. It is a solution to a problem known as priority-inversion where an otherwise low-priority task acquires a resource needed by a higher-priority task, thereby preventing it from executing. The system also implements methods for signaling using condition variables and interprocess communication.
Almost all Talisman multimedia tasks have real-time requirements such as a start time and a deadline. The system provides threads with fine-grained control of execution through the use of constraints. A constraint is a set of scheduling parameters that include earliest start time, deadline, execution time estimate, and criticality. A running thread dynamically declares constraints to the scheduler.
Some multimedia tasks have regular and predictable execution patterns. An MPEG audio decoder, for example, does a fairly constant unit of work in a well-defined time period. Such a task can, after performing a unit of work, register a constraint (using BeginConstraint) that causes itself to run again in time for the next unit of work. (For example, it could adjust the constraint so that the start time and deadline are incremented by one frame time from the current constraint).
Some tasks may not have regular periods because the media data arrives asynchronously. They cannot be scheduled using the mechanism discussed above. These threads can use the Condition_WaitAndBeginConstraint function to block on a condition and wake up with a constraint. Other tasks have execution times that cannot be determined in advance, because they are dependent on the data in the media stream. These threads can block on a condition (using Condition_Wait), and upon awakening examine the data stream (which may include timing data) and register a constraint.
Finally, some tasks don't have high time criticality. This is the case with some graphics-rendering operations and GDI/DDI functions. These threads can block waiting for data, and then run without constraints. The scheduler will run all such threads in a time slice after all of the threads with constraints have run.
When an interrupt occurs, the MMLite interrupt dispatcher transfers control to the interrupt service routine (ISR) that is responsible for initiating any actions necessary to service the interrupting device. Application programs install their ISRs by calling the BaseRtl AddDevice function. The ISR executes within the context of the thread that was running when the interrupt was asserted and uses the interrupted thread's stack.
To avoid interfering with the constraint-based scheduling system, the interrupt service routine must be as short and fast as possible. The ISR should do the minimum amount of processing necessary at the time of the interrupt. If additional processing is required, the ISR should wake up a device-handler thread to run in a fully preemptible-thread context. Notice that this is mandatory in any case where the processing requires synchronization of some form, since the ISR itself cannot block. MMLite base code is always interruptible, and the longest time with interrupts disabled is only a few cycles—much less than a microsecond.
Typically, a user-installed ISR performs one or both of the following:
Signals a condition variable. This awakens the thread that is waiting on the condition variable.
Accesses a physical device's registers to control or monitor the state of the device. Any lengthy device operations should wait until the device-handler thread begins executing.
The following example is an ISR that an application program installs.
BOOL MyIsr(void *pDevice, BOOL *pNotMine)
{
BOOL status;
MYDEVICE *pDev = (MYDEVICE *)pDevice;
/* Acknowledge interrupt by writing to device register. */
*pDev->pIntAckReg = 0x01;
/* Indicate acceptance of this interrupt. (Optional) */
*pNotMine = FALSE;
/* Awaken device-handler thread. */
status = Condition_InterruptSignal(&pDev->pMyCondVar);
return (status);
}
In the preceding example, the first parameter is a pointer to a device-specific MYDEVICE structure that contains a pIntAckReg member that points to a device register. The structure also contains the pMyCondVar condition variable. The ISR clears the interrupt-request signal from the device by writing to the register location pDev->pIntAckReg. The pMyCondVar condition variable, when signaled, awakens the thread that handles the bulk of the interrupt processing.
Note that the current implementation of the system's interrupt dispatcher simply ignores the value that the ISR writes to the location pointed to by the second parameter, pNotMine. In a future implementation, however, the pNotMine parameter might be used to enable chaining of multiple interrupts so that they can share a single interrupt-request level. When interrupt chaining is supported, an ISR will write the value FALSE to *pNotMine to indicate that it does not recognize the interrupt. The interrupt dispatcher initializes it to FALSE before invoking the ISR, assuming that the ISR will recognize the interrupt.
The value returned by an ISR indicates whether the interrupt dispatcher needs to call the scheduler to perform rescheduling as a result of the interrupt. In the preceding example, the value returned by the ISR is the same value returned from the call to Condition_InterruptSignal. If a thread was waiting on the condition variable that was signaled, Condition_InterruptSignal readies the thread to run and returns the value TRUE. When this occurs, the ISR must also return TRUE to ensure that the thread is scheduled to begin execution in a timely manner. If the ISR does not need to invoke Condition_InterruptSignal, it should return FALSE to avoid an unnecessary invocation of the scheduler.
An application program calls the AddDevice function to install an ISR. The function has the following syntax.
SCODE AddDevice(void *pDevice, void *pIsr,
UINT Arg0, UINT Arg1, UINT Arg2)
The five parameters have these meanings:
The first parameter, pDevice, is the value the interrupt dispatcher passes as a parameter to the ISR. Although pDevice is defined to be type (void *), it typically contains a pointer to a structure containing device-specific information needed by the ISR. In general, this structure varies from device to device, although a common structure might apply across a family of similar devices.
The second parameter, pIsr, is a pointer to the user's ISR. The dispatcher calls this ISR when the designated interrupt occurs.
The last three parameters contain implementation-specific information. In the current implementation of AddDevice, Arg1 contains the interrupt level associated with the ISR; the ISR is called when an interrupt occurs at the specified level. The Philips TM-MS architecture, for instance, defines interrupt levels 0 to 31, although not all 32 levels are available for application programs. The current implementation of AddDevice ignores the Arg0 and Arg2 parameters, but these should both be specified as zero for the sake of compatibility with future enhancements.
AddDevice returns a status code of S_OK if the function is successful; otherwise, it returns an error code.
To install the ISR in the preceding example, the call to AddDevice looks like this:
MYDEVICE MyDevice = {...};
#define IRQ_LEVEL 6
AddDevice((void *)&MyDevice, (void *)MyIsr, 0, IRQ_LEVEL, 0);
The &MyDevice parameter refers to an instance of the MYDEVICE structure that was described previously. When the interrupt occurs, the address of this structure is passed to the ISR as an parameter. The MyIsr parameter is the ISR function in the example. The IRQ_LEVEL parameter is a small, nonnegative integer value that specifies an interrupt-request level. The third and fifth parameters are not used in the current implementation and are specified as zero.
In general, an interrupt-service routine should avoid using any resources that may be in use by other threads. The only possible way to synchronize an ISR and a thread is via lock-free algorithms. The base supports these algorithms with the Atomic Functions and the IQueue interface. Refer to a parallel programming textbook for further information and examples of lock-free algorithms.