17th ACM Symposium on Operating Systems Principles (SOSP'99)

Published as Operating Systems Review 34(5): 247–260, Dec. 1999

Progress-based regulation of low-importance processes


 


John R. Douceur and William J. Bolosky

Microsoft Research
Redmond, WA 98007

{johndo, bolosky}@microsoft.com


 

 


Abstract

MS Manners is a mechanism that employs progress-based regulation to prevent resource contention with low-importance processes from degrading the performance of high-importance processes.  The mechanism assumes that resource contention that degrades the performance of a high-importance process will also retard the progress of the low-importance process.  MS Manners detects this contention by monitoring the progress of the low-importance process and inferring resource contention from a drop in the progress rate.  This technique recognizes contention over any system resource, as long as the performance impact on contending processes is roughly symmetric.  MS Manners employs statistical mechanisms to deal with stochastic progress measurements; it automatically calibrates a target progress rate, so no manual tuning is required; it supports multiple progress metrics from applications that perform several distinct tasks; and it orchestrates multiple low-importance processes to prevent measurement interference.  Experiments with two low-importance applications show that MS Manners can reduce the degradation of high-importance processes by up to an order of magnitude.

Categories and subject descriptors:  D.4.1 [Operating sys­tems]: Process management – scheduling; G.3 [Probability and statistics]; G.4 [Mathematical software]

General terms:  Algorithms, Measurement, Performance

Additional key words and phrases:  progress-based feed­back, symmetric resource contention, process priority

1.     Introduction

We have developed a method that prevents low-importance processes from interfering with the performance of high-importance processes.  Such interference is generally caused by contention over shared resources, so our method relieves this contention by suspending the low-importance process whenever it detects resource contention.  Our method detects resource contention by means of progress-based feedback:  A control system monitors the progress of the low-importance application and regards a drop in the progress rate as an indication of resource contention.  We refer to this control system as MS Manners, because it determines when a low-importance process should politely defer to a high-importance process.

The need for such a method arose in the development of the SIS Groveler, a Microsoft® Windows® 2000 system utility that finds and merges duplicate files on a file system.  We needed a mechanism that would allow our low-importance utility to use idle resources without degrading other processes.  The Groveler is I/O-bound, so CPU priority is insufficient to control its resource usage.  The Windows 2000 kernel does not support priority for other resources, and we did not wish to modify the kernel.  The Groveler runs in server environments with continuously running applications and unpredictable workload schedules, which disqualifies all prior solutions of which we are aware.

We found that this method worked well for the Groveler, so we expected it might work well for other long-running, low-importance applications, such as a backup system, a disk defragmenter, a file archiver, a virus scanner, or other housekeeping utilities.  Therefore, we re-packaged our method as a general application in two different forms: a library that can be called by low-importance applications, and an executable program that externally monitors and regulates low-importance applications that report progress through a standard Windows NT® mechanism.

Performance measurements indicate that MS Manners is very effective at reducing the impact of low-importance processes on high-importance processes.  In two example low-importance applications, MS Manners reduced the performance hit on high-importance applications by as much as an order of magnitude:  Running a low-importance process increased the execution time of a concurrent high-importance process by 90%; applying MS Manners reduced this increase to 7 – 12%.

In the following sections, we review previous approaches to low-importance process regulation, we explain the assumptions and limitations of our approach, we describe our technique in detail, and we present experimental results on its effectiveness.

2.     Previous approaches

Many approaches to low-importance process regulation have been proposed and implemented, such as scheduling for specific times, running as a screen saver, scanning the system process queue, and various resource-specific methods.  These approaches vary considerably in their generality, complexity, and invasiveness.  Our approach is not strictly better than any of the others; it is merely another design point in the overall problem space.

If system activity is at least coarsely predictable, one simple approach is to schedule a low-importance process to run during expected periods of little or no system activity, such as the middle of the night.  This technique fails to exploit unanticipated idle times, and it fails to regulate during periods of unanticipated activity.  Moreover, if system activity is not predictably periodic, this technique does not work at all, and a mechanism to dynamically determine idle periods is needed.

One method for determining idle periods is to use the approach of a screen saver: activate the low-importance process in the absence of keyboard or mouse activity.  This approach relies on the assumption that a lack of user input indicates that the system is unused.  This assumption may be reasonable for a desktop workstation, but it is not valid for a server, which is often busy but which rarely receives direct user input.

Another approach is to run low-importance processes only when no high-importance processes are in the system process queue [26].  However, a high-importance process may be in the process queue without consuming significant resources.  For example, a database-server application might run continuously but only require resources when given a workload.  In such a scenario, this approach would never allow a low-importance process to run.

A time-honored method [12] is to use CPU priority.  When assigned a low CPU priority, a low-importance process is prevented from using the CPU when any normal priority process is using it.  This works well if the CPU is the limiting resource.  If another resource – such as disk or memory – is the limiting factor, the CPU is uncontended, so CPU priority is ineffective.

Priority can be extended to other resources besides the CPU.  In general, this requires modifying the OS kernel.  For example, Stealth [11] is a process scheduler that prioritizes CPU, virtual memory, and file system cache.  This extension is still limited to a few specific resources, and the techniques are resource-specific.

A similar but more general approach is performance isolation [27], which supports various allocation and sharing policies for multiple resources.  If a low-importance process is entitled to zero resources, it can only use shares of resources that are unused by other processes.  Although the performance isolation framework is general, each resource requires a specific isolation technique.

Another resource-specific approach comes from the domain of real-time systems.  Resource kernels [19] schedule multiple system resources among concurrent processes.  All processes must make explicit reservations for resources, and actual resource usage is monitored and enforced by the kernel.

Our approach is resource-independent, and it requires no kernel modifications.  It works in server environments in which high-importance applications run continuously and receive workloads at unpredictable times.  These features differentiate it from previous approaches.

3.     Assumptions and limitations

Like the earlier approaches described above, MS Manners relies on several assumptions, primarily the symmetry of performance impact from resource contention and the accessibility of progress measures that accurately reflect resource consumption.

Our driving assumption is that the performance impact from resource contention is roughly symmetric.  If a low-importance process is substantially degrading a high-importance process, then the low-importance process will also be substantially degraded, so our method will observe the effect and appropriately suspend the low-importance process.  On the other hand, if the degradation of the low-importance process is too small to observe, then the impact on the high-importance process is likely to be small as well, so the low-importance process can continue.

This assumption of symmetry can be violated if the operating system does not aim for fairness in sharing resources among processes.  For example, a disk scheduler might favor small transfers over large ones, so a low-importance process making small disk transfers will degrade the performance of a high-importance process making large disk transfers, without itself experiencing any reciprocal degradation.

One noteworthy example of resource asymmetry is physical memory.  If the combined memory requirement of two processes exceeds the available physical memory, operating systems tend to drastically favor one process over another [23], in order to avoid page thrashing.  This is reasonable behavior, but it invalidates our key assumption for this important resource.

For the general case, we have no solution to the problem of resource asymmetry.  For resources with user-settable priority, the problem can be averted by lowering the resource priority of the low-importance process.

A second critical assumption is that the regulator has access to some progress metric for the low-importance process.  This means either that the application must export one or more progress metrics via a standard interface, or that the application can be modified to indicate its progress via a library call.

A malicious application might provide false progress information in order to avoid being regulated.  Our method assumes that progress information is correct and reasonably accurate, and it makes no attempt to detect or suspend malicious processes.

Beyond these assumptions, MS Manners has several limitations that follow from its premises.  Since it automatically calibrates its tuning parameters (see section 4.3), it requires significant periods of resource idleness.  If resources are continually busy, the calibrator cannot determine correct parameter values.  Even given significant idle periods, calibration can require many hours or even days if execution begins on a heavily loaded system.  If the calibration is started on a relatively idle system, this time can be shortened to a few minutes.  This limitation generally restricts progress-based regulation to processes that are long-running.

Since MS Manners is completely resource-independent, it does not discriminate between various classes of resources, such as those internal and external to a machine.  For example, a web crawler’s progress rate will degrade when the network is loaded, triggering MS Manners to suspend the process, which may not be as desired.  Solving this problem requires either making MS Manners resource-aware or modifying the application to adjust its progress metrics for external delays.  These fixes oppose our goals of resource independence and noninvasiveness.

Since MS Manners can regulate unmodified applications (see section 7.2), it may suspend a process that is holding a shared system resource, such as a lock or an exclusive file handle.  This resource will remain unavailable to any high-importance process that requires it, causing a priority inversion.  For applications that are modified to use MS Manners directly, suspension occurs only at well defined points in the application code, so this problem can be avoided by not holding a resource at such a point.  However, for unmodified applications, we have no solution.

4.     Architectural components

Figure 1. Application environment

Figure 1 illustrates the application environment of a standard operating system, in which multiple applications may run concurrently.  Some system resources may be used exclusively by a single application, whereas others may be shared by more than one application.  If a shared resource is limited, then use of that resource by one process will degrade the performance of other processes that use that same resource.

If a user explicitly designates one or more applications as having low importance, then the MS Manners control system prevents the process from using resources that are in use by any normal, high-importance application.  The control system monitors the progress of the low-importance application.  When the control system sees a drop in the rate of progress, it infers that the application is experiencing resource contention, so it suspends its execution.

Figure 2. System architecture

Figure 2 illustrates the main architectural components of MS Manners.  Periodically, a low-importance process provides an indication of its progress, through either a library call or a standard reporting interface.  A rate calculator combines this progress indication with temporal information from a system clock to determine the process’s progress rate.  This progress rate is used for two purposes:  First, it is fed into a target calibrator, which analyzes many progress rate measurements to determine a target rate for the process.  Second, the progress rate is fed into a rate comparator, which compares it against the target rate from the target calibrator.  The rate comparator judges whether the current progress rate is less than the target progress rate.  This judgment is used for two purposes:  First, it is fed into a suspension time calculator, which maintains a suspension time value; the calculator increases this value when the progress is judged to be below target, and it decreases this value when the progress is judged to be at or above target.  Second, the judgment is fed into an execution regulator; if the progress is below target, the regulator suspends the process for the suspension time.

The following sections describe these components in the abstract.  For implementation details, see section 7.

4.1     Core components

MS Manners’ core components are measuring the application’s rate of progress, comparing this rate against a target rate, and suspending the process when the rate falls below target.  This section describes these components in their simplest form; more sophisticated versions are described in subsequent sections.

Periodically, at times known as testpoints, the control system acquires metrics of the application’s progress.  (For the moment, we ignore the mechanism by which progress metrics are conveyed from the application to the control system.  Two such mechanisms are described in section 7.)  These progress metrics can be expressed in virtually any unit that is meaningful to and easily tracked by the application.  For example, a file compressor might indicate quantity of data compressed, whereas a content indexer might indicate the number of directory entries scanned.  Section 5 discusses properties of good progress metrics.

Testpoints should be made fairly frequently, at least once per few hundred milliseconds, so the process can be suspended promptly when necessary.  At each testpoint, MS Manners calculates the elapsed time and the progress made since the previous testpoint.  It then calculates the progress rate as the ratio of these two values.

MS Manners compares this progress rate to a target progress rate.  The target rate is the progress rate expected when the application is not contending for any resources (see section 4.3).  If the actual progress rate is at least as good as the target, MS Manners judges the progress rate to be good; otherwise, it judges it to be poor.

If the progress rate is good, the control system allows the process to continue immediately.  If the progress rate is poor, the control system suspends the process for a period of time before allowing it to continue.  The execution is not stopped entirely, or else there would be no way to determine when it is okay to continue.

The time a process is suspended depends on how many successive testpoints indicate poor progress.  On each testpoint that indicates poor progress, the suspension time is doubled, up to a set limit.  Once a testpoint indicates good progress, the process is allowed to continue, and the suspension time is restored to its initial value.

The exponential increase makes the low-importance process adjust to the time scale of other processes’ execution patterns.  Following short periods of activity by a high-importance process, the low-importance process will resume promptly, but during long periods of high-importance activity, the low-importance process makes only infrequent execution probes.  The limit on suspension time places a bound on the worst-case resumption time.

These components are necessary for progress-based regulation, but they are not always sufficient.  For example, if progress measurements are stochastic, directly comparing them to the target rate may yield an incorrect judgment of the progress rate.  Also, these components do not include a method for determining a target progress rate.  The rate calculation cannot cope with an application whose progress is naturally measured along two or more dimensions.  Finally, if multiple low-importance threads execute at the same time, they can interfere with each other’s progress measurements if they use any common resources.  The following sections describe additional components that deal with each of these complications.

4.2     Statistical rate comparison

Progress rate can fluctuate due to several factors, such as variable I/O timing [29], coarse progress measures, and clock granularity.  If the control system directly compares progress rate to target rate, it may frequently make incorrect progress-rate judgments, causing inappropriate suspension or execution of the process.

MS Manners copes with noisy measurements by using a statistical rate comparator.  Rather than making an immediate judgment about the progress rate, the comparator continues to collect progress-rate measurements until it has enough data to confidently make a judgment.

The comparator feeds each progress-rate measurement into a statistical hypothesis test (see section 6.1).  The test determines whether the progress rate is below the target rate, whether it is at or above the target rate, or whether there is not enough data to make such a judgment.  In the latter case, the process is allowed to continue until its next testpoint, but the current value of the suspension time is preserved.  In this manner, the process is repeatedly allowed to continue, and the progress rate is repeatedly measured, until the hypothesis test determines that there is enough data to make a judgment.  At that point, a good judgment will reset the suspension time, or a poor judgment will double the suspension time and suspend the process.

This technique assumes that the variability in an application’s measured progress rate is not serially correlated.  For example, a disk-bound application may, even on an unloaded system, encounter some very lengthy seeks.  As long as these lengthy seeks are interspersed with short seeks, the statistical comparator will correctly recognize that the progress rate is good.  However, a correlated series of long seeks will inappropriately trigger suspension.

4.3     Automatic target calibration

Progress-based regulation requires a target progress rate for the regulated process.  Ideally, this target rate represents the expected progress rate when the process is not contending for resources.  This ideal target rate may change over time as properties of the resources change; for example, file fragmentation [24] may reduce the ideal target rate for a process that reads files.  Therefore, it is necessary to track changes in the ideal target rate over time.

To determine the ideal target rate, the process must run for a while without resource contention.  We could require the user of a low-importance application to perform a calibration procedure, during which no other process runs on the system.  However, this is a burden for the user, especially since the calibration would have to be re-run periodically to track changes in resource characteristics.

Instead, MS Manners automatically establishes a target rate as the exponential average of the measured progress rate at each testpoint (see section 6.2).  Clearly, this approach tracks changes over time, but it is not clear that it reflects uncontended progress.  The key insight is that the averaging procedure gives equal weight to each testpoint’s progress-rate measurement.  Since the process is usually suspended when the progress rate is poor, few testpoints reflect poor progress.  Since the process is usually executing when the progress rate is good, many testpoints reflect good progress.  Thus, the average tends to converge to the rate of good progress.

This procedure is self-perpetuating, but it requires some way to get started.  Our method begins by allowing the process to execute briefly with no true regulation.  During this time, the calibrator averages the progress rate measurements to bootstrap the calibration procedure.

If this initial execution is performed on a relatively idle system, the initial value for target rate will be close to the ideal target rate, so correct regulation will immediately commence.  However, if the process experiences resource contention during its initial execution, the target progress rate will be set too low.  Eventually, once the low-importance process executes without resource contention, the calibrator will increase the target rate, but in the mean time, the control system will not prevent the low-importance process from running when it should not.  To deal with this problem, MS Manners limits the maximum execution rate for a probationary period, reducing the impact on other processes.

A major weakness of this approach is that there may never be a significant length of time when resources are uncontended, so the target rate may never be set correctly.  This weakness is inherent in any approach that does not require the user to establish an idle system.

This calibration procedure relies on the suspension behavior of a regulated process.  However, if a process becomes critical, it might choose to ignore regulation.  For example, a utility that archives old files might, when disk space becomes scarce, choose to execute even if it contends with a high-importance process.  When a process runs more aggressively than the regulator dictates, the calibrator subsamples the progress data, ignoring measurements from testpoints that would not have executed if the thread were following regulation strictly.

To preserve target values across restarts, calibration data is maintained persistently (see section 7.1).

4.4     Multiple progress metrics

The progress of some applications is not easily measured along a single dimension.  Some applications execute in sequential phases with a different type of progress in each phase.  For example, a garbage collector’s progress might be measured by its mark rate during its mark phase and by its sweep rate during its sweep phase.  Furthermore, some applications progress along multiple dimensions concurrently.  For example, a content indexer might measure progress in both bytes of content scanned and the count of indices added to its database.  Over a long term, separate metrics may be positively correlated, because one type of progress may be a precursor to another, as scanning is to indexing for a content indexer.  However, over a short term, separate metrics may be negatively correlated, because time spent progressing along one axis is time not spent progressing along another.  For such applications, there is no single scalar value that accurately reflects the progress rate.

Our method includes two ways of dealing with multiple progress metrics.  For applications that execute in discrete phases, any given testpoint will occur during some specific phase.  If the phase is known, our method compares the progress rate measured at each testpoint with a target rate specific to the phase in which the testpoint occurs.  For the garbage collector example, when a testpoint occurs during the mark phase, the measured mark rate is compared to a target mark rate, and when a testpoint occurs during the sweep phase, the measured sweep rate is compared to a target sweep rate.  Each target rate is calibrated separately.

If the number of testpoints in each phase is very small, the statistical rate comparator may not accrue sufficient data within a phase to make a judgment.  If the comparator were unable to combine measurements from separate phases, the control system would never be able to judge the progress rate, and the process could not be regulated.  However, the hypothesis test described in section 6.1 compares each sample against a separate target and combines these separate comparisons into a single judgment.  This enables the rate comparison to span multiple phases, permitting regulation even when the execution phases are very brief.

The second way of dealing with multiple progress metrics is used for applications that progress along multiple dimensions concurrently, or whose phases are not available to the control system.  To accommodate this situation, we must change the way progress rates are compared.  Section 4.1 stated that the control system calculates the progress rate from the measured progress and the measured duration since the previous testpoint, and it compares this calculated rate to a target rate.  The modification is to calculate a target duration based on the measured progress and the target rate, and to compare this calculated duration to the measured duration since the last testpoint.  For a single progress metric, these formulations produce equivalent results, but the latter allows an extension to support multiple progress metrics.

For multiple progress metrics, the control system calculates a target duration as follows:  Each progress metric is combined with its corresponding target rate to yield a target duration for the progress along that metric.  These separate target durations are added together to produce an overall target duration, and this is compared against the measured duration since the last testpoint.

As an example, consider a content indexer that scans data at a target rate of 750 kB/sec and adds indices to its database at a target rate of 120 indices/sec.  If a testpoint indicates that it scanned 60 kB of data and added 5 indices to its database in 120 milliseconds, then its target durations are 80 msec for scanning and 42 msec for indexing.  The sum of these durations, 122 milliseconds, is the overall target duration.  When the rate comparator compares this value against the actual duration of 120 milliseconds, it determines that the progress rate is good.

This method assumes that the time to make progress along multiple dimensions is equal to the sum of the times to make progress along each separate dimension.  This assumption can be rendered incorrect by overlapping operations.

The calibrator establishes a target rate for each progress metric.  For phased execution, this is straightforward, since each testpoint reports progress for a single metric.  For multiple concurrent progress, the calibrator uses linear regression to infer the contribution of each metric to the overall duration between testpoints (see section 6.3); in particular, it uses a technique called “ridge regression” that is not thwarted by even highly correlated metrics.  Rather than exponentially averaging the progress rate over time, the system exponentially averages the state information needed for the linear regression.

4.5     Multiple threads and processes

When multiple threads or processes run concurrently, every low-importance process should defer to any high-importance process.  If no high-importance process is contending for system resources, the low-importance processes should share the resources fairly.

If multiple low-importance processes or threads were to execute concurrently, they might contend with each other over resources, reducing each other’s progress rates and causing the control system to suspend them.  Mutually induced suspension with binary exponential back-off can lead to unfairness [20] or instability [5, 21], fully suspending the processes even on an idle system.

To address this problem, MS Manners allows only one low-importance process or thread to execute at a time.  If multiple low-importance processes or threads run concurrently, the control system multiplexes among them, allowing each to execute until its next testpoint before suspending it and executing another.  This “time-multiplex isolation” is somewhat inefficient, since it prevents low-importance processes from overlapping usage of different resources with each other.  However, since these processes are not critical, we consider this an acceptable cost.

Each thread is regulated separately.  There is one progress-rate comparator per thread, so threads that use different system resources will not be implicitly coupled.  For example, if only one disk on a computer is being used by a high-importance process, a low-importance thread that is using that disk will be suspended, but another low-importance thread using another disk may not be.

A subset of an application’s threads can be designated as low-importance.  Any unregulated threads in a process will not be time-multiplex isolated, so they will reduce the progress rate of the regulated threads if they contend for resources.  However, because this contention is always present, it will also reduce the target rate and thus not interfere with progress-based regulation.

5.     Progress metrics

In the abstract, progress-based regulation can be based on any unit of progress that is meaningful to the application being regulated.  However, since our method assumes that a drop in progress rate indicates resource contention, any progress metric that is used for regulation should have an approximately constant target rate over the life of the application.  For example, in a numerical solver, estimated solution accuracy is a poor metric, because its rate of change decreases as the solution converges.  A better metric for this example is the count of iterated solution steps, because its expected rate of change is constant, barring interference due to resource contention.

It is also important to use metrics that provide sufficient coverage of all progress that the application might make.  For example, consider a file archive utility that scans through files and only archives those older than a certain date.  It is not sufficient to regulate based on count of files scanned, because this rate will drop when scanning old files, since time will be consumed archiving them.  Similarly, it is not sufficient to regulate based on count of files archived, because this rate will drop when scanning new files, since time will be consumed scanning but not archiving them.

To help convey good choices of progress metrics for various applications, we present a representative but non-exhaustive list of low-importance applications that could profitably use progress-based regulation, along with a suggested set of progress metrics for each:

·         A file compressor might indicate the quantity of data it compresses.  This would account for resources consumed reading data, writing data, and compressing data.  It could also indicate the count of files it compresses, if the overhead in opening and closing a file is significant relative to reading, writing, and compressing.

·         A content indexer might indicate both the quantity of content it scans and the count of indices it adds to its database.

·         A file archive utility, as mentioned above, might indicate the count of files it scans and the count of files it archives; it should also indicate quantity of data it archives, since there is likely some resource cost per byte as well as some resource cost per file.

·         The SETI@home [2] program, which downloads and analyzes radio telescope data, currently runs under a screen saver.  It could instead use progress-based regulation by indicating the quantity of data it transfers and the number of computation steps it performs.

·         A backup system might indicate the quantity of data it uploads.  This would account for both disk and network resources.

·         A virus scanner might indicate the count of files and the quantity of data it scans.

·         A synchronization engine for a distributed file system, such as Coda [9], scans files and uploads copies of those that have been modified since a certain date.  It might indicate the count of files it scans, the count of files it uploads, and the quantity of data it uploads.

·         The disk defragmenter described in section 8 indicates the count of file blocks it moves and the count of move operations it performs.

·         The Single Instance Store Groveler, as described in section 8, finds and merges duplicate files.  It reports the count of read operations it performs and the quantity of data it reads.

6.     Mathematical details

The following sections provide mathematical details of the statistical hypothesis test used by the statistical comparator, the exponential averaging technique used by the automatic calibration procedure, and the linear regression technique used by the automatic calibration procedure for multiple progress metrics.

6.1     Statistical hypothesis test

The statistical comparator described in section 4.2 makes use of a statistical hypothesis test to determine whether the progress rate is below the target rate, whether it is at or above the target rate, or whether there is not enough data to make such a judgment.

The comparator uses a paired-sample sign test [4], which is a non-parametric hypothesis test.  A non-parametric test makes no assumptions about the distribution of the data and is therefore very robust.  The test depends on n, the sample size, and r, the count of sample rates that are below their corresponding target values (or the count of sample durations that are above their target values).  If r is greater than a threshold value that is a function of both n and a control parameter a, the progress rate is judged to be poor.  If r is less than a different threshold value that is a function of both n and another control parameter b, the progress rate is judged to be good.  If r falls between the two threshold values, the progress rate is indeterminate given the current data.

The control parameters a and b determine the sensitivity of the comparator.  The parameter a is the probability of making a type-I error, judging the progress rate to be poor when it is actually good.  The parameter b is the probability of making a type-II error, judging the progress rate to be good when it is actually poor.  Increasing a improves the system’s responsiveness, decreasing b improves the system’s efficacy, and increasing b relative to a improves the system’s efficiency, as follows:

Increasing a allows faster reaction to poor progress, because a is negatively related to m, the minimum number of samples for the sign test to recognize poor progress:

 

(1)

Decreasing b reduces the performance impact on high-importance processes, because b is – by definition – the probability that a marginally poor progress rate will be judged incorrectly to be good.

Increasing b relative to a improves the stability of process execution, because when progress is good, the process suspension state is a birth-death system that is isomorphic to a bulk service queue [10] of infinite group size with an arrival rate of a and a bulk service rate of b.  Thus, the steady-state probability that k judgments of poor progress have occurred since the last judgment of good progress is given by:

 

(2)

With a probability of a, the next judgment will yield poor progress.  m testpoints are required for such a judgment, and the suspension time will be 2k.  Thus, the mean steady-state fraction of time suspended is:

 

(3)

This system is unstable unless a < b.  Increasing b relative to a increases the duty cycle of the background process when its progress rate is good.

We have selected values of a = 0.05 and b = 0.2 for our experiments.  Theoretically, with a testpoint every few hundred milliseconds, these values yield a reaction time of a few seconds and a 1% performance degradation on the low-importance process.  Empirically (see section 9), these values demonstrate a prompt reaction to high-importance activity, a very stable suspension state, and a fairly low impact on a high-importance process.

6.2     Exponential averaging

The automatic calibration procedure described in section 4.3 uses exponential averaging to track changes in the target progress rate over time.  Each time a testpoint occurs, the duration d since the previous testpoint and the amount of progress Dp since the previous testpoint are used to update the target progress rate r according to the following rule:

 

r   ¬   x r  +  (1 – x) Dp / d

(4)

The value of x is determined by the following equation:

 

x  =  (n – 1) / n

(5)

where n is selected by its effect on the following values:

 

ts  =  n ´ expected time between testpoints

(6)

 

tl  =  n / m ´ maximum suspension time

(7)

ts is the time constant for smoothing out short-term variations in progress, so it should be large enough to maintain a steady target rate.  tl is the time constant for tracking long-term changes in the target progress rate, so it should be small enough to respond to changes in resource performance characteristics.

For our performance experiments, we have set n to 10,000.  Given our other parameters, this will smooth out short-term variation with a time constant of 20 – 30 minutes and track long-term changes with a time constant of 7 days.

6.3     Linear regression and averaging

The method for dealing with multiple progress metrics described in section 4.4 uses linear regression to infer the contributions of separate progress metrics to the duration between testpoints.  To track changes in target progress rates over time, it exponentially averages the state information needed for linear regression.

The multiple-metric calibration procedure determines target rate values rk for each progress metric k.  It assumes that the duration d since the last testpoint equals the sum of the times to make each type of progress, where each of these times is the inverse of the target progress rate rk times the measured progress Dpk:

 

(8)

The calibration procedure performs least-squares linear regression [4] on Equation 8 to estimate the regression coefficients 1/rk.  Since Equation 8 has a zero offset, the regression is constrained to have no bias term.  The minimum data needed to solve the regression are known as the “sufficient statistics,” which in this case are the matrix x and the vector y, defined as follows:

 

(9)

 

(10)

To track changes in resource performance characteristics over time, the calibrator exponentially averages these sufficient statistics.  A