previous | contents | next

Chapter 22 ½ The C.mmp/Hydra Project: An Architectural Overview 353

memory or register location. Since the registers of Dmap are given addresses in the control-register page, they are always addressable by Hydra.

As illustrated in Fig. 3, the Dmap intercepts the 18-bit UNIBUS addresses (16-bit words plus the two space bits) and converts them in the following manner: the three high-order bits of the 16-bit word select a register from the bank specified by the space bits. The contents of the register provide a 12-bit page-frame number; the remaining 13 bits from the address word are the displacement within that page. The two are concatenated to form the 25-bit shared memory address. The 13-bit displacement gives an 8-Kbyte page size. This transparent mapping is performed for all shared memory accesses. In addition to the 12 page-frame bits, there are 4 bits in each relocation register used for control. The first three are designated no page loaded, write-protected, and written-into, and the fourth bit controls whether values from the page may be stored in Mcache.

After the 25-bit address is generated, Mcache is checked to see if the data are already available. If the access is a read cycle and the datum is in Mcache, the datum is immediately returned, bypassing shared memory. Although Mcache is a write-through design, only read-only data are cached, because the cache/Pc PMS structure allows multiple, and possibly different, copies of a datum. However, since approximately 70 percent of the memory accesses are to code rather than data, the read-only requirement is expected to produce a high "hit" rate for pure code programs [Fuller and Harbison, 1978].

The internal characteristics of Mcache are:
 
Capacity 

Block size 

Set size 

2 Kbyte 

2 bytes (one word stored or returned per access) 

1

Although only a single prototype of Mcache was built, it estimated that it would save 50 percent of the time for a read cycle if the data were in the cache. However, it is important to realize that the motivation for including a cache on each processor was reduce memory contention rather than directly provide fast memory. The effects of not having the caches will be discussed ii Sec. 3.1.4 of this chapter.

Parity for both data and the 25-bit address is generated by the Dmap interface to the bus from the switch. The address parity checked at the switch interface. If the check fails, the request aborted and the processor interrupted. Data parity is not checked until the data are read from memory and returned to a Dmap. The fact that data parity is checked only by Dmap and not at any other point either in the cross-point switch or in the memory modules themselves has probably contributed to the reliability problems due to parity errors (see Sec. 3.3.1). Separate parity bits art maintained for both bytes of the word: one byte is given odd parity, the other, even. This detects words of all 1s or all 0s, both of which are common results of transient timing errors.

1.2.2 The Cross-Point Switch: Smp Smp routes the 25-bit address request to the memory port specified by the high-order four bits of the address. A port is requested by setting the bit corresponding to the requesting Pc in the port's request register. Contention for the port is resolved by periodically gating the request register into a second register, the queue register, which is left-shifted as the port becomes available. The shifting creates a priority-ordered queue: as a 1 bit is shifted out, the corresponding Pc is granted access to the port. Processor 15 is assigned the high-order bit and processor 0 the low-order bit, defining the priority. When the queue register is 0, all requests have been satisfied. The request register is again gated into the queue register and cleared, and a new cycle begins. A second request for the same port by a processor must enter via the request register; hence equality of service among the Pc's is maintained. The two-level request mechanism obscures the internal queue's priority ordering to the point that it is of virtually no importance outside the switch, preserving the symmetry of Smp.1 The switch's maximum concurrency (16 independent paths) is achieved if all Pc's request different ports.
 
 

1In the worst case, in which all Pc's repeatedly access the same port, the lowest-priority Pc suffers a 50 percent memory access time degradation, but since this situation is extremely rare in practice, the effect is negligible [McGehearty, 1980].
 
 

previous | contents | next