John D. Davis
Empowered by the virtually endless supply of transistors provided by today’s leading edge silicon process technology, computer architects seem to have an infinite number of design possibilities at their disposal. At the same time, the diminishing returns of instruction level parallelism (ILP) have forced designers to utilize designs embracing thread level parallelism (TLP). As a result, new CMP designs try to connect multiple processors using novel memory system designs instead of making larger and more complex uniprocessors. Unfortunately, these CMPs cannot be easily explored and evaluated using existing software-only simulations tools. Historically, software simulation has been the vehicle of choice for studying computer architecture because of its flexibility and low cost. Regrettably, designers of software simulators must choose between building simulators that provide either high performance or detailed hardware emulation. Building actual hardware, in contrast, provides high performance and accurate results, but lacks the flexibility to explore multiple designs and is very expensive. These tradeoffs have impeded our ability to thoroughly explore and evaluate new computer architectures. This dissertation describes the architecture, implementation and evaluation of a simple prototype. This proof of concept is implemented on a new hybrid hardware prototyping platform enabled by integrating a variety of hardware components on a printed circuit board (PCB) to implement Chip Multiprocessor (CMP) or Multiprocessor (MP) systems. The Flexible Architecture for Simulation and Testing (FAST) combines the flexibility of software simulation with the accuracy and speed of a hardware implementation enabling computer architects to implement new multithreaded, multiprocessor, or CMP architectures for in-depth evaluation and software development. This is accomplished by combining dedicated microprocessor chips and SRAMs with Field Programmable Gate Arrays (FPGAs), enabling FAST to operate at hardware speeds, yet still have the ability to emulate the latency and bandwidth characteristics of a modern CMP architecture. FAST provides the foundation for future TLP-focused computer architecture research and software development that is currently not possible or practical using software-only or hardware-only solutions.