Mercurial > cgi-bin > hgwebdir.cgi > VMS > 0__Writings > kshalle
view 0__Papers/Hardware/QMod/QMod pipeline equations abstract.txt @ 108:e488b77f2015
Def of sync paper
| author | Sean Halle <seanhalle@yahoo.com> |
|---|---|
| date | Sun, 09 Nov 2014 09:11:20 -0800 |
| parents | |
| children |
line source
3 We present research that provides an opportunity to out-of-order processor designers to gain intuition and potentially speed exploration of the design space. It doesn't so much solve a problem as it provides the opportunity to gain insight into processor design and to speed up identifying fruitful avenues to explore during design.
5 It is a set of equations that effectively encodes patterns that exist within all "typical" out-of-order pipelines. The equations take as input design choices, such as instruction window width and issue width, as well as measured quantities such as branch prediction effectiveness on a given application code. From this they calculate the instructions per cycle delivered by the design. Note, however, that this requires performing an up-front "tuning" simulation, from which the measured values are gathered. Hence, the equations presented here do not replace simulation, and they do not predict behavior of source code. Rather, they take a single simulation run as input, then predict the outcome were other simulations to be run with different design choices. In this way, the equations speed up the design exploration process.
7 The equations can be used in a variety of ways, such as to find bugs within processor simulators, gain intuition about relationships between design quantities without resorting to time consuming simulations, and quickly calculate the needed values that particular quantities should take, in order to utilize a design with chosen parameters. We don't claim to solve any particular problem, but rather hope to aid general insight and to speed up design exploration.
9 To validate, we compare the equations to simulations, over corner-case architectures executing the SPEC reference suite, and note that all major discrepancies encountered have been the result of bugs thusly uncovered in the simulator. We measure the maximum overall average difference between equations and simulator to be 0.35% and the maximum single-benchmark difference to be 1.8%. As such, once the "tuning" simulation has been completed, the calculated IPC can be relied upon to match simulation for other design choices.
10 The equations evaluate within a few microseconds, giving the answer for an entire SPEC application run, which otherwise requires seconds to run on hardware and minutes to run on a simulator. Millions of design points can be evaluated via equation in the time required to simulate just one. Hence if finding the optimal point requires testing hundreds of thousands of design points, then the equations provide a speedup on the order of hundreds of thousands.
14 ===========
16 The equations calculate the instructions per cycle that result from a combination of quantities, each of which represents a basic aspect of the pipeline. Some of the quantities are design decisions, such as the fetch width. Others quantities are measured outcomes, such as branch prediction accuracy and cache miss rate. All the measured quantities are statically determined for a given instruction trace and choice of structure, and remain constant regardless of the other design choices. This allows exploring what-if scenarios of combining various design choices, to see what IPC each scenario yields.
18 Note, however, that this requires performing an up-front "tuning" simulation, from which the measured values are gathered. Hence, the equations presented here do not replace simulation, and they do not predict behavior of source code. Rather, they take a single simulation run as input, then predict the outcome were other simulations run with different design choices. In this way, the equations do speed up the design exploration process.
20 XXXFirst, an out-of-order pipeline is abstractly modelled as a number of blocks that communicate. Each block represents a function that serves a particular purpose in an OoO pipeline, such as decode, renaming, load-to-store forwarding, reordering, and so on. The function we define is independent of implementation strategy, and so common to all pipeline implementations that include such functions. These functions and their interactions are encoded in the equations presented. The main contribution of this paper is these equations, but we illustrate several ways they can be used to gain insight. Note that several parameters in the equations are collected from simulation, such as branch prediction accuracy. Therefore, they do not directly predict performance given just code and pipeline design. However, the simulation inputs are generated once, and characterize the code, so they are then reused as the designer explores alternative choices and the equations accurately predict the change in performance.
