myspace profile view counter

Personal Site

<< back

           Contesting Simulator

 

 

Architectural contesting is a redundant-execution technique for enhancing single-thread performance by enabling fine grain regions of code to be executed on better customized core configurations. See this paper [H.H. Najaf-abadi, E. Rotenberg, Computer Arch. News, 2007] for more details.

The simulator presented here supports the simultaneous execution of the same code on two differently configured and differently clocked simulator instances. The simulator instances model proportional clock periods through a lock that ensures that they proceed together. They proceed to check the global clock in order (each core is assigned a number) and only one of the simulator instances increments the global time. When the global time equals a multiple of the clock period of a given core, that core is allowed to executed one iteration of its main simulator loop.

Each simulator instance writes its instruction results to a shared memory space that can be checked by the other simulator instance. Instruction results are tagged with the time the instruction result can be accessed.

If on dispatch, a simulator instance realizes that an instruction that it is dispatching has already been committed by the other simulator instance and the access time of the instruction has passed (the global clock is greater than the access time of the instruction), it directly commits the instruction by forwarding it to the commit stage (to be commit in the available commit bandwidth). Store instructions are exempt from this process and flow through the pipeline as usual.

Four new input flags must be provided when executing the simulator: 1) num_prc determines the number of simulator instances being contested (currently only 2 cores can be used), 2) this_prc determines the core number to be assigned to the simulator instance at hand (this can be arbitrary, the order in which the simulator instances are assigned numbers does not affect finial results. but an order is needed for synchronization purposes), 3) ClockRatio determines the proportional clock period of the simulator instance at hand in hundredths of nanoseconds (e.g. for a simulator instance at 3Ghz, a value of 33 should be set for this parameter), 4) c2c:lat determines the core to core latency in nanoseconds.

Note: this simulator is free software. It is based on the Simplescalar 4.0 code. You can redistribute it and/or modify it under the terms of the GNU Lesser General Public License version 2.1 published by the Free Software Foundation.

 

 

Download

                    Source code:    mase_cont.tgz

  

The material located at this site is not endorsed, sponsored or provided by or on behalf of North Carolina State University.