EPIC ( “ explicitly parallel instruction computing ”) is a class of microprocessor architectures with explicit command parallelism. The term was introduced in 1997 by the HP and Intel Alliance [1] for the development of the Intel Itanium architecture [2] . EPIC allows the microprocessor to execute instructions in parallel, relying on information from the compiler , rather than detecting the possibility of parallel operation of instructions using special circuits during execution. In theory, this could simplify the scaling of the processing power of the processor without increasing the clock frequency.
Content
The origins of the VLIW
In 1989, researchers at Hewlett-Packard came to the conclusion that the number of instructions that a RISC processor can execute in one clock cycle is limited. The development of a new architecture based on the VLIW architecture and called EPIC [2] was initiated. For VLIW processors , one instruction (one command word) encodes several operations; operations are performed simultaneously by different processor actuators.
EPIC Development Goals:
- removal of instructions from the processor scheduler ;
- an increase in the number of instructions that the processor is able to execute at the same time ( English instruction level parallelism ).
The instruction scheduler is a device with complex logic that is part of the processor and is designed to determine the order in which instructions are executed. Removing the instruction scheduler made it possible to free up space inside the processor for other devices (for example, for an ALU ). The instruction scheduler functions were assigned to the compiler .
Increasing the degree of parallelism of instructions is achieved by using the compiler's ability to search for independent commands.
The VLIW architecture , in its original form, had several drawbacks that prevented their massive implementation:
- VLIW instruction sets were not compatible between different generations of processors (a program compiled for a processor containing more executive devices (for example, more ALUs ) could not be executed by a processor containing fewer devices);
- delays in loading data from the memory hierarchy ( caches , DRAM ) were not completely predictable (because of this, the implementation of static scheduling of instructions for loading and using data was complicated).
VLIW Evolution
The EPIC architecture has the following features to address the shortcomings of the VLIW:
- Each group of several instructions is called a bundle . Each bundle may have a stop bit, indicating that the next group depends on the results of this work. This bit allows you to create future generations of architecture with the ability to run more bundles in parallel. The dependency information is computed by the compiler, and therefore the hardware will not have to perform additional verification of the independence of the operands.
- For pre-paging data, a software prefetch instruction is used. Preempting increases the likelihood that by the time the load command is executed, the data will already be in the cache . Also in this instruction there may be additional instructions for choosing different cache levels for the data.
- The speculative load instruction is used to load data before it becomes known whether they will be used ( bypassing control dependencies ), or whether they will be changed before use ( bypassing data dependencies ).
- The load load check instructions help speculative load instructions by checking whether the load instruction depended on subsequent writing. If there is such a dependency, the speculative download must be repeated.
The EPIC architecture also includes several concepts ( grab-bag ) for increasing ILP (instruction parallelism):
- Branch prediction is used to reduce the frequency of transitions and to increase the instructions. In the latter case, the conditional branch is transformed into the filling of predicate registers, then both branches are executed. The result of the branch that was not supposed to be executed is canceled by the value of the predicate register.
- Deferred exceptions using the Not a thing bit in general-purpose registers. They allow you to continue the speculative execution, even after exceptional situations.
- An extremely large register file to avoid the need to rename registers .
- Branch commands with multiple target addresses improve branch prediction by combining several alternating branches into a single bundle.
The Itanium architecture also added a rotating register file [3] , which is necessary to simplify software pipelining . With such a file, there is no need for manual promotion of loops and manual renaming of registers [4] .
Other developments and studies
There were a number of studies on EPIC architectures that were not related to development in Itanium.
- The IMPACT project at the University of Illinois at Urbana-Champaign under the leadership of Wen-mei Hwu greatly influenced later research.
- PlayDoh architecture from HP-labs.
- Gelato Federation, a more efficient compiler development community for Linux on Itanium servers. ( )
See also
- Complex instruction set computer (CISC)
- Reduced instruction set computer (RISC)
- Very long instruction word (VLIW)
- Elbrus - Russian processor
- IA-64
- Superscalar
Notes
- ↑ Schlansker and Rau EPIC: An Architecture for Instruction-Level Parallel Processors Non-Comp (PDF). HP Laboratories Palo Alto, HPL-1999-111 (February 2000). The appeal date is May 8, 2008. Archived on April 27, 2012.
- 2 1 2 Inventing Itanium: How HP Labs Helped Create the Next-Generation Chip Architecture . HP Labs (June 2001). The date of circulation is December 14, 2007. Archived April 27, 2012.
- ↑ Modern server processors. Part 2. Intel Itanium, HP PA8700, Alpha Archived January 12, 2012.
- G De Gelas, Johan Itanium – Is there a light? AnandTech (November 9, 2005). The appeal date is May 8, 2008. Archived on April 27, 2012.
Links
- Historical background for EPIC
- Mark Smotherman (2002) " Understanding EPIC Architectures and Implementations "