Three weeks ago Intel launched their Xeon Phi coprocessor card with much fanfare. We took the time to analyze some technical details about the chip that cast some doubt on Intel’s claim of x86 compatibility. It should be noted that Intel attempted to word their press release very carefully, to only make claims about the x86 programming model. Still, we believe it should be pointed out that beyond the basic x86 instruction set and programming model, Xeon Phi is quite a different beast compared to current and past x86 CPUs.
To get a basic understanding of Larrabee, it should be known that the single cores inside are based on the P5 microarchitecture which debuted in 1993. A slightly evolved version of that microarchitecture is used in Intel’s Atom line of CPUs and now also in the Many Integrated Core (MIC) architecture which Larrabee ultimately culminated in. Besides the same ancestry, the similarities between Atom and Xeon Phi in terms of feature sets end after basic x86 instructions and their 64-bit extensions which have been added to the microarchitecture.
On page 657f in the “Knights Corner Instruction Set Reference Manual” which can be downloaded via a link in this forum post aimed at developers, Intel details which registers and instructions are not supported in the Knights Corner architecture. This includes any instructions operating on MMX, XMM and YMM registers, more or less all of the instruction set extensions introduced over the course of the last 17 years – namely MMX, any iteration of SSE and AVX.
The same manual also contains a description of what is supported. This includes the basic x86 instruction set as well as the additions for Intel 64, which is Intel’s moniker for AMD64, the well-known 64-bit extension of x86. It also supports the x87 FPU instructions, which have been integrated since the arrival of the 486. Intel also added a new set of 32 512-bit wide ZMM registers that are accompanied with a new vector instruction set operating on those registers. It is possible to operate on vectors of 32-bit and 64-bit integer and floating point values, making them 16- respectively 8-wide. The gory details of these instructions are explained in the reference manual as well.
As a consequence of these architectural changes, binary software compatibility is improbable. At the very least, it is required to recompile software to be able to run on Larrabee. Depending on the actual implementation of software one may want to run on Intel’s MIC, it might also be necessary to put in some reengineering effort. Just to put it into perspective, Intel is throwing out over 15 years of x86 CPU innovations in this case. Handwritten SIMD code is basically worthless on Larrabee. This means that any HPC applications that rely on SIMD optimizations have to dedicate effort to rewriting portions of their code.