If you are still using Intel Itanium processors, you'd better receive your orders soon. Intel announced that it will be shipping the Itanium 9700 processors on July 29, 2021. The company says orders are due January 30, 2020 (discovered by Anandtech).
The Itanium 9700 line of four- and eight-core processors represents the last vestiges of Intel's attempt to change the world into an entirely new processor architecture: the IA-64. Instead of being a 64-bit extension for IA-32 ("Intel Architecture-32", Intel's preferred name for x86-compatible designs), IA-64 was a brand new design built around what Intel and HP called "explicit parallel instruction computing". "
The high-end processors of the late 1990s – both RISC processors in the Unix world and Intel's Pentium Pros IA-32 – were becoming more and more complicated pieces of hardware. The instruction sets that the processors used were essentially serial, describing a sequence of operations to be performed one after another. Executing instructions in this exact serial order limits performance (because each instruction must wait for its predecessor to be completed) and, in fact, it is not necessary.
Often there are instructions that do not depend on each other and can be executed simultaneously. Processors such as Pentium Pro and DEC Alpha analyzed the instructions they were running and the dependencies between them, and those used this information to perform out-of-order instructions. They extracted the parallelism between independent statements, freeing themselves from the strictly serial order that the program code implies. These processors also performed the speculative execution; an instruction depending on the result of another instruction can still be executed if the processor can guess what the result of the first instruction is. If the guess is correct, the speculative calculation is used; if the guess is wrong, the processor undoes the speculation and repeats the calculation with the correct value.
Considerable processor resources are dedicated to the handling of this system; the processor must still act "as if" it were executing serial instructions, one by one, in the exact order that the program determines. Instead of putting all this complexity in the processor, Intel's idea for the IA-64 was to put it in the compiler. Let the compiler identify which statements can be executed simultaneously and let the processor explicitly explain these independent statements in parallel. With this approach, processor transistors could be used for things like cache and functional units – first-generation IA-64 processors could run six instructions in parallel, and current chips can run 12 instructions in parallel – instead of use them transistors for all machines to deal with the speculative execution out of order.
Theory finds reality
That was a good idea, and indeed for some workloads – particularly for heavy-duty floating-point data processing – Itanium chips worked decently. But for common integer workloads, Intel has discovered a problem that compiler developers have been warning the company about all the time: it's really hard to figure out all these dependencies and know what things can be done in parallel at compile time.
For example, loading a memory value takes a variable time. If the value is in the processor cache, it can be very fast, less than 10 cycles. If it is in main memory, it may take a few hundred cycles to load. If it has been paged to a hard disk, there may be billions of cycles before the value is actually available for the processor to use. An instruction that depends on this value can thus become ready for execution within a handful of nanoseconds, or a billion of them. When the processor is dynamically choosing which statements to execute and when, it can handle that type of variation. But with EPIC, the scheduling of instructions is fixed and static. The processor can not continue with another job while waiting for a value to be fetched from memory, and can not easily fetch the values "in advance" so that they are available when they are actually needed.
This problem alone was probably insurmountable, at least for general purpose computing. But Itanium faced challenges even in the fields where it showed some strength. The initial Itanium hardware included hardware-based IA-32 compatibility so that it could run the existing x86 software, albeit slowly. For companies that want to transition their software from 32-bit to 64-bit, this was not very satisfying. During the transition, the ability to perform mixed workloads (some 32-bit software, some 64-bit) is valuable. IA-64 did not really offer this transition path; it could run 64-bit software at native speed, but it was very successful in 32-bit software, and chips that were good at 32-bit software were not able to run the IA-64 software.
Without the capabilities to create a new 64-bit architecture, AMD did something different: the AMD64 architecture was developed as an x86 extension that supported 64-bit computing. AMD did not want to fundamentally change how processors and compilers worked; AMD64 processors continued to use the same out-of-order and complex hardware as found on high-performance IA-32 chips (and still essential for high-performance processors to date). Because AMD64 and IA-32 were so similar, the same hardware could be easily designed to handle both, and there was no impact on the performance of 32-bit software on 64-bit chips, so mixed and transient workloads could be carried out without hindrance.
This made AMD64 much more attractive to developers and companies. Intel struggled to create its own extension for IA-32, but Microsoft – which already supported IA-32, IA-64 and AMD64 – told the company that it was unwilling to support a second 64-bit extension for x86, Intel with little choice but to adopt AMD64 itself. It did this (though with some incompatibilities) under the name Intel64.
IA-64 left with nowhere to go
This has eliminated Itanium from most markets. AMD64 offered the transition path of the IA-32, so it conquered the company and quickly shifted to consumer space as well. Itanium still had a few tricks up its sleeve – Intel's most advanced reliability, availability, and maintenance (RAS) features were released with Itanium first, so if you needed a system that could lead to serious problems such as memory failures and processor, Itanium was, for a time, the way to go. But for the most part, these features are now available on Xeon chips, eliminating even that advantage.
The proliferation of vector instruction sets – AMD64 has made SSE2 mandatory and Intel's AVX512 adds substantial new features – it also means that it is still possible in some ways to explicitly instruct the processor to perform operations in parallel, but in a much more restricted Instead of packets of different all instructions intended to be executed simultaneously, the vector instruction sets same instructions for multiple data simultaneously. This is not as rich and flexible as the EPIC idea, but it turns out to be good enough for many of the same workloads that Itanium excelled.
Currently the only vendor that still sells Itanium machines is HPE (the company that came from the 2014 division of HP) in its Integrity Superdome line, which runs the HP-UX operating system. Superdome systems offer a particular emphasis on RAS, which once made the Itanium a good fit, but now they can be equipped with Xeon chips. Those, instead of the Itanium, have a long-term future. HPE will support systems at least until 2025, but by the end of manufacturing in 2021, the machines will be living on borrowed time.