POWER9
POWER9: A Deep Dive into IBM's 14nm FinFET Microprocessor (2024)
Executive Summary
IBM's POWER9 is a family of superscalar, multithreading, multi-core microprocessors produced by IBM, based on the Power ISA. Announced in August 2016, the POWER9-based processors are being manufactured using a 14 nm FinFET process, in 12- and 24-core versions, for scale out and scale up applications. The POWER9 architecture is open for licensing and modification by the OpenPOWER Foundation members. This article provides an in-depth look at the POWER9's design, architecture, performance, and market positioning.
The POWER9 is designed to be modular and used in more processor variants and used for licensing, on a different fabrication process than IBM's. On-chip are co-processors for compression and cryptography, as well as a large low-latency eDRAM L3 cache. The POWER9 comes with a new interrupt controller architecture called "eXternal Interrupt Virtualization Engine" (XIVE) which replaces a much simpler architecture that was used in POWER4 through POWER8.
Summit, the ninth fastest supercomputer in the world, is based on POWER9, while also using Nvidia Tesla GPUs as accelerators. The POWER9 is also used in the IBM Power System E980 server, which is the ideal foundation for a private cloud infrastructure, able to power the large-scale, mission-critical applications enterprises need to transform data into a competitive advantage.
Architecture & Design
The POWER9 core comes in two variants, a four-way multithreaded one called SMT4 and an eight-way one called SMT8. The SMT4- and SMT8-cores are similar, in that they consist of a number of so-called slices fed by common schedulers. A slice is a rudimentary 64-bit single-threaded processing core with load store unit (LSU), integer unit (ALU) and a vector scalar unit (VSU, doing SIMD and floating point). A super-slice is the combination of two slices.
An SMT4-core consists of a 32 KiB L1 cache, a 32 KiB L1 data cache, an instruction fetch unit (IFU) and an instruction sequencing unit (ISU) which feeds two super-slices. An SMT8-core has two sets of L1 caches and IFUs and ISUs to feed four super-slices. The result is that the 12-core and 24-core versions of POWER9 each consist of the same number of slices (96 each) and the same amount of L1 cache.
A POWER9 core, whether SMT4 or SMT8, has a 12-stage pipeline (five stages shorter than its predecessor, the POWER8), but aims to retain the clock frequency of around 4 GHz. It will be the first to incorporate elements of the Power ISA v.3.0 that was released in December 2015, including the VSX-3 instructions.
| Core Variant | Number of Slices | Number of Super-Slices | L1 Cache |
|---|---|---|---|
| SMT4 | 4 | 2 | 32 KiB |
| SMT8 | 8 | 4 | 64 KiB |
Performance & Thermal
The POWER9 is designed to provide high performance and low power consumption. The 12-core and 24-core versions of POWER9 have a thermal design power (TDP) of around 190W and 240W, respectively. The POWER9 also features a new interrupt controller architecture called XIVE, which replaces the simpler architecture used in POWER4 through POWER8.
The POWER9 has been used in several high-performance computing applications, including the Summit supercomputer, which is the ninth fastest supercomputer in the world. The POWER9 has also been used in the IBM Power System E980 server, which is designed for large-scale, mission-critical applications.
Benchmarks have shown that the POWER9 provides high performance and low power consumption. For example, the POWER9 has been shown to provide up to 4x the performance of the POWER8 at the same power consumption. The POWER9 has also been shown to provide up to 2x the performance of the Intel Xeon E5-2699 v4 at the same power consumption.
| Processor | Number of Cores | TDP | Performance |
|---|---|---|---|
| POWER9 | 12 | 190W | Up to 4x POWER8 |
| POWER9 | 24 | 240W | Up to 2x Intel Xeon E5-2699 v4 |
Market Positioning
The POWER9 is positioned as a high-performance, low-power consumption processor for large-scale, mission-critical applications. The POWER9 is designed to provide high performance and low power consumption, making it an attractive option for data centers and cloud computing applications.
The POWER9 competes with other high-performance processors, such as the Intel Xeon and the AMD EPYC. However, the POWER9 has several advantages, including its high performance, low power consumption, and advanced features such as XIVE and VSX-3 instructions.
The target buyer for the POWER9 is the enterprise data center and cloud computing market. The POWER9 is designed to provide high performance and low power consumption, making it an attractive option for large-scale, mission-critical applications.
Specifications
Technical Specifications
| Specification | Detail |
|---|---|
| Process Node | 14nm FinFET |
| Number of Cores | 12, 24 |
| TDP | 190W, 240W |
| Clock Speed | Up to 4 GHz |
| L1 Cache | 32 KiB, 64 KiB |
| L2 Cache | 512 KiB |
| L3 Cache | 120 MB |
| Memory Support | DDR4 |
| Memory Bandwidth | Up to 120 GB/s |
Frequently Asked Questions
Frequently Asked Questions
What is the process node used in the POWER9 processor?
The POWER9 processor is manufactured using a 14nm FinFET process.
How many cores does the POWER9 processor have?
The POWER9 processor is available in 12-core and 24-core versions.
What is the clock speed of the POWER9 processor?
The POWER9 processor has a clock speed of up to 4 GHz.
What is the TDP of the POWER9 processor?
The TDP of the POWER9 processor is 190W for the 12-core version and 240W for the 24-core version.
What is the L1 cache size of the POWER9 processor?
The L1 cache size of the POWER9 processor is 32 KiB for the SMT4 core and 64 KiB for the SMT8 core.