AMD Instinct MI325X vs NVIDIA GB200

AMD Instinct MI325X

AMD Instinct MI325X

NVIDIA GB200

19456 Shaders 288GB HBM3e 2100MHz	2x 18944 Shaders 384GB (2x 192GB) HBM3e 2112MHz
Peak AI Performance 5.23 POPS INT4 Tensor Sparse	Peak AI Performance 40 PFLOPS FP4 Tensor Sparse
FP32 163.43 TFLOPS	FP32 160.04 TFLOPS
FP16 163.43 TFLOPS	FP16 320.08 TFLOPS
Form Factor OAM Module -	Form Factor Superchip -
TDP 560W	TDP 2700W
- - - - -	- - - - -

Highlights

Benchmarks

Geekbench 6

GB6 OpenCL N/A 0%	GB6 OpenCL N/A 0%
GB6 Metal N/A 0%	GB6 Metal N/A 0%
GB6 Vulkan N/A 0%	GB6 Vulkan N/A 0%

Geekbench 5

GB5 OpenCL N/A 0%	GB5 OpenCL N/A 0%
GB5 CUDA N/A 0%	GB5 CUDA N/A 0%
GB5 Metal N/A 0%	GB5 Metal N/A 0%
GB5 Vulkan N/A 0%	GB5 Vulkan N/A 0%

OctaneBench

OCT 2020.1 N/A 0%	OCT 2020.1 N/A 0%
OCT Metal N/A 0%	OCT Metal N/A 0%

Tech Specs

Theoretical Performance

Peak AI Performance 5.23 POPS INT4 Tensor Sparse	Peak AI Performance 40 PFLOPS FP4 Tensor Sparse
- - -	FP4 20 PFLOPS Tensor 40 PFLOPS Tensor Sparse
FP8 - 2.61 PFLOPS Tensor (FP16 Accumulate) 5.23 PFLOPS Tensor (FP16 Accumulate) Sparse 2.61 PFLOPS Tensor (FP32 Accumulate) 5.23 PFLOPS Tensor (FP32 Accumulate) Sparse	FP8 - 10 PFLOPS Tensor (FP16 Accumulate) 20 PFLOPS Tensor (FP16 Accumulate) Sparse 10 PFLOPS Tensor (FP32 Accumulate) 20 PFLOPS Tensor (FP32 Accumulate) Sparse
FP16 163.43 TFLOPS 1.31 PFLOPS Tensor (FP16 Accumulate) 2.61 PFLOPS Tensor (FP16 Accumulate) Sparse 1.31 PFLOPS Tensor (FP32 Accumulate) 2.61 PFLOPS Tensor (FP32 Accumulate) Sparse	FP16 320.08 TFLOPS 5 PFLOPS Tensor (FP16 Accumulate) 10 PFLOPS Tensor (FP16 Accumulate) Sparse 5 PFLOPS Tensor (FP32 Accumulate) 10 PFLOPS Tensor (FP32 Accumulate) Sparse
FP32 163.43 TFLOPS 163.43 TFLOPS Tensor 326.86 TFLOPS Tensor Sparse	FP32 160.04 TFLOPS - -
FP64 81.72 TFLOPS 163.43 TFLOPS Tensor	FP64 80.02 TFLOPS 78.13 TFLOPS Tensor
BF16 - 1.31 PFLOPS Tensor 2.61 PFLOPS Tensor Sparse	BF16 320.08 TFLOPS 5 PFLOPS Tensor 10 PFLOPS Tensor Sparse
TF32 653.72 TFLOPS Tensor 1.31 PFLOPS Tensor Sparse	TF32 2.5 PFLOPS Tensor 5 PFLOPS Tensor Sparse
INT4 2.61 POPS Tensor 5.23 POPS Tensor Sparse	- - -
INT8 - 2.61 POPS Tensor 5.23 POPS Tensor Sparse	INT8 - 10 POPS Tensor 20 POPS Tensor Sparse
- -	INT32 160.04 TOPS
- -	- -
Pixel Fillrate -	Pixel Fillrate 67.584 GPixel/s
- -	- -
Texture Fillrate 2553.6 GTexel/s	Texture Fillrate 1250.304 GTexel/s

Chip

Manufacturer AMD	Manufacturer NVIDIA
Chip Designer AMD	Chip Designer NVIDIA
Architecture CDNA 3	Architecture Blackwell
Family Instinct	Family Server
Codename Aqua Vanjaram Aqua Vanjaram XTX Variant Aqua Vanjaram XTX	Codename NV190 GB100 Variant Oberon
Market Segment Server	Market Segment Server
Release Date 6/2/2024	Release Date 3/18/2024

Fabrication

Foundry TSMC TSMC Active Interposer Die	Foundry TSMC -
Fabrication Node N5 N6 Active Interposer Die	Fabrication Node 4NP -
Die Size 4x 370 mm² -	Die Size 4x 810 mm² -
Transistor Count 4x 36.5 Billion -	Transistor Count 4x 104 Billion -
Transistor Density 98.65M/mm² -	Transistor Density 128.40M/mm² -

Form

Form

OAM Module

Form

Superchip

Core Configuration

Shading Units 19456 Shaders -	Shading Units 2x 18944 Shaders -
Texture Mapping Units 1216 TMUs	Texture Mapping Units 2x 592 TMUs
Render Output Units -	Render Output Units 2x 32 ROPs
Tensor Cores 1216 T-Cores	Tensor Cores 2x 592 T-Cores
- -	- -
- -	Streaming Multiprocessors 2x 148 SMs
Compute Units 304 CUs	- -
- -	- -
- -	- -

Clock Speeds

-

-

1000MHz Base

2100MHz

2062MHz Tensor

-

-

2112MHz

Cache

- -	- -
L1 - - 16KB/CU -	L1 64KB/SM Tex 256KB/SM - -
L2 16MB Shared	L2 64MB Shared
L3 256MB Shared -	- - -

Memory

288GB HBM3e ECC	384GB (2x 192GB) HBM3e -
Bus Width 8192Bit	Bus Width 8192Bit
Clock 2933MHz Transfer Rate 5.9GT/s Bandwidth 6006.8GB/s	Clock 3906MHz Transfer Rate 7.8GT/s Bandwidth 8000.1GB/s
- - - - - - - - -	- - - - - - - - -

Power & Thermals

TDP 560W	TDP 2700W
- -	- -

Ports

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

No Ports

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

No Ports

Video Output

Max Resolution Unknown	Max Resolution Unknown
Max Resolution Refresh Rate -	Max Resolution Refresh Rate -
Variable Refresh Rate - - -	Variable Refresh Rate G-Sync FreeSync -
Display Stream Compression (DSC) Not Supported	Display Stream Compression (DSC) Not Supported
Multi Monitor Support Unknown	Multi Monitor Support Unknown
- -	- -

Video Encoder

Model 2x VCN 2.6	No Encoders -
Codec - - - - - - - - AVC (H.264) HEVC (H.265) - - - -	- - - - - - - - - - - - - - -

Video Decoder

No Decoders	Model 7x NVDEC 6
- - - - - - - - - - - - - -	Codec MPEG-1 MPEG-2 MPEG-4 - VC-1 VP8 VP9 - AVC (H.264) HEVC (H.265) - AV1 - -

API Support

- - - -	- - - -
- - OpenCL 3.0 - -	- - OpenCL 3.0 - -
- - - - GFX 9.4 - - - -	- - CUDA 10.0 - - PureVideo HD VP13 VDPAU Feature Set M

Card

- - - -	Not a Card - - -
- - - - - - - -	- - - - - - - -
- - PCIe Version 5.0 PCIe Lanes 16	- - PCIe Version 6.0 PCIe Lanes 16
Multi GPU Support Supported Type Infinity Fabric	Multi GPU Support Supported Type NVLink
- - - - - -	- - - - - -

Competitors

AMD Instinct MI325X

AMD Instinct MI300X

AMD Instinct MI325X vs AMD Instinct MI300X

AMD Instinct MI325X

AMD Instinct MI350X

AMD Instinct MI325X vs AMD Instinct MI350X

AMD Instinct MI325X

AMD Instinct MI355X

AMD Instinct MI325X vs AMD Instinct MI355X

Change Comparison

Copy Link