AMD Instinct MI300A vs NVIDIA H100

AMD Instinct MI300A

AMD Instinct MI300A

NVIDIA H100

14592 Shaders 128GB HBM3 2100MHz	14592 Shaders 80GB HBM2e 1755MHz
Peak AI Performance 3.92 POPS INT4 Tensor Sparse	Peak AI Performance 3.03 POPS INT8 Tensor Sparse
FP32 122.57 TFLOPS	FP32 51.22 TFLOPS
FP16 122.57 TFLOPS	FP16 102.44 TFLOPS
Form Factor APU SH5 Socket -	Form Factor PCIe Card 2.0-Slots
TDP 560W	TDP 350W
- - - - -	Power Connectors - - - 1x 16-Pin 12VHPWR

Highlights

Benchmarks

Geekbench 6

GB6 OpenCL N/A 0%	GB6 OpenCL 287,370 74%
GB6 Metal N/A 0%	GB6 Metal N/A 0%
GB6 Vulkan N/A 0%	GB6 Vulkan 284,775 75%

Geekbench 5

GB5 OpenCL N/A 0%	GB5 OpenCL N/A 0%
GB5 CUDA N/A 0%	GB5 CUDA N/A 0%
GB5 Metal N/A 0%	GB5 Metal N/A 0%
GB5 Vulkan N/A 0%	GB5 Vulkan N/A 0%

OctaneBench

OCT 2020.1 N/A 0%	OCT 2020.1 N/A 0%
OCT Metal N/A 0%	OCT Metal N/A 0%

Tech Specs

Theoretical Performance

Peak AI Performance 3.92 POPS INT4 Tensor Sparse	Peak AI Performance 3.03 POPS INT8 Tensor Sparse
- - -	- - -
FP8 - 1.96 PFLOPS Tensor (FP16 Accumulate) 3.92 PFLOPS Tensor (FP16 Accumulate) Sparse 1.96 PFLOPS Tensor (FP32 Accumulate) 3.92 PFLOPS Tensor (FP32 Accumulate) Sparse	FP8 - 1.51 PFLOPS Tensor (FP16 Accumulate) 3.03 PFLOPS Tensor (FP16 Accumulate) Sparse 1.51 PFLOPS Tensor (FP32 Accumulate) 3.03 PFLOPS Tensor (FP32 Accumulate) Sparse
FP16 122.57 TFLOPS 980.58 TFLOPS Tensor (FP16 Accumulate) 1.96 PFLOPS Tensor (FP16 Accumulate) Sparse 980.58 TFLOPS Tensor (FP32 Accumulate) 1.96 PFLOPS Tensor (FP32 Accumulate) Sparse	FP16 102.44 TFLOPS 756.45 TFLOPS Tensor (FP16 Accumulate) 1.51 PFLOPS Tensor (FP16 Accumulate) Sparse 756.45 TFLOPS Tensor (FP32 Accumulate) 1.51 PFLOPS Tensor (FP32 Accumulate) Sparse
FP32 122.57 TFLOPS 122.57 TFLOPS Tensor 245.15 TFLOPS Tensor Sparse	FP32 51.22 TFLOPS - -
FP64 61.29 TFLOPS 122.57 TFLOPS Tensor	FP64 25.61 TFLOPS 47.28 TFLOPS Tensor
BF16 - 980.58 TFLOPS Tensor 1.96 PFLOPS Tensor Sparse	BF16 102.44 TFLOPS 756.45 TFLOPS Tensor 1.51 PFLOPS Tensor Sparse
TF32 490.29 TFLOPS Tensor 980.58 TFLOPS Tensor Sparse	TF32 378.23 TFLOPS Tensor 756.45 TFLOPS Tensor Sparse
INT4 1.96 POPS Tensor 3.92 POPS Tensor Sparse	- - -
INT8 - 1.96 POPS Tensor 3.92 POPS Tensor Sparse	INT8 - 1.51 POPS Tensor 3.03 POPS Tensor Sparse
- -	INT32 25.61 TOPS
- -	- -
Pixel Fillrate -	Pixel Fillrate 42.12 GPixel/s
- -	- -
Texture Fillrate 1915.2 GTexel/s	Texture Fillrate 800.28 GTexel/s

Chip

Manufacturer AMD	Manufacturer NVIDIA
Chip Designer AMD	Chip Designer NVIDIA
Architecture CDNA 3	Architecture Hopper
Family Instinct	Family Server
Codename Aqua Vanjaram Aqua Vanjaram XT Variant Aqua Vanjaram XT	Codename NV180 GH100 - -
Market Segment Server	Market Segment Server
Release Date 12/6/2023	Release Date 3/22/2022

Fabrication

Foundry TSMC TSMC Active Interposer Die	Foundry TSMC -
Fabrication Node N5 N6 Active Interposer Die	Fabrication Node 4N -
Die Size 3x 370 mm² -	Die Size 814 mm² -
Transistor Count 3x 36.5 Billion -	Transistor Count 80 Billion -
Transistor Density 98.65M/mm² -	Transistor Density 98.28M/mm² -

Form

Form

APU SH5 Socket

Form

PCIe Card

Core Configuration

Shading Units 14592 Shaders -	Shading Units 14592 Shaders -
Texture Mapping Units 912 TMUs	Texture Mapping Units 456 TMUs
Render Output Units -	Render Output Units 24 ROPs
Tensor Cores 912 T-Cores	Tensor Cores 456 T-Cores
- -	- -
- -	Streaming Multiprocessors 114 SMs
Compute Units 228 CUs	- -
- -	- -
- -	- -

Clock Speeds

-

-

1000MHz Base

2100MHz

1620MHz Tensor

-

1065MHz Base

1755MHz

Cache

- -	- -
L1 - - 16KB/CU -	L1 64KB/SM Tex 256KB/SM - -
L2 16MB Shared	L2 50MB Shared
L3 192MB Shared -	- - -

Memory

128GB HBM3 ECC	80GB HBM2e -
Bus Width 8192Bit	Bus Width 5120Bit
Clock 2600MHz Transfer Rate 5.2GT/s Bandwidth 5324.8GB/s	Clock 1593MHz Transfer Rate 3.2GT/s Bandwidth 2039GB/s
- - - - - - - - -	- - - - - - - - -

Power & Thermals

TDP 560W	TDP 350W
- -	- -

Ports

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

No Ports

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

No Ports

Video Output

Max Resolution Unknown	Max Resolution Unknown
Max Resolution Refresh Rate -	Max Resolution Refresh Rate -
Variable Refresh Rate - - -	Variable Refresh Rate G-Sync FreeSync -
Display Stream Compression (DSC) Not Supported	Display Stream Compression (DSC) Not Supported
Multi Monitor Support Unknown	Multi Monitor Support Unknown
- -	- -

Video Encoder

Model 2x VCN 2.6	No Encoders -
Codec - - - - - - - - AVC (H.264) HEVC (H.265) - - - -	- - - - - - - - - - - - - - -

Video Decoder

No Decoders	Model 7x NVDEC 5
- - - - - - - - - - - - - -	Codec MPEG-1 MPEG-2 MPEG-4 - VC-1 VP8 VP9 - AVC (H.264) HEVC (H.265) - AV1 - -

API Support

- - - -	- - - -
- - OpenCL 3.0 - -	- - OpenCL 3.0 - -
- - - - GFX 9.4 - - - -	- - CUDA 9.0 - - PureVideo HD VP11 VDPAU Feature Set K

Card

- - - -	- - - -
- - - - - - - -	Power Connectors - - - - - - 1x 16-Pin 12VHPWR
- - PCIe Version 5.0 PCIe Lanes 16	Slots Required 2.0 PCIe Version 5.0 PCIe Lanes 16
Multi GPU Support Supported Type Infinity Fabric	Multi GPU Support Supported Type NVLink
- - - - - -	Height 111 mm (4.37 in) Width 268 mm (10.55 in) Depth 40 mm (1.57 in)

Competitors

NVIDIA H100

NVIDIA A40

NVIDIA H100 vs NVIDIA A40

Change Comparison

Copy Link