AMD Instinct MI300P vs NVIDIA GRID A100A

AMD Instinct MI300P

AMD Instinct MI300P

NVIDIA GRID A100A

NVIDIA GRID A100A

9728 Shaders 64GB HBM3 2100MHz	6912 Shaders 32GB HBM2e 900MHz
Peak AI Performance 2.61 POPS INT4 Tensor Sparse	Peak AI Performance 1.59 POPS INT4 Tensor Sparse
FP32 81.72 TFLOPS	FP32 12.44 TFLOPS
FP16 81.72 TFLOPS	FP16 49.77 TFLOPS
Form Factor OAM Module -	Form Factor vGPU -
TDP 300W	TDP 400W
- - - - -	- - - - -

Highlights

Benchmarks

Geekbench 6

GB6 OpenCL N/A 0%	GB6 OpenCL N/A 0%
GB6 Metal N/A 0%	GB6 Metal N/A 0%
GB6 Vulkan N/A 0%	GB6 Vulkan N/A 0%

Geekbench 5

GB5 OpenCL N/A 0%	GB5 OpenCL N/A 0%
GB5 CUDA N/A 0%	GB5 CUDA N/A 0%
GB5 Metal N/A 0%	GB5 Metal N/A 0%
GB5 Vulkan N/A 0%	GB5 Vulkan N/A 0%

OctaneBench

OCT 2020.1 N/A 0%	OCT 2020.1 N/A 0%
OCT Metal N/A 0%	OCT Metal N/A 0%

Tech Specs

Theoretical Performance

Peak AI Performance 2.61 POPS INT4 Tensor Sparse	Peak AI Performance 1.59 POPS INT4 Tensor Sparse
- - -	- - -
FP8 - 1.31 PFLOPS Tensor (FP16 Accumulate) 2.61 PFLOPS Tensor (FP16 Accumulate) Sparse 1.31 PFLOPS Tensor (FP32 Accumulate) 2.61 PFLOPS Tensor (FP32 Accumulate) Sparse	- - - - - -
FP16 81.72 TFLOPS 653.72 TFLOPS Tensor (FP16 Accumulate) 1.31 PFLOPS Tensor (FP16 Accumulate) Sparse 653.72 TFLOPS Tensor (FP32 Accumulate) 1.31 PFLOPS Tensor (FP32 Accumulate) Sparse	FP16 49.77 TFLOPS 199.07 TFLOPS Tensor (FP16 Accumulate) 398.13 TFLOPS Tensor (FP16 Accumulate) Sparse 199.07 TFLOPS Tensor (FP32 Accumulate) 398.13 TFLOPS Tensor (FP32 Accumulate) Sparse
FP32 81.72 TFLOPS 81.72 TFLOPS Tensor 163.43 TFLOPS Tensor Sparse	FP32 12.44 TFLOPS - -
FP64 40.86 TFLOPS 81.72 TFLOPS Tensor	FP64 6.22 TFLOPS 12.44 TFLOPS Tensor
BF16 - 653.72 TFLOPS Tensor 1.31 PFLOPS Tensor Sparse	BF16 24.88 TFLOPS 199.07 TFLOPS Tensor 398.13 TFLOPS Tensor Sparse
TF32 326.86 TFLOPS Tensor 653.72 TFLOPS Tensor Sparse	TF32 99.53 TFLOPS Tensor 199.07 TFLOPS Tensor Sparse
INT4 1.31 POPS Tensor 2.61 POPS Tensor Sparse	INT4 796.26 TOPS Tensor 1.59 POPS Tensor Sparse
INT8 - 1.31 POPS Tensor 2.61 POPS Tensor Sparse	INT8 - 398.13 TOPS Tensor 796.26 TOPS Tensor Sparse
- -	INT32 12.44 TOPS
- -	- -
Pixel Fillrate -	Pixel Fillrate 172.8 GPixel/s
- -	- -
Texture Fillrate 1276.8 GTexel/s	Texture Fillrate 388.8 GTexel/s

Chip

Manufacturer AMD	Manufacturer NVIDIA
Chip Designer AMD	Chip Designer NVIDIA
Architecture CDNA 3	Architecture Ampere
Family Instinct	Family GRID
Codename Aqua Vanjaram Aqua Vanjaram XL Variant Aqua Vanjaram XL	Codename NV170 GA100 - -
Market Segment Server	Market Segment Server
Release Date 12/6/2023	Release Date 5/14/2020

Fabrication

Foundry TSMC TSMC Active Interposer Die	Foundry TSMC -
Fabrication Node N5 N6 Active Interposer Die	Fabrication Node 7N -
Die Size 2x 370 mm² -	Die Size 826 mm² -
Transistor Count 2x 36.5 Billion -	Transistor Count 54.2 Billion -
Transistor Density 98.65M/mm² -	Transistor Density 65.62M/mm² -

Form

Form

OAM Module

Form

vGPU

Core Configuration

Shading Units 9728 Shaders -	Shading Units 6912 Shaders -
Texture Mapping Units 608 TMUs	Texture Mapping Units 432 TMUs
Render Output Units -	Render Output Units 192 ROPs
Tensor Cores 608 T-Cores	Tensor Cores 432 T-Cores
- -	- -
- -	Streaming Multiprocessors 108 SMs
Compute Units 152 CUs	- -
- -	- -
- -	- -

Clock Speeds

-

-

1000MHz Base

2100MHz

-

-

-

900MHz

Cache

- -	- -
L1 - - 16KB/CU -	L1 64KB/SM Tex 192KB/SM - -
L2 16MB Shared	L2 40MB Shared
L3 128MB Shared -	- - -

Memory

64GB HBM3 ECC	32GB HBM2e -
Bus Width 4096Bit	Bus Width 6144Bit
Clock 2600MHz Transfer Rate 5.2GT/s Bandwidth 2662.4GB/s	Clock 1215MHz Transfer Rate 2.4GT/s Bandwidth 1866.2GB/s
- - - - - - - - -	- - - - - - - - -

Power & Thermals

TDP 300W	TDP 400W
- -	- -

Ports

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

No Ports

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

-

No Ports

Video Output

Max Resolution Unknown	Max Resolution Unknown
Max Resolution Refresh Rate -	Max Resolution Refresh Rate -
Variable Refresh Rate - - -	Variable Refresh Rate G-Sync FreeSync -
Display Stream Compression (DSC) Not Supported	Display Stream Compression (DSC) Not Supported
Multi Monitor Support Unknown	Multi Monitor Support Unknown
- -	- -

Video Encoder

Model 2x VCN 2.6	No Encoders -
Codec - - - - - - - - AVC (H.264) HEVC (H.265) - - - -	- - - - - - - - - - - - - - -

Video Decoder

No Decoders	Model 5x NVDEC 4
- - - - - - - - - - - - - -	Codec MPEG-1 MPEG-2 MPEG-4 - VC-1 VP8 VP9 - AVC (H.264) HEVC (H.265) - - - -

API Support

- - - -	- - - -
- - OpenCL 3.0 - -	- - OpenCL 3.0 Vulkan 1.2
- - - - GFX 9.4 - - - -	- - CUDA 8.0 - - PureVideo HD VP10 VDPAU Feature Set J

Card

- - - -	Not a Card - - -
- - - - - - - -	- - - - - - - -
- - PCIe Version 5.0 PCIe Lanes 16	- - PCIe Version 4.0 PCIe Lanes 16
Multi GPU Support Supported Type Infinity Fabric	- - - -
- - - - - -	- - - - - -

Competitors

NVIDIA GRID A100A

NVIDIA GRID A100B

NVIDIA GRID A100A vs NVIDIA GRID A100B

Change Comparison

Copy Link