NVIDIA V100 vs AMD Radeon Instinct MI60

NVIDIA V100
AMD Radeon Instinct MI60
5120 Shaders
32GB HBM2
1380MHz
4096 Shaders
32GB HBM2
1800MHz
Peak AI Performance
113.05 TFLOPS
FP16 Tensor (FP16 Accumulate)
Peak AI Performance
29.49 TFLOPS
FP16
FP32
14.13 TFLOPS
FP32
14.75 TFLOPS
FP16
28.26 TFLOPS
FP16
29.49 TFLOPS
Form Factor
PCIe Card
2.0-Slots
Form Factor
PCIe Card
2.0-Slots
TDP
250W
TDP
300W
Power Connectors
-
2x 8-Pin
Power Connectors
1x 6-Pin
1x 8-Pin

Peak AI Performance

  • 3.83x faster vs Radeon Instinct MI60
  • 74% slower vs V100
V100 - 113.05 TFLOPS FP16 Tensor (FP16 Accumulate)
x3.83
Radeon Instinct MI60 - 29.49 TFLOPS FP16
x1

FP32

  • 4% slower vs Radeon Instinct MI60
  • 4% faster vs V100
V100 - 14.13 TFLOPS FP32
x1
Radeon Instinct MI60 - 14.75 TFLOPS FP32
x1.04

FP16

  • 4% slower vs Radeon Instinct MI60
  • 4% faster vs V100
V100 - 28.26 TFLOPS FP16
x1
Radeon Instinct MI60 - 29.49 TFLOPS FP16
x1.04
  • 13% slower vs Radeon Instinct MI60
  • 14% faster vs V100
V100 - 896GB/s
x1
Radeon Instinct MI60 - 1024GB/s
x1.14
  • 17% lower vs Radeon Instinct MI60
  • 20% higher vs V100
V100 - 250W
x1
Radeon Instinct MI60 - 300W
x1.2
GB6 OpenCL N/A
0%
Manufacturer
NVIDIA
Manufacturer
AMD
Chip Designer
NVIDIA
Chip Designer
AMD
Architecture
Volta
Architecture
GCN 5
Family
Server
Family
Instinct
Codename
NV140
GV100
-
-
Codename
Moonshot
Vega 20
Variant
Vega 20 XT GL
Market Segment
Server
Market Segment
Server
Release Date
3/27/2018
Release Date
11/18/2018
Foundry
TSMC
Foundry
TSMC
Fabrication Node
12FFN
Fabrication Node
N7
Die Size
815 mm²
Die Size
331 mm²
Transistor Count
21.1 Billion
Transistor Count
13.2 Billion
Transistor Density
25.89M/mm²
Transistor Density
39.97M/mm²
Form
PCIe Card
Form
PCIe Card
Shading Units
5120 Shaders
Shading Units
4096 Shaders
Texture Mapping Units
320 TMUs
Texture Mapping Units
256 TMUs
Render Output Units
128 ROPs
Render Output Units
64 ROPs
Tensor Cores
640 T-Cores
-
-
Streaming Multiprocessors
80 SMs
-
-
-
-
Compute Units
64 CUs
Graphics Processing Clusters
6 GPCs
-
-
1230MHz Base
1380MHz
1500MHz Base
1800MHz
Peak AI Performance
113.05 TFLOPS
FP16 Tensor (FP16 Accumulate)
Peak AI Performance
29.49 TFLOPS
FP16
FP16
28.26 TFLOPS
113.05 TFLOPS Tensor (FP16 Accumulate)
FP16
29.49 TFLOPS
-
FP32
14.13 TFLOPS
FP32
14.75 TFLOPS
FP64
7.07 TFLOPS
FP64
7.37 TFLOPS
INT8
56.53 TOPS
-
-
INT32
14.13 TOPS
-
-
Pixel Fillrate
176.64 GPixel/s
Pixel Fillrate
115.2 GPixel/s
Texture Fillrate
441.6 GTexel/s
Texture Fillrate
460.8 GTexel/s
L1
128KB/SM
-
L1
-
16KB/CU
L2
6MB Shared
L2
4MB Shared
32GB
HBM2
-
32GB
HBM2
ECC
Bus Width
4096Bit
Bus Width
4096Bit
Clock
875MHz
Transfer Rate
1.8GT/s
Bandwidth
896GB/s
Clock
1000MHz
Transfer Rate
2GT/s
Bandwidth
1024GB/s
TDP
250W
TDP
300W
-
1x mini-DisplayPort 1.4
Max Resolution
Unknown
Max Resolution
7680x4320
Max Resolution Refresh Rate
-
Max Resolution Refresh Rate
60Hz
Variable Refresh Rate
G-Sync
FreeSync
Variable Refresh Rate
-
FreeSync
Display Stream Compression (DSC)
Not Supported
Display Stream Compression (DSC)
Not Supported
Multi Monitor Support
Unknown
Multi Monitor Support
3
-
-
Content Protection
HDCP 2.2
Model
3x NVENC 5
Model
VCE 4.1
Codec
AVC (H.264)
HEVC (H.265)
Codec
AVC (H.264)
HEVC (H.265)
Model
NVDEC 3
Model
UVD 7.2
Codec
MPEG-1
MPEG-2
MPEG-4
-
VC-1
VP9
AVC (H.264)
HEVC (H.265)
Codec
MPEG-1
MPEG-2
MPEG-4
JPEG
VC-1
-
AVC (H.264)
HEVC (H.265)
Direct X
12
Direct 3D
12_1
Direct X
12
Direct 3D
12_1
OpenGL
4.6
OpenCL
3.0
-
-
OpenGL
4.6
OpenCL
2.1
Vulkan
1.3
Shader Model
6.7
CUDA
7.0
-
-
PureVideo HD
VP9
VDPAU
Feature Set I
Shader Model
6.7
-
-
GFX
9
-
-
-
-
Power Connectors
-
2x 8-Pin
Power Connectors
1x 6-Pin
1x 8-Pin
Slots Required
2.0
PCIe Version
3.0
PCIe Lanes
16
Slots Required
2.0
PCIe Version
3.0
PCIe Lanes
16
Multi GPU Support
Supported
Type
NVLink
Multi GPU Support
Supported
Type
CrossFire XDMA
Height
111 mm (4.37 in)
Width
267 mm (10.51 in)
Depth
40 mm (1.57 in)
Height
111 mm (4.37 in)
Width
267 mm (10.51 in)
Depth
37 mm (1.46 in)
Change Comparison