NVIDIA T40 vs AMD Radeon Pro V340

NVIDIA T40
AMD Radeon Pro V340
4608 Shaders
24GB GDDR6
1560MHz
2x 3584 Shaders
32GB (2x 16GB) HBM2
1500MHz
Peak AI Performance
460.06 TOPS
INT4 Tensor
Peak AI Performance
43.01 TFLOPS
FP16
FP32
14.38 TFLOPS
FP32
21.5 TFLOPS
FP16
28.75 TFLOPS
FP16
43.01 TFLOPS
Form Factor
PCIe Card
1.0-Slots
Form Factor
PCIe Card
2.0-Slots
TDP
150W
TDP
300W
Power Connectors
1x 6-Pin
1x 8-Pin
Power Connectors
-
2x 8-Pin

Peak AI Performance

  • 10.7x faster vs Radeon Pro V340
  • 91% slower vs T40
T40 - 460.06 TOPS INT4 Tensor
x10.7
Radeon Pro V340 - 43.01 TFLOPS FP16
x1

FP32

  • 33% slower vs Radeon Pro V340
  • 50% faster vs T40
T40 - 14.38 TFLOPS FP32
x1
Radeon Pro V340 - 21.5 TFLOPS FP32
x1.5

FP16

  • 33% slower vs Radeon Pro V340
  • 50% faster vs T40
T40 - 28.75 TFLOPS FP16
x1
Radeon Pro V340 - 43.01 TFLOPS FP16
x1.5
  • 29% faster vs Radeon Pro V340
  • 22% slower vs T40
T40 - 624GB/s
x1.29
Radeon Pro V340 - 483.8GB/s
x1
  • 50% lower vs Radeon Pro V340
  • 100% higher vs T40
T40 - 150W
x1
Radeon Pro V340 - 300W
x2
Manufacturer
NVIDIA
Manufacturer
AMD
Chip Designer
NVIDIA
Chip Designer
AMD
Architecture
Turing
Architecture
GCN 5
Family
Server
Family
Radeon Pro V
Codename
NV162
TU102
Variant
TU102-890-KCD-A1
Codename
Greenland
Vega 10
Variant
Vega 10 XL GL
Market Segment
Server
Market Segment
Server
Release Date
9/13/2018
Release Date
8/26/2018
Foundry
TSMC
Foundry
GlobalFoundries
Fabrication Node
12FFN
Fabrication Node
14LPP
Die Size
754 mm²
Die Size
2x 486 mm²
Transistor Count
18.6 Billion
Transistor Count
2x 12.5 Billion
Transistor Density
24.67M/mm²
Transistor Density
25.72M/mm²
Form
PCIe Card
Form
PCIe Card
Shading Units
4608 Shaders
Shading Units
2x 3584 Shaders
Texture Mapping Units
288 TMUs
Texture Mapping Units
2x 224 TMUs
Render Output Units
96 ROPs
Render Output Units
2x 64 ROPs
Tensor Cores
576 T-Cores
-
-
Ray-Tracing Cores
72 RT-Cores
-
-
Streaming Multiprocessors
72 SMs
-
-
-
-
Compute Units
2x 56 CUs
1305MHz Base
1560MHz
852MHz Base
1500MHz
Peak AI Performance
460.06 TOPS
INT4 Tensor
Peak AI Performance
43.01 TFLOPS
FP16
FP16
28.75 TFLOPS
115.02 TFLOPS Tensor (FP16 Accumulate)
115.02 TFLOPS Tensor (FP32 Accumulate)
FP16
43.01 TFLOPS
-
-
FP32
14.38 TFLOPS
FP32
21.5 TFLOPS
FP64
450 GFLOPS
FP64
1.34 TFLOPS
INT4
460.06 TOPS Tensor
-
-
INT8
230.03 TOPS Tensor
-
-
Ray Tracing
56.2 TOPS
-
-
Pixel Fillrate
149.76 GPixel/s
Pixel Fillrate
96 GPixel/s
Texture Fillrate
449.28 GTexel/s
Texture Fillrate
336 GTexel/s
L1
32KB/SM Tex
64KB/SM
-
L1
-
-
16KB/CU
L2
6MB Shared
L2
4MB Shared
24GB
GDDR6
32GB (2x 16GB)
HBM2
Bus Width
384Bit
Bus Width
2048Bit
Clock
1625MHz
Transfer Rate
13GT/s
Bandwidth
624GB/s
Clock
945MHz
Transfer Rate
1.9GT/s
Bandwidth
483.8GB/s
TDP
150W
TDP
300W
-
1x mini-DisplayPort 1.4
Max Resolution
Unknown
Max Resolution
7680x4320
Max Resolution Refresh Rate
-
Max Resolution Refresh Rate
60Hz
Variable Refresh Rate
G-Sync
FreeSync
Variable Refresh Rate
-
FreeSync
Display Stream Compression (DSC)
Not Supported
Display Stream Compression (DSC)
Not Supported
Multi Monitor Support
Unknown
Multi Monitor Support
3
-
-
Content Protection
HDCP 2.2
Model
NVENC 7
Model
VCE 4.0
Codec
AVC (H.264)
HEVC (H.265)
Codec
AVC (H.264)
HEVC (H.265)
Model
NVDEC 4
Model
UVD 7.0
Codec
MPEG-1
MPEG-2
MPEG-4
-
VC-1
VP8
VP9
AVC (H.264)
HEVC (H.265)
Codec
MPEG-1
MPEG-2
MPEG-4
JPEG
VC-1
-
-
AVC (H.264)
HEVC (H.265)
Direct X
12
Direct 3D
12_2
Direct X
12
Direct 3D
12_1
OpenGL
4.6
OpenCL
3.0
Vulkan
1.2
OpenGL
4.6
OpenCL
2.1
Vulkan
1.3
Shader Model
6.6
CUDA
7.5
-
-
PureVideo HD
VP10
VDPAU
Feature Set J
Shader Model
6.7
-
-
GFX
9
-
-
-
-
Power Connectors
1x 6-Pin
1x 8-Pin
Power Connectors
-
2x 8-Pin
Slots Required
1.0
PCIe Version
3.0
PCIe Lanes
16
Slots Required
2.0
PCIe Version
3.0
PCIe Lanes
16
Multi GPU Support
Supported
Type
NVLink
Multi GPU Support
Supported
Type
CrossFire XDMA
Height
111 mm (4.37 in)
Width
267 mm (10.51 in)
Depth
20 mm (0.79 in)
Height
111 mm (4.37 in)
Width
267 mm (10.51 in)
Depth
37 mm (1.46 in)
Change Comparison