AMD Radeon Pro V340 vs NVIDIA Tesla P40

AMD Radeon Pro V340
NVIDIA Tesla P40 $5,699
2x 3584 Shaders
32GB (2x 16GB) HBM2
1500MHz
3840 Shaders
24GB GDDR5
1531MHz
Peak AI Performance
43.01 TFLOPS
FP16
Peak AI Performance
47.03 TOPS
INT8
FP32
21.5 TFLOPS
FP32
11.76 TFLOPS
FP16
43.01 TFLOPS
FP16
180 GFLOPS
Form Factor
PCIe Card
2.0-Slots
Form Factor
PCIe Card
2.0-Slots
TDP
300W
TDP
250W
Power Connectors
-
2x 8-Pin
Power Connectors
1x 6-Pin
1x 8-Pin

Peak AI Performance

  • 9% slower vs Tesla P40
  • 9% faster vs Radeon Pro V340
Radeon Pro V340 - 43.01 TFLOPS FP16
x1
Tesla P40 - 47.03 TOPS INT8
x1.09

FP32

  • 83% faster vs Tesla P40
  • 45% slower vs Radeon Pro V340
Radeon Pro V340 - 21.5 TFLOPS FP32
x1.83
Tesla P40 - 11.76 TFLOPS FP32
x1

FP16

  • 238.94x faster vs Tesla P40
  • 100% slower vs Radeon Pro V340
Radeon Pro V340 - 43.01 TFLOPS FP16
x238.94
Tesla P40 - 180 GFLOPS FP16
x1
  • 40% faster vs Tesla P40
  • 29% slower vs Radeon Pro V340
Radeon Pro V340 - 483.8GB/s
x1.4
Tesla P40 - 345.6GB/s
x1
  • 20% higher vs Tesla P40
  • 17% lower vs Radeon Pro V340
Radeon Pro V340 - 300W
x1.2
Tesla P40 - 250W
x1
Manufacturer
AMD
Manufacturer
NVIDIA
Chip Designer
AMD
Chip Designer
NVIDIA
Architecture
GCN 5
Architecture
Pascal
Family
Radeon Pro V
Family
Tesla P
Codename
Greenland
Vega 10
Variant
Vega 10 XL GL
Codename
NV132
GP102
Variant
GP102-890-A1
Market Segment
Server
Market Segment
Server
Release Date
8/26/2018
Release Date
9/13/2016
Foundry
GlobalFoundries
Foundry
TSMC
Fabrication Node
14LPP
Fabrication Node
16FF
Die Size
2x 486 mm²
Die Size
471 mm²
Transistor Count
2x 12.5 Billion
Transistor Count
11.8 Billion
Transistor Density
25.72M/mm²
Transistor Density
25.05M/mm²
Form
PCIe Card
Form
PCIe Card
Shading Units
2x 3584 Shaders
Shading Units
3840 Shaders
Texture Mapping Units
2x 224 TMUs
Texture Mapping Units
240 TMUs
Render Output Units
2x 64 ROPs
Render Output Units
96 ROPs
-
-
Streaming Multiprocessors
30 SMs
Compute Units
2x 56 CUs
-
-
-
-
Graphics Processing Clusters
6 GPCs
852MHz Base
1500MHz
1303MHz Base
1531MHz
Peak AI Performance
43.01 TFLOPS
FP16
Peak AI Performance
47.03 TOPS
INT8
FP16
43.01 TFLOPS
FP16
180 GFLOPS
FP32
21.5 TFLOPS
FP32
11.76 TFLOPS
FP64
1.34 TFLOPS
FP64
370 GFLOPS
-
-
INT8
47.03 TOPS
Pixel Fillrate
96 GPixel/s
Pixel Fillrate
146.976 GPixel/s
Texture Fillrate
336 GTexel/s
Texture Fillrate
367.44 GTexel/s
L1
-
16KB/CU
L1
48KB/SM
-
L2
4MB Shared
L2
3MB Shared
32GB (2x 16GB)
HBM2
24GB
GDDR5
Bus Width
2048Bit
Bus Width
384Bit
Clock
945MHz
Transfer Rate
1.9GT/s
Bandwidth
483.8GB/s
Clock
1800MHz
Transfer Rate
7.2GT/s
Bandwidth
345.6GB/s
TDP
300W
TDP
250W
1x mini-DisplayPort 1.4
-
Max Resolution
7680x4320
Max Resolution
Unknown
Max Resolution Refresh Rate
60Hz
Max Resolution Refresh Rate
-
Variable Refresh Rate
-
FreeSync
Variable Refresh Rate
G-Sync
FreeSync
Display Stream Compression (DSC)
Not Supported
Display Stream Compression (DSC)
Not Supported
Multi Monitor Support
3
Multi Monitor Support
Unknown
Content Protection
HDCP 2.2
-
-
Model
VCE 4.0
Model
2x NVENC 4
Codec
AVC (H.264)
HEVC (H.265)
Codec
AVC (H.264)
HEVC (H.265)
Model
UVD 7.0
Model
NVDEC 3
Codec
MPEG-1
MPEG-2
MPEG-4
JPEG
VC-1
-
AVC (H.264)
HEVC (H.265)
Codec
MPEG-1
MPEG-2
MPEG-4
-
VC-1
VP9
AVC (H.264)
HEVC (H.265)
Direct X
12
Direct 3D
12_1
Direct X
12
Direct 3D
12_1
OpenGL
4.6
OpenCL
2.1
Vulkan
1.3
OpenGL
4.6
OpenCL
3.0
Vulkan
1.3
Shader Model
6.7
-
-
GFX
9
-
-
-
-
Shader Model
6.7
CUDA
6.1
-
-
PureVideo HD
VP8
VDPAU
Feature Set H
Power Connectors
-
2x 8-Pin
Power Connectors
1x 6-Pin
1x 8-Pin
Slots Required
2.0
PCIe Version
3.0
PCIe Lanes
16
Slots Required
2.0
PCIe Version
3.0
PCIe Lanes
16
Multi GPU Support
Supported
Type
CrossFire XDMA
Multi GPU Support
Supported
-
-
Height
111 mm (4.37 in)
Width
267 mm (10.51 in)
Depth
37 mm (1.46 in)
Height
111 mm (4.37 in)
Width
267 mm (10.51 in)
Depth
40 mm (1.57 in)
Change Comparison