NVIDIA Tesla P100 vs NVIDIA Tesla P40

NVIDIA Tesla P100 $5,699
NVIDIA Tesla P40 $5,699
3584 Shaders
16GB HBM2
1303MHz
3840 Shaders
24GB GDDR5
1531MHz
Peak AI Performance
18.68 TFLOPS
FP16
Peak AI Performance
47.03 TOPS
INT8
FP32
9.34 TFLOPS
FP32
11.76 TFLOPS
FP16
18.68 TFLOPS
FP16
180 GFLOPS
Form Factor
PCIe Card
2.0-Slots
Form Factor
PCIe Card
2.0-Slots
TDP
250W
TDP
250W
Power Connectors
-
1x 8-Pin
Power Connectors
1x 6-Pin
1x 8-Pin

Peak AI Performance

  • 60% slower vs Tesla P40
  • 2.52x faster vs Tesla P100
Tesla P100 - 18.68 TFLOPS FP16
x1
Tesla P40 - 47.03 TOPS INT8
x2.52

FP32

  • 21% slower vs Tesla P40
  • 26% faster vs Tesla P100
Tesla P100 - 9.34 TFLOPS FP32
x1
Tesla P40 - 11.76 TFLOPS FP32
x1.26

FP16

  • 103.78x faster vs Tesla P40
  • 99% slower vs Tesla P100
Tesla P100 - 18.68 TFLOPS FP16
x103.78
Tesla P40 - 180 GFLOPS FP16
x1
  • 2.12x faster vs Tesla P40
  • 53% slower vs Tesla P100
Tesla P100 - 732.2GB/s
x2.12
Tesla P40 - 345.6GB/s
x1
  • Same TDP vs Tesla P40
  • Same TDP vs Tesla P100
Tesla P100 - 250W
x1
Tesla P40 - 250W
x1
Manufacturer
NVIDIA
Manufacturer
NVIDIA
Chip Designer
NVIDIA
Chip Designer
NVIDIA
Architecture
Pascal
Architecture
Pascal
Family
Tesla P
Family
Tesla P
Codename
NV130
GP100
Variant
GP100-893-A1
Codename
NV132
GP102
Variant
GP102-890-A1
Market Segment
Server
Market Segment
Server
Release Date
6/20/2016
Release Date
9/13/2016
Foundry
TSMC
Foundry
TSMC
Fabrication Node
16FF
Fabrication Node
16FF
Die Size
610 mm²
Die Size
471 mm²
Transistor Count
15.3 Billion
Transistor Count
11.8 Billion
Transistor Density
25.08M/mm²
Transistor Density
25.05M/mm²
Form
PCIe Card
Form
PCIe Card
Shading Units
3584 Shaders
Shading Units
3840 Shaders
Texture Mapping Units
224 TMUs
Texture Mapping Units
240 TMUs
Render Output Units
128 ROPs
Render Output Units
96 ROPs
Streaming Multiprocessors
56 SMs
Streaming Multiprocessors
30 SMs
-
-
Graphics Processing Clusters
6 GPCs
1126MHz Base
1303MHz
1303MHz Base
1531MHz
Peak AI Performance
18.68 TFLOPS
FP16
Peak AI Performance
47.03 TOPS
INT8
FP16
18.68 TFLOPS
FP16
180 GFLOPS
FP32
9.34 TFLOPS
FP32
11.76 TFLOPS
FP64
4.67 TFLOPS
FP64
370 GFLOPS
-
-
INT8
47.03 TOPS
Pixel Fillrate
166.784 GPixel/s
Pixel Fillrate
146.976 GPixel/s
Texture Fillrate
291.872 GTexel/s
Texture Fillrate
367.44 GTexel/s
L1
64KB/SM
L1
48KB/SM
L2
4MB Shared
L2
3MB Shared
16GB
HBM2
24GB
GDDR5
Bus Width
4096Bit
Bus Width
384Bit
Clock
715MHz
Transfer Rate
1.4GT/s
Bandwidth
732.2GB/s
Clock
1800MHz
Transfer Rate
7.2GT/s
Bandwidth
345.6GB/s
TDP
250W
TDP
250W
Max Resolution
Unknown
Max Resolution
Unknown
Variable Refresh Rate
G-Sync
FreeSync
Variable Refresh Rate
G-Sync
FreeSync
Display Stream Compression (DSC)
Not Supported
Display Stream Compression (DSC)
Not Supported
Multi Monitor Support
Unknown
Multi Monitor Support
Unknown
Model
3x NVENC 4
Model
2x NVENC 4
Codec
AVC (H.264)
HEVC (H.265)
Codec
AVC (H.264)
HEVC (H.265)
Model
NVDEC 3
Model
NVDEC 3
Codec
MPEG-1
MPEG-2
MPEG-4
VC-1
VP9
AVC (H.264)
HEVC (H.265)
Codec
MPEG-1
MPEG-2
MPEG-4
VC-1
VP9
AVC (H.264)
HEVC (H.265)
Direct X
12
Direct 3D
12_1
Direct X
12
Direct 3D
12_1
OpenGL
4.6
OpenCL
3.0
Vulkan
1.3
OpenGL
4.6
OpenCL
3.0
Vulkan
1.3
Shader Model
6.0
CUDA
6.0
PureVideo HD
VP8
VDPAU
Feature Set H
Shader Model
6.7
CUDA
6.1
PureVideo HD
VP8
VDPAU
Feature Set H
Power Connectors
-
1x 8-Pin
Power Connectors
1x 6-Pin
1x 8-Pin
Slots Required
2.0
PCIe Version
3.0
PCIe Lanes
16
Slots Required
2.0
PCIe Version
3.0
PCIe Lanes
16
Multi GPU Support
Supported
Multi GPU Support
Supported
Height
111 mm (4.37 in)
Width
267 mm (10.51 in)
Depth
40 mm (1.57 in)
Height
111 mm (4.37 in)
Width
267 mm (10.51 in)
Depth
40 mm (1.57 in)
Change Comparison