NVIDIA GRID RTX T10-4 vs Intel Gaudi

NVIDIA GRID RTX T10-4
Intel Gaudi
4608 Shaders
4GB GDDR6
1065MHz
2048 Shaders
32GB HBM2
610MHz
Peak AI Performance
314.08 TOPS
INT8 Tensor
Peak AI Performance
159.91 TFLOPS
BF16 Tensor
FP32
9.82 TFLOPS
-
-
FP16
19.63 TFLOPS
-
-
Form Factor
PCIe Card
2.0-Slots
Form Factor
OAM Module
-
TDP
260W
TDP
350W
Power Connectors
1x 6-Pin
1x 8-Pin
-
-
-
-
-
-
-
GB6 OpenCL N/A
0%
GB6 OpenCL N/A
0%
GB6 Metal N/A
0%
GB6 Metal N/A
0%
GB6 Vulkan N/A
0%
GB6 Vulkan N/A
0%
GB5 OpenCL N/A
0%
GB5 OpenCL N/A
0%
GB5 CUDA N/A
0%
GB5 CUDA N/A
0%
GB5 Metal N/A
0%
GB5 Metal N/A
0%
GB5 Vulkan N/A
0%
GB5 Vulkan N/A
0%
OCT 2020.1 N/A
0%
OCT 2020.1 N/A
0%
OCT Metal N/A
0%
OCT Metal N/A
0%
Peak AI Performance
314.08 TOPS
INT8 Tensor
Peak AI Performance
159.91 TFLOPS
BF16 Tensor
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
FP16
19.63 TFLOPS
-
-
78.52 TFLOPS Tensor (FP32 Accumulate)
-
-
-
-
-
-
-
FP32
9.82 TFLOPS
78.52 TFLOPS Tensor
-
-
-
-
-
FP64
310 GFLOPS
78.52 TFLOPS Tensor
-
-
-
BF16
-
157.04 TFLOPS Tensor
-
BF16
-
159.91 TFLOPS Tensor
-
-
-
-
-
-
-
-
-
-
-
-
-
INT8
-
314.08 TOPS Tensor
-
-
-
-
-
-
-
-
-
Ray Tracing
38.3 TOPS
-
-
Pixel Fillrate
102.24 GPixel/s
Pixel Fillrate
-
-
-
-
-
Texture Fillrate
306.72 GTexel/s
Texture Fillrate
-
Manufacturer
NVIDIA
Manufacturer
Intel
Chip Designer
NVIDIA
Chip Designer
Intel
Architecture
Turing
Architecture
Gaudi
Family
GRID
Family
Gaudi
Codename
NV162
TU102
-
-
Codename
Gaudi
HL-2000
Variant
HL-2000
Market Segment
Server
Market Segment
Server
Release Date
1/1/2020
Release Date
6/17/2019
Foundry
TSMC
-
Foundry
TSMC
-
Fabrication Node
12FFN
-
Fabrication Node
16FF
-
Die Size
754 mm²
-
-
-
-
Transistor Count
18.6 Billion
-
-
-
Transistor Density
24.67M/mm²
-
-
-
-
Form
PCIe Card
Form
OAM Module
Shading Units
4608 Shaders
-
Shading Units
2048 Shaders
-
Texture Mapping Units
288 TMUs
Texture Mapping Units
-
Render Output Units
96 ROPs
Render Output Units
-
Tensor Cores
576 T-Cores
Tensor Cores
8 T-Cores
Ray-Tracing Cores
72 RT-Cores
-
-
Streaming Multiprocessors
72 SMs
-
-
-
-
Compute Units
1 CU
-
-
-
-
-
-
-
-
-
-
-
1065MHz
-
-
-
610MHz
-
-
-
-
L1
32KB/SM Tex
64KB/SM
-
-
L1
-
-
-
Unknown
L2
6MB Shared
L2
Unknown
-
-
-
-
-
-
4GB
GDDR6
-
32GB
HBM2
ECC
Bus Width
384Bit
Bus Width
4096Bit
Clock
1750MHz
Transfer Rate
14GT/s
Bandwidth
672GB/s
Clock
980MHz
Transfer Rate
2GT/s
Bandwidth
1003.5GB/s
-
-
-
-
-
-
-
-
-
-
-
-
-
eSRAM
24MB
3200GB/s
-
-
TDP
260W
TDP
350W
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
No Ports
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
No Ports
Max Resolution
Unknown
Max Resolution
Unknown
Max Resolution Refresh Rate
-
Max Resolution Refresh Rate
-
Variable Refresh Rate
G-Sync
FreeSync
-
Variable Refresh Rate
-
-
-
Display Stream Compression (DSC)
Not Supported
Display Stream Compression (DSC)
Not Supported
Multi Monitor Support
Unknown
Multi Monitor Support
Unknown
-
-
-
-
Model
NVENC 7
No Encoders
-
Codec
-
-
-
-
-
-
-
-
AVC (H.264)
HEVC (H.265)
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
Model
NVDEC 4
No Decoders
Codec
MPEG-1
MPEG-2
MPEG-4
-
VC-1
VP8
VP9
-
AVC (H.264)
HEVC (H.265)
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
Direct X
12
Direct 3D
12_2
-
-
-
-
OpenGL
4.6
OpenCL
3.0
Vulkan
1.2
-
-
OpenCL
3.0
-
-
Shader Model
6.6
CUDA
7.5
-
-
PureVideo HD
VP10
VDPAU
Feature Set J
-
-
-
-
-
-
-
-
-
-
-
-
-
-
Not a Card
-
-
-
Power Connectors
-
-
1x 6-Pin
1x 8-Pin
-
-
-
-
-
-
-
-
-
-
-
Slots Required
2.0
PCIe Version
3.0
PCIe Lanes
16
-
-
PCIe Version
4.0
PCIe Lanes
16
-
-
-
-
Multi GPU Support
Supported
Type
RoCE
Height
111 mm (4.37 in)
Width
267 mm (10.51 in)
Depth
40 mm (1.57 in)
-
-
-
-
-
-
Change Comparison