Is the Nvidia RTX A4000 ADA suitable for Machine Learning?

10 min readJun 28, 2023

In April, NVIDIA launched a new product, the RTX A4000 ADA, a small form factor GPU designed for workstation applications. This processor replaces the A2000 and can be used for complex tasks, including scientific research and engineering calculations and data visualization.

The RTX A4000 ADA features 6,144 CUDA cores, 192 Tensor and 48 RT cores, and 20GB GDDR6 ECC VRAM. One of the key benefits of the new GPU is its power efficiency: the RTX A4000 ADA consumes only 70W, which lowers both power costs and system heat. The GPU also allows you to drive multiple displays thanks to its 4x Mini-DisplayPort 1.4a connectivity.

When comparing the RTX 4000 SFF ADA GPUs to other devices in the same class, it should be noted that when running in single precision mode, it shows a performance similar to the latest generation RTX A4000 GPU, which consumes twice as much power (140W vs. 70W).

The ADA RTX 4000 SFF is built on the ADA Lovelace architecture and 5nm process technology. This enables next-generation Tensor Core and ray tracing cores, which significantly improve performance by providing faster and more efficient ray tracing and Tensor cores than the RTX A4000. In addition, ADA’s RTX 4000 SFF comes in a small package — the card is 168mm long and as thick as two expansion slots.

Improved ray tracing kernels allows for efficient performance in environments where the technology is used, such as in 3D design and rendering. Furthermore, the new GPU’s 20GB memory capacity enables it to handle large environments.

According to the manufacturer, fourth-generation Tensor cores deliver high AI computational performance — a twofold increase in performance over the previous generation. The new Tensor cores support FP8 acceleration. This innovative feature may work well for those developing and deploying AI models in environments such as genomics and computer vision.

It’s also of note that the increase in encoding and decoding mechanisms makes the RTX 4000 SFF ADA a good solution for multimedia workloads such as video among others.

Technical specifications of NVIDIA RTX A4000 and RTX A5000 graphics cards, RTX 3090

Description of the test environment

Test results

V-Ray 5 Benchmark

V-Ray GPU CUDA and RTX tests measure relative GPU rendering performance. The RTX A4000 GPU is slightly behind the RTX A4000 ADA (4% and 11%, respectively).

Machine Learning

“Dogs vs. Cats”
To compare the performance of GPUs for neural networks, we used the “Dogs vs. Cats” dataset — the test analyzes the content of a photo and distinguishes whether the photo shows a cat or a dog. All the necessary raw data can be found here. We ran this test on different GPUs and cloud services and got the following results:

In this test, the RTX A4000 ADA slightly outperformed the RTX A4000 by 9%, but keep in mind the small size and low power consumption of the new GPU.

AI-Benchmark

AI-Benchmark allows you to measure the performance of the device during an AI model output task. The unit of measurement may vary according to the test, but usually it is the number of operations per second (OPS) or the number of frames per second (FPS).

The results of this test show that the performance of the RTX A4000 is 6% higher than RTX A4000 ADA, however, with the caveat that the test results may vary depending on the specific task and operating conditions employed.

PyTorch

RTX A 4000

Benchmarking ___ Model average train time (ms)
Training double precision type mnasnet0_5 ___ 62.995805740356445
Training double precision type mnasnet0_75 ___ 98.39066505432129
Training double precision type mnasnet1_0 ___ 126.60405158996582
Training double precision type mnasnet1_3 ___ 186.89460277557373
Training double precision type resnet18 ___ 428.08079719543457
Training double precision type resnet34 ___ 883.5790348052979
Training double precision type resnet50 ___ 1016.3950300216675
Training double precision type resnet101 ___ 1927.2308254241943
Training double precision type resnet152 ___ 2815.663013458252
Training double precision type resnext50_32x4d ___ 1075.4373741149902
Training double precision type resnext101_32x8d ___ 4050.0641918182373
Training double precision type wide_resnet50_2 ___ 2615.9953451156616
Training double precision type wide_resnet101_2 ___ 5218.524832725525
Training double precision type densenet121 ___ 751.9759511947632
Training double precision type densenet169 ___ 910.3225564956665
Training double precision type densenet201 ___ 1163.036551475525
Training double precision type densenet161 ___ 2141.505298614502
Training double precision type squeezenet1_0 ___ 203.14435005187988
Training double precision type squeezenet1_1 ___ 98.04857730865479
Training double precision type vgg11 ___ 1697.710485458374
Training double precision type vgg11_bn ___ 1729.2972660064697
Training double precision type vgg13 ___ 2491.615080833435
Training double precision type vgg13_bn ___ 2545.1631927490234
Training double precision type vgg16 ___ 3371.1953449249268
Training double precision type vgg16_bn ___ 3423.8639068603516
Training double precision type vgg19_bn ___ 4314.5153522491455
Training double precision type vgg19 ___ 4249.422650337219
Training double precision type mobilenet_v3_large ___ 105.54619789123535
Training double precision type mobilenet_v3_small ___ 37.6680850982666
Training double precision type shufflenet_v2_x0_5 ___ 26.51611328125
Training double precision type shufflenet_v2_x1_0 ___ 61.260504722595215
Training double precision type shufflenet_v2_x1_5 ___ 105.30067920684814
Training double precision type shufflenet_v2_x2_0 ___ 181.03694438934326
Inference double precision type mnasnet0_5 ___ 17.397074699401855
Inference double precision type mnasnet0_75 ___ 28.902697563171387
Inference double precision type mnasnet1_0 ___ 38.387718200683594
Inference double precision type mnasnet1_3 ___ 58.228821754455566
Inference double precision type resnet18 ___ 147.95727252960205
Inference double precision type resnet34 ___ 293.519492149353
Inference double precision type resnet50 ___ 336.44991874694824
Inference double precision type resnet101 ___ 637.9982376098633
Inference double precision type resnet152 ___ 948.9351654052734
Inference double precision type resnext50_32x4d ___ 372.80876636505127
Inference double precision type resnext101_32x8d ___ 1385.1624917984009
Inference double precision type wide_resnet50_2 ___ 873.048791885376
Inference double precision type wide_resnet101_2 ___ 1729.2765426635742
Inference double precision type densenet121 ___ 270.13323307037354
Inference double precision type densenet169 ___ 327.1932888031006
Inference double precision type densenet201 ___ 414.733362197876
Inference double precision type densenet161 ___ 766.3542318344116
Inference double precision type squeezenet1_0 ___ 74.86292839050293
Inference double precision type squeezenet1_1 ___ 34.04905319213867
Inference double precision type vgg11 ___ 576.3767147064209
Inference double precision type vgg11_bn ___ 580.5839586257935
Inference double precision type vgg13 ___ 853.4365510940552
Inference double precision type vgg13_bn ___ 860.3136301040649
Inference double precision type vgg16 ___ 1145.091052055359
Inference double precision type vgg16_bn ___ 1152.8028392791748
Inference double precision type vgg19_bn ___ 1444.9562692642212
Inference double precision type vgg19 ___ 1437.0987701416016
Inference double precision type mobilenet_v3_large __ 30.876317024230957
Inference double precision type mobilenet_v3_small _ 11.234536170959473
Inference double precision type shufflenet_v2_x0_5 ___ 7.425284385681152
Inference double precision type shufflenet_v2_x1_0 ___ 18.25782299041748
Inference double precision type shufflenet_v2_x1_5 ___ 33.34946632385254
Inference double precision type shufflenet_v2_x2_0 ___ 57.84676551818848

RTX A4000 ADA

Benchmarking ___ Model average train time
Training half precision type mnasnet0_5 ___ 20.266618728637695
Training half precision type mnasnet0_75 ___ 21.445374488830566
Training half precision type mnasnet1_0 ___ 26.714019775390625
Training half precision type mnasnet1_3 ___ 26.5126371383667
Training half precision type resnet18 ___ 19.624991416931152
Training half precision type resnet34 ___ 32.46446132659912
Training half precision type resnet50 ___ 57.17473030090332
Training half precision type resnet101 ___ 98.20127010345459
Training half precision type resnet152 ___ 138.18389415740967
Training half precision type resnext50_32x4d ___ 75.56005001068115
Training half precision type resnext101_32x8d ___ 228.8706636428833
Training half precision type wide_resnet50_2 ___ 113.76442432403564
Training half precision type wide_resnet101_2 ___ 204.17311191558838
Training half precision type densenet121 ___ 68.97401332855225
Training half precision type densenet169 ___ 85.16453742980957
Training half precision type densenet201 ___ 103.299241065979
Training half precision type densenet161 ___ 137.54578113555908
Training half precision type squeezenet1_0 ___ 16.71830177307129
Training half precision type squeezenet1_1 ___ 12.906527519226074
Training half precision type vgg11 ___ 51.7004919052124
Training half precision type vgg11_bn ___ 57.63327598571777
Training half precision type vgg13 ___ 86.10869407653809
Training half precision type vgg13_bn ___ 95.86676120758057
Training half precision type vgg16 ___ 102.91589260101318
Training half precision type vgg16_bn ___ 113.74778270721436
Training half precision type vgg19_bn ___ 131.56734943389893
Training half precision type vgg19 ___ 119.70191955566406
Training half precision type mobilenet_v3_large ___ 31.30636692047119
Training half precision type mobilenet_v3_small ___ 19.44464683532715
Training half precision type shufflenet_v2_x0_5 ___ 13.710575103759766
Training half precision type shufflenet_v2_x1_0 ___ 23.608479499816895
Training half precision type shufflenet_v2_x1_5 ___ 26.793746948242188
Training half precision type shufflenet_v2_x2_0 ___ 24.550962448120117
Inference half precision type mnasnet0_5 ___ 4.418272972106934
Inference half precision type mnasnet0_75 ___ 4.021778106689453
Inference half precision type mnasnet1_0 ___ 4.42598819732666
Inference half precision type mnasnet1_3 ___ 4.618926048278809
Inference half precision type resnet18 ___ 5.803341865539551
Inference half precision type resnet34 ___ 9.756693840026855
Inference half precision type resnet50 ___ 15.873079299926758
Inference half precision type resnet101 ___ 28.268003463745117
Inference half precision type resnet152 ___ 40.04594326019287
Inference half precision type resnext50_32x4d ___ 19.53421115875244
Inference half precision type resnext101_32x8d ___ 62.44826316833496
Inference half precision type wide_resnet50_2 ___ 33.533992767333984
Inference half precision type wide_resnet101_2 ___ 59.60897445678711
Inference half precision type densenet121 ___ 18.052735328674316
Inference half precision type densenet169 ___ 21.956982612609863
Inference half precision type densenet201 ___ 27.85182476043701
Inference half precision type densenet161 ___ 37.41891860961914
Inference half precision type squeezenet1_0 ___ 4.391803741455078
Inference half precision type squeezenet1_1 ___ 2.4281740188598633
Inference half precision type vgg11 ___ 17.11493968963623
Inference half precision type vgg11_bn ___ 18.40585231781006
Inference half precision type vgg13 ___ 28.438148498535156
Inference half precision type vgg13_bn ___ 30.672597885131836
Inference half precision type vgg16 ___ 34.43562984466553
Inference half precision type vgg16_bn ___ 36.92122936248779
Inference half precision type vgg19_bn ___ 43.144264221191406
Inference half precision type vgg19 ___ 40.5385684967041
Inference half precision type mobilenet_v3_large ___ 5.350713729858398
Inference half precision type mobilenet_v3_small ___ 4.016985893249512
Inference half precision type shufflenet_v2_x0_5 ___ 5.079126358032227
Inference half precision type shufflenet_v2_x1_0 ___ 5.593156814575195
Inference half precision type shufflenet_v2_x1_5 ___ 5.649552345275879
Inference half precision type shufflenet_v2_x2_0 ___ 5.355663299560547
Training double precision type mnasnet0_5 ___ 50.2386999130249
Training double precision type mnasnet0_75 ___ 80.66896915435791
Training double precision type mnasnet1_0 ___ 103.32422733306885
Training double precision type mnasnet1_3 ___ 154.6230697631836
Training double precision type resnet18 ___ 337.94031620025635
Training double precision type resnet34 ___ 677.7706575393677
Training double precision type resnet50 ___ 789.9243211746216
Training double precision type resnet101 ___ 1484.3351316452026
Training double precision type resnet152 ___ 2170.570478439331
Training double precision type resnext50_32x4d ___ 877.3719882965088
Training double precision type resnext101_32x8d ___ 3652.4944639205933
Training double precision type wide_resnet50_2 ___ 2154.612874984741
Training double precision type wide_resnet101_2 ___ 4176.522083282471
Training double precision type densenet121 ___ 607.8699731826782
Training double precision type densenet169 ___ 744.6409797668457
Training double precision type densenet201 ___ 962.677731513977
Training double precision type densenet161 ___ 1759.772515296936
Training double precision type squeezenet1_0 ___ 164.3690824508667
Training double precision type squeezenet1_1 ___ 78.70647430419922
Training double precision type vgg11 ___ 1362.6095294952393
Training double precision type vgg11_bn ___ 1387.2539138793945
Training double precision type vgg13 ___ 2006.0230445861816
Training double precision type vgg13_bn ___ 2047.526364326477
Training double precision type vgg16 ___ 2702.2086429595947
Training double precision type vgg16_bn ___ 2747.241234779358
Training double precision type vgg19_bn ___ 3447.1724700927734
Training double precision type vgg19 ___ 3397.990345954895
Training double precision type mobilenet_v3_large ___ 84.65698719024658
Training double precision type mobilenet_v3_small __ 29.816465377807617
Training double precision type shufflenet_v2_x0_5 ___ 27.401342391967773
Training double precision type shufflenet_v2_x1_0 ___ 48.322744369506836
Training double precision type shufflenet_v2_x1_5 ___ 82.22103118896484
Training double precision type shufflenet_v2_x2_0 ___ 141.7021369934082
Inference double precision type mnasnet0_5 ___ 12.988653182983398
Inference double precision type mnasnet0_75 ___ 22.422199249267578
Inference double precision type mnasnet1_0 ___ 30.056486129760742
Inference double precision type mnasnet1_3 ___ 46.953935623168945
Inference double precision type resnet18 ___ 118.04479122161865
Inference double precision type resnet34 ___ 231.52336597442627
Inference double precision type resnet50 ___ 268.63497734069824
Inference double precision type resnet101 ___ 495.2010440826416
Inference double precision type resnet152 ___ 726.4922094345093
Inference double precision type resnext50_32x4d ___ 291.47679328918457
Inference double precision type resnext101_32x8d ___ 1055.10901927948
Inference double precision type wide_resnet50_2 ___ 690.6917667388916
Inference double precision type wide_resnet101_2 ___ 1347.5529861450195
Inference double precision type densenet121 ___ 224.35829639434814
Inference double precision type densenet169 ___ 268.9145278930664
Inference double precision type densenet201 ___ 343.1972026824951
Inference double precision type densenet161 ___ 635.866231918335
Inference double precision type squeezenet1_0 ___ 61.92759037017822
Inference double precision type squeezenet1_1 ___ 27.009410858154297
Inference double precision type vgg11 ___ 462.3375129699707
Inference double precision type vgg11_bn ___ 468.4495782852173
Inference double precision type vgg13 ___ 692.8219032287598
Inference double precision type vgg13_bn ___ 703.3538103103638
Inference double precision type vgg16 ___ 924.4353818893433
Inference double precision type vgg16_bn ___ 936.5075063705444
Inference double precision type vgg19_bn ___ 1169.098300933838
Inference double precision type vgg19 ___ 1156.3771772384644
Inference double precision type mobilenet_v3_large ___ 24.2356014251709
Inference double precision type mobilenet_v3_small ___ 8.85490894317627
Inference double precision type shufflenet_v2_x0_5 ___ 6.360034942626953
Inference double precision type shufflenet_v2_x1_0 __ 14.301743507385254
Inference double precision type shufflenet_v2_x1_5 __ 24.863481521606445
Inference double precision type shufflenet_v2_x2_0 ___ 43.8505744934082

Conclusion

The new graphics card has proven to be an effective solution for a number of work tasks. Thanks to its compact size, it is ideal for powerful SFF (Small Form Factor) computers. Also, it is notable that the 6,144 CUDA cores and 20GB of memory with a 160-bit bus makes this card one of the most productive on the market. Furthermore, a low TDP of 70W helps to reduce power consumption costs. Four Mini-DisplayPort ports allow the card to be used with multiple monitors or as a multi-channel graphics solution.

The RTX 4000 SFF ADA represents a significant advance over previous generations, delivering performance equivalent to a card with twice the power consumption. With no PCIe power connector, the RTX 4000 SFF ADA is easy to integrate into low-power workstations without sacrificing high performance.

Is the Nvidia RTX A4000 ADA suitable for Machine Learning?

Conclusion

Written by HOSTKEY

No responses yet