![]() |
![]() |
#45 |
∂2ω=0
Sep 2002
República de California
5×2,347 Posts |
![]()
@Magellan3s: What about int32 performance vs float32? I'm guessing much of the TF code uses that.
No need for big specs-tables dumps, just the rundown of int32 vs float32 for various GPUs of interest. We know that float32 has very few bits-of-significance left over for FFT-mul data, not having to throw away 2,3,4 of those on roundoff error by way of an int32-based NTT could very well be a win even if int32 runs, say, half as fast as float32. |
![]() |
![]() |
![]() |
#46 | |
Mar 2022
61 Posts |
![]() Quote:
"GA10X includes FP32 processing on both datapaths, doubling the peak processing rate for FP32 operations. One datapath in each partition consists of 16 FP32 CUDA Cores capable of executing 16 FP32 operations per clock. Another datapath consists of both 16 FP32 CUDA Cores and 16 INT32 Cores, and is capable of executing either 16 FP32 operations OR 16 INT32 operations per clock. As a result of this new design, each GA10x SM partition is capable of executing either 32 FP32 operations per clock, or 16 FP32 and 16 INT32 operations per clock. All four SM partitions combined can execute 128 FP32 operations per clock, which is double the FP32 rate of the Turing SM, or 64 FP32 and 64 INT32 operations per clock." FP32 Compute performance for the 3080 is 30 TFLOPs, 3080ti is 34 TFLOPs and 3090 is 36 TFLOPS "The RTX 3000 cards are built on an architecture NVIDIA calls "Ampere," and its SM, in some ways, takes both the Pascal and the Turing approach. Ampere keeps the 64 FP32 cores as before, but the 64 other cores are now designated as "FP32 and INT32.” So, half the Ampere cores are dedicated to floating-point, but the other half can perform either floating-point or integer math, just like in Pascal." |
|
![]() |
![]() |
![]() |
Thread Tools | |
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
does half-precision have any use for GIMPS? | ixfd64 | GPU Computing | 9 | 2017-08-05 22:12 |
translating double to single precision? | ixfd64 | Hardware | 5 | 2012-09-12 05:10 |
so what GIMPS work can single precision do? | ixfd64 | Hardware | 21 | 2007-10-16 03:32 |
New program to test a single factor | dsouza123 | Programming | 6 | 2004-01-13 03:53 |
4 checkins in a single calendar month from a single computer | Gary Edstrom | Lounge | 7 | 2003-01-13 22:35 |