20071017, 19:33  #1 
P90 years forever!
Aug 2002
Yeehaw, FL
2×5×769 Posts 
Bits in an FFT word
A recent bug report has revealed an error in my calculation for the expected number of bits in an FFT result word prior to the carry propagation step.
When working on Mersenne numbers, 2^p1, we put n=p/FFTlen bits in each FFT input word. Since we use balanced representation each result word is the sum of many (n1) times (n1) bit values. Some of these "cancelout" as the quantities are both positive and negative. Lots of data has shown that I can expect the result words to have at most (n1) + (n1) + 0.6 * log2 (FFTlen) bits. When I test k*2^p1 numbers, there an extra log2(k)/2 bits in each input word. This adds an extra log2(k) bits to each result word. Tests have shown this formula works quite well. The problem arises when I test numbers 2^pc. In the Mersenne case the weights for each input word are between 1 and 2 (or is it 0.5 and 1.0  I forget). In the 2^pc case the input words are weighted with multipliers between 1 and c. In the Mersenne case every input word has roughly the same number of bits. In the 2^nc case, the input words have anywhere between n and n+log2(c) bits. Any ideas on what my formula should be in this case? Ernst, help! 
20071018, 19:24  #2 
∂^{2}ω=0
Sep 2002
República de California
2^{2}×3×7×139 Posts 
Well, assuming that the inputwordmultipliers in the mod(2^{n}c) DWT are approximately randomly [or uniformly] distributed in the interval [1,c], this leads to a multiplier (1+c)/2 on the average input wordsize W. It seems like you should just be able to rework formulae 58 in the F_{24} paper by replacing W with W*(1+c)/2 [which reduces to W in the Mersennemod case, by way of a basic sanity check] to get the generalized error estimate you need. I have a simple C routine which I've been using [starting with the Mlucas 3.0 beta] that implements formula (8) in the above paper, and which has saved me much grief in terms of calculating FFTlength breakover points. Here it is  feel free to use and customize as desired. let me know if this, together with the above Wfiddling works for you:
Code:
/* For a given FFT length, estimate maximum exponent that can be tested. This implements formula (8) in the F24 paper (Math Comp. 72 (243), pp.15551572, December 2002) in order to estimate the maximum average wordsize for a given FFT length. For roughly IEEE64compliant arithmetic, an asymptotic constant of 0.6 (log2(C) in the the paper, which recommends something around unity) seems to fit the observed data best. */ uint32 given_N_get_maxP(uint32 N) { const double Bmant = 53; const double AsympConst = 0.6; const double ln2inv = 1.0/log(2.0); double ln_N, lnln_N, l2_N, lnl2_N, l2l2_N, lnlnln_N, l2lnln_N; double Wbits, maxExp2; ln_N = log(1.0*N); lnln_N = log(ln_N); l2_N = ln2inv*ln_N; lnl2_N = log(l2_N); l2l2_N = ln2inv*lnl2_N; lnlnln_N = log(lnln_N); l2lnln_N = ln2inv*lnlnln_N; Wbits = 0.5*( Bmant  AsympConst  0.5*(l2_N + l2l2_N)  1.5*(l2lnln_N) ); maxExp2 = Wbits*N; /* 3/10/05: Future versions will need to loosen this p < 2^32 restriction: */ ASSERT(HERE, maxExp2 <= 1.0*0xffffffff,"given_N_get_maxP: maxExp2 <= 1.0*0xffffffff"); return (uint32)maxExp2; } 
20071018, 20:00  #3 
∂^{2}ω=0
Sep 2002
República de California
2^{2}×3×7×139 Posts 
It seems the suitably modified inequality (8) leads to a generalized version of the above function of this form: I've also added a superbasic test utility, you should be able to copy this into a .c file and build:
Code:
#include <stdio.h> #include <math.h> typedef unsigned int uint32; /* For a given FFT length N [= number of floats in the input data array], estimate maximum exponent P that can be tested in a DWT modulo (2^n  C). */ uint32 given_N_and_C__get_maxP(uint32 N, uint32 C) { const double Bmant = 53; const double AsympConst = 0.6; const double ln2inv = 1.0/log(2.0); double ln_N, lnln_N, l2_N, lnl2_N, l2l2_N, lnlnln_N, l2lnln_N, l2cplus1; double Wbits, maxExp2; ln_N = log(1.0*N); lnln_N = log(ln_N); l2_N = ln2inv*ln_N; lnl2_N = log(l2_N); l2l2_N = ln2inv*lnl2_N; lnlnln_N = log(lnln_N); l2lnln_N = ln2inv*lnlnln_N; l2cplus1 = ln2inv*log(1.0 + C); Wbits = 1.0  l2cplus1 + 0.5*( Bmant  AsympConst  0.5*(l2_N + l2l2_N)  1.5*(l2lnln_N) ); maxExp2 = Wbits*N; return (uint32)maxExp2; } int main() { uint32 c, i, n; printf("Enter subtractive constant C in modulus (2^n  C) >\n"); scanf("%u", &c); printf("Max p for various powerof2 FFT lengths:\n"); for(i=10;i<=25;i++) { n = 1 << i; printf("N =2^%2d, Max p = %10u\n",i,given_N_and_C__get_maxP(n,c)); } return 0; } 
20071019, 13:25  #4  
"Lucan"
Dec 2006
England
2·3·13·83 Posts 
Quote:
Doesn't a simple application of the normal (Gaussian) distribution inform the argument a bit? David Last fiddled with by davieddy on 20071019 at 13:29 

20071019, 16:27  #5  
∂^{2}ω=0
Sep 2002
República de California
2^{2}×3×7×139 Posts 
Quote:
Sans DWT weighting  or at least with the simpler kinds of DWT weights we use to effect negacyclics as for Fermat numbers [weights are the set of Nth complex roots of unity] and Mersennes [weights are fractional powers of 2 roughly randomly in [1, 2] things seem to follow the predictions of the randomwalk heuristic very well. What George is saying that in his moregeneral case something seems to go awry with the above approach, which may require deeper analysis to understand. 

20071019, 22:20  #6  
P90 years forever!
Aug 2002
Yeehaw, FL
2×5×769 Posts 
Quote:
I'm not sure I agree that plugging the (1+c)/2 average weighting in your formula is OK. Just doing a little thought experiment: If I have a length 1024 FFT, and each FFT word has 10 bits after weighting in a Mersennemod scenario. Then each output word is the sum of 1024 20bit products. We add 0.6 * log2 (1024) to hold carry bits as the 1024 values are both positive and negative. Thus, we feel safe with 26 bits of precision. Now lets say c is such that the DWT weights now add 0,1,...,8 bits to each input word (so the input words are now 1018 bits). If we use the average bit length of inputs, you'd think we'd need 14*2+6 = 34. However, occasionally we are adding 18bit times 18bit products, so this is clearly insufficient! 

20071019, 23:28  #7 
∂^{2}ω=0
Sep 2002
República de California
11676_{10} Posts 
Hmm, yes, perhaps the dynamic range of the input multipliers is key here, as you say  for Fermatmod the DWT weights are all of complex norm 1, so as uniform as can be, and for Mersennemod the dynamic range is from 1 to 2, which is also very small, so perhaps neglecting it [or treated it by way of an average] doesn't hurt us  at least a whole lot of computation tells us that seems to be the case. You could always try C rather than (1+C)/2 as the multiplier and see which of the two models seems to give a morereasonable ROE size  the "right" formula should give a nearlyconstant max. ROE with increasing FFT length, for moduli around the maxp at each length.

Thread Tools  
Similar Threads  
Thread  Thread Starter  Forum  Replies  Last Post 
Longest Worm  mfgoode  Lounge  108  20200420 22:06 
Random bit in a word  Prime95  Programming  19  20121006 09:34 
My last name is a word. I had no idea.  jasong  jasong  3  20071005 18:02 
Word Problem  mfgoode  Lounge  11  20060401 16:16 
The only word  mfgoode  Puzzles  11  20040914 19:08 