mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Math

Reply
 
Thread Tools
Old 2011-08-28, 17:11   #45
Mini-Geek
Account Deleted
 
Mini-Geek's Avatar
 
"Tim Sorbera"
Aug 2006
San Antonio, TX USA

17×251 Posts
Default

Quote:
Originally Posted by ciic View Post
My problem comes more with the first mentioned book by Crandall et al. it uses mathm. notations which I do not understand, e.g. an "=" with three horizontal lines and sum-signs and pi-signs and brackets...

Could someone please give me a hint, where I probaply could get information on this? Some "search-strings" for an extended google/amazon/wikipedia-search would also be very much helpful to me.

Cheers, ciic.
http://en.wikipedia.org/wiki/List_of...atical_symbols is a useful reference. And, e.g., if you saw a pi-sign used in a way you don't understand, you can look up other uses of Pi on Wikipedia.
Mini-Geek is offline   Reply With Quote
Old 2011-09-09, 14:17   #46
ciic
 
ciic's Avatar
 
May 2011
FR Germany - Berlin

2×7 Posts
Default

Thank you!
I found the fitting keyword for me: "Set Theory".

Will go w/ "
Goldrei, Derek (1996), Classic Set Theory, London: Chapman and Hall, p. 3, ISBN 0-412-60610-0" ...

Cheers, ciic.

---

P.S.:
"Them ain't Bagels, they doughnuts
": See http://en.wikipedia.org/wiki/Bagels , please.
"PS Can't fool me: there ain't no sanity clause." : 't me either!
ciic is offline   Reply With Quote
Old 2011-09-09, 16:00   #47
R.D. Silverman
 
R.D. Silverman's Avatar
 
Nov 2003

26×113 Posts
Default

Quote:
Originally Posted by DJones View Post
Marvellous, thank you. Now any chance of being pointed at a suitable site / book (preferably the former) which explains why powers of e^(i * 2 * pi / n) work as a Fourier Transform matrix. Yes, I appreciate I could google this and have done so, but mathematical functions and definitions tend to be very flexible on the web, which isn't surprising when you take into account the fact that any idiot can put a webpage up.

I also appreciate I'm coming at this from the wrong end - I stumbled across the Mersenne Primes when looking for a DC project to get involved in, and have tried to backtrack through the proofs and methods so that I understand what's going on. Unfortunately backtracking through proofs, as opposed to building on proofs you already have, is proving to be rather convoluted.
Read Nussbaumer's book. It should be required reading.
R.D. Silverman is offline   Reply With Quote
Old 2011-11-06, 20:46   #48
Maximus
 
Nov 2011

11002 Posts
Default

I found an algorithm for polynomial multiplying. The algorithm is relative to FFT and has a simple explanation. Is anybody interesting?
Maximus is offline   Reply With Quote
Old 2011-11-07, 02:20   #49
LaurV
Romulan Interpreter
 
LaurV's Avatar
 
Jun 2011
Thailand

893110 Posts
Default

Quote:
Originally Posted by Maximus View Post
I found an algorithm for polynomial multiplying. The algorithm is relative to FFT and has a simple explanation. Is anybody interesting?
I am confused here. Did you find an algorithm to multiply polynomials (in whatever complexity time) or did you find an algorithm to multiply integers in polynomial time?
LaurV is offline   Reply With Quote
Old 2011-11-07, 02:43   #50
Christenson
 
Christenson's Avatar
 
Dec 2010
Monticello

24·107 Posts
Default

Let's have it, with a small example. Be mathematically precise in the sense you mean it...and post in the homework thread, since your post (and my reply) is off-topic, and it will draw fewest flames there.
Christenson is offline   Reply With Quote
Old 2012-01-29, 18:36   #51
henryzz
Just call me Henry
 
henryzz's Avatar
 
"David"
Sep 2007
Cambridge (GMT/BST)

166F16 Posts
Default

I have been fiddling around trying to make small FFTs optimal. Here is my psuedo code for 4x4 multiplication using FFT.

Input: (a,b,c,d,0,0,0,0)*(e,f,g,h,0,0,0,0)

FFT:
Code:
brt=b*1/sqrt(2)
drt=d*1/sqrt(2)
mtmp1=a+c
mtmp2=b+d
mtmp3=brt-drt
mtmp4=brt+drt
m1=mtmp1+ntmp2
m2=a+mtmp3
m3=c+mtmp4
m4=a-c
m5=b-d
m6=a-mtmp3
m7=c-mtmp4
m8=mtmp1-mtmp2

frt=f*1/sqrt(2)
hrt=h*1/sqrt(2)
ntmp1=e+g
ntmp2=f+h
ntmp3=frt-hrt
ntmp4=frt+hrt
n1=ntmp1+ntmp2
n2=e+ntmp3
n3=g+ntmp4
n4=e-g
n5=f-h
n6=e-ntmp3
n7=g-ntmp4
n8=ntmp1-ntmp2
This makes up the parts of the FFTs. Each of the 2 FFTs is 6 adds, 6 subs and 2 muls. 1/sqrt(2) must also have been precalculated.

The multiplication can be done by:
Code:
r1=m1*n1
r2=m2*n2-m3*n3
r3=m4*n4-m5*n5
r4=m6*n6-m7*n7
r5=m8*n8
r6=m2*n3+m3*n2
r7=m4*n5+m5*n4
r8=m6*n7+m7*n6
This is 3 adds, 3 subs and 14 muls.

Inverse FFT:
Code:
tmp1=(r1+r5)/8
tmp2=r3/4
tmp3=tmp1+tmp2
tmp4=tmp1-tmp2
tmp5=(r2+r4)/4
tmp6=(r6+r8)/4
tmp7=(r1-r5)/8
tmp8=r7/4
tmp9=tmp7+tmp8
tmp10=tmp7-tmp8
tmp11=(r2-r4)*sqrt(2)/8
tmp12=(r6-r8)*sqrt(2)/8
output1: tmp3 + tmp5
output2: tmp9 + tmp11 + tmp12
output3: tmp4 + tmp6
output4: tmp10 - tmp11 + tmp12
output5: tmp3 - tmp5
output6: tmp9 - tmp11 - tmp12
output7: tmp4 - tmp6
output8: tmp10 + tmp11 - tmp12
This is 11 adds, 11 subs, 2 muls and 6 shifts. sqrt(2)/8 must also have been precalculated.

Total for 4x4 multiplication by FFT is 26 adds, 26 subs, 20 muls and 6 shifts. This is 78 operations which is slightly less than the 10*N*log2(N) estimate of 80.

Can anyone see any mistakes or any improvements? I am going to move onto 8x8 multiplication next.
henryzz is offline   Reply With Quote
Old 2012-01-29, 20:58   #52
ewmayer
2ω=0
 
ewmayer's Avatar
 
Sep 2002
Rep├║blica de California

25×307 Posts
Default

Without delving into your code in much detail, several general notes:

1. You need to be clear what type of transform you are working on - real-vector, complex, modular, what? (It seems from your examples you intend a real-signal covolution-based multiply, is that right?

2. You should never use 1/sqrt(N) normalizations - just apply 1/N during the final normalize-and-carry step on the convolution outputs.

3. "Shifts" are not an option on most floating-point hardware. What you call shifts should instead be muls-by-inverse-powers-of-2, should never appear explicitly. These should be absorbable into the other multiplicative constants appearing in the algorithm.

4. You need to think about whether this kind of small-transform-optimization approach will generalize well to larger signal lengths. It appears to be useful for the small-DFTs surrounding the dyadic-mul step, but large FFTs will need more-generic-radix passes (mostly small powers of 2, perhaps one odd-radix pass per fFFT and iFFT) preceding and following the dyadic-mul step. Those will have variable strides, and need to accommodate array accesses with varying strides, as well as complex "twiddle" muls.
ewmayer is offline   Reply With Quote
Old 2012-01-30, 00:39   #53
R.D. Silverman
 
R.D. Silverman's Avatar
 
Nov 2003

26×113 Posts
Default

Quote:
Originally Posted by ewmayer View Post
Without delving into your code in much detail, several general notes:

1. You need to be clear what type of transform you are working on - real-vector, complex, modular, what? (It seems from your examples you intend a real-signal covolution-based multiply, is that right?

2. You should never use 1/sqrt(N) normalizations - just apply 1/N during the final normalize-and-carry step on the convolution outputs.

3. "Shifts" are not an option on most floating-point hardware. What you call shifts should instead be muls-by-inverse-powers-of-2, should never appear explicitly. These should be absorbable into the other multiplicative constants appearing in the algorithm.

4. You need to think about whether this kind of small-transform-optimization approach will generalize well to larger signal lengths. It appears to be useful for the small-DFTs surrounding the dyadic-mul step, but large FFTs will need more-generic-radix passes (mostly small powers of 2, perhaps one odd-radix pass per fFFT and iFFT) preceding and following the dyadic-mul step. Those will have variable strides, and need to accommodate array accesses with varying strides, as well as complex "twiddle" muls.
Read Nussbaumer's book!!! It contains a lot of optimized small FFT's
for various domains.
R.D. Silverman is offline   Reply With Quote
Old 2012-01-30, 00:58   #54
henryzz
Just call me Henry
 
henryzz's Avatar
 
"David"
Sep 2007
Cambridge (GMT/BST)

166F16 Posts
Default

Quote:
Originally Posted by ewmayer View Post
Without delving into your code in much detail, several general notes:

1. You need to be clear what type of transform you are working on - real-vector, complex, modular, what? (It seems from your examples you intend a real-signal covolution-based multiply, is that right?
All inputs and outputs should be real. All the variables are real however there are some implied imaginary parts. I just changed the sign when necessary when using them later on. I started from your example in the fifth post and used e^(i*2*pi/8)=1/sqrt(2)+1/sqrt(2)*i as E in a 8x8 matrix. Once I had done that I just looked for where there were repeat operations and optimised them out.
Quote:
2. You should never use 1/sqrt(N) normalizations - just apply 1/N during the final normalize-and-carry step on the convolution outputs.
Are you refering to the 1/sqrt(2)s in the code or something about my iFFT? N is 4 in my case and I am doing a length 2N FFT.
Quote:
3. "Shifts" are not an option on most floating-point hardware. What you call shifts should instead be muls-by-inverse-powers-of-2, should never appear explicitly. These should be absorbable into the other multiplicative constants appearing in the algorithm.
I didn't realize they didn't exist for floating point. Previously I have almost exclusively used integer arithmatic which does support shifts.
Quote:
4. You need to think about whether this kind of small-transform-optimization approach will generalize well to larger signal lengths. It appears to be useful for the small-DFTs surrounding the dyadic-mul step, but large FFTs will need more-generic-radix passes (mostly small powers of 2, perhaps one odd-radix pass per fFFT and iFFT) preceding and following the dyadic-mul step. Those will have variable strides, and need to accommodate array accesses with varying strides, as well as complex "twiddle" muls.
Not sure what you mean by generic-radix and odd-radix though it might be the names fooling me. Do you mean that calulating r1 and r5 are odd-radix steps and the others generic radix steps?
I realize that the optimisations will get more complex. With a length 8 FFT E was 1/sqrt(2)+1/sqrt(2)*i and real(E)=im(E). With higher powers real(E)!=im(E).
The code above worked once I had changed m1=mtmp1+ntmp2 to m1=mtmp1+mtmp2.
I have gleaned most of my information on ffts from this thread. I have studied basic linear transformations at uni but nothing like this. I am currently half way through my first year studying maths at Leicester University. I will look at this again tomorrow. Need to get to sleep
henryzz is offline   Reply With Quote
Old 2012-01-30, 20:40   #55
ewmayer
2ω=0
 
ewmayer's Avatar
 
Sep 2002
Rep├║blica de California

231408 Posts
Default

Quote:
Originally Posted by R.D. Silverman View Post
Read Nussbaumer's book!!! It contains a lot of optimized small FFT's for various domains.
It was one of my main references when I was writing my own early-stage FFT, although I found it most useful for the optimized short-length-DFT algos, rather than the overall implementation of a large-signal FFT. Or were to talking to Henry?

Quote:
Originally Posted by henryzz View Post
All inputs and outputs should be real. All the variables are real however there are some implied imaginary parts. I just changed the sign when necessary when using them later on. I started from your example in the fifth post and used e^(i*2*pi/8)=1/sqrt(2)+1/sqrt(2)*i as E in a 8x8 matrix. Once I had done that I just looked for where there were repeat operations and optimised them out.
Are you refering to the 1/sqrt(2)s in the code or something about my iFFT? N is 4 in my case and I am doing a length 2N FFT.
Now that you made clear you are writing a real-vector FFT, the reason for the sqrt(2)s is clear to me. I mis-read that part of your code, partly because I didn't have time to delve deeply into your implementation yesterday, and instead focused on big-picture stuff.

Quote:
Not sure what you mean by generic-radix and odd-radix though it might be the names fooling me. Do you mean that calulating r1 and r5 are odd-radix steps and the others generic radix steps?
No, I mean that in general one seeks to write code that will be able to handle a wide variety of transform lengths. So e.g. if length-64 is too small to permit exact multiply of one's target inputs, one needs to be able to go easily to a larger transform length. One generally assembles these out of highly-optimized smaller-DFT routines. For example in my code I found it best to support power-of-2 complex-DFT-pass routines of lengths 8, 16 and 32. Lengths < 8 are inefficient because they lead to too many passes through the data array, but lengths > 32 yields little further gain in that regard. So for n=64 I would use 2 radix-8 passes, for n=128 I would do one pass at radix 8 ad another at radix16.

Once one needs to get seriously efficient one alno needs to support non-powr-of-2 intermediate runlengths, e.g. n=80, which one could do via radix [5,16] or [8,10] combos. That gets nontrivial because one needs to consider how to generalize the concept of "bit reversal reordering" to auch runlengths.

But you seem like you're on the right track, so let us know how things go.

Last fiddled with by ewmayer on 2012-01-30 at 20:42
ewmayer is offline   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
Any technology you guys are excited about? jasong Lounge 31 2015-10-09 20:55
These dell guys can't possibly be serious... Unregistered Hardware 12 2006-11-03 03:53
You guys are famous jasong Sierpinski/Riesel Base 5 1 2006-03-22 01:06
Hi guys ! Crystallize Lounge 6 2003-09-27 13:08
How do you guys do this? ThomRuley Lone Mersenne Hunters 1 2003-05-29 18:17

All times are UTC. The time now is 00:29.

Wed Nov 25 00:29:51 UTC 2020 up 75 days, 21:40, 4 users, load averages: 2.62, 2.66, 2.51

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2020, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.