20060606, 17:09  #1 
Jun 2006
5 Posts 
can any one help me
hi friends,
filtering the sieve matrix is on of the important steps. can any one please inform me acutal algorithm used in GGNFS or any recent literature available regarding the same. I know only one by pomerancesmith and S. cavallar. 
20060606, 17:29  #2  
Nov 2003
2^{2}·5·373 Posts 
Quote:
S. Cavallar. I have also implemented filtering code (15 years ago!) based upon the intelligent Gaussian elimination ideas of A. Odlyzko. It was written in Fortran (!!), and is *not* competitive with the CWI code in terms of the quality of the final matrix. I use the CWI code. While it has a few quirks and bugs, it produces excellent results. (but it needs quite a bit of manual coaxing to work well) If you do a web search on Stephania's name, I am sure her paper will turn up: Strategies in Filtering in the Number Field Sieve CWI Report MASR0012 

20060606, 17:36  #3  
Jul 2005
Saint Petersburg, Russia
5_{8} Posts 
Quote:


20060607, 09:24  #4 
Jun 2006
5 Posts 
To R.D. SIlverman
Thanks for your quick response. As mentioned in my pos,t i am aware of S. Cavallar's Approch and also Pomerance and Smiths' approach which is a modification over the Odlyzko's structured gaussian elimination. But S. Cavalla's thesis is atleast 5 yrs old and Pomerance and smith odlyzko's are 15 yr old. I was wondering if some paper has been published after 2000. 
20060607, 09:41  #5  
Nov 2003
2^{2}·5·373 Posts 
Quote:


20060607, 10:10  #6 
Jun 2006
101_{2} Posts 
I believe that, the filtering was done on single high end machine (Plz correct me if this is wrong). But in future the (sieve output)matrix size is expected to go beyond 20Gb. In that case, a single high end machine wont suffice. And we can not use to many machines as well. The problem here is that the communication will turn out to be bottleneck atleast in Pomerancesmith and S. Cavallar approach on multiple machines!

20060607, 10:44  #7  
Nov 2003
2^{2}·5·373 Posts 
Quote:
CWI code is not very memory efficient. (but it could be with some work; I don't have the time) A single 64bit processor machine with a lot of memory (16 to 32 Gbytes) will suffice for even the largest matrices in the near future. I do not see multiple processors being required for filtering for quite some time. 

20060607, 10:56  #8  
Bamboozled!
"πΊππ·π·π"
May 2003
Down not across
2802_{16} Posts 
Quote:
Don wrote a duplicate remover; I wrote a quick and dirty singleton remover. My code certainly needs more work because at present it looks only at primes and not prime ideals. Nonetheless, it does a useful job. Richard Wackerbarth and I have brainstormed a few ideas about parallel filtering, but we've not tried implementing any of them yet. Given our excessive workloads at the moment, I doubt we'll get much further in the near future. Paul 

20060607, 13:13  #9  
Tribal Bullet
Oct 2004
3,529 Posts 
Quote:
My understanding of Cavallar's algorithm is that it actually is gauss elimination, except that the clique processing part of it only figures into how the initial matrix is built. I'm experimenting with using gauss elimination (after the clique phase) and a variant of full Markowitz pivoting, which can be efficiently implemented using a discrete priority queue. Both Odlyzko's method and Cavallar's merge phase sound like approximations to that, and I'm hoping that simpler code which doesn't approximate can produce a decent matrix. It seems to work well, though obviously I haven't thrown big problems at it. The (undocumented) source code is in the current version of msieve. jasonp PS: Haven't read it carefully, but for block lanczos there's http://www.math.uu.nl/people/bisselin/flesch.pdf 

20060608, 07:04  #10 
Jun 2006
5 Posts 
thanks Jasonp

20060608, 14:30  #11 
Jun 2006
5_{8} Posts 
One more query,
After removing all the singletons, we need to throw out excess rows if they are more than the k extra realtions we intend to keep. There are two stratagies 1. Pomerance and Smith throw out the heavist realtions 2. S. Cavallar builds cliques of the reations that are 2way mergable. weighs them with some heuristic and throws out the cliques with suitable priority. My question is " is the second approach always better than the first one or its just an alternate way?" . Has any one comapred these two experimentally? 