mersenneforum.org  

Go Back   mersenneforum.org > Factoring Projects > Msieve

Reply
 
Thread Tools
Old 2016-01-26, 06:30   #1
Dubslow
Basketry That Evening!
 
Dubslow's Avatar
 
"Bunslow the Bold"
Jun 2011
40<A<43 -89<O<-88

722110 Posts
Default The big filtering bug strikes again (I think)

Code:
read 370M relations
error -9 reading relation 372978068
error -9 reading relation 373228223
error -9 reading relation 373228453
error -15 reading relation 373559422
read 380M relations
error -15 reading relation 381855498
skipped 89 relations with b > 2^32
skipped 64 relations with composite factors
found 74761703 hash collisions in 383596801 relations
added 1217250 free relations
commencing duplicate removal, pass 2
found 27671124 duplicates and 355926667 unique relations
memory use: 852.8 MB
reading ideals above 150011904
commencing singleton removal, initial pass
memory use: 6024.0 MB
reading all ideals from disk
memory use: 103.9 MB
Segmentation fault (core dumped)
This is the first attempt at post-processing 4051^71-1, an SNFS 260 job, with nearly 400M relations (with a roughly 8% duplication rate, as seen in the above paste).

Am I correct in thinking this is the big data filtering bug that is known to exist but has never been tracked down? What's the workaround? Will changing the target density suffice, or perhaps should I resort to manually de-duplicating?

Last fiddled with by Dubslow on 2016-01-26 at 06:32
Dubslow is offline   Reply With Quote
Old 2016-01-26, 08:16   #2
Dubslow
Basketry That Evening!
 
Dubslow's Avatar
 
"Bunslow the Bold"
Jun 2011
40<A<43 -89<O<-88

3×29×83 Posts
Default

Actually, wait, it could be due to the disk being full. 40GB SSD biting me again... (discovered this since prime95 was again reporting errors writing save files...)

Edit: Now I'm deleting 4108 MB of old Linux kernels (most of the space being taken by the source in /usr/src). That should help. du -sh * | sort -k 1 -h is a very useful command.

That it caused Msieve to segfault with no message (vs Prime95's errors about writing to disk) is unfortunate, but I suppose state of the art factoring software is unlikely to win any awards for user friendliness.

Last fiddled with by Dubslow on 2016-01-26 at 08:27
Dubslow is offline   Reply With Quote
Old 2016-01-26, 11:56   #3
jasonp
Tribal Bullet
 
jasonp's Avatar
 
Oct 2004

5×709 Posts
Default

Jeez, everyone's a critic. "Halting factorization, you're out of disk space. Please delete something under /usr, like those KDE libraries you haven't used lately"
jasonp is offline   Reply With Quote
Old 2016-01-26, 12:21   #4
Dubslow
Basketry That Evening!
 
Dubslow's Avatar
 
"Bunslow the Bold"
Jun 2011
40<A<43 -89<O<-88

11100001101012 Posts
Default

Quote:
Originally Posted by jasonp View Post
Jeez, everyone's a critic. "Halting factorization, you're out of disk space. Please delete something under /usr, like those KDE libraries you haven't used lately"
I'm sorry, I don't mean to be rude, it is of course wonderful software, you've done and do amazing work.
Dubslow is offline   Reply With Quote
Old 2016-01-26, 15:22   #5
jasonp
Tribal Bullet
 
jasonp's Avatar
 
Oct 2004

5×709 Posts
Default

No offense taken. You should see some of the lobbying I've gotten for setting a time deadline to run :)

Actually this error is interesting because neither the memory allocation nor the disk read failed (either of those would have generated log messages), but the file was truncated and probably didn't match the size the rest of the filtering expected.

Last fiddled with by jasonp on 2016-01-26 at 16:00
jasonp is offline   Reply With Quote
Old 2016-01-26, 21:01   #6
Dubslow
Basketry That Evening!
 
Dubslow's Avatar
 
"Bunslow the Bold"
Jun 2011
40<A<43 -89<O<-88

3·29·83 Posts
Default

Quote:
Originally Posted by jasonp View Post
Actually this error is interesting because neither the memory allocation nor the disk read failed (either of those would have generated log messages), but the file was truncated and probably didn't match the size the rest of the filtering expected.
Yes, that's what I was wondering too, surely a system call should have failed first...?

Anyways, take 2 has hung well and good:

Code:
read 380M relations
error -15 reading relation 381855498
skipped 89 relations with b > 2^32
skipped 64 relations with composite factors
found 74761735 hash collisions in 383597122 relations
added 1216929 free relations
commencing duplicate removal, pass 2

bt
^C
Program received signal SIGINT, Interrupt.
__GI_____strtoul_l_internal (
    nptr=0x7fffffffc7bb "42753584:2089,6A1,3692C6F,2B1A7C1,B8FCE1,43FBB,6EE9,7CCF2B5:157E5435,5486A6BB,851,1913F7D,1D010D,1643A7,19E23,7105\n", endptr=0x0, base=<optimized out>, group=<optimized out>, loc=<optimized out>)
    at ../stdlib/strtol_l.c:438
438     ../stdlib/strtol_l.c: No such file or directory.
(gdb) bt
#0  __GI_____strtoul_l_internal (
    nptr=0x7fffffffc7bb "42753584:2089,6A1,3692C6F,2B1A7C1,B8FCE1,43FBB,6EE9,7CCF2B5:157E5435,5486A6BB,851,1913F7D,1D010D,1643A7,19E23,7105\n", endptr=0x0, base=<optimized out>, group=<optimized out>, loc=<optimized out>)
    at ../stdlib/strtol_l.c:438
#1  0x000000000045ee80 in nfs_purge_duplicates ()
#2  0x0000000000427eae in nfs_filter_relations ()
#3  0x0000000000415bfc in factor_gnfs ()
#4  0x0000000000405077 in msieve_run ()
#5  0x0000000000403fa2 in factor_integer ()
#6  0x0000000000403a82 in main ()
^ It was stuck there for nearly twelve hours with no further output before I interrupted it to get the bt. Could this be a symptom of some bug, or is it more likely to be a symptom of corrupted files from the first run?
Dubslow is offline   Reply With Quote
Old 2016-01-26, 22:16   #7
Dubslow
Basketry That Evening!
 
Dubslow's Avatar
 
"Bunslow the Bold"
Jun 2011
40<A<43 -89<O<-88

3×29×83 Posts
Default

Okay now I'm scared. This is take 3, which is identical to take 2 except for deleting all intermediate files that had been produced by msieve, leaving the original .dat.gz, .fb, and .ini files to start over with:

Code:
read 370M relations
error -9 reading relation 372978068
error -9 reading relation 373228223
error -9 reading relation 373228453
error -15 reading relation 373559422
read 380M relations
error -15 reading relation 381855498
read 390M relations
read 400M relations
read 410M relations
read 420M relations
read 430M relations
read 440M relations
read 450M relations
read 460M relations
read 470M relations
read 480M relations
read 490M relations
read 500M relations
There's only supposed to be 380M relations... I'm going to keep letting it run a while longer to see if/when it stops...
Dubslow is offline   Reply With Quote
Old 2016-01-26, 23:49   #8
Dubslow
Basketry That Evening!
 
Dubslow's Avatar
 
"Bunslow the Bold"
Jun 2011
40<A<43 -89<O<-88

1C3516 Posts
Default

Okay I finally got bored (and am trying to get rid of some other files, so I need the disk IO capacity), so I killed it at 1.25B "relations" read:

Code:
read 1000M relations
read 1010M relations
read 1020M relations
read 1030M relations
read 1040M relations
read 1050M relations
read 1060M relations
read 1070M relations
read 1080M relations
read 1090M relations
read 1100M relations
read 1110M relations
read 1120M relations
read 1130M relations
read 1140M relations
read 1150M relations
read 1160M relations
read 1170M relations
read 1180M relations
read 1190M relations
read 1200M relations
read 1210M relations
read 1220M relations
read 1230M relations
read 1240M relations
read 1250M relations
^C
Program received signal SIGINT, Interrupt.
0x00007ffff7769642 in __gmpz_tdiv_q_ui () from /usr/lib/x86_64-linux-gnu/libgmp.so.10
(gdb) bt
#0  0x00007ffff7769642 in __gmpz_tdiv_q_ui () from /usr/lib/x86_64-linux-gnu/libgmp.so.10
#1  0x0000000000437e58 in divide_factor_out (tmp3=0x7fffffffcb98, tmp2=0x7fffffffcb88, tmp1=0x7fffffffcb78, 
    compress=1, num_factors=<synthetic pointer>, array_size_in=<synthetic pointer>, 
    factors=0x7fffffffc580 "\t\301!\215oX$\233AOF\225ayc\205;\177\220i]\201\065e3\276\203_\204\065(y+\201;M\032$\205Q\220}~D\214\r\002\364'\a\331#<\206\005b\201\205\213\203K\204\201\202\207\202Q\202s\206", p=41, 
    polyval=0x7fffffffc520) at gnfs/relation.c:38
#2  nfs_read_relation (buf=<optimized out>, fb=0x7fffffffcae0, r=0x7fffffffc530, array_size_out=0x7fffffffc4fc, 
    compress=1, polyval=0x7fffffffc520, test_primality=1) at gnfs/relation.c:210
#3  0x000000000045e7e8 in nfs_purge_duplicates (obj=0x6b25c8, obj@entry=0x6a0250, fb=0x6b08e0, 
    fb@entry=0x7fffffffcae0, max_relations=max_relations@entry=0, num_relations_out=0xfffffffa, 
    num_relations_out@entry=0x7fffffffc97c) at gnfs/filter/duplicate.c:361
#4  0x0000000000427eae in nfs_filter_relations (obj=obj@entry=0x6a0250, n=0x7fffffffcd20, n@entry=0x7fffffffcd10)
    at gnfs/filter/filter.c:322
#5  0x0000000000415bfc in factor_gnfs (obj=obj@entry=0x6a0250, input_n=input_n@entry=0x7fffffffd1d0, 
    factor_list=factor_list@entry=0x7fffffffd380) at gnfs/gnfs.c:153
#6  0x0000000000405077 in msieve_run_core (factor_list=0x7fffffffd380, n=0x7fffffffd1d0, obj=0x6a0250)
    at common/driver.c:158
#7  msieve_run (obj=0x6a0250) at common/driver.c:268
#8  0x0000000000403fa2 in factor_integer (
    buf=buf@entry=0x7fffffffdce0 "33842640673654900230841309689581377811814193589205837074162807524059165165109885829323413987612718480377383114930080316434037965607731519369328496941521960020893161774811908489956894175554089119794354"..., flags=flags@entry=7171, savefile_name=savefile_name@entry=0x0, logfile_name=logfile_name@entry=0x0, 
    nfs_fbfile_name=nfs_fbfile_name@entry=0x0, seed1=seed1@entry=0x7fffffffdcc4, seed2=0x7fffffffdcc8, 
    max_relations=0, cpu=cpu_core, cache_size1=32768, cache_size2=6291456, num_threads=4, which_gpu=0, 
    nfs_args=0x7fffffffe291 "target_density=140") at demo.c:233
#9  0x0000000000403a82 in main (argc=<optimized out>, argv=<optimized out>) at demo.c:599
Dubslow is offline   Reply With Quote
Old 2016-01-26, 23:50   #9
wombatman
I moo ablest echo power!
 
wombatman's Avatar
 
May 2013

22·449 Posts
Default

If you'd like, I can try it on my computer and see if it throws an error. Probably will have to download overnight or something, but I can post an update once it's building/built.
wombatman is offline   Reply With Quote
Old 2016-01-26, 23:55   #10
Dubslow
Basketry That Evening!
 
Dubslow's Avatar
 
"Bunslow the Bold"
Jun 2011
40<A<43 -89<O<-88

3·29·83 Posts
Default

I'm not ready to give up quite yet, I want to finish tarring and feathering zipping some 2.5GB of other old data off to a friend before I give it another whack.
Dubslow is offline   Reply With Quote
Old 2016-01-27, 00:08   #11
unconnected
 
unconnected's Avatar
 
May 2009
Russia, Moscow

22×5×139 Posts
Default

I successfully run filtering on this job. Msieve v1.52 produces 13.3M matrix with TD=120.
unconnected is offline   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
NFS filtering error... Stargate38 YAFU 4 2016-04-20 16:53
CKDO strikes again... lycorn PrimeNet 6 2014-01-22 01:18
Filtering Sleepy Msieve 25 2011-08-04 15:05
Filtering R.D. Silverman Cunningham Tables 14 2010-08-05 08:30
Pierre Jammes Strikes Again! wblipp ElevenSmooth 0 2008-01-22 05:18

All times are UTC. The time now is 14:54.


Tue Jun 28 14:54:54 UTC 2022 up 75 days, 12:56, 2 users, load averages: 1.27, 1.21, 1.25

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2022, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.

≠ ± ∓ ÷ × · − √ ‰ ⊗ ⊕ ⊖ ⊘ ⊙ ≤ ≥ ≦ ≧ ≨ ≩ ≺ ≻ ≼ ≽ ⊏ ⊐ ⊑ ⊒ ² ³ °
∠ ∟ ° ≅ ~ ‖ ⟂ ⫛
≡ ≜ ≈ ∝ ∞ ≪ ≫ ⌊⌋ ⌈⌉ ∘ ∏ ∐ ∑ ∧ ∨ ∩ ∪ ⨀ ⊕ ⊗ 𝖕 𝖖 𝖗 ⊲ ⊳
∅ ∖ ∁ ↦ ↣ ∩ ∪ ⊆ ⊂ ⊄ ⊊ ⊇ ⊃ ⊅ ⊋ ⊖ ∈ ∉ ∋ ∌ ℕ ℤ ℚ ℝ ℂ ℵ ℶ ℷ ℸ 𝓟
¬ ∨ ∧ ⊕ → ← ⇒ ⇐ ⇔ ∀ ∃ ∄ ∴ ∵ ⊤ ⊥ ⊢ ⊨ ⫤ ⊣ … ⋯ ⋮ ⋰ ⋱
∫ ∬ ∭ ∮ ∯ ∰ ∇ ∆ δ ∂ ℱ ℒ ℓ
𝛢𝛼 𝛣𝛽 𝛤𝛾 𝛥𝛿 𝛦𝜀𝜖 𝛧𝜁 𝛨𝜂 𝛩𝜃𝜗 𝛪𝜄 𝛫𝜅 𝛬𝜆 𝛭𝜇 𝛮𝜈 𝛯𝜉 𝛰𝜊 𝛱𝜋 𝛲𝜌 𝛴𝜎𝜍 𝛵𝜏 𝛶𝜐 𝛷𝜙𝜑 𝛸𝜒 𝛹𝜓 𝛺𝜔