mersenneforum.org

mersenneforum.org (https://www.mersenneforum.org/index.php)
-   CADO-NFS (https://www.mersenneforum.org/forumdisplay.php?f=170)
-   -   CADO-NFS error (exit code -9) (https://www.mersenneforum.org/showthread.php?t=25807)

RedGolpe 2020-08-06 13:51

CADO-NFS error (exit code -9)
 
Hi all, I am factoring a C143 but the program stopped in error, with no mention of what kind. Here are the last few lines diplayed.

[CODE]Info:Square Root: Creating file of (a,b) values
Warning:Command: Process with PID 1917 finished with return code -9
Error:Square Root: Program run on server failed with exit code -9
Error:Square Root: Command line was: /home/ubuntu/cado-nfs/build/ip-172-31-36-46/sqrt/sqrt -poly nfsdata/c145.poly -prefix nfsdata/c145.dep.gz -dep 0 -t 8 -side0 -side1 -gcd > nfsdata/c145.sqrt.stdout.4 2> nfsdata/c145.sqrt.stderr.4
Error:Square Root: Stderr output (last 10 lines only) follow (stored in file nfsdata/c145.sqrt.stderr.4):
Error:Square Root: Alg(1): starting level 3 at cpu=1962.67s (wct=257.15s), 8 values to multiply
Error:Square Root: Alg(7): level 1 took cpu=147.07s (wct=18.52s)
Error:Square Root: Alg(7): starting level 2 at cpu=1964.10s (wct=257.33s), 16 values to multiply
Error:Square Root: Alg(4): level 2 took cpu=167.89s (wct=21.14s)
Error:Square Root: Alg(4): starting level 3 at cpu=1964.98s (wct=257.44s), 8 values to multiply
Error:Square Root: Alg(2): level 2 took cpu=167.94s (wct=21.15s)
Error:Square Root: Alg(2): starting level 3 at cpu=1967.93s (wct=257.81s), 8 values to multiply
Error:Square Root: Alg(3): level 2 took cpu=166.18s (wct=20.93s)
Error:Square Root: Alg(3): starting level 3 at cpu=1982.80s (wct=259.69s), 8 values to multiply
Error:Square Root:
Traceback (most recent call last):
File "cado-nfs/cado-nfs.py", line 122, in <module>
factors = factorjob.run()
File "cado-nfs/scripts/cadofactor/cadotask.py", line 5957, in run
last_status, last_task = self.run_next_task()
File "cado-nfs/scripts/cadofactor/cadotask.py", line 6049, in run_next_task
return [task.run(), task.title]
File "cado-nfs/scripts/cadofactor/cadotask.py", line 4940, in run
raise Exception("Program failed")
Exception: Program failed[/CODE]

I resumed the computation and had the same issue. Can it just be that not enough memory is available? Thank you in advance.

charybdis 2020-08-06 14:24

Yes, this looks like running out of memory. By default CADO runs one dependency per thread which makes the square root very memory-intensive.

Try adding a line
[code]tasks.sqrt.threads = 1[/code]
to your parameter file and resuming - this will cause sqrt to run single-threaded so memory use is lower.

RedGolpe 2020-08-06 14:54

Thank you, it worked.

EdH 2020-08-06 15:07

Another way that I've used with larger runs is to add a swap file so I could still use full threads.

RedGolpe 2020-09-01 11:02

Both methods seem to work fine. Since I am running remote instances with limited disk space, and also considering that the sqrt phase takes a negligible time compared to the rest of the factorization, I'll stick with the thread tweaking for now.

I work with 8 threads so I expect the default is tasks.sqrt.threads = 8 and indeed I monitored the memory usage in semi-real time and it overflows. I tried a C140 with tasks.sqrt.threads = 4 and it completed fine. To make it work I edited the <workdir>/c140.parameters_snapshot.<x> file after the error fired so I was able to save the previous computation.

Is there a way to make this change permanent? Should I edit the files in cado-nfs/parameters/factor/ ? I assume for example that for a C140 the relevant file is cado-nfs/parameters/factor/params.c140, is that correct?

Thank you again.

EdH 2020-09-01 12:25

You can edit the params files or you can add the option in your command line. I do the latter on a continual basis with my scripts. Just add "tasks.sqrt.threads=4" within the command:
[code]
./cado-nfs.py <comp> . . . tasks.sqrt.threads=4 . . .
[/code]I pull the spaces around the "=" out when I add an option to the command line, but I probably don't need to. Options added to the command line will show up in the snapshots.

RedGolpe 2020-09-01 12:29

Ah, that's useful! Thank you very much!


All times are UTC. The time now is 22:36.

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2020, Jelsoft Enterprises Ltd.