mersenneforum.org Windows x64 CUDA Build
 Register FAQ Search Today's Posts Mark Forums Read

 2013-08-17, 19:00 #1 Brian Gladman     May 2008 Worcester, United Kingdom 72·11 Posts Windows x64 CUDA Build I have finally been able to work on the Visual Studio 2012 x64 build of msieve with GPU support. I have uploaded this to the msieve repository so anyone who is brave enough to build from source using Visual Studio 2012 will be able to try it out. I have also sent Jeff Gilchrist a set of binaries that I hope he will upload to his factoring binaries page shortly. Although I have not disabled the win32 GPU build, only the x64 build has been tested. Users of the win32 build are going to be largely on their own as I no longer undertake win32 development and might therefore have little to offer if help is needed. best regards, Brian
 2013-08-17, 19:06 #2 wombatman I moo ablest echo power!     May 2013 22·463 Posts Excellent! Once I get my CUDA issues worked out, I'll definitely try it out.
2013-08-23, 09:29   #3
JP12

Aug 2013

3 Posts
Crash on Algebra stage

Quote:
 Originally Posted by Brian Gladman I have finally been able to work on the Visual Studio 2012 x64 build of msieve with GPU support. I have uploaded this to the msieve repository so anyone who is brave enough to build from source using Visual Studio 2012 will be able to try it out. I have also sent Jeff Gilchrist a set of binaries that I hope he will upload to his factoring binaries page shortly. Although I have not disabled the win32 GPU build, only the x64 build has been tested. Users of the win32 build are going to be largely on their own as I no longer undertake win32 development and might therefore have little to offer if help is needed. best regards, Brian
I built with VS2012 the SVN 945, compiled without a glitch.
Set factsieve.py with USE_CUDA = True, and start factoring this number:
2881039827457895971881627053137530734638790825166127496066674320241571446494762386620442953820735453

Crashes after last line:
Found 5065278 relations, 123.7% of the estimated minimum (4095000).
-> msieve -s example\mynum.dat -l example\mynum.log -i example\mynum.ini -nf example\mynum.fb -t 8 -nc1
-> Running matrix solving step ...
-> msieve -s example\mynum.dat -l example\mynum.log -i example\mynum.ini -nf example\mynum.fb -t 8 -nc2

I can send compiled binaries, but too big to attach here (1263KB zipped).

Regards,

Jose

2013-08-26, 11:33   #4

May 2008
Worcester, United Kingdom

72×11 Posts

Quote:
 Originally Posted by JP12 I built with VS2012 the SVN 945, compiled without a glitch. Set factsieve.py with USE_CUDA = True, and start factoring this number: 2881039827457895971881627053137530734638790825166127496066674320241571446494762386620442953820735453 Crashes after last line: Found 5065278 relations, 123.7% of the estimated minimum (4095000). -> msieve -s example\mynum.dat -l example\mynum.log -i example\mynum.ini -nf example\mynum.fb -t 8 -nc1 -> Running matrix solving step ... -> msieve -s example\mynum.dat -l example\mynum.log -i example\mynum.ini -nf example\mynum.fb -t 8 -nc2 I can send compiled binaries, but too big to attach here (1263KB zipped). Regards, Jose
Hi Jose

I tried your number but I didn't get a crash. However, I didn't get a result either since it seems to go into an infinite loop at the nc2 stage:

Found 5117117 relations, 125.0% of the estimated minimum (4095000).
-> msieve -s ..\..\..\..\ggnfs\tests\mtest\mtest.dat -l ..\..\..\..\ggnfs\tests\mtest\mtest.log -i ..\..\..\..\ggnfs\tes
ts\mtest\mtest.ini -nf ..\..\..\..\ggnfs\tests\mtest\mtest.fb -t 8 -nc1
-> Running matrix solving step ...
-> msieve -s ..\..\..\..\ggnfs\tests\mtest\mtest.dat -l ..\..\..\..\ggnfs\tests\mtest\mtest.log -i ..\..\..\..\ggnfs\tes
ts\mtest\mtest.ini -nf ..\..\..\..\ggnfs\tests\mtest\mtest.fb -t 8 -nc2
linear algebra completed 3577949 of 158541 dimensions (2256.8%, ETA 630h22m)
Signal caught. Terminating...

Why it completes 3577949 when it only seeks 158541 dimensions seems odd.

Maybe Jason can shed light on what may be happening here.

Brian

 2013-08-27, 00:01 #5 jasonp Tribal Bullet     Oct 2004 32×5×79 Posts fivemack has had crash problems with later SVN versions; could you see if the problem goes away if you replace the common/lanczos directory with the version from SVN923?
2013-08-27, 08:23   #6

May 2008
Worcester, United Kingdom

72×11 Posts

Quote:
 Originally Posted by jasonp fivemack has had crash problems with later SVN versions; could you see if the problem goes away if you replace the common/lanczos directory with the version from SVN923?
Yes, that works perfectly for this number.

Brian

2013-08-27, 10:37   #7

May 2008
Worcester, United Kingdom

72·11 Posts

Quote:
 Originally Posted by Brian Gladman Yes, that works perfectly for this number. Brian
It is the changes to lanczos_matmul0.c after SVN927 that introduced this problem.

 2013-08-27, 11:24 #8 jasonp Tribal Bullet     Oct 2004 32·5·79 Posts I don't see how; the only changes in that neighborhood were in SVN938, which just moved some initialization upwards, and SVN939 which moved structure freeing upwards. I'm starting to think it's actually a buffer overrun somewhere, unless you see something in that code that I don't.
2013-08-27, 11:48   #9

May 2008
Worcester, United Kingdom

10000110112 Posts

Quote:
 Originally Posted by jasonp I don't see how; the only changes in that neighborhood were in SVN938, which just moved some initialization upwards, and SVN939 which moved structure freeing upwards. I'm starting to think it's actually a buffer overrun somewhere, unless you see something in that code that I don't.
I couldn't see how either :-(

But the only change I need to make to the current SVN version to get it to work is to revert the changes in this file alone. As you say, this change must be triggering some issue elsewhere.

Since the error turns up in the nc2 stage, what files do I need to delete (or keep) in order to rerun only the nc2 and subsequent stages?

Last fiddled with by Brian Gladman on 2013-08-27 at 12:27 Reason: ask a question

2013-08-29, 20:24   #10
JP12

Aug 2013

3 Posts
lanczos_matmul0.c

Quote:
 Originally Posted by Brian Gladman It is the changes to lanczos_matmul0.c after SVN927 that introduced this problem.
I think so. Why has the following code moved out of the else block?
Code:
		if (p->num_threads > 1) {
}
free(p->tasks);
Can this explain the problem?
Jose

 2013-08-30, 10:52 #11 jasonp Tribal Bullet     Oct 2004 32×5×79 Posts No; this is a structure that was built with multithreading in mind, but is needed for single-thread runs as well, so it always has to get cleaned up. Plus the code you quote only runs when shutting down, which would not explain why the linear algebra itself fails early on. My current guess is that there is a buffer overflow early on, which happens to stomp on memory that is used by the multithreading, because the current SVN allocates multithreading memory in a different place now. But I don't have the time to diagnose it.

 Similar Threads Thread Thread Starter Forum Replies Last Post Prime95 Hardware 147 2018-11-10 00:58 loopdemack Msieve 11 2016-01-18 13:44 f1pokerspeed Msieve 2 2013-12-30 01:14 Brian Gladman GMP-ECM 13 2013-05-13 15:00 Unregistered Information & Answers 14 2010-04-10 21:47

All times are UTC. The time now is 11:07.

Tue Jan 31 11:07:09 UTC 2023 up 166 days, 8:35, 0 users, load averages: 0.54, 0.87, 0.95