mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Hardware > GPU Computing

Reply
 
Thread Tools
Old 2012-05-25, 03:16   #1310
Dubslow
Basketry That Evening!
 
Dubslow's Avatar
 
"Bunslow the Bold"
Jun 2011
40<A<43 -89<O<-88

3·29·83 Posts
Default

Quote:
Originally Posted by msft View Post
Ver 2.01
Fix caluclate raundoff err.
Code:
M( 26974951 )C, 0xe72576f52d2b0d8c, n = 1474560, CUDALucas v2.00
M( 26975743 )C, 0x67d12882dc466fd7, n = 1474560, CUDALucas v2.00
M( 26767891 )C, 0xbbeb0ad54a815dda, n = 1474560, CUDALucas v2.00
M( 26768243 )C, 0x3280d4e28ef0b188, n = 1474560, CUDALucas v2.00
M( 26822449 )C, 0xad9016be8bd360a9, n = 1474560, CUDALucas v2.00
M( 26823619 )C, 0xb989439408521303, n = 1474560, CUDALucas v2.00
M( 27722911 )C, 0x42c0350cdd3596a9, n = 1572864, CUDALucas v2.00
M( 27192083 )C, 0xc5ec2d3fd58a9ccf, n = 1474560, CUDALucas v2.00
M( 27192391 )C, 0xc882ca1522e59f5e, n = 1474560, CUDALucas v2.00
M( 27699043 )C, 0xd0eda360a4e70525, n = 1572864, CUDALucas v2.01
M( 27703817 )C, 0xd73a94c433bcd689, n = 1572864, CUDALucas v2.01
M( 27706841 )C, 0x6708a475e7e3db2b, n = 1572864, CUDALucas v2.01
M( 27707293 )C, 0xf82f49632f7b11c1, n = 1572864, CUDALucas v2.01
M( 27708413 )C, 0x78c5e390d697eec8, n = 1572864, CUDALucas v2.01
M( 27661351 )C, 0x13df0d77627419bc, n = 1572864, CUDALucas v2.01
All DC successfully.
The 2.01 makefile no longer has -lcudart in it. Is that correct? That is definitely not correct. Then again, it probably only matters to me.

Last fiddled with by Dubslow on 2012-05-25 at 03:18
Dubslow is offline   Reply With Quote
Old 2012-05-26, 02:45   #1311
kdgehman
 
Feb 2011

22×3 Posts
Default

Quote:
Originally Posted by apsen View Post
I've got a mismatch with 2.0 for 29668031.

Could someone run it with P95?

Thanks,
Andriy

Sending result to server: UID: KDGehman1978/i5-2500k_2, M29668031 is not prime. Res64: 846461BD45E022__
PrimeNet success code with additional info:
LL test successfully completes double-check of M29668031
CPU credit is 29.5032 GHz-days.
kdgehman is offline   Reply With Quote
Old 2012-05-27, 04:12   #1312
Dubslow
Basketry That Evening!
 
Dubslow's Avatar
 
"Bunslow the Bold"
Jun 2011
40<A<43 -89<O<-88

722110 Posts
Smile CUDALucas 2.02

Well, I've tinkered with CUDALucas before, but this time I think it's a more worthwhile upgrade. In addition to standard worktodo.txt functionality as I have previously done, this version now supports proper .ini functionality, in the same vein as mfakt*. That is, almost all of the command line options can now be set via "CUDALucas.ini", so you don't have to remember the complicated command each time you restart it (or spam the up-arrow in my case ). The best part is any command line options will override the values in CUDALucas.ini. I've tentatively labelled it 2.02.

I have not messed with any of the computational code; any bugs that cause incorrect are also in 2.01. (I am not aware of any, for the record. I'm ~15/16 with 2.00 and 2.01.)

There is an extra file, plus anyone compiling on Windows (flash!) should look at lines 32-40. In order to make gcc link function calls in a .cu file functions defined in a .c file, I had to add extern "C" -- no idea how MSVC will react.

There is also a new Makefile, which corrects the thing I mentioned two posts above -- it also now includes warnings from nvcc. msft, I got these warnings when I compiled it:
Code:
nvcc -O2 -arch=sm_13 --compiler-options=-Wall -c CUDALucas.cu
CUDALucas.cu: In function ‘void printbits(double*, int, int, int, int, double, double, int, int, char*)’:
CUDALucas.cu:895:20: warning: zero-length gnu_printf format string
CUDALucas.cu: In function ‘int check(int, char*)’:
CUDALucas.cu:1161:19: warning: comparison between signed and unsigned integer expressions
CUDALucas.cu: In function ‘void printbits(double*, int, int, int, int, double, double, int, int, char*)’:
CUDALucas.cu:848:7: warning: ‘fp’ may be used uninitialized in this function
I could probably fix those myself, but I didn't want to touch the code so I could make the statement above.

The functions to read the ini file were taken from mfaktc and modified to my tastes, styled after Prime95's ini-reading functions.

Note that the ini file name of "CUDALucas.ini" will clobber the old .ini file in Windows, where file names are not case sensitive (or so I hear). However, if anyone has a better idea before flash compiles, it's something like line 42 in CUDALucas.cu.

As the computational stuff hasn't been touched, the checkpoint files are the same. However, your worktodo.txt will need to be reformatted to the "Test=" or "DoubleCheck=" format. Copy and pastes from GPU272 or PrimeNet/Manual will work just fine.

Any bugs should obviously be reported, but this passed some basic testing, and I'm not too worried since the core of this code is already used in mfaktc.


Code:
bill@Gravemind:~/CUDALucas/test∰∂ cat CUDALucas.ini
# You can use this file to customize CUDALucas without having to create a long
# and complex command. I got tired of having to hit the up arrow a bunch of
# times whenever I rebooted, so I created this. You can set most of the command
# line options here; however, if you do use command line options, they will
# override their corresponding value in this file.

# CheckpointIterations is the same as the -c option; it determines how often
# checkpoints are written and also how often CUDALucas prints to terminal.
CheckpointIterations=10000

# This sets the name of the workfile used by CUDALucas.
WorkFile=worktodo.txt

# Polite is the same as the -polite option. If it's 1, each iteration is
# polite. If it's (for example) 12, then every 12th iteration is polite. Thus
# the higher the number, the less polite the program is. Set to 0 to turn off
# completely. Polite!=0 will incur a slight performance drop, but the screen 
# should be more responsive. Trade responsiveness for performance.
Polite=1

# CheckRoundoffAllIterations is the same as the -t option. When active, each 
# iteration's roundoff error is checked, at the price of a small performance 
# cost. I'm not sure how often it's checked otherwise. This is a binary option;
# set to 1 to activate, 0 to de-activate.
CheckRoundoffAllIterations=0

# SaveAllCheckpoints is the same as the -s option. When active, CUDALucas will
# save each checkpoint separately in the folder specified in the "SaveFolder" 
# option below. This is a binary option; set to 1 to activate, 0 to de-activate.
SaveAllCheckpoints=0

# This option is the name of the folder where the separate checkpoint files are
# saved. This option is only checked if SaveAllCheckpoints is activated.
SaveFolder=savefiles

# Interactive is the same as the -k option. When active, you can press p, t, or
# s to change the respective options while the program is running. P is polite, 
# t is CheckRoundoffAllIterations, and s is the SaveAllCheckpoints feature
# below. This is a binary option; set to 1 to activate, 0 to de-activate.
Interactive=0

# Threads is the same as the -threads option. This sets the number of threads
# used in the FFTs. This must be 32, 64, 128, 256, 512, or 1024. (Some FFT
# lengths have a higher minimum than 32.)
Threads=256

# DeviceNumber is the same as the -d option. Use this to run CUDALucas on a GPU
# other than "the first one". Only useful if you have more than one GPU.
DeviceNumber=0

# FFTLength is the same as the -f option. If this is 0, CUDALucas will 
# autoselect a length for each exponent. Otherwise, you can set this with an
# override length; this length will be used for all exponents in worktodo.txt, 
# which may not be optimal (or even possible). In the future, I would like to 
# both create a better FFT length selection function, as well as be able to 
# specify a length on an individual-exponent basis (probably through a field in
# Test= in the work file). To see a list of reasonable FFT lengths, try running
# "$ CUDALucas -cufftbench 32768 3276800 32768" which will test a large range.
# In my personal experience on a GTX 460, I've found that for 26M exponents, 
# FFTLength=1474560 is a good length. (Technical note: FFT length must be a 
# multiple of 128*threads. See
# http://www.mersenneforum.org/showpost.php?p=292776&postcount=959 )
FFTLength=0
Edit: I probably should have put this in the attached ini file, but polite 0 is known to cause CUDALucas to take some small CPU time; polite 64 seems to be a good compromise.

Edit2: Forgot the todo list Here it is:
1) Add some sort of way to specify the FFT on a per-exponent basis, presumably through some field in "DoubleCheck=...". The Prime95 way of doing it would be rather tedious to parse... anyone have any ideas?
2) I'd like to refine the FFT autoselect function to do better; to that end, at some point over the summer I'll write a script to test a bunch of exponents and record the round off error, then distribute that around here to get as much hardware covered as possible. Then we'd use the data to either create a table of FFT lengths or a reasonably accurate regression of some sort. (Prime95 uses (very large) tables.)

Edit3: Anyone who needs a Linux binary can of course just ask me.
Attached Files
File Type: bz2 CUDALucas.2.02.tar.bz2 (17.7 KB, 85 views)

Last fiddled with by Dubslow on 2012-05-27 at 04:47
Dubslow is offline   Reply With Quote
Old 2012-05-27, 06:50   #1313
flashjh
 
flashjh's Avatar
 
"Jerry"
Nov 2011
Vancouver, WA

1,123 Posts
Default

I'm running into a few issues compiling on Windows. I've been able to get around most, but here is the main stuff:

Code:
 
parse.c(166) : warning C4996: 'strcpy': This function or variable may be unsafe.
 Consider using strcpy_s instead. To disable deprecation, use _CRT_SECURE_NO_WARNINGS. See online help for details.
        C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\INCLUDE\string.h(105) : see declaration of 'strcpy'
parse.c(176) : warning C4127: conditional expression is constant
parse.c(191) : error C3861: 'strncasecmp': identifier not found
parse.c(192) : error C3861: 'strncasecmp': identifier not found
parse.c(305) : warning C4127: conditional expression is constant
parse.c(312) : warning C4996: 'strcpy': This function or variable may be unsafe.
 Consider using strcpy_s instead. To disable deprecation, use _CRT_SECURE_NO_WARNINGS. See online help for details.
        C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\INCLUDE\string.h(105) : see declaration of 'strcpy'
make: *** [parse.x64.obj] Error 2
I'll look at it more tomorrow, but maybe you can identify the problem?
flashjh is offline   Reply With Quote
Old 2012-05-27, 07:42   #1314
Dubslow
Basketry That Evening!
 
Dubslow's Avatar
 
"Bunslow the Bold"
Jun 2011
40<A<43 -89<O<-88

160658 Posts
Default

Quote:
Originally Posted by flashjh View Post
I'm running into a few issues compiling on Windows. I've been able to get around most, but here is the main stuff:

Code:
 
parse.c(166) : warning C4996: 'strcpy': This function or variable may be unsafe.
 Consider using strcpy_s instead. To disable deprecation, use _CRT_SECURE_NO_WARNINGS. See online help for details.
        C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\INCLUDE\string.h(105) : see declaration of 'strcpy'
parse.c(176) : warning C4127: conditional expression is constant
parse.c(191) : error C3861: 'strncasecmp': identifier not found
parse.c(192) : error C3861: 'strncasecmp': identifier not found
parse.c(305) : warning C4127: conditional expression is constant
parse.c(312) : warning C4996: 'strcpy': This function or variable may be unsafe.
 Consider using strcpy_s instead. To disable deprecation, use _CRT_SECURE_NO_WARNINGS. See online help for details.
        C:\Program Files (x86)\Microsoft Visual Studio 10.0\VC\INCLUDE\string.h(105) : see declaration of 'strcpy'
make: *** [parse.x64.obj] Error 2
I'll look at it more tomorrow, but maybe you can identify the problem?
Bluh, you'll have to ask TheJudger how he compiles mfaktc for Windows. I'll look through his code tomorrow and look for Windows ifdefs.


Edit: Found one in compatability.h: #ifdef _MSC_VER; #define strncasecomp _strnicmp with the abuse of semicolon/newline. Next stop: makefile.win.

Last fiddled with by Dubslow on 2012-05-27 at 07:49
Dubslow is offline   Reply With Quote
Old 2012-05-27, 15:33   #1315
flashjh
 
flashjh's Avatar
 
"Jerry"
Nov 2011
Vancouver, WA

1,123 Posts
Default

Quote:
Originally Posted by Dubslow View Post
Bluh, you'll have to ask TheJudger how he compiles mfaktc for Windows. I'll look through his code tomorrow and look for Windows ifdefs.


Edit: Found one in compatability.h: #ifdef _MSC_VER; #define strncasecomp _strnicmp with the abuse of semicolon/newline. Next stop: makefile.win.
Probably should have them already, do you have a link to mfaktc source files?

I'm pretty sure I need _strnicmp instead of strncasecomp and strcpy_s instead of strcpy, however, I don't know what to do with the 'conditional expression is constant'.
flashjh is offline   Reply With Quote
Old 2012-05-27, 15:58   #1316
Dubslow
Basketry That Evening!
 
Dubslow's Avatar
 
"Bunslow the Bold"
Jun 2011
40<A<43 -89<O<-88

3·29·83 Posts
Default

Quote:
Originally Posted by flashjh View Post
Probably should have them already, do you have a link to mfaktc source files?
http://mersenneforum.org/mfaktc/ (The unqualified 0.18 would be the source.)
Quote:
Originally Posted by flashjh View Post
I'm pretty sure I need _strnicmp instead of strncasecomp and strcpy_s instead of strcpy,
I don't see anywhere in the mfatkc source about strcpy_s, but I'll keep looking, and you certainly know better than me about MSVC
Edit: http://msdn.microsoft.com/en-us/libr...(v=vs.80).aspx
Quote:
Originally Posted by MS
For example, the strcpy function has no way of telling if the string that it is copying is too big for its destination buffer. However, its secure counterpart, strcpy_s, takes the size of the buffer as a parameter, so it can determine if a buffer overrun will occur. If you use strcpy_s to copy eleven characters into a ten-character buffer, that is an error on your part; strcpy_s cannot correct your mistake, but it can detect your error and inform you by invoking the invalid parameter handler.
strcpy_s has a different definition, requiring a third size argument; it's just meant to be a "catch stupid programmer error" thing. Since the code does a check for a long line anyways, the extra functionality is unnecessary.
Code:
bill@Gravemind:~/CUDALucas/test∰∂ cat parse.c
#include <stdio.h>
#include <string.h>
#include <limits.h>
#include <ctype.h>
#include <errno.h>
#include <stdlib.h>

#ifdef _MSC_VER
#define strncasecmp _strnicmp
#define _CRT_SECURE_NO_WARNINGS
#endif

int isprime(unsigned int n)
/*
returns
0 if n is composite
1 if n is prime
*/
{
  unsigned int i;
  
  if(n<=1) return 0;
  if(n>2 && n%2==0)return 0;

  i=3;
  while(i*i <= n && i < 0x10000)
  {
    if(n%i==0)return 0;
    i+=2;
  }
  return 1;
}
Quote:
Originally Posted by flashjh View Post
however, I don't know what to do with the 'conditional expression is constant'.
Oh, that's just a while(1), that warning can be ignored. (Perhaps deleting the 1 and leaving the conditional blank will work?) Edit: Actually, in further looking, I actually have no clue why that loop is there at all. If the func finds a too-long line, then it reads in the characters until a control character then breaks? Wtf?


PS I forgot to mention above, but the help message was moved from no-args to the -h arg.
PPS Here's a slightly larger .ini file:
Code:
# Polite is the same as the -polite option. If it's 1, each iteration is
# polite. If it's (for example) 12, then every 12th iteration is polite. Thus
# the higher the number, the less polite the program is. Set to 0 to turn off
# completely. Polite!=0 will incur a slight performance drop, but the screen 
# should be more responsive. Trade responsiveness for performance. (Note:
# polite=0 is known to cause CUDALucas to use some extra CPU time; Polite=64 or
# higher is a good compromise.)
Polite=1
flash, could you add this to your copy before you bundle the executable?

The attached archive contains the modified parse.c and CUDALucas.ini.
Attached Files
File Type: bz2 CUDALucas.2.02.tar.bz2 (17.7 KB, 89 views)

Last fiddled with by Dubslow on 2012-05-27 at 16:27
Dubslow is offline   Reply With Quote
Old 2012-05-31, 04:21   #1317
Batalov
 
Batalov's Avatar
 
"Serge"
Mar 2008
Phi(4,2^7658614+1)/2

2×4,787 Posts
Default

OT is moved to a separate thread
Batalov is offline   Reply With Quote
Old 2012-05-31, 05:30   #1318
flashjh
 
flashjh's Avatar
 
"Jerry"
Nov 2011
Vancouver, WA

1,123 Posts
Default

Quote:
Originally Posted by Dubslow View Post
http://mersenneforum.org/mfaktc/ (The unqualified 0.18 would be the source.)

I don't see anywhere in the mfatkc source about strcpy_s, but I'll keep looking, and you certainly know better than me about MSVC
Edit: http://msdn.microsoft.com/en-us/libr...(v=vs.80).aspx

strcpy_s has a different definition, requiring a third size argument; it's just meant to be a "catch stupid programmer error" thing. Since the code does a check for a long line anyways, the extra functionality is unnecessary.
Code:
bill@Gravemind:~/CUDALucas/test∰∂ cat parse.c
#include <stdio.h>
#include <string.h>
#include <limits.h>
#include <ctype.h>
#include <errno.h>
#include <stdlib.h>
 
#ifdef _MSC_VER
#define strncasecmp _strnicmp
#define _CRT_SECURE_NO_WARNINGS
#endif
 
int isprime(unsigned int n)
/*
returns
0 if n is composite
1 if n is prime
*/
{
  unsigned int i;
 
  if(n<=1) return 0;
  if(n>2 && n%2==0)return 0;
 
  i=3;
  while(i*i <= n && i < 0x10000)
  {
    if(n%i==0)return 0;
    i+=2;
  }
  return 1;
}
Oh, that's just a while(1), that warning can be ignored. (Perhaps deleting the 1 and leaving the conditional blank will work?) Edit: Actually, in further looking, I actually have no clue why that loop is there at all. If the func finds a too-long line, then it reads in the characters until a control character then breaks? Wtf?


PS I forgot to mention above, but the help message was moved from no-args to the -h arg.
PPS Here's a slightly larger .ini file:
Code:
# Polite is the same as the -polite option. If it's 1, each iteration is
# polite. If it's (for example) 12, then every 12th iteration is polite. Thus
# the higher the number, the less polite the program is. Set to 0 to turn off
# completely. Polite!=0 will incur a slight performance drop, but the screen 
# should be more responsive. Trade responsiveness for performance. (Note:
# polite=0 is known to cause CUDALucas to use some extra CPU time; Polite=64 or
# higher is a good compromise.)
Polite=1
flash, could you add this to your copy before you bundle the executable?

The attached archive contains the modified parse.c and CUDALucas.ini.
Attached is CUDALucas 2.02 - UNTESTED. I included all the source files and the MAKEFILE for windows. I modified to 'safe' functions so MSVC would not complain**. The two while(1) lines cause a warning, but they're safe to ignore. No CUDA 3.2 because there were too many errors during compile for me to fix tonight.

@Dubslow: This build is based on your newest archive from earlier this afternoon. **I can't test the changes because I remoted into my system to do the build and CuLu doesn't detect my nVidia card while in a remote session.

I'll test it tomorrow; if anyone else can test it and let me know if it performs as expected, that would be great -- thanks.
Attached Files
File Type: zip CUDALucas2.02x64.zip (184.0 KB, 84 views)

Last fiddled with by flashjh on 2012-05-31 at 05:32
flashjh is offline   Reply With Quote
Old 2012-05-31, 05:37   #1319
Dubslow
Basketry That Evening!
 
Dubslow's Avatar
 
"Bunslow the Bold"
Jun 2011
40<A<43 -89<O<-88

1C3516 Posts
Default

Quote:
Originally Posted by flashjh View Post
Attached is CUDALucas 2.02 - UNTESTED. I included all the source files and the MAKEFILE for windows. I modified to 'safe' functions so MSVC would not complain**. The two while(1) lines cause a warning, but they're safe to ignore. No CUDA 3.2 because there were too many errors during compile for me to fix tonight.

@Dubslow: This build is based on your newest archive from earlier this afternoon. **I can't test the changes because I remoted into my system to do the build and CuLu doesn't detect my nVidia card while in a remote session.

I'll test it tomorrow; if anyone else can test it and let me know if it performs as expected, that would be great -- thanks.
Heehee, I've been modifying the source with trifling differences for much of the evening, and that DL was symlinked to my files. I'll see how late it was
I've completed one expo with 2.02 myself, and another one due in a few hours. I'll come back and edit this post with a source archive that compiles "out of the box" on both Windows and Linux.

Edit: Sheesh, where did you get that makefile? In all 2.x versions I've seen from msft, the makefiles he's included all had arch=sm_13 and -O2; this win makefile has a whole bunch of stuff that is (AFAICT) unnecessary. How did you even get it to compile with arch not sm_13? The change msft made in 2.01 requires sm_13 (at least nvcc threw me an error when I tried later arches of course now it works fine), and we've figured out it's the fastest.
Code:
CUFLAGS = -m64 --ptxas-options=-v -ccbin=$(CCLOC) -D$(BIT)  -Xcompiler /EHsc,/W3,/nologo,/Ox,/Oy,/GL -arch=$(CUDA_ARCH) -DMERS_PACKAGE -DBIT_SIEVE -DTESTING_SMALL_EXPONENTS -DSIEVE_SIZE_IN_BYTES=32 -DNUM_SMALL_PRIMES=32768 -DDO_NOT_USE_LONG_DOUBLE  "-I$(CUDA)/include"  -D__x86_64__ -O3
I'm pretty sure if you did a Ctrl+F on the defines in the source, you wouldn't find anything. Here's my CUFLAGS:
Code:
CUFLAGS = -O2 -arch=sm_13 --compiler-options=-Wall
The first two flags are from msft's makefile, and the last one is the one I added which produced warnings.

Is this win makefile a relic of MacLucas?

Last fiddled with by Dubslow on 2012-05-31 at 06:12 Reason: s/two are/two flags are/
Dubslow is offline   Reply With Quote
Old 2012-05-31, 05:45   #1320
flashjh
 
flashjh's Avatar
 
"Jerry"
Nov 2011
Vancouver, WA

1,123 Posts
Default

Quote:
Originally Posted by Dubslow View Post
Heehee, I've been modifying the source with trifling differences for much of the evening, and that DL was symlinked to my files. I'll see how late it was
I've completed one expo with 2.02 myself, and another one due in a few hours. I'll come back and edit this post with a source archive that compiles "out of the box" on both Windows and Linux.
I downloaded the file @ 17:23 MDT

Edit: Can you compile/run Windows versions?

Last fiddled with by flashjh on 2012-05-31 at 05:46
flashjh is offline   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
Don't DC/LL them with CudaLucas LaurV Data 131 2017-05-02 18:41
CUDALucas / cuFFT Performance on CUDA 7 / 7.5 / 8 Brain GPU Computing 13 2016-02-19 15:53
CUDALucas: which binary to use? Karl M Johnson GPU Computing 15 2015-10-13 04:44
settings for cudaLucas fairsky GPU Computing 11 2013-11-03 02:08
Trying to run CUDALucas on Windows 8 CP Rodrigo GPU Computing 12 2012-03-07 23:20

All times are UTC. The time now is 18:58.


Wed Oct 27 18:58:18 UTC 2021 up 96 days, 13:27, 0 users, load averages: 2.31, 2.20, 1.83

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.