mersenneforum.org  

Go Back   mersenneforum.org > Factoring Projects > Factoring

Reply
 
Thread Tools
Old 2021-08-28, 22:06   #23
WraithX
 
WraithX's Avatar
 
Mar 2006

3·173 Posts
Default

Quote:
Originally Posted by henryzz View Post
Although with B1=20000 I still get:
Code:
echo "(2^499-1)/20959" | ./ecm -gpu -cgbn -gpucurves 3584 -sigma 3:1000 20000
GMP-ECM 7.0.5-dev [configured with GMP 6.2.99, --enable-asm-redc, --enable-gpu, --enable-assert, --enable-openmp] [ECM]
Input number is (2^499-1)/20959 (146 digits)
Using B1=20000, B2=3804582, sigma=3:1000-3:4583 (3584 curves)
CUDA error (702) occurred: the launch timed out and was terminated
While running cudaDeviceSynchronize()   (file cgbn_stage1.cu, line 731)
What happens if you specify 0 for B2? Like this:
Code:
echo "(2^499-1)/20959" | ./ecm -gpu -cgbn -gpucurves 3584 -sigma 3:1000 20000 0
WraithX is offline   Reply With Quote
Old 2021-08-28, 22:40   #24
henryzz
Just call me Henry
 
henryzz's Avatar
 
"David"
Sep 2007
Liverpool (GMT/BST)

599210 Posts
Default

Quote:
Originally Posted by WraithX View Post
What happens if you specify 0 for B2? Like this:
Code:
echo "(2^499-1)/20959" | ./ecm -gpu -cgbn -gpucurves 3584 -sigma 3:1000 20000 0
The same thing.


If I run less curves at once it works. Possibly just that my gpu is pathetic (750 Ti):
Code:
echo "(2^499-1)/20959" | ./ecm -gpu -cgbn -sigma 3:1000 20000
GMP-ECM 7.0.5-dev [configured with GMP 6.2.99, --enable-asm-redc, --enable-gpu, --enable-assert, --enable-openmp] [ECM]
Input number is (2^499-1)/20959 (146 digits)
Using B1=20000, B2=3804582, sigma=3:1000-3:1319 (320 curves)
Computing 320 Step 1 took 756ms of CPU time / 1269ms of GPU time
Computing 320 Step 2 on CPU took 7488ms

Last fiddled with by henryzz on 2021-08-28 at 22:42
henryzz is offline   Reply With Quote
Old 2021-08-28, 22:44   #25
SethTro
 
SethTro's Avatar
 
"Seth"
Apr 2019

19×23 Posts
Default

You might try changing in cgbn_stage1.cu

-#define S_BITS_PER_CALL 10000
+#define S_BITS_PER_CALL 1000


then running with -v which might tell you when the GPU died (and also might prevent timeouts)



Code:
$ echo "(2^499-1)/20959" | ./ecm -v -gpu -cgbn -gpucurves 3584 -sigma 3:1000 20000 0
GMP-ECM 7.0.5-dev [configured with GMP 6.2.99, --enable-asm-redc, --enable-gpu, --enable-assert] [ECM]
Input number is (2^499-1)/20959 (146 digits)
GPU: will use device 0: GeForce GTX 1080 Ti, compute capability 6.1, 28 MPs.
Using B1=20000, B2=0, sigma=3:1000-3:4583 (3584 curves)
Running CGBN<512,4> kernel<112,128> at bit 0/28820 (0.0%)...
Running CGBN<512,4> kernel<112,128> at bit 1000/28820 (3.5%)...
...
Running CGBN<512,4> kernel<112,128> at bit 27000/28820 (93.7%)...
Running CGBN<512,4> kernel<112,128> at bit 28000/28820 (97.2%)...
Copying results back to CPU ...
Computing 3584 Step 1 took 15ms of CPU time / 1105ms of GPU time
Throughput: 3244.848 curves per second (on average 0.31ms per Step 1)
SethTro is offline   Reply With Quote
Old 2021-08-28, 22:50   #26
frmky
 
frmky's Avatar
 
Jul 2003
So Cal

1001100000002 Posts
Default

Quote:
Originally Posted by SethTro View Post
Glad you got a working binary! Would you mind measuring the speedup of echo "2^997-1" with -gpu vs -cgbn?
Code:
$ echo "(2^997-1)" | ./ecm -gpu -sigma 3:1000 20000 0
GMP-ECM 7.0.5-dev [configured with GMP 6.2.1, --enable-asm-redc, --enable-gpu, --enable-assert] [ECM]
Input number is (2^997-1) (301 digits)
Using B1=20000, B2=0, sigma=3:1000-3:6119 (5120 curves)
GPU: Block: 32x32x1 Grid: 160x1x1 (5120 parallel curves)
Computing 5120 Step 1 took 183ms of CPU time / 5364ms of GPU time

$ echo "(2^997-1)" | ./ecm -gpu -cgbn -sigma 3:1000 20000 0
GMP-ECM 7.0.5-dev [configured with GMP 6.2.1, --enable-asm-redc, --enable-gpu, --enable-assert] [ECM]
Input number is (2^997-1) (301 digits)
Using B1=20000, B2=0, sigma=3:1000-3:6119 (5120 curves)
Computing 5120 Step 1 took 1284ms of CPU time / 3057ms of GPU time
I'll try the configure changes later. Overnight I ran 2560 stage-1 curves on the C201 blocking the aliquot sequence starting at 3366 using B1=85e7. I'm working through stage 2 on those now.
frmky is online now   Reply With Quote
Old 2021-08-28, 23:23   #27
frmky
 
frmky's Avatar
 
Jul 2003
So Cal

27·19 Posts
Default

Those changes to acinclude.m4 aren't enough. It still can't find gmp.h during the test compile. We need to add a -I for the gmp include directory. And that breaks the build since it's trying to include libgmp.a during compile.

Last fiddled with by frmky on 2021-08-28 at 23:27
frmky is online now   Reply With Quote
Old 2021-08-29, 07:08   #28
henryzz
Just call me Henry
 
henryzz's Avatar
 
"David"
Sep 2007
Liverpool (GMT/BST)

23·7·107 Posts
Default

Reducing S_BITS_PER_CALL has fixed it for me. Thank you ๐Ÿ˜€
henryzz is offline   Reply With Quote
Old 2021-08-29, 12:53   #29
Gimarel
 
Apr 2010

111001002 Posts
Default

Current git fails for inputs near 512 Bits. It seems that there is a condition the wrong way:
Code:
diff --git a/cgbn_stage1.cu b/cgbn_stage1.cu
index 1b512ecd..f67f8715 100644
--- a/cgbn_stage1.cu
+++ b/cgbn_stage1.cu
@@ -653,7 +653,7 @@ int run_cgbn(mpz_t *factors, int *array_stage_found,
 #endif /* IS_DEV_BUILD */
   for (int k_i = 0; k_i < available_kernels.size(); k_i++) {
     uint32_t kernel_bits = available_kernels[k_i];
-    if (kernel_bits + 6 >=  mpz_sizeinbase(N, 2)) {
+    if (kernel_bits >=  mpz_sizeinbase(N, 2) + 6) {
       BITS = kernel_bits;
       assert( BITS % 32 == 0 );
       TPI = (BITS <= 512) ? 4 : (BITS <= 2048) ? 8 : (BITS <= 8192) ? 16 : 32;
Gimarel is offline   Reply With Quote
Old 2021-08-29, 22:55   #30
SethTro
 
SethTro's Avatar
 
"Seth"
Apr 2019

19·23 Posts
Default

Quote:
Originally Posted by Gimarel View Post
Current git fails for inputs near 512 Bits. It seems that there is a condition the wrong way:
Code:
diff --git a/cgbn_stage1.cu b/cgbn_stage1.cu
index 1b512ecd..f67f8715 100644
--- a/cgbn_stage1.cu
+++ b/cgbn_stage1.cu
@@ -653,7 +653,7 @@ int run_cgbn(mpz_t *factors, int *array_stage_found,
 #endif /* IS_DEV_BUILD */
   for (int k_i = 0; k_i < available_kernels.size(); k_i++) {
     uint32_t kernel_bits = available_kernels[k_i];
-    if (kernel_bits + 6 >=  mpz_sizeinbase(N, 2)) {
+    if (kernel_bits >=  mpz_sizeinbase(N, 2) + 6) {
       BITS = kernel_bits;
       assert( BITS % 32 == 0 );
       TPI = (BITS <= 512) ? 4 : (BITS <= 2048) ? 8 : (BITS <= 8192) ? 16 : 32;
Whoops, totally backwards, coding is hard :p I'll fix it tonight.
Thanks for testing
SethTro is offline   Reply With Quote
Old 2021-08-30, 16:06   #31
chris2be8
 
chris2be8's Avatar
 
Sep 2009

93D16 Posts
Default

Has anyone checked ecm-cgbn can find factors? On my system with a sm_30 GPU I updated test.gpuecm to pass -cgbn to ecm. But it failed to find any factors when the test cases expected them to be found!

It is *probably* because sm_30 is too low for CGBN.

It will be a while before I can test my newer GPU. The system it's on is running an old version of Linux which doesn't support CUDA 9.0. (I've been working on a "if it works don't fix it" base since it's only used for computations.) Upgrading Linux will probably need a complete re-install which I'll need to plan for a time when I don't need it for a few hours/days. And I'd be happier if I was sure CGBN would work once I got it installed.
chris2be8 is offline   Reply With Quote
Old 2021-08-30, 18:51   #32
SethTro
 
SethTro's Avatar
 
"Seth"
Apr 2019

19×23 Posts
Default

Quote:
Originally Posted by chris2be8 View Post
Has anyone checked ecm-cgbn can find factors? On my system with a sm_30 GPU I updated test.gpuecm to pass -cgbn to ecm. But it failed to find any factors when the test cases expected them to be found!

It is *probably* because sm_30 is too low for CGBN.

It will be a while before I can test my newer GPU. The system it's on is running an old version of Linux which doesn't support CUDA 9.0. (I've been working on a "if it works don't fix it" base since it's only used for computations.) Upgrading Linux will probably need a complete re-install which I'll need to plan for a time when I don't need it for a few hours/days. And I'd be happier if I was sure CGBN would work once I got it installed.
Yes, many of use have found the same test factor for (2^499-1)/20959 and I've verified several times that the residuals exactly match those produced by `-gpu`. I've also tested with `$ sage check_gpuecm.sage "./ecm -cgbn"`

Last fiddled with by SethTro on 2021-08-30 at 18:51
SethTro is offline   Reply With Quote
Old 2021-08-30, 18:52   #33
frmky
 
frmky's Avatar
 
Jul 2003
So Cal

27×19 Posts
Default

Quote:
Originally Posted by chris2be8 View Post
Has anyone checked ecm-cgbn can find factors?
Yes, test.gpuecm completes successfully both with and without -cgbn. I'm using a V100 with CUDA 11.3.
frmky is online now   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
NTT faster than FFT? moytrage Software 50 2021-07-21 05:55
PRP on gpu is faster that on cpu indomit Information & Answers 4 2020-10-07 10:50
faster than LL? paulunderwood Miscellaneous Math 13 2016-08-02 00:05
My CPU is getting faster and faster ;-) lidocorc Software 2 2008-11-08 09:26
Faster than LL? clowns789 Miscellaneous Math 3 2004-05-27 23:39

All times are UTC. The time now is 23:13.


Tue Aug 9 23:13:02 UTC 2022 up 33 days, 18 hrs, 1 user, load averages: 2.20, 2.11, 1.87

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2022, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.

โ‰  ยฑ โˆ“ รท ร— ยท โˆ’ โˆš โ€ฐ โŠ— โŠ• โŠ– โŠ˜ โŠ™ โ‰ค โ‰ฅ โ‰ฆ โ‰ง โ‰จ โ‰ฉ โ‰บ โ‰ป โ‰ผ โ‰ฝ โŠ โŠ โŠ‘ โŠ’ ยฒ ยณ ยฐ
โˆ  โˆŸ ยฐ โ‰… ~ โ€– โŸ‚ โซ›
โ‰ก โ‰œ โ‰ˆ โˆ โˆž โ‰ช โ‰ซ โŒŠโŒ‹ โŒˆโŒ‰ โˆ˜ โˆ โˆ โˆ‘ โˆง โˆจ โˆฉ โˆช โจ€ โŠ• โŠ— ๐–• ๐–– ๐–— โŠฒ โŠณ
โˆ… โˆ– โˆ โ†ฆ โ†ฃ โˆฉ โˆช โŠ† โŠ‚ โŠ„ โŠŠ โŠ‡ โŠƒ โŠ… โŠ‹ โŠ– โˆˆ โˆ‰ โˆ‹ โˆŒ โ„• โ„ค โ„š โ„ โ„‚ โ„ต โ„ถ โ„ท โ„ธ ๐“Ÿ
ยฌ โˆจ โˆง โŠ• โ†’ โ† โ‡’ โ‡ โ‡” โˆ€ โˆƒ โˆ„ โˆด โˆต โŠค โŠฅ โŠข โŠจ โซค โŠฃ โ€ฆ โ‹ฏ โ‹ฎ โ‹ฐ โ‹ฑ
โˆซ โˆฌ โˆญ โˆฎ โˆฏ โˆฐ โˆ‡ โˆ† ฮด โˆ‚ โ„ฑ โ„’ โ„“
๐›ข๐›ผ ๐›ฃ๐›ฝ ๐›ค๐›พ ๐›ฅ๐›ฟ ๐›ฆ๐œ€๐œ– ๐›ง๐œ ๐›จ๐œ‚ ๐›ฉ๐œƒ๐œ— ๐›ช๐œ„ ๐›ซ๐œ… ๐›ฌ๐œ† ๐›ญ๐œ‡ ๐›ฎ๐œˆ ๐›ฏ๐œ‰ ๐›ฐ๐œŠ ๐›ฑ๐œ‹ ๐›ฒ๐œŒ ๐›ด๐œŽ๐œ ๐›ต๐œ ๐›ถ๐œ ๐›ท๐œ™๐œ‘ ๐›ธ๐œ’ ๐›น๐œ“ ๐›บ๐œ”