![]() |
[QUOTE=henryzz;586760]
Although with B1=20000 I still get: [CODE]echo "(2^499-1)/20959" | ./ecm -gpu -cgbn -gpucurves 3584 -sigma 3:1000 20000 GMP-ECM 7.0.5-dev [configured with GMP 6.2.99, --enable-asm-redc, --enable-gpu, --enable-assert, --enable-openmp] [ECM] Input number is (2^499-1)/20959 (146 digits) Using B1=20000, B2=3804582, sigma=3:1000-3:4583 (3584 curves) CUDA error (702) occurred: the launch timed out and was terminated While running cudaDeviceSynchronize() (file cgbn_stage1.cu, line 731)[/CODE][/QUOTE] What happens if you specify 0 for B2? Like this: [CODE]echo "(2^499-1)/20959" | ./ecm -gpu -cgbn -gpucurves 3584 -sigma 3:1000 20000 0[/CODE] |
[QUOTE=WraithX;586761]What happens if you specify 0 for B2? Like this:
[CODE]echo "(2^499-1)/20959" | ./ecm -gpu -cgbn -gpucurves 3584 -sigma 3:1000 20000 0[/CODE][/QUOTE] The same thing. If I run less curves at once it works. Possibly just that my gpu is pathetic (750 Ti): [CODE]echo "(2^499-1)/20959" | ./ecm -gpu -cgbn -sigma 3:1000 20000 GMP-ECM 7.0.5-dev [configured with GMP 6.2.99, --enable-asm-redc, --enable-gpu, --enable-assert, --enable-openmp] [ECM] Input number is (2^499-1)/20959 (146 digits) Using B1=20000, B2=3804582, sigma=3:1000-3:1319 (320 curves) Computing 320 Step 1 took 756ms of CPU time / 1269ms of GPU time Computing 320 Step 2 on CPU took 7488ms[/CODE] |
You might try changing in cgbn_stage1.cu
-#define S_BITS_PER_CALL 10000 +#define S_BITS_PER_CALL 1000 then running with -v which might tell you when the GPU died (and also might prevent timeouts) [CODE] $ echo "(2^499-1)/20959" | ./ecm -v -gpu -cgbn -gpucurves 3584 -sigma 3:1000 20000 0 GMP-ECM 7.0.5-dev [configured with GMP 6.2.99, --enable-asm-redc, --enable-gpu, --enable-assert] [ECM] Input number is (2^499-1)/20959 (146 digits) GPU: will use device 0: GeForce GTX 1080 Ti, compute capability 6.1, 28 MPs. Using B1=20000, B2=0, sigma=3:1000-3:4583 (3584 curves) Running CGBN<512,4> kernel<112,128> at bit 0/28820 (0.0%)... Running CGBN<512,4> kernel<112,128> at bit 1000/28820 (3.5%)... ... Running CGBN<512,4> kernel<112,128> at bit 27000/28820 (93.7%)... Running CGBN<512,4> kernel<112,128> at bit 28000/28820 (97.2%)... Copying results back to CPU ... Computing 3584 Step 1 took 15ms of CPU time / 1105ms of GPU time Throughput: 3244.848 curves per second (on average 0.31ms per Step 1) [/CODE] |
[QUOTE=SethTro;586711]Glad you got a working binary! Would you mind measuring the speedup of echo "2^997-1" with -gpu vs -cgbn?[/QUOTE]
[CODE]$ echo "(2^997-1)" | ./ecm -gpu -sigma 3:1000 20000 0 GMP-ECM 7.0.5-dev [configured with GMP 6.2.1, --enable-asm-redc, --enable-gpu, --enable-assert] [ECM] Input number is (2^997-1) (301 digits) Using B1=20000, B2=0, sigma=3:1000-3:6119 (5120 curves) GPU: Block: 32x32x1 Grid: 160x1x1 (5120 parallel curves) Computing 5120 Step 1 took 183ms of CPU time / 5364ms of GPU time $ echo "(2^997-1)" | ./ecm -gpu -cgbn -sigma 3:1000 20000 0 GMP-ECM 7.0.5-dev [configured with GMP 6.2.1, --enable-asm-redc, --enable-gpu, --enable-assert] [ECM] Input number is (2^997-1) (301 digits) Using B1=20000, B2=0, sigma=3:1000-3:6119 (5120 curves) Computing 5120 Step 1 took 1284ms of CPU time / 3057ms of GPU time [/CODE] I'll try the configure changes later. Overnight I ran 2560 stage-1 curves on the C201 blocking the aliquot sequence starting at 3366 using B1=85e7. I'm working through stage 2 on those now. |
Those changes to acinclude.m4 aren't enough. It still can't find gmp.h during the test compile. We need to add a -I for the gmp include directory. And that breaks the build since it's trying to include libgmp.a during compile.
|
Reducing S_BITS_PER_CALL has fixed it for me. Thank you 😀
|
Current git fails for inputs near 512 Bits. It seems that there is a condition the wrong way:[CODE]diff --git a/cgbn_stage1.cu b/cgbn_stage1.cu
index 1b512ecd..f67f8715 100644 --- a/cgbn_stage1.cu +++ b/cgbn_stage1.cu @@ -653,7 +653,7 @@ int run_cgbn(mpz_t *factors, int *array_stage_found, #endif /* IS_DEV_BUILD */ for (int k_i = 0; k_i < available_kernels.size(); k_i++) { uint32_t kernel_bits = available_kernels[k_i]; - if (kernel_bits + 6 >= mpz_sizeinbase(N, 2)) { + if (kernel_bits >= mpz_sizeinbase(N, 2) + 6) { BITS = kernel_bits; assert( BITS % 32 == 0 ); TPI = (BITS <= 512) ? 4 : (BITS <= 2048) ? 8 : (BITS <= 8192) ? 16 : 32;[/CODE] |
[QUOTE=Gimarel;586805]Current git fails for inputs near 512 Bits. It seems that there is a condition the wrong way:[CODE]diff --git a/cgbn_stage1.cu b/cgbn_stage1.cu
index 1b512ecd..f67f8715 100644 --- a/cgbn_stage1.cu +++ b/cgbn_stage1.cu @@ -653,7 +653,7 @@ int run_cgbn(mpz_t *factors, int *array_stage_found, #endif /* IS_DEV_BUILD */ for (int k_i = 0; k_i < available_kernels.size(); k_i++) { uint32_t kernel_bits = available_kernels[k_i]; - if (kernel_bits + 6 >= mpz_sizeinbase(N, 2)) { + if (kernel_bits >= mpz_sizeinbase(N, 2) + 6) { BITS = kernel_bits; assert( BITS % 32 == 0 ); TPI = (BITS <= 512) ? 4 : (BITS <= 2048) ? 8 : (BITS <= 8192) ? 16 : 32;[/CODE][/QUOTE] Whoops, totally backwards, coding is hard :p I'll fix it tonight. Thanks for testing |
Has anyone checked ecm-cgbn can find factors? On my system with a sm_30 GPU I updated test.gpuecm to pass -cgbn to ecm. But it failed to find any factors when the test cases expected them to be found!
It is *probably* because sm_30 is too low for CGBN. It will be a while before I can test my newer GPU. The system it's on is running an old version of Linux which doesn't support CUDA 9.0. (I've been working on a "if it works don't fix it" base since it's only used for computations.) Upgrading Linux will probably need a complete re-install which I'll need to plan for a time when I don't need it for a few hours/days. And I'd be happier if I was sure CGBN would work once I got it installed. |
[QUOTE=chris2be8;586868]Has anyone checked ecm-cgbn can find factors? On my system with a sm_30 GPU I updated test.gpuecm to pass -cgbn to ecm. But it failed to find any factors when the test cases expected them to be found!
It is *probably* because sm_30 is too low for CGBN. It will be a while before I can test my newer GPU. The system it's on is running an old version of Linux which doesn't support CUDA 9.0. (I've been working on a "if it works don't fix it" base since it's only used for computations.) Upgrading Linux will probably need a complete re-install which I'll need to plan for a time when I don't need it for a few hours/days. And I'd be happier if I was sure CGBN would work once I got it installed.[/QUOTE] Yes, many of use have found the same test factor for (2^499-1)/20959 and I've verified several times that the residuals exactly match those produced by `-gpu`. I've also tested with `$ sage check_gpuecm.sage "./ecm -cgbn"` |
[QUOTE=chris2be8;586868]Has anyone checked ecm-cgbn can find factors?[/QUOTE]
Yes, test.gpuecm completes successfully both with and without -cgbn. I'm using a V100 with CUDA 11.3. |
All times are UTC. The time now is 04:41. |
Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2023, Jelsoft Enterprises Ltd.