mersenneforum.org  

Go Back   mersenneforum.org > Search Forums

Showing results 1 to 25 of 34
Search took 0.01 seconds.
Search: Posts Made By: SethTro
Forum: Factoring 2021-11-12, 04:04
Replies: 101
Views: 7,368
Posted By SethTro
I spent a good part of this week trying to...

I spent a good part of this week trying to implement fast squaring for CGBN. Ultimately my code (https://github.com/NVlabs/CGBN/issues/19#issuecomment-966779554) was 10% slower and still had breaking...
Forum: Factoring 2021-11-04, 10:12
Replies: 101
Views: 7,368
Posted By SethTro
I was playing around with CGBN today and I...

I was playing around with CGBN today and I realized that it doesn't use fast squaring (https://github.com/NVlabs/CGBN/blob/master/include/cgbn/impl_cuda.cu#L1033). in GMP fast squaring this yields a...
Forum: Factoring 2021-10-24, 08:47
Replies: 101
Views: 7,368
Posted By SethTro
I just merged...

I just merged https://gitlab.inria.fr/zimmerma/ecm/-/merge_requests/27 which contains a fix of B1 limit along with a number of quality of life improvements: multiple kernels included by default (512...
Forum: Factoring 2021-10-22, 18:50
Replies: 101
Views: 7,368
Posted By SethTro
It's to prevent GPU memory issues so it can be...

It's to prevent GPU memory issues so it can be ignored (unless you run with a very huge number.
It's on my to-do list to remove but I'm sadly without internet today.
You can remove the assert and...
Forum: Factoring 2021-09-21, 08:35
Replies: 101
Views: 7,368
Posted By SethTro
I was confused when you saw only moderate gains...

I was confused when you saw only moderate gains so I rented a V100 (V100-SXM2-16GB) on AWS today.
I'm seeing the new code be 3.1x faster which is similar to the 2-3x improvement I've seen on a...
Forum: Factoring 2021-09-10, 09:46
Replies: 101
Views: 7,368
Posted By SethTro
Two late night performance thoughts. 1. You...

Two late night performance thoughts.
1. You might get 10% more throughput by toggling VERIFY_NORMALIZED to 0 on line 55
It's a nice debug check while this is still in development but it has never...
Forum: Factoring 2021-09-10, 04:01
Replies: 101
Views: 7,368
Posted By SethTro
I spent most of today working on new optimal...

I spent most of today working on new optimal bounds. It can be a large speedup (https://www.mersenneforum.org/showpost.php?p=587617&postcount=22) to use these instead of the traditionally optimal B1...
Forum: Factoring 2021-09-09, 17:03
Replies: 101
Views: 7,368
Posted By SethTro
Yes! In cgbn_stage1.cu search for this line ...

Yes! In cgbn_stage1.cu search for this line
/* NOTE: Custom kernel changes here

You can either add a new kernel or I recommend just changing `cgbn_params_512`

- typedef cgbn_params_t<4,...
Forum: Factoring 2021-09-07, 06:25
Replies: 101
Views: 7,368
Posted By SethTro
I'm glad we finally got here! 2.2x speedup...

I'm glad we finally got here!

2.2x speedup for the 1024 bit case is almost exactly what everyone else is seeing (except bsquared maybe because newer card?).

You can often improve overall...
Forum: Factoring 2021-09-05, 07:01
Replies: 101
Views: 7,368
Posted By SethTro
Ignore this, but for completion sake you can...

Ignore this, but for completion sake you can probably clone my copy of CGBN with `git clone -b cgbn_swap https://github.com/sethtroisi/CGBN.git`

The top entry from `git log` should be


commit...
Forum: Factoring 2021-09-05, 03:40
Replies: 101
Views: 7,368
Posted By SethTro
It doesn't reduce runtime, it does make it faster...

It doesn't reduce runtime, it does make it faster for me to test things and slightly reduces registers pressure.
Forum: Factoring 2021-09-05, 03:28
Replies: 101
Views: 7,368
Posted By SethTro
This is an easy fix, you are on the home stretch!...

This is an easy fix, you are on the home stretch!

I'll committed a change that depends on https://github.com/NVlabs/CGBN/pull/17 being accepted. I'll committed a change reverting that to 3...
Forum: Factoring 2021-09-02, 23:57
Replies: 101
Views: 7,368
Posted By SethTro
Happy me! I found two 35 digit factors from...

Happy me!

I found two 35 digit factors from a C303 (http://factordb.com/index.php?id=1100000002657449020) today (from Factoring for a publication...
Forum: Factoring 2021-09-02, 20:06
Replies: 101
Views: 7,368
Posted By SethTro
You can find the literal program it failed to...

You can find the literal program it failed to compile in config.log or the shape in acinclude.m4 (basically wrap the 2nd block in int maint() { ... })


AC_RUN_IFELSE([AC_LANG_PROGRAM([
...
Forum: Factoring 2021-09-02, 09:12
Replies: 101
Views: 7,368
Posted By SethTro
I halved compile time by adding cgbn_swap and...

I halved compile time by adding cgbn_swap and avoiding inlining double_add_v2 twice.

Sadly I pushed the branch and it will probably fail to compile for everyone till...
Forum: Factoring 2021-09-02, 00:45
Replies: 101
Views: 7,368
Posted By SethTro
Maybe this relates to registers used by the...

Maybe this relates to registers used by the kernel? max threads per block? Any insight from CUDA experts would be appreciated
Forum: Factoring 2021-09-02, 00:44
Replies: 101
Views: 7,368
Posted By SethTro
I added `gpu_throughput_test.sh` which runs...

I added `gpu_throughput_test.sh` which runs different sized inputs and measures throughput.

On my system maximum results are achieved at

256 bits: 2x default curves (or 3584 curves), same speed...
Forum: Factoring 2021-09-01, 22:46
Replies: 101
Views: 7,368
Posted By SethTro
Fixed along with another issue.

Fixed along with another issue.
Forum: Factoring 2021-09-01, 20:30
Replies: 101
Views: 7,368
Posted By SethTro
I don't know how static linking works especially...

I don't know how static linking works especially with respect to CUDA but I compilled ecm with all supported SM (including sm35 and sm70) using CUDA 11.2. Feel free to try it, but I wouldn't be to...
Forum: Factoring 2021-09-01, 20:26
Replies: 101
Views: 7,368
Posted By SethTro
Can you try running with `-v --gpucurves 1280`...

Can you try running with `-v --gpucurves 1280` and `--gpucurves 2560` (if you are having fun you can also try 640 and 1792)?
The new code should give you approximate timings quite quickly so no...
Forum: Factoring 2021-09-01, 19:00
Replies: 101
Views: 7,368
Posted By SethTro
I know that feeling and I really empathize. I'm...

I know that feeling and I really empathize. I'm building on the pile of cludge that is cuda and I wish I could make this easier.

did you try with CC=gcc-9? I can also maybe add some debug to the...
Forum: Factoring 2021-09-01, 18:54
Replies: 101
Views: 7,368
Posted By SethTro
I rebased the branch to cleanup the git history....

I rebased the branch to cleanup the git history. so everyone will likely need to `git pull` and `git reset --hard origin/gpu_integration`. I'm sorry, but also we're in development and everything is...
Forum: Factoring 2021-08-31, 21:01
Replies: 101
Views: 7,368
Posted By SethTro
The most important factor is the size of N (which...

The most important factor is the size of N (which is limitted by CGBN to 32K for GPUs or ~10,000 digits).
Both CPU and GPU have the same linear scaling for B1 which can be increased to any number...
Forum: Factoring 2021-08-31, 08:49
Replies: 101
Views: 7,368
Posted By SethTro
@EdH I started using ECM.py again and it's great!...

@EdH I started using ECM.py again and it's great!

---

I wrote a bunch of code today so S_BITS_PER_BATCH is dynamic and there's better verbose output.

Verbose output includes this message,...
Forum: Factoring 2021-08-30, 18:51
Replies: 101
Views: 7,368
Posted By SethTro
Yes, many of use have found the same test factor...

Yes, many of use have found the same test factor for (2^499-1)/20959 and I've verified several times that the residuals exactly match those produced by `-gpu`. I've also tested with `$ sage...
Showing results 1 to 25 of 34

 
All times are UTC. The time now is 19:02.


Tue Nov 30 19:02:56 UTC 2021 up 130 days, 13:31, 0 users, load averages: 1.81, 1.47, 1.38

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.