Go Back > Factoring Projects > Factoring

Thread Tools
Old 2019-10-22, 12:02   #12
Tribal Bullet
jasonp's Avatar
Oct 2004

3·1,163 Posts

To be clear, I've read the paper and didn't see anything actionable. The problem with GPU SpMV is not that we don't know how to do one but that NFS matrices are so sparse that the GPU spends most of its time waiting for memory. Maybe 90% of the memory accesses are random and that is independent of block format.

You can implement the SpMV in GF(2) as a prefix sum with a little postprocessing; GPUs are awesome at prefix sums but try it on an NFS matrix and you get 5% of the documented prefix sum performance.

As a performance datapoint: back in the day we compared a K20 running the msieve-lacuda branch to the Ivy Bridge it was plugged into, for a moderately large problem; the K20 finished in half the time, which is great, but a K20 should be able to add a zero to the Ivy Bridge throughput.

Last fiddled with by jasonp on 2019-10-22 at 12:15
jasonp is offline   Reply With Quote

Thread Tools

Similar Threads
Thread Thread Starter Forum Replies Last Post
ARM SVE 2048 bit vector hansl Hardware 4 2019-06-07 13:13
Matrix data for GNFS 190+ on NFS@home VBCurtis NFS@Home 5 2019-04-28 05:59
GNFS matrix stage on GPU fivemack GPU Computing 0 2016-07-14 15:44
Initial vector in Lanczos Robert Holmes Miscellaneous Math 4 2010-09-11 01:34
Intel Advanced Vector Extensions (256-bit SSE) nuutti Programming 3 2008-04-08 18:01

All times are UTC. The time now is 15:06.

Wed Oct 28 15:06:00 UTC 2020 up 48 days, 12:16, 4 users, load averages: 1.55, 2.03, 2.03

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2020, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.