 Originally Posted by SethTro Better test runner script...
IMO, this is an excellent example of excellent work.

Test the code nine ways to Sunday before you even begin to trust it. Particularly if it is written by oneself (who, heuristically, is prone to mistakes).

Only human... 8-)

 2022-11-18, 10:13 #772 sweety439   "99(4^34019)99 palind" Nov 2016 (P^81993)SZ base 36 1110011000102 Posts @rogue: Why when click "download" in the page https://sourceforge.net/projects/mtsieve/, it only has "srsieve2cl.exe"? It seems that we should click https://sourceforge.net/projects/mts....3.7z/download to download the full mtsieve
I'm not certain why it does that. After these changes are integrated I will delete any standalone exe files in sourceforge so it should d/l the 7z with all exes.

 2022-11-18, 13:17 #774 rogue     "Mark" Apr 2003 Between here and the 6,971 Posts SethPro, did you output the number of collisions vs inserts in the hash table from those tests? Maybe there are opportunities to sizing the hash table better.
 Originally Posted by SethTro In two quick checks I saw a 10% improvement in the openCL code by making this one line fix Code:  for (idx=0; idx
I have confirmed the 10% improvement in the OpenCL performance. This is due to choosing a better code path when inserting into the hash table. I was sieving over 3600 at a time.

I have not tested the CPU logic changes yet.

 2022-11-22, 18:29 #776 rogue     "Mark" Apr 2003 Between here and the 1B3B16 Posts I have still not been able to test the CPU changes as I have been busy, but I have uploaded mtsieve 2.3.5 and all executables to sourceforge. Here are the changes: Code:  framework: Added code to destuctors to free allocated memory. Performance updates to HashTable. srsieve2/srieve2cl: version 1.6.5 Fixed HashTable usage to get up to 10% better performance.
 2022-11-22, 19:58 #777 SethTro     "Seth" Apr 2019 1DF16 Posts I'm glad these could get integrating. I wanted to find a few primes in a sequence and I was so happy to find a full feature sieving tool already existed for my problem. My first attempt was single-threaded and missed factors so it was great to find sr2sieve. I also appreciate that it's fully open source and I could modify and improve it; as I needed this to hack around the all terms being divisible by 2 for (3^n-7)/2. If you wanted to add a line somewhere that acknowledged optimization/profiling from Seth Troisi, it would make me feel extra valued for the work I did. Last fiddled with by SethTro on 2022-11-22 at 19:58
 Originally Posted by SethTro I'm glad these could get integrating. I wanted to find a few primes in a sequence and I was so happy to find a full feature sieving tool already existed for my problem. My first attempt was single-threaded and missed factors so it was great to find sr2sieve. I also appreciate that it's fully open source and I could modify and improve it; as I needed this to hack around the all terms being divisible by 2 for (3^n-7)/2. If you wanted to add a line somewhere that acknowledged optimization/profiling from Seth Troisi, it would make me feel extra valued for the work I did.

The divisible by d sequences need work. I laid out in one of these threads the conditions that must be met for srsieve2/srsieve2cl to sieve such sequences. I just don't recall where it is. That would be a nice contribution if you want to work on it.

 2022-11-28, 15:32 #779 rogue     "Mark" Apr 2003 Between here and the 6,971 Posts I have posted mtsieve 2.3.6 to sourceforge. Outside of modifying CHANGES.txt to mention Seth Troisi's addition, here are the changes for 2.3.6: Code:  cksieve/cksievecl: version 1.4 Initial release of cksievecl. cksieve will now run on non-x86 CPUs. It is 25% faster than the previous version. cksievecl is about 5x faster than cksieve when comparing i9-11950H vs NVIDIA RTX A5000 The only sieves without ARM builds are afsieve, gcwsieve, pixsieve, xyyxsieve and their OpenCL/Metal equivalents.
 2022-11-29, 17:26 #780 rogue     "Mark" Apr 2003 Between here and the 6,971 Posts I have posted mtsieve 2.3.7 to sourceforge. Here are the changes for 2.3.6: Code:  gcwsieve/gcwsievecl: version 1.5 Added support for non-x86 CPUs. FPU or AVX is still used on x86 CPUs. Added -A to enable AVX on x86 CPUs. AVX code can be faster than the FPU, but you will have to test ranges (for p > max n) to see which is faster. Updated invmod method in the GPU and FPU code to gain about 2%.`
 2022-11-29, 18:05 #781 storm5510 Random Account     Aug 2009 Not U. + S.A. 1001111011102 Posts I have an older package. With all due respect, I have not yet seen any good examples of how to use these things.

