mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Software

Reply
 
Thread Tools
Old 2018-09-19, 13:23   #78
ET_
Banned
 
ET_'s Avatar
 
"Luigi"
Aug 2002
Team Italia

4,871 Posts
Default

Hi folks.

I have a basic core2 machine at home, and an AVX-512 capable 4-core server via AWS. Now, I need a copy of gfndsieve compiled for Linux 64-bit and all the AVX optimizations turned on and no OpenCL, but on the AWS system I only have gcc version 4.8 and can't compile my optimized copy.

Is anybody out there who can provide me with such file? Or should I use the correct -march flag an compile on my machine?

Thanks in advance.

Luigi
---

Last fiddled with by ET_ on 2018-09-19 at 13:25
ET_ is offline   Reply With Quote
Old 2018-09-19, 13:33   #79
rogue
 
rogue's Avatar
 
"Mark"
Apr 2003
Between here and the

26·32·13 Posts
Default

Quote:
Originally Posted by ET_ View Post
Hi folks.

I have a basic core2 machine at home, and an AVX-512 capable 4-core server via AWS. Now, I need a copy of gfndsieve compiled for Linux 64-bit and all the AVX optimizations turned on and no OpenCL, but on the AWS system I only have gcc version 4.8 and can't compile my optimized copy.

Is anybody out there who can provide me with such file? Or should I use the correct -march flag an compile on my machine?
There is no AVX512 code (yet) and the decision to use AVX or SSE/FPU is decided a runtime based upon the capability of the CPU.

To disable compiling and linking with GPU code, set ENABLE_GPU to no in the makefile.

Once you do that, what issues are you getting on that box with gcc?
rogue is offline   Reply With Quote
Old 2018-09-19, 14:05   #80
ET_
Banned
 
ET_'s Avatar
 
"Luigi"
Aug 2002
Team Italia

4,871 Posts
Default

Quote:
Originally Posted by rogue View Post
There is no AVX512 code (yet) and the decision to use AVX or SSE/FPU is decided a runtime based upon the capability of the CPU.

To disable compiling and linking with GPU code, set ENABLE_GPU to no in the makefile.

Once you do that, what issues are you getting on that box with gcc?
I knew there were distinct code paths enabled on the executable, but I thought one had to enable the relative processor optimizations to have the code recognize it.

In other words, if my code is compiled with -march=native, and I have a Intel G2030 processor (a crippled ivy-bridge with no AVX / FMA3 support), will the executable automatically run the FMA3 path once it is run on a AWS Skylake architecture?

If so, then I solved the issue.
If not, then I should recompile the code on an architecture whose "native" processor recognizes the optimizations. But the AWS gcc is locked at version 4.8, and I'm afraid it wouldn't recognize FMA3 optimizations.

I'm a master in complicating my own life...
ET_ is offline   Reply With Quote
Old 2018-09-19, 14:32   #81
rogue
 
rogue's Avatar
 
"Mark"
Apr 2003
Between here and the

26·32·13 Posts
Default

Quote:
Originally Posted by ET_ View Post
I knew there were distinct code paths enabled on the executable, but I thought one had to enable the relative processor optimizations to have the code recognize it.

In other words, if my code is compiled with -march=native, and I have a Intel G2030 processor (a crippled ivy-bridge with no AVX / FMA3 support), will the executable automatically run the FMA3 path once it is run on a AWS Skylake architecture?

If so, then I solved the issue.
If not, then I should recompile the code on an architecture whose "native" processor recognizes the optimizations. But the AWS gcc is locked at version 4.8, and I'm afraid it wouldn't recognize FMA3 optimizations.

I'm a master in complicating my own life...
It should. It calls a gcc function called builtin_cpu_supports() when deciding if it can use AVX code. I assume that function checks something specific to the computer upon which the code is executing.
rogue is offline   Reply With Quote
Old 2018-09-19, 14:50   #82
ET_
Banned
 
ET_'s Avatar
 
"Luigi"
Aug 2002
Team Italia

10011000001112 Posts
Default

Quote:
Originally Posted by rogue View Post
It should. It calls a gcc function called builtin_cpu_supports() when deciding if it can use AVX code. I assume that function checks something specific to the computer upon which the code is executing.
Thank you Mark. I will test it and then report here. BTW, is there a message saying what optimizations are used at runtime?
ET_ is offline   Reply With Quote
Old 2018-09-19, 15:47   #83
ET_
Banned
 
ET_'s Avatar
 
"Luigi"
Aug 2002
Team Italia

4,871 Posts
Default

Quote:
Originally Posted by ET_ View Post
Thank you Mark. I will test it and then report here. BTW, is there a message saying what optimizations are used at runtime?
After the 50G primes tested with the same executable, the AVX version on the Xeon is 35%-40% faster than the base version at the same clock.
ET_ is offline   Reply With Quote
Old 2018-09-19, 16:51   #84
rogue
 
rogue's Avatar
 
"Mark"
Apr 2003
Between here and the

1D4016 Posts
Default

Quote:
Originally Posted by ET_ View Post
Thank you Mark. I will test it and then report here. BTW, is there a message saying what optimizations are used at runtime?
Not at this time, but it is something I have considered adding.
rogue is offline   Reply With Quote
Old 2018-09-26, 01:36   #85
rogue
 
rogue's Avatar
 
"Mark"
Apr 2003
Between here and the

26·32·13 Posts
Default

I have posted mtsieve 1.8.0 at my website. Here are the changes:

Code:
   Added twinsieve.  This is more than 3x faster than newpgen's twin sieve.

   Modified OpenCL code to change calculation for default workunits to improve GPU throughput.
   Modified "start sieving" message to include expected factors, but only if -P is not the default value.
   Modified all sieves to have custom "start sieving message" so it each show more detail specific to that sieve.
The default behavior of twinsieve is to sieve such that only potential twin primes are remaining, but there is a -i switch that allows one to sieve the +1 and -1 side independently.

Last fiddled with by rogue on 2018-09-26 at 01:38
rogue is offline   Reply With Quote
Old 2018-09-26, 07:33   #86
pepi37
 
pepi37's Avatar
 
Dec 2011
After 1.58M nines:)

13·137 Posts
Default

Quote:
Originally Posted by rogue View Post
I have posted mtsieve 1.8.0 at my website. Here are the changes:

Code:
   Added twinsieve.  This is more than 3x faster than newpgen's twin sieve.

   Modified OpenCL code to change calculation for default workunits to improve GPU throughput.
   Modified "start sieving" message to include expected factors, but only if -P is not the default value.
   Modified all sieves to have custom "start sieving message" so it each show more detail specific to that sieve.
The default behavior of twinsieve is to sieve such that only potential twin primes are remaining, but there is a -i switch that allows one to sieve the +1 and -1 side independently.

In twinsieve you use switch -i on two different places

-i --inputterms=i input file of remaining candidates
-i --independent Sieve +1 and -1 independently
if (!ib_OnlyTwins && it_Format == FF_ABC)
FatalError("Can only support ABC format if sieving +1 and -1 independently");
If i use --independent then is always zero output regardless format ABC


d:\MTSIEVE\TWINSIEVE>twinsieve -P100000000000 -w10000000 -i1.npg -ofact.txt -W4 -fN -r
twinsieve v1.0.0, a program to find factors of k*b^n+1/-1 numbers for fixed b and n and variable k
Sieve started: 30000000001 < p < 1e11 with 18446744073709502166 terms (261 < k < 99309, k*2^1778899) (expecting 876855490500155136 factors)


If in command line stay switch -r then you got this , if you remove it, then all is ok

Last fiddled with by pepi37 on 2018-09-26 at 08:11 Reason: add more info
pepi37 is offline   Reply With Quote
Old 2018-09-26, 12:19   #87
pepi37
 
pepi37's Avatar
 
Dec 2011
After 1.58M nines:)

33658 Posts
Default

And last
If sieve passed 54105949591 ( or very close up to this value) then will be no output and program just terminate.
If sieve depth is lower then that value, program gives output as should do.
pepi37 is offline   Reply With Quote
Old 2018-09-26, 13:08   #88
rogue
 
rogue's Avatar
 
"Mark"
Apr 2003
Between here and the

165008 Posts
Default

Quote:
Originally Posted by pepi37 View Post
In twinsieve you use switch -i on two different places

-i --inputterms=i input file of remaining candidates
-i --independent Sieve +1 and -1 independently
if (!ib_OnlyTwins && it_Format == FF_ABC)
FatalError("Can only support ABC format if sieving +1 and -1 independently");
If i use --independent then is always zero output regardless format ABC


d:\MTSIEVE\TWINSIEVE>twinsieve -P100000000000 -w10000000 -i1.npg -ofact.txt -W4 -fN -r
twinsieve v1.0.0, a program to find factors of k*b^n+1/-1 numbers for fixed b and n and variable k
Sieve started: 30000000001 < p < 1e11 with 18446744073709502166 terms (261 < k < 99309, k*2^1778899) (expecting 876855490500155136 factors)

If in command line stay switch -r then you got this , if you remove it, then all is ok
I'll switch it to use a different character as -i is reserved for the underlying framework.
rogue is offline   Reply With Quote
Reply

Thread Tools


All times are UTC. The time now is 03:14.


Thu Oct 5 03:14:18 UTC 2023 up 22 days, 56 mins, 0 users, load averages: 1.10, 0.76, 0.77

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2023, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.

≠ ± ∓ ÷ × · − √ ‰ ⊗ ⊕ ⊖ ⊘ ⊙ ≤ ≥ ≦ ≧ ≨ ≩ ≺ ≻ ≼ ≽ ⊏ ⊐ ⊑ ⊒ ² ³ °
∠ ∟ ° ≅ ~ ‖ ⟂ ⫛
≡ ≜ ≈ ∝ ∞ ≪ ≫ ⌊⌋ ⌈⌉ ∘ ∏ ∐ ∑ ∧ ∨ ∩ ∪ ⨀ ⊕ ⊗ 𝖕 𝖖 𝖗 ⊲ ⊳
∅ ∖ ∁ ↦ ↣ ∩ ∪ ⊆ ⊂ ⊄ ⊊ ⊇ ⊃ ⊅ ⊋ ⊖ ∈ ∉ ∋ ∌ ℕ ℤ ℚ ℝ ℂ ℵ ℶ ℷ ℸ 𝓟
¬ ∨ ∧ ⊕ → ← ⇒ ⇐ ⇔ ∀ ∃ ∄ ∴ ∵ ⊤ ⊥ ⊢ ⊨ ⫤ ⊣ … ⋯ ⋮ ⋰ ⋱
∫ ∬ ∭ ∮ ∯ ∰ ∇ ∆ δ ∂ ℱ ℒ ℓ
𝛢𝛼 𝛣𝛽 𝛤𝛾 𝛥𝛿 𝛦𝜀𝜖 𝛧𝜁 𝛨𝜂 𝛩𝜃𝜗 𝛪𝜄 𝛫𝜅 𝛬𝜆 𝛭𝜇 𝛮𝜈 𝛯𝜉 𝛰𝜊 𝛱𝜋 𝛲𝜌 𝛴𝜎𝜍 𝛵𝜏 𝛶𝜐 𝛷𝜙𝜑 𝛸𝜒 𝛹𝜓 𝛺𝜔