USE_BMI2=1 did solve the compilation issue, so thank you for that!
I am also encountering the issue when running tune() with the second siqs number: Code:
starting SIQS on c65: 34053408309992030649212497354061832056920539397279047809781589871 overriding small TF cutoff of 20 to 20 ==== sieve params ==== n = 67 digits, 220 bits factor base: 6384 primes (max prime = 136207) single large prime cutoff: 10215525 (75 * pmax) allocating 3 large prime slices of factor base buckets hold 2048 elements large prime hashtables have 196608 bytes using AVX2 enabled 32k sieve core sieve interval: 4 blocks of size 32768 polynomial A has ~ 8 factors using multiplier of 47 using Q2(x) polynomials for kN mod 8 = 1 using SPV correction of 20 bits, starting at offset 40 trial factoring cutoff at 75 bits ==== sieving in progress (1 thread): 6448 relations needed ==== ==== Press ctrlc to abort and save state ==== 750 rels found: 646 full + 104 from 5321 partial, (6828.42 rels/sec) Max specified relations found sieve time = 0.0000, relation time = 0.0000, poly_time = 0.0000 trial division touched 99728 sieve locations out of 2064384000 1699 rels found: 1276 full + 423 from 10689 partial, (6822.74 rels/sec) sieving required 7875 total polynomials (124 'A' polynomials) trial division touched 99728 sieve locations out of 2064384000 total reports = 99728, total surviving reports = 32675 total blocks sieved = 63000, avg surviving reports per block = 0.52 Elapsed time: 1.7537 sec elapsed time for ~10k relations of c65 = 1.7539 seconds. extrapolated time for complete factorization = 4.8349 seconds double free or corruption (!prev) Aborted 
One has to admit that this naming scheme was not very evident; the one for Intel Core i is much more understandable (excluding some oddities like recent mobile CPUs etc.).

Problem in tune() fixed.
Also fixed an issue in microecm when compiling with gcc (verified it works now with gcc11.1.0). Rebuilt and reuploaded the windows executables. 
New build of regular version completed tune fine on my i38100, and it's still running on my i73930K.
I do notice that the 3930K is embarrassingly slow in some cases, perhaps there's a fallback codepath due to lack of hardware features. For example, SIQS on c90 42735....7841 ran at 1135 rel/s on my i3, and is currently running at 3.26 on my 3930K. That's 350x slower, is that sane and/or expected? Last fiddled with by James Heinrich on 20220922 at 16:11 
When i run
yafu.exe "factor(3105695207255595953041248693082694537249993263358883561705428359401428617)" p threads 8 with use_gpuecm yafu runs siqs with one thread. Last fiddled with by kotenok2000 on 20220923 at 15:21 
GNFS also runs with one thread. Even with lathreads=8

As far as I know, YAFU has no GPU capability. 

