I revisited the currently-disabled checkpointing code yesterday - it predates my experiments with multithreaded runs and thus needs a complete overhaul, but in silver-lining news, I believe I've found a way to do it that will serve both your "poor's man's multithreading" current run mode and a future true-multithreaded implementation. Here 2 code comments describing the schema I have in mind:
Code:
/* The factoring checkpoint file is assumed to have the format:
The file is ascii format, with the following [entries] - user-added comments or annotations are allowed below,
lines 1 and 2, as long as they do not trigger hits for the find-substring-in-file triggered by the mandatory,
program-autogenerated entries:
Line 1: [String containing the current exponent stored in pstring.]
Exponent must be odd (but not necessarily prime), and have digit length corresponding
to the number * of 64-bit limbs set via -DP*WORD at build time. (-DNWORD means unlimited)
Line 2: [Value of TF_PASSES in the build (16 or 960)]
Followed by anywhere from 1 to TF_PASSES lines of the following form,
which need not be in numeric order but must have no index repeats:
Pass [index < TF_PASSES]: [Max factor-k value reached for this pass number by the run(s) which updated this savefile]
Here, "factor-k value" refers to the standard form of prime-exponent Mersenne number factors: M(p) has 1 or more
factors of form q = 2.k.p+1. Mfactor does not in fact require prime exponents, but for nonprime ones will only search
for factors of that form. For example, for the composite-exponent case M(25) = 2^25-1
*/
/*
Every 1024th pass through the small-primes sieve, and also following the final pass
through the sieve, write the checkpoint file, with format as described previously.
Since we expect that multiple jobs and/or threads may be working on the same exponent,
we make such checkpoint-updates atomic as follows:
1. job/thread X acquires file lock and opens checkpoint file <filename> for reading, if it exists.
No other job/thread may acquire file lock until X releases it;
2. X also opens a 2nd, temporary, file <filename.tmp> for writing, beginning with line 1: exponent (= pstring)
and line 2, Value of TF_PASSES in the build (16 or 960)
3. If there was an existing savefile found in step [1], X copies its contents
to the .tmp file in [2] line-by-line, only updating the single "Pass *: [max k reached]" entry
corresponding to the current pass whose progress is being saved via checkpointing;
4. X closes both files, renames <filename.tmp> to <filename>, thus overwriting the now-obsolete
version of the latter;
5. X releases the file lock and resumes processing factor candidates.
*/