Thread: mlucas on sun
View Single Post
Old 2004-01-02, 18:03   #4
ewmayer
2ω=0
 
ewmayer's Avatar
 
Sep 2002
Rep├║blica de California

231558 Posts
Default

You've got 2 choices here: first is to build the latest version (anon-ftp to hogranch.com, cd pub/mayer/src/C, mget *) yourself (assuming you have access to the SunPro C compiler), and use the automated self-test feature (type Mlucas -h to see the options here) to help you find the best set of FFT radices for the runlengths of interest, which will go into the mlucas.cfg file, whose format and purpose is described here.


Your second choice is to try a gzipped version of the sparc binary I use (built for me by Bill Rea - our Sparcs at work only have gcc) here:

ftp://hogranch.com/pub/mayer/bin/SPA...as2.8_sparc.gz

That version of the code has pretty much the same performance as the latest code, but lacks the automated self-test feature. To use it to build your mlucas.cfg file, go to the above .../src/C ftp archive and get only the Mlucas.c file. Scroll to the bottom portion of the source file, where you'll see a table of exponents and 64-bit hex residues, with entries that look like

Code:
/* Array of distinct test cases for self-tests. Add one extra slot to vector for user-specified self-test exponents: */
struct testCase testVec[numTest+1] =
{
	/* FFT #radices  p  100-iter Res64             #bits per digit  FFT radices      AvgMaxErr  */
	/* Small:                                                                         x86  alfa */
	{ 128, 3,  2550001,"CB6030D5790E2460"},/* testVec[ 0]   19.455  16,16,16,16     .1034 .1334 */
	{ 144, 2,  2920013,"7CC1B41482BCB7C0"},/* testVec[ 1]   19.803   9,16,16,32     .1508 .2113 */
	{ 160, 6,  3265007,"B912804D7FE4A9E5"},/* testVec[ 2]   19.928  10,16,16,32     .2020 .2656 */
	{ 176, 3,  3550007,"5059094E256FB886"},/* testVec[ 3]   19.698  11,16,16,32     .1686 .2403 */
	{ 192, 6,  3900067,"4744CB8E5287DA60"},/* testVec[ 4]   19.837  12,16,16,32     .1885 .2523 */
	{ 224, 6,  4540007,"1DA37E1FAC27BC68"},/* testVec[ 5]   19.793  14,16,16,32     .2097 .2929 */
	{ 256, 7,  5190001,"15216788A374E144"},/* testVec[ 6]   19.798  16,16,16,32     .2563 .3086 */
	{ 288, 2,  5780087,"ADB1333A531F6EED"},/* testVec[ 7]   19.599   9,16,32,32     .1774 .2384 */
	{ 320, 3,  6400013,"6B2DF2F4FD779CBC"},/* testVec[ 8]   19.531  10,16,32,32     .1846 .2392 */
	{ 352, 2,  7010011,"4FC7B9144100998F"},/* testVec[ 9]   19.448  11,16,32,32     .1756 .2585 */
	{ 384, 3,  7600013,"2AFA7C90899B583E"},/* testVec[10]   19.328  12,16,32,32     .1383 .1872 */
	{ 416, 2,  8330009,"74AB1D925A0E7DB7"},/* testVec[11]   19.555  13,16,32,32     .2488 .3152 */
	{ 448, 3,  8950001,"7D9DD642E10F2525"},/* testVec[12]   19.509  14,16,32,32     .2041 .2906 */
	{ 480, 2,  9490001,"01A4E738255C522B"},/* testVec[13]   19.307  15,16,32,32     .1642 .2186 */
	{ 512, 3, 10110007,"24AAC84A6CD400BE"},/* testVec[14]   19.283  16,16,32,32     .1884 .2260 */
	/* Medium:                                                                                  */
	{ 576, 2, 11350013,"7087EA4B45F416A6"},/* testVec[15]   19.243   9,32,32,32     .1657 .2181 */
	{ 640, 2, 12590009,"93E43FC168EAF6BF"},/* testVec[16]   19.211  10,32,32,32     .1885 .2382 */
	{ 704, 1, 13799939,"7A8B6F72D5F3A862"},/* testVec[17]   19.143  11,32,32,32     .1747 .2542 */
	{ 768, 2, 15099979,"D731A6D76D99F3F5"},/* testVec[18]   19.201  12,32,32,32     .1692 .2304 */
	{ 832, 1, 16299979,"39AB362A15AF832C"},/* testVec[19]   19.132  13,32,32,32     .2154 .2632 */
	{ 896, 2, 17599997,"EDF99B1D21DE8835"},/* testVec[20]   19.182  14,32,32,32     .2041 .2773 */
	{ 960, 1, 18899999,"AF0F81144A3372A4"},/* testVec[21]   19.226  15,32,32,32     .2186 .2915 */
	{1024, 6, 20099983,"119B2956917D0CC1"},/* testVec[22]   19.169  16,32,32,32     .2457 .2934 */
	{1152, 5, 22500011,"3D81D5C9CC3D1C65"},/* testVec[23]   19.073   9,16,16,16,16  .1845 .2582 */
	{1280, 2, 25000009,"B4A3AF6909228279"},/* testVec[24]   19.073  10,16,16,16,16  .2534 .3083 */
...
Find the table rows containing FFT lengths around the current GIMPS wavefront (as of the start of 2004, you'll want 1152K and 1280K). Then do 100-iteration timing tests of the corresponding exponents, using a variety of FFT radix sets. For instance for 1152 K, a single 100-iteration self-test with radix set 0 results from pasting the following (sans my <=== comments) into your command window:

time Mlucas
22500011 <=== exponent for LL test
1152 <=== FFT length (in K) for LL test
1 <=== 0 for a full LL test, 1 for a shorter timing test
100 <=== if previous line was a 1, how many iterations for the timing test
0 <=== This is the radix set index
1 <=== 0 for error checking off, 1 for EC on.

Start with radix set 0 and increase by one each run until you start getting "radix set XYZ not available - using defaults" warnings. All radix sets should give Res64 = 3D81D5C9CC3D1C65, as per the Mlucas.c table entry. Of the radix sets you tried, pick the one that yielded the smallest runtime and add the corresponding entry to your mlucas.cfg file, e.g. if RS 3 gave the best time @1152K, your mlucas.cfg file would look like

#
# mlucas.cfg optimized for UlraSparc blah blah...
#
200000
#
1152 3

The format of the .cfg file is important - you must begin with precisely 3 #-prefixed lines, where you may enter comments to the right of the # as desired. The fourth line tells the program how many initial iterations to do with per-iteration error checking turned on - in the above example if it gets through the first 200000 iterations on a given exponent with no roundoff errors greater than roughly 0.4, it turns of EC for the rest of the run. You can see if EC slows the code down appreciably by rerunning the self-tests, but entering a 0 instead of a 1 on the last line of input. If EC-on is no more than 1 or 2 % slower than EC-off, I recommend putting a large signed 32-bit integer (say 1000000000) on line 4 of the .cfg file, to force EC to be always on.

Once you've set up your mlucas.cfg file, create a worktodo.ini file in the same dir as your executable and your .cfg filer, enter an exponent in it, and invoke the program sans any flags, e.g. with "nice Mlucas &".
ewmayer is online now   Reply With Quote