mersenneforum.org  

Go Back   mersenneforum.org > Great Internet Mersenne Prime Search > Hardware > GPU Computing

Reply
 
Thread Tools
Old 2019-05-25, 23:59   #3158
ixfd64
Bemusing Prompter
 
ixfd64's Avatar
 
"Danny"
Dec 2002
California

34·29 Posts
Default

I'm getting the following errors while trying to compile mfaktc on a Linux system:

Code:
nvcc fatal   : Unsupported gpu architecture 'compute_11'
Makefile:55: recipe for target 'tf_72bit.o' failed
Any ideas?
ixfd64 is offline   Reply With Quote
Old 2019-05-26, 00:24   #3159
GP2
 
GP2's Avatar
 
Sep 2003

32×7×41 Posts
Default

Quote:
Originally Posted by ixfd64 View Post
Any ideas?
Just edit the Makefile and delete the lines with compute_11 and probably also compute_20. Those are only applicable to very old GPUs.
GP2 is offline   Reply With Quote
Old 2019-05-26, 00:39   #3160
nomead
 
nomead's Avatar
 
"Sam Laur"
Dec 2018
Turku, Finland

2·3·5·11 Posts
Default

Quote:
Originally Posted by ixfd64 View Post
I'm getting the following errors while trying to compile mfaktc on a Linux system:

Code:
nvcc fatal   : Unsupported gpu architecture 'compute_11'
Makefile:55: recipe for target 'tf_72bit.o' failed
Any ideas?
You need to modify the Makefile to make sure the compute_xx and sm_xx values match the hardware you're going to run mfaktc on. You're getting this error because newer versions of CUDA don't support compute capability 1.1 anymore (CUDA SDK 6.5 was the last one that did). But that's for an old architecture, 1.1-1.3 is for cards that are over 10 years old by now, GTX 2xx series and the like. Conversely, the newest cards (RTX20 / GTX16) will need compute capability 7.5 and that won't be supported under anything older than CUDA SDK 10.0. So find out what you need, put it in the Makefile, check that your CUDA version is OK and recompile. That should do it.
nomead is offline   Reply With Quote
Old 2019-05-27, 03:07   #3161
ixfd64
Bemusing Prompter
 
ixfd64's Avatar
 
"Danny"
Dec 2002
California

92D16 Posts
Default

Thanks. Commenting out the appropriate lines in the makefile resolved the issue.
ixfd64 is offline   Reply With Quote
Old 2019-06-06, 02:53   #3162
ixfd64
Bemusing Prompter
 
ixfd64's Avatar
 
"Danny"
Dec 2002
California

34·29 Posts
Default

I recently got access to two Tesla V100 GPUs. However, running mfaktc gives me the following error:

Quote:
no kernel image is available for execution on the device
Adding CUDA Compute Capability 7.0 to the makefile solved the problem:

Code:
NVCCFLAGS += --generate-code arch=compute_70,code=sm_70
Is mfaktc supposed to support Volta GPUs out of the box?
ixfd64 is offline   Reply With Quote
Old 2019-06-07, 22:23   #3163
TheJudger
 
TheJudger's Avatar
 
"Oliver"
Mar 2005
Germany

2·3·5·37 Posts
Default

Quote:
Originally Posted by ixfd64 View Post
I recently got access to two Tesla V100 GPUs. However, running mfaktc gives me the following error:



Adding CUDA Compute Capability 7.0 to the makefile solved the problem:

Code:
NVCCFLAGS += --generate-code arch=compute_70,code=sm_70
Is mfaktc supposed to support Volta GPUs out of the box?
Yes, all you need to do is to modify the Makefile as you already did!

Oliver
TheJudger is offline   Reply With Quote
Old 2019-06-08, 17:27   #3164
ixfd64
Bemusing Prompter
 
ixfd64's Avatar
 
"Danny"
Dec 2002
California

34·29 Posts
Default

It might be a good idea for the makefile to have rules for different CUDA versions. For example make cuda10 would automatically apply the flags for compute capability 3.0 to 7.5.

Ideally, non-developers shouldn't have to know how to modify makefiles. At the very least, the documentation should probably be updated.
ixfd64 is offline   Reply With Quote
Old 2019-06-16, 02:45   #3165
hansl
 
hansl's Avatar
 
Apr 2019

5·41 Posts
Default

Hi, I've just built mfaktc-0.21 against CUDA 10.1 on Linux, with a GTX 970, and I'm getting errors like this:
Code:
mfaktc v0.21 (64bit built)                                                                                                                                                                                                           

Compiletime options
  THREADS_PER_BLOCK         256
  SIEVE_SIZE_LIMIT          32kiB
  SIEVE_SIZE                193154bits
  SIEVE_SPLIT               250
  MORE_CLASSES              enabled

Runtime options
  SievePrimes               25000
  SievePrimesAdjust         1
  SievePrimesMin            5000
  SievePrimesMax            100000
  NumStreams                1
  CPUStreams                1
  GridSize                  0
  GPU Sieving               enabled
  GPUSievePrimes            82486
  GPUSieveSize              4Mi bits
  GPUSieveProcessSize       8Ki bits
  Checkpoints               enabled
  CheckpointDelay           30s
  WorkFileAddDelay          600s
  Stages                    enabled
  StopAfterFactor           bitlevel
  PrintMode                 full
  V5UserID                  (none)
  ComputerID                (none)
  AllowSleep                no
  TimeStampInResults        no

CUDA version info
  binary compiled for CUDA  10.10
  CUDA runtime version      10.10
  CUDA driver version       10.10

CUDA device info
  name                      GeForce GTX 970
  compute capability        5.2
  max threads per block     1024
  max shared memory per MP  98304 byte
  number of multiprocessors 13
  CUDA cores per MP         128
  CUDA cores - total        1664
  clock rate (CUDA cores)   1240MHz
  memory clock rate:        3505MHz
  memory bus width:         256 bit

Automatic parameters
  threads per grid          106496
ERROR: cudaStreamCreate() failed for stream 0
  cudaGetLastError() returned 2: out of memory
Not sure if my settings look a little odd at this point since I've tried to lower anything that looked like it would be related to increased memory usage.

About every 1 in 5-10 attempts it actually seems to get past this error and then everything seems to run fine (I completed -st and -st2 with all tests passing). But most of the times I just get that "out of memory" error before it even gets started. Any ideas what is wrong?
Is it likely due to some improper configuration on my part when building?

Using nvidia-smi to report utilizatoin (without mfaktc running) I see memory usage of: 152MiB / 4039MiB

Is it possible it is complaining about regular system memory, not GPU memory? I have 16GB of actual RAM, which is currently all used up by another process, but there is still gobs of swap space free on this system (>300GB) from an NVMe SSD.
hansl is offline   Reply With Quote
Old 2019-06-23, 04:26   #3166
hansl
 
hansl's Avatar
 
Apr 2019

5·41 Posts
Default

Nevermind about the above. It seems the issue was fixed after rebooting. I had been waiting for a very long job to finish and I think I must have forgotten to reboot since installing CUDA, causing it to behave in all kinds of strange ways.
hansl is offline   Reply With Quote
Old 2019-07-03, 02:49   #3167
kriesel
 
kriesel's Avatar
 
"TF79LL86GIMPS96gpu17"
Mar 2017
US midwest

484310 Posts
Default false factor 38814612911305349835664385407 generated reliably by display TDRs

Playing with an old slow gpu from my spare parts bin, on an also old CORE 2 DUO system, after it passes ~2900 self tests, I found that TDR timeouts reliably produce the known false factor 38814612911305349835664385407 in mfaktc:
Code:
batch wrapper logs (re)launch of EAGLET mfaktc quadro nvs295 at Tue 07/02/2019 20:01:57.14 
mfaktc v0.21 (64bit built)

Compiletime options
  THREADS_PER_BLOCK         256
  SIEVE_SIZE_LIMIT          32kiB
  SIEVE_SIZE                193154bits
  SIEVE_SPLIT               250
  MORE_CLASSES              enabled

Runtime options
  SievePrimes               25000
  SievePrimesAdjust         1
  SievePrimesMin            5000
  SievePrimesMax            100000
  NumStreams                3
  CPUStreams                3
  GridSize                  3
  GPU Sieving               enabled
  GPUSievePrimes            82486
  GPUSieveSize              64Mi bits
  GPUSieveProcessSize       16Ki bits
  Checkpoints               enabled
  CheckpointDelay           900s
  WorkFileAddDelay          600s
  Stages                    enabled
  StopAfterFactor           bitlevel
  PrintMode                 full
  V5UserID                  kriesel
  ComputerID                eaglet-nvs295
  AllowSleep                no
  TimeStampInResults        yes

CUDA version info
  binary compiled for CUDA  6.50
  CUDA runtime version      6.50
  CUDA driver version       6.50

CUDA device info
  name                      Quadro NVS 295
  compute capability        1.1
  max threads per block     512
  max shared memory per MP  16384 byte
  number of multiprocessors 1
  CUDA cores per MP         8
  CUDA cores - total        8
  clock rate (CUDA cores)   1300MHz
  memory clock rate:        695MHz
  memory bus width:         64 bit

Automatic parameters
  threads per grid          1048576
  GPUSievePrimes (adjusted) 82486
  GPUsieve minimum exponent 1055144

running a simple selftest...
Selftest statistics
  number of tests           107
  successfull tests         107

selftest PASSED!

got assignment: exp=119998999 bit_min=72 bit_max=73 (7.97 GHz-days)
Starting trial factoring M119998999 from 2^72 to 2^73 (7.97 GHz-days)
 k_min =  19676691147960
 k_max =  39353382296711
Using GPU kernel "barrett76_mul32_gs"
Date    Time | class   Pct |   time     ETA | GHz-d/day    Sieve     Wait
Jul 02 20:02 |    0   0.1% |  3.928   1h02m |    182.63    82485    n.a.%
M119998999 has a factor: 38814612911305349835664385407
ERROR: cudaGetLastError() returned 30: unknown error
at Tue 07/02/2019 20:02:39.49mfaktc quadro nvs295 exit logged by batch wrapper 
batch wrapper logs (re)launch of EAGLET mfaktc quadro nvs295 at Tue 07/02/2019 20:02:39.53 
mfaktc v0.21 (64bit built)
...

selftest PASSED!

got assignment: exp=119998999 bit_min=72 bit_max=73 (7.97 GHz-days)
Starting trial factoring M119998999 from 2^72 to 2^73 (7.97 GHz-days)
 k_min =  19676691147960
 k_max =  39353382296711
Using GPU kernel "barrett76_mul32_gs"
Date    Time | class   Pct |   time     ETA | GHz-d/day    Sieve     Wait
Jul 02 20:03 |    0   0.1% |  3.573  57m07s |    200.78    82485    n.a.%
M119998999 has a factor: 38814612911305349835664385407
ERROR: cudaGetLastError() returned 30: unknown error
at Tue 07/02/2019 20:03:21.71mfaktc quadro nvs295 exit logged by batch wrapper 
batch wrapper logs (re)launch of EAGLET mfaktc quadro nvs295 at Tue 07/02/2019 20:03:21.72 
mfaktc v0.21 (64bit built)
...

selftest PASSED!

got assignment: exp=119998999 bit_min=72 bit_max=73 (7.97 GHz-days)
Starting trial factoring M119998999 from 2^72 to 2^73 (7.97 GHz-days)
 k_min =  19676691147960
 k_max =  39353382296711
Using GPU kernel "barrett76_mul32_gs"
Date    Time | class   Pct |   time     ETA | GHz-d/day    Sieve     Wait
Jul 02 20:04 |    0   0.1% |  3.604  57m36s |    199.05    82485    n.a.%
M119998999 has a factor: 38814612911305349835664385407
ERROR: cudaGetLastError() returned 30: unknown error
at Tue 07/02/2019 20:04:03.89mfaktc quadro nvs295 exit logged by batch wrapper 
batch wrapper logs (re)launch of EAGLET mfaktc quadro nvs295 at Tue 07/02/2019 20:08:00.62 
mfaktc v0.21 (64bit built)
...

selftest PASSED!

got assignment: exp=119998999 bit_min=72 bit_max=73 (7.97 GHz-days)
Starting trial factoring M119998999 from 2^72 to 2^73 (7.97 GHz-days)
 k_min =  19676691147960
 k_max =  39353382296711
Using GPU kernel "barrett76_mul32_gs"
Date    Time | class   Pct |   time     ETA | GHz-d/day    Sieve     Wait
Jul 02 20:08 |    0   0.1% |  3.629  58m00s |    197.68    82485    n.a.%
M119998999 has a factor: 38814612911305349835664385407
ERROR: cudaGetLastError() returned 30: unknown error
at Tue 07/02/2019 20:08:42.44mfaktc quadro nvs295 exit logged by batch wrapper 
batch wrapper logs (re)launch of EAGLET mfaktc quadro nvs295 at Tue 07/02/2019 20:08:42.60 
mfaktc v0.21 (64bit built)
...

selftest PASSED!

got assignment: exp=119998999 bit_min=72 bit_max=73 (7.97 GHz-days)
Starting trial factoring M119998999 from 2^72 to 2^73 (7.97 GHz-days)
 k_min =  19676691147960
 k_max =  39353382296711
Using GPU kernel "barrett76_mul32_gs"
Date    Time | class   Pct |   time     ETA | GHz-d/day    Sieve     Wait
Jul 02 20:09 |    0   0.1% |  3.606  57m38s |    198.94    82485    n.a.%
M119998999 has a factor: 38814612911305349835664385407
ERROR: cudaGetLastError() returned 30: unknown error
at Tue 07/02/2019 20:09:25.30mfaktc quadro nvs295 exit logged by batch wrapper 
batch wrapper logs (re)launch of EAGLET mfaktc quadro nvs295 at Tue 07/02/2019 20:09:25.36 
mfaktc v0.21 (64bit built)
...

selftest PASSED!

got assignment: exp=119998999 bit_min=72 bit_max=73 (7.97 GHz-days)
Starting trial factoring M119998999 from 2^72 to 2^73 (7.97 GHz-days)
 k_min =  19676691147960
 k_max =  39353382296711
Using GPU kernel "barrett76_mul32_gs"
Date    Time | class   Pct |   time     ETA | GHz-d/day    Sieve     Wait
Jul 02 20:10 |    0   0.1% |  3.628  57m59s |    197.74    82485    n.a.%
M119998999 has a factor: 38814612911305349835664385407
ERROR: cudaGetLastError() returned 30: unknown error
at Tue 07/02/2019 20:10:07.93mfaktc quadro nvs295 exit logged by batch wrapper 
batch wrapper logs (re)launch of EAGLET mfaktc quadro nvs295 at Tue 07/02/2019 20:30:12.32 
mfaktc v0.21 (64bit built)
...

selftest PASSED!

got assignment: exp=119998999 bit_min=72 bit_max=73 (7.97 GHz-days)
Starting trial factoring M119998999 from 2^72 to 2^73 (7.97 GHz-days)
 k_min =  19676691147960
 k_max =  39353382296711
Using GPU kernel "barrett76_mul32_gs"
Date    Time | class   Pct |   time     ETA | GHz-d/day    Sieve     Wait
Jul 02 20:30 |    0   0.1% |  3.557  56m51s |    201.68    82485    n.a.%
M119998999 has a factor: 38814612911305349835664385407
ERROR: cudaGetLastError() returned 30: unknown error
at Tue 07/02/2019 20:30:54.43mfaktc quadro nvs295 exit logged by batch wrapper 
batch wrapper logs (re)launch of EAGLET mfaktc quadro nvs295 at Tue 07/02/2019 20:30:54.49 
mfaktc v0.21 (64bit built)
...

selftest PASSED!

got assignment: exp=119998999 bit_min=72 bit_max=73 (7.97 GHz-days)
Starting trial factoring M119998999 from 2^72 to 2^73 (7.97 GHz-days)
 k_min =  19676691147960
 k_max =  39353382296711
Using GPU kernel "barrett76_mul32_gs"
Date    Time | class   Pct |   time     ETA | GHz-d/day    Sieve     Wait
Jul 02 20:31 |    0   0.1% |  3.585  57m18s |    200.11    82485    n.a.%
M119998999 has a factor: 38814612911305349835664385407
ERROR: cudaGetLastError() returned 30: unknown error
at Tue 07/02/2019 20:31:36.79mfaktc quadro nvs295 exit logged by batch wrapper 
batch wrapper logs (re)launch of EAGLET mfaktc quadro nvs295 at Tue 07/02/2019 20:31:37.01 
mfaktc v0.21 (64bit built)
...

selftest PASSED!

got assignment: exp=119998999 bit_min=72 bit_max=73 (7.97 GHz-days)
Starting trial factoring M119998999 from 2^72 to 2^73 (7.97 GHz-days)
 k_min =  19676691147960
 k_max =  39353382296711
Using GPU kernel "barrett76_mul32_gs"
Date    Time | class   Pct |   time     ETA | GHz-d/day    Sieve     Wait
Jul 02 20:32 |    0   0.1% |  4.533   1h12m |    158.26    82485    n.a.%
M119998999 has a factor: 38814612911305349835664385407
ERROR: cudaGetLastError() returned 30: unknown error
at Tue 07/02/2019 20:32:20.23mfaktc quadro nvs295 exit logged by batch wrapper 
batch wrapper logs (re)launch of EAGLET mfaktc quadro nvs295 at Tue 07/02/2019 20:44:07.67 
mfaktc v0.21 (64bit built)
...

selftest PASSED!

got assignment: exp=119998999 bit_min=72 bit_max=73 (7.97 GHz-days)
Starting trial factoring M119998999 from 2^72 to 2^73 (7.97 GHz-days)
 k_min =  19676691147960
 k_max =  39353382296711
Using GPU kernel "barrett76_mul32_gs"
Date    Time | class   Pct |   time     ETA | GHz-d/day    Sieve     Wait
Jul 02 20:44 |    0   0.1% |  3.522  56m18s |    203.69    82485    n.a.%
M119998999 has a factor: 38814612911305349835664385407
ERROR: cudaGetLastError() returned 30: unknown error
at Tue 07/02/2019 20:44:49.76mfaktc quadro nvs295 exit logged by batch wrapper 
batch wrapper logs (re)launch of EAGLET mfaktc quadro nvs295 at Tue 07/02/2019 20:44:49.82 
mfaktc v0.21 (64bit built)
...

selftest PASSED!

got assignment: exp=119998999 bit_min=72 bit_max=73 (7.97 GHz-days)
Starting trial factoring M119998999 from 2^72 to 2^73 (7.97 GHz-days)
 k_min =  19676691147960
 k_max =  39353382296711
Using GPU kernel "barrett76_mul32_gs"
Date    Time | class   Pct |   time     ETA | GHz-d/day    Sieve     Wait
Jul 02 20:45 |    0   0.1% |  3.556  56m50s |    201.74    82485    n.a.%
M119998999 has a factor: 38814612911305349835664385407
ERROR: cudaGetLastError() returned 30: unknown error
at Tue 07/02/2019 20:45:32.10mfaktc quadro nvs295 exit logged by batch wrapper 
batch wrapper logs (re)launch of EAGLET mfaktc quadro nvs295 at Tue 07/02/2019 20:45:32.15 
mfaktc v0.21 (64bit built)
...

selftest PASSED!

got assignment: exp=119998999 bit_min=72 bit_max=73 (7.97 GHz-days)
Starting trial factoring M119998999 from 2^72 to 2^73 (7.97 GHz-days)
 k_min =  19676691147960
 k_max =  39353382296711
Using GPU kernel "barrett76_mul32_gs"
Date    Time | class   Pct |   time     ETA | GHz-d/day    Sieve     Wait
Jul 02 20:46 |    0   0.1% |  3.548  56m43s |    202.20    82485    n.a.%
M119998999 has a factor: 38814612911305349835664385407
ERROR: cudaGetLastError() returned 30: unknown error
at Tue 07/02/2019 20:46:14.24mfaktc quadro nvs295 exit logged by batch wrapper
At this point, it becomes obvious that the known-bad-factor, absurdly optimistic GHz-d/day rate indicated, error 30 occurrence, and periodic screen blanking have a 1:1 correspondence in 12 tries to the default TDR value of 2 being exceeded. Run REGEDT32, add the TdrDelay key dword value 0x20 (32 seconds), and retry:
https://docs.microsoft.com/en-us/windows-hardware/drivers/display/tdr-registry-keys yields success of a sort:
Code:
batch wrapper logs (re)launch of EAGLET mfaktc quadro nvs295 at Tue 07/02/2019 20:57:36.34 
mfaktc v0.21 (64bit built)

Compiletime options
  THREADS_PER_BLOCK         256
  SIEVE_SIZE_LIMIT          32kiB
  SIEVE_SIZE                193154bits
  SIEVE_SPLIT               250
  MORE_CLASSES              enabled

Runtime options
  SievePrimes               25000
  SievePrimesAdjust         1
  SievePrimesMin            5000
  SievePrimesMax            100000
  NumStreams                3
  CPUStreams                3
  GridSize                  3
  GPU Sieving               enabled
  GPUSievePrimes            82486
  GPUSieveSize              64Mi bits
  GPUSieveProcessSize       16Ki bits
  Checkpoints               enabled
  CheckpointDelay           900s
  WorkFileAddDelay          3600s
  Stages                    enabled
  StopAfterFactor           bitlevel
  PrintMode                 full
  V5UserID                  kriesel
  ComputerID                eaglet-nvs295
  AllowSleep                no
  TimeStampInResults        yes

CUDA version info
  binary compiled for CUDA  6.50
  CUDA runtime version      6.50
  CUDA driver version       6.50

CUDA device info
  name                      Quadro NVS 295
  compute capability        1.1
  max threads per block     512
  max shared memory per MP  16384 byte
  number of multiprocessors 1
  CUDA cores per MP         8
  CUDA cores - total        8
  clock rate (CUDA cores)   1300MHz
  memory clock rate:        695MHz
  memory bus width:         64 bit

Automatic parameters
  threads per grid          1048576
  GPUSievePrimes (adjusted) 82486
  GPUsieve minimum exponent 1055144

running a simple selftest...
Selftest statistics
  number of tests           107
  successfull tests         107

selftest PASSED!

got assignment: exp=119998999 bit_min=72 bit_max=73 (7.97 GHz-days)
Starting trial factoring M119998999 from 2^72 to 2^73 (7.97 GHz-days)
 k_min =  19676691147960
 k_max =  39353382296711
Using GPU kernel "barrett76_mul32_gs"
Date    Time | class   Pct |   time     ETA | GHz-d/day    Sieve     Wait
Jul 02 21:05 |    0   0.1% | 411.97   4d13h |      1.74    82485    n.a.%
Jul 02 21:11 |    5   0.2% | 411.85   4d13h |      1.74    82485    n.a.%
Jul 02 21:18 |    9   0.3% | 411.49   4d13h |      1.74    82485    n.a.%
Jul 02 21:25 |   12   0.4% | 411.39   4d13h |      1.74    82485    n.a.%
I had estimated it around 2.5 GHz-D/day, so this is more consistent with expectations. And perhaps a contender for the turtle consolation prize in gpu benchmarks. 23W/2Ghzd/day is rather costly also. And this little gem does not have DP Floats, so no P-1 or primality testing.
kriesel is offline   Reply With Quote
Old 2019-07-06, 01:07   #3168
hansl
 
hansl's Avatar
 
Apr 2019

5·41 Posts
Default

Are there any particular recommended settings that would help to maximize throughput for TF of many large exponents (between 10^9 and 2^32) to fairly low bit levels? I've tried a few different combinations to reduce overhead, but nothing in particular seems to make a noticeable difference.

I'm using a "less classes" build by the way.

Also strangely I seem to get the better performance from my laptop Quadro M1000M vs a GTX 780. I would have expected the GTX to be faster , having more than 4x CUDA core count of the mobile quadro.
https://www.videocardbenchmark.net/c...780/3349vs2525
Although, the quadro has a bit more core clock, is that all that matters in this case?
hansl is offline   Reply With Quote
Reply

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
mfakto: an OpenCL program for Mersenne prefactoring Bdot GPU Computing 1668 2020-12-22 15:38
The P-1 factoring CUDA program firejuggler GPU Computing 753 2020-12-12 18:07
gr-mfaktc: a CUDA program for generalized repunits prefactoring MrRepunit GPU Computing 32 2020-11-11 19:56
mfaktc 0.21 - CUDA runtime wrong keisentraut Software 2 2020-08-18 07:03
World's second-dumbest CUDA program fivemack Programming 112 2015-02-12 22:51

All times are UTC. The time now is 08:53.

Tue Jan 19 08:53:38 UTC 2021 up 47 days, 5:04, 0 users, load averages: 1.93, 2.13, 2.29

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.