20230205, 10:51  #1 
"Oliver"
Sep 2017
Porta Westfalica, DE
2640_{8} Posts 
One trillion digits of Pi download
Around five years ago, I finished a calculation of 1e12 digits of Pi, and wanted to share the results. Back in the days, I had problems getting a Torrent setup to work. With a bit of James Heinrich's guidance, it now works. The Torrent file is attached.
Be aware that this a huge download (around 406 GB)! 
20230205, 13:06  #2  
"Robert Gerbicz"
Oct 2005
Hungary
1,621 Posts 
Quote:
Code:
? log(10)/log(256)*10^12/2^30. %1 = 386.72332825224873934294090483912618086 

20230205, 14:08  #3 
"Oliver"
Sep 2017
Porta Westfalica, DE
2^{5}×3^{2}×5 Posts 
FWIW, I used pigz with the H flag (only use Huffman compression) and increased the block size dramatically.
Another option would have been to use the digit compressor by mysticial, but that is not a "standard compression". For Huffman, one should expect a tree with six entries of size 3 and four of size 4. That would give an expected file size of 6e11 * 3/8 + 4e11 * 4/8 = 4.25e11 bytes which would be nearly 396 binary GB. This is of course more than optimal, but also less than the real value likely because of overhead and suboptimal tree choices. The best option − of course − would be to use the hexadecimal representation and then use two hexadecimal digits per byte. I used this procedure to be practical. Yes, I used around 5 % more data than optimal, but I think I can argue that this is not useless. You can use all standard gz tools like zcat etc. and extracting is fast. Base converting takes much more time and LOTS of memory. 
20230205, 15:26  #4  
"Robert Gerbicz"
Oct 2005
Hungary
1,621 Posts 
Quote:
Code:
? 10^12/2/2^30. %2 = 465.66128730773925781250000000000000001 Code:
? 10^12/12*5/2^30. %10 = 388.05107275644938151041666666666666668 https://www.spoj.com/problems/MAGIC2/ The first few solvers (including me) have better compression rate than the popular programs are giving. 

20230205, 15:36  #5 
Undefined
"The unspeakable one"
Jun 2006
My evil lair
1A3A_{16} Posts 
I like your alternative way of expressing "plain binary": two hexadecimal digits per byte.
Last fiddled with by retina on 20230205 at 15:37 
20230205, 15:43  #6  
"Oliver"
Sep 2017
Porta Westfalica, DE
2^{5}·3^{2}·5 Posts 
Quote:
Quote:
Quote:


20230205, 16:02  #7 
"Robert Gerbicz"
Oct 2005
Hungary
1,621 Posts 

20230207, 15:46  #8 
"/X\(‘‘)/X\"
Jan 2013
2·11^{2}·13 Posts 
Just out of curiosity, I took the first million decimal and first million hexadecimal digits of pi to see how various compression commands/algorithms worked. Obviously not as good as plain binary.
Source files were 1,000,002 bytes (from All Digits of Pi). brotli pi_dec_1m.txt => 424825 brotli pi_hex_1m.txt = > 500051 bzip2 pi_dec_1m.txt => 431435 bzip2 pi_hex_1m.txt => 509456 gzip 9 pi_dec_1m.txt => 470449 gzip 9 pi_hex_1m.txt => 569818 lz4 z pi_dec_1m.txt => 948602 lz4 z pi_hex_1m.txt => 1000021 cat pi_dec_1m.txt  ~/go/bin/snappycompress => 851417 cat pi_hex_1m.txt  ~/go/bin/snappycompress => 1000140 xz z pi_dec_1m.txt => 437952 xz z pi_hex_1m.txt => 519644 zip 9 pi_dec_1m.txt => 470593 zip 9 pi_hex_1m.txt => 569962 cat pi_dec_1m.txt  zstd z => 484361 cat pi_hex_1m.txt  zstd z => 516778 The conclusion is that algorithms at both ends of the alphabet do better, while those in the middle barely compress if at all. 
20230207, 15:49  #9 
"Oliver"
Sep 2017
Porta Westfalica, DE
2640_{8} Posts 
Compared to pigz H (437102 bytes), only brotli and bzip2 procduce smaller files.

20230208, 20:43  #10 
"Carlos Pinho"
Oct 2011
Milton Keynes, UK
2^{2}×5×257 Posts 
Thank you. I'm still downloading it, another day and should be completed.

20230220, 00:46  #11 
Sep 2016
373 Posts 
ycruncher's .ycd format does 19 decimal digits in 8 bytes. (1.40% overhead)
Not as good as 3 digits / 10 bits (0.34% overhead), but the 8byte alignment worked better from a coding perspective. In retrospect, I could've used 40byte chunks for 320bit blocks with 3 digits/10 bits as that would achieve both the 1024/1000 efficiency and alignment at the same time. But I made this decision some 10 years ago, and the 1.40% > 0.34% improvement isn't big enough to justify redesigning the whole thing. 
Thread Tools  
Similar Threads  
Thread  Thread Starter  Forum  Replies  Last Post 
31.4 ... 62.8 ... 100 trillion digits of Pi  GWR  Mysticial  ycruncher  70  20220617 22:30 
operation trillion digits?  mersenneNoob  Operation Billion Digits  11  20210602 07:37 
Google Cloud Compute 31.4 Trillion Digits of Pi  Mysticial  ycruncher  30  20191011 14:45 
Can't Download Anything  ima wana be  Software  6  20120314 03:44 
Can't download  Unregistered  Information & Answers  1  20110723 14:30 