![]() |
![]() |
#1 |
Nov 2010
2·52 Posts |
![]()
I recently started working on NFS postprocessing for DLOG and found one funny thing called MapReduce. I noticed that duplicate and singleton removal stages of the postprocessing fit very well MapReduce distributed programming paradigm. I wonder if anyone tried to implement distributed postprocessing before? It could be quite interesting for all the distributed sieving projects since using MapReduce one doesn't generally need to collect all the data from all the sources in one place but do some point-to-point exchanges between peers.
|
![]() |
![]() |
![]() |
#2 |
Sep 2004
2×5×283 Posts |
![]()
I didn't read the link you gave but I was wondering how it would deal to organize in parallel the relations file with all computers connected.
For a 31 LP bits task the size is of ~14 GB zipped. For example 2,1019- had a relation file of more than 100 GB with a matrix size of ~25 GB. The point-to-point exchanges between peers are a lot of GB's. Last fiddled with by em99010pepe on 2012-12-05 at 11:19 |
![]() |
![]() |
![]() |
#3 |
Nov 2010
2×52 Posts |
![]()
Say, you have N relations on P peers each having N/P relation locally. If you want to collect all the data in one place you need O((P-1)N/P) of communication time then you need to process the data in O(N).
In MapReduce both times could be about P times smaller: say, after the Map stage each peer has O(N/P) key-values he needs to sort and then send O(N/P^2) of which to any other peer. He sends first O(N/P^2) key-values to his nearest neighbor, next O(N/P^2) to the next nearest, etc. Here each peer needs to be able to connect to any other peer but only one at a time. Since all the send-recv's are done in parallel overall communication time will be O((P-1)N/P^2). After all the communications finish each peer again has O(N/P) key-values he needs to process on the Reduce stage. Of course, in real life not all the peers have equal amount of data and key-values but in general there should be a very sensible speedup. |
![]() |
![]() |
![]() |
#4 |
(loop (#_fork))
Feb 2006
Cambridge, England
2×7×461 Posts |
![]()
Dedupe and singleton removal were done on a cluster for RSA768.
For anything smaller than that, they're a trivial amount of work (100GB of disc and a few GB of memory) and basically limited by disc I/O speed. |
![]() |
![]() |
![]() |
#5 |
Just call me Henry
"David"
Sep 2007
Liverpool (GMT/BST)
37·163 Posts |
![]() |
![]() |
![]() |
![]() |
#6 |
Tribal Bullet
Oct 2004
32×5×79 Posts |
![]()
MapReduce works extremely well on these kinds of problems, but the implicit assumption when using it is that you have an HPC-class distributed filesystem. Otherwise the main operation, shuffling keys around, finishes very quickly and disk IO becomes the bottleneck.
I have considered an MPI decomposition of the problem across a few machines, so that duplicate and singleton removal split the large hashtables involved across several CPU memories. MPI even has primitives for remote access to a distributed hashtable. |
![]() |
![]() |
![]() |
#7 |
Nov 2010
1100102 Posts |
![]()
And there's working MPI MapReduce implementaion. ;)
I believe one can also do naive large prime removal using double MapReduce. |
![]() |
![]() |
![]() |
Thread Tools | |
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
Postprocessing error: double free or corruption | Syd | Msieve | 2 | 2009-03-21 15:12 |
Postprocessing error: "too many merge attempts" | mdettweiler | Msieve | 4 | 2009-03-03 16:53 |
distributed.net completes OGR-25 | ixfd64 | Lounge | 4 | 2008-11-22 01:59 |
Distributed Internet Cryptography | ewmayer | Math | 1 | 2006-08-21 20:53 |
distributed proofreading | adpowers | Lounge | 10 | 2004-11-05 01:54 |