mersenneforum.org Cloud computing
 Register FAQ Search Today's Posts Mark Forums Read

 2011-04-28, 15:50 #1 Unregistered   22·1,019 Posts Cloud computing Can cloud computing be used to quickly test large numbers to determine if they are prime?
 2011-04-28, 18:05 #2 cheesehead     "Richard B. Woods" Aug 2002 Wisconsin USA 22×3×641 Posts It can, depending on your definitions of "quickly" and "large", if the cloud has appropriate primality-testing software.
2011-04-28, 18:30   #3
R. Gerbicz

"Robert Gerbicz"
Oct 2005
Hungary

1,493 Posts

Quote:
 Originally Posted by Unregistered Can cloud computing be used to quickly test large numbers to determine if they are prime?
2^(2^1000000000000000000000) is not prime, and I haven't used cloud computing to prove this. I hope this is a large number.

 2011-04-28, 19:51 #4 CRGreathouse     Aug 2006 3·1,993 Posts The question is ambiguous. I will take it a different way from the other posters: "Can a number be tested for primality faster with many computers than just one?" That is, could a sufficiently wealthy person test a number faster by using, say, AWS than a single fast machine? Even this reformulation is very open-ended. Are we talking about proving primality or just characterizing primality with a high degree of confidence? How much confidence is needed in the latter case? And how many machines do we have access to? A typical person might use 10 EC2 instances; a motivated person might be able to get 100 times as many for a short time; a rich person with appropriate connections and lots of cash could get perhaps 100 times that amount; a large country should be able to manage 100 times that many; the world, faced with some gargantuan task "compute this or be destroyed" might be able to ramp up production and usurp all earthly capacity for another factor of 100. But if we're talking pure theory we could go higher. If probable-primality is good enough I don't know of a good way to scale this at all, though with extra machines you could get confidence in the result. For one-in-a-trillion confidence only 20 fast computers would be needed, plus some number of slower ones to check the incremental results (say 500 machines at a tenth the speed). If primality needs to be proven it will depend in part on the number itself. ECPP can be parallelized to some degree, but I'm not sure how this scales. If the number of computers can be unreasonable (a mesh of 10^50 nodes, say) then there are other techniques available I imagine.
2011-04-28, 21:58   #5
Christenson

Dec 2010
Monticello

5×359 Posts

Quote:
 Originally Posted by CRGreathouse The question is ambiguous. I will take it a different way from the other posters: "Can a number be tested for primality faster with many computers than just one?" That is, could a sufficiently wealthy person test a number faster by using, say, AWS than a single fast machine?
Let's ask a different question whose results can easily be adjusted for specific meanings of the terminology: For each important problem in computational number theory, what is the smallest-grained parallelism that makes sense, given some kind of assumptions about the communications overhead?

That is, if we assume "cloud computing" means something like Primenet or NFS@home, with relatively large communications delays, then some jobs, like sieving, make sense, while others, like the block Lanczos reduction of the sieve output, make little sense because they require too much communications. That last tends to get done on the 900-node Lomonosov cluster because it has much better interprocessor commmunication.

LL tests, for example, don't make much sense to parallelise beyond the level of the mersenne number being tested...but, 10 days to factor an 8 million digit number can be fast or slow, and the number can be large or small, depending on your perspective.

Trial Factoring, on the other hand, parallelises very effectively. So do ECM passes, as each of many curves can run independently. I don't know about P-1 tests.

2011-04-28, 23:33   #6
diep

Sep 2006
The Netherlands

757 Posts

Quote:
 Originally Posted by Unregistered Can cloud computing be used to quickly test large numbers to determine if they are prime?
In theory you would be able to rent a reasonable box, say 8 cores and 8GB ram or something and nearly fulltime usage if you're so lucky no one else rents at the same time. Yet the price is a bit high.

If i remember well cloud computing was started by Amazon, against laughable high prices. Other companies saw this and also wanted to make some profit there.

Most companies initially, not too long ago, just a few years, had a price for the above system of 1.40 dollar cents based upon renting a 1ghz type processor, similar to say a k8 processor but could be also slower.

So to rent a system which is capable of quickly calculating your prime numbers would be in the thousands of dollars just run a single LL.

Last few years prices got a bit cheaper and now they seem to have a more relaxed form of renting.

Realize a year has far over 8000 hours, so if they can rent now a 8 core box
in total for 1.60 dollar per hour they are happy. However you have zero garantuees then and pay a huge price.

How many hours is it in total to do a check with 8 simple cores, hyperthreading on of course, of a reasonable Mersenne. Say 4 weeks or so?

And of course other software also spoils your bandwidth now and then.

$1.60 * 24 hours * 30 =$1152.

I'd swear you can buy similar crunching hardware for that.

The whole idea of cloud computing is just nonsense of course, except if you work for a huge company and can get that system time for free.

It's a few semi-civil servants which get ripped off at laughable rates.

The price at which commercially you can exploit system time always must be very high, as they must take into account they won't sell much of a system time from those machines.

Furthermore if you look in the definitions, if some of the companies want to, they can still fall back to that 1ghz definition of what a cpunode hour is.

Cloud computing never will be interesting if you aren't a member of a company that has these machines to their availability for free.

It's a buzz word which if i remember well Amazon first try to sell and then a number of civil servants were asleep and started yelling the buzzword until a number of companies also tried to make money with it, yet i'm not sure whether any company really makes money with it.

When i checked the conditions, and i do this every year or so from the different 'cloud computing' offers, then they can rip you off and if you're not working for the government at a specific spot they will rip you off; especially if you're not an US citizen.

Please just compare the above cloud computing prize with what some supercomputer centers offer. They offer a full fast node for 50 euro a week per node. That is a commercial price if you rent little there.

Most similar offers in cloud computing are a factor 10-100 higher or so than that for the same configuration, and you sure need that for a fast prime number checking.

A node usually is a 2 socket system.
A price no cloud computing center yet offers.
It's a commercial rip-off hype.
Clever buzz word, just makes them more cash than HPC centers already were charging.

Yet the civil servants not aware they can rent also in HPC centers sign too quick for such hypes.

Signing a contract with a company to do your calculations or doing it yourself always is far cheaper.

Regards,
Vincent

Last fiddled with by diep on 2011-04-28 at 23:39

 2011-04-29, 02:30 #7 Christenson     Dec 2010 Monticello 5·359 Posts By diep's definitions, "Cloud" computing doesn't make sense for computational number theory. The value it offers (high availability, distribution, and convenient administration freeing up organizational bandwidth for other tasks) just don't justify the charges for what is essentially a very compute-intensive amateur operation. And, as I mentioned, NFS@home certainly uses some high-performance supercomputing (Teragrid and 900-node Lomonosov cluster) for a few hours at a time when reducing the sieve results into factors. Now, a commercial organization might use cloud computing because most commercial computing is storage and database-intensive, with relatively minor amounts of computing. Remember, if you aren't playing video games, or editing movies, or god forbid, hunting primes, your 10 year old Win98 system at 400MHz is nearly indistinguishable from your multiple GHz, multigigabyte machine of today. It can save a lot of headaches when desktops and laptops die; all the critical data is in the cloud, so just buy your user a shiny new laptop when it dies after two or so years, whether from hardware failure or windows bloat. No fooling with backups; those are taken care of by the cloud. Me, when I think of the cloud, I think of all the distributed computing projects, which are certainly running on thousands of computers distributed all over the world, none of which is terribly reliable, but, over time, contributing huge amounts of compute cycles. Every one of the seven PCs I run prime95 on has crashed this week, either for windows update, loss of power, or standard windows memory leaks in large applications, but my computations march on, only slightly delayed. Last fiddled with by S485122 on 2011-04-29 at 08:14 Reason: orphan sentence
2011-04-29, 04:02   #8
diep

Sep 2006
The Netherlands

757 Posts

A cloud is inherently a commercial entity getting sold; much unlike distributed computing. As clouds change over time so will that cloud definition obviously.

I read a discussion recently from a few commercial companies that said cloud computing they couldn't even figure out further as they needed big i/o and the internet line from their company to the could computing organization always would be factors (and not a little bit) short of what they actually needed to transfer in data.

So cloud computing already for sure isn't for those in need of big i/o as you can't transfer it.

Secondly there is a number of crypto guys around here, yet they deal with military secrets most of them. Much more important than military secrets are financial secrets. Where military secrets can be of value to other nations, in themselves they have no direct cash paying value for the involved persons as it'll be big trouble.

Public companies secrets however directly turn into cash *immediately* and can be used also in sneaky manners; direct pay out without possibly no one noticing. Just look at the exchanges right now where the fast traders are factor 1000 faster than any government organisation would be able to analyze; they won't in short be able to do so.

Reprogrammable FPGA's that can get reprogrammed a few hours after exchange closure and no one will ever figure it out, if the actual trading logics already would reveal the reason why specific decisions get taken.

Cloud computing offers 0 protection against all that.

Cloud computing practical gets used as a commercial cash cow (as we say over here with a Dutch saying, but i'd guess it's pretty much an international saying).

You can easily quantify why for prime numbers cloud computing in a commercial manner will never be attractive.

Just think of being an organisation and what you buy and against what price, then compare it with what you would buy it for.

Then also use the model how to financial exploit it. If we use a more carefu model there (which makes less profit) that the HPC uses.

That's 30% resource usage in the first year, 50% second and 70% third after which the hardware is outdated. This is a rough model; HPC uses it very efficiently as compared to commercial datacenters.

Then you'll need a huge profit margin, as regurarly from calculation centers 90% is idle, unlike HPC (where the above model applies).

Yet it all starts with WHAT do you buy it for?

Let's take a simple example. You will say it's silly maybe. But let's face it.
These datacenters all buy 2 socket machines. And usually that'll be an expensive intel.
It runs pricewise 6 months behind market.

So if you would be able to get a 2 socket nehalem with latest chip right now for say $6k fully configured, a machine with *those* specs is 6 months later in productin ins uch datacenters massively. It takes time to deliver and to order it and bla bla. It'll have fast disks, probably also some expensive SSD's, it'll have ECC ram and quite a decent amount, and 2 very expensive intel procesors and a rackmount. Just the price of the rackmount is already the price you'd build that box yourself with. Of course equipped with latest single socket CPU and without ECC. Now for datacenters ECC might be simply a requirement; todays ram is so great and usually already default has a CRC (take DDR5 at gpu's - the RAM itself there won't make an error simply), you pay a price for all that. We didn't even discuss gpu computing. What gpu do you buy to cmpute at? I bought a 6000 series of AMD; in order to figure out it delivers a lot of gflops double precision but that it seemingly has problems with integers, as a resut i must do thigns in 14 bits significance inside 32 bits integers. Bummer. Yet i'll reach a speed of a part of or maybe even over 50% or maybe even better than 270M - 300M /s at a single gpu. At a CPU you'll hit 4M/s a core (@2.3Ghz or so). There is also a 6990 version of it with a lot more cores (PE's). The card i bought for 318 euro (it's now 280 i saw in other shop). You'd really believe a datacenter would have an AMD gpu huh? Oh comeon if you are already so lucky it'll have a Tesla of$2200 inside a dual socket Nehalem box worth $6k and you'll have to pay for airco, systemadministrators and a huge profit factor. Just that makes it already too expensive, even though by accident for integer multiplication above 32 bits, AMD seems not so well equipped in their GPU (though i'm awaiting some messages from AMD with news on whether my guess is correct). If i want to switch to Nvidia i would be able to switch for an euro or 700 to a GTX590, yet such cards will NEVER make it into ANY datacenter. If it has a tesla that'll be a lot already. You realize a Tesla has 448 cores @ 1.15Ghz @$2200
and a GTX590 has 1024 cores @ 1.2-1.4Ghz @ less than half that price

You can buy that GTX590 today; a datacenter would have it earliest 1 year from now,
and i can assure you it won't have them. Yet you also must pay for that \$20k infrastructure then in the price you pay for the cloud computing.

And i argue they really should not try to make a loss at it.

[QUOTE=Christenson;259881]By diep's definitions, "Cloud" computing doesn't make sense for computational number theory. The value it offers (high availability, distribution, and convenient administration freeing up organizational bandwidth for other tasks) just don't justify the charges for what is essentially a very compute-intensive amateur operation. And, as I mentioned, NFS@home certainly uses some high-performance supercomputing (Teragrid and 900-node Lomonosov cluster) for a few hours at a time when reducing the sieve results into factors.

Now, a commercial organization might use cloud computing because most commercial computing is storage and database-intensive, with relatively minor amounts of computing. Remember, if you aren't playing video games, or editing movies, or god forbid, hunting primes, your 10 year old Win98 system at 400MHz is nearly indistinguishable from your multiple GHz, multigigabyte machine of today. It can save a lot of headaches when desktops and laptops die; all the critical data is in the cloud, so just buy your user a shiny new laptop when it dies after two or so years, whether from hardware failure or windows bloat. No fooling with backups; those are taken care of by the cloud.

Me, when I think of the cloud, I think of all the distributed computing projects, which are certainly running on thousands of computers distributed all over the world, none of which is terribly reliable, but, over time, contributing huge amounts of compute cycles. Every one of the seven PCs I run prime95 on has crashed this week, either for windows update, loss of power, or standard windows memory leaks in large applications, but my computations march on, only slightly delayed.

Quote:
 Originally Posted by Christenson By diep's definitions, "Cloud" computing doesn't make sense for computational number theory. The value it offers (high availability, distribution, and convenient administration freeing up organizational bandwidth for other tasks) just don't justify the charges for what is essentially a very compute-intensive amateur operation. And, as I mentioned, NFS@home certainly uses some high-performance supercomputing (Teragrid and 900-node Lomonosov cluster) for a few hours at a time when reducing the sieve results into factors. Now, a commercial organization might use cloud computing because most commercial computing is storage and database-intensive, with relatively minor amounts of computing. Remember, if you aren't playing video games, or editing movies, or god forbid, hunting primes, your 10 year old Win98 system at 400MHz is nearly indistinguishable from your multiple GHz, multigigabyte machine of today. It can save a lot of headaches when desktops and laptops die; all the critical data is in the cloud, so just buy your user a shiny new laptop when it dies after two or so years, whether from hardware failure or windows bloat. No fooling with backups; those are taken care of by the cloud. Me, when I think of the cloud, I think of all the distributed computing projects, which are certainly running on thousands of computers distributed all over the world, none of which is terribly reliable, but, over time, contributing huge amounts of compute cycles. Every one of the seven PCs I run prime95 on has crashed this week, either for windows update, loss of power, or standard windows memory leaks in large applications, but my computations march on, only slightly delayed. And we

Last fiddled with by diep on 2011-04-29 at 04:02

2011-04-29, 04:48   #9
Christenson

Dec 2010
Monticello

5×359 Posts

Quote:
 Originally Posted by CRGreathouse Based on Christenson's post, here's an interesting (?) question. Suppose you had an extremely large number of processors, but not enough time to trial-divide the large number you are given. Can you quickly (1) factor the number (2) check the primality of the number (3) prove the primality of the number? Obviously the three tasks deal with differently-sized numbers.
We are approaching 10^3 CPUs on a chip on GPUs, which are "Sh*t hot" at TF. 10^3 GPUs TF'ing mersenne numbers isn't unreasonable in a few years. We'd take the time to do a TF, but, like it does now, it might fail; we'll simply ASSUME that TF over a reasonable proportion of our compute horses came up with no factor.

For factoring, something like GNFS or SNFS would be the order of the day; that is certainly parallelizable just like it is today by NFS@home. We then need to become concerned with how to reduce the large matrix of relations that come out of it. On that, my hunch is there *is* a breakthrough with power similar to the FFT multiplication in the relatively near future. This is where the communications eat the work alive at present.

I don't know the structure of PrP testing or primality testing on numbers of no special form well enough to comment.

I do know that a single, standard LL test might be pretty difficult to parallelise. I predict we are going to find, that like with Block Lanczos, that significant amounts of redundant calculation are cheaper in time than waiting for communications. I think the approach would be to assign to each CPU a group of digits, with overlaps on either side that got filled in from the neighboring CPU. That is, we line up all of the worker CPUs in a virtual row along the digits, and have them simultaneously work on each iteration, with corrections to the overlaps arriving just before they were required to not perturb the central group of digits.

Hmm, I wonder about a billion digit Mersenne number? LL'ing M(50Meg) takes about 40 days on one of my PCs, I think the time scales as the square of the argument, so M(300Meg) should take 36*40 days = 4 years, and 400 years for M(3,000Meg). Wow! we need a breakthrough in CUDALucas!

As noted, necessarily, as the number of processors grows, the delays between CPUs will end up scaling with the physical distance. Even now, we have a hierarchy of storage: CPU registers, L1,L2 cache, main memory, solid-state disk, hard disk, network-attached storage. We will also end up with a hierarchy of communications -- a node only supports so many communications links, and the longer the distance to cover, the fewer the links that can be supported.

 2011-05-09, 18:35 #10 em99010pepe     Sep 2004 2·5·283 Posts Cloud Computing Principles and Paradigms, 2011. may be a good book but encouraging others to violate copyright restrictions is not a good idea. Last fiddled with by xilman on 2011-05-10 at 13:05 Reason: Remove URL
2011-05-10, 00:57   #11
Christenson

Dec 2010
Monticello

70316 Posts

Quote:
 Originally Posted by em99010pepe Cloud Computing Principles and Paradigms, 2011.
Uhh, I didn't see your name on this book. I did see a copyright notice on the front. Can you explain to me how my downloading and reading this would fit within fair use?

Last fiddled with by xilman on 2011-05-10 at 13:05 Reason: URL removed

 Similar Threads Thread Thread Starter Forum Replies Last Post kladner Science & Technology 7 2017-03-02 14:18 rdick Cloud Computing 1 2016-12-02 01:27 Brain GPU Computing 20 2015-10-25 18:39 NBtarheel_33 GPU to 72 9 2013-07-31 15:32 GP2 Lounge 2 2003-12-03 14:13

All times are UTC. The time now is 23:18.

Sat Oct 16 23:18:03 UTC 2021 up 85 days, 17:47, 0 users, load averages: 1.64, 1.34, 1.27