mersenneforum.org  

Go Back   mersenneforum.org > Prime Search Projects > No Prime Left Behind

Closed Thread
 
Thread Tools
Old 2008-05-24, 23:18   #111
mdettweiler
A Sunny Moo
 
mdettweiler's Avatar
 
Aug 2007
USA (GMT-5)

3·2,083 Posts
Default

Quote:
Originally Posted by gd_barnes View Post
You missed my point Anon. My point is that since there should be little manual intervention required for LLRnet, THAT is why it's faster.

The down times in between finishing and starting new manual files always lost more CPU time for me than the slower LLR times for LLRnet.

What piles up is idle CPU time. If we have the servers set up to handle the load, there should be zero idle CPU time for all of the people's cores who are connected.

I remember before when I went on a trip, I would have to make sure I loaded enough manual work on ALL of ~30 cores. It was a lot of work, several hours worth. Now I just leave on the trip and don't worry about it.


Gary
Oh, I see. Usually when I do manual work, I always try to make sure I have everything covered so that, at maximum, I only have a minute or two of downtime when switching files or whatnot. (Even that can often be avoided by adding one file to the end of another file when doing manual LLR.) However, I forgot how such an approach, while easy for me with only 2 cores, would be a real bear for someone with much more.

What we need is an LLRnet based on LLR 3.7.1 so we can take advantage of the automation and the speed boost at the same time!
mdettweiler is offline  
Old 2008-05-25, 00:36   #112
gd_barnes
 
gd_barnes's Avatar
 
May 2007
Kansas; USA

1033910 Posts
Default

Quote:
Originally Posted by Anonymous View Post
Oh, I see. Usually when I do manual work, I always try to make sure I have everything covered so that, at maximum, I only have a minute or two of downtime when switching files or whatnot. (Even that can often be avoided by adding one file to the end of another file when doing manual LLR.) However, I forgot how such an approach, while easy for me with only 2 cores, would be a real bear for someone with much more.

What we need is an LLRnet based on LLR 3.7.1 so we can take advantage of the automation and the speed boost at the same time!

Agreed totally. And yes, you are correct on quantity of cores making a big difference. Back when I only personally had 6 slower-speed cores when I first started prime-searching, it wasn't a big deal to add a file on to the end of another or the such if I was out of town for a while.

That first weekend when I went out of town after I got 5 of my 6 quads built but didn't have them online yet was a bear. Besides learning a new operating system, I had to make sure each one of a total of ~25 cores had enough manual work for ~10 days and several of those cores were in the middle of manual work that would finish in 2-3 days. It took me ~3 hours to decide what to do, calculate approximate completion times, load files on, add them to the end of other files, etc. This last trip I had, for cores that would finish in before I got back, I just stopped them, ran LLRnet on them, and then finished their manual work after I got back. It took me < 1 hour.

I know a remote desktop will help but not only do I not have it set up yet (hope to within a few weeks), I'm not sure I'd have much time to mess with my ranges while I'm out of town.

I can see where the very heavy-hitters, and by that I mean much bigger than me, i.e. 50+ cores, are coming from. They want to run LLRnet for an extended period if necessary and not worry about it much. Otherwise, they'd be spending 4 hours+ per day determing what to reserve, loading files, concatenating on to the end of existing files, posting results files, etc.

And finally, what would it take to convert LLRnet to run version 3.7.1 of LLR?


Gary
gd_barnes is offline  
Old 2008-05-25, 06:38   #113
Brucifer
 
Brucifer's Avatar
 
Dec 2005

13916 Posts
Default

Just a little over twelve hours left to go huh? Giving it another shot, maybe with a little luck will make it through the night tonight.
Brucifer is offline  
Old 2008-05-25, 07:52   #114
gd_barnes
 
gd_barnes's Avatar
 
May 2007
Kansas; USA

72×211 Posts
Default

Quote:
Originally Posted by Anonymous View Post
Hmm...if we were to do that then we'd have to change your port 400 server to port 300, since that's what Adam's server is on and that would be a mess for lots of different users to have to change their config files for. (Whereas with your server, you could just set up port forwarding so that port 300 points to port 400.)

I think he meant for AFTER the rally here and we've cleaned out his port 400 server. Certainly we wouldn't want to change things midstream here.

This sounds like a very good idea to me. All port 300 servers on drive 1. All port 5000 servers on drive 3 if we ever need more than one there.


Gary
gd_barnes is offline  
Old 2008-05-25, 12:17   #115
Mini-Geek
Account Deleted
 
Mini-Geek's Avatar
 
"Tim Sorbera"
Aug 2006
San Antonio, TX USA

17×251 Posts
Default

Quote:
Originally Posted by gd_barnes View Post
You missed my point Anon. My point is that since there should be little manual intervention required for LLRnet, THAT is why it's faster.

The down times in between finishing and starting new manual files always lost more CPU time for me than the slower LLR times for LLRnet.

What piles up is idle CPU time. If we have the servers set up to handle the load, there should be zero idle CPU time for all of the people's cores who are connected.

I remember before when I went on a trip, I would have to make sure I loaded enough manual work on ALL of ~30 cores. It was a lot of work, several hours worth. Now I just leave on the trip and don't worry about it.


Gary
You could run LLRnet and LLR on all of them, with LLR set to priority 2 (so it makes LLRnet wait until LLR runs out). I'd recommend giving LLRnet a small cache size, maybe 2-3, so if the manual reservations make it expire, either it won't waste much time, or you can cancel it easier. That way, they're not idle in the time that you're moving over, but you still usually get a speed boost. Of course, if you'd prefer a speed reduction over taking the time to do it (even without idle CPU time), that's your decision. I know I'd rather not spend hours every week or two to load up 30 cores.
Of course, this will all be irrelevant once someone makes LLRnet 3.7.1. I'll probably still use manual reservations, because I don't like the randomness of if I'll get fed a prime. Of course, it's practically random either way, but I guess it's just that I don't want to be fed a number that was just before or after a prime, and miss a prime, because of small differences in iteration time or CPU usage or network delay.
Mini-Geek is offline  
Old 2008-05-25, 13:23   #116
mdettweiler
A Sunny Moo
 
mdettweiler's Avatar
 
Aug 2007
USA (GMT-5)

186916 Posts
Default

Quote:
Originally Posted by gd_barnes View Post
I think he meant for AFTER the rally here and we've cleaned out his port 400 server. Certainly we wouldn't want to change things midstream here.

This sounds like a very good idea to me. All port 300 servers on drive 1. All port 5000 servers on drive 3 if we ever need more than one there.


Gary
Ah, I see. Yes, I agree, this would be a good idea.
mdettweiler is offline  
Old 2008-05-26, 10:32   #117
Flatlander
I quite division it
 
Flatlander's Avatar
 
"Chris"
Feb 2005
England

31·67 Posts
Default

My clients are sleeping on port 300.
Flatlander is offline  
Old 2008-05-26, 11:40   #118
glennpat
 
glennpat's Avatar
 
May 2007
Minnesota USA

72 Posts
Default

The ones that I had a queue of 10 on are sleeping with nothing in the queue. I had two with larger queues that are working off the queue, but there is about 45 seconds of idle time between each WU.
glennpat is offline  
Old 2008-05-26, 15:50   #119
gd_barnes
 
gd_barnes's Avatar
 
May 2007
Kansas; USA

72×211 Posts
Default

Quote:
Originally Posted by Anonymous View Post
Ah, I see. Yes, I agree, this would be a good idea.
Quote:
Originally Posted by Flatlander View Post
My clients are sleeping on port 300.
Quote:
Originally Posted by glennpat View Post
The ones that I had a queue of 10 on are sleeping with nothing in the queue. I had two with larger queues that are working off the queue, but there is about 45 seconds of idle time between each WU.

Blast!! What is causing this? It keeps happening at the worst possible time of day! It's consistently been from 3-5 AM local time (8-10 AM GMT).

Are people still having a problem?

Adam, when you get a chance please check and see if you can come up with a permanent fix.

We may not want to run rallies on port 300 in the future. We can use it for regular crunching but use port 400 (or 5000) for rallies...but making sure that we always use 2 servers for rallies.

Karsten and Anon, after the rally is over, we can talk about how many and what port future servers should be on for the project. We may need 1 regular server for non-rally times and 2 separate servers, all on drive 1, for rallys.


Gary

Last fiddled with by gd_barnes on 2008-05-26 at 15:50
gd_barnes is offline  
Old 2008-05-26, 15:54   #120
glennpat
 
glennpat's Avatar
 
May 2007
Minnesota USA

72 Posts
Default

Mine are still sleeping.
glennpat is offline  
Old 2008-05-26, 15:56   #121
mdettweiler
A Sunny Moo
 
mdettweiler's Avatar
 
Aug 2007
USA (GMT-5)

624910 Posts
Default

Quote:
Originally Posted by gd_barnes View Post
Blast!! What is causing this? It keeps happening at the worst possible time of day! It's consistently been from 3-5 AM local time (8-10 AM GMT).

Are people still having a problem?

Adam, when you get a chance please check and see if you can come up with a permanent fix.

We may not want to run rallies on port 300 in the future. We can use it for regular crunching but use port 400 (or 5000) for rallies...but making sure that we always use 2 servers for rallies.

Karsten and Anon, after the rally is over, we can talk about how many and what port future servers should be on for the project. We may need 1 regular server for non-rally times and 2 separate servers, all on drive 1, for rallys.


Gary
Hmm...I was just trying to go to the http://nplb.rieselprime.org website and it was down. Maybe this time it's a different problem that we're facing?
mdettweiler is offline  
Closed Thread

Thread Tools


Similar Threads
Thread Thread Starter Forum Replies Last Post
Rally Jan. 23rd-25th gd_barnes No Prime Left Behind 89 2009-01-25 22:59
LLRnet server rally 400<k<1001 August 8-10 mdettweiler No Prime Left Behind 66 2008-08-11 03:00
LLRnet server rally 400<k<1001 June 20-22 mdettweiler No Prime Left Behind 67 2008-06-23 15:32
LLRnet server rally port 300 May 3rd-4th gd_barnes No Prime Left Behind 45 2008-05-05 19:56
LLRnet server rally March 8th-9th gd_barnes No Prime Left Behind 135 2008-03-14 19:52

All times are UTC. The time now is 02:43.

Wed Apr 14 02:43:06 UTC 2021 up 5 days, 21:23, 1 user, load averages: 1.77, 1.98, 2.17

Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2021, Jelsoft Enterprises Ltd.

This forum has received and complied with 0 (zero) government requests for information.

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.
A copy of the license is included in the FAQ.