Register FAQ Search Today's Posts Mark Forums Read

2016-07-19, 15:27   #1068
Serpentine Vermin Jar

Jul 2014

29·113 Posts

Quote:
 Originally Posted by James Heinrich I'm not sure why so many people are jumping to the conclusion the IP in question is a VPN IP. In any case, if all the DoS traffic is heading to a single report page, you could perhaps try a soft-block: rather than completely blocking the IP address from the server just insert a couple lines of code at the top of the report page to prevent running expensive queries but also provide feedback to the spider in question. Something like Code: if (\$_SERVER['REMOTE_ADDR'] == '123.234.345.456') { die('You have been blocked for aggressive spidering. Please email madpoo@primenet to discuss better ways of getting the data you want'); }
Not a bad idea... a friendlier way to communicate to the user that "you're welcome to crawl, but come, let us reason together".

I'm fairly certain it's not a VPN endpoint... the IP has a PTR that indicates it's a common residential DSL dynamic IP.

2016-07-21, 10:50   #1069
retina
Undefined

"The unspeakable one"
Jun 2006
My evil lair

578810 Posts

Quote:
 Originally Posted by Madpoo I'm fairly certain it's not a VPN endpoint... the IP has a PTR that indicates it's a common residential DSL dynamic IP.
And once the user restarts the DSL router connection then a new IP is generated and we go through this all again. And some other innocent DSL user will soon get the tainted IP.

I'd suggest to trigger on something other than the IP address. You mention the user agent having some unusual characteristics, so perhaps that is a better way to filter the problem requests.

Last fiddled with by retina on 2016-07-21 at 10:52

 2016-07-21, 13:12 #1070 0PolarBearsHere     Oct 2015 1000010102 Posts If it's as aggressive as it sounds it could just be as simple as putting in blocks for any host that hits more than a certain amount of times per minute. Google etc can be told what rate limits to use through various tools, so they should never trip it.
2016-07-21, 15:24   #1071
Serpentine Vermin Jar

Jul 2014

29×113 Posts

Quote:
 Originally Posted by 0PolarBearsHere If it's as aggressive as it sounds it could just be as simple as putting in blocks for any host that hits more than a certain amount of times per minute. Google etc can be told what rate limits to use through various tools, so they should never trip it.
That was my first attempt but even then I think I set the dynamic IP blocking (it's an IIS feature) too high... their crawl was still impacting the server in bad ways.

Blocking the IP outright was my "hurry up and do something to get the server stable again" effort since I'm travelling this week and couldn't spend more time on it. So, to put retina's concerns to rest, it will get more attention soon and the IP will be unblocked when I'm actually going to be able to monitor the situation in real time (if they're still even trying by this point).

2016-07-22, 03:26   #1072
Serpentine Vermin Jar

Jul 2014

63158 Posts

Quote:
 Originally Posted by Madpoo Blocking the IP outright was my "hurry up and do something to get the server stable again" effort since I'm travelling this week and couldn't spend more time on it. So, to put retina's concerns to rest, it will get more attention soon and the IP will be unblocked when I'm actually going to be able to monitor the situation in real time (if they're still even trying by this point).
Now that I've had some quiet time to take a closer look...

That user had hit ~ 600K pages for 3-4 days in a row. I looked at the hits in 5, 10 and 30 second intervals and we're talking about rates of 1500+ pages in any given 30 second interval.

When I say it was an aggressive crawl, I'm not kidding.

Anyway, I removed the IP block and put a dynamic block in that should prevent that type of thing in the future and I made sure the settings I used wouldn't have blocked any other access that might come up.

For instance, it wouldn't be too unusual for some user to open a handful of exponent report pages in a row and while the short term load is higher, it's not a big deal and those will be fine. But if you're hitting a couple thousand URLs per minute, expect to get errors until the rate is reduced to a normal level.

I'm guessing this person wrote a Ruby script or something to hit the exponent report page for every prime number between X and Y... I haven't done an exact peek but they started around 332M and just seemed to go up from there although it skips around a bit. At first they were actually doing bulk reports, not just a single exponent, so it would have been limited to 1000 or whatever. But at some point they just changed to crawl one at a time.

FYI, they did get up to 366M or so before the IP block kicked in...would they really have gone all the way to 1000M ?

 2016-07-22, 14:04 #1073 Mark Rose     "/X\(‘-‘)/X\" Jan 2013 32×11×29 Posts Again, I think the pages should have comments on them explaining XML is available, et cetera.
2016-07-25, 14:41   #1074
Serpentine Vermin Jar

Jul 2014

29×113 Posts

Quote:
 Originally Posted by Mark Rose Again, I think the pages should have comments on them explaining XML is available, et cetera.
Yeah... I know. In the back of my mind, I had a feeling I was going to do some additional testing on those to make sure they were all good (I'm pretty sure they are) and then add a link, but somehow I just didn't get to that yet. I'm not entirely sure where/how to add that info anyway... some text and/or link on the report page itself to the XML version of the same thing, or an extra link from the menus... yeah, not really sure. I'll have to mull it over.

 2016-08-05, 14:54 #1075 thyw   Feb 2016 ! North_America 2×5×7 Posts Ecm assigments/reservations problem(?) Hello, i couldn't find a better thread to ask this question, move this if you want. So about ecm assigments. I was working on m81239 ecm, b1=1M... but about halfway there the server unreserved it from me. Somebody finished the current "range/bounds", by completing enough of the requied curves on ecm progress. It was like 5GHzday 150 curves, but i cannot continue to ecm on with those bounds even from my existing savefile, it jumps to higher bounds assigment. Is it intended to unreserve exponents from users when the range/curve count is complete? My faults: i was slower than i promised, didn't run this 24h/day, and this was not my only exponent. Also this happened over a week ago. And looks like a separated(not connected p95 client also couldn't continue the exponent from the savefile(s). It could be a savefile corruption while copying after it, but the unreserving is still a question. Last fiddled with by thyw on 2016-08-05 at 15:02
2016-08-05, 15:49   #1076
GP2

Sep 2003

29×89 Posts

Quote:
 Originally Posted by thyw but i cannot continue to ecm on with those bounds even from my existing savefile,
Well, you can set UsePrimenet=0 in your prime.txt and edit your ECM2= line in worktodo.txt to remove the 32-digit assignment ID. Will it let you use the existing savefile then?

Let it run to completion, then manually submit the result and cross your fingers and hope PrimeNet awards you credit. Then remember to turn UsePrimenet=1 back on.

Last fiddled with by GP2 on 2016-08-05 at 15:51

 2016-08-05, 18:14 #1077 thyw   Feb 2016 ! North_America 2×5×7 Posts Thank you GP2, but i think i screwed up and probably overwrote the savefiles when tried to reassign. It won't continue, old or new instance. Nevermind, but this would be painful on a bigger job. Backups! Should've created one. Or disable/edit the ecm assigning/unreserving rule.
 2016-08-06, 02:11 #1078 LaurV Romulan Interpreter     Jun 2011 Thailand 112·73 Posts You should still be able to report your work already done and get credit for it. Then switch to higher curves. That number of curves in the table is a guideline, and not something that "must" be kept, i.e. if I want to do another 50 curves at 1M for an 8M exponent, then I can do them, and report them. The server will not cut me off, unless I have strange settings. If that happens, you can use N/A instead of the assignment key and the server will not be asked (replace the key with "N/A" without quotes, in your worktodo file, use any text editor you like, stop and exit P95 from "tools/exit" menu before editing worktodo). Of course, my chances to find a factor will be extremely small, if 1600 curves were done already at that size, but it is my hardware and my money, and I can do them, and report them. I don't believe the server "cut you off", most probably you did something strange there... It happened to all of us, and not only once.

 Similar Threads Thread Thread Starter Forum Replies Last Post ewmayer Lounge 39 2015-05-19 01:08 ewmayer Science & Technology 41 2014-04-16 11:54 cheesehead Soap Box 56 2013-06-29 01:42 cheesehead Soap Box 61 2013-06-11 04:30 Dubslow Programming 19 2012-05-31 17:49

All times are UTC. The time now is 07:51.

Wed Oct 21 07:51:30 UTC 2020 up 41 days, 5:02, 0 users, load averages: 0.91, 1.05, 1.23