Downtime today (5/30/07)

BuckeyeRyn · May 31, 2007

well, after having read this thread I must admit, job well done, I feel like a complete idiot... guess I will go play in the traffic on my way to work, licking a few windows on the way and try to remember some of my nursing skills...

whatever was done and whoever did it, thanks for performing CPR on BP, it appears for the time being we have a normal sinus rhythm! :biggrin:

Big Papa · May 31, 2007

Hubbard;854083; said:
Funny thing you mention the flux capacitor, first thing I asked Clarity. It was either the FC or he crossed the streams.

You guys crossed swords?? :yow1:

OCBucksFan · May 31, 2007

MililaniBuckeye;854329; said:
You can specify which IPs/sites are denied, and thus you can block specific protocols/services to specific sites. Therefore you can indeed block pings to BP...whether or not C-Dog's provider is doing that or not is another matter.

I am aware of that, and believe it or not I am not trying to argue or debate, my statement was simply that i had been able to ping before and yesterday I wasn't able so I asked about the status of the system to see if I could offer up any input. I fix networks for a living, so I thought I would ask.

osugrad21 · May 31, 2007

OCBucksFan;854442; said:
I fix networks for a living, so I thought I would ask.

Keep asking OC...your input is appreciated

MililaniBuckeye · May 31, 2007

OCBucksFan;854442; said:
I am aware of that, and believe it or not I am not trying to argue or debate, my statement was simply that i had been able to ping before and yesterday I wasn't able so I asked about the status of the system to see if I could offer up any input. I fix networks for a living, so I thought I would ask.

Cool. I'm a sys ad myself and maintained web servers for the USAF for five years...I misinterpretted your post to mean that denials can't be set specifically.

If it's a hard drive crapping out, things could get ugly...

EDIT: If the server "crash" and the network glitch happened at the same time, they could be having power spike issues...they would explain possible damage to the HDD on the server and the problems with the routers occuring simulteneously.

Clarity · May 31, 2007

We've got one (I think) layer of HD redundancy for the drive the drive the web and databases are on, so if one drive dies, we should be okay.

Hmm, actually, how would I know when one dies and we're on the next one in a RAID format?

One of the biggest issues yesterday was endless numbers of mysqld's were being spawned. To the point that everything was err'ing out with too many windows/tasks/whatever open. Somehow this was solved just by moving the database directories and going to default .conf settings, which is really bizarre stuff. If they didn't start mysql in --safe before the move, it would just hang. When they tried to stop it even in --safe, it would hang.

The default mysql settings just aren't enough for our levels of traffic. We can get by on them here in the offseason, but I'll have to crank it all back up before the Fall, and there was little confidence that doing so would be an error-free experience.

MililaniBuckeye · May 31, 2007

C-Dog, what OS and webserver are you running? UNIX and Apache?

OCBucksFan · May 31, 2007

Clarity;854657; said:
We've got one (I think) layer of HD redundancy for the drive the drive the web and databases are on, so if one drive dies, we should be okay.

Hmm, actually, how would I know when one dies and we're on the next one in a RAID format?

One of the biggest issues yesterday was endless numbers of mysqld's were being spawned. To the point that everything was err'ing out with too many windows/tasks/whatever open. Somehow this was solved just by moving the database directories and going to default .conf settings, which is really bizarre stuff. If they didn't start mysql in --safe before the move, it would just hang. When they tried to stop it even in --safe, it would hang.

The default mysql settings just aren't enough for our levels of traffic. We can get by on them here in the offseason, but I'll have to crank it all back up before the Fall, and there was little confidence that doing so would be an error-free experience.

Well if one drive dies in a raid people around the server will know, most have an audible alarm, others will constantly spam messages in the logfiles.

We have had something similiar occur, we had a server here that managed our listserv, and it just generated log after log after log and we never paid attention to it. Soon it started crashing, so I went to check it out, and everytime I tried to go into the directory whether it be by explorer or command prompt. We ended up setting the logs to a new directory and adding a script to delete them and setting it to run every month. The drive wasn't dead, it turned out to be a corruption in NTFS, so that may be some food for thought. That's if it's a windows server, if it's a *NIX box I would just check your system logfiles for anything out of the ordinary.

OCBucksFan · May 31, 2007

One more note, if you are using ATA-Raid/IDE Drives I would do some sector scans over the drive, IDE is horrible for having small failures on one drive but not enough for the raid controller to be notified and screw up the whole raid.

Clarity · May 31, 2007

MililaniBuckeye;854668; said:
C-Dog, what OS and webserver are you running? UNIX and Apache?

Gentoo Linux, Apache2

OCBucksFan;854687; said:
Well if one drive dies in a raid people around the server will know, most have an audible alarm, others will constantly spam messages in the logfiles.

We're in a secure facility across the street from the NOC, as I understand. They only go in there as needed, but that would be enough to hear and report an alarm. Nothing in the logfiles about bad drives. :)

OCBucksFan;854694; said:
One more note, if you are using ATA-Raid/IDE Drives I would do some sector scans over the drive, IDE is horrible for having small failures on one drive but not enough for the raid controller to be notified and screw up the whole raid.

I honestly don't remember what type our drives are. I'll have to find the thread where we specced it all out.

BB73 · Jun 2, 2007

Clarity;854704; said:
We're in a secure facility ... the NOC .... They only go in there as needed.

Are you sure the NOC is secure? A few drops in a guy's coffee cup, and ...

Downtime today (5/30/07)

BuckeyeRyn

Calling all friends & people I met on the way down

Big Papa

Urban!!!!

OCBucksFan

I won a math debate

osugrad21

Capo Regime

MililaniBuckeye

The satanic soulless freight train that is Ohio St

Clarity

Will Bryant

MililaniBuckeye

The satanic soulless freight train that is Ohio St

OCBucksFan

I won a math debate

OCBucksFan

I won a math debate

Clarity

Will Bryant

BB73

Loves Buckeye History