• New here? Register here now for access to all the forums, download game torrents, private messages, polls, Sportsbook, etc. Plus, stay connected and follow BP on Instagram @buckeyeplanet and Facebook.

Crash/hardware failure - 9/27

BUCKYLE;1274328; said:
Seriously, C, you don't ever need to apologize to anyone here for anything. Thanks for hookin' us back up!

I do though. The needs of the site, from a technical standpoint, have surpassed my limited ability to manage them, and I have few answers in regards to what to do about that. The crash we suffered today was clearly hardware related, which is what caused this later corruption. The problem is, no one has any idea (least of all me) what piece of hardware may be causing the problem. When Puregig rebooted us, they hooked up a monitor, and it was just a blank/black screen, no feedback whatsoever.

So now I'm just waiting for another crash. Our last was on the 16th, and then another today on the 27th. The good news is reboots have brought the site back online, and now that I know that the crashes can cause these sorts of databases problems, I'll be making sure I go through and check (and where necessary repair) all tables after it happens again, which should prevent this sort of extra tomfoolery.

I'm copying our database and file directories to my local machines here right now, just in case something absolutely horrible happens, so there's little chance we could lose everything -- but not knowing when the next shoe will drop is very frustrating and stressful.

My interest has always been the community/people/discussion side of the site, and the technical demands have largely robbed me of that over the last few years. I need to come up with some sort of solution that would afford us better managed care and oversight, meaning, someone with a clue -- but I don't really know where to start. We should also probably have some hardware redundancy. A couple smaller machines perhaps and a load balancer, which might even be cheaper than another big single loaded box -- but that's neither here nor there.

Anyway, I am sorry, because I can't promise we won't crash again, and there's nothing (at present) I can do to prevent that. All I can say is that if and when it happens, I'll call for a reboot asap, and check the databases immediately thereafter.
 
Upvote 0
Clarity;1274341; said:
I do though.

From my perspective, It's like JT feeling he needs to apologize to the fans for a poorly played quarter of football. Though we are disappointed, an apology is NEVER in order. You do what you're capable of doing, and so many times, you go above and beyond. The words to thank you for this site don't exist, and you apologizing makes me feel guilty. :biggrin:
 
Upvote 0
Clarity;1274341; said:
I do though. The needs of the site, from a technical standpoint, have surpassed my limited ability to manage them, and I have few answers in regards to what to do about that. The crash we suffered today was clearly hardware related, which is what caused this later corruption. The problem is, no one has any idea (least of all me) what piece of hardware may be causing the problem. When Puregig rebooted us, they hooked up a monitor, and it was just a blank/black screen, no feedback whatsoever.

Makes me think it's a memory issue, that or a storage issue would be my two guesses. I would try to get a memory test and a full disk check done and go from there. Also, make sure there's no issues with the system heating, any server getting too hot usually has really weird issues.
 
Upvote 0
OCBucksFan;1274358; said:
Makes me think it's a memory issue, that or a storage issue would be my two guesses. I would try to get a memory test and a full disk check done and go from there. Also, make sure there's no issues with the system heating, any server getting too hot usually has really weird issues.

Can't really do any of that, unfortunately. The machine in colocated in Arizona, and hiring their engineers would be prohibitively expensive (hiring a partner from Latham & Watkins would be cheaper). They recommend I send someone in to check the machine out, which is fine and good -- anyone with a clue happen to live in 'Zona? We do have a guy (a savior, on many levels) who does things for us from time to time (install new ram, upgrade the OS, etc.), but I don't know how much he can help us on this.

Not trying to paint a doom and gloom picture here, it may just be that we need to be going with a managed hosting solution, rather than colocating with a high-end company like Puregig.

The good news is, I've been watching the database, site, and server like a hawk for the last hour, and everything's been as smooth as butter. So for tonight, at least, it seems we're out of the woods.
 
Last edited:
Upvote 0
Clarity;1274364; said:
Can't really do any of that, unfortunately. The machine in colocated in Arizona, and hiring their engineers would be prohibitively expensive (hiring a partner from Latham & Watkins would be cheaper). They recommend I send someone in to check the machine out, which is fine and good -- anyone with a clue happen to live in 'Zona?

If not, I'll throw in on the purchase of a plane ticket and accomodations for a weekend trip for anyone with the time and knowledge.
 
Upvote 0
I haven't been able to resist (able to prevent myself from) checking on it all night, obviously we're still up and running. I'm guessing we'll be good for a while until the next hardware problem. The software/database, at least, seems solid again after the earlier fixes.
 
Upvote 0
Clarity;1274536; said:
I haven't been able to resist (able to prevent myself from) checking on it all night, obviously we're still up and running. I'm guessing we'll be good for a while until the next hardware problem. The software/database, at least, seems solid again after the earlier fixes.

I think you should publicly post your phone numbers here, cell and home, business as well, in case anything should happen, so that we can contact you.









:p
 
Upvote 0
Back
Top