A malfunction that shut down all of Toyota Motor's assembly plants in Japan for about a day last week occurred because some servers used to process parts orders became unavailable after maintenance procedures, the company said.
I haven’t read the article because documentation is overhead but I’m guessing the real reason is because the guy who kept saying they needed to add more storage was repeatedly told to calm down and stop overreacting.
I used to do some freelance work years ago and I had a number of customers who operated assembly lines. I specialized in emergency database restoration, and the assembly line folks were my favorite customers. They know how much it costs them for every hour of downtime, and never balked at my rates and minimums.
The majority of the time the outages were due to failure to follow basic maintenance, and log files eating up storage space was a common culprit.
So yes, I wouldn’t be surprised at all if the problem was something called out by the local IT, but were overruled for one reason or another.
and log files eating up storage space was a common culprit.
Another classic symptom of poorly maintained software.
Constant announcements of trivial nonsense, like [INFO]: Sum(1, 1) - got result 2! filling up disks.
I don’t know if the systems you’re talking about are like this, but it wouldn’t surprise me!
You gotta forward that to Spunk so your logs ain’t filling up the server generating them. Plus you can set up automated alerts for when the result stops being 2.
Literally sent that email this morning. It’s not that we don’t have the space, it’s that I can’t get a maintenance window to migrate the data to the new storage platform.
I mean I’ve worked at a hosting company that had a bunch of static sites running off an SSD connected by usb to the server so this did happen back in the day. I try not to think about those days.
“What’s that? Your accounting front end that’s built in obsolete front page code on an Access database isn’t working again? It’s probably a file lock, I’ll restart IIS.”
Sometimes that person is very silly though. We had a vendor call us saying we needed to clear our logs ASAP!!! due to their size. The log file was no joke, 20 years old. At the current rate, our disk would be full in another 20 years. We cleared it but like, calm down dude.
Just plonk a large file in the storage, make it relative to however much is normally used in the span of a work week or so. Then when shit hits the fan, delete the ballast and you’ll suddenly have bought a week to “find” and implement a solution. You’ll be hailed as a hero, rather than be the annoying doomer that just bothers people about technical stuff that’s irrelevant to the here and now.
I haven’t read the article because documentation is overhead but I’m guessing the real reason is because the guy who kept saying they needed to add more storage was repeatedly told to calm down and stop overreacting.
I used to do some freelance work years ago and I had a number of customers who operated assembly lines. I specialized in emergency database restoration, and the assembly line folks were my favorite customers. They know how much it costs them for every hour of downtime, and never balked at my rates and minimums.
The majority of the time the outages were due to failure to follow basic maintenance, and log files eating up storage space was a common culprit.
So yes, I wouldn’t be surprised at all if the problem was something called out by the local IT, but were overruled for one reason or another.
Another classic symptom of poorly maintained software. Constant announcements of trivial nonsense, like
[INFO]: Sum(1, 1) - got result 2!
filling up disks.I don’t know if the systems you’re talking about are like this, but it wouldn’t surprise me!
You gotta forward that to Spunk so your logs ain’t filling up the server generating them. Plus you can set up automated alerts for when the result stops being 2.
This message brought to you by Big Splunk.
I think you missed a letter…
I always make sure my logs are covered by Spunk.
spunking my logs is one of my favorite pass times
Big SplunkMissed letter? You mean Big Spunk, right?And yet that’s probably there because sometime, somewhere, it returned 1.9 or 2.00001 or some such nonsense.
1 + 1 = 2.000001 for sufficiently large (but not by much) values of 1.
this is software speciifcally for assembly line management?
There is specific software for everything
I’m this person in my organization. I sent an email up the chain warning folks we were going to eventually run out of space about 2 years ago.
Guess what just recently happened?
ShockedPikachuFace.gif
You got approval for new SSDs because the manglement recognised threat identified by you as critical?
Right?
Literally sent that email this morning. It’s not that we don’t have the space, it’s that I can’t get a maintenance window to migrate the data to the new storage platform.
Can’t you just add a few external USB drives? (heard this more than once at an NGO think tank.)
I mean I’ve worked at a hosting company that had a bunch of static sites running off an SSD connected by usb to the server so this did happen back in the day. I try not to think about those days.
“What’s that? Your accounting front end that’s built in obsolete front page code on an Access database isn’t working again? It’s probably a file lock, I’ll restart IIS.”
Sometimes that person is very silly though. We had a vendor call us saying we needed to clear our logs ASAP!!! due to their size. The log file was no joke, 20 years old. At the current rate, our disk would be full in another 20 years. We cleared it but like, calm down dude.
Ballast!
Just plonk a large file in the storage, make it relative to however much is normally used in the span of a work week or so. Then when shit hits the fan, delete the ballast and you’ll suddenly have bought a week to “find” and implement a solution. You’ll be hailed as a hero, rather than be the annoying doomer that just bothers people about technical stuff that’s irrelevant to the here and now.
Or you could be fired because technically you’re the one that caused the outage.
Damned if you do, damned if you don’t!
The ultimate goal is having no downtime. Ballast gives you that result. The cost of downtime far larger than wasting extra space for ballast.
Except then they’ll decide you fixed it, so nothing more needs to be done. I’ve seen this happen more than once.
And was fired for not doing his job which management prevented him from doing in the first place