The 64th Gamer

rackmount-official-my-ass:

myconetted:

myconetted:

speaking of twitter’s skeleton crew, here’s an absolutely unhinged article

he’s truly lucky to have bought a company that invested so hard in uptime and reliability that his ridiculous antics only resulted in massive instability for two months instead of taking things offline entirely. i can’t imagine how much work he’s set on fire and tossed out because it was “unnecessary”. they’re only unnecessary because people did all the “unnecessary” engineering work to make it this stable in the first place

that microservice architecture he hates so much is what makes it so robust to outages: instead of the whole site going down, it degrades “gracefully"—that is, features go down, but twitter.com still loads. even if it doesn’t exactly show you tweets or have working notifications, it’s still "online”. now think about how this applies to regional service, i.e. how many micro-outages have you not seen because the downtime only affected certain regions?

if i were an ex-twitter infrastructure engineer, i’d be so proud of how the platform has held up despite elon’s wild ride. and i’d be so sad to watch him tearing it apart.

data center side of tumblr here

this story. makes me. livid.

i work in data centers. (not twitter.) my job is probably what Alex from Uzbekistan does. i deploy racks and servers, i identify broken machines, i order and swap parts, and i decommission old equipment. when a hard drive fails, it often still has your data on it. it’s my job to carry that “dirty media” from the machine to a hydraulic crusher, where the platters are bent or shattered, for your security.

it cannot be understated how wrong this all is. red for safety hazards, orange for security hazards, blue for legal and liability hazards, pink for what the fuck did you think would happen you maladapted baboon

- you don’t just let someone in because they look like the CEO…that should have taken at least like an hour to get access filed. also it’s not your building.

- you do not raise the floor tiles without authorization. you don’t know what’s down there (generally just (big) power cables and a lot of forced cold air). it’s a confined space with electrocution and arc flash hazards. (these power cables are probably about a quarter to a silver dollar in diameter; see C60309.) also that’s probably not your space.

- you ABSOLUTELY do not…what was the word? “jimmy open”? an electrical panel with a knife??? your ass is not a Qualified Electrical Worker, and in a colocated data center like this one, that’s not your property. (!!) even if the subfloor was properly secured with cage wall to keep you out of other people’s space, he is now at the point of 1) trying to die and 2) endangering other companies’ equipment (the colo owns the power equipment, and other tenants could be sharing that power equipment with you).

- yeah just unplug some shit and see what happens. fuck you.

- DID I MENTION ENERGIZED CIRCUIT. the colo or other QEW opens the breaker for the circuit before you disconnect the cable. this is how you get arc flashes! you will be fucking vaporized!

- hello? shipment order? did you just walk into the dock and yell at the dock master until he opened the door? security??

- seriously, floor weight ratings are real and not a joke

- racks go in crates so they don’t break their casters or tip over or bang into each other during shipment

- an empty standard rack is heavy. at eight feet, these were probably 48U? even heavier. massive tipping risk. full of servers? fucking bonkers. i was moving a rack with a guy once (we were not trained for that btw) and he wasn’t paying attention and smashed his finger between the rolling metal and a structural I-beam running floor-to-ceiling. the only reason he kept the finger was because his wedding ring took most of the blow. he had a cast for a few weeks. data centers are dangerous, training is life safety critical, and procedures must be followed.

- so yeah we’re just gonna jam all of these full racks into a truck that shouldn’t be parked here, on no notice. probably about 5 tons every 3-4 feet along the length of the trailer. consumer-grade straps. no crate, no pallet, no bracing, just a 500 mile bumpy ride for those casters.

- “nobody told me there was data on the servers boo hoo how was i supposed to know” you’re a fucking moron and you should feel bad. what in god’s green ass did you think was on them. WHAT DO YOU THINK SERVERS ARE

- btw did you idk logically decommission any of the servers or switches or routers? nope. when they get to Portland, they will still think they’re in Sacramento.

- hey so where do you want these put? just wherever? ok. so what do you want them plugged into? just whatever? …ok… that’s not how networks work but alright.

like. it takes a couple months of notice generally when they send me a new rack to deploy. i have to get the colo to remove a cage wall because the door is too short to roll the thing in, coordinate a contractor to brace the rack (drill 3/4” holes and drop 3’ bolts into the ground, basically), load the servers into the rack, terminate copper and pull fiber for the switch and management gear, file a ticket to have the colo energize the circuit, and then the physical turnup is done. that’s on top of my regular maintenance work. after that, i kick off the logical side of the turnup, wherein the switch is onboarded into the network and QA tested, then the same for the servers. a typo can fuck it all up for weeks. someone forgetting a step, weeks.

i will give him a pass on using the term “server centers” because he’s not from the US and idk what they called them where he grew up. never heard that term before though.

if some rich asshole walked into my DC and started pulling this shit, I would have said “I need you to stop what you’re doing because it seems dangerous”—this is exercising my stop work authority—and when he ignored me i would gtfo of that data hall and get security. and i would be calling the building owners to be sure they’re aware of what’s going on. and i would be writing everything down so i have a record for whatever investigations happen next.

and yeah—mad props to the infra eng at twitter for how the service handled a trusted adversary unplugging thousands of machines at once and then popping them up in a different region. that is absolutely insane and a lesser platform would not have survived. i’m impressed.

fuck elmo