Hacker Read top | best | new | newcomments | leaders | about | bookmarklet login

Did you expect it to crash the second people were let go? It’s not like they power Twitter servers with power-generating bicycles.


sort by: page size:

I'm sure they severely messed up in a bunch of things during the layoffs. And things broke in the process.

But the catastrophic scenario that some people painted, where Twitter was basically walking dead timebomb ready to crash irreparably at any second, wasn't true, clearly.


I agree those were odd takes. I've likened firing most of the engineers to taking your hands off the wheel in the car. It won't crash immediately, but it doesn't mean the car can go driverless.

With that said, there are differences between internal systems and something like Twitter on the public internet. I assume that Twitter is a system under constant attack. What happens when the next log4shell level vulnerability comes out?


None of the actual engineers involved in Twitter said it would crash and burn immediately. Nor have other SREs. The belief in an imminent complete collapse was mostly amongst non specialists.

But SREs including former Twitter employees have said severe outages will become increasingly likely as the weeks go by, and that in the meantime we’d see an increasing number of unresolved glitches. So far that second part is proving quite true.


I think there are going to be two levels of shock:

1) Never give people an ultimatum. It basically never works in your favor. That was such a dumb thing by Elon and his team. I don't think Elon expected the numbers to be so high.

2) Workers often overestimate their own worth within a company or how easily they can be replaced. It's likely that all the people leaving saying Twitter is going to majorly break soon won't be correct, as I doubt the core engineers are the ones posting Emojis in the Slack channels.


>removing engineers won't instantly crash the product. It'll happen slowly

It's amazing to me how many people following the Twitter saga, some familiar with or actually working in technology, thought that Twitter would crash within days of the engineers being fired. And because it didn't, the job cuts are justified.


I think people who work in reliability see this type of thing as the real existential threat to twitter. It's unrealistic that a large infrastructure would fall over overnight, but what is very realistic is small problems being neglected until they become big problems, or multiple problems happening at the same time.

This alone is probably manageable, it might even be simple but painful to handle for 2-15 of twitters employees (pre-firing) with specialized knowledge. If 3 people knew the disaster recovery plan and they all got fired because they were so busy maintaining things and fighting fires that they failed to get good reviews by building things, well I wouldn't be surprised. Likewise the employees trusted with extreme disaster recovery mechanisms are not the poor souls on H1Bs who don't have the option of leaving easily, so the people trusted with access might have already jumped ship since they aren't being coerced into staying on board with a mad man.

The real existential threat is another problem compounding on top of this or a disastrous recovery effort. Auto-remediation systems could do something awful. A master database could fall over and a replica be promoted, but if that happens twice, 4 times? Without puppet to configure replacement machines appropriately, there could be a very real problem very quickly. Similarly, extremely powerful tools, like a root ssh key, might be taken out, but those keys do not have seat-belts and one command typed wrong could be catastrophic. Sometimes bigger disasters are made trying to fix smaller ones.

Puppet can be in the critical path of both recovery (via config change) and capacity.


> I've likened firing most of the engineers to taking your hands off the wheel in the car. It won't crash immediately, but it doesn't mean the car can go driverless.

This is an excellently apt analogy, in light of Twitter's new owner.


There were many different hypotheses on the outcome of Elon firing 80%+ of Twitter employees:

1. The site would crash and burn the next day.

2. Nothing would change. All those engineers were anyways just sitting around.

3. There would be no immediate impact (servers can run by themselves after all), but the site would slowly degrade over time as institutional knowledge around maintenance, upkeep and all the various system quirks was gone.

We are now seeing #3 play out in front of us.


And I think you are vastly underestimating the percieved impact by the people involved, which was kind of my point.

Both Twitter going down for an hour and a plant that assembles cars going down for an hour aren't that big of a deal in the huge scheme of things, but for people who are intimately connected (either work there, know someone who does, are emotionally connected to the product in some fashion, etc), it feels a lot bigger than it is.


It's interesting seeing the predictions come true. I was slightly nervous when Elon fired 90% of twitter staff and the site kept working. If stuff never broke, then empirically, firing 90% of your staff seems to be a good idea.

But now stuff is breaking each week, and it's evident that maybe it wasn't the best idea.


I’m not surprised to see Twitter’s reliability get progressively worse.

Overall, I think they could have reduced engineering staff in the long term by adopting mainstream open source technologies instead of their own custom (albeit also oss) database/filesystem/streaming system/rpc library/spark stack.

But those migrations would have taken years. Instead, slashing staff that fast is likely to cause slow rot of their ad network, and overall system resiliency.


If you fire people hastily but then are forced to ask them to come back because you discovered they were actually key personnel, that speaks to incompetence. The speed of Twitter's layoffs were completely under the control of Musk. There was no crisis or external factor that made them cut people so quickly.

Mistakes certainly can happen, but they're much more likely to happy when things are done chaotically without proper planning.


I've been in companies where 1/5th of engineer was laid off. The state of company infra and the product dropped significantly and it took a long time to approach recovery, where they hired the same amount of staff back.

I can't imagine how dysfunctional twitter would be with %75 of staff gone, I don't even know if it will continue running as a business after that.


Elon fired half the staff and Twitter is still running.

Tells you just how much wasted staff there is at most big companies.


The claim that Twitter would collapse immediately never made any sense. Most engineers tend to design software so that they can take holidays. There might be a handful of Ops people who may be need to be available 24/7 but even in a website the size of Twitter those probably number fewer than a few dozen.

I'm amazed at how blind (we) tech people can be.

I don't see how the events up to today doesn't tell us that Twitter can indeed survive on a reduced workforce.

There may be growing and adapting pains but, overall, the service is working and not degrading.

The company may very well not survive much longer but I don't think it'll have anything to do with having fired all those devs.

The problematic side was the business side, so if laying of a few thousand people extended the runway to figure things out, it made all the sense. And if it ends crashing and burning regardless, then, who cares?


I honestly think the engineering/ops problems are the least of their failures. They bled revenue not because of outages, but because wild policy swings and chaotic management style alienated some of their biggest customers.

If they had frozen features and left the existing policies in tact, I suspect we would have a dramatically different narrative about the layoffs. If brief interruptions like this are the worst that happens when you cut engineering to the bone, it's a good argument that is Twitter was indeed wildly overstaffed.

Instead, though, we have a company in crisis due to its mismanagement of other areas, so we're primed to view stuff like this through the lens of that broader failure.


I know some of the former Twitter engineers, and they were great engineers. I'm sure they built a fairly resilient system.

But at the scale and complexity, eventually you hit an unforeseen error that requires some institutional knowledge to fix, and it's doubtful they still have that, unless the ex-engineers loved the product so much they are willing to take phone calls from the people that are left and point them in the right direction.

I don't think we landed on a time frame. I'm actually surprised it stayed up this long.


A completely stupid thing for Twitter to do, like writing a post about their 12 hours of downtime ending in rollback by

1. starting the post with 5 paragraphs and a list of all the virtues of Twitter's engineering

2. spin the fact that, after a simple DB screw-up in production it took them about 6 hours to decide to just rollback from a backup

3. end the "postmortem" (which is, by the way, completely devoid of apology) by trying to hire

all the while talking about how working at Twitter is like building a rocket mid-flight (so, just like every other company)?

http://engineering.twitter.com/2010/07/twitter-performance-u...

next

Legal | privacy