Hacker Read

fbdab103 · 2024-04-28 16:18:52

It is probably one of those little process changes to minimize chance of catastrophic failure. Sure, the risk of the daisy chained system going poof is low, but not zero. Instead, you should try to re-work your plans so you do not need to daisy chain.

bumby | karma 6991 | avg karma 1.44 · | 2024-03-29 12:26:15

I don’t know about that specifically, but I’ve seen that as a common occurrence as a cost cutting measure in low risk situations. In that case, the risk is low because 1) it’s not safety critical and 2) the switching moat is large enough to ensure they don’t lose customers

gowld | karma 8809 | avg karma 1.33 · | 2018-03-27 09:45:02

Sounds like you are cutting corners on redundancy. Saves $$$, but risky.

ChuckMcM | karma 107741 | avg karma 5.96 · | 2016-09-28 20:50:38

That does save a lot of money :-) The trick is managing the rate of change and the risk of disruption. If you manage it to no risk you end up changing too slowly, if you manage it to close to the risk you end up with unexpected downtime and other customer impacting events. Understanding where you are between no risk and certain doom really only comes with experience.

alkonaut | karma 22123 | avg karma 2.58 · | 2018-01-16 20:13:47+00:00

If there is no spec or process, then the risk that the system becomes a mess is higher - but on the bright side it’s a lot easier to fix.

jacquesm | karma 227883 | avg karma 3.76 · | 2023-01-12 02:22:26

It's risk avoidance to the point that that avoidance leads to new kinds of risks. The whole idea that you can architect yourself out of failure modes to the point that you no longer need to make backups is one that I see every other week or so and the number of companies out there that believes that because they have redundancies they don't need backups any more is staggering.

tedk-42 | karma 1103 | avg karma 3.41 · | 2020-04-15 21:53:40+00:00

If the risk of the change was minimal, why would they not proceed?

How can you plan for things occurring out of your control? CF engineers are people as well. Things like this happen and there will be learnings to take out of it (like how to fail over faster)

reply

lIl1l | karma 10 | avg karma 3.33 · | 2019-02-08 12:40:35+00:00

And they can't afford even a tiny bit of redundancy?

Big companies tend to defer risk. Managers and project leads want to start new projects rather than upgrade existing infrastructure. Combine these forces and sometimes you get a catastrophe.

reply

akkishore | karma 93 | avg karma 1.94 · | 2014-06-19 10:04:51+00:00

Your risk increases with more moving parts that you introduce into your product. It does not decrease.

Probably you are talking about redundancy where, I agree, it does go down.

reply

arcbyte | karma 760 | avg karma 1.6 · | 2023-01-12 10:32:19

The only way to optimize for lowest overall risk is to optimize for speed of change.

All the checklists in the world to prevent something from happening are fine and dandy until something happens anyway (which it will). And then they hamstring you from actually fixing it.

Instead, if you can move fast consistently, you can minimize the total downtime.

reply

gregorymichael | karma 4678 | avg karma 4.57 · | 2022-08-01 13:42:02

That's not the greatest risk to your project.

dekhn | karma 28741 | avg karma 2.63 · | 2020-03-04 17:46:35

The obvious risk here is that you're combining a rush job with limited oversight, and large-scale distribution. So any mistake that would have been caught by the existing process will instead have the potential to roll into a global disaster.

gte910h | karma 5630 | avg karma 1.69 · | 2012-08-09 13:41:45

When you're doing a project like this, you want less, not more risk.

Changing controller design would be a foolhardy adding of risk. Something very familiar is a great choice for it

reply

makeitdouble | karma 10324 | avg karma 2.58 · | 2022-04-26 20:03:49

The gamble is to have small, contained issues you can deal with in a timely manner, vs full scale propagated failures you'd have to deal with at the worse time ever.

It's like accidents during fire drills, they happen, yet it's worth doing all things considered.

reply

antoaravinth | karma 354 | avg karma 1.39 · | 2015-08-25 17:23:12

Thanks for your response. So its all about designing the system and reduce the riskness involved in system interactions?

yuppie_scum | karma 854 | avg karma 0.47 · | 2022-11-24 21:17:09

That’s a business continuity risk. Generally you want to abstract the business logic from infrastructure (code in a separate system or in escrow).

gav | karma 1585 | avg karma 3.84 · | 2012-06-30 16:00:24

There's also risks inherent in a more complicated system.

You can engineer a more complicated system with the goal of avoiding downtime, but this added complexity may end up with unexpected corner-cases and cause a net decrease in uptime, at least in the short term.

It's often better to concentrate on improving mean time to repair (MTTR).

reply

AstralStorm | karma 5566 | avg karma 0.91 · | 2018-08-31 08:01:58

And then there are two reasons and you failed to discern which it is. Either:

1) thing is not essential, you saved the cost

2) a workaround that is inconvenient or costly exists, compare costs

3) you just caused an opportunity cost

In other words, you're breaking the process. Unless you're a process engineer and track both causes and results diligently you should not do that. Ever.

reply

seszett | karma 5921 | avg karma 3.27 · | 2020-08-15 03:31:43

I'm not a company myself so it doesn't really make sense, although I do consider my parent's home as a backup if I need to leave my house for some reason.

But my company does everything we can in order not to have a single point of failure, however unlikely it is to fail. Because low probabilities of something still mean it can happen, and you don't want to allow that if you can avoid it.

reply

notpushkin | karma 4321 | avg karma 3.73 · | 2023-11-24 03:15:49

Well, for some projects supply chain risks are better than no supply at all, so there's that.