Hacker Read top | best | new | newcomments | leaders | about | bookmarklet login

My point is that it is easy to tell in the restaurant, and it is not easy to tell in a cloud storage context. We're not disagreeing about the former.

I actually have cloud storage customers at my company. If you asked me what an unreasonable amount of cloud storage consumption was, I would have a really hard time coming up with a number. A petabyte, I guess? We have customers that store kilobytes, and others that store many terabytes. Frankly we'd love a petabyte customer, because we actually charge people for storage, like any sane B2B storage offering. Some things are not meant to be unlimited.



sort by: page size:

Roughly how many petabyte customers are you talking about?

The parent post is suggesting that 5 Million customers does not in any way imply 200 shards. It's much more likely that with 5M customers you have 1 or 2 shards. That's 4KB (one document) per customer.

it's definitely not zero. if you have a million machines in your data centers then it easily adds up to something quantifiable.

They are not, but actual usage is typically a single-digit % of promised space. So power users are served at cost (or even at loss), but the overwhelming majority of users are actually overpaying for what they use.

just is positive assessment about relative size. I could be wrong about positive statement of course. But it seems very low amount for a cloud service.

I wast trying to understand how business thinking work. Not judge or praise it.


I don't think anyone in their right mind interprets that as 800mb/s or 1733mb/s to 500 users simultaneously. It's pretty clear what it means.

I agree, I'm really scratching my head here trying to understand what data volume has to do with unique users also.

"All 14 other cloud providers combined have 1/5th the aggregate capacity of AWS (estimate by Gartner)", yet the slides say "5X the cloud capacity in use than the aggregate total of the other 14 providers". The "in use" part is very important to include in the sentence as it makes a difference to understand capacity vs just having a lot of customers.

Yeah I get that, but I think there are two sides to this.

One is that it is possible to guess ballpark figures without knowing the intricacies, and I think anyone with experience in a big tech company of this ilk can probably make a reasonable estimation (to within 50%?).

The other is that it's easy on the inside of these situations to justify each individual role, and miss the fact that the system is too complex and a simplification may be able to drop significant numbers. I'm not suggesting it is too complex, I fully realise there's a different sort of scale involved when you operate a large globally available service like this, but duplication still happens, engineers still make things more complex than they should be, etc.


> AirBnB runs thousands of compute instances and handles hundreds of terabytes of data with only five IT staff.

I'm not sure how they're measuring IT staff, but having worked at Airbnb I can't think of any sane definition of "IT staff" that would lead to this count.


I'm not talking about growth though, I'm talking about marginal cost per service rendered. To serve twice as many people in a restaurant you need roughly twice the servers. You don't need twice the amount of programmers (and other workers) to serve Netflix to twice as many people.

4.3PB/day.

A rare insight into the scale of cloud services. Amazon and Microsoft and Google generally keep extremely tight-lipped about the scale they are operating at.


> The company now operates 15 data centers that handle 55 million unique visitors every month, each of which watches an average of 106 minutes of streaming video a day.

That can't be right, can it? 55M x 100 minutes ... the superbowl is around 100M, so that is a half-superbowl equivalent every day.


First of all, no, it isn't. At a large enough scale, any metric that goes up with usage is a perfectly fine proxy for many conversations. Every day for a decade we had the same 3% of our user base online at peak doing the same transactions as yesterday. Over longer periods, the number crept up and the transaction mix changed, but "number of simultaneous users" got me accurate, if not precise, capacity planning from 27 to 1.5 million users.

Stress-testing, shared-nothing and dollar-scalable are platonic ideals, and they're not always achievable. If Dropbox had three infrastructure engineers, they probably weren't able to build proper capacity planning models, and probably couldn't afford to build a full production work-alike for stress testing anyway. (And at some scales, that's literally impossible. Our vendors couldn't physically manufacture enough servers to build a full test environment, cost aside.) I'm sure they did some simulated tests as well, but those won't tell you the whole story.

You're focused on IOPS, but you have no idea if that's what Dropbox's bottlenecks were. (Not to mention: What does IOPS mean on an EBS and S3 infrastructure?) Complex systems fall over in complex ways. You can predict the next bottleneck, but not the one after that; by the time you get there, your fix for the first bottleneck will have changed the dynamics.

It sounds like they did do stress testing, using real-world loads, on a system that was 100% similar to their production system. They ran continuous just-in-time stress tests in the Big Lab.


> The cost is strongly related to the size of the organization

There is a correlation between the number of GB you store and eg. how many DPOs you require?


Actually, this matters a lot in many enterprises. Beancounters hate excess capacities, so there are never enough spares and everything is always almost full.

Maybe SV is different...


It isn't even true there. This is one of the common misconceptions that show up in blogs. There's no such thing as a minimum metric for an A - SaaS or otherwise.

> It can handle more than 500,000 orders per month when hosted on a $6/month server

Is this a statement from the real world? What business sells 500,000 orders in a month from a $6 server? I mean, if you sell that much, you would probably want a high-availability solution and they don't sell for 6 bucks.


Maybe, maybe not. But I think I agree with you- 10m/m is a comically large number(I wonder how many companies spend even 1/100th of that a month on cloud compute?) and I have no reason to believe this person so why would I?
next

Legal | privacy