Hacker Read

fomine3 · 2022-12-18 18:56:42

Mastodon or other implementation isn't scalable like Twitter.

hintymad | karma 5901 | avg karma 4.37 · | 2019-12-06 21:19:28

Yeah, and Twitter’s user base has not increased much. Scalability is not really that much of problem.

ajkirwin | karma 860 | avg karma 2.26 · | 2009-04-15 05:33:29+00:00

I think Twitter is a bad usage case, because it's not something that NEEDS to be scalable.

kQq9oHeAz6wLLS | karma 1648 | avg karma 2.18 · | 2022-10-28 17:40:25

The problem I see with that is that it doesn't scale to the size Twitter is.

bonzoesc | karma 1388 | avg karma 3.06 · | 2012-11-05 13:07:37

The problem is Twitter API limits and processing bandwidth, and Redis isn't a +1 Wand of Scaling Magic.

ericingram | karma 559 | avg karma 1.7 · | 2012-06-14 02:47:23+00:00

Any language or framework can be tough to scale. Didn't twitter find this out in the early days?

nathanmarz | karma 2499 | avg karma 8.83 · | 2023-08-15 13:33:41

Building Twitter/Mastodon *not at scale* isn't that hard and certainly doesn't take 200 person-years. Building it *at scale* is a completely different story. Remember the fail-whale? That was years of Twitter struggling to scale their product.

That said, as we described in the post our implementation of Mastodon is less code than Mastodon's official implementation. So not only is Rama orders of magnitude more efficient for building applications at scale, it's also much faster for building first versions of an application.

reply

CommanderData | karma 1809 | avg karma 2.75 · | 2022-12-04 11:28:27

I hate to be the person but I've seen complicated dynamic applications push much higher bandwidth and serve millions of concurrent users with similar if not smaller h/w requirements.

Would be interesting seeing Twitters complete backend and while mastodon might not be apples to apples also interested cost per user to infrastructure analysis too.

reply

neilc | karma 4819 | avg karma 3.66 · | 2009-05-31 18:08:03

Twitter is a really simple application to scale. The fact that it took Twitter so long to get it somewhat right is remarkable, but I'm not sure I'd be so keen to look at them as an example of how to design scalable architectures.

saagarjha | karma 56017 | avg karma 2.29 · | 2023-01-08 01:49:46

> It wouldn't be surprising at all that you could build something equivalent to Twitter on just one beefy machine, maybe two for redundancy.

The blog post kind of gets a very cut-down version of Twitter running on a single machine. Actual Twitter absolutely would not work.

reply

nasredin | karma 555 | avg karma 1.01 · | 2018-04-28 21:59:46+00:00

Twitter is more like a mouse clicking game with very little interaction per mouseclick.

They still haven't implemented threading and free third party native clients.

reply

viraptor | karma 41139 | avg karma 2.79 · | 2023-10-11 15:25:20

It scaled for a long time for Twitter. With few exceptions, we're not going to work at anything close to the scale of Twitter at the time, so that's plenty scalable for most use cases. Now, if you want to be more compute-efficient that's a different question...

nathanmarz | karma 2499 | avg karma 8.83 · | 2024-04-30 21:53:18

Check out twitter-scale-mastodon, which is an implementation of Mastodon's backend from scratch that scales to Twitter scale. It's more than 40% less code than Mastodon's backend and 100x less code than Twitter wrote to build the equivalent.

https://github.com/redplanetlabs/twitter-scale-mastodon

https://blog.redplanetlabs.com/2023/08/15/how-we-reduced-the...

reply

robryan | karma 6222 | avg karma 1.7 · | 2010-11-06 23:18:13+00:00

I'd be surprised if anyone thought building Twitter with similar scalability would be easy. Sure while you cold fit it into 1 mySQL DB it would be easy to get the basic features down with a much simpler UI and no API. Past that though they would be vastly underestimating the work.

jacques_chester | karma 23379 | avg karma 2.58 · | 2011-07-31 09:43:10+00:00

My understanding is that the particular scaling problem of twitter is high fanout through subscriptions. Receiving 7k messages per second and storing them in a database is actually fairly straightforward.

mlindner | karma 2855 | avg karma 0.76 · | 2023-08-15 21:43:30

But it’s not at Twitter’s scale?

shimon | karma 3095 | avg karma 5.83 · | 2008-07-06 21:51:29

The title pretty much covers it. Twitter is more like an IM network or email list service than like a typical database-backed webapp; a relational database is just the wrong platform for large-scale messaging.

jrockway | karma 72069 | avg karma 3.74 · | 2008-05-13 03:41:40+00:00

I don't get why Twitter doesn't scale. It's just webmail, but with smaller messages and a simpler UI. Here's how twitter should work: every user should have a list of users following them. When they tweet, each follower gets a copy of that message in their personal inbox. A copy is also attached to the tweeter's account, so new followers can suck that copy in when they start following them.

That's it. Now, sending a message takes O(n) (n=followers) time, which is really cheap. On my machine, it takes about a second to create and sync 40,000 files (there's not much data, so replicating this via NFS wouldn't be that expensive either). With that out of the way, all you have to do is ls your "twitter directory" to see all of your friend's messages. This is another incredibly cheap operation. It's easy to distribute, and there's no locking.

Anyway, just look at the mail handling systems at huge universities and corporations. They scale fine, and they're much more complicated than twitter. Twitter is just a subset of e-mail, so it should be implement that way, not as a "SELECT * FROM tweets WHERE user IN (list, of, followers) ORDER BY date". That is the wrong approach because it makes reads (very common) expensive and writes (very uncommon) cheap. That's why twitter doesn't scale.

reply

bootload | karma 37441 | avg karma 4.0 · | 2007-04-13 10:44:20+00:00

'... Six Apart didn't hire Brad -- they acquired LiveJournal from him and named him "Chief Architect" or something ...'

Same result.

'... their developer does not seem to be amazingly qualified to do what he's doing ...'

The thing that strikes me is the system is not layered enough. The API's the app developers should be calling would shield having to deal with these types of problems. nostrodemons [0] summared flickrs approach to optomisation. [1] So is it the lack of a scaling infrastructure where twitter is failing?

'... how many are dynamic and how many are cachable/static ...'

One thing I notice with twitter is the update on the sytem. Every 2 minutes. For most users 5-10 minutes would probably be ample. I often wonder why they don't say "right you want RT, well here's the monthly subscription".

As for the dynamic and cacheable, the main hits appear to be reads of RSS public timeline. [2] RT creation allows no or little caching as the RSS would be built on the fly. Couple that with Rails in ability to talk to multiple db's [3] and you get bottle necks. Makes you wonder why they don't switch certain layers to perl?

'... definitely no Fitzpatrick ...'

Rare as hens teeth.

Reference

[0] nostrodemons, 'news.yc user'

http://news.ycombinator.com/user?id=nostrademons

[1] nostrodemons, 'comments in Startup founders, what books did you find most helpful?'

http://news.ycombinator.com/comments?id=5715

[2] google groups, twitter development talk, Alex Pain 'we don't gaurentee that you'll be able to collect contiguous sets of data from the public timeline API method. It's our most-requested method, so right now it's optimized for performance, not archival'

http://tinyurl.com/3xay7v

[3] Twitter trouble, 'thereÂ’s no facility in Rails to talk to more than one database at a time', Ibid.

reply

mittermayr | karma 2540 | avg karma 5.05 · | 2012-11-05 14:46:11+00:00

the bottleneck is twitter's api limits, data-wise and http connection wise I have headroom, lots of.

simple http connections to parse/spider the follower records from public pages is a no-go since twitter blocks the IP then, and scaling this out will eventually not end in a good way.

reply