Hacker Read top | best | new | newcomments | leaders | about | bookmarklet login
Ask HN: How technically hard was it for Twitter to move to 280 chars? (b'') similar stories update story
31.0 points by jabo | karma 3727 | avg karma 5.62 2017-09-27 16:02:46+00:00 | hide | past | favorite | 42 comments

This is mostly for folks who work at Twitter or who have heard some behind-the-scenes stories about what it took to do this change.

Given that Twitter has long been only allowing 140 characters, I'm wondering if the 140 character limit assumption was hard-baked into any parts of their architecture that they had to spend time refactoring, or if it was as simple as removing a constraint on the front-most part of their stack? Did it for example require any changes to the storage layer?

Just curious.



view as:

Hopefully it was.

#define MAX_TWEET_CHARS 140

to

#define MAX_TWEET_CHARS 280

and a re-compile.


no adjustment to css? you savage!

> no adjustment to css? you savage!

You mean your CSS rules aren't auto-generated from the C preprocessor?


C Preprocessor define, mustache in your CSS. Boom!!! Done.

Again, hopefully they had their code base in order and it was simple as changing a static setting, with a single line of code change.


What if they had to re-fab their load balancing ASICs? That's a year waiting for the shuttle run to come through!

Hey now, we're a scala shop :)

is this better?

object MAX_TWEET_CHARS { val value = 140 }

to

object MAX_TWEET_CHARS { val value = 280 }

and a re-compile.


thats the comment I was looking to find on HN haha.

I'm certain some obscure optimizations were done from top to bottom for 140 chars. 280 chars will also require a different mindset and optimizations.

The technical “changing-140-to-280” is only one aspect of the work. Other very important aspects include QA (e.g does increasing the limit break the layout when 280 characters are used?) and data science (e.g does having longer tweets actually keep people on Twitter longer and increase engagement)?

Both of those are reasons why companies do staged rollouts of features incase the company makes a mistake.


Agreed, but also storage requirements I would imagine. Those extra 140 characters add up quick in an environment like Twitter.

It's estimated at 500 million tweets a day. So UTF-8 characters take 1 byte. That's an extra 140 bytes needed per tweet (at the minimum, I'm not positive on how it's stored in the back end if they do ngrams or whatever else to a tweet). So that's 70,000,000,000 bytes. That's... 70GB, right? 70gb per day, that's an extra 24.920 TB of space needed per year. Not a large amount. Hell, I have a surveillance server with more storage than that. But then running analytics against that extra data, ALL THE DAMN TIME. Plus, if going 280 does, in theory, increase usage, then that number would be higher. I'm assuming UTF-8. I think I'm wrong. Might be 16, which then increases it to...6 bytes a character? I should really open a google tab, but I'm too lazy. I already grabbed a calculator. My exercise for the day is met :P

An extra 140 character SEEMS small, but on Twitter scale, that's not a light matter. Shit like that can run away on you fast.

edit: Jesus Christ, yes, I didn't include every single bloody thing. This is just a simple 5 minute example how Twitter scale can have a huge impact on "small" decisions. There's outliers in the average statistics. Not everyone will use the full character limit, frequency changes, etc. Nitpickers man...


You forgot to take images and videos into account.

Only a small percentage of tweets actually use 140, I would not assume that because the limit is doubled 100% of new tweets will be 240. Probably less than 5% of tweets will go beyond 140.

be interesting to see the flip side of "how many multiple tweets will now get condensed into a single or fewer tweets", thus saving metadata overhead

That's a good point.

And how many photographs/screengrabs of pieces of text are now posted as plain text.

There is also increased network traffic, more bytes to keep in the caches (RAM?), maybe effectively doubling the amount of data you store (in the worst case) creates terrible effects on (some of) your queries and increase latency somewhere in the system and you have to find where...

There could be "big" ramifications, it's a pretty interesting thing to think about :)


UTF-8 characters take 1 to 4 byte, depending on the character.

Well there will also be a reduction when all these (1/2) double tweets go away. I'm pretty sure the metadata, analytics, events, etc 2 tweets generate is way more than storing those extra bytes per tweet.

I saw a chart somewhere recently comparing the tweet length distributions between Japanese and other countries which basically says, there are less surge near the 140 char limit in JP than in other countries, and that should describe the reason why Twitter is still popular in JP while it's struggling in others: JP people are less frustrated by the char limit and thus experience a better UX.

Always wondered about that whether Asian countries actually like twitter or if the character limit was debilitating.

Cool to know, thanks.


I think the chart you mention is from Twitter's blog:

https://blog.twitter.com/official/en_us/topics/product/2017/...


Terribly interested in this as well, and commenting to follow the conversation.

Code refactoring is the process of restructuring existing computer code—changing the factoring—without changing its external behavior.

https://en.wikipedia.org/wiki/Code_refactoring


The fact that the maximum tweet length doubled means that Twitter's external behavior did change

I think that's the parent's point: it's not "refactoring" if the external behavior changed. Though I generally agree with the down-voting of the comment as needlessly pedantic, given the common usage of the term.

The common incorrect usage of the term is just one of my pet peeves. I couldn't help myself.

Sometimes, you may have to refactor existing code to be able to add new features.

I'm going to say far less technically involved than fixing their harassment problem.

Much more tractable problem.

Twitter's original limit was 140 bytes, but it's been 140 normalized Unicode code points for years. The physical message can be much longer than 140 bytes.

change the db field from varchar(140) to varchar(280)

I suspect they would use unicode - nvarchar(280).

It'll be interesting to see what apps and sites that use twitter feeds start crapping out when they get all those extra characters.

Here's a blog article regarding the changes required for search to support the tweet length changes: https://blog.twitter.com/engineering/en_us/topics/infrastruc...

The interesting question is why did SMS only have 160 characters. I know those 160 bytes are 7 bit characters and that a 8 bit SMS can contain 140 characters, but why does the control channel limit it to that?

I worked on a WAP browser a long time ago, which would support SMS in PDU mode as a bearer. WDP packets would just be encapsulated in an SMS. I am too old ... :(


Because the technology that came before cell phones was pagers.

Good point, but why not 139 bytes or 141 bytes?

Many years ago I worked on an SMSC. SMS are sent over the same data channels as phone signalling (call setup etc.) using SS7 protocol stack. The SS7 protocol physical layers used fixed size data packets of around 256 bytes, for example on T1 data channels. On top of that you need routing info such as addressing in MTP, TCAP to get the packet to the correct mobile switching center. Then GSM or IS-41 MAP to route to the device. Finally the sms bearer encoding needs some extra bytes for thing like delivery reports and source phone number. This left 160 bytes free.

In other words, SMS messages could be delivered across the communication channel that the devices were already doing in-order to remain in-contact with towers, essentially allowing providers to charge (SMS used to be an add-on) for something that cost carriers basically nothing in transmission costs.

Is that a correct translation?


Yup, exactly.

Legal | privacy