Sunday, June 14, 2009

Twitpocalypse

Twitter has hit an interesting limit in their software. They used the four-byte, signed integer representation for their message ids. That means 31 bits to count messages. (Four bytes is 32 bits but you need to save one to use for the plus/minus sign). How much can you count with 31 bits?

Well, you can count to 231 - 1 = 2,147,483,647. Computer users who have been around for a few years will recognize this as the two-billion, e.g., two-gigabyte limit. On old UNIX systems, file sizes were limited to 2-GB. On many systems, memory was limited the same way. It's even the source of the year-2038 limit on the UNIX time stamp. That's when it will have been two billion seconds since 1 Jan 1970, the UNIX epoch.

The prediction was that many Twitter clients wouldn't handle the roll-over. (This is like your odometer hitting its limit and rolling over to 1 mile again. That makes it a little harder to figure out, for example, how far you've driven if you started before the rollover). That seems to have come true.

Since I pretty much just read Twitter from their web site and don't really use a client, I haven't experienced the problem directly.

Other articles: