Skip to main content

Scaling Twitter: Making Twitter 10000 Percent Faster | High S...

Popularity Report

Total Popularity Score: 0

Loading...
Loading...
Loading...
Loading...
Loading...
Loading...

Rank

Bookmark History

Saved by 37 people (-7 private), first by anonymouse user on 2007-09-14


Public Sticky notes

Update 6: Some interesting changes from Twitter's Evan Weaver: everything in RAM now, database is a backup; peaks at 300 tweets/second; every tweet followed by average 126 people; vector cache of tweet IDs; row cache; fragment cache; page cache; keep separate caches; GC makes Ruby optimization resistant so went with Scala; Thrift and HTTP are used internally; 100s internal requests for every external request; rewrote MQ but kept interface the same; 3 queues are used to load balance requests; extensive A/B testing for backwards capability; switched to C memcached client for speed; optimize critical path; faster to get the cached results from the network memory than recompute them locally.

Highlighted by joel

For us, it’s really about scaling horizontally - to that end, Rails and Ruby haven’t been stumbling blocks, compared to any other language or framework. The performance boosts associated with a “faster” language would give us a 10-20% improvement, but thanks to architectural changes that Ruby and Rails happily accommodated, Twitter is 10000% faster than it was in January.

Highlighted by vincent

For us, it’s really about scaling horizontally - to that end, Rails and Ruby haven’t been stumbling blocks, compared to any other language or framework. The performance boosts associated with a “faster” language would give us a 10-20% improvement, but thanks to architectural changes that Ruby and Rails happily accommodated, Twitter is 10000% faster than it was in January.

Highlighted by fulvius

600 requests per second.

Highlighted by inouemak

Average 200-300 connections per second. Spiking to 800 connections per second.

Highlighted by inouemak

MySQL handled 2,400 requests per second.

Highlighted by inouemak

180

Highlighted by inouemak

  • Use caching with memcached a lot.
    - For example, if getting a count is slow, you can memoize the count into memcache in a millisecond.
    - Getting your friends status is complicated. There are security and other issues. So rather than doing a query, a friend's status is updated in cache instead. It never touches the database. This gives a predictable response time frame (upper bound 20 msecs).
    - ActiveRecord objects are huge so that's why they aren't cached. So they want to store critical attributes in a hash and lazy load the other attributes on access.
  • Highlighted by wade

    - Send message to invalidate friend's cache in the background instead of doing all individually, synchronously.

    Highlighted by wade

    Moved to Starling, a distributed queue written in Ruby.
    - Distributed queues were made to survive system crashes by writing them to disk. Other big websites take this simple approach as well.

    Highlighted by wade

    Abuse
    - A lot of down time because people crawl the site and add everyone as friends. 9000 friends in 24 hours. It would take down the site.

    Highlighted by wade

    - Be ruthless. Delete them as users.

    Highlighted by wade

    Abuse
    - A lot of down time because people crawl the site and add everyone as friends. 9000 friends in 24 hours. It would take down the site.
    - Build tools to detect these problems so you can pinpoint when and where they are happening.

    Highlighted by fulvius

    Abuse
    - A lot of down time because people crawl the site and add everyone as friends. 9000 friends in 24 hours. It would take down the site.
    - Build tools to detect these problems so you can pinpoint when and where they are happening.
    - Be ruthless. Delete them as users.

    Highlighted by fulvius

    Build in user limits. People will try to bust your system. Put in reasonable limits and detection mechanisms to protect your system from being killed.

    Highlighted by wade

    For example, they store all a user IDs friend IDs together, which prevented a lot of costly joins.

    Highlighted by wade

    Talk to the community. Don't hide and try to solve all problems yourself. Many brilliant people are willing to help if you ask.

    Highlighted by fulvius

    Cache the hell out of everything. Individual active records are not cached, yet. The queries are fast enough for now.

    Highlighted by fulvius

    Turn your website into an open service by creating an API. Their API is a huge reason for Twitter's success. It allows user's to create an ever expanding and ecosystem around Twitter that is difficult to compete with. You can never do all the work your user's can do and you probably won't be as creative. So open you application up and make it easy for others to integrate your application with theirs.

    Highlighted by fulvius