Hueniverse: Scaling a Microblogging Service - Part I
Popularity Report
![]() |
|||
![]() |
|||
![]() |
|||
![]() |
|||
![]() |
|||
![]() |
URL Tag Cloud
Bookmark History
Saved by 9 people (0 private), first by anonymouse user on 2008-05-14
- Rickyrobinson on 2008-06-18 - Tags webdev , web2.0 , programming
- J_thompson on 2008-06-16 - Tags no_tag
- Nycrican2 on 2008-06-09 - Tags web2.0links
- Marcuhlig on 2008-06-01 - Tags twitter , development , infrastructure , programming , scaling , microblogging , scalability , performance , architecture
- Amygdala on 2008-05-30 - Tags twitter , scalability , tbr
Public Sticky notes
Highlighted by rickyrobinson
Highlighted by j_thompson
The retrieval system is where things are not as simple. Unlike webmail services where refreshing a user’s inbox only queries a very simple data set (is there anything new in MY inbox?), refreshing a user’s home page on Twitter queries a much more complex data set (are there any new updates in ALL my friends’ pages?) and the nature of the service means that the ratio of reads to writes is significantly different from most other web services.
It is these constant queries that bring Twitter down during popular events. The fact that a single request for a user’s home page or an API request for a user’s friends timeline must perform a complex iterative query is what causing some requests to take too long, at best timeout and at worst cause the entire system to lag behind. These custom views are expensive and mean that it is much more difficult to cache the final result of a request.
Going through a timeline request, the server first looks up the list of people the user is following. Then for each person, checks if their timeline is public or private, and if private, if the requesting user has the rights to view it. If the user has rights, the last few messages are retrieved, and the same is repeated for each person being followed. When done, all the messages are collected, sorted, and the latest messages are converted from their internal representation to the requested format (JASON, XML, RSS, or ATOM).
Highlighted by smoody
Highlighted by bankwatch


Public Comment