Designing a News Feed System

Newsfeed is the constantly updating list of stories in the middle of your home page. News Feed includes status updates, photos, videos, links, app activity, and likes from people, pages, and groups that you follow on Facebook.

Clarifying Questions

  • Is this a mobile app? Or a web app? Or both?
  • What are the important features?
  • Is the news feed sorted by reverse chronological order or any particular order such as topic scores?
  • How many friends can a user have?
  • What is the traffic volume?
  • Can feed contain images, videos, or just text?

Answers:

  • Both, mobile app and web app
  • A user can publish a post and see her friends’ posts on the news feed page
  • The feed is sorted by reverse chronological order
  • 5000
  • 10 million DAU
  • It can contain media files, including both images and videos

High Level Design

Newsfeed APIs

Feed Publishing API

POST /v1/me/feed

Params:

  • content: content is the text of the post
  • auth_token: it is used to authenticate API requests

Newsfeed retrieval API

GET /v1/me/feed

Params:

  • auth_token: it is used to authenticate API requests

Our design is divided into two flows:

  • Feed Publishing
  • Newsfeed Building

Feed Publishing

  1. User: User can view newsfeed on browser or mobile device.
  2. A user make a post with content hello through API:
    • v1/me/feed?content=Hello&auth_token={auth_token}
  3. Load balancer distribute traffic to web servers
  4. Web servers redirect traffic to different internal services
  5. Post Service: Persist post in the database and cache
  6. Fanout Service: push new content to friends’ news feed. Newsfeed data is stored in the cache for fast retrieval
  7. Notification service: inform friends that new content is available and send out push notifications

Newsfeed building

  1. User: User send a request to retrieve her news feed.
    • v1/me/feed
  2. Load Balancers: load balancer redirects traffic to web servers
  3. Web Servers: web servers route requests to newsfeed service
  4. Newsfeed Service: news feed service fetches news feed from the cache
  5. Newsfeed cache: store needs feed IDs needed to render the news feed

Design Deep Dive

Feed Publishing

Web Servers: Besides communicating with clients, web servers enforce authentication and rate-limiting (users can only make certain number of posts during a window to prevent spam and abusive content)

Fanout service: The process of delivering a post to all friends.

  • Fanout on write (push model):
    • The newsfeed is precomputed during write time
    • A new post is delivered to friends’ cache immediately after it is published
ProsCons
– The news feed is generated in real-time and can be pushed to friends immediately
– Fetching news feed is fast because the news feed is pre-computed during write time
– Hotkey problem: If a user has many friends, fetching the friend list and generating news feed for all of them are slow and time consuming
– For inactive users, pre-computing news feeds waste computing resources
  • Fanout on read (pull model):
    • The news feed is generated during read time.
    • This is on-demand model
    • Recent posts are pulled when a user loads her home page
ProsCons
– For inactive users pull model works better because it will not waste computing resources on them
– Data is not push to them so there is no hotkey problem
– Fetching the newsfeed is slow as the news feed is not pre-computed

We adopt a hybrid approach

  • Push Model for the majority of the users
  • Pull Model for celebrities or those with many friends and followers

it works as follows:

  1. Fetch friend IDs from the graph database
  2. Get friends info from the user cache
  3. Send friends list and new post ID to the message queue
  4. Fanout workers fetch data from the message queue and store news feed data in the news feed cache.
    • <post_id, user_id>
  5. Store <post_id, user_id> in news feed cache.

Newsfeed Retrieval

  1. A user sends a request to retrieve her news feed.
    • /v1/me/feed
  2. The load balancer redistributes requests to web server
  3. Web servers call the news feed service to fetch news feed
  4. News feed service gets a list post IDs from the news feed cache
  5. The news feed service fetches the complete user and post objects from caches (user cache and post cache) to construct the fully hydrated news feed.
  6. The fully hydrated news feed is returned in JSON format back to the client for rendering.

Cache Architecture

News Feed: It stores IDs of news feed

Content:

  • It stores every post data
  • Popular content is stored in hot cache

Social Graph: It stores user relationship data

Action: It stores info about whether a user:

  • Like a post
  • Replied a post
  • Took other actions on a post

Counters: it stores counters for like, reply, follower, following, etc.