How Tumblr Recommends Content

Tumblr is a digital community—part microblogging, part social network—where people come together around shared interests to celebrate ideas and art through genuine conversations. The experience on Tumblr leans on our key values of community, privacy, authenticity, freedom of expression, and users’ control over the digital experience. We empower creators to make their best work and get it in front of the audience they deserve. 

As part of that, our content feeds aspire to provide high-quality, safe, entertaining, inspiring, and relevant content to each user. There are different feeds available, including one based on content from a curated list of blogs and tags the user already follows and another that dynamically serves content based on what’s trending along with the types of content we believe the user might be interested in. In order to develop these feeds, we use a wide range of content personalization techniques and signals, which includes the user’s dashboard preferences. Read on to learn more about each type of feed and how we select and order content to populate the feeds. 

Feeds

We support different consumption experiences, mostly through three tabs:

Following

 In this space we mostly showcase content from blogs the user follows. Users take an active role in controlling their experience by customizing which blogs they follow, as well as providing feedback about what they don’t want to see (e.g. filtering out content from specific blogs and tags).

We also provide occasional suggestions for: 

Examples of recommendation explanations in the Following feed

For most users, the volume of new content available since their last visit tends to be higher than what they are typically able to browse in one session. For this reason, our default experience on the Following feed ranks content algorithmically by predicted likelihood of engagement. Users can opt out from algorithmic ranking on the Following feed, and instead have a chronological feed, by turning off the “Best Stuff First” toggle in their dashboard preferences.

For You

The content on the For You feed comes from a mix of posts created or reblogged by blogs the user already follows, and posts from sources (either blogs or topics) the user might not know yet. 

When recommending content in the For You feed, we use different signals to identify the user’s engagement patterns, in terms of historical and real-time preferences over content. These signals include explicit positive and negative engagement on blogs (e.g. following, blocking), posts (e.g. likes, replies, reblogs, shares, dismissing), and tags (e.g. following, blocking), as well as search queries and browsing events (e.g. tapping, clicking). We assign different importance weights to these events, with explicit engagements having higher weights than browsing events, because they more accurately reflect the user’s preferences over content, while the latter could be more noisy (e.g. a user may click on a post they don’t like). We also consider the time elapsed since each action to give more importance to recent engagement and events, as this allows us to capture shifts in the user’s preferences over time.

We use this understanding of the user’s preferences over blogs, tags, and posts to identify a selection of posts (from blogs not yet followed) that are potentially relevant to the user. This happens through a suite of different sourcing algorithms, each one specialized in identifying relevant candidates using a subset of signals and definition of content-similarity. For example, Collaborative Filtering algorithms will surface posts engaged by users who have similar engagement patterns (e.g. engaged with the same posts), while content-based approaches suggest posts whose content (e.g. textual information, tags, media objects) is similar to the user’s interests (e.g. tags the user follows, posts the user has recently engaged with).

Ultimately, the ordering of posts in the For You feed is determined by the predicted likelihood the user will find each post engaging and relevant to their interests. We also try to make sure that content on this feed reflects a wide range of sources and interests. 

Because the consumption experience on the “For You” feed is algorithmically driven, users can influence what’s shown by engaging organically with the feed, curating the list of blogs and tags they follow or block, and marking irrelevant content by using the “Not interested in this post” link in the meatballs menu (●●●) at the upper right-hand corner of the post.

Your tags

This feed is meant to be a place for the user to catch up with the best and most recent content related to tags they follow. The ordering of posts in this feed balances recency and popularity in order to provide a mix of content that is fresh, relevant, and high quality. We also showcase a selection of the most popular creators in each tag, which is determined by their number of recent contributions with the tag and the corresponding engagement. Users can modify what we recommend by managing their followed tags, blocking tags, and filtering the Your Tags feed to only view content from certain tags. 

How we order content on feeds

The composition of feeds typically follows the following process. Some steps may be skipped based on user preferences. 

  1. Retrieving candidate posts from a variety of underlying sources (follow-graph for posts created by followed blogs, collaborative filtering for posts similar to those recently engaged, and content-based for posts with content matching user’s interests). 
  2. Applying several filters to make sure the content is available (e.g. it hasn’t been deleted, and post and blog visibility is set to public), complies with community guidelines and mature content visibility preferences, respects the user’s filtering settings on blogs and tags, and hasn’t been engaged before by the user. On the For You feed, we also apply a filtering step to remove posts the user has recently seen, to improve diversification and freshness.
  3. Ordering this pool of content in a way that provides the user with the most engaging and relevant posts near the top of their feeds. This phase involves predicting the likelihood that a user will find each specific candidate post relevant to their interests and engaging, and then sorting candidates accordingly.
  4. Rearranging results to improve the diversity of the sequence of posts in the feed. The previous phase might produce rows of similar content (e.g. created by the same blog, or about the same topic), which might result in a poor user experience. A diversification re-ranking ensures that the sequence of posts on the feed covers different interests of the user and comes from a balanced mix of sources. 

The relevance/engagement we associate to each post during the ranking phases depends on a variety of factors. We employ machine learning techniques to learn, from a large pool of historical events, how the interplay between those factors (features) influences the likelihood of the user engaging with the candidate posts. 

While we use feed-specific prediction models and we often iterate on those models to improve their accuracy, we found that the categories of features with higher predictive power tend to be the same. They include: 

User control over feeds 

We offer several ways for users to customize their experience on Tumblr, and we update the content on feeds in real-time to reflect the current settings. Users can:

Copied to clipboard!