How Tumblr Recommends Content

Tumblr is a digital community—part microblogging, part social network—where people come together around shared interests to celebrate ideas and art through genuine conversations. The experience on Tumblr leans on our key values of community, privacy, authenticity, freedom of expression, and users’ control over the digital experience. We empower creators to make their best work and get it in front of the audience they deserve.

As part of that, our content feeds aspire to provide high-quality, safe, entertaining, inspiring, and relevant content to each user. There are different feeds available, including one based on content from a curated list of blogs and tags the user already follows and another that dynamically serves content based on what’s trending along with the types of content we believe the user might be interested in. In order to develop these feeds, we use a wide range of content personalization techniques and signals, which includes the user’s dashboard preferences. Read on to learn more about each type of feed and how we select and order content to populate the feeds.

Feeds

We support different consumption experiences, mostly through three tabs:

Following

In this space we mostly showcase content from blogs the user follows. Users take an active role in controlling their experience by customizing which blogs they follow, as well as providing feedback about what they don’t want to see (e.g. filtering out content from specific blogs and tags).

We also provide occasional suggestions for:

Blogs to follow (“Check out these blogs”). These recommendations are based on follow-relationships (blogs followed by blogs the user has more recently followed, labeled as “In your orbit” or “Like blogs you follow”) and content-similarity (blogs that share similar content to what the user has recently engaged with, labeled as “You seem interested”). We don’t recommend blogs that the user has previously dismissed, reported, blocked, or recently unfollowed. Users can influence blog recommendations by engaging with blogs and content, or by dismissing a specific recommendation using the meatballs menu (●●●) at the upper right-hand corner of the post, blocking and/or unfollowing blogs.
Tags to follow (“Check out these tags”). For these recommendations we consider relationships between tags (e.g. tags used together frequently) and tags associated with content the user has recently engaged with. Users influence tag recommendations by engaging organically with content and blocking tags, as we don’t recommend tags the user has previously blocked.
Posts, including those from not-yet-followed blogs. These posts come from different sources to better diversify the feed, but we limit the frequency of these recommended posts to make sure the Following feed mostly contains content created by blogs the user follows. Each recommended post is labeled with a specific explanation so that the user can easily identify them. Examples of posts we recommend include:
- Posts trending in topics the user follows (labeled as “Because you follow #tag”). We surface popular posts that are tagged with the tags the user follows and has more recently engaged with. Users can opt out of seeing these posts by turning off the “Include followed tag posts” toggle from their dashboard preferences.
- Posts related to the user’s recent activity (labeled as “Based on Your Likes!”). We surface posts that are “similar” to those recently liked by the user. Our definition of similarity is based on engagement: two posts are considered similar if they have been engaged with by the same users. Users can opt out of these recommendations by turning off the “Include “Based On Your Likes!” toggle from their dashboard preferences.
- Posts liked by blogs the user follows (labeled as “Liked by @blogname”). We aggregate the “likes” activity of blogs the user follows and recommend content based on: (i) how many of the user’s followed blogs have liked each post (the more, the better); (ii) the frequency of interactions between the user and the blogs that liked the post (the higher, the better); (iii) the age of the post (the newer, the better). Users can opt out of these recommendations by turning off the “Include posts liked by the blogs you follow” toggle from their dashboard preferences. (We of course don’t share likes by users who have set their likes to private by turning off the “Share posts you like” toggle their privacy options.)

Examples of recommendation explanations in the Following feed

For most users, the volume of new content available since their last visit tends to be higher than what they are typically able to browse in one session. For this reason, our default experience on the Following feed ranks content algorithmically by predicted likelihood of engagement. Users can opt out from algorithmic ranking on the Following feed, and instead have a chronological feed, by turning off the “Best Stuff First” toggle in their dashboard preferences.

For You

The content on the For You feed comes from a mix of posts created or reblogged by blogs the user already follows, and posts from sources (either blogs or topics) the user might not know yet.

When recommending content in the For You feed, we use different signals to identify the user’s engagement patterns, in terms of historical and real-time preferences over content. These signals include explicit positive and negative engagement on blogs (e.g. following, blocking), posts (e.g. likes, replies, reblogs, shares, dismissing), and tags (e.g. following, blocking), as well as search queries and browsing events (e.g. tapping, clicking). We assign different importance weights to these events, with explicit engagements having higher weights than browsing events, because they more accurately reflect the user’s preferences over content, while the latter could be more noisy (e.g. a user may click on a post they don’t like). We also consider the time elapsed since each action to give more importance to recent engagement and events, as this allows us to capture shifts in the user’s preferences over time.

We use this understanding of the user’s preferences over blogs, tags, and posts to identify a selection of posts (from blogs not yet followed) that are potentially relevant to the user. This happens through a suite of different sourcing algorithms, each one specialized in identifying relevant candidates using a subset of signals and definition of content-similarity. For example, Collaborative Filtering algorithms will surface posts engaged by users who have similar engagement patterns (e.g. engaged with the same posts), while content-based approaches suggest posts whose content (e.g. textual information, tags, media objects) is similar to the user’s interests (e.g. tags the user follows, posts the user has recently engaged with).

Ultimately, the ordering of posts in the For You feed is determined by the predicted likelihood the user will find each post engaging and relevant to their interests. We also try to make sure that content on this feed reflects a wide range of sources and interests.

Because the consumption experience on the “For You” feed is algorithmically driven, users can influence what’s shown by engaging organically with the feed, curating the list of blogs and tags they follow or block, and marking irrelevant content by using the “Not interested in this post” link in the meatballs menu (●●●) at the upper right-hand corner of the post.

Your tags

This feed is meant to be a place for the user to catch up with the best and most recent content related to tags they follow. The ordering of posts in this feed balances recency and popularity in order to provide a mix of content that is fresh, relevant, and high quality. We also showcase a selection of the most popular creators in each tag, which is determined by their number of recent contributions with the tag and the corresponding engagement. Users can modify what we recommend by managing their followed tags, blocking tags, and filtering the Your Tags feed to only view content from certain tags.

How we order content on feeds

The composition of feeds typically follows the following process. Some steps may be skipped based on user preferences.

Retrieving candidate posts from a variety of underlying sources (follow-graph for posts created by followed blogs, collaborative filtering for posts similar to those recently engaged, and content-based for posts with content matching user’s interests).
Applying several filters to make sure the content is available (e.g. it hasn’t been deleted, and post and blog visibility is set to public), complies with community guidelines and mature content visibility preferences, respects the user’s filtering settings on blogs and tags, and hasn’t been engaged before by the user. On the For You feed, we also apply a filtering step to remove posts the user has recently seen, to improve diversification and freshness.
Ordering this pool of content in a way that provides the user with the most engaging and relevant posts near the top of their feeds. This phase involves predicting the likelihood that a user will find each specific candidate post relevant to their interests and engaging, and then sorting candidates accordingly.
Rearranging results to improve the diversity of the sequence of posts in the feed. The previous phase might produce rows of similar content (e.g. created by the same blog, or about the same topic), which might result in a poor user experience. A diversification re-ranking ensures that the sequence of posts on the feed covers different interests of the user and comes from a balanced mix of sources.

The relevance/engagement we associate to each post during the ranking phases depends on a variety of factors. We employ machine learning techniques to learn, from a large pool of historical events, how the interplay between those factors (features) influences the likelihood of the user engaging with the candidate posts.

While we use feed-specific prediction models and we often iterate on those models to improve their accuracy, we found that the categories of features with higher predictive power tend to be the same. They include:

Information about the creator of the content, such as their popularity (e.g. number of followers), and their level of recent activity (e.g. number of posts created recently);
Information about the post, such as its type (reblog vs original post), age, popularity (e.g. engagement count, typically broken down by type of engagement), and information about the post content (e.g. post-type, presence and number of tags, images, length of textual body) ;
Information about the user, such as their interests and preference for different types of post content.
Information about the user and the creator of the post, such as the existence of a one/two way social connection (do they follow each other?), subscription (does the user subscribe to the creator?), and recent level of engagement of the user on content shared by the creator of the post.

User control over feeds

We offer several ways for users to customize their experience on Tumblr, and we update the content on feeds in real-time to reflect the current settings. Users can:

Choose between chronological vs algorithmic ranking on the Following feed through the “Best Stuff First” option in their dashboard preferences.
Control recommended posts on the Following feed, through the “Include followed tag posts”, “Include “Based On Your Likes!” and “Include posts liked by the blogs you follow” toggles from their dashboard preferences.
Dismiss recommended posts, through the “Not interested in this post” or “Dismiss” item via the meatballs menu (●●●) at the upper right-hand corner of the post.
Block posts from a specific creator.
Block posts with specific tags.
Report posts as spam or for content-safety review, from the “Report post” or “Suggest community label” items via the meatballs menu (●●●) at the upper right-hand corner of the post.