RedQueen

An Online Algorithm for Smart Broadcasting in Social Networks

Presented at WSDM, 2017;
Full paper at arxiv.org/abs/1610.05773;
Source code at Networks-Learning/RedQueen

Introduction

User attention is becoming a sparse commodity. This is evinced by the proliferation of studies which point to information overload or to the plethora of articles proclaiming to recommend when to post your messages to maximise the probability of your followers reacting to the post.

We will look at a broadcaster who wants his posts to gain more attention on the feed of one follower and see how he can regulate his posting to do that.

Feeds in social media

Feeds in social media display posts of all followees of the user (the follower) in some order. However, the follower does not pay equal attention to all posts. Instead, more attention is given to the posts on top of the feed while the posts which disappear underneath the fold are given little attention. See Leveraging Position Bias to Improve Peer Recommendation, Information is not a Virus, and Other Consequences of Human Cognitive Limits, On the Efficiency of the Information Networks in Social Media

Let us define the rank $ r(t) $ as the position of the first post by the broadcaster on the follower's feed. It is a measure of visibility of the broadcasters post. One way of seeking the follower's attention is to minimize the rank $ r(t) $.

In order to minimize the rank, we first need to understand how it evolves. If the feeds are sorted by the time of the post, with most recent posts appearing on top, then the dynamics of the rank $ r(t) $ of the broadcaster in the feed can be modeled using the following equation: ⊕ $$ r(t+dt) = \left\{ \begin{array}{ccccl} & (r(t)+1) & {\color{other} dM(t)} & {\color{us} (1-dN(t))} & {\color{other} \text{Others post}}\\ + & 0 & {\color{other}(1 - dM(t))} & {\color{us} dN(t)} & {\color{us} \text{Our broadcaster posts}}\\ + & r(t) & {\color{other} (1 - dM(t))} & {\color{us} (1- dN(t))} & \text{No-one posts} \end{array} \right. $$ where $ {\color{other} dM(t)}, {\color{us} dN(t)} \in \{0, 1\} $ denote the presence or absence of a broadcast in the respective point process at time $ t $. Our method can generalize to any ordering of posts as long as we can reverse-engineer the algorithm and write out the dynamics of $ r(t) $.

The evolution of $ \color{us} N(t) $ is under the control of the broadcaster, whereas the $ \color{other} M(t) $ may behave in various ways:

It may be nearly constant (e.g. large number of diverse followees)
It may have bursts (e.g. breaking news posts)
It may be piecewise constant with some period (e.g. most followees are from a particular time zones)
$ \ldots $

⊕ Karimi et. al. assume that the broadcasting frequency of other broadcasters is piecewise constant. Thanks to our novel formulation of the problem, our solution RedQueen is able to handle all of these cases, without needing to perform any model inference using historical data for $ \color{other} M(t) $.

Methodology

Given the dynamics of the rank and of other broadcasters, as stochastic jump differential equations, we can attempt to find the optimal rate of posting for the broadcaster $ u(t) $ by formulating it as a variable in an stochastic control problem. However, before that, we have to choose a loss function which we will aim to optimize for.

We choose the following loss function: ⊕ This objective can be generalized for multiple followers. For details, see our paper. $$ \begin{align} \underset{u(t_0, t_f]}{\text{minimize}} & \quad \mathbb{E}_{({\color{us} N}, {\color{other} M})(t_0, t_f]}\left[ \int_{t_0}^{t_f} \left( \frac{1}{2} s \,r^2(\tau) + \frac{1}{2} q\,u^2(\tau) \right) d\tau \right] \\ \text{subject to} & \quad u(t) \geq 0 \quad \forall t \in (t_0, t_f] \end{align} $$

⊕ Karimi et. al. optimize for Time spent in top-k, wherein, they find the rates which maximize the time the broadcaster's posts spend with $ r(t) \lt k $. In our paper, we show that our approach, though optimizing for a different loss function, still outperforms the offline method. This function penalizes high ranks $ \frac{1}{2} s \,r^2(t) $ as well as posting with high frequency $ \frac{1}{2} q\,u^2(t) $, with tunable weights $ s $ and $ q $ to trade off between them.

We integrate the this loss function over the desired interval $ [ t_0, t_j ] $ and, since the future is stochastic, we attempt to find the posting rate $ u(t) $ of the broadcaster which minimizes the expectation of the integral.

With this choice of the loss function, the optimization problem can be solved analytically by formulating the optimal cost-to-go, deriving a PDE using the Hamilton-Jacobi-Bellman equation and then solving it.

Finally, on solving the PDE, we arrive at the following elegant solution: $$ \overbrace{u(t)}^{\text{broadcast intensity}} = \sqrt{s/q}\,\underbrace{r(t)}_{\text{Rank in feed}} $$

This function states that the rate at which the broadcaster must post messages (i.e. the urgency of posting), should be proportional to how bad his rank $ r(t) $ is. Hence, he will have to post faster to stay where he originally was (i.e. at the top).

Determining the rate requires only knowing the rank of the broadcaster on the feed(s) of his follower(s). This information can be calculated in $ \mathcal{O}(1) $ time, given the stream of posts from other broadcasters, the sampling, as shown below, too takes only $ \mathcal{O}(1) $ time, making this an efficient online algorithm for determining when-to-post.

From rate-of-posting to when-to-post

Now that we know how the broadcaster should regulate his messages, we have to determine when exactly should the next post be made.

First, notice that $ u(t) $ changes only when $ r(t) $ changes and in steps of size $ \sqrt{{s}/{q}} $.

Thanks to the superposition property of Poisson point processes we can treat each jump in $ u(t) $ as the start of a new Poisson process.

u(t) is superposition of multiple point processes

Now to draw the time of first event for $ u(t) $, i.e. when our broadcaster posts, we can draw samples from all point processes and take the minimum.

sampling u(t) involves taking the minimum of all samples for individual processes

Since drawing one sample is an $ \mathcal{O}(1) $ operation, and we draw one sample each time other broadcasters post, we only have to take $ \mathcal{O}(M(t_f)) $ samples.

Demo

Choose the metric you would like to see and press the Play button to start the demo.

Rank $ r(t) $ Time at top $ \int_{0}^{t} \mathbb{I}(r(\tau) \lt 1)\, d\tau $

⊕ Interactive demo showing the performance (i.e. $ r(t) $) of RedQueen against a Poisson broadcaster (who posts at random times) against a bursty stream of tweets coming from other broadcasters.

More details

Our paper contains a mathematical description of RedQueen, and several extensions including, but not limited to:

Comparison against an Oracle and the state of the art and saw that our algorithm fared favourably.
Extension to handle multiple users and to handle time varying weights for followers.
Testing under truthful what-if conditions by playing back tweets on falls of followers of over 2000 Twitter users. RedQueen was able to obtain better average rank as well as higher time in the top-1 compared to the real users as well as against the current state of the art.

The Name RedQueen

The name is taken from the Red Queen's race incident in Lewis Carroll's book Through the Looking Glass. In particular, the Queen's quote:

"Now, here, you see, it takes all the running you can do, to keep in the same place."
Red Queen

Authors

The webpage was created by Utkarsh Upadhyay. The other authors of RedQueen are Ali Zarezade, Hamid R. Rabiee, and Manuel Gomez-Rodriguez.

Acknowledgements

The RedQueen icon was made by Zlatko Najdenovski from www.flaticon.com. It is licensed by CC 3.0 BY.

Source code

Imprint / Data Protection