2015-06-04 17:57:11

by Manfred Spraul

[permalink] [raw]
Subject: Re: [PATCH 2/2] ipc,msg: provide barrier pairings for lockless receive

On 05/30/2015 02:03 AM, Davidlohr Bueso wrote:
> We currently use a full barrier on the sender side to
> to avoid receiver tasks disappearing on us while still
> performing on the sender side wakeup. We lack however,
> the proper CPU-CPU interactions pairing on the receiver
> side which busy-waits for the message. Similarly, we do
> not need a full smp_mb, and can relax the semantics for
> the writer and reader sides of the message. This is safe
> as we are only ordering loads and stores to r_msg. And in
> both smp_wmb and smp_rmb, there are no stores after the
> calls _anyway_.
I like the idea, the pairing in ipc is not good.
Another one is still open in sem.

Perhaps we should formalize it a bit more, so that it is easy to find
which barrier pair belongs together.
It is only an idea, but right now there are too many bugs.

> This obviously applies for pipelined_send and expunge_all,
> for EIRDM when destroying a queue.
>
> Signed-off-by: Davidlohr Bueso <[email protected]>
> ---
> ipc/msg.c | 31 ++++++++++++++++++++++---------
> 1 file changed, 22 insertions(+), 9 deletions(-)
>
> diff --git a/ipc/msg.c b/ipc/msg.c
> index 2b6fdbb..ac5116e 100644
> --- a/ipc/msg.c
> +++ b/ipc/msg.c
> @@ -196,7 +196,7 @@ static void expunge_all(struct msg_queue *msq, int res)
> * or dealing with -EAGAIN cases. See lockless receive part 1
> * and 2 in do_msgrcv().
> */
> - smp_mb();
> + smp_wmb();
Idea for improvement: Add here /* smp_barrier_pair: ipc_msg_01 */

> msr->r_msg = ERR_PTR(res);
> }
> }
> @@ -580,7 +580,7 @@ static inline int pipelined_send(struct msg_queue *msq, struct msg_msg *msg)
> /* initialize pipelined send ordering */
> msr->r_msg = NULL;
> wake_up_process(msr->r_tsk);
> - smp_mb(); /* see barrier comment below */
> + smp_wmb(); /* see barrier comment below */
> msr->r_msg = ERR_PTR(-E2BIG);
> } else {
> msr->r_msg = NULL;
> @@ -589,11 +589,12 @@ static inline int pipelined_send(struct msg_queue *msq, struct msg_msg *msg)
> wake_up_process(msr->r_tsk);
> /*
> * Ensure that the wakeup is visible before
> - * setting r_msg, as the receiving end depends
> - * on it. See lockless receive part 1 and 2 in
> - * do_msgrcv().
> + * setting r_msg, as the receiving can otherwise
> + * exit - once r_msg is set, the receiver can
> + * continue. See lockless receive part 1 and 2
> + * in do_msgrcv().
> */
> - smp_mb();
> + smp_wmb();
Idea for improvement: Add here /* smp_barrier_pair: ipc_msg_02 */
> msr->r_msg = msg;
>
> return 1;
> @@ -934,10 +935,22 @@ long do_msgrcv(int msqid, void __user *buf, size_t bufsz, long msgtyp, int msgfl
> * wake_up_process(). There is a race with exit(), see
> * ipc/mqueue.c for the details.
> */
> - msg = (struct msg_msg *)msr_d.r_msg;
> - while (msg == NULL) {
> - cpu_relax();
> + for (;;) {
> + /*
> + * Pairs with writer barrier in pipelined_send
> + * or expunge_all
> + */
> + smp_rmb();
And here again /* smp_barrier_pair: for ipc_msg_01 and ipc_msg_02 */

--
Manfred


2015-06-04 18:41:40

by Davidlohr Bueso

[permalink] [raw]
Subject: Re: [PATCH 2/2] ipc,msg: provide barrier pairings for lockless receive

On Thu, 2015-06-04 at 19:57 +0200, Manfred Spraul wrote:
> On 05/30/2015 02:03 AM, Davidlohr Bueso wrote:
> > We currently use a full barrier on the sender side to
> > to avoid receiver tasks disappearing on us while still
> > performing on the sender side wakeup. We lack however,
> > the proper CPU-CPU interactions pairing on the receiver
> > side which busy-waits for the message. Similarly, we do
> > not need a full smp_mb, and can relax the semantics for
> > the writer and reader sides of the message. This is safe
> > as we are only ordering loads and stores to r_msg. And in
> > both smp_wmb and smp_rmb, there are no stores after the
> > calls _anyway_.
> I like the idea, the pairing in ipc is not good.
> Another one is still open in sem.

Hmm for sems are you referring to spinning on ->status in
get_queue_result() while another task is performing a wakeup in between
wake_up_sem_queue_prepare() and wake_up_sem_queue_do()?

>
> Perhaps we should formalize it a bit more, so that it is easy to find
> which barrier pair belongs together.
> It is only an idea, but right now there are too many bugs.

Good point, however, what do you think of the below instead? Makes it
crystal clear, imho.

Thanks,
Davidlohr

diff --git a/ipc/msg.c b/ipc/msg.c
index 2b6fdbb..ce7bf50 100644
--- a/ipc/msg.c
+++ b/ipc/msg.c
@@ -196,7 +196,7 @@ static void expunge_all(struct msg_queue *msq, int res)
* or dealing with -EAGAIN cases. See lockless receive part 1
* and 2 in do_msgrcv().
*/
- smp_mb();
+ smp_wmb(); /* barrier (B) */
msr->r_msg = ERR_PTR(res);
}
}
@@ -580,7 +580,8 @@ static inline int pipelined_send(struct msg_queue *msq, struct msg_msg *msg)
/* initialize pipelined send ordering */
msr->r_msg = NULL;
wake_up_process(msr->r_tsk);
- smp_mb(); /* see barrier comment below */
+ /* barrier (B) see barrier comment below */
+ smp_wmb();
msr->r_msg = ERR_PTR(-E2BIG);
} else {
msr->r_msg = NULL;
@@ -589,11 +590,12 @@ static inline int pipelined_send(struct msg_queue *msq, struct msg_msg *msg)
wake_up_process(msr->r_tsk);
/*
* Ensure that the wakeup is visible before
- * setting r_msg, as the receiving end depends
- * on it. See lockless receive part 1 and 2 in
- * do_msgrcv().
+ * setting r_msg, as the receiving can otherwise
+ * exit - once r_msg is set, the receiver can
+ * continue. See lockless receive part 1 and 2
+ * in do_msgrcv(). Barrier (B).
*/
- smp_mb();
+ smp_wmb();
msr->r_msg = msg;

return 1;
@@ -932,12 +934,38 @@ long do_msgrcv(int msqid, void __user *buf, size_t bufsz, long msgtyp, int msgfl
/* Lockless receive, part 2:
* Wait until pipelined_send or expunge_all are outside of
* wake_up_process(). There is a race with exit(), see
- * ipc/mqueue.c for the details.
+ * ipc/mqueue.c for the details. The correct serialization
+ * ensures that a receiver cannot continue without the wakeup
+ * being visibible _before_ setting r_msg:
+ *
+ * CPU 0 CPU 1
+ * <loop receiver>
+ * smp_rmb(); (A) <-- pair -. <waker thread>
+ * <load ->r_msg> | msr->r_msg = NULL;
+ * | wakeup_process()
+ * <continue> `------> smp_wmb(); (B)
+ * msr->r_msg = msg;
+ *
+ * Where (A) orders the message value read and where (B) orders
+ * the write to the futex -- done in both pipelined_send and
+ * expunge_all. *
*/
- msg = (struct msg_msg *)msr_d.r_msg;
- while (msg == NULL) {
- cpu_relax();
+ for (;;) {
+ /*
+ * Pairs with writer barrier in pipelined_send
+ * or expunge_all.
+ */
+ smp_rmb(); /* barrier (A) */
msg = (struct msg_msg *)msr_d.r_msg;
+ if (msg)
+ break;
+
+ /*
+ * The cpu_relax() call is a compiler barrier
+ * which forces everything in this loop to be
+ * re-loaded.
+ */
+ cpu_relax();
}

/* Lockless receive, part 3:

2015-06-04 18:57:57

by Davidlohr Bueso

[permalink] [raw]
Subject: Re: [PATCH 2/2] ipc,msg: provide barrier pairings for lockless receive

On Thu, 2015-06-04 at 11:41 -0700, Davidlohr Bueso wrote:
> On Thu, 2015-06-04 at 19:57 +0200, Manfred Spraul wrote:
> > On 05/30/2015 02:03 AM, Davidlohr Bueso wrote:
> > > We currently use a full barrier on the sender side to
> > > to avoid receiver tasks disappearing on us while still
> > > performing on the sender side wakeup. We lack however,
> > > the proper CPU-CPU interactions pairing on the receiver
> > > side which busy-waits for the message. Similarly, we do
> > > not need a full smp_mb, and can relax the semantics for
> > > the writer and reader sides of the message. This is safe
> > > as we are only ordering loads and stores to r_msg. And in
> > > both smp_wmb and smp_rmb, there are no stores after the
> > > calls _anyway_.
> > I like the idea, the pairing in ipc is not good.
> > Another one is still open in sem.
>
> Hmm for sems are you referring to spinning on ->status in
> get_queue_result() while another task is performing a wakeup in between
> wake_up_sem_queue_prepare() and wake_up_sem_queue_do()?
>
> >
> > Perhaps we should formalize it a bit more, so that it is easy to find
> > which barrier pair belongs together.
> > It is only an idea, but right now there are too many bugs.
>
> Good point, however, what do you think of the below instead? Makes it
> crystal clear, imho.

We had to do some formalizing in futex too.

> + * Where (A) orders the message value read and where (B) orders
> + * the write to the futex -- done in both pipelined_send and
^^ this should be r_msg