Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751323AbbEBAgO (ORCPT ); Fri, 1 May 2015 20:36:14 -0400 Received: from cantor2.suse.de ([195.135.220.15]:48029 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750839AbbEBAgN (ORCPT ); Fri, 1 May 2015 20:36:13 -0400 Message-ID: <1430526956.1940.8.camel@stgolabs.net> Subject: Re: [PATCH 3/3] ipc/mqueue: lockless pipelined wakeups From: Davidlohr Bueso To: George Spelvin Cc: mingo@redhat.com, peterz@infradead.org, tglx@linutronix.de, bigeasy@linutronix.de, clm@fb.com, linux-kernel@vger.kernel.org, manfred@colorfullife.com, rostedt@goodmis.org, torvalds@linux-foundation.org Date: Fri, 01 May 2015 17:35:56 -0700 In-Reply-To: <20150501215207.25731.qmail@ns.horizon.com> References: <20150501215207.25731.qmail@ns.horizon.com> Content-Type: text/plain; charset="UTF-8" X-Mailer: Evolution 3.12.11 Mime-Version: 1.0 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3498 Lines: 93 On Fri, 2015-05-01 at 17:52 -0400, George Spelvin wrote: > In general, Acked-by, but you're making me fix all your comments. :-) > > This is a nice use of the wake queue, since the code was already handling > the same problem in a similar way with STATE_PENDING. > > > * The receiver accepts the message and returns without grabbing the queue > >+ * spinlock. The used algorithm is different from sysv semaphores (ipc/sem.c): > > Is that last sentence even wanted? Yeah, we can probably remove it now. > >+ * > >+ * - Set pointer to message. > >+ * - Queue the receiver task's for later wakeup (without the info->lock). > > It's "task" singular, and the apostrophe would be wrong if it were plural. > > >+ * - Update its state to STATE_READY. Now the receiver can continue. > >+ * - Wake up the process after the lock is dropped. Should the process wake up > >+ * before this wakeup (due to a timeout or a signal) it will either see > >+ * STATE_READY and continue or acquire the lock to check the sate again. > > "check the sTate again". > > >+ wake_q_add(wake_q, receiver->task); > >+ /* > >+ * Rely on the implicit cmpxchg barrier from wake_q_add such > >+ * that we can ensure that updating receiver->state is the last > >+ * write operation: As once set, the receiver can continue, > >+ * and if we don't have the reference count from the wake_q, > >+ * yet, at that point we can later have a use-after-free > >+ * condition and bogus wakeup. > >+ */ > > receiver->state = STATE_READY; > > How about: > /* > * There must be a write barrier here; setting STATE_READY > * lets the receiver proceed without further synchronization. > * The cmpxchg inside wake_q_add serves as the barrier here. > */ > > The need for a wake queue to take a reference to avoid use-after-free > is generic to wake queues, and handled in generic code; I don't see why > it needs a comment here. You are not wrong, but I'd rather leave the comment as is, as it will vary from user to user. The comments in the sched wake_q bits are already pretty clear, and if users cannot see the need for holding reference and the task disappearing on their own they have no business using wake_q. Furthermore, I think my comment serves better in mqueues as the need for it isn't immediately obvious. > >@@ -1084,6 +1094,7 @@ SYSCALL_DEFINE5(mq_timedreceive, mqd_t, mqdes, char __user *, u_msg_ptr, > > ktime_t expires, *timeout = NULL; > > struct timespec ts; > > struct posix_msg_tree_node *new_leaf = NULL; > >+ WAKE_Q(wake_q); > > > > if (u_abs_timeout) { > > int res = prepare_timeout(u_abs_timeout, &expires, &ts); > >@@ -1155,8 +1166,9 @@ SYSCALL_DEFINE5(mq_timedreceive, mqd_t, mqdes, char __user *, u_msg_ptr, > > CURRENT_TIME; > > > > /* There is now free space in queue. */ > >- pipelined_receive(info); > >+ pipelined_receive(&wake_q, info); > > spin_unlock(&info->lock); > >+ wake_up_q(&wake_q); > > ret = 0; > > } > > if (ret == 0) { > > Since WAKE_Q actually involves some initialization, would it make sense to > move its declaration to inside the condition that needs it? > > (I'm also a fan of declaring variables in the smallest scope possible, > just on general principles.) Agreed. Thanks, Davidlohr -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/