Date: 20 Apr 2015 21:39:39 -0400
Message-ID: <20150421013939.11690.qmail@ns.horizon.com>
From: "George Spelvin" <linux@horizon.com>
To: dave@stgolabs.net, linux@horizon.com
Subject: Re: [PATCH 1/2] sched: lockless wake-queues
Cc: linux-kernel@vger.kernel.org, peterz@infradead.org
In-Reply-To: <1429560500.2042.17.camel@stgolabs.net>
Sender: linux-kernel-owner@vger.kernel.org
Content-Length: 2806
Lines: 72

>> Is there some reason you don't use the simpler singly-linked list
>> construction with the tail being a pointer to a pointer:

> Sure, that would also work.

It's just a convenient simplification, already used in struct hlist_node.

>> +/*
>> + * Queue a task for later wake-up by wake_up_q().  If the task is already
>> + * queued by someone else, leave it to them to deliver the wakeup.
>
> This is already commented in the cmpxchg.
>
>> + *
>> + * This property makes it impossible to guarantee the order of wakeups,
>> + * but for efficiency we try to deliver wakeups in the order tasks
>> + * are added.  
>
> Ok.

This is just me thinking "out loud" about the semantics.

>> It may also be worth commenting the fact that wake_up_q() leaves the
>> struct wake_q_head in a corrupt state, so don't try to do it again.

> Right, we could re-init the list once the loop is complete, yes. But it
> shouldn't matter due to how we use wake-queues.

Oh, indeed, there's no point.  Unless it's worth a debugging option,
but as you say the usage patterns are such that I don't expect it's
needed.

It just seemed worth commenting explicitly.


If I were going to comment it, here's what I'd write.  Feel free
to copy any or none of this:

/*
 * Wake-queues are lists of tasks about to be woken up.
 * Deferring the wakeup is useful when the waker is waking up multiple
 * tasks while holding a lock which the woken tasks will need, so they'd
 * go straight into a wait queue anyway.
 *
 * So instead, the the waker can wake_q_add(&q, task) under the lock,
 * and then wake_up_q(&q) afterward.
 *
 * The list head is allocated on the waker's stack, and the queue nodes
 * are preallocated as part of the task struct.
 *
 * A reference to each task (get_task_struct()) is held during the wait,
 * so the list will remain valid through wake_up_q().
 *
 * One per task suffices, because there's never a need for a task to be
 * in two wake queues simultaneously; it is forbidden to abandon a task
 * in a wake queue (a call to wake_up_q() _must_ follow), so if a task is
 * already in a wake queue, the wakeup will happen soon and the second
 * waker can just skip it.
 *
 * As with all Linux wakeup primitives, there is no guarantee about the
 * order, but this code tries to wake tasks in wake_q_add order.
 *
 * The WAKE_Q macro declares and initializes the list head.
 * wake_up_q() does NOT reinitialize the list; it's expected to be
 * called near the end of a function, where the fact that the queue is
 * not used again will be easy to see by inspection.
 */
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/