Hello!
Both __kfifo_put() and __kfifo_get() have header comments stating
that if there is but one concurrent reader and one concurrent writer,
locking is not necessary. This is almost the case, but a couple of
memory barriers are needed. Another option would be to change the
header comments to remove the bit about locking not being needed, and
to change the those callers who currently don't use locking to add
the required locking. The attachment analyzes this approach, but the
patch below seems simpler.
Signed-off-by: Paul E. McKenney <[email protected]>
---
kfifo.c | 4 ++++
1 files changed, 4 insertions(+)
diff -urpNa -X dontdiff linux-2.6.18-rc2/kernel/kfifo.c linux-2.6.18-rc2-kfifo/kernel/kfifo.c
--- linux-2.6.18-rc2/kernel/kfifo.c 2006-07-15 14:53:08.000000000 -0700
+++ linux-2.6.18-rc2-kfifo/kernel/kfifo.c 2006-08-09 14:01:53.000000000 -0700
@@ -129,6 +129,8 @@ unsigned int __kfifo_put(struct kfifo *f
/* then put the rest (if any) at the beginning of the buffer */
memcpy(fifo->buffer, buffer + l, len - l);
+ smp_wmb();
+
fifo->in += len;
return len;
@@ -161,6 +163,8 @@ unsigned int __kfifo_get(struct kfifo *f
/* then get the rest (if any) from the beginning of the buffer */
memcpy(buffer + l, fifo->buffer, len - l);
+ smp_mb();
+
fifo->out += len;
return len;
On Wed, 9 Aug 2006 17:18:23 -0700
"Paul E. McKenney" <[email protected]> wrote:
> @@ -129,6 +129,8 @@ unsigned int __kfifo_put(struct kfifo *f
> /* then put the rest (if any) at the beginning of the buffer */
> memcpy(fifo->buffer, buffer + l, len - l);
>
> + smp_wmb();
> +
> fifo->in += len;
>
> return len;
> @@ -161,6 +163,8 @@ unsigned int __kfifo_get(struct kfifo *f
> /* then get the rest (if any) from the beginning of the buffer */
> memcpy(buffer + l, fifo->buffer, len - l);
>
> + smp_mb();
> +
> fifo->out += len;
>
> return len;
When adding barriers, please also add a little comment explaining what the
barrier is protecting us from.
Often it's fairly obvious, but sometimes it is not, and in both cases it is still
useful to communicate the programmer's intent in this way.
Thanks.
OK, it appears that we are even. I forgot to attach the promised
analysis of the callers to __kfifo_put() and __kfifo_get(), and
the [email protected] email address listed as maintainer
in drivers/scsi/libiscsi.c bounces complaining that, as a non-member,
I am not allowed to send it email. ;-)
Anyway, this time the analysis really is attached, sorry for my confusion!
Could someone please let the guys on [email protected] know
that they should take a look at this?
Thanx, Paul
On Wed, Aug 09, 2006 at 05:18:23PM -0700, Paul E. McKenney wrote:
> Hello!
>
> Both __kfifo_put() and __kfifo_get() have header comments stating
> that if there is but one concurrent reader and one concurrent writer,
> locking is not necessary. This is almost the case, but a couple of
> memory barriers are needed. Another option would be to change the
> header comments to remove the bit about locking not being needed, and
> to change the those callers who currently don't use locking to add
> the required locking. The attachment analyzes this approach, but the
> patch below seems simpler.
>
> Signed-off-by: Paul E. McKenney <[email protected]>
> ---
>
> kfifo.c | 4 ++++
> 1 files changed, 4 insertions(+)
>
> diff -urpNa -X dontdiff linux-2.6.18-rc2/kernel/kfifo.c linux-2.6.18-rc2-kfifo/kernel/kfifo.c
> --- linux-2.6.18-rc2/kernel/kfifo.c 2006-07-15 14:53:08.000000000 -0700
> +++ linux-2.6.18-rc2-kfifo/kernel/kfifo.c 2006-08-09 14:01:53.000000000 -0700
> @@ -129,6 +129,8 @@ unsigned int __kfifo_put(struct kfifo *f
> /* then put the rest (if any) at the beginning of the buffer */
> memcpy(fifo->buffer, buffer + l, len - l);
>
> + smp_wmb();
> +
> fifo->in += len;
>
> return len;
> @@ -161,6 +163,8 @@ unsigned int __kfifo_get(struct kfifo *f
> /* then get the rest (if any) from the beginning of the buffer */
> memcpy(buffer + l, fifo->buffer, len - l);
>
> + smp_mb();
> +
> fifo->out += len;
>
> return len;
On Wed, Aug 09, 2006 at 05:29:10PM -0700, Andrew Morton wrote:
> On Wed, 9 Aug 2006 17:18:23 -0700
> "Paul E. McKenney" <[email protected]> wrote:
>
> > @@ -129,6 +129,8 @@ unsigned int __kfifo_put(struct kfifo *f
> > /* then put the rest (if any) at the beginning of the buffer */
> > memcpy(fifo->buffer, buffer + l, len - l);
> >
> > + smp_wmb();
> > +
> > fifo->in += len;
> >
> > return len;
> > @@ -161,6 +163,8 @@ unsigned int __kfifo_get(struct kfifo *f
> > /* then get the rest (if any) from the beginning of the buffer */
> > memcpy(buffer + l, fifo->buffer, len - l);
> >
> > + smp_mb();
> > +
> > fifo->out += len;
> >
> > return len;
>
> When adding barriers, please also add a little comment explaining what the
> barrier is protecting us from.
>
> Often it's fairly obvious, but sometimes it is not, and in both cases it is still
> useful to communicate the programmer's intent in this way.
I certainly cannot claim that it was obvious in this case, as the act
of adding the comments caused me to realize that I had added only two
of the four memory barriers that are required. :-/ Updated patch below.
Signed-off-by: Paul E. McKenney <[email protected]>
---
kfifo.c | 28 ++++++++++++++++++++++++++++
1 files changed, 28 insertions(+)
diff -urpNa -X dontdiff linux-2.6.18-rc2/kernel/kfifo.c linux-2.6.18-rc2-kfifo/kernel/kfifo.c
--- linux-2.6.18-rc2/kernel/kfifo.c 2006-07-15 14:53:08.000000000 -0700
+++ linux-2.6.18-rc2-kfifo/kernel/kfifo.c 2006-08-09 17:45:41.000000000 -0700
@@ -122,6 +122,13 @@ unsigned int __kfifo_put(struct kfifo *f
len = min(len, fifo->size - fifo->in + fifo->out);
+ /*
+ * Ensure that we sample the fifo->out index -before- we
+ * start putting bytes into the kfifo.
+ */
+
+ smp_mb();
+
/* first put the data starting from fifo->in to buffer end */
l = min(len, fifo->size - (fifo->in & (fifo->size - 1)));
memcpy(fifo->buffer + (fifo->in & (fifo->size - 1)), buffer, l);
@@ -129,6 +136,13 @@ unsigned int __kfifo_put(struct kfifo *f
/* then put the rest (if any) at the beginning of the buffer */
memcpy(fifo->buffer, buffer + l, len - l);
+ /*
+ * Ensure that we add the bytes to the kfifo -before-
+ * we update the fifo->in index.
+ */
+
+ smp_wmb();
+
fifo->in += len;
return len;
@@ -154,6 +168,13 @@ unsigned int __kfifo_get(struct kfifo *f
len = min(len, fifo->in - fifo->out);
+ /*
+ * Ensure that we sample the fifo->in index -before- we
+ * start removing bytes from the kfifo.
+ */
+
+ smp_rmb();
+
/* first get the data from fifo->out until the end of the buffer */
l = min(len, fifo->size - (fifo->out & (fifo->size - 1)));
memcpy(buffer, fifo->buffer + (fifo->out & (fifo->size - 1)), l);
@@ -161,6 +182,13 @@ unsigned int __kfifo_get(struct kfifo *f
/* then get the rest (if any) from the beginning of the buffer */
memcpy(buffer + l, fifo->buffer, len - l);
+ /*
+ * Ensure that we remove the bytes from the kfifo -before-
+ * we update the fifo->out index.
+ */
+
+ smp_mb();
+
fifo->out += len;
return len;
Paul E. McKenney wrote:
> OK, it appears that we are even. I forgot to attach the promised
> analysis of the callers to __kfifo_put() and __kfifo_get(), and
> the [email protected] email address listed as maintainer
> in drivers/scsi/libiscsi.c bounces complaining that, as a non-member,
> I am not allowed to send it email. ;-)
Sorry about that. I do not have any control over the email list. I will
change the maintainer info entry to indicate that users should just send
mail to me or linux-scsi.
>
> Anyway, this time the analysis really is attached, sorry for my confusion!
>
We have change the code a little since the analysis was made. But, it
does not really matter much now. I am fine with us just grabbing the
session lock or xmitmitex (I will send a patch and test it as well) if
your barrier patch is not accepted. We grab the session lock in the fast
path now so there is not much benefit left for us.
On Thu, Aug 10, 2006 at 01:48:02AM -0400, Mike Christie wrote:
> Paul E. McKenney wrote:
> > OK, it appears that we are even. I forgot to attach the promised
> > analysis of the callers to __kfifo_put() and __kfifo_get(), and
> > the [email protected] email address listed as maintainer
> > in drivers/scsi/libiscsi.c bounces complaining that, as a non-member,
> > I am not allowed to send it email. ;-)
>
> Sorry about that. I do not have any control over the email list. I will
> change the maintainer info entry to indicate that users should just send
> mail to me or linux-scsi.
Sounds good!
> > Anyway, this time the analysis really is attached, sorry for my confusion!
>
> We have change the code a little since the analysis was made. But, it
> does not really matter much now. I am fine with us just grabbing the
> session lock or xmitmitex (I will send a patch and test it as well) if
> your barrier patch is not accepted. We grab the session lock in the fast
> path now so there is not much benefit left for us.
I am happy to go either way -- the patch with the memory barriers
(which does have the side-effect of slowing down kfifo_get() and
kfifo_put(), by the way), or a patch removing the comments saying
that it is OK to invoke __kfifo_get() and __kfifo_put() without
locking.
Any other thoughts on which is better? (1) the memory barriers or
(2) requiring the caller hold appropriate locks across calls to
__kfifo_get() and __kfifo_put()?
Thanx, Paul
Le jeudi 10 ao?t 2006 ? 06:41 -0700, Paul E. McKenney a ?crit :
> I am happy to go either way -- the patch with the memory barriers
> (which does have the side-effect of slowing down kfifo_get() and
> kfifo_put(), by the way), or a patch removing the comments saying
> that it is OK to invoke __kfifo_get() and __kfifo_put() without
> locking.
>
> Any other thoughts on which is better? (1) the memory barriers or
> (2) requiring the caller hold appropriate locks across calls to
> __kfifo_get() and __kfifo_put()?
If someone wants to use explicit locking, he/she can go with kfifo_get()
instead of the __ version.
I'd rather keep the __kfifo_get() and __kfifo_put() functions lockless,
so I say go for (1) even if there is a tiny price to pay for corectness.
Stelian.
--
Stelian Pop <[email protected]>
On Thu, Aug 10, 2006 at 04:26:53PM +0200, Stelian Pop wrote:
> Le jeudi 10 ao?t 2006 ? 06:41 -0700, Paul E. McKenney a ?crit :
>
> > I am happy to go either way -- the patch with the memory barriers
> > (which does have the side-effect of slowing down kfifo_get() and
> > kfifo_put(), by the way), or a patch removing the comments saying
> > that it is OK to invoke __kfifo_get() and __kfifo_put() without
> > locking.
> >
> > Any other thoughts on which is better? (1) the memory barriers or
> > (2) requiring the caller hold appropriate locks across calls to
> > __kfifo_get() and __kfifo_put()?
>
> If someone wants to use explicit locking, he/she can go with kfifo_get()
> instead of the __ version.
However, the kfifo_get()/kfifo_put() interfaces use the internal lock,
which cannot be used by the caller to protect other code surrounding
the call to kfifo_get()/kfifo_put(). See for example the ISCSI use,
where they have a session->lock that, among other things, protects their
__kfifo_get()/__kfifo_put() calls.
> I'd rather keep the __kfifo_get() and __kfifo_put() functions lockless,
> so I say go for (1) even if there is a tiny price to pay for corectness.
If we require the caller to supply the locks for __kfifo_get() and
__kfifo_put(), then we have -both- correctness -and- better performance.
And the only current user of __kfifo_get()/__kfifo_put() stated that
they could easily expand their session->lock to cover all such calls,
and that doing so would not hurt their performance.
So, are you sure? And if so, why?
Thanx, Paul
Le jeudi 10 ao?t 2006 ? 08:39 -0700, Paul E. McKenney a ?crit :
> On Thu, Aug 10, 2006 at 04:26:53PM +0200, Stelian Pop wrote:
> > Le jeudi 10 ao?t 2006 ? 06:41 -0700, Paul E. McKenney a ?crit :
> >
> > > I am happy to go either way -- the patch with the memory barriers
> > > (which does have the side-effect of slowing down kfifo_get() and
> > > kfifo_put(), by the way), or a patch removing the comments saying
> > > that it is OK to invoke __kfifo_get() and __kfifo_put() without
> > > locking.
> > >
> > > Any other thoughts on which is better? (1) the memory barriers or
> > > (2) requiring the caller hold appropriate locks across calls to
> > > __kfifo_get() and __kfifo_put()?
> >
> > If someone wants to use explicit locking, he/she can go with kfifo_get()
> > instead of the __ version.
>
> However, the kfifo_get()/kfifo_put() interfaces use the internal lock,
... and the internal lock can be supplied by the user at kfifo_alloc()
time.
Stelian.
--
Stelian Pop <[email protected]>
On Thu, Aug 10, 2006 at 05:47:22PM +0200, Stelian Pop wrote:
> Le jeudi 10 ao?t 2006 ? 08:39 -0700, Paul E. McKenney a ?crit :
> > On Thu, Aug 10, 2006 at 04:26:53PM +0200, Stelian Pop wrote:
> > > Le jeudi 10 ao?t 2006 ? 06:41 -0700, Paul E. McKenney a ?crit :
> > >
> > > > I am happy to go either way -- the patch with the memory barriers
> > > > (which does have the side-effect of slowing down kfifo_get() and
> > > > kfifo_put(), by the way), or a patch removing the comments saying
> > > > that it is OK to invoke __kfifo_get() and __kfifo_put() without
> > > > locking.
> > > >
> > > > Any other thoughts on which is better? (1) the memory barriers or
> > > > (2) requiring the caller hold appropriate locks across calls to
> > > > __kfifo_get() and __kfifo_put()?
> > >
> > > If someone wants to use explicit locking, he/she can go with kfifo_get()
> > > instead of the __ version.
> >
> > However, the kfifo_get()/kfifo_put() interfaces use the internal lock,
>
> ... and the internal lock can be supplied by the user at kfifo_alloc()
> time.
Would that really work for them? Looks to me like it would result
in self-deadlock if they passed in session->lock.
Or did you have something else in mind for them?
These are the __kfifo_get() and __kfifo_put() uses in:
drivers/scsi/iscsi_tcp.c
drivers/scsi/libiscsi.c
drivers/infiniband/ulp/iser/iser_initiator.c
Thanx, Paul
Le jeudi 10 ao?t 2006 ? 09:11 -0700, Paul E. McKenney a ?crit :
> On Thu, Aug 10, 2006 at 05:47:22PM +0200, Stelian Pop wrote:
> > Le jeudi 10 ao?t 2006 ? 08:39 -0700, Paul E. McKenney a ?crit :
> > > On Thu, Aug 10, 2006 at 04:26:53PM +0200, Stelian Pop wrote:
> > > > Le jeudi 10 ao?t 2006 ? 06:41 -0700, Paul E. McKenney a ?crit :
> > > >
> > > > > I am happy to go either way -- the patch with the memory barriers
> > > > > (which does have the side-effect of slowing down kfifo_get() and
> > > > > kfifo_put(), by the way), or a patch removing the comments saying
> > > > > that it is OK to invoke __kfifo_get() and __kfifo_put() without
> > > > > locking.
> > > > >
> > > > > Any other thoughts on which is better? (1) the memory barriers or
> > > > > (2) requiring the caller hold appropriate locks across calls to
> > > > > __kfifo_get() and __kfifo_put()?
> > > >
> > > > If someone wants to use explicit locking, he/she can go with kfifo_get()
> > > > instead of the __ version.
> > >
> > > However, the kfifo_get()/kfifo_put() interfaces use the internal lock,
> >
> > ... and the internal lock can be supplied by the user at kfifo_alloc()
> > time.
>
> Would that really work for them? Looks to me like it would result
> in self-deadlock if they passed in session->lock.
Yeah, it will deadlock if the lock is already taken before calling
__kfifo_get and __kfifo_put.
> Or did you have something else in mind for them?
What I had in mind is to replace all occurences of:
kfifo_alloc(..., NULL);
...
spin_lock(&session->lock)
__kfifo_get()
spin_unlock()
with the simpler:
kfifo_alloc(..., &session->lock)
...
kfifo_get()
As for the occurences of:
...
spin_lock(&session->lock)
do_something();
__kifo_get();
well, there is not much we can do about them...
Let's take this problem differently: is a memory barrier cheaper than a
spinlock ?
If the answer is yes as I suspect, why should the kfifo API force the
user to take a spinlock ?
Stelian.
--
Stelian Pop <[email protected]>
On Thu, Aug 10, 2006 at 06:23:04PM +0200, Stelian Pop wrote:
> Le jeudi 10 ao?t 2006 ? 09:11 -0700, Paul E. McKenney a ?crit :
> > On Thu, Aug 10, 2006 at 05:47:22PM +0200, Stelian Pop wrote:
> > > Le jeudi 10 ao?t 2006 ? 08:39 -0700, Paul E. McKenney a ?crit :
> > > > On Thu, Aug 10, 2006 at 04:26:53PM +0200, Stelian Pop wrote:
> > > > > Le jeudi 10 ao?t 2006 ? 06:41 -0700, Paul E. McKenney a ?crit :
> > > > >
> > > > > > I am happy to go either way -- the patch with the memory barriers
> > > > > > (which does have the side-effect of slowing down kfifo_get() and
> > > > > > kfifo_put(), by the way), or a patch removing the comments saying
> > > > > > that it is OK to invoke __kfifo_get() and __kfifo_put() without
> > > > > > locking.
> > > > > >
> > > > > > Any other thoughts on which is better? (1) the memory barriers or
> > > > > > (2) requiring the caller hold appropriate locks across calls to
> > > > > > __kfifo_get() and __kfifo_put()?
> > > > >
> > > > > If someone wants to use explicit locking, he/she can go with kfifo_get()
> > > > > instead of the __ version.
> > > >
> > > > However, the kfifo_get()/kfifo_put() interfaces use the internal lock,
> > >
> > > ... and the internal lock can be supplied by the user at kfifo_alloc()
> > > time.
> >
> > Would that really work for them? Looks to me like it would result
> > in self-deadlock if they passed in session->lock.
>
> Yeah, it will deadlock if the lock is already taken before calling
> __kfifo_get and __kfifo_put.
>
> > Or did you have something else in mind for them?
>
> What I had in mind is to replace all occurences of:
> kfifo_alloc(..., NULL);
> ...
> spin_lock(&session->lock)
> __kfifo_get()
> spin_unlock()
>
> with the simpler:
> kfifo_alloc(..., &session->lock)
> ...
> kfifo_get()
Fair enough!
> As for the occurences of:
> ...
> spin_lock(&session->lock)
> do_something();
> __kifo_get();
>
> well, there is not much we can do about them...
Agreed!
> Let's take this problem differently: is a memory barrier cheaper than a
> spinlock ?
Almost always, yes. But a spinlock is cheaper than a spinlock plus
a pair of memory barriers.
> If the answer is yes as I suspect, why should the kfifo API force the
> user to take a spinlock ?
My concern is that currently a majority of the calls to __kfifo_{get,put}()
are already holding a spinlock.
But if you could send me your tests for lock-free __kfifo_{get,put}(),
I would be happy to run them on weak-memory-consistency model machines
with the memory barriers. And without the memory barriers -- we need
a test that fails in the latter case to prove that the memory barriers
really are in the right place and that all of them are present.
Does this sound reasonable?
Thanx, Paul
[[email protected] bouncing, removed from CC:]
Le jeudi 10 ao?t 2006 ? 09:47 -0700, Paul E. McKenney a ?crit :
> > Let's take this problem differently: is a memory barrier cheaper than a
> > spinlock ?
>
> Almost always, yes. But a spinlock is cheaper than a spinlock plus
> a pair of memory barriers.
Right, but I think we're optimizing too much here.
> > If the answer is yes as I suspect, why should the kfifo API force the
> > user to take a spinlock ?
>
> My concern is that currently a majority of the calls to __kfifo_{get,put}()
> are already holding a spinlock.
>
> But if you could send me your tests for lock-free __kfifo_{get,put}(),
> I would be happy to run them on weak-memory-consistency model machines
> with the memory barriers. And without the memory barriers -- we need
> a test that fails in the latter case to prove that the memory barriers
> really are in the right place and that all of them are present.
>
> Does this sound reasonable?
It would sound reasonable if I had any tests to send to you :)
Since I don't have any and since you're the one proposing the change, I
guess it's up to you to write them. :)
Stelian.
--
Stelian Pop <[email protected]>
On Thu, Aug 10, 2006 at 10:27:42PM +0200, Stelian Pop wrote:
> [[email protected] bouncing, removed from CC:]
>
> Le jeudi 10 ao?t 2006 ? 09:47 -0700, Paul E. McKenney a ?crit :
>
> > > Let's take this problem differently: is a memory barrier cheaper than a
> > > spinlock ?
> >
> > Almost always, yes. But a spinlock is cheaper than a spinlock plus
> > a pair of memory barriers.
>
> Right, but I think we're optimizing too much here.
That was in fact my point initially -- why not just require locking,
either that registered at kfifo_alloc() time or a separately acquired
lock?
> > > If the answer is yes as I suspect, why should the kfifo API force the
> > > user to take a spinlock ?
> >
> > My concern is that currently a majority of the calls to __kfifo_{get,put}()
> > are already holding a spinlock.
> >
> > But if you could send me your tests for lock-free __kfifo_{get,put}(),
> > I would be happy to run them on weak-memory-consistency model machines
> > with the memory barriers. And without the memory barriers -- we need
> > a test that fails in the latter case to prove that the memory barriers
> > really are in the right place and that all of them are present.
> >
> > Does this sound reasonable?
>
> It would sound reasonable if I had any tests to send to you :)
>
> Since I don't have any and since you're the one proposing the change, I
> guess it's up to you to write them. :)
Ah, but you owe a test debt from your earlier submission of kfifo! ;-)
Thanx, Paul