2024-03-14 18:44:07

by David Teigland

[permalink] [raw]
Subject: [GIT PULL] dlm fixes for 6.9

Hi Linus,

Please pull dlm fixes from tag:

git://git.kernel.org/pub/scm/linux/kernel/git/teigland/linux-dlm.git dlm-6.9

Fix two refcounting bugs from recent changes:
- misuse of atomic_dec_and_test results in missed ref decrement
- wrong variable assignment results in another missed ref decrement

Thanks,
Dave


fs/dlm/lock.c | 10 ++++++----
fs/dlm/user.c | 10 +++++-----
2 files changed, 11 insertions(+), 9 deletions(-)


Alexander Aring (2):
dlm: fix user space lkb refcounting
dlm: fix off-by-one waiters refcount handling



2024-03-15 17:10:51

by Linus Torvalds

[permalink] [raw]
Subject: Re: [GIT PULL] dlm fixes for 6.9

On Thu, 14 Mar 2024 at 11:43, David Teigland <[email protected]> wrote:
>
> Fix two refcounting bugs from recent changes:
> - misuse of atomic_dec_and_test results in missed ref decrement
> - wrong variable assignment results in another missed ref decrement

I pulled this, and then I unpulled it again.

That code is insane.

This is *NOT* sane or valid code:

+ while (atomic_read(&lkb->lkb_wait_count)) {
+ if (atomic_dec_and_test(&lkb->lkb_wait_count))
+ list_del_init(&lkb->lkb_wait_reply);
+
+ unhold_lkb(lkb);
+ }

the above is completely crazy. That's simply not how atomics work.

What's the point of using a refcount - an atomic one at that - if you
just use it as a counter. That's not a "reference count", that's
literally just broken.

The whole - and *ONLY* - point of a refcount is that you are counting
references. References that *YOU* hold. Not that somebody else is
holding and you are releasing.

If you're the only holder of any counts, don't make them atomic, don't
put them in a data structure. But you're *not* the only holder fo that
refcount here, are you?

Using atomics for this kind of sequence shows some crazy crazy
behavior. It's not valid to say "ok, as long as this atomic is not
zero, let's decrement it and test if it's not zero".

Because for an atomic value to MAKE SENSE IN THE FIRST PLACE, there
could be somebody else that comes in and also possibly decrements it.

And if that happens between the test of "is this zero" and "did I
decrement it to zero", you now had two decrements, and that value is
now negative. So you didn't really have an atomic value, because you
did two operations on it.

And dammit, if that mutex means that it cannot happen, then WHY WAS IT
AN ATOMIC IN THE FIRST PLACE?

IOW, if you have locking that protects the value, then atomic accesses
are STILL wrong.

So there is not a single situation where I can see the above kind of
code ever being valid.

Now, if the issue is that you want to clean up something that is never
getting cleaned up by anybody else, and this is a fatal error, and
you're just trying to fix things up (badly), and you know that this is
all racy but the code is trying to kill a dead data structure, then
you should

(a) need a damn big comment (bigger than the comment is already)

(b) should *NOT* pretend to do some stupid "atomic decrement and test" loop

IOW, if what you want to do is get rid of stuck entries and set the
refcount to zero, then doing that would probably be something like

/* This is broken, but.. */
stale = atomic_xchg((&lkb->lkb_wait_count, 0);
if (stale) {
list_del_init(&lkb->lkb_wait_reply);
do { unhold_lkb(lkb); } while (--stale);
}

and it needs a much bigger comment than that "This is broken".

(And I don't know if you want that list_del_init() before or after the
'unhold N times' loop).

The above is still completely broken, but at least it doesn't do some
kind of odd non-atomic test and decrement stuff in a loop, and
hopefully makes it clear that we're very much talking about fixing up
stale final values

And no, I didn't look at the code around it. Because I really think
that "while (atomic_read(...)" loop cannot POSSIBLY be correct,
regardless of any context.

Linus

2024-03-15 18:43:14

by David Teigland

[permalink] [raw]
Subject: Re: [GIT PULL] dlm fixes for 6.9

On Fri, Mar 15, 2024 at 10:10:00AM -0700, Linus Torvalds wrote:
> Now, if the issue is that you want to clean up something that is never
> getting cleaned up by anybody else, and this is a fatal error, and
> you're just trying to fix things up (badly), and you know that this is
> all racy but the code is trying to kill a dead data structure, then
> you should
>
> (a) need a damn big comment (bigger than the comment is already)
>
> (b) should *NOT* pretend to do some stupid "atomic decrement and test" loop

Yes, that looks pretty messed up, the counter should not be an atomic_t. I was
a bit wary of making that atomic when it wasn't necessary, but didn't push back
enough on that change:

commit 75a7d60134ce84209f2c61ec4619ee543aa8f466
Author: Alexander Aring <[email protected]>
Date: Mon May 29 17:44:38 2023 -0400

Currently the lkb_wait_count is locked by the rsb lock and it should be
fine to handle lkb_wait_count as non atomic_t value. However for the
overall process of reducing locking this patch converts it to an
atomic_t value.

.. and the result is the primitives get abused, and the code becomes crazy.
My initial plan is to go back to a non-atomic counter there. It is indeed a
recovery situation that involves a forced reset of state, but I'll need to go
back and study that case further before I can say what it should finally look
like. Whatever that looks like, it'll have a very good comment :) Dropping
the pull is fine, there's a chance I may resend with the other patch and a new
fix, we'll see.

Thanks,
Dave