2022-06-28 01:00:03

by Alexander Aring

[permalink] [raw]
Subject: Re: sparse warnings related to kref_put_lock() and refcount_dec_and_lock()

Hi Luc and others,

On Mon, Jun 27, 2022 at 2:42 PM Luc Van Oostenryck
<[email protected]> wrote:
>
> On Mon, Jun 27, 2022 at 11:15:17AM -0400, Alexander Aring wrote:
> > Hi,
> >
> > I recently converted to use kref_put_lock() in fs/dlm subsystem and
> > now I get the following warning in sparse:
> >
> > warning: context imbalance in 'put_rsb' - unexpected unlock
> >
> > It seems sparse is not able to detect that there is a conditional
> > requirement that the lock passed to kref_put_lock() (or also
> > refcount_dec_and_lock()) is locked or not. I evaluate the return value
> > to check if kref_put_lock() helds the lock and unlock it then. The
> > idea is that the lock needs only to be held when the refcount is going
> > to be zero.
> >
> > It seems other users unlock the lock at the release callback of
> > kref_put_lock() and annotate the callback with "__releases()" which
> > seems to work to avoid the sparse warning. However this works if you
> > don't have additional stack pointers which you need to pass to the
> > release callback.
> >
> > My question would be is this a known problem and a recommended way to
> > avoid this sparse warning (maybe just for now)?
>
> Hi,
>
> I suppose that your case here can be simplified into something like:
>
> if (some_condition)
> take(some_lock);
>
> do_stuff();
>
> if (some_condition)
> release(some_lock);
>
> Sparse issues the 'context imbalance' warning because, a priori,
> it can't exclude that some execution will takes the lock and not
> releases it (or the opposite). In some case, when do_stuff() is
> very simple, sparse can see that everything is OK, but generally
> it won't (some whole kernel analysis but the general case is
> undecidable anyway).
>
> The recommended way would be to write things rather like this:
>
> if (some_condition) {
> take(some_lock);
> do_stuff();
> release(some_lock);
> } else {
> do_stuff();
> }
>

This is not an alternative for me because the lock needs to hold
during the "some_condition". (More background is that we dealing with
data structures here and cannot allow a get() from this data
structures during "some_condition", the lock is preventing this)
It is the refcount code which causes trouble here [0] and I think the
refcount code should never call the unlock() procedure in any
condition and leave it to the caller to call unlock() in any case.

I to'ed (hope to get more attention to this) more people related to
lib/refcount.c implementation (provided by get_maintainers.pl -f).

>
> The __acquires() and __releases() annotations are needed for sparse
> to know that the annotated function always take or always release
> some lock but won't handle conditional locks.
>

If we change the refcount code to _never_ calling unlock() for the
specific lock, then all those foo_and_lock_bar() functions can be
annotated with "__acquires()". This should also end in the same code?
For me it looks like the current implementation of refcount.c is fine
except sparse cannot figure out what's going on and maybe a reason to
change the specific handling to the mentioned one.

> I hope that this is helpful to you.
>

I hope we will find some solution, because I don't like sparse warnings.

- Alex

[0] https://elixir.bootlin.com/linux/v5.19-rc4/source/lib/refcount.c#L144


2022-06-28 01:52:13

by Alexander Aring

[permalink] [raw]
Subject: Re: sparse warnings related to kref_put_lock() and refcount_dec_and_lock()

Hi,

On Mon, Jun 27, 2022 at 8:56 PM Alexander Aring <[email protected]> wrote:
>
> Hi Luc and others,
>
> On Mon, Jun 27, 2022 at 2:42 PM Luc Van Oostenryck
> <[email protected]> wrote:
> >
> > On Mon, Jun 27, 2022 at 11:15:17AM -0400, Alexander Aring wrote:
> > > Hi,
> > >
> > > I recently converted to use kref_put_lock() in fs/dlm subsystem and
> > > now I get the following warning in sparse:
> > >
> > > warning: context imbalance in 'put_rsb' - unexpected unlock
> > >
> > > It seems sparse is not able to detect that there is a conditional
> > > requirement that the lock passed to kref_put_lock() (or also
> > > refcount_dec_and_lock()) is locked or not. I evaluate the return value
> > > to check if kref_put_lock() helds the lock and unlock it then. The
> > > idea is that the lock needs only to be held when the refcount is going
> > > to be zero.
> > >
> > > It seems other users unlock the lock at the release callback of
> > > kref_put_lock() and annotate the callback with "__releases()" which
> > > seems to work to avoid the sparse warning. However this works if you
> > > don't have additional stack pointers which you need to pass to the
> > > release callback.
> > >
> > > My question would be is this a known problem and a recommended way to
> > > avoid this sparse warning (maybe just for now)?
> >
> > Hi,
> >
> > I suppose that your case here can be simplified into something like:
> >
> > if (some_condition)
> > take(some_lock);
> >
> > do_stuff();
> >
> > if (some_condition)
> > release(some_lock);
> >
> > Sparse issues the 'context imbalance' warning because, a priori,
> > it can't exclude that some execution will takes the lock and not
> > releases it (or the opposite). In some case, when do_stuff() is
> > very simple, sparse can see that everything is OK, but generally
> > it won't (some whole kernel analysis but the general case is
> > undecidable anyway).
> >
> > The recommended way would be to write things rather like this:
> >
> > if (some_condition) {
> > take(some_lock);
> > do_stuff();
> > release(some_lock);
> > } else {
> > do_stuff();
> > }
> >
>
> This is not an alternative for me because the lock needs to hold
> during the "some_condition". (More background is that we dealing with
> data structures here and cannot allow a get() from this data
> structures during "some_condition", the lock is preventing this)
> It is the refcount code which causes trouble here [0] and I think the
> refcount code should never call the unlock() procedure in any
> condition and leave it to the caller to call unlock() in any case.
>
> I to'ed (hope to get more attention to this) more people related to
> lib/refcount.c implementation (provided by get_maintainers.pl -f).
>
> >
> > The __acquires() and __releases() annotations are needed for sparse
> > to know that the annotated function always take or always release
> > some lock but won't handle conditional locks.
> >
>
> If we change the refcount code to _never_ calling unlock() for the
> specific lock, then all those foo_and_lock_bar() functions can be
> annotated with "__acquires()". This should also end in the same code?

sorry, this will not work because of the first condition of "if
(refcount_dec_not_one(r))" which will never hold the lock if true...
that's what the optimization is all about. However, maybe somebody has
another idea...

- Alex

2022-06-28 09:42:22

by Luc Van Oostenryck

[permalink] [raw]
Subject: Re: sparse warnings related to kref_put_lock() and refcount_dec_and_lock()

On Mon, Jun 27, 2022 at 09:06:43PM -0400, Alexander Aring wrote:
> >
> > If we change the refcount code to _never_ calling unlock() for the
> > specific lock, then all those foo_and_lock_bar() functions can be
> > annotated with "__acquires()". This should also end in the same code?
>
> sorry, this will not work because of the first condition of "if
> (refcount_dec_not_one(r))" which will never hold the lock if true...
> that's what the optimization is all about. However, maybe somebody has
> another idea...

I would certainly not recommend this but ...
if it's OK to cheat and lie then you can do:
+ bool refcount_dec_and_lock(refcount_t *r, spinlock_t *lock) __acquires(lock);
+
bool refcount_dec_and_lock(refcount_t *r, spinlock_t *lock)
{
- if (refcount_dec_not_one(r))
- return false;
+ if (refcount_dec_not_one(r)) {
+ __acquire(lock);
+ return false;
+ }

spin_lock(lock);
if (!refcount_dec_and_test(r)) {
spin_unlock(lock);
+ __acquire(lock);
return false;
}

return true;
}

In other word, pretend that the lock is always taken but ...
1) it's ugly
2) it's lying and can be confusing
3) now all the users of this function will have an imbalance problem
(but they probably already have one since refcount_dec_and_lock()
is not annotated).

What is needed is some kind of annotation for conditional locks.
I've tried a few time and in itself it was working but in most
cases the usage pattern was so that there was a imbalance anyway.

-- Luc

2022-06-28 14:09:36

by Alexander Aring

[permalink] [raw]
Subject: Re: sparse warnings related to kref_put_lock() and refcount_dec_and_lock()

Hi Luc,

On Tue, Jun 28, 2022 at 4:58 AM Luc Van Oostenryck
<[email protected]> wrote:
>
> On Mon, Jun 27, 2022 at 09:06:43PM -0400, Alexander Aring wrote:
> > >
> > > If we change the refcount code to _never_ calling unlock() for the
> > > specific lock, then all those foo_and_lock_bar() functions can be
> > > annotated with "__acquires()". This should also end in the same code?
> >
> > sorry, this will not work because of the first condition of "if
> > (refcount_dec_not_one(r))" which will never hold the lock if true...
> > that's what the optimization is all about. However, maybe somebody has
> > another idea...
>
> I would certainly not recommend this but ...
> if it's OK to cheat and lie then you can do:
> + bool refcount_dec_and_lock(refcount_t *r, spinlock_t *lock) __acquires(lock);
> +
> bool refcount_dec_and_lock(refcount_t *r, spinlock_t *lock)
> {
> - if (refcount_dec_not_one(r))
> - return false;
> + if (refcount_dec_not_one(r)) {
> + __acquire(lock);
> + return false;
> + }
>
> spin_lock(lock);
> if (!refcount_dec_and_test(r)) {
> spin_unlock(lock);
> + __acquire(lock);
> return false;
> }
>
> return true;
> }
>
> In other word, pretend that the lock is always taken but ...
> 1) it's ugly
> 2) it's lying and can be confusing
> 3) now all the users of this function will have an imbalance problem
> (but they probably already have one since refcount_dec_and_lock()
> is not annotated).
>
> What is needed is some kind of annotation for conditional locks.
> I've tried a few time and in itself it was working but in most
> cases the usage pattern was so that there was a imbalance anyway.
>

may we can add something like __may_acquires() in the sense of
ignoring imbalances for the specific lock. This will not check
anything and probably ends in the same, but at least will stop
dropping warnings... my alternative would be to add a #ifdef
__CHECKER__ around my lock unlock().

Maybe just for now as no better option exists to validate such code
during compile time _at the moment_?

- Alex

2022-06-28 17:38:13

by Linus Torvalds

[permalink] [raw]
Subject: Re: sparse warnings related to kref_put_lock() and refcount_dec_and_lock()

On Tue, Jun 28, 2022 at 1:58 AM Luc Van Oostenryck
<[email protected]> wrote:
>
> I would certainly not recommend this but ...
> if it's OK to cheat and lie then you can do:
> + bool refcount_dec_and_lock(refcount_t *r, spinlock_t *lock) __acquires(lock);

Actually, we have "__cond_lock()" in the kernel to actually document
that something takes a lock only in certain conditions.

It needs to be declared as a macro in the header file, with this as an example:

#define raw_spin_trylock(lock) __cond_lock(lock, _raw_spin_trylock(lock))

ie that says that "raw_spin_trylock() takes 'lock' when
_raw_spin_trylock() returned true".

That then makes it possible to write code like

if (raw_spin_trylock(lock)) {..
raw_spin_unlock(lock));
}

and sparse will get the nesting right.

But that does *not* solve the issue of people then writing this as

locked = raw_spin_trylock(lock);
.. do_something ..
if (locked)
raw_spin_unlock(lock));

and you end up with the same thing again.

Anyway, for things like refcount_dec_and_lock(), I suspect that first
pattern should work, because you really shouldn't have that second
pattern. If you just decremented without any locking, the actions are
completely different from the "oh, got the last decrement and now it's
locked and I need to free things" or whatever.

Linus

2022-06-29 15:01:00

by Alexander Aring

[permalink] [raw]
Subject: Re: sparse warnings related to kref_put_lock() and refcount_dec_and_lock()

Hi,

On Tue, Jun 28, 2022 at 1:27 PM Linus Torvalds
<[email protected]> wrote:
>
> On Tue, Jun 28, 2022 at 1:58 AM Luc Van Oostenryck
> <[email protected]> wrote:
> >
> > I would certainly not recommend this but ...
> > if it's OK to cheat and lie then you can do:
> > + bool refcount_dec_and_lock(refcount_t *r, spinlock_t *lock) __acquires(lock);
>
> Actually, we have "__cond_lock()" in the kernel to actually document
> that something takes a lock only in certain conditions.
>
> It needs to be declared as a macro in the header file, with this as an example:
>
> #define raw_spin_trylock(lock) __cond_lock(lock, _raw_spin_trylock(lock))
>

I added a prefix of "raw_" to refcount_dec_and_lock() and similar
functions and replaced the original functions with the __cond_lock()
macro to redirect to their raw_ functions. Similar to how the
raw_spinlock_trylock() naming scheme is doing it. The "raw_"
functionality should never be called by the user then.

> ie that says that "raw_spin_trylock() takes 'lock' when
> _raw_spin_trylock() returned true".
>
> That then makes it possible to write code like
>
> if (raw_spin_trylock(lock)) {..
> raw_spin_unlock(lock));
> }
>
> and sparse will get the nesting right.
>
> But that does *not* solve the issue of people then writing this as
>
> locked = raw_spin_trylock(lock);
> .. do_something ..
> if (locked)
> raw_spin_unlock(lock));
>
> and you end up with the same thing again.
>

Yes it mostly removed all sparse warnings I see. I needed to move at
one call of the refcount_dec_and_lock() function inside the if
condition and the sparse warning was gone. It should not be a problem
to change it in this way.

If there are no other complaints I will send a patch for the raw_
prefix to all those conditional refcount lock functions and the add a
__cond_lock() macro for the original function calls.

Thanks!

- Alex