2022-10-13 17:33:31

by Frederic Weisbecker

[permalink] [raw]
Subject: [PATCH 1/3] srcu: Warn when NMI-unsafe API is used in NMI

Using the NMI-unsafe reader API from within NMIs is very likely to be
buggy for three reasons:

1) NMIs aren't strictly re-entrant (a pending nested NMI will execute
at the end of the current one) so it should be fine to use a
non-atomic increment here. However breakpoints can still interrupt
NMIs and if a breakpoint callback has a reader on that same ssp, a
racy increment can happen.

2) If the only reader site for a given ssp is in an NMI, RCU is definetly
a better choice over SRCU.

3) Because of the previous reason (2), an ssp having an SRCU read side
critical section in an NMI is likely to have another one from a task
context.

For all these reasons, warn if an nmi unsafe reader API is used from an
NMI.

Signed-off-by: Frederic Weisbecker <[email protected]>
---
kernel/rcu/srcutree.c | 2 ++
1 file changed, 2 insertions(+)

diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c
index c54142374793..8b7ef1031d89 100644
--- a/kernel/rcu/srcutree.c
+++ b/kernel/rcu/srcutree.c
@@ -642,6 +642,8 @@ static void srcu_check_nmi_safety(struct srcu_struct *ssp, bool nmi_safe)

if (!IS_ENABLED(CONFIG_PROVE_RCU))
return;
+ /* NMI-unsafe use in NMI is a bad sign */
+ WARN_ON_ONCE(!nmi_safe && in_nmi());
sdp = raw_cpu_ptr(ssp->sda);
old_nmi_safe_mask = READ_ONCE(sdp->srcu_nmi_safety);
if (!old_nmi_safe_mask) {
--
2.25.1


2022-10-14 23:08:45

by Joel Fernandes

[permalink] [raw]
Subject: Re: [PATCH 1/3] srcu: Warn when NMI-unsafe API is used in NMI

On Thu, Oct 13, 2022 at 07:22:42PM +0200, Frederic Weisbecker wrote:
> Using the NMI-unsafe reader API from within NMIs is very likely to be
> buggy for three reasons:
>
> 1) NMIs aren't strictly re-entrant (a pending nested NMI will execute
> at the end of the current one) so it should be fine to use a
> non-atomic increment here. However breakpoints can still interrupt
> NMIs and if a breakpoint callback has a reader on that same ssp, a
> racy increment can happen.
>
> 2) If the only reader site for a given ssp is in an NMI, RCU is definetly
definitely
> a better choice over SRCU.

Just checking - because NMI are by definition not-preemptibe, so SRCU over
RCU doesn't make much sense right?

Reviewed-by: Joel Fernandes (Google) <[email protected]>

thanks,

- Joel

>
> 3) Because of the previous reason (2), an ssp having an SRCU read side
> critical section in an NMI is likely to have another one from a task
> context.
>
> For all these reasons, warn if an nmi unsafe reader API is used from an
> NMI.
>
> Signed-off-by: Frederic Weisbecker <[email protected]>
> ---
> kernel/rcu/srcutree.c | 2 ++
> 1 file changed, 2 insertions(+)
>
> diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c
> index c54142374793..8b7ef1031d89 100644
> --- a/kernel/rcu/srcutree.c
> +++ b/kernel/rcu/srcutree.c
> @@ -642,6 +642,8 @@ static void srcu_check_nmi_safety(struct srcu_struct *ssp, bool nmi_safe)
>
> if (!IS_ENABLED(CONFIG_PROVE_RCU))
> return;
> + /* NMI-unsafe use in NMI is a bad sign */
> + WARN_ON_ONCE(!nmi_safe && in_nmi());
> sdp = raw_cpu_ptr(ssp->sda);
> old_nmi_safe_mask = READ_ONCE(sdp->srcu_nmi_safety);
> if (!old_nmi_safe_mask) {
> --
> 2.25.1
>

2022-10-20 22:26:56

by Paul E. McKenney

[permalink] [raw]
Subject: Re: [PATCH 1/3] srcu: Warn when NMI-unsafe API is used in NMI

On Fri, Oct 14, 2022 at 10:45:04PM +0000, Joel Fernandes wrote:
> On Thu, Oct 13, 2022 at 07:22:42PM +0200, Frederic Weisbecker wrote:
> > Using the NMI-unsafe reader API from within NMIs is very likely to be
> > buggy for three reasons:
> >
> > 1) NMIs aren't strictly re-entrant (a pending nested NMI will execute
> > at the end of the current one) so it should be fine to use a
> > non-atomic increment here. However breakpoints can still interrupt
> > NMIs and if a breakpoint callback has a reader on that same ssp, a
> > racy increment can happen.
> >
> > 2) If the only reader site for a given ssp is in an NMI, RCU is definetly
> definitely
> > a better choice over SRCU.
>
> Just checking - because NMI are by definition not-preemptibe, so SRCU over
> RCU doesn't make much sense right?

Agreed. But you never know...

> Reviewed-by: Joel Fernandes (Google) <[email protected]>

I will apply on the next rebase (after today's rebase), thank you!

Thanx, Paul

> thanks,
>
> - Joel
>
> >
> > 3) Because of the previous reason (2), an ssp having an SRCU read side
> > critical section in an NMI is likely to have another one from a task
> > context.
> >
> > For all these reasons, warn if an nmi unsafe reader API is used from an
> > NMI.
> >
> > Signed-off-by: Frederic Weisbecker <[email protected]>
> > ---
> > kernel/rcu/srcutree.c | 2 ++
> > 1 file changed, 2 insertions(+)
> >
> > diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c
> > index c54142374793..8b7ef1031d89 100644
> > --- a/kernel/rcu/srcutree.c
> > +++ b/kernel/rcu/srcutree.c
> > @@ -642,6 +642,8 @@ static void srcu_check_nmi_safety(struct srcu_struct *ssp, bool nmi_safe)
> >
> > if (!IS_ENABLED(CONFIG_PROVE_RCU))
> > return;
> > + /* NMI-unsafe use in NMI is a bad sign */
> > + WARN_ON_ONCE(!nmi_safe && in_nmi());
> > sdp = raw_cpu_ptr(ssp->sda);
> > old_nmi_safe_mask = READ_ONCE(sdp->srcu_nmi_safety);
> > if (!old_nmi_safe_mask) {
> > --
> > 2.25.1
> >