2022-11-22 02:16:29

by Paul E. McKenney

[permalink] [raw]
Subject: [PATCH rcu 0/16 Lazy call_rcu() updates for v6.2

Hello!

This series provides energy efficiency for nearly-idle systems by making
call_rcu() more lazy. Several NOCB changes come along for the ride:

1. Simplify rcu_init_nohz() cpumask handling, courtesy of Zhen Lei.

2. Fix late wakeup when flush of bypass cblist happens, courtesy of
"Joel Fernandes (Google)".

3. Fix missing nocb gp wake on rcu_barrier(), courtesy of Frederic
Weisbecker.

4. Make call_rcu() lazy to save power, courtesy of "Joel Fernandes
(Google)".

5. Refactor code a bit in rcu_nocb_do_flush_bypass(), courtesy of
"Joel Fernandes (Google)".

6. Shrinker for lazy rcu, courtesy of Vineeth Pillai.

7. Add laziness and kfree tests, courtesy of "Joel Fernandes
(Google)".

8. percpu-refcount: Use call_rcu_flush() for atomic switch, courtesy
of "Joel Fernandes (Google)".

9. Use call_rcu_flush() instead of call_rcu, courtesy of "Joel
Fernandes (Google)".

10. Use call_rcu_flush() for async reader test, courtesy of "Joel
Fernandes (Google)".

11. Use call_rcu_flush() where needed, courtesy of "Joel Fernandes
(Google)".

12. scsi/scsi_error: Use call_rcu_flush() instead of call_rcu(),
courtesy of Uladzislau Rezki.

13. Make queue_rcu_work() use call_rcu_flush(), courtesy of Uladzislau
Rezki.

14. Use call_rcu_flush() instead of call_rcu(), courtesy of "Joel
Fernandes (Google)".

15. Use call_rcu_flush() for dst_release(), courtesy of "Joel
Fernandes (Google)".

16. devinet: Reduce refcount before grace period, courtesy of Eric
Dumazet.

Changes since v1:

o Add more adjustments to avoid excessive laziness (#15 and
#16 above).

o Get appropriate Cc lines onto non-RCU patches.

https://lore.kernel.org/all/20221019225138.GA2499943@paulmck-ThinkPad-P17-Gen-1/

Thanx, Paul

------------------------------------------------------------------------

b/drivers/scsi/scsi_error.c | 2
b/include/linux/rcupdate.h | 9 +
b/kernel/rcu/Kconfig | 8 +
b/kernel/rcu/rcu.h | 8 +
b/kernel/rcu/rcuscale.c | 67 +++++++++++-
b/kernel/rcu/rcutorture.c | 16 +-
b/kernel/rcu/sync.c | 2
b/kernel/rcu/tiny.c | 2
b/kernel/rcu/tree.c | 11 +
b/kernel/rcu/tree.h | 1
b/kernel/rcu/tree_exp.h | 2
b/kernel/rcu/tree_nocb.h | 34 +-----
b/kernel/workqueue.c | 2
b/lib/percpu-refcount.c | 3
b/net/core/dst.c | 2
b/net/ipv4/devinet.c | 19 +--
b/net/rxrpc/conn_object.c | 2
kernel/rcu/rcuscale.c | 2
kernel/rcu/tree.c | 129 +++++++++++++++--------
kernel/rcu/tree.h | 11 +
kernel/rcu/tree_nocb.h | 243 ++++++++++++++++++++++++++++++++++++--------
21 files changed, 434 insertions(+), 141 deletions(-)


2022-11-22 02:16:35

by Paul E. McKenney

[permalink] [raw]
Subject: [PATCH v2 rcu 12/16] scsi/scsi_error: Use call_rcu_flush() instead of call_rcu()

From: Uladzislau Rezki <[email protected]>

Earlier commits in this series allow battery-powered systems to build
their kernels with the default-disabled CONFIG_RCU_LAZY=y Kconfig option.
This Kconfig option causes call_rcu() to delay its callbacks in order
to batch them. This means that a given RCU grace period covers more
callbacks, thus reducing the number of grace periods, in turn reducing
the amount of energy consumed, which increases battery lifetime which
can be a very good thing. This is not a subtle effect: In some important
use cases, the battery lifetime is increased by more than 10%.

This CONFIG_RCU_LAZY=y option is available only for CPUs that offload
callbacks, for example, CPUs mentioned in the rcu_nocbs kernel boot
parameter passed to kernels built with CONFIG_RCU_NOCB_CPU=y.

Delaying callbacks is normally not a problem because most callbacks do
nothing but free memory. If the system is short on memory, a shrinker
will kick all currently queued lazy callbacks out of their laziness,
thus freeing their memory in short order. Similarly, the rcu_barrier()
function, which blocks until all currently queued callbacks are invoked,
will also kick lazy callbacks, thus enabling rcu_barrier() to complete
in a timely manner.

However, there are some cases where laziness is not a good option.
For example, synchronize_rcu() invokes call_rcu(), and blocks until
the newly queued callback is invoked. It would not be a good for
synchronize_rcu() to block for ten seconds, even on an idle system.
Therefore, synchronize_rcu() invokes call_rcu_flush() instead of
call_rcu(). The arrival of a non-lazy call_rcu_flush() callback on a
given CPU kicks any lazy callbacks that might be already queued on that
CPU. After all, if there is going to be a grace period, all callbacks
might as well get full benefit from it.

Yes, this could be done the other way around by creating a
call_rcu_lazy(), but earlier experience with this approach and
feedback at the 2022 Linux Plumbers Conference shifted the approach
to call_rcu() being lazy with call_rcu_flush() for the few places
where laziness is inappropriate.

And another call_rcu() instance that cannot be lazy is the one in the
scsi_eh_scmd_add() function. Leaving this instance lazy results in
unacceptably slow boot times.

Therefore, make scsi_eh_scmd_add() use call_rcu_flush() in order to
revert to the old behavior.

Tested-by: Joel Fernandes (Google) <[email protected]>
Signed-off-by: Uladzislau Rezki <[email protected]>
Signed-off-by: Joel Fernandes (Google) <[email protected]>
Cc: "James E.J. Bottomley" <[email protected]>
Cc: "Martin K. Petersen" <[email protected]>
Cc: <[email protected]>
Signed-off-by: Paul E. McKenney <[email protected]>
---
drivers/scsi/scsi_error.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c
index 6995c89792300..634672e67c81f 100644
--- a/drivers/scsi/scsi_error.c
+++ b/drivers/scsi/scsi_error.c
@@ -312,7 +312,7 @@ void scsi_eh_scmd_add(struct scsi_cmnd *scmd)
* Ensure that all tasks observe the host state change before the
* host_failed change.
*/
- call_rcu(&scmd->rcu, scsi_eh_inc_host_failed);
+ call_rcu_flush(&scmd->rcu, scsi_eh_inc_host_failed);
}

/**
--
2.31.1.189.g2e36527f23


2022-11-22 02:17:01

by Paul E. McKenney

[permalink] [raw]
Subject: [PATCH v2 rcu 05/16] rcu: Refactor code a bit in rcu_nocb_do_flush_bypass()

From: "Joel Fernandes (Google)" <[email protected]>

This consolidates the code a bit and makes it cleaner. Functionally it
is the same.

Reported-by: Paul E. McKenney <[email protected]>
Signed-off-by: Joel Fernandes (Google) <[email protected]>
Signed-off-by: Paul E. McKenney <[email protected]>
---
kernel/rcu/tree_nocb.h | 17 +++++++++--------
1 file changed, 9 insertions(+), 8 deletions(-)

diff --git a/kernel/rcu/tree_nocb.h b/kernel/rcu/tree_nocb.h
index d6e4c076b0515..213daf81c057f 100644
--- a/kernel/rcu/tree_nocb.h
+++ b/kernel/rcu/tree_nocb.h
@@ -327,10 +327,11 @@ static void wake_nocb_gp_defer(struct rcu_data *rdp, int waketype,
*
* Note that this function always returns true if rhp is NULL.
*/
-static bool rcu_nocb_do_flush_bypass(struct rcu_data *rdp, struct rcu_head *rhp,
+static bool rcu_nocb_do_flush_bypass(struct rcu_data *rdp, struct rcu_head *rhp_in,
unsigned long j, bool lazy)
{
struct rcu_cblist rcl;
+ struct rcu_head *rhp = rhp_in;

WARN_ON_ONCE(!rcu_rdp_is_offloaded(rdp));
rcu_lockdep_assert_cblist_protected(rdp);
@@ -345,16 +346,16 @@ static bool rcu_nocb_do_flush_bypass(struct rcu_data *rdp, struct rcu_head *rhp,

/*
* If the new CB requested was a lazy one, queue it onto the main
- * ->cblist so we can take advantage of a sooner grade period.
+ * ->cblist so that we can take advantage of the grace-period that will
+ * happen regardless. But queue it onto the bypass list first so that
+ * the lazy CB is ordered with the existing CBs in the bypass list.
*/
if (lazy && rhp) {
- rcu_cblist_flush_enqueue(&rcl, &rdp->nocb_bypass, NULL);
- rcu_cblist_enqueue(&rcl, rhp);
- WRITE_ONCE(rdp->lazy_len, 0);
- } else {
- rcu_cblist_flush_enqueue(&rcl, &rdp->nocb_bypass, rhp);
- WRITE_ONCE(rdp->lazy_len, 0);
+ rcu_cblist_enqueue(&rdp->nocb_bypass, rhp);
+ rhp = NULL;
}
+ rcu_cblist_flush_enqueue(&rcl, &rdp->nocb_bypass, rhp);
+ WRITE_ONCE(rdp->lazy_len, 0);

rcu_segcblist_insert_pend_cbs(&rdp->cblist, &rcl);
WRITE_ONCE(rdp->nocb_bypass_first, j);
--
2.31.1.189.g2e36527f23


2022-11-22 02:17:03

by Paul E. McKenney

[permalink] [raw]
Subject: [PATCH v2 rcu 16/16] net: devinet: Reduce refcount before grace period

From: Eric Dumazet <[email protected]>

Currently, the inetdev_destroy() function waits for an RCU grace period
before decrementing the refcount and freeing memory. This causes a delay
with a new RCU configuration that tries to save power, which results in the
network interface disappearing later than expected. The resulting delay
causes test failures on ChromeOS.

Refactor the code such that the refcount is freed before the grace period
and memory is freed after. With this a ChromeOS network test passes that
does 'ip netns del' and polls for an interface disappearing, now passes.

Reported-by: Joel Fernandes (Google) <[email protected]>
Signed-off-by: Eric Dumazet <[email protected]>
Signed-off-by: Joel Fernandes (Google) <[email protected]>
Cc: David Ahern <[email protected]>
Cc: "David S. Miller" <[email protected]>
Cc: Hideaki YOSHIFUJI <[email protected]>
Cc: Jakub Kicinski <[email protected]>
Cc: Paolo Abeni <[email protected]>
Cc: <[email protected]>
Signed-off-by: Paul E. McKenney <[email protected]>
---
net/ipv4/devinet.c | 19 ++++++++++---------
1 file changed, 10 insertions(+), 9 deletions(-)

diff --git a/net/ipv4/devinet.c b/net/ipv4/devinet.c
index e8b9a9202fecd..b0acf6e19aed3 100644
--- a/net/ipv4/devinet.c
+++ b/net/ipv4/devinet.c
@@ -234,13 +234,20 @@ static void inet_free_ifa(struct in_ifaddr *ifa)
call_rcu(&ifa->rcu_head, inet_rcu_free_ifa);
}

+static void in_dev_free_rcu(struct rcu_head *head)
+{
+ struct in_device *idev = container_of(head, struct in_device, rcu_head);
+
+ kfree(rcu_dereference_protected(idev->mc_hash, 1));
+ kfree(idev);
+}
+
void in_dev_finish_destroy(struct in_device *idev)
{
struct net_device *dev = idev->dev;

WARN_ON(idev->ifa_list);
WARN_ON(idev->mc_list);
- kfree(rcu_dereference_protected(idev->mc_hash, 1));
#ifdef NET_REFCNT_DEBUG
pr_debug("%s: %p=%s\n", __func__, idev, dev ? dev->name : "NIL");
#endif
@@ -248,7 +255,7 @@ void in_dev_finish_destroy(struct in_device *idev)
if (!idev->dead)
pr_err("Freeing alive in_device %p\n", idev);
else
- kfree(idev);
+ call_rcu(&idev->rcu_head, in_dev_free_rcu);
}
EXPORT_SYMBOL(in_dev_finish_destroy);

@@ -298,12 +305,6 @@ static struct in_device *inetdev_init(struct net_device *dev)
goto out;
}

-static void in_dev_rcu_put(struct rcu_head *head)
-{
- struct in_device *idev = container_of(head, struct in_device, rcu_head);
- in_dev_put(idev);
-}
-
static void inetdev_destroy(struct in_device *in_dev)
{
struct net_device *dev;
@@ -328,7 +329,7 @@ static void inetdev_destroy(struct in_device *in_dev)
neigh_parms_release(&arp_tbl, in_dev->arp_parms);
arp_ifdown(dev);

- call_rcu(&in_dev->rcu_head, in_dev_rcu_put);
+ in_dev_put(in_dev);
}

int inet_addr_onlink(struct in_device *in_dev, __be32 a, __be32 b)
--
2.31.1.189.g2e36527f23


2022-11-22 02:17:42

by Paul E. McKenney

[permalink] [raw]
Subject: [PATCH v2 rcu 06/16] rcu: Shrinker for lazy rcu

From: Vineeth Pillai <[email protected]>

The shrinker is used to speed up the free'ing of memory potentially held
by RCU lazy callbacks. RCU kernel module test cases show this to be
effective. Test is introduced in a later patch.

Signed-off-by: Vineeth Pillai <[email protected]>
Signed-off-by: Joel Fernandes (Google) <[email protected]>
Signed-off-by: Paul E. McKenney <[email protected]>
---
kernel/rcu/tree_nocb.h | 52 ++++++++++++++++++++++++++++++++++++++++++
1 file changed, 52 insertions(+)

diff --git a/kernel/rcu/tree_nocb.h b/kernel/rcu/tree_nocb.h
index 213daf81c057f..9e1c8caec5ceb 100644
--- a/kernel/rcu/tree_nocb.h
+++ b/kernel/rcu/tree_nocb.h
@@ -1312,6 +1312,55 @@ int rcu_nocb_cpu_offload(int cpu)
}
EXPORT_SYMBOL_GPL(rcu_nocb_cpu_offload);

+static unsigned long
+lazy_rcu_shrink_count(struct shrinker *shrink, struct shrink_control *sc)
+{
+ int cpu;
+ unsigned long count = 0;
+
+ /* Snapshot count of all CPUs */
+ for_each_possible_cpu(cpu) {
+ struct rcu_data *rdp = per_cpu_ptr(&rcu_data, cpu);
+
+ count += READ_ONCE(rdp->lazy_len);
+ }
+
+ return count ? count : SHRINK_EMPTY;
+}
+
+static unsigned long
+lazy_rcu_shrink_scan(struct shrinker *shrink, struct shrink_control *sc)
+{
+ int cpu;
+ unsigned long flags;
+ unsigned long count = 0;
+
+ /* Snapshot count of all CPUs */
+ for_each_possible_cpu(cpu) {
+ struct rcu_data *rdp = per_cpu_ptr(&rcu_data, cpu);
+ int _count = READ_ONCE(rdp->lazy_len);
+
+ if (_count == 0)
+ continue;
+ rcu_nocb_lock_irqsave(rdp, flags);
+ WRITE_ONCE(rdp->lazy_len, 0);
+ rcu_nocb_unlock_irqrestore(rdp, flags);
+ wake_nocb_gp(rdp, false);
+ sc->nr_to_scan -= _count;
+ count += _count;
+ if (sc->nr_to_scan <= 0)
+ break;
+ }
+ return count ? count : SHRINK_STOP;
+}
+
+static struct shrinker lazy_rcu_shrinker = {
+ .count_objects = lazy_rcu_shrink_count,
+ .scan_objects = lazy_rcu_shrink_scan,
+ .batch = 0,
+ .seeks = DEFAULT_SEEKS,
+};
+
void __init rcu_init_nohz(void)
{
int cpu;
@@ -1342,6 +1391,9 @@ void __init rcu_init_nohz(void)
if (!rcu_state.nocb_is_setup)
return;

+ if (register_shrinker(&lazy_rcu_shrinker, "rcu-lazy"))
+ pr_err("Failed to register lazy_rcu shrinker!\n");
+
if (!cpumask_subset(rcu_nocb_mask, cpu_possible_mask)) {
pr_info("\tNote: kernel parameter 'rcu_nocbs=', 'nohz_full', or 'isolcpus=' contains nonexistent CPUs.\n");
cpumask_and(rcu_nocb_mask, cpu_possible_mask,
--
2.31.1.189.g2e36527f23


2022-11-23 16:30:02

by Frederic Weisbecker

[permalink] [raw]
Subject: Re: [PATCH v2 rcu 05/16] rcu: Refactor code a bit in rcu_nocb_do_flush_bypass()

On Mon, Nov 21, 2022 at 05:04:10PM -0800, Paul E. McKenney wrote:
> From: "Joel Fernandes (Google)" <[email protected]>
>
> This consolidates the code a bit and makes it cleaner. Functionally it
> is the same.
>
> Reported-by: Paul E. McKenney <[email protected]>
> Signed-off-by: Joel Fernandes (Google) <[email protected]>
> Signed-off-by: Paul E. McKenney <[email protected]>
> ---
> kernel/rcu/tree_nocb.h | 17 +++++++++--------
> 1 file changed, 9 insertions(+), 8 deletions(-)
>
> diff --git a/kernel/rcu/tree_nocb.h b/kernel/rcu/tree_nocb.h
> index d6e4c076b0515..213daf81c057f 100644
> --- a/kernel/rcu/tree_nocb.h
> +++ b/kernel/rcu/tree_nocb.h
> @@ -327,10 +327,11 @@ static void wake_nocb_gp_defer(struct rcu_data *rdp, int waketype,
> *
> * Note that this function always returns true if rhp is NULL.
> */
> -static bool rcu_nocb_do_flush_bypass(struct rcu_data *rdp, struct rcu_head *rhp,
> +static bool rcu_nocb_do_flush_bypass(struct rcu_data *rdp, struct rcu_head *rhp_in,
> unsigned long j, bool lazy)
> {
> struct rcu_cblist rcl;
> + struct rcu_head *rhp = rhp_in;

Why that intermediate rhp_in?

>
> WARN_ON_ONCE(!rcu_rdp_is_offloaded(rdp));
> rcu_lockdep_assert_cblist_protected(rdp);
> @@ -345,16 +346,16 @@ static bool rcu_nocb_do_flush_bypass(struct rcu_data *rdp, struct rcu_head *rhp,
>
> /*
> * If the new CB requested was a lazy one, queue it onto the main
> - * ->cblist so we can take advantage of a sooner grade period.
> + * ->cblist so that we can take advantage of the grace-period that will
> + * happen regardless. But queue it onto the bypass list first so that
> + * the lazy CB is ordered with the existing CBs in the bypass list.
> */
> if (lazy && rhp) {
> - rcu_cblist_flush_enqueue(&rcl, &rdp->nocb_bypass, NULL);
> - rcu_cblist_enqueue(&rcl, rhp);
> - WRITE_ONCE(rdp->lazy_len, 0);
> - } else {
> - rcu_cblist_flush_enqueue(&rcl, &rdp->nocb_bypass, rhp);
> - WRITE_ONCE(rdp->lazy_len, 0);
> + rcu_cblist_enqueue(&rdp->nocb_bypass, rhp);
> + rhp = NULL;
> }
> + rcu_cblist_flush_enqueue(&rcl, &rdp->nocb_bypass, rhp);
> + WRITE_ONCE(rdp->lazy_len, 0);

Reviewed-by: Frederic Weisbecker <[email protected]>

Thanks.

>
> rcu_segcblist_insert_pend_cbs(&rdp->cblist, &rcl);
> WRITE_ONCE(rdp->nocb_bypass_first, j);
> --
> 2.31.1.189.g2e36527f23
>

2022-11-23 19:01:35

by Paul E. McKenney

[permalink] [raw]
Subject: Re: [PATCH v2 rcu 05/16] rcu: Refactor code a bit in rcu_nocb_do_flush_bypass()

On Wed, Nov 23, 2022 at 04:59:29PM +0100, Frederic Weisbecker wrote:
> On Mon, Nov 21, 2022 at 05:04:10PM -0800, Paul E. McKenney wrote:
> > From: "Joel Fernandes (Google)" <[email protected]>
> >
> > This consolidates the code a bit and makes it cleaner. Functionally it
> > is the same.
> >
> > Reported-by: Paul E. McKenney <[email protected]>
> > Signed-off-by: Joel Fernandes (Google) <[email protected]>
> > Signed-off-by: Paul E. McKenney <[email protected]>
> > ---
> > kernel/rcu/tree_nocb.h | 17 +++++++++--------
> > 1 file changed, 9 insertions(+), 8 deletions(-)
> >
> > diff --git a/kernel/rcu/tree_nocb.h b/kernel/rcu/tree_nocb.h
> > index d6e4c076b0515..213daf81c057f 100644
> > --- a/kernel/rcu/tree_nocb.h
> > +++ b/kernel/rcu/tree_nocb.h
> > @@ -327,10 +327,11 @@ static void wake_nocb_gp_defer(struct rcu_data *rdp, int waketype,
> > *
> > * Note that this function always returns true if rhp is NULL.
> > */
> > -static bool rcu_nocb_do_flush_bypass(struct rcu_data *rdp, struct rcu_head *rhp,
> > +static bool rcu_nocb_do_flush_bypass(struct rcu_data *rdp, struct rcu_head *rhp_in,
> > unsigned long j, bool lazy)
> > {
> > struct rcu_cblist rcl;
> > + struct rcu_head *rhp = rhp_in;
>
> Why that intermediate rhp_in?

To avoid modifying the formal parameter, should the original value prove
useful, for example, for tracing or debugging.

> > WARN_ON_ONCE(!rcu_rdp_is_offloaded(rdp));
> > rcu_lockdep_assert_cblist_protected(rdp);
> > @@ -345,16 +346,16 @@ static bool rcu_nocb_do_flush_bypass(struct rcu_data *rdp, struct rcu_head *rhp,
> >
> > /*
> > * If the new CB requested was a lazy one, queue it onto the main
> > - * ->cblist so we can take advantage of a sooner grade period.
> > + * ->cblist so that we can take advantage of the grace-period that will
> > + * happen regardless. But queue it onto the bypass list first so that
> > + * the lazy CB is ordered with the existing CBs in the bypass list.
> > */
> > if (lazy && rhp) {
> > - rcu_cblist_flush_enqueue(&rcl, &rdp->nocb_bypass, NULL);
> > - rcu_cblist_enqueue(&rcl, rhp);
> > - WRITE_ONCE(rdp->lazy_len, 0);
> > - } else {
> > - rcu_cblist_flush_enqueue(&rcl, &rdp->nocb_bypass, rhp);
> > - WRITE_ONCE(rdp->lazy_len, 0);
> > + rcu_cblist_enqueue(&rdp->nocb_bypass, rhp);
> > + rhp = NULL;
> > }
> > + rcu_cblist_flush_enqueue(&rcl, &rdp->nocb_bypass, rhp);
> > + WRITE_ONCE(rdp->lazy_len, 0);
>
> Reviewed-by: Frederic Weisbecker <[email protected]>

Thank you! I will apply this on my next rebase.

Thanx, Paul

> Thanks.
>
> >
> > rcu_segcblist_insert_pend_cbs(&rdp->cblist, &rcl);
> > WRITE_ONCE(rdp->nocb_bypass_first, j);
> > --
> > 2.31.1.189.g2e36527f23
> >

2022-11-24 01:56:08

by Joel Fernandes

[permalink] [raw]
Subject: Re: [PATCH v2 rcu 05/16] rcu: Refactor code a bit in rcu_nocb_do_flush_bypass()



> On Nov 23, 2022, at 12:54 PM, Paul E. McKenney <[email protected]> wrote:
>
> On Wed, Nov 23, 2022 at 04:59:29PM +0100, Frederic Weisbecker wrote:
>>> On Mon, Nov 21, 2022 at 05:04:10PM -0800, Paul E. McKenney wrote:
>>> From: "Joel Fernandes (Google)" <[email protected]>
>>>
>>> This consolidates the code a bit and makes it cleaner. Functionally it
>>> is the same.
>>>
>>> Reported-by: Paul E. McKenney <[email protected]>
>>> Signed-off-by: Joel Fernandes (Google) <[email protected]>
>>> Signed-off-by: Paul E. McKenney <[email protected]>
>>> ---
>>> kernel/rcu/tree_nocb.h | 17 +++++++++--------
>>> 1 file changed, 9 insertions(+), 8 deletions(-)
>>>
>>> diff --git a/kernel/rcu/tree_nocb.h b/kernel/rcu/tree_nocb.h
>>> index d6e4c076b0515..213daf81c057f 100644
>>> --- a/kernel/rcu/tree_nocb.h
>>> +++ b/kernel/rcu/tree_nocb.h
>>> @@ -327,10 +327,11 @@ static void wake_nocb_gp_defer(struct rcu_data *rdp, int waketype,
>>> *
>>> * Note that this function always returns true if rhp is NULL.
>>> */
>>> -static bool rcu_nocb_do_flush_bypass(struct rcu_data *rdp, struct rcu_head *rhp,
>>> +static bool rcu_nocb_do_flush_bypass(struct rcu_data *rdp, struct rcu_head *rhp_in,
>>> unsigned long j, bool lazy)
>>> {
>>> struct rcu_cblist rcl;
>>> + struct rcu_head *rhp = rhp_in;
>>
>> Why that intermediate rhp_in?
>
> To avoid modifying the formal parameter, should the original value prove
> useful, for example, for tracing or debugging.

So as to not re assign function parameter and introduce bugs down the line (someone reading code thinks they passed a certain rhp but code is using something else later in the function).

Thanks.



>
>>> WARN_ON_ONCE(!rcu_rdp_is_offloaded(rdp));
>>> rcu_lockdep_assert_cblist_protected(rdp);
>>> @@ -345,16 +346,16 @@ static bool rcu_nocb_do_flush_bypass(struct rcu_data *rdp, struct rcu_head *rhp,
>>>
>>> /*
>>> * If the new CB requested was a lazy one, queue it onto the main
>>> - * ->cblist so we can take advantage of a sooner grade period.
>>> + * ->cblist so that we can take advantage of the grace-period that will
>>> + * happen regardless. But queue it onto the bypass list first so that
>>> + * the lazy CB is ordered with the existing CBs in the bypass list.
>>> */
>>> if (lazy && rhp) {
>>> - rcu_cblist_flush_enqueue(&rcl, &rdp->nocb_bypass, NULL);
>>> - rcu_cblist_enqueue(&rcl, rhp);
>>> - WRITE_ONCE(rdp->lazy_len, 0);
>>> - } else {
>>> - rcu_cblist_flush_enqueue(&rcl, &rdp->nocb_bypass, rhp);
>>> - WRITE_ONCE(rdp->lazy_len, 0);
>>> + rcu_cblist_enqueue(&rdp->nocb_bypass, rhp);
>>> + rhp = NULL;
>>> }
>>> + rcu_cblist_flush_enqueue(&rcl, &rdp->nocb_bypass, rhp);
>>> + WRITE_ONCE(rdp->lazy_len, 0);
>>
>> Reviewed-by: Frederic Weisbecker <[email protected]>
>
> Thank you! I will apply this on my next rebase.
>
> Thanx, Paul
>
>> Thanks.
>>
>>>
>>> rcu_segcblist_insert_pend_cbs(&rdp->cblist, &rcl);
>>> WRITE_ONCE(rdp->nocb_bypass_first, j);
>>> --
>>> 2.31.1.189.g2e36527f23
>>>

2022-11-26 00:35:23

by Bart Van Assche

[permalink] [raw]
Subject: Re: [PATCH v2 rcu 12/16] scsi/scsi_error: Use call_rcu_flush() instead of call_rcu()

On 11/21/22 17:04, Paul E. McKenney wrote:
> diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c
> index 6995c89792300..634672e67c81f 100644
> --- a/drivers/scsi/scsi_error.c
> +++ b/drivers/scsi/scsi_error.c
> @@ -312,7 +312,7 @@ void scsi_eh_scmd_add(struct scsi_cmnd *scmd)
> * Ensure that all tasks observe the host state change before the
> * host_failed change.
> */
> - call_rcu(&scmd->rcu, scsi_eh_inc_host_failed);
> + call_rcu_flush(&scmd->rcu, scsi_eh_inc_host_failed);
> }

Reviewed-by: Bart Van Assche <[email protected]>

2022-11-26 03:24:15

by Martin K. Petersen

[permalink] [raw]
Subject: Re: [PATCH v2 rcu 12/16] scsi/scsi_error: Use call_rcu_flush() instead of call_rcu()


Paul,

> Therefore, make scsi_eh_scmd_add() use call_rcu_flush() in order to
> revert to the old behavior.

[...]

> diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c
> index 6995c89792300..634672e67c81f 100644
> --- a/drivers/scsi/scsi_error.c
> +++ b/drivers/scsi/scsi_error.c
> @@ -312,7 +312,7 @@ void scsi_eh_scmd_add(struct scsi_cmnd *scmd)
> * Ensure that all tasks observe the host state change before the
> * host_failed change.
> */
> - call_rcu(&scmd->rcu, scsi_eh_inc_host_failed);
> + call_rcu_flush(&scmd->rcu, scsi_eh_inc_host_failed);
> }
>
> /**

OK with me.

Acked-by: Martin K. Petersen <[email protected]>

--
Martin K. Petersen Oracle Linux Engineering

2022-11-28 22:21:59

by Paul E. McKenney

[permalink] [raw]
Subject: Re: [PATCH v2 rcu 12/16] scsi/scsi_error: Use call_rcu_flush() instead of call_rcu()

On Fri, Nov 25, 2022 at 04:11:26PM -0800, Bart Van Assche wrote:
> On 11/21/22 17:04, Paul E. McKenney wrote:
> > diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c
> > index 6995c89792300..634672e67c81f 100644
> > --- a/drivers/scsi/scsi_error.c
> > +++ b/drivers/scsi/scsi_error.c
> > @@ -312,7 +312,7 @@ void scsi_eh_scmd_add(struct scsi_cmnd *scmd)
> > * Ensure that all tasks observe the host state change before the
> > * host_failed change.
> > */
> > - call_rcu(&scmd->rcu, scsi_eh_inc_host_failed);
> > + call_rcu_flush(&scmd->rcu, scsi_eh_inc_host_failed);
> > }
>
> Reviewed-by: Bart Van Assche <[email protected]>

Thank you both, I will apply these on the next rebase.

Thanx, Paul