2020-08-25 03:02:30

by Joel Fernandes

[permalink] [raw]
Subject: [PATCH v4 -rcu 1/4] rcu/segcblist: Do not depend on rcl->len to store the segcb len during merge

The donecbs's ->len field is used to store the total count of the segmented
callback list's length. This ->len field is then added to the destination segcb
list.

However, this presents a problem for per-segment length counting which is added
in a future patch. This future patch sets the rcl->len field as we move
segments of callbacks between source and destination lists, thus becoming
incompatible with the donecb's ->len field.

This commit therefore avoids depending on the ->len field in this way. IMHO,
this is also less error-prone and is more accurate - the donecb's ->len field
should be the length of the done segment and not just used as a temporarily
variable.

Signed-off-by: Joel Fernandes (Google) <[email protected]>
---
kernel/rcu/rcu_segcblist.c | 8 ++++++--
1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/kernel/rcu/rcu_segcblist.c b/kernel/rcu/rcu_segcblist.c
index 2d2a6b6b9dfb..b70d4154433c 100644
--- a/kernel/rcu/rcu_segcblist.c
+++ b/kernel/rcu/rcu_segcblist.c
@@ -513,14 +513,18 @@ void rcu_segcblist_merge(struct rcu_segcblist *dst_rsclp,
{
struct rcu_cblist donecbs;
struct rcu_cblist pendcbs;
+ long src_len;

rcu_cblist_init(&donecbs);
rcu_cblist_init(&pendcbs);
- rcu_segcblist_extract_count(src_rsclp, &donecbs);
+
+ src_len = rcu_segcblist_xchg_len(src_rsclp, 0);
rcu_segcblist_extract_done_cbs(src_rsclp, &donecbs);
rcu_segcblist_extract_pend_cbs(src_rsclp, &pendcbs);
- rcu_segcblist_insert_count(dst_rsclp, &donecbs);
+
+ rcu_segcblist_add_len(dst_rsclp, src_len);
rcu_segcblist_insert_done_cbs(dst_rsclp, &donecbs);
rcu_segcblist_insert_pend_cbs(dst_rsclp, &pendcbs);
+
rcu_segcblist_init(src_rsclp);
}
--
2.28.0.297.g1956fa8f8d-goog


2020-08-25 20:09:35

by Paul E. McKenney

[permalink] [raw]
Subject: Re: [PATCH v4 -rcu 1/4] rcu/segcblist: Do not depend on rcl->len to store the segcb len during merge

On Mon, Aug 24, 2020 at 10:48:39PM -0400, Joel Fernandes (Google) wrote:
> The donecbs's ->len field is used to store the total count of the segmented
> callback list's length. This ->len field is then added to the destination segcb
> list.
>
> However, this presents a problem for per-segment length counting which is added
> in a future patch. This future patch sets the rcl->len field as we move
> segments of callbacks between source and destination lists, thus becoming
> incompatible with the donecb's ->len field.

OK, I will bite. What is "rcl"? A placeholder for donecbs and pendcbs?
If so, please just name them both. If not, please explain.

> This commit therefore avoids depending on the ->len field in this way. IMHO,
> this is also less error-prone and is more accurate - the donecb's ->len field
> should be the length of the done segment and not just used as a temporarily
> variable.

Please also mention why ->len is handled specially at all, namely
interactions between rcu_barrier() and callback invocation. This is
the answer to "why not just make all this work like normal lists?"
This might go well in the first paragraph.

> Signed-off-by: Joel Fernandes (Google) <[email protected]>
> ---
> kernel/rcu/rcu_segcblist.c | 8 ++++++--
> 1 file changed, 6 insertions(+), 2 deletions(-)
>
> diff --git a/kernel/rcu/rcu_segcblist.c b/kernel/rcu/rcu_segcblist.c
> index 2d2a6b6b9dfb..b70d4154433c 100644
> --- a/kernel/rcu/rcu_segcblist.c
> +++ b/kernel/rcu/rcu_segcblist.c
> @@ -513,14 +513,18 @@ void rcu_segcblist_merge(struct rcu_segcblist *dst_rsclp,
> {
> struct rcu_cblist donecbs;
> struct rcu_cblist pendcbs;
> + long src_len;
>
> rcu_cblist_init(&donecbs);
> rcu_cblist_init(&pendcbs);
> - rcu_segcblist_extract_count(src_rsclp, &donecbs);
> +
> + src_len = rcu_segcblist_xchg_len(src_rsclp, 0);

Given that both rcu_segcblist_xchg_len() and rcu_segcblist_extract_count()
have only one callsite each, why not get rid of one of them?

Or better yet, please see below, which should allow getting rid of both
of them.

> rcu_segcblist_extract_done_cbs(src_rsclp, &donecbs);
> rcu_segcblist_extract_pend_cbs(src_rsclp, &pendcbs);
> - rcu_segcblist_insert_count(dst_rsclp, &donecbs);
> +
> + rcu_segcblist_add_len(dst_rsclp, src_len);
> rcu_segcblist_insert_done_cbs(dst_rsclp, &donecbs);
> rcu_segcblist_insert_pend_cbs(dst_rsclp, &pendcbs);

Rather than adding the blank lines, why not have the rcu_cblist structures
carry the lengths? You are already adjusting one of the two call sites
that care (rcu_do_batch()), and the other is srcu_invoke_callbacks().
That should shorten this function a bit more. And make callback handling
much more approachable, I suspect.

There would still be the callback-invocation need to be careful with
->cblist.len due to rcu_barrier() and srcu_barrier(). But both of
those should be excluded by this code. (But don't take my word for it,
ask KCSAN.)

Thanx, Paul

> +
> rcu_segcblist_init(src_rsclp);
> }
> --
> 2.28.0.297.g1956fa8f8d-goog
>

2020-08-25 22:48:29

by Joel Fernandes

[permalink] [raw]
Subject: Re: [PATCH v4 -rcu 1/4] rcu/segcblist: Do not depend on rcl->len to store the segcb len during merge

Hi Paul,

On Tue, Aug 25, 2020 at 01:08:09PM -0700, Paul E. McKenney wrote:
> On Mon, Aug 24, 2020 at 10:48:39PM -0400, Joel Fernandes (Google) wrote:
> > The donecbs's ->len field is used to store the total count of the segmented
> > callback list's length. This ->len field is then added to the destination segcb
> > list.
> >
> > However, this presents a problem for per-segment length counting which is added
> > in a future patch. This future patch sets the rcl->len field as we move
> > segments of callbacks between source and destination lists, thus becoming
> > incompatible with the donecb's ->len field.
>
> OK, I will bite. What is "rcl"? A placeholder for donecbs and pendcbs?
> If so, please just name them both. If not, please explain.

Ok will fix.

> > This commit therefore avoids depending on the ->len field in this way. IMHO,
> > this is also less error-prone and is more accurate - the donecb's ->len field
> > should be the length of the done segment and not just used as a temporarily
> > variable.
>
> Please also mention why ->len is handled specially at all, namely
> interactions between rcu_barrier() and callback invocation. This is
> the answer to "why not just make all this work like normal lists?"
> This might go well in the first paragraph.

Are you referring to the cblist structures ->len? I know the segcblist's
->len field is what rcu_barrier() samples but I am not changing that behavior
at all in this patch. This patch is only about the donecb's len (which is a
cblist structure on the stack).

> > Signed-off-by: Joel Fernandes (Google) <[email protected]>
> > ---
> > kernel/rcu/rcu_segcblist.c | 8 ++++++--
> > 1 file changed, 6 insertions(+), 2 deletions(-)
> >
> > diff --git a/kernel/rcu/rcu_segcblist.c b/kernel/rcu/rcu_segcblist.c
> > index 2d2a6b6b9dfb..b70d4154433c 100644
> > --- a/kernel/rcu/rcu_segcblist.c
> > +++ b/kernel/rcu/rcu_segcblist.c
> > @@ -513,14 +513,18 @@ void rcu_segcblist_merge(struct rcu_segcblist *dst_rsclp,
> > {
> > struct rcu_cblist donecbs;
> > struct rcu_cblist pendcbs;
> > + long src_len;
> >
> > rcu_cblist_init(&donecbs);
> > rcu_cblist_init(&pendcbs);
> > - rcu_segcblist_extract_count(src_rsclp, &donecbs);
> > +
> > + src_len = rcu_segcblist_xchg_len(src_rsclp, 0);
>
> Given that both rcu_segcblist_xchg_len() and rcu_segcblist_extract_count()
> have only one callsite each, why not get rid of one of them?

Good point, I will do that.

> Or better yet, please see below, which should allow getting rid of both
> of them.
>
> > rcu_segcblist_extract_done_cbs(src_rsclp, &donecbs);
> > rcu_segcblist_extract_pend_cbs(src_rsclp, &pendcbs);
> > - rcu_segcblist_insert_count(dst_rsclp, &donecbs);
> > +
> > + rcu_segcblist_add_len(dst_rsclp, src_len);
> > rcu_segcblist_insert_done_cbs(dst_rsclp, &donecbs);
> > rcu_segcblist_insert_pend_cbs(dst_rsclp, &pendcbs);
>
> Rather than adding the blank lines, why not have the rcu_cblist structures
> carry the lengths? You are already adjusting one of the two call sites
> that care (rcu_do_batch()), and the other is srcu_invoke_callbacks().
> That should shorten this function a bit more. And make callback handling
> much more approachable, I suspect.

Sorry, I did not understand. The rcu_cblist structure already has a length
field. I do modify rcu_segcblist_extract_done_cbs() and
rcu_segcblist_extract_pend_cbs() to carry the length already, in a later
patch.

Just to emphasize, this patch is just a small refactor to avoid an issue in
later patches. It aims to keep current functionality unchanged.

thanks,

- Joel

>
> There would still be the callback-invocation need to be careful with
> ->cblist.len due to rcu_barrier() and srcu_barrier(). But both of
> those should be excluded by this code. (But don't take my word for it,
> ask KCSAN.)
>
> Thanx, Paul
>
> > +
> > rcu_segcblist_init(src_rsclp);
> > }
> > --
> > 2.28.0.297.g1956fa8f8d-goog
> >

2020-08-26 14:36:47

by Paul E. McKenney

[permalink] [raw]
Subject: Re: [PATCH v4 -rcu 1/4] rcu/segcblist: Do not depend on rcl->len to store the segcb len during merge

On Tue, Aug 25, 2020 at 06:47:23PM -0400, Joel Fernandes wrote:
> Hi Paul,
>
> On Tue, Aug 25, 2020 at 01:08:09PM -0700, Paul E. McKenney wrote:
> > On Mon, Aug 24, 2020 at 10:48:39PM -0400, Joel Fernandes (Google) wrote:
> > > The donecbs's ->len field is used to store the total count of the segmented
> > > callback list's length. This ->len field is then added to the destination segcb
> > > list.
> > >
> > > However, this presents a problem for per-segment length counting which is added
> > > in a future patch. This future patch sets the rcl->len field as we move
> > > segments of callbacks between source and destination lists, thus becoming
> > > incompatible with the donecb's ->len field.
> >
> > OK, I will bite. What is "rcl"? A placeholder for donecbs and pendcbs?
> > If so, please just name them both. If not, please explain.
>
> Ok will fix.
>
> > > This commit therefore avoids depending on the ->len field in this way. IMHO,
> > > this is also less error-prone and is more accurate - the donecb's ->len field
> > > should be the length of the done segment and not just used as a temporarily
> > > variable.
> >
> > Please also mention why ->len is handled specially at all, namely
> > interactions between rcu_barrier() and callback invocation. This is
> > the answer to "why not just make all this work like normal lists?"
> > This might go well in the first paragraph.
>
> Are you referring to the cblist structures ->len? I know the segcblist's
> ->len field is what rcu_barrier() samples but I am not changing that behavior
> at all in this patch. This patch is only about the donecb's len (which is a
> cblist structure on the stack).

Yes, we agree. I am just suggesting that you call this out in the
commit log. It is probably not obvious to those who have not been
through the code yet. ;-)

> > > Signed-off-by: Joel Fernandes (Google) <[email protected]>
> > > ---
> > > kernel/rcu/rcu_segcblist.c | 8 ++++++--
> > > 1 file changed, 6 insertions(+), 2 deletions(-)
> > >
> > > diff --git a/kernel/rcu/rcu_segcblist.c b/kernel/rcu/rcu_segcblist.c
> > > index 2d2a6b6b9dfb..b70d4154433c 100644
> > > --- a/kernel/rcu/rcu_segcblist.c
> > > +++ b/kernel/rcu/rcu_segcblist.c
> > > @@ -513,14 +513,18 @@ void rcu_segcblist_merge(struct rcu_segcblist *dst_rsclp,
> > > {
> > > struct rcu_cblist donecbs;
> > > struct rcu_cblist pendcbs;
> > > + long src_len;
> > >
> > > rcu_cblist_init(&donecbs);
> > > rcu_cblist_init(&pendcbs);
> > > - rcu_segcblist_extract_count(src_rsclp, &donecbs);
> > > +
> > > + src_len = rcu_segcblist_xchg_len(src_rsclp, 0);
> >
> > Given that both rcu_segcblist_xchg_len() and rcu_segcblist_extract_count()
> > have only one callsite each, why not get rid of one of them?
>
> Good point, I will do that.
>
> > Or better yet, please see below, which should allow getting rid of both
> > of them.
> >
> > > rcu_segcblist_extract_done_cbs(src_rsclp, &donecbs);
> > > rcu_segcblist_extract_pend_cbs(src_rsclp, &pendcbs);
> > > - rcu_segcblist_insert_count(dst_rsclp, &donecbs);
> > > +
> > > + rcu_segcblist_add_len(dst_rsclp, src_len);
> > > rcu_segcblist_insert_done_cbs(dst_rsclp, &donecbs);
> > > rcu_segcblist_insert_pend_cbs(dst_rsclp, &pendcbs);
> >
> > Rather than adding the blank lines, why not have the rcu_cblist structures
> > carry the lengths? You are already adjusting one of the two call sites
> > that care (rcu_do_batch()), and the other is srcu_invoke_callbacks().
> > That should shorten this function a bit more. And make callback handling
> > much more approachable, I suspect.
>
> Sorry, I did not understand. The rcu_cblist structure already has a length
> field. I do modify rcu_segcblist_extract_done_cbs() and
> rcu_segcblist_extract_pend_cbs() to carry the length already, in a later
> patch.
>
> Just to emphasize, this patch is just a small refactor to avoid an issue in
> later patches. It aims to keep current functionality unchanged.

True enough. I am just suggesting that an equally small refactor in
a slightly different direction should get to a better place. The key
point enabling this slightly different direction is that this code is
an exception to the "preserve ->cblist.len" rule because it is invoked
only from the CPU hotplug code.

So you could use the rcu_cblist .len field to update the ->cblist.len
field, thus combining the _cbs and _count updates. One thing that helps
is that setting th e rcu_cblist .len field doesn't hurt the other use
cases that require careful handling of ->cblist.len.

Thanx, Paul

> thanks,
>
> - Joel
>
> >
> > There would still be the callback-invocation need to be careful with
> > ->cblist.len due to rcu_barrier() and srcu_barrier(). But both of
> > those should be excluded by this code. (But don't take my word for it,
> > ask KCSAN.)
> >
> > Thanx, Paul
> >
> > > +
> > > rcu_segcblist_init(src_rsclp);
> > > }
> > > --
> > > 2.28.0.297.g1956fa8f8d-goog
> > >

2020-08-27 22:56:21

by Joel Fernandes

[permalink] [raw]
Subject: Re: [PATCH v4 -rcu 1/4] rcu/segcblist: Do not depend on rcl->len to store the segcb len during merge

On Wed, Aug 26, 2020 at 07:20:28AM -0700, Paul E. McKenney wrote:
[...]
> > > Or better yet, please see below, which should allow getting rid of both
> > > of them.
> > >
> > > > rcu_segcblist_extract_done_cbs(src_rsclp, &donecbs);
> > > > rcu_segcblist_extract_pend_cbs(src_rsclp, &pendcbs);
> > > > - rcu_segcblist_insert_count(dst_rsclp, &donecbs);
> > > > +
> > > > + rcu_segcblist_add_len(dst_rsclp, src_len);
> > > > rcu_segcblist_insert_done_cbs(dst_rsclp, &donecbs);
> > > > rcu_segcblist_insert_pend_cbs(dst_rsclp, &pendcbs);
> > >
> > > Rather than adding the blank lines, why not have the rcu_cblist structures
> > > carry the lengths? You are already adjusting one of the two call sites
> > > that care (rcu_do_batch()), and the other is srcu_invoke_callbacks().
> > > That should shorten this function a bit more. And make callback handling
> > > much more approachable, I suspect.
> >
> > Sorry, I did not understand. The rcu_cblist structure already has a length
> > field. I do modify rcu_segcblist_extract_done_cbs() and
> > rcu_segcblist_extract_pend_cbs() to carry the length already, in a later
> > patch.
> >
> > Just to emphasize, this patch is just a small refactor to avoid an issue in
> > later patches. It aims to keep current functionality unchanged.
>
> True enough. I am just suggesting that an equally small refactor in
> a slightly different direction should get to a better place. The key
> point enabling this slightly different direction is that this code is
> an exception to the "preserve ->cblist.len" rule because it is invoked
> only from the CPU hotplug code.
>
> So you could use the rcu_cblist .len field to update the ->cblist.len
> field, thus combining the _cbs and _count updates. One thing that helps
> is that setting th e rcu_cblist .len field doesn't hurt the other use
> cases that require careful handling of ->cblist.len.

Thank you for the ideas. I am trying something like this on top of this
series based on the ideas. One thing I concerned a bit is if getting rid of
the rcu_segcblist_xchg_len() function (which has memory barriers in them)
causes issues in the hotplug path. I am now directly updating the length
without additional memory barriers. I will test it more and try to reason
more about it as well.

---8<-----------------------

From: Joel Fernandes <[email protected]>
Date: Thu, 27 Aug 2020 18:30:25 -0400
Subject: [PATCH] fixup! rcu/segcblist: Do not depend on donecbs ->len to store
the segcb len during merge

Signed-off-by: Joel Fernandes <[email protected]>
---
kernel/rcu/rcu_segcblist.c | 38 ++++----------------------------------
1 file changed, 4 insertions(+), 34 deletions(-)

diff --git a/kernel/rcu/rcu_segcblist.c b/kernel/rcu/rcu_segcblist.c
index 79c2cbe388c5..c33abbc97a07 100644
--- a/kernel/rcu/rcu_segcblist.c
+++ b/kernel/rcu/rcu_segcblist.c
@@ -175,26 +175,6 @@ void rcu_segcblist_inc_len(struct rcu_segcblist *rsclp)
rcu_segcblist_add_len(rsclp, 1);
}

-/*
- * Exchange the numeric length of the specified rcu_segcblist structure
- * with the specified value. This can cause the ->len field to disagree
- * with the actual number of callbacks on the structure. This exchange is
- * fully ordered with respect to the callers accesses both before and after.
- */
-static long rcu_segcblist_xchg_len(struct rcu_segcblist *rsclp, long v)
-{
-#ifdef CONFIG_RCU_NOCB_CPU
- return atomic_long_xchg(&rsclp->len, v);
-#else
- long ret = rsclp->len;
-
- smp_mb(); /* Up to the caller! */
- WRITE_ONCE(rsclp->len, v);
- smp_mb(); /* Up to the caller! */
- return ret;
-#endif
-}
-
/*
* Initialize an rcu_segcblist structure.
*/
@@ -361,6 +341,7 @@ void rcu_segcblist_extract_done_cbs(struct rcu_segcblist *rsclp,
if (rsclp->tails[i] == rsclp->tails[RCU_DONE_TAIL])
WRITE_ONCE(rsclp->tails[i], &rsclp->head);
rcu_segcblist_set_seglen(rsclp, RCU_DONE_TAIL, 0);
+ rcu_segcblist_add_len(rsclp, -(rclp->len));
}

/*
@@ -414,17 +395,7 @@ void rcu_segcblist_extract_pend_cbs(struct rcu_segcblist *rsclp,
WRITE_ONCE(rsclp->tails[i], rsclp->tails[RCU_DONE_TAIL]);
rcu_segcblist_set_seglen(rsclp, i, 0);
}
-}
-
-/*
- * Insert counts from the specified rcu_cblist structure in the
- * specified rcu_segcblist structure.
- */
-void rcu_segcblist_insert_count(struct rcu_segcblist *rsclp,
- struct rcu_cblist *rclp)
-{
- rcu_segcblist_add_len(rsclp, rclp->len);
- rclp->len = 0;
+ rcu_segcblist_add_len(rsclp, -(rclp->len));
}

/*
@@ -448,6 +419,7 @@ void rcu_segcblist_insert_done_cbs(struct rcu_segcblist *rsclp,
break;
rclp->head = NULL;
rclp->tail = &rclp->head;
+ rcu_segcblist_add_len(rsclp, rclp->len);
}

/*
@@ -463,6 +435,7 @@ void rcu_segcblist_insert_pend_cbs(struct rcu_segcblist *rsclp,
rcu_segcblist_add_seglen(rsclp, RCU_NEXT_TAIL, rclp->len);
WRITE_ONCE(*rsclp->tails[RCU_NEXT_TAIL], rclp->head);
WRITE_ONCE(rsclp->tails[RCU_NEXT_TAIL], rclp->tail);
+ rcu_segcblist_add_len(rsclp, rclp->len);
}

/*
@@ -601,16 +574,13 @@ void rcu_segcblist_merge(struct rcu_segcblist *dst_rsclp,
{
struct rcu_cblist donecbs;
struct rcu_cblist pendcbs;
- long src_len;

rcu_cblist_init(&donecbs);
rcu_cblist_init(&pendcbs);

- src_len = rcu_segcblist_xchg_len(src_rsclp, 0);
rcu_segcblist_extract_done_cbs(src_rsclp, &donecbs);
rcu_segcblist_extract_pend_cbs(src_rsclp, &pendcbs);

- rcu_segcblist_add_len(dst_rsclp, src_len);
rcu_segcblist_insert_done_cbs(dst_rsclp, &donecbs);
rcu_segcblist_insert_pend_cbs(dst_rsclp, &pendcbs);

--
2.28.0.402.g5ffc5be6b7-goog

2020-08-28 14:22:21

by Paul E. McKenney

[permalink] [raw]
Subject: Re: [PATCH v4 -rcu 1/4] rcu/segcblist: Do not depend on rcl->len to store the segcb len during merge

On Thu, Aug 27, 2020 at 06:55:18PM -0400, Joel Fernandes wrote:
> On Wed, Aug 26, 2020 at 07:20:28AM -0700, Paul E. McKenney wrote:
> [...]
> > > > Or better yet, please see below, which should allow getting rid of both
> > > > of them.
> > > >
> > > > > rcu_segcblist_extract_done_cbs(src_rsclp, &donecbs);
> > > > > rcu_segcblist_extract_pend_cbs(src_rsclp, &pendcbs);
> > > > > - rcu_segcblist_insert_count(dst_rsclp, &donecbs);
> > > > > +
> > > > > + rcu_segcblist_add_len(dst_rsclp, src_len);
> > > > > rcu_segcblist_insert_done_cbs(dst_rsclp, &donecbs);
> > > > > rcu_segcblist_insert_pend_cbs(dst_rsclp, &pendcbs);
> > > >
> > > > Rather than adding the blank lines, why not have the rcu_cblist structures
> > > > carry the lengths? You are already adjusting one of the two call sites
> > > > that care (rcu_do_batch()), and the other is srcu_invoke_callbacks().
> > > > That should shorten this function a bit more. And make callback handling
> > > > much more approachable, I suspect.
> > >
> > > Sorry, I did not understand. The rcu_cblist structure already has a length
> > > field. I do modify rcu_segcblist_extract_done_cbs() and
> > > rcu_segcblist_extract_pend_cbs() to carry the length already, in a later
> > > patch.
> > >
> > > Just to emphasize, this patch is just a small refactor to avoid an issue in
> > > later patches. It aims to keep current functionality unchanged.
> >
> > True enough. I am just suggesting that an equally small refactor in
> > a slightly different direction should get to a better place. The key
> > point enabling this slightly different direction is that this code is
> > an exception to the "preserve ->cblist.len" rule because it is invoked
> > only from the CPU hotplug code.
> >
> > So you could use the rcu_cblist .len field to update the ->cblist.len
> > field, thus combining the _cbs and _count updates. One thing that helps
> > is that setting th e rcu_cblist .len field doesn't hurt the other use
> > cases that require careful handling of ->cblist.len.
>
> Thank you for the ideas. I am trying something like this on top of this
> series based on the ideas. One thing I concerned a bit is if getting rid of
> the rcu_segcblist_xchg_len() function (which has memory barriers in them)
> causes issues in the hotplug path. I am now directly updating the length
> without additional memory barriers. I will test it more and try to reason
> more about it as well.

In this particular case, the CPU-hotplug locks prevent rcu_barrier()
from running concurrently, so it should be OK. Is there an easy way
to make lockdep help us check this? Does lockdep_assert_cpus_held()
suffice, or is it too easily satisfied?

> ---8<-----------------------
>
> From: Joel Fernandes <[email protected]>
> Date: Thu, 27 Aug 2020 18:30:25 -0400
> Subject: [PATCH] fixup! rcu/segcblist: Do not depend on donecbs ->len to store
> the segcb len during merge
>
> Signed-off-by: Joel Fernandes <[email protected]>
> ---
> kernel/rcu/rcu_segcblist.c | 38 ++++----------------------------------
> 1 file changed, 4 insertions(+), 34 deletions(-)
>
> diff --git a/kernel/rcu/rcu_segcblist.c b/kernel/rcu/rcu_segcblist.c
> index 79c2cbe388c5..c33abbc97a07 100644
> --- a/kernel/rcu/rcu_segcblist.c
> +++ b/kernel/rcu/rcu_segcblist.c
> @@ -175,26 +175,6 @@ void rcu_segcblist_inc_len(struct rcu_segcblist *rsclp)
> rcu_segcblist_add_len(rsclp, 1);
> }
>
> -/*
> - * Exchange the numeric length of the specified rcu_segcblist structure
> - * with the specified value. This can cause the ->len field to disagree
> - * with the actual number of callbacks on the structure. This exchange is
> - * fully ordered with respect to the callers accesses both before and after.
> - */
> -static long rcu_segcblist_xchg_len(struct rcu_segcblist *rsclp, long v)
> -{
> -#ifdef CONFIG_RCU_NOCB_CPU
> - return atomic_long_xchg(&rsclp->len, v);
> -#else
> - long ret = rsclp->len;
> -
> - smp_mb(); /* Up to the caller! */
> - WRITE_ONCE(rsclp->len, v);
> - smp_mb(); /* Up to the caller! */
> - return ret;
> -#endif
> -}
> -

This looks nice!

> /*
> * Initialize an rcu_segcblist structure.
> */
> @@ -361,6 +341,7 @@ void rcu_segcblist_extract_done_cbs(struct rcu_segcblist *rsclp,
> if (rsclp->tails[i] == rsclp->tails[RCU_DONE_TAIL])
> WRITE_ONCE(rsclp->tails[i], &rsclp->head);
> rcu_segcblist_set_seglen(rsclp, RCU_DONE_TAIL, 0);
> + rcu_segcblist_add_len(rsclp, -(rclp->len));
> }
>
> /*
> @@ -414,17 +395,7 @@ void rcu_segcblist_extract_pend_cbs(struct rcu_segcblist *rsclp,
> WRITE_ONCE(rsclp->tails[i], rsclp->tails[RCU_DONE_TAIL]);
> rcu_segcblist_set_seglen(rsclp, i, 0);
> }
> -}
> -
> -/*
> - * Insert counts from the specified rcu_cblist structure in the
> - * specified rcu_segcblist structure.
> - */
> -void rcu_segcblist_insert_count(struct rcu_segcblist *rsclp,
> - struct rcu_cblist *rclp)
> -{
> - rcu_segcblist_add_len(rsclp, rclp->len);
> - rclp->len = 0;
> + rcu_segcblist_add_len(rsclp, -(rclp->len));

As does this. ;-)

> }
>
> /*
> @@ -448,6 +419,7 @@ void rcu_segcblist_insert_done_cbs(struct rcu_segcblist *rsclp,
> break;
> rclp->head = NULL;
> rclp->tail = &rclp->head;
> + rcu_segcblist_add_len(rsclp, rclp->len);

Does there need to be a compensating action in rcu_do_batch(), or is
this the point of the "rcu_segcblist_add_len(rsclp, -(rclp->len));"
added to rcu_segcblist_extract_done_cbs() above?

If so, a comment would be good.

> }
>
> /*
> @@ -463,6 +435,7 @@ void rcu_segcblist_insert_pend_cbs(struct rcu_segcblist *rsclp,
> rcu_segcblist_add_seglen(rsclp, RCU_NEXT_TAIL, rclp->len);
> WRITE_ONCE(*rsclp->tails[RCU_NEXT_TAIL], rclp->head);
> WRITE_ONCE(rsclp->tails[RCU_NEXT_TAIL], rclp->tail);
> + rcu_segcblist_add_len(rsclp, rclp->len);
> }
>
> /*
> @@ -601,16 +574,13 @@ void rcu_segcblist_merge(struct rcu_segcblist *dst_rsclp,
> {
> struct rcu_cblist donecbs;
> struct rcu_cblist pendcbs;
> - long src_len;
>
> rcu_cblist_init(&donecbs);
> rcu_cblist_init(&pendcbs);
>
> - src_len = rcu_segcblist_xchg_len(src_rsclp, 0);
> rcu_segcblist_extract_done_cbs(src_rsclp, &donecbs);
> rcu_segcblist_extract_pend_cbs(src_rsclp, &pendcbs);
>
> - rcu_segcblist_add_len(dst_rsclp, src_len);
> rcu_segcblist_insert_done_cbs(dst_rsclp, &donecbs);
> rcu_segcblist_insert_pend_cbs(dst_rsclp, &pendcbs);

Can we now pair the corresponding _extract_ and _insert_ calls, thus
requiring only one rcu_cblist structure? This would drop two more lines
of code. Or would that break something?

Thanx, Paul

2020-09-01 15:10:13

by Joel Fernandes

[permalink] [raw]
Subject: Re: [PATCH v4 -rcu 1/4] rcu/segcblist: Do not depend on rcl->len to store the segcb len during merge

Hi,
Resuming regular activities here, post-LPC :)

On Fri, Aug 28, 2020 at 07:18:55AM -0700, Paul E. McKenney wrote:
> On Thu, Aug 27, 2020 at 06:55:18PM -0400, Joel Fernandes wrote:
> > On Wed, Aug 26, 2020 at 07:20:28AM -0700, Paul E. McKenney wrote:
> > [...]
> > > > > Or better yet, please see below, which should allow getting rid of both
> > > > > of them.
> > > > >
> > > > > > rcu_segcblist_extract_done_cbs(src_rsclp, &donecbs);
> > > > > > rcu_segcblist_extract_pend_cbs(src_rsclp, &pendcbs);
> > > > > > - rcu_segcblist_insert_count(dst_rsclp, &donecbs);
> > > > > > +
> > > > > > + rcu_segcblist_add_len(dst_rsclp, src_len);
> > > > > > rcu_segcblist_insert_done_cbs(dst_rsclp, &donecbs);
> > > > > > rcu_segcblist_insert_pend_cbs(dst_rsclp, &pendcbs);
> > > > >
> > > > > Rather than adding the blank lines, why not have the rcu_cblist structures
> > > > > carry the lengths? You are already adjusting one of the two call sites
> > > > > that care (rcu_do_batch()), and the other is srcu_invoke_callbacks().
> > > > > That should shorten this function a bit more. And make callback handling
> > > > > much more approachable, I suspect.
> > > >
> > > > Sorry, I did not understand. The rcu_cblist structure already has a length
> > > > field. I do modify rcu_segcblist_extract_done_cbs() and
> > > > rcu_segcblist_extract_pend_cbs() to carry the length already, in a later
> > > > patch.
> > > >
> > > > Just to emphasize, this patch is just a small refactor to avoid an issue in
> > > > later patches. It aims to keep current functionality unchanged.
> > >
> > > True enough. I am just suggesting that an equally small refactor in
> > > a slightly different direction should get to a better place. The key
> > > point enabling this slightly different direction is that this code is
> > > an exception to the "preserve ->cblist.len" rule because it is invoked
> > > only from the CPU hotplug code.
> > >
> > > So you could use the rcu_cblist .len field to update the ->cblist.len
> > > field, thus combining the _cbs and _count updates. One thing that helps
> > > is that setting th e rcu_cblist .len field doesn't hurt the other use
> > > cases that require careful handling of ->cblist.len.
> >
> > Thank you for the ideas. I am trying something like this on top of this
> > series based on the ideas. One thing I concerned a bit is if getting rid of
> > the rcu_segcblist_xchg_len() function (which has memory barriers in them)
> > causes issues in the hotplug path. I am now directly updating the length
> > without additional memory barriers. I will test it more and try to reason
> > more about it as well.
>
> In this particular case, the CPU-hotplug locks prevent rcu_barrier()
> from running concurrently, so it should be OK. Is there an easy way
> to make lockdep help us check this? Does lockdep_assert_cpus_held()
> suffice, or is it too easily satisfied?

Just to clarify, the race you mention is rcu_barrier() should not run
concurrently with CPU hotplug operation.

The reason is:
rcu_barrier() checks whether the segmented list's length is 0 before sending
IPIs for entraining a callback onto the destination CPU. But if it
misunderstands the length's value due to a lack of memory barriers, it may
not missing sending an IPI causing the barrier to fail. Please correct me if
I'm wrong though.

Due to CPU hotplug locking, such race is impossible because both
rcu_barrier() and the CPU migrating the callbacks holds it.

I think the lockdep_assert_cpus_held() may suffice. I can add it to the
rcu_segcblist_merge() function.

BTW, I think we can simplify rcu_barrier() a bit and combine the tracepoint
and rid one outer if clause (diff at end of email).

> > }
> >
> > /*
> > @@ -448,6 +419,7 @@ void rcu_segcblist_insert_done_cbs(struct rcu_segcblist *rsclp,
> > break;
> > rclp->head = NULL;
> > rclp->tail = &rclp->head;
> > + rcu_segcblist_add_len(rsclp, rclp->len);
>
> Does there need to be a compensating action in rcu_do_batch(), or is
> this the point of the "rcu_segcblist_add_len(rsclp, -(rclp->len));"
> added to rcu_segcblist_extract_done_cbs() above?
>
> If so, a comment would be good.

I think external compensation by rcu_do_batch is not needed, its best to
handle all cblist.len modifications internally in the segcblist API
where possible.

> > /*
> > @@ -463,6 +435,7 @@ void rcu_segcblist_insert_pend_cbs(struct rcu_segcblist *rsclp,
> > rcu_segcblist_add_seglen(rsclp, RCU_NEXT_TAIL, rclp->len);
> > WRITE_ONCE(*rsclp->tails[RCU_NEXT_TAIL], rclp->head);
> > WRITE_ONCE(rsclp->tails[RCU_NEXT_TAIL], rclp->tail);
> > + rcu_segcblist_add_len(rsclp, rclp->len);
> > }
> >
> > /*
> > @@ -601,16 +574,13 @@ void rcu_segcblist_merge(struct rcu_segcblist *dst_rsclp,
> > {
> > struct rcu_cblist donecbs;
> > struct rcu_cblist pendcbs;
> > - long src_len;
> >
> > rcu_cblist_init(&donecbs);
> > rcu_cblist_init(&pendcbs);
> >
> > - src_len = rcu_segcblist_xchg_len(src_rsclp, 0);
> > rcu_segcblist_extract_done_cbs(src_rsclp, &donecbs);
> > rcu_segcblist_extract_pend_cbs(src_rsclp, &pendcbs);
> >
> > - rcu_segcblist_add_len(dst_rsclp, src_len);
> > rcu_segcblist_insert_done_cbs(dst_rsclp, &donecbs);
> > rcu_segcblist_insert_pend_cbs(dst_rsclp, &pendcbs);
>
> Can we now pair the corresponding _extract_ and _insert_ calls, thus
> requiring only one rcu_cblist structure? This would drop two more lines
> of code. Or would that break something?

That could work as well, yes. But may not be worth adding another function to
combine extract/insert operation, since the extract and insert operations are
needed by rcutree and srcutree as well (other than hotplug).

thanks,

- Joel

---8<-----------------------

diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index 16ad99a9ebba..274a1845ad38 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -3856,17 +3856,16 @@ void rcu_barrier(void)
if (cpu_is_offline(cpu) &&
!rcu_segcblist_is_offloaded(&rdp->cblist))
continue;
- if (rcu_segcblist_n_cbs(&rdp->cblist) && cpu_online(cpu)) {
- rcu_barrier_trace(TPS("OnlineQ"), cpu,
- rcu_state.barrier_sequence);
- smp_call_function_single(cpu, rcu_barrier_func, (void *)cpu, 1);
- } else if (rcu_segcblist_n_cbs(&rdp->cblist) &&
- cpu_is_offline(cpu)) {
- rcu_barrier_trace(TPS("OfflineNoCBQ"), cpu,
- rcu_state.barrier_sequence);
- local_irq_disable();
- rcu_barrier_func((void *)cpu);
- local_irq_enable();
+ if (rcu_segcblist_n_cbs(&rdp->cblist)) {
+ rcu_barrier_trace(cpu_online(cpu) ? TPS("OnlineQ") : TPS("OfflineNoCBQ"),
+ cpu, rcu_state.barrier_sequence);
+ if (cpu_online(cpu)) {
+ smp_call_function_single(cpu, rcu_barrier_func, (void *)cpu, 1);
+ } else {
+ local_irq_disable();
+ rcu_barrier_func((void *)cpu);
+ local_irq_enable();
+ }
} else if (cpu_is_offline(cpu)) {
rcu_barrier_trace(TPS("OfflineNoCBNoQ"), cpu,
rcu_state.barrier_sequence);

2020-09-01 16:30:15

by Paul E. McKenney

[permalink] [raw]
Subject: Re: [PATCH v4 -rcu 1/4] rcu/segcblist: Do not depend on rcl->len to store the segcb len during merge

On Tue, Sep 01, 2020 at 11:06:09AM -0400, Joel Fernandes wrote:
> Hi,
> Resuming regular activities here, post-LPC :)
>
> On Fri, Aug 28, 2020 at 07:18:55AM -0700, Paul E. McKenney wrote:
> > On Thu, Aug 27, 2020 at 06:55:18PM -0400, Joel Fernandes wrote:
> > > On Wed, Aug 26, 2020 at 07:20:28AM -0700, Paul E. McKenney wrote:
> > > [...]
> > > > > > Or better yet, please see below, which should allow getting rid of both
> > > > > > of them.
> > > > > >
> > > > > > > rcu_segcblist_extract_done_cbs(src_rsclp, &donecbs);
> > > > > > > rcu_segcblist_extract_pend_cbs(src_rsclp, &pendcbs);
> > > > > > > - rcu_segcblist_insert_count(dst_rsclp, &donecbs);
> > > > > > > +
> > > > > > > + rcu_segcblist_add_len(dst_rsclp, src_len);
> > > > > > > rcu_segcblist_insert_done_cbs(dst_rsclp, &donecbs);
> > > > > > > rcu_segcblist_insert_pend_cbs(dst_rsclp, &pendcbs);
> > > > > >
> > > > > > Rather than adding the blank lines, why not have the rcu_cblist structures
> > > > > > carry the lengths? You are already adjusting one of the two call sites
> > > > > > that care (rcu_do_batch()), and the other is srcu_invoke_callbacks().
> > > > > > That should shorten this function a bit more. And make callback handling
> > > > > > much more approachable, I suspect.
> > > > >
> > > > > Sorry, I did not understand. The rcu_cblist structure already has a length
> > > > > field. I do modify rcu_segcblist_extract_done_cbs() and
> > > > > rcu_segcblist_extract_pend_cbs() to carry the length already, in a later
> > > > > patch.
> > > > >
> > > > > Just to emphasize, this patch is just a small refactor to avoid an issue in
> > > > > later patches. It aims to keep current functionality unchanged.
> > > >
> > > > True enough. I am just suggesting that an equally small refactor in
> > > > a slightly different direction should get to a better place. The key
> > > > point enabling this slightly different direction is that this code is
> > > > an exception to the "preserve ->cblist.len" rule because it is invoked
> > > > only from the CPU hotplug code.
> > > >
> > > > So you could use the rcu_cblist .len field to update the ->cblist.len
> > > > field, thus combining the _cbs and _count updates. One thing that helps
> > > > is that setting th e rcu_cblist .len field doesn't hurt the other use
> > > > cases that require careful handling of ->cblist.len.
> > >
> > > Thank you for the ideas. I am trying something like this on top of this
> > > series based on the ideas. One thing I concerned a bit is if getting rid of
> > > the rcu_segcblist_xchg_len() function (which has memory barriers in them)
> > > causes issues in the hotplug path. I am now directly updating the length
> > > without additional memory barriers. I will test it more and try to reason
> > > more about it as well.
> >
> > In this particular case, the CPU-hotplug locks prevent rcu_barrier()
> > from running concurrently, so it should be OK. Is there an easy way
> > to make lockdep help us check this? Does lockdep_assert_cpus_held()
> > suffice, or is it too easily satisfied?
>
> Just to clarify, the race you mention is rcu_barrier() should not run
> concurrently with CPU hotplug operation.
>
> The reason is:
> rcu_barrier() checks whether the segmented list's length is 0 before sending
> IPIs for entraining a callback onto the destination CPU. But if it
> misunderstands the length's value due to a lack of memory barriers, it may
> not missing sending an IPI causing the barrier to fail. Please correct me if
> I'm wrong though.

That is the concern for code that is not part of a CPU-hotplug operation.

> Due to CPU hotplug locking, such race is impossible because both
> rcu_barrier() and the CPU migrating the callbacks holds it.

Exactly.

> I think the lockdep_assert_cpus_held() may suffice. I can add it to the
> rcu_segcblist_merge() function.

The main thing to check is that the following results in a lockdep splat:

cpus_read_lock();
lockdep_assert_cpus_held();

> BTW, I think we can simplify rcu_barrier() a bit and combine the tracepoint
> and rid one outer if clause (diff at end of email).

If you were already changing this function for some reason, OK. But you
are trading a pair of lines for a long line (though still less than the
new 100-character limit), so as a standalone patch I am left uninspired.

> > > }
> > >
> > > /*
> > > @@ -448,6 +419,7 @@ void rcu_segcblist_insert_done_cbs(struct rcu_segcblist *rsclp,
> > > break;
> > > rclp->head = NULL;
> > > rclp->tail = &rclp->head;
> > > + rcu_segcblist_add_len(rsclp, rclp->len);
> >
> > Does there need to be a compensating action in rcu_do_batch(), or is
> > this the point of the "rcu_segcblist_add_len(rsclp, -(rclp->len));"
> > added to rcu_segcblist_extract_done_cbs() above?
> >
> > If so, a comment would be good.
>
> I think external compensation by rcu_do_batch is not needed, its best to
> handle all cblist.len modifications internally in the segcblist API
> where possible.

At the very least, the segcblist function providing a negative-length
list needs to very carefully explain itself!!! ;-)

> > > /*
> > > @@ -463,6 +435,7 @@ void rcu_segcblist_insert_pend_cbs(struct rcu_segcblist *rsclp,
> > > rcu_segcblist_add_seglen(rsclp, RCU_NEXT_TAIL, rclp->len);
> > > WRITE_ONCE(*rsclp->tails[RCU_NEXT_TAIL], rclp->head);
> > > WRITE_ONCE(rsclp->tails[RCU_NEXT_TAIL], rclp->tail);
> > > + rcu_segcblist_add_len(rsclp, rclp->len);
> > > }
> > >
> > > /*
> > > @@ -601,16 +574,13 @@ void rcu_segcblist_merge(struct rcu_segcblist *dst_rsclp,
> > > {
> > > struct rcu_cblist donecbs;
> > > struct rcu_cblist pendcbs;
> > > - long src_len;
> > >
> > > rcu_cblist_init(&donecbs);
> > > rcu_cblist_init(&pendcbs);
> > >
> > > - src_len = rcu_segcblist_xchg_len(src_rsclp, 0);
> > > rcu_segcblist_extract_done_cbs(src_rsclp, &donecbs);
> > > rcu_segcblist_extract_pend_cbs(src_rsclp, &pendcbs);
> > >
> > > - rcu_segcblist_add_len(dst_rsclp, src_len);
> > > rcu_segcblist_insert_done_cbs(dst_rsclp, &donecbs);
> > > rcu_segcblist_insert_pend_cbs(dst_rsclp, &pendcbs);
> >
> > Can we now pair the corresponding _extract_ and _insert_ calls, thus
> > requiring only one rcu_cblist structure? This would drop two more lines
> > of code. Or would that break something?
>
> That could work as well, yes. But may not be worth adding another function to
> combine extract/insert operation, since the extract and insert operations are
> needed by rcutree and srcutree as well (other than hotplug).

I was thinking in terms of just reordering the calls to the existing
functions, though I clearly was having trouble counting. Like this:

struct rcu_cblist cbs;

rcu_cblist_init(&cbs);
rcu_segcblist_extract_done_cbs(src_rsclp, &cbs);
rcu_segcblist_insert_done_cbs(dst_rsclp, &cbs);
rcu_cblist_init(&cbs);
rcu_segcblist_insert_pend_cbs(dst_rsclp, &cbs);
rcu_segcblist_extract_pend_cbs(src_rsclp, &cbs);

Thanx, Paul

> thanks,
>
> - Joel
>
> ---8<-----------------------
>
> diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> index 16ad99a9ebba..274a1845ad38 100644
> --- a/kernel/rcu/tree.c
> +++ b/kernel/rcu/tree.c
> @@ -3856,17 +3856,16 @@ void rcu_barrier(void)
> if (cpu_is_offline(cpu) &&
> !rcu_segcblist_is_offloaded(&rdp->cblist))
> continue;
> - if (rcu_segcblist_n_cbs(&rdp->cblist) && cpu_online(cpu)) {
> - rcu_barrier_trace(TPS("OnlineQ"), cpu,
> - rcu_state.barrier_sequence);
> - smp_call_function_single(cpu, rcu_barrier_func, (void *)cpu, 1);
> - } else if (rcu_segcblist_n_cbs(&rdp->cblist) &&
> - cpu_is_offline(cpu)) {
> - rcu_barrier_trace(TPS("OfflineNoCBQ"), cpu,
> - rcu_state.barrier_sequence);
> - local_irq_disable();
> - rcu_barrier_func((void *)cpu);
> - local_irq_enable();
> + if (rcu_segcblist_n_cbs(&rdp->cblist)) {
> + rcu_barrier_trace(cpu_online(cpu) ? TPS("OnlineQ") : TPS("OfflineNoCBQ"),
> + cpu, rcu_state.barrier_sequence);
> + if (cpu_online(cpu)) {
> + smp_call_function_single(cpu, rcu_barrier_func, (void *)cpu, 1);
> + } else {
> + local_irq_disable();
> + rcu_barrier_func((void *)cpu);
> + local_irq_enable();
> + }
> } else if (cpu_is_offline(cpu)) {
> rcu_barrier_trace(TPS("OfflineNoCBNoQ"), cpu,
> rcu_state.barrier_sequence);