2023-12-27 17:47:49

by Joel Fernandes

[permalink] [raw]
Subject: [PATCH v3] srcu: Improve comments about acceleration leak

The comments added in commit 1ef990c4b36b ("srcu: No need to
advance/accelerate if no callback enqueued") are a bit confusing.
The comments are describing a scenario for code that was moved and is
no longer the way it was (snapshot after advancing). Improve the code
comments to reflect this and also document why acceleration can never
fail.

Cc: Frederic Weisbecker <[email protected]>
Cc: Neeraj Upadhyay <[email protected]>
Reviewed-by: Frederic Weisbecker <[email protected]>
Signed-off-by: Joel Fernandes (Google) <[email protected]>
---
v1->v2: Fix typo in change log.
v2->v3: Improvement to acceleration comment.

kernel/rcu/srcutree.c | 24 ++++++++++++++++++++----
1 file changed, 20 insertions(+), 4 deletions(-)

diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c
index 0351a4e83529..051e149490d1 100644
--- a/kernel/rcu/srcutree.c
+++ b/kernel/rcu/srcutree.c
@@ -1234,11 +1234,20 @@ static unsigned long srcu_gp_start_if_needed(struct srcu_struct *ssp,
if (rhp)
rcu_segcblist_enqueue(&sdp->srcu_cblist, rhp);
/*
- * The snapshot for acceleration must be taken _before_ the read of the
- * current gp sequence used for advancing, otherwise advancing may fail
- * and acceleration may then fail too.
+ * It's crucial to capture the snapshot 's' for acceleration before
+ * reading the current gp_seq that is used for advancing. This is
+ * essential because if the acceleration snapshot is taken after a
+ * failed advancement attempt, there's a risk that a grace period may
+ * conclude and a new one may start in the interim. If the snapshot is
+ * captured after this sequence of events, the acceleration snapshot 's'
+ * could be excessively advanced, leading to acceleration failure.
+ * In such a scenario, an 'acceleration leak' can occur, where new
+ * callbacks become indefinitely stuck in the RCU_NEXT_TAIL segment.
+ * Also note that encountering advancing failures is a normal
+ * occurrence when the grace period for RCU_WAIT_TAIL is in progress.
*
- * This could happen if:
+ * To see this, consider the following events which occur if
+ * rcu_seq_snap() were to be called after advance:
*
* 1) The RCU_WAIT_TAIL segment has callbacks (gp_num = X + 4) and the
* RCU_NEXT_READY_TAIL also has callbacks (gp_num = X + 8).
@@ -1264,6 +1273,13 @@ static unsigned long srcu_gp_start_if_needed(struct srcu_struct *ssp,
if (rhp) {
rcu_segcblist_advance(&sdp->srcu_cblist,
rcu_seq_current(&ssp->srcu_sup->srcu_gp_seq));
+ /*
+ * Acceleration can never fail because the base current gp_seq
+ * used for acceleration is <= the value of gp_seq used for
+ * advancing. This means that RCU_NEXT_TAIL segment will
+ * always be able to be emptied by the acceleration into the
+ * RCU_NEXT_READY_TAIL or RCU_WAIT_TAIL segments.
+ */
WARN_ON_ONCE(!rcu_segcblist_accelerate(&sdp->srcu_cblist, s));
}
if (ULONG_CMP_LT(sdp->srcu_gp_seq_needed, s)) {
--
2.43.0.472.g3155946c3a-goog



2023-12-29 23:10:59

by Paul E. McKenney

[permalink] [raw]
Subject: Re: [PATCH v3] srcu: Improve comments about acceleration leak

On Wed, Dec 27, 2023 at 12:47:38PM -0500, Joel Fernandes (Google) wrote:
> The comments added in commit 1ef990c4b36b ("srcu: No need to
> advance/accelerate if no callback enqueued") are a bit confusing.
> The comments are describing a scenario for code that was moved and is
> no longer the way it was (snapshot after advancing). Improve the code
> comments to reflect this and also document why acceleration can never
> fail.
>
> Cc: Frederic Weisbecker <[email protected]>
> Cc: Neeraj Upadhyay <[email protected]>
> Reviewed-by: Frederic Weisbecker <[email protected]>
> Signed-off-by: Joel Fernandes (Google) <[email protected]>

Queued, thank you all!

Thanx, Paul

> ---
> v1->v2: Fix typo in change log.
> v2->v3: Improvement to acceleration comment.
>
> kernel/rcu/srcutree.c | 24 ++++++++++++++++++++----
> 1 file changed, 20 insertions(+), 4 deletions(-)
>
> diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c
> index 0351a4e83529..051e149490d1 100644
> --- a/kernel/rcu/srcutree.c
> +++ b/kernel/rcu/srcutree.c
> @@ -1234,11 +1234,20 @@ static unsigned long srcu_gp_start_if_needed(struct srcu_struct *ssp,
> if (rhp)
> rcu_segcblist_enqueue(&sdp->srcu_cblist, rhp);
> /*
> - * The snapshot for acceleration must be taken _before_ the read of the
> - * current gp sequence used for advancing, otherwise advancing may fail
> - * and acceleration may then fail too.
> + * It's crucial to capture the snapshot 's' for acceleration before
> + * reading the current gp_seq that is used for advancing. This is
> + * essential because if the acceleration snapshot is taken after a
> + * failed advancement attempt, there's a risk that a grace period may
> + * conclude and a new one may start in the interim. If the snapshot is
> + * captured after this sequence of events, the acceleration snapshot 's'
> + * could be excessively advanced, leading to acceleration failure.
> + * In such a scenario, an 'acceleration leak' can occur, where new
> + * callbacks become indefinitely stuck in the RCU_NEXT_TAIL segment.
> + * Also note that encountering advancing failures is a normal
> + * occurrence when the grace period for RCU_WAIT_TAIL is in progress.
> *
> - * This could happen if:
> + * To see this, consider the following events which occur if
> + * rcu_seq_snap() were to be called after advance:
> *
> * 1) The RCU_WAIT_TAIL segment has callbacks (gp_num = X + 4) and the
> * RCU_NEXT_READY_TAIL also has callbacks (gp_num = X + 8).
> @@ -1264,6 +1273,13 @@ static unsigned long srcu_gp_start_if_needed(struct srcu_struct *ssp,
> if (rhp) {
> rcu_segcblist_advance(&sdp->srcu_cblist,
> rcu_seq_current(&ssp->srcu_sup->srcu_gp_seq));
> + /*
> + * Acceleration can never fail because the base current gp_seq
> + * used for acceleration is <= the value of gp_seq used for
> + * advancing. This means that RCU_NEXT_TAIL segment will
> + * always be able to be emptied by the acceleration into the
> + * RCU_NEXT_READY_TAIL or RCU_WAIT_TAIL segments.
> + */
> WARN_ON_ONCE(!rcu_segcblist_accelerate(&sdp->srcu_cblist, s));
> }
> if (ULONG_CMP_LT(sdp->srcu_gp_seq_needed, s)) {
> --
> 2.43.0.472.g3155946c3a-goog
>