by Joel Fernandes

[permalink] [raw]

Subject: Re: [PATCH v6 2/4] rcu/segcblist: Add counters to segcblist datastructure

On Tue, Oct 13, 2020 at 01:20:08AM +0200, Frederic Weisbecker wrote:
> On Wed, Sep 23, 2020 at 11:22:09AM -0400, Joel Fernandes (Google) wrote:
> > +/* Return number of callbacks in a segment of the segmented callback list. */
> > +static void rcu_segcblist_add_seglen(struct rcu_segcblist *rsclp, int seg, long v)
> > +{
> > +#ifdef CONFIG_RCU_NOCB_CPU
> > + smp_mb__before_atomic(); /* Up to the caller! */
> > + atomic_long_add(v, &rsclp->seglen[seg]);
> > + smp_mb__after_atomic(); /* Up to the caller! */
> > +#else
> > + smp_mb(); /* Up to the caller! */
> > + WRITE_ONCE(rsclp->seglen[seg], rsclp->seglen[seg] + v);
> > + smp_mb(); /* Up to the caller! */
> > +#endif
> > +}
>
> I know that these "Up to the caller" comments come from the existing len
> functions but perhaps we should explain a bit more against what it is ordering
> and what it pairs to.

Sure.

> Also why do we need one before _and_ after?

I removed these memory barriers since they should not be needed, I will
update it this way for v7.

> And finally do we have the same ordering requirements than the unsegmented len
> field?

Do you mean ordering for the rsclp->seglen ? Yes we need not have ordering
for that since there are no races AFAICS (all accesses have either IRQs are
disabled, or nocb lock is held for the offloaded case). If you meant
something else like rcl->len, let me know. AFAICS, we don't have ordering
needs for those. Further, current readers of ->seglen are only for tracing.
->seglen does not influence rcu_barrier yet.

> > +/* Move from's segment length to to's segment. */
> > +static void rcu_segcblist_move_seglen(struct rcu_segcblist *rsclp, int from, int to)
> > +{
> > + long len;
> > +
> > + if (from == to)
> > + return;
> > +
> > + len = rcu_segcblist_get_seglen(rsclp, from);
> > + if (!len)
> > + return;
> > +
> > + rcu_segcblist_add_seglen(rsclp, to, len);
> > + rcu_segcblist_set_seglen(rsclp, from, 0);
> > +}
> > +
> [...]
> > @@ -245,6 +283,7 @@ void rcu_segcblist_enqueue(struct rcu_segcblist *rsclp,
> > struct rcu_head *rhp)
> > {
> > rcu_segcblist_inc_len(rsclp);
> > + rcu_segcblist_inc_seglen(rsclp, RCU_NEXT_TAIL);
> > smp_mb(); /* Ensure counts are updated before callback is enqueued. */
>
> Since inc_len and even now inc_seglen have two full barriers embracing the add up,
> we can probably spare the above smp_mb()?

Good point, I'll remove it.

> > rhp->next = NULL;
> > WRITE_ONCE(*rsclp->tails[RCU_NEXT_TAIL], rhp);
> > @@ -274,27 +313,13 @@ bool rcu_segcblist_entrain(struct rcu_segcblist *rsclp,
> > for (i = RCU_NEXT_TAIL; i > RCU_DONE_TAIL; i--)
> > if (rsclp->tails[i] != rsclp->tails[i - 1])
> > break;
> > + rcu_segcblist_inc_seglen(rsclp, i);
> > WRITE_ONCE(*rsclp->tails[i], rhp);
> > for (; i <= RCU_NEXT_TAIL; i++)
> > WRITE_ONCE(rsclp->tails[i], &rhp->next);
> > return true;
> > }
> >
> > @@ -403,6 +437,7 @@ void rcu_segcblist_advance(struct rcu_segcblist *rsclp, unsigned long seq)
> > if (ULONG_CMP_LT(seq, rsclp->gp_seq[i]))
> > break;
> > WRITE_ONCE(rsclp->tails[RCU_DONE_TAIL], rsclp->tails[i]);
> > + rcu_segcblist_move_seglen(rsclp, i, RCU_DONE_TAIL);
>
> Do we still need the same amount of full barriers contained in add() called by move() here?
> It's called in the reverse order (write queue then len) than usual. If I trust the comment
> in rcu_segcblist_enqueue(), the point of the barrier is to make the length visible before
> the new callback for rcu_barrier() (although that concerns len and not seglen). But here
> above, the unsegmented length doesn't change. I could understand a write barrier between
> add_seglen(x, i) and set_seglen(0, RCU_DONE_TAIL) but I couldn't find a paired couple either.

I'm guessing since I removed the memory barriers from seglen updates, this is
resolved.

> > }
> >
> > /* If no callbacks moved, nothing more need be done. */
> > @@ -423,6 +458,7 @@ void rcu_segcblist_advance(struct rcu_segcblist *rsclp, unsigned long seq)
> > if (rsclp->tails[j] == rsclp->tails[RCU_NEXT_TAIL])
> > break; /* No more callbacks. */
> > WRITE_ONCE(rsclp->tails[j], rsclp->tails[i]);
> > + rcu_segcblist_move_seglen(rsclp, i, j);
>
> Same question here (feel free to reply "same answer" :o)

Same answer :P

So based on these and other comments, I will update the patches and send them
out shortly.

thanks,

- Joel

2020-10-15 08:34:00

by Joel Fernandes

[permalink] [raw]

Subject: Re: [PATCH v6 3/4] rcu/trace: Add tracing for how segcb list changes

On Wed, Oct 14, 2020 at 08:52:17PM +0530, Neeraj Upadhyay wrote:
>
>
> On 9/23/2020 8:52 PM, Joel Fernandes (Google) wrote:
> > Track how the segcb list changes before/after acceleration, during
> > queuing and during dequeuing.
> >
> > This has proved useful to discover an optimization to avoid unwanted GP
> > requests when there are no callbacks accelerated. The overhead is minimal as
> > each segment's length is now stored in the respective segment.
> >
> > Signed-off-by: Joel Fernandes (Google) <[email protected]>
> > ---
> > include/trace/events/rcu.h | 25 +++++++++++++++++++++++++
> > kernel/rcu/rcu_segcblist.c | 34 ++++++++++++++++++++++++++++++++++
> > kernel/rcu/rcu_segcblist.h | 5 +++++
> > kernel/rcu/tree.c | 9 +++++++++
> > 4 files changed, 73 insertions(+)
> >
> > diff --git a/include/trace/events/rcu.h b/include/trace/events/rcu.h
> > index 155b5cb43cfd..7b84df3c95df 100644
> > --- a/include/trace/events/rcu.h
> > +++ b/include/trace/events/rcu.h
> > @@ -505,6 +505,31 @@ TRACE_EVENT_RCU(rcu_callback,
> > __entry->qlen)
> > );
> > +TRACE_EVENT_RCU(rcu_segcb,
> > +
> > + TP_PROTO(const char *ctx, int *cb_count, unsigned long *gp_seq),
> > +
> > + TP_ARGS(ctx, cb_count, gp_seq),
> > +
> > + TP_STRUCT__entry(
> > + __field(const char *, ctx)
> > + __array(int, cb_count, 4)
> > + __array(unsigned long, gp_seq, 4)
>
> Use RCU_CBLIST_NSEGS in place of 4 ?

Done.

> > + ),
> > +
> > + TP_fast_assign(
> > + __entry->ctx = ctx;
> > + memcpy(__entry->cb_count, cb_count, 4 * sizeof(int));
> > + memcpy(__entry->gp_seq, gp_seq, 4 * sizeof(unsigned long));
> > + ),
> > +
> > + TP_printk("%s cb_count: (DONE=%d, WAIT=%d, NEXT_READY=%d, NEXT=%d) "
> > + "gp_seq: (DONE=%lu, WAIT=%lu, NEXT_READY=%lu, NEXT=%lu)", __entry->ctx,
> > + __entry->cb_count[0], __entry->cb_count[1], __entry->cb_count[2], __entry->cb_count[3],
> > + __entry->gp_seq[0], __entry->gp_seq[1], __entry->gp_seq[2], __entry->gp_seq[3])
> > +
> > +);
> > +
> > /*
> > * Tracepoint for the registration of a single RCU callback of the special
> > * kvfree() form. The first argument is the RCU type, the second argument
> > diff --git a/kernel/rcu/rcu_segcblist.c b/kernel/rcu/rcu_segcblist.c
> > index 0e6d19bd3de9..df0f31e30947 100644
> > --- a/kernel/rcu/rcu_segcblist.c
> > +++ b/kernel/rcu/rcu_segcblist.c
> > @@ -13,6 +13,7 @@
> > #include <linux/rcupdate.h>
> > #include "rcu_segcblist.h"
> > +#include "rcu.h"
> > /* Initialize simple callback list. */
> > void rcu_cblist_init(struct rcu_cblist *rclp)
> > @@ -343,6 +344,39 @@ void rcu_segcblist_extract_done_cbs(struct rcu_segcblist *rsclp,
> > rcu_segcblist_set_seglen(rsclp, RCU_DONE_TAIL, 0);
> > }
> > +/*
> > + * Return how many CBs each segment along with their gp_seq values.
> > + *
> > + * This function is O(N) where N is the number of callbacks. Only used from
>
> N is number of segments?

Yes, will fix.

> > + * tracing code which is usually disabled in production.
> > + */
> > +#ifdef CONFIG_RCU_TRACE
> > +static void rcu_segcblist_countseq(struct rcu_segcblist *rsclp,
> > + int cbcount[RCU_CBLIST_NSEGS],
> > + unsigned long gpseq[RCU_CBLIST_NSEGS])
> > +{
> > + int i;
> > +
> > + for (i = 0; i < RCU_CBLIST_NSEGS; i++)
> > + cbcount[i] = 0;
> > +
>
> What is the reason for initializing to 0?

You are right, not needed. I'll remove.

> > + for (i = 0; i < RCU_CBLIST_NSEGS; i++) {
> > + cbcount[i] = rcu_segcblist_get_seglen(rsclp, i);
> > + gpseq[i] = rsclp->gp_seq[i];
> > + }
> > +}
> > +
> > +void trace_rcu_segcb_list(struct rcu_segcblist *rsclp, char *context)
> > +{
> > + int cbs[RCU_CBLIST_NSEGS];
> > + unsigned long gps[RCU_CBLIST_NSEGS];
> > +
> > + rcu_segcblist_countseq(rsclp, cbs, gps);
> > +
> > + trace_rcu_segcb(context, cbs, gps);
> > +}
> > +#endif
> > +
> > /*
> > * Extract only those callbacks still pending (not yet ready to be
> > * invoked) from the specified rcu_segcblist structure and place them in
> > diff --git a/kernel/rcu/rcu_segcblist.h b/kernel/rcu/rcu_segcblist.h
> > index 3e0eb1056ae9..15c10d30f88c 100644
> > --- a/kernel/rcu/rcu_segcblist.h
> > +++ b/kernel/rcu/rcu_segcblist.h
> > @@ -103,3 +103,8 @@ void rcu_segcblist_advance(struct rcu_segcblist *rsclp, unsigned long seq);
> > bool rcu_segcblist_accelerate(struct rcu_segcblist *rsclp, unsigned long seq);
> > void rcu_segcblist_merge(struct rcu_segcblist *dst_rsclp,
> > struct rcu_segcblist *src_rsclp);
> > +#ifdef CONFIG_RCU_TRACE
> > +void trace_rcu_segcb_list(struct rcu_segcblist *rsclp, char *context);
> > +#else
> > +#define trace_rcu_segcb_list(...)
> > +#endif
> > diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> > index 50af465729f4..e3381ff67fc6 100644
> > --- a/kernel/rcu/tree.c
> > +++ b/kernel/rcu/tree.c
> > @@ -1492,6 +1492,8 @@ static bool rcu_accelerate_cbs(struct rcu_node *rnp, struct rcu_data *rdp)
> > if (!rcu_segcblist_pend_cbs(&rdp->cblist))
> > return false;
> > + trace_rcu_segcb_list(&rdp->cblist, "SegCbPreAcc");
>
> Use TPS("SegCbPreAcc") ?

Fixed, thanks!

thanks,

- Joel

>
>
> Thanks
> Neeraj
>
> > +
> > /*
> > * Callbacks are often registered with incomplete grace-period
> > * information. Something about the fact that getting exact
> > @@ -1512,6 +1514,8 @@ static bool rcu_accelerate_cbs(struct rcu_node *rnp, struct rcu_data *rdp)
> > else
> > trace_rcu_grace_period(rcu_state.name, gp_seq_req, TPS("AccReadyCB"));
> > + trace_rcu_segcb_list(&rdp->cblist, "SegCbPostAcc");
> > +
> > return ret;
> > }
> > @@ -2469,6 +2473,9 @@ static void rcu_do_batch(struct rcu_data *rdp)
> > /* Invoke callbacks. */
> > tick_dep_set_task(current, TICK_DEP_BIT_RCU);
> > rhp = rcu_cblist_dequeue(&rcl);
> > +
> > + trace_rcu_segcb_list(&rdp->cblist, "SegCbDequeued");
> > +
> > for (; rhp; rhp = rcu_cblist_dequeue(&rcl)) {
> > rcu_callback_t f;
> > @@ -2982,6 +2989,8 @@ __call_rcu(struct rcu_head *head, rcu_callback_t func)
> > trace_rcu_callback(rcu_state.name, head,
> > rcu_segcblist_n_cbs(&rdp->cblist));
> > + trace_rcu_segcb_list(&rdp->cblist, "SegCBQueued");
> > +
> > /* Go handle any RCU core processing required. */
> > if (IS_ENABLED(CONFIG_RCU_NOCB_CPU) &&
> > unlikely(rcu_segcblist_is_offloaded(&rdp->cblist))) {
> >
>
> --
> QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member of
> the Code Aurora Forum, hosted by The Linux Foundation