The handle value used to report an event being cleared by dev_dbg() is
incorrect due to a post increment of the payload handle index.
Delay the increment of the index until after the print. Also add the
debugging for event processing which was useful in finding this bug.
To: Davidlohr Bueso <[email protected]>
To: Jonathan Cameron <[email protected]>
To: Dave Jiang <[email protected]>
To: Alison Schofield <[email protected]>
To: Vishal Verma <[email protected]>
To: Dan Williams <[email protected]>
Cc: Jonathan Cameron <[email protected]>
Cc: [email protected]
Cc: [email protected]
Signed-off-by: Ira Weiny <[email protected]>
---
NOTE: This does fix a bug in the patch referenced below. However, I
don't think that warrants back porting because this is only a debug
print.
Fixes: 6ebe28f9ec72 ("cxl/mem: Read, trace, and clear events on driver load")
---
drivers/cxl/core/mbox.c | 9 +++++++--
1 file changed, 7 insertions(+), 2 deletions(-)
diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c
index d6d067fbee97..f052d5f174ee 100644
--- a/drivers/cxl/core/mbox.c
+++ b/drivers/cxl/core/mbox.c
@@ -882,9 +882,10 @@ static int cxl_clear_event_record(struct cxl_memdev_state *mds,
*/
i = 0;
for (cnt = 0; cnt < total; cnt++) {
- payload->handles[i++] = get_pl->records[cnt].hdr.handle;
+ payload->handles[i] = get_pl->records[cnt].hdr.handle;
dev_dbg(mds->cxlds.dev, "Event log '%d': Clearing %u\n", log,
le16_to_cpu(payload->handles[i]));
+ i++;
if (i == max_handles) {
payload->nr_recs = i;
@@ -946,9 +947,13 @@ static void cxl_mem_get_records_log(struct cxl_memdev_state *mds,
if (!nr_rec)
break;
- for (i = 0; i < nr_rec; i++)
+ for (i = 0; i < nr_rec; i++) {
+ dev_dbg(dev, "Event log %d: processing handle %u\n",
+ type,
+ le16_to_cpu(payload->records[i].hdr.handle));
cxl_event_trace_record(cxlmd, type,
&payload->records[i]);
+ }
if (payload->flags & CXL_GET_EVENT_FLAG_OVERFLOW)
trace_cxl_overflow(cxlmd, type, payload);
---
base-commit: 5d0c230f1de8c7515b6567d9afba1f196fb4e2f4
change-id: 20230731-cxl-fix-clear-event-debug-print-3b57da0e956c
Best regards,
--
Ira Weiny <[email protected]>
On Mon, 2023-07-31 at 15:12 -0700, Dan Williams wrote:
> Ira Weiny wrote:
> > The handle value used to report an event being cleared by dev_dbg() is
> > incorrect due to a post increment of the payload handle index.
> >
> > Delay the increment of the index until after the print. Also add the
> > debugging for event processing which was useful in finding this bug.
> >
> > To: Davidlohr Bueso <[email protected]>
> > To: Jonathan Cameron <[email protected]>
> > To: Dave Jiang <[email protected]>
> > To: Alison Schofield <[email protected]>
> > To: Vishal Verma <[email protected]>
> > To: Dan Williams <[email protected]>
>
> Is this some new process recommendation to use "To:", I would only
> expect Cc: for maintainers.
>
> > Cc: Jonathan Cameron <[email protected]>
>
> Jonathan gets listed twice because?
>
> > Cc: [email protected]
> > Cc: [email protected]
>
> I assume this is because b4 takes its address from the patch itself?
>
> It just feels a bit too noisy.
Yes - but the better way to do this is to add the To: and Cc: lines in
b4 prep --edit-cover, even for a single patch branch. That way b4 uses
them to figure out where to send/cc, but other doesn't add them to
commit messages (or even the cover letter in case there is one).
On 7/31/23 13:52, Ira Weiny wrote:
> The handle value used to report an event being cleared by dev_dbg() is
> incorrect due to a post increment of the payload handle index.
>
> Delay the increment of the index until after the print. Also add the
> debugging for event processing which was useful in finding this bug.
>
> To: Davidlohr Bueso <[email protected]>
> To: Jonathan Cameron <[email protected]>
> To: Dave Jiang <[email protected]>
> To: Alison Schofield <[email protected]>
> To: Vishal Verma <[email protected]>
> To: Dan Williams <[email protected]>
> Cc: Jonathan Cameron <[email protected]>
> Cc: [email protected]
> Cc: [email protected]
> Signed-off-by: Ira Weiny <[email protected]>
Reviewed-by: Dave Jiang <[email protected]>
> ---
> NOTE: This does fix a bug in the patch referenced below. However, I
> don't think that warrants back porting because this is only a debug
> print.
>
> Fixes: 6ebe28f9ec72 ("cxl/mem: Read, trace, and clear events on driver load")
> ---
> drivers/cxl/core/mbox.c | 9 +++++++--
> 1 file changed, 7 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c
> index d6d067fbee97..f052d5f174ee 100644
> --- a/drivers/cxl/core/mbox.c
> +++ b/drivers/cxl/core/mbox.c
> @@ -882,9 +882,10 @@ static int cxl_clear_event_record(struct cxl_memdev_state *mds,
> */
> i = 0;
> for (cnt = 0; cnt < total; cnt++) {
> - payload->handles[i++] = get_pl->records[cnt].hdr.handle;
> + payload->handles[i] = get_pl->records[cnt].hdr.handle;
> dev_dbg(mds->cxlds.dev, "Event log '%d': Clearing %u\n", log,
> le16_to_cpu(payload->handles[i]));
> + i++;
>
> if (i == max_handles) {
> payload->nr_recs = i;
> @@ -946,9 +947,13 @@ static void cxl_mem_get_records_log(struct cxl_memdev_state *mds,
> if (!nr_rec)
> break;
>
> - for (i = 0; i < nr_rec; i++)
> + for (i = 0; i < nr_rec; i++) {
> + dev_dbg(dev, "Event log %d: processing handle %u\n",
> + type,
> + le16_to_cpu(payload->records[i].hdr.handle));
> cxl_event_trace_record(cxlmd, type,
> &payload->records[i]);
> + }
>
> if (payload->flags & CXL_GET_EVENT_FLAG_OVERFLOW)
> trace_cxl_overflow(cxlmd, type, payload);
>
> ---
> base-commit: 5d0c230f1de8c7515b6567d9afba1f196fb4e2f4
> change-id: 20230731-cxl-fix-clear-event-debug-print-3b57da0e956c
>
> Best regards,
Ira Weiny wrote:
> The handle value used to report an event being cleared by dev_dbg() is
> incorrect due to a post increment of the payload handle index.
>
> Delay the increment of the index until after the print. Also add the
> debugging for event processing which was useful in finding this bug.
>
> To: Davidlohr Bueso <[email protected]>
> To: Jonathan Cameron <[email protected]>
> To: Dave Jiang <[email protected]>
> To: Alison Schofield <[email protected]>
> To: Vishal Verma <[email protected]>
> To: Dan Williams <[email protected]>
Is this some new process recommendation to use "To:", I would only
expect Cc: for maintainers.
> Cc: Jonathan Cameron <[email protected]>
Jonathan gets listed twice because?
> Cc: [email protected]
> Cc: [email protected]
I assume this is because b4 takes its address from the patch itself?
It just feels a bit too noisy.
> Signed-off-by: Ira Weiny <[email protected]>
> ---
> NOTE: This does fix a bug in the patch referenced below. However, I
> don't think that warrants back porting because this is only a debug
> print.
I don't think that's a reason not to flag for backport. Smaller things
than this have been backported, and if it saves someone getting confused
in the field its worth it.
>
> Fixes: 6ebe28f9ec72 ("cxl/mem: Read, trace, and clear events on driver load")
> ---
> drivers/cxl/core/mbox.c | 9 +++++++--
> 1 file changed, 7 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c
> index d6d067fbee97..f052d5f174ee 100644
> --- a/drivers/cxl/core/mbox.c
> +++ b/drivers/cxl/core/mbox.c
> @@ -882,9 +882,10 @@ static int cxl_clear_event_record(struct cxl_memdev_state *mds,
> */
> i = 0;
> for (cnt = 0; cnt < total; cnt++) {
> - payload->handles[i++] = get_pl->records[cnt].hdr.handle;
> + payload->handles[i] = get_pl->records[cnt].hdr.handle;
> dev_dbg(mds->cxlds.dev, "Event log '%d': Clearing %u\n", log,
> le16_to_cpu(payload->handles[i]));
> + i++;
>
> if (i == max_handles) {
> payload->nr_recs = i;
> @@ -946,9 +947,13 @@ static void cxl_mem_get_records_log(struct cxl_memdev_state *mds,
> if (!nr_rec)
> break;
>
> - for (i = 0; i < nr_rec; i++)
> + for (i = 0; i < nr_rec; i++) {
> + dev_dbg(dev, "Event log %d: processing handle %u\n",
> + type,
> + le16_to_cpu(payload->records[i].hdr.handle));
> cxl_event_trace_record(cxlmd, type,
> &payload->records[i]);
This looks unrelated to the fix.
> + }
>
> if (payload->flags & CXL_GET_EVENT_FLAG_OVERFLOW)
> trace_cxl_overflow(cxlmd, type, payload);
>
> ---
> base-commit: 5d0c230f1de8c7515b6567d9afba1f196fb4e2f4
> change-id: 20230731-cxl-fix-clear-event-debug-print-3b57da0e956c
>
> Best regards,
> --
> Ira Weiny <[email protected]>
>
On Mon, Jul 31, 2023 at 01:52:27PM -0700, Ira Weiny wrote:
Looks like this is rolling for the distro list, so..
Please be more specific than this:
[PATCH] cxl/mbox: Fix debug message print
Something like this is more useful to folks scanning the one-liners:
[PATCH] cxl/mbox: Use correct handle in events debug message
> The handle value used to report an event being cleared by dev_dbg() is
> incorrect due to a post increment of the payload handle index.
>
> Delay the increment of the index until after the print. Also add the
> debugging for event processing which was useful in finding this bug.
>
"Also" always smells like something that should be a separate patch.
I guess you could rewrite the commit message and keep it in one patch:
[PATCH] cxl/mbox: Improve event dev_dbg() messages
Alison
> To: Davidlohr Bueso <[email protected]>
> To: Jonathan Cameron <[email protected]>
> To: Dave Jiang <[email protected]>
> To: Alison Schofield <[email protected]>
> To: Vishal Verma <[email protected]>
> To: Dan Williams <[email protected]>
> Cc: Jonathan Cameron <[email protected]>
> Cc: [email protected]
> Cc: [email protected]
> Signed-off-by: Ira Weiny <[email protected]>
> ---
> NOTE: This does fix a bug in the patch referenced below. However, I
> don't think that warrants back porting because this is only a debug
> print.
>
> Fixes: 6ebe28f9ec72 ("cxl/mem: Read, trace, and clear events on driver load")
> ---
> drivers/cxl/core/mbox.c | 9 +++++++--
> 1 file changed, 7 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c
> index d6d067fbee97..f052d5f174ee 100644
> --- a/drivers/cxl/core/mbox.c
> +++ b/drivers/cxl/core/mbox.c
> @@ -882,9 +882,10 @@ static int cxl_clear_event_record(struct cxl_memdev_state *mds,
> */
> i = 0;
> for (cnt = 0; cnt < total; cnt++) {
> - payload->handles[i++] = get_pl->records[cnt].hdr.handle;
> + payload->handles[i] = get_pl->records[cnt].hdr.handle;
> dev_dbg(mds->cxlds.dev, "Event log '%d': Clearing %u\n", log,
> le16_to_cpu(payload->handles[i]));
> + i++;
>
> if (i == max_handles) {
> payload->nr_recs = i;
> @@ -946,9 +947,13 @@ static void cxl_mem_get_records_log(struct cxl_memdev_state *mds,
> if (!nr_rec)
> break;
>
> - for (i = 0; i < nr_rec; i++)
> + for (i = 0; i < nr_rec; i++) {
> + dev_dbg(dev, "Event log %d: processing handle %u\n",
> + type,
> + le16_to_cpu(payload->records[i].hdr.handle));
> cxl_event_trace_record(cxlmd, type,
> &payload->records[i]);
> + }
>
> if (payload->flags & CXL_GET_EVENT_FLAG_OVERFLOW)
> trace_cxl_overflow(cxlmd, type, payload);
>
> ---
> base-commit: 5d0c230f1de8c7515b6567d9afba1f196fb4e2f4
> change-id: 20230731-cxl-fix-clear-event-debug-print-3b57da0e956c
>
> Best regards,
> --
> Ira Weiny <[email protected]>
>
Alison Schofield wrote:
> On Mon, Jul 31, 2023 at 01:52:27PM -0700, Ira Weiny wrote:
>
> Looks like this is rolling for the distro list, so..
>
> Please be more specific than this:
> [PATCH] cxl/mbox: Fix debug message print
>
> Something like this is more useful to folks scanning the one-liners:
> [PATCH] cxl/mbox: Use correct handle in events debug message
That is better thanks!
>
> > The handle value used to report an event being cleared by dev_dbg() is
> > incorrect due to a post increment of the payload handle index.
> >
> > Delay the increment of the index until after the print. Also add the
> > debugging for event processing which was useful in finding this bug.
> >
>
> "Also" always smells like something that should be a separate patch.
It could be but I did not see this as something which warranted its own
patch as it lead me to fix the bug. Since Dan feels the fix should be
backported that changes my attitude.
>
> I guess you could rewrite the commit message and keep it in one patch:
> [PATCH] cxl/mbox: Improve event dev_dbg() messages
Except that misses that this is a bug fix.
Ira
Dan Williams wrote:
> Ira Weiny wrote:
> > The handle value used to report an event being cleared by dev_dbg() is
> > incorrect due to a post increment of the payload handle index.
> >
> > Delay the increment of the index until after the print. Also add the
> > debugging for event processing which was useful in finding this bug.
> >
> > To: Davidlohr Bueso <[email protected]>
> > To: Jonathan Cameron <[email protected]>
> > To: Dave Jiang <[email protected]>
> > To: Alison Schofield <[email protected]>
> > To: Vishal Verma <[email protected]>
> > To: Dan Williams <[email protected]>
>
> Is this some new process recommendation to use "To:", I would only
> expect Cc: for maintainers.
It is just what b4 does because all those folks are listed as maintainers.
>
> > Cc: Jonathan Cameron <[email protected]>
>
> Jonathan gets listed twice because?
That is odd. I did not notice that.
>
> > Cc: [email protected]
> > Cc: [email protected]
>
> I assume this is because b4 takes its address from the patch itself?
>
> It just feels a bit too noisy.
Yea b4 generated the list and I just copied it here. Vishal is correct
that this is automatically put in the tracking cover letter commit b4 uses
but I copied it here because it was not a series. I'll skip this in the
future.
>
> > Signed-off-by: Ira Weiny <[email protected]>
> > ---
> > NOTE: This does fix a bug in the patch referenced below. However, I
> > don't think that warrants back porting because this is only a debug
> > print.
>
> I don't think that's a reason not to flag for backport. Smaller things
> than this have been backported, and if it saves someone getting confused
> in the field its worth it.
Ok.
>
> >
> > Fixes: 6ebe28f9ec72 ("cxl/mem: Read, trace, and clear events on driver load")
> > ---
> > drivers/cxl/core/mbox.c | 9 +++++++--
> > 1 file changed, 7 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c
> > index d6d067fbee97..f052d5f174ee 100644
> > --- a/drivers/cxl/core/mbox.c
> > +++ b/drivers/cxl/core/mbox.c
> > @@ -882,9 +882,10 @@ static int cxl_clear_event_record(struct cxl_memdev_state *mds,
> > */
> > i = 0;
> > for (cnt = 0; cnt < total; cnt++) {
> > - payload->handles[i++] = get_pl->records[cnt].hdr.handle;
> > + payload->handles[i] = get_pl->records[cnt].hdr.handle;
> > dev_dbg(mds->cxlds.dev, "Event log '%d': Clearing %u\n", log,
> > le16_to_cpu(payload->handles[i]));
> > + i++;
> >
> > if (i == max_handles) {
> > payload->nr_recs = i;
> > @@ -946,9 +947,13 @@ static void cxl_mem_get_records_log(struct cxl_memdev_state *mds,
> > if (!nr_rec)
> > break;
> >
> > - for (i = 0; i < nr_rec; i++)
> > + for (i = 0; i < nr_rec; i++) {
> > + dev_dbg(dev, "Event log %d: processing handle %u\n",
> > + type,
> > + le16_to_cpu(payload->records[i].hdr.handle));
> > cxl_event_trace_record(cxlmd, type,
> > &payload->records[i]);
>
> This looks unrelated to the fix.
It helped me to debug the issue. Since you want this marked as a fix for
backport I will split this out. Potentially just squash it into some
other patch or make it stand alone.
Ira