2024-03-07 01:06:02

by Len Brown

[permalink] [raw]
Subject: [PATCH 1/1] nvme: Use pr_dbg, not pr_info, when setting shutdown timeout

From: Len Brown <[email protected]>

Some words are alarming in routine kernel messages.
"timeout" is one of them...

Here NVME is routinely setting a timeout value,
rather than reporting that a timeout has occurred.

Assuming this message is actually useful to somebody somewhere,
let's make it DEBUG level...

Fixes: 1a3838d732ea ("nvme: modify the debug level for setting shutdown timeout")

Signed-off-by: Len Brown <[email protected]>
---
drivers/nvme/host/core.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index 0a96362912ce..e361af1953a4 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -3205,7 +3205,7 @@ static int nvme_init_identify(struct nvme_ctrl *ctrl)
shutdown_timeout, 60);

if (ctrl->shutdown_timeout != shutdown_timeout)
- dev_info(ctrl->device,
+ dev_dbg(ctrl->device,
"Shutdown timeout set to %u seconds\n",
ctrl->shutdown_timeout);
} else
--
2.40.1



2024-03-07 09:29:13

by Max Gurtovoy

[permalink] [raw]
Subject: Re: [PATCH 1/1] nvme: Use pr_dbg, not pr_info, when setting shutdown timeout



On 07/03/2024 3:05, Len Brown wrote:
> From: Len Brown <[email protected]>
>
> Some words are alarming in routine kernel messages.
> "timeout" is one of them...
>
> Here NVME is routinely setting a timeout value,
> rather than reporting that a timeout has occurred.

No.
see the original commit message

"When an NVMe controller reports RTD3 Entry Latency larger than the
value of shutdown_timeout module parameter, we update the
shutdown_timeout accordingly to honor RTD3 Entry Latency. Use an
informational debug level instead of a warning level for it."

So this is not a routine flow. This informs users about using a
different value than the module param they set.



>
> Assuming this message is actually useful to somebody somewhere,
> let's make it DEBUG level...

why ?
I don't think it is necessary.
>
> Fixes: 1a3838d732ea ("nvme: modify the debug level for setting shutdown timeout")

This doesn't sounds like a bug fix to me.

>
> Signed-off-by: Len Brown <[email protected]>
> ---
> drivers/nvme/host/core.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
> index 0a96362912ce..e361af1953a4 100644
> --- a/drivers/nvme/host/core.c
> +++ b/drivers/nvme/host/core.c
> @@ -3205,7 +3205,7 @@ static int nvme_init_identify(struct nvme_ctrl *ctrl)
> shutdown_timeout, 60);
>
> if (ctrl->shutdown_timeout != shutdown_timeout)
> - dev_info(ctrl->device,
> + dev_dbg(ctrl->device,
> "Shutdown timeout set to %u seconds\n",
> ctrl->shutdown_timeout);
> } else

2024-03-07 14:28:21

by Len Brown

[permalink] [raw]
Subject: Re: [PATCH 1/1] nvme: Use pr_dbg, not pr_info, when setting shutdown timeout

On Thu, Mar 7, 2024 at 4:29 AM Max Gurtovoy <[email protected]> wrote:

> > Some words are alarming in routine kernel messages.
> > "timeout" is one of them...
> >
> > Here NVME is routinely setting a timeout value,
> > rather than reporting that a timeout has occurred.
>
> No.
> see the original commit message
>
> "When an NVMe controller reports RTD3 Entry Latency larger than the
> value of shutdown_timeout module parameter, we update the
> shutdown_timeout accordingly to honor RTD3 Entry Latency. Use an
> informational debug level instead of a warning level for it."
>
> So this is not a routine flow. This informs users about using a
> different value than the module param they set.

I have machines in automated testing.
Those machines have zero module params.
This message appears in their dmesg 100% of the time,
and our dmesg scanner complains about them 100% of the time.

Is this a bug in the NVME hardware or software?

If yes, I'll be happy to help debug it.

If no, then exactly what action is the informed user supposed to take
upon seeing this message?

If none, then the message serves no purpose and should be deleted entirely.

thank you,

Len Brown, Intel

2024-03-07 15:17:57

by Keith Busch

[permalink] [raw]
Subject: Re: [PATCH 1/1] nvme: Use pr_dbg, not pr_info, when setting shutdown timeout

On Thu, Mar 07, 2024 at 09:27:21AM -0500, Len Brown wrote:
> On Thu, Mar 7, 2024 at 4:29 AM Max Gurtovoy <[email protected]> wrote:
>
> > > Some words are alarming in routine kernel messages.
> > > "timeout" is one of them...
>
> > > Here NVME is routinely setting a timeout value,
> > > rather than reporting that a timeout has occurred.
> >
> > No.
> > see the original commit message
> >
> > "When an NVMe controller reports RTD3 Entry Latency larger than the
> > value of shutdown_timeout module parameter, we update the
> > shutdown_timeout accordingly to honor RTD3 Entry Latency. Use an
> > informational debug level instead of a warning level for it."
> >
> > So this is not a routine flow. This informs users about using a
> > different value than the module param they set.
>
> I have machines in automated testing.
> Those machines have zero module params.
> This message appears in their dmesg 100% of the time,
> and our dmesg scanner complains about them 100% of the time.
>
> Is this a bug in the NVME hardware or software?
>
> If yes, I'll be happy to help debug it.
>
> If no, then exactly what action is the informed user supposed to take
> upon seeing this message?
>
> If none, then the message serves no purpose and should be deleted entirely.

It lets you know that your device takes longer to safely power off than
the module's default tolerance. System low power transitions may take a
long time, and at one point, people wanted to know about that since it
may affect their power management decisions.

This print was partly from when NVMe protocol did not provide a way to
advertise an appropriate shutdown time, and we had no idea what devices
in the wild actually needed. We often just get a dmesg with bug reports,
and knowing device's shutdown timings was helpful at one point with
suspend and power off issues.

You can make the print go away by adding param

nvme_core.shutdown_timeout=<Largest Observed Value>

But personally, I don't find this print very useful anymore, so I don't
care if it gets removed.

2024-03-07 15:24:19

by Keith Busch

[permalink] [raw]
Subject: Re: [PATCH 1/1] nvme: Use pr_dbg, not pr_info, when setting shutdown timeout

On Thu, Mar 07, 2024 at 08:17:05AM -0700, Keith Busch wrote:
> But personally, I don't find this print very useful anymore, so I don't
> care if it gets removed.

Or maybe we can make the print less scary: how about changing "shutdown
timeout" to "D3 entry latency"?

2024-03-07 16:25:37

by Len Brown

[permalink] [raw]
Subject: Re: [PATCH 1/1] nvme: Use pr_dbg, not pr_info, when setting shutdown timeout

On Thu, Mar 7, 2024 at 10:23 AM Keith Busch <[email protected]> wrote:

> Or maybe we can make the print less scary: how about changing "shutdown
> timeout" to "D3 entry latency"?

Works for me!

(The problem with the word "timeout" is that is is usually something a
human should know about, along with "error" "fail" "abort" etc)

you sending a patch, or shall I?

thanks,
Len Brown, Intel