2024-05-30 15:49:18

by Doug Anderson

[permalink] [raw]
Subject: [PATCH v2] serial: port: Don't block system suspend even if bytes are left to xmit

Recently, suspend testing on sc7180-trogdor based devices has started
to sometimes fail with messages like this:

port a88000.serial:0.0: PM: calling pm_runtime_force_suspend+0x0/0xf8 @ 28934, parent: a88000.serial:0
port a88000.serial:0.0: PM: dpm_run_callback(): pm_runtime_force_suspend+0x0/0xf8 returns -16
port a88000.serial:0.0: PM: pm_runtime_force_suspend+0x0/0xf8 returned -16 after 33 usecs
port a88000.serial:0.0: PM: failed to suspend: error -16

I could reproduce these problems by logging in via an agetty on the
debug serial port (which was _not_ used for kernel console) and
running:
cat /var/log/messages
..and then (via an SSH session) forcing a few suspend/resume cycles.

Tracing through the code and doing some printf()-based debugging shows
that the -16 (-EBUSY) comes from the recently added
serial_port_runtime_suspend().

The idea of the serial_port_runtime_suspend() function is to prevent
the port from being _runtime_ suspended if it still has bytes left to
transmit. Having bytes left to transmit isn't a reason to block
_system_ suspend, though. If a serdev device in the kernel needs to
block system suspend it should block its own suspend and it can use
serdev_device_wait_until_sent() to ensure bytes are sent.

The DEFINE_RUNTIME_DEV_PM_OPS() used by the serial_port code means
that the system suspend function will be pm_runtime_force_suspend().
In pm_runtime_force_suspend() we can see that before calling the
runtime suspend function we'll call pm_runtime_disable(). This should
be a reliable way to detect that we're called from system suspend and
that we shouldn't look for busyness.

Fixes: 43066e32227e ("serial: port: Don't suspend if the port is still busy")
Signed-off-by: Douglas Anderson <[email protected]>
---
In v1 [1] this was part of a 2-patch series. I'm now just sending this
patch on its own since the Qualcomm GENI serial driver has ended up
having a whole pile of problems that are taking a while to unravel.
It makes sense to disconnect the two efforts. The core problem fixed
by this patch and the geni problems never had any dependencies anyway.

[1] https://lore.kernel.org/r/20240523162207.1.I2395e66cf70c6e67d774c56943825c289b9c13e4@changeid/

Changes in v2:
- Fix "regulator" => "regular" in comment.
- Fix "PM Runtime" => "runtime PM" in comment.
- Commit messages says how serdev devices should ensure bytes xfered.

drivers/tty/serial/serial_port.c | 10 ++++++++++
1 file changed, 10 insertions(+)

diff --git a/drivers/tty/serial/serial_port.c b/drivers/tty/serial/serial_port.c
index 91a338d3cb34..93ca94426162 100644
--- a/drivers/tty/serial/serial_port.c
+++ b/drivers/tty/serial/serial_port.c
@@ -64,6 +64,16 @@ static int serial_port_runtime_suspend(struct device *dev)
if (port->flags & UPF_DEAD)
return 0;

+ /*
+ * We only want to check the busyness of the port if runtime PM is
+ * enabled. Specifically runtime PM will be disabled by
+ * pm_runtime_force_suspend() during system suspend and we don't want
+ * to block system suspend even if there is data still left to
+ * transmit. We only want to block regular runtime PM transitions.
+ */
+ if (!pm_runtime_enabled(dev))
+ return 0;
+
uart_port_lock_irqsave(port, &flags);
if (!port_dev->tx_enabled) {
uart_port_unlock_irqrestore(port, flags);
--
2.45.1.288.g0e0cd299f1-goog



2024-05-31 05:21:42

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH v2] serial: port: Don't block system suspend even if bytes are left to xmit

On Thu, May 30, 2024 at 08:48:46AM -0700, Douglas Anderson wrote:
> Recently, suspend testing on sc7180-trogdor based devices has started
> to sometimes fail with messages like this:
>
> port a88000.serial:0.0: PM: calling pm_runtime_force_suspend+0x0/0xf8 @ 28934, parent: a88000.serial:0
> port a88000.serial:0.0: PM: dpm_run_callback(): pm_runtime_force_suspend+0x0/0xf8 returns -16
> port a88000.serial:0.0: PM: pm_runtime_force_suspend+0x0/0xf8 returned -16 after 33 usecs
> port a88000.serial:0.0: PM: failed to suspend: error -16
>
> I could reproduce these problems by logging in via an agetty on the
> debug serial port (which was _not_ used for kernel console) and
> running:
> cat /var/log/messages
> ...and then (via an SSH session) forcing a few suspend/resume cycles.
>
> Tracing through the code and doing some printf()-based debugging shows
> that the -16 (-EBUSY) comes from the recently added
> serial_port_runtime_suspend().
>
> The idea of the serial_port_runtime_suspend() function is to prevent
> the port from being _runtime_ suspended if it still has bytes left to
> transmit. Having bytes left to transmit isn't a reason to block
> _system_ suspend, though. If a serdev device in the kernel needs to
> block system suspend it should block its own suspend and it can use
> serdev_device_wait_until_sent() to ensure bytes are sent.
>
> The DEFINE_RUNTIME_DEV_PM_OPS() used by the serial_port code means
> that the system suspend function will be pm_runtime_force_suspend().
> In pm_runtime_force_suspend() we can see that before calling the
> runtime suspend function we'll call pm_runtime_disable(). This should
> be a reliable way to detect that we're called from system suspend and
> that we shouldn't look for busyness.
>
> Fixes: 43066e32227e ("serial: port: Don't suspend if the port is still busy")
> Signed-off-by: Douglas Anderson <[email protected]>
> ---
> In v1 [1] this was part of a 2-patch series. I'm now just sending this
> patch on its own since the Qualcomm GENI serial driver has ended up
> having a whole pile of problems that are taking a while to unravel.
> It makes sense to disconnect the two efforts. The core problem fixed
> by this patch and the geni problems never had any dependencies anyway.
>
> [1] https://lore.kernel.org/r/20240523162207.1.I2395e66cf70c6e67d774c56943825c289b9c13e4@changeid/
>

Hi,

This is the friendly patch-bot of Greg Kroah-Hartman. You have sent him
a patch that has triggered this response. He used to manually respond
to these common problems, but in order to save his sanity (he kept
writing the same thing over and over, yet to different people), I was
created. Hopefully you will not take offence and will fix the problem
in your patch and resubmit it so that it can be accepted into the Linux
kernel tree.

You are receiving this message because of the following common error(s)
as indicated below:

- You have marked a patch with a "Fixes:" tag for a commit that is in an
older released kernel, yet you do not have a cc: stable line in the
signed-off-by area at all, which means that the patch will not be
applied to any older kernel releases. To properly fix this, please
follow the documented rules in the
Documentation/process/stable-kernel-rules.rst file for how to resolve
this.

If you wish to discuss this problem further, or you have questions about
how to resolve this issue, please feel free to respond to this email and
Greg will reply once he has dug out from the pending patches received
from other developers.

thanks,

greg k-h's patch email bot

2024-05-31 08:17:48

by Tony Lindgren

[permalink] [raw]
Subject: Re: [PATCH v2] serial: port: Don't block system suspend even if bytes are left to xmit

On Thu, May 30, 2024 at 08:48:46AM -0700, Douglas Anderson wrote:
> The DEFINE_RUNTIME_DEV_PM_OPS() used by the serial_port code means
> that the system suspend function will be pm_runtime_force_suspend().
> In pm_runtime_force_suspend() we can see that before calling the
> runtime suspend function we'll call pm_runtime_disable(). This should
> be a reliable way to detect that we're called from system suspend and
> that we shouldn't look for busyness.

OK makes sense, one comment below though.

> --- a/drivers/tty/serial/serial_port.c
> +++ b/drivers/tty/serial/serial_port.c
> @@ -64,6 +64,16 @@ static int serial_port_runtime_suspend(struct device *dev)
> if (port->flags & UPF_DEAD)
> return 0;
>
> + /*
> + * We only want to check the busyness of the port if runtime PM is
> + * enabled. Specifically runtime PM will be disabled by
> + * pm_runtime_force_suspend() during system suspend and we don't want
> + * to block system suspend even if there is data still left to
> + * transmit. We only want to block regular runtime PM transitions.
> + */
> + if (!pm_runtime_enabled(dev))
> + return 0;
> +

How about change the comment a bit to describe why this happens so it's
easy to remember the next time looking at the code. I'd suggest just
something like this:

Nothing to do on pm_runtime_force_suspend(), see DEFINE_RUNTIME_DEV_PM_OPS

Other than that:

Reviewed-by: Tony Lindgren <[email protected]>