2024-05-02 05:05:00

by Rengarajan S

[permalink] [raw]
Subject: [PATCH net-next v1] net: microchip: lan743x: Reduce PTP timeout on HW failure

The PTP_CMD_CTL is a self clearing register which controls the PTP clock
values. In the current implementation driver waits for a duration of 20
sec in case of HW failure to clear the PTP_CMD_CTL register bit. This
timeout of 20 sec is very long to recognize a HW failure, as it is
typically cleared in one clock(<16ns). Hence reducing the timeout to 1 sec
would be sufficient to conclude if there is any HW failure observed. The
usleep_range will sleep somewhere between 1 msec to 20 msec for each
iteration. By setting the PTP_CMD_CTL_TIMEOUT_CNT to 50 the max timeout
is extended to 1 sec.

Signed-off-by: Rengarajan S <[email protected]>
---
drivers/net/ethernet/microchip/lan743x_ptp.c | 2 +-
drivers/net/ethernet/microchip/lan743x_ptp.h | 1 +
2 files changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/microchip/lan743x_ptp.c b/drivers/net/ethernet/microchip/lan743x_ptp.c
index 2801f08bf1c9..f8e840fd62cd 100644
--- a/drivers/net/ethernet/microchip/lan743x_ptp.c
+++ b/drivers/net/ethernet/microchip/lan743x_ptp.c
@@ -58,7 +58,7 @@ int lan743x_gpio_init(struct lan743x_adapter *adapter)
static void lan743x_ptp_wait_till_cmd_done(struct lan743x_adapter *adapter,
u32 bit_mask)
{
- int timeout = 1000;
+ int timeout = PTP_CMD_CTL_TIMEOUT_CNT;
u32 data = 0;

while (timeout &&
diff --git a/drivers/net/ethernet/microchip/lan743x_ptp.h b/drivers/net/ethernet/microchip/lan743x_ptp.h
index e26d4eff7133..0d29914cd460 100644
--- a/drivers/net/ethernet/microchip/lan743x_ptp.h
+++ b/drivers/net/ethernet/microchip/lan743x_ptp.h
@@ -21,6 +21,7 @@
#define LAN743X_PTP_N_EXTTS 4
#define LAN743X_PTP_N_PPS 0
#define PCI11X1X_PTP_IO_MAX_CHANNELS 8
+#define PTP_CMD_CTL_TIMEOUT_CNT 50

struct lan743x_adapter;

--
2.25.1



2024-05-04 08:52:38

by Simon Horman

[permalink] [raw]
Subject: Re: [PATCH net-next v1] net: microchip: lan743x: Reduce PTP timeout on HW failure

On Thu, May 02, 2024 at 10:33:00AM +0530, Rengarajan S wrote:
> The PTP_CMD_CTL is a self clearing register which controls the PTP clock
> values. In the current implementation driver waits for a duration of 20
> sec in case of HW failure to clear the PTP_CMD_CTL register bit. This
> timeout of 20 sec is very long to recognize a HW failure, as it is
> typically cleared in one clock(<16ns). Hence reducing the timeout to 1 sec
> would be sufficient to conclude if there is any HW failure observed. The
> usleep_range will sleep somewhere between 1 msec to 20 msec for each
> iteration. By setting the PTP_CMD_CTL_TIMEOUT_CNT to 50 the max timeout
> is extended to 1 sec.
>
> Signed-off-by: Rengarajan S <[email protected]>

Reviewed-by: Simon Horman <[email protected]>

2024-05-06 10:00:38

by patchwork-bot+netdevbpf

[permalink] [raw]
Subject: Re: [PATCH net-next v1] net: microchip: lan743x: Reduce PTP timeout on HW failure

Hello:

This patch was applied to netdev/net-next.git (main)
by Paolo Abeni <[email protected]>:

On Thu, 2 May 2024 10:33:00 +0530 you wrote:
> The PTP_CMD_CTL is a self clearing register which controls the PTP clock
> values. In the current implementation driver waits for a duration of 20
> sec in case of HW failure to clear the PTP_CMD_CTL register bit. This
> timeout of 20 sec is very long to recognize a HW failure, as it is
> typically cleared in one clock(<16ns). Hence reducing the timeout to 1 sec
> would be sufficient to conclude if there is any HW failure observed. The
> usleep_range will sleep somewhere between 1 msec to 20 msec for each
> iteration. By setting the PTP_CMD_CTL_TIMEOUT_CNT to 50 the max timeout
> is extended to 1 sec.
>
> [...]

Here is the summary with links:
- [net-next,v1] net: microchip: lan743x: Reduce PTP timeout on HW failure
https://git.kernel.org/netdev/net-next/c/b1de3c0df7ab

You are awesome, thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html



2024-05-07 01:33:25

by Andrew Lunn

[permalink] [raw]
Subject: Re: [PATCH net-next v1] net: microchip: lan743x: Reduce PTP timeout on HW failure

On Thu, May 02, 2024 at 10:33:00AM +0530, Rengarajan S wrote:
> The PTP_CMD_CTL is a self clearing register which controls the PTP clock
> values. In the current implementation driver waits for a duration of 20
> sec in case of HW failure to clear the PTP_CMD_CTL register bit. This
> timeout of 20 sec is very long to recognize a HW failure, as it is
> typically cleared in one clock(<16ns). Hence reducing the timeout to 1 sec
> would be sufficient to conclude if there is any HW failure observed. The
> usleep_range will sleep somewhere between 1 msec to 20 msec for each
> iteration. By setting the PTP_CMD_CTL_TIMEOUT_CNT to 50 the max timeout
> is extended to 1 sec.

This patch has already been merged, so this is just for my
curiosity. The hardware is dead. Does it really matter if we wait 1s
or 20 seconds. It is still dead? This is a void function. Other than
reporting that the hardware is dead, nothing is done. So this change
seems pointless?

Andrew

2024-05-08 08:53:01

by Rengarajan S

[permalink] [raw]
Subject: Re: [PATCH net-next v1] net: microchip: lan743x: Reduce PTP timeout on HW failure

On Tue, 2024-05-07 at 03:33 +0200, Andrew Lunn wrote:
> EXTERNAL EMAIL: Do not click links or open attachments unless you
> know the content is safe
>
> On Thu, May 02, 2024 at 10:33:00AM +0530, Rengarajan S wrote:
> > The PTP_CMD_CTL is a self clearing register which controls the PTP
> > clock
> > values. In the current implementation driver waits for a duration
> > of 20
> > sec in case of HW failure to clear the PTP_CMD_CTL register bit.
> > This
> > timeout of 20 sec is very long to recognize a HW failure, as it is
> > typically cleared in one clock(<16ns). Hence reducing the timeout
> > to 1 sec
> > would be sufficient to conclude if there is any HW failure
> > observed. The
> > usleep_range will sleep somewhere between 1 msec to 20 msec for
> > each
> > iteration. By setting the PTP_CMD_CTL_TIMEOUT_CNT to 50 the max
> > timeout
> > is extended to 1 sec.
>
> This patch has already been merged, so this is just for my
> curiosity. The hardware is dead. Does it really matter if we wait 1s
> or 20 seconds. It is still dead? This is a void function. Other than
> reporting that the hardware is dead, nothing is done. So this change
> seems pointless?
>
>         Andrew

Hi Andrew, based on the customer experience they felt that there might
be cases where the 20-sec delay can cause the issue(reporting the HW to
be dead). For boards with defects/failure on few occasions it was found
that resetting the chip can lead to successful resolution; however,
since we need to wait for 20 sec for chip reset, we found that reducing
the timeout to 1 sec would be optimal.

2024-05-08 11:57:34

by Andrew Lunn

[permalink] [raw]
Subject: Re: [PATCH net-next v1] net: microchip: lan743x: Reduce PTP timeout on HW failure

> Hi Andrew, based on the customer experience they felt that there might
> be cases where the 20-sec delay can cause the issue(reporting the HW to
> be dead). For boards with defects/failure on few occasions it was found
> that resetting the chip can lead to successful resolution; however,
> since we need to wait for 20 sec for chip reset, we found that reducing
> the timeout to 1 sec would be optimal.

O.K. This should of been part of the commit message, since with this
comment the change becomes meaningful. Please try to ensure the commit
message explains "Why?"

Andrew