2018-01-16 16:58:32

by Denis Du

[permalink] [raw]
Subject: [PATCH] Carrier detect ok, don't turn off negotiation

In drivers/net/wan/hdlc_ppp.c, some noise on physical line can cause the carrier detect still ok, but the protocol will fail. So if carrier detect ok, don't turn off protocol negotiation

This patch is against the kernel version Linux 4.15-rc8


Attachments:
0001-netdev-carrier-detect-ok-don-t-turn-off-negotiation.patch (1.23 kB)

2018-01-22 20:25:51

by David Miller

[permalink] [raw]
Subject: Re: [PATCH] Carrier detect ok, don't turn off negotiation

From: Denis Du <[email protected]>
Date: Tue, 16 Jan 2018 16:58:25 +0000 (UTC)

> From b5902a4dfc709b62b704997ab64f31c9ef69a6db Mon Sep 17 00:00:00 2001
> From: Denis Du <[email protected]>
> Date: Mon, 15 Jan 2018 17:26:06 -0500
> Subject: [PATCH] netdev: carrier detect ok, don't turn off negotiation
>
> Sometimes when physical lines have a just good noise to make the protocol
> handshaking fail, but the carrier detect still good. Then after remove of
> the noise, nobody will trigger this protocol to be start again to cause
> the link to never come back. The fix is when the carrier is still on, not
> terminate the protocol handshaking.
>
> Signed-off-by: Denis Du <[email protected]>

The timer is supposed to restart the protocol again, that's how this
whole thing is designed to work.

I think you are making changes to the symptom rather than the true
cause of the problems you are seeing.

Sorry, I will not apply this until the exact issue is better
understood.

Thank you.

2018-01-22 22:17:58

by Denis Du

[permalink] [raw]
Subject: Re: [PATCH] Carrier detect ok, don't turn off negotiation



The timer is supposed to be triggered by carrier detect interrupt. After remove the line noise, the carrier detect interrupt is never triggered again, because the carrier is always ok and it only trigger the timer once, Since the protocol was terminated and no new interrupts happen, the link will never be back. So the case here is that the line noise is good and just good to make the carrier detect still good  but the protocol fail, the timer will be never triggered again.

Of course, if you increase the noise and make even the carrier detect fail, then remove the noise, the link will be up, Because the carrier down and up again and then trigger the timer to restart.

Denis Du




On Monday, January 22, 2018, 3:25:16 PM EST, David Miller <[email protected]> wrote:





From: Denis Du <[email protected]>
Date: Tue, 16 Jan 2018 16:58:25 +0000 (UTC)

> From b5902a4dfc709b62b704997ab64f31c9ef69a6db Mon Sep 17 00:00:00 2001
> From: Denis Du <[email protected]>
> Date: Mon, 15 Jan 2018 17:26:06 -0500
> Subject: [PATCH] netdev: carrier detect ok, don't turn off negotiation
>
> Sometimes when physical lines have a just good noise to make the protocol
> handshaking fail, but the carrier detect still good. Then after remove of
> the noise, nobody will trigger this protocol to be start again to cause
> the link to never come back. The fix is when the carrier is still on, not
> terminate the protocol handshaking.
>
> Signed-off-by: Denis Du <[email protected]>

The timer is supposed to restart the protocol again, that's how this
whole thing is designed to work.

I think you are making changes to the symptom rather than the true
cause of the problems you are seeing.

Sorry, I will not apply this until the exact issue is better
understood.

Thank you.

2018-01-23 17:09:30

by Denis Du

[permalink] [raw]
Subject: Re: [PATCH] Carrier detect ok, don't turn off negotiation



Ok, I check the source code again. It have nothing to do with the interrupts, it is related how the hdlc.c is implemented.
In drivers/net/wan/hdlc.c#L108


        if (hdlc->carrier == on)
        goto carrier_exit; /* no change in DCD line level */

    hdlc->carrier = on;

    if (!hdlc->open)
        goto carrier_exit;

    if (hdlc->carrier) {
        netdev_info(dev, "Carrier detected\n");
        hdlc_proto_start(dev);
    } else {
        netdev_info(dev, "Carrier lost\n");
        hdlc_proto_stop(dev);
    }

carrier_exit:
    spin_unlock_irqrestore(&hdlc->state_lock, flags);
    return NOTIFY_DONE;

From the above code, I can get that only Carrier have some change, it will restart the protocol by hdlc_proto_start(dev);and thus the timer, the previous timer expired due to protocol fail.

If carrier keep no change by if (hdlc->carrier == on)
        goto carrier_exit; /* no change in DCD line level */It will do nothing, not start any new protocol and thus the timer.


My case is the carrier always good, but protocol will fail due to perfect noise, and this issue was found and complained by our customers. So it is not my theory guessing, it is a real problem.






On Monday, January 22, 2018, 3:25:16 PM EST, David Miller <[email protected]> wrote:





From: Denis Du <[email protected]>
Date: Tue, 16 Jan 2018 16:58:25 +0000 (UTC)

> From b5902a4dfc709b62b704997ab64f31c9ef69a6db Mon Sep 17 00:00:00 2001
> From: Denis Du <[email protected]>
> Date: Mon, 15 Jan 2018 17:26:06 -0500
> Subject: [PATCH] netdev: carrier detect ok, don't turn off negotiation
>
> Sometimes when physical lines have a just good noise to make the protocol
> handshaking fail, but the carrier detect still good. Then after remove of
> the noise, nobody will trigger this protocol to be start again to cause
> the link to never come back. The fix is when the carrier is still on, not
> terminate the protocol handshaking.
>
> Signed-off-by: Denis Du <[email protected]>

The timer is supposed to restart the protocol again, that's how this
whole thing is designed to work.

I think you are making changes to the symptom rather than the true
cause of the problems you are seeing.

Sorry, I will not apply this until the exact issue is better
understood.

Thank you.


2018-01-28 14:35:32

by Krzysztof Halasa

[permalink] [raw]
Subject: Re: [PATCH] Carrier detect ok, don't turn off negotiation

Denis Du <[email protected]> writes:

>>From the above code, I can get that only Carrier have some change, it
> will restart the protocol by hdlc_proto_start(dev);and thus the timer,
> the previous timer expired due to protocol fail.
>
> If carrier keep no change by if (hdlc->carrier == on)
>         goto carrier_exit; /* no change in DCD line level */It will do
> nothing, not start any new protocol and thus the timer.

Sorry about being late, just returned home and am trying to get all the
backlogs under control.

I remember the PPP standard is a bit cloudy about the possible issue,
but the latter indeed exists (the PPP state machine was written directly
to STD-51). There is related (more visible in practice, though we aren't
affected) issue of "active" vs "passive" mode (hdlc_ppp.c is "active",
and two "passives" wouldn't negotiate at all).

Anyway the problem is real (though not very visible in practice,
especially on relatively modern links rather than 300 or 1200 bps dialup
connections) and should be fixed. Looking at the patch, my first
impression is it makes the code differ from STD-51 a little bit.
On the other hand, perhaps applying it as is and forgetting about the
issue is the way to go.

Ideally, I think the negotiation failure should end up (optionally, in
addition to the current behavior) in some configurable sleep, then
the negotiation should restart. If it's worth the effort at this point,
I don't know.

Perhaps I could look at this later, but no promises (this requires
pulling on and setting up some legacy hardware).

Anyway, since the patch is safe and can solve an existing problem:

Acked-by: Krzysztof Halasa <[email protected]>
--
Krzysztof Halasa

2018-02-06 15:19:41

by Denis Du

[permalink] [raw]
Subject: Re: [PATCH] Carrier detect ok, don't turn off negotiation



Hi, David:

How  do you think my patch?

As you see, Krzysztof  think my patch is ok to be accepted.

But if you have a better idea to fix it,I am glad to see it. Anyway, this issue have to be fixed.



Denis DU




On Sunday, January 28, 2018, 9:34:15 AM EST, Krzysztof Halasa <[email protected]> wrote:





Denis Du <[email protected]> writes:


>>From the above code, I can get that only Carrier have some change, it
> will restart the protocol by hdlc_proto_start(dev);and thus the timer,
> the previous timer expired due to protocol fail.
>
> If carrier keep no change by if (hdlc->carrier == on)
>         goto carrier_exit; /* no change in DCD line level */It will do
> nothing, not start any new protocol and thus the timer.


Sorry about being late, just returned home and am trying to get all the
backlogs under control.

I remember the PPP standard is a bit cloudy about the possible issue,
but the latter indeed exists (the PPP state machine was written directly
to STD-51). There is related (more visible in practice, though we aren't
affected) issue of "active" vs "passive" mode (hdlc_ppp.c is "active",
and two "passives" wouldn't negotiate at all).

Anyway the problem is real (though not very visible in practice,
especially on relatively modern links rather than 300 or 1200 bps dialup
connections) and should be fixed. Looking at the patch, my first
impression is it makes the code differ from STD-51 a little bit.
On the other hand, perhaps applying it as is and forgetting about the
issue is the way to go.

Ideally, I think the negotiation failure should end up (optionally, in
addition to the current behavior) in some configurable sleep, then
the negotiation should restart. If it's worth the effort at this point,
I don't know.

Perhaps I could look at this later, but no promises (this requires
pulling on and setting up some legacy hardware).

Anyway, since the patch is safe and can solve an existing problem:

Acked-by: Krzysztof Halasa <[email protected]>
--
Krzysztof Halasa


2018-02-06 15:31:14

by David Miller

[permalink] [raw]
Subject: Re: [PATCH] Carrier detect ok, don't turn off negotiation

From: Denis Du <[email protected]>
Date: Tue, 6 Feb 2018 15:15:28 +0000 (UTC)

> How? do you think my patch?
>
> As you see, Krzysztof? think my patch is ok to be accepted.
> But if you have a better idea to fix it,I am glad to see it. Anyway, this issue have to be fixed.

Please resubmit it and I'll think about it again, thank you.

2018-02-06 17:08:50

by Denis Du

[permalink] [raw]
Subject: Re: [PATCH] Carrier detect ok, don't turn off negotiation



Ok, I submit it  again.


In drivers/net/wan/hdlc_ppp.c, some noise on physical line can cause the carrier detect still ok, but the protocol will fail. So if carrier detect ok, don't turn off protocol negotiation

This patch is against the kernel version Linux 4.15-rc8





On Tuesday, February 6, 2018, 10:29:53 AM EST, David Miller <[email protected]> wrote:





From: Denis Du <[email protected]>

Date: Tue, 6 Feb 2018 15:15:28 +0000 (UTC)

> How  do you think my patch?
>
> As you see, Krzysztof  think my patch is ok to be accepted.
> But if you have a better idea to fix it,I am glad to see it. Anyway, this issue have to be fixed.


Please resubmit it and I'll think about it again, thank you.


Attachments:
0001-netdev-carrier-detect-ok-don-t-turn-off-negotiation.patch (1.23 kB)

2018-02-21 03:37:49

by Denis Du

[permalink] [raw]
Subject: Re: [PATCH] Carrier detect ok, don't turn off negotiation

Hi, David:

How  is your thinking about this patch?



From b5902a4dfc709b62b704997ab64f31c9ef69a6db Mon Sep 17 00:00:00 2001
From: Denis Du <[email protected]>
Date: Mon, 15 Jan 2018 17:26:06 -0500
Subject: [PATCH] netdev: carrier detect ok, don't turn off negotiation

Sometimes when physical lines have a just good noise to make the protocol
handshaking fail, but the carrier detect still good. Then after remove of
the noise, nobody will trigger this protocol to be start again to cause
the link to never come back. The fix is when the carrier is still on, not
terminate the protocol handshaking.

Signed-off-by: Denis Du <[email protected]>
---
drivers/net/wan/hdlc_ppp.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/net/wan/hdlc_ppp.c b/drivers/net/wan/hdlc_ppp.c
index afeca6b..ab8b3cb 100644
--- a/drivers/net/wan/hdlc_ppp.c
+++ b/drivers/net/wan/hdlc_ppp.c
@@ -574,7 +574,10 @@ static void ppp_timer(struct timer_list *t)
            ppp_cp_event(proto->dev, proto->pid, TO_GOOD, 0, 0,
                     0, NULL);
            proto->restart_counter--;
-        } else
+        } else if (netif_carrier_ok(proto->dev))
+            ppp_cp_event(proto->dev, proto->pid, TO_GOOD, 0, 0,
+                     0, NULL);
+        else
            ppp_cp_event(proto->dev, proto->pid, TO_BAD, 0, 0,
                     0, NULL);
        break;
--
2.1.4





On ‎Tuesday‎, ‎February‎ ‎06‎, ‎2018‎ ‎12‎:‎06‎:‎43‎ ‎PM‎ ‎EST, Denis Du <[email protected]> wrote:







Ok, I submit it  again.


In drivers/net/wan/hdlc_ppp.c, some noise on physical line can cause the carrier detect still ok, but the protocol will fail. So if carrier detect ok, don't turn off protocol negotiation

This patch is against the kernel version Linux 4.15-rc8





On Tuesday, February 6, 2018, 10:29:53 AM EST, David Miller <[email protected]> wrote:





From: Denis Du <[email protected]>

Date: Tue, 6 Feb 2018 15:15:28 +0000 (UTC)

> How  do you think my patch?
>
> As you see, Krzysztof  think my patch is ok to be accepted.
> But if you have a better idea to fix it,I am glad to see it. Anyway, this issue have to be fixed.


Please resubmit it and I'll think about it again, thank you.

2018-02-22 19:05:32

by David Miller

[permalink] [raw]
Subject: Re: [PATCH] Carrier detect ok, don't turn off negotiation

From: Denis Du <[email protected]>
Date: Wed, 21 Feb 2018 03:35:31 +0000 (UTC)

> How? is your thinking about this patch?

I cannot apply a patch which has been corrupted by your email client like
this.

Please send it properly again, plain ASCII text, and no trasnformations
by your email client.

You should send the patch to yourself and try to apply the patch you
receive, do not send to the list until you can pass the test properly.

Do not use attachments to fix this problem, the patch must be inline
after your commit message and signoffs.

Please read Documentation/process/submitting-patches.rst and
Documentation/process/email-clients.rsDt for more information.

Thank you.