2021-03-11 07:25:10

by Xie He

[permalink] [raw]
Subject: [PATCH net] net: lapbether: Prevent racing when checking whether the netif is running

There are two "netif_running" checks in this driver. One is in
"lapbeth_xmit" and the other is in "lapbeth_rcv". They serve to make
sure that the LAPB APIs called in these functions are called before
"lapb_unregister" is called by the "ndo_stop" function.

However, these "netif_running" checks are unreliable, because it's
possible that immediately after "netif_running" returns true, "ndo_stop"
is called (which causes "lapb_unregister" to be called).

This patch adds locking to make sure "lapbeth_xmit" and "lapbeth_rcv" can
reliably check and ensure the netif is running while doing their work.

Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Xie He <[email protected]>
---
drivers/net/wan/lapbether.c | 32 +++++++++++++++++++++++++-------
1 file changed, 25 insertions(+), 7 deletions(-)

diff --git a/drivers/net/wan/lapbether.c b/drivers/net/wan/lapbether.c
index c3372498f4f1..8fda0446ff71 100644
--- a/drivers/net/wan/lapbether.c
+++ b/drivers/net/wan/lapbether.c
@@ -51,6 +51,8 @@ struct lapbethdev {
struct list_head node;
struct net_device *ethdev; /* link to ethernet device */
struct net_device *axdev; /* lapbeth device (lapb#) */
+ bool up;
+ spinlock_t up_lock; /* Protects "up" */
};

static LIST_HEAD(lapbeth_devices);
@@ -101,8 +103,9 @@ static int lapbeth_rcv(struct sk_buff *skb, struct net_device *dev, struct packe
rcu_read_lock();
lapbeth = lapbeth_get_x25_dev(dev);
if (!lapbeth)
- goto drop_unlock;
- if (!netif_running(lapbeth->axdev))
+ goto drop_unlock_rcu;
+ spin_lock_bh(&lapbeth->up_lock);
+ if (!lapbeth->up)
goto drop_unlock;

len = skb->data[0] + skb->data[1] * 256;
@@ -117,11 +120,14 @@ static int lapbeth_rcv(struct sk_buff *skb, struct net_device *dev, struct packe
goto drop_unlock;
}
out:
+ spin_unlock_bh(&lapbeth->up_lock);
rcu_read_unlock();
return 0;
drop_unlock:
kfree_skb(skb);
goto out;
+drop_unlock_rcu:
+ rcu_read_unlock();
drop:
kfree_skb(skb);
return 0;
@@ -151,13 +157,11 @@ static int lapbeth_data_indication(struct net_device *dev, struct sk_buff *skb)
static netdev_tx_t lapbeth_xmit(struct sk_buff *skb,
struct net_device *dev)
{
+ struct lapbethdev *lapbeth = netdev_priv(dev);
int err;

- /*
- * Just to be *really* sure not to send anything if the interface
- * is down, the ethernet device may have gone.
- */
- if (!netif_running(dev))
+ spin_lock_bh(&lapbeth->up_lock);
+ if (!lapbeth->up)
goto drop;

/* There should be a pseudo header of 1 byte added by upper layers.
@@ -194,6 +198,7 @@ static netdev_tx_t lapbeth_xmit(struct sk_buff *skb,
goto drop;
}
out:
+ spin_unlock_bh(&lapbeth->up_lock);
return NETDEV_TX_OK;
drop:
kfree_skb(skb);
@@ -285,6 +290,7 @@ static const struct lapb_register_struct lapbeth_callbacks = {
*/
static int lapbeth_open(struct net_device *dev)
{
+ struct lapbethdev *lapbeth = netdev_priv(dev);
int err;

if ((err = lapb_register(dev, &lapbeth_callbacks)) != LAPB_OK) {
@@ -292,13 +298,22 @@ static int lapbeth_open(struct net_device *dev)
return -ENODEV;
}

+ spin_lock_bh(&lapbeth->up_lock);
+ lapbeth->up = true;
+ spin_unlock_bh(&lapbeth->up_lock);
+
return 0;
}

static int lapbeth_close(struct net_device *dev)
{
+ struct lapbethdev *lapbeth = netdev_priv(dev);
int err;

+ spin_lock_bh(&lapbeth->up_lock);
+ lapbeth->up = false;
+ spin_unlock_bh(&lapbeth->up_lock);
+
if ((err = lapb_unregister(dev)) != LAPB_OK)
pr_err("lapb_unregister error: %d\n", err);

@@ -356,6 +371,9 @@ static int lapbeth_new_device(struct net_device *dev)
dev_hold(dev);
lapbeth->ethdev = dev;

+ lapbeth->up = false;
+ spin_lock_init(&lapbeth->up_lock);
+
rc = -EIO;
if (register_netdevice(ndev))
goto fail;
--
2.27.0


2021-03-11 20:45:04

by Jakub Kicinski

[permalink] [raw]
Subject: Re: [PATCH net] net: lapbether: Prevent racing when checking whether the netif is running

On Wed, 10 Mar 2021 23:23:09 -0800 Xie He wrote:
> There are two "netif_running" checks in this driver. One is in
> "lapbeth_xmit" and the other is in "lapbeth_rcv". They serve to make
> sure that the LAPB APIs called in these functions are called before
> "lapb_unregister" is called by the "ndo_stop" function.
>
> However, these "netif_running" checks are unreliable, because it's
> possible that immediately after "netif_running" returns true, "ndo_stop"
> is called (which causes "lapb_unregister" to be called).
>
> This patch adds locking to make sure "lapbeth_xmit" and "lapbeth_rcv" can
> reliably check and ensure the netif is running while doing their work.
>
> Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
> Signed-off-by: Xie He <[email protected]>

Is this a theoretical issues or do you see a path where it triggers?

Who are the callers sending frames to a device which went down?

2021-03-11 21:16:51

by Xie He

[permalink] [raw]
Subject: Re: [PATCH net] net: lapbether: Prevent racing when checking whether the netif is running

On Thu, Mar 11, 2021 at 12:43 PM Jakub Kicinski <[email protected]> wrote:
>
> Is this a theoretical issues or do you see a path where it triggers?
>
> Who are the callers sending frames to a device which went down?

This is a theoretical issue. I didn't see this issue in practice.

When "__dev_queue_xmit" and "sch_direct_xmit" call
"dev_hard_start_xmit", there appears to be no locking mechanism
preventing the netif from going down while "dev_hard_start_xmit" is
doing its work.

David once confirmed in an email that a driver's "ndo_stop" function
would indeed race with its "ndo_start_xmit" function:
https://lore.kernel.org/netdev/[email protected]/

2021-03-11 22:57:08

by Jakub Kicinski

[permalink] [raw]
Subject: Re: [PATCH net] net: lapbether: Prevent racing when checking whether the netif is running

On Thu, 11 Mar 2021 13:12:25 -0800 Xie He wrote:
> On Thu, Mar 11, 2021 at 12:43 PM Jakub Kicinski <[email protected]> wrote:
> >
> > Is this a theoretical issues or do you see a path where it triggers?
> >
> > Who are the callers sending frames to a device which went down?
>
> This is a theoretical issue. I didn't see this issue in practice.
>
> When "__dev_queue_xmit" and "sch_direct_xmit" call
> "dev_hard_start_xmit", there appears to be no locking mechanism
> preventing the netif from going down while "dev_hard_start_xmit" is
> doing its work.

Normally driver's ndo_stop() calls netif_tx_disable() which takes TX
locks, so unless your driver is lockless (LLTX) there should be no xmit
calls after that point.

2021-03-11 23:14:45

by Xie He

[permalink] [raw]
Subject: Re: [PATCH net] net: lapbether: Prevent racing when checking whether the netif is running

On Thu, Mar 11, 2021 at 2:52 PM Jakub Kicinski <[email protected]> wrote:
>
> Normally driver's ndo_stop() calls netif_tx_disable() which takes TX
> locks, so unless your driver is lockless (LLTX) there should be no xmit
> calls after that point.

Do you mean I should call "netif_tx_disable" inside my "ndo_stop"
function as a fix for the racing between "ndo_stop" and
"ndo_start_xmit"?

I can't call "netif_tx_disable" inside my "ndo_stop" function because
"netif_tx_disable" will call "netif_tx_stop_queue", which causes
another racing problem. Please see my recent commit f7d9d4854519
("net: lapbether: Remove netif_start_queue / netif_stop_queue")

2021-03-12 00:15:34

by Jakub Kicinski

[permalink] [raw]
Subject: Re: [PATCH net] net: lapbether: Prevent racing when checking whether the netif is running

On Thu, 11 Mar 2021 15:13:01 -0800 Xie He wrote:
> On Thu, Mar 11, 2021 at 2:52 PM Jakub Kicinski <[email protected]> wrote:
> >
> > Normally driver's ndo_stop() calls netif_tx_disable() which takes TX
> > locks, so unless your driver is lockless (LLTX) there should be no xmit
> > calls after that point.
>
> Do you mean I should call "netif_tx_disable" inside my "ndo_stop"
> function as a fix for the racing between "ndo_stop" and
> "ndo_start_xmit"?
>
> I can't call "netif_tx_disable" inside my "ndo_stop" function because
> "netif_tx_disable" will call "netif_tx_stop_queue", which causes
> another racing problem. Please see my recent commit f7d9d4854519
> ("net: lapbether: Remove netif_start_queue / netif_stop_queue")

And the "noqueue" queue is there because it's on top of hdlc_fr.c
somehow or some out of tree driver? Or do you install it manually?

2021-03-12 00:32:32

by Xie He

[permalink] [raw]
Subject: Re: [PATCH net] net: lapbether: Prevent racing when checking whether the netif is running

On Thu, Mar 11, 2021 at 4:10 PM Jakub Kicinski <[email protected]> wrote:
>
> And the "noqueue" queue is there because it's on top of hdlc_fr.c
> somehow or some out of tree driver? Or do you install it manually?

No, this driver is not related to "hdlc_fr.c" or any out-of-tree
driver. The default qdisc is "noqueue" for this driver because this
driver doesn't set "tx_queue_len". This means the value of
"tx_queue_len" would be 0. In this case, "alloc_netdev_mqs" will
automatically add the "IFF_NO_QUEUE" flag to the device, then
"attach_one_default_qdisc" in "net/sched/sch_generic.c" will attach
the "noqueue" qdisc for devices with the "IFF_NO_QUEUE" flag.

2021-03-12 01:44:52

by Jakub Kicinski

[permalink] [raw]
Subject: Re: [PATCH net] net: lapbether: Prevent racing when checking whether the netif is running

On Thu, 11 Mar 2021 16:28:47 -0800 Xie He wrote:
> On Thu, Mar 11, 2021 at 4:10 PM Jakub Kicinski <[email protected]> wrote:
> >
> > And the "noqueue" queue is there because it's on top of hdlc_fr.c
> > somehow or some out of tree driver? Or do you install it manually?
>
> No, this driver is not related to "hdlc_fr.c" or any out-of-tree
> driver. The default qdisc is "noqueue" for this driver because this
> driver doesn't set "tx_queue_len". This means the value of
> "tx_queue_len" would be 0. In this case, "alloc_netdev_mqs" will
> automatically add the "IFF_NO_QUEUE" flag to the device, then
> "attach_one_default_qdisc" in "net/sched/sch_generic.c" will attach
> the "noqueue" qdisc for devices with the "IFF_NO_QUEUE" flag.

I see.

2021-03-16 16:42:46

by Martin Schiller

[permalink] [raw]
Subject: Re: [PATCH net] net: lapbether: Prevent racing when checking whether the netif is running

On 2021-03-14 02:59, Xie He wrote:
> Hi Martin,
>
> Could you ack? Thanks!

Acked-by: Martin Schiller <[email protected]>

2021-03-16 20:47:47

by Xie He

[permalink] [raw]
Subject: Re: [PATCH net] net: lapbether: Prevent racing when checking whether the netif is running

On Tue, Mar 16, 2021 at 8:17 AM Martin Schiller <[email protected]> wrote:
>
> On 2021-03-14 02:59, Xie He wrote:
> > Hi Martin,
> >
> > Could you ack? Thanks!
>
> Acked-by: Martin Schiller <[email protected]>

Thank you!!