2013-04-02 18:48:13

by Alan Ott

[permalink] [raw]
Subject: [PATCH 0/6] 802.15.4 and 6LoWPAN Buffering Fixes

These patches fix an issue in the 802.15.4 code where the output buffer
(which prior to this patchset is just a workqueue) can grow to an arbitrary
size. This is basically fixed as follows:

1. Use netif_stop_queue() and netif_wake_queue() to stop and start the
transmit queue, preventing packets from piling up without bound on the
mac802154 workqueue.

2. Increase the default tx_buffer_len for mac802154 (wpan) devices from 10
to 300, enabling Qdisc to work properly on the transmit queue.

Additionally the following related issues are fixed:

1. Handle dev_queue_xmit() return values properly in the 6LoWPAN code (and
return the proper errors to the higher layers). This will cause the higher
layers to retry later if the mac802154 queue is full.

2. Fix the retry of transmit failures in mac802154.

Alan Ott (6):
mac802154: Immediately retry sending failed packets
mac802154: Move xmit_attemps to stack
mac802154: Use netif flow control
mac802154: Increase tx_buffer_len
6lowpan: handle dev_queue_xmit error code properly
6lowpan: return the dev_queue_xmit() return value from lowpan_xmit()

net/ieee802154/6lowpan.c | 4 ++--
net/mac802154/tx.c | 34 ++++++++++++++++++++++++----------
net/mac802154/wpan.c | 2 +-
3 files changed, 27 insertions(+), 13 deletions(-)

--
1.7.11.2


2013-04-02 18:48:19

by Alan Ott

[permalink] [raw]
Subject: [PATCH 5/6] 6lowpan: handle dev_queue_xmit error code properly

dev_queue_xmit() can return positive error codes, so check for nonzero.

Signed-off-by: Alan Ott <[email protected]>
---
net/ieee802154/6lowpan.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/ieee802154/6lowpan.c b/net/ieee802154/6lowpan.c
index e1b4580..a68c792 100644
--- a/net/ieee802154/6lowpan.c
+++ b/net/ieee802154/6lowpan.c
@@ -1139,7 +1139,7 @@ static netdev_tx_t lowpan_xmit(struct sk_buff *skb, struct net_device *dev)
error:
dev_kfree_skb(skb);
out:
- if (err < 0)
+ if (err)
pr_debug("ERROR: xmit failed\n");

return (err < 0 ? NETDEV_TX_BUSY : NETDEV_TX_OK);
--
1.7.11.2

2013-04-02 18:48:17

by Alan Ott

[permalink] [raw]
Subject: [PATCH 6/6] 6lowpan: return the dev_queue_xmit() return value from lowpan_xmit()

dev_queue_xmit() will return a positive value if the packet could not be
queued, often because the real network device (in our case the mac802154
wpan device) has its queue stopped. lowpan_xmit() should return that value
to the higher layer so the higher layer will retry sending the packet.

Signed-off-by: Alan Ott <[email protected]>
---
net/ieee802154/6lowpan.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/ieee802154/6lowpan.c b/net/ieee802154/6lowpan.c
index a68c792..55e1fd5 100644
--- a/net/ieee802154/6lowpan.c
+++ b/net/ieee802154/6lowpan.c
@@ -1142,7 +1142,7 @@ out:
if (err)
pr_debug("ERROR: xmit failed\n");

- return (err < 0 ? NETDEV_TX_BUSY : NETDEV_TX_OK);
+ return (err < 0) ? NET_XMIT_DROP : err;
}

static struct wpan_phy *lowpan_get_phy(const struct net_device *dev)
--
1.7.11.2

2013-04-02 18:48:14

by Alan Ott

[permalink] [raw]
Subject: [PATCH 1/6] mac802154: Immediately retry sending failed packets

When ops->xmit() fails, immediately retry. Previously the packet was sent
to the back of the workqueue.

Signed-off-by: Alan Ott <[email protected]>
---
net/mac802154/tx.c | 17 ++++++++---------
1 file changed, 8 insertions(+), 9 deletions(-)

diff --git a/net/mac802154/tx.c b/net/mac802154/tx.c
index 4e09d07..fbf937c 100644
--- a/net/mac802154/tx.c
+++ b/net/mac802154/tx.c
@@ -59,19 +59,18 @@ static void mac802154_xmit_worker(struct work_struct *work)
}
}

- res = xw->priv->ops->xmit(&xw->priv->hw, xw->skb);
+ do {
+ res = xw->priv->ops->xmit(&xw->priv->hw, xw->skb);
+ if (res && ++xw->xmit_attempts >= MAC802154_MAX_XMIT_ATTEMPTS) {
+ pr_debug("transmission failed for %d times",
+ MAC802154_MAX_XMIT_ATTEMPTS);
+ break;
+ }
+ } while (res);

out:
mutex_unlock(&xw->priv->phy->pib_lock);

- if (res) {
- if (xw->xmit_attempts++ < MAC802154_MAX_XMIT_ATTEMPTS) {
- queue_work(xw->priv->dev_workqueue, &xw->work);
- return;
- } else
- pr_debug("transmission failed for %d times",
- MAC802154_MAX_XMIT_ATTEMPTS);
- }

dev_kfree_skb(xw->skb);

--
1.7.11.2

2013-04-02 18:50:54

by Alan Ott

[permalink] [raw]
Subject: [PATCH 4/6] mac802154: Increase tx_buffer_len

Increase the buffer length from 10 to 300 packets. Consider that traffic on
mac802154 devices will often be 6LoWPAN, and a full-length (1280 octet)
IPv6 packet will fragment into 15 6LoWPAN fragments (because the MTU of
IEEE 802.15.4 is 127). A 300-packet queue is really 20 full-length IPv6
packets.

With a queue length of 10, an entire IPv6 packet was unable to get queued
at one time, causing fragments to be dropped, and making reassembly
impossible.

Signed-off-by: Alan Ott <[email protected]>
---
net/mac802154/wpan.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/mac802154/wpan.c b/net/mac802154/wpan.c
index 7d3f659..2ca2f4d 100644
--- a/net/mac802154/wpan.c
+++ b/net/mac802154/wpan.c
@@ -360,7 +360,7 @@ void mac802154_wpan_setup(struct net_device *dev)
dev->header_ops = &mac802154_header_ops;
dev->needed_tailroom = 2; /* FCS */
dev->mtu = IEEE802154_MTU;
- dev->tx_queue_len = 10;
+ dev->tx_queue_len = 300;
dev->type = ARPHRD_IEEE802154;
dev->flags = IFF_NOARP | IFF_BROADCAST;
dev->watchdog_timeo = 0;
--
1.7.11.2

2013-04-02 18:51:22

by Alan Ott

[permalink] [raw]
Subject: [PATCH 3/6] mac802154: Use netif flow control

Use netif_stop_queue() and netif_wake_queue() to control the flow of
packets to mac802154 devices. Since many IEEE 802.15.4 devices have no
output buffer, and since the mac802154 xmit() function is designed to
block, netif_stop_queue() is called after each packet.

Signed-off-by: Alan Ott <[email protected]>
---
net/mac802154/tx.c | 16 ++++++++++++++++
1 file changed, 16 insertions(+)

diff --git a/net/mac802154/tx.c b/net/mac802154/tx.c
index a248246..fe3e02c 100644
--- a/net/mac802154/tx.c
+++ b/net/mac802154/tx.c
@@ -25,6 +25,7 @@
#include <linux/if_arp.h>
#include <linux/crc-ccitt.h>

+#include <net/ieee802154_netdev.h>
#include <net/mac802154.h>
#include <net/wpan-phy.h>

@@ -45,6 +46,7 @@ static void mac802154_xmit_worker(struct work_struct *work)
{
struct xmit_work *xw = container_of(work, struct xmit_work, work);
u8 xmit_attempts = 0;
+ struct mac802154_sub_if_data *sdata;
int res;

mutex_lock(&xw->priv->phy->pib_lock);
@@ -71,6 +73,12 @@ static void mac802154_xmit_worker(struct work_struct *work)
out:
mutex_unlock(&xw->priv->phy->pib_lock);

+ /* Restart the netif queue on each sub_if_data object. */
+ rcu_read_lock();
+ list_for_each_entry_rcu(sdata, &xw->priv->slaves, list) {
+ netif_wake_queue(sdata->dev);
+ }
+ rcu_read_unlock();

dev_kfree_skb(xw->skb);

@@ -81,6 +89,7 @@ netdev_tx_t mac802154_tx(struct mac802154_priv *priv, struct sk_buff *skb,
u8 page, u8 chan)
{
struct xmit_work *work;
+ struct mac802154_sub_if_data *sdata;

if (!(priv->phy->channels_supported[page] & (1 << chan))) {
WARN_ON(1);
@@ -108,6 +117,13 @@ netdev_tx_t mac802154_tx(struct mac802154_priv *priv, struct sk_buff *skb,
return NETDEV_TX_BUSY;
}

+ /* Stop the netif queue on each sub_if_data object. */
+ rcu_read_lock();
+ list_for_each_entry_rcu(sdata, &priv->slaves, list) {
+ netif_stop_queue(sdata->dev);
+ }
+ rcu_read_unlock();
+
INIT_WORK(&work->work, mac802154_xmit_worker);
work->skb = skb;
work->priv = priv;
--
1.7.11.2

2013-04-02 18:51:48

by Alan Ott

[permalink] [raw]
Subject: [PATCH 2/6] mac802154: Move xmit_attemps to stack

There's no reason to have it in the work struct anymore.

Signed-off-by: Alan Ott <[email protected]>
---
net/mac802154/tx.c | 5 ++---
1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/net/mac802154/tx.c b/net/mac802154/tx.c
index fbf937c..a248246 100644
--- a/net/mac802154/tx.c
+++ b/net/mac802154/tx.c
@@ -39,12 +39,12 @@ struct xmit_work {
struct mac802154_priv *priv;
u8 chan;
u8 page;
- u8 xmit_attempts;
};

static void mac802154_xmit_worker(struct work_struct *work)
{
struct xmit_work *xw = container_of(work, struct xmit_work, work);
+ u8 xmit_attempts = 0;
int res;

mutex_lock(&xw->priv->phy->pib_lock);
@@ -61,7 +61,7 @@ static void mac802154_xmit_worker(struct work_struct *work)

do {
res = xw->priv->ops->xmit(&xw->priv->hw, xw->skb);
- if (res && ++xw->xmit_attempts >= MAC802154_MAX_XMIT_ATTEMPTS) {
+ if (res && ++xmit_attempts >= MAC802154_MAX_XMIT_ATTEMPTS) {
pr_debug("transmission failed for %d times",
MAC802154_MAX_XMIT_ATTEMPTS);
break;
@@ -113,7 +113,6 @@ netdev_tx_t mac802154_tx(struct mac802154_priv *priv, struct sk_buff *skb,
work->priv = priv;
work->page = page;
work->chan = chan;
- work->xmit_attempts = 0;

queue_work(priv->dev_workqueue, &work->work);

--
1.7.11.2

2013-04-02 20:28:41

by Alan Ott

[permalink] [raw]
Subject: Re: [PATCH 1/6] mac802154: Immediately retry sending failed packets

On 04/02/2013 03:11 PM, Alexander Smirnov wrote:
> 2013/4/2 Alan Ott <[email protected] <mailto:[email protected]>>
>
> When ops->xmit() fails, immediately retry. Previously the packet was
> sent
> to the back of the workqueue.
>
> Signed-off-by: Alan Ott <[email protected] <mailto:[email protected]>>
> ---
> net/mac802154/tx.c | 17 ++++++++---------
> 1 file changed, 8 insertions(+), 9 deletions(-)
>
> diff --git a/net/mac802154/tx.c b/net/mac802154/tx.c
> index 4e09d07..fbf937c 100644
> --- a/net/mac802154/tx.c
> +++ b/net/mac802154/tx.c
> @@ -59,19 +59,18 @@ static void mac802154_xmit_worker(struct
> work_struct *work)
> }
> }
>
> - res = xw->priv->ops->xmit(&xw->priv->hw, xw->skb);
> + do {
> + res = xw->priv->ops->xmit(&xw->priv->hw, xw->skb);
> + if (res && ++xw->xmit_attempts >=
> MAC802154_MAX_XMIT_ATTEMPTS) {
> + pr_debug("transmission failed for %d times",
> + MAC802154_MAX_XMIT_ATTEMPTS);
> + break;
> + }
> + } while (res);
>
>
>
> IIRC this 802.15.4 stack uses single-thread-work-queue and all RX/TX
> are performed by using it.

Hi Alexander,

Yes, that is true. As is currently implemented, the driver xmit()
functions are called from a workqueue and block until the packet is sent.


> Doing TX retry in the way you proposed -
> it's possible that you will block other packets pending in this
> queue.

Yes. Since sending data is a blocking operation, any time spent sending
(or re-sending) is blocking.

As it was before this patch series, with the buffer (workqueue) growing
arbitrarily large, doing retry by putting a packet at the end of the
workqueue was largely useless because by the time it came to retry it,
any state associated with it (with respect to fragmentation/reassembly)
had expired.

Keep in mind that with the netif stop/wake code, putting retries at the
end of the workqueue or doing them immediately is basically the same
thing, since the workqueue is no longer the packet queue (and will
ideally only have 0 or 1 packets in it). The workqueue (with these
patches) only serves to lift the driver xmit() calls out of atomic
context, allowing them to block.

However, it is easy to envision one process clogging up the works with
retries by sending many packets to an unavailable address.

What do you recommend doing here instead?

> Despite on Linux is already 'slow' system to provide
> real-time for specific 802.15.4 features, I think it's not a good
> idea to increase nodes communication latency.

With the transmit buffer length increased (and actually being used),
maybe the packets with realtime requirements can be given a higher
priority to deal with these requirements.

Alan.

>
>
> out:
> mutex_unlock(&xw->priv->phy->pib_lock);
>
> - if (res) {
> - if (xw->xmit_attempts++ < MAC802154_MAX_XMIT_ATTEMPTS) {
> - queue_work(xw->priv->dev_workqueue, &xw->work);
> - return;
> - } else
> - pr_debug("transmission failed for %d times",
> - MAC802154_MAX_XMIT_ATTEMPTS);
> - }
>
> dev_kfree_skb(xw->skb);
>
> --
> 1.7.11.2
>
>

2013-04-02 21:23:00

by Sergei Shtylyov

[permalink] [raw]
Subject: Re: [PATCH 3/6] mac802154: Use netif flow control

Hello.

On 04/02/2013 10:47 PM, Alan Ott wrote:

> Use netif_stop_queue() and netif_wake_queue() to control the flow of
> packets to mac802154 devices. Since many IEEE 802.15.4 devices have no
> output buffer, and since the mac802154 xmit() function is designed to
> block, netif_stop_queue() is called after each packet.
>
> Signed-off-by: Alan Ott <[email protected]>
> ---
> net/mac802154/tx.c | 16 ++++++++++++++++
> 1 file changed, 16 insertions(+)
>
> diff --git a/net/mac802154/tx.c b/net/mac802154/tx.c
> index a248246..fe3e02c 100644
> --- a/net/mac802154/tx.c
> +++ b/net/mac802154/tx.c
[...]
> @@ -71,6 +73,12 @@ static void mac802154_xmit_worker(struct work_struct *work)
> out:
> mutex_unlock(&xw->priv->phy->pib_lock);
>
> + /* Restart the netif queue on each sub_if_data object. */
> + rcu_read_lock();
> + list_for_each_entry_rcu(sdata, &xw->priv->slaves, list) {
> + netif_wake_queue(sdata->dev);
> + }


Are {} really necessary here?

> @@ -108,6 +117,13 @@ netdev_tx_t mac802154_tx(struct mac802154_priv *priv, struct sk_buff *skb,
> return NETDEV_TX_BUSY;
> }
>
> + /* Stop the netif queue on each sub_if_data object. */
> + rcu_read_lock();
> + list_for_each_entry_rcu(sdata, &priv->slaves, list) {
> + netif_stop_queue(sdata->dev);
> + }

And here?

WBR, Sergei

2013-04-02 21:28:26

by Alan Ott

[permalink] [raw]
Subject: Re: [PATCH 1/6] mac802154: Immediately retry sending failed packets

On 04/02/2013 04:28 PM, Alan Ott wrote:
> On 04/02/2013 03:11 PM, Alexander Smirnov wrote:
>> 2013/4/2 Alan Ott <[email protected] <mailto:[email protected]>>
>>
>> When ops->xmit() fails, immediately retry. Previously the packet was
>> sent
>> to the back of the workqueue.
>>
>> Signed-off-by: Alan Ott <[email protected] <mailto:[email protected]>>
>> ---
>> net/mac802154/tx.c | 17 ++++++++---------
>> 1 file changed, 8 insertions(+), 9 deletions(-)
>>
>> diff --git a/net/mac802154/tx.c b/net/mac802154/tx.c
>> index 4e09d07..fbf937c 100644
>> --- a/net/mac802154/tx.c
>> +++ b/net/mac802154/tx.c
>> @@ -59,19 +59,18 @@ static void mac802154_xmit_worker(struct
>> work_struct *work)
>> }
>> }
>>
>> - res = xw->priv->ops->xmit(&xw->priv->hw, xw->skb);
>> + do {
>> + res = xw->priv->ops->xmit(&xw->priv->hw, xw->skb);
>> + if (res && ++xw->xmit_attempts >=
>> MAC802154_MAX_XMIT_ATTEMPTS) {
>> + pr_debug("transmission failed for %d times",
>> + MAC802154_MAX_XMIT_ATTEMPTS);
>> + break;
>> + }
>> + } while (res);
>>
>>
>>
>> IIRC this 802.15.4 stack uses single-thread-work-queue and all RX/TX
>> are performed by using it.
> Hi Alexander,
>
> Yes, that is true. As is currently implemented, the driver xmit()
> functions are called from a workqueue and block until the packet is sent.
>
>
>> Doing TX retry in the way you proposed -
>> it's possible that you will block other packets pending in this
>> queue.
> Yes. Since sending data is a blocking operation, any time spent sending
> (or re-sending) is blocking.
>
> As it was before this patch series, with the buffer (workqueue) growing
> arbitrarily large, doing retry by putting a packet at the end of the
> workqueue was largely useless because by the time it came to retry it,
> any state associated with it (with respect to fragmentation/reassembly)
> had expired.
>
> Keep in mind that with the netif stop/wake code, putting retries at the
> end of the workqueue or doing them immediately is basically the same
> thing, since the workqueue is no longer the packet queue (and will
> ideally only have 0 or 1 packets in it). The workqueue (with these
> patches) only serves to lift the driver xmit() calls out of atomic
> context, allowing them to block.
>
> However, it is easy to envision one process clogging up the works with
> retries by sending many packets to an unavailable address.
>
> What do you recommend doing here instead?

According to 7.5.6.5 of IEEE 802.15.4-2003, if the retransmission fails
more than aMaxFrameRetries (3) times, it is assumed that it has failed.
Since some transceivers (and I would assume most if not all) do this in
hardware, it's now my opinion that we should _not_ try to retransmit at
all in mac802154/tx.c.

For a driver for a device which _doesn't_ do retransmission in hardware,
maybe it should be handled by that driver then.

>
>> Despite on Linux is already 'slow' system to provide
>> real-time for specific 802.15.4 features, I think it's not a good
>> idea to increase nodes communication latency.
> With the transmit buffer length increased (and actually being used),
> maybe the packets with realtime requirements can be given a higher
> priority to deal with these requirements.
>
> Alan.
>
>>
>> out:
>> mutex_unlock(&xw->priv->phy->pib_lock);
>>
>> - if (res) {
>> - if (xw->xmit_attempts++ < MAC802154_MAX_XMIT_ATTEMPTS) {
>> - queue_work(xw->priv->dev_workqueue, &xw->work);
>> - return;
>> - } else
>> - pr_debug("transmission failed for %d times",
>> - MAC802154_MAX_XMIT_ATTEMPTS);
>> - }
>>
>> dev_kfree_skb(xw->skb);
>>
>> --
>> 1.7.11.2
>>
>>

2013-04-02 22:35:37

by Alan Ott

[permalink] [raw]
Subject: Re: [PATCH 1/6] mac802154: Immediately retry sending failed packets

On 04/02/2013 05:28 PM, Alan Ott wrote:

> According to 7.5.6.5 of IEEE 802.15.4-2003, if the retransmission fails
> more than aMaxFrameRetries (3) times, it is assumed that it has failed.
> Since some transceivers (and I would assume most if not all) do this in
> hardware, it's now my opinion that we should _not_ try to retransmit at
> all in mac802154/tx.c.
>
> For a driver for a device which _doesn't_ do retransmission in hardware,
> maybe it should be handled by that driver then.


It's worth noting that the mrf24j40, the at86rf230, and the cc2420
support retransmission in hardware (the first two are the only two in
mainline).

Alan.

2013-04-02 23:52:27

by Werner Almesberger

[permalink] [raw]
Subject: Re: [Linux-zigbee-devel] [PATCH 1/6] mac802154: Immediately retry sending failed packets

Alan Ott wrote:
> it's now my opinion that we should _not_ try to retransmit at
> all in mac802154/tx.c.

I think the currently blocking workqueue design is ugly and
quite contrary to how most the rest of the stack works. So
anything that kills it has my blessing :-)

I do wonder though why it was done like this in the first place.
Just for convenience ?

If we want to move towards an asynchronous interface, it could
exist in parallel with the current one. That way, drivers could
be migrated one by one.

Having said that, the errors you get there may not be failed
single transmissions on the air but some form of congestion in
the driver or a problem with the device. But I don't think
that's a valid reason for retrying the transmission at that
level.

- Werner

2013-04-03 01:24:58

by Alan Ott

[permalink] [raw]
Subject: Re: [Linux-zigbee-devel] [PATCH 1/6] mac802154: Immediately retry sending failed packets

On 04/02/2013 07:13 PM, Werner Almesberger wrote:
> Alan Ott wrote:
>> it's now my opinion that we should _not_ try to retransmit at
>> all in mac802154/tx.c.
> I think the currently blocking workqueue design is ugly and
> quite contrary to how most the rest of the stack works. So
> anything that kills it has my blessing :-)

I like it for a couple of reasons.
1. Most supported devices have only single packet output buffer, so
blocking in the driver is the most straight-forward way to handle it.
The alternative is to make each driver have a workqueue for xmit() (to
lift the blocking out from atomic context). This makes each driver simpler.

2. All of the flow control can be handled one time in the mac802154 layer.

> I do wonder though why it was done like this in the first place.
> Just for convenience ?
>
> If we want to move towards an asynchronous interface, it could
> exist in parallel with the current one. That way, drivers could
> be migrated one by one.

Maybe at some point this will be done. Right now we have a ton of
pressing issues (in my opinion).

> Having said that, the errors you get there may not be failed
> single transmissions on the air but some form of congestion in
> the driver or a problem with the device. But I don't think
> that's a valid reason for retrying the transmission at that
> level.

Agreed. Like I said, I'm now of the opinion that the mac802154 layer
should _not_ retry at all.

Alan.

2013-04-03 01:56:32

by David Miller

[permalink] [raw]
Subject: Re: [Linux-zigbee-devel] [PATCH 1/6] mac802154: Immediately retry sending failed packets

From: Alan Ott <[email protected]>
Date: Tue, 02 Apr 2013 21:24:59 -0400

> I like it for a couple of reasons.
> 1. Most supported devices have only single packet output buffer, so
> blocking in the driver is the most straight-forward way to handle it.
> The alternative is to make each driver have a workqueue for xmit() (to
> lift the blocking out from atomic context). This makes each driver simpler.
>
> 2. All of the flow control can be handled one time in the mac802154 layer.

We have a perfectly working flow control mechanism in the generic
networking queuing layer. Please use it instead of inventing things.

If it does not meet your needs, fix it, rather than go off and do
your own thing. That way everyone benfits, not just you.

2013-04-03 01:59:36

by Alan Ott

[permalink] [raw]
Subject: Re: [Linux-zigbee-devel] [PATCH 1/6] mac802154: Immediately retry sending failed packets

On 04/02/2013 09:56 PM, David Miller wrote:
> From: Alan Ott <[email protected]>
> Date: Tue, 02 Apr 2013 21:24:59 -0400
>
>> I like it for a couple of reasons.
>> 1. Most supported devices have only single packet output buffer, so
>> blocking in the driver is the most straight-forward way to handle it.
>> The alternative is to make each driver have a workqueue for xmit() (to
>> lift the blocking out from atomic context). This makes each driver simpler.
>>
>> 2. All of the flow control can be handled one time in the mac802154 layer.
> We have a perfectly working flow control mechanism in the generic
> networking queuing layer. Please use it instead of inventing things.

I'm pretty sure that's what I'm doing in [1]. When I say "flow control
can be handled," I mean managing calls to netif_stop_queue() and
netif_wake_queue(). Is there something else I should be doing instead?

> If it does not meet your needs, fix it, rather than go off and do
> your own thing. That way everyone benfits, not just you.

Fully agreed.

Alan.

[1] http://www.spinics.net/lists/netdev/msg231483.html

2013-04-03 02:03:20

by David Miller

[permalink] [raw]
Subject: Re: [Linux-zigbee-devel] [PATCH 1/6] mac802154: Immediately retry sending failed packets

From: Alan Ott <[email protected]>
Date: Tue, 02 Apr 2013 21:59:37 -0400

> On 04/02/2013 09:56 PM, David Miller wrote:
>> From: Alan Ott <[email protected]>
>> Date: Tue, 02 Apr 2013 21:24:59 -0400
>>
>>> I like it for a couple of reasons.
>>> 1. Most supported devices have only single packet output buffer, so
>>> blocking in the driver is the most straight-forward way to handle it.
>>> The alternative is to make each driver have a workqueue for xmit() (to
>>> lift the blocking out from atomic context). This makes each driver simpler.
>>>
>>> 2. All of the flow control can be handled one time in the mac802154 layer.
>> We have a perfectly working flow control mechanism in the generic
>> networking queuing layer. Please use it instead of inventing things.
>
> I'm pretty sure that's what I'm doing in [1]. When I say "flow control
> can be handled," I mean managing calls to netif_stop_queue() and
> netif_wake_queue(). Is there something else I should be doing instead?

Then you shouldn't need workqueues if the generic netdev facilities
can do the flow control properly.

There are several ethernet devices that have a single transmit buffer
and function just fine, and optimally, solely using the transmit queue
stop/start/wake infrastructure.

2013-04-03 02:25:28

by Alan Ott

[permalink] [raw]
Subject: Re: [Linux-zigbee-devel] [PATCH 1/6] mac802154: Immediately retry sending failed packets

On 04/02/2013 10:03 PM, David Miller wrote:
> From: Alan Ott <[email protected]>
> Date: Tue, 02 Apr 2013 21:59:37 -0400
>
>> On 04/02/2013 09:56 PM, David Miller wrote:
>>> From: Alan Ott <[email protected]>
>>> Date: Tue, 02 Apr 2013 21:24:59 -0400
>>>
>>>> I like it for a couple of reasons.
>>>> 1. Most supported devices have only single packet output buffer, so
>>>> blocking in the driver is the most straight-forward way to handle it.
>>>> The alternative is to make each driver have a workqueue for xmit() (to
>>>> lift the blocking out from atomic context). This makes each driver simpler.
>>>>
>>>> 2. All of the flow control can be handled one time in the mac802154 layer.
>>> We have a perfectly working flow control mechanism in the generic
>>> networking queuing layer. Please use it instead of inventing things.
>> I'm pretty sure that's what I'm doing in [1]. When I say "flow control
>> can be handled," I mean managing calls to netif_stop_queue() and
>> netif_wake_queue(). Is there something else I should be doing instead?
> Then you shouldn't need workqueues if the generic netdev facilities
> can do the flow control properly.

The workqueue in mac802154 is only needed because the current mac802154
xmit() function is designed to be blocking and synchronous.

Prior to my patch (#3/6), that very same workqueue would actually queue
up packets (without bound). That's what my patch fixes.

The workqueue in mac802154 also serializes the access to the device for
other functions like setting the channel, ensuring that in the driver
code, one doesn't have to mutex everything. I'm not sure if that's the
"right" way to do it, but that's the way it was when I got here.

> There are several ethernet devices that have a single transmit buffer
> and function just fine, and optimally, solely using the transmit queue
> stop/start/wake infrastructure.

Yes, that does work. enc28j60 works like this. However, since it's an
SPI device (and can sleep), its ndo_start_xmit() _does_ use a workqueue
(to remove it from atomic context), the same as ours (mac802154) does.
The difference is that we do it at the mac802154 layer, while Ethernet
devices do it in the driver.

I guess one advantage to the way it currently is in mac802154, with the
synchronous xmit(), is that a return code can be had from the driver for
each packet. With my new idea that we don't need to retransmit on
failure, I'm not sure we need this return code at all.

Alan.

2013-04-03 02:30:40

by David Miller

[permalink] [raw]
Subject: Re: [Linux-zigbee-devel] [PATCH 1/6] mac802154: Immediately retry sending failed packets

From: Alan Ott <[email protected]>
Date: Tue, 02 Apr 2013 22:25:28 -0400

> The workqueue in mac802154 is only needed because the current mac802154
> xmit() function is designed to be blocking and synchronous.
>
> Prior to my patch (#3/6), that very same workqueue would actually queue
> up packets (without bound). That's what my patch fixes.
>
> The workqueue in mac802154 also serializes the access to the device for
> other functions like setting the channel, ensuring that in the driver
> code, one doesn't have to mutex everything. I'm not sure if that's the
> "right" way to do it, but that's the way it was when I got here.

This is entirely duplicating existing facilities.

Your desire to allow blockability during xmit() on the basis of mutual
exclusion is not well founded.

2013-04-03 02:39:23

by Werner Almesberger

[permalink] [raw]
Subject: Re: [Linux-zigbee-devel] [PATCH 1/6] mac802154: Immediately retry sending failed packets

Alan Ott wrote:
> 1. Most supported devices have only single packet output buffer, so
> blocking in the driver is the most straight-forward way to handle it.
> The alternative is to make each driver have a workqueue for xmit() (to
> lift the blocking out from atomic context). This makes each driver simpler.

It does make following the program flow a little easier, but
the difference isn't all that large if you think of it,
particularly if you have to wait not only for I/O to finish
but also for the device to send the packet.

The latter will usually be signaled by some form of interrupt,
so you're already in a situation where a callback to the higher
layers of the stack would be very natural.

> Maybe at some point this will be done. Right now we have a ton of
> pressing issues (in my opinion).

Agreed on having no shortage of nasty issues :-) And I'd like
to echo Dave's comment regarding netdev. Those ieee802154_dev
always struck me as peculiar, with flow control just being one
issue.

And things get worse when you have a complex bus underneath
your driver. For example, my USB-using atusb driver (*) has to
do a great many things usbnet already does. And any other
USB-based WPAN driver would be more or less in the same boat.
Of course, one could reinvent that wheel as well and make a
usbwpan, but ... :)

(*) Sneak preview, still with a number of issues, not only style:
https://github.com/wpwrak/ben-wpan-linux/blob/master/drivers/net/ieee802154/atusb.c

- Werner

2013-04-03 02:57:58

by Alan Ott

[permalink] [raw]
Subject: Re: [Linux-zigbee-devel] [PATCH 1/6] mac802154: Immediately retry sending failed packets

On 04/02/2013 10:30 PM, David Miller wrote:
> From: Alan Ott <[email protected]>
> Date: Tue, 02 Apr 2013 22:25:28 -0400
>
>> The workqueue in mac802154 is only needed because the current mac802154
>> xmit() function is designed to be blocking and synchronous.
>>
>> Prior to my patch (#3/6), that very same workqueue would actually queue
>> up packets (without bound). That's what my patch fixes.
>>
>> The workqueue in mac802154 also serializes the access to the device for
>> other functions like setting the channel, ensuring that in the driver
>> code, one doesn't have to mutex everything. I'm not sure if that's the
>> "right" way to do it, but that's the way it was when I got here.
> This is entirely duplicating existing facilities.
>
> Your desire to allow blockability during xmit() on the basis of mutual
> exclusion is not well founded.

I'm not sure it's my desire, but rather a statement of the way it
currently is. To be clear, .ndo_start_xmit() does not block, but queues
a workqueue item which then calls ieee802154_ops->xmit() which does block.

This patch series centers around putting netif_stop_queue() and
netif_wake_queue() in the mac802154 layer. I've sent emails about this
before[1], and gotten no real suggestions about the issue, so I
proceeded with Solution #1 (as described at [1]). If you want to skip
this and go straight to solution #2, then let's talk about what that
might look like. I still think though, that there is benefit in getting
solution #1 in because it fixes some current usability problems
(including the buffer (workqueue) growing without bound).

All that said, I'm not sure I've answered your question or concern.
Please let me know if I'm still not getting it.

Alan.

[1] http://thread.gmane.org/gmane.linux.network/242495/focus=262869

2013-04-03 14:01:20

by Alan Ott

[permalink] [raw]
Subject: [PATCH v2 0/4] 802.15.4 and 6LoWPAN Buffering Fixes

Version 2 of this patch series:

Differences from v1:

1. Patches previously numbered 5 and 6 were squashed (to become current
patch #4) at the request of Alexander Smirnov.

2. Current patch #2 had extraneous braces removed.

3. Current patch #1 was changed. It is now a patch to make mac802154 _not_
retry sending packets on failure. I believe this to be consistent with the
802.15.4 specification (Section 7.5.6.4.3 of IEEE 802.15.4-2006)

Alan Ott (4):
mac802154: Do not try to resend failed packets
mac802154: Use netif flow control
mac802154: Increase tx_buffer_len
6lowpan: handle dev_queue_xmit() error code properly

net/ieee802154/6lowpan.c | 4 ++--
net/mac802154/mac802154.h | 2 --
net/mac802154/tx.c | 26 ++++++++++++++++----------
net/mac802154/wpan.c | 2 +-
4 files changed, 19 insertions(+), 15 deletions(-)

--
1.7.11.2

2013-04-03 14:01:18

by Alan Ott

[permalink] [raw]
Subject: [PATCH v2 4/4] 6lowpan: handle dev_queue_xmit() error code properly

dev_queue_xmit() will return a positive value if the packet could not be
queued, often because the real network device (in our case the mac802154
wpan device) has its queue stopped. lowpan_xmit() should handle the
positive return code (for the debug statement) and return that value to
the higher layer so the higher layer will retry sending the packet.

Signed-off-by: Alan Ott <[email protected]>
---
net/ieee802154/6lowpan.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/net/ieee802154/6lowpan.c b/net/ieee802154/6lowpan.c
index e1b4580..55e1fd5 100644
--- a/net/ieee802154/6lowpan.c
+++ b/net/ieee802154/6lowpan.c
@@ -1139,10 +1139,10 @@ static netdev_tx_t lowpan_xmit(struct sk_buff *skb, struct net_device *dev)
error:
dev_kfree_skb(skb);
out:
- if (err < 0)
+ if (err)
pr_debug("ERROR: xmit failed\n");

- return (err < 0 ? NETDEV_TX_BUSY : NETDEV_TX_OK);
+ return (err < 0) ? NET_XMIT_DROP : err;
}

static struct wpan_phy *lowpan_get_phy(const struct net_device *dev)
--
1.7.11.2

2013-04-03 14:01:16

by Alan Ott

[permalink] [raw]
Subject: [PATCH v2 1/4] mac802154: Do not try to resend failed packets

When ops->xmit() fails, drop the packet. Devices which support hardware
ack and retry (which include all devices currently supported by mainline),
will automatically retry sending the packet (in the hardware) up to 3
times, per the 802.15.4 spec. There is no need, and it is incorrect to
try to do it in mac802154.

Signed-off-by: Alan Ott <[email protected]>
---
net/mac802154/mac802154.h | 2 --
net/mac802154/tx.c | 12 ++----------
2 files changed, 2 insertions(+), 12 deletions(-)

diff --git a/net/mac802154/mac802154.h b/net/mac802154/mac802154.h
index 21fa386..5c9e021 100644
--- a/net/mac802154/mac802154.h
+++ b/net/mac802154/mac802154.h
@@ -88,8 +88,6 @@ struct mac802154_sub_if_data {

#define mac802154_to_priv(_hw) container_of(_hw, struct mac802154_priv, hw)

-#define MAC802154_MAX_XMIT_ATTEMPTS 3
-
#define MAC802154_CHAN_NONE (~(u8)0) /* No channel is assigned */

extern struct ieee802154_reduced_mlme_ops mac802154_mlme_reduced;
diff --git a/net/mac802154/tx.c b/net/mac802154/tx.c
index 4e09d07..7264874 100644
--- a/net/mac802154/tx.c
+++ b/net/mac802154/tx.c
@@ -39,7 +39,6 @@ struct xmit_work {
struct mac802154_priv *priv;
u8 chan;
u8 page;
- u8 xmit_attempts;
};

static void mac802154_xmit_worker(struct work_struct *work)
@@ -60,18 +59,12 @@ static void mac802154_xmit_worker(struct work_struct *work)
}

res = xw->priv->ops->xmit(&xw->priv->hw, xw->skb);
+ if (res)
+ pr_debug("transmission failed\n");

out:
mutex_unlock(&xw->priv->phy->pib_lock);

- if (res) {
- if (xw->xmit_attempts++ < MAC802154_MAX_XMIT_ATTEMPTS) {
- queue_work(xw->priv->dev_workqueue, &xw->work);
- return;
- } else
- pr_debug("transmission failed for %d times",
- MAC802154_MAX_XMIT_ATTEMPTS);
- }

dev_kfree_skb(xw->skb);

@@ -114,7 +107,6 @@ netdev_tx_t mac802154_tx(struct mac802154_priv *priv, struct sk_buff *skb,
work->priv = priv;
work->page = page;
work->chan = chan;
- work->xmit_attempts = 0;

queue_work(priv->dev_workqueue, &work->work);

--
1.7.11.2

2013-04-03 14:01:14

by Alan Ott

[permalink] [raw]
Subject: [PATCH v2 2/4] mac802154: Use netif flow control

Use netif_stop_queue() and netif_wake_queue() to control the flow of
packets to mac802154 devices. Since many IEEE 802.15.4 devices have no
output buffer, and since the mac802154 xmit() function is designed to
block, netif_stop_queue() is called after each packet.

Signed-off-by: Alan Ott <[email protected]>
---
net/mac802154/tx.c | 14 ++++++++++++++
1 file changed, 14 insertions(+)

diff --git a/net/mac802154/tx.c b/net/mac802154/tx.c
index 7264874..3fd3e07 100644
--- a/net/mac802154/tx.c
+++ b/net/mac802154/tx.c
@@ -25,6 +25,7 @@
#include <linux/if_arp.h>
#include <linux/crc-ccitt.h>

+#include <net/ieee802154_netdev.h>
#include <net/mac802154.h>
#include <net/wpan-phy.h>

@@ -44,6 +45,7 @@ struct xmit_work {
static void mac802154_xmit_worker(struct work_struct *work)
{
struct xmit_work *xw = container_of(work, struct xmit_work, work);
+ struct mac802154_sub_if_data *sdata;
int res;

mutex_lock(&xw->priv->phy->pib_lock);
@@ -65,6 +67,11 @@ static void mac802154_xmit_worker(struct work_struct *work)
out:
mutex_unlock(&xw->priv->phy->pib_lock);

+ /* Restart the netif queue on each sub_if_data object. */
+ rcu_read_lock();
+ list_for_each_entry_rcu(sdata, &xw->priv->slaves, list)
+ netif_wake_queue(sdata->dev);
+ rcu_read_unlock();

dev_kfree_skb(xw->skb);

@@ -75,6 +82,7 @@ netdev_tx_t mac802154_tx(struct mac802154_priv *priv, struct sk_buff *skb,
u8 page, u8 chan)
{
struct xmit_work *work;
+ struct mac802154_sub_if_data *sdata;

if (!(priv->phy->channels_supported[page] & (1 << chan))) {
WARN_ON(1);
@@ -102,6 +110,12 @@ netdev_tx_t mac802154_tx(struct mac802154_priv *priv, struct sk_buff *skb,
return NETDEV_TX_BUSY;
}

+ /* Stop the netif queue on each sub_if_data object. */
+ rcu_read_lock();
+ list_for_each_entry_rcu(sdata, &priv->slaves, list)
+ netif_stop_queue(sdata->dev);
+ rcu_read_unlock();
+
INIT_WORK(&work->work, mac802154_xmit_worker);
work->skb = skb;
work->priv = priv;
--
1.7.11.2

2013-04-03 14:01:11

by Alan Ott

[permalink] [raw]
Subject: [PATCH v2 3/4] mac802154: Increase tx_buffer_len

Increase the buffer length from 10 to 300 packets. Consider that traffic on
mac802154 devices will often be 6LoWPAN, and a full-length (1280 octet)
IPv6 packet will fragment into 15 6LoWPAN fragments (because the MTU of
IEEE 802.15.4 is 127). A 300-packet queue is really 20 full-length IPv6
packets.

With a queue length of 10, an entire IPv6 packet was unable to get queued
at one time, causing fragments to be dropped, and making reassembly
impossible.

Signed-off-by: Alan Ott <[email protected]>
---
net/mac802154/wpan.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/mac802154/wpan.c b/net/mac802154/wpan.c
index 7d3f659..2ca2f4d 100644
--- a/net/mac802154/wpan.c
+++ b/net/mac802154/wpan.c
@@ -360,7 +360,7 @@ void mac802154_wpan_setup(struct net_device *dev)
dev->header_ops = &mac802154_header_ops;
dev->needed_tailroom = 2; /* FCS */
dev->mtu = IEEE802154_MTU;
- dev->tx_queue_len = 10;
+ dev->tx_queue_len = 300;
dev->type = ARPHRD_IEEE802154;
dev->flags = IFF_NOARP | IFF_BROADCAST;
dev->watchdog_timeo = 0;
--
1.7.11.2

2013-04-07 21:06:58

by David Miller

[permalink] [raw]
Subject: Re: [PATCH v2 0/4] 802.15.4 and 6LoWPAN Buffering Fixes

From: Alan Ott <[email protected]>
Date: Wed, 3 Apr 2013 10:00:54 -0400

> Version 2 of this patch series:
>
> Differences from v1:
>
> 1. Patches previously numbered 5 and 6 were squashed (to become current
> patch #4) at the request of Alexander Smirnov.
>
> 2. Current patch #2 had extraneous braces removed.
>
> 3. Current patch #1 was changed. It is now a patch to make mac802154 _not_
> retry sending packets on failure. I believe this to be consistent with the
> 802.15.4 specification (Section 7.5.6.4.3 of IEEE 802.15.4-2006)

Series applied, thanks.