tpacket_snd() can change and kfree an skb after dev_queue_xmit(),
which is illegal.
With debugging by: Stephen Hemminger <[email protected]>
Reported-by: Michael Breuer <[email protected]>
Tested-by: Michael Breuer <[email protected]>
Signed-off-by: Jarek Poplawski <[email protected]>
Acked-by: Stephen Hemminger <[email protected]>
---
net/packet/af_packet.c | 8 +++-----
1 files changed, 3 insertions(+), 5 deletions(-)
diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c
index e0516a2..aba2049 100644
--- a/net/packet/af_packet.c
+++ b/net/packet/af_packet.c
@@ -1021,9 +1021,10 @@ static int tpacket_snd(struct packet_sock *po, struct msghdr *msg)
status = TP_STATUS_SEND_REQUEST;
err = dev_queue_xmit(skb);
- if (unlikely(err > 0 && (err = net_xmit_errno(err)) != 0))
- goto out_xmit;
packet_increment_head(&po->tx_ring);
+ if (unlikely(err > 0 && (err = net_xmit_errno(err)) != 0))
+ goto out_put;
+
len_sum += tp_len;
} while (likely((ph != NULL) ||
((!(msg->msg_flags & MSG_DONTWAIT)) &&
@@ -1033,9 +1034,6 @@ static int tpacket_snd(struct packet_sock *po, struct msghdr *msg)
err = len_sum;
goto out_put;
-out_xmit:
- skb->destructor = sock_wfree;
- atomic_dec(&po->tx_ring.pending);
out_status:
__packet_set_status(po, ph, status);
kfree_skb(skb);
From: Jarek Poplawski <[email protected]>
Date: Sat, 9 Jan 2010 13:38:27 +0100
> tpacket_snd() can change and kfree an skb after dev_queue_xmit(),
> which is illegal.
>
> With debugging by: Stephen Hemminger <[email protected]>
>
> Reported-by: Michael Breuer <[email protected]>
> Tested-by: Michael Breuer <[email protected]>
> Signed-off-by: Jarek Poplawski <[email protected]>
> Acked-by: Stephen Hemminger <[email protected]>
Jarek, if this code path triggers, it will deadlock the
send ring with your changes.
We will now leave the ring packet status in the "SENDING" state.
That's not right.
Then, if the application calls send again, we will just return
immediately since we only make progress if the head ring entry is in
SEND_REQUEST state.
This is really bogus behavior. When the qdisc or mid-layer
drops the packet, we should at least mark the packet state
properly (which is what the current code would does, sans
the "reference SKB after dev_queue_xmit()" issue). And
advance the packet ring pointer.
This way the user:
1) can see that the packet got dropped and couldn't be sent
2) can call send again to try sending the rest of the ring
Fix the use after dev_queue_xmit() issue, but don't change other side
effects which are important for correct AF_PACKET TX ring semantics.
On Sun, Jan 10, 2010 at 01:51:35PM -0800, David Miller wrote:
> From: Jarek Poplawski <[email protected]>
> Date: Sat, 9 Jan 2010 13:38:27 +0100
>
> > tpacket_snd() can change and kfree an skb after dev_queue_xmit(),
> > which is illegal.
> >
> > With debugging by: Stephen Hemminger <[email protected]>
> >
> > Reported-by: Michael Breuer <[email protected]>
> > Tested-by: Michael Breuer <[email protected]>
> > Signed-off-by: Jarek Poplawski <[email protected]>
> > Acked-by: Stephen Hemminger <[email protected]>
>
> Jarek, if this code path triggers, it will deadlock the
> send ring with your changes.
>
> We will now leave the ring packet status in the "SENDING" state.
>
> That's not right.
No, the destructor of this skb, tpacket_destruct_skb(), will clean
this. (Just like for other skbs kfreed during dev_queue_xmit().)
Jarek P.
On 10-01-2010 22:51, David Miller wrote:
> From: Jarek Poplawski <[email protected]>
> Date: Sat, 9 Jan 2010 13:38:27 +0100
>
>> tpacket_snd() can change and kfree an skb after dev_queue_xmit(),
>> which is illegal.
>>
>> With debugging by: Stephen Hemminger <[email protected]>
>>
>> Reported-by: Michael Breuer <[email protected]>
>> Tested-by: Michael Breuer <[email protected]>
>> Signed-off-by: Jarek Poplawski <[email protected]>
>> Acked-by: Stephen Hemminger <[email protected]>
>
> Jarek, if this code path triggers, it will deadlock the
> send ring with your changes.
>
> We will now leave the ring packet status in the "SENDING" state.
>
> That's not right.
>
> Then, if the application calls send again, we will just return
> immediately since we only make progress if the head ring entry is in
> SEND_REQUEST state.
>
> This is really bogus behavior. When the qdisc or mid-layer
> drops the packet, we should at least mark the packet state
> properly (which is what the current code would does, sans
> the "reference SKB after dev_queue_xmit()" issue). And
> advance the packet ring pointer.
>
> This way the user:
>
> 1) can see that the packet got dropped and couldn't be sent
>
> 2) can call send again to try sending the rest of the ring
>
> Fix the use after dev_queue_xmit() issue, but don't change other side
> effects which are important for correct AF_PACKET TX ring semantics.
As I wrote already, I don't think this patch is wrong. Alas, we can't
both fix this bug and retain exactly current behaviour, at least
without deeper changes. And I doubt it's worth it if we ignore negative
dev_queue_xmit() return (drops also) at the same time.
Btw, there was an alternative fix (positively) tested - more radical,
but IMHO safe and appropriate at least as a temporary solution for
-stable:
http://permalink.gmane.org/gmane.linux.kernel/934761
Anyway, here is another try, with even more of the current semantics.
If you think it's better, I hope Michael can test it (and send his
Tested-by).
Thanks,
Jarek P.
----------------> (alternative 3)
Subject: af_packet: Don't use skb after dev_queue_xmit()
tpacket_snd() can change and kfree an skb after dev_queue_xmit(),
which is illegal.
With debugging by: Stephen Hemminger <[email protected]>
Reported-by: Michael Breuer <[email protected]>
Signed-off-by: Jarek Poplawski <[email protected]>
Cc: Stephen Hemminger <[email protected]>
---
net/packet/af_packet.c | 19 ++++++++++++++-----
1 files changed, 14 insertions(+), 5 deletions(-)
diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c
index e0516a2..f126d18 100644
--- a/net/packet/af_packet.c
+++ b/net/packet/af_packet.c
@@ -1021,8 +1021,20 @@ static int tpacket_snd(struct packet_sock *po, struct msghdr *msg)
status = TP_STATUS_SEND_REQUEST;
err = dev_queue_xmit(skb);
- if (unlikely(err > 0 && (err = net_xmit_errno(err)) != 0))
- goto out_xmit;
+ if (unlikely(err > 0)) {
+ err = net_xmit_errno(err);
+ if (err && __packet_get_status(po, ph) ==
+ TP_STATUS_AVAILABLE) {
+ /* skb was destructed already */
+ skb = NULL;
+ goto out_status;
+ }
+ /*
+ * skb was dropped but not destructed yet;
+ * let's treat it like congestion or err < 0
+ */
+ err = 0;
+ }
packet_increment_head(&po->tx_ring);
len_sum += tp_len;
} while (likely((ph != NULL) ||
@@ -1033,9 +1045,6 @@ static int tpacket_snd(struct packet_sock *po, struct msghdr *msg)
err = len_sum;
goto out_put;
-out_xmit:
- skb->destructor = sock_wfree;
- atomic_dec(&po->tx_ring.pending);
out_status:
__packet_set_status(po, ph, status);
kfree_skb(skb);
On 1/11/2010 3:04 AM, Jarek Poplawski wrote:
> On 10-01-2010 22:51, David Miller wrote:
>
>> From: Jarek Poplawski<[email protected]>
>> Date: Sat, 9 Jan 2010 13:38:27 +0100
>>
>>
>>> tpacket_snd() can change and kfree an skb after dev_queue_xmit(),
>>> which is illegal.
>>>
>>> With debugging by: Stephen Hemminger<[email protected]>
>>>
>>> Reported-by: Michael Breuer<[email protected]>
>>> Tested-by: Michael Breuer<[email protected]>
>>> Signed-off-by: Jarek Poplawski<[email protected]>
>>> Acked-by: Stephen Hemminger<[email protected]>
>>>
>> Jarek, if this code path triggers, it will deadlock the
>> send ring with your changes.
>>
>> We will now leave the ring packet status in the "SENDING" state.
>>
>> That's not right.
>>
>> Then, if the application calls send again, we will just return
>> immediately since we only make progress if the head ring entry is in
>> SEND_REQUEST state.
>>
>> This is really bogus behavior. When the qdisc or mid-layer
>> drops the packet, we should at least mark the packet state
>> properly (which is what the current code would does, sans
>> the "reference SKB after dev_queue_xmit()" issue). And
>> advance the packet ring pointer.
>>
>> This way the user:
>>
>> 1) can see that the packet got dropped and couldn't be sent
>>
>> 2) can call send again to try sending the rest of the ring
>>
>> Fix the use after dev_queue_xmit() issue, but don't change other side
>> effects which are important for correct AF_PACKET TX ring semantics.
>>
> As I wrote already, I don't think this patch is wrong. Alas, we can't
> both fix this bug and retain exactly current behaviour, at least
> without deeper changes. And I doubt it's worth it if we ignore negative
> dev_queue_xmit() return (drops also) at the same time.
>
> Btw, there was an alternative fix (positively) tested - more radical,
> but IMHO safe and appropriate at least as a temporary solution for
> -stable:
> http://permalink.gmane.org/gmane.linux.kernel/934761
>
> Anyway, here is another try, with even more of the current semantics.
> If you think it's better, I hope Michael can test it (and send his
> Tested-by).
>
> Thanks,
> Jarek P.
> ----------------> (alternative 3)
>
> Subject: af_packet: Don't use skb after dev_queue_xmit()
>
> tpacket_snd() can change and kfree an skb after dev_queue_xmit(),
> which is illegal.
>
> With debugging by: Stephen Hemminger<[email protected]>
>
> Reported-by: Michael Breuer<[email protected]>
> Signed-off-by: Jarek Poplawski<[email protected]>
>
> Cc: Stephen Hemminger<[email protected]>
> ---
>
> net/packet/af_packet.c | 19 ++++++++++++++-----
> 1 files changed, 14 insertions(+), 5 deletions(-)
>
> diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c
> index e0516a2..f126d18 100644
> --- a/net/packet/af_packet.c
> +++ b/net/packet/af_packet.c
> @@ -1021,8 +1021,20 @@ static int tpacket_snd(struct packet_sock *po, struct msghdr *msg)
>
> status = TP_STATUS_SEND_REQUEST;
> err = dev_queue_xmit(skb);
> - if (unlikely(err> 0&& (err = net_xmit_errno(err)) != 0))
> - goto out_xmit;
> + if (unlikely(err> 0)) {
> + err = net_xmit_errno(err);
> + if (err&& __packet_get_status(po, ph) ==
> + TP_STATUS_AVAILABLE) {
> + /* skb was destructed already */
> + skb = NULL;
> + goto out_status;
> + }
> + /*
> + * skb was dropped but not destructed yet;
> + * let's treat it like congestion or err< 0
> + */
> + err = 0;
> + }
> packet_increment_head(&po->tx_ring);
> len_sum += tp_len;
> } while (likely((ph != NULL) ||
> @@ -1033,9 +1045,6 @@ static int tpacket_snd(struct packet_sock *po, struct msghdr *msg)
> err = len_sum;
> goto out_put;
>
> -out_xmit:
> - skb->destructor = sock_wfree;
> - atomic_dec(&po->tx_ring.pending);
> out_status:
> __packet_set_status(po, ph, status);
> kfree_skb(skb);
>
Tested by: Michael Breuer
Note: This patch is delivering better ethernet throughput (15-20%) and
no than the previous two patches. I'm also no longer seeing dropped RX
packets. Good work!
On Mon, Jan 11, 2010 at 05:30:45PM -0500, Michael Breuer wrote:
> On 1/11/2010 3:04 AM, Jarek Poplawski wrote:
...
> >----------------> (alternative 3)
> >
> >Subject: af_packet: Don't use skb after dev_queue_xmit()
> >
> >tpacket_snd() can change and kfree an skb after dev_queue_xmit(),
> >which is illegal.
> >
> >With debugging by: Stephen Hemminger<[email protected]>
> >
> >Reported-by: Michael Breuer<[email protected]>
> >Signed-off-by: Jarek Poplawski<[email protected]>
> >
> >Cc: Stephen Hemminger<[email protected]>
> >---
> >
> > net/packet/af_packet.c | 19 ++++++++++++++-----
> > 1 files changed, 14 insertions(+), 5 deletions(-)
...
> Tested by: Michael Breuer
>
> Note: This patch is delivering better ethernet throughput (15-20%)
> and no than the previous two patches. I'm also no longer seeing
> dropped RX packets. Good work!
David, if this patch is acceptable now, please add:
With help from: David S. Miller <[email protected]>
Tested-by: Michael Breuer<[email protected]>
Thanks everybody,
Jarek P.
From: Jarek Poplawski <[email protected]>
Date: Mon, 11 Jan 2010 23:48:17 +0100
> David, if this patch is acceptable now, please add:
>
> With help from: David S. Miller <[email protected]>
>
> Tested-by: Michael Breuer<[email protected]>
Okie dokie, I plan to look over this stuff later today.
From: Jarek Poplawski <[email protected]>
Date: Mon, 11 Jan 2010 23:48:17 +0100
> David, if this patch is acceptable now, please add:
>
> With help from: David S. Miller <[email protected]>
>
> Tested-by: Michael Breuer<[email protected]>
Looks great, applied, thanks everyone!