2012-12-07 20:31:37

by David Miller

[permalink] [raw]
Subject: Re: [RFC PATCH] af_packet: don't to defrag shared skb

From: Eric Leblond <[email protected]>
Date: Fri, 7 Dec 2012 19:56:01 +0100

Wireless folks, please take a look. The issue is that,
under the circumstances listed below, we get SKBs in
the AF_PACKET input path that are shared.

Given the logic present in ieee80211_deliver_skb() I think
the mac80211 code doesn't expect this either.

More commentary from me below:

> This patch is adding a check on skb before trying to defrag the
> packet for the hash computation in fanout mode. The goal of this
> patch is to avoid an kernel crash in pskb_expand_head.
> It appears that under some specific condition there is a shared
> skb reaching the defrag code and this lead to a crash due to the
> following code:
>
> if (skb_shared(skb))
> BUG();
>
> I've observed this crash under the following condition:
> 1. a program is listening to an wifi interface (let say wlan0)
> 2. it is using fanout capture in flow load balancing mode
> 3. defrag option is on on the fanout socket
> 4. the interface disconnect (radio down for example)
> 5. the interface reconnect (radio switched up)
> 6. once reconnected a single packet is seen with skb->users=2
> 7. the kernel crash in pskb_expand_head at skbuff.c:1035
>
> [BBB55:744364] [<ffffffff812a2761>] ? __pskb_pull_tail+0x43x0x26f
> [BB8S5.744395] [<ffffffff812d29Tb>] ? ip_check_defrag+ox3a/0x14a
> [BBB55.744422] [<ffffffffB1344459>] ? packet_rcv_fanout+ox5e/oxf9
> [BBBS5.7444S0] [<ffffffffB12aaS9b>] ? __netif_receive_skb+ox444/ox4f9
> [BBB55.T4447B] [<ffffffffB12aa?e1>] ? netif_receive_skb+ox6d/0x?3
> [BBB55.T4447B] [<ffffffffB12aa?e1>] ? ieee80211_deliver_skb+0xbd/0xfa [mac80211]
> [BBB55.T4447B] [<ffffffffB12aa?e1>] ? ieee80211_rx_h_data+0x1e0/0x21a [mac80211]
> [BBB55.T4447B] [<ffffffffB12aa?e1>] ? ieee80211_rx_handlers+0x3d5/0x480 [mac80211]
> [BBB55.T4447B] [<ffffffffB12aa?e1>] ? __wake_up
> [BBB55.T4447B] [<ffffffffB12aa?e1>] ? evdev_eventr+0xc0/0xcf [evdev]
>
> Signed-off-by: Eric Leblond <[email protected]>

So if we look at ieee80211_deliver_skb(), it has code to deal with unaligned
packet headers, wherein it memoves() the data into a better aligned location.

But if these SKBs really are skb_shared(), this packet data
modification is illegal.

I suspect that the assumptions built into this unaligned data handling
code, and AF_PACKET, are correct. Meaning that we should never see
skb_shared() packets here. We just have a missing skb_copy()
somewhere in mac80211, Johannes can you please take a look?

Thanks.


2012-12-10 09:40:45

by Johannes Berg

[permalink] [raw]
Subject: [PATCH] ipv4: ip_check_defrag must not modify skb before unsharing

From: Johannes Berg <[email protected]>

ip_check_defrag() might be called from af_packet within the
RX path where shared SKBs are used, so it must not modify
the input SKB before it has unshared it for defragmentation.
Use skb_copy_bits() to get the IP header and only pull in
everything later.

The same is true for the other caller in macvlan as it is
called from dev->rx_handler which can also get a shared SKB.

Reported-by: Eric Leblond <[email protected]>
Cc: [email protected]
Signed-off-by: Johannes Berg <[email protected]>
---
For some versions of the kernel, this code goes into af_packet.c

net/ipv4/ip_fragment.c | 19 +++++++++----------
1 file changed, 9 insertions(+), 10 deletions(-)

diff --git a/net/ipv4/ip_fragment.c b/net/ipv4/ip_fragment.c
index 448e685..8d5cc75 100644
--- a/net/ipv4/ip_fragment.c
+++ b/net/ipv4/ip_fragment.c
@@ -707,28 +707,27 @@ EXPORT_SYMBOL(ip_defrag);

struct sk_buff *ip_check_defrag(struct sk_buff *skb, u32 user)
{
- const struct iphdr *iph;
+ struct iphdr iph;
u32 len;

if (skb->protocol != htons(ETH_P_IP))
return skb;

- if (!pskb_may_pull(skb, sizeof(struct iphdr)))
+ if (!skb_copy_bits(skb, 0, &iph, sizeof(iph)))
return skb;

- iph = ip_hdr(skb);
- if (iph->ihl < 5 || iph->version != 4)
+ if (iph.ihl < 5 || iph.version != 4)
return skb;
- if (!pskb_may_pull(skb, iph->ihl*4))
- return skb;
- iph = ip_hdr(skb);
- len = ntohs(iph->tot_len);
- if (skb->len < len || len < (iph->ihl * 4))
+
+ len = ntohs(iph.tot_len);
+ if (skb->len < len || len < (iph.ihl * 4))
return skb;

- if (ip_is_fragment(ip_hdr(skb))) {
+ if (ip_is_fragment(&iph)) {
skb = skb_share_check(skb, GFP_ATOMIC);
if (skb) {
+ if (!pskb_may_pull(skb, iph.ihl*4))
+ return skb;
if (pskb_trim_rcsum(skb, len))
return skb;
memset(IPCB(skb), 0, sizeof(struct inet_skb_parm));
--
1.8.0




2012-12-07 21:30:04

by Johannes Berg

[permalink] [raw]
Subject: Re: [RFC PATCH] af_packet: don't to defrag shared skb



> Wireless folks, please take a look. The issue is that,
> under the circumstances listed below, we get SKBs in
> the AF_PACKET input path that are shared.

Ok so I took a look, but I can't see where the wireless stack is going
wrong.

> Given the logic present in ieee80211_deliver_skb() I think
> the mac80211 code doesn't expect this either.

This is correct, but the driver should never give us a shared skb. From
the other mail it seems Eric is using iwlwifi, which is definitely not
creating shared SKBs. Nothing in mac80211 creates them either.

> > I've observed this crash under the following condition:
> > 1. a program is listening to an wifi interface (let say wlan0)
> > 2. it is using fanout capture in flow load balancing mode
> > 3. defrag option is on on the fanout socket

How do you set this up, and what does it do? I'd like to try to
reproduce this.

> > 4. the interface disconnect (radio down for example)
> > 5. the interface reconnect (radio switched up)
> > 6. once reconnected a single packet is seen with skb->users=2

That's interesting. A single one seems odd. I might have expected two,
but not one. Well, since you removed the crash ... I guess I'll have to
believe that there's just one and the second one doesn't show up because
we crashed before :-)

> So if we look at ieee80211_deliver_skb(), it has code to deal with unaligned
> packet headers, wherein it memoves() the data into a better aligned location.
>
> But if these SKBs really are skb_shared(), this packet data
> modification is illegal.
>
> I suspect that the assumptions built into this unaligned data handling
> code, and AF_PACKET, are correct. Meaning that we should never see
> skb_shared() packets here. We just have a missing skb_copy()
> somewhere in mac80211, Johannes can you please take a look?

My first theory was related to multiple virtual interfaces, but Eric
didn't say he was running that, but we use skb_copy() for that in
ieee80211_prepare_and_rx_handle(). That's not necessarily the most
efficient (another reason for drivers to use paged RX here) but clearly
not causing the issue.

The only other theory I can come up with right now is that the skb_get()
happens in deliver_skb via __netif_receive_skb. Keeping in mind that
wpa_supplicant might have another packet socket open for authentication
packets, that seems like a possibility. I'll test it once I figure out
how to do this "defrag" option you speak of :)

johannes



2012-12-07 21:33:19

by Eric Leblond

[permalink] [raw]
Subject: Re: [RFC PATCH] af_packet: don't to defrag shared skb

Hi,

On Fri, 2012-12-07 at 15:31 -0500, David Miller wrote:
> From: Eric Leblond <[email protected]>
> Date: Fri, 7 Dec 2012 19:56:01 +0100
>
> Wireless folks, please take a look. The issue is that,
> under the circumstances listed below, we get SKBs in
> the AF_PACKET input path that are shared.
>
> Given the logic present in ieee80211_deliver_skb() I think
> the mac80211 code doesn't expect this either.
>
> More commentary from me below:
>
> > This patch is adding a check on skb before trying to defrag the
> > packet for the hash computation in fanout mode. The goal of this
> > patch is to avoid an kernel crash in pskb_expand_head.
> > It appears that under some specific condition there is a shared
> > skb reaching the defrag code and this lead to a crash due to the
> > following code:
> >
> > if (skb_shared(skb))
> > BUG();
> >
> > I've observed this crash under the following condition:
> > 1. a program is listening to an wifi interface (let say wlan0)
> > 2. it is using fanout capture in flow load balancing mode
> > 3. defrag option is on on the fanout socket
> > 4. the interface disconnect (radio down for example)
> > 5. the interface reconnect (radio switched up)
> > 6. once reconnected a single packet is seen with skb->users=2
> > 7. the kernel crash in pskb_expand_head at skbuff.c:1035
> >
> > [BBB55:744364] [<ffffffff812a2761>] ? __pskb_pull_tail+0x43x0x26f
> > [BB8S5.744395] [<ffffffff812d29Tb>] ? ip_check_defrag+ox3a/0x14a
> > [BBB55.744422] [<ffffffffB1344459>] ? packet_rcv_fanout+ox5e/oxf9
> > [BBBS5.7444S0] [<ffffffffB12aaS9b>] ? __netif_receive_skb+ox444/ox4f9
> > [BBB55.T4447B] [<ffffffffB12aa?e1>] ? netif_receive_skb+ox6d/0x?3
> > [BBB55.T4447B] [<ffffffffB12aa?e1>] ? ieee80211_deliver_skb+0xbd/0xfa [mac80211]
> > [BBB55.T4447B] [<ffffffffB12aa?e1>] ? ieee80211_rx_h_data+0x1e0/0x21a [mac80211]
> > [BBB55.T4447B] [<ffffffffB12aa?e1>] ? ieee80211_rx_handlers+0x3d5/0x480 [mac80211]
> > [BBB55.T4447B] [<ffffffffB12aa?e1>] ? __wake_up
> > [BBB55.T4447B] [<ffffffffB12aa?e1>] ? evdev_eventr+0xc0/0xcf [evdev]
> >
> > Signed-off-by: Eric Leblond <[email protected]>
>
> So if we look at ieee80211_deliver_skb(), it has code to deal with unaligned
> packet headers, wherein it memoves() the data into a better aligned location.
>
> But if these SKBs really are skb_shared(), this packet data
> modification is illegal.
>
> I suspect that the assumptions built into this unaligned data handling
> code, and AF_PACKET, are correct. Meaning that we should never see
> skb_shared() packets here. We just have a missing skb_copy()
> somewhere in mac80211, Johannes can you please take a look?

Here's some more info that may help people knowing the code. During my
test, I've removed the BUG() and replaced with a printk to have a living
kernel. Only one single shared skb was seen for each up event.

I've also add another oops in the same code:
[BBB55:744364] [<ffffffff812a2761>] ? __pskb_pull_tail+0x43x0x26f
[BB8S5.744395] [<ffffffff812d29Tb>] ? ip_check_defrag+ox3a/0x14a
[BBB55.744422] [<ffffffffB1344459>] ? packet_rcv_fanout+ox5e/oxf9
[BBBS5.7444S0] [<ffffffffB12aaS9b>] ? __netif_receive_skb+ox444/ox4f9
[BBB55.T4447B] [<ffffffffB12aa?e1>] ? netif_receive_skb+ox6d/0x?3
[BBB55.T4447B] [<ffffffffB12aa?e1>] ? ieee80211_deliver_skb+0xbd/0xfa [mac80211]
[BBB55.T4447B] [<ffffffffB12aa?e1>] ? ieee80211_rx_h_data+0x1e0/0x21a [mac80211]
[BBB55.T4447B] [<ffffffffB12aa?e1>] ? ieee80211_rx_handlers+0x3d5/0x480 [mac80211]
[BBB55.T4447B] [<ffffffffB12aa?e1>] ? _raw_spin_lock_irqsave+0x14/0x35
[BBB55.T4447B] [<ffffffffB12aa?e1>] ? ieee80211_prepare_and_rx_handle+0x5a3/0x5db [mac80211]
...
[BBB55.T4447B] [<ffffffffB12aa?e1>] ? ttwu_dowakeup+0x2d

Picture of the oops available here:
http://home.regit.org/~regit/wireless-oops.jpg

BR,
--
Eric Leblond <[email protected]>
Blog: https://home.regit.org/


2012-12-07 22:23:17

by Johannes Berg

[permalink] [raw]
Subject: Re: [RFC PATCH] af_packet: don't to defrag shared skb

Oh I should say ... IP is like black magic to me ;-)

> - if (!pskb_may_pull(skb, sizeof(struct iphdr)))
> + if (!skb_copy_bits(skb, 0, &iph, sizeof(iph)))
> return skb;
>
> - iph = ip_hdr(skb);
> - if (iph->ihl < 5 || iph->version != 4)
> + if (iph.ihl < 5 || iph.version != 4)
> return skb;
> - if (!pskb_may_pull(skb, iph->ihl*4))
> - return skb;
> - iph = ip_hdr(skb);
> - len = ntohs(iph->tot_len);
> - if (skb->len < len || len < (iph->ihl * 4))
> +
> + len = ntohs(iph.tot_len);
> + if (skb->len < len || len < (iph.ihl * 4))
> return skb;
>
> - if (ip_is_fragment(ip_hdr(skb))) {
> + if (ip_is_fragment(&iph)) {
> skb = skb_share_check(skb, GFP_ATOMIC);
> if (skb) {
> + if (!pskb_may_pull(skb, iph.ihl*4))
> + return skb;

I moved this here but I have no idea what it does. Asking if we can pull
this here seems a bit pointless, and in the previous place it seemed
similarly pointless since we only use the static part of the IP header
and don't look at any options... So to me it seems this pskb_may_pull()
could just be removed, and that most likely means I'm messing with code
I don't understand :-)

johannes


2012-12-10 11:03:42

by Eric Leblond

[permalink] [raw]
Subject: Re: [PATCH] ipv4: ip_check_defrag must not modify skb before unsharing

Hello,

On Mon, 2012-12-10 at 10:41 +0100, Johannes Berg wrote:
> From: Johannes Berg <[email protected]>
>
> ip_check_defrag() might be called from af_packet within the
> RX path where shared SKBs are used, so it must not modify
> the input SKB before it has unshared it for defragmentation.
> Use skb_copy_bits() to get the IP header and only pull in
> everything later.
>
> The same is true for the other caller in macvlan as it is
> called from dev->rx_handler which can also get a shared SKB.

I've applied the patch and built a new kernel. I did not manage to get
it crashed when using the two techniques (suspend to ram and down/up
interface) that were working well to crash kernel without the patch.

BR,

> Reported-by: Eric Leblond <[email protected]>
> Cc: [email protected]
> Signed-off-by: Johannes Berg <[email protected]>
> ---
> For some versions of the kernel, this code goes into af_packet.c
>
> net/ipv4/ip_fragment.c | 19 +++++++++----------
> 1 file changed, 9 insertions(+), 10 deletions(-)
>
> diff --git a/net/ipv4/ip_fragment.c b/net/ipv4/ip_fragment.c
> index 448e685..8d5cc75 100644
> --- a/net/ipv4/ip_fragment.c
> +++ b/net/ipv4/ip_fragment.c
> @@ -707,28 +707,27 @@ EXPORT_SYMBOL(ip_defrag);
>
> struct sk_buff *ip_check_defrag(struct sk_buff *skb, u32 user)
> {
> - const struct iphdr *iph;
> + struct iphdr iph;
> u32 len;
>
> if (skb->protocol != htons(ETH_P_IP))
> return skb;
>
> - if (!pskb_may_pull(skb, sizeof(struct iphdr)))
> + if (!skb_copy_bits(skb, 0, &iph, sizeof(iph)))
> return skb;
>
> - iph = ip_hdr(skb);
> - if (iph->ihl < 5 || iph->version != 4)
> + if (iph.ihl < 5 || iph.version != 4)
> return skb;
> - if (!pskb_may_pull(skb, iph->ihl*4))
> - return skb;
> - iph = ip_hdr(skb);
> - len = ntohs(iph->tot_len);
> - if (skb->len < len || len < (iph->ihl * 4))
> +
> + len = ntohs(iph.tot_len);
> + if (skb->len < len || len < (iph.ihl * 4))
> return skb;
>
> - if (ip_is_fragment(ip_hdr(skb))) {
> + if (ip_is_fragment(&iph)) {
> skb = skb_share_check(skb, GFP_ATOMIC);
> if (skb) {
> + if (!pskb_may_pull(skb, iph.ihl*4))
> + return skb;
> if (pskb_trim_rcsum(skb, len))
> return skb;
> memset(IPCB(skb), 0, sizeof(struct inet_skb_parm));

--
Eric Leblond <[email protected]>
Blog: https://home.regit.org/


2012-12-10 18:45:32

by Johannes Berg

[permalink] [raw]
Subject: Re: [PATCH] ipv4: ip_check_defrag must not modify skb before unsharing

On Mon, 2012-12-10 at 13:41 -0500, David Miller wrote:
> From: Johannes Berg <[email protected]>
> Date: Mon, 10 Dec 2012 10:41:06 +0100
>
> > From: Johannes Berg <[email protected]>
> >
> > ip_check_defrag() might be called from af_packet within the
> > RX path where shared SKBs are used, so it must not modify
> > the input SKB before it has unshared it for defragmentation.
> > Use skb_copy_bits() to get the IP header and only pull in
> > everything later.
> >
> > The same is true for the other caller in macvlan as it is
> > called from dev->rx_handler which can also get a shared SKB.
> >
> > Reported-by: Eric Leblond <[email protected]>
> > Cc: [email protected]
> > Signed-off-by: Johannes Berg <[email protected]>
> > ---
> > For some versions of the kernel, this code goes into af_packet.c
>
> So the bug is that ip_check_defrag() has a precondition which is met
> properly by all callers except AF_PACKET.
>
> If this is the case, remind me why are we changing ip_check_defrag()
> rather than the violator of the precondition?

I don't think this is the case.

If you're referring to my note about af_packet: the kernels where this
goes into af_packet.c are the kernels that don't even have
ip_check_defrag() because macvlan didn't exist/didn't have ip defrag
support and af_packet had this code there -- see commit bc416d9768a.

If you're not referring to my note about af_packet: both callers (there
are only two) of ip_check_defrag() have this bug as far as I can tell
because they're both in the part of the RX path where shared SKBs might
happen.

johannes



2012-12-10 09:29:33

by Johannes Berg

[permalink] [raw]
Subject: Re: [RFC PATCH] af_packet: don't to defrag shared skb

On Fri, 2012-12-07 at 23:23 +0100, Johannes Berg wrote:
> Oh I should say ... IP is like black magic to me ;-)
>
> > - if (!pskb_may_pull(skb, sizeof(struct iphdr)))
> > + if (!skb_copy_bits(skb, 0, &iph, sizeof(iph)))
> > return skb;
> >
> > - iph = ip_hdr(skb);
> > - if (iph->ihl < 5 || iph->version != 4)
> > + if (iph.ihl < 5 || iph.version != 4)
> > return skb;
> > - if (!pskb_may_pull(skb, iph->ihl*4))
> > - return skb;
> > - iph = ip_hdr(skb);
> > - len = ntohs(iph->tot_len);
> > - if (skb->len < len || len < (iph->ihl * 4))
> > +
> > + len = ntohs(iph.tot_len);
> > + if (skb->len < len || len < (iph.ihl * 4))
> > return skb;
> >
> > - if (ip_is_fragment(ip_hdr(skb))) {
> > + if (ip_is_fragment(&iph)) {
> > skb = skb_share_check(skb, GFP_ATOMIC);
> > if (skb) {
> > + if (!pskb_may_pull(skb, iph.ihl*4))
> > + return skb;
>
> I moved this here but I have no idea what it does.

Well, ok, looking further it does seem kinda obvious -- ip_defrag()
assumes the IP header is (fully?) present in the skb header, so that's
what this does.

Eric (Leblond), could you test this patch?

johannes


2012-12-10 18:50:16

by David Miller

[permalink] [raw]
Subject: Re: [PATCH] ipv4: ip_check_defrag must not modify skb before unsharing

From: Johannes Berg <[email protected]>
Date: Mon, 10 Dec 2012 19:45:52 +0100

> On Mon, 2012-12-10 at 13:41 -0500, David Miller wrote:
>> So the bug is that ip_check_defrag() has a precondition which is met
>> properly by all callers except AF_PACKET.
>>
>> If this is the case, remind me why are we changing ip_check_defrag()
>> rather than the violator of the precondition?
>
> I don't think this is the case.
>
> If you're referring to my note about af_packet: the kernels where this
> goes into af_packet.c are the kernels that don't even have
> ip_check_defrag() because macvlan didn't exist/didn't have ip defrag
> support and af_packet had this code there -- see commit bc416d9768a.
>
> If you're not referring to my note about af_packet: both callers (there
> are only two) of ip_check_defrag() have this bug as far as I can tell
> because they're both in the part of the RX path where shared SKBs might
> happen.

You're right, I misinterpreted what's happening here.

My misunderstanding was that this was a situation where normal IPV4
input processing makes sure the SKB is unshared, and we had special
code paths that didn't make sure that was the case.

Rather, here, we have a special entrypoint for macvlan and AF_PACKET
which is supposed to take care of such issues since it is known to
execute in a different kind of environment.

I'm pretty sure I'll apply this, after I check a few more things,
thanks Johannes!

2012-12-10 18:41:49

by David Miller

[permalink] [raw]
Subject: Re: [PATCH] ipv4: ip_check_defrag must not modify skb before unsharing

From: Johannes Berg <[email protected]>
Date: Mon, 10 Dec 2012 10:41:06 +0100

> From: Johannes Berg <[email protected]>
>
> ip_check_defrag() might be called from af_packet within the
> RX path where shared SKBs are used, so it must not modify
> the input SKB before it has unshared it for defragmentation.
> Use skb_copy_bits() to get the IP header and only pull in
> everything later.
>
> The same is true for the other caller in macvlan as it is
> called from dev->rx_handler which can also get a shared SKB.
>
> Reported-by: Eric Leblond <[email protected]>
> Cc: [email protected]
> Signed-off-by: Johannes Berg <[email protected]>
> ---
> For some versions of the kernel, this code goes into af_packet.c

So the bug is that ip_check_defrag() has a precondition which is met
properly by all callers except AF_PACKET.

If this is the case, remind me why are we changing ip_check_defrag()
rather than the violator of the precondition?

2012-12-07 22:12:23

by Johannes Berg

[permalink] [raw]
Subject: Re: [RFC PATCH] af_packet: don't to defrag shared skb

On Fri, 2012-12-07 at 22:41 +0100, Johannes Berg wrote:

> wpa_supplicant opens a packet socket for ETH_P_EAPOL, which indirectly
> eventually calls dev_add_pack(). But if you do the same for another
> socket, you'll get the same again, and then deliver_skb() will deliver
> only a refcounted packet to the prot_hook->func().
>
> This seems like it could very well cause the problem?

Ok I couldn't reproduce it because suricata didn't work this way, but I
did try using tcpdump (without the fanout) and deliver_skb() *is* called
twice for each packet as I thought. It thus seems the problem is
entirely in af_packet itself. It was changed a bit by Eric Dumazet in
bc416d9768 but the original code goes back to the original defrag
support in 7736d33f4, as far as I can tell. aec27311c changed the code
to not do skb_clone(), but it seems the skb_share_check() should be
before the pskb_pull().

Well, it seems ip_check_defrag() should simply not modify the SKB before
it unshares it ... like this maybe:

diff --git a/net/ipv4/ip_fragment.c b/net/ipv4/ip_fragment.c
index 448e685..8d5cc75 100644
--- a/net/ipv4/ip_fragment.c
+++ b/net/ipv4/ip_fragment.c
@@ -707,28 +707,27 @@ EXPORT_SYMBOL(ip_defrag);

struct sk_buff *ip_check_defrag(struct sk_buff *skb, u32 user)
{
- const struct iphdr *iph;
+ struct iphdr iph;
u32 len;

if (skb->protocol != htons(ETH_P_IP))
return skb;

- if (!pskb_may_pull(skb, sizeof(struct iphdr)))
+ if (!skb_copy_bits(skb, 0, &iph, sizeof(iph)))
return skb;

- iph = ip_hdr(skb);
- if (iph->ihl < 5 || iph->version != 4)
+ if (iph.ihl < 5 || iph.version != 4)
return skb;
- if (!pskb_may_pull(skb, iph->ihl*4))
- return skb;
- iph = ip_hdr(skb);
- len = ntohs(iph->tot_len);
- if (skb->len < len || len < (iph->ihl * 4))
+
+ len = ntohs(iph.tot_len);
+ if (skb->len < len || len < (iph.ihl * 4))
return skb;

- if (ip_is_fragment(ip_hdr(skb))) {
+ if (ip_is_fragment(&iph)) {
skb = skb_share_check(skb, GFP_ATOMIC);
if (skb) {
+ if (!pskb_may_pull(skb, iph.ihl*4))
+ return skb;
if (pskb_trim_rcsum(skb, len))
return skb;
memset(IPCB(skb), 0, sizeof(struct inet_skb_parm));

johannes


2012-12-07 21:46:49

by Eric Leblond

[permalink] [raw]
Subject: Re: [RFC PATCH] af_packet: don't to defrag shared skb

Hi,

On Fri, 2012-12-07 at 22:30 +0100, Johannes Berg wrote:
>
> > Wireless folks, please take a look. The issue is that,
> > under the circumstances listed below, we get SKBs in
> > the AF_PACKET input path that are shared.
>
> Ok so I took a look, but I can't see where the wireless stack is going
> wrong.
>
> > Given the logic present in ieee80211_deliver_skb() I think
> > the mac80211 code doesn't expect this either.
>
> This is correct, but the driver should never give us a shared skb. From
> the other mail it seems Eric is using iwlwifi, which is definitely not
> creating shared SKBs. Nothing in mac80211 creates them either.
>
> > > I've observed this crash under the following condition:
> > > 1. a program is listening to an wifi interface (let say wlan0)
> > > 2. it is using fanout capture in flow load balancing mode
> > > 3. defrag option is on on the fanout socket
>
> How do you set this up, and what does it do? I'd like to try to
> reproduce this.



> > > 4. the interface disconnect (radio down for example)
> > > 5. the interface reconnect (radio switched up)
> > > 6. once reconnected a single packet is seen with skb->users=2
>
> That's interesting. A single one seems odd. I might have expected two,
> but not one. Well, since you removed the crash ... I guess I'll have to
> believe that there's just one and the second one doesn't show up because
> we crashed before :-)

It was the case with initial code but I've suppressed the BUG() call and
replaced it with a return ;)

> > So if we look at ieee80211_deliver_skb(), it has code to deal with unaligned
> > packet headers, wherein it memoves() the data into a better aligned location.
> >
> > But if these SKBs really are skb_shared(), this packet data
> > modification is illegal.
> >
> > I suspect that the assumptions built into this unaligned data handling
> > code, and AF_PACKET, are correct. Meaning that we should never see
> > skb_shared() packets here. We just have a missing skb_copy()
> > somewhere in mac80211, Johannes can you please take a look?
>
> My first theory was related to multiple virtual interfaces, but Eric
> didn't say he was running that, but we use skb_copy() for that in
> ieee80211_prepare_and_rx_handle(). That's not necessarily the most
> efficient (another reason for drivers to use paged RX here) but clearly
> not causing the issue.
>
> The only other theory I can come up with right now is that the skb_get()
> happens in deliver_skb via __netif_receive_skb. Keeping in mind that
> wpa_supplicant might have another packet socket open for authentication
> packets, that seems like a possibility. I'll test it once I figure out
> how to do this "defrag" option you speak of :)

I've no simple code available to test it. I've add the problem when
running suricata. Maybe you could use it. It is packaged in most
distribution now.
To enable packet fanout. Modify default /etc/suricata/suricata.yaml to
have something like:
af-packet:
- interface: wlan0
# Number of receive threads (>1 will enable experimental flow pinned
# runmode)
threads: 3

Start it with: suricata --af-packet=wlan0
Then get wlan0 interface down and up. After a few seconds, the crash
will occur.
It is a bit complicated for a simple test case. I can cook a little
example code if you want.

BR,
--
Eric Leblond <[email protected]>
Blog: https://home.regit.org/


2012-12-07 21:41:21

by Johannes Berg

[permalink] [raw]
Subject: Re: [RFC PATCH] af_packet: don't to defrag shared skb

On Fri, 2012-12-07 at 22:30 +0100, Johannes Berg wrote:

> The only other theory I can come up with right now is that the skb_get()
> happens in deliver_skb via __netif_receive_skb. Keeping in mind that
> wpa_supplicant might have another packet socket open for authentication
> packets, that seems like a possibility. I'll test it once I figure out
> how to do this "defrag" option you speak of :)

Hmm now I'm venturing into the unknown (for me) and realm of
speculation...

wpa_supplicant opens a packet socket for ETH_P_EAPOL, which indirectly
eventually calls dev_add_pack(). But if you do the same for another
socket, you'll get the same again, and then deliver_skb() will deliver
only a refcounted packet to the prot_hook->func().

This seems like it could very well cause the problem?

johannes


2012-12-07 21:56:10

by Johannes Berg

[permalink] [raw]
Subject: Re: [RFC PATCH] af_packet: don't to defrag shared skb

HI,

> > That's interesting. A single one seems odd. I might have expected two,
> > but not one. Well, since you removed the crash ... I guess I'll have to
> > believe that there's just one and the second one doesn't show up because
> > we crashed before :-)
>
> It was the case with initial code but I've suppressed the BUG() call and
> replaced it with a return ;)

Right. Well, does it actually still work then? I wonder if then you
don't see a second packet because the first one doesn't make it
through ... I'm thinking that this is just an internal af_packet problem
with multiple listeners.


> I've no simple code available to test it. I've add the problem when
> running suricata. Maybe you could use it. It is packaged in most
> distribution now.
> To enable packet fanout. Modify default /etc/suricata/suricata.yaml to
> have something like:
> af-packet:
> - interface: wlan0
> # Number of receive threads (>1 will enable experimental flow pinned
> # runmode)
> threads: 3

That'll should do, thanks.

johannes


2012-12-07 20:41:46

by Johannes Berg

[permalink] [raw]
Subject: Re: [RFC PATCH] af_packet: don't to defrag shared skb

On Fri, 2012-12-07 at 15:31 -0500, David Miller wrote:
> From: Eric Leblond <[email protected]>
> Date: Fri, 7 Dec 2012 19:56:01 +0100
>
> Wireless folks, please take a look. The issue is that,
> under the circumstances listed below, we get SKBs in
> the AF_PACKET input path that are shared.
>
> Given the logic present in ieee80211_deliver_skb() I think
> the mac80211 code doesn't expect this either.

Indeed, it would certainly not like this, I'll take a look.

Eric, what's the driver you're using? I'm wondering whether paged skbs
vs. all data in the header would make a difference, hence the question.

johannes