Subject: [PATCH v2] AF_PACKET doesnt strip VLAN information

When an application sends with AF_PACKET and places a vlan header on
the raw packet; then the AF_PACKET needs to move the tag into the skb
so that it gets processed normally through the rest of the transmit
path.

This is particularly a problem on Hyper-V where the host only allows
vlan in the offload info.

Cc: [email protected]
Cc: Sriram Krishnan <[email protected]>
Signed-off-by: Sriram Krishnan <[email protected]>
---
net/packet/af_packet.c | 26 ++++++++++++++++++++++++--
1 file changed, 24 insertions(+), 2 deletions(-)

diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c
index 29bd405adbbd..10988639561e 100644
--- a/net/packet/af_packet.c
+++ b/net/packet/af_packet.c
@@ -1857,15 +1857,35 @@ static int packet_rcv_spkt(struct sk_buff *skb, struct net_device *dev,
return 0;
}

-static void packet_parse_headers(struct sk_buff *skb, struct socket *sock)
+static int packet_parse_headers(struct sk_buff *skb, struct socket *sock)
{
if ((!skb->protocol || skb->protocol == htons(ETH_P_ALL)) &&
sock->type == SOCK_RAW) {
+ __be16 ethertype;
+
skb_reset_mac_header(skb);
+
+ ethertype = eth_hdr(skb)->h_proto;
+ /*
+ * If Vlan tag is present in the packet
+ * move it to skb
+ */
+ if (eth_type_vlan(ethertype)) {
+ int err;
+ __be16 vlan_tci;
+
+ err = __skb_vlan_pop(skb, &vlan_tci);
+ if (unlikely(err))
+ return err;
+
+ __vlan_hwaccel_put_tag(skb, ethertype, vlan_tci);
+ }
+
skb->protocol = dev_parse_header_protocol(skb);
}

skb_probe_transport_header(skb);
+ return 0;
}

/*
@@ -1987,7 +2007,9 @@ static int packet_sendmsg_spkt(struct socket *sock, struct msghdr *msg,
if (unlikely(extra_len == 4))
skb->no_fcs = 1;

- packet_parse_headers(skb, sock);
+ err = packet_parse_headers(skb, sock);
+ if (err)
+ goto out_unlock;

dev_queue_xmit(skb);
rcu_read_unlock();
--
2.24.0


2020-07-18 15:59:29

by Willem de Bruijn

[permalink] [raw]
Subject: Re: [PATCH v2] AF_PACKET doesnt strip VLAN information

On Sat, Jul 18, 2020 at 5:25 AM Sriram Krishnan <[email protected]> wrote:
>
> When an application sends with AF_PACKET and places a vlan header on
> the raw packet; then the AF_PACKET needs to move the tag into the skb
> so that it gets processed normally through the rest of the transmit
> path.
>
> This is particularly a problem on Hyper-V where the host only allows
> vlan in the offload info.

Shouldn't this be resolved in that driver, then?

Packet sockets are not the only way to introduce packets in the
kernel. Tuntap would be another.

2020-07-19 05:46:51

by kernel test robot

[permalink] [raw]
Subject: Re: [PATCH v2] AF_PACKET doesnt strip VLAN information

Hi Sriram,

Thank you for the patch! Perhaps something to improve:

[auto build test WARNING on hnaz-linux-mm/master]
[also build test WARNING on sparc-next/master linus/master v5.8-rc5 next-20200717]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url: https://github.com/0day-ci/linux/commits/Sriram-Krishnan/AF_PACKET-doesnt-strip-VLAN-information/20200718-172707
base: https://github.com/hnaz/linux-mm master
config: s390-randconfig-s031-20200719 (attached as .config)
compiler: s390-linux-gcc (GCC) 9.3.0
reproduce:
wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
chmod +x ~/bin/make.cross
# apt-get install sparse
# sparse version: v0.6.2-49-g707c5017-dirty
# save the attached .config to linux build tree
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross C=1 CF='-fdiagnostic-prefix -D__CHECK_ENDIAN__' ARCH=s390

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <[email protected]>


sparse warnings: (new ones prefixed by >>)

>> net/packet/af_packet.c:1877:52: sparse: sparse: incorrect type in argument 2 (different base types) @@ expected unsigned short [usertype] *vlan_tci @@ got restricted __be16 * @@
>> net/packet/af_packet.c:1877:52: sparse: expected unsigned short [usertype] *vlan_tci
>> net/packet/af_packet.c:1877:52: sparse: got restricted __be16 *
>> net/packet/af_packet.c:1881:64: sparse: sparse: incorrect type in argument 3 (different base types) @@ expected unsigned short [usertype] vlan_tci @@ got restricted __be16 [addressable] [usertype] vlan_tci @@
net/packet/af_packet.c:1881:64: sparse: expected unsigned short [usertype] vlan_tci
>> net/packet/af_packet.c:1881:64: sparse: got restricted __be16 [addressable] [usertype] vlan_tci

vim +1877 net/packet/af_packet.c

1859
1860 static int packet_parse_headers(struct sk_buff *skb, struct socket *sock)
1861 {
1862 if ((!skb->protocol || skb->protocol == htons(ETH_P_ALL)) &&
1863 sock->type == SOCK_RAW) {
1864 __be16 ethertype;
1865
1866 skb_reset_mac_header(skb);
1867
1868 ethertype = eth_hdr(skb)->h_proto;
1869 /*
1870 * If Vlan tag is present in the packet
1871 * move it to skb
1872 */
1873 if (eth_type_vlan(ethertype)) {
1874 int err;
1875 __be16 vlan_tci;
1876
> 1877 err = __skb_vlan_pop(skb, &vlan_tci);
1878 if (unlikely(err))
1879 return err;
1880
> 1881 __vlan_hwaccel_put_tag(skb, ethertype, vlan_tci);
1882 }
1883
1884 skb->protocol = dev_parse_header_protocol(skb);
1885 }
1886
1887 skb_probe_transport_header(skb);
1888 return 0;
1889 }
1890

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/[email protected]


Attachments:
(No filename) (3.01 kB)
.config.gz (34.62 kB)
Download all attachments

2020-07-20 13:56:08

by Willem de Bruijn

[permalink] [raw]
Subject: Re: [PATCH v2] AF_PACKET doesnt strip VLAN information

On Mon, Jul 20, 2020 at 12:27 AM Sriram Krishnan (srirakr2)
<[email protected]> wrote:
>
> +Stephen Hemminger
>
> Hi Willem,
> Thanks for looking into the code, I understand that this is more of a generic problem wherein many of the filtering functions assume the vlan tag to be in the skb rather than in the packet. Hence we moved the fix from the driver to the common AF packet that our solution uses.
>
> I recall from the v1 of the patch you had mentioned other common areas where this fix might be relevant (such as tap/tun), but I'm afraid I cant comprehensively test those patches out. Please let me know your thoughts

Please use plain text to respond. HTML replies do not reach the list.

Can you be more precise in which other code besides the hyper-v driver
is affected? Do you have an example?

This is a resubmit of the original patch. My previous
questions/concerns remain valid:

- if the function can now fail, all callers must be updated to detect
and handle that

- any solution should probably address all inputs into the tx path:
packet sockets, tuntap, virtio-net

- this only addresses packet sockets with ETH_P_ALL/ETH_P_NONE. Not
sockets that set ETH_P_8021Q

- which code in the transmit stack requires the tag to be in the skb,
and does this problem after this patch still persist for Q-in-Q?

Subject: Re: [PATCH v2] AF_PACKET doesnt strip VLAN information

I have moved the code to the driver and pushed a new patch due to the below highlighted issues.

Stephen H,
Please let me know if you have any concerns localising the changes to the netvsc driver.


Thanks,
Sriram

On 20/07/20, 7:23 PM, "Willem de Bruijn" <[email protected]> wrote:

On Mon, Jul 20, 2020 at 12:27 AM Sriram Krishnan (srirakr2)
<[email protected]> wrote:
>
> +Stephen Hemminger
>
> Hi Willem,
> Thanks for looking into the code, I understand that this is more of a generic problem wherein many of the filtering functions assume the vlan tag to be in the skb rather than in the packet. Hence we moved the fix from the driver to the common AF packet that our solution uses.
>
> I recall from the v1 of the patch you had mentioned other common areas where this fix might be relevant (such as tap/tun), but I'm afraid I cant comprehensively test those patches out. Please let me know your thoughts

Please use plain text to respond. HTML replies do not reach the list.

Can you be more precise in which other code besides the hyper-v driver
is affected? Do you have an example?

This is a resubmit of the original patch. My previous
questions/concerns remain valid:

- if the function can now fail, all callers must be updated to detect
and handle that

- any solution should probably address all inputs into the tx path:
packet sockets, tuntap, virtio-net

- this only addresses packet sockets with ETH_P_ALL/ETH_P_NONE. Not
sockets that set ETH_P_8021Q

- which code in the transmit stack requires the tag to be in the skb,
and does this problem after this patch still persist for Q-in-Q?

2020-07-20 20:57:29

by Stephen Hemminger

[permalink] [raw]
Subject: Re: [PATCH v2] AF_PACKET doesnt strip VLAN information

On Mon, 20 Jul 2020 09:52:27 -0400
Willem de Bruijn <[email protected]> wrote:

> On Mon, Jul 20, 2020 at 12:27 AM Sriram Krishnan (srirakr2)
> <[email protected]> wrote:
> >
> > +Stephen Hemminger
> >
> > Hi Willem,
> > Thanks for looking into the code, I understand that this is more of a generic problem wherein many of the filtering functions assume the vlan tag to be in the skb rather than in the packet. Hence we moved the fix from the driver to the common AF packet that our solution uses.
> >
> > I recall from the v1 of the patch you had mentioned other common areas where this fix might be relevant (such as tap/tun), but I'm afraid I cant comprehensively test those patches out. Please let me know your thoughts
>
> Please use plain text to respond. HTML replies do not reach the list.
>
> Can you be more precise in which other code besides the hyper-v driver
> is affected? Do you have an example?
>
> This is a resubmit of the original patch. My previous
> questions/concerns remain valid:
>
> - if the function can now fail, all callers must be updated to detect
> and handle that
>
> - any solution should probably address all inputs into the tx path:
> packet sockets, tuntap, virtio-net
>
> - this only addresses packet sockets with ETH_P_ALL/ETH_P_NONE. Not
> sockets that set ETH_P_8021Q
>
> - which code in the transmit stack requires the tag to be in the skb,
> and does this problem after this patch still persist for Q-in-Q?

It matters because the problem is generic, not just to the netvsc driver.
For example, BPF programs and netfilter rules will see different packets
when send is through AF_PACKET than they would see for sends from the
kernel stack.

Presenting uniform data to the lower layers makes sense.

2020-07-20 21:25:05

by Willem de Bruijn

[permalink] [raw]
Subject: Re: [PATCH v2] AF_PACKET doesnt strip VLAN information

On Mon, Jul 20, 2020 at 4:57 PM Stephen Hemminger
<[email protected]> wrote:
>
> On Mon, 20 Jul 2020 09:52:27 -0400
> Willem de Bruijn <[email protected]> wrote:
>
> > On Mon, Jul 20, 2020 at 12:27 AM Sriram Krishnan (srirakr2)
> > <[email protected]> wrote:
> > >
> > > +Stephen Hemminger
> > >
> > > Hi Willem,
> > > Thanks for looking into the code, I understand that this is more of a generic problem wherein many of the filtering functions assume the vlan tag to be in the skb rather than in the packet. Hence we moved the fix from the driver to the common AF packet that our solution uses.
> > >
> > > I recall from the v1 of the patch you had mentioned other common areas where this fix might be relevant (such as tap/tun), but I'm afraid I cant comprehensively test those patches out. Please let me know your thoughts
> >
> > Please use plain text to respond. HTML replies do not reach the list.
> >
> > Can you be more precise in which other code besides the hyper-v driver
> > is affected? Do you have an example?
> >
> > This is a resubmit of the original patch. My previous
> > questions/concerns remain valid:
> >
> > - if the function can now fail, all callers must be updated to detect
> > and handle that
> >
> > - any solution should probably address all inputs into the tx path:
> > packet sockets, tuntap, virtio-net
> >
> > - this only addresses packet sockets with ETH_P_ALL/ETH_P_NONE. Not
> > sockets that set ETH_P_8021Q
> >
> > - which code in the transmit stack requires the tag to be in the skb,
> > and does this problem after this patch still persist for Q-in-Q?
>
> It matters because the problem is generic, not just to the netvsc driver.
> For example, BPF programs and netfilter rules will see different packets
> when send is through AF_PACKET than they would see for sends from the
> kernel stack.
>
> Presenting uniform data to the lower layers makes sense.

Are all forwarded and locally generated packets guaranteed to always
have VLAN information in the tag (so that this is indeed only an issue
with input from userspace, through tuntap, virtio and packet sockets)?

I guess the first might be assured due to this in __netif_receive_skb_core:

if (skb->protocol == cpu_to_be16(ETH_P_8021Q) ||
skb->protocol == cpu_to_be16(ETH_P_8021AD)) {
skb = skb_vlan_untag(skb);
if (unlikely(!skb))
goto out;
}

and the second by this in vlan_dev_hard_start_xmit:

if (veth->h_vlan_proto != vlan->vlan_proto ||
vlan->flags & VLAN_FLAG_REORDER_HDR) {
u16 vlan_tci;
vlan_tci = vlan->vlan_id;
vlan_tci |= vlan_dev_get_egress_qos_mask(dev, skb->priority);
__vlan_hwaccel_put_tag(skb, vlan->vlan_proto, vlan_tci);
}

But I don't know this code very well, so that is based on a very
cursory glance only. Might well be missing other paths. (update: I
think pktgen is another example.)

Netfilter and BPF still need to handle tags in the packet for Q-in-Q,
right? So does this actually simplify their logic?

If the above holds and Q-in-Q is not a problem, then doing the same on
ingress from userspace may make sense. I don't know the kind of BPF
or netfilter programs what would be affected, and how.

Then it would be good to all those inputs at once to really plug the hole.
See also virtio_net_hdr_to_skb for another example of code that
applies to all of tuntap, virtio, pf_packet and uml.

2020-07-20 22:05:10

by Stephen Hemminger

[permalink] [raw]
Subject: Re: [PATCH v2] AF_PACKET doesnt strip VLAN information

On Mon, 20 Jul 2020 17:22:49 -0400
Willem de Bruijn <[email protected]> wrote:

> On Mon, Jul 20, 2020 at 4:57 PM Stephen Hemminger
> <[email protected]> wrote:
> >
> > On Mon, 20 Jul 2020 09:52:27 -0400
> > Willem de Bruijn <[email protected]> wrote:
> >
> > > On Mon, Jul 20, 2020 at 12:27 AM Sriram Krishnan (srirakr2)
> > > <[email protected]> wrote:
> > > >
> > > > +Stephen Hemminger
> > > >
> > > > Hi Willem,
> > > > Thanks for looking into the code, I understand that this is more of a generic problem wherein many of the filtering functions assume the vlan tag to be in the skb rather than in the packet. Hence we moved the fix from the driver to the common AF packet that our solution uses.
> > > >
> > > > I recall from the v1 of the patch you had mentioned other common areas where this fix might be relevant (such as tap/tun), but I'm afraid I cant comprehensively test those patches out. Please let me know your thoughts
> > >
> > > Please use plain text to respond. HTML replies do not reach the list.
> > >
> > > Can you be more precise in which other code besides the hyper-v driver
> > > is affected? Do you have an example?
> > >
> > > This is a resubmit of the original patch. My previous
> > > questions/concerns remain valid:
> > >
> > > - if the function can now fail, all callers must be updated to detect
> > > and handle that
> > >
> > > - any solution should probably address all inputs into the tx path:
> > > packet sockets, tuntap, virtio-net
> > >
> > > - this only addresses packet sockets with ETH_P_ALL/ETH_P_NONE. Not
> > > sockets that set ETH_P_8021Q
> > >
> > > - which code in the transmit stack requires the tag to be in the skb,
> > > and does this problem after this patch still persist for Q-in-Q?
> >
> > It matters because the problem is generic, not just to the netvsc driver.
> > For example, BPF programs and netfilter rules will see different packets
> > when send is through AF_PACKET than they would see for sends from the
> > kernel stack.
> >
> > Presenting uniform data to the lower layers makes sense.
>
> Are all forwarded and locally generated packets guaranteed to always
> have VLAN information in the tag (so that this is indeed only an issue
> with input from userspace, through tuntap, virtio and packet sockets)?
>
> I guess the first might be assured due to this in __netif_receive_skb_core:
>
> if (skb->protocol == cpu_to_be16(ETH_P_8021Q) ||
> skb->protocol == cpu_to_be16(ETH_P_8021AD)) {
> skb = skb_vlan_untag(skb);
> if (unlikely(!skb))
> goto out;
> }
>
> and the second by this in vlan_dev_hard_start_xmit:
>
> if (veth->h_vlan_proto != vlan->vlan_proto ||
> vlan->flags & VLAN_FLAG_REORDER_HDR) {
> u16 vlan_tci;
> vlan_tci = vlan->vlan_id;
> vlan_tci |= vlan_dev_get_egress_qos_mask(dev, skb->priority);
> __vlan_hwaccel_put_tag(skb, vlan->vlan_proto, vlan_tci);
> }
>
> But I don't know this code very well, so that is based on a very
> cursory glance only. Might well be missing other paths. (update: I
> think pktgen is another example.)
>
> Netfilter and BPF still need to handle tags in the packet for Q-in-Q,
> right? So does this actually simplify their logic?
>
> If the above holds and Q-in-Q is not a problem, then doing the same on
> ingress from userspace may make sense. I don't know the kind of BPF
> or netfilter programs what would be affected, and how.
>
> Then it would be good to all those inputs at once to really plug the hole.
> See also virtio_net_hdr_to_skb for another example of code that
> applies to all of tuntap, virtio, pf_packet and uml.


Older versions of Linux used to handle outer VLAN differentl
based on what the driver supported. It was a mess.
Some drivers and code paths would strip and put in meta-data, some
would leave it in skb data. But in recent (like 5 yrs), the kernel
has tried to be more uniform and only have vlan as skb tag.
It looks like AF_PACKET was overlooked at that time.

2020-07-20 23:38:01

by David Miller

[permalink] [raw]
Subject: Re: [PATCH v2] AF_PACKET doesnt strip VLAN information

From: Stephen Hemminger <[email protected]>
Date: Mon, 20 Jul 2020 13:56:50 -0700

> It matters because the problem is generic, not just to the netvsc driver.
> For example, BPF programs and netfilter rules will see different packets
> when send is through AF_PACKET than they would see for sends from the
> kernel stack.
>
> Presenting uniform data to the lower layers makes sense.

The issue here is that for hyperv, vlan offloading is not a "may" but
a "must" and I've never understood it to have that meaning.

And I still haven't heard what is going to happen in Q-in-Q situations
even with the ugly hyperv driver hack that is currently under review.