2020-04-28 03:34:58

by Nathan Chancellor

[permalink] [raw]
Subject: [PATCH] hv_netvsc: Fix netvsc_start_xmit's return type

netvsc_start_xmit is used as a callback function for the ndo_start_xmit
function pointer. ndo_start_xmit's return type is netdev_tx_t but
netvsc_start_xmit's return type is int.

This causes a failure with Control Flow Integrity (CFI), which requires
function pointer prototypes and callback function definitions to match
exactly. When CFI is in enforcing, the kernel panics. When booting a
CFI kernel with WSL 2, the VM is immediately terminated because of this:

$ wsl.exe -d ubuntu
The Windows Subsystem for Linux instance has terminated.

Avoid this by using the right return type for netvsc_start_xmit.

Fixes: fceaf24a943d8 ("Staging: hv: add the Hyper-V virtual network driver")
Link: https://github.com/ClangBuiltLinux/linux/issues/1009
Signed-off-by: Nathan Chancellor <[email protected]>
---

Do note that netvsc_xmit still returns int because netvsc_xmit has a
potential return from netvsc_vf_xmit, which does not return netdev_tx_t
because of the call to dev_queue_xmit.

I am not sure if that is an oversight that was introduced by
commit 0c195567a8f6e ("netvsc: transparent VF management") or if
everything works properly as it is now.

My patch is purely concerned with making the definition match the
prototype so it should be NFC aside from avoiding the CFI panic.

drivers/net/hyperv/netvsc_drv.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/net/hyperv/netvsc_drv.c b/drivers/net/hyperv/netvsc_drv.c
index d8e86bdbfba1e..ebcfbae056900 100644
--- a/drivers/net/hyperv/netvsc_drv.c
+++ b/drivers/net/hyperv/netvsc_drv.c
@@ -707,7 +707,8 @@ static int netvsc_xmit(struct sk_buff *skb, struct net_device *net, bool xdp_tx)
goto drop;
}

-static int netvsc_start_xmit(struct sk_buff *skb, struct net_device *ndev)
+static netdev_tx_t netvsc_start_xmit(struct sk_buff *skb,
+ struct net_device *ndev)
{
return netvsc_xmit(skb, ndev, false);
}

base-commit: 51184ae37e0518fd90cb437a2fbc953ae558cd0d
--
2.26.2


2020-04-28 10:10:31

by Wei Liu

[permalink] [raw]
Subject: Re: [PATCH] hv_netvsc: Fix netvsc_start_xmit's return type

On Mon, Apr 27, 2020 at 08:30:43PM -0700, Nathan Chancellor wrote:
> netvsc_start_xmit is used as a callback function for the ndo_start_xmit
> function pointer. ndo_start_xmit's return type is netdev_tx_t but
> netvsc_start_xmit's return type is int.
>
> This causes a failure with Control Flow Integrity (CFI), which requires
> function pointer prototypes and callback function definitions to match
> exactly. When CFI is in enforcing, the kernel panics. When booting a
> CFI kernel with WSL 2, the VM is immediately terminated because of this:
>
> $ wsl.exe -d ubuntu
> The Windows Subsystem for Linux instance has terminated.
>
> Avoid this by using the right return type for netvsc_start_xmit.
>
> Fixes: fceaf24a943d8 ("Staging: hv: add the Hyper-V virtual network driver")
> Link: https://github.com/ClangBuiltLinux/linux/issues/1009

Please consider pulling in the panic log from #1009 to the commit
message. It is much better than the one line message above.

> Signed-off-by: Nathan Chancellor <[email protected]>
> ---
>
> Do note that netvsc_xmit still returns int because netvsc_xmit has a
> potential return from netvsc_vf_xmit, which does not return netdev_tx_t
> because of the call to dev_queue_xmit.
>
> I am not sure if that is an oversight that was introduced by
> commit 0c195567a8f6e ("netvsc: transparent VF management") or if
> everything works properly as it is now.
>
> My patch is purely concerned with making the definition match the
> prototype so it should be NFC aside from avoiding the CFI panic.
>
> drivers/net/hyperv/netvsc_drv.c | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/net/hyperv/netvsc_drv.c b/drivers/net/hyperv/netvsc_drv.c
> index d8e86bdbfba1e..ebcfbae056900 100644
> --- a/drivers/net/hyperv/netvsc_drv.c
> +++ b/drivers/net/hyperv/netvsc_drv.c
> @@ -707,7 +707,8 @@ static int netvsc_xmit(struct sk_buff *skb, struct net_device *net, bool xdp_tx)
> goto drop;
> }
>
> -static int netvsc_start_xmit(struct sk_buff *skb, struct net_device *ndev)
> +static netdev_tx_t netvsc_start_xmit(struct sk_buff *skb,
> + struct net_device *ndev)
> {
> return netvsc_xmit(skb, ndev, false);
> }
>
> base-commit: 51184ae37e0518fd90cb437a2fbc953ae558cd0d
> --
> 2.26.2
>

2020-04-28 17:59:47

by Nathan Chancellor

[permalink] [raw]
Subject: [PATCH v2] hv_netvsc: Fix netvsc_start_xmit's return type

netvsc_start_xmit is used as a callback function for the ndo_start_xmit
function pointer. ndo_start_xmit's return type is netdev_tx_t but
netvsc_start_xmit's return type is int.

This causes a failure with Control Flow Integrity (CFI), which requires
function pointer prototypes and callback function definitions to match
exactly. When CFI is in enforcing, the kernel panics. When booting a
CFI kernel with WSL 2, the VM is immediately terminated because of this.

The splat when CONFIG_CFI_PERMISSIVE is used:

[ 5.916765] CFI failure (target: netvsc_start_xmit+0x0/0x10):
[ 5.916771] WARNING: CPU: 8 PID: 0 at kernel/cfi.c:29 __cfi_check_fail+0x2e/0x40
[ 5.916772] Modules linked in:
[ 5.916774] CPU: 8 PID: 0 Comm: swapper/8 Not tainted 5.7.0-rc3-next-20200424-microsoft-cbl-00001-ged4eb37d2c69-dirty #1
[ 5.916776] RIP: 0010:__cfi_check_fail+0x2e/0x40
[ 5.916777] Code: 48 c7 c7 70 98 63 a9 48 c7 c6 11 db 47 a9 e8 69 55 59 00 85 c0 75 02 5b c3 48 c7 c7 73 c6 43 a9 48 89 de 31 c0 e8 12 2d f0 ff <0f> 0b 5b c3 00 00 cc cc 00 00 cc cc 00 00 cc cc 00 00 85 f6 74 25
[ 5.916778] RSP: 0018:ffffa803c0260b78 EFLAGS: 00010246
[ 5.916779] RAX: 712a1af25779e900 RBX: ffffffffa8cf7950 RCX: ffffffffa962cf08
[ 5.916779] RDX: ffffffffa9c36b60 RSI: 0000000000000082 RDI: ffffffffa9c36b5c
[ 5.916780] RBP: ffff8ffc4779c2c0 R08: 0000000000000001 R09: ffffffffa9c3c300
[ 5.916781] R10: 0000000000000151 R11: ffffffffa9c36b60 R12: ffff8ffe39084000
[ 5.916782] R13: ffffffffa8cf7950 R14: ffffffffa8d12cb0 R15: ffff8ffe39320140
[ 5.916784] FS: 0000000000000000(0000) GS:ffff8ffe3bc00000(0000) knlGS:0000000000000000
[ 5.916785] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 5.916786] CR2: 00007ffef5749408 CR3: 00000002f4f5e000 CR4: 0000000000340ea0
[ 5.916787] Call Trace:
[ 5.916788] <IRQ>
[ 5.916790] __cfi_check+0x3ab58/0x450e0
[ 5.916793] ? dev_hard_start_xmit+0x11f/0x160
[ 5.916795] ? sch_direct_xmit+0xf2/0x230
[ 5.916796] ? __dev_queue_xmit.llvm.11471227737707190958+0x69d/0x8e0
[ 5.916797] ? neigh_resolve_output+0xdf/0x220
[ 5.916799] ? neigh_connected_output.cfi_jt+0x8/0x8
[ 5.916801] ? ip6_finish_output2+0x398/0x4c0
[ 5.916803] ? nf_nat_ipv6_out+0x10/0xa0
[ 5.916804] ? nf_hook_slow+0x84/0x100
[ 5.916807] ? ip6_input_finish+0x8/0x8
[ 5.916807] ? ip6_output+0x6f/0x110
[ 5.916808] ? __ip6_local_out.cfi_jt+0x8/0x8
[ 5.916810] ? mld_sendpack+0x28e/0x330
[ 5.916811] ? ip_rt_bug+0x8/0x8
[ 5.916813] ? mld_ifc_timer_expire+0x2db/0x400
[ 5.916814] ? neigh_proxy_process+0x8/0x8
[ 5.916816] ? call_timer_fn+0x3d/0xd0
[ 5.916817] ? __run_timers+0x2a9/0x300
[ 5.916819] ? rcu_core_si+0x8/0x8
[ 5.916820] ? run_timer_softirq+0x14/0x30
[ 5.916821] ? __do_softirq+0x154/0x262
[ 5.916822] ? native_x2apic_icr_write+0x8/0x8
[ 5.916824] ? irq_exit+0xba/0xc0
[ 5.916825] ? hv_stimer0_vector_handler+0x99/0xe0
[ 5.916826] ? hv_stimer0_callback_vector+0xf/0x20
[ 5.916826] </IRQ>
[ 5.916828] ? hv_stimer_global_cleanup.cfi_jt+0x8/0x8
[ 5.916829] ? raw_setsockopt+0x8/0x8
[ 5.916830] ? default_idle+0xe/0x10
[ 5.916832] ? do_idle.llvm.10446269078108580492+0xb7/0x130
[ 5.916833] ? raw_setsockopt+0x8/0x8
[ 5.916833] ? cpu_startup_entry+0x15/0x20
[ 5.916835] ? cpu_hotplug_enable.cfi_jt+0x8/0x8
[ 5.916836] ? start_secondary+0x188/0x190
[ 5.916837] ? secondary_startup_64+0xa5/0xb0
[ 5.916838] ---[ end trace f2683fa869597ba5 ]---

Avoid this by using the right return type for netvsc_start_xmit.

Fixes: fceaf24a943d8 ("Staging: hv: add the Hyper-V virtual network driver")
Link: https://github.com/ClangBuiltLinux/linux/issues/1009
Signed-off-by: Nathan Chancellor <[email protected]>
---

v1 -> v2:

* Move splat into commit message rather than issue.

Comment from previous version:

Do note that netvsc_xmit still returns int because netvsc_xmit has a
potential return from netvsc_vf_xmit, which does not return netdev_tx_t
because of the call to dev_queue_xmit.

I am not sure if that is an oversight that was introduced by
commit 0c195567a8f6e ("netvsc: transparent VF management") or if
everything works properly as it is now.

My patch is purely concerned with making the definition match the
prototype so it should be NFC aside from avoiding the CFI panic.

drivers/net/hyperv/netvsc_drv.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/net/hyperv/netvsc_drv.c b/drivers/net/hyperv/netvsc_drv.c
index d8e86bdbfba1e..ebcfbae056900 100644
--- a/drivers/net/hyperv/netvsc_drv.c
+++ b/drivers/net/hyperv/netvsc_drv.c
@@ -707,7 +707,8 @@ static int netvsc_xmit(struct sk_buff *skb, struct net_device *net, bool xdp_tx)
goto drop;
}

-static int netvsc_start_xmit(struct sk_buff *skb, struct net_device *ndev)
+static netdev_tx_t netvsc_start_xmit(struct sk_buff *skb,
+ struct net_device *ndev)
{
return netvsc_xmit(skb, ndev, false);
}

base-commit: 51184ae37e0518fd90cb437a2fbc953ae558cd0d
--
2.26.2

2020-04-29 10:14:59

by Wei Liu

[permalink] [raw]
Subject: Re: [PATCH v2] hv_netvsc: Fix netvsc_start_xmit's return type

David

Do you want this to go through net tree? I can submit it via hyperv tree
if that's preferred.

Wei.

On Tue, Apr 28, 2020 at 10:54:56AM -0700, Nathan Chancellor wrote:
> netvsc_start_xmit is used as a callback function for the ndo_start_xmit
> function pointer. ndo_start_xmit's return type is netdev_tx_t but
> netvsc_start_xmit's return type is int.
>
> This causes a failure with Control Flow Integrity (CFI), which requires
> function pointer prototypes and callback function definitions to match
> exactly. When CFI is in enforcing, the kernel panics. When booting a
> CFI kernel with WSL 2, the VM is immediately terminated because of this.
>
> The splat when CONFIG_CFI_PERMISSIVE is used:
>
> [ 5.916765] CFI failure (target: netvsc_start_xmit+0x0/0x10):
> [ 5.916771] WARNING: CPU: 8 PID: 0 at kernel/cfi.c:29 __cfi_check_fail+0x2e/0x40
> [ 5.916772] Modules linked in:
> [ 5.916774] CPU: 8 PID: 0 Comm: swapper/8 Not tainted 5.7.0-rc3-next-20200424-microsoft-cbl-00001-ged4eb37d2c69-dirty #1
> [ 5.916776] RIP: 0010:__cfi_check_fail+0x2e/0x40
> [ 5.916777] Code: 48 c7 c7 70 98 63 a9 48 c7 c6 11 db 47 a9 e8 69 55 59 00 85 c0 75 02 5b c3 48 c7 c7 73 c6 43 a9 48 89 de 31 c0 e8 12 2d f0 ff <0f> 0b 5b c3 00 00 cc cc 00 00 cc cc 00 00 cc cc 00 00 85 f6 74 25
> [ 5.916778] RSP: 0018:ffffa803c0260b78 EFLAGS: 00010246
> [ 5.916779] RAX: 712a1af25779e900 RBX: ffffffffa8cf7950 RCX: ffffffffa962cf08
> [ 5.916779] RDX: ffffffffa9c36b60 RSI: 0000000000000082 RDI: ffffffffa9c36b5c
> [ 5.916780] RBP: ffff8ffc4779c2c0 R08: 0000000000000001 R09: ffffffffa9c3c300
> [ 5.916781] R10: 0000000000000151 R11: ffffffffa9c36b60 R12: ffff8ffe39084000
> [ 5.916782] R13: ffffffffa8cf7950 R14: ffffffffa8d12cb0 R15: ffff8ffe39320140
> [ 5.916784] FS: 0000000000000000(0000) GS:ffff8ffe3bc00000(0000) knlGS:0000000000000000
> [ 5.916785] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 5.916786] CR2: 00007ffef5749408 CR3: 00000002f4f5e000 CR4: 0000000000340ea0
> [ 5.916787] Call Trace:
> [ 5.916788] <IRQ>
> [ 5.916790] __cfi_check+0x3ab58/0x450e0
> [ 5.916793] ? dev_hard_start_xmit+0x11f/0x160
> [ 5.916795] ? sch_direct_xmit+0xf2/0x230
> [ 5.916796] ? __dev_queue_xmit.llvm.11471227737707190958+0x69d/0x8e0
> [ 5.916797] ? neigh_resolve_output+0xdf/0x220
> [ 5.916799] ? neigh_connected_output.cfi_jt+0x8/0x8
> [ 5.916801] ? ip6_finish_output2+0x398/0x4c0
> [ 5.916803] ? nf_nat_ipv6_out+0x10/0xa0
> [ 5.916804] ? nf_hook_slow+0x84/0x100
> [ 5.916807] ? ip6_input_finish+0x8/0x8
> [ 5.916807] ? ip6_output+0x6f/0x110
> [ 5.916808] ? __ip6_local_out.cfi_jt+0x8/0x8
> [ 5.916810] ? mld_sendpack+0x28e/0x330
> [ 5.916811] ? ip_rt_bug+0x8/0x8
> [ 5.916813] ? mld_ifc_timer_expire+0x2db/0x400
> [ 5.916814] ? neigh_proxy_process+0x8/0x8
> [ 5.916816] ? call_timer_fn+0x3d/0xd0
> [ 5.916817] ? __run_timers+0x2a9/0x300
> [ 5.916819] ? rcu_core_si+0x8/0x8
> [ 5.916820] ? run_timer_softirq+0x14/0x30
> [ 5.916821] ? __do_softirq+0x154/0x262
> [ 5.916822] ? native_x2apic_icr_write+0x8/0x8
> [ 5.916824] ? irq_exit+0xba/0xc0
> [ 5.916825] ? hv_stimer0_vector_handler+0x99/0xe0
> [ 5.916826] ? hv_stimer0_callback_vector+0xf/0x20
> [ 5.916826] </IRQ>
> [ 5.916828] ? hv_stimer_global_cleanup.cfi_jt+0x8/0x8
> [ 5.916829] ? raw_setsockopt+0x8/0x8
> [ 5.916830] ? default_idle+0xe/0x10
> [ 5.916832] ? do_idle.llvm.10446269078108580492+0xb7/0x130
> [ 5.916833] ? raw_setsockopt+0x8/0x8
> [ 5.916833] ? cpu_startup_entry+0x15/0x20
> [ 5.916835] ? cpu_hotplug_enable.cfi_jt+0x8/0x8
> [ 5.916836] ? start_secondary+0x188/0x190
> [ 5.916837] ? secondary_startup_64+0xa5/0xb0
> [ 5.916838] ---[ end trace f2683fa869597ba5 ]---
>
> Avoid this by using the right return type for netvsc_start_xmit.
>
> Fixes: fceaf24a943d8 ("Staging: hv: add the Hyper-V virtual network driver")
> Link: https://github.com/ClangBuiltLinux/linux/issues/1009
> Signed-off-by: Nathan Chancellor <[email protected]>
> ---
>
> v1 -> v2:
>
> * Move splat into commit message rather than issue.
>
> Comment from previous version:
>
> Do note that netvsc_xmit still returns int because netvsc_xmit has a
> potential return from netvsc_vf_xmit, which does not return netdev_tx_t
> because of the call to dev_queue_xmit.
>
> I am not sure if that is an oversight that was introduced by
> commit 0c195567a8f6e ("netvsc: transparent VF management") or if
> everything works properly as it is now.
>
> My patch is purely concerned with making the definition match the
> prototype so it should be NFC aside from avoiding the CFI panic.
>
> drivers/net/hyperv/netvsc_drv.c | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/net/hyperv/netvsc_drv.c b/drivers/net/hyperv/netvsc_drv.c
> index d8e86bdbfba1e..ebcfbae056900 100644
> --- a/drivers/net/hyperv/netvsc_drv.c
> +++ b/drivers/net/hyperv/netvsc_drv.c
> @@ -707,7 +707,8 @@ static int netvsc_xmit(struct sk_buff *skb, struct net_device *net, bool xdp_tx)
> goto drop;
> }
>
> -static int netvsc_start_xmit(struct sk_buff *skb, struct net_device *ndev)
> +static netdev_tx_t netvsc_start_xmit(struct sk_buff *skb,
> + struct net_device *ndev)
> {
> return netvsc_xmit(skb, ndev, false);
> }
>
> base-commit: 51184ae37e0518fd90cb437a2fbc953ae558cd0d
> --
> 2.26.2
>

2020-04-29 18:13:48

by David Miller

[permalink] [raw]
Subject: Re: [PATCH v2] hv_netvsc: Fix netvsc_start_xmit's return type

From: Wei Liu <[email protected]>
Date: Wed, 29 Apr 2020 11:10:55 +0100

> Do you want this to go through net tree? I can submit it via hyperv tree
> if that's preferred.

I'll be taking this, thanks.

2020-04-30 00:07:59

by Michael Kelley (LINUX)

[permalink] [raw]
Subject: RE: [PATCH v2] hv_netvsc: Fix netvsc_start_xmit's return type

From: Nathan Chancellor <[email protected]> Sent: Tuesday, April 28, 2020 10:55 AM
>
> Do note that netvsc_xmit still returns int because netvsc_xmit has a
> potential return from netvsc_vf_xmit, which does not return netdev_tx_t
> because of the call to dev_queue_xmit.
>
> I am not sure if that is an oversight that was introduced by
> commit 0c195567a8f6e ("netvsc: transparent VF management") or if
> everything works properly as it is now.
>
> My patch is purely concerned with making the definition match the
> prototype so it should be NFC aside from avoiding the CFI panic.
>

While it probably works correctly now, I'm not too keen on just
changing the return type for netvsc_start_xmit() and assuming the
'int' that is returned from netvsc_xmit() will be correctly mapped to
the netdev_tx_t enum type. While that mapping probably happens
correctly at the moment, this really should have a more holistic fix.

Nathan -- are you willing to look at doing the more holistic fix? Or
should we see about asking Haiyang Zhang to do it?

Michael

2020-04-30 06:06:07

by Nathan Chancellor

[permalink] [raw]
Subject: Re: [PATCH v2] hv_netvsc: Fix netvsc_start_xmit's return type

Hi Michael,

On Thu, Apr 30, 2020 at 12:06:09AM +0000, Michael Kelley wrote:
> From: Nathan Chancellor <[email protected]> Sent: Tuesday, April 28, 2020 10:55 AM
> >
> > Do note that netvsc_xmit still returns int because netvsc_xmit has a
> > potential return from netvsc_vf_xmit, which does not return netdev_tx_t
> > because of the call to dev_queue_xmit.
> >
> > I am not sure if that is an oversight that was introduced by
> > commit 0c195567a8f6e ("netvsc: transparent VF management") or if
> > everything works properly as it is now.
> >
> > My patch is purely concerned with making the definition match the
> > prototype so it should be NFC aside from avoiding the CFI panic.
> >
>
> While it probably works correctly now, I'm not too keen on just
> changing the return type for netvsc_start_xmit() and assuming the
> 'int' that is returned from netvsc_xmit() will be correctly mapped to
> the netdev_tx_t enum type. While that mapping probably happens
> correctly at the moment, this really should have a more holistic fix.

While it might work correctly, I am not sure that the mapping is
correct, hence that comment.

netdev_tx_t is an enum with two acceptable types, 0x00 and 0x10. Up
until commit 0c195567a8f6e ("netvsc: transparent VF management"),
netvsc_xmit was guaranteed to return something of type netdev_tx_t.

However, after said commit, we could return anything from
netvsc_vf_xmit, which in turn calls dev_queue_xmit then
__dev_queue_xmit which will return either an error code (-ENOMEM or
-ENETDOWN) or something from __dev_xmit_skb, which appears to be
NET_XMIT_SUCCESS, NET_XMIT_DROP, or NET_XMIT_CN.

It does not look like netvsc_xmit or netvsc_vf_xmit try to convert those
returns to netdev_tx_t in some way; netvsc_vf_xmit just passes the
return value up to netvsc_xmit, which is the part that I am unsure
about...

> Nathan -- are you willing to look at doing the more holistic fix? Or
> should we see about asking Haiyang Zhang to do it?

I would be fine trying to look at a more holistic fix but I know
basically nothing about this subsystem. I am unsure if something like
this would be acceptable or if something else needs to happen.
Otherwise, I'd be fine with you guys taking a look and just giving me
reported-by credit.

Cheers,
Nathan

diff --git a/drivers/net/hyperv/netvsc_drv.c b/drivers/net/hyperv/netvsc_drv.c
index d8e86bdbfba1e..a39480cfb8fa7 100644
--- a/drivers/net/hyperv/netvsc_drv.c
+++ b/drivers/net/hyperv/netvsc_drv.c
@@ -520,7 +520,8 @@ static int netvsc_vf_xmit(struct net_device *net, struct net_device *vf_netdev,
return rc;
}

-static int netvsc_xmit(struct sk_buff *skb, struct net_device *net, bool xdp_tx)
+static netdev_tx_t netvsc_xmit(struct sk_buff *skb, struct net_device *net,
+ bool xdp_tx)
{
struct net_device_context *net_device_ctx = netdev_priv(net);
struct hv_netvsc_packet *packet = NULL;
@@ -537,8 +538,11 @@ static int netvsc_xmit(struct sk_buff *skb, struct net_device *net, bool xdp_tx)
*/
vf_netdev = rcu_dereference_bh(net_device_ctx->vf_netdev);
if (vf_netdev && netif_running(vf_netdev) &&
- !netpoll_tx_running(net))
- return netvsc_vf_xmit(net, vf_netdev, skb);
+ !netpoll_tx_running(net)) {
+ if (!netvsc_vf_xmit(net, vf_netdev, skb))
+ return NETDEV_TX_OK;
+ goto drop;
+ }

/* We will atmost need two pages to describe the rndis
* header. We can only transmit MAX_PAGE_BUFFER_COUNT number
@@ -707,7 +711,8 @@ static int netvsc_xmit(struct sk_buff *skb, struct net_device *net, bool xdp_tx)
goto drop;
}

-static int netvsc_start_xmit(struct sk_buff *skb, struct net_device *ndev)
+static netdev_tx_t netvsc_start_xmit(struct sk_buff *skb,
+ struct net_device *ndev)
{
return netvsc_xmit(skb, ndev, false);
}

2020-04-30 15:45:09

by Haiyang Zhang

[permalink] [raw]
Subject: RE: [PATCH v2] hv_netvsc: Fix netvsc_start_xmit's return type



> -----Original Message-----
> From: Nathan Chancellor <[email protected]>
> Sent: Thursday, April 30, 2020 2:02 AM
> To: Michael Kelley <[email protected]>
> Cc: KY Srinivasan <[email protected]>; Haiyang Zhang
> <[email protected]>; Stephen Hemminger
> <[email protected]>; Wei Liu <[email protected]>; linux-
> [email protected]; [email protected]; linux-
> [email protected]; [email protected]; Sami
> Tolvanen <[email protected]>
> Subject: Re: [PATCH v2] hv_netvsc: Fix netvsc_start_xmit's return type
>
> Hi Michael,
>
> On Thu, Apr 30, 2020 at 12:06:09AM +0000, Michael Kelley wrote:
> > From: Nathan Chancellor <[email protected]> Sent: Tuesday,
> > April 28, 2020 10:55 AM
> > >
> > > Do note that netvsc_xmit still returns int because netvsc_xmit has a
> > > potential return from netvsc_vf_xmit, which does not return
> > > netdev_tx_t because of the call to dev_queue_xmit.
> > >
> > > I am not sure if that is an oversight that was introduced by commit
> > > 0c195567a8f6e ("netvsc: transparent VF management") or if everything
> > > works properly as it is now.
> > >
> > > My patch is purely concerned with making the definition match the
> > > prototype so it should be NFC aside from avoiding the CFI panic.
> > >
> >
> > While it probably works correctly now, I'm not too keen on just
> > changing the return type for netvsc_start_xmit() and assuming the
> > 'int' that is returned from netvsc_xmit() will be correctly mapped to
> > the netdev_tx_t enum type. While that mapping probably happens
> > correctly at the moment, this really should have a more holistic fix.
>
> While it might work correctly, I am not sure that the mapping is correct,
> hence that comment.
>
> netdev_tx_t is an enum with two acceptable types, 0x00 and 0x10. Up until
> commit 0c195567a8f6e ("netvsc: transparent VF management"), netvsc_xmit
> was guaranteed to return something of type netdev_tx_t.
>
> However, after said commit, we could return anything from netvsc_vf_xmit,
> which in turn calls dev_queue_xmit then __dev_queue_xmit which will
> return either an error code (-ENOMEM or
> -ENETDOWN) or something from __dev_xmit_skb, which appears to be
> NET_XMIT_SUCCESS, NET_XMIT_DROP, or NET_XMIT_CN.
>
> It does not look like netvsc_xmit or netvsc_vf_xmit try to convert those
> returns to netdev_tx_t in some way; netvsc_vf_xmit just passes the return
> value up to netvsc_xmit, which is the part that I am unsure about...
>
> > Nathan -- are you willing to look at doing the more holistic fix? Or
> > should we see about asking Haiyang Zhang to do it?
>
> I would be fine trying to look at a more holistic fix but I know basically nothing
> about this subsystem. I am unsure if something like this would be acceptable
> or if something else needs to happen.
> Otherwise, I'd be fine with you guys taking a look and just giving me
> reported-by credit.

Here is more info regarding Linux network subsystem:

As said in "include/linux/netdevice.h", drivers are allowed to return any codes
from the three different namespaces.
And hv_netvsc needs to support "transparent VF", and calls netvsc_vf_xmit >>
dev_queue_xmit which returns qdisc return codes, and errnos like -ENOMEM,
etc. These are compliant with the guideline below:

79 /*
80 * Transmit return codes: transmit return codes originate from three different
81 * namespaces:
82 *
83 * - qdisc return codes
84 * - driver transmit return codes
85 * - errno values
86 *
87 * Drivers are allowed to return any one of those in their hard_start_xmit()

Also, ndo_start_xmit function pointer is used by upper layer functions which can
handles three types of the return codes.
For example, in the calling stack: ndo_start_xmit << netdev_start_xmit <<
xmit_one << dev_hard_start_xmit():
The function dev_hard_start_xmit() uses dev_xmit_complete() to handle the
return codes. It handles three types of the return codes correctly.

3483 struct sk_buff *dev_hard_start_xmit(struct sk_buff *first, struct net_device *dev,
3484 struct netdev_queue *txq, int *ret)
3485 {
3486 struct sk_buff *skb = first;
3487 int rc = NETDEV_TX_OK;
3488
3489 while (skb) {
3490 struct sk_buff *next = skb->next;
3491
3492 skb_mark_not_on_list(skb);
3493 rc = xmit_one(skb, dev, txq, next != NULL);
3494 if (unlikely(!dev_xmit_complete(rc))) {
3495 skb->next = next;
3496 goto out;
3497 }
3498
3499 skb = next;
3500 if (netif_tx_queue_stopped(txq) && skb) {
3501 rc = NETDEV_TX_BUSY;
3502 break;
3503 }
3504 }
3505
3506 out:
3507 *ret = rc;
3508 return skb;
3509 }


118 /*
119 * Current order: NETDEV_TX_MASK > NET_XMIT_MASK >= 0 is significant;
120 * hard_start_xmit() return < NET_XMIT_MASK means skb was consumed.
121 */
122 static inline bool dev_xmit_complete(int rc)
123 {
124 /*
125 * Positive cases with an skb consumed by a driver:
126 * - successful transmission (rc == NETDEV_TX_OK)
127 * - error while transmitting (rc < 0)
128 * - error while queueing to a different device (rc & NET_XMIT_MASK)
129 */
130 if (likely(rc < NET_XMIT_MASK))
131 return true;
132
133 return false;
134 }

Regarding "a more holistic fix", I believe the return type of ndo_start_xmit should be
int, because of three namespaces of the return codes. This means to change all network
drivers. I'm not proposing to do this big change right now.

So I have no objection of your patch.

Thanks,
- Haiyang

Reviewed-by: Haiyang Zhang <[email protected]>

2020-05-01 22:28:59

by David Miller

[permalink] [raw]
Subject: Re: [PATCH v2] hv_netvsc: Fix netvsc_start_xmit's return type

From: Nathan Chancellor <[email protected]>
Date: Tue, 28 Apr 2020 10:54:56 -0700

> netvsc_start_xmit is used as a callback function for the ndo_start_xmit
> function pointer. ndo_start_xmit's return type is netdev_tx_t but
> netvsc_start_xmit's return type is int.
>
> This causes a failure with Control Flow Integrity (CFI), which requires
> function pointer prototypes and callback function definitions to match
> exactly. When CFI is in enforcing, the kernel panics. When booting a
> CFI kernel with WSL 2, the VM is immediately terminated because of this.
>
> The splat when CONFIG_CFI_PERMISSIVE is used:
...
> Avoid this by using the right return type for netvsc_start_xmit.
>
> Fixes: fceaf24a943d8 ("Staging: hv: add the Hyper-V virtual network driver")
> Link: https://github.com/ClangBuiltLinux/linux/issues/1009
> Signed-off-by: Nathan Chancellor <[email protected]>

Applied.