2018-03-11 19:26:04

by Josh Elsasser

[permalink] [raw]
Subject: [PATCH 0/1] net: avoid a kernel panic during sk_busy_loop

Hi Dave,

I stumbled across a reproducible kernel panic while playing around with
busy_poll on a Linux 4.9.86 kernel. There's an unfortunate interaction
between init_dummy_netdev, which doesn't bother to fill in netdev_ops, and
sk_busy_loop, which assumes netdev_ops is a valid pointer.

To reproduce on the device under test (DUT), I did:

$ ip addr show dev wlan0
8: wlan0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq [...]
inet 172.16.122.6/23 brd 172.16.123.255 scope global wlan0
$ sysctl -w net.core.busy_read=50
$ nc -l 172.16.122.6 5001

Then transmitted some data to this socket from a second host:

$ echo "foo" | nc 172.16.122.6 5001

The DUT immediately hits a kernel panic.

I've attached a patch that applies cleanly to the 4.9.87 stable release.
This fix isn't necessary for net/net-next (ndo_busy_poll was removed in
linux-4.11), but a further backport of this commit is likely required for
any stable releases older than linux-4.5.

I hope this is the right way to raise something like this. I couldn't find
a clear answer from the -stable and netdev howtos for bugs against features
that no longer exist in mainline.

Thanks,
Josh


2018-03-11 19:26:14

by Josh Elsasser

[permalink] [raw]
Subject: [PATCH 1/1] net: check dev->reg_state before deref of napi netdev_ops

init_dummy_netdev() leaves its netdev_ops pointer zeroed. This leads
to a NULL pointer dereference when sk_busy_loop fires against an iwlwifi
wireless adapter and checks napi->dev->netdev_ops->ndo_busy_poll.

Avoid this by ensuring that napi->dev is not a dummy device before
dereferencing napi dev's netdev_ops, preventing the following panic:

BUG: unable to handle kernel NULL pointer dereference at 00000000000000c8
IP: [<ffffffff817b4b72>] sk_busy_loop+0x92/0x2f0
Call Trace:
[<ffffffff815a3134>] ? uart_write_room+0x74/0xf0
[<ffffffff817964a9>] sock_poll+0x99/0xa0
[<ffffffff81223142>] do_sys_poll+0x2e2/0x520
[<ffffffff8118d3fc>] ? get_page_from_freelist+0x3bc/0xa30
[<ffffffff810ada22>] ? update_curr+0x62/0x140
[<ffffffff811ea671>] ? __slab_free+0xa1/0x2a0
[<ffffffff811ea671>] ? __slab_free+0xa1/0x2a0
[<ffffffff8179dbb1>] ? skb_free_head+0x21/0x30
[<ffffffff81221bd0>] ? poll_initwait+0x50/0x50
[<ffffffff811eaa36>] ? kmem_cache_free+0x1c6/0x1e0
[<ffffffff815a4884>] ? uart_write+0x124/0x1d0
[<ffffffff810bd1cd>] ? remove_wait_queue+0x4d/0x60
[<ffffffff810bd224>] ? __wake_up+0x44/0x50
[<ffffffff81582731>] ? tty_write_unlock+0x31/0x40
[<ffffffff8158c5c6>] ? tty_ldisc_deref+0x16/0x20
[<ffffffff81584820>] ? tty_write+0x1e0/0x2f0
[<ffffffff81587e50>] ? process_echoes+0x80/0x80
[<ffffffff8120c17b>] ? __vfs_write+0x2b/0x130
[<ffffffff8120d09a>] ? vfs_write+0x15a/0x1a0
[<ffffffff81223455>] SyS_poll+0x75/0x100
[<ffffffff819a6524>] entry_SYSCALL_64_fastpath+0x24/0xcf

Commit 79e7fff47b7b ("net: remove support for per driver ndo_busy_poll()")
indirectly fixed this upstream in linux-4.11 by removing the offending
pointer usage. No other users of napi->dev touch its netdev_ops.

Fixes: 060212928670 ("net: add low latency socket poll")
Fixes: ce6aea93f751 ("net: network drivers no longer need to implement ndo_busy_poll()") - 4.9.y
Signed-off-by: Josh Elsasser <[email protected]>
---
net/core/dev.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/net/core/dev.c b/net/core/dev.c
index 8898618bf341..d0f67d544587 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -5042,7 +5042,10 @@ bool sk_busy_loop(struct sock *sk, int nonblock)
goto out;

/* Note: ndo_busy_poll method is optional in linux-4.5 */
- busy_poll = napi->dev->netdev_ops->ndo_busy_poll;
+ if (napi->dev->reg_state != NETREG_DUMMY)
+ busy_poll = napi->dev->netdev_ops->ndo_busy_poll;
+ else
+ busy_poll = NULL;

do {
rc = 0;
--
2.11.0


2018-03-12 23:18:57

by Cong Wang

[permalink] [raw]
Subject: Re: [PATCH 1/1] net: check dev->reg_state before deref of napi netdev_ops

On Sun, Mar 11, 2018 at 12:22 PM, Josh Elsasser <[email protected]> wrote:
> init_dummy_netdev() leaves its netdev_ops pointer zeroed. This leads
> to a NULL pointer dereference when sk_busy_loop fires against an iwlwifi
> wireless adapter and checks napi->dev->netdev_ops->ndo_busy_poll.
>
> Avoid this by ensuring that napi->dev is not a dummy device before
> dereferencing napi dev's netdev_ops, preventing the following panic:

Hmm, how about just checking ->netdev_ops? Checking reg_state looks
odd, although works.

2018-03-13 05:19:11

by Josh Elsasser

[permalink] [raw]
Subject: Re: [PATCH 1/1] net: check dev->reg_state before deref of napi netdev_ops

On Mon, Mar 12, 2018 at 4:17 PM, Cong Wang <[email protected]> wrote:
> On Sun, Mar 11, 2018 at 12:22 PM, Josh Elsasser <[email protected]> wrote:
>> init_dummy_netdev() leaves its netdev_ops pointer zeroed. This leads
>> to a NULL pointer dereference when sk_busy_loop fires against an iwlwifi
>> wireless adapter and checks napi->dev->netdev_ops->ndo_busy_poll.
>>
>> Avoid this by ensuring that napi->dev is not a dummy device before
>> dereferencing napi dev's netdev_ops, preventing the following panic:
>
> Hmm, how about just checking ->netdev_ops? Checking reg_state looks
> odd, although works.

Fair point. I was trying to differentiate between an unexpected
NULL pointer and a dummy netdev, but I guess it was clever
at the expense of readability.

I'll push up a v2 that just does the obvious.