Subject: [PATCH] xen/netfront: destroy queues before real_num_tx_queues is zeroed

xennet_destroy_queues() relies on info->netdev->real_num_tx_queues to
delete queues. Since d7dac083414eb5bb99a6d2ed53dc2c1b405224e5
("net-sysfs: update the queue counts in the unregistration path"),
unregister_netdev() indirectly sets real_num_tx_queues to 0. Those two
facts together means, that xennet_destroy_queues() called from
xennet_remove() cannot do its job, because it's called after
unregister_netdev(). This results in kfree-ing queues that are still
linked in napi, which ultimately crashes:

BUG: kernel NULL pointer dereference, address: 0000000000000000
#PF: supervisor read access in kernel mode
#PF: error_code(0x0000) - not-present page
PGD 0 P4D 0
Oops: 0000 [#1] PREEMPT SMP PTI
CPU: 1 PID: 52 Comm: xenwatch Tainted: G W 5.16.10-1.32.fc32.qubes.x86_64+ #226
RIP: 0010:free_netdev+0xa3/0x1a0
Code: ff 48 89 df e8 2e e9 00 00 48 8b 43 50 48 8b 08 48 8d b8 a0 fe ff ff 48 8d a9 a0 fe ff ff 49 39 c4 75 26 eb 47 e8 ed c1 66 ff <48> 8b 85 60 01 00 00 48 8d 95 60 01 00 00 48 89 ef 48 2d 60 01 00
RSP: 0000:ffffc90000bcfd00 EFLAGS: 00010286
RAX: 0000000000000000 RBX: ffff88800edad000 RCX: 0000000000000000
RDX: 0000000000000001 RSI: ffffc90000bcfc30 RDI: 00000000ffffffff
RBP: fffffffffffffea0 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000001 R12: ffff88800edad050
R13: ffff8880065f8f88 R14: 0000000000000000 R15: ffff8880066c6680
FS: 0000000000000000(0000) GS:ffff8880f3300000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000000 CR3: 00000000e998c006 CR4: 00000000003706e0
Call Trace:
<TASK>
xennet_remove+0x13d/0x300 [xen_netfront]
xenbus_dev_remove+0x6d/0xf0
__device_release_driver+0x17a/0x240
device_release_driver+0x24/0x30
bus_remove_device+0xd8/0x140
device_del+0x18b/0x410
? _raw_spin_unlock+0x16/0x30
? klist_iter_exit+0x14/0x20
? xenbus_dev_request_and_reply+0x80/0x80
device_unregister+0x13/0x60
xenbus_dev_changed+0x18e/0x1f0
xenwatch_thread+0xc0/0x1a0
? do_wait_intr_irq+0xa0/0xa0
kthread+0x16b/0x190
? set_kthread_struct+0x40/0x40
ret_from_fork+0x22/0x30
</TASK>

Fix this by calling xennet_destroy_queues() from xennet_close() too,
when real_num_tx_queues is still available. This ensures that queues are
destroyed when real_num_tx_queues is set to 0, regardless of how
unregister_netdev() was called.

Originally reported at
https://github.com/QubesOS/qubes-issues/issues/7257

Fixes: d7dac083414eb5bb9 ("net-sysfs: update the queue counts in the unregistration path")
Cc: [email protected] # 5.16+
Signed-off-by: Marek Marczykowski-Górecki <[email protected]>

---
While this fixes the issue, I'm not sure if that is the correct thing
to do. xennet_remove() calls xennet_destroy_queues() under rtnl_lock,
which may be important here? Just moving xennet_destroy_queues() before
unregister_netdev() in xennet_remove() did not helped - it crashed in
another way (use-after-free in xennet_close()).

Signed-off-by: Marek Marczykowski-Górecki <[email protected]>
---
drivers/net/xen-netfront.c | 33 +++++++++++++++++----------------
1 file changed, 17 insertions(+), 16 deletions(-)

diff --git a/drivers/net/xen-netfront.c b/drivers/net/xen-netfront.c
index d514d96027a6..5b69a930581e 100644
--- a/drivers/net/xen-netfront.c
+++ b/drivers/net/xen-netfront.c
@@ -828,6 +828,22 @@ static netdev_tx_t xennet_start_xmit(struct sk_buff *skb, struct net_device *dev
return NETDEV_TX_OK;
}

+static void xennet_destroy_queues(struct netfront_info *info)
+{
+ unsigned int i;
+
+ for (i = 0; i < info->netdev->real_num_tx_queues; i++) {
+ struct netfront_queue *queue = &info->queues[i];
+
+ if (netif_running(info->netdev))
+ napi_disable(&queue->napi);
+ netif_napi_del(&queue->napi);
+ }
+
+ kfree(info->queues);
+ info->queues = NULL;
+}
+
static int xennet_close(struct net_device *dev)
{
struct netfront_info *np = netdev_priv(dev);
@@ -839,6 +855,7 @@ static int xennet_close(struct net_device *dev)
queue = &np->queues[i];
napi_disable(&queue->napi);
}
+ xennet_destroy_queues(np);
return 0;
}

@@ -2103,22 +2120,6 @@ static int write_queue_xenstore_keys(struct netfront_queue *queue,
return err;
}

-static void xennet_destroy_queues(struct netfront_info *info)
-{
- unsigned int i;
-
- for (i = 0; i < info->netdev->real_num_tx_queues; i++) {
- struct netfront_queue *queue = &info->queues[i];
-
- if (netif_running(info->netdev))
- napi_disable(&queue->napi);
- netif_napi_del(&queue->napi);
- }
-
- kfree(info->queues);
- info->queues = NULL;
-}
-


static int xennet_create_page_pool(struct netfront_queue *queue)
--
2.31.1


2022-02-21 09:35:01

by Juergen Gross

[permalink] [raw]
Subject: Re: [PATCH] xen/netfront: destroy queues before real_num_tx_queues is zeroed

On 20.02.22 14:42, Marek Marczykowski-Górecki wrote:
> xennet_destroy_queues() relies on info->netdev->real_num_tx_queues to
> delete queues. Since d7dac083414eb5bb99a6d2ed53dc2c1b405224e5
> ("net-sysfs: update the queue counts in the unregistration path"),
> unregister_netdev() indirectly sets real_num_tx_queues to 0. Those two
> facts together means, that xennet_destroy_queues() called from
> xennet_remove() cannot do its job, because it's called after
> unregister_netdev(). This results in kfree-ing queues that are still
> linked in napi, which ultimately crashes:
>
> BUG: kernel NULL pointer dereference, address: 0000000000000000
> #PF: supervisor read access in kernel mode
> #PF: error_code(0x0000) - not-present page
> PGD 0 P4D 0
> Oops: 0000 [#1] PREEMPT SMP PTI
> CPU: 1 PID: 52 Comm: xenwatch Tainted: G W 5.16.10-1.32.fc32.qubes.x86_64+ #226
> RIP: 0010:free_netdev+0xa3/0x1a0
> Code: ff 48 89 df e8 2e e9 00 00 48 8b 43 50 48 8b 08 48 8d b8 a0 fe ff ff 48 8d a9 a0 fe ff ff 49 39 c4 75 26 eb 47 e8 ed c1 66 ff <48> 8b 85 60 01 00 00 48 8d 95 60 01 00 00 48 89 ef 48 2d 60 01 00
> RSP: 0000:ffffc90000bcfd00 EFLAGS: 00010286
> RAX: 0000000000000000 RBX: ffff88800edad000 RCX: 0000000000000000
> RDX: 0000000000000001 RSI: ffffc90000bcfc30 RDI: 00000000ffffffff
> RBP: fffffffffffffea0 R08: 0000000000000000 R09: 0000000000000000
> R10: 0000000000000000 R11: 0000000000000001 R12: ffff88800edad050
> R13: ffff8880065f8f88 R14: 0000000000000000 R15: ffff8880066c6680
> FS: 0000000000000000(0000) GS:ffff8880f3300000(0000) knlGS:0000000000000000
> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 0000000000000000 CR3: 00000000e998c006 CR4: 00000000003706e0
> Call Trace:
> <TASK>
> xennet_remove+0x13d/0x300 [xen_netfront]
> xenbus_dev_remove+0x6d/0xf0
> __device_release_driver+0x17a/0x240
> device_release_driver+0x24/0x30
> bus_remove_device+0xd8/0x140
> device_del+0x18b/0x410
> ? _raw_spin_unlock+0x16/0x30
> ? klist_iter_exit+0x14/0x20
> ? xenbus_dev_request_and_reply+0x80/0x80
> device_unregister+0x13/0x60
> xenbus_dev_changed+0x18e/0x1f0
> xenwatch_thread+0xc0/0x1a0
> ? do_wait_intr_irq+0xa0/0xa0
> kthread+0x16b/0x190
> ? set_kthread_struct+0x40/0x40
> ret_from_fork+0x22/0x30
> </TASK>
>
> Fix this by calling xennet_destroy_queues() from xennet_close() too,
> when real_num_tx_queues is still available. This ensures that queues are
> destroyed when real_num_tx_queues is set to 0, regardless of how
> unregister_netdev() was called.
>
> Originally reported at
> https://github.com/QubesOS/qubes-issues/issues/7257
>
> Fixes: d7dac083414eb5bb9 ("net-sysfs: update the queue counts in the unregistration path")
> Cc: [email protected] # 5.16+
> Signed-off-by: Marek Marczykowski-Górecki <[email protected]>
>
> ---
> While this fixes the issue, I'm not sure if that is the correct thing
> to do. xennet_remove() calls xennet_destroy_queues() under rtnl_lock,
> which may be important here? Just moving xennet_destroy_queues() before

I checked some of the call paths leading to xennet_close(), and all of
those contained an ASSERT_RTNL(), so it seems the rtnl_lock is already
taken here. Could you test with adding an ASSERT_RTNL() in
xennet_destroy_queues()?

> unregister_netdev() in xennet_remove() did not helped - it crashed in
> another way (use-after-free in xennet_close()).

Yes, this would need to basically do the xennet_close() handling in
xennet_destroy() instead, which I believe is not really an option.

In case your test with the added ASSERT_RTNL() doesn't show any
problem you can add my:

Reviewed-by: Juergen Gross <[email protected]>


Juergen


Attachments:
OpenPGP_0xB0DE9DD628BF132F.asc (3.08 kB)
OpenPGP public key
OpenPGP_signature (505.00 B)
OpenPGP digital signature
Download all attachments
Subject: Re: [PATCH] xen/netfront: destroy queues before real_num_tx_queues is zeroed

On Mon, Feb 21, 2022 at 07:27:32AM +0100, Juergen Gross wrote:
> I checked some of the call paths leading to xennet_close(), and all of
> those contained an ASSERT_RTNL(), so it seems the rtnl_lock is already
> taken here. Could you test with adding an ASSERT_RTNL() in
> xennet_destroy_queues()?

Tried that and no issues spotted.

> In case your test with the added ASSERT_RTNL() doesn't show any
> problem you can add my:
>
> Reviewed-by: Juergen Gross <[email protected]>

Thanks.

--
Best Regards,
Marek Marczykowski-Górecki
Invisible Things Lab


Attachments:
(No filename) (571.00 B)
signature.asc (499.00 B)
Download all attachments

2022-02-23 01:14:52

by Jakub Kicinski

[permalink] [raw]
Subject: Re: [PATCH] xen/netfront: destroy queues before real_num_tx_queues is zeroed

On Mon, 21 Feb 2022 07:27:32 +0100 Juergen Gross wrote:
> On 20.02.22 14:42, Marek Marczykowski-Górecki wrote:
> > xennet_destroy_queues() relies on info->netdev->real_num_tx_queues to
> > delete queues. Since d7dac083414eb5bb99a6d2ed53dc2c1b405224e5
> > ("net-sysfs: update the queue counts in the unregistration path"),
> > unregister_netdev() indirectly sets real_num_tx_queues to 0. Those two
> > facts together means, that xennet_destroy_queues() called from
> > xennet_remove() cannot do its job, because it's called after
> > unregister_netdev(). This results in kfree-ing queues that are still
> > linked in napi, which ultimately crashes:
> >
> > BUG: kernel NULL pointer dereference, address: 0000000000000000
> > #PF: supervisor read access in kernel mode
> > #PF: error_code(0x0000) - not-present page
> > PGD 0 P4D 0
> > Oops: 0000 [#1] PREEMPT SMP PTI
> > CPU: 1 PID: 52 Comm: xenwatch Tainted: G W 5.16.10-1.32.fc32.qubes.x86_64+ #226
> > RIP: 0010:free_netdev+0xa3/0x1a0
> > Code: ff 48 89 df e8 2e e9 00 00 48 8b 43 50 48 8b 08 48 8d b8 a0 fe ff ff 48 8d a9 a0 fe ff ff 49 39 c4 75 26 eb 47 e8 ed c1 66 ff <48> 8b 85 60 01 00 00 48 8d 95 60 01 00 00 48 89 ef 48 2d 60 01 00
> > RSP: 0000:ffffc90000bcfd00 EFLAGS: 00010286
> > RAX: 0000000000000000 RBX: ffff88800edad000 RCX: 0000000000000000
> > RDX: 0000000000000001 RSI: ffffc90000bcfc30 RDI: 00000000ffffffff
> > RBP: fffffffffffffea0 R08: 0000000000000000 R09: 0000000000000000
> > R10: 0000000000000000 R11: 0000000000000001 R12: ffff88800edad050
> > R13: ffff8880065f8f88 R14: 0000000000000000 R15: ffff8880066c6680
> > FS: 0000000000000000(0000) GS:ffff8880f3300000(0000) knlGS:0000000000000000
> > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > CR2: 0000000000000000 CR3: 00000000e998c006 CR4: 00000000003706e0
> > Call Trace:
> > <TASK>
> > xennet_remove+0x13d/0x300 [xen_netfront]
> > xenbus_dev_remove+0x6d/0xf0
> > __device_release_driver+0x17a/0x240
> > device_release_driver+0x24/0x30
> > bus_remove_device+0xd8/0x140
> > device_del+0x18b/0x410
> > ? _raw_spin_unlock+0x16/0x30
> > ? klist_iter_exit+0x14/0x20
> > ? xenbus_dev_request_and_reply+0x80/0x80
> > device_unregister+0x13/0x60
> > xenbus_dev_changed+0x18e/0x1f0
> > xenwatch_thread+0xc0/0x1a0
> > ? do_wait_intr_irq+0xa0/0xa0
> > kthread+0x16b/0x190
> > ? set_kthread_struct+0x40/0x40
> > ret_from_fork+0x22/0x30
> > </TASK>
> >
> > Fix this by calling xennet_destroy_queues() from xennet_close() too,
> > when real_num_tx_queues is still available. This ensures that queues are
> > destroyed when real_num_tx_queues is set to 0, regardless of how
> > unregister_netdev() was called.
> >
> > Originally reported at
> > https://github.com/QubesOS/qubes-issues/issues/7257
> >
> > Fixes: d7dac083414eb5bb9 ("net-sysfs: update the queue counts in the unregistration path")
> > Cc: [email protected] # 5.16+
> > Signed-off-by: Marek Marczykowski-Górecki <[email protected]>
> >
> > ---
> > While this fixes the issue, I'm not sure if that is the correct thing
> > to do. xennet_remove() calls xennet_destroy_queues() under rtnl_lock,
> > which may be important here? Just moving xennet_destroy_queues() before
>
> I checked some of the call paths leading to xennet_close(), and all of
> those contained an ASSERT_RTNL(), so it seems the rtnl_lock is already
> taken here. Could you test with adding an ASSERT_RTNL() in
> xennet_destroy_queues()?
>
> > unregister_netdev() in xennet_remove() did not helped - it crashed in
> > another way (use-after-free in xennet_close()).
>
> Yes, this would need to basically do the xennet_close() handling in
> xennet_destroy() instead, which I believe is not really an option.

I think the patch makes open/close asymmetric, tho. After ifup ; ifdown;
the next ifup will fail because queues are already destroyed, no?
IOW xennet_open() expects the queues were created at an earlier stage.

Maybe we can move the destroy to ndo_uninit? (and create to ndo_init?)

Subject: Re: [PATCH] xen/netfront: destroy queues before real_num_tx_queues is zeroed

On Tue, Feb 22, 2022 at 12:03:01PM -0800, Jakub Kicinski wrote:
> On Mon, 21 Feb 2022 07:27:32 +0100 Juergen Gross wrote:
> > On 20.02.22 14:42, Marek Marczykowski-Górecki wrote:
> > > xennet_destroy_queues() relies on info->netdev->real_num_tx_queues to
> > > delete queues. Since d7dac083414eb5bb99a6d2ed53dc2c1b405224e5
> > > ("net-sysfs: update the queue counts in the unregistration path"),
> > > unregister_netdev() indirectly sets real_num_tx_queues to 0. Those two
> > > facts together means, that xennet_destroy_queues() called from
> > > xennet_remove() cannot do its job, because it's called after
> > > unregister_netdev(). This results in kfree-ing queues that are still
> > > linked in napi, which ultimately crashes:
> > >
> > > BUG: kernel NULL pointer dereference, address: 0000000000000000
> > > #PF: supervisor read access in kernel mode
> > > #PF: error_code(0x0000) - not-present page
> > > PGD 0 P4D 0
> > > Oops: 0000 [#1] PREEMPT SMP PTI
> > > CPU: 1 PID: 52 Comm: xenwatch Tainted: G W 5.16.10-1.32.fc32.qubes.x86_64+ #226
> > > RIP: 0010:free_netdev+0xa3/0x1a0
> > > Code: ff 48 89 df e8 2e e9 00 00 48 8b 43 50 48 8b 08 48 8d b8 a0 fe ff ff 48 8d a9 a0 fe ff ff 49 39 c4 75 26 eb 47 e8 ed c1 66 ff <48> 8b 85 60 01 00 00 48 8d 95 60 01 00 00 48 89 ef 48 2d 60 01 00
> > > RSP: 0000:ffffc90000bcfd00 EFLAGS: 00010286
> > > RAX: 0000000000000000 RBX: ffff88800edad000 RCX: 0000000000000000
> > > RDX: 0000000000000001 RSI: ffffc90000bcfc30 RDI: 00000000ffffffff
> > > RBP: fffffffffffffea0 R08: 0000000000000000 R09: 0000000000000000
> > > R10: 0000000000000000 R11: 0000000000000001 R12: ffff88800edad050
> > > R13: ffff8880065f8f88 R14: 0000000000000000 R15: ffff8880066c6680
> > > FS: 0000000000000000(0000) GS:ffff8880f3300000(0000) knlGS:0000000000000000
> > > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > > CR2: 0000000000000000 CR3: 00000000e998c006 CR4: 00000000003706e0
> > > Call Trace:
> > > <TASK>
> > > xennet_remove+0x13d/0x300 [xen_netfront]
> > > xenbus_dev_remove+0x6d/0xf0
> > > __device_release_driver+0x17a/0x240
> > > device_release_driver+0x24/0x30
> > > bus_remove_device+0xd8/0x140
> > > device_del+0x18b/0x410
> > > ? _raw_spin_unlock+0x16/0x30
> > > ? klist_iter_exit+0x14/0x20
> > > ? xenbus_dev_request_and_reply+0x80/0x80
> > > device_unregister+0x13/0x60
> > > xenbus_dev_changed+0x18e/0x1f0
> > > xenwatch_thread+0xc0/0x1a0
> > > ? do_wait_intr_irq+0xa0/0xa0
> > > kthread+0x16b/0x190
> > > ? set_kthread_struct+0x40/0x40
> > > ret_from_fork+0x22/0x30
> > > </TASK>
> > >
> > > Fix this by calling xennet_destroy_queues() from xennet_close() too,
> > > when real_num_tx_queues is still available. This ensures that queues are
> > > destroyed when real_num_tx_queues is set to 0, regardless of how
> > > unregister_netdev() was called.
> > >
> > > Originally reported at
> > > https://github.com/QubesOS/qubes-issues/issues/7257
> > >
> > > Fixes: d7dac083414eb5bb9 ("net-sysfs: update the queue counts in the unregistration path")
> > > Cc: [email protected] # 5.16+
> > > Signed-off-by: Marek Marczykowski-Górecki <[email protected]>
> > >
> > > ---
> > > While this fixes the issue, I'm not sure if that is the correct thing
> > > to do. xennet_remove() calls xennet_destroy_queues() under rtnl_lock,
> > > which may be important here? Just moving xennet_destroy_queues() before
> >
> > I checked some of the call paths leading to xennet_close(), and all of
> > those contained an ASSERT_RTNL(), so it seems the rtnl_lock is already
> > taken here. Could you test with adding an ASSERT_RTNL() in
> > xennet_destroy_queues()?
> >
> > > unregister_netdev() in xennet_remove() did not helped - it crashed in
> > > another way (use-after-free in xennet_close()).
> >
> > Yes, this would need to basically do the xennet_close() handling in
> > xennet_destroy() instead, which I believe is not really an option.
>
> I think the patch makes open/close asymmetric, tho. After ifup ; ifdown;
> the next ifup will fail because queues are already destroyed, no?
> IOW xennet_open() expects the queues were created at an earlier stage.

Right.

> Maybe we can move the destroy to ndo_uninit? (and create to ndo_init?)

It looks like talk_to_netback(), which currently create queues, needs
them for for quite some work. It is also called when reconnecting (and
netdev is _not_ re-registered in this case), so that would be a
significant refactor.
But, moving destroy to ndo_uninit() should be fine. It works, including
after ifup;ifdown;ifup case too. I'll send v2 shortly.

--
Best Regards,
Marek Marczykowski-Górecki
Invisible Things Lab


Attachments:
(No filename) (4.82 kB)
signature.asc (499.00 B)
Download all attachments