With the inclusion of d7dac083414e ("net-sysfs: update the queue counts in the
unregistration path"), we have started see the following message during one of
our stress tests that brings an interface up and down while continuously
trying to send out packets on it:
et3_11_1 selects TX queue 0, but real number of TX queues is 0
It seems that this is a result of a race between remove_queue_kobjects() and
netdev_cap_txqueue() for the last packets before setting dev->flags &= ~IFF_UP
in __dev_close_many(). When this message is displayed, netdev_cap_txqueue()
selects queue 0 anyway (the noop queue at this point). As it did before the
above commit, that queue (which I guess is still around due to reference
counting) proceeds to drop the packet and return NET_XMIT_CN. So there doesn't
appear to be a functional change. However, the warning message seems to be
spurious if not slightly confusing.
I'm not exactly sure what the fix for this should be or if there should be
one. In the meantime, we have ignored this message for this test, but was
wondering if there weren't any ideas for a better solution.
Thanks,
Kevin
Quoting Kevin Mitchell (2022-09-28 03:27:46)
> With the inclusion of d7dac083414e ("net-sysfs: update the queue counts in the
> unregistration path"), we have started see the following message during one of
> our stress tests that brings an interface up and down while continuously
> trying to send out packets on it:
>
> et3_11_1 selects TX queue 0, but real number of TX queues is 0
>
> It seems that this is a result of a race between remove_queue_kobjects() and
> netdev_cap_txqueue() for the last packets before setting dev->flags &= ~IFF_UP
> in __dev_close_many(). When this message is displayed, netdev_cap_txqueue()
> selects queue 0 anyway (the noop queue at this point). As it did before the
> above commit, that queue (which I guess is still around due to reference
> counting) proceeds to drop the packet and return NET_XMIT_CN. So there doesn't
> appear to be a functional change. However, the warning message seems to be
> spurious if not slightly confusing.
Do you know the call traces leading to this? Also I'm not 100% sure to
follow as remove_queue_kobjects is called in the unregistration path
while the test is setting the iface up & down. What driver is used?
As you said and looking around queue 0 is somewhat special and used as a
fallback. My suggestion would be to 1) check if the above race is
expected 2) if yes, a possible solution would be not to warn when
real_num_tx_queues == 0 as in such cases selecting queue 0 would be the
expected fallback (and you might want to check places like [1]).
Thanks,
Antoine
[1] https://elixir.bootlin.com/linux/latest/source/net/core/dev.c#L4126
On Wed, Sep 28, 2022 at 11:46:20AM +0200, Antoine Tenart wrote:
> Quoting Kevin Mitchell (2022-09-28 03:27:46)
> > With the inclusion of d7dac083414e ("net-sysfs: update the queue counts in the
> > unregistration path"), we have started see the following message during one of
> > our stress tests that brings an interface up and down while continuously
> > trying to send out packets on it:
> >
> > et3_11_1 selects TX queue 0, but real number of TX queues is 0
> >
> > It seems that this is a result of a race between remove_queue_kobjects() and
> > netdev_cap_txqueue() for the last packets before setting dev->flags &= ~IFF_UP
> > in __dev_close_many(). When this message is displayed, netdev_cap_txqueue()
> > selects queue 0 anyway (the noop queue at this point). As it did before the
> > above commit, that queue (which I guess is still around due to reference
> > counting) proceeds to drop the packet and return NET_XMIT_CN. So there doesn't
> > appear to be a functional change. However, the warning message seems to be
> > spurious if not slightly confusing.
>
> Do you know the call traces leading to this? Also I'm not 100% sure to
> follow as remove_queue_kobjects is called in the unregistration path
> while the test is setting the iface up & down. What driver is used?
Sorry, my language was imprecise. The device is being unregistered and
re-registered. The driver is out of tree for our front panel ports. I don't
think this is specific to the driver, but I'd be happy to be convinced
otherwise.
The call trace to the queue removal is
[ 628.165565] dump_stack+0x74/0x90
(remove_queue_kobject)
[ 628.165569] netdev_unregister_kobject+0x7a/0xb3
[ 628.165572] rollback_registered_many+0x560/0x5c4
[ 628.165576] unregister_netdevice_queue+0xa3/0xfc
[ 628.165578] unregister_netdev+0x1e/0x25
[ 628.165589] fdev_free+0x26e/0x29d [strata_dma_drv]
The call trace to the warning message is
[ 1094.355489] dump_stack+0x74/0x90
(netdev_cap_txqueue)
[ 1094.355495] netdev_core_pick_tx+0x91/0xaf
[ 1094.355500] __dev_queue_xmit+0x249/0x602
[ 1094.355503] ? printk+0x58/0x6f
[ 1094.355510] dev_queue_xmit+0x10/0x12
[ 1094.355518] packet_sendmsg+0xe88/0xeee
[ 1094.355524] ? update_curr+0x6b/0x15d
[ 1094.355530] sock_sendmsg_nosec+0x12/0x1d
[ 1094.355533] sock_write_iter+0x8a/0xb6
[ 1094.355539] new_sync_write+0x7c/0xb4
[ 1094.355543] vfs_write+0xfe/0x12a
[ 1094.355547] ksys_write+0x6e/0xb9
[ 1094.355552] ? exit_to_user_mode_prepare+0xd3/0xf0
[ 1094.355555] __x64_sys_write+0x1a/0x1c
[ 1094.355559] do_syscall_64+0x31/0x40
[ 1094.355564] entry_SYSCALL_64_after_hwframe+0x44/0xa9
>
> As you said and looking around queue 0 is somewhat special and used as a
> fallback. My suggestion would be to 1) check if the above race is
> expected 2) if yes, a possible solution would be not to warn when
> real_num_tx_queues == 0 as in such cases selecting queue 0 would be the
> expected fallback (and you might want to check places like [1]).
Yes this is exactly where this is happening and that sounds like a good idea to
me. As far as I can tell, the message is completely innocuous. If there really
are no cases where it is useful to have this warning for real_num_tx_queues ==
0, I could submit a patch to not emit it in that case.
>
> Thanks,
> Antoine
>
> [1] https://elixir.bootlin.com/linux/latest/source/net/core/dev.c#L4126
On Wed, 28 Sep 2022 16:20:33 -0700 Kevin Mitchell wrote:
> > As you said and looking around queue 0 is somewhat special and used as a
> > fallback. My suggestion would be to 1) check if the above race is
> > expected 2) if yes, a possible solution would be not to warn when
> > real_num_tx_queues == 0 as in such cases selecting queue 0 would be the
> > expected fallback (and you might want to check places like [1]).
>
> Yes this is exactly where this is happening and that sounds like a good idea to
> me. As far as I can tell, the message is completely innocuous. If there really
> are no cases where it is useful to have this warning for real_num_tx_queues ==
> 0, I could submit a patch to not emit it in that case.
SGTM, FWIW.