Received: by 2002:a05:6359:c8b:b0:c7:702f:21d4 with SMTP id go11csp1303321rwb; Wed, 28 Sep 2022 16:38:23 -0700 (PDT) X-Google-Smtp-Source: AMsMyM6RaVlRMRQ4qcMv5Wgp2BZf5DJHEtDSWK+NBIU3XWTi3HRqNRc/RkQAglAGgStKSDMtz4P0 X-Received: by 2002:a17:907:94ca:b0:783:ac0b:15ef with SMTP id dn10-20020a17090794ca00b00783ac0b15efmr292180ejc.256.1664408303653; Wed, 28 Sep 2022 16:38:23 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1664408303; cv=none; d=google.com; s=arc-20160816; b=fH6DzKymZ3K3Qq8xb8GOduK2R+5+yRu2RkApi5KPVU7I9ID6RJ/ijpbsBXosHGzUud m/pLRgQXGoXEDnTgC0uOkyE5IzjJMm9BS/r3BCNQjWy31+QgB2z8VeliUI7IT32VC2zr yPEEAb+lkjGAZrGwPsSGo2rFqKu5t+tEgzAA5/JHkvOrZBmEPgAwN80xKwSc9ZKLQRtz 0SpYC3j//kwr6F0JparzQBpON6ksTO9dzOrgVYijYGoWZwrBrO482pUr+bibNAwYYkwC sBaMnTjg6HZjYcP386ND5H9eAgR3HmKxYibFPbSHkhehK0kLRQtgVjeFKdk++Ax/16FJ 9E5Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:references :message-id:subject:cc:to:from:date:dkim-signature; bh=P+xjXaurwhtJmqmMf0PJEzjAO1MMXyc/q6GQ8sS6wLs=; b=zixTgV1oW+sKU25IKd7FfkQ+Be6+J5LkWeDvUDiAz1iUGeqxy3afKq+6y/fCNNlShW M9de7guqSWicn1BHHj3mon3iWgk4n43Aevjo24E4r8xoxzfzstUnoZtrc7KsrtLQZIT+ 83kP9MdUaycprhcOnZaKPEdsT3u4lCsNU2x7JHBPfZtTCQ7cnTXSRYFAx7leaSFRNa/E pszM0b1dClFJXT8z+BFo+BFnx10OZ2NyldzQdc7TzgnaInGarstC3Gg/2rfMoWDkO9xn 8i5Jo8eWQKZDIB1R80UXgH/HCBmu3xFcPN2GH/fpmWoH4ooX+5ODbI9nchrdwOViBIC5 28dA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@arista.com header.s=Arista-B header.b=EKAGIolP; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=arista.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id f14-20020a056402354e00b004533484fe11si7959226edd.236.2022.09.28.16.37.58; Wed, 28 Sep 2022 16:38:23 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@arista.com header.s=Arista-B header.b=EKAGIolP; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=arista.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233134AbiI1XUm (ORCPT + 99 others); Wed, 28 Sep 2022 19:20:42 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39294 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230431AbiI1XUk (ORCPT ); Wed, 28 Sep 2022 19:20:40 -0400 Received: from mail-pj1-x1062.google.com (mail-pj1-x1062.google.com [IPv6:2607:f8b0:4864:20::1062]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3CCEF28718 for ; Wed, 28 Sep 2022 16:20:36 -0700 (PDT) Received: by mail-pj1-x1062.google.com with SMTP id bu5-20020a17090aee4500b00202e9ca2182so4422805pjb.0 for ; Wed, 28 Sep 2022 16:20:36 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=in-reply-to:content-disposition:references:message-id:subject:cc:to :from:date:dkim-signature:x-gm-message-state:from:to:cc:subject:date; bh=P+xjXaurwhtJmqmMf0PJEzjAO1MMXyc/q6GQ8sS6wLs=; b=IdZXp152gThz8VgSGp+PGWhipLw7ECFgb/mibfrc7X2VYLYoW2xDvtgRvrWDoT9fby sPnnhAuSRZmjyQWgxv3h0/94OfIVjXJ6NaltaLSAC5Hvyf+1824sWCxmpmHu1CDepUU3 D/h/oErTJqfuFKQZ84Wk4A3Tul6WibTNu6IAuYs1eQZ/F7wt5dBVQPg9keew0BUJOgaJ 25JWH2ZjsRCf5RrjvZOtOuzOo8p3D4I9snov2i86IwDPTIFCJ0pUdrfSutQ/6IF6MRzh P0HLF5NEpQ1Lfp60oThOsbW0SD4KfqzIjHnqsKN8pR9MTKLgS3B+krpLnElRbR5HsrM6 EFyA== X-Gm-Message-State: ACrzQf2y04qhyZzDjLaC6z9+e7SvJ5ruC17Ol1qNhhpvfrgSU9DtXZyd YePLDlTT1sKGNBTF3H45dZQlT971mKDo/W3E4nHzrTs27N+m X-Received: by 2002:a17:903:11cf:b0:178:a8f4:d511 with SMTP id q15-20020a17090311cf00b00178a8f4d511mr410333plh.72.1664407235579; Wed, 28 Sep 2022 16:20:35 -0700 (PDT) Received: from smtp.aristanetworks.com (smtp.aristanetworks.com. [54.193.82.35]) by smtp-relay.gmail.com with ESMTPS id u11-20020a170902714b00b0016ee647ca85sm223889plm.93.2022.09.28.16.20.35 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Wed, 28 Sep 2022 16:20:35 -0700 (PDT) X-Relaying-Domain: arista.com Received: from chmeee (unknown [10.95.71.70]) by smtp.aristanetworks.com (Postfix) with ESMTPS id 23A60301BD94; Wed, 28 Sep 2022 16:20:35 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=arista.com; s=Arista-B; t=1664407235; bh=P+xjXaurwhtJmqmMf0PJEzjAO1MMXyc/q6GQ8sS6wLs=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=EKAGIolPTBrjq14fjU+yv8WHtOnQNJEyyjVWQRvq230g1uiLby3q4VxaAWbqI31jP np9Jthr79J0qp/NGRmhzcT7gBnApItlwuM+igi7XdudHhMYVFqa7NWHMxAYYhAzPH7 04jcvxWW0K/W7AHNaqgn8WEMnNO5C+gXumZQ19Ic= Received: from kevmitch by chmeee with local (Exim 4.96) (envelope-from ) id 1odgLx-000WEQ-2s; Wed, 28 Sep 2022 16:20:33 -0700 Date: Wed, 28 Sep 2022 16:20:33 -0700 From: Kevin Mitchell To: Antoine Tenart Cc: Jakub Kicinski , netdev@vger.kernel.org, linux-kernel@vger.kernel.org Subject: Re: new warning caused by ("net-sysfs: update the queue counts in the unregistration path") Message-ID: References: <166435838013.3919.14607521178984182789@kwain> Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <166435838013.3919.14607521178984182789@kwain> X-Spam-Status: No, score=-2.2 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Sep 28, 2022 at 11:46:20AM +0200, Antoine Tenart wrote: > Quoting Kevin Mitchell (2022-09-28 03:27:46) > > With the inclusion of d7dac083414e ("net-sysfs: update the queue counts in the > > unregistration path"), we have started see the following message during one of > > our stress tests that brings an interface up and down while continuously > > trying to send out packets on it: > > > > et3_11_1 selects TX queue 0, but real number of TX queues is 0 > > > > It seems that this is a result of a race between remove_queue_kobjects() and > > netdev_cap_txqueue() for the last packets before setting dev->flags &= ~IFF_UP > > in __dev_close_many(). When this message is displayed, netdev_cap_txqueue() > > selects queue 0 anyway (the noop queue at this point). As it did before the > > above commit, that queue (which I guess is still around due to reference > > counting) proceeds to drop the packet and return NET_XMIT_CN. So there doesn't > > appear to be a functional change. However, the warning message seems to be > > spurious if not slightly confusing. > > Do you know the call traces leading to this? Also I'm not 100% sure to > follow as remove_queue_kobjects is called in the unregistration path > while the test is setting the iface up & down. What driver is used? Sorry, my language was imprecise. The device is being unregistered and re-registered. The driver is out of tree for our front panel ports. I don't think this is specific to the driver, but I'd be happy to be convinced otherwise. The call trace to the queue removal is [ 628.165565] dump_stack+0x74/0x90 (remove_queue_kobject) [ 628.165569] netdev_unregister_kobject+0x7a/0xb3 [ 628.165572] rollback_registered_many+0x560/0x5c4 [ 628.165576] unregister_netdevice_queue+0xa3/0xfc [ 628.165578] unregister_netdev+0x1e/0x25 [ 628.165589] fdev_free+0x26e/0x29d [strata_dma_drv] The call trace to the warning message is [ 1094.355489] dump_stack+0x74/0x90 (netdev_cap_txqueue) [ 1094.355495] netdev_core_pick_tx+0x91/0xaf [ 1094.355500] __dev_queue_xmit+0x249/0x602 [ 1094.355503] ? printk+0x58/0x6f [ 1094.355510] dev_queue_xmit+0x10/0x12 [ 1094.355518] packet_sendmsg+0xe88/0xeee [ 1094.355524] ? update_curr+0x6b/0x15d [ 1094.355530] sock_sendmsg_nosec+0x12/0x1d [ 1094.355533] sock_write_iter+0x8a/0xb6 [ 1094.355539] new_sync_write+0x7c/0xb4 [ 1094.355543] vfs_write+0xfe/0x12a [ 1094.355547] ksys_write+0x6e/0xb9 [ 1094.355552] ? exit_to_user_mode_prepare+0xd3/0xf0 [ 1094.355555] __x64_sys_write+0x1a/0x1c [ 1094.355559] do_syscall_64+0x31/0x40 [ 1094.355564] entry_SYSCALL_64_after_hwframe+0x44/0xa9 > > As you said and looking around queue 0 is somewhat special and used as a > fallback. My suggestion would be to 1) check if the above race is > expected 2) if yes, a possible solution would be not to warn when > real_num_tx_queues == 0 as in such cases selecting queue 0 would be the > expected fallback (and you might want to check places like [1]). Yes this is exactly where this is happening and that sounds like a good idea to me. As far as I can tell, the message is completely innocuous. If there really are no cases where it is useful to have this warning for real_num_tx_queues == 0, I could submit a patch to not emit it in that case. > > Thanks, > Antoine > > [1] https://elixir.bootlin.com/linux/latest/source/net/core/dev.c#L4126