Received: by 2002:a05:7412:2a8c:b0:e2:908c:2ebd with SMTP id u12csp1860294rdh; Tue, 26 Sep 2023 05:58:29 -0700 (PDT) X-Google-Smtp-Source: AGHT+IG6044IMih7AvArlnkSV+g9GlaRgu6ZRHq6W23ccG3kJhzqYsIbmMqsXG+15eusW8mloX8m X-Received: by 2002:a05:6a00:1401:b0:691:fd26:f54a with SMTP id l1-20020a056a00140100b00691fd26f54amr11585060pfu.20.1695733109265; Tue, 26 Sep 2023 05:58:29 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1695733109; cv=none; d=google.com; s=arc-20160816; b=Yy0SgrjNDwfH55O+7uMcVLgSwAJ9dvCRCIe4nWSFMNE0ZrNP+veRzLcbtpeorjLj8z Z+R6fLg22zy6bqlh9rlRD6fnRhakTRfkoQUXYlOdeigQ9k48vNUBurV1zfEb4vwBu7R1 4idIHHeRegDCAx3K1vY7JOvSvp9V/q2hRaWzRrm1Op6Q2Nxnc6IYLxOH0OFEGhUDdqFb EORj5kFdWH5bjyi/tNbZptVOQnXeLFmacStzDr7EVDvBGyjf9JJnxPntyKnEC4QlljKh i7BXms99Vv0a7Nxfr5/5G3pW4R+Xm0AWjRCOV3N7UV2CVP8xs2YOaPKo0eGTt1kHPDq6 WhOw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:mime-version:user-agent :content-transfer-encoding:references:in-reply-to:date:cc:to:from :subject:message-id:dkim-signature; bh=eg4t+n3zm3o8ChFzdnYGpONlsKgP9BOQ3n+qZ2QhXVw=; fh=YnxJ0pMcWbT/tt7fs5f5PAn/987kJtJk7LKwAtUxt3E=; b=vQgupqWthYfl52vDXZUU9/FnI9K6cDDWMHCSm6BLuO/eEAHcSiDEyFY4S3i1wWecoy ZxANPMyJMo9G92b03j9ZuihVYg6DAksNc94Cx2LEer0LXOl48U8nZGLyb4+hbM+F9Gqo 5HXb1YqA7PnA9bfr+UHlX5x6E117PVrKAiAYwKEg6qs6kq8ksxKOizXNLvYrR3/+cFol cGe49A8PbHzi+28VOjG3n0m4nR2AfKkahexYjRDaz6Vp/vBjE2AIASFaK06cu0wCBT7+ UcPrH7edhXynG8zLwl16WKjbHHIjbF3duylzazRd2ewN4rYLDPueSD9GQxbJOkMo0sdx JHNw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@sipsolutions.net header.s=mail header.b=O1nmQbY7; spf=pass (google.com: domain of linux-wireless-owner@vger.kernel.org designates 23.128.96.37 as permitted sender) smtp.mailfrom=linux-wireless-owner@vger.kernel.org; dmarc=pass (p=NONE sp=REJECT dis=NONE) header.from=sipsolutions.net Return-Path: Received: from snail.vger.email (snail.vger.email. [23.128.96.37]) by mx.google.com with ESMTPS id eb6-20020a056a004c8600b0068fc9c6eafasi12497243pfb.357.2023.09.26.05.58.28 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 26 Sep 2023 05:58:29 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-wireless-owner@vger.kernel.org designates 23.128.96.37 as permitted sender) client-ip=23.128.96.37; Authentication-Results: mx.google.com; dkim=pass header.i=@sipsolutions.net header.s=mail header.b=O1nmQbY7; spf=pass (google.com: domain of linux-wireless-owner@vger.kernel.org designates 23.128.96.37 as permitted sender) smtp.mailfrom=linux-wireless-owner@vger.kernel.org; dmarc=pass (p=NONE sp=REJECT dis=NONE) header.from=sipsolutions.net Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by snail.vger.email (Postfix) with ESMTP id BD6D3807280E; Tue, 26 Sep 2023 01:07:18 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at snail.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229685AbjIZIHW (ORCPT + 56 others); Tue, 26 Sep 2023 04:07:22 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:54296 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231767AbjIZIHV (ORCPT ); Tue, 26 Sep 2023 04:07:21 -0400 Received: from sipsolutions.net (s3.sipsolutions.net [IPv6:2a01:4f8:242:246e::2]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 93B0DFB; Tue, 26 Sep 2023 01:07:14 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=sipsolutions.net; s=mail; h=MIME-Version:Content-Transfer-Encoding: Content-Type:References:In-Reply-To:Date:Cc:To:From:Subject:Message-ID:Sender :Reply-To:Content-ID:Content-Description:Resent-Date:Resent-From:Resent-To: Resent-Cc:Resent-Message-ID; bh=eg4t+n3zm3o8ChFzdnYGpONlsKgP9BOQ3n+qZ2QhXVw=; t=1695715634; x=1696925234; b=O1nmQbY7T0Yhum1VfhyxTGecEAemp0d41xVXYxrn/IDmptw Y+IMifUAD+6of0vNUdIJvbV1xkxT1x02P6o+ipfDQrZxVa1lYTI2LUEMy2rnpcICYkAqcYlUHMcec wp1ro2JDq24onZQRwOL6ZRSFSWr9VTEGFDw8TQx5+1Eiz4zdsTEBxPimlOwHoeOhiSFaTUxqdv0RB f69roSvZaWDL2sq+E4Z82pLaknMGMzRZd7arjwKaIXyLm/MzUb+Hs8SF8sg9ZHzytjJwRBL6Wx9/1 foj2I5AsmLl3hKwx9JjF21Q3TeHSWdrG1cf3Q0Ni13VZKTElHYIme1r7XDZNOgyw==; Received: by sipsolutions.net with esmtpsa (TLS1.3:ECDHE_X25519__RSA_PSS_RSAE_SHA256__AES_256_GCM:256) (Exim 4.96) (envelope-from ) id 1ql368-005LDu-0Z; Tue, 26 Sep 2023 10:07:12 +0200 Message-ID: <790f9d0914e2baba66c394f4e21ef118e44d9775.camel@sipsolutions.net> Subject: Re: netif_carrier_on() race From: Johannes Berg To: netdev@vger.kernel.org Cc: linux-wireless@vger.kernel.org Date: Tue, 26 Sep 2023 10:07:11 +0200 In-Reply-To: <346b21d87c69f817ea3c37caceb34f1f56255884.camel@sipsolutions.net> References: <346b21d87c69f817ea3c37caceb34f1f56255884.camel@sipsolutions.net> Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable User-Agent: Evolution 3.48.4 (3.48.4-1.fc38) MIME-Version: 1.0 X-malware-bazaar: not-scanned X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,SPF_HELO_PASS,SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-wireless@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (snail.vger.email [0.0.0.0]); Tue, 26 Sep 2023 01:07:18 -0700 (PDT) Focusing on this part for a moment, because it affects not just wireless: > Then, in netif_carrier_on(), we immediately set the carrier on bit, so > that you can actually immediately see this from userspace if you ask > rtnetlink, however, it's not actually immediately _functional_ - it > still needs to schedule and run the linkwatch work first, to call > dev_activate() to change the (TX queue) qdisc(s) away from noop. >=20 > Also, even though you can already query the carrier state and see it on, > the actual rtnetlink _event_ for this only happens from the linkwatch > work as well, via netdev_state_change(). >=20 > All of this makes sense since you need to hold RTNL for all those state > changes/notifier chains, but it does lead to the first race/consistency > problem: if you query at just the right time you can see carrier being > on, however, if the carrier is actually removed again and the linkwatch > work didn't run yet, there might never be an event for the carrier on, > iow, you might have: >=20 > netif_carrier_on() > query from userspace and see carrier on > netif_carrier_off() > linkwatch work runs and sends only carrier off event and also because, as Andrew mentioned, you can have the exact opposite problem... It can actually happen that something _else_ sends an event, so even if userspace does't query but waits for a carrier on event, you could end up with: * netif_carrier_on() * something else triggers netdev_state_change(), even userspace setting link alias netdev_state_change() -> sends an rtnetlink event saying carrier is on * userspace transmits but frames are dropped * linkwatch work runs and enables qdiscs only now To address this issue, we could introduce a new state, say __LINK_STATE_CARRIER_COMPLETE or something like that, which is used when communicating carrier state to userspace, and only set/cleared in dev_activate()/dev_deactivate(). That way, any events to userspace (or userspace querying) wouldn't show the carrier state until it's actually fully reflected in software (qdiscs) too. This doesn't fully solve _my_ (wifi) problem, but perhaps lets me work around it in userspace by querying for the carrier state, if it's reflected correctly ("fully ready to transmit") then we can do that. Right now, we can't even do that. But would it break something else? There's also a way to query it via ethtool, which perhaps should _not_ be converted to _COMPLETE since you could argue the ethtool link state is about the physical link, and with this change the carrier becomes about the logical link in a fashion? johannes