Received: by 2002:a25:ef43:0:0:0:0:0 with SMTP id w3csp1681637ybm; Sat, 30 May 2020 17:21:14 -0700 (PDT) X-Google-Smtp-Source: ABdhPJyx6sCrLn56VfsVVMRbPJTphlfdJ9wV3qrcUKWWdxFOjfZvaHF/vKT8tgDuWnv8nFWYQNtM X-Received: by 2002:a17:907:33ce:: with SMTP id zk14mr13535575ejb.2.1590884474420; Sat, 30 May 2020 17:21:14 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1590884474; cv=none; d=google.com; s=arc-20160816; b=COjZI+xGMo7FYAbwySV4jtSVv6mZNAoZtQ7p993AjgAzD7Ajm+xmVxycJB/qHYwzJE fu28qmD5eAW8WLMTD9TJOlm89T6bhQbBOT0D+OWg+fpC9Qrg7yRaE5bRFnOX+EdgOdyh MWIYQxY8rNkbwXMw14CqAv4+eEnkHTAKf9hHW81NEMIbfPq8yOJbWrNTHOud8P+koQkc ygWH4iFCifoaK99KMCb4DFsWtqv/GsoIHeFPalA84Rx7zDtr9iJy8/dxylA8Oym2RS5h XpSs2HjEpKZmJcgtiN3mjJ92ssO3H9XUksaq7bmzfW8LNH/6K4PspxaL6xv4Zp4vpWHC Z8ig== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:dkim-signature; bh=wOIeSpwSQVRmf9tkVhXNQpeRGfDV1NinvsV4WEWvauA=; b=a7eU2CS6N69lwFxtp2w0eDgN5/rF3854/KwmxYcxX5pdj5A2XJJboVUj9e5qoqLmBp YRwAp9WPwh+PthvSNjXMCip6jCPbuhTrhirrrSR8GwoQTQ8Cj05iAtOG8H+klPnJqxW2 lDS4pIUMssZvVjN4V46nF/TAsNL549NQ2UjocQqbGeJ+jM9psr5oaIRsjZ4ZYWLwy1Yk DyqaBi88K4G/KwYWq4NdrkPDGnC3BM9ilhSIC0eW+uljuy0G1kXNkVNT1J4+w83/U2iB rgjC8xfvFGr6zbxunsjCfl8sXgSlqV+87GhUDDuFUzaoIzmTYQxlEQoUCoWM6XqlF0d5 SwwA== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail (test mode) header.i=@armlinux.org.uk header.s=pandora-2019 header.b=ycsWV7OF; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=armlinux.org.uk Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id w22si8140955eds.426.2020.05.30.17.20.51; Sat, 30 May 2020 17:21:14 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=fail (test mode) header.i=@armlinux.org.uk header.s=pandora-2019 header.b=ycsWV7OF; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=armlinux.org.uk Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729564AbgEaATN (ORCPT + 99 others); Sat, 30 May 2020 20:19:13 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:45610 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729361AbgEaATM (ORCPT ); Sat, 30 May 2020 20:19:12 -0400 Received: from pandora.armlinux.org.uk (pandora.armlinux.org.uk [IPv6:2001:4d48:ad52:3201:214:fdff:fe10:1be6]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id AA49EC03E969; Sat, 30 May 2020 17:19:11 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=armlinux.org.uk; s=pandora-2019; h=Sender:In-Reply-To:Content-Type: MIME-Version:References:Message-ID:Subject:Cc:To:From:Date:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Id: List-Help:List-Unsubscribe:List-Subscribe:List-Post:List-Owner:List-Archive; bh=wOIeSpwSQVRmf9tkVhXNQpeRGfDV1NinvsV4WEWvauA=; b=ycsWV7OFygToHe8J31oVZBGPo piaZXSFsWHFuinXM0j2wipzzCttiOgVD2YDSMnTybl0L2vmQHGyvCSWYh2S34ywrLf2J+dHHl3fi9 lRgebGj+pKgjEnDn414wrizIqzBfTSF5adoox6aKSCWoF8mc9OK7fxSldXCpxyXESNKvhQJers5p4 9QPN8dGi8McfQ2osaEpNqcdd7UK3sacFUbInX1WIWhE5QsO9H5/BlOU2U8y4zZedVKKcGaVIoZkO4 m5SCBwd/WD4OSGnqc+t16wzIARqGJTIwCgrht+87nH071+rqeOm/1Otv/ag/EmyvkmeCiRn0uHxCb Prc0re8JQ==; Received: from shell.armlinux.org.uk ([2001:4d48:ad52:3201:5054:ff:fe00:4ec]:47224) by pandora.armlinux.org.uk with esmtpsa (TLSv1.2:ECDHE-RSA-AES256-GCM-SHA384:256) (Exim 4.90_1) (envelope-from ) id 1jfBgl-0003Qq-0l; Sun, 31 May 2020 01:18:55 +0100 Received: from linux by shell.armlinux.org.uk with local (Exim 4.92) (envelope-from ) id 1jfBgf-0001WV-6Z; Sun, 31 May 2020 01:18:49 +0100 Date: Sun, 31 May 2020 01:18:49 +0100 From: Russell King - ARM Linux admin To: Vladimir Oltean Cc: stable@vger.kernel.org, gregkh@linuxfoundation.org, netdev@vger.kernel.org, andrew@lunn.ch, f.fainelli@gmail.com, hkallweit1@gmail.com, davem@davemloft.net, kuba@kernel.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH stable-4.19.y] net: phy: reschedule state machine if AN has not completed in PHY_AN state Message-ID: <20200531001849.GG1551@shell.armlinux.org.uk> References: <20200530214315.1051358-1-olteanv@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20200530214315.1051358-1-olteanv@gmail.com> User-Agent: Mutt/1.10.1 (2018-07-13) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Sun, May 31, 2020 at 12:43:15AM +0300, Vladimir Oltean wrote: > From: Vladimir Oltean > > In kernel 4.19 (and probably earlier too) there are issues surrounding > the PHY_AN state. > > For example, if a PHY is in PHY_AN state and AN has not finished, then > what is supposed to happen is that the state machine gets rescheduled > until it is, or until the link_timeout reaches zero which triggers an > autoneg restart process. > > But actually the rescheduling never works if the PHY uses interrupts, > because the condition under which rescheduling occurs is just if > phy_polling_mode() is true. So basically, this whole rescheduling > functionality works for AN-not-yet-complete just by mistake. Let me > explain. > > Most of the time the AN process manages to finish by the time the > interrupt has triggered. One might say "that should always be the case, > otherwise the PHY wouldn't raise the interrupt, right?". > Well, some PHYs implement an .aneg_done method which allows them to tell > the state machine when the AN is really complete. > The AR8031/AR8033 driver (at803x.c) is one such example. Even when > copper autoneg completes, the driver still keeps the "aneg_done" > variable unset until in-band SGMII autoneg finishes too (there is no > interrupt for that). So we have the premises of a race condition. Why do we care whether SGMII autoneg has completed - is that not the domain of the MAC side of the link? It sounds like things are a little confused. The PHY interrupt is signalling that the copper side has completed its autoneg. If we're in SGMII mode, the PHY can now start the process of informing the MAC about the negotiation results across the SGMII link. When the MAC receives those results, and sends the acknowledgement back to the PHY, is it not the responsibility of the MAC to then say "the link is now up" ? That's how we deal with it elsewhere with phylink integration, which is what has to be done when you have to cope with PHYs that switch their host interface mode between SGMII, 2500BASE-X, 5GBASE-R and 10GBASE-R - the MAC side needs to be dynamically reconfigured depending on the new host-side operating mode of the PHY. Only when the MAC subsequently reports that the link has been established is the whole link from the MAC to the media deemed to be operational. -- RMK's Patch system: https://www.armlinux.org.uk/developer/patches/ FTTC for 0.8m (est. 1762m) line in suburbia: sync at 13.1Mbps down 424kbps up