Received: by 2002:a25:1985:0:0:0:0:0 with SMTP id 127csp2854884ybz; Mon, 27 Apr 2020 05:54:35 -0700 (PDT) X-Google-Smtp-Source: APiQypK5R0qmFIhKSmeGobiIKkNOeiK0j4nK1A7GNL4Kii5sHbfyABwI76iVFG5Jt45SDpHnaXvp X-Received: by 2002:a17:906:1292:: with SMTP id k18mr19399926ejb.132.1587992075389; Mon, 27 Apr 2020 05:54:35 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1587992075; cv=none; d=google.com; s=arc-20160816; b=F7uiPEJsqcnN06kBl7HQ/3Fd/YxlgbAAwKztVqjJwrq99WnkDHIcB1omgl+NsjlYTY k5ofX+o5gdGK69FSD1+2mchS6+n5MGctR7aa1aJWoZOtZTcNxhoSPWpOdWTBkwMn5kbI +fVenoX8YFxoOT61HapPMPH+r0hDSZVJLLzDdeoRPYbbsxW2cj22mIl5d0lF8GzZYXaO UEKElvp7IF+h8lpx2DG1dHKJMgEuSHLG2tBrk/1JSiRIFHEC2xEbe5vOA8RbbaqTtdlG wwzZbDb4qS+dXrrX0ZeIpTjFCHnts+1OLwsv/2gpz5QculpMP5ClaFqhTN2reOLBOyPf pjSg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:dkim-signature; bh=npgNBfjXCgdOT6VhrlGrF6mCkBD86EOkn8/hf4N7+Xg=; b=PRVZqS4QlxwKrAUBNt7OmsBkdyzTIy3+jYTQ5p+Ozp4sgQnhjLgjaqKxN6iJ8cL/EO B6X7Lww5mWmck8TTTxP/SZEec4rpiE1vAqPd/VNgm7vAOuuDnGt40Mo93blnYNINb3qk S5wkbyc9ZDTMU0DFEhz/mjy0IHpCZRTlCBKb0AFXzshVxI8aEJSinv0cKAH8j0+NQY8W 7hlb1MPp6maKoEGQKVFo6NDEQXQ3u1XMbm7HbrXrxXw/IRvx7lpwz99PR7g420H7YuKl Js8LKVmSswA3gdl/q8kgIMpf6Jtdejc368LvjUPuyrMfw9ZHnN/P3rxLOlC1FP+BUh8K FPEg== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail (test mode) header.i=@armlinux.org.uk header.s=pandora-2019 header.b="a/7b4z2q"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=armlinux.org.uk Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id cz11si8230189edb.383.2020.04.27.05.54.11; Mon, 27 Apr 2020 05:54:35 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=fail (test mode) header.i=@armlinux.org.uk header.s=pandora-2019 header.b="a/7b4z2q"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=armlinux.org.uk Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727889AbgD0Mw0 (ORCPT + 99 others); Mon, 27 Apr 2020 08:52:26 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47192 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-FAIL-OK-FAIL) by vger.kernel.org with ESMTP id S1727846AbgD0MwR (ORCPT ); Mon, 27 Apr 2020 08:52:17 -0400 Received: from pandora.armlinux.org.uk (pandora.armlinux.org.uk [IPv6:2001:4d48:ad52:3201:214:fdff:fe10:1be6]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 39F48C0610D5; Mon, 27 Apr 2020 05:52:16 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=armlinux.org.uk; s=pandora-2019; h=Sender:In-Reply-To:Content-Type: MIME-Version:References:Message-ID:Subject:Cc:To:From:Date:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Id: List-Help:List-Unsubscribe:List-Subscribe:List-Post:List-Owner:List-Archive; bh=npgNBfjXCgdOT6VhrlGrF6mCkBD86EOkn8/hf4N7+Xg=; b=a/7b4z2qzgc5vUXpvfjDqlWFR eHuKJUX5x6kbAaOEUDrXclNWnO1VUMxWUQ0MOpByul2vIZQNYS7W7tyjND0FgJOzJsl2/O0ulNNdd T7ZFUbY9ceu/h7MptFsfE2YzJsYOrIYlqHMMeUMlGK47UOr4inl6q4T2b7pZnsVlBnNHbNouIdv8i TTBkayO3wdgVVnu44j7yAfCTyA9eKf4z4BXlxkCaX1j2vRwV9LVgaDWUQl1BpmdvetjBe6pglYnpO y3pTxvBUFLkfeVQBrfmRye0MfGBTM2tInVpUZSl3fQUfJROARf9+9lSsmBRotIfklRiFatSm5yDLG ZkznR4qJw==; Received: from shell.armlinux.org.uk ([fd8f:7570:feb6:1:5054:ff:fe00:4ec]:56264) by pandora.armlinux.org.uk with esmtpsa (TLSv1.2:ECDHE-RSA-AES256-GCM-SHA384:256) (Exim 4.90_1) (envelope-from ) id 1jT3Ev-0002uk-JK; Mon, 27 Apr 2020 13:52:01 +0100 Received: from linux by shell.armlinux.org.uk with local (Exim 4.92) (envelope-from ) id 1jT3Er-0006fX-0I; Mon, 27 Apr 2020 13:51:57 +0100 Date: Mon, 27 Apr 2020 13:51:56 +0100 From: Russell King - ARM Linux admin To: Florinel Iordache Cc: Andrew Lunn , "davem@davemloft.net" , "netdev@vger.kernel.org" , "f.fainelli@gmail.com" , "hkallweit1@gmail.com" , "devicetree@vger.kernel.org" , "linux-doc@vger.kernel.org" , "robh+dt@kernel.org" , "mark.rutland@arm.com" , "kuba@kernel.org" , "corbet@lwn.net" , "shawnguo@kernel.org" , Leo Li , "Madalin Bucur (OSS)" , Ioana Ciornei , "linux-kernel@vger.kernel.org" Subject: Re: Re: [PATCH net-next v2 6/9] net: phy: add backplane kr driver support Message-ID: <20200427125156.GD25745@shell.armlinux.org.uk> References: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.10.1 (2018-07-13) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Apr 27, 2020 at 12:40:37PM +0000, Florinel Iordache wrote: > > > +/* Backplane mutex between all KR PHY threads */ static struct mutex > > > +backplane_lock; > > > > > > > +/* Read AN Link Status */ > > > +static int is_an_link_up(struct phy_device *phydev) { > > > + struct backplane_device *bpdev = phydev->priv; > > > + int ret, val = 0; > > > + > > > + mutex_lock(&bpdev->bpphy_lock); > > > > Last time i asked the question about how this mutex and the phy mutex interact. > > I don't remember seeing an answer. > > > > Andrew > > Hi Andrew, > Yes, your question was: > <lock? It appears both are trying to do the same thing, serialise access to the PHY hardware.>> > The answer is: yes, you are right, they both are protecting the critical section related to accessing the PHY hardware for a particular PHY. > As you can see the backplane device (bpdev) has associated one phy_device (phydev) so bpdev->bpphy_lock and phydev->lock are equivalent. > Normally your assumption is correct: backplane driver should use the same phydev->lock but there is the following problem: > Backplane driver needs to protect all accesses to a PHY hardware including the ones coming from backplane scheduled workqueues for all lanes within a PHY. > But phydev->lock is already acquired for a phy_device (from phy.c) before each phy_driver callback is called (e.g.: config_aneg, suspend, ...) > So if I would use phydev->lock instead of bpdev->bpphy_lock then this would result in a deadlock when it is called from phy_driver callbacks. > However a possible solution would be to remove all these locks using bpphy_lock and use instead only one phydev->lock in backplane kr state machine: (bp_kr_state_machine). > But this solution will result in poorer performance, the training total duration will increase because only one single lane can enter the training procedure at a time therefore it would be possible for multi-lane phy training to ultimately fail because training is not finished in under 500ms. So I wanted to avoid this loss of training performance. > Yet another possible solution would be to keep the locks where they are, at the lowest level exactly at phy_read/write_mmd calls, in order to allow lanes training running in parallel, but use instead the phydev->lock as would be normal to be and according to your suggestion. > But in this case I must avoid the deadlock I mentioned above by differentiating between the calls coming from phy_driver callbacks where the phydev->lock is already acquired for this phy_device by the phy framework so the mutex should be skipped in this case and the calls coming from anywhere else (for example from backplane kr state machine) when the phydev->lock was not already acquired for this phy_device and the mutex must be used. > If you agree with this latest solution then I can implement it in next version by using a flag in backplane_device called: 'phy_mutex_already_acquired' or 'skip_phy_mutex' which must be set in all backplane phy_driver callbacks and will be used to skip the locks on phydev->lock used at phy_read/write_mmd calls in these cases. I think you have a rather big misunderstanding of the locking in phylib from what you said above. The register accessors do not use phydev->lock. Follow the code. phy_read_mmd() uses phy_lock_mdio_bus(). phy_lock_mdio_bus() locks the phydev->mdio.bus->mdio_lock mutex. This is the _bus_ level lock, and is entirely different from phydev->lock. It is entirely safe to call phy_read_mmd() from any region of code which is holding phydev->lock - indeed, we have many PHY drivers that already do this. So, I think you need to rewrite your entire locking strategy, because it seems that you've misunderstood the locking here. However, it's actually way worse, because of the abuse in your driver of a single phy_device struct, which you use to access multiple PHYs, randomly changing phydev->mdio.addr according to which PHY needs to be accessed - you need to _carefully_ consider how your locking is done for that. I regard this as a big abuse, and I'm very tempted to NAK your patches on this abuse alone. I think you need to take onboard my comments about the (ab)use of phy_device here. An alternative solution to this is to push the phy_* accessors up a level to the mdiobus level (we already have some, and I've already been converting others) so you don't have to mess with phydev->mdio.addr at all. However, I would still consider your use of struct phy_device to be an abuse. -- RMK's Patch system: https://www.armlinux.org.uk/developer/patches/ FTTC broadband for 0.8mile line in suburbia: sync at 10.2Mbps down 587kbps up