Received: by 2002:a25:f815:0:0:0:0:0 with SMTP id u21csp659171ybd; Wed, 26 Jun 2019 04:25:20 -0700 (PDT) X-Google-Smtp-Source: APXvYqxYeYHhhipbPFGw5asw8BKboT1aOKLmnrh1e/YJfQaWEe3MaxuwCEmfQMYap/W10WhbViXM X-Received: by 2002:a17:902:ba8b:: with SMTP id k11mr4865575pls.107.1561548320769; Wed, 26 Jun 2019 04:25:20 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1561548320; cv=none; d=google.com; s=arc-20160816; b=r48SJGp0r1SOnA0j3YMWABw2VG/tm0aSCDSXtKX+jUbCWstgt1Lb70g6TOuMdurbqb SPIjSh9nG40lx43hwRLwaBD8D9eVF5xfEJusJYNOM4JlKV3FnMvOia2hPcOUCkWruOO8 ArnxnqdHYhz/f4SgYSiMolBdpcQ3zNOYEWeInX10avhRvuqdPyFobpSwOAePLsZRzxyp iZDbKq/FV8pZa82Nx5gSyuntLX4pZKBAlaf01Efp26bTuLPK9llKmsEkKJWpa8O1Up8F C/xSlB0acHUnpwhtfaSTu4Sc/3PW9JsuVpFow+M++ybkGDanGCNWOTTEg3sshRhY54g6 eL2Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:message-id:date :subject:cc:to:from; bh=YUSkN2j2SqWvGtIOFgG5C/+eqFSiOYn9d+xLqKFGf2M=; b=RvPSljVGYwh9ZD82VYlxRaraIVWy+gs03zVy6PUGXgdbB5dh4l7JFD/k/cMrprrj/b BgyqUgc7vhyGwKv8Z69D3NFke1mNRxC/DvCm+jZ/ZtfWWoSt1HvuwtH+pd6xzUq26UBF LZdFplfVuQYJRwGsyBBe0itF43l0W/cpgrdDgBL2ARed2/rQfGcY0u0JszGD+IGaZcBw yS8D8cKlMgoURxbUAKZ0oU/A4lBUoBUmtdDL8V9Gk5HaVZofh7W80RXR4FpnmOyjDhy3 +lRUqyg7KXb/UnhgEo3zkgT+SMC+H0JLcnyTFNseaVztuIedADd4uBo1LA99nSGikQcx Bf0w== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id 36si3307659pla.80.2019.06.26.04.25.04; Wed, 26 Jun 2019 04:25:20 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727294AbfFZLXe (ORCPT + 99 others); Wed, 26 Jun 2019 07:23:34 -0400 Received: from mail.thorsis.com ([92.198.35.195]:32845 "EHLO mail.thorsis.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726104AbfFZLXe (ORCPT ); Wed, 26 Jun 2019 07:23:34 -0400 Received: from localhost (localhost [127.0.0.1]) by mail.thorsis.com (Postfix) with ESMTP id 6FEDEE59; Wed, 26 Jun 2019 13:23:43 +0200 (CEST) X-Virus-Scanned: Debian amavisd-new at mail.thorsis.com Received: from mail.thorsis.com ([127.0.0.1]) by localhost (mail.thorsis.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id P9tVkmwBsPg7; Wed, 26 Jun 2019 13:23:39 +0200 (CEST) Received: by mail.thorsis.com (Postfix, from userid 109) id 7A8291DA3; Wed, 26 Jun 2019 13:23:39 +0200 (CEST) X-Spam-Level: X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00,NO_RECEIVED, NO_RELAYS autolearn=ham autolearn_force=no version=3.4.2 From: Alexander Dahl To: netdev@vger.kernel.org Cc: linux-kernel@vger.kernel.org, Thomas Pfahl Subject: net: never suspend the ethernet PHY on certain boards? Date: Wed, 26 Jun 2019 13:23:24 +0200 Message-ID: <4693980.Yko7hG0E1C@ada> Content-Transfer-Encoding: 7Bit Content-Type: text/plain; charset="us-ascii" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hei hei, tl;dr: is there a way to prevent an ethernet PHY to ever power down, preferred with some dt configuration, not with a hack e.g. patching out suspend functions? With the bugfix 0da70f808029476001109b6cb076737bc04cea2e ("net: macb: do not disable MDIO bus at open/close time", came with kernel v4.19, was backported to v4.18.7) a problem arises for us, which was masked before for ages, with a special combination of SoC, ethernet PHY and other chips on the same board, and the linux drivers for that. The boards use either a at91sam9g20 or a sama5d27 SoC, both using cadence/macb as ethernet driver. Both boards have a smsc LAN8720A ethernet phy attached. The RMII clock is generated by the PHY, which uses a 25 MHz crystal for that. This clock line is of course fed into the SoC/MAC, but also used (you might say hijacked) by other chips on the board which depend on that clock being _always_ on (at least after initial init on boot). The hardware can not be changed, we speak of several hundred boards already sold in the last years. O:-) Symptom is: when calling `ip link set down dev eth0` that clock goes off, the other (not soc nor phy) chips depending on that clock, freeze. I could bisect this behaviour change on a vanilla kernel to the commit mentioned above (actually to the backport commit v4.18.7-4-g716fc5ce90cf, because I bisected from v4.17.19 to v4.18.20). What I tracked down so far: macb_close() before the bugfix reset the MPE bit in the MAC Network Control Register, which probably prevents the MAC to send MDIO telegrams to the PHY? After the bugfix, that bit is not cleared anymore (to allow still talking to other PHYs on the same MDIO bus, we don't have that case). I assume communicating with the PHY is still possible then. macb_close() also calls phy_stop() which sets the state of the phy driver state machine to PHY_HALTED, with the next run of that state machine phy_suspend() is called. The smsc phy driver has no special suspend/resume functions, but uses genphy_suspend(), that one sets BMCR_PDOWN in MII_BMCR register of that (standard compliant) PHY. I suspect after that the PHY powers down and the clock goes off. I assume before that bugfix, this power down bit could not be set, because the MDIO interface in the MAC had been disabled, so the PHY stayed on. (However there's a possible race because in macb_close() the phy_stop() is called before macb_reset_hw(), right?) So far, these are mostly assumptions. I did not use gdb on the drivers or a logic analyzer on the MDIO lines. I could do to prove, however. What I could do: 1) Revert that change on my tree, which would mean reverting a generic bugfix 2) Patch smsc phy driver to not suspend anymore 3) Invent some new way to prevent suspend on a configuration basis (dt?) 4) Anything I did not think of yet I know 1) or 2) are hacks without a chance to make it to mainline. What would be your suggestions for 3) and 4)? Greets Alex