Received: by 2002:ac0:a591:0:0:0:0:0 with SMTP id m17-v6csp502850imm; Thu, 5 Jul 2018 04:13:49 -0700 (PDT) X-Google-Smtp-Source: AAOMgpec4Betwa9k8A7t1Ba3Ivf1zHMb71zLZEUM7I//PbONEE5kZUQHdWayVMoh2CWvjrmU0ti1 X-Received: by 2002:a63:35c3:: with SMTP id c186-v6mr5160820pga.217.1530789229672; Thu, 05 Jul 2018 04:13:49 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1530789229; cv=none; d=google.com; s=arc-20160816; b=mvyZSNV00ikRXXyo/ctXC7qsToxFmjaFv2+njPnO9tMPKNkaxLWPLN1tDdGyHgkt2u tpJPl+eu5qzvPSRMwVW67QKXQdlhqvuyNTj0xR+7CiQw4/L/ynrUAxGLHreAALi3mlTG tdwpZQdeAO+6xiOVhrjUUZK5nE9HOMc/QL4hv2cYduzj/ZiihZoz1SSCMZlmVZB5275V 6fdTlGRN03kqQZmd/fWYWyb+n0I+kFf5nugVvkUnaMRpjhi37MSb92K3zfYkHux9fdyi ija5a+7ybdwX9UX8nCSo+Bh5cDdLsapl614Bxc1Yqma0Jv6DEMVbj2cw7HcMk5hdAORw U+ng== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :references:in-reply-to:mime-version:dkim-signature :arc-authentication-results; bh=vb7LWxzEbIiasBKo9S8np/tabHRsd8b5gwJ4G5c1yu8=; b=u2C00TqgHlCJ9nD5d6qQo1CB15qUP6OTTVma7KlhSHL6cst8ruw7Fp/jNTNRbGkUHI g2cETV4fneJ5uP0AnxNFq7Aog+lY3F9CL2X4cwsC5360c1rOy30vX0ENStnN9o1XHAVS +wed/Xv7WdzJxa6ZwpzjoYVi46IeDauggBDtViydhLpNX1W50YDRUTZn/WoNM549rstT zqLId2EYJZtCtKAEKT67ZH6vqrnmFrbg0EnR3hUik2RsnGG9NbAEFNMwvI9tQia682sW wQ738SPZG6FQyVGojJF7roMGC7J4E131qbLHUk5Lsy50av1SSBn95yqdywgpl2oREYMn S+yQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=QcsOWlzh; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id o23-v6si5436609pgm.170.2018.07.05.04.13.35; Thu, 05 Jul 2018 04:13:49 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=QcsOWlzh; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753894AbeGELMo (ORCPT + 99 others); Thu, 5 Jul 2018 07:12:44 -0400 Received: from mail-it0-f65.google.com ([209.85.214.65]:53590 "EHLO mail-it0-f65.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753755AbeGELMl (ORCPT ); Thu, 5 Jul 2018 07:12:41 -0400 Received: by mail-it0-f65.google.com with SMTP id a195-v6so11437123itd.3 for ; Thu, 05 Jul 2018 04:12:41 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=vb7LWxzEbIiasBKo9S8np/tabHRsd8b5gwJ4G5c1yu8=; b=QcsOWlzhfXLw63OZSKuHyrzjqK+b2p3/J1y6u67O4LDRqKZx7Tdn5897+camCD3HTI lc7l6Hv7TEug6Is4yONXWQiCGNc3U7Z6Juia+lPqbuNsfQQhv+GN4/9zQwJL87+/39AI AcTplJXoHzIVh1hHLlIMYzEM8vH4LJ0JUftYE= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=vb7LWxzEbIiasBKo9S8np/tabHRsd8b5gwJ4G5c1yu8=; b=KVURJQaW59khrn6i6fpawQiwLuxUko4RsMQpHyZCCWKpzmls+T6QQ1KncAz/LjeILM z8hg1GZpDyUBNhWPk5W/Cs283BBngkFDOJPFY1CDZIXtZpHG64uNDDcSbbyr6uZfMnVz JeiTlRbg+OLuA4bZGVngipKhO04/c0vqPgeICm5kiUUcXrkrrhd9OUAbnalar757UEgv Q3h324LTb+3IMoCmv0pPsbTZTfYuf5B/clzVkZ3Xg1+qareAu7eQkBPE+De4KVfIqvNw r4dBrGhkzDsOsgWGotzTPBgC23c6hILj09MBjYU5TDa9+u9DLsdMxN+vTJEupDzm1cty LloA== X-Gm-Message-State: APt69E1xfI6vyXoX7J3h+/zx4wBhzS7AQEUsQL9w1lXTpQfhCaSPdrt/ XFhVWAMYRdGuknQNBVd82CxVZ4H4owaC+lIhu0YOtw== X-Received: by 2002:a02:579a:: with SMTP id b26-v6mr4447954jad.107.1530789160740; Thu, 05 Jul 2018 04:12:40 -0700 (PDT) MIME-Version: 1.0 Received: by 2002:a02:818f:0:0:0:0:0 with HTTP; Thu, 5 Jul 2018 04:12:40 -0700 (PDT) In-Reply-To: <20180704212951.43c6a62a@why.wild-wind.fr.eu.org> References: <1530685741-20604-1-git-send-email-stefan@olimex.com> <9b8f30fd-12aa-46ba-ced9-aed38ada0059@arm.com> <20180704212951.43c6a62a@why.wild-wind.fr.eu.org> From: Ulf Hansson Date: Thu, 5 Jul 2018 13:12:40 +0200 Message-ID: Subject: Re: [PATCH v2 1/1] mmc: sunxi: Disable irq during pm_suspend To: Marc Zyngier Cc: Stefan Mavrodiev , Maxime Ripard , Chen-Yu Tsai , "open list:MULTIMEDIA CARD (MMC), SECURE DIGITAL (SD) AND..." , "moderated list:ARM/Allwinner sunXi SoC support" , open list Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 4 July 2018 at 22:29, Marc Zyngier wrote: > On Wed, 4 Jul 2018 15:34:36 +0200 > Ulf Hansson wrote: > >> On 4 July 2018 at 13:34, Marc Zyngier wrote: >> > On 04/07/18 11:50, Ulf Hansson wrote: >> >> + Marc >> >> >> >> On 4 July 2018 at 08:28, Stefan Mavrodiev wrote: >> >>> When mmc host controller enters suspend state, the clocks are >> >>> disabled, but irqs are not. For some reason the irqchip emits >> >>> false interrupts, which causes system lock loop. >> >>> >> >>> Debug log is: >> >>> ... >> >>> sunxi-mmc 1c11000.mmc: setting clk to 52000000, rounded 51200000 >> >>> sunxi-mmc 1c11000.mmc: enabling the clock >> >>> sunxi-mmc 1c11000.mmc: cmd 13(8000014d) arg 10000 ie 0x0000bbc6 len 0 >> >>> sunxi-mmc 1c11000.mmc: irq: rq (ptrval) mi 00000004 idi 00000000 >> >>> sunxi-mmc 1c11000.mmc: cmd 6(80000146) arg 3210101 ie 0x0000bbc6 len 0 >> >>> sunxi-mmc 1c11000.mmc: irq: rq (ptrval) mi 00000004 idi 00000000 >> >>> sunxi-mmc 1c11000.mmc: cmd 13(8000014d) arg 10000 ie 0x0000bbc6 len 0 >> >>> sunxi-mmc 1c11000.mmc: irq: rq (ptrval) mi 00000004 idi 00000000 >> >>> mmc1: new DDR MMC card at address 0001 >> >>> mmcblk1: mmc1:0001 AGND3R 14.6 GiB >> >>> mmcblk1boot0: mmc1:0001 AGND3R partition 1 4.00 MiB >> >>> mmcblk1boot1: mmc1:0001 AGND3R partition 2 4.00 MiB >> >>> sunxi-mmc 1c11000.mmc: cmd 18(80003352) arg 0 ie 0x0000fbc2 len 409 >> >>> sunxi-mmc 1c11000.mmc: irq: rq (ptrval) mi 00004000 idi 00000002 >> >>> mmcblk1: p1 >> >>> sunxi-mmc 1c11000.mmc: irq: rq (null) mi 00000000 idi 00000000 >> >>> sunxi-mmc 1c11000.mmc: irq: rq (null) mi 00000000 idi 00000000 >> >>> sunxi-mmc 1c11000.mmc: irq: rq (null) mi 00000000 idi 00000000 >> >>> sunxi-mmc 1c11000.mmc: irq: rq (null) mi 00000000 idi 00000000 >> >>> and so on... >> >>> >> >>> This issue apears on eMMC cards, routed on MMC2 slot. The patch is >> >>> tested with A20-OLinuXino-MICRO/LIME/LIME2 boards. >> >>> >> >>> Fixes: 9a8e1e8cc2c0 ("mmc: sunxi: Add runtime_pm support") >> >>> Signed-off-by: Stefan Mavrodiev >> >>> --- >> >>> Changes in v2: >> >>> - Add comment why disable_irq() is necessary >> >>> >> >>> --- >> >>> drivers/mmc/host/sunxi-mmc.c | 7 +++++++ >> >>> 1 file changed, 7 insertions(+) >> >>> >> >>> diff --git a/drivers/mmc/host/sunxi-mmc.c b/drivers/mmc/host/sunxi-mmc.c >> >>> index e747259..8e7f3e3 100644 >> >>> --- a/drivers/mmc/host/sunxi-mmc.c >> >>> +++ b/drivers/mmc/host/sunxi-mmc.c >> >>> @@ -1446,6 +1446,7 @@ static int sunxi_mmc_runtime_resume(struct device *dev) >> >>> sunxi_mmc_init_host(host); >> >>> sunxi_mmc_set_bus_width(host, mmc->ios.bus_width); >> >>> sunxi_mmc_set_clk(host, &mmc->ios); >> >>> + enable_irq(host->irq); >> >>> >> >>> return 0; >> >>> } >> >>> @@ -1455,6 +1456,12 @@ static int sunxi_mmc_runtime_suspend(struct device *dev) >> >>> struct mmc_host *mmc = dev_get_drvdata(dev); >> >>> struct sunxi_mmc_host *host = mmc_priv(mmc); >> >>> >> >>> + /* >> >>> + * When clocks are off, it's possible receiving >> >>> + * fake interrupts, which will stall the system. >> >>> + * Disabling the irq will prevent this. >> >>> + */ >> >>> + disable_irq(host->irq); >> >> >> >> No, this doesn't work for shared IRQs. >> > >> > Well, in this case, it does work, because that interrupt line cannot be >> > shared with anything else, if I understand how the SoC is wired: each >> > MMC controller has a dedicated interrupt line to the GIC, and it isn't >> > shared with anything (that's on the A20 though, and I don't know about >> > other SoCs integrating the same IP). >> >> That's the problem. This may work on some SoCs but not on others. >> >> > >> >> >> >>> sunxi_mmc_reset_host(host); >> >>> sunxi_mmc_disable(host); >> >>> >> >>> -- >> >>> 2.7.4 >> >>> >> >> >> >> The only option today is to use free_irq() in runtime suspend and then >> >> re-request the irq to re-install the handler at runtime resume. >> >> >> >> That's not an optimal solution, which is pointed out in the below >> >> discussion as well. Moreover, it has also turned out using free_irq() >> >> is also problematic in cases threaded handlers are used. >> >> >> >> Here's the link to the discussion, it's not the only one I know of, so >> >> this is common problem. >> >> https://lkml.org/lkml/2016/1/28/213 >> >> >> >> Care to have a hack on the "common" solution, which in principle means >> >> adding APIs to genirq that can disable/enable handlers from being >> >> called, rather than the entire IRQ line. >> > >> > That doesn't work. You still end-up with a screaming interrupt, and you >> > will still spend 100% of your time in interrupt context for nothing. >> > >> > Eventually, the kernel will have enough (the /other/ shared handlers >> > returning IRQ_NONE all the time), and will forcefully kill that >> > particular interrupt interrupt line, meaning you end-up in the same >> > situation of having the line disabled for all the users of that >> > interrupt line. Except that now, it is disabled forever. >> >> Ahh, correct! >> >> Sounds like free_irq() is what we need. Only that it's bit heavy >> weight as we need to re-install handlers. > > BTW, free_irq() doesn't help you either in the case of a shared > handler. You'll end-up in the exact same scenario as above. In regards to the spurious interrupt storm issue, yes, I fully agree. On the other hand, in case of a shared IRQ, don't we want the genirq core to deal with disabling the IRQ, rather than the driver? Also, don't forget the other related issue, which is when the IRQ handler gets invoked (not as a storm, but once is enough), either because of a spurious IRQ or because of a shared IRQ - while the device is in a low power state (runtime suspended with clock gated for example). If that happens and the handler accesses a register the handler may hang. > > The real solution to this is to prevent the device itself from > generating interrupts (or to forbid interrupt sharing if it isn't > possible). I fully agree that the device should be configured to not deliver interrupt, this is the first and most important step a driver should take. For example it should mask its device's IRQ register bits. However, this isn't sufficient, because of shared IRQs and buggy HWs delivering spurious IRQs. Kind regards Uffe