Received: by 2002:a25:ab43:0:0:0:0:0 with SMTP id u61csp4797022ybi; Mon, 3 Jun 2019 18:00:11 -0700 (PDT) X-Google-Smtp-Source: APXvYqyKlH/NJKcgopWaKHRyJgnnVdIMb4QO7nl6EMr700fswzhXaS/o1op7FWYTHcdk7yJHYC+R X-Received: by 2002:a63:78cf:: with SMTP id t198mr23158367pgc.82.1559610011182; Mon, 03 Jun 2019 18:00:11 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1559610011; cv=none; d=google.com; s=arc-20160816; b=L1m6AH5CJ5mNhaePF++neIN86C4fQ+b6XT+M8etD9jEszbTaZWnqF71MAchPZOkIwE dBIa31QCfECwQsTjUx2sIJSLHbz6r9/LJEArI1CiEyyPGAtaP/edJbpB/prDc1eYy3qw sN2r5Qyt90YQU4IFIIe0+sA+pu8mcKcVVU4OPfsShl0rVwHGJFPoOmq4F1zV+75wr1q4 kVuHyG4IRIIdaFFENAEk9lzmdgySej1NsPwi0XlRTYLY8TkZmcX7qjgltpYfXpdG3DpE JOv+XQWKzRcN10NnYz3oqFbc5DH+ZOZwwsIPcHMfqd1AgYTzpn7W7zO5H2TLdMQ2YTnO n6XA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:dkim-signature; bh=wGerWka379zTFKk4O18r7e4C+XH4CLZsivoNiqmFflA=; b=OfKoLiyBhxdqlfvT6K9au74kMn/D6V0cn8NtNQmqCgzmOe+C0BhUvqYYtcQ+6DlRiP CpOdfozkGuGJwNWM2pWxtG9NxYpUG0iB8miKWuuhWwPeBsls0v1jf4m6a6l8Awmn4CEj 7JBLungqvSCelmHzhjl6SSf8QhyC+nDhc+IgsxZExfXC4xVFXkGZNBFHqM+tw8GejQ5y RZSkXME9DNpWSyJEJct8ECILgG/Kg/9AkLMQjSo7raEBAmFCHai/IiLYFVAirDpL40cU dYZh5nW1DJGQ4aBRgD2ZRiNwFliSZLFwTyqoneM3oPusk8sDjIVUpCblzKAMGIII3yvI 9keQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@chromium.org header.s=google header.b="N/B5m564"; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=chromium.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id 74si19587085pga.291.2019.06.03.17.59.54; Mon, 03 Jun 2019 18:00:11 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@chromium.org header.s=google header.b="N/B5m564"; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=chromium.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726566AbfFDA5r (ORCPT + 99 others); Mon, 3 Jun 2019 20:57:47 -0400 Received: from mail-qt1-f194.google.com ([209.85.160.194]:36679 "EHLO mail-qt1-f194.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726173AbfFDA5q (ORCPT ); Mon, 3 Jun 2019 20:57:46 -0400 Received: by mail-qt1-f194.google.com with SMTP id u12so11855191qth.3 for ; Mon, 03 Jun 2019 17:57:45 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=chromium.org; s=google; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=wGerWka379zTFKk4O18r7e4C+XH4CLZsivoNiqmFflA=; b=N/B5m564uGRU5DzGSaidfavHUZK6VVz9j6uZFXnLNih2mM8li/CbD2YgPhtBuvUB8s DN56C6nsS+a65E4/Eg/e5JNxm/80fhAUP4jtU75cgX61L0i1ES1BdsVNbWLZ+IHWgRLe /GgIQasYCPFzw6oMVy2VAHhTKmSHlCsEKIGLo= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=wGerWka379zTFKk4O18r7e4C+XH4CLZsivoNiqmFflA=; b=H6J7u6YYWJY+4CWCxJphweOFZ//3P2sDJIOQRpYaCErR8bN7fdzK9DtJ4KFYelkHlk LcORRXbc4lfWE34+hQWd+lizw1D4UnN/+5I0Iv4cLxSR/BHiTB8kGMdXOANWXEuRwsP9 cJzkoweXdD8d2wzf8Vy4Ie2FBIDU+kPjJKS4cGCM3ukZGvpbWwrdZmO5DqRZ+ONWOs2U Wrta//jLg0iODoA1PCvBg/qpWoPi2JF1dM2Fyc94pdxgQwYGtzgOI6oxTADrMq9j+s0Q SZ5/P3onHuHKdK2VvQE8PkOc61Efj/4GcmrFhQwDNRli36daFuaxj48ln7uxcC33xj4H /aoQ== X-Gm-Message-State: APjAAAXCj+8U8qnjkUCwHxi7rCMAP1ifID2jPdoDxujCJ3Y7g2bxyade bxFSc+MqrH9l2wqbFupaM6R0fdlpHBfh3YVqqTTuSvZuIaZf9w== X-Received: by 2002:a0c:b04d:: with SMTP id l13mr24741070qvc.104.1559609865168; Mon, 03 Jun 2019 17:57:45 -0700 (PDT) MIME-Version: 1.0 References: <20190429035515.73611-1-drinkcat@chromium.org> <20190429035515.73611-3-drinkcat@chromium.org> <155778659317.14659.136626364818483852@swboyd.mtv.corp.google.com> <155786487644.14659.17142525593824613967@swboyd.mtv.corp.google.com> <1559289956.13732.17.camel@mhfsdcap03> <1559548885.13732.28.camel@mhfsdcap03> In-Reply-To: <1559548885.13732.28.camel@mhfsdcap03> From: Nicolas Boichat Date: Tue, 4 Jun 2019 08:57:34 +0800 Message-ID: Subject: Re: [PATCH 2/2] pinctrl: mediatek: Update cur_mask in mask/mask ops To: Chuanjia Liu Cc: Evan Green , Stephen Boyd , "moderated list:ARM/Mediatek SoC support" , Sean Wang , Linus Walleij , Matthias Brugger , "open list:GPIO SUBSYSTEM" , linux-arm Mailing List , lkml , Thomas Gleixner Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Jun 3, 2019 at 4:01 PM Chuanjia Liu wrote: > > On Fri, 2019-05-31 at 10:17 -0700, Evan Green wrote: > > On Fri, May 31, 2019 at 1:06 AM Chuanjia Liu wrote: > > > > > > On Thu, 2019-05-30 at 10:12 -0700, Evan Green wrote: > > > > On Wed, May 15, 2019 at 1:05 AM Nicolas Boichat wrote: > > > > > > > > > > On Wed, May 15, 2019 at 4:14 AM Stephen Boyd wrote: > > > > > > > > > > > > Quoting Nicolas Boichat (2019-05-13 18:37:58) > > > > > > > On Tue, May 14, 2019 at 6:29 AM Stephen Boyd wrote: > > > > > > > > > > > > > > > > Quoting Nicolas Boichat (2019-04-28 20:55:15) > > > > > > > > > During suspend/resume, mtk_eint_mask may be called while > > > > > > > > > wake_mask is active. For example, this happens if a wake-source > > > > > > > > > with an active interrupt handler wakes the system: > > > > > > > > > irq/pm.c:irq_pm_check_wakeup would disable the interrupt, so > > > > > > > > > that it can be handled later on in the resume flow. > > > > > > > > > > > > > > > > > > However, this may happen before mtk_eint_do_resume is called: > > > > > > > > > in this case, wake_mask is loaded, and cur_mask is restored > > > > > > > > > from an older copy, re-enabling the interrupt, and causing > > > > > > > > > an interrupt storm (especially for level interrupts). > > > > > > > > > > > > > > > > > > Instead, we just record mask/unmask changes in cur_mask. This > > > > > > > > > also avoids the need to read the current mask in eint_do_suspend, > > > > > > > > > and we can remove mtk_eint_chip_read_mask function. > > > > > > > > > > > > > > > > > > Signed-off-by: Nicolas Boichat > > > > > > > > > > > > > > > > It looks an awful lot like you should just use IRQCHIP_MASK_ON_SUSPEND > > > > > > > > here. Isn't that what's happening? All non-wake irqs should be masked at > > > > > > > > the hardware level so they can't cause a wakeup during suspend and on > > > > > > > > resume they can be unmasked? > > > > > > > > > > > > > > No, this is for an line that has both wake and interrupt enabled. To > > > > > > > reword the commit message above: > > > > > > > > > > > > Is my understanding correct that there isn't a different "wake up" > > > > > > register that can be written to cause a GPIO to be configured to wake > > > > > > the system from suspend? The only way to do so is to leave the GPIO > > > > > > unmasked in the hardware by having EINT_EN[irq] = 1? And thus any > > > > > > interrupts that we don't want to wake us up during suspend should be > > > > > > masked in the hardware? > > > > > > > > > > Yes, that's my understanding as well. > > > > > > > > > > And then, what this driver does is to emulate the behaviour of a > > > > > controller that would actually have separate irq and wake enable > > > > > registers. > > > > > > > > > > > If that's true, the code here that's trying to keep track of enabled > > > > > > irqs and wakeup enabled irqs can be replaced with the irqchip flag so > > > > > > that wakeup irqs are not masked while non-wakeups are masked. > > > > > > > > > > Correct, but with the caveat that I don't see anything that definitely > > > > > requires an interrupt to be enabled to be a wake source. See below... > > > > > > > > > > > > > > > > > > 1. cur_mask[irq] = 1; wake_mask[irq] = 1; EINT_EN[irq] = 1 (interrupt > > > > > > > enabled at hardware level) > > > > > > > 2. System suspends, resumes due to that line (at this stage EINT_HW > > > > > > > == wake_mask) > > > > > > > 3. irq_pm_check_wakeup is called, and disables the interrupt => > > > > > > > EINT_EN[irq] = 0, but we still have cur_mask[irq] = 1 > > > > > > > 4. mtk_eint_do_resume is called, and restores EINT_EN = cur_mask, so > > > > > > > it reenables EINT_EN[irq] = 1 => interrupt storm. > > > > > > > > > > > > > > This patch fixes the issue in step 3. So that the interrupt can be > > > > > > > re-enabled properly later on, sometimes after mtk_eint_do_resume, when > > > > > > > the driver is ready to handle it. > > > > > > > > > > > > Right, we'd rather not see irqchip drivers working around the genirq > > > > > > layer to do these things like tracking cur_mask and wake_mask. That > > > > > > leads to subtle bugs and makes the driver maintain state across the > > > > > > irqchip callbacks and system suspend/resume. > > > > > > > > > > > > > > > > > > > > Also, IRQCHIP_MASK_ON_SUSPEND does not handle lines that are enabled > > > > > > > as a wake source, but without interrupt enabled (e.g. cros_ec driver > > > > > > > does that), which we do want to support. > > > > > > > > > > > > Hmm. I thought that even if the irq is disabled by a driver, that would > > > > > > be a lazy disable so it isn't really masked in the hardware. Then if an > > > > > > interrupt comes in during suspend on a wake configured irq line, the > > > > > > hardware will have left it unmasked because IRQCHIP_MASK_ON_SUSPEND in > > > > > > combination with lazy disable would mean that the line is left unmasked > > > > > > (ignoring whatever this mediatek driver is doing to mask and unmask in > > > > > > PM hooks). > > > > > > > > > > At the very least, that's not what happens with this system. The > > > > > interrupt is definitely not kept enabled in suspend, and the system > > > > > would not wake from an EC interrupt. (see also this series, BTW: > > > > > https://patchwork.kernel.org/cover/10921121/). > > > > > > > > > > > Just reading Documentation/power/suspend-and-interrupts.txt I'm led to > > > > > > believe that the cros_ec driver shouldn't call disable_irq() on the > > > > > > interrupt if it wants to wakeup from it: > > > > > > > > > > > > "Calling enable_irq_wake() causes suspend_device_irqs() to treat the > > > > > > given IRQ in a special way. Namely, the IRQ remains enabled, by on the > > > > > > first interrupt it will be disabled, marked as pending and "suspended" > > > > > > so that it will be re-enabled by resume_device_irqs() during the > > > > > > subsequent system resume. Also the PM core is notified about the event > > > > > > which causes the system suspend in progress to be aborted (that doesn't > > > > > > have to happen immediately, but at one of the points where the suspend > > > > > > thread looks for pending wakeup events)." > > > > > > > > > > I think this describes the behaviour when you keep both enabled. > > > > > > > > > > > I suppose the problem is an irq line disabled in hardware that has > > > > > > wakeup armed on it? Is this even valid? Shouldn't an irq be enabled for > > > > > > wakeup to work? > > > > > > > > > > I couldn't really find a definite answer, but there are a bunch of > > > > > examples of other drivers in the kernel: > > > > > - drivers/extcon/extcon-usb-gpio.c:usb_extcon_suspend > > > > > - drivers/hid/i2c-hid/i2c-hid.c:i2c_hid_suspend > > > > > - drivers/mfd/max77843.c:max77843_suspend > > > > > (not exhaustive, this is quite hard to grep for...) > > > > > > > > > > > We could immediately unmask those lines in the hardware when the > > > > > > set_wake() callback is called. That way the genirq layer can use the > > > > > > driver to do what it wants with the hardware and the driver can make > > > > > > sure that set_wake() will always cause the wakeup interrupt to be > > > > > > delivered to genirq even when software has disabled it. > > > > > > > > > > > > But I think that there might be a problem with how genirq understands > > > > > > the masked state of a line when the wakeup implementation conflates > > > > > > masked state with wakeup armed state. Consider this call-flow: > > > > > > > > > > > > irq masked in hardware, IRQD_IRQ_MASKED is set > > > > > > enable_irq_wake() > > > > > > unmask_irq in hardware > > > > > > IRQD_WAKEUP_ARMED is set > > > > > > > > > > > > handle_level_irq() > > > > > > mask_ack_irq() > > > > > > mask_irq() > > > > > > if (irqd_irq_masked()) -> returns true and skips masking! > > > > > > if (desc->irq_data.chip->irq_ack) > > > > > > ... > > > > > > irq_may_run() > > > > > > irq_pm_check_wakeup() > > > > > > irq_disable() > > > > > > mask_irq() -> does nothing again > > > > > > > > > > > > In the above flow, we never mask the irq because we thought it was > > > > > > already masked when it was disabled, but the irqchip implementation > > > > > > unmasked it to make wakeup work. Maybe we should always mask the irq if > > > > > > wakeup is armed and we're trying to call mask_irq()? Looks hacky. > > > Maybe we can implement irqchip's mask_ack_irq in mediatek driver to > > > always mask the irq. Then flow will always call it without judgment > > > IRQD_IRQ_MASKED. > > > > > > diff --git a/drivers/pinctrl/mediatek/mtk-eint.c > > > b/drivers/pinctrl/mediatek/mtk- > > > index f464f8c..9f1aae2 100644 > > > --- a/drivers/pinctrl/mediatek/mtk-eint.c > > > +++ b/drivers/pinctrl/mediatek/mtk-eint.c > > > @@ -272,12 +272,19 @@ static void mtk_eint_irq_release_resources(struct > > > irq_data > > > gpiochip_unlock_as_irq(gpio_c, gpio_n); > > > } > > > > > > +static void mtk_eint_mask_ack(struct irq_data *d) > > > +{ > > > + mtk_eint_mask(d); > > > + mtk_eint_ack(d); > > > +} > > > + > > > static struct irq_chip mtk_eint_irq_chip = { > > > .name = "mt-eint", > > > .irq_disable = mtk_eint_mask, > > > .irq_mask = mtk_eint_mask, > > > .irq_unmask = mtk_eint_unmask, > > > .irq_ack = mtk_eint_ack, > > > + .irq_mask_ack = mtk_eint_mask_ack, > > > .irq_set_type = mtk_eint_set_type, > > > .irq_set_wake = mtk_eint_irq_set_wake, > > > .irq_request_resources = mtk_eint_irq_request_resources, > > > > > > This seems like a small change. > > > thanks. > > > > Does this work? My understanding is that Linux thinks the irq is > > _already_ masked, so it short-circuits in the generic IRQ code and > > doesn't call mask again. > > -Evan > > Yes, you are right. > > The underlying problem is really that the hardware IRQ enabled state is > out of sync with what linux thinks.In resume flow,Linux thinks the irq > is _already_masked, so it short-circuits in the generic IRQ code and > doesn't call mask again.So in step 3 will have a interrupt storm. > > But we implement irqchip's mask_ack_irq so that mask_ack_irq() calls > desc->irq_data.chip->irq_mask_ack instead of mask_irq() that needs to > judge IRQD_IRQ_MASKED. This will correctly set cur_mask[irq] = 0 to > sync with kernel state. Oh, I see. I guess that would work as mask_ack_irq does not check for current masked status before calling irq_chip->irq_mask_ack... But that's only if irq_chip->irq_mask_ack is defined, else it just calls mask_irq which does that check. But if we follow bf22ff45bed ("genirq: Avoid unnecessary low level irq function calls"), this is actually a bug and we should add that same check to mask_ack_irq (and since pinctrl-rockchip.c does not implement irq_mask_ack, it's understandable why the author missed that). So I'm not sure if this is the right change either... > Also, this patch can solve the issue of [1/2] in this patchset[1] which > also is the interrupt mask cannot be set correctly due to hardware irq > state not sync kernel. > > [1] https://patchwork.kernel.org/patch/10921143/ > >