Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754043AbdLNTDc (ORCPT ); Thu, 14 Dec 2017 14:03:32 -0500 Received: from mail-io0-f173.google.com ([209.85.223.173]:45422 "EHLO mail-io0-f173.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753703AbdLNTDa (ORCPT ); Thu, 14 Dec 2017 14:03:30 -0500 X-Google-Smtp-Source: ACJfBot8MndhYaacw9ICeYZ2RAurgldv2U7cEWsUQc9qQKpHpRYWKkx53wv8/ofaQcOInQu9Kcf8BRzmjy/ulVSJQno= MIME-Version: 1.0 In-Reply-To: References: <168050887.sZlTFXWCmO@aspire.rjw.lan> <20171206121452.GA6320@dhcp22.suse.cz> <0f1d3d63-fa10-5cef-8014-81753dc60243@mblankhorst.nl> <57c8679e-1b88-c9ad-2299-2bea7560b28f@mblankhorst.nl> <20171213162336.GG53955@bhelgaas-glaptop.roam.corp.google.com> From: Linus Torvalds Date: Thu, 14 Dec 2017 11:03:27 -0800 X-Google-Sender-Auth: ufJ8b0MGu8_P82CkBf52Cw4OMiQ Message-ID: Subject: Re: Linux 4.15-rc2: Regression in resume from ACPI S3 To: Thomas Gleixner Cc: Bjorn Helgaas , Maarten Lankhorst , Michal Hocko , "Rafael J. Wysocki" , Andy Lutomirski , Linux Kernel Mailing List , "the arch/x86 maintainers" , Daniel Vetter , Bjorn Helgaas , "Rafael J. Wysocki" , linux-pci@vger.kernel.org, linux-pm@vger.kernel.org Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3324 Lines: 82 On Thu, Dec 14, 2017 at 3:54 AM, Thomas Gleixner wrote: > > So the old scheme silently consumed the spurious vector. I added debug code > to that effect to 4.14 and on that machine IRQ7 is triggered at the same > point post resume and the core code drops it silently because the interrupt > is marked masked and no action assigned. > > So the only difference to today is that the new code complains, while the > old one does an extra mask of the already masked IOAPIC pin and silently > returns. Great debugging, and it looks like Rafael has a patch that already got positive testing. I just wanted to pipe up about that "irq7", because judging from your email it seems like you think it's a real irq: > Now there is a race > whether the kernel resume path manages to mask the PIC again early enough > before something triggers IRQ7 or not. ..and that's not how the PIC works. In fact, "legacy irq 7" is the _normal_ and very traditional spurious interrupt, and it's documented. If the PIC gets an interrupt from _any_ source, but the interrupt goes away before the PIC gets an acknowledge from the CPU (and by "acknowledge", I'm not talking about the explicit software IRQ ACK, I'm talking about the hardware protocol, between the PIC and the CPU), the PIC will then report irq 7 as the interrupt - regardless of what the original was. The reason is almost always something like - CPU interrupts are disabled or masked - driver does a write to the external hardware that causes an interrupt to be raised - CPU doesn't react to the irq due to the disabled/masked nature - but the driver then does something that masks the interrupt again - interrupts are enabled/unmasked on the CPU - CPU now acks the interrupt, but the PIC no longer sees any interrupt source, so the PIC (that has to reply with *something*) replies with that documented spurious irq7. To confuse things further, irq7 is not _exclusively_ the spurious interrupt, You can definitely put real hardware and connect it to pin7 of the PIC, and get real irq7 reports. And to confuse things even *more*, this "irq7" thing is per-PIC, and the PC model obviously has the whole "nested PIC" thing where the second PIC is connected to irq2 of the first PIC. So there are *two* different "spurious interrupt" reports, one for each PIC. Anyway, to avoid this issue, drivers should strive to (a) actually take the interrupt when doing things that can cause them, and have the interrupt handler do whatever it is that causes the interrupt to go away (ie: "normal operation") (b) if you play games with clearing the source of the interrupt _without_ taking the interrupt, you should strive to basically mask the interrupt first. So to do (b) you can do something like mask_device_interrupt(dev); read_from_device_to_synchronize(dev); instead of (or perhaps _before_) disabling interrupts at a CPU level. Suspend/resume obviously does tend to play games with these kinds of things where you are no longer in "normal operation" and you do setup without having interrupts actually enabled. Or you can just decide that spurious interrupts are ok, and ignore the issue. But they *can* be very confusing, and obviously in this case that confusion then seems to have caused actual problems. Linus