Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755814AbcJNM3E (ORCPT ); Fri, 14 Oct 2016 08:29:04 -0400 Received: from mail.linuxfoundation.org ([140.211.169.12]:40947 "EHLO mail.linuxfoundation.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755408AbcJNM2C (ORCPT ); Fri, 14 Oct 2016 08:28:02 -0400 From: Greg Kroah-Hartman To: linux-kernel@vger.kernel.org Cc: Greg Kroah-Hartman , stable@vger.kernel.org, Mika Westerberg , Thomas Gleixner Subject: [PATCH 4.8 20/37] x86/irq: Prevent force migration of irqs which are not in the vector domain Date: Fri, 14 Oct 2016 14:27:06 +0200 Message-Id: <20161014122552.681579448@linuxfoundation.org> X-Mailer: git-send-email 2.10.0 In-Reply-To: <20161014122549.411962735@linuxfoundation.org> References: <20161014122549.411962735@linuxfoundation.org> User-Agent: quilt/0.64 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3056 Lines: 84 4.8-stable review patch. If anyone has any objections, please let me know. ------------------ From: Mika Westerberg commit db91aa793ff984ac048e199ea1c54202543952fe upstream. When a CPU is about to be offlined we call fixup_irqs() that resets IRQ affinities related to the CPU in question. The same thing is also done when the system is suspended to S-states like S3 (mem). For each IRQ we try to complete any on-going move regardless whether the IRQ is actually part of x86_vector_domain. For each IRQ descriptor we fetch its chip_data, assume it is of type struct apic_chip_data and manipulate it by clearing old_domain mask etc. For irq_chips that are not part of the x86_vector_domain, like those created by various GPIO drivers, will find their chip_data being changed unexpectly. Below is an example where GPIO chip owned by pinctrl-sunrisepoint.c gets corrupted after resume: # cat /sys/kernel/debug/gpio gpiochip0: GPIOs 360-511, parent: platform/INT344B:00, INT344B:00: gpio-511 ( |sysfs ) in hi # rtcwake -s10 -mmem <10 seconds passes> # cat /sys/kernel/debug/gpio gpiochip0: GPIOs 360-511, parent: platform/INT344B:00, INT344B:00: gpio-511 ( |sysfs ) in ? Note '?' in the output. It means the struct gpio_chip ->get function is NULL whereas before suspend it was there. Fix this by first checking that the IRQ belongs to x86_vector_domain before we try to use the chip_data as struct apic_chip_data. Reported-and-tested-by: Sakari Ailus Signed-off-by: Mika Westerberg Link: http://lkml.kernel.org/r/20161003101708.34795-1-mika.westerberg@linux.intel.com Signed-off-by: Thomas Gleixner Signed-off-by: Greg Kroah-Hartman --- arch/x86/kernel/apic/vector.c | 23 ++++++++++++++++++++--- 1 file changed, 20 insertions(+), 3 deletions(-) --- a/arch/x86/kernel/apic/vector.c +++ b/arch/x86/kernel/apic/vector.c @@ -661,11 +661,28 @@ void irq_complete_move(struct irq_cfg *c */ void irq_force_complete_move(struct irq_desc *desc) { - struct irq_data *irqdata = irq_desc_get_irq_data(desc); - struct apic_chip_data *data = apic_chip_data(irqdata); - struct irq_cfg *cfg = data ? &data->cfg : NULL; + struct irq_data *irqdata; + struct apic_chip_data *data; + struct irq_cfg *cfg; unsigned int cpu; + /* + * The function is called for all descriptors regardless of which + * irqdomain they belong to. For example if an IRQ is provided by + * an irq_chip as part of a GPIO driver, the chip data for that + * descriptor is specific to the irq_chip in question. + * + * Check first that the chip_data is what we expect + * (apic_chip_data) before touching it any further. + */ + irqdata = irq_domain_get_irq_data(x86_vector_domain, + irq_desc_get_irq(desc)); + if (!irqdata) + return; + + data = apic_chip_data(irqdata); + cfg = data ? &data->cfg : NULL; + if (!cfg) return;