Received: by 10.223.185.116 with SMTP id b49csp1017136wrg; Wed, 21 Feb 2018 10:37:51 -0800 (PST) X-Google-Smtp-Source: AH8x224BzfEXa06HFaLhMe1J/Zs9zfYCtBVYP5dUT+mIvXu5/u0TXJKWGJkJq3sNGuPpGOanS1I9 X-Received: by 2002:a17:902:724b:: with SMTP id c11-v6mr3985402pll.352.1519238271266; Wed, 21 Feb 2018 10:37:51 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1519238271; cv=none; d=google.com; s=arc-20160816; b=LGa7BHYkTf/Gu61RdXdFp1kLR+jJoC8FISOdhyPBkkpM2N1KLcf/65bl7y2ft9NHOs Wme+zUyRxOkW8zoxAkTWFtFDGvySC2vm3SEmxXRorzRxrJzkR1ZkgV9eLI3K8fd9aTFj J+EaWqe4l3/RlYaFtbVUeJNsCUTh44D5N2bycrlKh8nXXqrTD4sXnNs/6O0WrLfnstdp RmKajv5BoHEaThKMa1whAS4izefYKxot37EMeDQFM6A6v/1h8PU0A4RQcD8HcHV9Pg6v 5/N5Rklvu5LlrP0bBb6neweFnAldWY1lom+AzZ9SQx6/gqclbdSv+BbRSp3UxN+eBjmG 0E0w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:mime-version:user-agent:references :in-reply-to:message-id:date:subject:cc:to:from :arc-authentication-results; bh=1RzssPTA5Tcsfw+vpzbz4NENjigZ4vyYAWDAqxgU5b0=; b=LdTXksr/cMJ8i95+N4R8vVh6pKTr9ha/NHdz5qWF8hj7M4uRNKuR/zWMOCL7FP4S1X xtcuDgsn4H/IhpwTQB/huJPMsSxfi+ewQDpEc+oXqDARX6RIIEwLIOvg3dqaybG7oBgZ RKp6qXEHLZBy0iqAQSJbBM7twQUA8HBwcEdVgu7BoNIg2CvZOMKpOf2oW4CQOVUKBT5J Q23WYDxwvHqOOKR8ZV5jvgf1bKTSk01u5I5xAlCFM30hJqiw9kElzOAQqMpYXYM+W0N0 7JntrbY0eWCeOwZ408SWqHaBz/ppBfZaDYFMEfI18gsqxwD5fB8gFHzasRckxzmZvhsl KZxA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id r69si6571808pgr.678.2018.02.21.10.37.37; Wed, 21 Feb 2018 10:37:51 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S965572AbeBUNLm (ORCPT + 99 others); Wed, 21 Feb 2018 08:11:42 -0500 Received: from mail.linuxfoundation.org ([140.211.169.12]:45294 "EHLO mail.linuxfoundation.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S933469AbeBUNLj (ORCPT ); Wed, 21 Feb 2018 08:11:39 -0500 Received: from localhost (LFbn-1-12258-90.w90-92.abo.wanadoo.fr [90.92.71.90]) by mail.linuxfoundation.org (Postfix) with ESMTPSA id 285D8E29; Wed, 21 Feb 2018 13:11:38 +0000 (UTC) From: Greg Kroah-Hartman To: linux-kernel@vger.kernel.org Cc: Greg Kroah-Hartman , stable@vger.kernel.org, Masayoshi Mizuma , Thomas Gleixner , Linus Torvalds , Peter Zijlstra , yasu.isimatu@gmail.com, Ingo Molnar Subject: [PATCH 4.15 098/163] x86/smpboot: Fix uncore_pci_remove() indexing bug when hot-removing a physical CPU Date: Wed, 21 Feb 2018 13:48:47 +0100 Message-Id: <20180221124535.713836865@linuxfoundation.org> X-Mailer: git-send-email 2.16.2 In-Reply-To: <20180221124529.931834518@linuxfoundation.org> References: <20180221124529.931834518@linuxfoundation.org> User-Agent: quilt/0.65 X-stable: review MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org 4.15-stable review patch. If anyone has any objections, please let me know. ------------------ From: Masayoshi Mizuma commit 295cc7eb314eb3321fb6d67ca6f7305f5c50d10f upstream. When a physical CPU is hot-removed, the following warning messages are shown while the uncore device is removed in uncore_pci_remove(): WARNING: CPU: 120 PID: 5 at arch/x86/events/intel/uncore.c:988 uncore_pci_remove+0xf1/0x110 ... CPU: 120 PID: 5 Comm: kworker/u1024:0 Not tainted 4.15.0-rc8 #1 Workqueue: kacpi_hotplug acpi_hotplug_work_fn ... Call Trace: pci_device_remove+0x36/0xb0 device_release_driver_internal+0x145/0x210 pci_stop_bus_device+0x76/0xa0 pci_stop_root_bus+0x44/0x60 acpi_pci_root_remove+0x1f/0x80 acpi_bus_trim+0x54/0x90 acpi_bus_trim+0x2e/0x90 acpi_device_hotplug+0x2bc/0x4b0 acpi_hotplug_work_fn+0x1a/0x30 process_one_work+0x141/0x340 worker_thread+0x47/0x3e0 kthread+0xf5/0x130 When uncore_pci_remove() runs, it tries to get the package ID to clear the value of uncore_extra_pci_dev[].dev[] by using topology_phys_to_logical_pkg(). The warning messesages are shown because topology_phys_to_logical_pkg() returns -1. arch/x86/events/intel/uncore.c: static void uncore_pci_remove(struct pci_dev *pdev) { ... phys_id = uncore_pcibus_to_physid(pdev->bus); ... pkg = topology_phys_to_logical_pkg(phys_id); // returns -1 for (i = 0; i < UNCORE_EXTRA_PCI_DEV_MAX; i++) { if (uncore_extra_pci_dev[pkg].dev[i] == pdev) { uncore_extra_pci_dev[pkg].dev[i] = NULL; break; } } WARN_ON_ONCE(i >= UNCORE_EXTRA_PCI_DEV_MAX); // <=========== HERE!! topology_phys_to_logical_pkg() tries to find cpuinfo_x86->phys_proc_id that matches the phys_pkg argument. arch/x86/kernel/smpboot.c: int topology_phys_to_logical_pkg(unsigned int phys_pkg) { int cpu; for_each_possible_cpu(cpu) { struct cpuinfo_x86 *c = &cpu_data(cpu); if (c->initialized && c->phys_proc_id == phys_pkg) return c->logical_proc_id; } return -1; } However, the phys_proc_id was already set to 0 by remove_siblinginfo() when the CPU was offlined. So, topology_phys_to_logical_pkg() cannot find the correct logical_proc_id and always returns -1. As the result, uncore_pci_remove() calls WARN_ON_ONCE() and the warning messages are shown. What is worse is that the bogus 'pkg' index results in two bugs: - We dereference uncore_extra_pci_dev[] with a negative index - We fail to clean up a stale pointer in uncore_extra_pci_dev[][] To fix these bugs, remove the clearing of ->phys_proc_id from remove_siblinginfo(). This should not cause any problems, because ->phys_proc_id is not used after it is hot-removed and it is re-set while hot-adding. Signed-off-by: Masayoshi Mizuma Acked-by: Thomas Gleixner Cc: Linus Torvalds Cc: Peter Zijlstra Cc: yasu.isimatu@gmail.com Cc: Fixes: 30bb9811856f ("x86/topology: Avoid wasting 128k for package id array") Link: http://lkml.kernel.org/r/ed738d54-0f01-b38b-b794-c31dc118c207@gmail.com Signed-off-by: Ingo Molnar Signed-off-by: Greg Kroah-Hartman --- arch/x86/kernel/smpboot.c | 1 - 1 file changed, 1 deletion(-) --- a/arch/x86/kernel/smpboot.c +++ b/arch/x86/kernel/smpboot.c @@ -1431,7 +1431,6 @@ static void remove_siblinginfo(int cpu) cpumask_clear(cpu_llc_shared_mask(cpu)); cpumask_clear(topology_sibling_cpumask(cpu)); cpumask_clear(topology_core_cpumask(cpu)); - c->phys_proc_id = 0; c->cpu_core_id = 0; cpumask_clear_cpu(cpu, cpu_sibling_setup_mask); recompute_smt_state();