Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752693AbaKAWla (ORCPT ); Sat, 1 Nov 2014 18:41:30 -0400 Received: from shadbolt.e.decadent.org.uk ([88.96.1.126]:60313 "EHLO shadbolt.e.decadent.org.uk" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751786AbaKAWjd (ORCPT ); Sat, 1 Nov 2014 18:39:33 -0400 Content-Type: text/plain; charset="UTF-8" Content-Disposition: inline Content-Transfer-Encoding: 8bit MIME-Version: 1.0 From: Ben Hutchings To: linux-kernel@vger.kernel.org, stable@vger.kernel.org CC: akpm@linux-foundation.org, "Yasuaki Ishimatsu" , "Ingo Molnar" , "Prarit Bhargava" , "Toshi Kani" , "Steven Rostedt" , "Linn Crosetto" , "Peter Zijlstra" , "David Rientjes" , "Borislav Petkov" , "Wanpeng Li" Date: Sat, 01 Nov 2014 22:28:03 +0000 Message-ID: X-Mailer: LinuxStableQueue (scripts by bwh) Subject: [PATCH 3.2 069/102] sched: Fix unreleased llc_shared_mask bit during CPU hotplug In-Reply-To: X-SA-Exim-Connect-IP: 192.168.4.249 X-SA-Exim-Mail-From: ben@decadent.org.uk X-SA-Exim-Scanned: No (on shadbolt.decadent.org.uk); SAEximRunCond expanded to false Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org 3.2.64-rc1 review patch. If anyone has any objections, please let me know. ------------------ From: Wanpeng Li commit 03bd4e1f7265548832a76e7919a81f3137c44fd1 upstream. The following bug can be triggered by hot adding and removing a large number of xen domain0's vcpus repeatedly: BUG: unable to handle kernel NULL pointer dereference at 0000000000000004 IP: [..] find_busiest_group PGD 5a9d5067 PUD 13067 PMD 0 Oops: 0000 [#3] SMP [...] Call Trace: load_balance ? _raw_spin_unlock_irqrestore idle_balance __schedule schedule schedule_timeout ? lock_timer_base schedule_timeout_uninterruptible msleep lock_device_hotplug_sysfs online_store dev_attr_store sysfs_write_file vfs_write SyS_write system_call_fastpath Last level cache shared mask is built during CPU up and the build_sched_domain() routine takes advantage of it to setup the sched domain CPU topology. However, llc_shared_mask is not released during CPU disable, which leads to an invalid sched domainCPU topology. This patch fix it by releasing the llc_shared_mask correctly during CPU disable. Yasuaki also reported that this can happen on real hardware: https://lkml.org/lkml/2014/7/22/1018 His case is here: == Here is an example on my system. My system has 4 sockets and each socket has 15 cores and HT is enabled. In this case, each core of sockes is numbered as follows: | CPU# Socket#0 | 0-14 , 60-74 Socket#1 | 15-29, 75-89 Socket#2 | 30-44, 90-104 Socket#3 | 45-59, 105-119 Then llc_shared_mask of CPU#30 has 0x3fff80000001fffc0000000. It means that last level cache of Socket#2 is shared with CPU#30-44 and 90-104. When hot-removing socket#2 and #3, each core of sockets is numbered as follows: | CPU# Socket#0 | 0-14 , 60-74 Socket#1 | 15-29, 75-89 But llc_shared_mask is not cleared. So llc_shared_mask of CPU#30 remains having 0x3fff80000001fffc0000000. After that, when hot-adding socket#2 and #3, each core of sockets is numbered as follows: | CPU# Socket#0 | 0-14 , 60-74 Socket#1 | 15-29, 75-89 Socket#2 | 30-59 Socket#3 | 90-119 Then llc_shared_mask of CPU#30 becomes 0x3fff8000fffffffc0000000. It means that last level cache of Socket#2 is shared with CPU#30-59 and 90-104. So the mask has the wrong value. Signed-off-by: Wanpeng Li Tested-by: Linn Crosetto Reviewed-by: Borislav Petkov Reviewed-by: Toshi Kani Reviewed-by: Yasuaki Ishimatsu Cc: David Rientjes Cc: Prarit Bhargava Cc: Steven Rostedt Cc: Peter Zijlstra Link: http://lkml.kernel.org/r/1411547885-48165-1-git-send-email-wanpeng.li@linux.intel.com Signed-off-by: Ingo Molnar Signed-off-by: Ben Hutchings --- arch/x86/kernel/smpboot.c | 3 +++ 1 file changed, 3 insertions(+) --- a/arch/x86/kernel/smpboot.c +++ b/arch/x86/kernel/smpboot.c @@ -1252,6 +1252,9 @@ static void remove_siblinginfo(int cpu) for_each_cpu(sibling, cpu_sibling_mask(cpu)) cpumask_clear_cpu(cpu, cpu_sibling_mask(sibling)); + for_each_cpu(sibling, cpu_llc_shared_mask(cpu)) + cpumask_clear_cpu(cpu, cpu_llc_shared_mask(sibling)); + cpumask_clear(cpu_llc_shared_mask(cpu)); cpumask_clear(cpu_sibling_mask(cpu)); cpumask_clear(cpu_core_mask(cpu)); c->phys_proc_id = 0; -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/