Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752414AbcLJTLx (ORCPT ); Sat, 10 Dec 2016 14:11:53 -0500 Received: from Galois.linutronix.de ([146.0.238.70]:36908 "EHLO Galois.linutronix.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751772AbcLJTLw (ORCPT ); Sat, 10 Dec 2016 14:11:52 -0500 Date: Sat, 10 Dec 2016 20:09:06 +0100 (CET) From: Thomas Gleixner To: Boris Ostrovsky cc: LKML , x86@kernel.org, Peter Zijlstra , Borislav Petkov , "Charles (Chas) Williams" , "M. Vefa Bicakci" , Alok Kataria , xen-devel , =?ISO-8859-15?Q?Juergen_Gro=DF?= Subject: Re: [PATCH] x86/smpboot: Make logical package management more robust In-Reply-To: <730d61ff-ff1e-df80-3446-7fceb25a6d63@oracle.com> Message-ID: References: <8aa33de4-db18-759b-d2cb-0e25d5ab9d88@oracle.com> <730d61ff-ff1e-df80-3446-7fceb25a6d63@oracle.com> User-Agent: Alpine 2.20 (DEB 67 2015-01-07) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1809 Lines: 49 On Fri, 9 Dec 2016, Boris Ostrovsky wrote: > On 12/09/2016 06:02 PM, Boris Ostrovsky wrote: > > On 12/09/2016 05:06 PM, Thomas Gleixner wrote: > > > On Thu, 8 Dec 2016, Thomas Gleixner wrote: > > > > > > Boris, can you please verify if that makes the > > > topology_update_package_map() call which you placed into the Xen cpu > > > starting code obsolete ? > > > > Will do. I did test your patch but without removing > > topology_update_package_map() call. It complained about package IDs > > being wrong, but that's expected until I fix Xen part. > > Ignore my statement about earlier testing --- it was all on single-node > machines. > > Something is broken with multi-node on Intel, but failure modes are different. > Prior to this patch build_sched_domain() reports an error and pretty soon we > crash in scheduler (don't remember off the top of my head). With patch applied > I crash mush later, when one of the drivers does kmalloc_node(.., > cpu_to_node(cpu)) and cpu_to_node() returns 1, which should never happen > ("x86: Booted up 1 node, 32 CPUs" is reported, for example). Hmm. But the cpu_to_node() association is unrelated to the logical package management. > 2-node AMD box doesn't have these problems. > > I haven't upgraded the Intel machine for about a month but this all must have > happened in 4.9 timeframe. > > So I can't answer your question since we clearly have other problems on Xen. I > will be looking into this. Fair enough. What you could do though with this patch applied and the extra XEN call to topology_update_package_map() removed is to watchout for the following messages: pr_info("Max logical packages: %u\n", __max_logical_packages); and pr_warn(CPU %u Converting physical %u to logical package %u\n", ...) Ideally the latter wont show. Thanks, tglx