Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752807AbcLKDZp (ORCPT ); Sat, 10 Dec 2016 22:25:45 -0500 Received: from aserp1040.oracle.com ([141.146.126.69]:29304 "EHLO aserp1040.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751530AbcLKDZo (ORCPT ); Sat, 10 Dec 2016 22:25:44 -0500 Subject: Re: [PATCH] x86/smpboot: Make logical package management more robust To: Thomas Gleixner References: <8aa33de4-db18-759b-d2cb-0e25d5ab9d88@oracle.com> <730d61ff-ff1e-df80-3446-7fceb25a6d63@oracle.com> Cc: LKML , x86@kernel.org, Peter Zijlstra , Borislav Petkov , "Charles (Chas) Williams" , "M. Vefa Bicakci" , Alok Kataria , xen-devel , =?UTF-8?Q?Juergen_Gro=c3=9f?= From: Boris Ostrovsky Message-ID: <60e7a807-27fb-f666-270a-9512804deae8@oracle.com> Date: Sat, 10 Dec 2016 22:23:45 -0500 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.5.1 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 7bit X-Source-IP: userv0022.oracle.com [156.151.31.74] Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1890 Lines: 50 On 12/10/2016 02:13 PM, Thomas Gleixner wrote: > On Sat, 10 Dec 2016, Thomas Gleixner wrote: >> On Fri, 9 Dec 2016, Boris Ostrovsky wrote: >>> On 12/09/2016 06:02 PM, Boris Ostrovsky wrote: >>>> On 12/09/2016 05:06 PM, Thomas Gleixner wrote: >>>>> On Thu, 8 Dec 2016, Thomas Gleixner wrote: >>>>> >>>>> Boris, can you please verify if that makes the >>>>> topology_update_package_map() call which you placed into the Xen cpu >>>>> starting code obsolete ? >>>> >>>> Will do. I did test your patch but without removing >>>> topology_update_package_map() call. It complained about package IDs >>>> being wrong, but that's expected until I fix Xen part. >>> >>> Ignore my statement about earlier testing --- it was all on single-node >>> machines. >>> >>> Something is broken with multi-node on Intel, but failure modes are different. >>> Prior to this patch build_sched_domain() reports an error and pretty soon we >>> crash in scheduler (don't remember off the top of my head). With patch applied >>> I crash mush later, when one of the drivers does kmalloc_node(.., >>> cpu_to_node(cpu)) and cpu_to_node() returns 1, which should never happen >>> ("x86: Booted up 1 node, 32 CPUs" is reported, for example). >> >> Hmm. But the cpu_to_node() association is unrelated to the logical package >> management. > > Just came to my mind after hitting send. We had the whole persistent cpuid > to nodeid association work merged in 4.9. So that might be related. Yes, that's exactly the reason. It uses _PXM to set nodeID and _PXM is exposed to dom0 (which is a privileged PV guest). Re: you previous message: after I "fix" the problem above, I see pr_info("Max logical packages: %u\n", __max_logical_packages); but no pr_warn(CPU %u Converting physical %u to logical package %u\n", ...) with or without topology_update_package_map() in arch/x86/xen/smp.c:cpu_bringup() -boris