Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752105AbdLDTsl (ORCPT ); Mon, 4 Dec 2017 14:48:41 -0500 Received: from mx4.wp.pl ([212.77.101.12]:39482 "EHLO mx4.wp.pl" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751514AbdLDTsj (ORCPT ); Mon, 4 Dec 2017 14:48:39 -0500 Date: Mon, 4 Dec 2017 11:48:30 -0800 From: Jakub Kicinski To: Prarit Bhargava Cc: linux-kernel@vger.kernel.org, "netdev@vger.kernel.org" , Thomas Gleixner , Clark Williams Subject: Re: [bisected] x86 boot still broken on -rc2 Message-ID: <20171204114830.0cbaf0ad@cakuba.netronome.com> In-Reply-To: <20171204164521.17870-1-prarit@redhat.com> References: <20171204164521.17870-1-prarit@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-WP-MailID: 4cedf5515874ca97f71ab475967afd05 X-WP-AV: skaner antywirusowy Poczty Wirtualnej Polski X-WP-SPAM: NO 000000A [gXOU] Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1426 Lines: 33 On Mon, 4 Dec 2017 11:45:21 -0500, Prarit Bhargava wrote: > On 12/04/2017 08:13 AM, Prarit Bhargava wrote: > > x86: Booting SMP configuration: > > .... node #0, CPUs: #1 #2 #3 #4 > > .... node #1, CPUs: #5 #6 #7 #8 #9 > > .... node #0, CPUs: #10 #11 #12 #13 #14 > > .... node #1, CPUs: #15 #16 #17 #18 #19 > > smp: Brought up 2 nodes, 20 CPUs > > smpboot: Max logical packages: 1 > > > > which means that the calculation of logical packages is wrong because > > > > ncpus = cpu_data(0).booted_cores * smp_num_siblings; > > ncpus = 10 * 2; > > ncpus = 20; > > > > smp_num_siblings is defined as "The number of threads in a core" which > > should be 1 if HT/SMT is disabled. > > > > It looks like my patch has exposed a bug in the > > smp_num_siblings calculation. I'm still debugging ... > > The bug is that smp_num_siblings has been incorrectly calculated as the > *maximum* number of threads in a core, and not the actual number of threads in > a core on systems which have a CPUID level greater than 0xb. (see > arch/x86/kernel/cpu/topology.c:59) > > That will take some time to investigate and come up with a proper solution and > fix. In the meantime, the patch below will fix the problem in the short-term. > I've tested the patch using SMT enabled, SMT disabled, maxcpus=1 and nr_cpus=1. Thanks Prarit, the work around does the job! Indeed, I have SMT disabled.