Received: by 2002:a05:6358:1087:b0:cb:c9d3:cd90 with SMTP id j7csp7256325rwi; Mon, 24 Oct 2022 11:58:21 -0700 (PDT) X-Google-Smtp-Source: AMsMyM7nmGxlgPQOdHe918Vf5BoOuQsrAZC+qZdyK/g0uj13VUY3hNSH4kbBEfH+DXgF2QsTIPRB X-Received: by 2002:a17:903:120d:b0:179:d027:66f0 with SMTP id l13-20020a170903120d00b00179d02766f0mr34868566plh.61.1666637900818; Mon, 24 Oct 2022 11:58:20 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1666637900; cv=none; d=google.com; s=arc-20160816; b=vfptPGBJBYx/VzcJ2Rf1tRSRtvb7z8DFqTL6ljQerxA1LrM0MYOskQ9C8bkTTa7h7Z FDN6SwH3053e85fQrNW3Dq6DSJUXP0EPXkmAMSwZS+qT4BexnPfJQa/EBIePnIRIy50z 28J+SCDqU6eFoPT6P3TzHVld1U8dsmGvZp2h+sge9wlI2So0J93bCHwx4P+hz5ah3NXz WrI24sFcY8GwxntdnqX60Eds0UkeERO5TpXw8gXqi9XcKpyeM3qE5sRybKV8eTN5Qrzp astGHDMPwGE++Gq0VDtnG2y1/BrDIBe0BHs/R3vBKTug1IQVKite4DYkz6n3yl1RlEoC 4JFg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:in-reply-to:from :references:cc:to:content-language:subject:user-agent:mime-version :date:message-id:dkim-signature; bh=uvHZ2sNaImYBPZq6LGoKjdDQEjKcroJFuB91Lub2Whg=; b=Hkc63luvP3Eqae9b82L4MHlrXNmA3ryPPsc0e66SrZ34JO2rliQo5j02DmaTbFKepa Jap4PEh3IkKp2xv4Q3IvpxTVoydB8x5VRfLpR9SISvuEDVf7/3swP+Ju5f9BTOh30MZv 81GF0TnX1H675/w18klWUL/SfEHISreb5mtjbZ5UC5V2KtR7srMeM3POeb3uxjARdzsl LueYM/J6JPVo5EjtNfbvLmBMEeP08QtI9uvyYB9ErtYyXNKKCgoGqMGN3ohh/p+3kGug GlPSaa5pCQMRpZhuQeXOILKlBii1XEIxzojfwgHuHNUbr99v0Su2gWgS7VgBjt9lcFSC f83A== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=oGQMgPsT; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id u37-20020a631425000000b0046ec2ad4a97si333902pgl.163.2022.10.24.11.58.08; Mon, 24 Oct 2022 11:58:20 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=oGQMgPsT; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232555AbiJXSN1 (ORCPT + 99 others); Mon, 24 Oct 2022 14:13:27 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35220 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232594AbiJXSMu (ORCPT ); Mon, 24 Oct 2022 14:12:50 -0400 Received: from mga18.intel.com (mga18.intel.com [134.134.136.126]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2FB6E7CA9A for ; Mon, 24 Oct 2022 09:54:41 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1666630481; x=1698166481; h=message-id:date:mime-version:subject:to:cc:references: from:in-reply-to:content-transfer-encoding; bh=dv4QDRIPcuHumVexzn37KMl4eAOU2AKnc4Ln3f02ot4=; b=oGQMgPsT96TTRJ7H88RypU/oBDYCI401hKHFK4zskfLrDFBmAlxop5KX mQVU8TikzgB+iuFl0OUuW6bZ4tMEHQE5rQ/EzkzcrIF3Yq87WqKUoLN/X GqbUtiS4AOyLlg1BdIitRrnpi96HA8c1YgLItSosJAuoQq5g1I5T4Oi2e vSabfq+VR+eC0gXjQBJKra+SixVcdcKjKlZlwbNvPrEzj+A41N1mAar+D qf8CcEDCeYziw4tuO9S6S7c72k77up1tRpn9w7pbL9J8HzLS/f3PCgYnO jqM8adomGm677najOx3dUTHNf70v6tDBYw7mxZFVhcGVoYTzemJIRTzBZ g==; X-IronPort-AV: E=McAfee;i="6500,9779,10510"; a="290755428" X-IronPort-AV: E=Sophos;i="5.95,209,1661842800"; d="scan'208";a="290755428" Received: from orsmga006.jf.intel.com ([10.7.209.51]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 24 Oct 2022 08:43:34 -0700 X-IronPort-AV: E=McAfee;i="6500,9779,10510"; a="609238828" X-IronPort-AV: E=Sophos;i="5.95,209,1661842800"; d="scan'208";a="609238828" Received: from csun9-mobl.amr.corp.intel.com (HELO [10.209.104.152]) ([10.209.104.152]) by orsmga006-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 24 Oct 2022 08:43:33 -0700 Message-ID: <397f513f-9273-76d1-a0ba-9d1d403020c5@intel.com> Date: Mon, 24 Oct 2022 08:43:33 -0700 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.2.2 Subject: Re: [PATCH v1 1/2] x86/tsc: use logical_package as a better estimation of socket numbers Content-Language: en-US To: Feng Tang , Zhang Rui , Thomas Gleixner , Ingo Molnar , Borislav Petkov , "H . Peter Anvin" , Peter Zijlstra , x86@kernel.org Cc: linux-kernel@vger.kernel.org, tim.c.chen@intel.com, Xiongfeng Wang , liaoyu15@huawei.com References: <20221021062131.1826810-1-feng.tang@intel.com> <63dca468-c94d-844a-5b19-09c03cf84911@intel.com> From: Dave Hansen In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-2.6 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,NICE_REPLY_A, SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 10/24/22 00:37, Feng Tang wrote: >> For instance, I can live with the implementation being a bit goofy when >> kernel commandlines are in play. We can pr_info() about those cases. > Something like adding > > pr_info("Watchdog for TSC is disabled for this platform while estimating > the socket number is %d, if the real socket number is bigger than > 4 (may due to some tricks like 'maxcpus=' cmdline parameter, please > add 'tsc=watchdog' to cmdline as well\n", logical_packages); That's too wishy-washy. Also, I *KNOW* Intel has built systems with wonky, opaque numbers of "sockets". Cascade Lake was a single physical "socket", but in all other respects (including enumeration to software) it acted like two logical sockets. So, what was the "real" socket number for Cascade Lake? If you looked in a chassis, you'd see one socket. But, there were two dies in that socket talking to each other over UPI, so it had a system topology which was indistinguishable from a 2-socket system. Let's just state the facts: pr_info("Disabling TSC watchdog on %d-package system.", ...) Then, we can have a flag elsewhere to say how reliable that number is. A taint flag or CPU bug is probably going to far, but something like this: bool logical_package_count_unreliable = false; void mark_bad_package_count(char *reason) { if (logical_package_count_unreliable) return true; pr_warn("processor package count is unreliable"); } Might be OK. Then you can call mark_bad_package_count() from multiple sites, like the maxcpus= code. But, like I said in the other thread, let's make sure we're agreed on the precise problem that we're solving before we go down this road. > and adding a new 'tsc=watchdog' option to force watchdog on (might be > over-complexed?) Agreed, I don't think that's quite warranted yet.