Received: by 2002:a05:7412:31a9:b0:e2:908c:2ebd with SMTP id et41csp2826145rdb; Tue, 12 Sep 2023 13:22:10 -0700 (PDT) X-Google-Smtp-Source: AGHT+IHdlAb0ZRWssCkHnLJgmCzDZ7aOBy+1fgON8zHYLNUhSKSaE7cG4bB2PqbOejDBwtPChlyU X-Received: by 2002:a05:6a21:3e05:b0:154:d3ac:2063 with SMTP id bk5-20020a056a213e0500b00154d3ac2063mr429707pzc.27.1694550130633; Tue, 12 Sep 2023 13:22:10 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1694550130; cv=none; d=google.com; s=arc-20160816; b=yGX2M+lT2ppnjHOGSoNBHbbPSfMFHYvT5tt8vRg6pYOxQBQV3mduZm1wcxhoTHJBsp tztBqZFNGxGv3hef/zY5f0QkMCCly4M2cpi2/CqQDbWkUjesMEOQFzS10A5Y/u/eDzfl U6G+QWZLRm8ynb2Ao96+Z5e1dJNVaotGEIaq1Lt+JmdYqfZBW6bHYa0XBG4YRt8S6DzU O9+xMO6KyPNxupv7vRTxaG/7ZMfFUYzb6tXVe4iPL/gzyKG2ev7EuiL6IAqUDxNJhhaB q1qWTkcNaa/t0/HK+0WtyrOpT2iPCsTwMUfwN1VNMMcf+X9coIUek7yCMNAYv05+druc P3iA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:dkim-signature; bh=kSp0ERoHwZAlnt8oa6Zyb0pjvDMev+xaSgGMnE1joAc=; fh=hBumMKgdp07rd33RvRRiOkGat7EEsmGW7/N2mfIMmX8=; b=dKYD5N6yhvuIJAkZ9SJtvPzswFvIvg/sR2sRp8k2H34otZtfA4ewXC6u8pQBP639GX w/7MD8qfti1uIoxEc4v5/DFruP944bRPIbMM+ZatQnJWUZjg79f2SVwj6ngLgfqqyFv9 78+kspd/EcUNXfThHs4lv8epK0QX7fekXXmAwdSHHJodGPnNEUIBLejD4Ra5Tz8R0I79 kCj9Gh/bMjsQczjYRQz7Qa4AglwCgHMWXzERk2yM44MzUJhgO+VSE99yErWKjCcXtHzb HScugNrNmziCJyY+TQDlFzbnVT/NAelrXpXd7vrk7uo0LinYW+/GiXNDM/45uR2uyaLs p81g== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=IxnW4xBo; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.33 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from lipwig.vger.email (lipwig.vger.email. [23.128.96.33]) by mx.google.com with ESMTPS id t185-20020a6381c2000000b00565f617a486si8363308pgd.212.2023.09.12.13.22.05 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 12 Sep 2023 13:22:10 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.33 as permitted sender) client-ip=23.128.96.33; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=IxnW4xBo; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.33 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by lipwig.vger.email (Postfix) with ESMTP id C3FD5803797B; Tue, 12 Sep 2023 10:46:15 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at lipwig.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236828AbjILRpm (ORCPT + 99 others); Tue, 12 Sep 2023 13:45:42 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36540 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233429AbjILRpl (ORCPT ); Tue, 12 Sep 2023 13:45:41 -0400 Received: from mgamail.intel.com (mgamail.intel.com [192.55.52.120]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E550310C9; Tue, 12 Sep 2023 10:45:37 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1694540737; x=1726076737; h=date:from:to:cc:subject:message-id:references: mime-version:in-reply-to; bh=CG1SbLvb2i8X0vM67LTH8RnhL6e9+zkf2JDJ8mw6STw=; b=IxnW4xBouY9ecHCXD6+E6beRY3nLH13iUQXewMzIEv0wFPIKiWYjY4TB W511j20qiOS0k4H1YwTh8OaNfLK8eSgV39aMrArO2jTPpDkrqxJ4wGVdg RWlVBnUq04CuQ7/UGo6NCp5b4dfdewLYhUAjJ4HGuHcG7s2Pd7DYlVUkk RbM63Uj3VzSsRyU48GwHlCggB6I7WWHGYb+ewf9x5uEka+Ns6cX0h54Hk OubS7adfGSqykU1aaCq9MTqDlbua5mEiKA+YGPuydeEmoLBthsXSzN/de vQmAdK1ocigY3Aq7PNvi12Wg5esuYLL92mAcd+nizN1FRSjaZTgS7WQO8 g==; X-IronPort-AV: E=McAfee;i="6600,9927,10831"; a="377363183" X-IronPort-AV: E=Sophos;i="6.02,139,1688454000"; d="scan'208";a="377363183" Received: from fmsmga008.fm.intel.com ([10.253.24.58]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 12 Sep 2023 10:45:14 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10831"; a="809364292" X-IronPort-AV: E=Sophos;i="6.02,139,1688454000"; d="scan'208";a="809364292" Received: from agluck-desk3.sc.intel.com (HELO agluck-desk3) ([172.25.222.74]) by fmsmga008-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 12 Sep 2023 10:45:14 -0700 Date: Tue, 12 Sep 2023 10:45:13 -0700 From: Tony Luck To: Reinette Chatre Cc: Fenghua Yu , Peter Newman , Jonathan Corbet , Shuah Khan , x86@kernel.org, Shaopeng Tan , James Morse , Jamie Iles , Babu Moger , Randy Dunlap , linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, patches@lists.linux.dev Subject: Re: [PATCH v5 0/8] Add support for Sub-NUMA cluster (SNC) systems Message-ID: References: <20230722190740.326190-1-tony.luck@intel.com> <20230829234426.64421-1-tony.luck@intel.com> <45e81506-3c43-95ba-56a4-38e1bb2e42dc@intel.com> <0a59dce4-59f3-3d60-e06a-ef3e84e8c64c@intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <0a59dce4-59f3-3d60-e06a-ef3e84e8c64c@intel.com> Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (lipwig.vger.email [0.0.0.0]); Tue, 12 Sep 2023 10:46:15 -0700 (PDT) X-Spam-Status: No, score=-0.9 required=5.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lipwig.vger.email On Tue, Sep 12, 2023 at 10:13:31AM -0700, Reinette Chatre wrote: > Hi Tony, > > On 9/12/2023 9:01 AM, Tony Luck wrote: > > On Mon, Sep 11, 2023 at 01:23:35PM -0700, Reinette Chatre wrote: > >> Hi Tony, > >> > >> On 8/29/2023 4:44 PM, Tony Luck wrote: > >>> The Sub-NUMA cluster feature on some Intel processors partitions > >>> the CPUs that share an L3 cache into two or more sets. This plays > >>> havoc with the Resource Director Technology (RDT) monitoring features. > >>> Prior to this patch Intel has advised that SNC and RDT are incompatible. > >>> > >>> Some of these CPU support an MSR that can partition the RMID > >>> counters in the same way. This allows for monitoring features > >>> to be used (with the caveat that memory accesses between different > >>> SNC NUMA nodes may still not be counted accuratlely. > >> > >> Same typo as in V4. > > > > Sorry. Will fix and re-post. > > > >>> > >>> Note that this patch series improves resctrl reporting considerably > >>> on systems with SNC enabled, but there will still be some anomalies > >>> for processes accessing memory from other sub-NUMA nodes. > >> > >> I have the same question as with V4 that was not answered in that email > >> thread nor in this new version. > >> https://lore.kernel.org/lkml/e350514e-76ed-14ea-3e74-c0852658182f@intel.com/ > > > > Non-SNC systems already have an issue when reporting memory bandwidth > > for a task that Linux may migrate the task to a CPU on a different node > > which means that logging for that task will also move to different files > > in the mon_data/mon_L3_*/ for the new node. > > It is not obvious to me that this is an issue. From what I understand > the data remains accurate. > > How does this map to the earlier "may still not be counted > accurately"? Yes, the data is accurate. But a naive application reading the wrong files from mon_data will not see the accurate data. Without SNC users may only see this issue rarely as Linux tries hard to not migrate processes to other NUMA nodes or L3 cache domains. But with SNC enabled this is no longer the case. the ACPI SLIT distance of 0xC is below the threshold that Linux checks for "is the target CPU for a migration far away" so migration to other SNC nodes may be quite common. I can move this out of the cover letter and provide guidance/warnings in the patch to Documentation/arch/x86/resctrl.rst > > > > > With SNC enabled, migration between NUMA nodes on the same socket may happen > > much more frequently because: > > 1) The CPUs on the other NUMA nodes in the socket are in the same Linux > > L3 cache domain. So Linux regard the migration as "cheap". > > 2) The ACPI SLIT table on SNC enabled systems may also report the > > latency for remote access to another NUMA node on the same socket > > as significantly lower than the latency for cross-socket access. On > > my test system the SLIT distance for same socket nodes is 0xC, > > compared to 0x15 for cross-socket distance. This will also lead > > to Linux being more likely to migrate a task to a CPU on another > > SNC NUMA node in the same socket. > > > > To avoid migration issues, users may use sched_setaffinity(2) to bind > > tasks to the subset of CPUs that share an SNC NUMA node. > > > > I can write this up in a new cover letter. > > > >> I stop my review of this series here. > > > > Reinette > > > > Should I repost the whole series as v6 with the new cover letter. The > > only change to the patches so far is to the selftest reported by > > Shaopeng Tan[1]. > > > > Is this an assurance that the cover letter in no way reflects how > feedback was addressed in the rest of this series? My track record here is far from perfect. I believe I addressed all the issues you raised. But it's very possible that I may have missed some, or misintepreted the concerns you raised. > > Reinette -Tony