Received: by 2002:a05:6358:9144:b0:117:f937:c515 with SMTP id r4csp707159rwr; Thu, 27 Apr 2023 07:12:46 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ50lVHu/yOVWx6c5XyZC2El8a5oVcSbfvPKM4OiqnogkD7Y/+PCJcjuZQMzi4O2t2/www9t X-Received: by 2002:a05:6a20:7f99:b0:f0:251f:f099 with SMTP id d25-20020a056a207f9900b000f0251ff099mr1646309pzj.1.1682604766240; Thu, 27 Apr 2023 07:12:46 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1682604766; cv=none; d=google.com; s=arc-20160816; b=ZOGySmhcYjKse99SUzLrMYgV0Hi4Qn7aGJlN+8OP78ouTRwKs7lgDMKc49W9/q13W2 JwOzWaIef9mGOW1xKrsykXMQtvhkhbYM8FEWYlpEJav1DMMHA02TV4SpQq47cHNR1u/8 J3cWsvZYs2iJZf4VNezEYI1XnHivcEIZq/YUchSgjHr2LTr23uumfKPc8tXEtUAbhhrw Mg0VXIgo1Vl42C5SJWVuwG6HjegO2NB57xuOZ3bYMgz+xBEe1ZznVDaHTz/U1+MjIClT eD/QTaZ26rM93GTXUgvDLTphkcn/yTlM6AoguKqXkWdWc3apyOdtRCGa16co67joLKLj CXkA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:in-reply-to:from :references:cc:to:content-language:subject:user-agent:mime-version :date:message-id; bh=7Cm2sF0RFimEJ+Vw1/E3C7iGT+hjGq7fj2oDDOGtChI=; b=RLWmXRjYkggrt9gqu05vJIzMoAEDT04Ks924A1KFMml8EFEmWKKHHlU2kH2KnYSzwP +PeFvWFdwBTyzcbcAe1/60jctNSjs9RWik0uZaZr/qRkLcMArQEPq/Q/O6pRtkrQl5/Y vMwTlYUJoajWCeXDJs8U6SgLKvrUDUZi6SXZKGcIL3LX0YhnfxXrNS5qfVitmIkLO+h3 i7wJX4z6qYi4CSCgeMMwJizjmCcKz3T+1ogJthlHCxiLqsucXc3Mik2j97e6gaa7KZ4A EXCjX6hCTu3ywRyM5b280T28BDNV36t6ZjoNCIBetC5vgiWm8BjCJl/tR+nt085iHOM6 qYnw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id l186-20020a6391c3000000b005215235fbd7si19026596pge.365.2023.04.27.07.12.30; Thu, 27 Apr 2023 07:12:46 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S243834AbjD0OLp (ORCPT + 99 others); Thu, 27 Apr 2023 10:11:45 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:32780 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S243547AbjD0OLo (ORCPT ); Thu, 27 Apr 2023 10:11:44 -0400 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 7B4994690 for ; Thu, 27 Apr 2023 07:11:43 -0700 (PDT) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id E2965143D; Thu, 27 Apr 2023 07:12:26 -0700 (PDT) Received: from [10.1.196.177] (eglon.cambridge.arm.com [10.1.196.177]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id B2DBB3F7D8; Thu, 27 Apr 2023 07:11:17 -0700 (PDT) Message-ID: <8e92b43d-dd8f-80e0-e31b-5ebfed418a0f@arm.com> Date: Thu, 27 Apr 2023 15:11:07 +0100 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux aarch64; rv:91.0) Gecko/20100101 Thunderbird/91.13.0 Subject: Re: [PATCH v3 09/19] x86/resctrl: Queue mon_event_read() instead of sending an IPI Content-Language: en-GB To: Peter Newman Cc: x86@kernel.org, linux-kernel@vger.kernel.org, Fenghua Yu , Reinette Chatre , Thomas Gleixner , Ingo Molnar , Borislav Petkov , H Peter Anvin , Babu Moger , shameerali.kolothum.thodi@huawei.com, D Scott Phillips OS , carl@os.amperecomputing.com, lcherian@marvell.com, bobo.shaobowang@huawei.com, tan.shaopeng@fujitsu.com, xingxin.hx@openanolis.org, baolin.wang@linux.alibaba.com, Jamie Iles , Xin Hao References: <20230320172620.18254-1-james.morse@arm.com> <20230320172620.18254-10-james.morse@arm.com> From: James Morse In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-5.6 required=5.0 tests=BAYES_00,NICE_REPLY_A, RCVD_IN_DNSWL_MED,SPF_HELO_NONE,SPF_NONE,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi Peter, On 22/03/2023 14:07, Peter Newman wrote: > On Mon, Mar 20, 2023 at 6:27 PM James Morse wrote: >> >> x86 is blessed with an abundance of monitors, one per RMID, that can be > > As I explained earlier, this is not the case on AMD. I'll change it so say Intel. >> read from any CPU in the domain. MPAMs monitors reside in the MMIO MSC, >> the number implemented is up to the manufacturer. This means when there are >> fewer monitors than needed, they need to be allocated and freed. >> >> Worse, the domain may be broken up into slices, and the MMIO accesses >> for each slice may need performing from different CPUs. >> >> These two details mean MPAMs monitor code needs to be able to sleep, and >> IPI another CPU in the domain to read from a resource that has been sliced. > > This doesn't sound very convincing. Could mon_event_read() IPI all the > CPUs in the domain? (after waiting to allocate and install monitors > when necessary?) On the majority of platforms this would be a waste of time as the IPI only needs sending to one. I'd like to keep the cost of being strange limited to the strange platforms. I don't think exposing a 'sub domain' cpumask to resctrl is helpful: this needs to be hidden in the architecture specific code. The IPI is because of SoC components being implemented as slices which are private to that slice. The sleeping is because the CSU counters are allowed to be 'not ready' immediately after programming. The time is short, and to allow platforms that have too few CSU monitors to support the same user-interface as x86^W Intel, the MPAM driver needs to be able to multiplex a single CSU monitor between multiple control/monitor groups. Allowing it to sleep for the advertised not-ready period is the simplest way of doing this. >> mon_event_read() already invokes mon_event_count() via IPI, which means >> this isn't possible. On systems using nohz-full, some CPUs need to be >> interrupted to run kernel work as they otherwise stay in user-space >> running realtime workloads. Interrupting these CPUs should be avoided, >> and scheduling work on them may never complete. >> >> Change mon_event_read() to pick a housekeeping CPU, (one that is not using >> nohz_full) and schedule mon_event_count() and wait. If all the CPUs >> in a domain are using nohz-full, then an IPI is used as the fallback. >> >> This function is only used in response to a user-space filesystem request >> (not the timing sensitive overflow code). >> >> This allows MPAM to hide the slice behaviour from resctrl, and to keep >> the monitor-allocation in monitor.c. > > This goal sounds more likely. > > If it makes the initial enablement smoother, then I'm all for it. > Reviewed-By: Peter Newman > > These changes worked fine for me on tip/master, though there were merge > conflicts to resolve. > > Tested-By: Peter Newman Thanks! James