Received: by 10.223.185.116 with SMTP id b49csp4366478wrg; Mon, 26 Feb 2018 16:35:38 -0800 (PST) X-Google-Smtp-Source: AH8x225fDLLGAogIVkmYWP9q+qeCsvnKGUNPOwdcAmral8pwEiAEPZGIplxdh2P3wwM9PBQU9Xgm X-Received: by 10.99.126.24 with SMTP id z24mr9688963pgc.343.1519691738144; Mon, 26 Feb 2018 16:35:38 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1519691738; cv=none; d=google.com; s=arc-20160816; b=fzF/BJOCoN8BmeXaX2YJ0Bc9s/wQOoHcqSqIzbl3ixIyUp54ml3CBGXJzphXawrsH2 PqhGf6JOtgNy5EU1bPpsCfxq+UUHeuevMDtLfoU6AyUMSGaq79wvg8DAq0PmL8CuvDOF VP9Sl/KuIXe66XtnqWH6nBlMsWs7EYL/9f0jjx6bTIXg01MNDU4Dq0AfiJ1kk7SqGgHB hltJB7gl9G7HGIWhryjwFR32YYvpWKZZxh2SRbst9fhfvcs48JsMHUjErerh94WAlfhN 0VBLJbe2SGPA+Q/isemNc5xXQ5TXx3/rOYmcHcDpG2Tsi1n+z4zb4Gp5loS8d/veETSm 9iTA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding :content-language:in-reply-to:mime-version:user-agent:date :message-id:from:references:cc:to:subject:arc-authentication-results; bh=RG3B1zZs5j7+q9Q4LCP+bFrb0yYB63bN6ACk+5wYzyM=; b=Vcgc037f4N4YwohfrJnPtXeeW5QSELCsBo3/MxLpA6h5LeA0UW166TxP7IoXAUvYji BgoBrqSXW8B4fjsxGLCk/pPv66CngHabfwUHz0Z6I8ToSZ4dZWfHmnX9huptFT7ma7ER jc2EowyE4EkAwPdkNzGLlGrOq7CUfDfHxcf/mTtLdiw/m8B/OxgpznelnlrhVBdYlpTM BdXNN6/IflHMeNqkLe9VyNqqAclVTA/fy2e9KX8oqdIXSdcua2i7sw2Am0M3KH+SBxfJ wzG1AGLOFcZ27S7DOCRdYmRRcWi5Qy0LwRbNKXw8/nt1qRnJWOdjgwxCiHCpVq9AqMRx GShg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id f4si6181494pgc.267.2018.02.26.16.35.22; Mon, 26 Feb 2018 16:35:38 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751531AbeB0Aeo (ORCPT + 99 others); Mon, 26 Feb 2018 19:34:44 -0500 Received: from mga04.intel.com ([192.55.52.120]:23099 "EHLO mga04.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751411AbeB0Aen (ORCPT ); Mon, 26 Feb 2018 19:34:43 -0500 X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga007.fm.intel.com ([10.253.24.52]) by fmsmga104.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 26 Feb 2018 16:34:42 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.47,398,1515484800"; d="scan'208";a="20417500" Received: from rchatre-mobl.amr.corp.intel.com (HELO [10.24.14.117]) ([10.24.14.117]) by fmsmga007.fm.intel.com with ESMTP; 26 Feb 2018 16:34:42 -0800 Subject: Re: [RFC PATCH V2 13/22] x86/intel_rdt: Support schemata write - pseudo-locking core To: Thomas Gleixner Cc: fenghua.yu@intel.com, tony.luck@intel.com, gavin.hindman@intel.com, vikas.shivappa@linux.intel.com, dave.hansen@intel.com, mingo@redhat.com, hpa@zytor.com, x86@kernel.org, linux-kernel@vger.kernel.org References: From: Reinette Chatre Message-ID: <73fb98d2-ce93-0443-b909-fde75908cc1e@intel.com> Date: Mon, 26 Feb 2018 16:34:41 -0800 User-Agent: Mozilla/5.0 (Windows NT 6.3; WOW64; rv:52.0) Gecko/20100101 Thunderbird/52.6.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi Thomas, On 2/20/2018 9:15 AM, Thomas Gleixner wrote: > Let's look at the existing crtl/mon groups which are each represented by a > directory already. > > - Adding a 'size' file to the ctrl groups would be a natural extension > which makes sense for regular cache allocations as well. > > - Adding a 'exclusive' flag would be an interesting feature even for the > normal use case. Marking a group as exclusive prevents other groups to > request CBM bits which are held by a exclusive allocation. > > I'd suggest to have a file 'mode' for controlling this. The valid values > would be something like 'shareable' and 'exclusive'. > > When trying to set a group to exclusive mode then the schemata has to be > checked for overlaps with the other schematas and in case of conflict > the write fails. Once enabled subsequent writes to the schemata file > need to be checked for conflicts as well. > > If the exclusive setting is enabled then the CBM bits of that group > are excluded from being used in other control groups. > > Aside of that a file in the info directory which shows the (un)used CBM > bits of all groups is really helpful for controlling all of that (even w/o > pseudo locking). You have this in the 'avail' file, but there is no reason > why this should only be available for pseudo locking enabled systems. > > Now for the pseudo locking part. > > What you need on top of the above is a new 'mode': 'locked'. That mode > utilizes the 'exclusive' mode rules vs. conflict checking and the > protection against allocating the associated CBM bits in other control > groups. > > The setup would be like this: > > mkdir group > echo '$CONFIG' >group/schemata > echo 'locked' >group/mode > > Setting mode to locked locks down the schemata file along with the > task/cpus/cpus_list files. The task/cpu files need to be empty when > entering locked mode, otherwise the operation fails. I'd even would not > bother handing back the CLOSID. For simplicity the CLOSID should just stay > associated with the control group until it is destroyed as any other > control group. I started looking at how this implementation may look and would like to confirm with you that your intentions behind the new "exclusive" and "locked" modes can be maintained. I also have a few questions. Focusing on CAT a resource group represents a closid across all domains (cache instances) of all resources (cache layers) on the system. A full schemata reflecting the active bitmask associated with this closid for each domain of each resource is maintained. The current implementation supports partial writes to the schemata, with the assumption that only the changed values need to be updated, the others remain as is. For the current implementation this works well since what is shown by schemata reflects current hardware settings and what is written to schemata will change current hardware settings. This is done irrespective of any overlap between bitmasks of different closids (the "shareable" mode). A change to start us off with could be to initialize the schemata with all the shareable and unused bits set for all domains when a new resource group is created. Moving to "exclusive" mode it appears that, when enabled for a resource group, all domains of all resources are forced to have an "exclusive" region associated with this resource group (closid). This is because the schemata reflects the hardware settings of all resources and their domains and the hardware does not accept a "zero" bitmask. A user thus cannot just specify a single region of a particular cache instance as "exclusive". Does this match your intention wrt "exclusive"? Moving on to the "locked" mode. We cannot support different pseudo-locked regions across multiple resources (eg. L2 and L3). In fact, if we would at some point in the future then a pseudo-locked region on one resource could implicitly span a second resource. Additionally, we would like to enable a user to enable a single pseudo-locked region on a single cache instance. From the above it follows that "locked" mode cannot just simply build on top of "exclusive" mode rules (as I expressed them above) since it cannot enforce a locked region on each domain of each resource. We would like to support something like (as you also have in your example): mkdir group echo "L2:1=0x3" > schemata echo locked > mode The above should only pseudo-lock the indicated region and not touch any other domain. The problem is that the schemata always contain non-zero bitmasks for all domains so at the time "locked" is written it is not known which cache region needs to be locked. I am currently unable to see a simple way to build on top of the current schemata design to support the "locked" mode as you intended. It does seem as though the user's intention to create a pseudo-locked region needs to be communicated before the schemata is written, but from what I understand this does not seem to be supported by the mode/schemata combination. Please do correct me where I am wrong. To continue, when we overcome the above obstacle: A scenario could be where a single resource group will contain all the pseudo-locked regions (to avoid wasting closids). It is not clear to me how to easily support such a usage though since the way writes to the schemata is done is "changes only". If for example, two pseudo-locked regions exists: # mkdir group # echo "L2:1=0x3" > schemata # echo locked > mode # cat schemata L2:1=0x3 # echo "L2:0=0xf" > schemata # cat schemata L2:0=0xf;1=0x3 How can the user remove one of the pseudo-locked regions without affecting the other? Could we perhaps allow zero bitmask writes when a region is locked? Another point I would like to highlight is that when we talked about keeping the closid associated with the pseudo-locked region I mentioned that some resources may have few closids (for example, 4). As discussed this seems ok when there are only 8 bits in the bitmask. What I did not highlight at that time is that the closids are limited to the smallest number supported by all resources. So, if this same platform has a second resource (with more bits in a bitmask) with more closids, they would also be limited to 4. In this case it does seem removing a closid from service would have bigger impact. Reinette