LinuxLists.cc - About add an A64FX cache control function into resctrl

2021-04-09 05:54:58

Subject: About add an A64FX cache control function into resctrl

Hello

I'm Tan Shaopeng from Fujitsu Limited.

I$B!G(Bm trying to implement Fujitsu A64FX$B!G(Bs cache related features.
It is a cache partitioning function we called sector cache function
that using the value of the tag that is upper 8 bits of the 64bit
address and the value of the sector cache register to control virtual
cache capacity of the L1D&L2 cache.

A few days ago, when I sent a driver that realizes this function to
ARM64 kernel community, Will Deacon and Arnd Bergmann suggested
an idea to add the sector cache function of A64FX into resctrl.
https://lore.kernel.org/linux-arm-kernel/CAK8P3a2pFcNTw9NpRtQfYr7A5OcZ=As2kM0D_sbfFcGQ_J2Q+Q@mail.gmail.com/

Based on my study, I think the sector cache function of A64FX can be
added into the allocation features of resctrl after James' resctrl
rework has finished. But, in order to implement this function,
more interfaces for resctrl are need. The details are as follow,
and could you give me some advice?

[Sector cache function]
The sector cache function split cache into multiple sectors and
control them separately. It is implemented on the L1D cache and
L2 cache in the A64FX processor and can be controlled individually
for L1D cache and L2 cache. A64FX has no L3 cache. Each L1D cache
and L2 cache has 4 sectors. Which L1D sector is used is specified
by the value of [57:56] bits of address, how many ways of sector
are specified by the value of register (IMP_SCCR_L1_EL0).
Which L2 sector is used is specified by the value of [56] bits of
address, and how many ways of sector are specified by value of register
(IMP_SCCR_ASSIGN_EL1, IMP_SCCR_SET0_L2_EL1, IMP_SCCR_SET1_L2_EL1).

For more details of sector cache function,
see A64FX HPC extension specification (1.2. Sector cache) in
https://github.com/fujitsu/A64FX

[Difference between resctrl(CAT) and this sector cache function]
L2/L3 CAT (Cache Allocation Technology) enables the user to specify
some physical partition of cache space that an application can fill.
A64FX's L1D/L2 cache has 4 sectors and 16ways. This sector function
enables a user to specify number of ways each sector uses.
Therefore, for CAT it is enough to specify a cache portion for
each cache_id (socket). On the other hand, sector cache needs to
specify cache portion of each sector for each cache_id, and following
extension to resctrl interface is needed to support sector cache.

[Idear for A64FX sector cache function control interface (schemata file details)]
L1:<cache_id0>=<cwbm>,<cwbm>,<cwbm>,<cwbm>;<cache_id1>=<cwbm>,<cwbm>,<cwbm>,<cwbm>;$B!D(B
L2:<cache_id0>=>=<cwbm>,<cwbm>,<cwbm>,<cwbm>;<cache_id1>=<cwbm>,<cwbm>,<cwbm>,<cwbm>;$B!D(B

$B!&(BL1: Add a new interface to control the L1D cache.
$B!&(B<cwbm>,<cwbm>,<cwbm>,<cwbm>$B!'(BSpecify the number of ways for each sector.
$B!&(Bcwbm$B!'(BSpecify the number of ways in each sector as a bitmap (percentage),
but the bitmap does not indicate the location of the cache.
* In the sector cache function, L2 sector cache way setting register is
shared among PEs (Processor Element) in shared domain. If two PEs
which share L2 cache belongs to different resource groups, one resource
group's L2 setting will affect to other resource group's L2 setting.
* Since A64FX does not support MPAM, it is not necessary to consider
how to switch between MPAM and sector cache function now.

Some questions:
1.I'm still studying about RDT, could you tell me whether RDT has
the similar mechanism with sector cache function?
2.In RDT, L3 cache is shared among cores in socket. If two cores which
share L3 cache belongs to different resource groups, one resource
group's L3 setting will affect to other resource group's L3 setting?
3.Is this approach acceptable? could you give me some advice?

Best regards
Tan Shaopeng

2021-04-21 09:57:28

by Shaopeng Tan (Fujitsu)

[permalink] [raw]

Subject: RE: About add an A64FX cache control function into resctrl

Hi,

Ping... any comments&advice about add an A64FX cache control function into resctrl?

Best regards
Tan Shaopeng

> Hello
>
>
> I'm Tan Shaopeng from Fujitsu Limited.
>
> I$B!G(Bm trying to implement Fujitsu A64FX$B!G(Bs cache related features.
> It is a cache partitioning function we called sector cache function that using
> the value of the tag that is upper 8 bits of the 64bit address and the value of the
> sector cache register to control virtual cache capacity of the L1D&L2 cache.
>
> A few days ago, when I sent a driver that realizes this function to
> ARM64 kernel community, Will Deacon and Arnd Bergmann suggested an idea
> to add the sector cache function of A64FX into resctrl.
> https://lore.kernel.org/linux-arm-kernel/CAK8P3a2pFcNTw9NpRtQfYr7A5Oc
> [email protected]/
>
> Based on my study, I think the sector cache function of A64FX can be added
> into the allocation features of resctrl after James' resctrl rework has finished.
> But, in order to implement this function, more interfaces for resctrl are need.
> The details are as follow, and could you give me some advice?
>
> [Sector cache function]
> The sector cache function split cache into multiple sectors and control them
> separately. It is implemented on the L1D cache and
> L2 cache in the A64FX processor and can be controlled individually for L1D
> cache and L2 cache. A64FX has no L3 cache. Each L1D cache and L2 cache
> has 4 sectors. Which L1D sector is used is specified by the value of [57:56] bits
> of address, how many ways of sector are specified by the value of register
> (IMP_SCCR_L1_EL0).
> Which L2 sector is used is specified by the value of [56] bits of address, and
> how many ways of sector are specified by value of register
> (IMP_SCCR_ASSIGN_EL1, IMP_SCCR_SET0_L2_EL1,
> IMP_SCCR_SET1_L2_EL1).
>
> For more details of sector cache function, see A64FX HPC extension
> specification (1.2. Sector cache) in https://github.com/fujitsu/A64FX
>
> [Difference between resctrl(CAT) and this sector cache function]
> L2/L3 CAT (Cache Allocation Technology) enables the user to specify some
> physical partition of cache space that an application can fill.
> A64FX's L1D/L2 cache has 4 sectors and 16ways. This sector function enables
> a user to specify number of ways each sector uses.
> Therefore, for CAT it is enough to specify a cache portion for each cache_id
> (socket). On the other hand, sector cache needs to specify cache portion of
> each sector for each cache_id, and following extension to resctrl interface is
> needed to support sector cache.
>
> [Idear for A64FX sector cache function control interface (schemata file
> details)]
> L1:<cache_id0>=<cwbm>,<cwbm>,<cwbm>,<cwbm>;<cache_id1>=<cw
> bm>,<cwbm>,<cwbm>,<cwbm>;$B!D(B
> L2:<cache_id0>=>=<cwbm>,<cwbm>,<cwbm>,<cwbm>;<cache_id1>=
> <cwbm>,<cwbm>,<cwbm>,<cwbm>;$B!D(B
>
> $B!&(BL1: Add a new interface to control the L1D cache.
> $B!&(B<cwbm>,<cwbm>,<cwbm>,<cwbm>$B!'(BSpecify the number of ways for each
> sector.
> $B!&(Bcwbm$B!'(BSpecify the number of ways in each sector as a bitmap (percentage),
> but the bitmap does not indicate the location of the cache.
> * In the sector cache function, L2 sector cache way setting register is
> shared among PEs (Processor Element) in shared domain. If two PEs
> which share L2 cache belongs to different resource groups, one resource
> group's L2 setting will affect to other resource group's L2 setting.
> * Since A64FX does not support MPAM, it is not necessary to consider
> how to switch between MPAM and sector cache function now.
>
> Some questions:
> 1.I'm still studying about RDT, could you tell me whether RDT has
> the similar mechanism with sector cache function?
> 2.In RDT, L3 cache is shared among cores in socket. If two cores which
> share L3 cache belongs to different resource groups, one resource
> group's L3 setting will affect to other resource group's L3 setting?
> 3.Is this approach acceptable? could you give me some advice?
>
>
> Best regards
> Tan Shaopeng

2021-04-22 01:48:00

by Reinette Chatre

[permalink] [raw]

Subject: Re: About add an A64FX cache control function into resctrl

Hi Tan Shaopeng,

On 4/21/2021 1:37 AM, [email protected] wrote:
> Hi,
>
> Ping... any comments&advice about add an A64FX cache control function into resctrl?

My apologies for the delay.

>
> Best regards
> Tan Shaopeng
>
>> Hello
>>
>>
>> I'm Tan Shaopeng from Fujitsu Limited.
>>
>> I$B!G(Bm trying to implement Fujitsu A64FX$B!G(Bs cache related features.
>> It is a cache partitioning function we called sector cache function that using
>> the value of the tag that is upper 8 bits of the 64bit address and the value of the
>> sector cache register to control virtual cache capacity of the L1D&L2 cache.
>>
>> A few days ago, when I sent a driver that realizes this function to
>> ARM64 kernel community, Will Deacon and Arnd Bergmann suggested an idea
>> to add the sector cache function of A64FX into resctrl.
>> https://lore.kernel.org/linux-arm-kernel/CAK8P3a2pFcNTw9NpRtQfYr7A5Oc
>> [email protected]/
>>
>> Based on my study, I think the sector cache function of A64FX can be added
>> into the allocation features of resctrl after James' resctrl rework has finished.
>> But, in order to implement this function, more interfaces for resctrl are need.
>> The details are as follow, and could you give me some advice?
>>
>> [Sector cache function]
>> The sector cache function split cache into multiple sectors and control them
>> separately. It is implemented on the L1D cache and
>> L2 cache in the A64FX processor and can be controlled individually for L1D
>> cache and L2 cache. A64FX has no L3 cache. Each L1D cache and L2 cache
>> has 4 sectors. Which L1D sector is used is specified by the value of [57:56] bits
>> of address, how many ways of sector are specified by the value of register
>> (IMP_SCCR_L1_EL0).
>> Which L2 sector is used is specified by the value of [56] bits of address, and
>> how many ways of sector are specified by value of register
>> (IMP_SCCR_ASSIGN_EL1, IMP_SCCR_SET0_L2_EL1,
>> IMP_SCCR_SET1_L2_EL1).
>>
>> For more details of sector cache function, see A64FX HPC extension
>> specification (1.2. Sector cache) in https://github.com/fujitsu/A64FX

The overview in section 12 was informative but very high level.
I was not able to find any instance of "IMP_SCCR" in this document to
explore how this cache allocation works.

Are these cache sectors exposed to the OS in any way? For example, when
the OS discovers the cache, does it learn about these sectors and expose
the details to user space (/sys/devices/system/cpuX/cache)?

The overview of Sector Cache in that document provides details of how
the size of the sector itself is dynamically adjusted to usage. That
description is quite cryptic but it seems like a sector, since the
number of ways associated with it can dynamically change, is more
equivalent to a class of service or resource group in the resctrl
environment.

I really may be interpreting things wrong here, could you perhaps point
me to where I can obtain more details?

>> [Difference between resctrl(CAT) and this sector cache function]
>> L2/L3 CAT (Cache Allocation Technology) enables the user to specify some
>> physical partition of cache space that an application can fill.
>> A64FX's L1D/L2 cache has 4 sectors and 16ways. This sector function enables
>> a user to specify number of ways each sector uses.
>> Therefore, for CAT it is enough to specify a cache portion for each cache_id
>> (socket). On the other hand, sector cache needs to specify cache portion of
>> each sector for each cache_id, and following extension to resctrl interface is
>> needed to support sector cache.
>>
>> [Idear for A64FX sector cache function control interface (schemata file
>> details)]
>> L1:<cache_id0>=<cwbm>,<cwbm>,<cwbm>,<cwbm>;<cache_id1>=<cw
>> bm>,<cwbm>,<cwbm>,<cwbm>;$B!D(B
>> L2:<cache_id0>=>=<cwbm>,<cwbm>,<cwbm>,<cwbm>;<cache_id1>=
>> <cwbm>,<cwbm>,<cwbm>,<cwbm>;$B!D(B
>>
>> $B!&(BL1: Add a new interface to control the L1D cache.
>> $B!&(B<cwbm>,<cwbm>,<cwbm>,<cwbm>$B!'(BSpecify the number of ways for each
>> sector.
>> $B!&(Bcwbm$B!'(BSpecify the number of ways in each sector as a bitmap (percentage),
>> but the bitmap does not indicate the location of the cache.
>> * In the sector cache function, L2 sector cache way setting register is
>> shared among PEs (Processor Element) in shared domain. If two PEs
>> which share L2 cache belongs to different resource groups, one resource
>> group's L2 setting will affect to other resource group's L2 setting.

In resctrl a "resource group" can be viewed as a class of service.

>> * Since A64FX does not support MPAM, it is not necessary to consider
>> how to switch between MPAM and sector cache function now.
>>
>> Some questions:
>> 1.I'm still studying about RDT, could you tell me whether RDT has
>> the similar mechanism with sector cache function?

This is not clear to me yet. One thing to keep in mind is that a bit in
the capacity bitmask could correspond to some number of ways in a cache,
but it does not have to. It is essentially a hint to hardware on how
much cache space needs to be allocated while also indicating overlap and
isolation from other allocations.

resctrl already supports the bitmask being interpreted differently
between architectures and with the MPAM support there will be even more
support for different interpretations.

>> 2.In RDT, L3 cache is shared among cores in socket. If two cores which
>> share L3 cache belongs to different resource groups, one resource
>> group's L3 setting will affect to other resource group's L3 setting?

This question is not entirely clear to me. Are you referring to the
hardware layout or configuration changes via the resctrl "cpus" file?

Each resource group is a class of service (CLOS) that is supported by
all cache instances. By default each resource group would thus contain
all cache instances on the system (even if some cache instances do not
support the same number of CLOS resctrl would only support the CLOS
supported by all resources).

Reinette

2021-04-23 08:18:10

by Shaopeng Tan (Fujitsu)

[permalink] [raw]

Subject: RE: About add an A64FX cache control function into resctrl

2021-04-28 08:18:09

by Shaopeng Tan (Fujitsu)

[permalink] [raw]

Hi Reinette,

> On 7/7/2021 4:26 AM, [email protected] wrote:
> >>> Sorry, I have not explained A64FX's sector cache function well yet.
> >>> I think I need explain this function from different perspective.
> >>
> >> You have explained the A64FX's sector cache function well. I have
> >> also read both specs to understand it better. It appears to me that
> >> you are not considering the resctrl architecture as part of your
> >> solution but instead just forcing your architecture onto the resctrl
> >> filesystem. For example, in resctrl the resource groups are not just
> >> a directory structure but has significance in what is being
> >> represented within the directory (a class of service). The files
> >> within a resource group's directory build on that. From your side I
> >> have not seen any effort in aligning the sector cache function with the
> resctrl architecture but instead you are just changing resctrl interface to match
> the A64FX architecture.
> >>
> >> Could you please take a moment to understand what resctrl is and how
> >> it could be mapped to A64FX in a coherent way?
> >
> > Previously, my idea is based on how to make instructions use different
> > sectors in one task. After I studied resctrl, to utilize resctrl
> > architecture on A64FX, I think it$B!G(Bs better to assign one sector to one
> > task. Thanks for your idea that "sectors" could be considered the same
> > as the resctrl "classes of service".
> >
> > Based on your idea, I am considering the implementation details.
> > In this email, I will explain the outline of new proposal, and then
> > please allow me to confirm a few technologies about resctrl.
> >
> > The outline of my proposal is as follows.
> > - Add a sector function equivalent to Intel's CAT function into resctrl.
> > (divide shared L2 cache into multiple partitions for multiple cores
> > use)
> > - Allocate one sector to one resource group (one CLOSID). Since one
> > core can only be assigned to one resource group, on A64FX each core
> > only uses one sector at a time.
>
> ok, so a sector is a portion of cache and matches with what can be represented
> with a resource group.
>
> The second part of your comment is not clear to me. In the first part you
> mention: "one core can only be assigned to one resource group" - this seems to
> indicate some static assignment between cores and sectors and if this is the

Sorry, does "static assignment between cores and sectors" mean
each core always use a fixed sector id? For example, core 0 always
use sector 0 at any case. It is not.

> case this needs more thinking since the current implementation assumes that
> any core that can access the cache can access all resource groups associated
> with that cache. On the other hand, you mention "on A64FX each core only uses
> one sector at a time" - this now sounds dynamic and is how resctrl works since
> the CPU is assigned a single class of service to indicate all resources
> accessible to it.

It is correct. Each core can be assigned to any resource group, and
each core only uses one sector at a time. Additionally, which sector
each core uses depends on the resource group (class of service) ID.

> > - Disable A64FX's HPC tag address override function. We only set each
> > core's default sector value according to closid(default sector
> ID=CLOSID).
> > - No L1 cache control since L1 cache is not shared for cores. It is not
> > necessary to add L1 cache interface for schemata file.
> > - No need to update schemata interface. Resctrl's L2 cache interface
> > (L2: <cache_id0> = <cbm>; <cache_id1> = <cbm>; ...)
> > will be used as it is. However, on A64FX, <cbm> does not indicate
> > the position of cache partition, only indicate the number of
> > cache ways (size).
>
> From what I understand the upcoming MPAM support would make this easier
> to do.
>
> >
> > This is the smallest start of incorporating sector cache function into
> > resctrl. I will consider if we could add more sector cache features
> > into resctrl (e.g. selecting different sectors from one task) after
> > finishing this.
> >
> > (some questions are below)
> >
> >>>
> >>>> On 5/17/2021 1:31 AM, [email protected] wrote:
> >>
> >>> --------
> >>> A64FX NUMA-PE-Cache Architecture:
> >>> NUMA0:
> >>> PE0:
> >>> L1sector0,L1sector1,L1sector2,L1sector3
> >>> PE1:
> >>> L1sector0,L1sector1,L1sector2,L1sector3
> >>> ...
> >>> PE11:
> >>> L1sector0,L1sector1,L1sector2,L1sector3
> >>>
> >>> L2sector0,1/L2sector2,3
> >>> NUMA1:
> >>> PE0:
> >>> L1sector0,L1sector1,L1sector2,L1sector3
> >>> ...
> >>> PE11:
> >>> L1sector0,L1sector1,L1sector2,L1sector3
> >>>
> >>> L2sector0,1/L2sector2,3
> >>> NUMA2:
> >>> ...
> >>> NUMA3:
> >>> ...
> >>> --------
> >>> In A64FX processor, one L1 sector cache capacity setting register is
> >>> only for one PE and not shared among PEs. L2 sector cache maximum
> >>> capacity setting registers are shared among PEs in same NUMA, and it
> >>> is to be noted that changing these registers in one PE influences other PE.
> >>
> >> Understood. cache affinity is familiar to resctrl. When a CPU becomes
> >> online it is discovered which caches/resources it has affinity to.
> >> Resources then have CPU mask associated with them to indicate on
> >> which CPU a register could be changed to configure the
> >> resource/cache. See
> >> domain_add_cpu() and struct rdt_domain.
> >
> > Is the following understanding correct?
> > Struct rdt_domain is a group of online CPUs that share a same cache
> > instance. When a CPU is online(resctrl initialization), the
> > domain_add_cpu() function add the online cpu to corresponding
> > rdt_domain (in rdt_resource:domains list). For example, if there are
> > 4 L2 cache instances, then there will be 4 rdt_domain in the list and
> > each CPU is assigned to corresponding rdt_domain.
>
> Correct.
>
> >
> > The set values of cache/memory are stored in the *ctrl_val array
> > (indexed by CLOSID) of struct rdt_domain. For example, in CAT
> > function, the CBM value of CLOSID=x is stored in ctrl_val [x].
> > When we create a resource group and write set values of cache into the
> > schemata file, the update_domains() function updates the CBM value to
> > ctrl_val [CLOSID = resource group ID] in rdt_domain and updates the
> > CBM value to CBM register(MSR_IA32_Lx_CBM_BASE).
>
> For the most part, yes. The only part that I would like to clarify is that each
> CLOSID is represented by a different register, which register is updated
> depends on which CLOSID is changed. Could be written as
> MSR_IA32_L2_CBM_CLOSID/MSR_IA32_L3_CBM_CLOSID. The "BASE"
> register is CLOSID 0, the default, and the other registers are determined as
> offset from it.
>
> Also, the registers have the scope of the resource/cache. So, for example, if
> CPU 0 and CPU 1 share a L2 cache then it is only necessary to update the
> register on one of these CPUs.

Thanks for your explanation. I understood it.
In addition, A64FX's L2 cache setting registers have similar scopes
of resource/cache, and only necessary to update the register on one of
these CPUs.

> >>> The number of ways for L2 Sector ID (0,1 or 2,3) can be set through
> >>> any PEs in same NUMA. The sector ID 0,1 and 2,3 are not available at
> >>> the same time in same NUMA.
> >>>
> >>>
> >>> I think, in your idea, a resource group will be created for each sector ID.
> >>> (> "sectors" could be considered the same as the resctrl "classes of
> >>> service") Then, an example of resource group is created as follows.
> >>> $B!&(B L1: NUMAX-PEY-L1sector0 (X = 0,1,2,3.Y = 0,1,2 ... 11),
> >>> $B!&(B L2: NUMAX-L2sector0 (X = 0,1,2,3)
> >>>
> >>> In this example, sector with same ID(0) of all PEs is allocated to
> >>> resource group. The L1D caches are numbered from
> >>> NUMA0_PE0-L1sector0(0) to NUMA4_PE11-L1sector0(47) and the L2
> >> caches
> >>> numbered from
> >>> NUMA0-L2sector0(0) to NUM4-L2sector0(3).
> >>> (NUMA number X is from 0-4, PE number Y is from 0-11)
> >>> (1) The number of ways of NUMAX-PEY-L1sector0 can be set
> independently
> >>> for each PEs (0-47). When run a task on this resource group,
> >>> we cannot control on which PE the task is running on and how
> many
> >>> cache ways the task is using.
> >>
> >> resctrl does not control the affinity on which PE/CPU a task is run.
> >> resctrl is an interface with which to configure how resources are
> >> allocated on the system. resctrl could thus provide interface with
> >> which each sector of each cache instance is assigned a number of cache
> ways.
> >> resctrl also provides an interface to assign a task with a class of
> >> service (sector id?). Through this the task obtains access to all
> >> resources that is allocated to the particular class of service
> >> (sector id?). Depending on which CPU the task is running it may
> >> indeed experience different performance if the sector id it is
> >> running with does not have the same allocations on all cache instances.
> The affinity of the task needs to be managed separately using for example
> taskset.
> >> Please see Documentation/x86/resctrl.rst "Examples for RDT allocation
> usage"
> >
> > In resctrl_sched_in(), there are comments as follow:
> > /*
> > * If this task has a closid/rmid assigned, use it.
> > * Else use the closid/rmid assigned to this cpu.
> > */
> > I thought when we write PID to tasks file, this task (PID) will only
> > run on the CPUs which are specified in cpus file in the same resource
> > group. So, the task_struct's closid and cpu's closid is the same.
> > When task's closid is different from cpu's closid?
>
> resctrl does not manage the affinity of tasks.
>
> Tony recently summarized the cpus file very well to me: The actual semantics of
> the CPUs file is to associate a CLOSid for a task that is in the default resctrl
> group ? while it is running on one of the listed CPUs.
>
> To answer your question the task's closid could be different from the CPU's
> closid if the task's closid is 0 while it is running on a CPU that is in the cpus file
> of a non-default resource group.
>
> You can see a summary of the decision flow in section "Resource allocation
> rules" in Documentation/x86/resctrl.rst
>
> The "cpus" file was created in support of the real-time use cases. In these use
> cases a group of CPUs can be designated as supporting the real-time work and
> with their own resource group and assigned the needed resources to do the
> real-time work. A real-time task can then be started with affinity to those CPUs
> and dynamically any kernel threads (that will be started on the same CPU)
> doing work on behalf of this task would be able to use the resources set aside
> for the real-time work.

Thanks for your explanation. I understood it.

I will implement this sector function, and if I have other questions,
please allow me to mail you.

Best regards,
Tan Shaopeng

2021-07-21 23:41:39

by Reinette Chatre

[permalink] [raw]

Subject: Re: About add an A64FX cache control function into resctrl

Hi Tan Shaopeng,

On 7/21/2021 1:10 AM, [email protected] wrote:
> Hi Reinette,
>
>> On 7/7/2021 4:26 AM, [email protected] wrote:
>>>>> Sorry, I have not explained A64FX's sector cache function well yet.
>>>>> I think I need explain this function from different perspective.
>>>>
>>>> You have explained the A64FX's sector cache function well. I have
>>>> also read both specs to understand it better. It appears to me that
>>>> you are not considering the resctrl architecture as part of your
>>>> solution but instead just forcing your architecture onto the resctrl
>>>> filesystem. For example, in resctrl the resource groups are not just
>>>> a directory structure but has significance in what is being
>>>> represented within the directory (a class of service). The files
>>>> within a resource group's directory build on that. From your side I
>>>> have not seen any effort in aligning the sector cache function with the
>> resctrl architecture but instead you are just changing resctrl interface to match
>> the A64FX architecture.
>>>>
>>>> Could you please take a moment to understand what resctrl is and how
>>>> it could be mapped to A64FX in a coherent way?
>>>
>>> Previously, my idea is based on how to make instructions use different
>>> sectors in one task. After I studied resctrl, to utilize resctrl
>>> architecture on A64FX, I think it$B!G(Bs better to assign one sector to one
>>> task. Thanks for your idea that "sectors" could be considered the same
>>> as the resctrl "classes of service".
>>>
>>> Based on your idea, I am considering the implementation details.
>>> In this email, I will explain the outline of new proposal, and then
>>> please allow me to confirm a few technologies about resctrl.
>>>
>>> The outline of my proposal is as follows.
>>> - Add a sector function equivalent to Intel's CAT function into resctrl.
>>> (divide shared L2 cache into multiple partitions for multiple cores
>>> use)
>>> - Allocate one sector to one resource group (one CLOSID). Since one
>>> core can only be assigned to one resource group, on A64FX each core
>>> only uses one sector at a time.
>>
>> ok, so a sector is a portion of cache and matches with what can be represented
>> with a resource group.
>>
>> The second part of your comment is not clear to me. In the first part you
>> mention: "one core can only be assigned to one resource group" - this seems to
>> indicate some static assignment between cores and sectors and if this is the
>
> Sorry, does "static assignment between cores and sectors" mean
> each core always use a fixed sector id? For example, core 0 always
> use sector 0 at any case. It is not.
>
>> case this needs more thinking since the current implementation assumes that
>> any core that can access the cache can access all resource groups associated
>> with that cache. On the other hand, you mention "on A64FX each core only uses
>> one sector at a time" - this now sounds dynamic and is how resctrl works since
>> the CPU is assigned a single class of service to indicate all resources
>> accessible to it.
>
> It is correct. Each core can be assigned to any resource group, and
> each core only uses one sector at a time. Additionally, which sector
> each core uses depends on the resource group (class of service) ID.

Thank you for clarifying. From what I understand this could be supported
by existing resctrl flows.

...

>>> In resctrl_sched_in(), there are comments as follow:
>>> /*
>>> * If this task has a closid/rmid assigned, use it.
>>> * Else use the closid/rmid assigned to this cpu.
>>> */
>>> I thought when we write PID to tasks file, this task (PID) will only
>>> run on the CPUs which are specified in cpus file in the same resource
>>> group. So, the task_struct's closid and cpu's closid is the same.
>>> When task's closid is different from cpu's closid?
>>
>> resctrl does not manage the affinity of tasks.
>>
>> Tony recently summarized the cpus file very well to me: The actual semantics of
>> the CPUs file is to associate a CLOSid for a task that is in the default resctrl
>> group ? while it is running on one of the listed CPUs.
>>
>> To answer your question the task's closid could be different from the CPU's
>> closid if the task's closid is 0 while it is running on a CPU that is in the cpus file
>> of a non-default resource group.
>>
>> You can see a summary of the decision flow in section "Resource allocation
>> rules" in Documentation/x86/resctrl.rst
>>
>> The "cpus" file was created in support of the real-time use cases. In these use
>> cases a group of CPUs can be designated as supporting the real-time work and
>> with their own resource group and assigned the needed resources to do the
>> real-time work. A real-time task can then be started with affinity to those CPUs
>> and dynamically any kernel threads (that will be started on the same CPU)
>> doing work on behalf of this task would be able to use the resources set aside
>> for the real-time work.
>
> Thanks for your explanation. I understood it.
>
> I will implement this sector function, and if I have other questions,
> please allow me to mail you.

I will help where I can. You may also be interested in the work James is
busy with. See his latest series at
https://lore.kernel.org/lkml/[email protected]/

Reinette