by Marcelo Tosatti

[permalink] [raw]

Subject: Re: [RFD] CAT user space interface revisited

On Tue, Dec 29, 2015 at 01:44:16PM +0100, Thomas Gleixner wrote:
> Marcelo,
>
> On Wed, 23 Dec 2015, Marcelo Tosatti wrote:
> > On Tue, Dec 22, 2015 at 06:12:05PM +0000, Yu, Fenghua wrote:
> > > > From: Thomas Gleixner [mailto:[email protected]]
> > > >
> > > > I was not able to identify any existing infrastructure where this really fits in. I
> > > > chose a directory/file based representation. We certainly could do the same
> > >
> > > Is this be /sys/devices/system/?
> > > Then create qos/cat directory. In the future, other directories may be created
> > > e.g. qos/mbm?
> >
> > I suppose Thomas is talking about the socketmask only, as discussed in
> > the call with Intel.
>
> I have no idea about what you talked in a RH/Intel call.
>
> > Thomas, is that correct? (if you want a change in directory structure,
> > please explain the whys, because we don't need that change in directory
> > structure).
>
> Can you please start to write coherent and understandable mails? I have no
> idea of which directory structure, which does not need to be changed, you are
> talking.

Thomas,

There is one directory structure in this topic, CAT. That is the
directory structure which is exposed to userspace to control the
CAT HW.

With the current patchset posted by Intel ("Subject: [PATCH V16 00/11]
x86: Intel Cache Allocation Technology Support"), the directory
structure there (the files and directories exposed by that patchset)
(*1) does not allow one to configure different CBM masks on each socket
(that is, it forces the user to configure the same mask CBM on every
socket). This is a blocker for us, and it is one of the points in your
proposal.

There was a call between Red Hat and Intel where it was communicated
to Intel, and Intel agreed, that it was necessary to fix this (fix this
== allow different CBM masks on different sockets).

Now, that is one change to the current directory structure (*1).

(*1) modified to allow for different CBM masks on different sockets,
lets say (*2), is what we have been waiting for Intel to post.
It would handle our usecase, and all use-cases which the current
patchset from Intel already handles (Vikas posted emails mentioning
there are happy users of the current interface, feel free to ask
him for more details).

What i have asked you, and you replied "to go Google read my previous
post" is this:
What are the advantages over you proposal (which is a completely
different directory structure, requiring a complete rewrite),
over (*2) ?

(what is my reason behind this: the reason is that if you, with
maintainer veto power, forces your proposal to be accepted, it will be
necessary to wait for another rewrite (a new set of problems, fully
think through your proposal, test it, ...) rather than simply modify an
already known, reviewed, already used directory structure.

And functionally, your proposal adds nothing to (*2) (other than, well,
being a different directory structure).

If Fenghua or you post a patchset, say in 2 weeks, with your proposal,
i am fine with that. But i since i doubt that will be the case, i am
pushing for the interface which requires the least amount of changes
(and therefore the least amount of time) to be integrated.

>From your email:

"It would even be sufficient for particular use cases to just associate
a piece of cache to a given CPU and do not bother with tasks at all.

We really need to make this as configurable as possible from userspace
without imposing random restrictions to it. I played around with it on
my new intel toy and the restriction to 16 COS ids (that's 8 with CDP
enabled) makes it really useless if we force the ids to have the same
meaning on all sockets and restrict it to per task partitioning."

Yes, thats the issue we hit, that is the modification that was agreed
with Intel, and thats what we are waiting for them to post.

> I described a directory structure for that qos/cat stuff in my proposal and
> that's complete AFAICT.

Ok, lets make the job for the submitter easier. You are the maintainer,
so you decide.

Is it enough for you to have (*2) (which was agreed with Intel), or
would you rather prefer to integrate the directory structure at
"[RFD] CAT user space interface revisited" ?

Thanks.

2015-12-31 22:32:01

by Thomas Gleixner

[permalink] [raw]

Subject: Re: [RFD] CAT user space interface revisited

Marcelo,

On Thu, 31 Dec 2015, Marcelo Tosatti wrote:

First of all thanks for the explanation.

> There is one directory structure in this topic, CAT. That is the
> directory structure which is exposed to userspace to control the
> CAT HW.
>
> With the current patchset posted by Intel ("Subject: [PATCH V16 00/11]
> x86: Intel Cache Allocation Technology Support"), the directory
> structure there (the files and directories exposed by that patchset)
> (*1) does not allow one to configure different CBM masks on each socket
> (that is, it forces the user to configure the same mask CBM on every
> socket). This is a blocker for us, and it is one of the points in your
> proposal.
>
> There was a call between Red Hat and Intel where it was communicated
> to Intel, and Intel agreed, that it was necessary to fix this (fix this
> == allow different CBM masks on different sockets).
>
> Now, that is one change to the current directory structure (*1).

I don't have an idea how that would look like. The current structure is a
cgroups based hierarchy oriented approach, which does not allow simple things
like

T1 00001111
T2 00111100

at least not in a way which is natural to the problem at hand.

> (*1) modified to allow for different CBM masks on different sockets,
> lets say (*2), is what we have been waiting for Intel to post.
> It would handle our usecase, and all use-cases which the current
> patchset from Intel already handles (Vikas posted emails mentioning
> there are happy users of the current interface, feel free to ask
> him for more details).

I cannot imagine how that modification to the current interface would solve
that. Not to talk about per CPU associations which are not related to tasks at
all.

> What i have asked you, and you replied "to go Google read my previous
> post" is this:
> What are the advantages over you proposal (which is a completely
> different directory structure, requiring a complete rewrite),
> over (*2) ?
>
> (what is my reason behind this: the reason is that if you, with
> maintainer veto power, forces your proposal to be accepted, it will be
> necessary to wait for another rewrite (a new set of problems, fully
> think through your proposal, test it, ...) rather than simply modify an
> already known, reviewed, already used directory structure.
>
> And functionally, your proposal adds nothing to (*2) (other than, well,
> being a different directory structure).

Sorry. I cannot see at all how a modification to the existing interface would
cover all the sensible use cases I described in a coherent way. I really want
to see a proper description of the interface before people start hacking on it
in a frenzy. What you described is: "let's say (*2)" modification. That's
pretty meager.

> If Fenghua or you post a patchset, say in 2 weeks, with your proposal,
> i am fine with that. But i since i doubt that will be the case, i am
> pushing for the interface which requires the least amount of changes
> (and therefore the least amount of time) to be integrated.
>
> >From your email:
>
> "It would even be sufficient for particular use cases to just associate
> a piece of cache to a given CPU and do not bother with tasks at all.
>
> We really need to make this as configurable as possible from userspace
> without imposing random restrictions to it. I played around with it on
> my new intel toy and the restriction to 16 COS ids (that's 8 with CDP
> enabled) makes it really useless if we force the ids to have the same
> meaning on all sockets and restrict it to per task partitioning."
>
> Yes, thats the issue we hit, that is the modification that was agreed
> with Intel, and thats what we are waiting for them to post.

How do you implement the above - especially that part:

"It would even be sufficient for particular use cases to just associate a
piece of cache to a given CPU and do not bother with tasks at all."

as a "simple" modification to (*1) ?

> > I described a directory structure for that qos/cat stuff in my proposal and
> > that's complete AFAICT.
>
> Ok, lets make the job for the submitter easier. You are the maintainer,
> so you decide.
>
> Is it enough for you to have (*2) (which was agreed with Intel), or
> would you rather prefer to integrate the directory structure at
> "[RFD] CAT user space interface revisited" ?

The only thing I care about as a maintainer is, that we merge something which
actually reflects the properties of the hardware and gives the admin the
required flexibility to utilize it fully. I don't care at all if it's my
proposal or something else which allows to do the same.

Let me copy the relevant bits from my proposal here once more and let me ask
questions to the various points so you can tell me how that modification to
(*1) is going to deal with that.

>> At top level:
>>
>> xxxxxxx/cat/max_cosids <- Assume that all CPUs are the same
>> xxxxxxx/cat/max_maskbits <- Assume that all CPUs are the same
>> xxxxxxx/cat/cdp_enable <- Depends on CDP availability

Where is that information in (*2) and how is that related to (*1)? If you
think it's not required, please explain why.

>> Per socket data:
>>
>> xxxxxxx/cat/socket-0/
>> ...
>> xxxxxxx/cat/socket-N/l3_size
>> xxxxxxx/cat/socket-N/hwsharedbits

Where is that information in (*2) and how is that related to (*1)? If you
think it's not required, please explain why.

>> Per socket mask data:
>>
>> xxxxxxx/cat/socket-N/cos-id-0/
>> ...
>> xxxxxxx/cat/socket-N/cos-id-N/inuse
>> /cat_mask
>> /cdp_mask <- Data mask if CDP enabled

Where is that information in (*2) and how is that related to (*1)? If you
think it's not required, please explain why.

>> Per cpu default cos id for the cpus on that socket:
>>
>> xxxxxxx/cat/socket-N/cpu-x/default_cosid
>> ...
>> xxxxxxx/cat/socket-N/cpu-N/default_cosid
>>
>> The above allows a simple cpu based partitioning. All tasks which do
>> not have a cache partition assigned on a particular socket use the
>> default one of the cpu they are running on.

Where is that information in (*2) and how is that related to (*1)? If you
think it's not required, please explain why.

>> Now for the task(s) partitioning:
>>
>> xxxxxxx/cat/partitions/
>>
>> Under that directory one can create partitions
>>
>> xxxxxxx/cat/partitions/p1/tasks
>> /socket-0/cosid
>> ...
>> /socket-n/cosid
>>
>> The default value for the per socket cosid is COSID_DEFAULT, which
>> causes the task(s) to use the per cpu default id.

Where is that information in (*2) and how is that related to (*1)? If you
think it's not required, please explain why.

Yes. I ask the same question several times and I really want to see the
directory/interface structure which solves all of the above before anyone
starts to implement it. We already have a completely useless interface (*1)
and there is no point to implement another one based on it (*2) just because
it solves your particular issue and is the fastest way forward. User space
interfaces are hard and we really do not need some half baken solution which
we have to support forever.

Let me enumerate the required points again:

1) Information about the hardware properties

2) Integration of CAT and CDP

3) Per socket cos-id partitioning

4) Per cpu default cos-id association

5) Task association to cos-id

Can you please explain in a simple directory based scheme, like the one I gave
you how all of these points are going to be solved with a modification to (*1)?

Thanks,

tglx