LinuxLists.cc - Re: [RFC v3 1/5] sched/core: add capacity constraints to CPU controller

2017-03-13 10:46:25

Subject: Re: [RFC v3 1/5] sched/core: add capacity constraints to CPU controller

On Tue, Feb 28, 2017 at 6:38 AM, Patrick Bellasi
<[email protected]> wrote:
> The CPU CGroup controller allows to assign a specified (maximum)
> bandwidth to tasks within a group, however it does not enforce any
> constraint on how such bandwidth can be consumed.
> With the integration of schedutil, the scheduler has now the proper
> information about a task to select the most suitable frequency to
> satisfy tasks needs.
[..]

> +static u64 cpu_capacity_min_read_u64(struct cgroup_subsys_state *css,
> + struct cftype *cft)
> +{
> + struct task_group *tg;
> + u64 min_capacity;
> +
> + rcu_read_lock();
> + tg = css_tg(css);
> + min_capacity = tg->cap_clamp[CAP_CLAMP_MIN];

Shouldn't the cap_clamp be accessed with READ_ONCE (and WRITE_ONCE in
the write path) to avoid load-tearing?

Thanks,
Joel

2017-03-15 11:20:44

by Patrick Bellasi

[permalink] [raw]

Subject: Re: [RFC v3 1/5] sched/core: add capacity constraints to CPU controller

On 13-Mar 03:46, Joel Fernandes (Google) wrote:
> On Tue, Feb 28, 2017 at 6:38 AM, Patrick Bellasi
> <[email protected]> wrote:
> > The CPU CGroup controller allows to assign a specified (maximum)
> > bandwidth to tasks within a group, however it does not enforce any
> > constraint on how such bandwidth can be consumed.
> > With the integration of schedutil, the scheduler has now the proper
> > information about a task to select the most suitable frequency to
> > satisfy tasks needs.
> [..]
>
> > +static u64 cpu_capacity_min_read_u64(struct cgroup_subsys_state *css,
> > + struct cftype *cft)
> > +{
> > + struct task_group *tg;
> > + u64 min_capacity;
> > +
> > + rcu_read_lock();
> > + tg = css_tg(css);
> > + min_capacity = tg->cap_clamp[CAP_CLAMP_MIN];
>
> Shouldn't the cap_clamp be accessed with READ_ONCE (and WRITE_ONCE in
> the write path) to avoid load-tearing?

tg->cap_clamp is an "unsigned int" and thus I would expect a single
memory access to write/read it, isn't it? I mean: I do not expect the
compiler "to mess" with these accesses.

However, if your concerns are more about overlapping read/write for the
same capacity from different threads, then perhaps we should better
use a mutex to serialize these two functions... not entirely convinced...

> Thanks,
> Joel

--
#include <best/regards.h>

Patrick Bellasi

2017-03-15 13:21:27

by Joel Fernandes

[permalink] [raw]

Subject: Re: [RFC v3 1/5] sched/core: add capacity constraints to CPU controller

On Wed, Mar 15, 2017 at 4:20 AM, Patrick Bellasi
<[email protected]> wrote:
> On 13-Mar 03:46, Joel Fernandes (Google) wrote:
>> On Tue, Feb 28, 2017 at 6:38 AM, Patrick Bellasi
>> <[email protected]> wrote:
>> > The CPU CGroup controller allows to assign a specified (maximum)
>> > bandwidth to tasks within a group, however it does not enforce any
>> > constraint on how such bandwidth can be consumed.
>> > With the integration of schedutil, the scheduler has now the proper
>> > information about a task to select the most suitable frequency to
>> > satisfy tasks needs.
>> [..]
>>
>> > +static u64 cpu_capacity_min_read_u64(struct cgroup_subsys_state *css,
>> > + struct cftype *cft)
>> > +{
>> > + struct task_group *tg;
>> > + u64 min_capacity;
>> > +
>> > + rcu_read_lock();
>> > + tg = css_tg(css);
>> > + min_capacity = tg->cap_clamp[CAP_CLAMP_MIN];
>>
>> Shouldn't the cap_clamp be accessed with READ_ONCE (and WRITE_ONCE in
>> the write path) to avoid load-tearing?
>
> tg->cap_clamp is an "unsigned int" and thus I would expect a single
> memory access to write/read it, isn't it? I mean: I do not expect the
> compiler "to mess" with these accesses.

This depends on compiler and arch. I'm not sure if its in practice
these days an issue, but see section on 'load tearing' in
Documentation/memory-barriers.txt . If compiler decided to break down
the access to multiple accesses due to some reason, then might be a
problem.

Adding Paul for his expert opinion on the matter ;)

Thanks,
Joel

2017-03-15 16:11:12

by Paul E. McKenney

[permalink] [raw]

Subject: Re: [RFC v3 1/5] sched/core: add capacity constraints to CPU controller

On Wed, Mar 15, 2017 at 06:20:28AM -0700, Joel Fernandes wrote:
> On Wed, Mar 15, 2017 at 4:20 AM, Patrick Bellasi
> <[email protected]> wrote:
> > On 13-Mar 03:46, Joel Fernandes (Google) wrote:
> >> On Tue, Feb 28, 2017 at 6:38 AM, Patrick Bellasi
> >> <[email protected]> wrote:
> >> > The CPU CGroup controller allows to assign a specified (maximum)
> >> > bandwidth to tasks within a group, however it does not enforce any
> >> > constraint on how such bandwidth can be consumed.
> >> > With the integration of schedutil, the scheduler has now the proper
> >> > information about a task to select the most suitable frequency to
> >> > satisfy tasks needs.
> >> [..]
> >>
> >> > +static u64 cpu_capacity_min_read_u64(struct cgroup_subsys_state *css,
> >> > + struct cftype *cft)
> >> > +{
> >> > + struct task_group *tg;
> >> > + u64 min_capacity;
> >> > +
> >> > + rcu_read_lock();
> >> > + tg = css_tg(css);
> >> > + min_capacity = tg->cap_clamp[CAP_CLAMP_MIN];
> >>
> >> Shouldn't the cap_clamp be accessed with READ_ONCE (and WRITE_ONCE in
> >> the write path) to avoid load-tearing?
> >
> > tg->cap_clamp is an "unsigned int" and thus I would expect a single
> > memory access to write/read it, isn't it? I mean: I do not expect the
> > compiler "to mess" with these accesses.
>
> This depends on compiler and arch. I'm not sure if its in practice
> these days an issue, but see section on 'load tearing' in
> Documentation/memory-barriers.txt . If compiler decided to break down
> the access to multiple accesses due to some reason, then might be a
> problem.

The compiler might also be able to inline cpu_capacity_min_read_u64()
fuse the load from tg->cap_clamp[CAP_CLAMP_MIN] with other accesses.
If min_capacity is used several times in the ensuing code, the compiler
could reload multiple times from tg->cap_clamp[CAP_CLAMP_MIN], which at
best might be a bit confusing.

> Adding Paul for his expert opinion on the matter ;)

My personal approach is to use READ_ONCE() and WRITE_ONCE() unless
I can absolutely prove that the compiler cannot do any destructive
optimizations. And I not-infrequently find unsuspected opportunities
for destructive optimization in my own code. Your mileage may vary. ;-)

Thanx, Paul

2017-03-15 16:45:22

by Patrick Bellasi

[permalink] [raw]

Subject: Re: [RFC v3 1/5] sched/core: add capacity constraints to CPU controller

On 15-Mar 09:10, Paul E. McKenney wrote:
> On Wed, Mar 15, 2017 at 06:20:28AM -0700, Joel Fernandes wrote:
> > On Wed, Mar 15, 2017 at 4:20 AM, Patrick Bellasi
> > <[email protected]> wrote:
> > > On 13-Mar 03:46, Joel Fernandes (Google) wrote:
> > >> On Tue, Feb 28, 2017 at 6:38 AM, Patrick Bellasi
> > >> <[email protected]> wrote:
> > >> > The CPU CGroup controller allows to assign a specified (maximum)
> > >> > bandwidth to tasks within a group, however it does not enforce any
> > >> > constraint on how such bandwidth can be consumed.
> > >> > With the integration of schedutil, the scheduler has now the proper
> > >> > information about a task to select the most suitable frequency to
> > >> > satisfy tasks needs.
> > >> [..]
> > >>
> > >> > +static u64 cpu_capacity_min_read_u64(struct cgroup_subsys_state *css,
> > >> > + struct cftype *cft)
> > >> > +{
> > >> > + struct task_group *tg;
> > >> > + u64 min_capacity;
> > >> > +
> > >> > + rcu_read_lock();
> > >> > + tg = css_tg(css);
> > >> > + min_capacity = tg->cap_clamp[CAP_CLAMP_MIN];
> > >>
> > >> Shouldn't the cap_clamp be accessed with READ_ONCE (and WRITE_ONCE in
> > >> the write path) to avoid load-tearing?
> > >
> > > tg->cap_clamp is an "unsigned int" and thus I would expect a single
> > > memory access to write/read it, isn't it? I mean: I do not expect the
> > > compiler "to mess" with these accesses.
> >
> > This depends on compiler and arch. I'm not sure if its in practice
> > these days an issue, but see section on 'load tearing' in
> > Documentation/memory-barriers.txt . If compiler decided to break down
> > the access to multiple accesses due to some reason, then might be a
> > problem.
>
> The compiler might also be able to inline cpu_capacity_min_read_u64()
> fuse the load from tg->cap_clamp[CAP_CLAMP_MIN] with other accesses.
> If min_capacity is used several times in the ensuing code, the compiler
> could reload multiple times from tg->cap_clamp[CAP_CLAMP_MIN], which at
> best might be a bit confusing.

That's actually an interesting case, however I don't think it applies
in this case since cpu_capacity_min_read_u64() is called only via
a function poninter and thus it will never be inlined. isn't it?

> > Adding Paul for his expert opinion on the matter ;)
>
> My personal approach is to use READ_ONCE() and WRITE_ONCE() unless
> I can absolutely prove that the compiler cannot do any destructive
> optimizations. And I not-infrequently find unsuspected opportunities
> for destructive optimization in my own code. Your mileage may vary. ;-)

I guess here the main concern from Joel is that such a pattern:

u64 var = unsigned_int_value_from_memory;

can result is a couple of "load from memory" operations.

In that case a similar:

unsigned_int_left_value = new_unsigned_int_value;

executed on a different thread can overlap with the previous memory
read operations and ending up in "var" containing a not consistent
value.

Question is: can this really happen, given the data types in use?

> Thanx, Paul

Thanks! ;-)

--
#include <best/regards.h>

Patrick Bellasi

2017-03-15 17:24:37

by Paul E. McKenney

[permalink] [raw]

Subject: Re: [RFC v3 1/5] sched/core: add capacity constraints to CPU controller

On Wed, Mar 15, 2017 at 04:44:39PM +0000, Patrick Bellasi wrote:
> On 15-Mar 09:10, Paul E. McKenney wrote:
> > On Wed, Mar 15, 2017 at 06:20:28AM -0700, Joel Fernandes wrote:
> > > On Wed, Mar 15, 2017 at 4:20 AM, Patrick Bellasi
> > > <[email protected]> wrote:
> > > > On 13-Mar 03:46, Joel Fernandes (Google) wrote:
> > > >> On Tue, Feb 28, 2017 at 6:38 AM, Patrick Bellasi
> > > >> <[email protected]> wrote:
> > > >> > The CPU CGroup controller allows to assign a specified (maximum)
> > > >> > bandwidth to tasks within a group, however it does not enforce any
> > > >> > constraint on how such bandwidth can be consumed.
> > > >> > With the integration of schedutil, the scheduler has now the proper
> > > >> > information about a task to select the most suitable frequency to
> > > >> > satisfy tasks needs.
> > > >> [..]
> > > >>
> > > >> > +static u64 cpu_capacity_min_read_u64(struct cgroup_subsys_state *css,
> > > >> > + struct cftype *cft)
> > > >> > +{
> > > >> > + struct task_group *tg;
> > > >> > + u64 min_capacity;
> > > >> > +
> > > >> > + rcu_read_lock();
> > > >> > + tg = css_tg(css);
> > > >> > + min_capacity = tg->cap_clamp[CAP_CLAMP_MIN];
> > > >>
> > > >> Shouldn't the cap_clamp be accessed with READ_ONCE (and WRITE_ONCE in
> > > >> the write path) to avoid load-tearing?
> > > >
> > > > tg->cap_clamp is an "unsigned int" and thus I would expect a single
> > > > memory access to write/read it, isn't it? I mean: I do not expect the
> > > > compiler "to mess" with these accesses.
> > >
> > > This depends on compiler and arch. I'm not sure if its in practice
> > > these days an issue, but see section on 'load tearing' in
> > > Documentation/memory-barriers.txt . If compiler decided to break down
> > > the access to multiple accesses due to some reason, then might be a
> > > problem.
> >
> > The compiler might also be able to inline cpu_capacity_min_read_u64()
> > fuse the load from tg->cap_clamp[CAP_CLAMP_MIN] with other accesses.
> > If min_capacity is used several times in the ensuing code, the compiler
> > could reload multiple times from tg->cap_clamp[CAP_CLAMP_MIN], which at
> > best might be a bit confusing.
>
> That's actually an interesting case, however I don't think it applies
> in this case since cpu_capacity_min_read_u64() is called only via
> a function poninter and thus it will never be inlined. isn't it?
>
> > > Adding Paul for his expert opinion on the matter ;)
> >
> > My personal approach is to use READ_ONCE() and WRITE_ONCE() unless
> > I can absolutely prove that the compiler cannot do any destructive
> > optimizations. And I not-infrequently find unsuspected opportunities
> > for destructive optimization in my own code. Your mileage may vary. ;-)
>
> I guess here the main concern from Joel is that such a pattern:
>
> u64 var = unsigned_int_value_from_memory;
>
> can result is a couple of "load from memory" operations.

Indeed it can. I first learned this the hard way in the early 1990s,
so 20-year-old compiler optimizations are quite capable of making this
sort of thing happen.

> In that case a similar:
>
> unsigned_int_left_value = new_unsigned_int_value;
>
> executed on a different thread can overlap with the previous memory
> read operations and ending up in "var" containing a not consistent
> value.
>
> Question is: can this really happen, given the data types in use?

So we have an updater changing the value of unsigned_int_left_value,
while readers in other threads are accessing it, correct? And you
are asking whether the compiler can optimize the updater so as to
mess up the readers, right?

One such optimization would be a byte-wise write, though I have no
idea why a compiler would do such a thing assuming that the variable
is reasonably sized and aligned. Another is that the compiler could
use the variable as temporary storage just before the assignment.
(You haven't told the compiler that anyone else is reading it, though
I don't know of this being done by production compilers.) A third is
that the compiler could fuse successive stores, which might or might
not be a problem, depending.

Probably more, but that should be a start. ;-)

Thanx, Paul

> Thanks! ;-)
>
> --
> #include <best/regards.h>
>
> Patrick Bellasi
>

2017-03-15 17:57:07

by Patrick Bellasi

[permalink] [raw]

Subject: Re: [RFC v3 1/5] sched/core: add capacity constraints to CPU controller

On 15-Mar 10:24, Paul E. McKenney wrote:
> On Wed, Mar 15, 2017 at 04:44:39PM +0000, Patrick Bellasi wrote:
> > On 15-Mar 09:10, Paul E. McKenney wrote:
> > > On Wed, Mar 15, 2017 at 06:20:28AM -0700, Joel Fernandes wrote:
> > > > On Wed, Mar 15, 2017 at 4:20 AM, Patrick Bellasi
> > > > <[email protected]> wrote:
> > > > > On 13-Mar 03:46, Joel Fernandes (Google) wrote:
> > > > >> On Tue, Feb 28, 2017 at 6:38 AM, Patrick Bellasi
> > > > >> <[email protected]> wrote:
> > > > >> > The CPU CGroup controller allows to assign a specified (maximum)
> > > > >> > bandwidth to tasks within a group, however it does not enforce any
> > > > >> > constraint on how such bandwidth can be consumed.
> > > > >> > With the integration of schedutil, the scheduler has now the proper
> > > > >> > information about a task to select the most suitable frequency to
> > > > >> > satisfy tasks needs.
> > > > >> [..]
> > > > >>
> > > > >> > +static u64 cpu_capacity_min_read_u64(struct cgroup_subsys_state *css,
> > > > >> > + struct cftype *cft)
> > > > >> > +{
> > > > >> > + struct task_group *tg;
> > > > >> > + u64 min_capacity;
> > > > >> > +
> > > > >> > + rcu_read_lock();
> > > > >> > + tg = css_tg(css);
> > > > >> > + min_capacity = tg->cap_clamp[CAP_CLAMP_MIN];
> > > > >>
> > > > >> Shouldn't the cap_clamp be accessed with READ_ONCE (and WRITE_ONCE in
> > > > >> the write path) to avoid load-tearing?
> > > > >
> > > > > tg->cap_clamp is an "unsigned int" and thus I would expect a single
> > > > > memory access to write/read it, isn't it? I mean: I do not expect the
> > > > > compiler "to mess" with these accesses.
> > > >
> > > > This depends on compiler and arch. I'm not sure if its in practice
> > > > these days an issue, but see section on 'load tearing' in
> > > > Documentation/memory-barriers.txt . If compiler decided to break down
> > > > the access to multiple accesses due to some reason, then might be a
> > > > problem.
> > >
> > > The compiler might also be able to inline cpu_capacity_min_read_u64()
> > > fuse the load from tg->cap_clamp[CAP_CLAMP_MIN] with other accesses.
> > > If min_capacity is used several times in the ensuing code, the compiler
> > > could reload multiple times from tg->cap_clamp[CAP_CLAMP_MIN], which at
> > > best might be a bit confusing.
> >
> > That's actually an interesting case, however I don't think it applies
> > in this case since cpu_capacity_min_read_u64() is called only via
> > a function poninter and thus it will never be inlined. isn't it?
> >
> > > > Adding Paul for his expert opinion on the matter ;)
> > >
> > > My personal approach is to use READ_ONCE() and WRITE_ONCE() unless
> > > I can absolutely prove that the compiler cannot do any destructive
> > > optimizations. And I not-infrequently find unsuspected opportunities
> > > for destructive optimization in my own code. Your mileage may vary. ;-)
> >
> > I guess here the main concern from Joel is that such a pattern:
> >
> > u64 var = unsigned_int_value_from_memory;
> >
> > can result is a couple of "load from memory" operations.
>
> Indeed it can. I first learned this the hard way in the early 1990s,
> so 20-year-old compiler optimizations are quite capable of making this
> sort of thing happen.
>
> > In that case a similar:
> >
> > unsigned_int_left_value = new_unsigned_int_value;
> >
> > executed on a different thread can overlap with the previous memory
> > read operations and ending up in "var" containing a not consistent
> > value.
> >
> > Question is: can this really happen, given the data types in use?
>
> So we have an updater changing the value of unsigned_int_left_value,
> while readers in other threads are accessing it, correct? And you
> are asking whether the compiler can optimize the updater so as to
> mess up the readers, right?
>
> One such optimization would be a byte-wise write, though I have no
> idea why a compiler would do such a thing assuming that the variable
> is reasonably sized and aligned. Another is that the compiler could
> use the variable as temporary storage just before the assignment.
> (You haven't told the compiler that anyone else is reading it, though
> I don't know of this being done by production compilers.) A third is
> that the compiler could fuse successive stores, which might or might
> not be a problem, depending.
>
> Probably more, but that should be a start. ;-)

Definitively yes! :)

Thanks for the detailed explanation, I'll add the READ_ONCE as
Joel suggested! +1

--
#include <best/regards.h>

Patrick Bellasi