2013-07-23 17:42:12

by Rakib Mullick

[permalink] [raw]
Subject: [PATCH v2] sched: update_top_cache_domain only at the times of building sched domain.

Currently, update_top_cache_domain() is called whenever schedule domain is built or destroyed. But, the following
callpath shows that they're at the same callpath and can be avoided update_top_cache_domain() while destroying schedule
domain and update only at the times of building schedule domains.

partition_sched_domains()
detach_destroy_domain()
cpu_attach_domain()
update_top_cache_domain()
build_sched_domains()
cpu_attach_domain()
update_top_cache_domain()

Changes since v1: use sd to determine when to skip, courtesy PeterZ

Signed-off-by: Rakib Mullick <[email protected]>
---

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index b7c32cb..387fb66 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -5138,7 +5138,8 @@ cpu_attach_domain(struct sched_domain *sd, struct root_domain *rd, int cpu)
rcu_assign_pointer(rq->sd, sd);
destroy_sched_domains(tmp, cpu);

- update_top_cache_domain(cpu);
+ if (sd)
+ update_top_cache_domain(cpu);
}

/* cpus with isolated domains */



2013-07-24 03:26:43

by Michael wang

[permalink] [raw]
Subject: Re: [PATCH v2] sched: update_top_cache_domain only at the times of building sched domain.

Hi, Rakib

On 07/24/2013 01:42 AM, Rakib Mullick wrote:
> Currently, update_top_cache_domain() is called whenever schedule domain is built or destroyed. But, the following
> callpath shows that they're at the same callpath and can be avoided update_top_cache_domain() while destroying schedule
> domain and update only at the times of building schedule domains.
>
> partition_sched_domains()
> detach_destroy_domain()
> cpu_attach_domain()
> update_top_cache_domain()

IMHO, cpu_attach_domain() and update_top_cache_domain() should be
paired, below patch will open a window which 'rq->sd == NULL' while
'sd_llc != NULL', isn't it?

I don't think we have the promise that before we rebuild the stuff
correctly, no one will utilize 'sd_llc'...

Further more, what will happen if the old sd was freed after next rcu
work cycle while 'sd_llc' still hold the reference for some victims?

Thus I do suggest we leave the things untouched since the benefit we get
is too less, not worth the risk...

Regards,
Michael Wang

> build_sched_domains()
> cpu_attach_domain()
> update_top_cache_domain()
>
> Changes since v1: use sd to determine when to skip, courtesy PeterZ
>
> Signed-off-by: Rakib Mullick <[email protected]>
> ---
>
> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index b7c32cb..387fb66 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -5138,7 +5138,8 @@ cpu_attach_domain(struct sched_domain *sd, struct root_domain *rd, int cpu)
> rcu_assign_pointer(rq->sd, sd);
> destroy_sched_domains(tmp, cpu);
>
> - update_top_cache_domain(cpu);
> + if (sd)
> + update_top_cache_domain(cpu);
> }
>
> /* cpus with isolated domains */
>
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>

2013-07-24 08:01:37

by Rakib Mullick

[permalink] [raw]
Subject: Re: [PATCH v2] sched: update_top_cache_domain only at the times of building sched domain.

On Wed, Jul 24, 2013 at 9:26 AM, Michael Wang
<[email protected]> wrote:
> Hi, Rakib
>
> On 07/24/2013 01:42 AM, Rakib Mullick wrote:
>> Currently, update_top_cache_domain() is called whenever schedule domain is built or destroyed. But, the following
>> callpath shows that they're at the same callpath and can be avoided update_top_cache_domain() while destroying schedule
>> domain and update only at the times of building schedule domains.
>>
>> partition_sched_domains()
>> detach_destroy_domain()
>> cpu_attach_domain()
>> update_top_cache_domain()
>
> IMHO, cpu_attach_domain() and update_top_cache_domain() should be
> paired, below patch will open a window which 'rq->sd == NULL' while
> 'sd_llc != NULL', isn't it?
>
> I don't think we have the promise that before we rebuild the stuff
> correctly, no one will utilize 'sd_llc'...
>
I never said it. My point is different. partition_sched_domain works as -

- destroying existing schedule domain (if previous domain and new
domain aren't same)
- building new partition

while doing the first it needs to detach all the cpus on that domain.
By detaching what it does,
it fall backs to it's root default domain. In this context (which i've
proposed to skip), by means
of updating top cache domain it takes the highest flag domain to setup
it's sd_llc_id or cpu itself.

Whatever done above gets overwritten (updating top cache domain),
while building new partition.
Then, why we did that before? Hope you understand my point.

Thanks,
Rakib.

2013-07-24 08:34:52

by Michael wang

[permalink] [raw]
Subject: Re: [PATCH v2] sched: update_top_cache_domain only at the times of building sched domain.

On 07/24/2013 04:01 PM, Rakib Mullick wrote:
> On Wed, Jul 24, 2013 at 9:26 AM, Michael Wang
> <[email protected]> wrote:
>> Hi, Rakib
>>
>> On 07/24/2013 01:42 AM, Rakib Mullick wrote:
>>> Currently, update_top_cache_domain() is called whenever schedule domain is built or destroyed. But, the following
>>> callpath shows that they're at the same callpath and can be avoided update_top_cache_domain() while destroying schedule
>>> domain and update only at the times of building schedule domains.
>>>
>>> partition_sched_domains()
>>> detach_destroy_domain()
>>> cpu_attach_domain()
>>> update_top_cache_domain()
>>
>> IMHO, cpu_attach_domain() and update_top_cache_domain() should be
>> paired, below patch will open a window which 'rq->sd == NULL' while
>> 'sd_llc != NULL', isn't it?
>>
>> I don't think we have the promise that before we rebuild the stuff
>> correctly, no one will utilize 'sd_llc'...
>>
> I never said it. My point is different. partition_sched_domain works as -
>
> - destroying existing schedule domain (if previous domain and new
> domain aren't same)
> - building new partition
>
> while doing the first it needs to detach all the cpus on that domain.
> By detaching what it does,
> it fall backs to it's root default domain. In this context (which i've
> proposed to skip), by means
> of updating top cache domain it takes the highest flag domain to setup
> it's sd_llc_id or cpu itself.
>
> Whatever done above gets overwritten (updating top cache domain),
> while building new partition.
> Then, why we did that before? Hope you understand my point.

I think you missed this in PeterZ's suggestion:

- cpu_attach_domain(NULL, &def_root_domain, i);

With this change, it will be safe since you will still get an un-freed
sd, although it's an old one.

But your patch will run the risk to get a freed sd, since you make
'sd_llc' wrong for a period of time (between destroy and rebuild) IMO.

I guess I get you point, you are trying to save one time update since
you think this will be done twice, but actually the result of this two
time update was different, it's not redo and it's in order to sync
'sd_llc' with 'rq->sd'.

Regards,
Michael Wang


>
> Thanks,
> Rakib.
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>

2013-07-24 10:49:48

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH v2] sched: update_top_cache_domain only at the times of building sched domain.

On Wed, Jul 24, 2013 at 04:34:39PM +0800, Michael Wang wrote:
> But your patch will run the risk to get a freed sd, since you make
> 'sd_llc' wrong for a period of time (between destroy and rebuild) IMO.
>
> I guess I get you point, you are trying to save one time update since
> you think this will be done twice, but actually the result of this two
> time update was different, it's not redo and it's in order to sync
> 'sd_llc' with 'rq->sd'.

Michael is right, you cannot skip update_top_cache_domain() when
destroying things.

2013-07-24 13:57:22

by Rakib Mullick

[permalink] [raw]
Subject: Re: [PATCH v2] sched: update_top_cache_domain only at the times of building sched domain.

On Wed, Jul 24, 2013 at 2:34 PM, Michael Wang
<[email protected]> wrote:
> On 07/24/2013 04:01 PM, Rakib Mullick wrote:
>> On Wed, Jul 24, 2013 at 9:26 AM, Michael Wang
>> <[email protected]> wrote:
>>> Hi, Rakib
>>>
>>> On 07/24/2013 01:42 AM, Rakib Mullick wrote:
>>>> Currently, update_top_cache_domain() is called whenever schedule domain is built or destroyed. But, the following
>>>> callpath shows that they're at the same callpath and can be avoided update_top_cache_domain() while destroying schedule
>>>> domain and update only at the times of building schedule domains.
>>>>
>>>> partition_sched_domains()
>>>> detach_destroy_domain()
>>>> cpu_attach_domain()
>>>> update_top_cache_domain()
>>>
>>> IMHO, cpu_attach_domain() and update_top_cache_domain() should be
>>> paired, below patch will open a window which 'rq->sd == NULL' while
>>> 'sd_llc != NULL', isn't it?
>>>
>>> I don't think we have the promise that before we rebuild the stuff
>>> correctly, no one will utilize 'sd_llc'...
>>>
>> I never said it. My point is different. partition_sched_domain works as -
>>
>> - destroying existing schedule domain (if previous domain and new
>> domain aren't same)
>> - building new partition
>>
>> while doing the first it needs to detach all the cpus on that domain.
>> By detaching what it does,
>> it fall backs to it's root default domain. In this context (which i've
>> proposed to skip), by means
>> of updating top cache domain it takes the highest flag domain to setup
>> it's sd_llc_id or cpu itself.
>>
>> Whatever done above gets overwritten (updating top cache domain),
>> while building new partition.
>> Then, why we did that before? Hope you understand my point.
>
> I think you missed this in PeterZ's suggestion:
>
> - cpu_attach_domain(NULL, &def_root_domain, i);
>
> With this change, it will be safe since you will still get an un-freed
> sd, although it's an old one.
>
I never meant it and clearly I missed it. If you remove
cpu_attach_domain(), then detach_destroy_domain() becomes meaningless.
And I don't have any intent to remove cpu_attach_domain from
detach_destroy_domain() at all.

> But your patch will run the risk to get a freed sd, since you make
> 'sd_llc' wrong for a period of time (between destroy and rebuild) IMO.
>
Building 'sd_llc' depends on schedule domain. If you don't have sd,
sd_llc will point to NULL and sd_llc_id is
the CPU itself. Since, we're trying to re-construing so for this time
being it doesn't matter, cause we're building
it again. Now, please just note what you're saying, on last thread
you've said -

"I don't think we have the promise that before we rebuild the stuff
correctly, no one will utilize 'sd_llc'..."

If that is the case, then we shouldn't worry about it at all. And this
above comments (from previous thread I've quoted and this thread I'm
replying) they're just self contradictory.

> I guess I get you point, you are trying to save one time update since
> you think this will be done twice, but actually the result of this two
> time update was different, it's not redo and it's in order to sync
> 'sd_llc' with 'rq->sd'.
>
Yes, you got my point now, but I don't understand your points. Anyway,
I'm not going to argue with this
anymore, this stuff isn't much of an issue, but removing this sorts of
stuff is typical in kernel development.

Thanks,
Rakib.

2013-07-25 02:50:16

by Michael wang

[permalink] [raw]
Subject: Re: [PATCH v2] sched: update_top_cache_domain only at the times of building sched domain.

On 07/24/2013 06:49 PM, Peter Zijlstra wrote:
> On Wed, Jul 24, 2013 at 04:34:39PM +0800, Michael Wang wrote:
>> But your patch will run the risk to get a freed sd, since you make
>> 'sd_llc' wrong for a period of time (between destroy and rebuild) IMO.
>>
>> I guess I get you point, you are trying to save one time update since
>> you think this will be done twice, but actually the result of this two
>> time update was different, it's not redo and it's in order to sync
>> 'sd_llc' with 'rq->sd'.
>
> Michael is right, you cannot skip update_top_cache_domain() when
> destroying things.

Thanks for the confirm :)

Regards,
Michael Wang

> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>

2013-07-25 03:15:36

by Michael wang

[permalink] [raw]
Subject: Re: [PATCH v2] sched: update_top_cache_domain only at the times of building sched domain.

On 07/24/2013 09:57 PM, Rakib Mullick wrote:
> On Wed, Jul 24, 2013 at 2:34 PM, Michael Wang
[snip]
>>
>> I think you missed this in PeterZ's suggestion:
>>
>> - cpu_attach_domain(NULL, &def_root_domain, i);
>>
>> With this change, it will be safe since you will still get an un-freed
>> sd, although it's an old one.
>>
> I never meant it and clearly I missed it. If you remove
> cpu_attach_domain(), then detach_destroy_domain() becomes meaningless.
> And I don't have any intent to remove cpu_attach_domain from
> detach_destroy_domain() at all.
>
>> But your patch will run the risk to get a freed sd, since you make
>> 'sd_llc' wrong for a period of time (between destroy and rebuild) IMO.
>>

Ok, allow me try to explain it again, hope this time it could be more
clear...

> Building 'sd_llc' depends on schedule domain. If you don't have sd,
> sd_llc will point to NULL and sd_llc_id is
> the CPU itself. Since, we're trying to re-construing so for this time
> being it doesn't matter, cause we're building
> it again.

It does matter, although we build it again, we need to make things sync
at any point of time in SMP world.

Now, please just note what you're saying, on last thread
> you've said -
>
> "I don't think we have the promise that before we rebuild the stuff
> correctly, no one will utilize 'sd_llc'..."
>
> If that is the case, then we shouldn't worry about it at all. And this
> above comments (from previous thread I've quoted and this thread I'm
> replying) they're just self contradictory.

Let's have some picture like:

destroy
cpu_attach_domain(NULL) //cad_A
update_top_cache_domain() //utcd_A

WINDOW //begin after last rcu_read_unlock()
//end after next rcu_read_lock()

build
cpu_attach_domain(new_sd) //cad_B
update_top_cache_domain() //utcd_B

Now in old world, what we have is:
1. in 'utcd_A', make 'sd_llc' to be NULL since old_sd was destroyed in
cad_A.
2. thus during WINDOW, both 'rq->sd' and 'sd_llc' is NULL
3. in 'utcd_B', update 'sd_llc' to be the new 'highest cache-share sd'
since new_sd attached in 'cad_B'

Now with your patch applied, what will happen is:
1. 'utcd_A' won't happen now, although the sd 'sd_llc' point to was
destroyed in 'cad_A'
2. thus during WINDOW, 'rq->sd' is NULL while 'sd_llc' is the destroyed
'old highest cache-share sd'
3. in 'utcd_B', update 'sd_llc' to be the new 'highest cache-share sd '
since new_sd attached in 'cad_B'

Seems like both will result the same 'sd_llc', but your patch make
'sd_llc' point to a destroyed sd during the WINDOW.

And I said:

"I don't think we have the promise that before we rebuild the stuff
correctly, no one will utilize 'sd_llc'..."

By which I mean some one will utilize 'sd_llc' during the WINDOW, in old
world, it's safe since will get NULL, with your patch, it's unsafe since
we get a freeing sd.

And that's the risk I concerned, and that's my point.

>
>> I guess I get you point, you are trying to save one time update since
>> you think this will be done twice, but actually the result of this two
>> time update was different, it's not redo and it's in order to sync
>> 'sd_llc' with 'rq->sd'.
>>
> Yes, you got my point now, but I don't understand your points. Anyway,
> I'm not going to argue with this
> anymore, this stuff isn't much of an issue, but removing this sorts of
> stuff is typical in kernel development.

I'm not argue, actually there is nothing to argue...just try to explain
what is wrong IMO, if I failed to, then I could only blame my poor
writing skill...

Regards,
Michael Wang

>
> Thanks,
> Rakib.
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>