2005-09-06 23:12:55

by Ashok Raj

[permalink] [raw]
Subject: Re: [patch 13/14] x86_64: Use common functions in cluster and physflat mode

On Tue, Sep 06, 2005 at 01:16:28AM +0200, Andi Kleen wrote:
> On Sat, Sep 03, 2005 at 02:33:30PM -0700, [email protected] wrote:
> >
> > From: Ashok Raj <[email protected]>
> >
> > Newly introduced physflat_* shares way too much with cluster with only a very
> > differences. So we introduce some common functions in that can be reused in
> > both cases.
> >
> > In addition the following are also fixed.
> > - Use of non-existent CONFIG_CPU_HOTPLUG option renamed to actual one in use.
> > - Removed comment that ACPI would provide a way to select this dynamically
> > since ACPI_CONFIG_HOTPLUG_CPU already exists that indicates platform support
> > for hotplug via ACPI. In addition CONFIG_HOTPLUG_CPU only indicates logical
> > offline/online which is even used by Suspend/Resume folks where the same
> > support (for no-broadcast) is required.
>
>
> (hmm did I reply to that? I think I did but my mailer seems to have
> lost the r flag. My apologies if it's a duplicate)
>
> I didn't like that one because it makes the code less readable than
> before imho. I did a separate patch for the CPU_HOTPLUG typo.

The code is less readable? Now iam confused. Attached the link to patch
below to refresh your memory.

http://marc.theaimsgroup.com/?l=linux-kernel&m=112293577309653&w=2

diffstat would show we have fewer lines ~40 less lines of code. physflat
basicaly copied/cloned some useful code in cluster and some from flat mode
genapic code.

I would have consolidated the code in the first place when you put the physflat
mode. Again this was just my habit, cant step over code bloat and duplication.

Which part of the code is unreadable to you? If you are happy with just renamed
functions with copied body of the code which is what physflat did, thats fine.

I was just puzzeled at the convoluted and less readable part of the code. If
there is something you like to point out, i would be happy to fix it.. or you
can if you prefer it that way.


--
Cheers,
Ashok Raj
- Open Source Technology Center


2005-09-09 17:01:47

by Zwane Mwaikambo

[permalink] [raw]
Subject: Re: [patch 13/14] x86_64: Use common functions in cluster and physflat mode

On Tue, 6 Sep 2005, Ashok Raj wrote:

> On Tue, Sep 06, 2005 at 01:16:28AM +0200, Andi Kleen wrote:
> > On Sat, Sep 03, 2005 at 02:33:30PM -0700, [email protected] wrote:
> > >
> > > From: Ashok Raj <[email protected]>
> > >
> > > Newly introduced physflat_* shares way too much with cluster with only a very
> > > differences. So we introduce some common functions in that can be reused in
> > > both cases.

On a slightly different topic, how come we're using physflat for hotplug
cpu?

-#ifndef CONFIG_CPU_HOTPLUG
/* In the CPU hotplug case we cannot use broadcast mode
because that opens a race when a CPU is removed.
- Stay at physflat mode in this case.
- It is bad to do this unconditionally though. Once
- we have ACPI platform support for CPU hotplug
- we should detect hotplug capablity from ACPI tables and
- only do this when really needed. -AK */
+ Stay at physflat mode in this case. - AK */
+#ifdef CONFIG_HOTPLUG_CPU
if (num_cpus <= 8)
genapic = &apic_flat;

Thanks,
Zwane

2005-09-09 20:46:08

by Ashok Raj

[permalink] [raw]
Subject: Re: [patch 13/14] x86_64: Use common functions in cluster and physflat mode

On Fri, Sep 09, 2005 at 10:07:28AM -0700, Zwane Mwaikambo wrote:
> On a slightly different topic, how come we're using physflat for hotplug
> cpu?
>
> -#ifndef CONFIG_CPU_HOTPLUG
> /* In the CPU hotplug case we cannot use broadcast mode
> because that opens a race when a CPU is removed.
> - Stay at physflat mode in this case.
> - It is bad to do this unconditionally though. Once
> - we have ACPI platform support for CPU hotplug
> - we should detect hotplug capablity from ACPI tables and
> - only do this when really needed. -AK */
> + Stay at physflat mode in this case. - AK */
> +#ifdef CONFIG_HOTPLUG_CPU
> if (num_cpus <= 8)
> genapic = &apic_flat;

What you say was true before this patch, (Although now that you point out i
realize the ifdef CONFIG_HOTPLUG_CPU is not required).

Think Andi is fixing this in his next drop to -mm*

When physflat was introduced, it also switched to use physflat mode for
#cpus <=8 when hotplug is enabled, since it doesnt use shortcuts and
so is also safer (although slower).

http://marc.theaimsgroup.com/?l=linux-kernel&m=112317686712929&w=2

The link above made using genapic_flat safer by using the
flat_send_IPI_mask(), and hence i switched back to using
logical flat mode when #cpus <=8, since that a little more efficient than
the send_IPI_mask_sequence() used in physflat mode.

In general we need

flat_mode - #cpus <= 8 (Hotplug defined or not, so we use mask version
for safety)

physflat or cluster_mode when #cpus >8.

If we choose physflat as default for #cpus <=8 (with hotplug) would make
IPI performance worse, since it would do one cpu at a time, and requires 2
writes per cpu for each IPI v.s just 2 for a flat mode mask version of the API.

--
Cheers,
Ashok Raj
- Open Source Technology Center

2005-09-10 00:30:24

by Andi Kleen

[permalink] [raw]
Subject: Re: [patch 13/14] x86_64: Use common functions in cluster and physflat mode

On Fri, Sep 09, 2005 at 10:07:28AM -0700, Zwane Mwaikambo wrote:
> On Tue, 6 Sep 2005, Ashok Raj wrote:
>
> > On Tue, Sep 06, 2005 at 01:16:28AM +0200, Andi Kleen wrote:
> > > On Sat, Sep 03, 2005 at 02:33:30PM -0700, [email protected] wrote:
> > > >
> > > > From: Ashok Raj <[email protected]>
> > > >
> > > > Newly introduced physflat_* shares way too much with cluster with only a very
> > > > differences. So we introduce some common functions in that can be reused in
> > > > both cases.
>
> On a slightly different topic, how come we're using physflat for hotplug
> cpu?

The original idea was to always use physflat mode for hotplug because
that does all the sequencing stuff and avoids the shortcut races.
But then Ashok decided it was better to add more ifdefs to flat mode
instead and I gave up protesting at some point.

-Andi

>
> -#ifndef CONFIG_CPU_HOTPLUG
> /* In the CPU hotplug case we cannot use broadcast mode
> because that opens a race when a CPU is removed.
> - Stay at physflat mode in this case.
> - It is bad to do this unconditionally though. Once
> - we have ACPI platform support for CPU hotplug
> - we should detect hotplug capablity from ACPI tables and
> - only do this when really needed. -AK */
> + Stay at physflat mode in this case. - AK */
> +#ifdef CONFIG_HOTPLUG_CPU
> if (num_cpus <= 8)
> genapic = &apic_flat;
>
> Thanks,
> Zwane
>

2005-09-10 01:44:46

by Zwane Mwaikambo

[permalink] [raw]
Subject: Re: [patch 13/14] x86_64: Use common functions in cluster and physflat mode

On Fri, 10 Sep 2005, Andi Kleen wrote:

> On Fri, Sep 09, 2005 at 10:07:28AM -0700, Zwane Mwaikambo wrote:
> > On Tue, 6 Sep 2005, Ashok Raj wrote:
> >
> > > On Tue, Sep 06, 2005 at 01:16:28AM +0200, Andi Kleen wrote:
> > > > On Sat, Sep 03, 2005 at 02:33:30PM -0700, [email protected] wrote:
> > > > >
> > > > > From: Ashok Raj <[email protected]>
> > > > >
> > > > > Newly introduced physflat_* shares way too much with cluster with only a very
> > > > > differences. So we introduce some common functions in that can be reused in
> > > > > both cases.
> >
> > On a slightly different topic, how come we're using physflat for hotplug
> > cpu?
>
> The original idea was to always use physflat mode for hotplug because
> that does all the sequencing stuff and avoids the shortcut races.
> But then Ashok decided it was better to add more ifdefs to flat mode
> instead and I gave up protesting at some point.

Ok so you wanted to segragate them, i can understand that, but didn't we
have a version which worked around the races by doing the same thing,
hotplug or not? Is this the one where you weren't pleased with the
supposed execution penalty?

Thanks,
Zwane

2005-09-11 16:37:50

by Zwane Mwaikambo

[permalink] [raw]
Subject: Re: [patch 13/14] x86_64: Use common functions in cluster and physflat mode

On Fri, 9 Sep 2005, Ashok Raj wrote:

> In general we need
>
> flat_mode - #cpus <= 8 (Hotplug defined or not, so we use mask version
> for safety)
>
> physflat or cluster_mode when #cpus >8.

I agree there.

> If we choose physflat as default for #cpus <=8 (with hotplug) would make
> IPI performance worse, since it would do one cpu at a time, and requires 2
> writes per cpu for each IPI v.s just 2 for a flat mode mask version of the API.

I don't see the benefit then :/ I certainly hope we don't go that route.

Thanks,
Zwane

2005-09-11 23:02:30

by Andi Kleen

[permalink] [raw]
Subject: Re: [patch 13/14] x86_64: Use common functions in cluster and physflat mode

On Sun, Sep 11, 2005 at 09:44:16AM -0700, Zwane Mwaikambo wrote:
> On Fri, 9 Sep 2005, Ashok Raj wrote:
>
> > In general we need
> >
> > flat_mode - #cpus <= 8 (Hotplug defined or not, so we use mask version
> > for safety)
> >
> > physflat or cluster_mode when #cpus >8.
>
> I agree there.
>
> > If we choose physflat as default for #cpus <=8 (with hotplug) would make
> > IPI performance worse, since it would do one cpu at a time, and requires 2
> > writes per cpu for each IPI v.s just 2 for a flat mode mask version of the API.
>
> I don't see the benefit then :/ I certainly hope we don't go that route.

I originally made that objection, but Ashok then did some benchmarks
that showed essentially no difference. I can see the point - it's
likely that an access to the local APIC (which is in the CPU) is fast
(we know it is) and all the time is dominated by sending the
requests over the wires between the CPUs. So it shouldn't matter
much if you use sequence mode or masks.

That is why I changed my mind and just made physflat default for the hotplug
case (which will be essentially everywhere because I expect most kernels
to have hotplug enabled in the future)

It's a bit of a mess in mm right now because me and Ashok have been fixing
similar problems in a different way (e.g. the patch in flat to
use the sequence sending path is also not needed anymore with that)
Need to clean this up a bit.

Handling it properly for i386 is also still needed e.g. the older physflat
patch I did needs to be merged with bigsmp and cleaned up a bit.
But I don't have time to do it before monday so it'll miss the 2.6.14
window anyways.

-Andi

2005-09-12 22:24:07

by Ashok Raj

[permalink] [raw]
Subject: Re: [patch 13/14] x86_64: Use common functions in cluster and physflat mode

On Mon, Sep 12, 2005 at 01:02:20AM +0200, Andi Kleen wrote:
> On Sun, Sep 11, 2005 at 09:44:16AM -0700, Zwane Mwaikambo wrote:
> > On Fri, 9 Sep 2005, Ashok Raj wrote:
> >
> > > IPI performance worse, since it would do one cpu at a time, and requires 2
> > > writes per cpu for each IPI v.s just 2 for a flat mode mask version of the API.
> >
> > I don't see the benefit then :/ I certainly hope we don't go that route.
>
> I originally made that objection, but Ashok then did some benchmarks
> that showed essentially no difference. I can see the point - it's


Just a minor difference is that for a 8 CPU system

Using IPI shortcut uses just 1 write to local apic
Using mask version of broadcast uses 2 writes to local apic

The stat showed no significant difference when we do the 1 extra write. But
that is for the whole system.

When we use send_IPI_mask_sequence() we use 14 writes to (two writes
for each CPU that we need to target a IPI).

Using the same stats i used earlier, i see about 0x200 - 0x1000 extra cycles
when using mask_sequence() on certain long runs about 100K samples.

We should probably remove the !HOTPLUG case and just use the mask version
for all cases <=8 CPUS, use physflat or the cluster mode for >8cpus as
the case may be, instead of defaulting to sequence_IPI which seems
a little overkill for the intended purpose.

> likely that an access to the local APIC (which is in the CPU) is fast
> (we know it is) and all the time is dominated by sending the
> requests over the wires between the CPUs. So it shouldn't matter
> much if you use sequence mode or masks.
>
> That is why I changed my mind and just made physflat default for the hotplug
> case (which will be essentially everywhere because I expect most kernels
> to have hotplug enabled in the future)
>
> It's a bit of a mess in mm right now because me and Ashok have been fixing
> similar problems in a different way (e.g. the patch in flat to
> use the sequence sending path is also not needed anymore with that)
> Need to clean this up a bit.
>
> Handling it properly for i386 is also still needed e.g. the older physflat
> patch I did needs to be merged with bigsmp and cleaned up a bit.
> But I don't have time to do it before monday so it'll miss the 2.6.14
> window anyways.
>
> -Andi

--
Cheers,
Ashok Raj
- Open Source Technology Center

2005-09-13 08:11:00

by Andi Kleen

[permalink] [raw]
Subject: Re: [patch 13/14] x86_64: Use common functions in cluster and physflat mode

> We should probably remove the !HOTPLUG case and just use the mask version
> for all cases <=8 CPUS, use physflat or the cluster mode for >8cpus as
> the case may be, instead of defaulting to sequence_IPI which seems
> a little overkill for the intended purpose.

Or just always use physflat and remove the logical flat case?
That seems cleanest to me. Any objections?

-Andi

2005-09-13 14:46:43

by Zwane Mwaikambo

[permalink] [raw]
Subject: Re: [patch 13/14] x86_64: Use common functions in cluster and physflat mode

On Tue, 13 Sep 2005, Andi Kleen wrote:

> > We should probably remove the !HOTPLUG case and just use the mask version
> > for all cases <=8 CPUS, use physflat or the cluster mode for >8cpus as
> > the case may be, instead of defaulting to sequence_IPI which seems
> > a little overkill for the intended purpose.
>
> Or just always use physflat and remove the logical flat case?
> That seems cleanest to me. Any objections?

My objection is the number of APIC writes required to issue IPIs to a
group of processors, however i do understand that it would help
maintainability and testing coverage if we reduce the number of operating
modes, are you proposing physflat for _everything_ ?

Thanks,
Zwane

2005-09-13 15:31:11

by Andi Kleen

[permalink] [raw]
Subject: Re: [patch 13/14] x86_64: Use common functions in cluster and physflat mode

On Tue, Sep 13, 2005 at 07:53:03AM -0700, Zwane Mwaikambo wrote:
> On Tue, 13 Sep 2005, Andi Kleen wrote:
>
> > > We should probably remove the !HOTPLUG case and just use the mask version
> > > for all cases <=8 CPUS, use physflat or the cluster mode for >8cpus as
> > > the case may be, instead of defaulting to sequence_IPI which seems
> > > a little overkill for the intended purpose.
> >
> > Or just always use physflat and remove the logical flat case?
> > That seems cleanest to me. Any objections?
>
> My objection is the number of APIC writes required to issue IPIs to a
> group of processors, however i do understand that it would help

local APIC accesses are cheap, they don't go over any external bus.
It's not like a PCI cycle.

> maintainability and testing coverage if we reduce the number of operating
> modes, are you proposing physflat for _everything_ ?

On everything that's currently run by flat yes. Possibly the others
too when tested.

-Andi