The following patch...
[PATCH 1/1] x86: replace cpu_up hard-coded mdelay with variable
enables reducing cpu_up() time by 10ms on modern systems.
This means that for every processor in the system,
boot-time and resume-time can be reduced by 10ms per-processor.
Once this patch is accepted, I'll send a subsequent patch
to update the default delay, as appropriate.
thanks,
Len Brown, Intel Open Source Technology Center
From: Len Brown <[email protected]>
Default behavior unchanged.
cpu_up() has a hard-coded mdelay(10). Change that to a variable,
with default CONFIG_X86_INIT_MDELAY
and a boot-time override, "cpu_init_mdelay=N"
This patch adds mechanism without changing default policy.
Default policy will be changed in a subsequent patch.
Signed-off-by: Len Brown <[email protected]>
---
Documentation/kernel-parameters.txt | 7 +++++++
arch/x86/Kconfig | 15 +++++++++++++++
arch/x86/kernel/smpboot.c | 16 +++++++++++++++-
3 files changed, 37 insertions(+), 1 deletion(-)
diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt
index bfcb1a6..ec98968 100644
--- a/Documentation/kernel-parameters.txt
+++ b/Documentation/kernel-parameters.txt
@@ -737,6 +737,13 @@ bytes respectively. Such letter suffixes can also be entirely omitted.
cpuidle.off=1 [CPU_IDLE]
disable the cpuidle sub-system
+ cpu_init_mdelay=N
+ [X86] Delay for N millisec between assert and de-assert
+ of APIC INIT to start processor. In most configurations,
+ this occurs once for every CPU upon boot, and online,
+ such as resume from suspend and resume from hibernate.
+ Default: 10
+
cpcihp_generic= [HW,PCI] Generic port I/O CompactPCI driver
Format:
<first_slot>,<last_slot>,<port>,<enum_bit>[,<debug>]
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index b7d31ca..d2a91da 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -884,6 +884,21 @@ config SCHED_MC
making when dealing with multi-core CPU chips at a cost of slightly
increased overhead in some places. If unsure say N here.
+config X86_INIT_MDELAY
+ int "Milliseconds to wait to de-assert INIT upon CPU startup" if SMP
+ range 0 10
+ default "10"
+ ---help---
+ This allows you to specify how long the BSP will delay
+ between asserting INIT and de-asserting INIT when starting another CPU.
+ This delay is paid on each cpu-online operation, including boot-time,
+ resume from suspend, and resume from hibernate. The the value of
+ 10 millisec was specified by Multi-Processor Spec 1.4 in 1997.
+ But modern hardware does not require any delay.
+
+ The kernel boot-time parameter "cpu_init_mdelay="
+ will over-ride this build-time default.
+
source "kernel/Kconfig.preempt"
config UP_LATE_INIT
diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
index febc6aa..973621f 100644
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -555,6 +555,20 @@ wakeup_secondary_cpu_via_nmi(int apicid, unsigned long start_eip)
return (send_status | accept_status);
}
+unsigned int x86_init_mdelay = CONFIG_X86_INIT_MDELAY;
+static int __init cpu_init_mdelay(char *str)
+{
+ unsigned int tmp;
+
+ get_option(&str, &tmp);
+ pr_info("x86_init_mdelay set to %d, was %d", tmp, x86_init_mdelay);
+ x86_init_mdelay = tmp;
+ return 0;
+}
+
+early_param("cpu_init_mdelay", cpu_init_mdelay);
+
+
static int
wakeup_secondary_cpu_via_init(int phys_apicid, unsigned long start_eip)
{
@@ -586,7 +600,7 @@ wakeup_secondary_cpu_via_init(int phys_apicid, unsigned long start_eip)
pr_debug("Waiting for send to finish...\n");
send_status = safe_apic_wait_icr_idle();
- mdelay(10);
+ mdelay(x86_init_mdelay);
pr_debug("Deasserting INIT\n");
--
2.4.0.rc1
* Len Brown <[email protected]> wrote:
> +config X86_INIT_MDELAY
> + int "Milliseconds to wait to de-assert INIT upon CPU startup" if SMP
> + range 0 10
> + default "10"
> + ---help---
> + This allows you to specify how long the BSP will delay
> + between asserting INIT and de-asserting INIT when starting another CPU.
> + This delay is paid on each cpu-online operation, including boot-time,
> + resume from suspend, and resume from hibernate. The the value of
> + 10 millisec was specified by Multi-Processor Spec 1.4 in 1997.
> + But modern hardware does not require any delay.
What's the cutoff for 'modern hardware' - which CPUs stopped requiring
the delay?
Also, is there any public document where the no-delay is specified for
'modern hardware'?
Thanks,
Ingo
* Len Brown <[email protected]> wrote:
> The following patch...
>
> [PATCH 1/1] x86: replace cpu_up hard-coded mdelay with variable
>
> enables reducing cpu_up() time by 10ms on modern systems.
>
> This means that for every processor in the system,
> boot-time and resume-time can be reduced by 10ms per-processor.
>
> Once this patch is accepted, I'll send a subsequent patch
> to update the default delay, as appropriate.
>
> thanks,
> Len Brown, Intel Open Source Technology Center
So instead of playing games with an ancient delay, I'd suggest we
install the 10 msec INIT assertion wait as a platform quirk instead,
and activate it for all CPUs/systems that we think might need it, with
a sufficiently robust and future-proof quirk cutoff condition.
New systems won't have the quirk active and thus won't have to have
this delay configurable either.
Thanks,
Ingo
> What's the cutoff for 'modern hardware' - which CPUs stopped requiring
> the delay?
This is the topic of ongoing research, and I'm not ready to send
the patch setting a new default until I've heard back from a few more HW people.
Every system I've tested appears to work with delay 0.
Were I to guess, I'd venture that every
system that runs an X86_64 kernel might count as "modern" -- even
the 2005 AMD Turion laptop I've got in the bone pile.
> Also, is there any public document where the no-delay is specified for
> 'modern hardware'?
Unfortunately only the converse is true. There is an ancient document specifying
that the delay may be necessary.
cheers,
-Len
* Brown, Len <[email protected]> wrote:
> > What's the cutoff for 'modern hardware' - which CPUs stopped requiring
> > the delay?
>
> This is the topic of ongoing research, and I'm not ready to send
> the patch setting a new default until I've heard back from a few more HW people.
>
> Every system I've tested appears to work with delay 0.
> Were I to guess, I'd venture that every
> system that runs an X86_64 kernel might count as "modern" -- even
> the 2005 AMD Turion laptop I've got in the bone pile.
Could we use the apic version as a cutoff perhaps?
It would be nice to 'automatically' include modern 32-bit x86 systems
as well.
Any failure here would be relatively easy to bisect to, so we might as
well guess a bit and refine the quirk condition if needed?
Thanks,
Ingo
On Sat, Apr 18, 2015 at 8:53 PM, Len Brown <[email protected]> wrote:
> From: Len Brown <[email protected]>
>
[...]
> diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
> index b7d31ca..d2a91da 100644
> --- a/arch/x86/Kconfig
> +++ b/arch/x86/Kconfig
> @@ -884,6 +884,21 @@ config SCHED_MC
> making when dealing with multi-core CPU chips at a cost of slightly
> increased overhead in some places. If unsure say N here.
>
> +config X86_INIT_MDELAY
> + int "Milliseconds to wait to de-assert INIT upon CPU startup" if SMP
> + range 0 10
> + default "10"
> + ---help---
> + This allows you to specify how long the BSP will delay
> + between asserting INIT and de-asserting INIT when starting another CPU.
> + This delay is paid on each cpu-online operation, including boot-time,
> + resume from suspend, and resume from hibernate. The the value of
Happened to notice a 2x "The the..." when I came across this in linux-next
today and was reading the help out of curiousity...
P.
--
> + 10 millisec was specified by Multi-Processor Spec 1.4 in 1997.
> + But modern hardware does not require any delay.
> +
> + The kernel boot-time parameter "cpu_init_mdelay="
> + will over-ride this build-time default.
> +
> source "kernel/Kconfig.preempt"
>
> config UP_LATE_INIT
> diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
> index febc6aa..973621f 100644
> --- a/arch/x86/kernel/smpboot.c
> +++ b/arch/x86/kernel/smpboot.c
> @@ -555,6 +555,20 @@ wakeup_secondary_cpu_via_nmi(int apicid, unsigned long start_eip)
> return (send_status | accept_status);
> }
>
> +unsigned int x86_init_mdelay = CONFIG_X86_INIT_MDELAY;
> +static int __init cpu_init_mdelay(char *str)
> +{
> + unsigned int tmp;
> +
> + get_option(&str, &tmp);
> + pr_info("x86_init_mdelay set to %d, was %d", tmp, x86_init_mdelay);
> + x86_init_mdelay = tmp;
> + return 0;
> +}
> +
> +early_param("cpu_init_mdelay", cpu_init_mdelay);
> +
> +
> static int
> wakeup_secondary_cpu_via_init(int phys_apicid, unsigned long start_eip)
> {
> @@ -586,7 +600,7 @@ wakeup_secondary_cpu_via_init(int phys_apicid, unsigned long start_eip)
> pr_debug("Waiting for send to finish...\n");
> send_status = safe_apic_wait_icr_idle();
>
> - mdelay(10);
> + mdelay(x86_init_mdelay);
>
> pr_debug("Deasserting INIT\n");
>
> --
> 2.4.0.rc1
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
* Paul Gortmaker <[email protected]> wrote:
> On Sat, Apr 18, 2015 at 8:53 PM, Len Brown <[email protected]> wrote:
> > From: Len Brown <[email protected]>
> >
>
> [...]
>
> > diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
> > index b7d31ca..d2a91da 100644
> > --- a/arch/x86/Kconfig
> > +++ b/arch/x86/Kconfig
> > @@ -884,6 +884,21 @@ config SCHED_MC
> > making when dealing with multi-core CPU chips at a cost of slightly
> > increased overhead in some places. If unsure say N here.
> >
> > +config X86_INIT_MDELAY
> > + int "Milliseconds to wait to de-assert INIT upon CPU startup" if SMP
> > + range 0 10
> > + default "10"
> > + ---help---
> > + This allows you to specify how long the BSP will delay
> > + between asserting INIT and de-asserting INIT when starting another CPU.
> > + This delay is paid on each cpu-online operation, including boot-time,
> > + resume from suspend, and resume from hibernate. The the value of
>
> Happened to notice a 2x "The the..." when I came across this in
> linux-next today and was reading the help out of curiousity...
Len, this commit isn't in a permanent Git tree yet, right? We want to
do this through the x86 tree, and along the suggestions I made.
Thanks,
Ingo
On Tue, Apr 21, 2015 at 11:07 PM, Ingo Molnar <[email protected]> wrote:
>
> * Paul Gortmaker <[email protected]> wrote:
>> Happened to notice a 2x "The the..." when I came across this in
>> linux-next today and was reading the help out of curiousity...
>
> Len, this commit isn't in a permanent Git tree yet, right? We want to
> do this through the x86 tree, and along the suggestions I made.
It was in my -next tree -- no not permanent, yes, since removed (today).
Yes, I plan to send you a v2 and TIP is the right place for it to live.
Paul, thanks for noticing the typo.
Len Brown, Intel Open Source Technology Center
On Mon, Apr 20, 2015 at 12:15 AM, Ingo Molnar <[email protected]> wrote:
> So instead of playing games with an ancient delay, I'd suggest we
> install the 10 msec INIT assertion wait as a platform quirk instead,
> and activate it for all CPUs/systems that we think might need it, with
> a sufficiently robust and future-proof quirk cutoff condition.
>
> New systems won't have the quirk active and thus won't have to have
> this delay configurable either.
Okay, at this time, I think the quirk would apply to:
1. Intel family 5 (original pentium) -- some may actually need the quirk
2. Intel family F (pentium4) -- mostly b/c I don't want to bother
finding/testing p4
3. All AMD (happy to narrow down, if somebody can speak for AMD)
I'd keep the cmdline override, in case we break something,
or somebody wants to optimize/test. (Though I'll update units to
usec, rather than msec.,
so we can go below 1ms without going to 0)
I don't think we need the config option, just a #define to document the quirk.
What do you think?
Len Brown, Intel Open Source Technology Center
On Fri, May 01, 2015 at 02:42:39PM -0700, Len Brown wrote:
> On Mon, Apr 20, 2015 at 12:15 AM, Ingo Molnar <[email protected]> wrote:
>
> > So instead of playing games with an ancient delay, I'd suggest we
> > install the 10 msec INIT assertion wait as a platform quirk instead,
> > and activate it for all CPUs/systems that we think might need it, with
> > a sufficiently robust and future-proof quirk cutoff condition.
> >
> > New systems won't have the quirk active and thus won't have to have
> > this delay configurable either.
>
> Okay, at this time, I think the quirk would apply to:
>
> 1. Intel family 5 (original pentium) -- some may actually need the quirk
> 2. Intel family F (pentium4) -- mostly b/c I don't want to bother
> finding/testing p4
> 3. All AMD (happy to narrow down, if somebody can speak for AMD)
Aravind and I could probably test on a couple of AMD boxes to narrow down.
@Aravind, see here:
https://lkml.kernel.org/r/87d69aab88c14d65ae1e7be55050d1b689b59b4b.1429402494.git.len.brown@intel.com
You could ask around whether a timeout is needed between the assertion
and deassertion of INIT done by the BSP when booting other cores.
If not, we probably should convert, at least modern AMD machines, to the
no-delay default.
> I'd keep the cmdline override, in case we break something,
> or somebody wants to optimize/test. (Though I'll update units to
> usec, rather than msec.,
> so we can go below 1ms without going to 0)
> I don't think we need the config option, just a #define to document the quirk.
>
> What do you think?
>
> Len Brown, Intel Open Source Technology Center
>
--
Regards/Gruss,
Boris.
ECO tip #101: Trim your mails when you reply.
--
On 5/1/15 5:47 PM, Borislav Petkov wrote:
> On Fri, May 01, 2015 at 02:42:39PM -0700, Len Brown wrote:
>> On Mon, Apr 20, 2015 at 12:15 AM, Ingo Molnar <[email protected]> wrote:
>>
>>> So instead of playing games with an ancient delay, I'd suggest we
>>> install the 10 msec INIT assertion wait as a platform quirk instead,
>>> and activate it for all CPUs/systems that we think might need it, with
>>> a sufficiently robust and future-proof quirk cutoff condition.
>>>
>>> New systems won't have the quirk active and thus won't have to have
>>> this delay configurable either.
>> Okay, at this time, I think the quirk would apply to:
>>
>> 1. Intel family 5 (original pentium) -- some may actually need the quirk
>> 2. Intel family F (pentium4) -- mostly b/c I don't want to bother
>> finding/testing p4
>> 3. All AMD (happy to narrow down, if somebody can speak for AMD)
> Aravind and I could probably test on a couple of AMD boxes to narrow down.
>
> @Aravind, see here:
>
> https://lkml.kernel.org/r/87d69aab88c14d65ae1e7be55050d1b689b59b4b.1429402494.git.len.brown@intel.com
>
> You could ask around whether a timeout is needed between the assertion
> and deassertion of INIT done by the BSP when booting other cores.
Sure, I'll ask around and try mdelay(0) on some systems as well.
I can gather Fam15h, Fam16h but don't have K8's or older.
Will let you know how it goes.
-Aravind.
> If not, we probably should convert, at least modern AMD machines, to the
> no-delay default.
>
On 5/1/15 7:42 PM, Aravind Gopalakrishnan wrote:
>
> On 5/1/15 5:47 PM, Borislav Petkov wrote:
>> On Fri, May 01, 2015 at 02:42:39PM -0700, Len Brown wrote:
>>> On Mon, Apr 20, 2015 at 12:15 AM, Ingo Molnar <[email protected]> wrote:
>>>
>>>> So instead of playing games with an ancient delay, I'd suggest we
>>>> install the 10 msec INIT assertion wait as a platform quirk instead,
>>>> and activate it for all CPUs/systems that we think might need it, with
>>>> a sufficiently robust and future-proof quirk cutoff condition.
>>>>
>>>> New systems won't have the quirk active and thus won't have to have
>>>> this delay configurable either.
>>> Okay, at this time, I think the quirk would apply to:
>>>
>>> 1. Intel family 5 (original pentium) -- some may actually need the
>>> quirk
>>> 2. Intel family F (pentium4) -- mostly b/c I don't want to bother
>>> finding/testing p4
>>> 3. All AMD (happy to narrow down, if somebody can speak for AMD)
>> Aravind and I could probably test on a couple of AMD boxes to narrow
>> down.
>>
>> @Aravind, see here:
>>
>> https://lkml.kernel.org/r/87d69aab88c14d65ae1e7be55050d1b689b59b4b.1429402494.git.len.brown@intel.com
>>
>>
>> You could ask around whether a timeout is needed between the assertion
>> and deassertion of INIT done by the BSP when booting other cores.
>
> Sure, I'll ask around and try mdelay(0) on some systems as well.
> I can gather Fam15h, Fam16h but don't have K8's or older.
>
> Will let you know how it goes.
Update:
Fam15h Model00h-0fh, Fam15hModel60h and Fam16h Model 00h-0fh processors
boot fine with mdelay(0) and BSP brings up all secondary cpus correctly.
I don't have Fam15hModel30h system currently up, but I'll try that too
tomorrow.
I am yet to get feedback from HW folks regarding this though.
Thanks,
-Aravind.
On 5/3/2015 11:13 AM, Aravind Gopalakrishnan wrote:
>
> On 5/1/15 7:42 PM, Aravind Gopalakrishnan wrote:
>>
>> On 5/1/15 5:47 PM, Borislav Petkov wrote:
>>> Aravind and I could probably test on a couple of AMD boxes to narrow
>>> down.
>>>
>>> @Aravind, see here:
>>>
>>> https://lkml.kernel.org/r/87d69aab88c14d65ae1e7be55050d1b689b59b4b.1429402494.git.len.brown@intel.com
>>>
>>>
>>> You could ask around whether a timeout is needed between the assertion
>>> and deassertion of INIT done by the BSP when booting other cores.
>>
>> Sure, I'll ask around and try mdelay(0) on some systems as well.
>> I can gather Fam15h, Fam16h but don't have K8's or older.
>>
>> Will let you know how it goes.
> Update:
> Fam15h Model00h-0fh, Fam15hModel60h and Fam16h Model 00h-0fh
> processors boot fine with mdelay(0) and BSP brings up all secondary
> cpus correctly. I don't have Fam15hModel30h system currently up, but
> I'll try that too tomorrow.
>
> I am yet to get feedback from HW folks regarding this though.
Tested a delay of 0 on Fam10h and Fam15h Model 30h-3fh and both work fine.
Feedback from asking internally about this is that we should be OK to
move to a no-delay default from K8 onwards.
Thanks,
-Aravind.
On Mon, May 04, 2015 at 05:45:17PM -0500, Aravind Gopalakrishnan wrote:
> Tested a delay of 0 on Fam10h and Fam15h Model 30h-3fh and both work fine.
Cool, thanks for testing.
> Feedback from asking internally about this is that we should be OK to move
> to a no-delay default from K8 onwards.
Also cool :-)
@Len: So everything with X86_VENDOR_AMD and c->x86 >= 0xf should be fine.
Thanks.
--
Regards/Gruss,
Boris.
ECO tip #101: Trim your mails when you reply.
--