2018-11-01 11:21:52

by wangyufen

[permalink] [raw]
Subject: [PATCH] ARM:kexec:offline panic_smp_self_stop CPU

From: Yufen Wang <[email protected]>

In case panic() and panic() called at the same time on different CPUS.
For example:
CPU 0:
panic()
__crash_kexec
machine_crash_shutdown
crash_smp_send_stop
machine_kexec
BUG_ON(num_online_cpus() > 1);

CPU 1:
panic()
local_irq_disable
panic_smp_self_stop

If CPU 1 calls panic_smp_self_stop() before crash_smp_send_stop(), kdump
fails. CPU1 can't receive the ipi irq, CPU1 will be always online.
I changed BUG_ON to WARN in kexec crash as arm64 does, kdump also fails.
Because num_online_cpus() > 1, can't disable the L2 in _soft_restart.
To fix this problem, this patch split out the panic_smp_self_stop()
and add set_cpu_online(smp_processor_id(), false).

Signed-off-by: Yufen Wang <[email protected]>
---
arch/arm/kernel/setup.c | 10 ++++++++++
1 file changed, 10 insertions(+)

diff --git a/arch/arm/kernel/setup.c b/arch/arm/kernel/setup.c
index 31940bd..151861f 100644
--- a/arch/arm/kernel/setup.c
+++ b/arch/arm/kernel/setup.c
@@ -602,6 +602,16 @@ static void __init smp_build_mpidr_hash(void)
}
#endif

+void panic_smp_self_stop(void)
+{
+ printk(KERN_DEBUG "CPU %u will stop doing anything useful since another CPU has paniced\n",
+ smp_processor_id());
+ set_cpu_online(smp_processor_id(), false);
+ while (1)
+ cpu_relax();
+
+}
+
static void __init setup_processor(void)
{
struct proc_info_list *list;
--
2.7.4




2018-11-01 11:35:50

by Russell King (Oracle)

[permalink] [raw]
Subject: Re: [PATCH] ARM:kexec:offline panic_smp_self_stop CPU

On Thu, Nov 01, 2018 at 07:20:49PM +0800, Wang Yufen wrote:
> From: Yufen Wang <[email protected]>
>
> In case panic() and panic() called at the same time on different CPUS.
> For example:
> CPU 0:
> panic()
> __crash_kexec
> machine_crash_shutdown
> crash_smp_send_stop
> machine_kexec
> BUG_ON(num_online_cpus() > 1);
>
> CPU 1:
> panic()
> local_irq_disable
> panic_smp_self_stop
>
> If CPU 1 calls panic_smp_self_stop() before crash_smp_send_stop(), kdump
> fails. CPU1 can't receive the ipi irq, CPU1 will be always online.
> I changed BUG_ON to WARN in kexec crash as arm64 does, kdump also fails.
> Because num_online_cpus() > 1, can't disable the L2 in _soft_restart.
> To fix this problem, this patch split out the panic_smp_self_stop()
> and add set_cpu_online(smp_processor_id(), false).

Thanks.

I think this may as well go into arch/arm/kernel/smp.c - it won't be
required for single-CPU systems, since there aren't "other" CPUs.

It's probably also worth a comment above the function as to why we
have this.

>
> Signed-off-by: Yufen Wang <[email protected]>
> ---
> arch/arm/kernel/setup.c | 10 ++++++++++
> 1 file changed, 10 insertions(+)
>
> diff --git a/arch/arm/kernel/setup.c b/arch/arm/kernel/setup.c
> index 31940bd..151861f 100644
> --- a/arch/arm/kernel/setup.c
> +++ b/arch/arm/kernel/setup.c
> @@ -602,6 +602,16 @@ static void __init smp_build_mpidr_hash(void)
> }
> #endif
>
> +void panic_smp_self_stop(void)
> +{
> + printk(KERN_DEBUG "CPU %u will stop doing anything useful since another CPU has paniced\n",
> + smp_processor_id());
> + set_cpu_online(smp_processor_id(), false);
> + while (1)
> + cpu_relax();
> +
> +}
> +
> static void __init setup_processor(void)
> {
> struct proc_info_list *list;
> --
> 2.7.4
>
>

--
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up

2018-11-02 01:18:52

by wangyufen

[permalink] [raw]
Subject: Re: [PATCH] ARM:kexec:offline panic_smp_self_stop CPU

On 2018/11/1 19:34, Russell King - ARM Linux wrote:
> On Thu, Nov 01, 2018 at 07:20:49PM +0800, Wang Yufen wrote:
>> From: Yufen Wang <[email protected]>
>>
>> In case panic() and panic() called at the same time on different CPUS.
>> For example:
>> CPU 0:
>> panic()
>> __crash_kexec
>> machine_crash_shutdown
>> crash_smp_send_stop
>> machine_kexec
>> BUG_ON(num_online_cpus() > 1);
>>
>> CPU 1:
>> panic()
>> local_irq_disable
>> panic_smp_self_stop
>>
>> If CPU 1 calls panic_smp_self_stop() before crash_smp_send_stop(), kdump
>> fails. CPU1 can't receive the ipi irq, CPU1 will be always online.
>> I changed BUG_ON to WARN in kexec crash as arm64 does, kdump also fails.
>> Because num_online_cpus() > 1, can't disable the L2 in _soft_restart.
>> To fix this problem, this patch split out the panic_smp_self_stop()
>> and add set_cpu_online(smp_processor_id(), false).
> Thanks.
>
> I think this may as well go into arch/arm/kernel/smp.c - it won't be
> required for single-CPU systems, since there aren't "other" CPUs.
>
> It's probably also worth a comment above the function as to why we
> have this.

Thanks.

I will send v2.

>> Signed-off-by: Yufen Wang <[email protected]>
>> ---
>> arch/arm/kernel/setup.c | 10 ++++++++++
>> 1 file changed, 10 insertions(+)
>>
>> diff --git a/arch/arm/kernel/setup.c b/arch/arm/kernel/setup.c
>> index 31940bd..151861f 100644
>> --- a/arch/arm/kernel/setup.c
>> +++ b/arch/arm/kernel/setup.c
>> @@ -602,6 +602,16 @@ static void __init smp_build_mpidr_hash(void)
>> }
>> #endif
>>
>> +void panic_smp_self_stop(void)
>> +{
>> + printk(KERN_DEBUG "CPU %u will stop doing anything useful since another CPU has paniced\n",
>> + smp_processor_id());
>> + set_cpu_online(smp_processor_id(), false);
>> + while (1)
>> + cpu_relax();
>> +
>> +}
>> +
>> static void __init setup_processor(void)
>> {
>> struct proc_info_list *list;
>> --
>> 2.7.4
>>
>>



2018-11-02 02:33:09

by wangyufen

[permalink] [raw]
Subject: [PATCH v2] ARM:kexec:offline panic_smp_self_stop CPU

In case panic() and panic() called at the same time on different CPUS.
For example:
CPU 0:
panic()
__crash_kexec
machine_crash_shutdown
crash_smp_send_stop
machine_kexec
BUG_ON(num_online_cpus() > 1);

CPU 1:
panic()
local_irq_disable
panic_smp_self_stop

If CPU 1 calls panic_smp_self_stop() before crash_smp_send_stop(), kdump
fails. CPU1 can't receive the ipi irq, CPU1 will be always online.
To fix this problem, this patch split out the panic_smp_self_stop()
and add set_cpu_online(smp_processor_id(), false).

Signed-off-by: Yufen Wang <[email protected]>
---
arch/arm/kernel/smp.c | 15 +++++++++++++++
1 file changed, 15 insertions(+)

diff --git a/arch/arm/kernel/smp.c b/arch/arm/kernel/smp.c
index 9000d8b..d7b86e4 100644
--- a/arch/arm/kernel/smp.c
+++ b/arch/arm/kernel/smp.c
@@ -682,6 +682,21 @@ void smp_send_stop(void)
pr_warn("SMP: failed to stop secondary CPUs\n");
}

+/* In case panic() and panic() called at the same time on CPU1 and CPU2,
+ * and CPU 1 calls panic_smp_self_stop() before crash_smp_send_stop()
+ * CPU1 can't receive the ipi irqs from CPU2, CPU1 will be always online,
+ * kdump fails. So split out the panic_smp_self_stop() and add
+ * set_cpu_online(smp_processor_id(), false).
+ */
+void panic_smp_self_stop(void)
+{
+ pr_debug("CPU %u will stop doing anything useful since another CPU has paniced\n",
+ smp_processor_id());
+ set_cpu_online(smp_processor_id(), false);
+ while (1)
+ cpu_relax();
+}
+
/*
* not supported here
*/
--
2.7.4



2018-11-02 09:56:56

by Russell King (Oracle)

[permalink] [raw]
Subject: Re: [PATCH v2] ARM:kexec:offline panic_smp_self_stop CPU

On Fri, Nov 02, 2018 at 10:31:27AM +0800, wangyufen wrote:
> In case panic() and panic() called at the same time on different CPUS.
> For example:
> CPU 0:
> panic()
> __crash_kexec
> machine_crash_shutdown
> crash_smp_send_stop
> machine_kexec
> BUG_ON(num_online_cpus() > 1);
>
> CPU 1:
> panic()
> local_irq_disable
> panic_smp_self_stop
>
> If CPU 1 calls panic_smp_self_stop() before crash_smp_send_stop(), kdump
> fails. CPU1 can't receive the ipi irq, CPU1 will be always online.
> To fix this problem, this patch split out the panic_smp_self_stop()
> and add set_cpu_online(smp_processor_id(), false).

Looks fine now, please send it to the patch system (details in my
signature.) Thanks.

>
> Signed-off-by: Yufen Wang <[email protected]>
> ---
> arch/arm/kernel/smp.c | 15 +++++++++++++++
> 1 file changed, 15 insertions(+)
>
> diff --git a/arch/arm/kernel/smp.c b/arch/arm/kernel/smp.c
> index 9000d8b..d7b86e4 100644
> --- a/arch/arm/kernel/smp.c
> +++ b/arch/arm/kernel/smp.c
> @@ -682,6 +682,21 @@ void smp_send_stop(void)
> pr_warn("SMP: failed to stop secondary CPUs\n");
> }
>
> +/* In case panic() and panic() called at the same time on CPU1 and CPU2,
> + * and CPU 1 calls panic_smp_self_stop() before crash_smp_send_stop()
> + * CPU1 can't receive the ipi irqs from CPU2, CPU1 will be always online,
> + * kdump fails. So split out the panic_smp_self_stop() and add
> + * set_cpu_online(smp_processor_id(), false).
> + */
> +void panic_smp_self_stop(void)
> +{
> + pr_debug("CPU %u will stop doing anything useful since another CPU has paniced\n",
> + smp_processor_id());
> + set_cpu_online(smp_processor_id(), false);
> + while (1)
> + cpu_relax();
> +}
> +
> /*
> * not supported here
> */
> --
> 2.7.4
>
>

--
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up