2022-04-06 06:24:55

by Marc Zyngier

[permalink] [raw]
Subject: [PATCH v3 2/3] genirq: Always limit the affinity to online CPUs

When booting with maxcpus=<small number> (or even loading a driver
while most CPUs are offline), it is pretty easy to observe managed
affinities containing a mix of online and offline CPUs being passed
to the irqchip driver.

This means that the irqchip cannot trust the affinity passed down
from the core code, which is a bit annoying and requires (at least
in theory) all drivers to implement some sort of affinity narrowing.

In order to address this, always limit the cpumask to the set of
online CPUs.

Signed-off-by: Marc Zyngier <[email protected]>
---
kernel/irq/manage.c | 25 +++++++++++++++++--------
1 file changed, 17 insertions(+), 8 deletions(-)

diff --git a/kernel/irq/manage.c b/kernel/irq/manage.c
index c03f71d5ec10..f71ecc100545 100644
--- a/kernel/irq/manage.c
+++ b/kernel/irq/manage.c
@@ -222,11 +222,16 @@ int irq_do_set_affinity(struct irq_data *data, const struct cpumask *mask,
{
struct irq_desc *desc = irq_data_to_desc(data);
struct irq_chip *chip = irq_data_get_irq_chip(data);
+ const struct cpumask *prog_mask;
int ret;

+ static DEFINE_RAW_SPINLOCK(tmp_mask_lock);
+ static struct cpumask tmp_mask;
+
if (!chip || !chip->irq_set_affinity)
return -EINVAL;

+ raw_spin_lock(&tmp_mask_lock);
/*
* If this is a managed interrupt and housekeeping is enabled on
* it check whether the requested affinity mask intersects with
@@ -248,24 +253,28 @@ int irq_do_set_affinity(struct irq_data *data, const struct cpumask *mask,
*/
if (irqd_affinity_is_managed(data) &&
housekeeping_enabled(HK_TYPE_MANAGED_IRQ)) {
- const struct cpumask *hk_mask, *prog_mask;
-
- static DEFINE_RAW_SPINLOCK(tmp_mask_lock);
- static struct cpumask tmp_mask;
+ const struct cpumask *hk_mask;

hk_mask = housekeeping_cpumask(HK_TYPE_MANAGED_IRQ);

- raw_spin_lock(&tmp_mask_lock);
cpumask_and(&tmp_mask, mask, hk_mask);
if (!cpumask_intersects(&tmp_mask, cpu_online_mask))
prog_mask = mask;
else
prog_mask = &tmp_mask;
- ret = chip->irq_set_affinity(data, prog_mask, force);
- raw_spin_unlock(&tmp_mask_lock);
} else {
- ret = chip->irq_set_affinity(data, mask, force);
+ prog_mask = mask;
}
+
+ /* Make sure we only provide online CPUs to the irqchip */
+ cpumask_and(&tmp_mask, prog_mask, cpu_online_mask);
+ if (!cpumask_empty(&tmp_mask))
+ ret = chip->irq_set_affinity(data, &tmp_mask, force);
+ else
+ ret = -EINVAL;
+
+ raw_spin_unlock(&tmp_mask_lock);
+
switch (ret) {
case IRQ_SET_MASK_OK:
case IRQ_SET_MASK_OK_DONE:
--
2.34.1


Subject: [tip: irq/core] genirq: Always limit the affinity to online CPUs

The following commit has been merged into the irq/core branch of tip:

Commit-ID: 33de0aa4bae982ed6f7c777f86b5af3e627ac937
Gitweb: https://git.kernel.org/tip/33de0aa4bae982ed6f7c777f86b5af3e627ac937
Author: Marc Zyngier <[email protected]>
AuthorDate: Tue, 05 Apr 2022 19:50:39 +01:00
Committer: Thomas Gleixner <[email protected]>
CommitterDate: Sun, 10 Apr 2022 21:06:30 +02:00

genirq: Always limit the affinity to online CPUs

When booting with maxcpus=<small number> (or even loading a driver
while most CPUs are offline), it is pretty easy to observe managed
affinities containing a mix of online and offline CPUs being passed
to the irqchip driver.

This means that the irqchip cannot trust the affinity passed down
from the core code, which is a bit annoying and requires (at least
in theory) all drivers to implement some sort of affinity narrowing.

In order to address this, always limit the cpumask to the set of
online CPUs.

Signed-off-by: Marc Zyngier <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Link: https://lore.kernel.org/r/[email protected]

---
kernel/irq/manage.c | 25 +++++++++++++++++--------
1 file changed, 17 insertions(+), 8 deletions(-)

diff --git a/kernel/irq/manage.c b/kernel/irq/manage.c
index c03f71d..f71ecc1 100644
--- a/kernel/irq/manage.c
+++ b/kernel/irq/manage.c
@@ -222,11 +222,16 @@ int irq_do_set_affinity(struct irq_data *data, const struct cpumask *mask,
{
struct irq_desc *desc = irq_data_to_desc(data);
struct irq_chip *chip = irq_data_get_irq_chip(data);
+ const struct cpumask *prog_mask;
int ret;

+ static DEFINE_RAW_SPINLOCK(tmp_mask_lock);
+ static struct cpumask tmp_mask;
+
if (!chip || !chip->irq_set_affinity)
return -EINVAL;

+ raw_spin_lock(&tmp_mask_lock);
/*
* If this is a managed interrupt and housekeeping is enabled on
* it check whether the requested affinity mask intersects with
@@ -248,24 +253,28 @@ int irq_do_set_affinity(struct irq_data *data, const struct cpumask *mask,
*/
if (irqd_affinity_is_managed(data) &&
housekeeping_enabled(HK_TYPE_MANAGED_IRQ)) {
- const struct cpumask *hk_mask, *prog_mask;
-
- static DEFINE_RAW_SPINLOCK(tmp_mask_lock);
- static struct cpumask tmp_mask;
+ const struct cpumask *hk_mask;

hk_mask = housekeeping_cpumask(HK_TYPE_MANAGED_IRQ);

- raw_spin_lock(&tmp_mask_lock);
cpumask_and(&tmp_mask, mask, hk_mask);
if (!cpumask_intersects(&tmp_mask, cpu_online_mask))
prog_mask = mask;
else
prog_mask = &tmp_mask;
- ret = chip->irq_set_affinity(data, prog_mask, force);
- raw_spin_unlock(&tmp_mask_lock);
} else {
- ret = chip->irq_set_affinity(data, mask, force);
+ prog_mask = mask;
}
+
+ /* Make sure we only provide online CPUs to the irqchip */
+ cpumask_and(&tmp_mask, prog_mask, cpu_online_mask);
+ if (!cpumask_empty(&tmp_mask))
+ ret = chip->irq_set_affinity(data, &tmp_mask, force);
+ else
+ ret = -EINVAL;
+
+ raw_spin_unlock(&tmp_mask_lock);
+
switch (ret) {
case IRQ_SET_MASK_OK:
case IRQ_SET_MASK_OK_DONE:

2022-04-13 19:01:05

by Marek Szyprowski

[permalink] [raw]
Subject: Re: [PATCH v3 2/3] genirq: Always limit the affinity to online CPUs

Hi Marc,

On 05.04.2022 20:50, Marc Zyngier wrote:
> When booting with maxcpus=<small number> (or even loading a driver
> while most CPUs are offline), it is pretty easy to observe managed
> affinities containing a mix of online and offline CPUs being passed
> to the irqchip driver.
>
> This means that the irqchip cannot trust the affinity passed down
> from the core code, which is a bit annoying and requires (at least
> in theory) all drivers to implement some sort of affinity narrowing.
>
> In order to address this, always limit the cpumask to the set of
> online CPUs.
>
> Signed-off-by: Marc Zyngier <[email protected]>

This patch landed in linux next-20220413 as commit 33de0aa4bae9
("genirq: Always limit the affinity to online CPUs"). Unfortunately it
breaks booting of most ARM 32bit Samsung Exynos based boards.

I don't see anything specific in the log, though. Booting just hangs at
some point. The only Samsung Exynos boards that boot properly are those
Exynos4412 based.

I assume that this is related to the Multi Core Timer IRQ configuration
specific for that SoCs. Exynos4412 uses PPI interrupts, while all other
Exynos SoCs have separate IRQ lines for each CPU.

Let me know how I can help debugging this issue.

> ---
> kernel/irq/manage.c | 25 +++++++++++++++++--------
> 1 file changed, 17 insertions(+), 8 deletions(-)
>
> diff --git a/kernel/irq/manage.c b/kernel/irq/manage.c
> index c03f71d5ec10..f71ecc100545 100644
> --- a/kernel/irq/manage.c
> +++ b/kernel/irq/manage.c
> @@ -222,11 +222,16 @@ int irq_do_set_affinity(struct irq_data *data, const struct cpumask *mask,
> {
> struct irq_desc *desc = irq_data_to_desc(data);
> struct irq_chip *chip = irq_data_get_irq_chip(data);
> + const struct cpumask *prog_mask;
> int ret;
>
> + static DEFINE_RAW_SPINLOCK(tmp_mask_lock);
> + static struct cpumask tmp_mask;
> +
> if (!chip || !chip->irq_set_affinity)
> return -EINVAL;
>
> + raw_spin_lock(&tmp_mask_lock);
> /*
> * If this is a managed interrupt and housekeeping is enabled on
> * it check whether the requested affinity mask intersects with
> @@ -248,24 +253,28 @@ int irq_do_set_affinity(struct irq_data *data, const struct cpumask *mask,
> */
> if (irqd_affinity_is_managed(data) &&
> housekeeping_enabled(HK_TYPE_MANAGED_IRQ)) {
> - const struct cpumask *hk_mask, *prog_mask;
> -
> - static DEFINE_RAW_SPINLOCK(tmp_mask_lock);
> - static struct cpumask tmp_mask;
> + const struct cpumask *hk_mask;
>
> hk_mask = housekeeping_cpumask(HK_TYPE_MANAGED_IRQ);
>
> - raw_spin_lock(&tmp_mask_lock);
> cpumask_and(&tmp_mask, mask, hk_mask);
> if (!cpumask_intersects(&tmp_mask, cpu_online_mask))
> prog_mask = mask;
> else
> prog_mask = &tmp_mask;
> - ret = chip->irq_set_affinity(data, prog_mask, force);
> - raw_spin_unlock(&tmp_mask_lock);
> } else {
> - ret = chip->irq_set_affinity(data, mask, force);
> + prog_mask = mask;
> }
> +
> + /* Make sure we only provide online CPUs to the irqchip */
> + cpumask_and(&tmp_mask, prog_mask, cpu_online_mask);
> + if (!cpumask_empty(&tmp_mask))
> + ret = chip->irq_set_affinity(data, &tmp_mask, force);
> + else
> + ret = -EINVAL;
> +
> + raw_spin_unlock(&tmp_mask_lock);
> +
> switch (ret) {
> case IRQ_SET_MASK_OK:
> case IRQ_SET_MASK_OK_DONE:

Best regards
--
Marek Szyprowski, PhD
Samsung R&D Institute Poland

2022-04-13 19:09:30

by Marc Zyngier

[permalink] [raw]
Subject: Re: [PATCH v3 2/3] genirq: Always limit the affinity to online CPUs

Hi Marek,

On Wed, 13 Apr 2022 15:59:21 +0100,
Marek Szyprowski <[email protected]> wrote:
>
> Hi Marc,
>
> On 05.04.2022 20:50, Marc Zyngier wrote:
> > When booting with maxcpus=<small number> (or even loading a driver
> > while most CPUs are offline), it is pretty easy to observe managed
> > affinities containing a mix of online and offline CPUs being passed
> > to the irqchip driver.
> >
> > This means that the irqchip cannot trust the affinity passed down
> > from the core code, which is a bit annoying and requires (at least
> > in theory) all drivers to implement some sort of affinity narrowing.
> >
> > In order to address this, always limit the cpumask to the set of
> > online CPUs.
> >
> > Signed-off-by: Marc Zyngier <[email protected]>
>
> This patch landed in linux next-20220413 as commit 33de0aa4bae9
> ("genirq: Always limit the affinity to online CPUs"). Unfortunately it
> breaks booting of most ARM 32bit Samsung Exynos based boards.
>
> I don't see anything specific in the log, though. Booting just hangs at
> some point. The only Samsung Exynos boards that boot properly are those
> Exynos4412 based.
>
> I assume that this is related to the Multi Core Timer IRQ configuration
> specific for that SoCs. Exynos4412 uses PPI interrupts, while all other
> Exynos SoCs have separate IRQ lines for each CPU.
>
> Let me know how I can help debugging this issue.

Thanks for the heads up. Can you pick the last working kernel, enable
CONFIG_GENERIC_IRQ_DEBUGFS, and dump the /sys/kernel/debug/irq/irqs/
entries for the timer IRQs?

Also, see below.

>
> > ---
> > kernel/irq/manage.c | 25 +++++++++++++++++--------
> > 1 file changed, 17 insertions(+), 8 deletions(-)
> >
> > diff --git a/kernel/irq/manage.c b/kernel/irq/manage.c
> > index c03f71d5ec10..f71ecc100545 100644
> > --- a/kernel/irq/manage.c
> > +++ b/kernel/irq/manage.c
> > @@ -222,11 +222,16 @@ int irq_do_set_affinity(struct irq_data *data, const struct cpumask *mask,
> > {
> > struct irq_desc *desc = irq_data_to_desc(data);
> > struct irq_chip *chip = irq_data_get_irq_chip(data);
> > + const struct cpumask *prog_mask;
> > int ret;
> >
> > + static DEFINE_RAW_SPINLOCK(tmp_mask_lock);
> > + static struct cpumask tmp_mask;
> > +
> > if (!chip || !chip->irq_set_affinity)
> > return -EINVAL;
> >
> > + raw_spin_lock(&tmp_mask_lock);
> > /*
> > * If this is a managed interrupt and housekeeping is enabled on
> > * it check whether the requested affinity mask intersects with
> > @@ -248,24 +253,28 @@ int irq_do_set_affinity(struct irq_data *data, const struct cpumask *mask,
> > */
> > if (irqd_affinity_is_managed(data) &&
> > housekeeping_enabled(HK_TYPE_MANAGED_IRQ)) {
> > - const struct cpumask *hk_mask, *prog_mask;
> > -
> > - static DEFINE_RAW_SPINLOCK(tmp_mask_lock);
> > - static struct cpumask tmp_mask;
> > + const struct cpumask *hk_mask;
> >
> > hk_mask = housekeeping_cpumask(HK_TYPE_MANAGED_IRQ);
> >
> > - raw_spin_lock(&tmp_mask_lock);
> > cpumask_and(&tmp_mask, mask, hk_mask);
> > if (!cpumask_intersects(&tmp_mask, cpu_online_mask))
> > prog_mask = mask;
> > else
> > prog_mask = &tmp_mask;
> > - ret = chip->irq_set_affinity(data, prog_mask, force);
> > - raw_spin_unlock(&tmp_mask_lock);
> > } else {
> > - ret = chip->irq_set_affinity(data, mask, force);
> > + prog_mask = mask;
> > }
> > +
> > + /* Make sure we only provide online CPUs to the irqchip */
> > + cpumask_and(&tmp_mask, prog_mask, cpu_online_mask);
> > + if (!cpumask_empty(&tmp_mask))
> > + ret = chip->irq_set_affinity(data, &tmp_mask, force);
> > + else
> > + ret = -EINVAL;

Can you also check that with the patch applied, it is this path that
is taken and that it is the timer interrupts that get rejected? If
that's the case, can you put a dump_stack() here and give me that
stack trace? The use of irq_force_affinity() in the driver looks
suspicious...

Finally, is there a QEMU emulation of one of these failing boards?

Thanks,

M.

--
Without deviation from the norm, progress is not possible.

2022-04-14 14:13:03

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [PATCH v3 2/3] genirq: Always limit the affinity to online CPUs

On Wed, Apr 13 2022 at 18:26, Marc Zyngier wrote:
> Marek Szyprowski <[email protected]> wrote:
>> This patch landed in linux next-20220413 as commit 33de0aa4bae9
>> ("genirq: Always limit the affinity to online CPUs"). Unfortunately it
>> breaks booting of most ARM 32bit Samsung Exynos based boards.
>>
>> I don't see anything specific in the log, though. Booting just hangs at
>> some point. The only Samsung Exynos boards that boot properly are those
>> Exynos4412 based.
>>
>> I assume that this is related to the Multi Core Timer IRQ configuration
>> specific for that SoCs. Exynos4412 uses PPI interrupts, while all other
>> Exynos SoCs have separate IRQ lines for each CPU.
>
> Can you also check that with the patch applied, it is this path that
> is taken and that it is the timer interrupts that get rejected? If
> that's the case, can you put a dump_stack() here and give me that
> stack trace? The use of irq_force_affinity() in the driver looks
> suspicious...

It's pretty clear what happens.

secondary_start_kernel()
notify_cpu_starting(cpu);
exynos4_mct_starting_cpu()
irq_force_affinity() -> fail

set_cpu_online(cpu, true);

Thanks,

tglx

2022-04-14 18:19:36

by Marek Szyprowski

[permalink] [raw]
Subject: Re: [PATCH v3 2/3] genirq: Always limit the affinity to online CPUs

Hi Marc,

On 14.04.2022 12:35, Marc Zyngier wrote:
> On Thu, 14 Apr 2022 10:09:31 +0100,
> Marek Szyprowski <[email protected]> wrote:
>> On 13.04.2022 19:26, Marc Zyngier wrote:
>>> On Wed, 13 Apr 2022 15:59:21 +0100,
>>> Marek Szyprowski <[email protected]> wrote:
>>>> On 05.04.2022 20:50, Marc Zyngier wrote:
>>>>> When booting with maxcpus=<small number> (or even loading a driver
>>>>> while most CPUs are offline), it is pretty easy to observe managed
>>>>> affinities containing a mix of online and offline CPUs being passed
>>>>> to the irqchip driver.
>>>>>
>>>>> This means that the irqchip cannot trust the affinity passed down
>>>>> from the core code, which is a bit annoying and requires (at least
>>>>> in theory) all drivers to implement some sort of affinity narrowing.
>>>>>
>>>>> In order to address this, always limit the cpumask to the set of
>>>>> online CPUs.
>>>>>
>>>>> Signed-off-by: Marc Zyngier <[email protected]>
>>>> This patch landed in linux next-20220413 as commit 33de0aa4bae9
>>>> ("genirq: Always limit the affinity to online CPUs"). Unfortunately it
>>>> breaks booting of most ARM 32bit Samsung Exynos based boards.
>>>>
>>>> I don't see anything specific in the log, though. Booting just hangs at
>>>> some point. The only Samsung Exynos boards that boot properly are those
>>>> Exynos4412 based.
>>>>
>>>> I assume that this is related to the Multi Core Timer IRQ configuration
>>>> specific for that SoCs. Exynos4412 uses PPI interrupts, while all other
>>>> Exynos SoCs have separate IRQ lines for each CPU.
>>>>
>>>> Let me know how I can help debugging this issue.
>>> Thanks for the heads up. Can you pick the last working kernel, enable
>>> CONFIG_GENERIC_IRQ_DEBUGFS, and dump the /sys/kernel/debug/irq/irqs/
>>> entries for the timer IRQs?
>> Exynos4210, Trats board, next-20220411:
> Thanks for all of the debug, super helpful. The issue is that we don't
> handle the 'force' case, which a handful of drivers are using when
> bringing up CPUs (and doing so before the CPUs are marked online).
>
> Can you please give the below hack a go?

This patch fixed the issue. Thanks! Feel free to add my:

Reported-by: Marek Szyprowski <[email protected]>

Tested-by: Marek Szyprowski <[email protected]>

Best regards
--
Marek Szyprowski, PhD
Samsung R&D Institute Poland

2022-04-16 00:14:03

by Marek Szyprowski

[permalink] [raw]
Subject: Re: [PATCH v3 2/3] genirq: Always limit the affinity to online CPUs

Hi Marc,

On 13.04.2022 19:26, Marc Zyngier wrote:
> Hi Marek,
>
> On Wed, 13 Apr 2022 15:59:21 +0100,
> Marek Szyprowski <[email protected]> wrote:
>> Hi Marc,
>>
>> On 05.04.2022 20:50, Marc Zyngier wrote:
>>> When booting with maxcpus=<small number> (or even loading a driver
>>> while most CPUs are offline), it is pretty easy to observe managed
>>> affinities containing a mix of online and offline CPUs being passed
>>> to the irqchip driver.
>>>
>>> This means that the irqchip cannot trust the affinity passed down
>>> from the core code, which is a bit annoying and requires (at least
>>> in theory) all drivers to implement some sort of affinity narrowing.
>>>
>>> In order to address this, always limit the cpumask to the set of
>>> online CPUs.
>>>
>>> Signed-off-by: Marc Zyngier <[email protected]>
>> This patch landed in linux next-20220413 as commit 33de0aa4bae9
>> ("genirq: Always limit the affinity to online CPUs"). Unfortunately it
>> breaks booting of most ARM 32bit Samsung Exynos based boards.
>>
>> I don't see anything specific in the log, though. Booting just hangs at
>> some point. The only Samsung Exynos boards that boot properly are those
>> Exynos4412 based.
>>
>> I assume that this is related to the Multi Core Timer IRQ configuration
>> specific for that SoCs. Exynos4412 uses PPI interrupts, while all other
>> Exynos SoCs have separate IRQ lines for each CPU.
>>
>> Let me know how I can help debugging this issue.
> Thanks for the heads up. Can you pick the last working kernel, enable
> CONFIG_GENERIC_IRQ_DEBUGFS, and dump the /sys/kernel/debug/irq/irqs/
> entries for the timer IRQs?

Exynos4210, Trats board, next-20220411:

root@target:~# cat /proc/interrupts | grep mct
 40:          0          0 GIC-0  89 Level     mct_comp_irq
 41:       4337          0 GIC-0  74 Level     mct_tick0
 42:          0      11061 GIC-0  80 Level     mct_tick1
root@target:~# cat /sys/kernel/debug/irq/irqs/41
handler:  handle_fasteoi_irq
device:   (null)
status:   0x00003504
            _IRQ_NOPROBE
            _IRQ_NOAUTOEN
istate:   0x00000000
ddepth:   0
wdepth:   0
dstate:   0x13403604
            IRQ_TYPE_LEVEL_HIGH
            IRQD_LEVEL
            IRQD_ACTIVATED
            IRQD_IRQ_STARTED
            IRQD_NO_BALANCING
            IRQD_SINGLE_TARGET
            IRQD_AFFINITY_SET
            IRQD_DEFAULT_TRIGGER_SET
            IRQD_HANDLE_ENFORCE_IRQCTX
node:     0
affinity: 0
effectiv: 0
domain:  :soc:interrupt-controller@10490000
 hwirq:   0x4a
 chip:    GIC-0
  flags:   0x15
             IRQCHIP_SET_TYPE_MASKED
             IRQCHIP_MASK_ON_SUSPEND
             IRQCHIP_SKIP_SET_WAKE
root@target:~# cat /sys/kernel/debug/irq/irqs/42
handler:  handle_fasteoi_irq
device:   (null)
status:   0x00003504
            _IRQ_NOPROBE
            _IRQ_NOAUTOEN
istate:   0x00000000
ddepth:   0
wdepth:   0
dstate:   0x13403604
            IRQ_TYPE_LEVEL_HIGH
            IRQD_LEVEL
            IRQD_ACTIVATED
            IRQD_IRQ_STARTED
            IRQD_NO_BALANCING
            IRQD_SINGLE_TARGET
            IRQD_AFFINITY_SET
            IRQD_DEFAULT_TRIGGER_SET
            IRQD_HANDLE_ENFORCE_IRQCTX
node:     0
affinity: 1
effectiv: 1
domain:  :soc:interrupt-controller@10490000
 hwirq:   0x50
 chip:    GIC-0
  flags:   0x15
             IRQCHIP_SET_TYPE_MASKED
             IRQCHIP_MASK_ON_SUSPEND
             IRQCHIP_SKIP_SET_WAKE
root@target:~#

> Also, see below.
>
>>> ---
>>> kernel/irq/manage.c | 25 +++++++++++++++++--------
>>> 1 file changed, 17 insertions(+), 8 deletions(-)
>>>
>>> diff --git a/kernel/irq/manage.c b/kernel/irq/manage.c
>>> index c03f71d5ec10..f71ecc100545 100644
>>> --- a/kernel/irq/manage.c
>>> +++ b/kernel/irq/manage.c
>>> @@ -222,11 +222,16 @@ int irq_do_set_affinity(struct irq_data *data, const struct cpumask *mask,
>>> {
>>> struct irq_desc *desc = irq_data_to_desc(data);
>>> struct irq_chip *chip = irq_data_get_irq_chip(data);
>>> + const struct cpumask *prog_mask;
>>> int ret;
>>>
>>> + static DEFINE_RAW_SPINLOCK(tmp_mask_lock);
>>> + static struct cpumask tmp_mask;
>>> +
>>> if (!chip || !chip->irq_set_affinity)
>>> return -EINVAL;
>>>
>>> + raw_spin_lock(&tmp_mask_lock);
>>> /*
>>> * If this is a managed interrupt and housekeeping is enabled on
>>> * it check whether the requested affinity mask intersects with
>>> @@ -248,24 +253,28 @@ int irq_do_set_affinity(struct irq_data *data, const struct cpumask *mask,
>>> */
>>> if (irqd_affinity_is_managed(data) &&
>>> housekeeping_enabled(HK_TYPE_MANAGED_IRQ)) {
>>> - const struct cpumask *hk_mask, *prog_mask;
>>> -
>>> - static DEFINE_RAW_SPINLOCK(tmp_mask_lock);
>>> - static struct cpumask tmp_mask;
>>> + const struct cpumask *hk_mask;
>>>
>>> hk_mask = housekeeping_cpumask(HK_TYPE_MANAGED_IRQ);
>>>
>>> - raw_spin_lock(&tmp_mask_lock);
>>> cpumask_and(&tmp_mask, mask, hk_mask);
>>> if (!cpumask_intersects(&tmp_mask, cpu_online_mask))
>>> prog_mask = mask;
>>> else
>>> prog_mask = &tmp_mask;
>>> - ret = chip->irq_set_affinity(data, prog_mask, force);
>>> - raw_spin_unlock(&tmp_mask_lock);
>>> } else {
>>> - ret = chip->irq_set_affinity(data, mask, force);
>>> + prog_mask = mask;
>>> }
>>> +
>>> + /* Make sure we only provide online CPUs to the irqchip */
>>> + cpumask_and(&tmp_mask, prog_mask, cpu_online_mask);
>>> + if (!cpumask_empty(&tmp_mask))
>>> + ret = chip->irq_set_affinity(data, &tmp_mask, force);
>>> + else
>>> + ret = -EINVAL;
> Can you also check that with the patch applied, it is this path that
> is taken and that it is the timer interrupts that get rejected? If
> that's the case, can you put a dump_stack() here and give me that
> stack trace? The use of irq_force_affinity() in the driver looks
> suspicious...

[    0.158241] smp: Bringing up secondary CPUs ...
[    0.166118] irq_do_set_affinity irq 42
[    0.166160] CPU: 1 PID: 0 Comm: swapper/1 Not tainted
5.18.0-rc1-00002-g33de0aa4bae9-dirty #11708
[    0.166176] Hardware name: Samsung Exynos (Flattened Device Tree)
[    0.166188]  unwind_backtrace from show_stack+0x10/0x14
[    0.166220]  show_stack from dump_stack_lvl+0x58/0x70
[    0.166239]  dump_stack_lvl from irq_do_set_affinity+0x188/0x1c8
[    0.166258]  irq_do_set_affinity from irq_set_affinity_locked+0xf8/0x17c
[    0.166274]  irq_set_affinity_locked from irq_force_affinity+0x34/0x54
[    0.166290]  irq_force_affinity from exynos4_mct_starting_cpu+0xdc/0x11c
[    0.166308]  exynos4_mct_starting_cpu from
cpuhp_invoke_callback+0x190/0x38c
[    0.166328]  cpuhp_invoke_callback from
cpuhp_invoke_callback_range+0x98/0xb4
[    0.166345]  cpuhp_invoke_callback_range from
notify_cpu_starting+0x54/0x94
[    0.166364]  notify_cpu_starting from secondary_start_kernel+0x160/0x26c
[    0.166383]  secondary_start_kernel from 0x40101a00
[    0.166498] CPU1: thread -1, cpu 1, socket 9, mpidr 80000901
[    0.166515] CPU1: Spectre v2: using BPIALL workaround
[    0.265631] smp: Brought up 1 node, 2 CPUs
[    0.268660] SMP: Total of 2 processors activated (96.00 BogoMIPS).
[    0.274583] CPU: All CPU(s) started in SVC mode.

> Finally, is there a QEMU emulation of one of these failing boards?

yes, smdkc210, I've reproduced it with the following command:

qemu-system-arm -serial null -serial stdio -kernel /tftpboot/qemu/zImage
-dtb /tftpboot/qemu/exynos4210-smdkv310.dtb -append
"console=ttySAC1,115200n8 earlycon root=/dev/mmcblk0p2 rootwait
init=/bin/bash ip=::::target::off cpuidle.off=1" -M smdkc210 -drive
file=qemu-smdkv310-mmcblk0.raw,if=sd,bus=0,index=2,format=raw


Best regards
--
Marek Szyprowski, PhD
Samsung R&D Institute Poland

Subject: [tip: irq/core] genirq: Take the proposed affinity at face value if force==true

The following commit has been merged into the irq/core branch of tip:

Commit-ID: c48c8b829d2b966a6649827426bcdba082ccf922
Gitweb: https://git.kernel.org/tip/c48c8b829d2b966a6649827426bcdba082ccf922
Author: Marc Zyngier <[email protected]>
AuthorDate: Thu, 14 Apr 2022 15:00:11 +01:00
Committer: Thomas Gleixner <[email protected]>
CommitterDate: Thu, 14 Apr 2022 16:11:25 +02:00

genirq: Take the proposed affinity at face value if force==true

Although setting the affinity of an interrupt to a set of CPUs that doesn't
have any online CPU is generally frowned apon, there are a few limited
cases where such affinity is set from a CPUHP notifier, setting the
affinity to a CPU that isn't online yet.

The saving grace is that this is always done using the 'force' attribute,
which gives a hint that the affinity setting can be outside of the online
CPU mask and the callsite set this flag with the knowledge that the
underlying interrupt controller knows to handle it.

This restores the expected behaviour on Marek's system.

Fixes: 33de0aa4bae9 ("genirq: Always limit the affinity to online CPUs")
Reported-by: Marek Szyprowski <[email protected]>
Signed-off-by: Marc Zyngier <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Tested-by: Marek Szyprowski <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Link: https://lore.kernel.org/r/[email protected]

---
kernel/irq/manage.c | 10 ++++++++--
1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/kernel/irq/manage.c b/kernel/irq/manage.c
index f71ecc1..f1d5a94 100644
--- a/kernel/irq/manage.c
+++ b/kernel/irq/manage.c
@@ -266,10 +266,16 @@ int irq_do_set_affinity(struct irq_data *data, const struct cpumask *mask,
prog_mask = mask;
}

- /* Make sure we only provide online CPUs to the irqchip */
+ /*
+ * Make sure we only provide online CPUs to the irqchip,
+ * unless we are being asked to force the affinity (in which
+ * case we do as we are told).
+ */
cpumask_and(&tmp_mask, prog_mask, cpu_online_mask);
- if (!cpumask_empty(&tmp_mask))
+ if (!force && !cpumask_empty(&tmp_mask))
ret = chip->irq_set_affinity(data, &tmp_mask, force);
+ else if (force)
+ ret = chip->irq_set_affinity(data, mask, force);
else
ret = -EINVAL;

2022-04-16 01:52:09

by Marc Zyngier

[permalink] [raw]
Subject: Re: [PATCH v3 2/3] genirq: Always limit the affinity to online CPUs

Hi Marek,

On Thu, 14 Apr 2022 10:09:31 +0100,
Marek Szyprowski <[email protected]> wrote:
>
> Hi Marc,
>
> On 13.04.2022 19:26, Marc Zyngier wrote:
> > Hi Marek,
> >
> > On Wed, 13 Apr 2022 15:59:21 +0100,
> > Marek Szyprowski <[email protected]> wrote:
> >> Hi Marc,
> >>
> >> On 05.04.2022 20:50, Marc Zyngier wrote:
> >>> When booting with maxcpus=<small number> (or even loading a driver
> >>> while most CPUs are offline), it is pretty easy to observe managed
> >>> affinities containing a mix of online and offline CPUs being passed
> >>> to the irqchip driver.
> >>>
> >>> This means that the irqchip cannot trust the affinity passed down
> >>> from the core code, which is a bit annoying and requires (at least
> >>> in theory) all drivers to implement some sort of affinity narrowing.
> >>>
> >>> In order to address this, always limit the cpumask to the set of
> >>> online CPUs.
> >>>
> >>> Signed-off-by: Marc Zyngier <[email protected]>
> >> This patch landed in linux next-20220413 as commit 33de0aa4bae9
> >> ("genirq: Always limit the affinity to online CPUs"). Unfortunately it
> >> breaks booting of most ARM 32bit Samsung Exynos based boards.
> >>
> >> I don't see anything specific in the log, though. Booting just hangs at
> >> some point. The only Samsung Exynos boards that boot properly are those
> >> Exynos4412 based.
> >>
> >> I assume that this is related to the Multi Core Timer IRQ configuration
> >> specific for that SoCs. Exynos4412 uses PPI interrupts, while all other
> >> Exynos SoCs have separate IRQ lines for each CPU.
> >>
> >> Let me know how I can help debugging this issue.
> > Thanks for the heads up. Can you pick the last working kernel, enable
> > CONFIG_GENERIC_IRQ_DEBUGFS, and dump the /sys/kernel/debug/irq/irqs/
> > entries for the timer IRQs?
>
> Exynos4210, Trats board, next-20220411:

Thanks for all of the debug, super helpful. The issue is that we don't
handle the 'force' case, which a handful of drivers are using when
bringing up CPUs (and doing so before the CPUs are marked online).

Can you please give the below hack a go?

Thanks,

M.

diff --git a/kernel/irq/manage.c b/kernel/irq/manage.c
index f71ecc100545..f1d5a94c6c9f 100644
--- a/kernel/irq/manage.c
+++ b/kernel/irq/manage.c
@@ -266,10 +266,16 @@ int irq_do_set_affinity(struct irq_data *data, const struct cpumask *mask,
prog_mask = mask;
}

- /* Make sure we only provide online CPUs to the irqchip */
+ /*
+ * Make sure we only provide online CPUs to the irqchip,
+ * unless we are being asked to force the affinity (in which
+ * case we do as we are told).
+ */
cpumask_and(&tmp_mask, prog_mask, cpu_online_mask);
- if (!cpumask_empty(&tmp_mask))
+ if (!force && !cpumask_empty(&tmp_mask))
ret = chip->irq_set_affinity(data, &tmp_mask, force);
+ else if (force)
+ ret = chip->irq_set_affinity(data, mask, force);
else
ret = -EINVAL;


--
Without deviation from the norm, progress is not possible.

2022-04-21 09:54:14

by Krzysztof Kozlowski

[permalink] [raw]
Subject: Re: [PATCH v3 2/3] genirq: Always limit the affinity to online CPUs

On 14/04/2022 13:08, Marek Szyprowski wrote:
>> Thanks for all of the debug, super helpful. The issue is that we don't
>> handle the 'force' case, which a handful of drivers are using when
>> bringing up CPUs (and doing so before the CPUs are marked online).
>>
>> Can you please give the below hack a go?
>
> This patch fixed the issue. Thanks! Feel free to add my:
>
> Reported-by: Marek Szyprowski <[email protected]>
>
> Tested-by: Marek Szyprowski <[email protected]>

Hi Marc,

Linux-next still fails to boot on Exynos5422 boards, so I wonder if you
applied the fix?

Instead of silent fail there is now "Unable to handle kernel paging
request at virtual address f0836644", so it is slightly different.

See the dmesg:
https://krzk.eu/#/builders/21/builds/3542/steps/15/logs/serial0


Best regards,
Krzysztof

2022-04-22 09:02:08

by Marc Zyngier

[permalink] [raw]
Subject: Re: [PATCH v3 2/3] genirq: Always limit the affinity to online CPUs

Hi Krzysztof,

On Wed, 20 Apr 2022 10:13:52 +0100,
Krzysztof Kozlowski <[email protected]> wrote:
>
> On 14/04/2022 13:08, Marek Szyprowski wrote:
> >> Thanks for all of the debug, super helpful. The issue is that we don't
> >> handle the 'force' case, which a handful of drivers are using when
> >> bringing up CPUs (and doing so before the CPUs are marked online).
> >>
> >> Can you please give the below hack a go?
> >
> > This patch fixed the issue. Thanks! Feel free to add my:
> >
> > Reported-by: Marek Szyprowski <[email protected]>
> >
> > Tested-by: Marek Szyprowski <[email protected]>
>
> Hi Marc,
>
> Linux-next still fails to boot on Exynos5422 boards, so I wonder if you
> applied the fix?

It was picked up by Thomas and pushed out into tip, which is pulled by
-next:

maz@hot-poop:~/arm-platforms$ git describe --contains c48c8b829d2b966a6649827426bcdba082ccf922
next-20220420~51^2~3^2

So it definitely is in today's -next.

> Instead of silent fail there is now "Unable to handle kernel paging
> request at virtual address f0836644", so it is slightly different.
>
> See the dmesg:
> https://krzk.eu/#/builders/21/builds/3542/steps/15/logs/serial0

This looks completely unrelated:

[ 10.382010] Unable to handle kernel paging request at virtual address f0836644
[ 10.388597] [f0836644] *pgd=41c83811, *pte=00000000, *ppte=00000000
[ 10.394482] Internal error: Oops: 807 [#1] PREEMPT SMP ARM
[ 10.399567] Modules linked in:
[ 10.402583] CPU: 2 PID: 1 Comm: swapper/0 Not tainted 5.18.0-rc3-next-20220420 #2
[ 10.410060] Hardware name: Samsung Exynos (Flattened Device Tree)
[ 10.416106] PC is at cpu_ca15_set_pte_ext+0x4c/0x58
[ 10.420952] LR is at handle_pte_fault+0x218/0x260
[ 10.425631] pc : [<c011d588>] lr : [<c02ab188>] psr: 40000113
[ 10.431874] sp : f0835df0 ip : f0835e5c fp : 00000081
[ 10.437069] r10: c0f2eafc r9 : c1d31000 r8 : 00000000
[ 10.442268] r7 : c1d58000 r6 : 00000081 r5 : befffff6 r4 : f0835e24
[ 10.448773] r3 : 00000000 r2 : 00000000 r1 : 00000040 r0 : f0835e44
[ 10.455273] Flags: nZcv IRQs on FIQs on Mode SVC_32 ISA ARM Segment none
[ 10.462381] Control: 10c5387d Table: 4000406a DAC: 00000051

This is a crash in cpu_ca15_set_pte_ext() when populating the
userspace page tables, which seems unrelated to interrupt affinity.

I suggest you bisect this to find the actual problem.

Thanks,

M.

--
Without deviation from the norm, progress is not possible.

2022-04-22 17:21:54

by Krzysztof Kozlowski

[permalink] [raw]
Subject: Re: [PATCH v3 2/3] genirq: Always limit the affinity to online CPUs

On 20/04/2022 11:40, Marc Zyngier wrote:
> It was picked up by Thomas and pushed out into tip, which is pulled by
> -next:
>
> maz@hot-poop:~/arm-platforms$ git describe --contains c48c8b829d2b966a6649827426bcdba082ccf922
> next-20220420~51^2~3^2
>
> So it definitely is in today's -next.
>
>> Instead of silent fail there is now "Unable to handle kernel paging
>> request at virtual address f0836644", so it is slightly different.
>>
>> See the dmesg:
>> https://krzk.eu/#/builders/21/builds/3542/steps/15/logs/serial0
>
> This looks completely unrelated:
>
> [ 10.382010] Unable to handle kernel paging request at virtual address f0836644
> [ 10.388597] [f0836644] *pgd=41c83811, *pte=00000000, *ppte=00000000
> [ 10.394482] Internal error: Oops: 807 [#1] PREEMPT SMP ARM
> [ 10.399567] Modules linked in:
> [ 10.402583] CPU: 2 PID: 1 Comm: swapper/0 Not tainted 5.18.0-rc3-next-20220420 #2
> [ 10.410060] Hardware name: Samsung Exynos (Flattened Device Tree)
> [ 10.416106] PC is at cpu_ca15_set_pte_ext+0x4c/0x58
> [ 10.420952] LR is at handle_pte_fault+0x218/0x260
> [ 10.425631] pc : [<c011d588>] lr : [<c02ab188>] psr: 40000113
> [ 10.431874] sp : f0835df0 ip : f0835e5c fp : 00000081
> [ 10.437069] r10: c0f2eafc r9 : c1d31000 r8 : 00000000
> [ 10.442268] r7 : c1d58000 r6 : 00000081 r5 : befffff6 r4 : f0835e24
> [ 10.448773] r3 : 00000000 r2 : 00000000 r1 : 00000040 r0 : f0835e44
> [ 10.455273] Flags: nZcv IRQs on FIQs on Mode SVC_32 ISA ARM Segment none
> [ 10.462381] Control: 10c5387d Table: 4000406a DAC: 00000051
>
> This is a crash in cpu_ca15_set_pte_ext() when populating the
> userspace page tables, which seems unrelated to interrupt affinity.
>
> I suggest you bisect this to find the actual problem.

Thanks for checking.


Best regards,
Krzysztof

2022-04-22 18:09:46

by Krzysztof Kozlowski

[permalink] [raw]
Subject: Re: [PATCH v3 2/3] genirq: Always limit the affinity to online CPUs

On 20/04/2022 11:47, Marek Szyprowski wrote:
>> Instead of silent fail there is now "Unable to handle kernel paging
>> request at virtual address f0836644", so it is slightly different.
>
> This is yet another issue (related to all ARM 32bit boards) introduced
> in next-20220413, see:
>
> https://lore.kernel.org/all/[email protected]/T/#m6137721ae1323fdf424cee0f8ea1a6af5a3af396

Thanks, Marek!


Best regards,
Krzysztof

2022-04-22 18:23:37

by Marek Szyprowski

[permalink] [raw]
Subject: Re: [PATCH v3 2/3] genirq: Always limit the affinity to online CPUs

Hi Krzysztof,

On 20.04.2022 11:13, Krzysztof Kozlowski wrote:
> On 14/04/2022 13:08, Marek Szyprowski wrote:
>>> Thanks for all of the debug, super helpful. The issue is that we don't
>>> handle the 'force' case, which a handful of drivers are using when
>>> bringing up CPUs (and doing so before the CPUs are marked online).
>>>
>>> Can you please give the below hack a go?
>> This patch fixed the issue. Thanks! Feel free to add my:
>>
>> Reported-by: Marek Szyprowski <[email protected]>
>>
>> Tested-by: Marek Szyprowski <[email protected]>
> Hi Marc,
>
> Linux-next still fails to boot on Exynos5422 boards, so I wonder if you
> applied the fix?
>
> Instead of silent fail there is now "Unable to handle kernel paging
> request at virtual address f0836644", so it is slightly different.

This is yet another issue (related to all ARM 32bit boards) introduced
in next-20220413, see:

https://lore.kernel.org/all/[email protected]/T/#m6137721ae1323fdf424cee0f8ea1a6af5a3af396

Best regards
--
Marek Szyprowski, PhD
Samsung R&D Institute Poland