2018-02-23 15:30:29

by Paul Menzel

[permalink] [raw]
Subject: `do_IRQ: 1.55 No irq handler for vector` on ASRock E350M1

Dear Thomas,


On the ASRock E350M1 (AMD A50M), since Linux 4.15 I get the message below.

do_IRQ: 1.55 No irq handler for vector


Kind regards,

Paul


Attachments:
smime.p7s (5.05 kB)
S/MIME Cryptographic Signature

2018-02-23 18:19:20

by Thomas Gleixner

[permalink] [raw]
Subject: Re: `do_IRQ: 1.55 No irq handler for vector` on ASRock E350M1

Paul,

On Fri, 23 Feb 2018, Paul Menzel wrote:
>
> On the ASRock E350M1 (AMD A50M), since Linux 4.15 I get the message below.
>
> do_IRQ: 1.55 No irq handler for vector

Thanks for the report, but the single line of dmesg w/o any context is not
really helpful. Can you please provide a larger piece of dmesg which shows
in which context this happens.

I assume that this happens during early boot right when CPU1 is brought up,
right?

Borislav is seeing similar issues on larger AMD machines. The interrupt
seems to come from BIOS/microcode during bringup of secondary CPUs and we
have no idea why.

Thanks,

tglx


2018-02-23 19:11:02

by Borislav Petkov

[permalink] [raw]
Subject: Re: `do_IRQ: 1.55 No irq handler for vector` on ASRock E350M1

On Fri, Feb 23, 2018 at 07:18:34PM +0100, Thomas Gleixner wrote:
> Borislav is seeing similar issues on larger AMD machines. The interrupt
> seems to come from BIOS/microcode during bringup of secondary CPUs and we
> have no idea why.

Paul, can you boot 4.14 and grep your dmesg for something like:

[ 0.000000] spurious 8259A interrupt: IRQ7.

?

Thx.

--
Regards/Gruss,
Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

2018-02-24 08:07:43

by Paul Menzel

[permalink] [raw]
Subject: Re: `do_IRQ: 1.55 No irq handler for vector` on ASRock E350M1

Dear Thomas, dear Borislav,


Sorry for not attaching the logs.

Am 23.02.2018 um 20:09 schrieb Borislav Petkov:
> On Fri, Feb 23, 2018 at 07:18:34PM +0100, Thomas Gleixner wrote:
>> Borislav is seeing similar issues on larger AMD machines. The interrupt
>> seems to come from BIOS/microcode during bringup of secondary CPUs and we
>> have no idea why.
>
> Paul, can you boot 4.14 and grep your dmesg for something like:
>
> [ 0.000000] spurious 8259A interrupt: IRQ7. >
> ?

No, I do not see that. Please find the logs attached.


Kind regards,

Paul


Attachments:
=?utf-8?Q?20180219=E2=80=93linux=5F4=2E14=2E17=E2=80=93journalctl-k=2Etxt?= (229.25 kB)
=?utf-8?Q?20180224=E2=80=93linux=5F4=2E15=2E4=E2=80=93journalctl-k=2Etxt?= (149.75 kB)
Download all attachments

2018-02-24 09:00:02

by Thomas Gleixner

[permalink] [raw]
Subject: Re: `do_IRQ: 1.55 No irq handler for vector` on ASRock E350M1

On Sat, 24 Feb 2018, Paul Menzel wrote:
> Am 23.02.2018 um 20:09 schrieb Borislav Petkov:
> > On Fri, Feb 23, 2018 at 07:18:34PM +0100, Thomas Gleixner wrote:
> > > Borislav is seeing similar issues on larger AMD machines. The interrupt
> > > seems to come from BIOS/microcode during bringup of secondary CPUs and we
> > > have no idea why.
> >
> > Paul, can you boot 4.14 and grep your dmesg for something like:
> >
> > [ 0.000000] spurious 8259A interrupt: IRQ7. >
> > ?
>
> No, I do not see that. Please find the logs attached.

From your 4.14 log:

Feb 19 09:48:06.843173 kodi kernel: CPU 0 irqstacks, hard=e9b0a000 soft=e9b0c000
Feb 19 09:48:06.843216 kodi kernel: spurious 8259A interrupt: IRQ7.
Feb 19 09:48:06.843258 kodi kernel: Console: colour VGA+ 80x25

So this has been there in 4.14 already just the detection mechanism has
changed due to the modifications of the interrupt vector management code.
It's not a real issue, just annoying ....

Thanks,

tglx

2018-02-26 16:15:39

by Tom Lendacky

[permalink] [raw]
Subject: Re: `do_IRQ: 1.55 No irq handler for vector` on ASRock E350M1

On 2/24/2018 2:59 AM, Thomas Gleixner wrote:
> On Sat, 24 Feb 2018, Paul Menzel wrote:
>> Am 23.02.2018 um 20:09 schrieb Borislav Petkov:
>>> On Fri, Feb 23, 2018 at 07:18:34PM +0100, Thomas Gleixner wrote:
>>>> Borislav is seeing similar issues on larger AMD machines. The interrupt
>>>> seems to come from BIOS/microcode during bringup of secondary CPUs and we
>>>> have no idea why.
>>>
>>> Paul, can you boot 4.14 and grep your dmesg for something like:
>>>
>>> [ 0.000000] spurious 8259A interrupt: IRQ7. >
>>> ?
>>
>> No, I do not see that. Please find the logs attached.
>
> From your 4.14 log:
>
> Feb 19 09:48:06.843173 kodi kernel: CPU 0 irqstacks, hard=e9b0a000 soft=e9b0c000
> Feb 19 09:48:06.843216 kodi kernel: spurious 8259A interrupt: IRQ7.

I think I remember seeing something like this previously and it turned out
to be a BIOS bug. All the AP's were enabled to work with the legacy 8259
interrupt controller. In an SMP system, only one processor in the system
should be configured to handle legacy 8259 interrupts (ExtINT delivery
mode - see Intel's SDM, Volume 3, section 10.5.1, Delivery Mode). Once
the BIOS was fixed, the spurious interrupt message went away.

I believe at some point during UEFI, the APs were exposed to an ExtINT
interrupt. Since they were configured to handle ExtINT delivery mode and
interrupts were not yet enabled, the interrupt was left pending. When the
APs were started by the OS and interrupts were enabled, the interrupt
triggered. Since the original pending interrupt was handled by the BSP,
there was no longer an interrupt actually pending, so the 8259 responds
with IRQ 7 when queried by the OS. This occurred for each AP.

Thanks,
Tom

> Feb 19 09:48:06.843258 kodi kernel: Console: colour VGA+ 80x25
>
> So this has been there in 4.14 already just the detection mechanism has
> changed due to the modifications of the interrupt vector management code.
> It's not a real issue, just annoying ....
>
> Thanks,
>
> tglx
>

2018-02-26 16:32:39

by Borislav Petkov

[permalink] [raw]
Subject: Re: `do_IRQ: 1.55 No irq handler for vector` on ASRock E350M1

On Mon, Feb 26, 2018 at 10:14:10AM -0600, Tom Lendacky wrote:
> On 2/24/2018 2:59 AM, Thomas Gleixner wrote:
> > On Sat, 24 Feb 2018, Paul Menzel wrote:
> >> Am 23.02.2018 um 20:09 schrieb Borislav Petkov:
> >>> On Fri, Feb 23, 2018 at 07:18:34PM +0100, Thomas Gleixner wrote:
> >>>> Borislav is seeing similar issues on larger AMD machines. The interrupt
> >>>> seems to come from BIOS/microcode during bringup of secondary CPUs and we
> >>>> have no idea why.
> >>>
> >>> Paul, can you boot 4.14 and grep your dmesg for something like:
> >>>
> >>> [ 0.000000] spurious 8259A interrupt: IRQ7. >
> >>> ?
> >>
> >> No, I do not see that. Please find the logs attached.
> >
> > From your 4.14 log:
> >
> > Feb 19 09:48:06.843173 kodi kernel: CPU 0 irqstacks, hard=e9b0a000 soft=e9b0c000
> > Feb 19 09:48:06.843216 kodi kernel: spurious 8259A interrupt: IRQ7.
>
> I think I remember seeing something like this previously and it turned out
> to be a BIOS bug. All the AP's were enabled to work with the legacy 8259
> interrupt controller. In an SMP system, only one processor in the system
> should be configured to handle legacy 8259 interrupts (ExtINT delivery
> mode - see Intel's SDM, Volume 3, section 10.5.1, Delivery Mode). Once
> the BIOS was fixed, the spurious interrupt message went away.
>
> I believe at some point during UEFI, the APs were exposed to an ExtINT
> interrupt. Since they were configured to handle ExtINT delivery mode and
> interrupts were not yet enabled, the interrupt was left pending. When the
> APs were started by the OS and interrupts were enabled, the interrupt
> triggered. Since the original pending interrupt was handled by the BSP,
> there was no longer an interrupt actually pending, so the 8259 responds
> with IRQ 7 when queried by the OS. This occurred for each AP.

Interesting - is this something that can happen on Zen too?

Because I have such reports too.

--
Regards/Gruss,
Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

2018-02-26 16:38:59

by Morton, Eric

[permalink] [raw]
Subject: Re: `do_IRQ: 1.55 No irq handler for vector` on ASRock E350M1

Thomas,

Yazen dug out PLAT-21393 as sounding like this issue. I haven't had a chance to digest it.

Eric

On 2/26/18, 10:31 AM, "Borislav Petkov" <[email protected]> wrote:

On Mon, Feb 26, 2018 at 10:14:10AM -0600, Tom Lendacky wrote:
> On 2/24/2018 2:59 AM, Thomas Gleixner wrote:
> > On Sat, 24 Feb 2018, Paul Menzel wrote:
> >> Am 23.02.2018 um 20:09 schrieb Borislav Petkov:
> >>> On Fri, Feb 23, 2018 at 07:18:34PM +0100, Thomas Gleixner wrote:
> >>>> Borislav is seeing similar issues on larger AMD machines. The interrupt
> >>>> seems to come from BIOS/microcode during bringup of secondary CPUs and we
> >>>> have no idea why.
> >>>
> >>> Paul, can you boot 4.14 and grep your dmesg for something like:
> >>>
> >>> [ 0.000000] spurious 8259A interrupt: IRQ7. >
> >>> ?
> >>
> >> No, I do not see that. Please find the logs attached.
> >
> > From your 4.14 log:
> >
> > Feb 19 09:48:06.843173 kodi kernel: CPU 0 irqstacks, hard=e9b0a000 soft=e9b0c000
> > Feb 19 09:48:06.843216 kodi kernel: spurious 8259A interrupt: IRQ7.
>
> I think I remember seeing something like this previously and it turned out
> to be a BIOS bug. All the AP's were enabled to work with the legacy 8259
> interrupt controller. In an SMP system, only one processor in the system
> should be configured to handle legacy 8259 interrupts (ExtINT delivery
> mode - see Intel's SDM, Volume 3, section 10.5.1, Delivery Mode). Once
> the BIOS was fixed, the spurious interrupt message went away.
>
> I believe at some point during UEFI, the APs were exposed to an ExtINT
> interrupt. Since they were configured to handle ExtINT delivery mode and
> interrupts were not yet enabled, the interrupt was left pending. When the
> APs were started by the OS and interrupts were enabled, the interrupt
> triggered. Since the original pending interrupt was handled by the BSP,
> there was no longer an interrupt actually pending, so the 8259 responds
> with IRQ 7 when queried by the OS. This occurred for each AP.

Interesting - is this something that can happen on Zen too?

Because I have such reports too.

--
Regards/Gruss,
Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.


2018-02-26 16:42:03

by Tom Lendacky

[permalink] [raw]
Subject: Re: `do_IRQ: 1.55 No irq handler for vector` on ASRock E350M1

On 2/26/2018 10:31 AM, Borislav Petkov wrote:
> On Mon, Feb 26, 2018 at 10:14:10AM -0600, Tom Lendacky wrote:
>> On 2/24/2018 2:59 AM, Thomas Gleixner wrote:
>>> On Sat, 24 Feb 2018, Paul Menzel wrote:
>>>> Am 23.02.2018 um 20:09 schrieb Borislav Petkov:
>>>>> On Fri, Feb 23, 2018 at 07:18:34PM +0100, Thomas Gleixner wrote:
>>>>>> Borislav is seeing similar issues on larger AMD machines. The interrupt
>>>>>> seems to come from BIOS/microcode during bringup of secondary CPUs and we
>>>>>> have no idea why.
>>>>>
>>>>> Paul, can you boot 4.14 and grep your dmesg for something like:
>>>>>
>>>>> [ 0.000000] spurious 8259A interrupt: IRQ7. >
>>>>> ?
>>>>
>>>> No, I do not see that. Please find the logs attached.
>>>
>>> From your 4.14 log:
>>>
>>> Feb 19 09:48:06.843173 kodi kernel: CPU 0 irqstacks, hard=e9b0a000 soft=e9b0c000
>>> Feb 19 09:48:06.843216 kodi kernel: spurious 8259A interrupt: IRQ7.
>>
>> I think I remember seeing something like this previously and it turned out
>> to be a BIOS bug. All the AP's were enabled to work with the legacy 8259
>> interrupt controller. In an SMP system, only one processor in the system
>> should be configured to handle legacy 8259 interrupts (ExtINT delivery
>> mode - see Intel's SDM, Volume 3, section 10.5.1, Delivery Mode). Once
>> the BIOS was fixed, the spurious interrupt message went away.
>>
>> I believe at some point during UEFI, the APs were exposed to an ExtINT
>> interrupt. Since they were configured to handle ExtINT delivery mode and
>> interrupts were not yet enabled, the interrupt was left pending. When the
>> APs were started by the OS and interrupts were enabled, the interrupt
>> triggered. Since the original pending interrupt was handled by the BSP,
>> there was no longer an interrupt actually pending, so the 8259 responds
>> with IRQ 7 when queried by the OS. This occurred for each AP.
>
> Interesting - is this something that can happen on Zen too?

Yes, that's where I remember seeing it.

Thanks,
Tom

>
> Because I have such reports too.
>

2018-02-26 16:43:23

by Tom Lendacky

[permalink] [raw]
Subject: Re: `do_IRQ: 1.55 No irq handler for vector` on ASRock E350M1

On 2/26/2018 10:37 AM, Morton, Eric wrote:
> Thomas,
>
> Yazen dug out PLAT-21393 as sounding like this issue. I haven't had a chance to digest it.

Yes, internally to AMD, that was the bug that tracked the issue I was
referring to.

Thanks,
Tom

>
> Eric
>
> On 2/26/18, 10:31 AM, "Borislav Petkov" <[email protected]> wrote:
>
> On Mon, Feb 26, 2018 at 10:14:10AM -0600, Tom Lendacky wrote:
> > On 2/24/2018 2:59 AM, Thomas Gleixner wrote:
> > > On Sat, 24 Feb 2018, Paul Menzel wrote:
> > >> Am 23.02.2018 um 20:09 schrieb Borislav Petkov:
> > >>> On Fri, Feb 23, 2018 at 07:18:34PM +0100, Thomas Gleixner wrote:
> > >>>> Borislav is seeing similar issues on larger AMD machines. The interrupt
> > >>>> seems to come from BIOS/microcode during bringup of secondary CPUs and we
> > >>>> have no idea why.
> > >>>
> > >>> Paul, can you boot 4.14 and grep your dmesg for something like:
> > >>>
> > >>> [ 0.000000] spurious 8259A interrupt: IRQ7. >
> > >>> ?
> > >>
> > >> No, I do not see that. Please find the logs attached.
> > >
> > > From your 4.14 log:
> > >
> > > Feb 19 09:48:06.843173 kodi kernel: CPU 0 irqstacks, hard=e9b0a000 soft=e9b0c000
> > > Feb 19 09:48:06.843216 kodi kernel: spurious 8259A interrupt: IRQ7.
> >
> > I think I remember seeing something like this previously and it turned out
> > to be a BIOS bug. All the AP's were enabled to work with the legacy 8259
> > interrupt controller. In an SMP system, only one processor in the system
> > should be configured to handle legacy 8259 interrupts (ExtINT delivery
> > mode - see Intel's SDM, Volume 3, section 10.5.1, Delivery Mode). Once
> > the BIOS was fixed, the spurious interrupt message went away.
> >
> > I believe at some point during UEFI, the APs were exposed to an ExtINT
> > interrupt. Since they were configured to handle ExtINT delivery mode and
> > interrupts were not yet enabled, the interrupt was left pending. When the
> > APs were started by the OS and interrupts were enabled, the interrupt
> > triggered. Since the original pending interrupt was handled by the BSP,
> > there was no longer an interrupt actually pending, so the 8259 responds
> > with IRQ 7 when queried by the OS. This occurred for each AP.
>
> Interesting - is this something that can happen on Zen too?
>
> Because I have such reports too.
>
> --
> Regards/Gruss,
> Boris.
>
> Good mailing practices for 400: avoid top-posting and trim the reply.
>
>

2018-03-29 07:22:41

by Kai-Heng Feng

[permalink] [raw]
Subject: Re: `do_IRQ: 1.55 No irq handler for vector` on ASRock E350M1

Hi Tom & Thomas,

> On Feb 27, 2018, at 12:42 AM, Tom Lendacky <[email protected]> wrote:
>
> On 2/26/2018 10:37 AM, Morton, Eric wrote:
>> Thomas,
>>
>> Yazen dug out PLAT-21393 as sounding like this issue. I haven't had a
>> chance to digest it.
>
> Yes, internally to AMD, that was the bug that tracked the issue I was
> referring to.

There's also another user [1] affected by this issue.

Ethernet r8169 failed to work after system suspend:

[ 150.944101] do_IRQ: 3.37 No irq handler for vector
[ 150.944105] r8169 0000:01:00.0 enp1s0: link down

It's a regression started from v4.15-rc5.

[1] https://bugs.launchpad.net/bugs/1752772

Kai-Heng

>
> Thanks,
> Tom
>
>> Eric
>>
>> On 2/26/18, 10:31 AM, "Borislav Petkov" <[email protected]> wrote:
>>
>> On Mon, Feb 26, 2018 at 10:14:10AM -0600, Tom Lendacky wrote:
>>> On 2/24/2018 2:59 AM, Thomas Gleixner wrote:
>>>> On Sat, 24 Feb 2018, Paul Menzel wrote:
>>>>> Am 23.02.2018 um 20:09 schrieb Borislav Petkov:
>>>>>> On Fri, Feb 23, 2018 at 07:18:34PM +0100, Thomas Gleixner wrote:
>>>>>>> Borislav is seeing similar issues on larger AMD machines. The
>>>>>>> interrupt
>>>>>>> seems to come from BIOS/microcode during bringup of secondary CPUs
>>>>>>> and we
>>>>>>> have no idea why.
>>>>>>
>>>>>> Paul, can you boot 4.14 and grep your dmesg for something like:
>>>>>>
>>>>>> [ 0.000000] spurious 8259A interrupt: IRQ7. >
>>>>>> ?
>>>>>
>>>>> No, I do not see that. Please find the logs attached.
>>>>
>>>> From your 4.14 log:
>>>>
>>>> Feb 19 09:48:06.843173 kodi kernel: CPU 0 irqstacks, hard=e9b0a000
>>>> soft=e9b0c000
>>>> Feb 19 09:48:06.843216 kodi kernel: spurious 8259A interrupt: IRQ7.
>>>
>>> I think I remember seeing something like this previously and it turned
>>> out
>>> to be a BIOS bug. All the AP's were enabled to work with the legacy 8259
>>> interrupt controller. In an SMP system, only one processor in the system
>>> should be configured to handle legacy 8259 interrupts (ExtINT delivery
>>> mode - see Intel's SDM, Volume 3, section 10.5.1, Delivery Mode). Once
>>> the BIOS was fixed, the spurious interrupt message went away.
>>>
>>> I believe at some point during UEFI, the APs were exposed to an ExtINT
>>> interrupt. Since they were configured to handle ExtINT delivery mode and
>>> interrupts were not yet enabled, the interrupt was left pending. When
>>> the
>>> APs were started by the OS and interrupts were enabled, the interrupt
>>> triggered. Since the original pending interrupt was handled by the BSP,
>>> there was no longer an interrupt actually pending, so the 8259 responds
>>> with IRQ 7 when queried by the OS. This occurred for each AP.
>>
>> Interesting - is this something that can happen on Zen too?
>>
>> Because I have such reports too.
>>
>> --
>> Regards/Gruss,
>> Boris.
>>
>> Good mailing practices for 400: avoid top-posting and trim the reply.

2018-03-29 08:30:47

by Paul Menzel

[permalink] [raw]
Subject: Re: `do_IRQ: 1.55 No irq handler for vector` on ASRock E350M1

Dear Yazen, Eric, Tom,


On 02/26/18 17:42, Tom Lendacky wrote:
> On 2/26/2018 10:37 AM, Morton, Eric wrote:

>> Yazen dug out PLAT-21393 as sounding like this issue. I haven't had
>> a chance to digest it.
>
> Yes, internally to AMD, that was the bug that tracked the issue I
> was referring to.

If you could point me to more details how to fix this in AGESA, I could
probably fix the version in coreboot to fix the issue.


Kind regards,

Paul


Attachments:
smime.p7s (5.05 kB)
S/MIME Cryptographic Signature

2018-03-29 20:59:06

by Morton, Eric

[permalink] [raw]
Subject: Re: `do_IRQ: 1.55 No irq handler for vector` on ASRock E350M1

+Frank G.

Eric

By reading this e-mail, you are consenting to agree with the opinions disclosed within.
On 3/29/18, 3:29 AM, "Paul Menzel" <[email protected]> wrote:

Dear Yazen, Eric, Tom,


On 02/26/18 17:42, Tom Lendacky wrote:
> On 2/26/2018 10:37 AM, Morton, Eric wrote:

>> Yazen dug out PLAT-21393 as sounding like this issue. I haven't had
>> a chance to digest it.
>
> Yes, internally to AMD, that was the bug that tracked the issue I
> was referring to.

If you could point me to more details how to fix this in AGESA, I could
probably fix the version in coreboot to fix the issue.


Kind regards,

Paul