2020-04-17 16:10:37

by Marc Dionne

[permalink] [raw]
Subject: FreeNAS VM disk access errors, bisected to commit 6f1a4891a592

Hi,

Commit 6f1a4891a592 ("x86/apic/msi: Plug non-maskable MSI affinity
race") causes Linux VMs hosted on FreeNAS (bhyve hypervisor) to lose
access to their disk devices shortly after boot. The disks are zfs
zvols on the host, presented to each VM.

Background: I recently updated some fedora 31 VMs running under the
bhyve hypervisor (hosted on a FreeNAS mini), and they moved to a
distro 5.5 kernel (5.5.15). Shortly after reboot, the disks became
inaccessible with any operation getting EIO errors. Booting back into
a 5.4 kernel, everything was fine. I built a 5.7-rc1 kernel, which
showed the same symptoms, and was then able to bisect it down to
commit 6f1a4891a592. Note that the symptoms do not occur on every
boot, but often enough (roughly 80%) to make bisection possible.

Applying a manual revert of 6f1a4891a592 on top of mainline from
yesterday gives me a kernel that works fine.

Not sure which details are useful, but here are some bits that might
be relevant:
- FreeNAS host is running FreeNAS-11.3-U2
- efi/bios details:
efi: EFI v2.40 by BHYVE
efi: SMBIOS=0x7fb5b000 ACPI=0x7fb88000 ACPI 2.0=0x7fb88014
DMI: BHYVE, BIOS 1.00 03/14/2014
- A sample disk:
ata4: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
ata4.00: ATA-9: BHYVE SATA DISK, 001, max UDMA/133
ata4.00: 2147483680 sectors, multi 128: LBA48 NCQ (depth 32)
ata4.00: configured for UDMA/133
scsi 3:0:0:0: Direct-Access ATA BHYVE SATA DISK 001 PQ: 0 ANSI: 5
scsi 3:0:0:0: Attached scsi generic sg3 type 0
- The first sign of a problem on a "bad" kernel shows up as:
ata1.00: exception Emask 0x0 SAct 0x78000001 SErr 0x0 action 0x6 frozen

Thanks,
Marc


2020-04-17 20:21:46

by Thomas Gleixner

[permalink] [raw]
Subject: Re: FreeNAS VM disk access errors, bisected to commit 6f1a4891a592

Marc,

Marc Dionne <[email protected]> writes:

> Commit 6f1a4891a592 ("x86/apic/msi: Plug non-maskable MSI affinity
> race") causes Linux VMs hosted on FreeNAS (bhyve hypervisor) to lose
> access to their disk devices shortly after boot. The disks are zfs
> zvols on the host, presented to each VM.
>
> Background: I recently updated some fedora 31 VMs running under the
> bhyve hypervisor (hosted on a FreeNAS mini), and they moved to a
> distro 5.5 kernel (5.5.15). Shortly after reboot, the disks became
> inaccessible with any operation getting EIO errors. Booting back into
> a 5.4 kernel, everything was fine. I built a 5.7-rc1 kernel, which
> showed the same symptoms, and was then able to bisect it down to
> commit 6f1a4891a592. Note that the symptoms do not occur on every
> boot, but often enough (roughly 80%) to make bisection possible.
>
> Applying a manual revert of 6f1a4891a592 on top of mainline from
> yesterday gives me a kernel that works fine.

we tested on real hardware and various hypervisors that the fix actually
works correctly.

That makes me assume that the staged approach of changing affinity for
this non-maskable MSI mess makes your particular hypervisor unhappy.

Are there any messages like this:

"do_IRQ: 0.83 No irq handler for vector"

in dmesg on the Linux side? If they happen then before the disk timeout
happens.

I have absolutely zero knowledge about bhyve, so may I suggest to talk
to the bhyve experts about this.

Thanks,

tglx

2020-04-18 00:51:46

by Marc Dionne

[permalink] [raw]
Subject: Re: FreeNAS VM disk access errors, bisected to commit 6f1a4891a592

On Fri, Apr 17, 2020 at 5:19 PM Thomas Gleixner <[email protected]> wrote:
>
> Marc,
>
> Marc Dionne <[email protected]> writes:
>
> > Commit 6f1a4891a592 ("x86/apic/msi: Plug non-maskable MSI affinity
> > race") causes Linux VMs hosted on FreeNAS (bhyve hypervisor) to lose
> > access to their disk devices shortly after boot. The disks are zfs
> > zvols on the host, presented to each VM.
> >
> > Background: I recently updated some fedora 31 VMs running under the
> > bhyve hypervisor (hosted on a FreeNAS mini), and they moved to a
> > distro 5.5 kernel (5.5.15). Shortly after reboot, the disks became
> > inaccessible with any operation getting EIO errors. Booting back into
> > a 5.4 kernel, everything was fine. I built a 5.7-rc1 kernel, which
> > showed the same symptoms, and was then able to bisect it down to
> > commit 6f1a4891a592. Note that the symptoms do not occur on every
> > boot, but often enough (roughly 80%) to make bisection possible.
> >
> > Applying a manual revert of 6f1a4891a592 on top of mainline from
> > yesterday gives me a kernel that works fine.
>
> we tested on real hardware and various hypervisors that the fix actually
> works correctly.
>
> That makes me assume that the staged approach of changing affinity for
> this non-maskable MSI mess makes your particular hypervisor unhappy.
>
> Are there any messages like this:
>
> "do_IRQ: 0.83 No irq handler for vector"

I haven't seen those although I only have a VNC console that scrolls
by rather fast.
I did see a report from someone running Ubuntu 18.04 which had this
after the initial errors:

do_IRQ: 2.35 No irq handler for vector
ata1.00: revalidation failed (error=-5)

> in dmesg on the Linux side? If they happen then before the disk timeout
> happens.
>
> I have absolutely zero knowledge about bhyve, so may I suggest to talk
> to the bhyve experts about this.

I opened a ticket with iXsystems. I noticed several people reporting
the same problem in their community forums.

Marc

2020-04-18 09:47:24

by Thomas Gleixner

[permalink] [raw]
Subject: Re: FreeNAS VM disk access errors, bisected to commit 6f1a4891a592

Marc,

Marc Dionne <[email protected]> writes:
> On Fri, Apr 17, 2020 at 5:19 PM Thomas Gleixner <[email protected]> wrote:
>> That makes me assume that the staged approach of changing affinity for
>> this non-maskable MSI mess makes your particular hypervisor unhappy.
>>
>> Are there any messages like this:
>>
>> "do_IRQ: 0.83 No irq handler for vector"
>
> I haven't seen those although I only have a VNC console that scrolls
> by rather fast.
> I did see a report from someone running Ubuntu 18.04 which had this
> after the initial errors:
>
> do_IRQ: 2.35 No irq handler for vector
> ata1.00: revalidation failed (error=-5)

I expected that and it points to an issue on the HV side either in
affinity management or in the local APIC emulation.

Thanks,

tglx