Hi arm64 maintainers,
We now have some syzbot instances testing arm64 (woohoo!) using qemu
emulation. I wanted to write up the current status.
There are 3 instances, first uses KASAN:
https://syzkaller.appspot.com/upstream?manager=ci-qemu2-arm64
second KASAN and 32-bit userspace test load (compat):
https://syzkaller.appspot.com/upstream?manager=ci-qemu2-arm64-compat
third uses MTE/KASAN_HWTAGS:
https://syzkaller.appspot.com/upstream?manager=ci-qemu2-arm64-mte
Kernel configs:
https://github.com/google/syzkaller/blob/master/dashboard/config/linux/upstream-arm64-kasan.config
https://github.com/google/syzkaller/blob/master/dashboard/config/linux/upstream-arm64-mte.config
The instances have KCOV disabled because it slows down execution too
much (KASAN in qemu emulation is already extremely slow), so no
coverage guidance and coverage reports for now :(
The instances found few arm64-specific issues that we have not
observed on other instances:
https://syzkaller.appspot.com/bug?id=1d22a2cc3521d5cf6b41bd6b825793c2015f861f
https://syzkaller.appspot.com/bug?id=bb2c16b0e13b4de4bbf22cf6a4b9b16fb0c20eea
https://syzkaller.appspot.com/bug?id=b75386f45318ec181b7f49260d619fac9877d456
https://syzkaller.appspot.com/bug?id=5a1bc29bca656159f95c7c8bb30e3776ca860332
but mostly re-discovering known bugs we already found on x86.
The instances use qemu emulation and lots of debug configs, so they
are quite slow and it makes sense to target them at arm64-specific
parts of the kernel as much as possible (rather
than stress generic subsystems that are already stressed on x86).
So the question is: what arm64-specific parts are there that we can reach
in qemu?
Can you think of any qemu flags (cpu features, device emulation, etc)?
Any kernel subsystems with heavy arm-specific parts that we may be missing?
Testing some of the arm64 drivers that qemu can emulate may be the
most profitable thing.
Currently the instances use the following flags:
-machine virt,virtualization=on,graphics=on,usb=on -cpu cortex-a57
-machine virt,virtualization=on,mte=on,graphics=on,usb=on -cpu max
mte=on + virtualization=on is broken in the kernel on in the qemu:
https://lore.kernel.org/lkml/CAAeHK+wDz8aSLyjq1b=q3+HG9aJXxwYR6+gN_fTttMN5osM5gg@mail.gmail.com/
On Thu, Mar 11, 2021 at 12:38:21PM +0100, 'Dmitry Vyukov' via syzkaller wrote:
> Hi arm64 maintainers,
Hi Dmitry,
> We now have some syzbot instances testing arm64 (woohoo!) using qemu
> emulation. I wanted to write up the current status.
Nice!
> There are 3 instances, first uses KASAN:
> https://syzkaller.appspot.com/upstream?manager=ci-qemu2-arm64
> second KASAN and 32-bit userspace test load (compat):
> https://syzkaller.appspot.com/upstream?manager=ci-qemu2-arm64-compat
> third uses MTE/KASAN_HWTAGS:
> https://syzkaller.appspot.com/upstream?manager=ci-qemu2-arm64-mte
>
> Kernel configs:
> https://github.com/google/syzkaller/blob/master/dashboard/config/linux/upstream-arm64-kasan.config
> https://github.com/google/syzkaller/blob/master/dashboard/config/linux/upstream-arm64-mte.config
FWIW, I keep my fuzzing config fragment in my fuzzing/* branches on
git.kernel.org, and for comparison my fragment for v5.12-rc1 is:
https://git.kernel.org/pub/scm/linux/kernel/git/mark/linux.git/commit/?h=fuzzing/5.12-rc1&id=6d9f7f8a2514fe882823fadbe7478228f71d7ab1
... I'm not sure whether there's anything in that which is novel to you.
> The instances have KCOV disabled because it slows down execution too
> much (KASAN in qemu emulation is already extremely slow), so no
> coverage guidance and coverage reports for now :(
>
> The instances found few arm64-specific issues that we have not
> observed on other instances:
> https://syzkaller.appspot.com/bug?id=1d22a2cc3521d5cf6b41bd6b825793c2015f861f
> https://syzkaller.appspot.com/bug?id=bb2c16b0e13b4de4bbf22cf6a4b9b16fb0c20eea
> https://syzkaller.appspot.com/bug?id=b75386f45318ec181b7f49260d619fac9877d456
> https://syzkaller.appspot.com/bug?id=5a1bc29bca656159f95c7c8bb30e3776ca860332
> but mostly re-discovering known bugs we already found on x86.
Likewise, my general experience these days (fuzzing under KVM on a
ThunderX2 host) is that we mostly hit issues in core code or drivers
rather than anything strictly specific to arm64. As my host is ARMv8.1
that might just be by virtue of not exercising many of the new
architectural features.
> The instances use qemu emulation and lots of debug configs, so they
> are quite slow and it makes sense to target them at arm64-specific
> parts of the kernel as much as possible (rather
> than stress generic subsystems that are already stressed on x86).
> So the question is: what arm64-specific parts are there that we can reach
> in qemu?
> Can you think of any qemu flags (cpu features, device emulation, etc)?
Generally, `-cpu max` will expose the more interesting CPU features, and
you already seem to have that, so I think you're mostly there on that
front.
Devices vary a lot between SoCs (and most aren't even emulated), so
unless you have particular platforms in mind I'd suggest it might be
better to just use PV devices and try to focus fuzzing on arch code and
common code like mm rather than drivers.
> Any kernel subsystems with heavy arm-specific parts that we may be missing?
It looks like your configs already have BPF, which is probably one of
the more interesting subsystems with architecture-specific bits, so I
don't have further suggestions on that front.
> Testing some of the arm64 drivers that qemu can emulate may be the
> most profitable thing.
> Currently the instances use the following flags:
> -machine virt,virtualization=on,graphics=on,usb=on -cpu cortex-a57
> -machine virt,virtualization=on,mte=on,graphics=on,usb=on -cpu max
With `-cpu max`, QEMU will use a relatively expensive SW implementation
of pointer authentication (which I found significantly magnified the
cost of implementation like kcov), so depending on your priorities you
might want to disable that or (assuming you have a recent enough build
of QEMU) you might wantto force the use of a cheaper algorithm by
passing `-cpu max,pauth-impef`.
The relevant QEMU commit is:
eb94284d0812b4e7 ("arget/arm: Add cpu properties to control pauth")
... but it looks like that might not yet be in a tagged release yet.
Thanks,
Mark.
> mte=on + virtualization=on is broken in the kernel on in the qemu:
> https://lore.kernel.org/lkml/CAAeHK+wDz8aSLyjq1b=q3+HG9aJXxwYR6+gN_fTttMN5osM5gg@mail.gmail.com/
>
> --
> You received this message because you are subscribed to the Google Groups "syzkaller" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to [email protected].
> To view this discussion on the web visit https://groups.google.com/d/msgid/syzkaller/CACT4Y%2BbeyZ7rjmy7im0KdSU-Pcqd4Rud3xsxonBbYVk0wU-B9g%40mail.gmail.com.
On Thu, Mar 11, 2021 at 12:38 PM Dmitry Vyukov <[email protected]> wrote:
>
> The instances found few arm64-specific issues that we have not
> observed on other instances:
I've had a brief look at these:
> https://syzkaller.appspot.com/bug?id=1d22a2cc3521d5cf6b41bd6b825793c2015f861f
This one doesn't seem arm64 specific at all. While the KASAN report has shown
up on arm64, the link to
https://syzkaller.appspot.com/bug?id=aa8808729c0a3540e6a29f0d45394665caf79dca
seems to be for x86 machines running into the same problem.
Looking deeper into the log, I see that fw_load_sysfs_fallback() finds
an existing
list entry on the global "pending_fw_head" list, which seems to have been freed
earlier (the allocation listed here is not for a firmware load, so presumably it
was recycled in the meantime). The log shows that this is the second time that
loading the regulatory database failed in that run, so my guess is that it was
the first failed load that left the freed firmware private data on the
list, but I
don't see how that happened.
> https://syzkaller.appspot.com/bug?id=bb2c16b0e13b4de4bbf22cf6a4b9b16fb0c20eea
This one rings a bell: opening a 8250 uart on a well-known port must fail
when no I/O ports are registered in the system, or when the PCI I/O ports
are mapped to an invalid area.
It seems to be attempting a register access at I/O port '1' (virtual
address 0xfffffbfffe800001 is one byte into the well-known PCI_IOBASE),
which is an unusual place for a UART, traditional PCs had it at 0x3F8.
This could be either a result of qemu claiming to support a PIO based UART
at the first available address, or the table of UARTS being uninitialized
.bss memory.
Definitely an arm64 specific bug.
> https://syzkaller.appspot.com/bug?id=b75386f45318ec181b7f49260d619fac9877d456
A freed entry on the timer list caused a bug when adding another entry.
The allocation from alloc_fdtable does not seem to be the one at fault,
as the fdtable does not contain a timer. Several of the linked kasan reports
point to ext4_fill_super() as the code that allocated and freed the timer
list entry, so it's possible that this is the same timer that later fails to
be inserted if we ever get to kfree(sbi) without killing the timer first.
I don't see how that could happen, but the code was recently rewritten
in c92dc856848f ("ext4: defer saving error info from atomic context")
> https://syzkaller.appspot.com/bug?id=5a1bc29bca656159f95c7c8bb30e3776ca860332
I see that reiserfs_xattr_jcreate_nblocks() is dereferencing a NULL
inode pointer -- inode->i_sb has offset 0x30. However, that doesn't
make any sense with the call chain, as the pointer was newly allocated
and checked for NULL.
Arnd
On Thu, Mar 11, 2021 at 1:33 PM Mark Rutland <[email protected]> wrote:
>
> On Thu, Mar 11, 2021 at 12:38:21PM +0100, 'Dmitry Vyukov' via syzkaller wrote:
> > Hi arm64 maintainers,
>
> Hi Dmitry,
>
> > We now have some syzbot instances testing arm64 (woohoo!) using qemu
> > emulation. I wanted to write up the current status.
>
> Nice!
>
> > There are 3 instances, first uses KASAN:
> > https://syzkaller.appspot.com/upstream?manager=ci-qemu2-arm64
> > second KASAN and 32-bit userspace test load (compat):
> > https://syzkaller.appspot.com/upstream?manager=ci-qemu2-arm64-compat
> > third uses MTE/KASAN_HWTAGS:
> > https://syzkaller.appspot.com/upstream?manager=ci-qemu2-arm64-mte
> >
> > Kernel configs:
> > https://github.com/google/syzkaller/blob/master/dashboard/config/linux/upstream-arm64-kasan.config
> > https://github.com/google/syzkaller/blob/master/dashboard/config/linux/upstream-arm64-mte.config
>
> FWIW, I keep my fuzzing config fragment in my fuzzing/* branches on
> git.kernel.org, and for comparison my fragment for v5.12-rc1 is:
>
> https://git.kernel.org/pub/scm/linux/kernel/git/mark/linux.git/commit/?h=fuzzing/5.12-rc1&id=6d9f7f8a2514fe882823fadbe7478228f71d7ab1
>
> ... I'm not sure whether there's anything in that which is novel to you.
Hi Mark,
I've learned about DEBUG_TIMEKEEPING which we had disabled. I am enabling it.
We also have CONTEXT_TRACKING_FORCE disabled. I don't completely
understand what it's doing. Is it also "more debug checks" type of
config?
FWIW we have more debug configs:
https://github.com/google/syzkaller/blob/master/dashboard/config/linux/bits/debug.yml
https://github.com/google/syzkaller/blob/master/dashboard/config/linux/bits/base.yml
https://github.com/google/syzkaller/blob/master/dashboard/config/linux/bits/kasan.yml
https://github.com/google/syzkaller/blob/master/dashboard/config/linux/bits/kmemleak.yml
On Thu, Mar 11, 2021 at 1:33 PM Mark Rutland <[email protected]> wrote:
> On Thu, Mar 11, 2021 at 12:38:21PM +0100, 'Dmitry Vyukov' via syzkaller wrote:
> > Hi arm64 maintainers,
> > The instances have KCOV disabled because it slows down execution too
> > much (KASAN in qemu emulation is already extremely slow), so no
> > coverage guidance and coverage reports for now :(
> >
> > The instances found few arm64-specific issues that we have not
> > observed on other instances:
> > https://syzkaller.appspot.com/bug?id=1d22a2cc3521d5cf6b41bd6b825793c2015f861f
> > https://syzkaller.appspot.com/bug?id=bb2c16b0e13b4de4bbf22cf6a4b9b16fb0c20eea
> > https://syzkaller.appspot.com/bug?id=b75386f45318ec181b7f49260d619fac9877d456
> > https://syzkaller.appspot.com/bug?id=5a1bc29bca656159f95c7c8bb30e3776ca860332
> > but mostly re-discovering known bugs we already found on x86.
>
> Likewise, my general experience these days (fuzzing under KVM on a
> ThunderX2 host) is that we mostly hit issues in core code or drivers
> rather than anything strictly specific to arm64. As my host is ARMv8.1
> that might just be by virtue of not exercising many of the new
> architectural features.
>
> > The instances use qemu emulation and lots of debug configs, so they
> > are quite slow and it makes sense to target them at arm64-specific
> > parts of the kernel as much as possible (rather
> > than stress generic subsystems that are already stressed on x86).
> > So the question is: what arm64-specific parts are there that we can reach
> > in qemu?
> > Can you think of any qemu flags (cpu features, device emulation, etc)?
>
> Generally, `-cpu max` will expose the more interesting CPU features, and
> you already seem to have that, so I think you're mostly there on that
> front.
>
> Devices vary a lot between SoCs (and most aren't even emulated), so
> unless you have particular platforms in mind I'd suggest it might be
> better to just use PV devices and try to focus fuzzing on arch code and
> common code like mm rather than drivers.
I don't have any specific SoC in mind. I think we are interested in
covering something more commonly used rather than a driver used only
on 1 SoC.
Testing virt drivers is good, but since we have 3 arm64 instances, we
could make then use different boards to get more coverage.
What about things like pstore, numa, mtdblock, pflash? When I do man
qemu-system-aarch64 for some reason I see help for x86_64, so I am not
sure if these are applicable to arm64.
> > Any kernel subsystems with heavy arm-specific parts that we may be missing?
>
> It looks like your configs already have BPF, which is probably one of
> the more interesting subsystems with architecture-specific bits, so I
> don't have further suggestions on that front.
>
> > Testing some of the arm64 drivers that qemu can emulate may be the
> > most profitable thing.
> > Currently the instances use the following flags:
> > -machine virt,virtualization=on,graphics=on,usb=on -cpu cortex-a57
> > -machine virt,virtualization=on,mte=on,graphics=on,usb=on -cpu max
>
> With `-cpu max`, QEMU will use a relatively expensive SW implementation
> of pointer authentication (which I found significantly magnified the
> cost of implementation like kcov), so depending on your priorities you
> might want to disable that or (assuming you have a recent enough build
> of QEMU) you might wantto force the use of a cheaper algorithm by
> passing `-cpu max,pauth-impef`.
>
> The relevant QEMU commit is:
>
> eb94284d0812b4e7 ("arget/arm: Add cpu properties to control pauth")
>
> ... but it looks like that might not yet be in a tagged release yet.
Interesting. I need to note this somewhere.
> > mte=on + virtualization=on is broken in the kernel on in the qemu:
> > https://lore.kernel.org/lkml/CAAeHK+wDz8aSLyjq1b=q3+HG9aJXxwYR6+gN_fTttMN5osM5gg@mail.gmail.com/
> >
> > --
> > You received this message because you are subscribed to the Google Groups "syzkaller" group.
> > To unsubscribe from this group and stop receiving emails from it, send an email to [email protected].
> > To view this discussion on the web visit https://groups.google.com/d/msgid/syzkaller/CACT4Y%2BbeyZ7rjmy7im0KdSU-Pcqd4Rud3xsxonBbYVk0wU-B9g%40mail.gmail.com.
On Thu, Mar 11, 2021 at 2:30 PM Arnd Bergmann <[email protected]> wrote:
>
> On Thu, Mar 11, 2021 at 12:38 PM Dmitry Vyukov <[email protected]> wrote:
> >
> > The instances found few arm64-specific issues that we have not
> > observed on other instances:
>
> I've had a brief look at these:
>
> > https://syzkaller.appspot.com/bug?id=1d22a2cc3521d5cf6b41bd6b825793c2015f861f
>
> This one doesn't seem arm64 specific at all. While the KASAN report has shown
> up on arm64, the link to
> https://syzkaller.appspot.com/bug?id=aa8808729c0a3540e6a29f0d45394665caf79dca
> seems to be for x86 machines running into the same problem.
You are right. It's probably a consequence of some configs being enabled.
I think we need to enable CONFIG_FW_LOADER_USER_HELPER on x86_64
upstream instances as well.
> Looking deeper into the log, I see that fw_load_sysfs_fallback() finds
> an existing
> list entry on the global "pending_fw_head" list, which seems to have been freed
> earlier (the allocation listed here is not for a firmware load, so presumably it
> was recycled in the meantime). The log shows that this is the second time that
> loading the regulatory database failed in that run, so my guess is that it was
> the first failed load that left the freed firmware private data on the
> list, but I
> don't see how that happened.
Can it be as simple as: fw_load_sysfs_fallback adds fw to the pending
list, but then returns with an error w/o removing it from the list?
There are some errors checks after that:
https://elixir.bootlin.com/linux/v5.12-rc2/source/drivers/base/firmware_loader/fallback.c#L536
and it seems that the caller deletes fw in this case:
https://elixir.bootlin.com/linux/v5.12-rc2/source/drivers/base/firmware_loader/main.c#L839
On Thu, Mar 11, 2021 at 2:30 PM Arnd Bergmann <[email protected]> wrote:
> >
> > The instances found few arm64-specific issues that we have not
> > observed on other instances:
>
> I've had a brief look at these:
>
> > https://syzkaller.appspot.com/bug?id=1d22a2cc3521d5cf6b41bd6b825793c2015f861f
>
> This one doesn't seem arm64 specific at all. While the KASAN report has shown
> up on arm64, the link to
> https://syzkaller.appspot.com/bug?id=aa8808729c0a3540e6a29f0d45394665caf79dca
> seems to be for x86 machines running into the same problem.
>
> Looking deeper into the log, I see that fw_load_sysfs_fallback() finds
> an existing
> list entry on the global "pending_fw_head" list, which seems to have been freed
> earlier (the allocation listed here is not for a firmware load, so presumably it
> was recycled in the meantime). The log shows that this is the second time that
> loading the regulatory database failed in that run, so my guess is that it was
> the first failed load that left the freed firmware private data on the
> list, but I
> don't see how that happened.
>
> > https://syzkaller.appspot.com/bug?id=bb2c16b0e13b4de4bbf22cf6a4b9b16fb0c20eea
>
> This one rings a bell: opening a 8250 uart on a well-known port must fail
> when no I/O ports are registered in the system, or when the PCI I/O ports
> are mapped to an invalid area.
>
> It seems to be attempting a register access at I/O port '1' (virtual
> address 0xfffffbfffe800001 is one byte into the well-known PCI_IOBASE),
> which is an unusual place for a UART, traditional PCs had it at 0x3F8.
>
> This could be either a result of qemu claiming to support a PIO based UART
> at the first available address, or the table of UARTS being uninitialized
> .bss memory.
>
> Definitely an arm64 specific bug.
I can reproduce this with just:
#include <stdlib.h>
#include <string.h>
#include <sys/syscall.h>
#include <sys/types.h>
#include <unistd.h>
int main(void)
{
int fd = syscall(__NR_openat, 0xffffffffffffff9cul, "/dev/ttyS3", 0ul, 0ul);
char ch = 0;
syscall(__NR_ioctl, fd, 0x5412, &ch); // TIOCSTI
return 0;
}
It does not even do any tty setup... does it point to a qemu bug?
On Thu, Mar 11, 2021 at 6:25 PM Dmitry Vyukov <[email protected]> wrote:
>
> On Thu, Mar 11, 2021 at 2:30 PM Arnd Bergmann <[email protected]> wrote:
> >
> > On Thu, Mar 11, 2021 at 12:38 PM Dmitry Vyukov <[email protected]> wrote:
> > >
> > > The instances found few arm64-specific issues that we have not
> > > observed on other instances:
> >
> > I've had a brief look at these:
> >
> > > https://syzkaller.appspot.com/bug?id=1d22a2cc3521d5cf6b41bd6b825793c2015f861f
> >
> > This one doesn't seem arm64 specific at all. While the KASAN report has shown
> > up on arm64, the link to
> > https://syzkaller.appspot.com/bug?id=aa8808729c0a3540e6a29f0d45394665caf79dca
> > seems to be for x86 machines running into the same problem.
>
> You are right. It's probably a consequence of some configs being enabled.
> I think we need to enable CONFIG_FW_LOADER_USER_HELPER on x86_64
> upstream instances as well.
>
>
> > Looking deeper into the log, I see that fw_load_sysfs_fallback() finds
> > an existing
> > list entry on the global "pending_fw_head" list, which seems to have been freed
> > earlier (the allocation listed here is not for a firmware load, so presumably it
> > was recycled in the meantime). The log shows that this is the second time that
> > loading the regulatory database failed in that run, so my guess is that it was
> > the first failed load that left the freed firmware private data on the
> > list, but I
> > don't see how that happened.
>
> Can it be as simple as: fw_load_sysfs_fallback adds fw to the pending
> list, but then returns with an error w/o removing it from the list?
> There are some errors checks after that:
> https://elixir.bootlin.com/linux/v5.12-rc2/source/drivers/base/firmware_loader/fallback.c#L536
> and it seems that the caller deletes fw in this case:
> https://elixir.bootlin.com/linux/v5.12-rc2/source/drivers/base/firmware_loader/main.c#L839
I've enabled CONFIG_FW_LOADER_USER_HELPER for all instances and syzbot
come up with a repro and KASAN report with alloc/free stacks:
https://lore.kernel.org/lkml/[email protected]/
On Thu, Mar 11, 2021 at 6:57 PM Dmitry Vyukov <[email protected]> wrote:
> On Thu, Mar 11, 2021 at 2:30 PM Arnd Bergmann <[email protected]> wrote:
> > >
> > > The instances found few arm64-specific issues that we have not
> > > observed on other instances:
> >
> > I've had a brief look at these:
> >
> > > https://syzkaller.appspot.com/bug?id=1d22a2cc3521d5cf6b41bd6b825793c2015f861f
> >
> > This one doesn't seem arm64 specific at all. While the KASAN report has shown
> > up on arm64, the link to
> > https://syzkaller.appspot.com/bug?id=aa8808729c0a3540e6a29f0d45394665caf79dca
> > seems to be for x86 machines running into the same problem.
> >
> > Looking deeper into the log, I see that fw_load_sysfs_fallback() finds
> > an existing
> > list entry on the global "pending_fw_head" list, which seems to have been freed
> > earlier (the allocation listed here is not for a firmware load, so presumably it
> > was recycled in the meantime). The log shows that this is the second time that
> > loading the regulatory database failed in that run, so my guess is that it was
> > the first failed load that left the freed firmware private data on the
> > list, but I
> > don't see how that happened.
> >
> > > https://syzkaller.appspot.com/bug?id=bb2c16b0e13b4de4bbf22cf6a4b9b16fb0c20eea
> >
> > This one rings a bell: opening a 8250 uart on a well-known port must fail
> > when no I/O ports are registered in the system, or when the PCI I/O ports
> > are mapped to an invalid area.
> >
> > It seems to be attempting a register access at I/O port '1' (virtual
> > address 0xfffffbfffe800001 is one byte into the well-known PCI_IOBASE),
> > which is an unusual place for a UART, traditional PCs had it at 0x3F8.
> >
> > This could be either a result of qemu claiming to support a PIO based UART
> > at the first available address, or the table of UARTS being uninitialized
> > .bss memory.
> >
> > Definitely an arm64 specific bug.
>
> I can reproduce this with just:
>
> #include <stdlib.h>
> #include <string.h>
> #include <sys/syscall.h>
> #include <sys/types.h>
> #include <unistd.h>
>
> int main(void)
> {
> int fd = syscall(__NR_openat, 0xffffffffffffff9cul, "/dev/ttyS3", 0ul, 0ul);
> char ch = 0;
> syscall(__NR_ioctl, fd, 0x5412, &ch); // TIOCSTI
> return 0;
> }
>
>
> It does not even do any tty setup... does it point to a qemu bug?
There are at least two bugs here, but both could be either in the
kernel or in qemu:
a) accessing a legacy ISA/LPC port should not result in an oops,
but should instead return values with all bits set. There could
be a ratelimited console warning about broken drivers, but we
can't assume that all drivers work correctly, as some ancient
PC style drivers still rely on this.
John Garry has recently worked on a related bugfix, so maybe
either this is the same bug he encountered (and hasn't merged
yet), or if his fix got merged there is still a remaining problem.
b) It should not be possible to open /dev/ttyS3 if the device is
not initialized. What is the output of 'cat /proc/tty/driver/serial'
on this machine? Do you see any messages from the serial
driver in the boot log?
Unfortunately there are so many different ways to probe devices
in the 8250 driver that I don't know where this comes from.
Your config file has
CONFIG_SERIAL_8250_PNP=y
CONFIG_SERIAL_8250_NR_UARTS=32
CONFIG_SERIAL_8250_RUNTIME_UARTS=4
CONFIG_SERIAL_8250_EXTENDED=y
I guess it's probably the preconfigured uarts that somehow
become probed without initialization, but it could also be
an explicit device incorrectly described by qemu.
Arnd
On Fri, Mar 12, 2021 at 9:40 AM Arnd Bergmann <[email protected]> wrote:
>
> On Thu, Mar 11, 2021 at 6:57 PM Dmitry Vyukov <[email protected]> wrote:
> > On Thu, Mar 11, 2021 at 2:30 PM Arnd Bergmann <[email protected]> wrote:
> > > >
> > > > The instances found few arm64-specific issues that we have not
> > > > observed on other instances:
> > >
> > > I've had a brief look at these:
> > >
> > > > https://syzkaller.appspot.com/bug?id=1d22a2cc3521d5cf6b41bd6b825793c2015f861f
> > >
> > > This one doesn't seem arm64 specific at all. While the KASAN report has shown
> > > up on arm64, the link to
> > > https://syzkaller.appspot.com/bug?id=aa8808729c0a3540e6a29f0d45394665caf79dca
> > > seems to be for x86 machines running into the same problem.
> > >
> > > Looking deeper into the log, I see that fw_load_sysfs_fallback() finds
> > > an existing
> > > list entry on the global "pending_fw_head" list, which seems to have been freed
> > > earlier (the allocation listed here is not for a firmware load, so presumably it
> > > was recycled in the meantime). The log shows that this is the second time that
> > > loading the regulatory database failed in that run, so my guess is that it was
> > > the first failed load that left the freed firmware private data on the
> > > list, but I
> > > don't see how that happened.
> > >
> > > > https://syzkaller.appspot.com/bug?id=bb2c16b0e13b4de4bbf22cf6a4b9b16fb0c20eea
> > >
> > > This one rings a bell: opening a 8250 uart on a well-known port must fail
> > > when no I/O ports are registered in the system, or when the PCI I/O ports
> > > are mapped to an invalid area.
> > >
> > > It seems to be attempting a register access at I/O port '1' (virtual
> > > address 0xfffffbfffe800001 is one byte into the well-known PCI_IOBASE),
> > > which is an unusual place for a UART, traditional PCs had it at 0x3F8.
> > >
> > > This could be either a result of qemu claiming to support a PIO based UART
> > > at the first available address, or the table of UARTS being uninitialized
> > > .bss memory.
> > >
> > > Definitely an arm64 specific bug.
> >
> > I can reproduce this with just:
> >
> > #include <stdlib.h>
> > #include <string.h>
> > #include <sys/syscall.h>
> > #include <sys/types.h>
> > #include <unistd.h>
> >
> > int main(void)
> > {
> > int fd = syscall(__NR_openat, 0xffffffffffffff9cul, "/dev/ttyS3", 0ul, 0ul);
> > char ch = 0;
> > syscall(__NR_ioctl, fd, 0x5412, &ch); // TIOCSTI
> > return 0;
> > }
> >
> >
> > It does not even do any tty setup... does it point to a qemu bug?
>
> There are at least two bugs here, but both could be either in the
> kernel or in qemu:
>
> a) accessing a legacy ISA/LPC port should not result in an oops,
> but should instead return values with all bits set. There could
> be a ratelimited console warning about broken drivers, but we
> can't assume that all drivers work correctly, as some ancient
> PC style drivers still rely on this.
> John Garry has recently worked on a related bugfix, so maybe
> either this is the same bug he encountered (and hasn't merged
> yet), or if his fix got merged there is still a remaining problem.
>
> b) It should not be possible to open /dev/ttyS3 if the device is
> not initialized. What is the output of 'cat /proc/tty/driver/serial'
> on this machine? Do you see any messages from the serial
> driver in the boot log?
> Unfortunately there are so many different ways to probe devices
> in the 8250 driver that I don't know where this comes from.
> Your config file has
> CONFIG_SERIAL_8250_PNP=y
> CONFIG_SERIAL_8250_NR_UARTS=32
> CONFIG_SERIAL_8250_RUNTIME_UARTS=4
> CONFIG_SERIAL_8250_EXTENDED=y
> I guess it's probably the preconfigured uarts that somehow
> become probed without initialization, but it could also be
> an explicit device incorrectly described by qemu.
Here is fool boot log, /proc/tty/driver/serial and the crash:
https://gist.githubusercontent.com/dvyukov/084890d9b4aa7cd54f468e652a9b5881/raw/54c12248ff6a4885ba6c530d56b3adad59bc6187/gistfile1.txt
On Fri, Mar 12, 2021 at 10:21 AM Dmitry Vyukov <[email protected]> wrote:
>
> On Fri, Mar 12, 2021 at 10:16 AM Arnd Bergmann <[email protected]> wrote:
> >
> > On Fri, Mar 12, 2021 at 9:46 AM Dmitry Vyukov <[email protected]> wrote:
> > > On Fri, Mar 12, 2021 at 9:40 AM Arnd Bergmann <[email protected]> wrote:
> > > > On Thu, Mar 11, 2021 at 6:57 PM Dmitry Vyukov <[email protected]> wrote:
> > > > a) accessing a legacy ISA/LPC port should not result in an oops,
> > > > but should instead return values with all bits set. There could
> > > > be a ratelimited console warning about broken drivers, but we
> > > > can't assume that all drivers work correctly, as some ancient
> > > > PC style drivers still rely on this.
> > > > John Garry has recently worked on a related bugfix, so maybe
> > > > either this is the same bug he encountered (and hasn't merged
> > > > yet), or if his fix got merged there is still a remaining problem.
> >
> > > > b) It should not be possible to open /dev/ttyS3 if the device is
> > > > not initialized. What is the output of 'cat /proc/tty/driver/serial'
> > > > on this machine? Do you see any messages from the serial
> > > > driver in the boot log?
> > > > Unfortunately there are so many different ways to probe devices
> > > > in the 8250 driver that I don't know where this comes from.
> > > > Your config file has
> > > > CONFIG_SERIAL_8250_PNP=y
> > > > CONFIG_SERIAL_8250_NR_UARTS=32
> > > > CONFIG_SERIAL_8250_RUNTIME_UARTS=4
> > > > CONFIG_SERIAL_8250_EXTENDED=y
> > > > I guess it's probably the preconfigured uarts that somehow
> > > > become probed without initialization, but it could also be
> > > > an explicit device incorrectly described by qemu.
> > >
> > >
> > > Here is fool boot log, /proc/tty/driver/serial and the crash:
> > > https://gist.githubusercontent.com/dvyukov/084890d9b4aa7cd54f468e652a9b5881/raw/54c12248ff6a4885ba6c530d56b3adad59bc6187/gistfile1.txt
> >
> > Ok, so there are four 8250 ports, and none of them are initialized,
> > while the console is on /dev/ttyAMA0 using a different driver.
> >
> > I'm fairly sure this is a bug in the kernel then, not in qemu.
> >
> >
> > I also see that the PCI I/O space gets mapped to a physical address:
> > [ 3.974309][ T1] pci-host-generic 4010000000.pcie: IO
> > 0x003eff0000..0x003effffff -> 0x0000000000
> >
> > So it's probably qemu that triggers the 'synchronous external
> > abort' when accessing the PCI I/O space, which in turn hints
> > towards a bug in qemu. Presumably it only returns data from
> > I/O ports that are actually mapped to a device when real hardware
> > is supposed to return 0xffffffff when reading from unused I/O ports.
> > This would be separate from the work that John did, which only
> > fixed the kernel for accessing I/O port ranges that do not have
> > a corresponding MMU mapping to hardware ports.
>
> Will John's patch fix this crash w/o any changes in qemu? That would
> be good enough for syzbot. Otherwise we need to report the issue to
> qemu.
No, this was a third issue. As far as I remember, this would result in
a similar problem in the case where there is no PCI bus at all, or
where no PCI host has an I/O port range, so the inb() from the serial
driver would cause a page fault. The problem you ran into happens
in qemu when the PCI I/O ports are mapped to hardware registers
that cause an exception when accessed.
If you just want to work around the problem for now, it should
go away if you set CONFIG_SERIAL_8250_RUNTIME_UARTS
to zero.
Arnd
On Fri, Mar 12, 2021 at 11:11 AM Arnd Bergmann <[email protected]> wrote:
> > > On Fri, Mar 12, 2021 at 9:46 AM Dmitry Vyukov <[email protected]> wrote:
> > > > On Fri, Mar 12, 2021 at 9:40 AM Arnd Bergmann <[email protected]> wrote:
> > > > > On Thu, Mar 11, 2021 at 6:57 PM Dmitry Vyukov <[email protected]> wrote:
> > > > > a) accessing a legacy ISA/LPC port should not result in an oops,
> > > > > but should instead return values with all bits set. There could
> > > > > be a ratelimited console warning about broken drivers, but we
> > > > > can't assume that all drivers work correctly, as some ancient
> > > > > PC style drivers still rely on this.
> > > > > John Garry has recently worked on a related bugfix, so maybe
> > > > > either this is the same bug he encountered (and hasn't merged
> > > > > yet), or if his fix got merged there is still a remaining problem.
> > >
> > > > > b) It should not be possible to open /dev/ttyS3 if the device is
> > > > > not initialized. What is the output of 'cat /proc/tty/driver/serial'
> > > > > on this machine? Do you see any messages from the serial
> > > > > driver in the boot log?
> > > > > Unfortunately there are so many different ways to probe devices
> > > > > in the 8250 driver that I don't know where this comes from.
> > > > > Your config file has
> > > > > CONFIG_SERIAL_8250_PNP=y
> > > > > CONFIG_SERIAL_8250_NR_UARTS=32
> > > > > CONFIG_SERIAL_8250_RUNTIME_UARTS=4
> > > > > CONFIG_SERIAL_8250_EXTENDED=y
> > > > > I guess it's probably the preconfigured uarts that somehow
> > > > > become probed without initialization, but it could also be
> > > > > an explicit device incorrectly described by qemu.
> > > >
> > > >
> > > > Here is fool boot log, /proc/tty/driver/serial and the crash:
> > > > https://gist.githubusercontent.com/dvyukov/084890d9b4aa7cd54f468e652a9b5881/raw/54c12248ff6a4885ba6c530d56b3adad59bc6187/gistfile1.txt
> > >
> > > Ok, so there are four 8250 ports, and none of them are initialized,
> > > while the console is on /dev/ttyAMA0 using a different driver.
> > >
> > > I'm fairly sure this is a bug in the kernel then, not in qemu.
> > >
> > >
> > > I also see that the PCI I/O space gets mapped to a physical address:
> > > [ 3.974309][ T1] pci-host-generic 4010000000.pcie: IO
> > > 0x003eff0000..0x003effffff -> 0x0000000000
> > >
> > > So it's probably qemu that triggers the 'synchronous external
> > > abort' when accessing the PCI I/O space, which in turn hints
> > > towards a bug in qemu. Presumably it only returns data from
> > > I/O ports that are actually mapped to a device when real hardware
> > > is supposed to return 0xffffffff when reading from unused I/O ports.
> > > This would be separate from the work that John did, which only
> > > fixed the kernel for accessing I/O port ranges that do not have
> > > a corresponding MMU mapping to hardware ports.
> >
> > Will John's patch fix this crash w/o any changes in qemu? That would
> > be good enough for syzbot. Otherwise we need to report the issue to
> > qemu.
>
> No, this was a third issue. As far as I remember, this would result in
> a similar problem in the case where there is no PCI bus at all, or
> where no PCI host has an I/O port range, so the inb() from the serial
> driver would cause a page fault. The problem you ran into happens
> in qemu when the PCI I/O ports are mapped to hardware registers
> that cause an exception when accessed.
>
> If you just want to work around the problem for now, it should
> go away if you set CONFIG_SERIAL_8250_RUNTIME_UARTS
> to zero.
It does not happen too often on syzbot so far, so let's try to do the
right thing first.
I've filed: https://bugs.launchpad.net/qemu/+bug/1918917
with a link to this thread. To be fair, I don't fully understand what
I am talking about, I hope I proxied your description properly.
On Fri, Mar 12, 2021 at 11:38 AM Dmitry Vyukov <[email protected]> wrote:
> On Fri, Mar 12, 2021 at 11:11 AM Arnd Bergmann <[email protected]> wrote:
>
> It does not happen too often on syzbot so far, so let's try to do the
> right thing first.
> I've filed: https://bugs.launchpad.net/qemu/+bug/1918917
> with a link to this thread. To be fair, I don't fully understand what
> I am talking about, I hope I proxied your description properly.
Thanks, looks good. I provided a little more detail in a comment there.
Arnd
On Fri, Mar 12, 2021 at 9:46 AM Dmitry Vyukov <[email protected]> wrote:
> On Fri, Mar 12, 2021 at 9:40 AM Arnd Bergmann <[email protected]> wrote:
> > On Thu, Mar 11, 2021 at 6:57 PM Dmitry Vyukov <[email protected]> wrote:
> > a) accessing a legacy ISA/LPC port should not result in an oops,
> > but should instead return values with all bits set. There could
> > be a ratelimited console warning about broken drivers, but we
> > can't assume that all drivers work correctly, as some ancient
> > PC style drivers still rely on this.
> > John Garry has recently worked on a related bugfix, so maybe
> > either this is the same bug he encountered (and hasn't merged
> > yet), or if his fix got merged there is still a remaining problem.
> > b) It should not be possible to open /dev/ttyS3 if the device is
> > not initialized. What is the output of 'cat /proc/tty/driver/serial'
> > on this machine? Do you see any messages from the serial
> > driver in the boot log?
> > Unfortunately there are so many different ways to probe devices
> > in the 8250 driver that I don't know where this comes from.
> > Your config file has
> > CONFIG_SERIAL_8250_PNP=y
> > CONFIG_SERIAL_8250_NR_UARTS=32
> > CONFIG_SERIAL_8250_RUNTIME_UARTS=4
> > CONFIG_SERIAL_8250_EXTENDED=y
> > I guess it's probably the preconfigured uarts that somehow
> > become probed without initialization, but it could also be
> > an explicit device incorrectly described by qemu.
>
>
> Here is fool boot log, /proc/tty/driver/serial and the crash:
> https://gist.githubusercontent.com/dvyukov/084890d9b4aa7cd54f468e652a9b5881/raw/54c12248ff6a4885ba6c530d56b3adad59bc6187/gistfile1.txt
Ok, so there are four 8250 ports, and none of them are initialized,
while the console is on /dev/ttyAMA0 using a different driver.
I'm fairly sure this is a bug in the kernel then, not in qemu.
I also see that the PCI I/O space gets mapped to a physical address:
[ 3.974309][ T1] pci-host-generic 4010000000.pcie: IO
0x003eff0000..0x003effffff -> 0x0000000000
So it's probably qemu that triggers the 'synchronous external
abort' when accessing the PCI I/O space, which in turn hints
towards a bug in qemu. Presumably it only returns data from
I/O ports that are actually mapped to a device when real hardware
is supposed to return 0xffffffff when reading from unused I/O ports.
This would be separate from the work that John did, which only
fixed the kernel for accessing I/O port ranges that do not have
a corresponding MMU mapping to hardware ports.
Arnd
On Fri, Mar 12, 2021 at 10:16 AM Arnd Bergmann <[email protected]> wrote:
>
> On Fri, Mar 12, 2021 at 9:46 AM Dmitry Vyukov <[email protected]> wrote:
> > On Fri, Mar 12, 2021 at 9:40 AM Arnd Bergmann <[email protected]> wrote:
> > > On Thu, Mar 11, 2021 at 6:57 PM Dmitry Vyukov <[email protected]> wrote:
> > > a) accessing a legacy ISA/LPC port should not result in an oops,
> > > but should instead return values with all bits set. There could
> > > be a ratelimited console warning about broken drivers, but we
> > > can't assume that all drivers work correctly, as some ancient
> > > PC style drivers still rely on this.
> > > John Garry has recently worked on a related bugfix, so maybe
> > > either this is the same bug he encountered (and hasn't merged
> > > yet), or if his fix got merged there is still a remaining problem.
>
> > > b) It should not be possible to open /dev/ttyS3 if the device is
> > > not initialized. What is the output of 'cat /proc/tty/driver/serial'
> > > on this machine? Do you see any messages from the serial
> > > driver in the boot log?
> > > Unfortunately there are so many different ways to probe devices
> > > in the 8250 driver that I don't know where this comes from.
> > > Your config file has
> > > CONFIG_SERIAL_8250_PNP=y
> > > CONFIG_SERIAL_8250_NR_UARTS=32
> > > CONFIG_SERIAL_8250_RUNTIME_UARTS=4
> > > CONFIG_SERIAL_8250_EXTENDED=y
> > > I guess it's probably the preconfigured uarts that somehow
> > > become probed without initialization, but it could also be
> > > an explicit device incorrectly described by qemu.
> >
> >
> > Here is fool boot log, /proc/tty/driver/serial and the crash:
> > https://gist.githubusercontent.com/dvyukov/084890d9b4aa7cd54f468e652a9b5881/raw/54c12248ff6a4885ba6c530d56b3adad59bc6187/gistfile1.txt
>
> Ok, so there are four 8250 ports, and none of them are initialized,
> while the console is on /dev/ttyAMA0 using a different driver.
>
> I'm fairly sure this is a bug in the kernel then, not in qemu.
>
>
> I also see that the PCI I/O space gets mapped to a physical address:
> [ 3.974309][ T1] pci-host-generic 4010000000.pcie: IO
> 0x003eff0000..0x003effffff -> 0x0000000000
>
> So it's probably qemu that triggers the 'synchronous external
> abort' when accessing the PCI I/O space, which in turn hints
> towards a bug in qemu. Presumably it only returns data from
> I/O ports that are actually mapped to a device when real hardware
> is supposed to return 0xffffffff when reading from unused I/O ports.
> This would be separate from the work that John did, which only
> fixed the kernel for accessing I/O port ranges that do not have
> a corresponding MMU mapping to hardware ports.
Will John's patch fix this crash w/o any changes in qemu? That would
be good enough for syzbot. Otherwise we need to report the issue to
qemu.
On 12/03/2021 10:52, Arnd Bergmann wrote:
> On Fri, Mar 12, 2021 at 11:38 AM Dmitry Vyukov <[email protected]> wrote:
>> On Fri, Mar 12, 2021 at 11:11 AM Arnd Bergmann <[email protected]> wrote:
>>
>> It does not happen too often on syzbot so far, so let's try to do the
>> right thing first.
>> I've filed: https://bugs.launchpad.net/qemu/+bug/1918917
>> with a link to this thread. To be fair, I don't fully understand what
>> I am talking about, I hope I proxied your description properly.
>
> Thanks, looks good. I provided a little more detail in a comment there.
>
> Arnd
> .
>
From looking at the bug report, my impression is that this is a qemu
issue, as the logical IO space is mapped to the PCI host bridge IO
space, and qemu does not handle accesses to that CPU addressable region
at all. As Arnd said.
However, we really should not be accessing logical IO ports 0 or 0x2f8
at all via ttyS3 if not enumerated from PCI device at that logical IO
port. That is what I think anyway, as who knows what device - if any -
really exists at that location. That is why I had this patch to just
stop accesses to legacy IO port regions on arm64:
https://lore.kernel.org/lkml/[email protected]/
Thanks,
John
On Mon, Mar 15, 2021 at 10:45 AM John Garry <[email protected]> wrote:
> >> It does not happen too often on syzbot so far, so let's try to do the
> >> right thing first.
> >> I've filed: https://bugs.launchpad.net/qemu/+bug/1918917
> >> with a link to this thread. To be fair, I don't fully understand what
> >> I am talking about, I hope I proxied your description properly.
> >
> > Thanks, looks good. I provided a little more detail in a comment there.
> >
> > Arnd
> > .
> >
>
> From looking at the bug report, my impression is that this is a qemu
> issue, as the logical IO space is mapped to the PCI host bridge IO
> space, and qemu does not handle accesses to that CPU addressable region
> at all. As Arnd said.
>
> However, we really should not be accessing logical IO ports 0 or 0x2f8
> at all via ttyS3 if not enumerated from PCI device at that logical IO
> port. That is what I think anyway, as who knows what device - if any -
> really exists at that location. That is why I had this patch to just
> stop accesses to legacy IO port regions on arm64:
>
> https://lore.kernel.org/lkml/[email protected]/
Hi John,
Thanks for the info.
The patch is from January, but it's not merged yet, right?
It will fix the crash we see, right?
On 15/03/2021 10:01, Dmitry Vyukov wrote:
> On Mon, Mar 15, 2021 at 10:45 AM John Garry<[email protected]> wrote:
>>>> It does not happen too often on syzbot so far, so let's try to do the
>>>> right thing first.
>>>> I've filed:https://bugs.launchpad.net/qemu/+bug/1918917
>>>> with a link to this thread. To be fair, I don't fully understand what
>>>> I am talking about, I hope I proxied your description properly.
>>> Thanks, looks good. I provided a little more detail in a comment there.
>>>
>>> Arnd
>>> .
>>>
>> From looking at the bug report, my impression is that this is a qemu
>> issue, as the logical IO space is mapped to the PCI host bridge IO
>> space, and qemu does not handle accesses to that CPU addressable region
>> at all. As Arnd said.
>>
>> However, we really should not be accessing logical IO ports 0 or 0x2f8
>> at all via ttyS3 if not enumerated from PCI device at that logical IO
>> port. That is what I think anyway, as who knows what device - if any -
>> really exists at that location. That is why I had this patch to just
>> stop accesses to legacy IO port regions on arm64:
>>
>> https://lore.kernel.org/lkml/[email protected]/
> Hi John,
>
> Thanks for the info.
>
> The patch is from January, but it's not merged yet, right?
> It will fix the crash we see, right?
> .
It's not merged, and it probably would solve this issue. But following
discussion with Arnd when it was originally posted, I still need to do
some analysis whether it is the proper thing to do.
However, as mentioned, the fundamental issue looks like qemu IO port
access, so it would be good to check that first.
On a related topic, I will cc colleague Jiahui Cen, who I think was
doing some work arm on qemu support in a related area, so may share some
experience here.
Jiahui Cen did have a patch to fix logical PIO code from this work [0],
which is not merged, but I don't think would help here. I will cc you on it.
Thanks,
John
[0]
https://lore.kernel.org/lkml/[email protected]/
On Mon, Mar 15, 2021 at 11:31 AM John Garry <[email protected]> wrote:
>
> On 15/03/2021 10:01, Dmitry Vyukov wrote:
> > On Mon, Mar 15, 2021 at 10:45 AM John Garry<[email protected]> wrote:
> >>>> It does not happen too often on syzbot so far, so let's try to do the
> >>>> right thing first.
> >>>> I've filed:https://bugs.launchpad.net/qemu/+bug/1918917
> >>>> with a link to this thread. To be fair, I don't fully understand what
> >>>> I am talking about, I hope I proxied your description properly.
> >>> Thanks, looks good. I provided a little more detail in a comment there.
> >>>
> >>> Arnd
> >>> .
> >>>
> >> From looking at the bug report, my impression is that this is a qemu
> >> issue, as the logical IO space is mapped to the PCI host bridge IO
> >> space, and qemu does not handle accesses to that CPU addressable region
> >> at all. As Arnd said.
> >>
> >> However, we really should not be accessing logical IO ports 0 or 0x2f8
> >> at all via ttyS3 if not enumerated from PCI device at that logical IO
> >> port. That is what I think anyway, as who knows what device - if any -
> >> really exists at that location. That is why I had this patch to just
> >> stop accesses to legacy IO port regions on arm64:
> >>
> >> https://lore.kernel.org/lkml/[email protected]/
> > Hi John,
> >
> > Thanks for the info.
> >
> > The patch is from January, but it's not merged yet, right?
> > It will fix the crash we see, right?
> > .
>
> It's not merged, and it probably would solve this issue. But following
> discussion with Arnd when it was originally posted, I still need to do
> some analysis whether it is the proper thing to do.
OK, I will tell syzbot about the fixing patch.
> However, as mentioned, the fundamental issue looks like qemu IO port
> access, so it would be good to check that first.
I've filed https://bugs.launchpad.net/qemu/+bug/1918917 for qemu.
> On a related topic, I will cc colleague Jiahui Cen, who I think was
> doing some work arm on qemu support in a related area, so may share some
> experience here.
>
> Jiahui Cen did have a patch to fix logical PIO code from this work [0],
> which is not merged, but I don't think would help here. I will cc you on it.
>
> Thanks,
> John
>
> [0]
> https://lore.kernel.org/lkml/[email protected]/
On Mon, Mar 15, 2021 at 11:29 AM John Garry <[email protected]> wrote:
> On 15/03/2021 10:01, Dmitry Vyukov wrote:
> > On Mon, Mar 15, 2021 at 10:45 AM John Garry<[email protected]> wrote:
> >>>> It does not happen too often on syzbot so far, so let's try to do the
> >>>> right thing first.
> >>>> I've filed:https://bugs.launchpad.net/qemu/+bug/1918917
> >>>> with a link to this thread. To be fair, I don't fully understand what
> >>>> I am talking about, I hope I proxied your description properly.
> >>> Thanks, looks good. I provided a little more detail in a comment there.
> >>>
> >>> Arnd
> >>> .
> >>>
> >> From looking at the bug report, my impression is that this is a qemu
> >> issue, as the logical IO space is mapped to the PCI host bridge IO
> >> space, and qemu does not handle accesses to that CPU addressable region
> >> at all. As Arnd said.
> >>
> >> However, we really should not be accessing logical IO ports 0 or 0x2f8
> >> at all via ttyS3 if not enumerated from PCI device at that logical IO
> >> port. That is what I think anyway, as who knows what device - if any -
> >> really exists at that location. That is why I had this patch to just
> >> stop accesses to legacy IO port regions on arm64:
> >>
> >> https://lore.kernel.org/lkml/[email protected]/
> > Hi John,
> >
> > Thanks for the info.
> >
> > The patch is from January, but it's not merged yet, right?
> > It will fix the crash we see, right?
>
> It's not merged, and it probably would solve this issue. But following
> discussion with Arnd when it was originally posted, I still need to do
> some analysis whether it is the proper thing to do.
Right, it might be something we decide to do in order to intentionally
break legacy ISA devices, but I don't think that patch should be necessary
as a bugfix, given that the same devices on x86 just work.
I think Huawei and some others have a BMC on either a PCI port or
an LPC bridge, with PC style device emulation for things that might
include VGA, serial, keyboard/mouse, RTC, IPMI, or ATAPI CDROM.
In a properly designed system, these should all be described by the
firmware if present, but today they should still work even if the firmware
does not describe them (provided the driver looks at the correct address,
which the 8250 driver did not).
Arnd
On Thu, Mar 11, 2021 at 05:56:46PM +0100, Dmitry Vyukov wrote:
> On Thu, Mar 11, 2021 at 1:33 PM Mark Rutland <[email protected]> wrote:
> > FWIW, I keep my fuzzing config fragment in my fuzzing/* branches on
> > git.kernel.org, and for comparison my fragment for v5.12-rc1 is:
> >
> > https://git.kernel.org/pub/scm/linux/kernel/git/mark/linux.git/commit/?h=fuzzing/5.12-rc1&id=6d9f7f8a2514fe882823fadbe7478228f71d7ab1
> >
> > ... I'm not sure whether there's anything in that which is novel to you.
>
> Hi Mark,
>
> I've learned about DEBUG_TIMEKEEPING which we had disabled. I am enabling it.
> We also have CONTEXT_TRACKING_FORCE disabled. I don't completely
> understand what it's doing. Is it also "more debug checks" type of
> config?
Context tracking tracks user<->kernel transitions, and tries to disable
RCU when it is not needed (e.g. while a CPU is in usersspace), to avoid
the need to perturb that CPU with IPIs and so on. Normally this is not
enabled unless CPUs are set aside for NOHZ usage, as there's some
expense in doing this tracking. I haven't measured how expensive it is
in practice.
CONTEXT_TRACKING_FORCE enables that tracking regardless of whether any
CPUs are set aside for NOHZ usage, and makes it easier to find bugs in
that tracking code, or where it is not being used correctly (e.g. missed
calls, or called in the wrong places).
I added it to my debug fragment back when I fixed the arm64 entry code
accounting for lockdep, and I keep it around to make sure that we don't
accidentally regress any of that.
Thanks,
Mark.
> FWIW we have more debug configs:
> https://github.com/google/syzkaller/blob/master/dashboard/config/linux/bits/debug.yml
> https://github.com/google/syzkaller/blob/master/dashboard/config/linux/bits/base.yml
> https://github.com/google/syzkaller/blob/master/dashboard/config/linux/bits/kasan.yml
> https://github.com/google/syzkaller/blob/master/dashboard/config/linux/bits/kmemleak.yml
On Wed, Mar 17, 2021 at 7:45 PM Mark Rutland <[email protected]> wrote:
>
> On Thu, Mar 11, 2021 at 05:56:46PM +0100, Dmitry Vyukov wrote:
> > On Thu, Mar 11, 2021 at 1:33 PM Mark Rutland <[email protected]> wrote:
> > > FWIW, I keep my fuzzing config fragment in my fuzzing/* branches on
> > > git.kernel.org, and for comparison my fragment for v5.12-rc1 is:
> > >
> > > https://git.kernel.org/pub/scm/linux/kernel/git/mark/linux.git/commit/?h=fuzzing/5.12-rc1&id=6d9f7f8a2514fe882823fadbe7478228f71d7ab1
> > >
> > > ... I'm not sure whether there's anything in that which is novel to you.
> >
> > Hi Mark,
> >
> > I've learned about DEBUG_TIMEKEEPING which we had disabled. I am enabling it.
> > We also have CONTEXT_TRACKING_FORCE disabled. I don't completely
> > understand what it's doing. Is it also "more debug checks" type of
> > config?
>
> Context tracking tracks user<->kernel transitions, and tries to disable
> RCU when it is not needed (e.g. while a CPU is in usersspace), to avoid
> the need to perturb that CPU with IPIs and so on. Normally this is not
> enabled unless CPUs are set aside for NOHZ usage, as there's some
> expense in doing this tracking. I haven't measured how expensive it is
> in practice.
>
> CONTEXT_TRACKING_FORCE enables that tracking regardless of whether any
> CPUs are set aside for NOHZ usage, and makes it easier to find bugs in
> that tracking code, or where it is not being used correctly (e.g. missed
> calls, or called in the wrong places).
>
> I added it to my debug fragment back when I fixed the arm64 entry code
> accounting for lockdep, and I keep it around to make sure that we don't
> accidentally regress any of that.
Looks like a debug config we should enable on syzbot. I am enabling it in:
https://github.com/google/syzkaller/pull/2500/commits/8ebea47133e458293e3c71e1189e18052286393d
Thanks
> > FWIW we have more debug configs:
> > https://github.com/google/syzkaller/blob/master/dashboard/config/linux/bits/debug.yml
> > https://github.com/google/syzkaller/blob/master/dashboard/config/linux/bits/base.yml
> > https://github.com/google/syzkaller/blob/master/dashboard/config/linux/bits/kasan.yml
> > https://github.com/google/syzkaller/blob/master/dashboard/config/linux/bits/kmemleak.yml
On Fri, 12 Mar 2021 at 09:16, Arnd Bergmann <[email protected]> wrote:
> So it's probably qemu that triggers the 'synchronous external
> abort' when accessing the PCI I/O space, which in turn hints
> towards a bug in qemu. Presumably it only returns data from
> I/O ports that are actually mapped to a device when real hardware
> is supposed to return 0xffffffff when reading from unused I/O ports.
Do you have a reference to the bit of the PCI spec that mandates
this -1/discard behaviour for attempted access to places where
there isn't actually a PCI device mapped ? The spec is pretty
long and hard to read...
(Knowing to what extent this behaviour is mandatory for all
PCI systems/host controllers vs just "it would be nice if the
gpex host controller worked this way" would help in figuring
out where in QEMU to change.)
thanks
-- PMM
On Sat, Mar 20, 2021 at 9:43 PM Peter Maydell <[email protected]> wrote:
>
> On Fri, 12 Mar 2021 at 09:16, Arnd Bergmann <[email protected]> wrote:
> > So it's probably qemu that triggers the 'synchronous external
> > abort' when accessing the PCI I/O space, which in turn hints
> > towards a bug in qemu. Presumably it only returns data from
> > I/O ports that are actually mapped to a device when real hardware
> > is supposed to return 0xffffffff when reading from unused I/O ports.
>
> Do you have a reference to the bit of the PCI spec that mandates
> this -1/discard behaviour for attempted access to places where
> there isn't actually a PCI device mapped ? The spec is pretty
> long and hard to read...
>
> (Knowing to what extent this behaviour is mandatory for all
> PCI systems/host controllers vs just "it would be nice if the
> gpex host controller worked this way" would help in figuring
> out where in QEMU to change.)
Sorry, I don't. I can probably find something in there myself,
but in the end it comes down to Linux drivers relying on this
behavior for ISA devices since the start. On an old-style x86
PC, this is the only method for finding out if a device is present
or not, since there is no description in the firmware that lists them.
PCIe devices remain backwards compatible with the old ISA
bus, so the old behavior must generally be kept possible.
I don't think a specification for the ISA bus exists at all, and
I found nothing in the related LPC specification about reading
from an unknown device.
https://tldp.org/HOWTO/Plug-and-Play-HOWTO-6.html#ss6.12
states the behavior of the ISA I/O ports and how Linux drivers rely
on that. Is that enough for you?
Arnd
On Sun, Mar 21, 2021 at 12:52 PM Arnd Bergmann <[email protected]> wrote:
> On Sat, Mar 20, 2021 at 9:43 PM Peter Maydell <[email protected]> wrote:
> https://tldp.org/HOWTO/Plug-and-Play-HOWTO-6.html#ss6.12
> states the behavior of the ISA I/O ports and how Linux drivers rely
> on that. Is that enough for you?
Actually it only mentions it very indirectly, by saying that if a present
device has 0xFF in a register, the detection may fail since it looks like
a missing device.
Arnd
On Sat, Mar 20, 2021 at 9:43 PM Peter Maydell <[email protected]> wrote:
>
> On Fri, 12 Mar 2021 at 09:16, Arnd Bergmann <[email protected]> wrote:
> > So it's probably qemu that triggers the 'synchronous external
> > abort' when accessing the PCI I/O space, which in turn hints
> > towards a bug in qemu. Presumably it only returns data from
> > I/O ports that are actually mapped to a device when real hardware
> > is supposed to return 0xffffffff when reading from unused I/O ports.
>
> Do you have a reference to the bit of the PCI spec that mandates
> this -1/discard behaviour for attempted access to places where
> there isn't actually a PCI device mapped ? The spec is pretty
> long and hard to read...
>
> (Knowing to what extent this behaviour is mandatory for all
> PCI systems/host controllers vs just "it would be nice if the
> gpex host controller worked this way" would help in figuring
> out where in QEMU to change.)
I spent some more time looking at both really old PCI specifications,
and new ones.
The old PCI specs seem to just leave this bit as out of scope because
it does not concern transactions on the bus. The PCI host controller
can either report a 'master abort' to the CPU, or ignore it, and each
bridge can decide to turn master aborts on reads into all 1s.
We do have support some SoCs in Linux that trigger a CPU exception,
but we tend to deal with those with an ugly hack that just ignores
all exceptions from the CPU. Most host bridges fortunately behave
like an x86 PC though, and do not trigger an exception here.
In the PCIe 4.0 specification, I found that the behavior is configurable
at the root port, using the 'RP PIO Exception Register' at offset 0x1c
in the DPC Extended Capability. This register defaults to '0', meaning
that reads from an unknown port that generate a 'Unsupported Request
Completion' get turned into all 1s. If the firmware or OS enables it,
this can be turned into an AER log event, generate an interrupt or
a CPU exception.
Linux has a driver for DPC, which apparently configures it to
cause an interrupt to log the event, but it does not hook up the
CPU exception handler to this. I don't see an implementation of DPC
in qemu, which I take as an indication that it should use the
default behavior and cause neither an interrupt nor a CPU exception.
Arnd
On Sun, 21 Mar 2021 at 19:00, Arnd Bergmann <[email protected]> wrote:
>
> On Sat, Mar 20, 2021 at 9:43 PM Peter Maydell <[email protected]> wrote:
> >
> > On Fri, 12 Mar 2021 at 09:16, Arnd Bergmann <[email protected]> wrote:
> > > So it's probably qemu that triggers the 'synchronous external
> > > abort' when accessing the PCI I/O space, which in turn hints
> > > towards a bug in qemu. Presumably it only returns data from
> > > I/O ports that are actually mapped to a device when real hardware
> > > is supposed to return 0xffffffff when reading from unused I/O ports.
> >
> > Do you have a reference to the bit of the PCI spec that mandates
> > this -1/discard behaviour for attempted access to places where
> > there isn't actually a PCI device mapped ? The spec is pretty
> > long and hard to read...
> >
> > (Knowing to what extent this behaviour is mandatory for all
> > PCI systems/host controllers vs just "it would be nice if the
> > gpex host controller worked this way" would help in figuring
> > out where in QEMU to change.)
>
> I spent some more time looking at both really old PCI specifications,
> and new ones.
> The old PCI specs seem to just leave this bit as out of scope because
> it does not concern transactions on the bus. The PCI host controller
> can either report a 'master abort' to the CPU, or ignore it, and each
> bridge can decide to turn master aborts on reads into all 1s.
> We do have support some SoCs in Linux that trigger a CPU exception,
> but we tend to deal with those with an ugly hack that just ignores
> all exceptions from the CPU. Most host bridges fortunately behave
> like an x86 PC though, and do not trigger an exception here.
There's apparently a bit in the PCI spec that reads:
The host bus bridge, in PC compatible systems, must return all
1's on a read transaction and
discard data on a write transaction when terminated with Master-Abort.
which obviously applies only to "PC compatible systems".
> In the PCIe 4.0 specification, I found that the behavior is configurable
> at the root port, using the 'RP PIO Exception Register' at offset 0x1c
> in the DPC Extended Capability. This register defaults to '0', meaning
> that reads from an unknown port that generate a 'Unsupported Request
> Completion' get turned into all 1s. If the firmware or OS enables it,
> this can be turned into an AER log event, generate an interrupt or
> a CPU exception.
>
> Linux has a driver for DPC, which apparently configures it to
> cause an interrupt to log the event, but it does not hook up the
> CPU exception handler to this. I don't see an implementation of DPC
> in qemu, which I take as an indication that it should use the
> default behavior and cause neither an interrupt nor a CPU exception.
Hmm, maybe. We should probably also implement -1/discard just because
we're not intending to have 'surprising' behaviour.
TBH I'm having difficulty seeing why the kernel should be doing
this at all, though. The device tree tells you you have a PCI
controller; PCI supports enumeration of devices; you know exactly
where everything is mapped because the BARs tell you that.
I don't see anything that justifies the kernel in randomly
dereferencing areas of the IO or memory windows where it hasn't
mapped anything. You shouldn't be probing for legacy ISA-port
devices unless you're on a system which might actually have them
(eg an x86 PC).
thanks
-- PMM
On Mon, Mar 22, 2021 at 2:53 PM Peter Maydell <[email protected]> wrote:
> On Sun, 21 Mar 2021 at 19:00, Arnd Bergmann <[email protected]> wrote:
> > On Sat, Mar 20, 2021 at 9:43 PM Peter Maydell <[email protected]> wrote:
> > > On Fri, 12 Mar 2021 at 09:16, Arnd Bergmann <[email protected]> wrote:
> > > > So it's probably qemu that triggers the 'synchronous external
> > > > abort' when accessing the PCI I/O space, which in turn hints
> > > > towards a bug in qemu. Presumably it only returns data from
> > > > I/O ports that are actually mapped to a device when real hardware
> > > > is supposed to return 0xffffffff when reading from unused I/O ports.
> > >
> > > Do you have a reference to the bit of the PCI spec that mandates
> > > this -1/discard behaviour for attempted access to places where
> > > there isn't actually a PCI device mapped ? The spec is pretty
> > > long and hard to read...
> > >
> > > (Knowing to what extent this behaviour is mandatory for all
> > > PCI systems/host controllers vs just "it would be nice if the
> > > gpex host controller worked this way" would help in figuring
> > > out where in QEMU to change.)
> >
> > I spent some more time looking at both really old PCI specifications,
> > and new ones.
> > The old PCI specs seem to just leave this bit as out of scope because
> > it does not concern transactions on the bus. The PCI host controller
> > can either report a 'master abort' to the CPU, or ignore it, and each
> > bridge can decide to turn master aborts on reads into all 1s.
> > We do have support some SoCs in Linux that trigger a CPU exception,
> > but we tend to deal with those with an ugly hack that just ignores
> > all exceptions from the CPU. Most host bridges fortunately behave
> > like an x86 PC though, and do not trigger an exception here.
>
> There's apparently a bit in the PCI spec that reads:
> The host bus bridge, in PC compatible systems, must return all
> 1's on a read transaction and discard data on a write transaction
> when terminated with Master-Abort.
>
> which obviously applies only to "PC compatible systems".
Right. As far as I can tell, all ARMv8 and most ARMv7 based SoCs
do this to be more compatible with PC style operating systems like
Linux, but you are right that the specification here does not
mandate that, and the older ARMv5 SoCs seem to be compliant
as well based on this.
> > Linux has a driver for DPC, which apparently configures it to
> > cause an interrupt to log the event, but it does not hook up the
> > CPU exception handler to this. I don't see an implementation of DPC
> > in qemu, which I take as an indication that it should use the
> > default behavior and cause neither an interrupt nor a CPU exception.
>
> Hmm, maybe. We should probably also implement -1/discard just because
> we're not intending to have 'surprising' behaviour.
>
> TBH I'm having difficulty seeing why the kernel should be doing
> this at all, though. The device tree tells you you have a PCI
> controller; PCI supports enumeration of devices; you know exactly
> where everything is mapped because the BARs tell you that.
> I don't see anything that justifies the kernel in randomly
> dereferencing areas of the IO or memory windows where it hasn't
> mapped anything. You shouldn't be probing for legacy ISA-port
> devices unless you're on a system which might actually have them
> (eg an x86 PC).
It only happened in this case because there is also a bug in
the 8250 serial port driver that is configured to assume four ports
exist at port zero. On real arm64 hardware, this is apparently
harmless because the driver has coped with this for 30 years ;-)
There are a few other drivers that assume hardware is accessible
at the legacy addresses, and applications can also still open /dev/ioport
(if that is enabled at compile time) for the same purpose. Examples
could be PC-style mouse/keyboard (emulated by a server BMC),
PATA/SATA controllers in pre-AHCI mode, VGA console, and a
couple of industrial I/O drivers that have ISA devices behind a
PCI bridge.
Most other actual ISA add-on card drivers can only be enabled
on kernels that support machines with real slots, so you could
get them on an i386 kernel running a virtualized x86_64 machine,
but not on ARMv6 or later kernels, and you can't run pre-ARMv7
kernels on ARMv8 hardware.
Arnd
>>
>> There's apparently a bit in the PCI spec that reads:
>> The host bus bridge, in PC compatible systems, must return all
>> 1's on a read transaction and discard data on a write transaction
>> when terminated with Master-Abort.
>>
>> which obviously applies only to "PC compatible systems".
>
> Right. As far as I can tell, all ARMv8 and most ARMv7 based SoCs
> do this to be more compatible with PC style operating systems like
> Linux, but you are right that the specification here does not
> mandate that, and the older ARMv5 SoCs seem to be compliant
> as well based on this.
>
>>> Linux has a driver for DPC, which apparently configures it to
>>> cause an interrupt to log the event, but it does not hook up the
>>> CPU exception handler to this. I don't see an implementation of DPC
>>> in qemu, which I take as an indication that it should use the
>>> default behavior and cause neither an interrupt nor a CPU exception.
>>
>> Hmm, maybe. We should probably also implement -1/discard just because
>> we're not intending to have 'surprising' behaviour.
>>
>> TBH I'm having difficulty seeing why the kernel should be doing
>> this at all, though. The device tree tells you you have a PCI
>> controller; PCI supports enumeration of devices; you know exactly
>> where everything is mapped because the BARs tell you that.
>> I don't see anything that justifies the kernel in randomly
>> dereferencing areas of the IO or memory windows where it hasn't
>> mapped anything.
BIOS has described a CPU-addressable PIO region in the PCI hostbridge,
and the kernel has mapped it:
[ 3.974309][ T1] pci-host-generic 4010000000.pcie: IO
0x003eff0000..0x003effffff -> 0x0000000000
So I don't see why any accesses there should fault.
>> You shouldn't be probing for legacy ISA-port
>> devices unless you're on a system which might actually have them
>> (eg an x86 PC).
>
> It only happened in this case because there is also a bug in
> the 8250 serial port driver that is configured to assume four ports
> exist at port zero. On real arm64 hardware, this is apparently
> harmless because the driver has coped with this for 30 years ;-)
>
> There are a few other drivers that assume hardware is accessible
> at the legacy addresses, and applications can also still open /dev/ioport
> (if that is enabled at compile time) for the same purpose. Examples
> could be PC-style mouse/keyboard (emulated by a server BMC),
> PATA/SATA controllers in pre-AHCI mode, VGA console, and a
> couple of industrial I/O drivers that have ISA devices behind a
> PCI bridge.
>
> Most other actual ISA add-on card drivers can only be enabled
> on kernels that support machines with real slots, so you could
> get them on an i386 kernel running a virtualized x86_64 machine,
> but not on ARMv6 or later kernels, and you can't run pre-ARMv7
> kernels on ARMv8 hardware.
> There are also lots of the hwmon drivers which use super IO, and probe
a fixed PIO addresses for HW detection. These may be enabled on any
architecture (apart from PPC, who explicitly disabled them to avoid
issues like this).
Thanks,
John
On Mon, 22 Mar 2021 at 16:36, John Garry <[email protected]> wrote:
>
> >>
> >> There's apparently a bit in the PCI spec that reads:
> >> The host bus bridge, in PC compatible systems, must return all
> >> 1's on a read transaction and discard data on a write transaction
> >> when terminated with Master-Abort.
> >>
> >> which obviously applies only to "PC compatible systems".
> >
> > Right. As far as I can tell, all ARMv8 and most ARMv7 based SoCs
> > do this to be more compatible with PC style operating systems like
> > Linux, but you are right that the specification here does not
> > mandate that, and the older ARMv5 SoCs seem to be compliant
> > as well based on this.
> >> TBH I'm having difficulty seeing why the kernel should be doing
> >> this at all, though. The device tree tells you you have a PCI
> >> controller; PCI supports enumeration of devices; you know exactly
> >> where everything is mapped because the BARs tell you that.
> >> I don't see anything that justifies the kernel in randomly
> >> dereferencing areas of the IO or memory windows where it hasn't
> >> mapped anything.
>
> BIOS has described a CPU-addressable PIO region in the PCI hostbridge,
> and the kernel has mapped it:
>
> [ 3.974309][ T1] pci-host-generic 4010000000.pcie: IO
> 0x003eff0000..0x003effffff -> 0x0000000000
>
> So I don't see why any accesses there should fault.
As requested above, do you have the PCI spec reference for
why the PIO region is supposed to do -1/discard for parts of
the PIO region where the kernel hasn't mapped any devices ?
For classic PCI, at least, the spec does not seem to mandate it.
thanks
-- PMM