2023-12-15 14:43:47

by Alexander Potapenko

[permalink] [raw]
Subject: Re: [syzbot] [crypto?] KMSAN: uninit-value in __crc32c_le_base (3)

On Thu, Dec 14, 2023 at 10:39 PM 'Dave Chinner' via syzkaller-bugs
<[email protected]> wrote:
>
> On Thu, Dec 14, 2023 at 03:55:00PM +0100, Alexander Potapenko wrote:
> > On Wed, Dec 13, 2023 at 10:58 PM 'Dave Chinner' via syzkaller-bugs
> > <[email protected]> wrote:
> > >
> > > On Thu, Dec 14, 2023 at 08:16:07AM +1100, Dave Chinner wrote:
> > > > [cc [email protected] because that's where all questions
> > > > about XFS stuff should be directed, not to random individual
> > > > developers. ]
> > > >
> > > > On Wed, Dec 13, 2023 at 11:49:50AM +0100, Alexander Potapenko wrote:
> > > > > Hi Christoph, Dave,
> > > > >
> > > > > The repro provided by Xingwei indeed works.
> > >
> > > Can you please test the patch below?
> >
> > It fixed the problem for me, feel free to add:
> >
> > Tested-by: Alexander Potapenko <[email protected]>
>
> Thanks.
>
> > As for the time needed to detect the bug, note that kmemcheck was
> > never used together with syzkaller, so it couldn't have the chance to
> > find it.
> >
> > KMSAN found this bug in April
> > (https://syzkaller.appspot.com/bug?extid=a6d6b8fffa294705dbd8),
>
> KMSAN has been used for quite a long time with syzbot, however,
> and it's supposed to find these problems, too. Yet it's only been
> finding this for 6 months?
>
> > only
> > half a year after we started mounting XFS images on syzbot.
>
> Really? Where did you get that from? syzbot has been exercising XFS
> filesystems since 2017 - the bug reports to the XFS list go back at
> least that far.

You are right, syzbot used to mount XFS way before 2022.
On the other hand, last fall there were some major changes to the way
syz_mount_image() works, so I am attributing the newly detected bugs
to those changes.
Unfortunately we don't have much insight into reasons behind syzkaller
being able to trigger one bug or another: once a bug is found for the
first time, the likelihood to trigger it again increases, but finding
it initially might be tricky.

I don't understand much how trivial is the repro at
https://gist.github.com/xrivendell7/c7bb6ddde87a892818ed1ce206a429c4,
but overall we are not drilling deep enough into XFS.
https://storage.googleapis.com/syzbot-assets/8547e3dd1cca/ci-upstream-kmsan-gce-c7402612.html
(ouch, 230Mb!) shows very limited coverage.


2023-12-15 22:08:08

by Dave Chinner

[permalink] [raw]
Subject: Re: [syzbot] [crypto?] KMSAN: uninit-value in __crc32c_le_base (3)

On Fri, Dec 15, 2023 at 03:41:49PM +0100, Alexander Potapenko wrote:
> On Thu, Dec 14, 2023 at 10:39 PM 'Dave Chinner' via syzkaller-bugs
> <[email protected]> wrote:
> >
> > On Thu, Dec 14, 2023 at 03:55:00PM +0100, Alexander Potapenko wrote:
> > > On Wed, Dec 13, 2023 at 10:58 PM 'Dave Chinner' via syzkaller-bugs
> > > <[email protected]> wrote:
> > > >
> > > > On Thu, Dec 14, 2023 at 08:16:07AM +1100, Dave Chinner wrote:
> > > > > [cc [email protected] because that's where all questions
> > > > > about XFS stuff should be directed, not to random individual
> > > > > developers. ]
> > > > >
> > > > > On Wed, Dec 13, 2023 at 11:49:50AM +0100, Alexander Potapenko wrote:
> > > > > > Hi Christoph, Dave,
> > > > > >
> > > > > > The repro provided by Xingwei indeed works.
> > > >
> > > > Can you please test the patch below?
> > >
> > > It fixed the problem for me, feel free to add:
> > >
> > > Tested-by: Alexander Potapenko <[email protected]>
> >
> > Thanks.
> >
> > > As for the time needed to detect the bug, note that kmemcheck was
> > > never used together with syzkaller, so it couldn't have the chance to
> > > find it.
> > >
> > > KMSAN found this bug in April
> > > (https://syzkaller.appspot.com/bug?extid=a6d6b8fffa294705dbd8),
> >
> > KMSAN has been used for quite a long time with syzbot, however,
> > and it's supposed to find these problems, too. Yet it's only been
> > finding this for 6 months?
> >
> > > only
> > > half a year after we started mounting XFS images on syzbot.
> >
> > Really? Where did you get that from? syzbot has been exercising XFS
> > filesystems since 2017 - the bug reports to the XFS list go back at
> > least that far.
>
> You are right, syzbot used to mount XFS way before 2022.
> On the other hand, last fall there were some major changes to the way
> syz_mount_image() works, so I am attributing the newly detected bugs
> to those changes.

Oh, so that's when syzbot first turned on XFS V5 format testing?

Or was that done in April, when this issue was first reported?

> Unfortunately we don't have much insight into reasons behind syzkaller
> being able to trigger one bug or another: once a bug is found for the
> first time, the likelihood to trigger it again increases, but finding
> it initially might be tricky.
>
> I don't understand much how trivial is the repro at
> https://gist.github.com/xrivendell7/c7bb6ddde87a892818ed1ce206a429c4,

I just looked at it - all it does is create a new file. It's
effectively "mount; touch", which is exactly what I said earlier
in the thread should reproduce this issue every single time.

> but overall we are not drilling deep enough into XFS.
> https://storage.googleapis.com/syzbot-assets/8547e3dd1cca/ci-upstream-kmsan-gce-c7402612.html
> (ouch, 230Mb!) shows very limited coverage.

*sigh*

Did you think to look at the coverage results to check why the
numbers for XFS, ext4 and btrfs are all at 1%? Why didn't the low
number make you dig a bit deeper to see if the number was real or
whether there was a test execution problem during measurement?

I just spent a minute doing exactly that, and the answer is
pretty obvious. Both ext4 and XFS had a mount attempts
rejected at mount option parsing, and btrfs rejected a device scan
ioctl. That's it. Nothing else was exercised in those three
filesystems.

Put simply: the filesystems *weren't tested during coverage
measurement*.

If you are going to do coverage testing, please measure coverage
over *thousands* of different tests performed on a single filesystem
type. It needs to be thousands, because syzbot tests are so shallow
and narrow that actually covering any significant amount of
filesystem code is quite difficult....

-Dave.
--
Dave Chinner
[email protected]