2022-10-18 09:51:46

by Miklos Szeredi

[permalink] [raw]
Subject: Re: [Syzkaller] INFO: task hung in fuse_lookup with v6.0 kernel in guest

On Mon, Oct 17, 2022 at 11:17 AM Pengfei Xu <[email protected]> wrote:
>
> Hi Miklos,
>
> Greeting!
>
> Platform: Tiger lake CPU platform.
>
> We found 1 "task hung in fuse_lookup" issue by syzkaller with v6.0 mainline
> kernel in guest.
>
> Bisected and found the bad commit:
> "
> commit: 62dd1fc8cc6b22e3e568be46ebdb817e66f5d6a5
> fuse: move fget() to fuse_get_tree()
> "
>
> Reproduced code generated by syzkaller, binary, bisect log and all the dmesg
> info are in attached package.

Thanks for the report.

I tried out the reproducer, and the deadlock can be triggered.
Unfortunately killing the deadlocked processes is not enough, but it
still should be possible to recover with "echo 1 >
/sys/fs/fuse/connections/$FUSE_DEV/abort". In my tests this works,
so I'm not sure there's anything to fix here.

Is there a real life situation where this occurs, or is this just
triggered with fuzzing?

I'm wondering why syzbot didn't try aborting using the "abort" file in
sysfs, AFAICS it does know this trick.

Thanks,
Miklos


2022-10-19 03:17:17

by Pengfei Xu

[permalink] [raw]
Subject: Re: [Syzkaller] INFO: task hung in fuse_lookup with v6.0 kernel in guest

Hi Miklos,

On 2022-10-18 at 11:23:17 +0200, Miklos Szeredi wrote:
> On Mon, Oct 17, 2022 at 11:17 AM Pengfei Xu <[email protected]> wrote:
> >
> > Hi Miklos,
> >
> > Greeting!
> >
> > Platform: Tiger lake CPU platform.
> >
> > We found 1 "task hung in fuse_lookup" issue by syzkaller with v6.0 mainline
> > kernel in guest.
> >
> > Bisected and found the bad commit:
> > "
> > commit: 62dd1fc8cc6b22e3e568be46ebdb817e66f5d6a5
> > fuse: move fget() to fuse_get_tree()
> > "
> >
> > Reproduced code generated by syzkaller, binary, bisect log and all the dmesg
> > info are in attached package.
>
> Thanks for the report.
>
> I tried out the reproducer, and the deadlock can be triggered.
> Unfortunately killing the deadlocked processes is not enough, but it
> still should be possible to recover with "echo 1 >
> /sys/fs/fuse/connections/$FUSE_DEV/abort". In my tests this works,
> so I'm not sure there's anything to fix here.
Thanks for the solution: "echo 1 > /sys/fs/fuse/connections/$FUSE_DEV/abort"

>
> Is there a real life situation where this occurs, or is this just
> triggered with fuzzing?
It only could be reproduced by repro.c from syzkaller, and we have not
encountered this problem in real life yet.
So it's a low priority issue and it's not even clear if it's worth solving?

>
> I'm wondering why syzbot didn't try aborting using the "abort" file in
> sysfs, AFAICS it does know this trick.
Yes, maybe syzbot should improve it? :)

Thanks!
BR.

>
> Thanks,
> Miklos
>