So looking at the decode, as usual the noise generated by KASAN isn't
being very helpful, but it does look like at least one of the reports
(I picked 5.2 because I don't care about 4.19 etc) is because
'kernfs_root(kn) is NULL in kernfs_add_one().
Looking at the reports, every single one seems to have a call chain
that comes from vhci_write() -> vhci_get_user() ->
vhci_create_device() -> __vhci_create_device() -> hci_register_dev()
-> device_add() -> kobject_add().
(In this case, "every single one" is by looking at the last 10 reports
sorted by date, it wasn't exhaustive).
The way it got into 'write()' can be a bit varied (splice, write, whatever).
That makes me think it's bluetooth that is the problem, but it might
be an effect of how syzbot groups the reports too, of course.
Might the device have been added at the same time that the last
previous device was removed, so that the parent was deleted as the new
device was aded? I dunno. The repro seem to be a repeated "open
/dev/vhci, write two random bytes to it"
Or might it be some "it happens after you've added enough devices that
something overflows" issue?
Adding bluetooth people to the cc.
Linus
On Mon, Nov 18, 2019 at 10:27 PM syzbot
<[email protected]> wrote:
>
> syzbot has bisected this bug to:
>
> commit 726e41097920a73e4c7c33385dcc0debb1281e18
> Author: Benjamin Herrenschmidt <[email protected]>
> Date: Tue Jul 10 00:29:10 2018 +0000
>
> drivers: core: Remove glue dirs from sysfs earlier
>
> bisection log: https://syzkaller.appspot.com/x/bisect.txt?x=168e1012e00000
> start commit: 5e335542 Merge branch 'for-linus' of git://git.kernel.org/..
> git tree: upstream
> final crash: https://syzkaller.appspot.com/x/report.txt?x=158e1012e00000
> console output: https://syzkaller.appspot.com/x/log.txt?x=118e1012e00000
> kernel config: https://syzkaller.appspot.com/x/.config?x=9917ff4b798e1a1e
> dashboard link: https://syzkaller.appspot.com/bug?extid=db1637662f412ac0d556
> syz repro: https://syzkaller.appspot.com/x/repro.syz?x=10a66c11400000
> C reproducer: https://syzkaller.appspot.com/x/repro.c?x=1346c771400000
>
> Reported-by: [email protected]
> Fixes: 726e41097920 ("drivers: core: Remove glue dirs from sysfs earlier")
>
> For information about bisection process see: https://goo.gl/tpsmEJ#bisection
Hi Linus,
> So looking at the decode, as usual the noise generated by KASAN isn't
> being very helpful, but it does look like at least one of the reports
> (I picked 5.2 because I don't care about 4.19 etc) is because
> 'kernfs_root(kn) is NULL in kernfs_add_one().
>
> Looking at the reports, every single one seems to have a call chain
> that comes from vhci_write() -> vhci_get_user() ->
> vhci_create_device() -> __vhci_create_device() -> hci_register_dev()
> -> device_add() -> kobject_add().
>
> (In this case, "every single one" is by looking at the last 10 reports
> sorted by date, it wasn't exhaustive).
>
> The way it got into 'write()' can be a bit varied (splice, write, whatever).
>
> That makes me think it's bluetooth that is the problem, but it might
> be an effect of how syzbot groups the reports too, of course.
>
> Might the device have been added at the same time that the last
> previous device was removed, so that the parent was deleted as the new
> device was aded? I dunno. The repro seem to be a repeated "open
> /dev/vhci, write two random bytes to it"
>
> Or might it be some "it happens after you've added enough devices that
> something overflows" issue?
long time ago there used to be an issue with quick device remove / device add operations, but that was fixed. I am just too fuzzy on the details since it has been a while.
We also haven’t touched our sysfs integration in a while and Bluetooth support is so old that this might have been bit-rotting.
I need to run the re-producer myself and see if something stands out that I can spot.
Regards
Marcel
On Tue, 2019-11-19 at 11:00 -0800, Linus Torvalds wrote:
> So looking at the decode, as usual the noise generated by KASAN isn't
> being very helpful, but it does look like at least one of the reports
> (I picked 5.2 because I don't care about 4.19 etc) is because
> 'kernfs_root(kn) is NULL in kernfs_add_one().
>
> Looking at the reports, every single one seems to have a call chain
> that comes from vhci_write() -> vhci_get_user() ->
> vhci_create_device() -> __vhci_create_device() -> hci_register_dev()
> -> device_add() -> kobject_add().
>
> (In this case, "every single one" is by looking at the last 10
> reports
> sorted by date, it wasn't exhaustive).
>
> The way it got into 'write()' can be a bit varied (splice, write,
> whatever).
>
> That makes me think it's bluetooth that is the problem, but it might
> be an effect of how syzbot groups the reports too, of course.
>
> Might the device have been added at the same time that the last
> previous device was removed, so that the parent was deleted as the
> new
> device was aded? I dunno. The repro seem to be a repeated "open
> /dev/vhci, write two random bytes to it"
>
> Or might it be some "it happens after you've added enough devices
> that
> something overflows" issue?
>
> Adding bluetooth people to the cc.
Could this be what was fixed by:
ac43432cb1f5c2950408534987e57c2071e24d8f
("driver core: Fix use-after-free and double free on glue directory")
Which went into 5.3 afaik ?
Cheers,
Ben.
> Linus
>
> On Mon, Nov 18, 2019 at 10:27 PM syzbot
> <[email protected]> wrote:
> >
> > syzbot has bisected this bug to:
> >
> > commit 726e41097920a73e4c7c33385dcc0debb1281e18
> > Author: Benjamin Herrenschmidt <[email protected]>
> > Date: Tue Jul 10 00:29:10 2018 +0000
> >
> > drivers: core: Remove glue dirs from sysfs earlier
> >
> > bisection log:
> > https://syzkaller.appspot.com/x/bisect.txt?x=168e1012e00000
> > start commit: 5e335542 Merge branch 'for-linus' of
> > git://git.kernel.org/..
> > git tree: upstream
> > final crash:
> > https://syzkaller.appspot.com/x/report.txt?x=158e1012e00000
> > console output:
> > https://syzkaller.appspot.com/x/log.txt?x=118e1012e00000
> > kernel config:
> > https://syzkaller.appspot.com/x/.config?x=9917ff4b798e1a1e
> > dashboard link:
> > https://syzkaller.appspot.com/bug?extid=db1637662f412ac0d556
> > syz repro:
> > https://syzkaller.appspot.com/x/repro.syz?x=10a66c11400000
> > C reproducer:
> > https://syzkaller.appspot.com/x/repro.c?x=1346c771400000
> >
> > Reported-by: [email protected]
> > Fixes: 726e41097920 ("drivers: core: Remove glue dirs from sysfs
> > earlier")
> >
> > For information about bisection process see:
> > https://goo.gl/tpsmEJ#bisection
On Tue, Nov 19, 2019 at 8:04 PM Benjamin Herrenschmidt
<[email protected]> wrote:
>
> Could this be what was fixed by:
>
> ac43432cb1f5c2950408534987e57c2071e24d8f
> ("driver core: Fix use-after-free and double free on glue directory")
>
> Which went into 5.3 afaik ?
Hmm. Sounds very possible. It matches the commit syzbot bisected to,
and looking at the reports, the I can't find anything that is 5.3 or
later.
I did find a 5.3.0-rc2+ report, but that's still consistent with that
commit: it got merged just before 5.3-rc4.
So I think you're right.
I forget what the magic email rule was to report that something is
fixed to syzbot..
Linus
On Wed, Nov 20, 2019 at 5:54 PM Linus Torvalds
<[email protected]> wrote:
>
> On Tue, Nov 19, 2019 at 8:04 PM Benjamin Herrenschmidt
> <[email protected]> wrote:
> >
> > Could this be what was fixed by:
> >
> > ac43432cb1f5c2950408534987e57c2071e24d8f
> > ("driver core: Fix use-after-free and double free on glue directory")
> >
> > Which went into 5.3 afaik ?
>
> Hmm. Sounds very possible. It matches the commit syzbot bisected to,
> and looking at the reports, the I can't find anything that is 5.3 or
> later.
>
> I did find a 5.3.0-rc2+ report, but that's still consistent with that
> commit: it got merged just before 5.3-rc4.
>
> So I think you're right.
>
> I forget what the magic email rule was to report that something is
> fixed to syzbot..
Hi Linus,
This would be:
#syz fix: driver core: Fix use-after-free and double free on glue directory
FTR, the cheat sheet is referenced in every bug report:
> syzbot will keep track of this bug report. See:
> https://goo.gl/tpsmEJ#bug-status-tracking for how to communicate with syzbot.