LinuxLists.cc - possible deadlock in start_this

[permalink] [raw]

Subject: Re: possible deadlock in start_this_handle (2)

On Thu 11-02-21 11:49:47, Jan Kara wrote:
> Hello,
>
> added mm guys to CC.
>
> On Wed 10-02-21 05:35:18, syzbot wrote:
> > HEAD commit: 1e0d27fc Merge branch 'akpm' (patches from Andrew)
> > git tree: upstream
> > console output: https://syzkaller.appspot.com/x/log.txt?x=15cbce90d00000
> > kernel config: https://syzkaller.appspot.com/x/.config?x=bd1f72220b2e57eb
> > dashboard link: https://syzkaller.appspot.com/bug?extid=bfdded10ab7dcd7507ae
> > userspace arch: i386
> >
> > Unfortunately, I don't have any reproducer for this issue yet.
> >
> > IMPORTANT: if you fix the issue, please add the following tag to the commit:
> > Reported-by: [email protected]
> >
> > ======================================================
> > WARNING: possible circular locking dependency detected
> > 5.11.0-rc6-syzkaller #0 Not tainted
> > ------------------------------------------------------
> > kswapd0/2246 is trying to acquire lock:
> > ffff888041a988e0 (jbd2_handle){++++}-{0:0}, at: start_this_handle+0xf81/0x1380 fs/jbd2/transaction.c:444
> >
> > but task is already holding lock:
> > ffffffff8be892c0 (fs_reclaim){+.+.}-{0:0}, at: __fs_reclaim_acquire+0x0/0x30 mm/page_alloc.c:5195
> >
> > which lock already depends on the new lock.
> >
> > the existing dependency chain (in reverse order) is:
> >
> > -> #2 (fs_reclaim){+.+.}-{0:0}:
> > __fs_reclaim_acquire mm/page_alloc.c:4326 [inline]
> > fs_reclaim_acquire+0x117/0x150 mm/page_alloc.c:4340
> > might_alloc include/linux/sched/mm.h:193 [inline]
> > slab_pre_alloc_hook mm/slab.h:493 [inline]
> > slab_alloc_node mm/slub.c:2817 [inline]
> > __kmalloc_node+0x5f/0x430 mm/slub.c:4015
> > kmalloc_node include/linux/slab.h:575 [inline]
> > kvmalloc_node+0x61/0xf0 mm/util.c:587
> > kvmalloc include/linux/mm.h:781 [inline]
> > ext4_xattr_inode_cache_find fs/ext4/xattr.c:1465 [inline]
> > ext4_xattr_inode_lookup_create fs/ext4/xattr.c:1508 [inline]
> > ext4_xattr_set_entry+0x1ce6/0x3780 fs/ext4/xattr.c:1649
> > ext4_xattr_ibody_set+0x78/0x2b0 fs/ext4/xattr.c:2224
> > ext4_xattr_set_handle+0x8f4/0x13e0 fs/ext4/xattr.c:2380
> > ext4_xattr_set+0x13a/0x340 fs/ext4/xattr.c:2493
> > ext4_xattr_user_set+0xbc/0x100 fs/ext4/xattr_user.c:40
> > __vfs_setxattr+0x10e/0x170 fs/xattr.c:177
> > __vfs_setxattr_noperm+0x11a/0x4c0 fs/xattr.c:208
> > __vfs_setxattr_locked+0x1bf/0x250 fs/xattr.c:266
> > vfs_setxattr+0x135/0x320 fs/xattr.c:291
> > setxattr+0x1ff/0x290 fs/xattr.c:553
> > path_setxattr+0x170/0x190 fs/xattr.c:572
> > __do_sys_setxattr fs/xattr.c:587 [inline]
> > __se_sys_setxattr fs/xattr.c:583 [inline]
> > __ia32_sys_setxattr+0xbc/0x150 fs/xattr.c:583
> > do_syscall_32_irqs_on arch/x86/entry/common.c:77 [inline]
> > __do_fast_syscall_32+0x56/0x80 arch/x86/entry/common.c:139
> > do_fast_syscall_32+0x2f/0x70 arch/x86/entry/common.c:164
> > entry_SYSENTER_compat_after_hwframe+0x4d/0x5c
>
> This stacktrace should never happen. ext4_xattr_set() starts a transaction.
> That internally goes through start_this_handle() which calls:
>
> handle->saved_alloc_context = memalloc_nofs_save();
>
> and we restore the allocation context only in stop_this_handle() when
> stopping the handle. And with this fs_reclaim_acquire() should remove
> __GFP_FS from the mask and not call __fs_reclaim_acquire().
>
> Now I have no idea why something here didn't work out. Given we don't have
> a reproducer it will be probably difficult to debug this. I'd note that
> about year and half ago similar report happened (got autoclosed) so it may
> be something real somewhere but it may also be just some HW glitch or
> something like that.

Is it possible this is just a lockdep false positive? Is it possible
that there is a pre-recorded lock dependency chain that happens outside
of the transaction and that clashes with this one?

I do not remember any recent changes in the way how scope API is handled
except for CMA scope API changes but those should be pretty much
independent.
--
Michal Hocko
SUSE Labs

2021-02-11 11:26:46

[permalink] [raw]

Subject: Re: possible deadlock in start_this_handle (2)

On Thu, Feb 11, 2021 at 11:49 AM Jan Kara <[email protected]> wrote:
>
> Hello,
>
> added mm guys to CC.
>
> On Wed 10-02-21 05:35:18, syzbot wrote:
> > HEAD commit: 1e0d27fc Merge branch 'akpm' (patches from Andrew)
> > git tree: upstream
> > console output: https://syzkaller.appspot.com/x/log.txt?x=15cbce90d00000
> > kernel config: https://syzkaller.appspot.com/x/.config?x=bd1f72220b2e57eb
> > dashboard link: https://syzkaller.appspot.com/bug?extid=bfdded10ab7dcd7507ae
> > userspace arch: i386
> >
> > Unfortunately, I don't have any reproducer for this issue yet.
> >
> > IMPORTANT: if you fix the issue, please add the following tag to the commit:
> > Reported-by: [email protected]
> >
> > ======================================================
> > WARNING: possible circular locking dependency detected
> > 5.11.0-rc6-syzkaller #0 Not tainted
> > ------------------------------------------------------
> > kswapd0/2246 is trying to acquire lock:
> > ffff888041a988e0 (jbd2_handle){++++}-{0:0}, at: start_this_handle+0xf81/0x1380 fs/jbd2/transaction.c:444
> >
> > but task is already holding lock:
> > ffffffff8be892c0 (fs_reclaim){+.+.}-{0:0}, at: __fs_reclaim_acquire+0x0/0x30 mm/page_alloc.c:5195
> >
> > which lock already depends on the new lock.
> >
> > the existing dependency chain (in reverse order) is:
> >
> > -> #2 (fs_reclaim){+.+.}-{0:0}:
> > __fs_reclaim_acquire mm/page_alloc.c:4326 [inline]
> > fs_reclaim_acquire+0x117/0x150 mm/page_alloc.c:4340
> > might_alloc include/linux/sched/mm.h:193 [inline]
> > slab_pre_alloc_hook mm/slab.h:493 [inline]
> > slab_alloc_node mm/slub.c:2817 [inline]
> > __kmalloc_node+0x5f/0x430 mm/slub.c:4015
> > kmalloc_node include/linux/slab.h:575 [inline]
> > kvmalloc_node+0x61/0xf0 mm/util.c:587
> > kvmalloc include/linux/mm.h:781 [inline]
> > ext4_xattr_inode_cache_find fs/ext4/xattr.c:1465 [inline]
> > ext4_xattr_inode_lookup_create fs/ext4/xattr.c:1508 [inline]
> > ext4_xattr_set_entry+0x1ce6/0x3780 fs/ext4/xattr.c:1649
> > ext4_xattr_ibody_set+0x78/0x2b0 fs/ext4/xattr.c:2224
> > ext4_xattr_set_handle+0x8f4/0x13e0 fs/ext4/xattr.c:2380
> > ext4_xattr_set+0x13a/0x340 fs/ext4/xattr.c:2493
> > ext4_xattr_user_set+0xbc/0x100 fs/ext4/xattr_user.c:40
> > __vfs_setxattr+0x10e/0x170 fs/xattr.c:177
> > __vfs_setxattr_noperm+0x11a/0x4c0 fs/xattr.c:208
> > __vfs_setxattr_locked+0x1bf/0x250 fs/xattr.c:266
> > vfs_setxattr+0x135/0x320 fs/xattr.c:291
> > setxattr+0x1ff/0x290 fs/xattr.c:553
> > path_setxattr+0x170/0x190 fs/xattr.c:572
> > __do_sys_setxattr fs/xattr.c:587 [inline]
> > __se_sys_setxattr fs/xattr.c:583 [inline]
> > __ia32_sys_setxattr+0xbc/0x150 fs/xattr.c:583
> > do_syscall_32_irqs_on arch/x86/entry/common.c:77 [inline]
> > __do_fast_syscall_32+0x56/0x80 arch/x86/entry/common.c:139
> > do_fast_syscall_32+0x2f/0x70 arch/x86/entry/common.c:164
> > entry_SYSENTER_compat_after_hwframe+0x4d/0x5c
>
> This stacktrace should never happen. ext4_xattr_set() starts a transaction.
> That internally goes through start_this_handle() which calls:
>
> handle->saved_alloc_context = memalloc_nofs_save();
>
> and we restore the allocation context only in stop_this_handle() when
> stopping the handle. And with this fs_reclaim_acquire() should remove
> __GFP_FS from the mask and not call __fs_reclaim_acquire().
>
> Now I have no idea why something here didn't work out. Given we don't have
> a reproducer it will be probably difficult to debug this. I'd note that
> about year and half ago similar report happened (got autoclosed) so it may
> be something real somewhere but it may also be just some HW glitch or
> something like that.

HW glitch is theoretically possible. But if we are considering such
causes, I would say a kernel memory corruption is way more likely, we
have hundreds of known memory-corruption-capable bugs open. In most
cases they are caught by KASAN before doing silent damage. But KASAN
can miss some cases.

I see at least 4 existing bugs with similar stack:
https://syzkaller.appspot.com/bug?extid=bfdded10ab7dcd7507ae
https://syzkaller.appspot.com/bug?extid=a7ab8df042baaf42ae3c
https://syzkaller.appspot.com/bug?id=c814a55a728493959328551c769ede4c8ff72aab
https://syzkaller.appspot.com/bug?id=426ad9adca053dafcd698f3a48ad5406dccc972b

All in all, I would not assume it's a memory corruption. When we had
bugs that actually caused silent memory corruption, that caused a
spike of random one-time crashes all over the kernel. This does not
look like it.

2021-02-11 11:35:28

[permalink] [raw]

Subject: Re: possible deadlock in start_this_handle (2)

On Thu, Feb 11, 2021 at 12:22 PM Dmitry Vyukov <[email protected]> wrote:
>
> On Thu, Feb 11, 2021 at 11:49 AM Jan Kara <[email protected]> wrote:
> >
> > Hello,
> >
> > added mm guys to CC.
> >
> > On Wed 10-02-21 05:35:18, syzbot wrote:
> > > HEAD commit: 1e0d27fc Merge branch 'akpm' (patches from Andrew)
> > > git tree: upstream
> > > console output: https://syzkaller.appspot.com/x/log.txt?x=15cbce90d00000
> > > kernel config: https://syzkaller.appspot.com/x/.config?x=bd1f72220b2e57eb
> > > dashboard link: https://syzkaller.appspot.com/bug?extid=bfdded10ab7dcd7507ae
> > > userspace arch: i386
> > >
> > > Unfortunately, I don't have any reproducer for this issue yet.
> > >
> > > IMPORTANT: if you fix the issue, please add the following tag to the commit:
> > > Reported-by: [email protected]
> > >
> > > ======================================================
> > > WARNING: possible circular locking dependency detected
> > > 5.11.0-rc6-syzkaller #0 Not tainted
> > > ------------------------------------------------------
> > > kswapd0/2246 is trying to acquire lock:
> > > ffff888041a988e0 (jbd2_handle){++++}-{0:0}, at: start_this_handle+0xf81/0x1380 fs/jbd2/transaction.c:444
> > >
> > > but task is already holding lock:
> > > ffffffff8be892c0 (fs_reclaim){+.+.}-{0:0}, at: __fs_reclaim_acquire+0x0/0x30 mm/page_alloc.c:5195
> > >
> > > which lock already depends on the new lock.
> > >
> > > the existing dependency chain (in reverse order) is:
> > >
> > > -> #2 (fs_reclaim){+.+.}-{0:0}:
> > > __fs_reclaim_acquire mm/page_alloc.c:4326 [inline]
> > > fs_reclaim_acquire+0x117/0x150 mm/page_alloc.c:4340
> > > might_alloc include/linux/sched/mm.h:193 [inline]
> > > slab_pre_alloc_hook mm/slab.h:493 [inline]
> > > slab_alloc_node mm/slub.c:2817 [inline]
> > > __kmalloc_node+0x5f/0x430 mm/slub.c:4015
> > > kmalloc_node include/linux/slab.h:575 [inline]
> > > kvmalloc_node+0x61/0xf0 mm/util.c:587
> > > kvmalloc include/linux/mm.h:781 [inline]
> > > ext4_xattr_inode_cache_find fs/ext4/xattr.c:1465 [inline]
> > > ext4_xattr_inode_lookup_create fs/ext4/xattr.c:1508 [inline]
> > > ext4_xattr_set_entry+0x1ce6/0x3780 fs/ext4/xattr.c:1649
> > > ext4_xattr_ibody_set+0x78/0x2b0 fs/ext4/xattr.c:2224
> > > ext4_xattr_set_handle+0x8f4/0x13e0 fs/ext4/xattr.c:2380
> > > ext4_xattr_set+0x13a/0x340 fs/ext4/xattr.c:2493
> > > ext4_xattr_user_set+0xbc/0x100 fs/ext4/xattr_user.c:40
> > > __vfs_setxattr+0x10e/0x170 fs/xattr.c:177
> > > __vfs_setxattr_noperm+0x11a/0x4c0 fs/xattr.c:208
> > > __vfs_setxattr_locked+0x1bf/0x250 fs/xattr.c:266
> > > vfs_setxattr+0x135/0x320 fs/xattr.c:291
> > > setxattr+0x1ff/0x290 fs/xattr.c:553
> > > path_setxattr+0x170/0x190 fs/xattr.c:572
> > > __do_sys_setxattr fs/xattr.c:587 [inline]
> > > __se_sys_setxattr fs/xattr.c:583 [inline]
> > > __ia32_sys_setxattr+0xbc/0x150 fs/xattr.c:583
> > > do_syscall_32_irqs_on arch/x86/entry/common.c:77 [inline]
> > > __do_fast_syscall_32+0x56/0x80 arch/x86/entry/common.c:139
> > > do_fast_syscall_32+0x2f/0x70 arch/x86/entry/common.c:164
> > > entry_SYSENTER_compat_after_hwframe+0x4d/0x5c
> >
> > This stacktrace should never happen. ext4_xattr_set() starts a transaction.
> > That internally goes through start_this_handle() which calls:
> >
> > handle->saved_alloc_context = memalloc_nofs_save();
> >
> > and we restore the allocation context only in stop_this_handle() when
> > stopping the handle. And with this fs_reclaim_acquire() should remove
> > __GFP_FS from the mask and not call __fs_reclaim_acquire().
> >
> > Now I have no idea why something here didn't work out. Given we don't have
> > a reproducer it will be probably difficult to debug this. I'd note that
> > about year and half ago similar report happened (got autoclosed) so it may
> > be something real somewhere but it may also be just some HW glitch or
> > something like that.
>
> HW glitch is theoretically possible. But if we are considering such
> causes, I would say a kernel memory corruption is way more likely, we
> have hundreds of known memory-corruption-capable bugs open. In most
> cases they are caught by KASAN before doing silent damage. But KASAN
> can miss some cases.
>
> I see at least 4 existing bugs with similar stack:
> https://syzkaller.appspot.com/bug?extid=bfdded10ab7dcd7507ae
> https://syzkaller.appspot.com/bug?extid=a7ab8df042baaf42ae3c
> https://syzkaller.appspot.com/bug?id=c814a55a728493959328551c769ede4c8ff72aab
> https://syzkaller.appspot.com/bug?id=426ad9adca053dafcd698f3a48ad5406dccc972b
>
> All in all, I would not assume it's a memory corruption. When we had
> bugs that actually caused silent memory corruption, that caused a
> spike of random one-time crashes all over the kernel. This does not
> look like it.

I wonder if memalloc_nofs_save (or any other manipulation of
current->flags) could have been invoked from interrupt context? I
think it could cause the failure mode we observe (extremely rare
disappearing flags). It may be useful to add a check for task context
there.

2021-02-11 11:50:44

by Jan Kara

[permalink] [raw]

Subject: Re: possible deadlock in start_this_handle (2)

On Thu 11-02-21 12:22:39, Dmitry Vyukov wrote:
> On Thu, Feb 11, 2021 at 11:49 AM Jan Kara <[email protected]> wrote:
> >
> > Hello,
> >
> > added mm guys to CC.
> >
> > On Wed 10-02-21 05:35:18, syzbot wrote:
> > > HEAD commit: 1e0d27fc Merge branch 'akpm' (patches from Andrew)
> > > git tree: upstream
> > > console output: https://syzkaller.appspot.com/x/log.txt?x=15cbce90d00000
> > > kernel config: https://syzkaller.appspot.com/x/.config?x=bd1f72220b2e57eb
> > > dashboard link: https://syzkaller.appspot.com/bug?extid=bfdded10ab7dcd7507ae
> > > userspace arch: i386
> > >
> > > Unfortunately, I don't have any reproducer for this issue yet.
> > >
> > > IMPORTANT: if you fix the issue, please add the following tag to the commit:
> > > Reported-by: [email protected]
> > >
> > > ======================================================
> > > WARNING: possible circular locking dependency detected
> > > 5.11.0-rc6-syzkaller #0 Not tainted
> > > ------------------------------------------------------
> > > kswapd0/2246 is trying to acquire lock:
> > > ffff888041a988e0 (jbd2_handle){++++}-{0:0}, at: start_this_handle+0xf81/0x1380 fs/jbd2/transaction.c:444
> > >
> > > but task is already holding lock:
> > > ffffffff8be892c0 (fs_reclaim){+.+.}-{0:0}, at: __fs_reclaim_acquire+0x0/0x30 mm/page_alloc.c:5195
> > >
> > > which lock already depends on the new lock.
> > >
> > > the existing dependency chain (in reverse order) is:
> > >
> > > -> #2 (fs_reclaim){+.+.}-{0:0}:
> > > __fs_reclaim_acquire mm/page_alloc.c:4326 [inline]
> > > fs_reclaim_acquire+0x117/0x150 mm/page_alloc.c:4340
> > > might_alloc include/linux/sched/mm.h:193 [inline]
> > > slab_pre_alloc_hook mm/slab.h:493 [inline]
> > > slab_alloc_node mm/slub.c:2817 [inline]
> > > __kmalloc_node+0x5f/0x430 mm/slub.c:4015
> > > kmalloc_node include/linux/slab.h:575 [inline]
> > > kvmalloc_node+0x61/0xf0 mm/util.c:587
> > > kvmalloc include/linux/mm.h:781 [inline]
> > > ext4_xattr_inode_cache_find fs/ext4/xattr.c:1465 [inline]
> > > ext4_xattr_inode_lookup_create fs/ext4/xattr.c:1508 [inline]
> > > ext4_xattr_set_entry+0x1ce6/0x3780 fs/ext4/xattr.c:1649
> > > ext4_xattr_ibody_set+0x78/0x2b0 fs/ext4/xattr.c:2224
> > > ext4_xattr_set_handle+0x8f4/0x13e0 fs/ext4/xattr.c:2380
> > > ext4_xattr_set+0x13a/0x340 fs/ext4/xattr.c:2493
> > > ext4_xattr_user_set+0xbc/0x100 fs/ext4/xattr_user.c:40
> > > __vfs_setxattr+0x10e/0x170 fs/xattr.c:177
> > > __vfs_setxattr_noperm+0x11a/0x4c0 fs/xattr.c:208
> > > __vfs_setxattr_locked+0x1bf/0x250 fs/xattr.c:266
> > > vfs_setxattr+0x135/0x320 fs/xattr.c:291
> > > setxattr+0x1ff/0x290 fs/xattr.c:553
> > > path_setxattr+0x170/0x190 fs/xattr.c:572
> > > __do_sys_setxattr fs/xattr.c:587 [inline]
> > > __se_sys_setxattr fs/xattr.c:583 [inline]
> > > __ia32_sys_setxattr+0xbc/0x150 fs/xattr.c:583
> > > do_syscall_32_irqs_on arch/x86/entry/common.c:77 [inline]
> > > __do_fast_syscall_32+0x56/0x80 arch/x86/entry/common.c:139
> > > do_fast_syscall_32+0x2f/0x70 arch/x86/entry/common.c:164
> > > entry_SYSENTER_compat_after_hwframe+0x4d/0x5c
> >
> > This stacktrace should never happen. ext4_xattr_set() starts a transaction.
> > That internally goes through start_this_handle() which calls:
> >
> > handle->saved_alloc_context = memalloc_nofs_save();
> >
> > and we restore the allocation context only in stop_this_handle() when
> > stopping the handle. And with this fs_reclaim_acquire() should remove
> > __GFP_FS from the mask and not call __fs_reclaim_acquire().
> >
> > Now I have no idea why something here didn't work out. Given we don't have
> > a reproducer it will be probably difficult to debug this. I'd note that
> > about year and half ago similar report happened (got autoclosed) so it may
> > be something real somewhere but it may also be just some HW glitch or
> > something like that.
>
> HW glitch is theoretically possible. But if we are considering such
> causes, I would say a kernel memory corruption is way more likely, we
> have hundreds of known memory-corruption-capable bugs open. In most
> cases they are caught by KASAN before doing silent damage. But KASAN
> can miss some cases.
>
> I see at least 4 existing bugs with similar stack:
> https://syzkaller.appspot.com/bug?extid=bfdded10ab7dcd7507ae
> https://syzkaller.appspot.com/bug?extid=a7ab8df042baaf42ae3c
> https://syzkaller.appspot.com/bug?id=c814a55a728493959328551c769ede4c8ff72aab
> https://syzkaller.appspot.com/bug?id=426ad9adca053dafcd698f3a48ad5406dccc972b

The last one looks different and likely unrelated (I don't see scoping API
to be used anywhere in that subsystem) but the others look indeed valid. So
I agree it seems to be some very hard to hit problem and likely not just a
random corruption.

Honza
--
Jan Kara <[email protected]>
SUSE Labs, CR

2021-02-11 12:14:15

by Jan Kara

[permalink] [raw]

Subject: Re: possible deadlock in start_this_handle (2)

On Thu 11-02-21 12:28:48, Dmitry Vyukov wrote:
> On Thu, Feb 11, 2021 at 12:22 PM Dmitry Vyukov <[email protected]> wrote:
> >
> > On Thu, Feb 11, 2021 at 11:49 AM Jan Kara <[email protected]> wrote:
> > >
> > > Hello,
> > >
> > > added mm guys to CC.
> > >
> > > On Wed 10-02-21 05:35:18, syzbot wrote:
> > > > HEAD commit: 1e0d27fc Merge branch 'akpm' (patches from Andrew)
> > > > git tree: upstream
> > > > console output: https://syzkaller.appspot.com/x/log.txt?x=15cbce90d00000
> > > > kernel config: https://syzkaller.appspot.com/x/.config?x=bd1f72220b2e57eb
> > > > dashboard link: https://syzkaller.appspot.com/bug?extid=bfdded10ab7dcd7507ae
> > > > userspace arch: i386
> > > >
> > > > Unfortunately, I don't have any reproducer for this issue yet.
> > > >
> > > > IMPORTANT: if you fix the issue, please add the following tag to the commit:
> > > > Reported-by: [email protected]
> > > >
> > > > ======================================================
> > > > WARNING: possible circular locking dependency detected
> > > > 5.11.0-rc6-syzkaller #0 Not tainted
> > > > ------------------------------------------------------
> > > > kswapd0/2246 is trying to acquire lock:
> > > > ffff888041a988e0 (jbd2_handle){++++}-{0:0}, at: start_this_handle+0xf81/0x1380 fs/jbd2/transaction.c:444
> > > >
> > > > but task is already holding lock:
> > > > ffffffff8be892c0 (fs_reclaim){+.+.}-{0:0}, at: __fs_reclaim_acquire+0x0/0x30 mm/page_alloc.c:5195
> > > >
> > > > which lock already depends on the new lock.
> > > >
> > > > the existing dependency chain (in reverse order) is:
> > > >
> > > > -> #2 (fs_reclaim){+.+.}-{0:0}:
> > > > __fs_reclaim_acquire mm/page_alloc.c:4326 [inline]
> > > > fs_reclaim_acquire+0x117/0x150 mm/page_alloc.c:4340
> > > > might_alloc include/linux/sched/mm.h:193 [inline]
> > > > slab_pre_alloc_hook mm/slab.h:493 [inline]
> > > > slab_alloc_node mm/slub.c:2817 [inline]
> > > > __kmalloc_node+0x5f/0x430 mm/slub.c:4015
> > > > kmalloc_node include/linux/slab.h:575 [inline]
> > > > kvmalloc_node+0x61/0xf0 mm/util.c:587
> > > > kvmalloc include/linux/mm.h:781 [inline]
> > > > ext4_xattr_inode_cache_find fs/ext4/xattr.c:1465 [inline]
> > > > ext4_xattr_inode_lookup_create fs/ext4/xattr.c:1508 [inline]
> > > > ext4_xattr_set_entry+0x1ce6/0x3780 fs/ext4/xattr.c:1649
> > > > ext4_xattr_ibody_set+0x78/0x2b0 fs/ext4/xattr.c:2224
> > > > ext4_xattr_set_handle+0x8f4/0x13e0 fs/ext4/xattr.c:2380
> > > > ext4_xattr_set+0x13a/0x340 fs/ext4/xattr.c:2493
> > > > ext4_xattr_user_set+0xbc/0x100 fs/ext4/xattr_user.c:40
> > > > __vfs_setxattr+0x10e/0x170 fs/xattr.c:177
> > > > __vfs_setxattr_noperm+0x11a/0x4c0 fs/xattr.c:208
> > > > __vfs_setxattr_locked+0x1bf/0x250 fs/xattr.c:266
> > > > vfs_setxattr+0x135/0x320 fs/xattr.c:291
> > > > setxattr+0x1ff/0x290 fs/xattr.c:553
> > > > path_setxattr+0x170/0x190 fs/xattr.c:572
> > > > __do_sys_setxattr fs/xattr.c:587 [inline]
> > > > __se_sys_setxattr fs/xattr.c:583 [inline]
> > > > __ia32_sys_setxattr+0xbc/0x150 fs/xattr.c:583
> > > > do_syscall_32_irqs_on arch/x86/entry/common.c:77 [inline]
> > > > __do_fast_syscall_32+0x56/0x80 arch/x86/entry/common.c:139
> > > > do_fast_syscall_32+0x2f/0x70 arch/x86/entry/common.c:164
> > > > entry_SYSENTER_compat_after_hwframe+0x4d/0x5c
> > >
> > > This stacktrace should never happen. ext4_xattr_set() starts a transaction.
> > > That internally goes through start_this_handle() which calls:
> > >
> > > handle->saved_alloc_context = memalloc_nofs_save();
> > >
> > > and we restore the allocation context only in stop_this_handle() when
> > > stopping the handle. And with this fs_reclaim_acquire() should remove
> > > __GFP_FS from the mask and not call __fs_reclaim_acquire().
> > >
> > > Now I have no idea why something here didn't work out. Given we don't have
> > > a reproducer it will be probably difficult to debug this. I'd note that
> > > about year and half ago similar report happened (got autoclosed) so it may
> > > be something real somewhere but it may also be just some HW glitch or
> > > something like that.
> >
> > HW glitch is theoretically possible. But if we are considering such
> > causes, I would say a kernel memory corruption is way more likely, we
> > have hundreds of known memory-corruption-capable bugs open. In most
> > cases they are caught by KASAN before doing silent damage. But KASAN
> > can miss some cases.
> >
> > I see at least 4 existing bugs with similar stack:
> > https://syzkaller.appspot.com/bug?extid=bfdded10ab7dcd7507ae
> > https://syzkaller.appspot.com/bug?extid=a7ab8df042baaf42ae3c
> > https://syzkaller.appspot.com/bug?id=c814a55a728493959328551c769ede4c8ff72aab
> > https://syzkaller.appspot.com/bug?id=426ad9adca053dafcd698f3a48ad5406dccc972b
> >
> > All in all, I would not assume it's a memory corruption. When we had
> > bugs that actually caused silent memory corruption, that caused a
> > spike of random one-time crashes all over the kernel. This does not
> > look like it.
>
> I wonder if memalloc_nofs_save (or any other manipulation of
> current->flags) could have been invoked from interrupt context? I
> think it could cause the failure mode we observe (extremely rare
> disappearing flags). It may be useful to add a check for task context
> there.

That's an interesting idea. I'm not sure if anything does manipulate
current->flags from inside an interrupt (definitely memalloc_nofs_save()
doesn't seem to be) but I'd think that in fully preemtible kernel,
scheduler could preempt the task inside memalloc_nofs_save() and the
current->flags manipulation could also clash with a manipulation of these
flags by the scheduler if there's any?

Honza
--
Jan Kara <[email protected]>
SUSE Labs, CR

2021-02-11 12:40:30

[permalink] [raw]

Subject: Re: possible deadlock in start_this_handle (2)

On Thu 11-02-21 13:10:20, Jan Kara wrote:
> On Thu 11-02-21 12:28:48, Dmitry Vyukov wrote:
> > On Thu, Feb 11, 2021 at 12:22 PM Dmitry Vyukov <[email protected]> wrote:
> > >
> > > On Thu, Feb 11, 2021 at 11:49 AM Jan Kara <[email protected]> wrote:
> > > >
> > > > Hello,
> > > >
> > > > added mm guys to CC.
> > > >
> > > > On Wed 10-02-21 05:35:18, syzbot wrote:
> > > > > HEAD commit: 1e0d27fc Merge branch 'akpm' (patches from Andrew)
> > > > > git tree: upstream
> > > > > console output: https://syzkaller.appspot.com/x/log.txt?x=15cbce90d00000
> > > > > kernel config: https://syzkaller.appspot.com/x/.config?x=bd1f72220b2e57eb
> > > > > dashboard link: https://syzkaller.appspot.com/bug?extid=bfdded10ab7dcd7507ae
> > > > > userspace arch: i386
> > > > >
> > > > > Unfortunately, I don't have any reproducer for this issue yet.
> > > > >
> > > > > IMPORTANT: if you fix the issue, please add the following tag to the commit:
> > > > > Reported-by: [email protected]
> > > > >
> > > > > ======================================================
> > > > > WARNING: possible circular locking dependency detected
> > > > > 5.11.0-rc6-syzkaller #0 Not tainted
> > > > > ------------------------------------------------------
> > > > > kswapd0/2246 is trying to acquire lock:
> > > > > ffff888041a988e0 (jbd2_handle){++++}-{0:0}, at: start_this_handle+0xf81/0x1380 fs/jbd2/transaction.c:444
> > > > >
> > > > > but task is already holding lock:
> > > > > ffffffff8be892c0 (fs_reclaim){+.+.}-{0:0}, at: __fs_reclaim_acquire+0x0/0x30 mm/page_alloc.c:5195
> > > > >
> > > > > which lock already depends on the new lock.
> > > > >
> > > > > the existing dependency chain (in reverse order) is:
> > > > >
> > > > > -> #2 (fs_reclaim){+.+.}-{0:0}:
> > > > > __fs_reclaim_acquire mm/page_alloc.c:4326 [inline]
> > > > > fs_reclaim_acquire+0x117/0x150 mm/page_alloc.c:4340
> > > > > might_alloc include/linux/sched/mm.h:193 [inline]
> > > > > slab_pre_alloc_hook mm/slab.h:493 [inline]
> > > > > slab_alloc_node mm/slub.c:2817 [inline]
> > > > > __kmalloc_node+0x5f/0x430 mm/slub.c:4015
> > > > > kmalloc_node include/linux/slab.h:575 [inline]
> > > > > kvmalloc_node+0x61/0xf0 mm/util.c:587
> > > > > kvmalloc include/linux/mm.h:781 [inline]
> > > > > ext4_xattr_inode_cache_find fs/ext4/xattr.c:1465 [inline]
> > > > > ext4_xattr_inode_lookup_create fs/ext4/xattr.c:1508 [inline]
> > > > > ext4_xattr_set_entry+0x1ce6/0x3780 fs/ext4/xattr.c:1649
> > > > > ext4_xattr_ibody_set+0x78/0x2b0 fs/ext4/xattr.c:2224
> > > > > ext4_xattr_set_handle+0x8f4/0x13e0 fs/ext4/xattr.c:2380
> > > > > ext4_xattr_set+0x13a/0x340 fs/ext4/xattr.c:2493
> > > > > ext4_xattr_user_set+0xbc/0x100 fs/ext4/xattr_user.c:40
> > > > > __vfs_setxattr+0x10e/0x170 fs/xattr.c:177
> > > > > __vfs_setxattr_noperm+0x11a/0x4c0 fs/xattr.c:208
> > > > > __vfs_setxattr_locked+0x1bf/0x250 fs/xattr.c:266
> > > > > vfs_setxattr+0x135/0x320 fs/xattr.c:291
> > > > > setxattr+0x1ff/0x290 fs/xattr.c:553
> > > > > path_setxattr+0x170/0x190 fs/xattr.c:572
> > > > > __do_sys_setxattr fs/xattr.c:587 [inline]
> > > > > __se_sys_setxattr fs/xattr.c:583 [inline]
> > > > > __ia32_sys_setxattr+0xbc/0x150 fs/xattr.c:583
> > > > > do_syscall_32_irqs_on arch/x86/entry/common.c:77 [inline]
> > > > > __do_fast_syscall_32+0x56/0x80 arch/x86/entry/common.c:139
> > > > > do_fast_syscall_32+0x2f/0x70 arch/x86/entry/common.c:164
> > > > > entry_SYSENTER_compat_after_hwframe+0x4d/0x5c
> > > >
> > > > This stacktrace should never happen. ext4_xattr_set() starts a transaction.
> > > > That internally goes through start_this_handle() which calls:
> > > >
> > > > handle->saved_alloc_context = memalloc_nofs_save();
> > > >
> > > > and we restore the allocation context only in stop_this_handle() when
> > > > stopping the handle. And with this fs_reclaim_acquire() should remove
> > > > __GFP_FS from the mask and not call __fs_reclaim_acquire().
> > > >
> > > > Now I have no idea why something here didn't work out. Given we don't have
> > > > a reproducer it will be probably difficult to debug this. I'd note that
> > > > about year and half ago similar report happened (got autoclosed) so it may
> > > > be something real somewhere but it may also be just some HW glitch or
> > > > something like that.
> > >
> > > HW glitch is theoretically possible. But if we are considering such
> > > causes, I would say a kernel memory corruption is way more likely, we
> > > have hundreds of known memory-corruption-capable bugs open. In most
> > > cases they are caught by KASAN before doing silent damage. But KASAN
> > > can miss some cases.
> > >
> > > I see at least 4 existing bugs with similar stack:
> > > https://syzkaller.appspot.com/bug?extid=bfdded10ab7dcd7507ae
> > > https://syzkaller.appspot.com/bug?extid=a7ab8df042baaf42ae3c
> > > https://syzkaller.appspot.com/bug?id=c814a55a728493959328551c769ede4c8ff72aab
> > > https://syzkaller.appspot.com/bug?id=426ad9adca053dafcd698f3a48ad5406dccc972b
> > >
> > > All in all, I would not assume it's a memory corruption. When we had
> > > bugs that actually caused silent memory corruption, that caused a
> > > spike of random one-time crashes all over the kernel. This does not
> > > look like it.
> >
> > I wonder if memalloc_nofs_save (or any other manipulation of
> > current->flags) could have been invoked from interrupt context? I
> > think it could cause the failure mode we observe (extremely rare
> > disappearing flags). It may be useful to add a check for task context
> > there.
>
> That's an interesting idea. I'm not sure if anything does manipulate
> current->flags from inside an interrupt (definitely memalloc_nofs_save()
> doesn't seem to be) but I'd think that in fully preemtible kernel,
> scheduler could preempt the task inside memalloc_nofs_save() and the
> current->flags manipulation could also clash with a manipulation of these
> flags by the scheduler if there's any?

current->flags should be always manipulated from the user context. But
who knows maybe there is a bug and some interrupt handler is calling it.
This should be easy to catch no?

--
Michal Hocko
SUSE Labs

2021-02-11 13:04:10

[permalink] [raw]

Subject: Re: possible deadlock in start_this_handle (2)

On Thu, Feb 11, 2021 at 01:34:48PM +0100, Michal Hocko wrote:
> On Thu 11-02-21 13:10:20, Jan Kara wrote:
> > On Thu 11-02-21 12:28:48, Dmitry Vyukov wrote:
> > > On Thu, Feb 11, 2021 at 12:22 PM Dmitry Vyukov <[email protected]> wrote:
> > > >
> > > > On Thu, Feb 11, 2021 at 11:49 AM Jan Kara <[email protected]> wrote:
> > > > >
> > > > > Hello,
> > > > >
> > > > > added mm guys to CC.
> > > > >
> > > > > On Wed 10-02-21 05:35:18, syzbot wrote:
> > > > > > HEAD commit: 1e0d27fc Merge branch 'akpm' (patches from Andrew)
> > > > > > git tree: upstream
> > > > > > console output: https://syzkaller.appspot.com/x/log.txt?x=15cbce90d00000
> > > > > > kernel config: https://syzkaller.appspot.com/x/.config?x=bd1f72220b2e57eb
> > > > > > dashboard link: https://syzkaller.appspot.com/bug?extid=bfdded10ab7dcd7507ae
> > > > > > userspace arch: i386
> > > > > >
> > > > > > Unfortunately, I don't have any reproducer for this issue yet.
> > > > > >
> > > > > > IMPORTANT: if you fix the issue, please add the following tag to the commit:
> > > > > > Reported-by: [email protected]
> > > > > >
> > > > > > ======================================================
> > > > > > WARNING: possible circular locking dependency detected
> > > > > > 5.11.0-rc6-syzkaller #0 Not tainted
> > > > > > ------------------------------------------------------
> > > > > > kswapd0/2246 is trying to acquire lock:
> > > > > > ffff888041a988e0 (jbd2_handle){++++}-{0:0}, at: start_this_handle+0xf81/0x1380 fs/jbd2/transaction.c:444
> > > > > >
> > > > > > but task is already holding lock:
> > > > > > ffffffff8be892c0 (fs_reclaim){+.+.}-{0:0}, at: __fs_reclaim_acquire+0x0/0x30 mm/page_alloc.c:5195
> > > > > >
> > > > > > which lock already depends on the new lock.
> > > > > >
> > > > > > the existing dependency chain (in reverse order) is:
> > > > > >
> > > > > > -> #2 (fs_reclaim){+.+.}-{0:0}:
> > > > > > __fs_reclaim_acquire mm/page_alloc.c:4326 [inline]
> > > > > > fs_reclaim_acquire+0x117/0x150 mm/page_alloc.c:4340
> > > > > > might_alloc include/linux/sched/mm.h:193 [inline]
> > > > > > slab_pre_alloc_hook mm/slab.h:493 [inline]
> > > > > > slab_alloc_node mm/slub.c:2817 [inline]
> > > > > > __kmalloc_node+0x5f/0x430 mm/slub.c:4015
> > > > > > kmalloc_node include/linux/slab.h:575 [inline]
> > > > > > kvmalloc_node+0x61/0xf0 mm/util.c:587
> > > > > > kvmalloc include/linux/mm.h:781 [inline]
> > > > > > ext4_xattr_inode_cache_find fs/ext4/xattr.c:1465 [inline]
> > > > > > ext4_xattr_inode_lookup_create fs/ext4/xattr.c:1508 [inline]
> > > > > > ext4_xattr_set_entry+0x1ce6/0x3780 fs/ext4/xattr.c:1649
> > > > > > ext4_xattr_ibody_set+0x78/0x2b0 fs/ext4/xattr.c:2224
> > > > > > ext4_xattr_set_handle+0x8f4/0x13e0 fs/ext4/xattr.c:2380
> > > > > > ext4_xattr_set+0x13a/0x340 fs/ext4/xattr.c:2493
> > > > > > ext4_xattr_user_set+0xbc/0x100 fs/ext4/xattr_user.c:40
> > > > > > __vfs_setxattr+0x10e/0x170 fs/xattr.c:177
> > > > > > __vfs_setxattr_noperm+0x11a/0x4c0 fs/xattr.c:208
> > > > > > __vfs_setxattr_locked+0x1bf/0x250 fs/xattr.c:266
> > > > > > vfs_setxattr+0x135/0x320 fs/xattr.c:291
> > > > > > setxattr+0x1ff/0x290 fs/xattr.c:553
> > > > > > path_setxattr+0x170/0x190 fs/xattr.c:572
> > > > > > __do_sys_setxattr fs/xattr.c:587 [inline]
> > > > > > __se_sys_setxattr fs/xattr.c:583 [inline]
> > > > > > __ia32_sys_setxattr+0xbc/0x150 fs/xattr.c:583
> > > > > > do_syscall_32_irqs_on arch/x86/entry/common.c:77 [inline]
> > > > > > __do_fast_syscall_32+0x56/0x80 arch/x86/entry/common.c:139
> > > > > > do_fast_syscall_32+0x2f/0x70 arch/x86/entry/common.c:164
> > > > > > entry_SYSENTER_compat_after_hwframe+0x4d/0x5c
> > > > >
> > > > > This stacktrace should never happen. ext4_xattr_set() starts a transaction.
> > > > > That internally goes through start_this_handle() which calls:
> > > > >
> > > > > handle->saved_alloc_context = memalloc_nofs_save();
> > > > >
> > > > > and we restore the allocation context only in stop_this_handle() when
> > > > > stopping the handle. And with this fs_reclaim_acquire() should remove
> > > > > __GFP_FS from the mask and not call __fs_reclaim_acquire().
> > > > >
> > > > > Now I have no idea why something here didn't work out. Given we don't have
> > > > > a reproducer it will be probably difficult to debug this. I'd note that
> > > > > about year and half ago similar report happened (got autoclosed) so it may
> > > > > be something real somewhere but it may also be just some HW glitch or
> > > > > something like that.
> > > >
> > > > HW glitch is theoretically possible. But if we are considering such
> > > > causes, I would say a kernel memory corruption is way more likely, we
> > > > have hundreds of known memory-corruption-capable bugs open. In most
> > > > cases they are caught by KASAN before doing silent damage. But KASAN
> > > > can miss some cases.
> > > >
> > > > I see at least 4 existing bugs with similar stack:
> > > > https://syzkaller.appspot.com/bug?extid=bfdded10ab7dcd7507ae
> > > > https://syzkaller.appspot.com/bug?extid=a7ab8df042baaf42ae3c
> > > > https://syzkaller.appspot.com/bug?id=c814a55a728493959328551c769ede4c8ff72aab
> > > > https://syzkaller.appspot.com/bug?id=426ad9adca053dafcd698f3a48ad5406dccc972b
> > > >
> > > > All in all, I would not assume it's a memory corruption. When we had
> > > > bugs that actually caused silent memory corruption, that caused a
> > > > spike of random one-time crashes all over the kernel. This does not
> > > > look like it.
> > >
> > > I wonder if memalloc_nofs_save (or any other manipulation of
> > > current->flags) could have been invoked from interrupt context? I
> > > think it could cause the failure mode we observe (extremely rare
> > > disappearing flags). It may be useful to add a check for task context
> > > there.
> >
> > That's an interesting idea. I'm not sure if anything does manipulate
> > current->flags from inside an interrupt (definitely memalloc_nofs_save()
> > doesn't seem to be) but I'd think that in fully preemtible kernel,
> > scheduler could preempt the task inside memalloc_nofs_save() and the
> > current->flags manipulation could also clash with a manipulation of these
> > flags by the scheduler if there's any?
>
> current->flags should be always manipulated from the user context. But
> who knows maybe there is a bug and some interrupt handler is calling it.
> This should be easy to catch no?

Why would it matter if it were? We save the current value of the nofs
flag and then restore it. That would happen before the end of the
interrupt handler. So the interrupt isn't going to change the observed
value of the flag by the task which is interrupted.

2021-02-11 13:13:45

[permalink] [raw]

Subject: Re: possible deadlock in start_this_handle (2)

On Thu 11-02-21 12:57:17, Matthew Wilcox wrote:
> On Thu, Feb 11, 2021 at 01:34:48PM +0100, Michal Hocko wrote:
> > On Thu 11-02-21 13:10:20, Jan Kara wrote:
> > > On Thu 11-02-21 12:28:48, Dmitry Vyukov wrote:
> > > > On Thu, Feb 11, 2021 at 12:22 PM Dmitry Vyukov <[email protected]> wrote:
> > > > >
> > > > > On Thu, Feb 11, 2021 at 11:49 AM Jan Kara <[email protected]> wrote:
> > > > > >
> > > > > > Hello,
> > > > > >
> > > > > > added mm guys to CC.
> > > > > >
> > > > > > On Wed 10-02-21 05:35:18, syzbot wrote:
> > > > > > > HEAD commit: 1e0d27fc Merge branch 'akpm' (patches from Andrew)
> > > > > > > git tree: upstream
> > > > > > > console output: https://syzkaller.appspot.com/x/log.txt?x=15cbce90d00000
> > > > > > > kernel config: https://syzkaller.appspot.com/x/.config?x=bd1f72220b2e57eb
> > > > > > > dashboard link: https://syzkaller.appspot.com/bug?extid=bfdded10ab7dcd7507ae
> > > > > > > userspace arch: i386
> > > > > > >
> > > > > > > Unfortunately, I don't have any reproducer for this issue yet.
> > > > > > >
> > > > > > > IMPORTANT: if you fix the issue, please add the following tag to the commit:
> > > > > > > Reported-by: [email protected]
> > > > > > >
> > > > > > > ======================================================
> > > > > > > WARNING: possible circular locking dependency detected
> > > > > > > 5.11.0-rc6-syzkaller #0 Not tainted
> > > > > > > ------------------------------------------------------
> > > > > > > kswapd0/2246 is trying to acquire lock:
> > > > > > > ffff888041a988e0 (jbd2_handle){++++}-{0:0}, at: start_this_handle+0xf81/0x1380 fs/jbd2/transaction.c:444
> > > > > > >
> > > > > > > but task is already holding lock:
> > > > > > > ffffffff8be892c0 (fs_reclaim){+.+.}-{0:0}, at: __fs_reclaim_acquire+0x0/0x30 mm/page_alloc.c:5195
> > > > > > >
> > > > > > > which lock already depends on the new lock.
> > > > > > >
> > > > > > > the existing dependency chain (in reverse order) is:
> > > > > > >
> > > > > > > -> #2 (fs_reclaim){+.+.}-{0:0}:
> > > > > > > __fs_reclaim_acquire mm/page_alloc.c:4326 [inline]
> > > > > > > fs_reclaim_acquire+0x117/0x150 mm/page_alloc.c:4340
> > > > > > > might_alloc include/linux/sched/mm.h:193 [inline]
> > > > > > > slab_pre_alloc_hook mm/slab.h:493 [inline]
> > > > > > > slab_alloc_node mm/slub.c:2817 [inline]
> > > > > > > __kmalloc_node+0x5f/0x430 mm/slub.c:4015
> > > > > > > kmalloc_node include/linux/slab.h:575 [inline]
> > > > > > > kvmalloc_node+0x61/0xf0 mm/util.c:587
> > > > > > > kvmalloc include/linux/mm.h:781 [inline]
> > > > > > > ext4_xattr_inode_cache_find fs/ext4/xattr.c:1465 [inline]
> > > > > > > ext4_xattr_inode_lookup_create fs/ext4/xattr.c:1508 [inline]
> > > > > > > ext4_xattr_set_entry+0x1ce6/0x3780 fs/ext4/xattr.c:1649
> > > > > > > ext4_xattr_ibody_set+0x78/0x2b0 fs/ext4/xattr.c:2224
> > > > > > > ext4_xattr_set_handle+0x8f4/0x13e0 fs/ext4/xattr.c:2380
> > > > > > > ext4_xattr_set+0x13a/0x340 fs/ext4/xattr.c:2493
> > > > > > > ext4_xattr_user_set+0xbc/0x100 fs/ext4/xattr_user.c:40
> > > > > > > __vfs_setxattr+0x10e/0x170 fs/xattr.c:177
> > > > > > > __vfs_setxattr_noperm+0x11a/0x4c0 fs/xattr.c:208
> > > > > > > __vfs_setxattr_locked+0x1bf/0x250 fs/xattr.c:266
> > > > > > > vfs_setxattr+0x135/0x320 fs/xattr.c:291
> > > > > > > setxattr+0x1ff/0x290 fs/xattr.c:553
> > > > > > > path_setxattr+0x170/0x190 fs/xattr.c:572
> > > > > > > __do_sys_setxattr fs/xattr.c:587 [inline]
> > > > > > > __se_sys_setxattr fs/xattr.c:583 [inline]
> > > > > > > __ia32_sys_setxattr+0xbc/0x150 fs/xattr.c:583
> > > > > > > do_syscall_32_irqs_on arch/x86/entry/common.c:77 [inline]
> > > > > > > __do_fast_syscall_32+0x56/0x80 arch/x86/entry/common.c:139
> > > > > > > do_fast_syscall_32+0x2f/0x70 arch/x86/entry/common.c:164
> > > > > > > entry_SYSENTER_compat_after_hwframe+0x4d/0x5c
> > > > > >
> > > > > > This stacktrace should never happen. ext4_xattr_set() starts a transaction.
> > > > > > That internally goes through start_this_handle() which calls:
> > > > > >
> > > > > > handle->saved_alloc_context = memalloc_nofs_save();
> > > > > >
> > > > > > and we restore the allocation context only in stop_this_handle() when
> > > > > > stopping the handle. And with this fs_reclaim_acquire() should remove
> > > > > > __GFP_FS from the mask and not call __fs_reclaim_acquire().
> > > > > >
> > > > > > Now I have no idea why something here didn't work out. Given we don't have
> > > > > > a reproducer it will be probably difficult to debug this. I'd note that
> > > > > > about year and half ago similar report happened (got autoclosed) so it may
> > > > > > be something real somewhere but it may also be just some HW glitch or
> > > > > > something like that.
> > > > >
> > > > > HW glitch is theoretically possible. But if we are considering such
> > > > > causes, I would say a kernel memory corruption is way more likely, we
> > > > > have hundreds of known memory-corruption-capable bugs open. In most
> > > > > cases they are caught by KASAN before doing silent damage. But KASAN
> > > > > can miss some cases.
> > > > >
> > > > > I see at least 4 existing bugs with similar stack:
> > > > > https://syzkaller.appspot.com/bug?extid=bfdded10ab7dcd7507ae
> > > > > https://syzkaller.appspot.com/bug?extid=a7ab8df042baaf42ae3c
> > > > > https://syzkaller.appspot.com/bug?id=c814a55a728493959328551c769ede4c8ff72aab
> > > > > https://syzkaller.appspot.com/bug?id=426ad9adca053dafcd698f3a48ad5406dccc972b
> > > > >
> > > > > All in all, I would not assume it's a memory corruption. When we had
> > > > > bugs that actually caused silent memory corruption, that caused a
> > > > > spike of random one-time crashes all over the kernel. This does not
> > > > > look like it.
> > > >
> > > > I wonder if memalloc_nofs_save (or any other manipulation of
> > > > current->flags) could have been invoked from interrupt context? I
> > > > think it could cause the failure mode we observe (extremely rare
> > > > disappearing flags). It may be useful to add a check for task context
> > > > there.
> > >
> > > That's an interesting idea. I'm not sure if anything does manipulate
> > > current->flags from inside an interrupt (definitely memalloc_nofs_save()
> > > doesn't seem to be) but I'd think that in fully preemtible kernel,
> > > scheduler could preempt the task inside memalloc_nofs_save() and the
> > > current->flags manipulation could also clash with a manipulation of these
> > > flags by the scheduler if there's any?
> >
> > current->flags should be always manipulated from the user context. But
> > who knows maybe there is a bug and some interrupt handler is calling it.
> > This should be easy to catch no?
>
> Why would it matter if it were?

I was thinking about a clobbered state because updates to ->flags are
not atomic because this shouldn't ever be updated concurrently. So maybe
a racing interrupt could corrupt the flags state?
--
Michal Hocko
SUSE Labs

2021-02-11 13:22:23

[permalink] [raw]

Subject: Re: possible deadlock in start_this_handle (2)

On Thu, Feb 11, 2021 at 1:57 PM Matthew Wilcox <[email protected]> wrote:
> > > > > > Hello,
> > > > > >
> > > > > > added mm guys to CC.
> > > > > >
> > > > > > On Wed 10-02-21 05:35:18, syzbot wrote:
> > > > > > > HEAD commit: 1e0d27fc Merge branch 'akpm' (patches from Andrew)
> > > > > > > git tree: upstream
> > > > > > > console output: https://syzkaller.appspot.com/x/log.txt?x=15cbce90d00000
> > > > > > > kernel config: https://syzkaller.appspot.com/x/.config?x=bd1f72220b2e57eb
> > > > > > > dashboard link: https://syzkaller.appspot.com/bug?extid=bfdded10ab7dcd7507ae
> > > > > > > userspace arch: i386
> > > > > > >
> > > > > > > Unfortunately, I don't have any reproducer for this issue yet.
> > > > > > >
> > > > > > > IMPORTANT: if you fix the issue, please add the following tag to the commit:
> > > > > > > Reported-by: [email protected]
> > > > > > >
> > > > > > > ======================================================
> > > > > > > WARNING: possible circular locking dependency detected
> > > > > > > 5.11.0-rc6-syzkaller #0 Not tainted
> > > > > > > ------------------------------------------------------
> > > > > > > kswapd0/2246 is trying to acquire lock:
> > > > > > > ffff888041a988e0 (jbd2_handle){++++}-{0:0}, at: start_this_handle+0xf81/0x1380 fs/jbd2/transaction.c:444
> > > > > > >
> > > > > > > but task is already holding lock:
> > > > > > > ffffffff8be892c0 (fs_reclaim){+.+.}-{0:0}, at: __fs_reclaim_acquire+0x0/0x30 mm/page_alloc.c:5195
> > > > > > >
> > > > > > > which lock already depends on the new lock.
> > > > > > >
> > > > > > > the existing dependency chain (in reverse order) is:
> > > > > > >
> > > > > > > -> #2 (fs_reclaim){+.+.}-{0:0}:
> > > > > > > __fs_reclaim_acquire mm/page_alloc.c:4326 [inline]
> > > > > > > fs_reclaim_acquire+0x117/0x150 mm/page_alloc.c:4340
> > > > > > > might_alloc include/linux/sched/mm.h:193 [inline]
> > > > > > > slab_pre_alloc_hook mm/slab.h:493 [inline]
> > > > > > > slab_alloc_node mm/slub.c:2817 [inline]
> > > > > > > __kmalloc_node+0x5f/0x430 mm/slub.c:4015
> > > > > > > kmalloc_node include/linux/slab.h:575 [inline]
> > > > > > > kvmalloc_node+0x61/0xf0 mm/util.c:587
> > > > > > > kvmalloc include/linux/mm.h:781 [inline]
> > > > > > > ext4_xattr_inode_cache_find fs/ext4/xattr.c:1465 [inline]
> > > > > > > ext4_xattr_inode_lookup_create fs/ext4/xattr.c:1508 [inline]
> > > > > > > ext4_xattr_set_entry+0x1ce6/0x3780 fs/ext4/xattr.c:1649
> > > > > > > ext4_xattr_ibody_set+0x78/0x2b0 fs/ext4/xattr.c:2224
> > > > > > > ext4_xattr_set_handle+0x8f4/0x13e0 fs/ext4/xattr.c:2380
> > > > > > > ext4_xattr_set+0x13a/0x340 fs/ext4/xattr.c:2493
> > > > > > > ext4_xattr_user_set+0xbc/0x100 fs/ext4/xattr_user.c:40
> > > > > > > __vfs_setxattr+0x10e/0x170 fs/xattr.c:177
> > > > > > > __vfs_setxattr_noperm+0x11a/0x4c0 fs/xattr.c:208
> > > > > > > __vfs_setxattr_locked+0x1bf/0x250 fs/xattr.c:266
> > > > > > > vfs_setxattr+0x135/0x320 fs/xattr.c:291
> > > > > > > setxattr+0x1ff/0x290 fs/xattr.c:553
> > > > > > > path_setxattr+0x170/0x190 fs/xattr.c:572
> > > > > > > __do_sys_setxattr fs/xattr.c:587 [inline]
> > > > > > > __se_sys_setxattr fs/xattr.c:583 [inline]
> > > > > > > __ia32_sys_setxattr+0xbc/0x150 fs/xattr.c:583
> > > > > > > do_syscall_32_irqs_on arch/x86/entry/common.c:77 [inline]
> > > > > > > __do_fast_syscall_32+0x56/0x80 arch/x86/entry/common.c:139
> > > > > > > do_fast_syscall_32+0x2f/0x70 arch/x86/entry/common.c:164
> > > > > > > entry_SYSENTER_compat_after_hwframe+0x4d/0x5c
> > > > > >
> > > > > > This stacktrace should never happen. ext4_xattr_set() starts a transaction.
> > > > > > That internally goes through start_this_handle() which calls:
> > > > > >
> > > > > > handle->saved_alloc_context = memalloc_nofs_save();
> > > > > >
> > > > > > and we restore the allocation context only in stop_this_handle() when
> > > > > > stopping the handle. And with this fs_reclaim_acquire() should remove
> > > > > > __GFP_FS from the mask and not call __fs_reclaim_acquire().
> > > > > >
> > > > > > Now I have no idea why something here didn't work out. Given we don't have
> > > > > > a reproducer it will be probably difficult to debug this. I'd note that
> > > > > > about year and half ago similar report happened (got autoclosed) so it may
> > > > > > be something real somewhere but it may also be just some HW glitch or
> > > > > > something like that.
> > > > >
> > > > > HW glitch is theoretically possible. But if we are considering such
> > > > > causes, I would say a kernel memory corruption is way more likely, we
> > > > > have hundreds of known memory-corruption-capable bugs open. In most
> > > > > cases they are caught by KASAN before doing silent damage. But KASAN
> > > > > can miss some cases.
> > > > >
> > > > > I see at least 4 existing bugs with similar stack:
> > > > > https://syzkaller.appspot.com/bug?extid=bfdded10ab7dcd7507ae
> > > > > https://syzkaller.appspot.com/bug?extid=a7ab8df042baaf42ae3c
> > > > > https://syzkaller.appspot.com/bug?id=c814a55a728493959328551c769ede4c8ff72aab
> > > > > https://syzkaller.appspot.com/bug?id=426ad9adca053dafcd698f3a48ad5406dccc972b
> > > > >
> > > > > All in all, I would not assume it's a memory corruption. When we had
> > > > > bugs that actually caused silent memory corruption, that caused a
> > > > > spike of random one-time crashes all over the kernel. This does not
> > > > > look like it.
> > > >
> > > > I wonder if memalloc_nofs_save (or any other manipulation of
> > > > current->flags) could have been invoked from interrupt context? I
> > > > think it could cause the failure mode we observe (extremely rare
> > > > disappearing flags). It may be useful to add a check for task context
> > > > there.
> > >
> > > That's an interesting idea. I'm not sure if anything does manipulate
> > > current->flags from inside an interrupt (definitely memalloc_nofs_save()
> > > doesn't seem to be) but I'd think that in fully preemtible kernel,
> > > scheduler could preempt the task inside memalloc_nofs_save() and the
> > > current->flags manipulation could also clash with a manipulation of these
> > > flags by the scheduler if there's any?
> >
> > current->flags should be always manipulated from the user context. But
> > who knows maybe there is a bug and some interrupt handler is calling it.
> > This should be easy to catch no?
>
> Why would it matter if it were? We save the current value of the nofs
> flag and then restore it. That would happen before the end of the
> interrupt handler. So the interrupt isn't going to change the observed
> value of the flag by the task which is interrupted.

Good question.
I just think that fixing some of these assumptions as runtime checks
is useful, as it will allow us to reduce infinite space of
possibilities. What is called from what context. Maybe checking that
PF_MEMALLOC_NOFS is indeed set when we enter memalloc_nofs_restore().

2021-02-11 13:31:18

[permalink] [raw]

Subject: Re: possible deadlock in start_this_handle (2)

On Thu, Feb 11, 2021 at 02:07:03PM +0100, Michal Hocko wrote:
> On Thu 11-02-21 12:57:17, Matthew Wilcox wrote:
> > > current->flags should be always manipulated from the user context. But
> > > who knows maybe there is a bug and some interrupt handler is calling it.
> > > This should be easy to catch no?
> >
> > Why would it matter if it were?
>
> I was thinking about a clobbered state because updates to ->flags are
> not atomic because this shouldn't ever be updated concurrently. So maybe
> a racing interrupt could corrupt the flags state?

I don't think that's possible. Same-CPU races between interrupt and
process context are simpler because the CPU always observes its own writes
in order and the interrupt handler completes "between" two instructions.

eg a load-store CPU will do:

load 0 from address A
or 8 with result
store 8 to A

Two CPUs can do:

CPU 0 CPU 1
load 0 from A
load 0 from A
or 8 with 0
or 4 with 0
store 8 to A
store 4 to A

and the store of 8 is lost.

process interrupt
load 0 from A
load 0 from A
or 4 with 0
store 4 to A
or 8 with 0
store 8 to A

so the store of 4 would be lost.

but we expect the interrupt handler to restore it. so we actually have this:

load 0 from A
load 0 from A
or 4 with 0
store 4 to A
load 4 from A
clear 4 from 4
store 0 to A
or 8 with 0
store 8 to A

If we have a leak where someone forgets to restore the nofs, that might
cause this. We could check for the allocation mask bits being clear at
syscall exit (scheduling with these flags set is obviously ok).

2021-02-11 14:25:42

[permalink] [raw]

Subject: Re: possible deadlock in start_this_handle (2)

On Thu 11-02-21 13:25:33, Matthew Wilcox wrote:
> On Thu, Feb 11, 2021 at 02:07:03PM +0100, Michal Hocko wrote:
> > On Thu 11-02-21 12:57:17, Matthew Wilcox wrote:
> > > > current->flags should be always manipulated from the user context. But
> > > > who knows maybe there is a bug and some interrupt handler is calling it.
> > > > This should be easy to catch no?
> > >
> > > Why would it matter if it were?
> >
> > I was thinking about a clobbered state because updates to ->flags are
> > not atomic because this shouldn't ever be updated concurrently. So maybe
> > a racing interrupt could corrupt the flags state?
>
> I don't think that's possible. Same-CPU races between interrupt and
> process context are simpler because the CPU always observes its own writes
> in order and the interrupt handler completes "between" two instructions.

I have to confess I haven't really thought the scenario through. My idea
was to simply add a simple check for an irq context into ->flags setting
routine because this should never be done in the first place. Not only
for scope gfp flags but any other PF_ flags IIRC.

--
Michal Hocko
SUSE Labs

2021-02-11 14:30:41

[permalink] [raw]

Subject: Re: possible deadlock in start_this_handle (2)

On Thu, Feb 11, 2021 at 03:20:41PM +0100, Michal Hocko wrote:
> On Thu 11-02-21 13:25:33, Matthew Wilcox wrote:
> > On Thu, Feb 11, 2021 at 02:07:03PM +0100, Michal Hocko wrote:
> > > On Thu 11-02-21 12:57:17, Matthew Wilcox wrote:
> > > > > current->flags should be always manipulated from the user context. But
> > > > > who knows maybe there is a bug and some interrupt handler is calling it.
> > > > > This should be easy to catch no?
> > > >
> > > > Why would it matter if it were?
> > >
> > > I was thinking about a clobbered state because updates to ->flags are
> > > not atomic because this shouldn't ever be updated concurrently. So maybe
> > > a racing interrupt could corrupt the flags state?
> >
> > I don't think that's possible. Same-CPU races between interrupt and
> > process context are simpler because the CPU always observes its own writes
> > in order and the interrupt handler completes "between" two instructions.
>
> I have to confess I haven't really thought the scenario through. My idea
> was to simply add a simple check for an irq context into ->flags setting
> routine because this should never be done in the first place. Not only
> for scope gfp flags but any other PF_ flags IIRC.

That's not automatically clear to me. There are plenty of places
where an interrupt borrows the context of the task that it happens to
have interrupted. Specifically, interrupts should be using GFP_ATOMIC
anyway, so this doesn't really make a lot of sense, but I don't think
it's necessarily wrong for an interrupt to call a function that says
"Definitely don't make GFP_FS allocations between these two points".

2021-02-11 16:49:53

[permalink] [raw]

Subject: Re: possible deadlock in start_this_handle (2)

On Thu 11-02-21 14:26:30, Matthew Wilcox wrote:
> On Thu, Feb 11, 2021 at 03:20:41PM +0100, Michal Hocko wrote:
> > On Thu 11-02-21 13:25:33, Matthew Wilcox wrote:
> > > On Thu, Feb 11, 2021 at 02:07:03PM +0100, Michal Hocko wrote:
> > > > On Thu 11-02-21 12:57:17, Matthew Wilcox wrote:
> > > > > > current->flags should be always manipulated from the user context. But
> > > > > > who knows maybe there is a bug and some interrupt handler is calling it.
> > > > > > This should be easy to catch no?
> > > > >
> > > > > Why would it matter if it were?
> > > >
> > > > I was thinking about a clobbered state because updates to ->flags are
> > > > not atomic because this shouldn't ever be updated concurrently. So maybe
> > > > a racing interrupt could corrupt the flags state?
> > >
> > > I don't think that's possible. Same-CPU races between interrupt and
> > > process context are simpler because the CPU always observes its own writes
> > > in order and the interrupt handler completes "between" two instructions.
> >
> > I have to confess I haven't really thought the scenario through. My idea
> > was to simply add a simple check for an irq context into ->flags setting
> > routine because this should never be done in the first place. Not only
> > for scope gfp flags but any other PF_ flags IIRC.
>
> That's not automatically clear to me. There are plenty of places
> where an interrupt borrows the context of the task that it happens to
> have interrupted. Specifically, interrupts should be using GFP_ATOMIC
> anyway, so this doesn't really make a lot of sense, but I don't think
> it's necessarily wrong for an interrupt to call a function that says
> "Definitely don't make GFP_FS allocations between these two points".

Not sure I got your point. IRQ context never does reclaim so anything
outside of NOWAIT/ATOMIC is pointless. But you might be refering to a
future code where GFP_FS might have a meaning outside of the reclaim
context?

Anyway if we are to allow modifying PF_ flags from an interrupt contenxt
then I believe we should make that code IRQ aware at least. I do not
feel really comfortable about async modifications when this is stated to
be safe doing in a non atomic way.

But I suspect we have drifted away from the original issue. I thought
that a simple check would help us narrow down this particular case and
somebody messing up from the IRQ context didn't sound like a completely
off.

--
Michal Hocko
SUSE Labs

2021-02-12 12:25:54

[permalink] [raw]

Subject: Re: possible deadlock in start_this_handle (2)

On Fri, Feb 12, 2021 at 08:18:11PM +0900, Tetsuo Handa wrote:
> On 2021/02/12 1:41, Michal Hocko wrote:
> > But I suspect we have drifted away from the original issue. I thought
> > that a simple check would help us narrow down this particular case and
> > somebody messing up from the IRQ context didn't sound like a completely
> > off.
> >
>
> From my experience at https://lkml.kernel.org/r/[email protected] ,
> I think we can replace direct PF_* manipulation with macros which do not receive "struct task_struct *" argument.
> Since TASK_PFA_TEST()/TASK_PFA_SET()/TASK_PFA_CLEAR() are for manipulating PFA_* flags on a remote thread, we can
> define similar ones for manipulating PF_* flags on current thread. Then, auditing dangerous users becomes easier.

No, nobody is manipulating another task's GFP flags.

2021-02-12 12:32:09

[permalink] [raw]

Subject: Re: possible deadlock in start_this_handle (2)

On Fri 12-02-21 12:22:07, Matthew Wilcox wrote:
> On Fri, Feb 12, 2021 at 08:18:11PM +0900, Tetsuo Handa wrote:
> > On 2021/02/12 1:41, Michal Hocko wrote:
> > > But I suspect we have drifted away from the original issue. I thought
> > > that a simple check would help us narrow down this particular case and
> > > somebody messing up from the IRQ context didn't sound like a completely
> > > off.
> > >
> >
> > From my experience at https://lkml.kernel.org/r/[email protected] ,
> > I think we can replace direct PF_* manipulation with macros which do not receive "struct task_struct *" argument.
> > Since TASK_PFA_TEST()/TASK_PFA_SET()/TASK_PFA_CLEAR() are for manipulating PFA_* flags on a remote thread, we can
> > define similar ones for manipulating PF_* flags on current thread. Then, auditing dangerous users becomes easier.
>
> No, nobody is manipulating another task's GFP flags.

Agreed. And nobody should be manipulating PF flags on remote tasks
either.

--
Michal Hocko
SUSE Labs

2021-02-12 13:03:57

by Tetsuo Handa

[permalink] [raw]

Subject: Re: possible deadlock in start_this_handle (2)

On 2021/02/12 21:30, Michal Hocko wrote:
> On Fri 12-02-21 12:22:07, Matthew Wilcox wrote:
>> On Fri, Feb 12, 2021 at 08:18:11PM +0900, Tetsuo Handa wrote:
>>> On 2021/02/12 1:41, Michal Hocko wrote:
>>>> But I suspect we have drifted away from the original issue. I thought
>>>> that a simple check would help us narrow down this particular case and
>>>> somebody messing up from the IRQ context didn't sound like a completely
>>>> off.
>>>>
>>>
>>> From my experience at https://lkml.kernel.org/r/[email protected] ,
>>> I think we can replace direct PF_* manipulation with macros which do not receive "struct task_struct *" argument.
>>> Since TASK_PFA_TEST()/TASK_PFA_SET()/TASK_PFA_CLEAR() are for manipulating PFA_* flags on a remote thread, we can
>>> define similar ones for manipulating PF_* flags on current thread. Then, auditing dangerous users becomes easier.
>>
>> No, nobody is manipulating another task's GFP flags.
>
> Agreed. And nobody should be manipulating PF flags on remote tasks
> either.
>

No. You are misunderstanding. The bug report above is an example of manipulating PF flags on remote tasks.
You say "nobody should", but the reality is "there indeed was". There might be unnoticed others. The point of
this proposal is to make it possible to "find such unnoticed users who are manipulating PF flags on remote tasks".

2021-02-12 13:15:04

[permalink] [raw]

Subject: Re: possible deadlock in start_this_handle (2)

On Fri 12-02-21 21:58:15, Tetsuo Handa wrote:
> On 2021/02/12 21:30, Michal Hocko wrote:
> > On Fri 12-02-21 12:22:07, Matthew Wilcox wrote:
> >> On Fri, Feb 12, 2021 at 08:18:11PM +0900, Tetsuo Handa wrote:
> >>> On 2021/02/12 1:41, Michal Hocko wrote:
> >>>> But I suspect we have drifted away from the original issue. I thought
> >>>> that a simple check would help us narrow down this particular case and
> >>>> somebody messing up from the IRQ context didn't sound like a completely
> >>>> off.
> >>>>
> >>>
> >>> From my experience at https://lkml.kernel.org/r/[email protected] ,
> >>> I think we can replace direct PF_* manipulation with macros which do not receive "struct task_struct *" argument.
> >>> Since TASK_PFA_TEST()/TASK_PFA_SET()/TASK_PFA_CLEAR() are for manipulating PFA_* flags on a remote thread, we can
> >>> define similar ones for manipulating PF_* flags on current thread. Then, auditing dangerous users becomes easier.
> >>
> >> No, nobody is manipulating another task's GFP flags.
> >
> > Agreed. And nobody should be manipulating PF flags on remote tasks
> > either.
> >
>
> No. You are misunderstanding. The bug report above is an example of manipulating PF flags on remote tasks.

Could you be more specific? I do not remember there was any theory that
somebody is manipulating flags on a remote task. A very vague theory was
that an interrupt context might be doing that on the _current_ context
but even that is not based on any real evidence. It is a pure
speculation.
--
Michal Hocko
SUSE Labs

2021-02-12 15:44:46

[permalink] [raw]

Subject: Re: possible deadlock in start_this_handle (2)

On Fri 12-02-21 21:58:15, Tetsuo Handa wrote:
> On 2021/02/12 21:30, Michal Hocko wrote:
> > On Fri 12-02-21 12:22:07, Matthew Wilcox wrote:
> >> On Fri, Feb 12, 2021 at 08:18:11PM +0900, Tetsuo Handa wrote:
> >>> On 2021/02/12 1:41, Michal Hocko wrote:
> >>>> But I suspect we have drifted away from the original issue. I thought
> >>>> that a simple check would help us narrow down this particular case and
> >>>> somebody messing up from the IRQ context didn't sound like a completely
> >>>> off.
> >>>>
> >>>
> >>> From my experience at https://lkml.kernel.org/r/[email protected] ,
> >>> I think we can replace direct PF_* manipulation with macros which do not receive "struct task_struct *" argument.
> >>> Since TASK_PFA_TEST()/TASK_PFA_SET()/TASK_PFA_CLEAR() are for manipulating PFA_* flags on a remote thread, we can
> >>> define similar ones for manipulating PF_* flags on current thread. Then, auditing dangerous users becomes easier.
> >>
> >> No, nobody is manipulating another task's GFP flags.
> >
> > Agreed. And nobody should be manipulating PF flags on remote tasks
> > either.
> >
>
> No. You are misunderstanding. The bug report above is an example of
> manipulating PF flags on remote tasks.

The bug report you are referring to is ancient. And the cpuset code
doesn't touch task->flags for a long time. I haven't checked exactly but
it is years since regular and atomic flags have been separated unless I
misremember.

> You say "nobody should", but the reality is "there indeed was". There
> might be unnoticed others. The point of this proposal is to make it
> possible to "find such unnoticed users who are manipulating PF flags
> on remote tasks".

I am really confused what you are proposing here TBH and referring to an
ancient bug doesn't really help. task->flags are _explicitly_ documented
to be only used for _current_. Is it possible that somebody writes a
buggy code? Sure, should we build a whole infrastructure around that to
catch such a broken code? I am not really sure. One bug 6 years ago
doesn't sound like a good reason for that.

--
Michal Hocko
SUSE Labs

2021-02-13 11:00:08