2022-08-17 09:35:07

by J. R. Okajima

[permalink] [raw]
Subject: NFS, two d_delete() calls in nfs_unlink()

Hello NeilBrown and Trond Myklebust,

By the commin in v6.0-rc1,
3c59366c207e 2022-08-08 NFS: don't unhash dentry during unlink/rename
nfs_unlink() stopped calling __d_drop().
And it MAY cause two d_delete() calls. If it happens, the second call
leads to NULL pointer access because d_inode is already cleared.

Here is the detail.

nfs_unlink()
+ nfs_safe_remove()
+ NFS_PROTO(dir)->remove() <-- returns ENOENT
+ nfs_dentry_handle_enoent()
+ if (simple_positive()) d_delete() <-- 1st call and d_inode is cleared
+ nfs_dentry_remove_handle_error()
+ if (ENOENT) d_delete() <-- second call and NULL d_inode is accessed

How about adding a condition for d_delete() call in
nfs_dentry_remove_handle_error(), such like simple_positive()?


J. R. Okajima


2022-08-18 03:07:58

by NeilBrown

[permalink] [raw]
Subject: Re: NFS, two d_delete() calls in nfs_unlink()

On Wed, 17 Aug 2022, J. R. Okajima wrote:
> Hello NeilBrown and Trond Myklebust,
>
> By the commin in v6.0-rc1,
> 3c59366c207e 2022-08-08 NFS: don't unhash dentry during unlink/rename
> nfs_unlink() stopped calling __d_drop().
> And it MAY cause two d_delete() calls. If it happens, the second call
> leads to NULL pointer access because d_inode is already cleared.
>
> Here is the detail.
>
> nfs_unlink()
> + nfs_safe_remove()
> + NFS_PROTO(dir)->remove() <-- returns ENOENT
> + nfs_dentry_handle_enoent()
> + if (simple_positive()) d_delete() <-- 1st call and d_inode is cleared
> + nfs_dentry_remove_handle_error()
> + if (ENOENT) d_delete() <-- second call and NULL d_inode is accessed
>
> How about adding a condition for d_delete() call in
> nfs_dentry_remove_handle_error(), such like simple_positive()?
>

Thanks for the report.
This possibility of calling d_delete() twice has been present
since 9019fb391de0 in v5.16.
It is possible that my patch made it more likely or more problematic,
though I cannot see why.
I posted a patch which Trond has applied to his linux-next branch.
It is in linux-next as commit 9a31abb1c009c40

How did you discover this bug, and why do you think my patch
caused it?

Thanks,
NeilBrown

2022-08-18 03:47:25

by J. R. Okajima

[permalink] [raw]
Subject: Re: NFS, two d_delete() calls in nfs_unlink()

"NeilBrown":
> Thanks for the report.
> This possibility of calling d_delete() twice has been present
> since 9019fb391de0 in v5.16.

I don't think 9019fb391de0 is a problem.
Before v6.0-rc1, the target dentry was unhashed by __d_drop() call in
nfs_unlink(), and nfs_dentry_handle_enoent() skipped calling d_delete()
by simple_positive(). d_delete() was called only once via
nfs_dentry_remove_handle_error().

In v6.0-rc1, the dentry is not unhashed and nfs_dentry_handle_enoent()
doesn't skip calling d_delete().


> How did you discover this bug, and why do you think my patch
> caused it?

I met this problem during a stress test aiming a filesystem I am
developing.
And I think unhashing causes nfs_dentry_handle_enoent() to call
d_delete().


J. R. Okajima

2022-08-18 18:45:49

by Olga Kornievskaia

[permalink] [raw]
Subject: Re: NFS, two d_delete() calls in nfs_unlink()

On Wed, Aug 17, 2022 at 11:47 PM <[email protected]> wrote:
>
> "NeilBrown":
> > Thanks for the report.
> > This possibility of calling d_delete() twice has been present
> > since 9019fb391de0 in v5.16.
>
> I don't think 9019fb391de0 is a problem.
> Before v6.0-rc1, the target dentry was unhashed by __d_drop() call in
> nfs_unlink(), and nfs_dentry_handle_enoent() skipped calling d_delete()
> by simple_positive(). d_delete() was called only once via
> nfs_dentry_remove_handle_error().
>
> In v6.0-rc1, the dentry is not unhashed and nfs_dentry_handle_enoent()
> doesn't skip calling d_delete().
>
>
> > How did you discover this bug, and why do you think my patch
> > caused it?
>
> I met this problem during a stress test aiming a filesystem I am
> developing.
> And I think unhashing causes nfs_dentry_handle_enoent() to call
> d_delete().
>
>

Hello folks,

Just to add my 2cents. Netapp testing is hitting this consistently. It
seems like it's the same thing. Client's kernel is based of Trond's
tree with the whatever is going into 6.0
commit 3fa5cbdc44de190f2c5605ba7db015ae0d26f668 (HEAD ->
08122022.1550-linux-next, tag: nfs-for-5.20-1, origin/linux-nex

I've then reverted Neil's patch and things went back to normal. Again
I'm not implying that the patch causes it but just stating this is hit
every time compared to before.

Aug 18 13:37:23 localhost kernel:
==================================================================
Aug 18 13:37:23 localhost kernel: BUG: KASAN: null-ptr-deref in
_raw_spin_lock+0x78/0xe0
Aug 18 13:37:23 localhost kernel: Write of size 4 at addr
0000000000000088 by task dt/12703
Aug 18 13:37:23 localhost kernel:
Aug 18 13:37:23 localhost kernel: CPU: 5 PID: 12703 Comm: dt Tainted:
G S 5.19.0-rc6+ #18
Aug 18 13:37:23 localhost kernel: Hardware name: Supermicro
SYS-5028R-WR/X10SRW-F, BIOS 3.2 11/22/2019
Aug 18 13:37:23 localhost kernel: Call Trace:
Aug 18 13:37:23 localhost kernel: <TASK>
Aug 18 13:37:23 localhost kernel: dump_stack_lvl+0x33/0x42
Aug 18 13:37:23 localhost kernel: ? _raw_spin_lock+0x78/0xe0
Aug 18 13:37:23 localhost kernel: print_report.cold.17+0x493/0x6b3
Aug 18 13:37:23 localhost kernel: ? start_report.constprop.11+0xdc/0x180
Aug 18 13:37:23 localhost kernel: ? _raw_spin_lock+0x78/0xe0
Aug 18 13:37:23 localhost kernel: kasan_report+0x83/0x100
Aug 18 13:37:23 localhost kernel: ? _raw_spin_lock+0x78/0xe0
Aug 18 13:37:23 localhost kernel: kasan_check_range+0x183/0x1e0
Aug 18 13:37:23 localhost kernel: _raw_spin_lock+0x78/0xe0
Aug 18 13:37:23 localhost kernel: BUG: kernel NULL pointer
dereference, address: 0000000000000088
Aug 18 13:37:23 localhost kernel: ? _raw_spin_lock_irqsave+0xf0/0xf0
Aug 18 13:37:23 localhost kernel: #PF: supervisor write access in kernel mode
Aug 18 13:37:23 localhost kernel: ? _raw_spin_lock_irqsave+0xf0/0xf0
Aug 18 13:37:23 localhost kernel: #PF: error_code(0x0002) - not-present page
Aug 18 13:37:23 localhost kernel: ? _atomic_dec_and_lock+0x1a/0x60
Aug 18 13:37:23 localhost kernel: PGD 8000000202253067 P4D 8000000202253067
Aug 18 13:37:23 localhost kernel: d_delete+0x2c/0x90
Aug 18 13:37:23 localhost kernel: PUD 202b67067 PMD 0
Aug 18 13:37:23 localhost kernel: nfs_dentry_remove_handle_error+0x21/0x60 [nfs]
Aug 18 13:37:23 localhost kernel: Oops: 0002 [#1] PREEMPT SMP KASAN PTI
Aug 18 13:37:23 localhost kernel: nfs_unlink+0x1c4/0x4a0 [nfs]
Aug 18 13:37:23 localhost kernel: CPU: 25 PID: 12182 Comm: dt Tainted:
G S 5.19.0-rc6+ #18
Aug 18 13:37:23 localhost kernel: vfs_unlink+0x1af/0x410
Aug 18 13:37:23 localhost kernel: Hardware name: Supermicro
SYS-5028R-WR/X10SRW-F, BIOS 3.2 11/22/2019
Aug 18 13:37:23 localhost kernel: ? __lookup_hash+0x1f/0xd0
Aug 18 13:37:23 localhost kernel: RIP: 0010:_raw_spin_lock+0x90/0xe0
Aug 18 13:37:23 localhost kernel: do_unlinkat+0x2c9/0x410
Aug 18 13:37:23 localhost kernel: Code: 00 00 48 89 df c7 44 24 20 00
00 00 00 e8 18 8f 42 ff be 04 00 00 00 48 8d 7c 24 20 e8 09 8f 42 ff
ba 01 00 00 00 8b 44 24 20 <f0> 0f b1 13 75 2a 48 b8 00 00 00 00 00 fc
ff df 48 c7 44 05 00 00
Aug 18 13:37:23 localhost kernel: ? __x64_sys_rmdir+0x30/0x30
Aug 18 13:37:23 localhost kernel: RSP: 0018:ffff888462f5fc88 EFLAGS: 00010297
Aug 18 13:37:23 localhost kernel: ? xfd_validate_state+0x34/0xb0
Aug 18 13:37:23 localhost kernel: RAX: 0000000000000000 RBX:
0000000000000088 RCX: ffffffff8cabd117
Aug 18 13:37:23 localhost kernel: ? __virt_addr_valid+0xd8/0x170
Aug 18 13:37:23 localhost kernel: RDX: 0000000000000001 RSI:
0000000000000004 RDI: ffff888462f5fca8
Aug 18 13:37:23 localhost kernel: ? __check_object_size+0x1b3/0x380
Aug 18 13:37:23 localhost kernel: RBP: 1ffff1108c5ebf91 R08:
0000000000000004 R09: ffffed108c5ebf95
Aug 18 13:37:23 localhost kernel: ? strncpy_from_user+0x178/0x1b0
Aug 18 13:37:23 localhost kernel: R10: 0000000000000003 R11:
ffffed108c5ebf95 R12: ffff8881803fac58
Aug 18 13:37:23 localhost kernel: ? getname_flags+0x10d/0x2a0
Aug 18 13:37:23 localhost kernel: R13: ffff88812b20e6b8 R14:
ffff88812b20e6b8 R15: ffff8881803fac30
Aug 18 13:37:23 localhost kernel: __x64_sys_unlink+0x2c/0x30
Aug 18 13:37:23 localhost kernel: FS: 00007fdb273f3700(0000)
GS:ffff889bff280000(0000) knlGS:0000000000000000
Aug 18 13:37:23 localhost kernel: do_syscall_64+0x3a/0x80
Aug 18 13:37:23 localhost kernel: CS: 0010 DS: 0000 ES: 0000 CR0:
0000000080050033
Aug 18 13:37:23 localhost kernel: entry_SYSCALL_64_after_hwframe+0x46/0xb0
Aug 18 13:37:23 localhost kernel: CR2: 0000000000000088 CR3:
00000001ab930002 CR4: 00000000003706e0
Aug 18 13:37:23 localhost kernel: RIP: 0033:0x7f279543960b
Aug 18 13:37:23 localhost kernel: DR0: 0000000000000000 DR1:
0000000000000000 DR2: 0000000000000000
Aug 18 13:37:23 localhost kernel: Code: f0 ff ff 73 01 c3 48 8b 0d 7a
58 38 00 f7 d8 64 89 01 48 83 c8 ff c3 0f 1f 84 00 00 00 00 00 f3 0f
1e fa b8 57 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 4d 58
38 00 f7 d8 64 89 01 48
Aug 18 13:37:23 localhost kernel: DR3: 0000000000000000 DR6:
00000000fffe0ff0 DR7: 0000000000000400
Aug 18 13:37:23 localhost kernel: RSP: 002b:00007f2794ef9df8 EFLAGS: 00000202
Aug 18 13:37:23 localhost kernel: Call Trace:
Aug 18 13:37:23 localhost kernel: ORIG_RAX: 0000000000000057
Aug 18 13:37:23 localhost kernel: RAX: ffffffffffffffda RBX:
0000000001add1d8 RCX: 00007f279543960b
Aug 18 13:37:23 localhost kernel: <TASK>
Aug 18 13:37:23 localhost kernel: RDX: 000000000048433d RSI:
000000000048433d RDI: 00007f272c000d30
Aug 18 13:37:23 localhost kernel: ? _raw_spin_lock_irqsave+0xf0/0xf0
Aug 18 13:37:23 localhost kernel: RBP: 0000000001ad6890 R08:
0000000000000000 R09: 00007f2794ef9ba6
Aug 18 13:37:23 localhost kernel: ? _raw_spin_lock_irqsave+0xf0/0xf0
Aug 18 13:37:23 localhost kernel: R10: 0000000000000000 R11:
0000000000000202 R12: 000000000000001a
Aug 18 13:37:23 localhost kernel: ? _atomic_dec_and_lock+0x1a/0x60
Aug 18 13:37:23 localhost kernel: R13: 000000000000001a R14:
00007f272c000b90 R15: 00007f272c000d30
Aug 18 13:37:23 localhost kernel: d_delete+0x2c/0x90
Aug 18 13:37:23 localhost kernel: </TASK>
Aug 18 13:37:23 localhost kernel: nfs_dentry_remove_handle_error+0x21/0x60 [nfs]
Aug 18 13:37:23 localhost kernel:
==================================================================
Aug 18 13:37:23 localhost kernel: nfs_unlink+0x1c4/0x4a0 [nfs]
Aug 18 13:37:23 localhost kernel: vfs_unlink+0x1af/0x410

2022-08-18 23:08:37

by NeilBrown

[permalink] [raw]
Subject: Re: NFS, two d_delete() calls in nfs_unlink()

On Thu, 18 Aug 2022, [email protected] wrote:
> "NeilBrown":
> > Thanks for the report.
> > This possibility of calling d_delete() twice has been present
> > since 9019fb391de0 in v5.16.
>
> I don't think 9019fb391de0 is a problem.
> Before v6.0-rc1, the target dentry was unhashed by __d_drop() call in
> nfs_unlink(), and nfs_dentry_handle_enoent() skipped calling d_delete()
> by simple_positive(). d_delete() was called only once via
> nfs_dentry_remove_handle_error().

Ahhh - simple_positive() checks d_unhashed() - I didn't connect that.

So before my recent patch we needed the second call to d_delete() in
nfs_unlink() because the first wasn't effective. However the second
call in nfs_rmdir() was still a problem.

So if the current fix (9a31abb1c009) gets ported back before the patch
that removed unhash (3c59366c207e) then it won't do the right thing
for nfs_unlink(). As it has a Fixes: tag, that is likely.

It would be better to protect the d_delete() with
d_really_is_positive().

Trond: can we drop that patch and replace it, or should I add the
d_really_is_positive() check with a new patch?


>
> In v6.0-rc1, the dentry is not unhashed and nfs_dentry_handle_enoent()
> doesn't skip calling d_delete().
>
>
> > How did you discover this bug, and why do you think my patch
> > caused it?
>
> I met this problem during a stress test aiming a filesystem I am
> developing.
> And I think unhashing causes nfs_dentry_handle_enoent() to call
> d_delete().

Thanks a lot for helping me understand!

Thanks,
NeilBrown

2022-08-18 23:30:09

by Trond Myklebust

[permalink] [raw]
Subject: Re: NFS, two d_delete() calls in nfs_unlink()

On Fri, 2022-08-19 at 09:05 +1000, NeilBrown wrote:
> On Thu, 18 Aug 2022, [email protected] wrote:
> > "NeilBrown":
> > > Thanks for the report.
> > > This possibility of calling d_delete() twice has been present
> > > since  9019fb391de0 in v5.16.
> >
> > I don't think 9019fb391de0 is a problem.
> > Before v6.0-rc1, the target dentry was unhashed by __d_drop() call
> > in
> > nfs_unlink(), and nfs_dentry_handle_enoent() skipped calling
> > d_delete()
> > by simple_positive(). d_delete() was called only once via
> > nfs_dentry_remove_handle_error().
>
> Ahhh - simple_positive() checks d_unhashed() - I didn't connect that.
>
> So before my recent patch we needed the second call to d_delete() in
> nfs_unlink() because the first wasn't effective.  However the second
> call in nfs_rmdir() was still a problem.
>
> So if the current fix (9a31abb1c009) gets ported back before the
> patch
> that removed unhash (3c59366c207e) then it won't do the right thing
> for nfs_unlink().  As it has a Fixes: tag, that is likely.
>
> It would be better to protect the d_delete() with
> d_really_is_positive().
>
> Trond: can we drop that patch and replace it, or should I add the
> d_really_is_positive() check with a new patch?

It is the very last patch in the linux-next branch, so replacing it is
not a huge problem.

--
Trond Myklebust
Linux NFS client maintainer, Hammerspace
[email protected]