2024-03-11 16:18:43

by Trond Myklebust

[permalink] [raw]
Subject: Re: [PATCH v2] nfs: fix panic when nfs4_ff_layout_prepare_ds() fails

On Mon, 2024-03-11 at 11:11 -0400, Josef Bacik wrote:
> We've been seeing the following panic in production
>
> BUG: kernel NULL pointer dereference, address: 0000000000000065
> PGD 2f485f067 P4D 2f485f067 PUD 2cc5d8067 PMD 0
> RIP: 0010:ff_layout_cancel_io+0x3a/0x90 [nfs_layout_flexfiles]
> Call Trace:
>  <TASK>
>  ? __die+0x78/0xc0
>  ? page_fault_oops+0x286/0x380
>  ? __rpc_execute+0x2c3/0x470 [sunrpc]
>  ? rpc_new_task+0x42/0x1c0 [sunrpc]
>  ? exc_page_fault+0x5d/0x110
>  ? asm_exc_page_fault+0x22/0x30
>  ? ff_layout_free_layoutreturn+0x110/0x110 [nfs_layout_flexfiles]
>  ? ff_layout_cancel_io+0x3a/0x90 [nfs_layout_flexfiles]
>  ? ff_layout_cancel_io+0x6f/0x90 [nfs_layout_flexfiles]
>  pnfs_mark_matching_lsegs_return+0x1b0/0x360 [nfsv4]
>  pnfs_error_mark_layout_for_return+0x9e/0x110 [nfsv4]
>  ? ff_layout_send_layouterror+0x50/0x160 [nfs_layout_flexfiles]
>  nfs4_ff_layout_prepare_ds+0x11f/0x290 [nfs_layout_flexfiles]
>  ff_layout_pg_init_write+0xf0/0x1f0 [nfs_layout_flexfiles]
>  __nfs_pageio_add_request+0x154/0x6c0 [nfs]
>  nfs_pageio_add_request+0x26b/0x380 [nfs]
>  nfs_do_writepage+0x111/0x1e0 [nfs]
>  nfs_writepages_callback+0xf/0x30 [nfs]
>  write_cache_pages+0x17f/0x380
>  ? nfs_pageio_init_write+0x50/0x50 [nfs]
>  ? nfs_writepages+0x6d/0x210 [nfs]
>  ? nfs_writepages+0x6d/0x210 [nfs]
>  nfs_writepages+0x125/0x210 [nfs]
>  do_writepages+0x67/0x220
>  ? generic_perform_write+0x14b/0x210
>  filemap_fdatawrite_wbc+0x5b/0x80
>  file_write_and_wait_range+0x6d/0xc0
>  nfs_file_fsync+0x81/0x170 [nfs]
>  ? nfs_file_mmap+0x60/0x60 [nfs]
>  __x64_sys_fsync+0x53/0x90
>  do_syscall_64+0x3d/0x90
>  entry_SYSCALL_64_after_hwframe+0x46/0xb0
>
> Inspecting the core with drgn I was able to pull this
>
>   >>> prog.crashed_thread().stack_trace()[0]
>   #0 at 0xffffffffa079657a (ff_layout_cancel_io+0x3a/0x84) in
> ff_layout_cancel_io at fs/nfs/flexfilelayout/flexfilelayout.c:2021:27
>   >>> prog.crashed_thread().stack_trace()[0]['idx']
>   (u32)1
>   >>>
> prog.crashed_thread().stack_trace()[0]['flseg'].mirror_array[1].mirro
> r_ds
>   (struct nfs4_ff_layout_ds *)0xffffffffffffffed
>
> This is clear from the stack trace, we call
> nfs4_ff_layout_prepare_ds()
> which could error out initializing the mirror_ds, and then we go to
> clean it all up and our check is only for if (!mirror->mirror_ds). 
> This
> is inconsistent with the rest of the users of mirror_ds, which have
>
>   if (IS_ERR_OR_NULL(mirror_ds))
>
> to keep from tripping over this exact scenario.  Fix this up in
> ff_layout_cancel_io() to make sure we don't panic when we get an
> error.
> I also spot checked all the other instances of checking mirror_ds and
> we
> appear to be doing the correct checks everywhere, only
> unconditionally
> dereferencing mirror_ds when we know it would be valid.
>
> Signed-off-by: Josef Bacik <[email protected]>
> ---
> v1->v2:
> - My bad, I missed the formatting of the drgn output and it looked
> mangled.
>
>  fs/nfs/flexfilelayout/flexfilelayout.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/fs/nfs/flexfilelayout/flexfilelayout.c
> b/fs/nfs/flexfilelayout/flexfilelayout.c
> index ef817a0475ff..3e724cb7ef01 100644
> --- a/fs/nfs/flexfilelayout/flexfilelayout.c
> +++ b/fs/nfs/flexfilelayout/flexfilelayout.c
> @@ -2016,7 +2016,7 @@ static void ff_layout_cancel_io(struct
> pnfs_layout_segment *lseg)
>   for (idx = 0; idx < flseg->mirror_array_cnt; idx++) {
>   mirror = flseg->mirror_array[idx];
>   mirror_ds = mirror->mirror_ds;
> - if (!mirror_ds)
> + if (IS_ERR_OR_NULL(mirror_ds))
>   continue;
>   ds = mirror->mirror_ds->ds;
>   if (!ds)

That is truly weird... I have the above correct in the original patch,
which is still in the Hammerspace repo. However the port to upstream
appears to have introduced the bug.

Thanks so much for catching this, Josef!

--
Trond Myklebust
Linux NFS client maintainer, Hammerspace
[email protected]