2023-10-09 20:31:17

by Jeff Layton

[permalink] [raw]
Subject: lockd KASAN pop in svc_exit_thread with nfsd-next branch

I hit this KASAN pop today while rebooting a machine. This kernel
contains the contents of Chuck's nfsd-next branch, so it should have
all of Neil's latest thread handling patches.

I was able to reproduce this by just running "systemctl stop nfs-
server". This does not occur on stock v6.6-rc5:

Oct 09 15:26:08 f38-nfsd kernel: ==================================================================
Oct 09 15:26:08 f38-nfsd kernel: BUG: KASAN: slab-use-after-free in svc_exit_thread+0x10c/0x150 [sunrpc]
Oct 09 15:26:08 f38-nfsd kernel: Write of size 8 at addr ffff888112147bb8 by task lockd/973
Oct 09 15:26:08 f38-nfsd kernel:
Oct 09 15:26:08 f38-nfsd kernel: CPU: 5 PID: 973 Comm: lockd Not tainted 6.6.0-rc4-g2615e14a89be #8
Oct 09 15:26:08 f38-nfsd kernel: Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.2-1.fc38 04/01/2014
Oct 09 15:26:08 f38-nfsd kernel: Call Trace:
Oct 09 15:26:08 f38-nfsd kernel: <TASK>
Oct 09 15:26:08 f38-nfsd kernel: dump_stack_lvl+0x43/0x60
Oct 09 15:26:08 f38-nfsd kernel: print_report+0xc5/0x620
Oct 09 15:26:08 f38-nfsd kernel: ? __pfx__raw_spin_lock_irqsave+0x10/0x10
Oct 09 15:26:08 f38-nfsd kernel: ? __virt_addr_valid+0xbe/0x130
Oct 09 15:26:08 f38-nfsd kernel: ? svc_exit_thread+0x10c/0x150 [sunrpc]
Oct 09 15:26:08 f38-nfsd kernel: kasan_report+0xac/0xe0
Oct 09 15:26:08 f38-nfsd kernel: ? svc_exit_thread+0x10c/0x150 [sunrpc]
Oct 09 15:26:08 f38-nfsd kernel: kasan_check_range+0xf0/0x1a0
Oct 09 15:26:08 f38-nfsd kernel: svc_exit_thread+0x10c/0x150 [sunrpc]
Oct 09 15:26:08 f38-nfsd kernel: lockd+0x15d/0x1a0 [lockd]
Oct 09 15:26:08 f38-nfsd kernel: ? __pfx_lockd+0x10/0x10 [lockd]
Oct 09 15:26:08 f38-nfsd kernel: kthread+0x181/0x1c0
Oct 09 15:26:08 f38-nfsd kernel: ? __pfx_kthread+0x10/0x10
Oct 09 15:26:08 f38-nfsd kernel: ret_from_fork+0x2d/0x50
Oct 09 15:26:08 f38-nfsd kernel: ? __pfx_kthread+0x10/0x10
Oct 09 15:26:08 f38-nfsd kernel: ret_from_fork_asm+0x1b/0x30
Oct 09 15:26:08 f38-nfsd kernel: </TASK>
Oct 09 15:26:08 f38-nfsd kernel:
Oct 09 15:26:08 f38-nfsd kernel: Allocated by task 835:
Oct 09 15:26:08 f38-nfsd kernel: kasan_save_stack+0x1c/0x40
Oct 09 15:26:08 f38-nfsd kernel: kasan_set_track+0x21/0x30
Oct 09 15:26:08 f38-nfsd kernel: __kasan_kmalloc+0xa6/0xb0
Oct 09 15:26:08 f38-nfsd kernel: __kmalloc+0x5d/0x160
Oct 09 15:26:08 f38-nfsd kernel: __svc_create+0x2b0/0x470 [sunrpc]
Oct 09 15:26:08 f38-nfsd kernel: lockd_up+0x111/0x3b0 [lockd]
Oct 09 15:26:08 f38-nfsd kernel: nfsd_svc+0x54a/0x580 [nfsd]
Oct 09 15:26:08 f38-nfsd kernel: write_threads+0x148/0x260 [nfsd]
Oct 09 15:26:08 f38-nfsd kernel: nfsctl_transaction_write+0x76/0xa0 [nfsd]
Oct 09 15:26:08 f38-nfsd kernel: vfs_write+0x1b4/0x670
Oct 09 15:26:08 f38-nfsd kernel: ksys_write+0xbd/0x160
Oct 09 15:26:08 f38-nfsd kernel: do_syscall_64+0x38/0x90
Oct 09 15:26:08 f38-nfsd kernel: entry_SYSCALL_64_after_hwframe+0x6e/0xd8
Oct 09 15:26:08 f38-nfsd kernel:
Oct 09 15:26:08 f38-nfsd kernel: Freed by task 973:
Oct 09 15:26:08 f38-nfsd kernel: kasan_save_stack+0x1c/0x40
Oct 09 15:26:08 f38-nfsd kernel: kasan_set_track+0x21/0x30
Oct 09 15:26:08 f38-nfsd kernel: kasan_save_free_info+0x27/0x40
Oct 09 15:26:08 f38-nfsd kernel: ____kasan_slab_free+0x166/0x1c0
Oct 09 15:26:08 f38-nfsd kernel: slab_free_freelist_hook+0x9f/0x1e0
Oct 09 15:26:08 f38-nfsd kernel: __kmem_cache_free+0x187/0x2d0
Oct 09 15:26:08 f38-nfsd kernel: svc_destroy+0x105/0x1d0 [sunrpc]
Oct 09 15:26:08 f38-nfsd kernel: svc_exit_thread+0x135/0x150 [sunrpc]
Oct 09 15:26:08 f38-nfsd kernel: lockd+0x15d/0x1a0 [lockd]
Oct 09 15:26:08 f38-nfsd kernel: kthread+0x181/0x1c0
Oct 09 15:26:08 f38-nfsd kernel: ret_from_fork+0x2d/0x50
Oct 09 15:26:08 f38-nfsd kernel: ret_from_fork_asm+0x1b/0x30
Oct 09 15:26:08 f38-nfsd kernel:
Oct 09 15:26:08 f38-nfsd kernel: The buggy address belongs to the object at ffff888112147b00
which belongs to the cache kmalloc-192 of size 192
Oct 09 15:26:08 f38-nfsd kernel: The buggy address is located 184 bytes inside of
freed 192-byte region [ffff888112147b00, ffff888112147bc0)
Oct 09 15:26:08 f38-nfsd kernel:
Oct 09 15:26:08 f38-nfsd kernel: The buggy address belongs to the physical page:
Oct 09 15:26:08 f38-nfsd kernel: page:00000000b3fc2355 refcount:1 mapcount:0 mapping:0000000000000000 index:0xffff888112147d00 pfn:0x112146
Oct 09 15:26:08 f38-nfsd kernel: head:00000000b3fc2355 order:1 entire_mapcount:0 nr_pages_mapped:0 pincount:0
Oct 09 15:26:08 f38-nfsd kernel: flags: 0x17fffc000000840(slab|head|node=0|zone=2|lastcpupid=0x1ffff)
Oct 09 15:26:08 f38-nfsd kernel: page_type: 0xffffffff()
Oct 09 15:26:08 f38-nfsd kernel: raw: 017fffc000000840 ffff888100042a00 ffffea0004098790 ffffea00043e1b10
Oct 09 15:26:08 f38-nfsd kernel: raw: ffff888112147d00 000000000020000f 00000001ffffffff 0000000000000000
Oct 09 15:26:08 f38-nfsd kernel: page dumped because: kasan: bad access detected
Oct 09 15:26:08 f38-nfsd kernel:
Oct 09 15:26:08 f38-nfsd kernel: Memory state around the buggy address:
Oct 09 15:26:08 f38-nfsd kernel: ffff888112147a80: 00 00 00 00 00 00 00 fc fc fc fc fc fc fc fc fc
Oct 09 15:26:08 f38-nfsd kernel: ffff888112147b00: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
Oct 09 15:26:08 f38-nfsd kernel: >ffff888112147b80: fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc
Oct 09 15:26:08 f38-nfsd kernel: ^
Oct 09 15:26:08 f38-nfsd kernel: ffff888112147c00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
Oct 09 15:26:08 f38-nfsd kernel: ffff888112147c80: 00 00 00 00 00 00 fc fc fc fc fc fc fc fc fc fc
Oct 09 15:26:08 f38-nfsd kernel: ==================================================================

--
Jeff Layton <[email protected]>


2023-10-09 20:55:50

by Jeff Layton

[permalink] [raw]
Subject: Re: lockd KASAN pop in svc_exit_thread with nfsd-next branch

On Mon, 2023-10-09 at 16:30 -0400, Jeff Layton wrote:
> I hit this KASAN pop today while rebooting a machine. This kernel
> contains the contents of Chuck's nfsd-next branch, so it should have
> all of Neil's latest thread handling patches.
>
> I was able to reproduce this by just running "systemctl stop nfs-
> server". This does not occur on stock v6.6-rc5:
>
> Oct 09 15:26:08 f38-nfsd kernel: ==================================================================
> Oct 09 15:26:08 f38-nfsd kernel: BUG: KASAN: slab-use-after-free in svc_exit_thread+0x10c/0x150 [sunrpc]
> Oct 09 15:26:08 f38-nfsd kernel: Write of size 8 at addr ffff888112147bb8 by task lockd/973
> Oct 09 15:26:08 f38-nfsd kernel:
> Oct 09 15:26:08 f38-nfsd kernel: CPU: 5 PID: 973 Comm: lockd Not tainted 6.6.0-rc4-g2615e14a89be #8
> Oct 09 15:26:08 f38-nfsd kernel: Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.2-1.fc38 04/01/2014
> Oct 09 15:26:08 f38-nfsd kernel: Call Trace:
> Oct 09 15:26:08 f38-nfsd kernel: <TASK>
> Oct 09 15:26:08 f38-nfsd kernel: dump_stack_lvl+0x43/0x60
> Oct 09 15:26:08 f38-nfsd kernel: print_report+0xc5/0x620
> Oct 09 15:26:08 f38-nfsd kernel: ? __pfx__raw_spin_lock_irqsave+0x10/0x10
> Oct 09 15:26:08 f38-nfsd kernel: ? __virt_addr_valid+0xbe/0x130
> Oct 09 15:26:08 f38-nfsd kernel: ? svc_exit_thread+0x10c/0x150 [sunrpc]
> Oct 09 15:26:08 f38-nfsd kernel: kasan_report+0xac/0xe0
> Oct 09 15:26:08 f38-nfsd kernel: ? svc_exit_thread+0x10c/0x150 [sunrpc]
> Oct 09 15:26:08 f38-nfsd kernel: kasan_check_range+0xf0/0x1a0
> Oct 09 15:26:08 f38-nfsd kernel: svc_exit_thread+0x10c/0x150 [sunrpc]
> Oct 09 15:26:08 f38-nfsd kernel: lockd+0x15d/0x1a0 [lockd]
> Oct 09 15:26:08 f38-nfsd kernel: ? __pfx_lockd+0x10/0x10 [lockd]
> Oct 09 15:26:08 f38-nfsd kernel: kthread+0x181/0x1c0
> Oct 09 15:26:08 f38-nfsd kernel: ? __pfx_kthread+0x10/0x10
> Oct 09 15:26:08 f38-nfsd kernel: ret_from_fork+0x2d/0x50
> Oct 09 15:26:08 f38-nfsd kernel: ? __pfx_kthread+0x10/0x10
> Oct 09 15:26:08 f38-nfsd kernel: ret_from_fork_asm+0x1b/0x30
> Oct 09 15:26:08 f38-nfsd kernel: </TASK>
> Oct 09 15:26:08 f38-nfsd kernel:
> Oct 09 15:26:08 f38-nfsd kernel: Allocated by task 835:
> Oct 09 15:26:08 f38-nfsd kernel: kasan_save_stack+0x1c/0x40
> Oct 09 15:26:08 f38-nfsd kernel: kasan_set_track+0x21/0x30
> Oct 09 15:26:08 f38-nfsd kernel: __kasan_kmalloc+0xa6/0xb0
> Oct 09 15:26:08 f38-nfsd kernel: __kmalloc+0x5d/0x160
> Oct 09 15:26:08 f38-nfsd kernel: __svc_create+0x2b0/0x470 [sunrpc]
> Oct 09 15:26:08 f38-nfsd kernel: lockd_up+0x111/0x3b0 [lockd]
> Oct 09 15:26:08 f38-nfsd kernel: nfsd_svc+0x54a/0x580 [nfsd]
> Oct 09 15:26:08 f38-nfsd kernel: write_threads+0x148/0x260 [nfsd]
> Oct 09 15:26:08 f38-nfsd kernel: nfsctl_transaction_write+0x76/0xa0 [nfsd]
> Oct 09 15:26:08 f38-nfsd kernel: vfs_write+0x1b4/0x670
> Oct 09 15:26:08 f38-nfsd kernel: ksys_write+0xbd/0x160
> Oct 09 15:26:08 f38-nfsd kernel: do_syscall_64+0x38/0x90
> Oct 09 15:26:08 f38-nfsd kernel: entry_SYSCALL_64_after_hwframe+0x6e/0xd8
> Oct 09 15:26:08 f38-nfsd kernel:
> Oct 09 15:26:08 f38-nfsd kernel: Freed by task 973:
> Oct 09 15:26:08 f38-nfsd kernel: kasan_save_stack+0x1c/0x40
> Oct 09 15:26:08 f38-nfsd kernel: kasan_set_track+0x21/0x30
> Oct 09 15:26:08 f38-nfsd kernel: kasan_save_free_info+0x27/0x40
> Oct 09 15:26:08 f38-nfsd kernel: ____kasan_slab_free+0x166/0x1c0
> Oct 09 15:26:08 f38-nfsd kernel: slab_free_freelist_hook+0x9f/0x1e0
> Oct 09 15:26:08 f38-nfsd kernel: __kmem_cache_free+0x187/0x2d0
> Oct 09 15:26:08 f38-nfsd kernel: svc_destroy+0x105/0x1d0 [sunrpc]
> Oct 09 15:26:08 f38-nfsd kernel: svc_exit_thread+0x135/0x150 [sunrpc]
> Oct 09 15:26:08 f38-nfsd kernel: lockd+0x15d/0x1a0 [lockd]
> Oct 09 15:26:08 f38-nfsd kernel: kthread+0x181/0x1c0
> Oct 09 15:26:08 f38-nfsd kernel: ret_from_fork+0x2d/0x50
> Oct 09 15:26:08 f38-nfsd kernel: ret_from_fork_asm+0x1b/0x30
> Oct 09 15:26:08 f38-nfsd kernel:
> Oct 09 15:26:08 f38-nfsd kernel: The buggy address belongs to the object at ffff888112147b00
> which belongs to the cache kmalloc-192 of size 192
> Oct 09 15:26:08 f38-nfsd kernel: The buggy address is located 184 bytes inside of
> freed 192-byte region [ffff888112147b00, ffff888112147bc0)
> Oct 09 15:26:08 f38-nfsd kernel:
> Oct 09 15:26:08 f38-nfsd kernel: The buggy address belongs to the physical page:
> Oct 09 15:26:08 f38-nfsd kernel: page:00000000b3fc2355 refcount:1 mapcount:0 mapping:0000000000000000 index:0xffff888112147d00 pfn:0x112146
> Oct 09 15:26:08 f38-nfsd kernel: head:00000000b3fc2355 order:1 entire_mapcount:0 nr_pages_mapped:0 pincount:0
> Oct 09 15:26:08 f38-nfsd kernel: flags: 0x17fffc000000840(slab|head|node=0|zone=2|lastcpupid=0x1ffff)
> Oct 09 15:26:08 f38-nfsd kernel: page_type: 0xffffffff()
> Oct 09 15:26:08 f38-nfsd kernel: raw: 017fffc000000840 ffff888100042a00 ffffea0004098790 ffffea00043e1b10
> Oct 09 15:26:08 f38-nfsd kernel: raw: ffff888112147d00 000000000020000f 00000001ffffffff 0000000000000000
> Oct 09 15:26:08 f38-nfsd kernel: page dumped because: kasan: bad access detected
> Oct 09 15:26:08 f38-nfsd kernel:
> Oct 09 15:26:08 f38-nfsd kernel: Memory state around the buggy address:
> Oct 09 15:26:08 f38-nfsd kernel: ffff888112147a80: 00 00 00 00 00 00 00 fc fc fc fc fc fc fc fc fc
> Oct 09 15:26:08 f38-nfsd kernel: ffff888112147b00: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
> Oct 09 15:26:08 f38-nfsd kernel: >ffff888112147b80: fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc
> Oct 09 15:26:08 f38-nfsd kernel: ^
> Oct 09 15:26:08 f38-nfsd kernel: ffff888112147c00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> Oct 09 15:26:08 f38-nfsd kernel: ffff888112147c80: 00 00 00 00 00 00 fc fc fc fc fc fc fc fc fc fc
> Oct 09 15:26:08 f38-nfsd kernel: ==================================================================
>

Some follow up info. It looks like the comment in svc_exit_thread about
it not being the last reference may no longer be true for lockd?

$ ./scripts/faddr2line --list net/sunrpc/sunrpc.ko svc_exit_thread+0x10c/0x150
(inlined by) svc_exit_thread at /home/jlayton/git/linux/net/sunrpc/svc.c:943
938 svc_put(serv);
939 /* That svc_put() cannot be the last, because the thread
940 * waiting for SP_VICTIM_REMAINS to clear must hold
941 * a reference. So it is still safe to access pool.
942 */
>943< clear_and_wake_up_bit(SP_VICTIM_REMAINS, &pool->sp_flags);
944 }
945 EXPORT_SYMBOL_GPL(svc_exit_thread);
946
947 /*
948 * Register an "inet" protocol family netid with the local

$ ./scripts/faddr2line --list net/sunrpc/sunrpc.ko svc_exit_thread+0x135/0x150
(inlined by) svc_exit_thread at /home/jlayton/git/linux/net/sunrpc/svc.c:938
933 spin_unlock_bh(&serv->sv_lock);
934 svc_sock_update_bufs(serv);
935
936 svc_rqst_free(rqstp);
937
>938< svc_put(serv);
939 /* That svc_put() cannot be the last, because the thread
940 * waiting for SP_VICTIM_REMAINS to clear must hold
941 * a reference. So it is still safe to access pool.
942 */
943 clear_and_wake_up_bit(SP_VICTIM_REMAINS, &pool->sp_flags);

--
Jeff Layton <[email protected]>