2024-02-28 03:54:17

by Sergey Senozhatsky

[permalink] [raw]
Subject: Re: [PATCH 0/2] mm/zsmalloc: simplify synchronization between zs_page_migrate() and free_zspage()

On (24/02/27 03:02), Chengming Zhou wrote:
> free_zspage() has to hold locks of all pages, since zs_page_migrate()
> path rely on this page lock to protect the race between zs_free() and
> it, so it can safely get zspage from page->private.
>
> But this way is not good and simple enough:
>
> 1. Since zs_free() couldn't be sleepable, it can only trylock pages,
> or has to kick_deferred_free() to defer that to a work.
>
> 2. Even in the worker context, async_free_zspage() can't simply
> lock all pages in lock_zspage(), it's still trylock because of
> the race between zs_free() and zs_page_migrate(). Please see
> the commit 2505a981114d ("zsmalloc: fix races between asynchronous
> zspage free and page migration") for details.
>
> Actually, all free_zspage() needs is to get zspage from page safely,
> we can use RCU to achieve it easily. Then free_zspage() don't need to
> hold locks of all pages, so don't need the deferred free mechanism
> at all. This patchset implements it and remove all of deferred free
> related code.
>
> Thanks for review and comments!
>
> Signed-off-by: Chengming Zhou <[email protected]>
> ---
> Chengming Zhou (2):
> mm/zsmalloc: don't hold locks of all pages when free_zspage()

That seems to be crashing on me:

[ 28.123867] ==================================================================
[ 28.125303] BUG: KASAN: null-ptr-deref in obj_malloc+0xa9/0x1f0
[ 28.126289] Read of size 8 at addr 0000000000000028 by task mkfs.ext2/432
[ 28.127414]
[ 28.127684] CPU: 8 PID: 432 Comm: mkfs.ext2 Tainted: G N 6.8.0-rc5+ #309
[ 28.129015] Call Trace:
[ 28.129442] <TASK>
[ 28.129805] dump_stack_lvl+0x6f/0xab
[ 28.130437] print_report+0xe0/0x5e0
[ 28.131050] ? _printk+0x59/0x7b
[ 28.131602] ? kasan_report+0x96/0x120
[ 28.132233] ? obj_malloc+0xa9/0x1f0
[ 28.132837] kasan_report+0xe7/0x120
[ 28.133441] ? obj_malloc+0xa9/0x1f0
[ 28.134046] obj_malloc+0xa9/0x1f0
[ 28.134633] zs_malloc+0x22c/0x3e0
[ 28.135211] zram_submit_bio+0x44e/0xee0
[ 28.135871] ? lock_release+0x50c/0x700
[ 28.136520] submit_bio_noacct_nocheck+0x22a/0x650
[ 28.137327] __block_write_full_folio+0x48b/0x710
[ 28.138119] ? __cfi_blkdev_get_block+0x10/0x10
[ 28.138885] ? __cfi_block_write_full_folio+0x10/0x10
[ 28.139737] write_cache_pages+0x83/0xf0
[ 28.140397] ? __cfi_blkdev_get_block+0x10/0x10
[ 28.141152] blkdev_writepages+0x46/0x80
[ 28.141810] do_writepages+0x1be/0x400
[ 28.142443] file_write_and_wait_range+0x104/0x170
[ 28.143254] blkdev_fsync+0x4a/0x70
[ 28.143846] __x64_sys_fsync+0xe9/0x120
[ 28.144491] do_syscall_64+0x8d/0x130
[ 28.145106] entry_SYSCALL_64_after_hwframe+0x46/0x4e