Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1756198AbbLAMiN (ORCPT ); Tue, 1 Dec 2015 07:38:13 -0500 Received: from mail-pa0-f47.google.com ([209.85.220.47]:34097 "EHLO mail-pa0-f47.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754890AbbLAMiK (ORCPT ); Tue, 1 Dec 2015 07:38:10 -0500 From: Sergey Senozhatsky To: Andrew Morton Cc: Minchan Kim , Kyeongdon Kim , linux-kernel@vger.kernel.org, Sergey Senozhatsky , Sergey Senozhatsky , stable@vger.kernel.org Subject: [PATCH v4 1/2] zram/zcomp: use GFP_NOIO to allocate streams Date: Tue, 1 Dec 2015 21:36:29 +0900 Message-Id: <1448973390-21170-2-git-send-email-sergey.senozhatsky@gmail.com> X-Mailer: git-send-email 2.6.2 In-Reply-To: <1448973390-21170-1-git-send-email-sergey.senozhatsky@gmail.com> References: <1448973390-21170-1-git-send-email-sergey.senozhatsky@gmail.com> Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 8130 Lines: 160 We can end up allocating a new compression stream with GFP_KERNEL from within the IO path, which may result is nested (recursive) IO operations. That can introduce problems if the IO path in question is a reclaimer, holding some locks that will deadlock nested IOs. Allocate streams and working memory using GFP_NOIO flag, forbidding recursive IO and FS operations. An example: [ 747.233722] inconsistent {IN-RECLAIM_FS-W} -> {RECLAIM_FS-ON-W} usage. [ 747.233724] git/20158 [HC0[0]:SC0[0]:HE1:SE1] takes: [ 747.233725] (jbd2_handle){+.+.?.}, at: [] start_this_handle+0x4ca/0x555 [ 747.233733] {IN-RECLAIM_FS-W} state was registered at: [ 747.233735] [] __lock_acquire+0x8da/0x117b [ 747.233738] [] lock_acquire+0x10c/0x1a7 [ 747.233740] [] start_this_handle+0x52d/0x555 [ 747.233742] [] jbd2__journal_start+0xb4/0x237 [ 747.233744] [] __ext4_journal_start_sb+0x108/0x17e [ 747.233748] [] ext4_dirty_inode+0x32/0x61 [ 747.233750] [] __mark_inode_dirty+0x16b/0x60c [ 747.233754] [] iput+0x11e/0x274 [ 747.233757] [] __dentry_kill+0x148/0x1b8 [ 747.233759] [] shrink_dentry_list+0x274/0x44a [ 747.233761] [] prune_dcache_sb+0x4a/0x55 [ 747.233763] [] super_cache_scan+0xfc/0x176 [ 747.233767] [] shrink_slab.part.14.constprop.25+0x2a2/0x4d3 [ 747.233770] [] shrink_zone+0x74/0x140 [ 747.233772] [] kswapd+0x6b7/0x930 [ 747.233774] [] kthread+0x107/0x10f [ 747.233778] [] ret_from_fork+0x3f/0x70 [ 747.233783] irq event stamp: 138297 [ 747.233784] hardirqs last enabled at (138297): [] debug_check_no_locks_freed+0x113/0x12f [ 747.233786] hardirqs last disabled at (138296): [] debug_check_no_locks_freed+0x33/0x12f [ 747.233788] softirqs last enabled at (137818): [] __do_softirq+0x2d3/0x3e9 [ 747.233792] softirqs last disabled at (137813): [] irq_exit+0x41/0x95 [ 747.233794] other info that might help us debug this: [ 747.233796] Possible unsafe locking scenario: [ 747.233797] CPU0 [ 747.233798] ---- [ 747.233799] lock(jbd2_handle); [ 747.233801] [ 747.233801] lock(jbd2_handle); [ 747.233803] *** DEADLOCK *** [ 747.233805] 5 locks held by git/20158: [ 747.233806] #0: (sb_writers#7){.+.+.+}, at: [] mnt_want_write+0x24/0x4b [ 747.233811] #1: (&type->i_mutex_dir_key#2/1){+.+.+.}, at: [] lock_rename+0xd9/0xe3 [ 747.233817] #2: (&sb->s_type->i_mutex_key#11){+.+.+.}, at: [] lock_two_nondirectories+0x3f/0x6b [ 747.233822] #3: (&sb->s_type->i_mutex_key#11/4){+.+.+.}, at: [] lock_two_nondirectories+0x66/0x6b [ 747.233827] #4: (jbd2_handle){+.+.?.}, at: [] start_this_handle+0x4ca/0x555 [ 747.233831] stack backtrace: [ 747.233834] CPU: 2 PID: 20158 Comm: git Not tainted 4.1.0-rc7-next-20150615-dbg-00016-g8bdf555-dirty #211 [ 747.233837] ffff8800a56cea40 ffff88010d0a75f8 ffffffff814f446d ffffffff81077036 [ 747.233840] ffffffff823a84b0 ffff88010d0a7638 ffffffff814f3849 0000000000000001 [ 747.233843] 000000000000000a ffff8800a56cf6f8 ffff8800a56cea40 ffffffff810795dd [ 747.233846] Call Trace: [ 747.233849] [] dump_stack+0x4c/0x6e [ 747.233852] [] ? up+0x39/0x3e [ 747.233854] [] print_usage_bug.part.23+0x25b/0x26a [ 747.233857] [] ? print_shortest_lock_dependencies+0x182/0x182 [ 747.233859] [] mark_lock+0x384/0x56d [ 747.233862] [] mark_held_locks+0x5f/0x76 [ 747.233865] [] ? zcomp_strm_alloc+0x25/0x73 [zram] [ 747.233867] [] lockdep_trace_alloc+0xb2/0xb5 [ 747.233870] [] kmem_cache_alloc_trace+0x32/0x1e2 [ 747.233873] [] zcomp_strm_alloc+0x25/0x73 [zram] [ 747.233876] [] zcomp_strm_multi_find+0xe7/0x173 [zram] [ 747.233879] [] zcomp_strm_find+0xc/0xe [zram] [ 747.233881] [] zram_bvec_rw+0x2ca/0x7e0 [zram] [ 747.233885] [] zram_make_request+0x1fa/0x301 [zram] [ 747.233889] [] generic_make_request+0x9c/0xdb [ 747.233891] [] submit_bio+0xf7/0x120 [ 747.233895] [] ? __test_set_page_writeback+0x1a0/0x1b8 [ 747.233897] [] ext4_io_submit+0x2e/0x43 [ 747.233899] [] ext4_bio_write_page+0x1b7/0x300 [ 747.233902] [] mpage_submit_page+0x60/0x77 [ 747.233905] [] mpage_map_and_submit_buffers+0x10f/0x21d [ 747.233907] [] ext4_writepages+0xc8c/0xe1b [ 747.233910] [] do_writepages+0x23/0x2c [ 747.233913] [] __filemap_fdatawrite_range+0x84/0x8b [ 747.233915] [] filemap_flush+0x1c/0x1e [ 747.233917] [] ext4_alloc_da_blocks+0xb8/0x117 [ 747.233919] [] ext4_rename+0x132/0x6dc [ 747.233921] [] ? mark_held_locks+0x5f/0x76 [ 747.233924] [] ext4_rename2+0x29/0x2b [ 747.233926] [] vfs_rename+0x540/0x636 [ 747.233928] [] SyS_renameat2+0x359/0x44d [ 747.233931] [] SyS_rename+0x1e/0x20 [ 747.233933] [] entry_SYSCALL_64_fastpath+0x12/0x6f [minchan@kernel.org: add stable mark] Signed-off-by: Sergey Senozhatsky Acked-by: Minchan Kim Cc: Kyeongdon Kim Cc: --- drivers/block/zram/zcomp.c | 4 ++-- drivers/block/zram/zcomp_lz4.c | 2 +- drivers/block/zram/zcomp_lzo.c | 2 +- 3 files changed, 4 insertions(+), 4 deletions(-) diff --git a/drivers/block/zram/zcomp.c b/drivers/block/zram/zcomp.c index 5cb13ca..c536177 100644 --- a/drivers/block/zram/zcomp.c +++ b/drivers/block/zram/zcomp.c @@ -76,7 +76,7 @@ static void zcomp_strm_free(struct zcomp *comp, struct zcomp_strm *zstrm) */ static struct zcomp_strm *zcomp_strm_alloc(struct zcomp *comp) { - struct zcomp_strm *zstrm = kmalloc(sizeof(*zstrm), GFP_KERNEL); + struct zcomp_strm *zstrm = kmalloc(sizeof(*zstrm), GFP_NOIO); if (!zstrm) return NULL; @@ -85,7 +85,7 @@ static struct zcomp_strm *zcomp_strm_alloc(struct zcomp *comp) * allocate 2 pages. 1 for compressed data, plus 1 extra for the * case when compressed size is larger than the original one */ - zstrm->buffer = (void *)__get_free_pages(GFP_KERNEL | __GFP_ZERO, 1); + zstrm->buffer = (void *)__get_free_pages(GFP_NOIO | __GFP_ZERO, 1); if (!zstrm->private || !zstrm->buffer) { zcomp_strm_free(comp, zstrm); zstrm = NULL; diff --git a/drivers/block/zram/zcomp_lz4.c b/drivers/block/zram/zcomp_lz4.c index f2afb7e..ee44b51 100644 --- a/drivers/block/zram/zcomp_lz4.c +++ b/drivers/block/zram/zcomp_lz4.c @@ -15,7 +15,7 @@ static void *zcomp_lz4_create(void) { - return kzalloc(LZ4_MEM_COMPRESS, GFP_KERNEL); + return kzalloc(LZ4_MEM_COMPRESS, GFP_NOIO); } static void zcomp_lz4_destroy(void *private) diff --git a/drivers/block/zram/zcomp_lzo.c b/drivers/block/zram/zcomp_lzo.c index da1bc47..683ce04 100644 --- a/drivers/block/zram/zcomp_lzo.c +++ b/drivers/block/zram/zcomp_lzo.c @@ -15,7 +15,7 @@ static void *lzo_create(void) { - return kzalloc(LZO1X_MEM_COMPRESS, GFP_KERNEL); + return kzalloc(LZO1X_MEM_COMPRESS, GFP_NOIO); } static void lzo_destroy(void *private) -- 2.6.2 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/