Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754520AbZGDVtb (ORCPT ); Sat, 4 Jul 2009 17:49:31 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1752608AbZGDVtV (ORCPT ); Sat, 4 Jul 2009 17:49:21 -0400 Received: from qmta12.emeryville.ca.mail.comcast.net ([76.96.27.227]:39877 "EHLO QMTA12.emeryville.ca.mail.comcast.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752521AbZGDVtU (ORCPT ); Sat, 4 Jul 2009 17:49:20 -0400 To: tytso@mit.edu, adilger@sun.com, linux-ext4@vger.kernel.org, linux-kernel@vger.kernel.org Subject: ext4 lockdep report re possible reclaim deadlock on jbd2_handle X-Message-Flag: Warning: May contain useful information X-Priority: 1 X-MSMail-Priority: High From: Roland Dreier Date: Sat, 04 Jul 2009 14:49:06 -0700 Message-ID: <87prcgp19p.fsf@shaolin.home.digitalvampire.org> User-Agent: Gnus/5.1008 (Gnus v5.10.8) XEmacs/21.4.21 (linux) MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 7127 Lines: 139 Hi, I recently got the following lockdep warning on my laptop. The kernel is Ubuntu's tree with Linus's git up to d960eea9 pulled it; I don't think there are any non-mainline ext4 changes involved, and the warning below does look valid on mainline. As far as I can tell, lockdep is warning that jbd2_handle is usually acquired while doing reclaim -- which makes sense, as pushing dirty inodes out of memory is of course going to want to do fs transactions; but also there is the code path: ext4_ext_truncate(): ... handle = ext4_journal_start(inode, err); ... ext4_block_truncate_page(handle, mapping, inode->i_size); ... where we have ext4_block_truncate_page(): ... page = grab_cache_page(mapping, from >> PAGE_CACHE_SHIFT); and grab_cache_page() expands as: static inline struct page *grab_cache_page(struct address_space *mapping, pgoff_t index) { return find_or_create_page(mapping, index, mapping_gfp_mask(mapping)); } and if find_or_create_page() doesn't find the page, it just does: page = __page_cache_alloc(gfp_mask); Now, I'm not much of a VM or FS expert, but I see no reason why find_or_create_page() would never have to create a page in the ext4 usage, and I don't see any reason mapping_gfp_mask() would never include __GFP_FS; so it seems that lockdep is correct that this codepath could then trigger reclaim and deadlock by calling back into ext4 and trying to start another transaction. (And indeed the my system did hit the page allocate path, although perhaps that is a bug somewhere else) It does seem a little odd to me that no one else has seen this, since I thought quite a few people tested with lockdep enabled. So maybe something is odd about my system or kernel config. Anyway, I'm happy to try further tests and/or debugging patches. Full lockdep output is below. Thanks, Roland [20786.363249] ================================= [20786.363257] [ INFO: inconsistent lock state ] [20786.363265] 2.6.31-2-generic #14~rbd4gitd960eea9 [20786.363270] --------------------------------- [20786.363276] inconsistent {IN-RECLAIM_FS-W} -> {RECLAIM_FS-ON-W} usage. [20786.363285] http/8397 [HC0[0]:SC0[0]:HE1:SE1] takes: [20786.363291] (jbd2_handle){+.+.?.}, at: [] jbd2_journal_start+0xdb/0x150 [20786.363314] {IN-RECLAIM_FS-W} state was registered at: [20786.363320] [] mark_irqflags+0xc6/0x1a0 [20786.363334] [] __lock_acquire+0x287/0x430 [20786.363345] [] lock_acquire+0xa5/0x150 [20786.363355] [] jbd2_journal_start+0xfa/0x150 [20786.363365] [] ext4_journal_start_sb+0x58/0x90 [20786.363377] [] ext4_delete_inode+0xc5/0x2c0 [20786.363389] [] generic_delete_inode+0xd3/0x1a0 [20786.363401] [] generic_drop_inode+0x25/0x30 [20786.363411] [] iput+0x62/0x70 [20786.363420] [] dentry_iput+0x98/0x110 [20786.363429] [] d_kill+0x50/0x80 [20786.363438] [] dput+0x95/0x180 [20786.363447] [] ecryptfs_d_release+0x2b/0x70 [20786.363459] [] d_free+0x28/0x60 [20786.363468] [] d_kill+0x68/0x80 [20786.363477] [] prune_one_dentry+0xa3/0xc0 [20786.363487] [] __shrink_dcache_sb+0x271/0x290 [20786.363497] [] prune_dcache+0x109/0x1b0 [20786.363506] [] shrink_dcache_memory+0x3f/0x50 [20786.363516] [] shrink_slab+0x12d/0x190 [20786.363527] [] balance_pgdat+0x4d7/0x640 [20786.363537] [] kswapd+0x117/0x170 [20786.363546] [] kthread+0x9e/0xb0 [20786.363558] [] child_rip+0xa/0x20 [20786.363569] [] 0xffffffffffffffff [20786.363598] irq event stamp: 15997 [20786.363603] hardirqs last enabled at (15997): [] kmem_cache_alloc+0xfd/0x1a0 [20786.363617] hardirqs last disabled at (15996): [] kmem_cache_alloc+0x61/0x1a0 [20786.363628] softirqs last enabled at (15966): [] __do_softirq+0x14a/0x220 [20786.363641] softirqs last disabled at (15861): [] call_softirq+0x1c/0x30 [20786.363651] [20786.363653] other info that might help us debug this: [20786.363660] 3 locks held by http/8397: [20786.363665] #0: (&sb->s_type->i_mutex_key#8){+.+.+.}, at: [] do_truncate+0x64/0x90 [20786.363685] #1: (&sb->s_type->i_alloc_sem_key#5){+++++.}, at: [] notify_change+0x250/0x350 [20786.363707] #2: (jbd2_handle){+.+.?.}, at: [] jbd2_journal_start+0xdb/0x150 [20786.363724] [20786.363726] stack backtrace: [20786.363734] Pid: 8397, comm: http Tainted: G C 2.6.31-2-generic #14~rbd4gitd960eea9 [20786.363741] Call Trace: [20786.363752] [] print_usage_bug+0x18c/0x1a0 [20786.363763] [] ? check_usage_backwards+0x0/0xb0 [20786.363773] [] mark_lock_irq+0xf2/0x280 [20786.363783] [] mark_lock+0x137/0x1d0 [20786.363793] [] mark_held_locks+0x6c/0xa0 [20786.363803] [] lockdep_trace_alloc+0xaf/0xe0 [20786.363813] [] __alloc_pages_nodemask+0x7c/0x180 [20786.363824] [] ? find_get_page+0x91/0xf0 [20786.363835] [] alloc_pages_current+0x87/0xd0 [20786.363845] [] __page_cache_alloc+0x67/0x70 [20786.363856] [] find_or_create_page+0x4f/0xb0 [20786.363867] [] ext4_block_truncate_page+0x3e/0x460 [20786.363876] [] ? jbd2_journal_start+0xfa/0x150 [20786.363885] [] ? jbd2_journal_start+0xdb/0x150 [20786.363895] [] ? ext4_meta_trans_blocks+0x75/0xf0 [20786.363905] [] ext4_ext_truncate+0x1bb/0x1e0 [20786.363916] [] ? unmap_mapping_range+0x75/0x290 [20786.363926] [] ext4_truncate+0x498/0x630 [20786.363938] [] ? _raw_spin_unlock+0x5e/0xb0 [20786.363947] [] ? unmap_mapping_range+0xb6/0x290 [20786.363957] [] ? trace_hardirqs_on+0xd/0x10 [20786.363966] [] ? jbd2_journal_stop+0x1f8/0x2e0 [20786.363976] [] vmtruncate+0xb0/0x110 [20786.363986] [] inode_setattr+0x35/0x170 [20786.363995] [] ext4_setattr+0x186/0x370 [20786.364005] [] notify_change+0x16b/0x350 [20786.364014] [] do_truncate+0x70/0x90 [20786.364021] [] T.657+0xeb/0x110 [20786.364021] [] sys_ftruncate+0xe/0x10 [20786.364021] [] system_call_fastpath+0x16/0x1b -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/