Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S942488AbcJFRSL (ORCPT ); Thu, 6 Oct 2016 13:18:11 -0400 Received: from mx1.redhat.com ([209.132.183.28]:40828 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S942442AbcJFRSC (ORCPT ); Thu, 6 Oct 2016 13:18:02 -0400 Date: Thu, 6 Oct 2016 19:17:58 +0200 From: Oleg Nesterov To: Dave Chinner Cc: Jan Kara , Al Viro , Nikolay Borisov , "Paul E. McKenney" , linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, fstests@vger.kernel.org Subject: Re: [PATCH V2 2/2] fs/super.c: don't fool lockdep in freeze_super() and thaw_super() paths Message-ID: <20161006171758.GA21707@redhat.com> References: <20160926161856.GB32458@quack2.suse.cz> <20160926165525.GA9338@redhat.com> <20160927065135.GA1139@quack2.suse.cz> <20160927172901.GA11879@redhat.com> <20160930171434.GA2373@redhat.com> <20161002214225.GS9806@dastard> <20161003164435.GB6634@redhat.com> <20161004114341.GA8572@redhat.com> <20161004194435.GW9806@dastard> <20161005164432.GB15121@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20161005164432.GB15121@redhat.com> User-Agent: Mutt/1.5.24 (2015-08-30) X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.38]); Thu, 06 Oct 2016 17:18:01 +0000 (UTC) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 5458 Lines: 130 On 10/05, Oleg Nesterov wrote: > > On 10/05, Dave Chinner wrote: > > > > On Tue, Oct 04, 2016 at 01:43:43PM +0200, Oleg Nesterov wrote: > > > > > plus the following warnings: > > > > > > [ 1894.500040] run fstests generic/070 at 2016-10-04 05:03:39 > > > [ 1895.076655] ================================= > > > [ 1895.077136] [ INFO: inconsistent lock state ] > > > [ 1895.077574] 4.8.0 #1 Not tainted > > > [ 1895.077900] --------------------------------- > > > [ 1895.078330] inconsistent {IN-RECLAIM_FS-W} -> {RECLAIM_FS-ON-W} usage. > > > [ 1895.078993] fsstress/18239 [HC0[0]:SC0[0]:HE1:SE1] takes: > > > [ 1895.079522] (&xfs_nondir_ilock_class){++++?-}, at: [] xfs_ilock+0x165/0x210 [xfs] > > > [ 1895.080529] {IN-RECLAIM_FS-W} state was registered at: > > > > And that is a bug in the lockdep annotations for memory allocation because it > > fails to take into account the current task flags that are set via > > memalloc_noio_save() to prevent vmalloc from doing GFP_KERNEL allocations. i.e. > > in _xfs_buf_map_pages(): > > OK, I see... > > I'll re-test with the following change: > > --- a/kernel/locking/lockdep.c > +++ b/kernel/locking/lockdep.c > @@ -2867,7 +2867,7 @@ static void __lockdep_trace_alloc(gfp_t gfp_mask, unsigned long flags) > return; > > /* We're only interested __GFP_FS allocations for now */ > - if (!(gfp_mask & __GFP_FS)) > + if ((curr->flags & PF_MEMALLOC_NOIO) || !(gfp_mask & __GFP_FS)) > return; > and with the change above "./check -b auto" finishes without lockdep warnings, probably I should send this patch to lockdep maintainers. Now, with 2/2 applied I got the following: [ INFO: inconsistent lock state ] 4.8.0+ #4 Tainted: G W --------------------------------- inconsistent {RECLAIM_FS-ON-W} -> {IN-RECLAIM_FS-R} usage. kswapd0/32 [HC0[0]:SC0[0]:HE1:SE1] takes: (sb_internal){+++++?}, at: [] __sb_start_write+0xb7/0xf0 {RECLAIM_FS-ON-W} state was registered at: [] mark_held_locks+0x6f/0xa0 [] lockdep_trace_alloc+0xd3/0x120 [] kmem_cache_alloc+0x2f/0x280 [] kmem_zone_alloc+0x81/0x120 [xfs] [] xfs_trans_alloc+0x6c/0x130 [xfs] [] xfs_sync_sb+0x39/0x80 [xfs] [] xfs_log_sbcount+0x4d/0x50 [xfs] [] xfs_quiesce_attr+0x57/0xb0 [xfs] [] xfs_fs_freeze+0x21/0x40 [xfs] [] freeze_super+0xcf/0x190 [] do_vfs_ioctl+0x55f/0x6c0 [] SyS_ioctl+0x79/0x90 [] entry_SYSCALL_64_fastpath+0x1f/0xbd irq event stamp: 36471805 hardirqs last enabled at (36471805): [] clear_page_dirty_for_io+0x1ed/0x2e0 hardirqs last disabled at (36471804): [] clear_page_dirty_for_io+0x1bd/0x2e0 softirqs last enabled at (36468590): [] __do_softirq+0x37a/0x44d softirqs last disabled at (36468579): [] irq_exit+0xe5/0xf0 other info that might help us debug this: Possible unsafe locking scenario: CPU0 ---- lock(sb_internal); lock(sb_internal); *** DEADLOCK *** no locks held by kswapd0/32. stack backtrace: CPU: 0 PID: 32 Comm: kswapd0 Tainted: G W 4.8.0+ #4 Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011 0000000000000086 00000000028a3434 ffff880139b2b520 ffffffff91449193 ffff880139b1a680 ffffffff928c1e70 ffff880139b2b570 ffffffff91106c75 0000000000000000 0000000000000001 ffff880100000001 000000000000000a Call Trace: [] dump_stack+0x85/0xc2 [] print_usage_bug+0x215/0x240 [] mark_lock+0x58b/0x650 [] ? print_shortest_lock_dependencies+0x1a0/0x1a0 [] __lock_acquire+0x36d/0x1870 [] lock_acquire+0x10d/0x200 [] ? __sb_start_write+0xb7/0xf0 [] percpu_down_read+0x3c/0x90 [] ? __sb_start_write+0xb7/0xf0 [] __sb_start_write+0xb7/0xf0 [] xfs_trans_alloc+0xe3/0x130 [xfs] [] xfs_iomap_write_allocate+0x1f7/0x380 [xfs] [] ? xfs_map_blocks+0xe3/0x380 [xfs] [] ? rcu_read_lock_sched_held+0x58/0x60 [] xfs_map_blocks+0x22a/0x380 [xfs] [] xfs_do_writepage+0x188/0x6c0 [xfs] [] xfs_vm_writepage+0x3b/0x70 [xfs] [] pageout.isra.46+0x190/0x380 [] shrink_page_list+0x9ab/0xa70 [] shrink_inactive_list+0x252/0x5d0 [] shrink_node_memcg+0x5af/0x790 [] shrink_node+0xe1/0x320 [] kswapd+0x387/0x8b0 Probably false positive? Although when I look at the comment above xfs_sync_sb() I think that may be sometging like below makes sense, but I know absolutely nothing about fs/ and XFS in particular. Oleg. --- x/fs/xfs/xfs_trans.c +++ x/fs/xfs/xfs_trans.c @@ -245,7 +245,8 @@ xfs_trans_alloc( atomic_inc(&mp->m_active_trans); tp = kmem_zone_zalloc(xfs_trans_zone, - (flags & XFS_TRANS_NOFS) ? KM_NOFS : KM_SLEEP); + (flags & (XFS_TRANS_NOFS | XFS_TRANS_NO_WRITECOUNT)) + ? KM_NOFS : KM_SLEEP); tp->t_magic = XFS_TRANS_HEADER_MAGIC; tp->t_flags = flags; tp->t_mountp = mp;