Received: by 10.213.65.68 with SMTP id h4csp513730imn; Wed, 4 Apr 2018 02:26:14 -0700 (PDT) X-Google-Smtp-Source: AIpwx4+3ujoCN3Hbv4K2W62MiqhQQoQsqI6rYfDjr5G+IiSPC3lD/aU18BihFvTwweBQO1bu1zwl X-Received: by 2002:a17:902:4001:: with SMTP id b1-v6mr17558060pld.273.1522833974666; Wed, 04 Apr 2018 02:26:14 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1522833974; cv=none; d=google.com; s=arc-20160816; b=AoYtCLjdqzUaLHnRY05gRZIyHnFUDujUEJ5bGAszht1RNSaQdICYq7sR7LH04iGpHo MQJYrLvx8yE2TLULtIdqCshpBq9q1tOB3LfapbV0Aen9wdTcHW+mJuHO8oLiiiM+UxD3 uTBEnCy93rI3df/62cJUfYBI6T0j+sUgPkBpM4pcwX47sQxim7eRynsBExblx/zDJ03e PoSCUSYwcleE1yDNE+be0L58FppHKy41wEvK424kROFefLJVFyR5A2J0aa7qA163fRU9 fqqhwIeeEpunVn46Ue9JqCEhDo4nY05Sp76DWVk22I0eOxxDaxGfIKGqmvWgYsPEBiF5 NbHg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-language :content-transfer-encoding:in-reply-to:mime-version:user-agent:date :message-id:from:references:cc:to:subject:arc-authentication-results; bh=q9ztgNSIwmLGHUfFNmxpxjjDt/6zBaeTbtuu2NfSZ2w=; b=XVgnXdkJwk0LE4AsWOpnYgc+Ke7LktmzwNxXWp87WsImnmzSlgBB8G/FEpy/Ujr5Iz vBBCHPijge4eQTdB0LcytRcJenJxagfDhF8+ZlKQp/2yxeFKLjLC6xRMi4KELaVBJiQ2 SrJ0f+zhK0JGncBpvoxe+ZgPuH6QmM6j9GobiRmK5AQNM0bYvdpe1mM4FmCOYzoi7C2y mfpwB0J9lXNH7OoH4dJ7kcDKElTUl9ppwDe2IHRDrngQ4YXgDL/sVj265hev0PaSawwB gfnyPupTU6SZ9kWcTT9mFjpHUcfmi+7XioQDpADC4EjWf7UiMik1uMh8z6mLzgOlxpsh cegQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id t20si3763299pfk.228.2018.04.04.02.26.00; Wed, 04 Apr 2018 02:26:14 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1750950AbeDDJYz (ORCPT + 99 others); Wed, 4 Apr 2018 05:24:55 -0400 Received: from mx3-rdu2.redhat.com ([66.187.233.73]:35420 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1750736AbeDDJYx (ORCPT ); Wed, 4 Apr 2018 05:24:53 -0400 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.rdu2.redhat.com [10.11.54.4]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id CE99B4270958; Wed, 4 Apr 2018 09:24:52 +0000 (UTC) Received: from 117.195.187.81.in-addr.arpa (unknown [10.33.36.20]) by smtp.corp.redhat.com (Postfix) with ESMTP id 1AC602026DFD; Wed, 4 Apr 2018 09:24:48 +0000 (UTC) Subject: Re: WARNING in account_page_dirtied To: Jan Kara , syzbot Cc: akpm@linux-foundation.org, axboe@kernel.dk, hannes@cmpxchg.org, jlayton@redhat.com, keescook@chromium.org, laoar.shao@gmail.com, linux-kernel@vger.kernel.org, linux-mm@kvack.org, mhocko@suse.com, syzkaller-bugs@googlegroups.com, tytso@mit.edu, Bob Peterson , cluster-devel@redhat.com References: <001a113ff9ca1684ab0568cc6bb6@google.com> <20180403120529.z3mthf2v64he52gg@quack2.suse.cz> From: Steven Whitehouse Message-ID: Date: Wed, 4 Apr 2018 10:24:48 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.5.0 MIME-Version: 1.0 In-Reply-To: <20180403120529.z3mthf2v64he52gg@quack2.suse.cz> Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit Content-Language: en-US X-Scanned-By: MIMEDefang 2.78 on 10.11.54.4 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.7]); Wed, 04 Apr 2018 09:24:52 +0000 (UTC) X-Greylist: inspected by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.7]); Wed, 04 Apr 2018 09:24:52 +0000 (UTC) for IP:'10.11.54.4' DOMAIN:'int-mx04.intmail.prod.int.rdu2.redhat.com' HELO:'smtp.corp.redhat.com' FROM:'swhiteho@redhat.com' RCPT:'' Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi, On 03/04/18 13:05, Jan Kara wrote: > Hello, > > On Sun 01-04-18 10:01:02, syzbot wrote: >> syzbot hit the following crash on upstream commit >> 10b84daddbec72c6b440216a69de9a9605127f7a (Sat Mar 31 17:59:00 2018 +0000) >> Merge branch 'perf-urgent-for-linus' of >> git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip >> syzbot dashboard link: >> https://syzkaller.appspot.com/bug?extid=b7772c65a1d88bfd8fca >> >> C reproducer: https://syzkaller.appspot.com/x/repro.c?id=5705587757154304 >> syzkaller reproducer: >> https://syzkaller.appspot.com/x/repro.syz?id=5644332530925568 >> Raw console output: >> https://syzkaller.appspot.com/x/log.txt?id=5472755969425408 >> Kernel config: >> https://syzkaller.appspot.com/x/.config?id=-2760467897697295172 >> compiler: gcc (GCC) 7.1.1 20170620 >> >> IMPORTANT: if you fix the bug, please add the following tag to the commit: >> Reported-by: syzbot+b7772c65a1d88bfd8fca@syzkaller.appspotmail.com >> It will help syzbot understand when the bug is fixed. See footer for >> details. >> If you forward the report, please keep this part and the footer. >> >> gfs2: fsid=loop0.0: jid=0, already locked for use >> gfs2: fsid=loop0.0: jid=0: Looking at journal... >> gfs2: fsid=loop0.0: jid=0: Done >> gfs2: fsid=loop0.0: first mount done, others may mount >> gfs2: fsid=loop0.0: found 1 quota changes >> WARNING: CPU: 0 PID: 4469 at ./include/linux/backing-dev.h:341 inode_to_wb >> include/linux/backing-dev.h:338 [inline] >> WARNING: CPU: 0 PID: 4469 at ./include/linux/backing-dev.h:341 >> account_page_dirtied+0x8f9/0xcb0 mm/page-writeback.c:2416 >> Kernel panic - not syncing: panic_on_warn set ... >> >> CPU: 0 PID: 4469 Comm: syzkaller368843 Not tainted 4.16.0-rc7+ #9 >> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS >> Google 01/01/2011 >> Call Trace: >> __dump_stack lib/dump_stack.c:17 [inline] >> dump_stack+0x194/0x24d lib/dump_stack.c:53 >> panic+0x1e4/0x41c kernel/panic.c:183 >> __warn+0x1dc/0x200 kernel/panic.c:547 >> report_bug+0x1f4/0x2b0 lib/bug.c:186 >> fixup_bug.part.10+0x37/0x80 arch/x86/kernel/traps.c:178 >> fixup_bug arch/x86/kernel/traps.c:247 [inline] >> do_error_trap+0x2d7/0x3e0 arch/x86/kernel/traps.c:296 >> do_invalid_op+0x1b/0x20 arch/x86/kernel/traps.c:315 >> invalid_op+0x1b/0x40 arch/x86/entry/entry_64.S:986 >> RIP: 0010:inode_to_wb include/linux/backing-dev.h:338 [inline] >> RIP: 0010:account_page_dirtied+0x8f9/0xcb0 mm/page-writeback.c:2416 >> RSP: 0018:ffff8801d966e5c0 EFLAGS: 00010093 >> RAX: ffff8801acb7e600 RBX: 1ffff1003b2cdcba RCX: ffffffff818f47a9 >> RDX: 0000000000000000 RSI: ffff8801d3338148 RDI: 0000000000000082 >> RBP: ffff8801d966e698 R08: 1ffff1003b2cdc13 R09: 000000000000000c >> R10: ffff8801d966e558 R11: 0000000000000002 R12: ffff8801c96f0368 >> R13: ffffea0006b12780 R14: ffff8801c96f01d8 R15: ffff8801c96f01d8 >> __set_page_dirty+0x100/0x4b0 fs/buffer.c:605 >> mark_buffer_dirty+0x454/0x5d0 fs/buffer.c:1126 > Huh, I don't see how this could possibly happen. The warning is: > > WARN_ON_ONCE(debug_locks && > (!lockdep_is_held(&inode->i_lock) && > !lockdep_is_held(&inode->i_mapping->tree_lock) && > !lockdep_is_held(&inode->i_wb->list_lock))); > > Now __set_page_dirty() which called account_page_dirtied() just did: > > spin_lock_irqsave(&mapping->tree_lock, flags); > > Now the fact is that account_page_dirtied() actually checks > mapping->host->i_mapping->tree_lock so if mapping->host->i_mapping doesn't > get us back to 'mapping', that would explain the warning. But then > something would have to be very wrong in the GFS2 land... Adding some GFS2 > related CCs just in case they have some idea. So I looked at this for some time trying to work out what is going on. I'm sill not 100% sure now, but lets see if we can figure it out.... The stack trace shows a call path to the end of the journal flush code where we are unpinning pages that have been through the journal. Assuming that jdata is not in use (it is used for some internal files, even if it is not selected by the user) then it is most likely that this applies to a metadata page. For recent gfs2, all the metadata pages are kept in an address space which for inodes is in the relevant glock, and for resource groups is a single address space kept for only that purpose in the super block. In both of those cases the mapping->host points to the block device inode. Since the inode's mapping->host reflects only the block device address space (unused by gfs2) we would not expect it to point back to the relevant address space. As far as I can tell this usage is ok, since it doesn't make much sense to require lots of inodes to be hanging around uselessly just to keep metadata pages in. That after all, is why the address space and inode are separate structures in the first place since it is not a one to one relationship. So I think that probably explains why this triggers, since the test is not really a valid one in all cases, Steve. >> gfs2_unpin+0x143/0x12c0 fs/gfs2/lops.c:108 >> buf_lo_after_commit+0x273/0x430 fs/gfs2/lops.c:512 >> lops_after_commit fs/gfs2/lops.h:67 [inline] >> gfs2_log_flush+0xe2a/0x2750 fs/gfs2/log.c:809 >> do_sync+0x666/0xe40 fs/gfs2/quota.c:958 >> gfs2_quota_sync+0x2cc/0x570 fs/gfs2/quota.c:1301 >> gfs2_sync_fs+0x46/0xb0 fs/gfs2/super.c:956 >> __sync_filesystem fs/sync.c:39 [inline] >> sync_filesystem+0x188/0x2e0 fs/sync.c:64 >> generic_shutdown_super+0xd5/0x540 fs/super.c:425 >> kill_block_super+0x9b/0xf0 fs/super.c:1146 >> gfs2_kill_sb+0x133/0x1b0 fs/gfs2/ops_fstype.c:1392 >> deactivate_locked_super+0x88/0xd0 fs/super.c:312 >> deactivate_super+0x141/0x1b0 fs/super.c:343 >> cleanup_mnt+0xb2/0x150 fs/namespace.c:1173 >> __cleanup_mnt+0x16/0x20 fs/namespace.c:1180 >> task_work_run+0x199/0x270 kernel/task_work.c:113 >> exit_task_work include/linux/task_work.h:22 [inline] >> do_exit+0x9bb/0x1ad0 kernel/exit.c:865 >> do_group_exit+0x149/0x400 kernel/exit.c:968 >> SYSC_exit_group kernel/exit.c:979 [inline] >> SyS_exit_group+0x1d/0x20 kernel/exit.c:977 >> do_syscall_64+0x281/0x940 arch/x86/entry/common.c:287 >> entry_SYSCALL_64_after_hwframe+0x42/0xb7 >> RIP: 0033:0x456c29 >> RSP: 002b:00007fff74938dc8 EFLAGS: 00000202 ORIG_RAX: 00000000000000e7 >> RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 0000000000456c29 >> RDX: 00000000004170e0 RSI: 0000000000000000 RDI: 0000000000000001 >> RBP: 0000000000000003 R08: 000000000000000a R09: 0000000000418100 >> R10: 00000000200a9300 R11: 0000000000000202 R12: 0000000000000004 >> R13: 0000000000418100 R14: 0000000000000000 R15: 0000000000000000 >> Dumping ftrace buffer: >> (ftrace buffer empty) >> Kernel Offset: disabled >> Rebooting in 86400 seconds.. > Honza