Received: by 10.213.65.68 with SMTP id h4csp702602imn; Wed, 4 Apr 2018 05:58:28 -0700 (PDT) X-Google-Smtp-Source: AIpwx48gslegL+7hF6++1FZF5GAY84JJZYZgG7gBf7h+aP9ke27zSvBpGReh/wBlZebc0TlRnOJc X-Received: by 2002:a17:902:7618:: with SMTP id k24-v6mr736102pll.244.1522846708905; Wed, 04 Apr 2018 05:58:28 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1522846708; cv=none; d=google.com; s=arc-20160816; b=COGL88ZxeR9jzZk9cmJM//978of07vIi0hS/XWChZpKVElhKrsLxz1c8TQT7a76QFa 1KIsRYdid4NlB5g9JXUj4L+3xDWhGyqRneuit8TchC2MEJa67PgkV79ENsGPA3u9W6Hw 9GZWMajtI5Ir+Yk7FM7+5IOas8DI26Gom+7TTUzY5b+gaUs2pwQD3eOS/koAzAKKfJno EYJ1XuhZ9V9vojul1L7qbnT54pm7qi5ukH0Bp2mHCG2c8qKqT7Q4W88tdJ0t1dbxUWG2 oW4OUT2RI+8qqlb9qDl8JBrjAjPn0P2fzfFmebknVUAbKU+Jh04qXoJvqSo8uU8jaD/c VMvw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-language :content-transfer-encoding:in-reply-to:mime-version:user-agent:date :message-id:from:references:cc:to:subject:arc-authentication-results; bh=35Zh5sYPOoJuOwMeBAjlO/0/R9+NBu1oSPy/3M7KHkM=; b=E1FblMpt7bLCKJp/Btj/NGPq3Fc7evesPLSlb83ON6437uRknGdepJN2XhwaJSfM+d Mc4k71oXKu+LYF6WPU7RmWaT98QuaoY0F+1UBrfJYbi+dTxahN1gELQtdw119WhqtLp2 4NbTN2hzTQnSHYnjnTfN+PZf2L5NHNInJRhYPEBNSJnEGOPDeyaL+cb1TE0G0SYu0XrZ ppdM6Ns+2ZGiGOpWDoEffQh+d45pikb3aKDMXIGJNfQkdkqf1BzFj0UvcdlGzDZsI/oV Q+rbXJbeaCMEEbHDCOxeoJS/VdZA+siZQWXNlXsNi8oQuIF7okvy9DKnS/PabAebu83G Reqw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id e20si3926920pfi.359.2018.04.04.05.58.14; Wed, 04 Apr 2018 05:58:28 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751265AbeDDM5B (ORCPT + 99 others); Wed, 4 Apr 2018 08:57:01 -0400 Received: from mx3-rdu2.redhat.com ([66.187.233.73]:48568 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1751097AbeDDM5A (ORCPT ); Wed, 4 Apr 2018 08:57:00 -0400 Received: from smtp.corp.redhat.com (int-mx05.intmail.prod.int.rdu2.redhat.com [10.11.54.5]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id A05D6EB71B; Wed, 4 Apr 2018 12:56:59 +0000 (UTC) Received: from 117.195.187.81.in-addr.arpa (unknown [10.33.36.20]) by smtp.corp.redhat.com (Postfix) with ESMTP id 8C25784438; Wed, 4 Apr 2018 12:56:54 +0000 (UTC) Subject: Re: WARNING in account_page_dirtied To: Jan Kara Cc: syzbot , akpm@linux-foundation.org, axboe@kernel.dk, hannes@cmpxchg.org, jlayton@redhat.com, keescook@chromium.org, laoar.shao@gmail.com, linux-kernel@vger.kernel.org, linux-mm@kvack.org, mhocko@suse.com, syzkaller-bugs@googlegroups.com, tytso@mit.edu, Bob Peterson , cluster-devel@redhat.com References: <001a113ff9ca1684ab0568cc6bb6@google.com> <20180403120529.z3mthf2v64he52gg@quack2.suse.cz> <20180404123634.6wz5ctjkryzm5nf7@quack2.suse.cz> From: Steven Whitehouse Message-ID: <0d2e8961-a14a-e033-030a-ee2bed6c0f9d@redhat.com> Date: Wed, 4 Apr 2018 13:56:53 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.5.0 MIME-Version: 1.0 In-Reply-To: <20180404123634.6wz5ctjkryzm5nf7@quack2.suse.cz> Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit Content-Language: en-US X-Scanned-By: MIMEDefang 2.79 on 10.11.54.5 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.1]); Wed, 04 Apr 2018 12:56:59 +0000 (UTC) X-Greylist: inspected by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.1]); Wed, 04 Apr 2018 12:56:59 +0000 (UTC) for IP:'10.11.54.5' DOMAIN:'int-mx05.intmail.prod.int.rdu2.redhat.com' HELO:'smtp.corp.redhat.com' FROM:'swhiteho@redhat.com' RCPT:'' Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi, On 04/04/18 13:36, Jan Kara wrote: > Hi, > > On Wed 04-04-18 10:24:48, Steven Whitehouse wrote: >> On 03/04/18 13:05, Jan Kara wrote: >>> Hello, >>> >>> On Sun 01-04-18 10:01:02, syzbot wrote: >>>> syzbot hit the following crash on upstream commit >>>> 10b84daddbec72c6b440216a69de9a9605127f7a (Sat Mar 31 17:59:00 2018 +0000) >>>> Merge branch 'perf-urgent-for-linus' of >>>> git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip >>>> syzbot dashboard link: >>>> https://syzkaller.appspot.com/bug?extid=b7772c65a1d88bfd8fca >>>> >>>> C reproducer: https://syzkaller.appspot.com/x/repro.c?id=5705587757154304 >>>> syzkaller reproducer: >>>> https://syzkaller.appspot.com/x/repro.syz?id=5644332530925568 >>>> Raw console output: >>>> https://syzkaller.appspot.com/x/log.txt?id=5472755969425408 >>>> Kernel config: >>>> https://syzkaller.appspot.com/x/.config?id=-2760467897697295172 >>>> compiler: gcc (GCC) 7.1.1 20170620 >>>> >>>> IMPORTANT: if you fix the bug, please add the following tag to the commit: >>>> Reported-by: syzbot+b7772c65a1d88bfd8fca@syzkaller.appspotmail.com >>>> It will help syzbot understand when the bug is fixed. See footer for >>>> details. >>>> If you forward the report, please keep this part and the footer. >>>> >>>> gfs2: fsid=loop0.0: jid=0, already locked for use >>>> gfs2: fsid=loop0.0: jid=0: Looking at journal... >>>> gfs2: fsid=loop0.0: jid=0: Done >>>> gfs2: fsid=loop0.0: first mount done, others may mount >>>> gfs2: fsid=loop0.0: found 1 quota changes >>>> WARNING: CPU: 0 PID: 4469 at ./include/linux/backing-dev.h:341 inode_to_wb >>>> include/linux/backing-dev.h:338 [inline] >>>> WARNING: CPU: 0 PID: 4469 at ./include/linux/backing-dev.h:341 >>>> account_page_dirtied+0x8f9/0xcb0 mm/page-writeback.c:2416 >>>> Kernel panic - not syncing: panic_on_warn set ... >>>> >>>> CPU: 0 PID: 4469 Comm: syzkaller368843 Not tainted 4.16.0-rc7+ #9 >>>> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS >>>> Google 01/01/2011 >>>> Call Trace: >>>> __dump_stack lib/dump_stack.c:17 [inline] >>>> dump_stack+0x194/0x24d lib/dump_stack.c:53 >>>> panic+0x1e4/0x41c kernel/panic.c:183 >>>> __warn+0x1dc/0x200 kernel/panic.c:547 >>>> report_bug+0x1f4/0x2b0 lib/bug.c:186 >>>> fixup_bug.part.10+0x37/0x80 arch/x86/kernel/traps.c:178 >>>> fixup_bug arch/x86/kernel/traps.c:247 [inline] >>>> do_error_trap+0x2d7/0x3e0 arch/x86/kernel/traps.c:296 >>>> do_invalid_op+0x1b/0x20 arch/x86/kernel/traps.c:315 >>>> invalid_op+0x1b/0x40 arch/x86/entry/entry_64.S:986 >>>> RIP: 0010:inode_to_wb include/linux/backing-dev.h:338 [inline] >>>> RIP: 0010:account_page_dirtied+0x8f9/0xcb0 mm/page-writeback.c:2416 >>>> RSP: 0018:ffff8801d966e5c0 EFLAGS: 00010093 >>>> RAX: ffff8801acb7e600 RBX: 1ffff1003b2cdcba RCX: ffffffff818f47a9 >>>> RDX: 0000000000000000 RSI: ffff8801d3338148 RDI: 0000000000000082 >>>> RBP: ffff8801d966e698 R08: 1ffff1003b2cdc13 R09: 000000000000000c >>>> R10: ffff8801d966e558 R11: 0000000000000002 R12: ffff8801c96f0368 >>>> R13: ffffea0006b12780 R14: ffff8801c96f01d8 R15: ffff8801c96f01d8 >>>> __set_page_dirty+0x100/0x4b0 fs/buffer.c:605 >>>> mark_buffer_dirty+0x454/0x5d0 fs/buffer.c:1126 >>> Huh, I don't see how this could possibly happen. The warning is: >>> >>> WARN_ON_ONCE(debug_locks && >>> (!lockdep_is_held(&inode->i_lock) && >>> !lockdep_is_held(&inode->i_mapping->tree_lock) && >>> !lockdep_is_held(&inode->i_wb->list_lock))); >>> >>> Now __set_page_dirty() which called account_page_dirtied() just did: >>> >>> spin_lock_irqsave(&mapping->tree_lock, flags); >>> >>> Now the fact is that account_page_dirtied() actually checks >>> mapping->host->i_mapping->tree_lock so if mapping->host->i_mapping doesn't >>> get us back to 'mapping', that would explain the warning. But then >>> something would have to be very wrong in the GFS2 land... Adding some GFS2 >>> related CCs just in case they have some idea. >> So I looked at this for some time trying to work out what is going on. I'm >> sill not 100% sure now, but lets see if we can figure it out.... >> >> The stack trace shows a call path to the end of the journal flush code where >> we are unpinning pages that have been through the journal. Assuming that >> jdata is not in use (it is used for some internal files, even if it is not >> selected by the user) then it is most likely that this applies to a metadata >> page. >> >> For recent gfs2, all the metadata pages are kept in an address space which >> for inodes is in the relevant glock, and for resource groups is a single >> address space kept for only that purpose in the super block. In both of >> those cases the mapping->host points to the block device inode. Since the >> inode's mapping->host reflects only the block device address space (unused >> by gfs2) we would not expect it to point back to the relevant address space. >> >> As far as I can tell this usage is ok, since it doesn't make much sense to >> require lots of inodes to be hanging around uselessly just to keep metadata >> pages in. That after all, is why the address space and inode are separate >> structures in the first place since it is not a one to one relationship. So >> I think that probably explains why this triggers, since the test is not >> really a valid one in all cases, > The problem is we really do expect mapping->host->i_mapping == mapping as > we pass mapping and inode interchangebly in the mm code. The address_space > and inodes are separate structures because you can have many inodes > pointing to one address space (block devices). However it is not allowed > for several address_spaces to point to one inode! That way mm code may end > up using different address_spaces in different places although they should > be the same one as is the case in this assert... Probably you use these > address_spaces in a very limited way and so things seem to work but it is > really a pure coincidence. From a very quick look you seem to be using > these special address_spaces to track dirty metadata associated with an > inode? Anything else? > > Honza Yes, either an inode or a rgrp. However I'm fairly sure that we landed up doing that because we were told that inodes and address spaces were intended to be independent at some point in the past. They are used in a fairly limited way and mostly so that we can efficiently invalidate metadata belonging to a particular inode (or rgrp). In the rgrp case we could just use the existing block dev inode's address space except that we'd have to make sure that we invalidated it on mount. The rgrps are easy because each one is a single extent only. For the inode metadata case, we did (a very long time ago) try tracking the metadata in a different way and it was not very efficient at all, so using a separate address space was the best solution we could find at the time. We do not want to go back to having two struct inodes for each real inode since that took up a lot of memory in cases where there were lots of small files... https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/fs/gfs2/glock.c?id=009d851837ab26cab18adda6169a813f70b0b21b and now I remember that is also resolved an issue of a circular dependency between inodes used for the metadata address space and "proper" inodes too. When we introduced the change in the above patch, both inodes and glock were using the address spaces in the glock, however we further optimised the rgrps at a later date to share a single address space between them. So while that doesn't solve the problem, it does, I hope, explain some of the background, Steve.