Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0BBD0C433F5 for ; Wed, 22 Dec 2021 14:47:29 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1343584AbhLVOr2 (ORCPT ); Wed, 22 Dec 2021 09:47:28 -0500 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]:44040 "EHLO us-smtp-delivery-124.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1343578AbhLVOrW (ORCPT ); Wed, 22 Dec 2021 09:47:22 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1640184441; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=KkPrGYIHtROONol+rZCx2qQwfF/Vw8Q6rsv1yxJ+1lA=; b=FKVm6/U5lYQ7SaDn9Ft521q8F+U0uboFBQ55IjvMkMKKrXeIJVj+YNLiGa5BEf7eKtriZe 6nNR3+ygTLiofT+uC6/WIWvxgcGjuskTImMKZRALFfrHFJC5ZB2L5D7HWX68sWLx/WEQoR IfpeYE38+I78snTt9zmvuPtyx9vBPag= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-256-RQyFGqGhOEaecN1tVe6sXw-1; Wed, 22 Dec 2021 09:47:20 -0500 X-MC-Unique: RQyFGqGhOEaecN1tVe6sXw-1 Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.phx2.redhat.com [10.5.11.22]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 8520418C89C4; Wed, 22 Dec 2021 14:47:18 +0000 (UTC) Received: from work (unknown [10.40.193.216]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 549C11084266; Wed, 22 Dec 2021 14:47:16 +0000 (UTC) Date: Wed, 22 Dec 2021 15:47:12 +0100 From: Lukas Czerner To: Jan Kara Cc: Ye Bin , tytso@mit.edu, adilger.kernel@dilger.ca, linux-ext4@vger.kernel.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH -next] ext4: Fix BUG_ON in ext4_bread when write quota data Message-ID: <20211222144712.fdkgr3d4cqvgxbi2@work> References: <20211222013537.3096310-1-yebin10@huawei.com> <20211222143757.GD685@quack2.suse.cz> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20211222143757.GD685@quack2.suse.cz> X-Scanned-By: MIMEDefang 2.84 on 10.5.11.22 Precedence: bulk List-ID: X-Mailing-List: linux-ext4@vger.kernel.org On Wed, Dec 22, 2021 at 03:37:57PM +0100, Jan Kara wrote: > On Wed 22-12-21 09:35:37, Ye Bin wrote: > > We got issue as follows when run syzkaller: > > [ 167.936972] EXT4-fs error (device loop0): __ext4_remount:6314: comm rep: Abort forced by user > > [ 167.938306] EXT4-fs (loop0): Remounting filesystem read-only > > [ 167.981637] Assertion failure in ext4_getblk() at fs/ext4/inode.c:847: '(EXT4_SB(inode->i_sb)->s_mount_state & EXT4_FC_REPLAY) || handle != NULL || create == 0' > > [ 167.983601] ------------[ cut here ]------------ > > [ 167.984245] kernel BUG at fs/ext4/inode.c:847! > > [ 167.984882] invalid opcode: 0000 [#1] PREEMPT SMP KASAN PTI > > [ 167.985624] CPU: 7 PID: 2290 Comm: rep Tainted: G B 5.16.0-rc5-next-20211217+ #123 > > [ 167.986823] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS ?-20190727_073836-buildvm-ppc64le-16.ppc.fedoraproject.org-3.fc31 04/01/2014 > > [ 167.988590] RIP: 0010:ext4_getblk+0x17e/0x504 > > [ 167.989189] Code: c6 01 74 28 49 c7 c0 a0 a3 5c 9b b9 4f 03 00 00 48 c7 c2 80 9c 5c 9b 48 c7 c6 40 b6 5c 9b 48 c7 c7 20 a4 5c 9b e8 77 e3 fd ff <0f> 0b 8b 04 244 > > [ 167.991679] RSP: 0018:ffff8881736f7398 EFLAGS: 00010282 > > [ 167.992385] RAX: 0000000000000094 RBX: 1ffff1102e6dee75 RCX: 0000000000000000 > > [ 167.993337] RDX: 0000000000000001 RSI: ffffffff9b6e29e0 RDI: ffffed102e6dee66 > > [ 167.994292] RBP: ffff88816a076210 R08: 0000000000000094 R09: ffffed107363fa09 > > [ 167.995252] R10: ffff88839b1fd047 R11: ffffed107363fa08 R12: ffff88816a0761e8 > > [ 167.996205] R13: 0000000000000000 R14: 0000000000000021 R15: 0000000000000001 > > [ 167.997158] FS: 00007f6a1428c740(0000) GS:ffff88839b000000(0000) knlGS:0000000000000000 > > [ 167.998238] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > > [ 167.999025] CR2: 00007f6a140716c8 CR3: 0000000133216000 CR4: 00000000000006e0 > > [ 167.999987] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > > [ 168.000944] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 > > [ 168.001899] Call Trace: > > [ 168.002235] > > [ 168.007167] ext4_bread+0xd/0x53 > > [ 168.007612] ext4_quota_write+0x20c/0x5c0 > > [ 168.010457] write_blk+0x100/0x220 > > [ 168.010944] remove_free_dqentry+0x1c6/0x440 > > [ 168.011525] free_dqentry.isra.0+0x565/0x830 > > [ 168.012133] remove_tree+0x318/0x6d0 > > [ 168.014744] remove_tree+0x1eb/0x6d0 > > [ 168.017346] remove_tree+0x1eb/0x6d0 > > [ 168.019969] remove_tree+0x1eb/0x6d0 > > [ 168.022128] qtree_release_dquot+0x291/0x340 > > [ 168.023297] v2_release_dquot+0xce/0x120 > > [ 168.023847] dquot_release+0x197/0x3e0 > > [ 168.024358] ext4_release_dquot+0x22a/0x2d0 > > [ 168.024932] dqput.part.0+0x1c9/0x900 > > [ 168.025430] __dquot_drop+0x120/0x190 > > [ 168.025942] ext4_clear_inode+0x86/0x220 > > [ 168.026472] ext4_evict_inode+0x9e8/0xa22 > > [ 168.028200] evict+0x29e/0x4f0 > > [ 168.028625] dispose_list+0x102/0x1f0 > > [ 168.029148] evict_inodes+0x2c1/0x3e0 > > [ 168.030188] generic_shutdown_super+0xa4/0x3b0 > > [ 168.030817] kill_block_super+0x95/0xd0 > > [ 168.031360] deactivate_locked_super+0x85/0xd0 > > [ 168.031977] cleanup_mnt+0x2bc/0x480 > > [ 168.033062] task_work_run+0xd1/0x170 > > [ 168.033565] do_exit+0xa4f/0x2b50 > > [ 168.037155] do_group_exit+0xef/0x2d0 > > [ 168.037666] __x64_sys_exit_group+0x3a/0x50 > > [ 168.038237] do_syscall_64+0x3b/0x90 > > [ 168.038751] entry_SYSCALL_64_after_hwframe+0x44/0xae > > > > In order to reproduce this problem, the following conditions need to be met: > > 1. Ext4 filesystem with no journal; > > 2. Filesystem image with incorrect quota data; > > 3. Abort filesystem forced by user; > > 4. umount filesystem; > > > > As in ext4_quota_write: > > ... > > if (EXT4_SB(sb)->s_journal && !handle) { > > ext4_msg(sb, KERN_WARNING, "Quota write (off=%llu, len=%llu)" > > " cancelled because transaction is not started", > > (unsigned long long)off, (unsigned long long)len); > > return -EIO; > > } > > ... > > We only check handle if NULL when filesystem has journal. There is need > > check handle if NULL even when filesystem has no journal. > > Hum, so I think we can just drop this whole condition and warning and > instead add a silent return of -EROFS if the filesystem is readonly. That > should also fix the bug and also make aborted case less noisy. The message > from ext4_quota_write() IMO is not very useful and ext4_bread() will > provide us with a BUG and stacktrace which is better for debugging what > went wrong anyway. Can we really only get here when the fs is read only ? We're going to end up with NULL handle here any time the ext4_journal_start() fails in ext4_release_dquot() which I think can fail for plenty of reasons, not just EROFS. -Lukas > > Honza > > > > > Signed-off-by: Ye Bin > > --- > > fs/ext4/super.c | 5 +++-- > > 1 file changed, 3 insertions(+), 2 deletions(-) > > > > diff --git a/fs/ext4/super.c b/fs/ext4/super.c > > index 071b7b3c5678..c8ca5811ea65 100644 > > --- a/fs/ext4/super.c > > +++ b/fs/ext4/super.c > > @@ -6955,9 +6955,10 @@ static ssize_t ext4_quota_write(struct super_block *sb, int type, > > struct buffer_head *bh; > > handle_t *handle = journal_current_handle(); > > > > - if (EXT4_SB(sb)->s_journal && !handle) { > > + if (!handle) { > > ext4_msg(sb, KERN_WARNING, "Quota write (off=%llu, len=%llu)" > > - " cancelled because transaction is not started", > > + " cancelled because transaction is not started" > > + " or filesystem is abort forced by user", > > (unsigned long long)off, (unsigned long long)len); > > return -EIO; > > } > > -- > > 2.31.1 > > > -- > Jan Kara > SUSE Labs, CR >