Received: by 2002:a05:6a10:22f:0:0:0:0 with SMTP id 15csp4000618pxk; Tue, 29 Sep 2020 11:28:01 -0700 (PDT) X-Google-Smtp-Source: ABdhPJw6xKFhE/p84CxQSzzvfVqUgSscH4R2HB2+RsTo3NEK/RxDHae5ghukqZAqQebmPf+SCxF5 X-Received: by 2002:a17:906:a198:: with SMTP id s24mr5156412ejy.154.1601404081227; Tue, 29 Sep 2020 11:28:01 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1601404081; cv=none; d=google.com; s=arc-20160816; b=EEwMsTKTaUSDguqKjlu8FWqD4rWW9O7FfNOfY1T3NpKd60G68upc01t8Gp3uKFQTtO xogYr582oIt9b95urqZyYHiBYYjgdvHEYzYVRhGTtOafTeYch6DNrH1YsKf8yBAyEJ8P SAfMdGOrdKl1oQl0Z/jLSc77IDSXV6+3Ttl2rp9oNhtbrJK/OHSlZLHHuJxui/O9F8Wq pgkVtcy0/nyao/1hgoXSQZlI4eeQYVGhy6Zmo0g7e2+KcPXWY/a32CMoUC5J/S7v36My oKlS96nWxfAIzLFX3doIPYSq2UT1SRUUFeXHXDPyGVNqC7Jw4znztc01bUoFjIFo6Yt3 0J3g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :user-agent:references:in-reply-to:message-id:date:subject:cc:to :from:dkim-signature; bh=pB8fgCGmUtK9xUXtEo4ml1/f8GGCo9nrxXghRM8rwWk=; b=e4plGT7Hw3ETvq2BE0/arF/p7lD1LTMPDqu+nRWjXrkX3FsoNsFMAcwG/AM3KVqVk3 kGSRUWBsxxFUZuW3PPeTB3SAW/VaIXCNaRZel9mr9tyxLYGn1i+8YCqOmsLXr/aiME+L bsU8+95S+o5+v5pzeT+rB5dlLziWOA+Q8heAwP1g1sr/3hRLEM+yEJmF1+ivOavUerCV ZLOp07SBsA/m22YzB/0rqdPxxiTYdgUzTQVzxr/4sQTvtjMxbFL0c/brRH8PWzg59Rwq VuiAQgLY5UM5w/pGkqNtoKseuAwEbl1mK+ZZsBcV047CYaiB54PikxiKyIhFXQTi69RT /i1Q== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=wfYLokDx; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linuxfoundation.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id v15si3456033ede.489.2020.09.29.11.27.38; Tue, 29 Sep 2020 11:28:01 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=wfYLokDx; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linuxfoundation.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1733123AbgI2MhV (ORCPT + 99 others); Tue, 29 Sep 2020 08:37:21 -0400 Received: from mail.kernel.org ([198.145.29.99]:37340 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729486AbgI2LU6 (ORCPT ); Tue, 29 Sep 2020 07:20:58 -0400 Received: from localhost (83-86-74-64.cable.dynamic.v4.ziggo.nl [83.86.74.64]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 32F112395A; Tue, 29 Sep 2020 11:18:49 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1601378329; bh=g0g8js2g58QMBPozju3ghyMNyjPxgzjchUVAoHycwlo=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=wfYLokDxCpzZ7z6zscsbhpUTirE8HdD3nnAKUFwdJJkMB/rgkmBTty6eYZs/z8YV4 dgSgkZTMU6a+TqjNWwtUAaLdQn6aG347uqN4bhu10l+Msh7s7PpuO+E/UKOGxPQZOY 6iMK6Vof8AHBrdnDQpVPnr5wZ8z7XYQd1FGxHQmU= From: Greg Kroah-Hartman To: linux-kernel@vger.kernel.org Cc: Greg Kroah-Hartman , stable@vger.kernel.org, Josef Bacik , Qu Wenruo , David Sterba , Sasha Levin Subject: [PATCH 4.14 142/166] btrfs: qgroup: fix data leak caused by race between writeback and truncate Date: Tue, 29 Sep 2020 13:00:54 +0200 Message-Id: <20200929105942.284255358@linuxfoundation.org> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20200929105935.184737111@linuxfoundation.org> References: <20200929105935.184737111@linuxfoundation.org> User-Agent: quilt/0.66 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Qu Wenruo [ Upstream commit fa91e4aa1716004ea8096d5185ec0451e206aea0 ] [BUG] When running tests like generic/013 on test device with btrfs quota enabled, it can normally lead to data leak, detected at unmount time: BTRFS warning (device dm-3): qgroup 0/5 has unreleased space, type 0 rsv 4096 ------------[ cut here ]------------ WARNING: CPU: 11 PID: 16386 at fs/btrfs/disk-io.c:4142 close_ctree+0x1dc/0x323 [btrfs] RIP: 0010:close_ctree+0x1dc/0x323 [btrfs] Call Trace: btrfs_put_super+0x15/0x17 [btrfs] generic_shutdown_super+0x72/0x110 kill_anon_super+0x18/0x30 btrfs_kill_super+0x17/0x30 [btrfs] deactivate_locked_super+0x3b/0xa0 deactivate_super+0x40/0x50 cleanup_mnt+0x135/0x190 __cleanup_mnt+0x12/0x20 task_work_run+0x64/0xb0 __prepare_exit_to_usermode+0x1bc/0x1c0 __syscall_return_slowpath+0x47/0x230 do_syscall_64+0x64/0xb0 entry_SYSCALL_64_after_hwframe+0x44/0xa9 ---[ end trace caf08beafeca2392 ]--- BTRFS error (device dm-3): qgroup reserved space leaked [CAUSE] In the offending case, the offending operations are: 2/6: writev f2X[269 1 0 0 0 0] [1006997,67,288] 0 2/7: truncate f2X[269 1 0 0 48 1026293] 18388 0 The following sequence of events could happen after the writev(): CPU1 (writeback) | CPU2 (truncate) ----------------------------------------------------------------- btrfs_writepages() | |- extent_write_cache_pages() | |- Got page for 1003520 | | 1003520 is Dirty, no writeback | | So (!clear_page_dirty_for_io()) | | gets called for it | |- Now page 1003520 is Clean. | | | btrfs_setattr() | | |- btrfs_setsize() | | |- truncate_setsize() | | New i_size is 18388 |- __extent_writepage() | | |- page_offset() > i_size | |- btrfs_invalidatepage() | |- Page is clean, so no qgroup | callback executed This means, the qgroup reserved data space is not properly released in btrfs_invalidatepage() as the page is Clean. [FIX] Instead of checking the dirty bit of a page, call btrfs_qgroup_free_data() unconditionally in btrfs_invalidatepage(). As qgroup rsv are completely bound to the QGROUP_RESERVED bit of io_tree, not bound to page status, thus we won't cause double freeing anyway. Fixes: 0b34c261e235 ("btrfs: qgroup: Prevent qgroup->reserved from going subzero") CC: stable@vger.kernel.org # 4.14+ Reviewed-by: Josef Bacik Signed-off-by: Qu Wenruo Signed-off-by: David Sterba Signed-off-by: Sasha Levin --- fs/btrfs/inode.c | 23 ++++++++++------------- 1 file changed, 10 insertions(+), 13 deletions(-) diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index 17856e92b93d1..c9e7b92d0f212 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -9204,20 +9204,17 @@ again: /* * Qgroup reserved space handler * Page here will be either - * 1) Already written to disk - * In this case, its reserved space is released from data rsv map - * and will be freed by delayed_ref handler finally. - * So even we call qgroup_free_data(), it won't decrease reserved - * space. - * 2) Not written to disk - * This means the reserved space should be freed here. However, - * if a truncate invalidates the page (by clearing PageDirty) - * and the page is accounted for while allocating extent - * in btrfs_check_data_free_space() we let delayed_ref to - * free the entire extent. + * 1) Already written to disk or ordered extent already submitted + * Then its QGROUP_RESERVED bit in io_tree is already cleaned. + * Qgroup will be handled by its qgroup_record then. + * btrfs_qgroup_free_data() call will do nothing here. + * + * 2) Not written to disk yet + * Then btrfs_qgroup_free_data() call will clear the QGROUP_RESERVED + * bit of its io_tree, and free the qgroup reserved data space. + * Since the IO will never happen for this page. */ - if (PageDirty(page)) - btrfs_qgroup_free_data(inode, NULL, page_start, PAGE_SIZE); + btrfs_qgroup_free_data(inode, NULL, page_start, PAGE_SIZE); if (!inode_evicting) { clear_extent_bit(tree, page_start, page_end, EXTENT_LOCKED | EXTENT_DIRTY | -- 2.25.1