Received: by 2002:ac0:a5a6:0:0:0:0:0 with SMTP id m35-v6csp4534092imm; Mon, 17 Sep 2018 16:02:03 -0700 (PDT) X-Google-Smtp-Source: ANB0VdZ6sF/pCrwA/ITF7us/0fjcujpUSHZ6VtYZu83UhI7yZl1+/q7oIewy64637H5t6kjMLo4L X-Received: by 2002:a62:6547:: with SMTP id z68-v6mr27769581pfb.20.1537225323040; Mon, 17 Sep 2018 16:02:03 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1537225323; cv=none; d=google.com; s=arc-20160816; b=pRtuzidb0PH9j2VPqBTBI+f3kbkn0lnY6/hMd53BzOorANvLmThM4UJyJkA7pLkabf 1AmsromLZ+l03NBll5YIs+zA8W5UyDwBGOyyB9jlVbx+/g/8HohPLT2c4AFIoyj7EroZ 1QmxIkEI03qfGzmtbE/iPUlksoCxqBES+f54mLDIM2l0MHYhJetgMEqoYa9uJr7w1LwS Nrud+9QaglWseFuUWy1zcfANjRvLJxOJxq55mcVXTWNE9ZIibk1L/PlY0UPQMaqwuxu0 j7bFjBbaqjBjpU6QN+dlCSnyS7PoPOnxJrTto9a4r25PyypGlcwo0UThu13sh9feWmp6 5+PQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :user-agent:references:in-reply-to:message-id:date:subject:cc:to :from; bh=ARqwHyI0sS9RxH1W2pjbDkiqTrekv4kimVTzPAaz26k=; b=kKVu1guOqvmi1Oo3FLMJz5LK/Tv7AwQlvfg/TI+IsGX0BzzGzMbQHQljp7yfSmyuz+ RBoOr6LjvvBVLQbDpfawfw85e/A1J3wmP649Ca87rvcqmzldIVMjNoXRonPb8gxO0hHK f4uupKF51ZMHYPcVZd5exqFECGWdPDuaruG6MhdvYvws8RkL3lh9z63ikDtbGSAY+o+0 BKgBmIv+lsbPfAd2uUONJJyTuVHtTyZxJUDGeArXdhQ5a0l/2je7u/Dzz0U84Yavw1rs cVRHVMa7LxHKxtt7+i9e5x1D5ZyclSZa5fxFr8iVm3N76FG1O3L6xqaIWvCkU8XJldQV fcdQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id k20-v6si16432181pgb.115.2018.09.17.16.01.47; Mon, 17 Sep 2018 16:02:03 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730217AbeIREa7 (ORCPT + 99 others); Tue, 18 Sep 2018 00:30:59 -0400 Received: from mail.linuxfoundation.org ([140.211.169.12]:48292 "EHLO mail.linuxfoundation.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727088AbeIREa7 (ORCPT ); Tue, 18 Sep 2018 00:30:59 -0400 Received: from localhost (li1825-44.members.linode.com [172.104.248.44]) by mail.linuxfoundation.org (Postfix) with ESMTPSA id 7A75BC7D; Mon, 17 Sep 2018 23:01:30 +0000 (UTC) From: Greg Kroah-Hartman To: linux-kernel@vger.kernel.org Cc: Greg Kroah-Hartman , stable@vger.kernel.org, Filipe Manana , David Sterba Subject: [PATCH 4.14 010/126] Btrfs: fix data corruption when deduplicating between different files Date: Tue, 18 Sep 2018 00:40:58 +0200 Message-Id: <20180917211705.101518761@linuxfoundation.org> X-Mailer: git-send-email 2.19.0 In-Reply-To: <20180917211703.481236999@linuxfoundation.org> References: <20180917211703.481236999@linuxfoundation.org> User-Agent: quilt/0.65 X-stable: review MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org 4.14-stable review patch. If anyone has any objections, please let me know. ------------------ From: Filipe Manana commit de02b9f6bb65a6a1848f346f7a3617b7a9b930c0 upstream. If we deduplicate extents between two different files we can end up corrupting data if the source range ends at the size of the source file, the source file's size is not aligned to the filesystem's block size and the destination range does not go past the size of the destination file size. Example: $ mkfs.btrfs -f /dev/sdb $ mount /dev/sdb /mnt $ xfs_io -f -c "pwrite -S 0x6b 0 2518890" /mnt/foo # The first byte with a value of 0xae starts at an offset (2518890) # which is not a multiple of the sector size. $ xfs_io -c "pwrite -S 0xae 2518890 102398" /mnt/foo # Confirm the file content is full of bytes with values 0x6b and 0xae. $ od -t x1 /mnt/foo 0000000 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b * 11467540 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b ae ae ae ae ae ae 11467560 ae ae ae ae ae ae ae ae ae ae ae ae ae ae ae ae * 11777540 ae ae ae ae ae ae ae ae 11777550 # Create a second file with a length not aligned to the sector size, # whose bytes all have the value 0x6b, so that its extent(s) can be # deduplicated with the first file. $ xfs_io -f -c "pwrite -S 0x6b 0 557771" /mnt/bar # Now deduplicate the entire second file into a range of the first file # that also has all bytes with the value 0x6b. The destination range's # end offset must not be aligned to the sector size and must be less # then the offset of the first byte with the value 0xae (byte at offset # 2518890). $ xfs_io -c "dedupe /mnt/bar 0 1957888 557771" /mnt/foo # The bytes in the range starting at offset 2515659 (end of the # deduplication range) and ending at offset 2519040 (start offset # rounded up to the block size) must all have the value 0xae (and not # replaced with 0x00 values). In other words, we should have exactly # the same data we had before we asked for deduplication. $ od -t x1 /mnt/foo 0000000 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b * 11467540 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b ae ae ae ae ae ae 11467560 ae ae ae ae ae ae ae ae ae ae ae ae ae ae ae ae * 11777540 ae ae ae ae ae ae ae ae 11777550 # Unmount the filesystem and mount it again. This guarantees any file # data in the page cache is dropped. $ umount /dev/sdb $ mount /dev/sdb /mnt $ od -t x1 /mnt/foo 0000000 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b * 11461300 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 00 00 00 00 00 11461320 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 * 11470000 ae ae ae ae ae ae ae ae ae ae ae ae ae ae ae ae * 11777540 ae ae ae ae ae ae ae ae 11777550 # The bytes in range 2515659 to 2519040 have a value of 0x00 and not a # value of 0xae, data corruption happened due to the deduplication # operation. So fix this by rounding down, to the sector size, the length used for the deduplication when the following conditions are met: 1) Source file's range ends at its i_size; 2) Source file's i_size is not aligned to the sector size; 3) Destination range does not cross the i_size of the destination file. Fixes: e1d227a42ea2 ("btrfs: Handle unaligned length in extent_same") CC: stable@vger.kernel.org # 4.2+ Signed-off-by: Filipe Manana Signed-off-by: David Sterba Signed-off-by: Greg Kroah-Hartman --- fs/btrfs/ioctl.c | 19 +++++++++++++++++++ 1 file changed, 19 insertions(+) --- a/fs/btrfs/ioctl.c +++ b/fs/btrfs/ioctl.c @@ -3158,6 +3158,25 @@ static int btrfs_extent_same(struct inod same_lock_start = min_t(u64, loff, dst_loff); same_lock_len = max_t(u64, loff, dst_loff) + len - same_lock_start; + } else { + /* + * If the source and destination inodes are different, the + * source's range end offset matches the source's i_size, that + * i_size is not a multiple of the sector size, and the + * destination range does not go past the destination's i_size, + * we must round down the length to the nearest sector size + * multiple. If we don't do this adjustment we end replacing + * with zeroes the bytes in the range that starts at the + * deduplication range's end offset and ends at the next sector + * size multiple. + */ + if (loff + olen == i_size_read(src) && + dst_loff + len < i_size_read(dst)) { + const u64 sz = BTRFS_I(src)->root->fs_info->sectorsize; + + len = round_down(i_size_read(src), sz) - loff; + olen = len; + } } /* don't make the dst file partly checksummed */