Received: by 2002:a05:6a10:206:0:0:0:0 with SMTP id 6csp464865pxj; Thu, 13 May 2021 09:00:19 -0700 (PDT) X-Google-Smtp-Source: ABdhPJyxXIkNX7RhJsvkpqoyxGn5S0YS1el2oyhRXU1NDCvolVIcWblsLhsi+/nknWDY67LEPzHj X-Received: by 2002:a05:6830:611:: with SMTP id w17mr21959738oti.225.1620921619796; Thu, 13 May 2021 09:00:19 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1620921619; cv=none; d=google.com; s=arc-20160816; b=x5rI+Pt6ua6paZgH0HnjzhfOlStxUlwhRgR7XOHO920ughYUVq4Yw5Cf51koQpLWWn Ahat5nwTEkUFyltJdDwYLYOz8/OAKvQGsODwQ8lzpSPhdJmOYrxCOAAOlDXC2Fmu0I/3 f79e4lOwG1VjuUW9dX/yBUnejzd0GPVwNO4Rm1WOqa8PJrDSpdgH8qUPcEm8lg05yn66 l2HaHRlRcPZaitCr/yPlTHRzc+9sZO+UMk0I0F8BYkD9JyLKXeWcGPL85uxqN6UfxWB+ PbBuBX1j4ydgtQEphkkSTs177sGmxLZLJ9OXl3SdkKGcgF0t7HWolxUHXDt0u6J+wH/Q /ZeA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:content-language :in-reply-to:mime-version:user-agent:date:message-id:from:references :cc:to:subject; bh=sC9rS2dsZghLNP4cQrgEl3m85yFLwm9XaohEby3pw50=; b=A7ZdtnET6V5Ryju2lmZgyVge9dCoSG/hxvBGtT3D7qYDxB0/kaEINiP4NhGh4vIL+L CSgtORzuAhnr1xCO2HZya5nUgXOWcWgJ31Eg9LUTUr5yyhUPTj1l+XQrXfE4eb3cWVvS aiEqIPmW/LCRU7Go8aelVKZcavCqC1X5FLEYSubSfHt7CNvtVsX7K3CHYn4NJMLG+iR6 UPWDsvJ81VWxsyHP68/Ay7x8dwFdFOxskyAt1n0hV61Xkxkwm5kaCsQAlA3zcnuFrqel Tu6EY3cGApZ8QiKnlxIVcQPJ6kAv99CbhYD+ZcFHIkLigsfQmy5aZ8A6WbQZEmThRQOK Qpww== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-ext4-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-ext4-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id d82si3312228oib.228.2021.05.13.09.00.01; Thu, 13 May 2021 09:00:19 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-ext4-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-ext4-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-ext4-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234946AbhEMQAX (ORCPT + 99 others); Thu, 13 May 2021 12:00:23 -0400 Received: from mail-pl1-f173.google.com ([209.85.214.173]:46893 "EHLO mail-pl1-f173.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234949AbhEMQAX (ORCPT ); Thu, 13 May 2021 12:00:23 -0400 Received: by mail-pl1-f173.google.com with SMTP id s20so14610427plr.13; Thu, 13 May 2021 08:59:13 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:to:cc:references:from:message-id:date :user-agent:mime-version:in-reply-to:content-language :content-transfer-encoding; bh=sC9rS2dsZghLNP4cQrgEl3m85yFLwm9XaohEby3pw50=; b=RNxDtndZ2FbFHXEwkwpkMUmEqYt5xX4dC4P0P7Axp6tGYABzi5MeoGO2dbRc3TPFYS DZHbl8bDdDPd8ZuuAoNPiw5Hjqj5t9+hwjeRDqDiaMDTrC2vkPF62HF59XEZMc1/4h0Z 4VEnXdu6rM0IryXkEmqNTkRbUvpBOEpHHVlY/2MVUolz7iWWCeSPWPY5YNke0tZPmmH9 a+6Fq3ow+RNVw2uV7rPhIWob/GRhPiHEwb4epYSGuQuEGrHBNLp9H3USw96hvqyYh9Ma yLznlrnRI/0PpiX+xgu59+MTgSl3uQVreYZ8r0G01KK4bMM/kUbhAYnA1RFIDHT3shB4 U39w== X-Gm-Message-State: AOAM531nDQMyK/FaUR/scokeFFtgaUUKLJMMC8kZLTsos5nAkw6CJCbg tRPey0kinmAOesqtv41IPnJDSLhYjmkqAw== X-Received: by 2002:a17:902:7d8b:b029:ef:8306:6ae7 with SMTP id a11-20020a1709027d8bb02900ef83066ae7mr9036878plm.26.1620921553393; Thu, 13 May 2021 08:59:13 -0700 (PDT) Received: from ?IPv6:2601:647:4000:d7:6dd7:5386:8c63:ccae? ([2601:647:4000:d7:6dd7:5386:8c63:ccae]) by smtp.gmail.com with ESMTPSA id t4sm2410467pfq.165.2021.05.13.08.59.11 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Thu, 13 May 2021 08:59:12 -0700 (PDT) Subject: Re: regression: data corruption with ext4 on LUKS on nvme with torvalds master To: Theodore Ts'o , Changheun Lee Cc: alex_y_xu@yahoo.ca, axboe@kernel.dk, bgoncalv@redhat.com, dm-crypt@saout.de, hch@lst.de, jaegeuk@kernel.org, linux-block@vger.kernel.org, linux-ext4@vger.kernel.org, linux-kernel@vger.kernel.org, linux-nvme@lists.infradead.org, ming.lei@redhat.com, yi.zhang@redhat.com References: <20210513094222.17635-1-nanich.lee@samsung.com> From: Bart Van Assche Message-ID: Date: Thu, 13 May 2021 08:59:10 -0700 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.10.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit Precedence: bulk List-ID: X-Mailing-List: linux-ext4@vger.kernel.org On 5/13/21 7:15 AM, Theodore Ts'o wrote: > On Thu, May 13, 2021 at 06:42:22PM +0900, Changheun Lee wrote: >> >> Problem might be casued by exhausting of memory. And memory exhausting >> would be caused by setting of small bio_max_size. Actually it was not >> reproduced in my VM environment at first. But, I reproduced same problem >> when bio_max_size is set with 8KB forced. Too many bio allocation would >> be occurred by setting of 8KB bio_max_size. > > Hmm... I'm not sure how to align your diagnosis with the symptoms in > the bug report. If we were limited by memory, that should slow down > the I/O, but we should still be making forward progress, no? And a > forced reboot should not result in data corruption, unless maybe there > was a missing check for a failed memory allocation, causing data to be > written to the wrong location, a missing error check leading to the > block or file system layer not noticing that a write had failed > (although again, memory exhaustion should not lead to failed writes; > it might slow us down, sure, but if writes are being failed, something > is Badly Going Wrong --- things like writes to the swap device or > writes by the page cleaner must succeed, or else Things Would Go Bad > In A Hurry). After the LUKS data corruption issue was reported I decided to take a look at the dm-crypt code. In that code I found the following: static void clone_init(struct dm_crypt_io *io, struct bio *clone) { struct crypt_config *cc = io->cc; clone->bi_private = io; clone->bi_end_io = crypt_endio; bio_set_dev(clone, cc->dev->bdev); clone->bi_opf = io->base_bio->bi_opf; } [ ... ] static struct bio *crypt_alloc_buffer(struct dm_crypt_io *io, unsigned size) { [ ... ] clone = bio_alloc_bioset(GFP_NOIO, nr_iovecs, &cc->bs); [ ... ] clone_init(io, clone); [ ... ] for (i = 0; i < nr_iovecs; i++) { [ ... ] bio_add_page(clone, page, len, 0); remaining_size -= len; } [ ... ] } My interpretation is that crypt_alloc_buffer() allocates a bio, associates it with the underlying device and clones a bio. The input bio may have a size up to UINT_MAX while the new limit for the size of the cloned bio is max_sectors * 512. That causes bio_add_page() to fail if the input bio is larger than max_sectors * 512, hence the data corruption. Please note that this is a guess only and that I'm not familiar with the dm-crypt code. Bart.