Received: by 2002:a6b:500f:0:0:0:0:0 with SMTP id e15csp176796iob; Tue, 3 May 2022 14:26:11 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxhz0AEK5FvJIoC4Bi7zOWXl+MuTuv2JWEwB5jk8e3Rr9O4p4kjaBoeAws1QrjwXP3zsh1/ X-Received: by 2002:a63:2ad0:0:b0:3c1:5f7e:fd78 with SMTP id q199-20020a632ad0000000b003c15f7efd78mr15366569pgq.56.1651613171440; Tue, 03 May 2022 14:26:11 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1651613171; cv=none; d=google.com; s=arc-20160816; b=xKQ8Y5S84ArGk8C8HdpKDBPbxftDW5WIsLq6HjkfWIDbb6KNcKtSHa4pfdLq5QCL7L UQdeuhXa0SR2gPOO36GOjrHYr3ibMNWGFxn0lrn/YGwCoVZ04f8WGZXdlWapdPio9ilI mR7uTvjn3Q/qUE75jNLsmXO6OEn8Utm26Kezig3MuOeGPVn14UuIAk1MvPfyYfHjOMB6 paBQEdMqQ0VrliU/m33Q/obI4HaMpqD58g6nvyx3DB07n4c1i6EmWPCTsoaArr3Xo5kS Yl3kfkFi2dbyqDayn5z9Efk1D9MX2NBt1mfM/SxNgTMGTiZ1m+I/+YCAeJZ1cYrH92ay YNmg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:subject:message-id:date:from:in-reply-to :references:mime-version:dkim-signature; bh=6JG3HzoSawUya4Ff5hOskiwsDPqB8IOURZiJ3WxkT7I=; b=ppeSnT46RhMQV1Yf9ffbanhurvLZs8CoaU5wNV0wTQKCTmGXpemBv3me65R3gGM2lo GT74mwwyRJ6oHc3JbRd8dD2TeOMSFuVBv8SkdiL9HEiNk1gRbtlH4hox918hxhJL6daY 1gApjYWzyEgP7TlkuAPHrtyvAfAFwsibdSBXXd4K13zFrOwKuC4r+R4Va55uweWh1q+i Fg8+vfdBUXKxnorUFj3yEE0sBOVS1yWeIO9FJqrv+QDjolmLPBanxaHFyJ36S+CVRLnz 29BK+ptFwfll04IVfQ7IdVMEVU/b6tDfjafI9kakuqN97O2uNmbzl+vlS3DAPufdi1Wn hwKA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linux-foundation.org header.s=google header.b="XyTHbO/T"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id t13-20020a170902b20d00b00153b2d1655csi16239283plr.356.2022.05.03.14.25.57; Tue, 03 May 2022 14:26:11 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@linux-foundation.org header.s=google header.b="XyTHbO/T"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S239330AbiECQXT (ORCPT + 99 others); Tue, 3 May 2022 12:23:19 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33954 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S239196AbiECQXS (ORCPT ); Tue, 3 May 2022 12:23:18 -0400 Received: from mail-lf1-x129.google.com (mail-lf1-x129.google.com [IPv6:2a00:1450:4864:20::129]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D714B2CE3F for ; Tue, 3 May 2022 09:19:44 -0700 (PDT) Received: by mail-lf1-x129.google.com with SMTP id t25so31166417lfg.7 for ; Tue, 03 May 2022 09:19:44 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux-foundation.org; s=google; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=6JG3HzoSawUya4Ff5hOskiwsDPqB8IOURZiJ3WxkT7I=; b=XyTHbO/T17RhX6KeTaQRsyWHKfUZUjRIcfF2dxzjhwrNQCTLyaHdRtlVu0D3ZMfUOr yLARl+3w/sPGEBVgTRp4sRNYGbtxaOSoBGfOkj0VKZT+OwDCqmOkbOxZ5/UxuREnKMBZ 5rlI886BqT1O89jxNYWRi10yebcu9Avcp57aA= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=6JG3HzoSawUya4Ff5hOskiwsDPqB8IOURZiJ3WxkT7I=; b=wUzT7loSnEY/RLipltB7wASy8GF/Gazs8SdPVpTimiZS+eDrhJBUifg5Ke7W0qYAYt UDG30o1gmS9rnGDioU4PUHgH4UgarVv7qBUBDb0cgUn7klAwX3eLPBwDNTfSHSb3wKlz dt3JcAhbIk62B3abMHAsSptAnuMq7cVHfUcwKtqHpZMpk9gv67mBDt9RS4RXP513hLHC wGYTmrNzUOsYw3MomIzpPtuhieF42L+NvruCpqVGVtpkWVmG/Tkmju6D6b3/z5O8JNny LTpwWm2Tg+ngIHTIV3MeYfbwGsw/uZVRgxEPJOLGS5HLF7j+ARDfaeD0gEllt/I4m9w/ d2hw== X-Gm-Message-State: AOAM531d0w3UJT5FrYtrCXW3WaSqqqeTdJ83yTOO2k594iufuJK0Ly1M X/F+cxCpE5LyRkt1xUoY4+E6gR2QGjwgmOPG X-Received: by 2002:a05:6512:3191:b0:472:5e00:1bee with SMTP id i17-20020a056512319100b004725e001beemr9500683lfe.129.1651594782450; Tue, 03 May 2022 09:19:42 -0700 (PDT) Received: from mail-lj1-f174.google.com (mail-lj1-f174.google.com. [209.85.208.174]) by smtp.gmail.com with ESMTPSA id f12-20020a2eb5ac000000b0024f3d1daea1sm1385006ljn.41.2022.05.03.09.19.41 for (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Tue, 03 May 2022 09:19:41 -0700 (PDT) Received: by mail-lj1-f174.google.com with SMTP id q14so22619401ljc.12 for ; Tue, 03 May 2022 09:19:41 -0700 (PDT) X-Received: by 2002:a2e:934b:0:b0:24f:cce:5501 with SMTP id m11-20020a2e934b000000b0024f0cce5501mr10518058ljh.443.1651594781101; Tue, 03 May 2022 09:19:41 -0700 (PDT) MIME-Version: 1.0 References: <20220426145445.2282274-1-agruenba@redhat.com> In-Reply-To: From: Linus Torvalds Date: Tue, 3 May 2022 09:19:24 -0700 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: [GIT PULL] gfs2 fix To: Andreas Gruenbacher Cc: Christoph Hellwig , "Darrick J. Wong" , Dave Chinner , cluster-devel , Linux Kernel Mailing List Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-1.8 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, May 3, 2022 at 1:56 AM Andreas Gruenbacher wrote: > > We still get data corruption with the patch applied. The > WARN_ON_ONCE(!bytes) doesn't trigger. Oh well. I was so sure that I'd finally found something.. That partial write case has had bugs before. > As an additional experiment, I've added code to check the iterator > position that iomap_file_buffered_write() returns, and it's all > looking good as well: an iov_iter_advance(orig_from, written) from the > original position always gets us to the same iterator. Yeah, I've looked at the iterator parts (and iov_iter_revert() in particular) multiple times, because that too is an area where we've had bugs before. That too may be easy to get wrong, but I couldn't for the life of me see any issues there. > This points at gfs2 getting things wrong after a short write, for > example, marking a page / folio uptodate that isn't. But the uptodate > handling happens at the iomap layer, so this doesn't leave me with an > immediate suspect. Yeah, the uptodate setting looked safe, particularly with that "if we copied less than we thought we would, and it wasn't uptodate, just claim we didn't do anything at all". That said, I now have a *new* suspect: the 'iter->pos' handling in iomap_write_iter(). In particular, let's look at iomap_file_buffered_write(), which does: while ((ret = iomap_iter(&iter, ops)) > 0) iter.processed = iomap_write_iter(&iter, i); and then look at what happens to iter.pos here. iomap_write_iter() does this: loff_t pos = iter->pos; ... pos += status; but it never seems to write the updated position back to the iterator. So what happens next time iomap_write_iter() gets called? This looks like such a huge bug that I'm probably missing something, but I wonder if this is normally hidden by the fact that usually iomap_write_iter() consumes the whole 'iter', so despite the 'while()' loop, it's actually effectively only called once. Except if it gets a short write due to an unhandled page fault.. Am I entirely blind, and that 'iter.pos' is updated somewhere and I just missed it? Or is this maybe the reason for it all? Linus