Received: by 2002:a05:7412:251c:b0:e2:908c:2ebd with SMTP id w28csp1974731rda; Tue, 24 Oct 2023 08:39:17 -0700 (PDT) X-Google-Smtp-Source: AGHT+IFd4fMY5fN7EvUz28B8GjESb7TdeE/wtAEdF/HcmxCeGPLlgMpU/GRtcL7FYQL9LLK4Sm8i X-Received: by 2002:a05:6a00:93aa:b0:6be:bf7:fda5 with SMTP id ka42-20020a056a0093aa00b006be0bf7fda5mr11575864pfb.12.1698161956703; Tue, 24 Oct 2023 08:39:16 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1698161956; cv=none; d=google.com; s=arc-20160816; b=H4mdp3JDaLp4NxsuhjdqM6/BXuk1UQJS2DrLWfwVnYtAsGLbNPe0Ran5L2udNJGbZ8 CiIHaBy5E5iiN4I8Cu6BbfADo0LQWT0bt/blqyUUJ9jHd3zWaDXpcscGRpDsSMm/+Xbo w22GB32qDbdtq+a4QoNCwTghoViFtDK1mYOgcDDXamyVRLhMwxPsQPz3tLAOFWJf3sdC fz1rNVUNpD2c9toivObDl8AiwwdnrTmqMKFDFyGes1Yp6ZkSBSosmuEVKFZUR/3+alPF T1/cpRK1USKOcI6n23DjlIbFqOY/8Mw1v6oJb/+YyfcXdWe1VtautGcOL2bj2JW4ckyc 4z4A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-transfer-encoding :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:dkim-signature; bh=4EjGpCztyi/bpm0e/SUGbDGiVmJ6U06zSFh5PzYY8uM=; fh=LZPvEcQHydLqJ52wc8TFN3sEYLSi9R1Vlw8S/nlWw84=; b=oSfdgeOb1Fd32HcoTwmN8WgbGs9xAQK4Q6My05LBlW+F7uLrN6H93P+6XEoD3wLDyx g1AeF/8veCtD8joNRncG6iA5yIgpJdaNDY1i4ZBBGMoxw24LR/iqL2pdav2Kui1kxQp6 gsRQ6agNTxvlpSRsCme0ltREvfrXRT/chTQmrhsKTPSGkPD6BsmRSlgj+nMCJGEsoSKo VcoTOJQ8+TlBOEuznFPULJo7+DFEcrNH7oJOu2CXd0gGqbXnywOvakGiSvUG7W2NmRTn R31SuIRx4H2bZ4u9RZowz/09op0VMGzjRoYYa/uNIxY95mkvwv+ElDqCpnafX/yKypp5 TNvQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=H6k38Jn6; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.33 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from lipwig.vger.email (lipwig.vger.email. [23.128.96.33]) by mx.google.com with ESMTPS id b21-20020a63e715000000b0056f8fdb4430si8715783pgi.893.2023.10.24.08.39.16 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 24 Oct 2023 08:39:16 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.33 as permitted sender) client-ip=23.128.96.33; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=H6k38Jn6; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.33 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by lipwig.vger.email (Postfix) with ESMTP id 7D1D680CC23D; Tue, 24 Oct 2023 08:39:14 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at lipwig.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1343547AbjJXPjK (ORCPT + 99 others); Tue, 24 Oct 2023 11:39:10 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39764 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234241AbjJXPjJ (ORCPT ); Tue, 24 Oct 2023 11:39:09 -0400 Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4BED6A3; Tue, 24 Oct 2023 08:39:07 -0700 (PDT) Received: by smtp.kernel.org (Postfix) with ESMTPSA id DCF29C433C7; Tue, 24 Oct 2023 15:39:06 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1698161946; bh=/WC2xE/MloO7WOkbdOiY+gYM4E6FAb5qDVXBP6DKJgE=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=H6k38Jn6oekyNutgwdatRXSG/Bw1GzzQuEieqct4eWuKBwkOz9GspxoWar6n0juBE F7bgw6ocWBEFUqFczUUnp4CGMTtpp6+H+BIQjXnY8FAIqP7p6msqBlf3PYnWBXps5l CPIQBSvvc77kslT628HI7WbPZjdX3JQzd6kCV44zJRogUQ8iXMbM7FcC/YVnNtmsTw us+LhBC4j7OYXOS59OQb5x2DnCpsoNY0kuFKibcb1SPH5dA+hmrdQlJgCHgnJCcq8B 0c+kP6gqqhToRx4cGM5SnSfZx9GlvMYhWGknhYHI3pKwn/Oa7/LSUDMvi0B0CDywjl FH4Bx7+Y0G0PQ== Date: Tue, 24 Oct 2023 08:39:06 -0700 From: "Darrick J. Wong" To: John Garry Cc: Dave Chinner , linux-kernel@vger.kernel.org, linux-api@vger.kernel.org, martin.petersen@oracle.com, himanshu.madhani@oracle.com Subject: Re: [PATCH 2/4] readv.2: Document RWF_ATOMIC flag Message-ID: <20231024153906.GJ11391@frogsfrogsfrogs> References: <20230929093717.2972367-1-john.g.garry@oracle.com> <20230929093717.2972367-3-john.g.garry@oracle.com> <20231009174438.GE21283@frogsfrogsfrogs> <20231009210531.GB214073@frogsfrogsfrogs> <7da93082-2985-85f4-7688-a082728de0a5@oracle.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <7da93082-2985-85f4-7688-a082728de0a5@oracle.com> X-Spam-Status: No, score=-1.2 required=5.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lipwig.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (lipwig.vger.email [0.0.0.0]); Tue, 24 Oct 2023 08:39:14 -0700 (PDT) On Tue, Oct 24, 2023 at 01:35:33PM +0100, John Garry wrote: > On 09/10/2023 22:05, Darrick J. Wong wrote: > > > > If the file range is a sparse hole, the directio setup will allocate > > > > space and create an unwritten mapping before issuing the write bio. The > > > > rest of the process works the same as preallocations and has the same > > > > behaviors. > > > > > > > > If the file range is allocated and was previously written, the write is > > > > issued and that's all that's needed from the fs. After a crash, reads > > > > of the storage device produce the old contents or the new contents. > > > This is exactly what I explained when reviewing the code that > > > rejected RWF_ATOMIC without O_DSYNC on metadata dirty inodes. > > I'm glad we agree. ???? > > > > John, when you're back from vacation, can we get rid of this language > > and all those checks under _is_dsync() in the iomap patch? > > > > (That code is 100% the result of me handwaving and bellyaching 6 months > > ago when the team was trying to get all the atomic writes bits working > > prior to LSF and I was too burned out to think the xfs part through. > > As a result, I decided that we'd only support strict overwrites for the > > first iteration.) > > So this following additive code in iomap_dio_bio_iter() should be dropped: > > ----8<----- > > --- a/fs/iomap/direct-io.c > +++ b/fs/iomap/direct-io.c > @@ -275,10 +275,11 @@ static inline blk_opf_t iomap_dio_bio_opflags(struct > iomap_dio *dio, > static loff_t iomap_dio_bio_iter(const struct iomap_iter *iter, > struct iomap_dio *dio) > { > > ... > > @@ -292,6 +293,13 @@ static loff_t iomap_dio_bio_iter(const struct > iomap_iter *iter, > !bdev_iter_is_aligned(iomap->bdev, dio->submit.iter)) > return -EINVAL; > > + if (atomic_write && !iocb_is_dsync(dio->iocb)) { > + if (iomap->flags & IOMAP_F_DIRTY) > + return -EIO; > + if (iomap->type != IOMAP_MAPPED) > + return -EIO; > + } > + > > ---->8----- > > ok? Yes. > > > > > > Summarizing: > > > > > > > > An (ATOMIC|SYNC) request provides the strongest guarantees (data > > > > will not be torn, and all file metadata updates are persisted before > > > > the write is returned to userspace. Programs see either the old data or > > > > the new data, even if there's a crash. > > > > > > > > (ATOMIC|DSYNC) is less strong -- data will not be torn, and any file > > > > updates for just that region are persisted before the write is returned. > > > > > > > > (ATOMIC) is the least strong -- data will not be torn. Neither the > > > > filesystem nor the device make guarantees that anything ended up on > > > > stable storage, but if it does, programs see either the old data or the > > > > new data. > > > Yup, that makes sense to me. > > Perhaps this ^^ is what we should be documenting here. > > > > > > Maybe we should rename the whole UAPI s/atomic/untorn/... > > > Perhaps, though "torn writes" is nomenclature that nobody outside > > > storage and filesystem developers really knows about. All I ever > > > hear from userspace developers is "we want atomic/all-or-nothing > > > data writes"... How about O_NOTEARS -> PWF_NOTEARS -> REQ_NOTEARS. --D > > Fair 'enuf. > > > Thanks, > John