Received: by 2002:a05:7412:e794:b0:fa:551:50a7 with SMTP id o20csp1355997rdd; Wed, 10 Jan 2024 17:41:22 -0800 (PST) X-Google-Smtp-Source: AGHT+IGzo+CULL5NY2vvo3t7P8vD9eVuhVTQExswZgn1iHoVUFzYm1U1CvgCHlUkvSnCPeVEoZ33 X-Received: by 2002:a05:6808:229f:b0:3bd:53e2:7fa with SMTP id bo31-20020a056808229f00b003bd53e207famr217055oib.54.1704937282314; Wed, 10 Jan 2024 17:41:22 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1704937282; cv=none; d=google.com; s=arc-20160816; b=Qnvm1MkRquckw/g6Nwze2FxK1BHJfF7KflAaAeAiKYRnjOkqkZiJS696Kd9xRGvGhK /fAkHQGzL/+TVbaQ7nhDT5DLwLlesBpu83n1vMb25CcY20J1zeYQgxy1gMWuimYDM13k X4e1s532zBXtmXKoSJAnO5ogrOQ1bvHqLNrKZFFtflQK479ZpC1/b6TWUgL0+1OVFF9i vUkvwRarZI1zl+gIX/J+knlnJQXlKrh7zylSA0Vh4noM6TvkT7VX9aaP9TFQFz5RYxPL 0ZBPnpneWL0ZARsHdmHQuU3ilQ2otkE3xzjNw14Cp64HppdHjIt+EvPCeVPehsmHpSJd hH1Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=in-reply-to:content-disposition:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:references:message-id:subject:cc :to:from:date:dkim-signature; bh=M7bk/KPEWBuUgsVEGP/h7HBCoC9cvvECUPXKRx2ssl0=; fh=4/6MKL/NM3wQcIZG+s+LE3JjBudeWGWRLuhdm35w+fE=; b=BES0NFij7+BGPOE6D9LBDwLmcI39Hm1mZN0x9y2SY+2dNb8HhUJcVaioLL772f3S3i weWC5gyBQ2mVbQ5D2u7uhkP2MCK2JAJEdFOvGakvrMqKrGBvwA8omp/q0Cq9jDVPJLyC 63qfQFr+k7iTz+NJfuxP9wysce8GSqnKiHKOChJitSLdEzDDUEbOdwSg2v7DhT/rbDlq MBpDltfiwo8aWMjrT8zJ0VJjF4st9vgwUH5Ydk1GGaUE8XmvFbdPI7vTDulJJqkh0Ajn antZXo+hDCcl7aPK/zcT89UgOtAAn69oWsjXp0mOmVnQPQQkKiuPv9MmoaeEl4y3SjL1 +MEg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=cfDXvP+m; spf=pass (google.com: domain of linux-kernel+bounces-22931-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.48.161 as permitted sender) smtp.mailfrom="linux-kernel+bounces-22931-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from sy.mirrors.kernel.org (sy.mirrors.kernel.org. [147.75.48.161]) by mx.google.com with ESMTPS id ff17-20020a056a002f5100b006d9f299a035si4730692pfb.142.2024.01.10.17.41.21 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 10 Jan 2024 17:41:22 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel+bounces-22931-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.48.161 as permitted sender) client-ip=147.75.48.161; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=cfDXvP+m; spf=pass (google.com: domain of linux-kernel+bounces-22931-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.48.161 as permitted sender) smtp.mailfrom="linux-kernel+bounces-22931-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by sy.mirrors.kernel.org (Postfix) with ESMTPS id 56E01B25C60 for ; Thu, 11 Jan 2024 01:41:15 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 727DB10E1; Thu, 11 Jan 2024 01:40:58 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="cfDXvP+m" Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 8AF19650; Thu, 11 Jan 2024 01:40:57 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id E1D61C433F1; Thu, 11 Jan 2024 01:40:56 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1704937257; bh=u51yyq5q6reKn5H0CqnJR7jUYlsGmLnDJKKy8teJDRs=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=cfDXvP+m7491pQR0YUCkm0mTB2aBegWGodHoLZFFG4xnm1HYzXRQuH2gS9jmS4sxK 4a1YcSnAQwsufJxQwNnVNUv8zlTNrN2c8FDL4SvgoPYxlV5HCtl48g/zf8dKFEZFKy s2oFPeyW0anIBYuwCbKWlfg0iILlMck90/uZ6HjK3pEsmr6x3ggHxQYlbGaZ+LM+u9 Cdp1Me8dqv2fMqivWwzjoUve3j+IWy/LLnsB6hnY7qN7e4nycBAq9OtxqrOqUVgXsd pNUpTm723vIzDa8WcFdKrRTfElEbLc8gQRM3fqF3Cypry45w0MWn8V/GMykrc5s3y6 6wFr/xhlj/0lw== Date: Wed, 10 Jan 2024 17:40:56 -0800 From: "Darrick J. Wong" To: Christoph Hellwig Cc: Dave Chinner , John Garry , axboe@kernel.dk, kbusch@kernel.org, sagi@grimberg.me, jejb@linux.ibm.com, martin.petersen@oracle.com, viro@zeniv.linux.org.uk, brauner@kernel.org, dchinner@redhat.com, jack@suse.cz, linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, linux-nvme@lists.infradead.org, linux-xfs@vger.kernel.org, linux-fsdevel@vger.kernel.org, tytso@mit.edu, jbongio@google.com, linux-scsi@vger.kernel.org, ming.lei@redhat.com, bvanassche@acm.org, ojaswin@linux.ibm.com Subject: Re: [PATCH v2 00/16] block atomic writes Message-ID: <20240111014056.GL722975@frogsfrogsfrogs> References: <20231219051456.GB3964019@frogsfrogsfrogs> <20231219052121.GA338@lst.de> <76c85021-dd9e-49e3-80e3-25a17c7ca455@oracle.com> <20231219151759.GA4468@lst.de> <20231221065031.GA25778@lst.de> <73d03703-6c57-424a-80ea-965e636c34d6@oracle.com> <20240110091929.GA31003@lst.de> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20240110091929.GA31003@lst.de> On Wed, Jan 10, 2024 at 10:19:29AM +0100, Christoph Hellwig wrote: > On Wed, Jan 10, 2024 at 10:04:00AM +1100, Dave Chinner wrote: > > Hence history teaches us that we should be designing the API around > > the generic filesystem function required (hard alignment of physical > > extent allocation), not the specific use case that requires that > > functionality. > > I disagree. The alignment requirement is an artefact of how you > implement atomic writes. As the fs user I care that I can do atomic > writes on a file and need to query how big the writes can be and > what alignment is required. > > The forcealign feature is a sensible fs side implementation of that > if using hardware based atomic writes with alignment requirements, > but it is a really lousy userspace API. > > So with John's API proposal for XFS with hardware alignment based atomic > writes we could still use force align. > > Requesting atomic writes for an inode will set the forcealign flag > and the extent size hint, and after that it'll report atomic write > capabilities. Roughly the same implementation, but not an API > tied to an implementation detail. Sounds good to me! So to summarize, this is approximately what userspace programs would have to do something like this: struct statx statx; struct fsxattr fsxattr; int fd = open('/foofile', O_RDWR | O_DIRECT); ioctl(fd, FS_IOC_GETXATTR, &fsxattr); fsxattr.fsx_xflags |= FS_XFLAG_FORCEALIGN | FS_XFLAG_WRITE_ATOMIC; fsxattr.fsx_extsize = 16384; /* only for hardware no-tears writes */ ioctl(fd, FS_IOC_SETXATTR, &fsxattr); statx(fd, "", AT_EMPTY_PATH, STATX_ALL | STATX_WRITE_ATOMIC, &statx); if (statx.stx_atomic_write_unit_max >= 16384) { pwrite(fd, &iov, 1, 0, RWF_SYNC | RWF_ATOMIC); printf("HAPPY DANCE\n"); } (Assume we bail out on errors.) --D