Received: by 2002:a6b:500f:0:0:0:0:0 with SMTP id e15csp2151404iob; Fri, 20 May 2022 03:16:46 -0700 (PDT) X-Google-Smtp-Source: ABdhPJx61dp6SEcPmo3iyPuboYl/An/KaGVKcoxA3KyV5DXOVx3hifa4GrvBCUu0Tffi536zCJyr X-Received: by 2002:a65:4ccd:0:b0:3c2:428d:d13b with SMTP id n13-20020a654ccd000000b003c2428dd13bmr7823459pgt.425.1653041806065; Fri, 20 May 2022 03:16:46 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1653041806; cv=none; d=google.com; s=arc-20160816; b=FcQVD7v/MNACEg4JZUjCtk6qnLbCfDMD9Qb1led3fviVSbOLvzsNpv5qSmjcgTWQL3 E234c1WdWfFPvFILImxq44fSIDDySk33BnmvTIRLDAtl6pcbIku1YTDCNh7440ss0BL3 AmpQPeCO6J2oXHe+B6aJMNp9KAw1TD4g2EvtKTqY2OJDivvfMIQxa4uQVj/iQmZEXr9o Iw4DOD5knftDoRnFftm2fqH4wBGfmFCYJzSKhf99eJGX3cxsxVUgrl7JnfLyuq42CyNt OFlRd1PiL28YFlAdfbA+Lf6HZLJATGk0zQ5Vy6qfekXLmV1rNakARfkjiNpdKrfPJ3n6 RWYA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date; bh=5kw+EBZjE7VU1oZGnaJXzP9KKs0ff2QnGmEMcPk/zFo=; b=dAgHO74Ux+tl8Nzm6isO53FoT2ISPyApjSTLAZkJF/CPe97W95cwKS7dCh8TVRjapz 0M1A+mSMMJ2jYjn8IbGlMS5KgkUJgJOaPl9Q/gHxTZoS+LohpZASBAp1heaNp6S5DmQ3 t2QkpLeAJ2Eq1EX1OBr/9RHTEoCac7LJf3aGIiZI9FtaLDIbRcpfsxY0sPoWJXCrT3/U lf+jjKWpnf75gdIF7n5Ejwi0ApXk5llPbmKZsIb9Q8r7/LsK8llI9A1wp1VtXvo97ZvJ 639hSZ2cgvMNifMi8bVlb5NS0sztG0yknRAOrrB0so2T8idnhUGZxdmo5714GDMYHCBo VamA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-ext4-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-ext4-owner@vger.kernel.org Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id p14-20020a17090a2c4e00b001df1812df7esi2936012pjm.73.2022.05.20.03.16.23; Fri, 20 May 2022 03:16:46 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-ext4-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-ext4-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-ext4-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234470AbiETD1o (ORCPT + 99 others); Thu, 19 May 2022 23:27:44 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:42286 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230117AbiETD1o (ORCPT ); Thu, 19 May 2022 23:27:44 -0400 Received: from mail105.syd.optusnet.com.au (mail105.syd.optusnet.com.au [211.29.132.249]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 314C78DDF6; Thu, 19 May 2022 20:27:42 -0700 (PDT) Received: from dread.disaster.area (pa49-181-2-147.pa.nsw.optusnet.com.au [49.181.2.147]) by mail105.syd.optusnet.com.au (Postfix) with ESMTPS id 4126E10E68AE; Fri, 20 May 2022 13:27:41 +1000 (AEST) Received: from dave by dread.disaster.area with local (Exim 4.92.3) (envelope-from ) id 1nrtIh-00E6GN-MT; Fri, 20 May 2022 13:27:39 +1000 Date: Fri, 20 May 2022 13:27:39 +1000 From: Dave Chinner To: "Darrick J. Wong" Cc: Eric Biggers , linux-fsdevel@vger.kernel.org, linux-ext4@vger.kernel.org, linux-f2fs-devel@lists.sourceforge.net, linux-xfs@vger.kernel.org, linux-api@vger.kernel.org, linux-fscrypt@vger.kernel.org, linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, Keith Busch Subject: Re: [RFC PATCH v2 1/7] statx: add I/O alignment information Message-ID: <20220520032739.GB1098723@dread.disaster.area> References: <20220518235011.153058-1-ebiggers@kernel.org> <20220518235011.153058-2-ebiggers@kernel.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Optus-CM-Score: 0 X-Optus-CM-Analysis: v=2.4 cv=VuxAv86n c=1 sm=1 tr=0 ts=62870aad a=ivVLWpVy4j68lT4lJFbQgw==:117 a=ivVLWpVy4j68lT4lJFbQgw==:17 a=kj9zAlcOel0A:10 a=oZkIemNP1mAA:10 a=1XWaLZrsAAAA:8 a=VwQbUJbxAAAA:8 a=7-415B0cAAAA:8 a=-h69JAkiF4VdWyV60hkA:9 a=CjuIK1q_8ugA:10 a=AjGcO6oz07-iQ99wixmX:22 a=biEYGPWJfzWAr4FL6Ov7:22 X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00,RCVD_IN_DNSWL_NONE, SPF_HELO_PASS,SPF_NONE,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-ext4@vger.kernel.org On Thu, May 19, 2022 at 04:06:05PM -0700, Darrick J. Wong wrote: > On Wed, May 18, 2022 at 04:50:05PM -0700, Eric Biggers wrote: > > From: Eric Biggers > > > > Traditionally, the conditions for when DIO (direct I/O) is supported > > were fairly simple: filesystems either supported DIO aligned to the > > block device's logical block size, or didn't support DIO at all. > > > > However, due to filesystem features that have been added over time (e.g, > > data journalling, inline data, encryption, verity, compression, > > checkpoint disabling, log-structured mode), the conditions for when DIO > > is allowed on a file have gotten increasingly complex. Whether a > > particular file supports DIO, and with what alignment, can depend on > > various file attributes and filesystem mount options, as well as which > > block device(s) the file's data is located on. > > > > XFS has an ioctl XFS_IOC_DIOINFO which exposes this information to > > applications. However, as discussed > > (https://lore.kernel.org/linux-fsdevel/20220120071215.123274-1-ebiggers@kernel.org/T/#u), > > this ioctl is rarely used and not known to be used outside of > > XFS-specific code. It also was never intended to indicate when a file > > doesn't support DIO at all, and it only exposes the minimum I/O > > alignment, not the optimal I/O alignment which has been requested too. > > > > Therefore, let's expose this information via statx(). Add the > > STATX_IOALIGN flag and three fields associated with it: > > > > * stx_mem_align_dio: the alignment (in bytes) required for user memory > > buffers for DIO, or 0 if DIO is not supported on the file. > > > > * stx_offset_align_dio: the alignment (in bytes) required for file > > offsets and I/O segment lengths for DIO, or 0 if DIO is not supported > > on the file. This will only be nonzero if stx_mem_align_dio is > > nonzero, and vice versa. > > > > * stx_offset_align_optimal: the alignment (in bytes) suggested for file > > offsets and I/O segment lengths to get optimal performance. This > > applies to both DIO and buffered I/O. It differs from stx_blocksize > > in that stx_offset_align_optimal will contain the real optimum I/O > > size, which may be a large value. In contrast, for compatibility > > reasons stx_blocksize is the minimum size needed to avoid page cache > > read/write/modify cycles, which may be much smaller than the optimum > > I/O size. For more details about the motivation for this field, see > > https://lore.kernel.org/r/20220210040304.GM59729@dread.disaster.area > > Hmm. So I guess this is supposed to be the filesystem's best guess at > the IO size that will minimize RMW cycles in the entire stack? i.e. if > the user does not want RMW of pagecache pages, of file allocation units > (if COW is enabled), of RAID stripes, or in the storage itself, then it > should ensure that all IOs are aligned to this value? > > I guess that means for XFS it's effectively max(pagesize, i_blocksize, > bdev io_opt, sb_width, and (pretend XFS can reflink the realtime volume) > the rt extent size)? I didn't see a manpage update for statx(2) but > that's mostly what I'm interested in. :) Yup, xfs_stat_blksize() should give a good idea of what we should do. It will end up being pretty much that, except without the need to a mount option to turn on the sunit/swidth return, and always taking into consideration extent size hints rather than just doing that for RT inodes... Cheers, Dave. -- Dave Chinner david@fromorbit.com