2020-10-28 22:12:59

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH 1/9] common: extract rt extent size for _get_file_block_size

On Tue, Oct 27, 2020 at 12:01:35PM -0700, Darrick J. Wong wrote:
> From: Darrick J. Wong <[email protected]>
>
> _get_file_block_size is intended to return the size (in bytes) of the
> fundamental allocation unit for a file. This is required for remapping
> operations like fallocate and reflink, which can only operate on
> allocation units. Since the XFS realtime volume can be configure for
> allocation units larger than 1 fs block, we need to factor that in here.

Should this also cover the ext4 bigalloc clusters? Or do they not
matter for fallocate?

> Signed-off-by: Darrick J. Wong <[email protected]>
> ---
> common/rc | 13 ++++++++++---
> common/xfs | 20 ++++++++++++++++++++
> 2 files changed, 30 insertions(+), 3 deletions(-)
>
>
> diff --git a/common/rc b/common/rc
> index 27a27ea3..41f93047 100644
> --- a/common/rc
> +++ b/common/rc
> @@ -3974,11 +3974,18 @@ _get_file_block_size()
> echo "Missing mount point argument for _get_file_block_size"
> exit 1
> fi
> - if [ "$FSTYP" = "ocfs2" ]; then
> +
> + case "$FSTYP" in
> + "ocfs2")
> stat -c '%o' $1
> - else
> + ;;
> + "xfs")
> + _xfs_get_file_block_size $1
> + ;;
> + *)
> _get_block_size $1
> - fi
> + ;;
> + esac
> }
>
> # Get the minimum block size of an fs.
> diff --git a/common/xfs b/common/xfs
> index 79dab058..3f5c14ba 100644
> --- a/common/xfs
> +++ b/common/xfs
> @@ -174,6 +174,26 @@ _scratch_mkfs_xfs()
> return $mkfs_status
> }
>
> +# Get the size of an allocation unit of a file. Normally this is just the
> +# block size of the file, but for realtime files, this is the realtime extent
> +# size.
> +_xfs_get_file_block_size()
> +{
> + local path="$1"
> +
> + if ! ($XFS_IO_PROG -c "stat -v" "$path" 2>&1 | egrep -q '(rt-inherit|realtime)'); then
> + _get_block_size "$path"
> + return
> + fi
> +
> + # Otherwise, call xfs_info until we find a mount point or the root.
> + path="$(readlink -m "$path")"
> + while ! $XFS_INFO_PROG "$path" &>/dev/null && [ "$path" != "/" ]; do
> + path="$(dirname "$path")"
> + done
> + $XFS_INFO_PROG "$path" | grep realtime | sed -e 's/^.*extsz=\([0-9]*\).*$/\1/g'
> +}
> +
> # xfs_check script is planned to be deprecated. But, we want to
> # be able to invoke "xfs_check" behavior in xfstests in order to
> # maintain the current verification levels.
>
---end quoted text---


2020-10-28 22:25:46

by Darrick J. Wong

[permalink] [raw]
Subject: Re: [PATCH 1/9] common: extract rt extent size for _get_file_block_size

On Wed, Oct 28, 2020 at 07:41:19AM +0000, Christoph Hellwig wrote:
> On Tue, Oct 27, 2020 at 12:01:35PM -0700, Darrick J. Wong wrote:
> > From: Darrick J. Wong <[email protected]>
> >
> > _get_file_block_size is intended to return the size (in bytes) of the
> > fundamental allocation unit for a file. This is required for remapping
> > operations like fallocate and reflink, which can only operate on
> > allocation units. Since the XFS realtime volume can be configure for
> > allocation units larger than 1 fs block, we need to factor that in here.
>
> Should this also cover the ext4 bigalloc clusters? Or do they not
> matter for fallocate?

They don't matter for fallocate, because ext4 doesn't require clusters
to be fully allocated like ocfs2 and xfs do.

This means that all the bigalloc codepaths have this horrible "implied
cluster allocation" thing sprinkled everywhere where to map in a single
block you have to scan left and right in the extent map to see if anyone
already mapped something. And even more strangely, extent tree blocks
don't do this, so it seems to waste the entire cluster past the first fs
block.

But I guess it /does/ mean that _get_file_block_size doesn't have to do
anything special for ext*.

--D

> > Signed-off-by: Darrick J. Wong <[email protected]>
> > ---
> > common/rc | 13 ++++++++++---
> > common/xfs | 20 ++++++++++++++++++++
> > 2 files changed, 30 insertions(+), 3 deletions(-)
> >
> >
> > diff --git a/common/rc b/common/rc
> > index 27a27ea3..41f93047 100644
> > --- a/common/rc
> > +++ b/common/rc
> > @@ -3974,11 +3974,18 @@ _get_file_block_size()
> > echo "Missing mount point argument for _get_file_block_size"
> > exit 1
> > fi
> > - if [ "$FSTYP" = "ocfs2" ]; then
> > +
> > + case "$FSTYP" in
> > + "ocfs2")
> > stat -c '%o' $1
> > - else
> > + ;;
> > + "xfs")
> > + _xfs_get_file_block_size $1
> > + ;;
> > + *)
> > _get_block_size $1
> > - fi
> > + ;;
> > + esac
> > }
> >
> > # Get the minimum block size of an fs.
> > diff --git a/common/xfs b/common/xfs
> > index 79dab058..3f5c14ba 100644
> > --- a/common/xfs
> > +++ b/common/xfs
> > @@ -174,6 +174,26 @@ _scratch_mkfs_xfs()
> > return $mkfs_status
> > }
> >
> > +# Get the size of an allocation unit of a file. Normally this is just the
> > +# block size of the file, but for realtime files, this is the realtime extent
> > +# size.
> > +_xfs_get_file_block_size()
> > +{
> > + local path="$1"
> > +
> > + if ! ($XFS_IO_PROG -c "stat -v" "$path" 2>&1 | egrep -q '(rt-inherit|realtime)'); then
> > + _get_block_size "$path"
> > + return
> > + fi
> > +
> > + # Otherwise, call xfs_info until we find a mount point or the root.
> > + path="$(readlink -m "$path")"
> > + while ! $XFS_INFO_PROG "$path" &>/dev/null && [ "$path" != "/" ]; do
> > + path="$(dirname "$path")"
> > + done
> > + $XFS_INFO_PROG "$path" | grep realtime | sed -e 's/^.*extsz=\([0-9]*\).*$/\1/g'
> > +}
> > +
> > # xfs_check script is planned to be deprecated. But, we want to
> > # be able to invoke "xfs_check" behavior in xfstests in order to
> > # maintain the current verification levels.
> >
> ---end quoted text---