2010-04-27 06:21:05

by Tao Ma

[permalink] [raw]
Subject: [PATCH] XFS: Let the broken fiemap work in query mode.

According to Documentation/filesystems/fiemap.txt, If fm_extent_count
is zero, then the fm_extents[] array is ignored (no extents will be
returned), and the fm_mapped_extents count will hold the number of
extents needed.

But as the commit 97db39a1f6f69e906e98118392400de5217aa33a has changed
bmv_count to the caller's input buffer, this number query function can't
work any more. As this commit is written to change bmv_count from
MAXEXTNUM because of ENOMEM, we can't find a really suitable number to
set bmv_count now in xfs_vn_fiemap. Since we really have no idea of how
much extents the file has, a big number may cause ENOMEM, while a small
one will mask the real extent no.

So this patch try to resolve this problem by adding a temporary getbmapx
in xfs_getbmap. If the caller didn't give bmv_count, we don't allocate
the "out" either. Instead, every time we want to use 'out', use '&tmp'
instead.

I know this solution is a bit ugly, but I can't find a way to resolve
this issue while not changing the codes too much. So any good suggestion
is welcomed.

Cc: Eric Sandeen <[email protected]>
Cc: Christoph Hellwig <[email protected]>
Cc: Alex Elder <[email protected]>
Signed-off-by: Tao Ma <[email protected]>
---
fs/xfs/xfs_bmap.c | 47 +++++++++++++++++++++++++++++------------------
1 files changed, 29 insertions(+), 18 deletions(-)

diff --git a/fs/xfs/xfs_bmap.c b/fs/xfs/xfs_bmap.c
index 98251cd..654d9cf 100644
--- a/fs/xfs/xfs_bmap.c
+++ b/fs/xfs/xfs_bmap.c
@@ -5557,13 +5557,14 @@ xfs_getbmap(
int nexleft; /* # of user extents left */
int subnex; /* # of bmapi's can do */
int nmap; /* number of map entries */
- struct getbmapx *out; /* output structure */
+ struct getbmapx *out = NULL; /* output structure */
int whichfork; /* data or attr fork */
int prealloced; /* this is a file with
* preallocated data space */
int iflags; /* interface flags */
int bmapi_flags; /* flags for xfs_bmapi */
int cur_ext = 0;
+ struct getbmapx tmp, *bmap;

mp = ip->i_mount;
iflags = bmv->bmv_iflags;
@@ -5635,16 +5636,20 @@ xfs_getbmap(
}

nex = bmv->bmv_count - 1;
- if (nex <= 0)
+ if (nex < 0)
return XFS_ERROR(EINVAL);
bmvend = bmv->bmv_offset + bmv->bmv_length;


if (bmv->bmv_count > ULONG_MAX / sizeof(struct getbmapx))
return XFS_ERROR(ENOMEM);
- out = kmem_zalloc(bmv->bmv_count * sizeof(struct getbmapx), KM_MAYFAIL);
- if (!out)
- return XFS_ERROR(ENOMEM);
+ if (nex) {
+ out = kmem_zalloc(bmv->bmv_count * sizeof(struct getbmapx),
+ KM_MAYFAIL);
+ if (!out)
+ return XFS_ERROR(ENOMEM);
+ } else
+ nex = MAXEXTNUM;

xfs_ilock(ip, XFS_IOLOCK_SHARED);
if (whichfork == XFS_DATA_FORK && !(iflags & BMV_IF_DELALLOC)) {
@@ -5700,35 +5705,37 @@ xfs_getbmap(
ASSERT(nmap <= subnex);

for (i = 0; i < nmap && nexleft && bmv->bmv_length; i++) {
- out[cur_ext].bmv_oflags = 0;
+ if (out)
+ bmap = &out[cur_ext];
+ else
+ bmap = &tmp;
+ bmap->bmv_oflags = 0;
if (map[i].br_state == XFS_EXT_UNWRITTEN)
- out[cur_ext].bmv_oflags |= BMV_OF_PREALLOC;
+ bmap->bmv_oflags |= BMV_OF_PREALLOC;
else if (map[i].br_startblock == DELAYSTARTBLOCK)
- out[cur_ext].bmv_oflags |= BMV_OF_DELALLOC;
- out[cur_ext].bmv_offset =
+ bmap->bmv_oflags |= BMV_OF_DELALLOC;
+ bmap->bmv_offset =
XFS_FSB_TO_BB(mp, map[i].br_startoff);
- out[cur_ext].bmv_length =
+ bmap->bmv_length =
XFS_FSB_TO_BB(mp, map[i].br_blockcount);
- out[cur_ext].bmv_unused1 = 0;
- out[cur_ext].bmv_unused2 = 0;
+ bmap->bmv_unused1 = 0;
+ bmap->bmv_unused2 = 0;
ASSERT(((iflags & BMV_IF_DELALLOC) != 0) ||
(map[i].br_startblock != DELAYSTARTBLOCK));
if (map[i].br_startblock == HOLESTARTBLOCK &&
whichfork == XFS_ATTR_FORK) {
/* came to the end of attribute fork */
- out[cur_ext].bmv_oflags |= BMV_OF_LAST;
+ bmap->bmv_oflags |= BMV_OF_LAST;
goto out_free_map;
}

- if (!xfs_getbmapx_fix_eof_hole(ip, &out[cur_ext],
+ if (!xfs_getbmapx_fix_eof_hole(ip, bmap,
prealloced, bmvend,
map[i].br_startblock))
goto out_free_map;

nexleft--;
- bmv->bmv_offset =
- out[cur_ext].bmv_offset +
- out[cur_ext].bmv_length;
+ bmv->bmv_offset = bmap->bmv_offset + bmap->bmv_length;
bmv->bmv_length =
max_t(__int64_t, 0, bmvend - bmv->bmv_offset);
bmv->bmv_entries++;
@@ -5746,8 +5753,12 @@ xfs_getbmap(
for (i = 0; i < cur_ext; i++) {
int full = 0; /* user array is full */

+ if (out)
+ bmap = &out[i];
+ else
+ bmap = &tmp;
/* format results & advance arg */
- error = formatter(&arg, &out[i], &full);
+ error = formatter(&arg, bmap, &full);
if (error || full)
break;
}
--
1.6.3.3.334.g916e1.dirty


2010-04-28 01:51:18

by Dave Chinner

[permalink] [raw]
Subject: Re: [PATCH] XFS: Let the broken fiemap work in query mode.

On Tue, Apr 27, 2010 at 02:17:45PM +0800, Tao Ma wrote:
> According to Documentation/filesystems/fiemap.txt, If fm_extent_count
> is zero, then the fm_extents[] array is ignored (no extents will be
> returned), and the fm_mapped_extents count will hold the number of
> extents needed.
>
> But as the commit 97db39a1f6f69e906e98118392400de5217aa33a has changed
> bmv_count to the caller's input buffer, this number query function can't
> work any more. As this commit is written to change bmv_count from
> MAXEXTNUM because of ENOMEM, we can't find a really suitable number to
> set bmv_count now in xfs_vn_fiemap. Since we really have no idea of how
> much extents the file has, a big number may cause ENOMEM, while a small
> one will mask the real extent no.
>
> So this patch try to resolve this problem by adding a temporary getbmapx
> in xfs_getbmap. If the caller didn't give bmv_count, we don't allocate
> the "out" either. Instead, every time we want to use 'out', use '&tmp'
> instead.
>
> I know this solution is a bit ugly, but I can't find a way to resolve
> this issue while not changing the codes too much. So any good suggestion
> is welcomed.

I don't see a need to change xfs_getbmap() to fix this. We can limit
the maximum allocation size to something realistic just by setting
bm.bmv.count to something sane. e.g, in xfs_vn_fiemap:

- bm.bmv_count = fieinfo->fi_extents_max + 1;
+ bm.bmv.count = !fieinfo->fi_extents_max ? MAXEXTNUM :
+ fieinfo->fi_extents_max - 1;
+ bm.bmv_count = MIN(bm.bmv_count,
(PAGE_SIZE * 16 / sizeof(struct getbmapx)));

Unless I'm missing something, that should also prevent the case of
an application providing a really large fi_extents_max from
triggering ENOMEM in most cases as well.

FWIW, how did you find this? Is it possible for you to add a test
for this regression into xfstests so that we don't break it again
in future?

Cheers,

Dave.
--
Dave Chinner
[email protected]

2010-04-28 02:02:05

by Tao Ma

[permalink] [raw]
Subject: Re: [PATCH] XFS: Let the broken fiemap work in query mode.

Hi Dave,

Dave Chinner wrote:
> On Tue, Apr 27, 2010 at 02:17:45PM +0800, Tao Ma wrote:
>> According to Documentation/filesystems/fiemap.txt, If fm_extent_count
>> is zero, then the fm_extents[] array is ignored (no extents will be
>> returned), and the fm_mapped_extents count will hold the number of
>> extents needed.
>>
>> But as the commit 97db39a1f6f69e906e98118392400de5217aa33a has changed
>> bmv_count to the caller's input buffer, this number query function can't
>> work any more. As this commit is written to change bmv_count from
>> MAXEXTNUM because of ENOMEM, we can't find a really suitable number to
>> set bmv_count now in xfs_vn_fiemap. Since we really have no idea of how
>> much extents the file has, a big number may cause ENOMEM, while a small
>> one will mask the real extent no.
>>
>> So this patch try to resolve this problem by adding a temporary getbmapx
>> in xfs_getbmap. If the caller didn't give bmv_count, we don't allocate
>> the "out" either. Instead, every time we want to use 'out', use '&tmp'
>> instead.
>>
>> I know this solution is a bit ugly, but I can't find a way to resolve
>> this issue while not changing the codes too much. So any good suggestion
>> is welcomed.
>
> I don't see a need to change xfs_getbmap() to fix this. We can limit
> the maximum allocation size to something realistic just by setting
> bm.bmv.count to something sane. e.g, in xfs_vn_fiemap:
>
> - bm.bmv_count = fieinfo->fi_extents_max + 1;
> + bm.bmv.count = !fieinfo->fi_extents_max ? MAXEXTNUM :
> + fieinfo->fi_extents_max - 1;
> + bm.bmv_count = MIN(bm.bmv_count,
> (PAGE_SIZE * 16 / sizeof(struct getbmapx)));
>
> Unless I'm missing something, that should also prevent the case of
> an application providing a really large fi_extents_max from
> triggering ENOMEM in most cases as well.
I just worry about one thing: What if the real extent number is larger
than the PAGE_SIZE * 16 / sizeof(struct getbmapx)? In this case, we will
give up the wrong extent number to the user space.
>
> FWIW, how did you find this? Is it possible for you to add a test
> for this regression into xfstests so that we don't break it again
> in future?
Sure, I will check and see whether I can add it in xfstests.

Regards,
Tao

2010-04-28 02:31:39

by Dave Chinner

[permalink] [raw]
Subject: Re: [PATCH] XFS: Let the broken fiemap work in query mode.

On Wed, Apr 28, 2010 at 10:00:01AM +0800, Tao Ma wrote:
> Hi Dave,
>
> Dave Chinner wrote:
> >On Tue, Apr 27, 2010 at 02:17:45PM +0800, Tao Ma wrote:
> >>According to Documentation/filesystems/fiemap.txt, If fm_extent_count
> >>is zero, then the fm_extents[] array is ignored (no extents will be
> >>returned), and the fm_mapped_extents count will hold the number of
> >>extents needed.
> >>
> >>But as the commit 97db39a1f6f69e906e98118392400de5217aa33a has changed
> >>bmv_count to the caller's input buffer, this number query function can't
> >>work any more. As this commit is written to change bmv_count from
> >>MAXEXTNUM because of ENOMEM, we can't find a really suitable number to
> >>set bmv_count now in xfs_vn_fiemap. Since we really have no idea of how
> >>much extents the file has, a big number may cause ENOMEM, while a small
> >>one will mask the real extent no.
> >>
> >>So this patch try to resolve this problem by adding a temporary getbmapx
> >>in xfs_getbmap. If the caller didn't give bmv_count, we don't allocate
> >>the "out" either. Instead, every time we want to use 'out', use '&tmp'
> >>instead.
> >>
> >>I know this solution is a bit ugly, but I can't find a way to resolve
> >>this issue while not changing the codes too much. So any good suggestion
> >>is welcomed.
> >
> >I don't see a need to change xfs_getbmap() to fix this. We can limit
> >the maximum allocation size to something realistic just by setting
> >bm.bmv.count to something sane. e.g, in xfs_vn_fiemap:
> >
> >- bm.bmv_count = fieinfo->fi_extents_max + 1;
> >+ bm.bmv.count = !fieinfo->fi_extents_max ? MAXEXTNUM :
> >+ fieinfo->fi_extents_max - 1;
> >+ bm.bmv_count = MIN(bm.bmv_count,
> > (PAGE_SIZE * 16 / sizeof(struct getbmapx)));
> >
> >Unless I'm missing something, that should also prevent the case of
> >an application providing a really large fi_extents_max from
> >triggering ENOMEM in most cases as well.
> I just worry about one thing: What if the real extent number is
> larger than the PAGE_SIZE * 16 / sizeof(struct getbmapx)? In this
> case, we will give up the wrong extent number to the user space.

Applications need to handle mappings changing from query to getting
the mapping, so this should not be a major issue. Especially as the
method of fiemap indicating there are more extents to be extracted
from the inode in the case the kernel can't allocate a buffer large
enough is already documented.

Realistically though, xfs_getbmap() needs a complete rewrite so
right now I'd prefer just to do the minimum to fix the reported
problem that continue to make it into even more of a mess than it is
now...

Cheers,

Dave.
--
Dave Chinner
[email protected]

2010-04-28 03:01:38

by Tao Ma

[permalink] [raw]
Subject: [PATCH v2] XFS: Let the broken fiemap work in query mode.

Dave Chinner wrote:
> On Wed, Apr 28, 2010 at 10:00:01AM +0800, Tao Ma wrote:
>> Hi Dave,
>>
>> Dave Chinner wrote:
>>> On Tue, Apr 27, 2010 at 02:17:45PM +0800, Tao Ma wrote:
>>>> According to Documentation/filesystems/fiemap.txt, If fm_extent_count
>>>> is zero, then the fm_extents[] array is ignored (no extents will be
>>>> returned), and the fm_mapped_extents count will hold the number of
>>>> extents needed.
>>>>
>>>> But as the commit 97db39a1f6f69e906e98118392400de5217aa33a has changed
>>>> bmv_count to the caller's input buffer, this number query function can't
>>>> work any more. As this commit is written to change bmv_count from
>>>> MAXEXTNUM because of ENOMEM, we can't find a really suitable number to
>>>> set bmv_count now in xfs_vn_fiemap. Since we really have no idea of how
>>>> much extents the file has, a big number may cause ENOMEM, while a small
>>>> one will mask the real extent no.
>>>>
>>>> So this patch try to resolve this problem by adding a temporary getbmapx
>>>> in xfs_getbmap. If the caller didn't give bmv_count, we don't allocate
>>>> the "out" either. Instead, every time we want to use 'out', use '&tmp'
>>>> instead.
>>>>
>>>> I know this solution is a bit ugly, but I can't find a way to resolve
>>>> this issue while not changing the codes too much. So any good suggestion
>>>> is welcomed.
>>> I don't see a need to change xfs_getbmap() to fix this. We can limit
>>> the maximum allocation size to something realistic just by setting
>>> bm.bmv.count to something sane. e.g, in xfs_vn_fiemap:
>>>
>>> - bm.bmv_count = fieinfo->fi_extents_max + 1;
>>> + bm.bmv.count = !fieinfo->fi_extents_max ? MAXEXTNUM :
>>> + fieinfo->fi_extents_max - 1;
>>> + bm.bmv_count = MIN(bm.bmv_count,
>>> (PAGE_SIZE * 16 / sizeof(struct getbmapx)));
>>>
>>> Unless I'm missing something, that should also prevent the case of
>>> an application providing a really large fi_extents_max from
>>> triggering ENOMEM in most cases as well.
>> I just worry about one thing: What if the real extent number is
>> larger than the PAGE_SIZE * 16 / sizeof(struct getbmapx)? In this
>> case, we will give up the wrong extent number to the user space.
>
> Applications need to handle mappings changing from query to getting
> the mapping, so this should not be a major issue. Especially as the
> method of fiemap indicating there are more extents to be extracted
> from the inode in the case the kernel can't allocate a buffer large
> enough is already documented.
>
> Realistically though, xfs_getbmap() needs a complete rewrite so
> right now I'd prefer just to do the minimum to fix the reported
> problem that continue to make it into even more of a mess than it is
> now...
Fair enough. Here is the updated patch.

btw, I am working on adding the test cases in xfstests.

Regards,
Tao

>From e5d32636c907be106d55d63c253d1750a4a898d7 Mon Sep 17 00:00:00 2001
From: Tao Ma <[email protected]>
Date: Wed, 28 Apr 2010 10:25:33 +0800
Subject: [PATCH v2] XFS: Let the broken fiemap work in query mode.

According to Documentation/filesystems/fiemap.txt, If fm_extent_count
is zero, then the fm_extents[] array is ignored (no extents will be
returned), and the fm_mapped_extents count will hold the number of
extents needed.

But as the commit 97db39a1f6f69e906e98118392400de5217aa33a has changed
bmv_count to the caller's input buffer, this number query function can't
work any more. As this commit is written to change bmv_count from
MAXEXTNUM because of ENOMEM.

This patch just try to set bm.bmv.count to something sane.
Thanks to Dave Chinner <[email protected]> for the suggestion.

Cc: Dave Chinner <[email protected]>
Cc: Eric Sandeen <[email protected]>
Cc: Christoph Hellwig <[email protected]>
Cc: Alex Elder <[email protected]>
Signed-off-by: Tao Ma <[email protected]>
---
fs/xfs/linux-2.6/xfs_iops.c | 5 ++++-
1 files changed, 4 insertions(+), 1 deletions(-)

diff --git a/fs/xfs/linux-2.6/xfs_iops.c b/fs/xfs/linux-2.6/xfs_iops.c
index 2259460..24ccad9 100644
--- a/fs/xfs/linux-2.6/xfs_iops.c
+++ b/fs/xfs/linux-2.6/xfs_iops.c
@@ -662,7 +662,10 @@ xfs_vn_fiemap(
bm.bmv_length = BTOBB(length);

/* We add one because in getbmap world count includes the header */
- bm.bmv_count = fieinfo->fi_extents_max + 1;
+ bm.bmv_count = !fieinfo->fi_extents_max ? MAXEXTNUM :
+ fieinfo->fi_extents_max + 1;
+ bm.bmv_count = MIN(bm.bmv_count,
+ (__s32)(PAGE_SIZE * 16 / sizeof(struct getbmapx)));
bm.bmv_iflags = BMV_IF_PREALLOC;
if (fieinfo->fi_flags & FIEMAP_FLAG_XATTR)
bm.bmv_iflags |= BMV_IF_ATTRFORK;
--
1.6.3.3.334.g916e1.dirty

2010-04-28 14:41:39

by Tao Ma

[permalink] [raw]
Subject: [PATCH v3] XFS: Let the broken fiemap work in query mode.

> On Wed, Apr 28, 2010 at 11:00:25AM +0800, Tao Ma wrote:
>> - bm.bmv_count = fieinfo->fi_extents_max + 1;
>> + bm.bmv_count = !fieinfo->fi_extents_max ? MAXEXTNUM :
>> + fieinfo->fi_extents_max + 1;
>> + bm.bmv_count = MIN(bm.bmv_count,
>> + (__s32)(PAGE_SIZE * 16 / sizeof(struct getbmapx)));
>
> I would use min_t here instead of the case, but otherwise the patch
> looks good to me
>
>
> Reviewed-by: Christoph Hellwig <[email protected]>

Ok, here comes the v3. Thanks for the review.

Regards,
Tao

>From c735bccfbb925841545dcb1853fa8ffb3f3c8785 Mon Sep 17 00:00:00 2001
From: Tao Ma <[email protected]>
Date: Wed, 28 Apr 2010 21:57:32 +0800
Subject: [PATCH v3] XFS: Let the broken fiemap work in query mode.

According to Documentation/filesystems/fiemap.txt, If fm_extent_count
is zero, then the fm_extents[] array is ignored (no extents will be
returned), and the fm_mapped_extents count will hold the number of
extents needed.

But as the commit 97db39a1f6f69e906e98118392400de5217aa33a has changed
bmv_count to the caller's input buffer, this number query function can't
work any more. As this commit is written to change bmv_count from
MAXEXTNUM because of ENOMEM.

This patch just try to set bm.bmv.count to something sane.
Thanks to Dave Chinner <[email protected]> for the suggestion.

Cc: Eric Sandeen <[email protected]>
Cc: Alex Elder <[email protected]>
Reviewed-by: Christoph Hellwig <[email protected]>
Signed-off-by: Tao Ma <[email protected]>
---
fs/xfs/linux-2.6/xfs_iops.c | 5 ++++-
1 files changed, 4 insertions(+), 1 deletions(-)

diff --git a/fs/xfs/linux-2.6/xfs_iops.c b/fs/xfs/linux-2.6/xfs_iops.c
index 2259460..68531fc 100644
--- a/fs/xfs/linux-2.6/xfs_iops.c
+++ b/fs/xfs/linux-2.6/xfs_iops.c
@@ -662,7 +662,10 @@ xfs_vn_fiemap(
bm.bmv_length = BTOBB(length);

/* We add one because in getbmap world count includes the header */
- bm.bmv_count = fieinfo->fi_extents_max + 1;
+ bm.bmv_count = !fieinfo->fi_extents_max ? MAXEXTNUM :
+ fieinfo->fi_extents_max + 1;
+ bm.bmv_count = min_t(__s32, bm.bmv_count,
+ (PAGE_SIZE * 16 / sizeof(struct getbmapx)));
bm.bmv_iflags = BMV_IF_PREALLOC;
if (fieinfo->fi_flags & FIEMAP_FLAG_XATTR)
bm.bmv_iflags |= BMV_IF_ATTRFORK;
--
1.6.3.3.334.g916e1.dirty

2010-04-28 11:34:04

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH v2] XFS: Let the broken fiemap work in query mode.

On Wed, Apr 28, 2010 at 11:00:25AM +0800, Tao Ma wrote:
> - bm.bmv_count = fieinfo->fi_extents_max + 1;
> + bm.bmv_count = !fieinfo->fi_extents_max ? MAXEXTNUM :
> + fieinfo->fi_extents_max + 1;
> + bm.bmv_count = MIN(bm.bmv_count,
> + (__s32)(PAGE_SIZE * 16 / sizeof(struct getbmapx)));

I would use min_t here instead of the case, but otherwise the patch
looks good to me,


Reviewed-by: Christoph Hellwig <[email protected]>