2011-04-02 13:29:59

by Pádraig Brady

[permalink] [raw]
Subject: Re: bug#8411: due to missing sync even on 2.6.39, cp fails to copy an odd file

On 02/04/11 12:16, Jim Meyering wrote:
> Hi P?draig,
>
> As of this change,
>
> copy: with fiemap copy, only sync when needed
> http://git.savannah.gnu.org/cgit/coreutils.git/commit/?id=f69876e55
>
> fiemap copy with extents beyond EOF can fail on ext4 even with
> Fedora 15 (2.6.38) and rawhide's 2.6.39 kernel.
>
> Here we construct an odd file. First, preallocate 10MB of space,
> and then write 5KiB of random data into the beginning of that:
>
> $ fallocate -l 10000000 -n k
> $ dd count=10 if=/dev/urandom conv=notrunc iflag=fullblock of=k
>
> However, when we try to copy "k", we get a file, "k2" of the
> expected size, but with only NUL bytes for contents:

So the extent info is not updated until sync(),
which means cp will consider the "unwritten" extent
as NUL data and not bother to read it :(

I guess this is a corner case that was missed
in the fixups for ext4 (and btrfs?) in 2.6.38 for this?
I've copied ext4 devs for clarification.
I.E. if you do this:

fallocate -l 10000000 -n k
dd count=10 if=/dev/urandom conv=notrunc iflag=fullblock of=k
filefrag -v k

Do you get all extents still unwritten.
I do on my 2.6.35.11-83.fc14.i686 kernel, but I expected that.

cheers,
P?draig.


2011-04-02 13:50:22

by Jim Meyering

[permalink] [raw]
Subject: Re: bug#8411: due to missing sync even on 2.6.39, cp fails to copy an odd file

P?draig Brady wrote:
> On 02/04/11 12:16, Jim Meyering wrote:
>> Hi P?draig,
>> As of this change,
>>
>> copy: with fiemap copy, only sync when needed
>> http://git.savannah.gnu.org/cgit/coreutils.git/commit/?id=f69876e55
>>
>> fiemap copy with extents beyond EOF can fail on ext4 even with
>> Fedora 15 (2.6.38) and rawhide's 2.6.39 kernel.
>>
>> Here we construct an odd file. First, preallocate 10MB of space,
>> and then write 5KiB of random data into the beginning of that:
>>
>> $ fallocate -l 10000000 -n k
>> $ dd count=10 if=/dev/urandom conv=notrunc iflag=fullblock of=k
>>
>> However, when we try to copy "k", we get a file, "k2" of the
>> expected size, but with only NUL bytes for contents:
>
> So the extent info is not updated until sync(),
> which means cp will consider the "unwritten" extent
> as NUL data and not bother to read it :(
>
> I guess this is a corner case that was missed
> in the fixups for ext4 (and btrfs?) in 2.6.38 for this?
> I've copied ext4 devs for clarification.
> I.E. if you do this:
>
> fallocate -l 10000000 -n k
> dd count=10 if=/dev/urandom conv=notrunc iflag=fullblock of=k
> filefrag -v k
>
> Do you get all extents still unwritten.
> I do on my 2.6.35.11-83.fc14.i686 kernel, but I expected that.

Right. The new extent remains unwritten until after a sync
with ext4 on both Fedora 15 and rawhide kernels.

Here's F15:

$ fallocate -l 10000000 -n k
$ dd count=10 if=/dev/urandom conv=notrunc iflag=fullblock of=k
...
$ filefrag -v k
Filesystem type is: ef53
File size of k is 5120 (2 blocks, blocksize 4096)
ext logical physical expected length flags
0 0 6498304 2442 unwritten,eof
k: 1 extent found
$ sync
$ filefrag -v k
Filesystem type is: ef53
File size of k is 5120 (2 blocks, blocksize 4096)
ext logical physical expected length flags
0 0 6498304 2 eof
1 2 6498306 2440 unwritten,eof
k: 1 extent found

Same problem with rawhide, but slightly different layout:

$ fallocate -l 10000000 -n k
$ dd count=10 if=/dev/urandom conv=notrunc iflag=fullblock of=k
...
$ filefrag -v k
Filesystem type is: ef53
File size of k is 5120 (2 blocks, blocksize 4096)
ext logical physical expected length flags
0 0 2054144 2048 unwritten,eof
1 2048 2060288 2056191 394 unwritten,eof
k: 2 extents found
$ filefrag -v k
[Exit 1]
$ sync
$ filefrag -v k
Filesystem type is: ef53
File size of k is 5120 (2 blocks, blocksize 4096)
ext logical physical expected length flags
0 0 2054144 2 eof
1 2 2054146 2046 unwritten,eof
2 2048 2060288 2056191 394 unwritten,eof
k: 2 extents found

2011-04-02 18:08:43

by Jim Meyering

[permalink] [raw]
Subject: Re: bug#8411: due to missing sync even on 2.6.39, cp fails to copy an odd file

Jim Meyering wrote:
> P?draig Brady wrote:
>> On 02/04/11 12:16, Jim Meyering wrote:
>>> Hi P?draig,
>>> As of this change,
>>>
>>> copy: with fiemap copy, only sync when needed
>>> http://git.savannah.gnu.org/cgit/coreutils.git/commit/?id=f69876e55
>>>
>>> fiemap copy with extents beyond EOF can fail on ext4 even with
>>> Fedora 15 (2.6.38) and rawhide's 2.6.39 kernel.
>>>
>>> Here we construct an odd file. First, preallocate 10MB of space,
>>> and then write 5KiB of random data into the beginning of that:
>>>
>>> $ fallocate -l 10000000 -n k
>>> $ dd count=10 if=/dev/urandom conv=notrunc iflag=fullblock of=k
>>>
>>> However, when we try to copy "k", we get a file, "k2" of the
>>> expected size, but with only NUL bytes for contents:
>>
>> So the extent info is not updated until sync(),
>> which means cp will consider the "unwritten" extent
>> as NUL data and not bother to read it :(
>>
>> I guess this is a corner case that was missed
>> in the fixups for ext4 (and btrfs?) in 2.6.38 for this?
>> I've copied ext4 devs for clarification.
>> I.E. if you do this:
>>
>> fallocate -l 10000000 -n k
>> dd count=10 if=/dev/urandom conv=notrunc iflag=fullblock of=k
>> filefrag -v k
>>
>> Do you get all extents still unwritten.
>> I do on my 2.6.35.11-83.fc14.i686 kernel, but I expected that.
>
> Right. The new extent remains unwritten until after a sync
> with ext4 on both Fedora 15 and rawhide kernels.

Here's the quick band-aid I'll push tomorrow.
I'll welcome a better fix, but this, at least,
lets "make check" pass once again.

From 0a6d128d0d17c1604245f1caafe6af73584a0bb8 Mon Sep 17 00:00:00 2001
From: Jim Meyering <[email protected]>
Date: Sat, 2 Apr 2011 19:59:30 +0200
Subject: [PATCH] copy: require fiemap sync also for 2.6.38 and 2.6.39 kernels

* src/extent-scan.c (extent_need_sync): Require sync also for 2.6.38
and 2.6.39. Without this, part of the cp/fiemap-empty test would fail
both on F15-to-be and rawhide. For discussion and details, see:
http://thread.gmane.org/gmane.comp.gnu.coreutils.bugs/22190
---
src/extent-scan.c | 4 ++--
1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/extent-scan.c b/src/extent-scan.c
index 4ddf46e..752de85 100644
--- a/src/extent-scan.c
+++ b/src/extent-scan.c
@@ -31,7 +31,7 @@
# include "fiemap.h"
#endif

-/* Work around Linux kernel issues on BTRFS and EXT4 before 2.6.38.
+/* Work around Linux kernel issues on BTRFS and EXT4 before 2.6.40.
FIXME: remove in 2013, or whenever we're pretty confident
that the offending, unpatched kernels are no longer in use. */
static bool
@@ -50,7 +50,7 @@ extent_need_sync (void)
unsigned long val;
if (xstrtoul (name.release + 4, NULL, 10, &val, NULL) == LONGINT_OK)
{
- if (val < 38)
+ if (val < 40)
need_sync = 1;
}
}
--
1.7.4.2.662.gcbd0

2011-04-02 23:00:13

by Theodore Ts'o

[permalink] [raw]
Subject: Re: bug#8411: due to missing sync even on 2.6.39, cp fails to copy an odd file

On Sat, Apr 02, 2011 at 08:08:34PM +0200, Jim Meyering wrote:
> From 0a6d128d0d17c1604245f1caafe6af73584a0bb8 Mon Sep 17 00:00:00 2001
> From: Jim Meyering <[email protected]>
> Date: Sat, 2 Apr 2011 19:59:30 +0200
> Subject: [PATCH] copy: require fiemap sync also for 2.6.38 and 2.6.39 kernels
>
> * src/extent-scan.c (extent_need_sync): Require sync also for 2.6.38
> and 2.6.39. Without this, part of the cp/fiemap-empty test would fail
> both on F15-to-be and rawhide. For discussion and details, see:
> http://thread.gmane.org/gmane.comp.gnu.coreutils.bugs/22190

FYI, the following fix has been merged into mainline, which should fix
the problem for 2.6.39 once it is finally released, at least for ext4.
It was merged right before Linus released 2.6.39-rc1. I'm assuming
that Rawhide released a pre-2.6.39-rc1 kernel in the middle of the
merge window.

Some distro's will informally, but incorrectly, refer to such a
release as "2.6.39-rc0". I prefer the more technically correct
2.6.38-git18 (which is the first git tag in the Linux git repo which
contained the patch below, as of March 25, 2011). Unfortunately, RPM
doesn't understand that 2.6.38-rc1 sorts before 2.6.38, while
2.6.38-git17 sorts *after* 2.6.38. (Hence the incorrect, but
convenient, use of 2.6.39-rc0.)

- Ted

commit 6d9c85eb700bd3ac59e63bb9de463dea1aca084c
Author: Yongqiang Yang <[email protected]>
Date: Sun Feb 27 17:25:47 2011 -0500

ext4: make FIEMAP and delayed allocation play well together

Fix the FIEMAP ioctl so that it returns all of the page ranges which
are still subject to delayed allocation. We were missing some cases
if the file was sparse.

Reported by Chris Mason <[email protected]>:
>We've had reports on btrfs that cp is giving us files full of zeros
>instead of actually copying them. It was tracked down to a bug with
>the btrfs fiemap implementation where it was returning holes for
>delalloc ranges.
>
>Newer versions of cp are trusting fiemap to tell it where the holes
>are, which does seem like a pretty neat trick.
>
>I decided to give xfs and ext4 a shot with a few tests cases too, xfs
>passed with all the ones btrfs was getting wrong, and ext4 got the basic
>delalloc case right.
>$ mkfs.ext4 /dev/xxx
>$ mount /dev/xxx /mnt
>$ dd if=/dev/zero of=/mnt/foo bs=1M count=1
>$ fiemap-test foo
>ext: 0 logical: [ 0.. 255] phys: 0.. 255
>flags: 0x007 tot: 256
>
>Horray! But once we throw a hole in, things go bad:
>$ mkfs.ext4 /dev/xxx
>$ mount /dev/xxx /mnt
>$ dd if=/dev/zero of=/mnt/foo bs=1M count=1 seek=1
>$ fiemap-test foo
>< no output >
>
>We've got a delalloc extent after the hole and ext4 fiemap didn't find
>it. If I run sync to kick the delalloc out:
>$sync
>$ fiemap-test foo
>ext: 0 logical: [ 256.. 511] phys: 34048.. 34303
>flags: 0x001 tot: 256
>
>fiemap-test is sitting in my /usr/local/bin, and I have no idea how it
>got there. It's full of pretty comments so I know it isn't mine, but
>you can grab it here:
>
>http://oss.oracle.com/~mason/fiemap-test.c
>
>xfsqa has a fiemap program too.

After Fix, test results are as follows:
ext: 0 logical: [ 256.. 511] phys: 0.. 255
flags: 0x007 tot: 256
ext: 0 logical: [ 256.. 511] phys: 33280.. 33535
flags: 0x001 tot: 256

$ mkfs.ext4 /dev/xxx
$ mount /dev/xxx /mnt
$ dd if=/dev/zero of=/mnt/foo bs=1M count=1 seek=1
$ sync
$ dd if=/dev/zero of=/mnt/foo bs=1M count=1 seek=3
$ dd if=/dev/zero of=/mnt/foo bs=1M count=1 seek=5
$ fiemap-test foo
ext: 0 logical: [ 256.. 511] phys: 33280.. 33535
flags: 0x000 tot: 256
ext: 1 logical: [ 768.. 1023] phys: 0.. 255
flags: 0x006 tot: 256
ext: 2 logical: [ 1280.. 1535] phys: 0.. 255
flags: 0x007 tot: 256

Tested-by: Eric Sandeen <[email protected]>
Reviewed-by: Andreas Dilger <[email protected]>
Signed-off-by: Yongqiang Yang <[email protected]>
Signed-off-by: "Theodore Ts'o" <[email protected]>

diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c
index d16f6b5..9ea1bc6 100644
--- a/fs/ext4/extents.c
+++ b/fs/ext4/extents.c
@@ -3775,6 +3775,7 @@ int ext4_convert_unwritten_extents(struct inode *inode, loff_t offset,
}
return ret > 0 ? ret2 : ret;
}
+
/*
* Callback function called for each extent to gather FIEMAP information.
*/
@@ -3782,38 +3783,162 @@ static int ext4_ext_fiemap_cb(struct inode *inode, struct ext4_ext_path *path,
struct ext4_ext_cache *newex, struct ext4_extent *ex,
void *data)
{
- struct fiemap_extent_info *fieinfo = data;
- unsigned char blksize_bits = inode->i_sb->s_blocksize_bits;
__u64 logical;
__u64 physical;
__u64 length;
+ loff_t size;
__u32 flags = 0;
- int error;
+ int ret = 0;
+ struct fiemap_extent_info *fieinfo = data;
+ unsigned char blksize_bits;

- logical = (__u64)newex->ec_block << blksize_bits;
+ blksize_bits = inode->i_sb->s_blocksize_bits;
+ logical = (__u64)newex->ec_block << blksize_bits;

if (newex->ec_start == 0) {
- pgoff_t offset;
- struct page *page;
+ /*
+ * No extent in extent-tree contains block @newex->ec_start,
+ * then the block may stay in 1)a hole or 2)delayed-extent.
+ *
+ * Holes or delayed-extents are processed as follows.
+ * 1. lookup dirty pages with specified range in pagecache.
+ * If no page is got, then there is no delayed-extent and
+ * return with EXT_CONTINUE.
+ * 2. find the 1st mapped buffer,
+ * 3. check if the mapped buffer is both in the request range
+ * and a delayed buffer. If not, there is no delayed-extent,
+ * then return.
+ * 4. a delayed-extent is found, the extent will be collected.
+ */
+ ext4_lblk_t end = 0;
+ pgoff_t last_offset;
+ pgoff_t offset;
+ pgoff_t index;
+ struct page **pages = NULL;
struct buffer_head *bh = NULL;
+ struct buffer_head *head = NULL;
+ unsigned int nr_pages = PAGE_SIZE / sizeof(struct page *);
+
+ pages = kmalloc(PAGE_SIZE, GFP_KERNEL);
+ if (pages == NULL)
+ return -ENOMEM;

offset = logical >> PAGE_SHIFT;
- page = find_get_page(inode->i_mapping, offset);
- if (!page || !page_has_buffers(page))
- return EXT_CONTINUE;
+repeat:
+ last_offset = offset;
+ head = NULL;
+ ret = find_get_pages_tag(inode->i_mapping, &offset,
+ PAGECACHE_TAG_DIRTY, nr_pages, pages);
+
+ if (!(flags & FIEMAP_EXTENT_DELALLOC)) {
+ /* First time, try to find a mapped buffer. */
+ if (ret == 0) {
+out:
+ for (index = 0; index < ret; index++)
+ page_cache_release(pages[index]);
+ /* just a hole. */
+ kfree(pages);
+ return EXT_CONTINUE;
+ }

- bh = page_buffers(page);
+ /* Try to find the 1st mapped buffer. */
+ end = ((__u64)pages[0]->index << PAGE_SHIFT) >>
+ blksize_bits;
+ if (!page_has_buffers(pages[0]))
+ goto out;
+ head = page_buffers(pages[0]);
+ if (!head)
+ goto out;

- if (!bh)
- return EXT_CONTINUE;
+ bh = head;
+ do {
+ if (buffer_mapped(bh)) {
+ /* get the 1st mapped buffer. */
+ if (end > newex->ec_block +
+ newex->ec_len)
+ /* The buffer is out of
+ * the request range.
+ */
+ goto out;
+ goto found_mapped_buffer;
+ }
+ bh = bh->b_this_page;
+ end++;
+ } while (bh != head);

- if (buffer_delay(bh)) {
- flags |= FIEMAP_EXTENT_DELALLOC;
- page_cache_release(page);
+ /* No mapped buffer found. */
+ goto out;
} else {
- page_cache_release(page);
- return EXT_CONTINUE;
+ /*Find contiguous delayed buffers. */
+ if (ret > 0 && pages[0]->index == last_offset)
+ head = page_buffers(pages[0]);
+ bh = head;
+ }
+
+found_mapped_buffer:
+ if (bh != NULL && buffer_delay(bh)) {
+ /* 1st or contiguous delayed buffer found. */
+ if (!(flags & FIEMAP_EXTENT_DELALLOC)) {
+ /*
+ * 1st delayed buffer found, record
+ * the start of extent.
+ */
+ flags |= FIEMAP_EXTENT_DELALLOC;
+ newex->ec_block = end;
+ logical = (__u64)end << blksize_bits;
+ }
+ /* Find contiguous delayed buffers. */
+ do {
+ if (!buffer_delay(bh))
+ goto found_delayed_extent;
+ bh = bh->b_this_page;
+ end++;
+ } while (bh != head);
+
+ for (index = 1; index < ret; index++) {
+ if (!page_has_buffers(pages[index])) {
+ bh = NULL;
+ break;
+ }
+ head = page_buffers(pages[index]);
+ if (!head) {
+ bh = NULL;
+ break;
+ }
+ if (pages[index]->index !=
+ pages[0]->index + index) {
+ /* Blocks are not contiguous. */
+ bh = NULL;
+ break;
+ }
+ bh = head;
+ do {
+ if (!buffer_delay(bh))
+ /* Delayed-extent ends. */
+ goto found_delayed_extent;
+ bh = bh->b_this_page;
+ end++;
+ } while (bh != head);
+ }
+ } else if (!(flags & FIEMAP_EXTENT_DELALLOC))
+ /* a hole found. */
+ goto out;
+
+found_delayed_extent:
+ newex->ec_len = min(end - newex->ec_block,
+ (ext4_lblk_t)EXT_INIT_MAX_LEN);
+ if (ret == nr_pages && bh != NULL &&
+ newex->ec_len < EXT_INIT_MAX_LEN &&
+ buffer_delay(bh)) {
+ /* Have not collected an extent and continue. */
+ for (index = 0; index < ret; index++)
+ page_cache_release(pages[index]);
+ goto repeat;
}
+
+ for (index = 0; index < ret; index++)
+ page_cache_release(pages[index]);
+ kfree(pages);
}

physical = (__u64)newex->ec_start << blksize_bits;
@@ -3822,32 +3947,16 @@ static int ext4_ext_fiemap_cb(struct inode *inode, struct ext4_ext_path *path,
if (ex && ext4_ext_is_uninitialized(ex))
flags |= FIEMAP_EXTENT_UNWRITTEN;

- /*
- * If this extent reaches EXT_MAX_BLOCK, it must be last.
- *
- * Or if ext4_ext_next_allocated_block is EXT_MAX_BLOCK,
- * this also indicates no more allocated blocks.
- *
- * XXX this might miss a single-block extent at EXT_MAX_BLOCK
- */
- if (ext4_ext_next_allocated_block(path) == EXT_MAX_BLOCK ||
- newex->ec_block + newex->ec_len - 1 == EXT_MAX_BLOCK) {
- loff_t size = i_size_read(inode);
- loff_t bs = EXT4_BLOCK_SIZE(inode->i_sb);
-
+ size = i_size_read(inode);
+ if (logical + length >= size)
flags |= FIEMAP_EXTENT_LAST;
- if ((flags & FIEMAP_EXTENT_DELALLOC) &&
- logical+length > size)
- length = (size - logical + bs - 1) & ~(bs-1);
- }

- error = fiemap_fill_next_extent(fieinfo, logical, physical,
+ ret = fiemap_fill_next_extent(fieinfo, logical, physical,
length, flags);
- if (error < 0)
- return error;
- if (error == 1)
+ if (ret < 0)
+ return ret;
+ if (ret == 1)
return EXT_BREAK;

2011-04-03 10:12:51

by Pádraig Brady

[permalink] [raw]
Subject: Re: bug#8411: due to missing sync even on 2.6.39, cp fails to copy an odd file

On 03/04/11 00:00, Ted Ts'o wrote:
> On Sat, Apr 02, 2011 at 08:08:34PM +0200, Jim Meyering wrote:
>> From 0a6d128d0d17c1604245f1caafe6af73584a0bb8 Mon Sep 17 00:00:00 2001
>> From: Jim Meyering <[email protected]>
>> Date: Sat, 2 Apr 2011 19:59:30 +0200
>> Subject: [PATCH] copy: require fiemap sync also for 2.6.38 and 2.6.39 kernels
>>
>> * src/extent-scan.c (extent_need_sync): Require sync also for 2.6.38
>> and 2.6.39. Without this, part of the cp/fiemap-empty test would fail
>> both on F15-to-be and rawhide. For discussion and details, see:
>> http://thread.gmane.org/gmane.comp.gnu.coreutils.bugs/22190
>
> FYI, the following fix has been merged into mainline, which should fix
> the problem for 2.6.39 once it is finally released, at least for ext4.
> It was merged right before Linus released 2.6.39-rc1. I'm assuming
> that Rawhide released a pre-2.6.39-rc1 kernel in the middle of the
> merge window.

So this fix is not in 2.6.38?
It was committed before 2.6.38-rc6 was released,
and I would have thought it appropriate for 2.6.38 :(
Anyway I guess that we now have to assume that there
can be 2.6.38 kernels in the wild with this issue,
even if the stable branch does get the fix soon.

As for 2.6.39, I guess we can assume it's OK,
and ignore the rawhide aberration for a while.

cheers,
P?draig.

2011-04-03 10:15:27

by Jim Meyering

[permalink] [raw]
Subject: Re: bug#8411: due to missing sync even on 2.6.39, cp fails to copy an odd file

Ted Ts'o wrote:
> On Sat, Apr 02, 2011 at 08:08:34PM +0200, Jim Meyering wrote:
>> From 0a6d128d0d17c1604245f1caafe6af73584a0bb8 Mon Sep 17 00:00:00 2001
>> From: Jim Meyering <[email protected]>
>> Date: Sat, 2 Apr 2011 19:59:30 +0200
>> Subject: [PATCH] copy: require fiemap sync also for 2.6.38 and 2.6.39 kernels
>>
>> * src/extent-scan.c (extent_need_sync): Require sync also for 2.6.38
>> and 2.6.39. Without this, part of the cp/fiemap-empty test would fail
>> both on F15-to-be and rawhide. For discussion and details, see:
>> http://thread.gmane.org/gmane.comp.gnu.coreutils.bugs/22190
>
> FYI, the following fix has been merged into mainline, which should fix
> the problem for 2.6.39 once it is finally released, at least for ext4.
> It was merged right before Linus released 2.6.39-rc1. I'm assuming
> that Rawhide released a pre-2.6.39-rc1 kernel in the middle of the
> merge window.
>
> Some distro's will informally, but incorrectly, refer to such a
> release as "2.6.39-rc0". I prefer the more technically correct
> 2.6.38-git18 (which is the first git tag in the Linux git repo which
> contained the patch below, as of March 25, 2011). Unfortunately, RPM
> doesn't understand that 2.6.38-rc1 sorts before 2.6.38, while
> 2.6.38-git17 sorts *after* 2.6.38. (Hence the incorrect, but
> convenient, use of 2.6.39-rc0.)

Right. Rawhide's 2.6.39-0.rc0.git11.0.fc16.x86_64 kernel
is from March 22.

Good. That means we needn't condemn 2.6.39.
This sort of uname-based kernel check is precisely
why we should minimize use of -rcN named kernels,
but if it affects only rawhide (and that only briefly),
I won't complain too loudly.

2011-04-03 10:27:04

by Jim Meyering

[permalink] [raw]
Subject: Re: bug#8411: due to missing sync even on 2.6.39, cp fails to copy an odd file

P?draig Brady wrote:

> On 03/04/11 00:00, Ted Ts'o wrote:
>> On Sat, Apr 02, 2011 at 08:08:34PM +0200, Jim Meyering wrote:
>>> From 0a6d128d0d17c1604245f1caafe6af73584a0bb8 Mon Sep 17 00:00:00 2001
>>> From: Jim Meyering <[email protected]>
>>> Date: Sat, 2 Apr 2011 19:59:30 +0200
>>> Subject: [PATCH] copy: require fiemap sync also for 2.6.38 and 2.6.39 kernels
>>>
>>> * src/extent-scan.c (extent_need_sync): Require sync also for 2.6.38
>>> and 2.6.39. Without this, part of the cp/fiemap-empty test would fail
>>> both on F15-to-be and rawhide. For discussion and details, see:
>>> http://thread.gmane.org/gmane.comp.gnu.coreutils.bugs/22190
>>
>> FYI, the following fix has been merged into mainline, which should fix
>> the problem for 2.6.39 once it is finally released, at least for ext4.
>> It was merged right before Linus released 2.6.39-rc1. I'm assuming
>> that Rawhide released a pre-2.6.39-rc1 kernel in the middle of the
>> merge window.
>
> So this fix is not in 2.6.38?
> It was committed before 2.6.38-rc6 was released,
> and I would have thought it appropriate for 2.6.38 :(
> Anyway I guess that we now have to assume that there
> can be 2.6.38 kernels in the wild with this issue,
> even if the stable branch does get the fix soon.

Yes, this is unfortunate. But it's just an optimization,
so probably not a big deal for anyone. I haven't measured
the performance difference. Have you?

If it's a problem, I suppose once we've seen that most major distros
have patched their 2.6.38, we can turn off the sync also for 2.6.38
kernels.

> As for 2.6.39, I guess we can assume it's OK,
> and ignore the rawhide aberration for a while.

Yes. I've adjusted and pushed this:

From 1c3654cb1fb0d8f3c422c766028d0783a40f4a42 Mon Sep 17 00:00:00 2001
From: Jim Meyering <[email protected]>
Date: Sat, 2 Apr 2011 19:59:30 +0200
Subject: [PATCH] copy: require fiemap sync also for 2.6.38 kernels

* src/extent-scan.c (extent_need_sync): Require sync also for 2.6.38.
Without this, part of the cp/fiemap-empty test would fail both on
F15-to-be (2.6.38.1-6.fc15.x86_64) and rawhide. For details, see
http://thread.gmane.org/gmane.comp.gnu.coreutils.bugs/22190
---
src/extent-scan.c | 4 ++--
1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/extent-scan.c b/src/extent-scan.c
index c0a5de6..d84746c 100644
--- a/src/extent-scan.c
+++ b/src/extent-scan.c
@@ -31,7 +31,7 @@
# include "fiemap.h"
#endif

-/* Work around Linux kernel issues on BTRFS and EXT4 before 2.6.38.
+/* Work around Linux kernel issues on BTRFS and EXT4 before 2.6.39.
FIXME: remove in 2013, or whenever we're pretty confident
that the offending, unpatched kernels are no longer in use. */
static bool
@@ -50,7 +50,7 @@ extent_need_sync (void)
unsigned long val;
if (xstrtoul (name.release + 4, NULL, 10, &val, NULL) == LONGINT_OK)
{
- if (val < 38)
+ if (val < 39)
need_sync = 1;
}
}
--
1.7.4.2.662.gcbd0

2011-04-03 10:46:18

by Theodore Ts'o

[permalink] [raw]
Subject: Re: bug#8411: due to missing sync even on 2.6.39, cp fails to copy an odd file


On Apr 3, 2011, at 6:12 AM, P?draig Brady wrote:

> So this fix is not in 2.6.38?
> It was committed before 2.6.38-rc6 was released,
> and I would have thought it appropriate for 2.6.38 :(
> Anyway I guess that we now have to assume that there
> can be 2.6.38 kernels in the wild with this issue,
> even if the stable branch does get the fix soon.


Yeah, the patch was complicated enough that I was concerned about pushing it that late in the 2.6.38 development cycle. When I was told that this feature was in a pre-release shellutils, I made the call not to try to rush things.... We can get it into a 2.6.38 stable release fairly quickly.

-- Ted