Hi,
Weird problem, when I build app from source,
make; make install
run the command, but got "cannot execute binary file"
hexdump shows the installed binary is full of zero
Is it related to ext4 fiemap problem described below?
http://lwn.net/Articles/429349/
I finally managed to find the way to reproduce this:
just cp a elf binary A to file B, then cp B to file C, then you will get:
A == B != C
ie.
cp /bin/ls ls1
cp ls1 ls2
ls2 will be filled with zero
Below is a strace log of install, kernel version is 3.1.0-rc6+
geteuid() = 0
umask(0) = 022
stat("/tmp/vpnc", 0x7fff85363710) = -1 ENOENT (No such file or directory)
stat("vpnc", {st_mode=S_IFREG|0755, st_size=368662, ...}) = 0
lstat("/tmp/vpnc", 0x7fff85363250) = -1 ENOENT (No such file or directory)
open("vpnc", O_RDONLY) = 3
fstat(3, {st_mode=S_IFREG|0755, st_size=368662, ...}) = 0
open("/tmp/vpnc", O_WRONLY|O_CREAT|O_EXCL, 0755) = 4
fstat(4, {st_mode=S_IFREG|0755, st_size=0, ...}) = 0
uname({sys="Linux", node="darkstar", ...}) = 0
ioctl(3, FS_IOC_FIEMAP, 0x7fff85361f60) = 0
ftruncate(4, 368662) = 0
fsetxattr(4, "system.posix_acl_access",
"\x02\x00\x00\x00\x01\x00\x06\x00\xff\xff\xff\xff\x04\x00\x00\x00\xff\xff\xff\xff
\x00\x00\x00\xff\xff\xff\xff", 28, 0) = 0
close(4) = 0
close(3) = 0
chmod("/tmp/vpnc", 0755) = 0
close(0) = 0
close(1) = 0
close(2) = 0
exit_group(0) = ?
--
Regards
Dave
On Sat, Oct 01, 2011 at 10:01:35PM +0800, Dave Young wrote:
> Hi,
>
> Weird problem, when I build app from source,
> make; make install
> run the command, but got "cannot execute binary file"
>
> hexdump shows the installed binary is full of zero
>
> Is it related to ext4 fiemap problem described below?
> http://lwn.net/Articles/429349/
There is general agreement that /bin/cp should not have been relying
on FIEMAP, and I believe the more recent versions of /bin/cp have
removed that code by default pending implementation of
SEEK_HOLE/SEEK_DATA. That being said, ext4 had a workaround to its
FIEMAP implementation that landed in 2.6.39, and you're using
3.1.0-rc6.
> I finally managed to find the way to reproduce this:
> just cp a elf binary A to file B, then cp B to file C, then you will get:
> A == B != C
>
> ie.
> cp /bin/ls ls1
> cp ls1 ls2
>
> ls2 will be filled with zero
If you add a "sync" between the two copies, does that work around the
problem? I bet it will...
My suggestion is to upgrade to a newer version of coreutils that
doesn't try to use FIEMAP.
- Ted
On Sat, Oct 1, 2011 at 10:39 PM, Ted Ts'o <[email protected]> wrote:
> On Sat, Oct 01, 2011 at 10:01:35PM +0800, Dave Young wrote:
>> Hi,
>>
>> Weird problem, when I build app from source,
>> make; make install
>> run the command, but got "cannot execute binary file"
>>
>> hexdump shows the installed binary is full of zero
>>
>> Is it related to ext4 fiemap problem described below?
>> http://lwn.net/Articles/429349/
>
> There is general agreement that /bin/cp should not have been relying
> on FIEMAP, and I believe the more recent versions of /bin/cp have
> removed that code by default pending implementation of
> SEEK_HOLE/SEEK_DATA. That being said, ext4 had a workaround to its
> FIEMAP implementation that landed in 2.6.39, and you're using
> 3.1.0-rc6.
Do you means It should work in 3.1.0-rc6 even with cp which depends fiemap?
>
>> I finally managed to find the way to reproduce this:
>> just cp a elf binary A to file B, then cp B to file C, then you will get:
>> A == B != C
>>
>> ie.
>> cp /bin/ls ls1
>> cp ls1 ls2
>>
>> ls2 will be filled with zero
>
> If you add a "sync" between the two copies, does that work around the
> problem? I bet it will...
Yes, it works
>
> My suggestion is to upgrade to a newer version of coreutils that
> doesn't try to use FIEMAP.
Thanks, will try
>
> - Ted
>
--
Regards
Dave
> On Sat, Oct 1, 2011 at 10:39 PM, Ted Ts'o <[email protected]> wrote:
>> On Sat, Oct 01, 2011 at 10:01:35PM +0800, Dave Young wrote:
>>> Hi,
>>>
>>> Weird problem, when I build app from source,
>>> make; make install
>>> run the command, but got "cannot execute binary file"
>>>
>>> hexdump shows the installed binary is full of zero
>>>
>>> Is it related to ext4 fiemap problem described below?
>>> http://lwn.net/Articles/429349/
>>
>> There is general agreement that /bin/cp should not have been relying
>> on FIEMAP, and I believe the more recent versions of /bin/cp have
>> removed that code by default pending implementation of
>> SEEK_HOLE/SEEK_DATA. That being said, ext4 had a workaround to its
>> FIEMAP implementation that landed in 2.6.39, and you're using
>> 3.1.0-rc6.
Actually, upstream cp(1) using FIEMAP only if the source file is sparse, or else, it will do normal copy, i.e, block based.
Thanks,
-Jeff
>
> Do you means It should work in 3.1.0-rc6 even with cp which depends fiemap?
>
>>
>>> I finally managed to find the way to reproduce this:
>>> just cp a elf binary A to file B, then cp B to file C, then you will get:
>>> A == B != C
>>>
>>> ie.
>>> cp /bin/ls ls1
>>> cp ls1 ls2
>>>
>>> ls2 will be filled with zero
>>
>> If you add a "sync" between the two copies, does that work around the
>> problem? I bet it will...
>
> Yes, it works
>
>>
>> My suggestion is to upgrade to a newer version of coreutils that
>> doesn't try to use FIEMAP.
>
> Thanks, will try
>
>>
>> - Ted
>>
>
>
>
> --
> Regards
> Dave
> --
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
On 2011-10-01, at 11:41 PM, Jeff liu wrote:
>> On Sat, Oct 1, 2011 at 10:39 PM, Ted Ts'o <[email protected]> wrote:
>>> On Sat, Oct 01, 2011 at 10:01:35PM +0800, Dave Young wrote:
>>>> Hi,
>>>>
>>>> Weird problem, when I build app from source,
>>>> make; make install
>>>> run the command, but got "cannot execute binary file"
>>>>
>>>> hexdump shows the installed binary is full of zero
>>>>
>>>> Is it related to ext4 fiemap problem described below?
>>>> http://lwn.net/Articles/429349/
>>>
>>> There is general agreement that /bin/cp should not have been relying
>>> on FIEMAP, and I believe the more recent versions of /bin/cp have
>>> removed that code by default pending implementation of
>>> SEEK_HOLE/SEEK_DATA. That being said, ext4 had a workaround to its
>>> FIEMAP implementation that landed in 2.6.39, and you're using
>>> 3.1.0-rc6.
>
> Actually, upstream cp(1) using FIEMAP only if the source file is sparse, or else, it will do normal copy, i.e, block based.
My understanding is that cp uses the blocks count to determine whether the file is sparse or not. In the case of delayed allocation (where blocks are not yet allocated, if they are not reflected in the i_blocks count) it might mistakenly think that the file is sparse.
Given the danger of this bug, it is important to ensure ext4 returns DELALLOC extents for pages in the page cache. I think Yongqiang Yang just submitted a patch series to do this for ext4, so it would be important to verify it fixes this problem.
>> Do you means It should work in 3.1.0-rc6 even with cp which depends fiemap?
>>
>>>
>>>> I finally managed to find the way to reproduce this:
>>>> just cp a elf binary A to file B, then cp B to file C, then you will get:
>>>> A == B != C
>>>>
>>>> ie.
>>>> cp /bin/ls ls1
>>>> cp ls1 ls2
>>>>
>>>> ls2 will be filled with zero
>>>
>>> If you add a "sync" between the two copies, does that work around the
>>> problem? I bet it will...
>>
>> Yes, it works
>>
>>>
>>> My suggestion is to upgrade to a newer version of coreutils that
>>> doesn't try to use FIEMAP.
>>
>> Thanks, will try
>>
>>>
>>> - Ted
>>>
>>
>>
>>
>> --
>> Regards
>> Dave
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
>> the body of a message to [email protected]
>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
Cheers, Andreas
On 2011-10-01, at 11:41 PM, Jeff liu wrote:
>> On Sat, Oct 1, 2011 at 10:39 PM, Ted Ts'o <[email protected]> wrote:
>>> On Sat, Oct 01, 2011 at 10:01:35PM +0800, Dave Young wrote:
>>>> Weird problem, when I build app from source,
>>>> make; make install
>>>> run the command, but got "cannot execute binary file"
>>>>
>>>> hexdump shows the installed binary is full of zero
>>>>
>>>> Is it related to ext4 fiemap problem described below?
>>>> http://lwn.net/Articles/429349/
>>>
>>> There is general agreement that /bin/cp should not have been relying
>>> on FIEMAP, and I believe the more recent versions of /bin/cp have
>>> removed that code by default pending implementation of
>>> SEEK_HOLE/SEEK_DATA. That being said, ext4 had a workaround to its
>>> FIEMAP implementation that landed in 2.6.39, and you're using
>>> 3.1.0-rc6.
>
> Actually, upstream cp(1) using FIEMAP only if the source file is sparse, or else, it will do normal copy, i.e, block based.
Are there any distros that are shipping with a version of cp that depends on FIEMAP? That would dramatically increase the severity of this problem, since orders of magnitude more users will hit the problem.
Dave, what distro were you seeing this problem on, and had you installed/upgraded your coreutils and/or kernel yourself?
>> Do you means It should work in 3.1.0-rc6 even with cp which depends fiemap?
>>
>>>
>>>> I finally managed to find the way to reproduce this:
>>>> just cp a elf binary A to file B, then cp B to file C, then you will get:
>>>> A == B != C
>>>>
>>>> ie.
>>>> cp /bin/ls ls1
>>>> cp ls1 ls2
>>>>
>>>> ls2 will be filled with zero
>>>
>>> If you add a "sync" between the two copies, does that work around the
>>> problem? I bet it will...
>>
>> Yes, it works
>>
>>>
>>> My suggestion is to upgrade to a newer version of coreutils that
>>> doesn't try to use FIEMAP.
>>
>> Thanks, will try
>>
>>>
>>> - Ted
>>>
>>
>>
>>
>> --
>> Regards
>> Dave
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
>> the body of a message to [email protected]
>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
Cheers, Andreas
?? 2011-10-2??????3:59?? Andreas Dilger д????
> On 2011-10-01, at 11:41 PM, Jeff liu wrote:
>>> On Sat, Oct 1, 2011 at 10:39 PM, Ted Ts'o <[email protected]> wrote:
>>>> On Sat, Oct 01, 2011 at 10:01:35PM +0800, Dave Young wrote:
>>>>> Hi,
>>>>>
>>>>> Weird problem, when I build app from source,
>>>>> make; make install
>>>>> run the command, but got "cannot execute binary file"
>>>>>
>>>>> hexdump shows the installed binary is full of zero
>>>>>
>>>>> Is it related to ext4 fiemap problem described below?
>>>>> http://lwn.net/Articles/429349/
>>>>
>>>> There is general agreement that /bin/cp should not have been relying
>>>> on FIEMAP, and I believe the more recent versions of /bin/cp have
>>>> removed that code by default pending implementation of
>>>> SEEK_HOLE/SEEK_DATA. That being said, ext4 had a workaround to its
>>>> FIEMAP implementation that landed in 2.6.39, and you're using
>>>> 3.1.0-rc6.
>>
>> Actually, upstream cp(1) using FIEMAP only if the source file is sparse, or else, it will do normal copy, i.e, block based.
>
> My understanding is that cp uses the blocks count to determine whether the file is sparse or not.
Yes, it based on blocks count to determine that.
> In the case of delayed allocation (where blocks are not yet allocated, if they are not reflected in the i_blocks count) it might mistakenly think that the file is sparse.
Thanks for pointing this out, I missed this case.
So for Dave's issue, even if he updated to the upstream Coreutils, this issue will still exists occasionally for delayed allocation, if not run sync in between times.
>
> Given the danger of this bug, it is important to ensure ext4 returns DELALLOC extents for pages in the page cache. I think Yongqiang Yang just submitted a patch series to do this for ext4, so it would be important to verify it fixes this problem.
Thanks,
-Jeff
>
>>> Do you means It should work in 3.1.0-rc6 even with cp which depends fiemap?
>>>
>>>>
>>>>> I finally managed to find the way to reproduce this:
>>>>> just cp a elf binary A to file B, then cp B to file C, then you will get:
>>>>> A == B != C
>>>>>
>>>>> ie.
>>>>> cp /bin/ls ls1
>>>>> cp ls1 ls2
>>>>>
>>>>> ls2 will be filled with zero
>>>>
>>>> If you add a "sync" between the two copies, does that work around the
>>>> problem? I bet it will...
>>>
>>> Yes, it works
>>>
>>>>
>>>> My suggestion is to upgrade to a newer version of coreutils that
>>>> doesn't try to use FIEMAP.
>>>
>>> Thanks, will try
>>>
>>>>
>>>> - Ted
>>>>
>>>
>>>
>>>
>>> --
>>> Regards
>>> Dave
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
>>> the body of a message to [email protected]
>>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
>> the body of a message to [email protected]
>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
>
> Cheers, Andreas
>
>
>
>
>
On Sun, Oct 2, 2011 at 4:02 PM, Andreas Dilger <[email protected]> wrote:
> On 2011-10-01, at 11:41 PM, Jeff liu wrote:
>>> On Sat, Oct 1, 2011 at 10:39 PM, Ted Ts'o <[email protected]> wrote:
>>>> On Sat, Oct 01, 2011 at 10:01:35PM +0800, Dave Young wrote:
>>>>> Weird problem, when I build app from source,
>>>>> make; make install
>>>>> run the command, but got "cannot execute binary file"
>>>>>
>>>>> hexdump shows the installed binary is full of zero
>>>>>
>>>>> Is it related to ext4 fiemap problem described below?
>>>>> http://lwn.net/Articles/429349/
>>>>
>>>> There is general agreement that /bin/cp should not have been relying
>>>> on FIEMAP, and I believe the more recent versions of /bin/cp have
>>>> removed that code by default pending implementation of
>>>> SEEK_HOLE/SEEK_DATA. That being said, ext4 had a workaround to its
>>>> FIEMAP implementation that landed in 2.6.39, and you're using
>>>> 3.1.0-rc6.
>>
>> Actually, upstream cp(1) using FIEMAP only if the source file is sparse, or else, it will do normal copy, i.e, block based.
>
> Are there any distros that are shipping with a version of cp that depends on FIEMAP? That would dramatically increase the severity of this problem, since orders of magnitude more users will hit the problem.
I'm not sure if it depends on FIEMAP, I think it should be not so old.
>
> Dave, what distro were you seeing this problem on, and had you installed/upgraded your coreutils and/or kernel yourself?
Slackware 13.37, coreutils 8.11
kernel is always built from linus's git by myself
>
>>> Do you means It should work in 3.1.0-rc6 even with cp which depends fiemap?
>>>
>>>>
>>>>> I finally managed to find the way to reproduce this:
>>>>> just cp a elf binary A to file B, then cp B to file C, then you will get:
>>>>> A == B != C
>>>>>
>>>>> ie.
>>>>> cp /bin/ls ls1
>>>>> cp ls1 ls2
>>>>>
>>>>> ls2 will be filled with zero
>>>>
>>>> If you add a "sync" between the two copies, does that work around the
>>>> problem? I bet it will...
>>>
>>> Yes, it works
>>>
>>>>
>>>> My suggestion is to upgrade to a newer version of coreutils that
>>>> doesn't try to use FIEMAP.
>>>
>>> Thanks, will try
>>>
>>>>
>>>> - Ted
>>>>
>>>
>>>
>>>
>>> --
>>> Regards
>>> Dave
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
>>> the body of a message to [email protected]
>>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
>> the body of a message to [email protected]
>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
>
> Cheers, Andreas
>
>
>
>
>
>
--
Regards
Dave
2011/10/2 Jeff liu <[email protected]>:
>
> 在 2011-10-2,下午3:59, Andreas Dilger 写道:
>
>> On 2011-10-01, at 11:41 PM, Jeff liu wrote:
>>>> On Sat, Oct 1, 2011 at 10:39 PM, Ted Ts'o <[email protected]> wrote:
>>>>> On Sat, Oct 01, 2011 at 10:01:35PM +0800, Dave Young wrote:
>>>>>> Hi,
>>>>>>
>>>>>> Weird problem, when I build app from source,
>>>>>> make; make install
>>>>>> run the command, but got "cannot execute binary file"
>>>>>>
>>>>>> hexdump shows the installed binary is full of zero
>>>>>>
>>>>>> Is it related to ext4 fiemap problem described below?
>>>>>> http://lwn.net/Articles/429349/
>>>>>
>>>>> There is general agreement that /bin/cp should not have been relying
>>>>> on FIEMAP, and I believe the more recent versions of /bin/cp have
>>>>> removed that code by default pending implementation of
>>>>> SEEK_HOLE/SEEK_DATA. That being said, ext4 had a workaround to its
>>>>> FIEMAP implementation that landed in 2.6.39, and you're using
>>>>> 3.1.0-rc6.
>>>
>>> Actually, upstream cp(1) using FIEMAP only if the source file is sparse, or else, it will do normal copy, i.e, block based.
>>
>> My understanding is that cp uses the blocks count to determine whether the file is sparse or not.
> Yes, it based on blocks count to determine that.
>
>> In the case of delayed allocation (where blocks are not yet allocated, if they are not reflected in the i_blocks count) it might mistakenly think that the file is sparse.
I think this might be my case
> Thanks for pointing this out, I missed this case.
> So for Dave's issue, even if he updated to the upstream Coreutils, this issue will still exists occasionally for delayed allocation, if not run sync in between times.
Not occasionally, I can easily reproduce it recently.
>
>>
>> Given the danger of this bug, it is important to ensure ext4 returns DELALLOC extents for pages in the page cache. I think Yongqiang Yang just submitted a patch series to do this for ext4, so it would be important to verify it fixes this problem.
>
>
> Thanks,
> -Jeff
>>
>>>> Do you means It should work in 3.1.0-rc6 even with cp which depends fiemap?
>>>>
>>>>>
>>>>>> I finally managed to find the way to reproduce this:
>>>>>> just cp a elf binary A to file B, then cp B to file C, then you will get:
>>>>>> A == B != C
>>>>>>
>>>>>> ie.
>>>>>> cp /bin/ls ls1
>>>>>> cp ls1 ls2
>>>>>>
>>>>>> ls2 will be filled with zero
>>>>>
>>>>> If you add a "sync" between the two copies, does that work around the
>>>>> problem? I bet it will...
>>>>
>>>> Yes, it works
>>>>
>>>>>
>>>>> My suggestion is to upgrade to a newer version of coreutils that
>>>>> doesn't try to use FIEMAP.
>>>>
>>>> Thanks, will try
>>>>
>>>>>
>>>>> - Ted
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Regards
>>>> Dave
>>>> --
>>>> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
>>>> the body of a message to [email protected]
>>>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>>>
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
>>> the body of a message to [email protected]
>>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>>
>>
>> Cheers, Andreas
>>
>>
>>
>>
>>
>
>
--
Regards
Dave
On Sun, Oct 02, 2011 at 12:59:22AM -0700, Andreas Dilger wrote:
> My understanding is that cp uses the blocks count to determine whether the file is sparse or not. In the case of delayed allocation (where blocks are not yet allocated, if they are not reflected in the i_blocks count) it might mistakenly think that the file is sparse.
Ext4 fortunatley is smart enough to add the delalloc blocks to st_blocks
for state, just like all other filesystems implementing delayed
allocations.
On Sun, Oct 2, 2011 at 3:59 PM, Andreas Dilger <[email protected]> wrote:
> On 2011-10-01, at 11:41 PM, Jeff liu wrote:
>>> On Sat, Oct 1, 2011 at 10:39 PM, Ted Ts'o <[email protected]> wrote:
>>>> On Sat, Oct 01, 2011 at 10:01:35PM +0800, Dave Young wrote:
>>>>> Hi,
>>>>>
>>>>> Weird problem, when I build app from source,
>>>>> make; make install
>>>>> run the command, but got "cannot execute binary file"
>>>>>
>>>>> hexdump shows the installed binary is full of zero
>>>>>
>>>>> Is it related to ext4 fiemap problem described below?
>>>>> http://lwn.net/Articles/429349/
>>>>
>>>> There is general agreement that /bin/cp should not have been relying
>>>> on FIEMAP, and I believe the more recent versions of /bin/cp have
>>>> removed that code by default pending implementation of
>>>> SEEK_HOLE/SEEK_DATA. ?That being said, ext4 had a workaround to its
>>>> FIEMAP implementation that landed in 2.6.39, and you're using
>>>> 3.1.0-rc6.
>>
>> Actually, upstream cp(1) using FIEMAP only if the source file is sparse, ?or else, it will do normal copy, i.e, block based.
>
> My understanding is that cp uses the blocks count to determine whether the file is sparse or not. ?In the case of delayed allocation (where blocks are not yet allocated, if they are not reflected in the i_blocks count) it might mistakenly think that the file is sparse.
>
> Given the danger of this bug, it is important to ensure ext4 returns DELALLOC extents for pages in the page cache. ?I think Yongqiang Yang just submitted a patch series to do this for ext4, so it would be important to verify it fixes this problem.
It seemed the patch[ ext4: in fiemap use FIEMAP_EXTENT_LAST flag for
last extent] (http://www.spinics.net/lists/linux-ext4/msg25698.html)
Lukas submitted on FIEMAP which ignores delayed extents beyond the
last allocated block. e.g. AAAHHHHDDDD
A - allocated, H - hole, D - delayed alloc, then the ending delayed
extent is ignored.
Yongqiang.
>
>>> Do you means It should work in 3.1.0-rc6 even with cp which depends fiemap?
>>>
>>>>
>>>>> I finally managed to find the way to reproduce this:
>>>>> just cp a elf binary A ?to file B, then cp B to file C, ?then you will get:
>>>>> A == B != C
>>>>>
>>>>> ie.
>>>>> cp /bin/ls ls1
>>>>> cp ls1 ls2
>>>>>
>>>>> ls2 will be filled with zero
>>>>
>>>> If you add a "sync" between the two copies, does that work around the
>>>> problem? ?I bet it will...
>>>
>>> Yes, it works
>>>
>>>>
>>>> My suggestion is to upgrade to a newer version of coreutils that
>>>> doesn't try to use FIEMAP.
>>>
>>> Thanks, will try
>>>
>>>>
>>>> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? - Ted
>>>>
>>>
>>>
>>>
>>> --
>>> Regards
>>> Dave
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
>>> the body of a message to [email protected]
>>> More majordomo info at ?http://vger.kernel.org/majordomo-info.html
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
>> the body of a message to [email protected]
>> More majordomo info at ?http://vger.kernel.org/majordomo-info.html
>
>
> Cheers, Andreas
>
>
>
>
>
>
--
Best Wishes
Yongqiang Yang
On 10/02/2011 09:43 AM, Dave Young wrote:
> On Sun, Oct 2, 2011 at 4:02 PM, Andreas Dilger <[email protected]> wrote:
>> On 2011-10-01, at 11:41 PM, Jeff liu wrote:
>>>> On Sat, Oct 1, 2011 at 10:39 PM, Ted Ts'o <[email protected]> wrote:
>>>>> On Sat, Oct 01, 2011 at 10:01:35PM +0800, Dave Young wrote:
>>>>>> Weird problem, when I build app from source,
>>>>>> make; make install
>>>>>> run the command, but got "cannot execute binary file"
>>>>>>
>>>>>> hexdump shows the installed binary is full of zero
>>>>>>
>>>>>> Is it related to ext4 fiemap problem described below?
>>>>>> http://lwn.net/Articles/429349/
>>>>>
>>>>> There is general agreement that /bin/cp should not have been relying
>>>>> on FIEMAP, and I believe the more recent versions of /bin/cp have
>>>>> removed that code by default pending implementation of
>>>>> SEEK_HOLE/SEEK_DATA. That being said, ext4 had a workaround to its
>>>>> FIEMAP implementation that landed in 2.6.39, and you're using
>>>>> 3.1.0-rc6.
>>>
>>> Actually, upstream cp(1) using FIEMAP only if the source file is sparse, or else, it will do normal copy, i.e, block based.
>>
>> Are there any distros that are shipping with a version of cp that depends on FIEMAP? That would dramatically increase the severity of this problem, since orders of magnitude more users will hit the problem.
>
> I'm not sure if it depends on FIEMAP, I think it should be not so old.
>
>>
>> Dave, what distro were you seeing this problem on, and had you installed/upgraded your coreutils and/or kernel yourself?
>
> Slackware 13.37, coreutils 8.11
> kernel is always built from linus's git by myself
Coreutils 8.11 was only released for 13 days,
before 8.12 was released specifically to avoid this issue.
Slackware should update.
Coreutils 8.12 only uses a fiemap based copy for
sparse files, where it will do a sync first.
The sparseness heuristic is st_blocks < st_size / st_blksize
cheers,
Pádraig.
On Mon, 3 Oct 2011, Yongqiang Yang wrote:
> On Sun, Oct 2, 2011 at 3:59 PM, Andreas Dilger <[email protected]> wrote:
> > On 2011-10-01, at 11:41 PM, Jeff liu wrote:
> >>> On Sat, Oct 1, 2011 at 10:39 PM, Ted Ts'o <[email protected]> wrote:
> >>>> On Sat, Oct 01, 2011 at 10:01:35PM +0800, Dave Young wrote:
> >>>>> Hi,
> >>>>>
> >>>>> Weird problem, when I build app from source,
> >>>>> make; make install
> >>>>> run the command, but got "cannot execute binary file"
> >>>>>
> >>>>> hexdump shows the installed binary is full of zero
> >>>>>
> >>>>> Is it related to ext4 fiemap problem described below?
> >>>>> http://lwn.net/Articles/429349/
> >>>>
> >>>> There is general agreement that /bin/cp should not have been relying
> >>>> on FIEMAP, and I believe the more recent versions of /bin/cp have
> >>>> removed that code by default pending implementation of
> >>>> SEEK_HOLE/SEEK_DATA. ?That being said, ext4 had a workaround to its
> >>>> FIEMAP implementation that landed in 2.6.39, and you're using
> >>>> 3.1.0-rc6.
> >>
> >> Actually, upstream cp(1) using FIEMAP only if the source file is sparse, ?or else, it will do normal copy, i.e, block based.
> >
> > My understanding is that cp uses the blocks count to determine whether the file is sparse or not. ?In the case of delayed allocation (where blocks are not yet allocated, if they are not reflected in the i_blocks count) it might mistakenly think that the file is sparse.
> >
> > Given the danger of this bug, it is important to ensure ext4 returns DELALLOC extents for pages in the page cache. ?I think Yongqiang Yang just submitted a patch series to do this for ext4, so it would be important to verify it fixes this problem.
> It seemed the patch[ ext4: in fiemap use FIEMAP_EXTENT_LAST flag for
> last extent] (http://www.spinics.net/lists/linux-ext4/msg25698.html)
> Lukas submitted on FIEMAP which ignores delayed extents beyond the
> last allocated block. e.g. AAAHHHHDDDD
> A - allocated, H - hole, D - delayed alloc, then the ending delayed
> extent is ignored.
Oops, you're right. I think that the best solution would be to revert
the commit
c03f8aa9abdd517477c2021ea1251939b4da49e6
ext4: use FIEMAP_EXTENT_LAST flag for last extent in fiemap
and then fix the original problem with your delayed extent tree
solution, where we can easily check not only for next allocated extent,
but also for next delayed extent to see if the current one is last or
not.
Currently, the problem is that at the point we are filling the fiemap
extent with fiemap_fill_next_extent() we do not have enough information
to say whether the extent is really the last or is not. And currently
there is not easy way to check for next delayed extent (which will be
fixed with your delayed extent tree).
I do not know how "ready" are your patches..Is it possible to wait for
them to be ready and fix it in your patch set ? That means, revert the
mentioned commit and reimplement fiemap with delayed extent tree.
Thanks!
-Lukas
>
> Yongqiang.
> >
> >>> Do you means It should work in 3.1.0-rc6 even with cp which depends fiemap?
> >>>
> >>>>
> >>>>> I finally managed to find the way to reproduce this:
> >>>>> just cp a elf binary A ?to file B, then cp B to file C, ?then you will get:
> >>>>> A == B != C
> >>>>>
> >>>>> ie.
> >>>>> cp /bin/ls ls1
> >>>>> cp ls1 ls2
> >>>>>
> >>>>> ls2 will be filled with zero
> >>>>
> >>>> If you add a "sync" between the two copies, does that work around the
> >>>> problem? ?I bet it will...
> >>>
> >>> Yes, it works
> >>>
> >>>>
> >>>> My suggestion is to upgrade to a newer version of coreutils that
> >>>> doesn't try to use FIEMAP.
> >>>
> >>> Thanks, will try
> >>>
> >>>>
> >>>> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? - Ted
> >>>>
> >>>
> >>>
> >>>
> >>> --
> >>> Regards
> >>> Dave
> >>> --
> >>> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> >>> the body of a message to [email protected]
> >>> More majordomo info at ?http://vger.kernel.org/majordomo-info.html
> >>
> >> --
> >> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> >> the body of a message to [email protected]
> >> More majordomo info at ?http://vger.kernel.org/majordomo-info.html
> >
> >
> > Cheers, Andreas
> >
> >
> >
> >
> >
> >
>
>
>
>
--
On Mon, Oct 03, 2011 at 03:11:30PM +0200, Lukas Czerner wrote:
>
> Oops, you're right. I think that the best solution would be to revert
> the commit
>
> c03f8aa9abdd517477c2021ea1251939b4da49e6
> ext4: use FIEMAP_EXTENT_LAST flag for last extent in fiemap
>
> and then fix the original problem with your delayed extent tree
> solution, where we can easily check not only for next allocated extent,
> but also for next delayed extent to see if the current one is last or
> not.
> ...
>
> I do not know how "ready" are your patches..Is it possible to wait for
> them to be ready and fix it in your patch set ? That means, revert the
> mentioned commit and reimplement fiemap with delayed extent tree.
Sigh, yeah, we need to fix this to avoid the hang in xfstests #252 but
users losing data even if the coreutils release was only out there for
13 days is bad juju.
I'm working on reviewing the kernel patch backlog this week, and I'll
give this series one priority.
Thanks to Yongqiang and Lukas for looking into this!
- Ted
On Mon, 3 Oct 2011, Ted Ts'o wrote:
> On Mon, Oct 03, 2011 at 03:11:30PM +0200, Lukas Czerner wrote:
> >
> > Oops, you're right. I think that the best solution would be to revert
> > the commit
> >
> > c03f8aa9abdd517477c2021ea1251939b4da49e6
> > ext4: use FIEMAP_EXTENT_LAST flag for last extent in fiemap
> >
> > and then fix the original problem with your delayed extent tree
> > solution, where we can easily check not only for next allocated extent,
> > but also for next delayed extent to see if the current one is last or
> > not.
> > ...
> >
> > I do not know how "ready" are your patches..Is it possible to wait for
> > them to be ready and fix it in your patch set ? That means, revert the
> > mentioned commit and reimplement fiemap with delayed extent tree.
>
> Sigh, yeah, we need to fix this to avoid the hang in xfstests #252 but
> users losing data even if the coreutils release was only out there for
> 13 days is bad juju.
>
> I'm working on reviewing the kernel patch backlog this week, and I'll
> give this series one priority.
Actually the series needs to be changed to to fix the problem. I'll
comment the appropriate patch.
Thanks!
-Lukas
>
> Thanks to Yongqiang and Lukas for looking into this!
>
> - Ted