Hello,
Here's what happened. I reshaped my RAID5 to a RAID6 and a larger disk count.
Then I ran "resize2fs /dev/md0". It successfully enlarged the filesystem
online (that took a couple of hours, I think).
Since my count of disks changed, I thought I should change the array stripe
width. Then, while still not unmounting the filesystem, I did:
tune2fs -E stripe_width=48 /dev/md0
Then I started copying files from another array to the one on which the above
operations were conducted. I did:
cp -Rp /mnt/array1/data/* /mnt/array2/new-data/
This completed successfully. No errors on the console, silence in dmesg.
Then I thought I'd verify the destination, just to be sure. Luckily, I have
checksums stored in almost every directory as .SFV files (created with "cfv").
And some checksums did not match. On a closer investigation, it appears like
some files deep down in the directory tree (and with a rare occurence,
something like one file in a thousand) were truncated during copying. E.g.
they'd have size of 188 MB instead of 349 MB, or 128 MB instead of 170 MB.
Some files (originally less than 1 MB in size) just had zero-length on the
destination.
Other than these truncations, there is NO corruption of data inside any of the
files. Which kinda rules out CRC-style errors in controller/disk/cable.
So.. this is completely puzzling to me, and I suspect either a kernel bug, or
my mistake, and tune2fs cannot modify a mounted FS (however it did NOT show any
warning that the FS was mounted, and if this is the case, it should've
absolutely refused to operate on a mounted FS).
Any ideas? :)
--
With respect,
Roman
On 2011-04-09, at 12:39 PM, Roman Mamedov wrote:
> Here's what happened. I reshaped my RAID5 to a RAID6 and a larger disk count.
> Then I ran "resize2fs /dev/md0". It successfully enlarged the filesystem
> online (that took a couple of hours, I think).
The online resize shouldn't take more than a few minutes, unless the disk is crazy busy and you are going from 16GB to 16TB or something.
> Since my count of disks changed, I thought I should change the array stripe
> width. Then, while still not unmounting the filesystem, I did:
>
> tune2fs -E stripe_width=48 /dev/md0
This does nothing more than set a field in the superblock, and it only gives a hint to the allocator to align the files on multiples of 48-block boundaries.
> Then I started copying files from another array to the one on which the above
> operations were conducted. I did:
>
> cp -Rp /mnt/array1/data/* /mnt/array2/new-data/
What kernel version do you have, and what version of coreutils? Is this perhaps a bleeding-edge kernel/coreutils with the "FIEMAP" bug?
> This completed successfully. No errors on the console, silence in dmesg.
>
> Then I thought I'd verify the destination, just to be sure. Luckily, I have
> checksums stored in almost every directory as .SFV files (created with "cfv").
>
> And some checksums did not match. On a closer investigation, it appears like
> some files deep down in the directory tree (and with a rare occurence,
> something like one file in a thousand) were truncated during copying. E.g.
> they'd have size of 188 MB instead of 349 MB, or 128 MB instead of 170 MB.
> Some files (originally less than 1 MB in size) just had zero-length on the
> destination.
Were the source files that had problems recently written themselves in this case?
> Other than these truncations, there is NO corruption of data inside any of the
> files. Which kinda rules out CRC-style errors in controller/disk/cable.
>
> So.. this is completely puzzling to me, and I suspect either a kernel bug, or
> my mistake, and tune2fs cannot modify a mounted FS (however it did NOT show any
> warning that the FS was mounted, and if this is the case, it should've
> absolutely refused to operate on a mounted FS).
>
> Any ideas? :)
>
> --
> With respect,
> Roman
Cheers, Andreas
On Sat, 9 Apr 2011 15:30:42 -0600
Andreas Dilger <[email protected]> wrote:
> The online resize shouldn't take more than a few minutes, unless the disk is
> crazy busy and you are going from 16GB to 16TB or something.
I was resizing from 4 TB to 6 TB, and during the time (okay, maybe it was just
an hour) I saw disk free space slowly increase at about 1 GB per couple of
seconds. RAID6 also was in a degraded state (one disk missing), so maybe
that's why it was slower than it should usually be.
> > Then I started copying files from another array to the one on which the
> > above operations were conducted. I did:
> >
> > cp -Rp /mnt/array1/data/* /mnt/array2/new-data/
>
> What kernel version do you have, and what version of coreutils? Is this
> perhaps a bleeding-edge kernel/coreutils with the "FIEMAP" bug?
Kernel version 2.6.38.2, cp (GNU coreutils) 8.5.
Also, what I didn't mention in my previous post, is that during that session I
also had one E-Mail message which was just received by the mail client (not
copied from another disk via cp or otherwise) truncate too. The mail client is
claws-mail and it stores individual messages on disk as regular files, a file
per message, and that one file was 0 bytes in size. At first I didn't think it
was related, but it looks like all disk access is/was affected, not just
cp/coreutils.
> Were the source files that had problems recently written themselves in this
> case?
Recently, as in "while on 2.6.38x kernels", or "in the past N minutes"?
Latter - definitely not (except for that one E-Mail mentioned above), former -
maybe, but unlikely.
--
With respect,
Roman
Hi Roman,
Your symptoms don't sound familiar to me, other than the standard
concerns about hardware induced file system inconsistency problems.
Have you checked your logs carefully to make sure there weren't any
hardware errors reported? If this is a hardware RAID system, is it
regularly doing disk scrubbing? Has the hardware RAID reported
anything unusual? How long have you been running in a degraded RAID 6
state?
And have you tried shutting down the system and running fsck to make
sure there weren't any file system corruption problems? When's the
last time you've run fsck on the system?
If this is an LVM system, I'd strongly suggest that you set aside
space you can take a snapshot, and then regularly take a snapshot, and
then run fsck on the snapshot. If any problems are noted, you can
then schedule downtime and fsck the entire system.
Regards,
- Ted
On Mon, 11 Apr 2011 09:10:08 -0400
Ted Ts'o <[email protected]> wrote:
> Your symptoms don't sound familiar to me, other than the standard
> concerns about hardware induced file system inconsistency problems.
Thing is, I do not observe any in-file random data corruptions which would
point to a problem at a lower (block-device) level, so I do not think it is a
RAID or HDD problem.
The breakage seemed to be on the filesystem logic level, perhaps something to
do with allocation of space for new files? And since I immediately just before
that, made two operations possibly affecting it (tune2fs stride size + online
grow with resize2fs) that's why I thought this might be an ext4 problem.
While still in the same session, I then re-copied the affected files replacing
their "shortened" copies, and they were written out fine the second time. And
after a reboot, no more file truncations are observed so far.
> Have you checked your logs carefully to make sure there weren't any
> hardware errors reported?
No, there weren't any errors in dmesg, or on the same console where 'cp' would
output its errors.
> If this is a hardware RAID system, is it regularly doing disk scrubbing?
> Has the hardware RAID reported anything unusual? How long have you been
> running in a degraded RAID 6 state?
It is an mdadm RAID6, and it does not report any problem. It was running in a
degraded state for only a short time (less than a day). And AFAIK running
degraded without one disk is not a dangerous or risky situation with RAID6.
> And have you tried shutting down the system and running fsck to make
> sure there weren't any file system corruption problems? When's the
> last time you've run fsck on the system?
I have unmounted it and ran fsck just now. Admittedly there was a long time
since the last fsck.
# e2fsck /dev/md0
e2fsck 1.41.12 (17-May-2010)
/dev/md0 has gone 306 days without being checked, check forced.
Pass 1: Checking inodes, blocks, and sizes
Pass 2: Checking directory structure
Pass 3: Checking directory connectivity
Pass 4: Checking reference counts
Pass 5: Checking group summary information
/dev/md0: 367107/364412928 files (4.3% non-contiguous), 1219229259/1457626752
blocks
> If this is an LVM system, I'd strongly suggest that you set aside
> space you can take a snapshot, and then regularly take a snapshot, and
> then run fsck on the snapshot. If any problems are noted, you can
> then schedule downtime and fsck the entire system.
No, I don't use LVM there.
--
With respect,
Roman