2016-12-09 05:40:37

by Simon Matthews

[permalink] [raw]
Subject: Fwd: Filesystem size problem.

Forwarding this to linux-ext4 list because the linux-admin list
appears to be very quiet.

Please accept my apologies in advance if this is the wrong forum for
this question!

Simon

---------- Forwarded message ----------
From: Simon Matthews <[email protected]>
Date: Thu, Dec 8, 2016 at 8:29 PM
Subject: Filesystem size problem.
To: [email protected]


I have an ext3 filesystem that will not mount under newer versions of
the kernel and I hope someone here can help.

Obviously, one solution is "backup and re-create from scratch". I have
the backups, but I hope that there may be a quicker method to fix the
issues.

The root issue is that the filesystem is very slightly smaller than
the allocated space. The filesystem exists on a MDRAID device and I
think that when I converted the MDRAID to a newer metadata version, it
truncated the available size, slightly. However, how I got here isn't
really important, fixing it now is.

With an slightly older kernel (4.0.5), the filesystem can be mounted.
With 4.4.26, the ext3 support is provided by the ext4 subsystem and it
appears that it will not accept the size issues. dmesg showed this
from the mount attempt:

md5: detected capacity change from 0 to 2839999799296
[ 1162.508338] EXT4-fs (md5): mounting ext3 file system using the ext4 subsystem
[ 1162.508560] EXT4-fs (md5): bad geometry: block count 693359344
exceeds size of device (693359326 blocks)

As I stated, the difference is very small, so it was working OK for a long time.

My attempts to re-size the filesystem did not work. I don't have the
error messages available. Getting the system up and running was more
important at the time.

Apart from "backup and re-create", how can I fix this? What would be
the correct options to use with resize2fs (if that is the correct
approach)? fsck gave me some serious warnings about possibly
destroying the filesystem, so I did not want to do this without
advice.

Simon


2016-12-09 20:29:55

by Andreas Dilger

[permalink] [raw]
Subject: Re: Filesystem size problem.

On Dec 8, 2016, at 10:40 PM, Simon Matthews <[email protected]> wrote:
>
> I have an ext3 filesystem that will not mount under newer versions of
> the kernel and I hope someone here can help.
>
> Obviously, one solution is "backup and re-create from scratch". I have
> the backups, but I hope that there may be a quicker method to fix the
> issues.
>
> The root issue is that the filesystem is very slightly smaller than
> the allocated space. The filesystem exists on a MDRAID device and I
> think that when I converted the MDRAID to a newer metadata version, it
> truncated the available size, slightly. However, how I got here isn't
> really important, fixing it now is.

Running "e2fsck -fy" should fix this. I'd recommend to use the latest
version of e2fsck.

Cheers, Andreas

>
> With an slightly older kernel (4.0.5), the filesystem can be mounted.
> With 4.4.26, the ext3 support is provided by the ext4 subsystem and it
> appears that it will not accept the size issues. dmesg showed this
> from the mount attempt:
>
> md5: detected capacity change from 0 to 2839999799296
> [ 1162.508338] EXT4-fs (md5): mounting ext3 file system using the ext4 subsystem
> [ 1162.508560] EXT4-fs (md5): bad geometry: block count 693359344
> exceeds size of device (693359326 blocks)
>
> As I stated, the difference is very small, so it was working OK for a long time.
>
> My attempts to re-size the filesystem did not work. I don't have the
> error messages available. Getting the system up and running was more
> important at the time.
>
> Apart from "backup and re-create", how can I fix this? What would be
> the correct options to use with resize2fs (if that is the correct
> approach)? fsck gave me some serious warnings about possibly
> destroying the filesystem, so I did not want to do this without
> advice.
>
> Simon
> --
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html


Cheers, Andreas






Attachments:
signature.asc (833.00 B)
Message signed with OpenPGP using GPGMail

2016-12-10 02:18:48

by Simon Matthews

[permalink] [raw]
Subject: Re: Filesystem size problem.

On Fri, Dec 9, 2016 at 12:29 PM, Andreas Dilger <[email protected]> wrote:
> On Dec 8, 2016, at 10:40 PM, Simon Matthews <[email protected]> wrote:
>>
>> I have an ext3 filesystem that will not mount under newer versions of
>> the kernel and I hope someone here can help.
>>
>> Obviously, one solution is "backup and re-create from scratch". I have
>> the backups, but I hope that there may be a quicker method to fix the
>> issues.
>>
>> The root issue is that the filesystem is very slightly smaller than
>> the allocated space. The filesystem exists on a MDRAID device and I
>> think that when I converted the MDRAID to a newer metadata version, it
>> truncated the available size, slightly. However, how I got here isn't
>> really important, fixing it now is.
>
> Running "e2fsck -fy" should fix this. I'd recommend to use the latest
> version of e2fsck.

The system has v1.42.13 installed. Is that recent enough?

Simon


>
> Cheers, Andreas
>
>>
>> With an slightly older kernel (4.0.5), the filesystem can be mounted.
>> With 4.4.26, the ext3 support is provided by the ext4 subsystem and it
>> appears that it will not accept the size issues. dmesg showed this
>> from the mount attempt:
>>
>> md5: detected capacity change from 0 to 2839999799296
>> [ 1162.508338] EXT4-fs (md5): mounting ext3 file system using the ext4 subsystem
>> [ 1162.508560] EXT4-fs (md5): bad geometry: block count 693359344
>> exceeds size of device (693359326 blocks)
>>
>> As I stated, the difference is very small, so it was working OK for a long time.
>>
>> My attempts to re-size the filesystem did not work. I don't have the
>> error messages available. Getting the system up and running was more
>> important at the time.
>>
>> Apart from "backup and re-create", how can I fix this? What would be
>> the correct options to use with resize2fs (if that is the correct
>> approach)? fsck gave me some serious warnings about possibly
>> destroying the filesystem, so I did not want to do this without
>> advice.
>>
>> Simon
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
>> the body of a message to [email protected]
>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
>
> Cheers, Andreas
>
>
>
>
>

2016-12-10 04:35:08

by Eric Sandeen

[permalink] [raw]
Subject: Re: Filesystem size problem.

On 12/9/16 2:29 PM, Andreas Dilger wrote:
> On Dec 8, 2016, at 10:40 PM, Simon Matthews <[email protected]> wrote:
>>
>> I have an ext3 filesystem that will not mount under newer versions of
>> the kernel and I hope someone here can help.
>>
>> Obviously, one solution is "backup and re-create from scratch". I have
>> the backups, but I hope that there may be a quicker method to fix the
>> issues.
>>
>> The root issue is that the filesystem is very slightly smaller than
>> the allocated space.

So the device is now smaller than the filesystem thinks, right?

> The filesystem exists on a MDRAID device and I
>> think that when I converted the MDRAID to a newer metadata version, it
>> truncated the available size, slightly. However, how I got here isn't
>> really important, fixing it now is.
>
> Running "e2fsck -fy" should fix this. I'd recommend to use the latest
> version of e2fsck.

Reaslly? e2fsck can change total blocks in the superblock to accomodate a
shrunken device? That's a new one for me...

I don't think so:

$ truncate --size=101m testfile
$ mkfs.ext3 testfile
$ truncate --size=100m testfile
$ e2fsck -f testfile
...
The physical size of the device is 102400 blocks
Either the superblock or the partition table is likely to be corrupt!
Abort<y>? n
...
$ e2fsck -f testfile
...
The physical size of the device is 102400 blocks
Either the superblock or the partition table is likely to be corrupt!
Abort<y>? n
$ e2fsck -f testfile
...
The physical size of the device is 102400 blocks
Either the superblock or the partition table is likely to be corrupt!
Abort<y>? n

etc.


The proper solution is to fix your block device, not the filesystem; it was
the block device which was inappropriately shortened.

I don't know if just poking a smaller total blocks number into the superblock
via debugfs would be safe or not.

-Eric

> Cheers, Andreas


2016-12-10 05:27:08

by Simon Matthews

[permalink] [raw]
Subject: Re: Filesystem size problem.

On Fri, Dec 9, 2016 at 8:35 PM, Eric Sandeen <[email protected]> wrote:
> On 12/9/16 2:29 PM, Andreas Dilger wrote:
>> On Dec 8, 2016, at 10:40 PM, Simon Matthews <[email protected]> wrote:
>>>
>>> I have an ext3 filesystem that will not mount under newer versions of
>>> the kernel and I hope someone here can help.
>>>
>>> Obviously, one solution is "backup and re-create from scratch". I have
>>> the backups, but I hope that there may be a quicker method to fix the
>>> issues.
>>>
>>> The root issue is that the filesystem is very slightly smaller than
>>> the allocated space.
>
> So the device is now smaller than the filesystem thinks, right?

Yes, I got that the wrong way round in my original email. The device
is very slightly smaller than the filesystem.


Simon

2016-12-12 22:36:32

by Andreas Dilger

[permalink] [raw]
Subject: Re: Filesystem size problem.

On Dec 9, 2016, at 9:35 PM, Eric Sandeen <[email protected]> wrote:
>
> On 12/9/16 2:29 PM, Andreas Dilger wrote:
>> On Dec 8, 2016, at 10:40 PM, Simon Matthews <[email protected]> wrote:
>>>
>>> I have an ext3 filesystem that will not mount under newer versions of
>>> the kernel and I hope someone here can help.
>>>
>>> Obviously, one solution is "backup and re-create from scratch". I have
>>> the backups, but I hope that there may be a quicker method to fix the
>>> issues.
>>>
>>> The root issue is that the filesystem is very slightly smaller than
>>> the allocated space.
>
> So the device is now smaller than the filesystem thinks, right?
>
>> The filesystem exists on a MDRAID device and I
>>> think that when I converted the MDRAID to a newer metadata version, it
>>> truncated the available size, slightly. However, how I got here isn't
>>> really important, fixing it now is.
>>
>> Running "e2fsck -fy" should fix this. I'd recommend to use the latest
>> version of e2fsck.
>
> Reaslly? e2fsck can change total blocks in the superblock to accomodate a
> shrunken device? That's a new one for me...

Strange, I thought this case was handled properly by e2fsck.

You could probably fix this with:

# debugfs -R "ssv blocks_count 693359326" /dev/md5
# e2fsck -f /dev/md5

to set the blocks count. It is unlikely anything is in the last 18 blocks
of the filesystem, and if it is then it is probably already corrupted by
the RAID superblock stored there.

>
> I don't think so:
>
> $ truncate --size=101m testfile
> $ mkfs.ext3 testfile
> $ truncate --size=100m testfile
> $ e2fsck -f testfile
> ...
> The physical size of the device is 102400 blocks
> Either the superblock or the partition table is likely to be corrupt!
> Abort<y>? n
> ...
> $ e2fsck -f testfile
> ...
> The physical size of the device is 102400 blocks
> Either the superblock or the partition table is likely to be corrupt!
> Abort<y>? n
> $ e2fsck -f testfile
> ...
> The physical size of the device is 102400 blocks
> Either the superblock or the partition table is likely to be corrupt!
> Abort<y>? n
>
> etc.
>
>
> The proper solution is to fix your block device, not the filesystem;
> it was the block device which was inappropriately shortened.

This may be more easily said than done...

> I don't know if just poking a smaller total blocks number into the
> superblock via debugfs would be safe or not.

It would probably be better to have e2fsck fix this problem itself,
but it is uncommon enough that there is a danger someone will also
shoot themselves in the foot for cases where this isn't working right.

Cheers, Andreas






Attachments:
signature.asc (833.00 B)
Message signed with OpenPGP using GPGMail

2016-12-13 02:48:46

by Simon Matthews

[permalink] [raw]
Subject: Re: Filesystem size problem.

On Mon, Dec 12, 2016 at 2:36 PM, Andreas Dilger <[email protected]> wrote:
> On Dec 9, 2016, at 9:35 PM, Eric Sandeen <[email protected]> wrote:
>>
>> On 12/9/16 2:29 PM, Andreas Dilger wrote:
>>> On Dec 8, 2016, at 10:40 PM, Simon Matthews <[email protected]> wrote:
>>>>
>>>> I have an ext3 filesystem that will not mount under newer versions of
>>>> the kernel and I hope someone here can help.
>>>>
>>>> Obviously, one solution is "backup and re-create from scratch". I have
>>>> the backups, but I hope that there may be a quicker method to fix the
>>>> issues.
>>>>
>>>> The root issue is that the filesystem is very slightly smaller than
>>>> the allocated space.
>>
>> So the device is now smaller than the filesystem thinks, right?
>>
>>> The filesystem exists on a MDRAID device and I
>>>> think that when I converted the MDRAID to a newer metadata version, it
>>>> truncated the available size, slightly. However, how I got here isn't
>>>> really important, fixing it now is.
>>>
>>> Running "e2fsck -fy" should fix this. I'd recommend to use the latest
>>> version of e2fsck.
>>
>> Reaslly? e2fsck can change total blocks in the superblock to accomodate a
>> shrunken device? That's a new one for me...
>
> Strange, I thought this case was handled properly by e2fsck.
>
> You could probably fix this with:
>
> # debugfs -R "ssv blocks_count 693359326" /dev/md5



"probably"?

How safe or dangerous is this? Does the filesystem have to be unmounted first?

Simon

2016-12-13 20:55:54

by Andreas Dilger

[permalink] [raw]
Subject: Re: Filesystem size problem.

On Dec 12, 2016, at 7:48 PM, Simon Matthews <[email protected]> wrote:
>
> On Mon, Dec 12, 2016 at 2:36 PM, Andreas Dilger <[email protected]> wrote:
>> On Dec 9, 2016, at 9:35 PM, Eric Sandeen <[email protected]> wrote:
>>>
>>> On 12/9/16 2:29 PM, Andreas Dilger wrote:
>>>> On Dec 8, 2016, at 10:40 PM, Simon Matthews <[email protected]> wrote:
>>>>>
>>>>> I have an ext3 filesystem that will not mount under newer versions of
>>>>> the kernel and I hope someone here can help.
>>>>>
>>>>> Obviously, one solution is "backup and re-create from scratch". I have
>>>>> the backups, but I hope that there may be a quicker method to fix the
>>>>> issues.
>>>>>
>>>>> The root issue is that the filesystem is very slightly smaller than
>>>>> the allocated space.
>>>
>>> So the device is now smaller than the filesystem thinks, right?
>>>
>>>>> The filesystem exists on a MDRAID device and I think that when I
>>>>> converted the MDRAID to a newer metadata version, it truncated the
>>>>> available size, slightly. However, how I got here isn't really
>>>>> important, fixing it now is.
>>>>
>>>> Running "e2fsck -fy" should fix this. I'd recommend to use the latest
>>>> version of e2fsck.
>>>
>>> Reaslly? e2fsck can change total blocks in the superblock to accomodate
>>> a shrunken device? That's a new one for me...
>>
>> Strange, I thought this case was handled properly by e2fsck.
>>
>> You could probably fix this with:
>>
>> # debugfs -w -R "ssv blocks_count 693359326" /dev/md5
>
>
> "probably"?
>
> How safe or dangerous is this? Does the filesystem have to be unmounted first?

The filesystem *definitely* needs to be unmounted first.

I wouldn't classify this change as being super dangerous, because it is
only removing a few blocks from the end of the filesystem, and e2fsck
should handle the case where inodes reference blocks beyond EOFS as any
other corrupt blocks. I don't think that is likely to happen in this
case, unless your filesystem is extremely full, since extN filesystems
front-end bias allocations to the faster part of the storage device.

That said, I haven't tested this process[*], and if you are concerned that
it may eat your data (that is always possible) you should make a backup.
You should probably make a backup even if you aren't going to do this, as
that is always a good idea. As with any free advice you on the internet
YMMV, and the final decision is up to you.

The other option is to make a new filesystem on a second set of storage
and then copy the old files over. That also has benefits that the old
filesystem acts as your backup, you get any new features enabled in ext4
when the filesystem is newly formatted, and the files will likely be laid
laid out on disk contiguously during the copy, so it will defragment the
filesystem (not that ext4 needs this very much).

PS: I added "-w" to the debugfs command above, or it would have failed

Cheers, Andreas

[*] I did just try this on a test filesystem and it worked OK for me:

[root@mookie ~]# dumpe2fs -h /dev/dm-53 | grep "Block count"
dumpe2fs 1.42.13.wc5 (15-Apr-2016)
Block count: 2621440
[root@mookie ~]# debugfs -w -R "ssv blocks_count 2621400" /dev/dm-53
debugfs 1.42.13.wc5 (15-Apr-2016)
[root@mookie ~]# e2fsck -fn /dev/dm-53
e2fsck 1.42.13.wc5 (15-Apr-2016)
Pass 1: Checking inodes, blocks, and sizes
Pass 2: Checking directory structure
Pass 3: Checking directory connectivity
Pass 4: Checking reference counts
Pass 5: Checking group summary information
Free blocks count wrong for group #79 (26047, counted=26007).
Fix? no

Free blocks count wrong (2226761, counted=2226721).
Fix? no

Padding at end of block bitmap is not set. Fix? no


myth_2-MDT0000: ********** WARNING: Filesystem still has errors **********

myth_2-MDT0000: 1010990/2621440 files (0.1% non-contiguous), 394639/2621400 blocks
[root@mookie ~]# e2fsck -fp /dev/dm-53
myth_2-MDT0000: Padding at end of block bitmap is not set. FIXED.
myth_2-MDT0000: 1010990/2621440 files (0.1% non-contiguous), 394679/2621400 blocks






Attachments:
signature.asc (833.00 B)
Message signed with OpenPGP using GPGMail

2016-12-14 01:43:58

by Simon Matthews

[permalink] [raw]
Subject: Re: Filesystem size problem.

On Tue, Dec 13, 2016 at 12:48 PM, Andreas Dilger <[email protected]> wrote:
> On Dec 12, 2016, at 7:48 PM, Simon Matthews <[email protected]> wrote:
>>
>> On Mon, Dec 12, 2016 at 2:36 PM, Andreas Dilger <[email protected]> wrote:
>>> On Dec 9, 2016, at 9:35 PM, Eric Sandeen <[email protected]> wrote:
>>>>
>>>> On 12/9/16 2:29 PM, Andreas Dilger wrote:
>>>>> On Dec 8, 2016, at 10:40 PM, Simon Matthews <[email protected]> wrote:
>>>>>>
>>>>>> I have an ext3 filesystem that will not mount under newer versions of
>>>>>> the kernel and I hope someone here can help.
>>>>>>
>>>>>> Obviously, one solution is "backup and re-create from scratch". I have
>>>>>> the backups, but I hope that there may be a quicker method to fix the
>>>>>> issues.
>>>>>>
>>>>>> The root issue is that the filesystem is very slightly smaller than
>>>>>> the allocated space.
>>>>
>>>> So the device is now smaller than the filesystem thinks, right?
>>>>
>>>>>> The filesystem exists on a MDRAID device and I think that when I
>>>>>> converted the MDRAID to a newer metadata version, it truncated the
>>>>>> available size, slightly. However, how I got here isn't really
>>>>>> important, fixing it now is.
>>>>>
>>>>> Running "e2fsck -fy" should fix this. I'd recommend to use the latest
>>>>> version of e2fsck.
>>>>
>>>> Reaslly? e2fsck can change total blocks in the superblock to accomodate
>>>> a shrunken device? That's a new one for me...
>>>
>>> Strange, I thought this case was handled properly by e2fsck.
>>>
>>> You could probably fix this with:
>>>
>>> # debugfs -w -R "ssv blocks_count 693359326" /dev/md5
>>
>>
>> "probably"?
>>
>> How safe or dangerous is this? Does the filesystem have to be unmounted first?
>
> The filesystem *definitely* needs to be unmounted first.
>
> I wouldn't classify this change as being super dangerous, because it is
> only removing a few blocks from the end of the filesystem, and e2fsck
> should handle the case where inodes reference blocks beyond EOFS as any
> other corrupt blocks. I don't think that is likely to happen in this
> case, unless your filesystem is extremely full, since extN filesystems
> front-end bias allocations to the faster part of the storage device.
>
> That said, I haven't tested this process[*], and if you are concerned that
> it may eat your data (that is always possible) you should make a backup.
> You should probably make a backup even if you aren't going to do this, as
> that is always a good idea. As with any free advice you on the internet
> YMMV, and the final decision is up to you.
>
> The other option is to make a new filesystem on a second set of storage
> and then copy the old files over. That also has benefits that the old
> filesystem acts as your backup, you get any new features enabled in ext4
> when the filesystem is newly formatted, and the files will likely be laid
> laid out on disk contiguously during the copy, so it will defragment the
> filesystem (not that ext4 needs this very much).
>
> PS: I added "-w" to the debugfs command above, or it would have failed

Thanks for this. I think that the best solution is to get new drives
and build a new ext4 filesystem on those.

It's always best to know that I have options.

We are using nfsv3 because performance of nfsv4 was terrible. Do you
have any idea if nfsv4 will work better with ext4?

Simon

>
> Cheers, Andreas
>
> [*] I did just try this on a test filesystem and it worked OK for me:
>
> [root@mookie ~]# dumpe2fs -h /dev/dm-53 | grep "Block count"
> dumpe2fs 1.42.13.wc5 (15-Apr-2016)
> Block count: 2621440
> [root@mookie ~]# debugfs -w -R "ssv blocks_count 2621400" /dev/dm-53
> debugfs 1.42.13.wc5 (15-Apr-2016)
> [root@mookie ~]# e2fsck -fn /dev/dm-53
> e2fsck 1.42.13.wc5 (15-Apr-2016)
> Pass 1: Checking inodes, blocks, and sizes
> Pass 2: Checking directory structure
> Pass 3: Checking directory connectivity
> Pass 4: Checking reference counts
> Pass 5: Checking group summary information
> Free blocks count wrong for group #79 (26047, counted=26007).
> Fix? no
>
> Free blocks count wrong (2226761, counted=2226721).
> Fix? no
>
> Padding at end of block bitmap is not set. Fix? no
>
>
> myth_2-MDT0000: ********** WARNING: Filesystem still has errors **********
>
> myth_2-MDT0000: 1010990/2621440 files (0.1% non-contiguous), 394639/2621400 blocks
> [root@mookie ~]# e2fsck -fp /dev/dm-53
> myth_2-MDT0000: Padding at end of block bitmap is not set. FIXED.
> myth_2-MDT0000: 1010990/2621440 files (0.1% non-contiguous), 394679/2621400 blocks
>
>
>
>
>