2012-11-19 18:45:46

by Toralf Förster

[permalink] [raw]
Subject: ext4 fs errors - immediately reboot needed ?

/me wonders, whether the following messages should force me to *immediately*
reboot the system of whether I should schedule a reboot in the near future :

2012-11-19T19:23:53.099+01:00 n22 kernel: EXT4-fs error (device sdb3) in ext4_new_inode:933: IO failure
2012-11-19T19:24:04.971+01:00 n22 kernel: EXT4-fs error (device sdb3): ext4_mb_generate_buddy:741: group 409, 11806 clusters in bitmap, 11805 in gd
2012-11-19T19:24:04.971+01:00 n22 kernel: JBD2: Spotted dirty metadata buffer (dev = sdb3, blocknr = 0). There's a risk of filesystem corruption in case of system crash.
2012-11-19T19:37:20.188+01:00 n22 kernel: EXT4-fs error (device sdb3) in ext4_new_inode:933: IO failure
2012-11-19T19:44:28.552+01:00 n22 kernel: EXT4-fs error (device sdb3): ext4_mb_generate_buddy:741: group 424, 27235 clusters in bitmap, 27234 in gd
2012-11-19T19:44:28.552+01:00 n22 kernel: JBD2: Spotted dirty metadata buffer (dev = sdb3, blocknr = 0). There's a risk of filesystem corruption in case of system crash.
2012-11-19T19:44:33.734+01:00 n22 kernel: EXT4-fs error (device sdb3): mb_free_blocks:1300: group 424, block 13916749:freeing already freed block (bit 23117)
2012-11-19T19:44:33.734+01:00 n22 kernel: EXT4-fs error (device sdb3): mb_free_blocks:1300: group 409, block 13431904:freeing already freed block (bit 29792)

--
MfG/Sincerely
Toralf Förster
pgp finger print: 7B1A 07F4 EC82 0F90 D4C2 8936 872A E508 7DB6 9DA3


2012-11-19 19:31:37

by Theodore Ts'o

[permalink] [raw]
Subject: Re: ext4 fs errors - immediately reboot needed ?

On Mon, Nov 19, 2012 at 07:45:41PM +0100, Toralf F?rster wrote:
> /me wonders, whether the following messages should force me to *immediately*
> reboot the system of whether I should schedule a reboot in the near future :

In general, if you're not sure, the answer is yes....

> 2012-11-19T19:37:20.188+01:00 n22 kernel: EXT4-fs error (device sdb3) in ext4_new_inode:933: IO failure

This means that you have some kind of hardware or media error,
probably in an inode bitmap block. There are some planned patches to
try to isolate out failures when reading from a block or inode bitmap,
but they aren't yet in mainline yet.

> 2012-11-19T19:44:33.734+01:00 n22 kernel: EXT4-fs error (device sdb3): mb_free_blocks:1300: group 424, block 13916749:freeing already freed block (bit 23117)

This means you have a corrupted block bitmap block. When deleting a
file, we found that some data blocks that should have been marked as
in use, were marked as freed. That's actually not dangerous, but it's
likely that if you have this problem, that you might have the reverse,
which is a data block which should be marked as in use, but was marked
as freed. The reason why is dangerous is because ext4 might decide to
allocate a block which is still in use, which would lead to data loss
when that block is reused.

Some future patches might make it safe(r) to continue after seeing
this error (by preventing any allocations from that block group), but
even then, you're probably going to be better off forcing an fsck run
as soon as possible. For now, I would strongly recommend it.

Regards,

- Ted