2016-04-28 09:44:16

by Lay, Kuan Loon

[permalink] [raw]
Subject: EXT4 bad block

Hi,

I encounter random bad block on different file, the message looks like "EXT4-fs error (device mmcblk0p14): ext4_xattr_block_get:298: inode #77: comm (syslogd): bad block 7288".

I am using mke2fs 1.43-WIP (18-May-2015) and I saw this message "Suggestion: Use Linux kernel >= 3.18 for improved stability of the metadata and journal checksum features." print out.

My current kernel version is 3.14.55, what patch I need to backport to solve the bad block issue?

Thanks

Best Regards,
Lay


2016-04-28 14:10:00

by Alan Cox

[permalink] [raw]
Subject: Re: EXT4 bad block

On Thu, 28 Apr 2016 09:44:16 +0000
"Lay, Kuan Loon" <[email protected]> wrote:

> Hi,
>
> I encounter random bad block on different file, the message looks like "EXT4-fs error (device mmcblk0p14): ext4_xattr_block_get:298: inode #77: comm (syslogd): bad block 7288".
>
> I am using mke2fs 1.43-WIP (18-May-2015) and I saw this message "Suggestion: Use Linux kernel >= 3.18 for improved stability of the metadata and journal checksum features." print out.
>
> My current kernel version is 3.14.55, what patch I need to backport to solve the bad block issue?

I think you could start by providing a lot more information about the
platform you see it on, the other messages logged, whether you've run
memtest86 on the machine, whether it is stable if you boot a 4.4 kernel
and so on.

Without that information I doubt anyone can help you with your bug.

Alan

2016-05-02 06:49:02

by Philipp Hahn

[permalink] [raw]
Subject: Re: EXT4 bad block - ext4_xattr_block_get

Hello,

Am 28.04.2016 um 11:44 schrieb Lay, Kuan Loon:
> I encounter random bad block on different file, the message looks like "EXT4-fs error (device mmcblk0p14): ext4_xattr_block_get:298: inode #77: comm (syslogd): bad block 7288".

Interesting; I posted a similar bug report on 2016-04-19 titles
[BUG 4.1.11] EXT4-fs error: ext4_xattr_block_get:299 - Remounting
filesystem read-only

I never got a reply.

> I am using mke2fs 1.43-WIP (18-May-2015) and I saw this message "Suggestion: Use Linux kernel >= 3.18 for improved stability of the metadata and journal checksum features." print out.
>
> My current kernel version is 3.14.55, what patch I need to backport to solve the bad block issue?

That one happened on 4.1.11 on a virtual machine running inside
VMware-ESX after a hardware change. Last change was to disabled the
pvscsi drivers again; the system seems to be running fine since 1 week,
but the first time it took 1 month to notice the corruption, so we're
not yet sure that the problem is solved.

Philipp

2016-05-09 01:52:18

by Lay, Kuan Loon

[permalink] [raw]
Subject: RE: EXT4 bad block - ext4_xattr_block_get

Hi,

Not getting the bad block message after disable metadata_csum.

Best Regards,
Lay

> -----Original Message-----
> From: Philipp Hahn [mailto:[email protected]]
> Sent: Monday, May 2, 2016 2:43 PM
> To: Lay, Kuan Loon <[email protected]>; [email protected];
> [email protected]; [email protected]; linux-
> [email protected]
> Subject: Re: EXT4 bad block - ext4_xattr_block_get
>
> Hello,
>
> Am 28.04.2016 um 11:44 schrieb Lay, Kuan Loon:
> > I encounter random bad block on different file, the message looks like
> "EXT4-fs error (device mmcblk0p14): ext4_xattr_block_get:298: inode #77:
> comm (syslogd): bad block 7288".
>
> Interesting; I posted a similar bug report on 2016-04-19 titles [BUG 4.1.11]
> EXT4-fs error: ext4_xattr_block_get:299 - Remounting filesystem read-only
>
> I never got a reply.
>
> > I am using mke2fs 1.43-WIP (18-May-2015) and I saw this message
> "Suggestion: Use Linux kernel >= 3.18 for improved stability of the metadata
> and journal checksum features." print out.
> >
> > My current kernel version is 3.14.55, what patch I need to backport to solve
> the bad block issue?
>
> That one happened on 4.1.11 on a virtual machine running inside VMware-
> ESX after a hardware change. Last change was to disabled the pvscsi drivers
> again; the system seems to be running fine since 1 week, but the first time it
> took 1 month to notice the corruption, so we're not yet sure that the
> problem is solved.
>
> Philipp