LinuxLists.cc - Re: Mount -o sync regression in 2.6.31

2009-11-13 23:07:24

Subject: Re: Mount -o sync regression in 2.6.31

On Fri, 13 Nov 2009 14:47:27 -0800
Stephen Hemminger <[email protected]> wrote:

> Okay the scenario is:
> 1. Mount floppy with ext2 filesystem in sync mode
> 2. Write to floppy
> 3. Prematurely eject floppy
> 4. Unmount floppy

I assume that by "prematurely eject", you mean that you waited until
the activity LED had turned off, then hit the button without typing
umount?

Really, I think ext2 is just wrong here. At unmount time it runs
ext2_sync_super(), which updates a few fields in the superblock and
tries to write them back.

But if the fs was mounted with `-osync' then the driver should be
operating synchronously, dammit! That means that any time any syscall
alters the filesystem's s_free_blocks_count, s_free_inodes_count or
s_wtime, the driver should write those fields back to the superblock
prior to returning from the syscall.

However that might be unreasonaly anal of us, because those fields
aren't terribly important. A fsck will whine and fix them up.

> This triggers this warning in 2.6.31 which did not occur in 2.6.30

hm, I wonder why that changed. The warning's been there for a while.

> [ 580.325447] floppy0: disk absent or changed during operation
> [ 580.325454] lost page write due to I/O error on fd0
> [ 580.958434] ------------[ cut here ]------------
> [ 580.958437] WARNING: at fs/buffer.c:1152 mark_buffer_dirty+0x23/0x7d()
> [ 580.958440] Hardware name: System Product Name
> [ 580.958442] Modules linked in: hid_belkin binfmt_misc bridge stp llc kvm_intel kvm ipv6 video output pci_slot acpi_cpufreq cpufreq_powersave cpufreq_ondemand cpufreq_conservative cpufreq_userspace psmouse pcspkr serio_raw rtc_cmos wmi evdev sg sr_mod cdrom usbhid mvsas pata_acpi pata_marvell ata_piix libsas ahci igb sky2 scsi_transport_sas dca e1000e floppy dm_mirror dm_region_hash dm_log dm_snapshot dm_mod fuse
> [ 580.958475] Pid: 8340, comm: umount Not tainted 2.6.30-rc6-bisect #12
> [ 580.958477] Call Trace:
> [ 580.958483] [<ffffffff80232ebb>] ? warn_slowpath_fmt+0xd9/0x10b
> [ 580.958489] [<ffffffff80270031>] ? find_get_pages_tag+0x41/0xdd
> [ 580.958494] [<ffffffff802782f0>] ? pagevec_lookup+0x17/0x1e
> [ 580.958498] [<ffffffff80279377>] ? truncate_inode_pages_range+0x21d/0x31d
> [ 580.958502] [<ffffffff80270db5>] ? wait_on_page_writeback_range+0xce/0x11b
> [ 580.958507] [<ffffffff80245037>] ? bit_waitqueue+0x10/0xa0
> [ 580.958510] [<ffffffff802be80d>] ? mark_buffer_dirty+0x23/0x7d
> [ 580.958515] [<ffffffff803003eb>] ? ext2_sync_super+0x39/0x55
> [ 580.958519] [<ffffffff8030079c>] ? ext2_put_super+0x34/0xc7
> [ 580.958524] [<ffffffff802a3174>] ? generic_shutdown_super+0x70/0xfd
> [ 580.958533] [<ffffffff802a3223>] ? kill_block_super+0x22/0x3a
> [ 580.958537] [<ffffffff802a3831>] ? deactivate_super+0x43/0x5a
> [ 580.958541] [<ffffffff802b56f9>] ? sys_umount+0x2d8/0x307
> [ 580.958546] [<ffffffff8020aeeb>] ? system_call_fastpath+0x16/0x1b
> [ 580.958549] ---[ end trace ccbde6411acd243f ]---
> [ 580.958580] floppy0: disk absent or changed during operation
> [ 580.958587] lost page write due to I/O error on fd0

void mark_buffer_dirty(struct buffer_head *bh)
{
WARN_ON_ONCE(!buffer_uptodate(bh));

that's expected, I expect. And it's just a WARN_ON_ONCE() so it
won't be flooding logs. We could just delete it I suppose.

> Bisecting between 2.6.30 and 2.6.31-rc1 reduced it down to:
>
> 8f184f27300f66f6dcc8296c2dae7a1fbe8429c9 is first bad commit
> commit 8f184f27300f66f6dcc8296c2dae7a1fbe8429c9
> Author: Frederic Weisbecker <[email protected]>
> Date: Sat May 16 06:24:36 2009 +0200
>
> tracing/stat: replace linked list by an rbtree for sorting

Current score:

git: 4,559
kernel developers: 0

2009-11-13 23:28:21

by Stephen Hemminger

[permalink] [raw]

Subject: Re: Mount -o sync regression in 2.6.31

On Fri, 13 Nov 2009 15:07:19 -0800
Andrew Morton <[email protected]> wrote:

> On Fri, 13 Nov 2009 14:47:27 -0800
> Stephen Hemminger <[email protected]> wrote:
>
> > Okay the scenario is:
> > 1. Mount floppy with ext2 filesystem in sync mode
> > 2. Write to floppy
> > 3. Prematurely eject floppy
> > 4. Unmount floppy
>
> I assume that by "prematurely eject", you mean that you waited until
> the activity LED had turned off, then hit the button without typing
> umount?

yes. no activity, but not unmounted

>
> Really, I think ext2 is just wrong here. At unmount time it runs
> ext2_sync_super(), which updates a few fields in the superblock and
> tries to write them back.
>
> But if the fs was mounted with `-osync' then the driver should be
> operating synchronously, dammit! That means that any time any syscall
> alters the filesystem's s_free_blocks_count, s_free_inodes_count or
> s_wtime, the driver should write those fields back to the superblock
> prior to returning from the syscall.

Agreed, but it looks like it has always done that.

> However that might be unreasonaly anal of us, because those fields
> aren't terribly important. A fsck will whine and fix them up.
>
> > This triggers this warning in 2.6.31 which did not occur in 2.6.30
>
> hm, I wonder why that changed. The warning's been there for a while.
>
> > [ 580.325447] floppy0: disk absent or changed during operation
> > [ 580.325454] lost page write due to I/O error on fd0
> > [ 580.958434] ------------[ cut here ]------------
> > [ 580.958437] WARNING: at fs/buffer.c:1152 mark_buffer_dirty+0x23/0x7d()
> > [ 580.958440] Hardware name: System Product Name
> > [ 580.958442] Modules linked in: hid_belkin binfmt_misc bridge stp llc kvm_intel kvm ipv6 video output pci_slot acpi_cpufreq cpufreq_powersave cpufreq_ondemand cpufreq_conservative cpufreq_userspace psmouse pcspkr serio_raw rtc_cmos wmi evdev sg sr_mod cdrom usbhid mvsas pata_acpi pata_marvell ata_piix libsas ahci igb sky2 scsi_transport_sas dca e1000e floppy dm_mirror dm_region_hash dm_log dm_snapshot dm_mod fuse
> > [ 580.958475] Pid: 8340, comm: umount Not tainted 2.6.30-rc6-bisect #12
> > [ 580.958477] Call Trace:
> > [ 580.958483] [<ffffffff80232ebb>] ? warn_slowpath_fmt+0xd9/0x10b
> > [ 580.958489] [<ffffffff80270031>] ? find_get_pages_tag+0x41/0xdd
> > [ 580.958494] [<ffffffff802782f0>] ? pagevec_lookup+0x17/0x1e
> > [ 580.958498] [<ffffffff80279377>] ? truncate_inode_pages_range+0x21d/0x31d
> > [ 580.958502] [<ffffffff80270db5>] ? wait_on_page_writeback_range+0xce/0x11b
> > [ 580.958507] [<ffffffff80245037>] ? bit_waitqueue+0x10/0xa0
> > [ 580.958510] [<ffffffff802be80d>] ? mark_buffer_dirty+0x23/0x7d
> > [ 580.958515] [<ffffffff803003eb>] ? ext2_sync_super+0x39/0x55
> > [ 580.958519] [<ffffffff8030079c>] ? ext2_put_super+0x34/0xc7
> > [ 580.958524] [<ffffffff802a3174>] ? generic_shutdown_super+0x70/0xfd
> > [ 580.958533] [<ffffffff802a3223>] ? kill_block_super+0x22/0x3a
> > [ 580.958537] [<ffffffff802a3831>] ? deactivate_super+0x43/0x5a
> > [ 580.958541] [<ffffffff802b56f9>] ? sys_umount+0x2d8/0x307
> > [ 580.958546] [<ffffffff8020aeeb>] ? system_call_fastpath+0x16/0x1b
> > [ 580.958549] ---[ end trace ccbde6411acd243f ]---
> > [ 580.958580] floppy0: disk absent or changed during operation
> > [ 580.958587] lost page write due to I/O error on fd0
>
> void mark_buffer_dirty(struct buffer_head *bh)
> {
> WARN_ON_ONCE(!buffer_uptodate(bh));
>
> that's expected, I expect. And it's just a WARN_ON_ONCE() so it
> won't be flooding logs. We could just delete it I suppose.
>
> > Bisecting between 2.6.30 and 2.6.31-rc1 reduced it down to:
> >
> > 8f184f27300f66f6dcc8296c2dae7a1fbe8429c9 is first bad commit
> > commit 8f184f27300f66f6dcc8296c2dae7a1fbe8429c9
> > Author: Frederic Weisbecker <[email protected]>
> > Date: Sat May 16 06:24:36 2009 +0200
> >
> > tracing/stat: replace linked list by an rbtree for sorting
>
> Current score:
>
> git: 4,559
> kernel developers: 0

--

2009-11-17 00:04:49

by Stephen Hemminger

[permalink] [raw]

Subject: [PATCH] ext2: clear uptodate flag on super block I/O error

This fixes a WARN backtrace in mark_buffer_dirty() that occurs during
unmount when a USB or floppy device is removed. I reported this a kernel
regression, but looks like it might have been there for longer
than that.

The super block update from a previous operation has marked the buffer
as in error, and the flag has to be cleared before doing the update.
(Similar code already exists in ext4).

Signed-off-by: Stephen Hemminger <[email protected]>

--- a/fs/ext2/super.c 2009-11-16 15:55:36.399078475 -0800
+++ b/fs/ext2/super.c 2009-11-16 15:59:49.814765923 -0800
@@ -1121,8 +1121,20 @@ static void ext2_sync_super(struct super
static int ext2_sync_fs(struct super_block *sb, int wait)
{
struct ext2_super_block *es = EXT2_SB(sb)->s_es;
+ struct buffer_head *sbh = EXT2_SB(sb)->s_sbh;

lock_kernel();
+ if (buffer_write_io_error(sbh)) {
+ /*
+ * This happens if USB or floppy device is yanked out.
+ * Maybe user put device back in so warn and update again.
+ */
+ printk(KERN_ERR
+ "EXT2-fs: previous I/O error to superblock detected\n");
+ clear_buffer_write_io_error(sbh);
+ set_buffer_uptodate(sbh);
+ }
+
if (es->s_state & cpu_to_le16(EXT2_VALID_FS)) {
ext2_debug("setting valid to 0\n");
es->s_state &= cpu_to_le16(~EXT2_VALID_FS);

2009-11-17 02:08:19

by Nick Piggin

[permalink] [raw]

Subject: Re: [PATCH] ext2: clear uptodate flag on super block I/O error

On Mon, Nov 16, 2009 at 04:04:49PM -0800, Stephen Hemminger wrote:
> This fixes a WARN backtrace in mark_buffer_dirty() that occurs during
> unmount when a USB or floppy device is removed. I reported this a kernel
> regression, but looks like it might have been there for longer
> than that.
>
> The super block update from a previous operation has marked the buffer
> as in error, and the flag has to be cleared before doing the update.
> (Similar code already exists in ext4).
>
> Signed-off-by: Stephen Hemminger <[email protected]>
>
> --- a/fs/ext2/super.c 2009-11-16 15:55:36.399078475 -0800
> +++ b/fs/ext2/super.c 2009-11-16 15:59:49.814765923 -0800
> @@ -1121,8 +1121,20 @@ static void ext2_sync_super(struct super
> static int ext2_sync_fs(struct super_block *sb, int wait)
> {
> struct ext2_super_block *es = EXT2_SB(sb)->s_es;
> + struct buffer_head *sbh = EXT2_SB(sb)->s_sbh;
>
> lock_kernel();
> + if (buffer_write_io_error(sbh)) {
> + /*
> + * This happens if USB or floppy device is yanked out.
> + * Maybe user put device back in so warn and update again.
> + */
> + printk(KERN_ERR
> + "EXT2-fs: previous I/O error to superblock detected\n");
> + clear_buffer_write_io_error(sbh);
> + set_buffer_uptodate(sbh);
> + }
> +
> if (es->s_state & cpu_to_le16(EXT2_VALID_FS)) {
> ext2_debug("setting valid to 0\n");
> es->s_state &= cpu_to_le16(~EXT2_VALID_FS);

I think the real fix is to avoid clearing uptodate in case of io errors.
For read io errors, the buffer/page should not have been uptodate to
start with, and for write io errors, an error writing back the buffer
does not mean it is somehow no longer the most uptodate copy of the data.

Higher level policy about IO errors (whether to retry, ignore, throw
out the data, etc) would be nice to implement properly too, but that is
not really the job of the low level cache and IO routines.

I proposed some patches a while back but didn't get much interest.
Maybe I should just ask someone to merge them.

2009-11-17 17:16:18

by Stephen Hemminger

[permalink] [raw]

Subject: Re: [PATCH] ext2: clear uptodate flag on super block I/O error

> I think the real fix is to avoid clearing uptodate in case of io errors.
> For read io errors, the buffer/page should not have been uptodate to
> start with, and for write io errors, an error writing back the buffer
> does not mean it is somehow no longer the most uptodate copy of the data.
>
> Higher level policy about IO errors (whether to retry, ignore, throw
> out the data, etc) would be nice to implement properly too, but that is
> not really the job of the low level cache and IO routines.
>
> I proposed some patches a while back but didn't get much interest.
> Maybe I should just ask someone to merge them.

I am interested in fixing current releases. But I agree it should be handled
in a general way in future.