Message-ID: <4955F338.4020509@zappa.cx>
Date: Sat, 27 Dec 2008 10:19:52 +0100
From: Andreas Sundstrom <sunkan@zappa.cx>
User-Agent: Thunderbird 2.0.0.18 (X11/20081125)
MIME-Version: 1.0
To: linux-kernel@vger.kernel.org
Subject: Re: 2.6.28 ext4, xen and lvm volume becomes ro after snapshot
References: <4954BAAB.9090108@zappa.cx> <20081226140721.GN9871@mit.edu> <4954FB62.4090306@zappa.cx> <20081226182145.GP9871@mit.edu> <495526F6.9040704@zappa.cx> <20081226193307.GA2138@mit.edu> <495553EB.6030604@zappa.cx> <20081227030632.GA3539@mit.edu>
In-Reply-To: <20081227030632.GA3539@mit.edu>
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Sender: linux-kernel-owner@vger.kernel.org
Content-Length: 2566
Lines: 72

Theodore Tso wrote:
> On Fri, Dec 26, 2008 at 11:00:11PM +0100, Andreas Sundstrom wrote:
>> But I enabled debugfs and did
>> "echo 3 > /sys/kernel/debug/jbd2/jbd2-debug" and reproduced the problem
>> by taking a snapshot while the system was live.
>> I hope this had the same effect as your proposed change.
> 
> Thanks, that was helpful.  Can you try applying this patch, and let me
> know whether the printk triggers?

No problem

> 
> What I'm guessing is going on is that on a native kernel, we get the
> ENOTSUPP error immediately when we call submit_bh().  However, with
> the Xen kernel, we aren't getting the error right away; we're either
> getting ENOTSUPP later on, when we call wait_on_buffer().  For ext3,
> this doesn't matter, since we call sync_dirty_buffer() which calls
> submit_bh() and wait_on_buffer() synchronously.  But ext4 doesn't use
> sync_dirty_buffer(), instead calling submit_bh() and wait_on_buffer()
> separately.
> 
> This patch should be able to confirm whether or not this supposition
> is correct.
> 
> 						- Ted
> 
> diff --git a/fs/jbd2/commit.c b/fs/jbd2/commit.c
> index bd1fad0..630196d 100644
> --- a/fs/jbd2/commit.c
> +++ b/fs/jbd2/commit.c
> @@ -174,9 +174,16 @@ static int journal_wait_on_commit_record(struct buffer_head *bh)
>  
>  	clear_buffer_dirty(bh);
>  	wait_on_buffer(bh);
> +	if (buffer_eopnotsupp(bh)) {
> +		printk("jbd2: journal_wait_on_commit_record: eopnotsupp\n");
> +		ret = sync_dirty_buffer(bh);
> +		printk("jbd2: sync_dirty_buffer returned %d\n", ret);
> +	}
>  
> -	if (unlikely(!buffer_uptodate(bh)))
> +	if (unlikely(!buffer_uptodate(bh))) {
> +		printk("jbd2: journal_wait_on_commit_record: not uptodate\n");
>  		ret = -EIO;
> +	}
>  	put_bh(bh);            /* One for getblk() */
>  	jbd2_journal_put_journal_head(bh2jh(bh));
>  

[   44.546636] blkfront: xvda1: write barrier op failed

[   44.546666] blkfront: xvda1: barriers disabled

[   44.546686] end_request: I/O error, dev xvda1, sector 5256

[   44.546710] end_request: I/O error, dev xvda1, sector 5256

[   44.548228] jbd2: journal_wait_on_commit_record: eopnotsupp

[   44.548251] jbd2: sync_dirty_buffer returned 0

[   44.548270] jbd2: journal_wait_on_commit_record: not uptodate

[   44.548293] Aborting journal on device xvda1:8.

More output here http://pastebin.com/m3694a25b
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/