2010-02-03 21:16:28

by Theodore Ts'o

[permalink] [raw]
Subject: Re: [PATCH 2/2] ext4: fix delalloc retry loop logic v2

On Wed, Feb 03, 2010 at 10:07:03PM +0300, Dmitry Monakhov wrote:
> Dmitry Monakhov <[email protected]> writes:
>
> > Theodore please review this patch ASAP, currently ext4+quota is
> > fatally broken due to your patch. Christmas holidays when you
> > submit your patch is not good time for good review, IMHO
> > i was too lazy to review it carefully.
> > Testcase is trivial it is enough just hit a quota barrier.
> > dmon$ set-quota-limit /mnt id=dmon --bsoft=1000 --bsoft=1000
> > dmon$ dd if=/dev/zefo of=/mnt/file

Sorry, I had to submit 0637c6f somewhat in a hurry because commit
d21cd8f (your patch) was causing a rather large number of failures
that users were complaining about. In retrospect maybe I should have
just backed out d21cd8f entirely and tried to sort out this whole mess
before the next merge window.

OK, I'll look this over as soon as I can.

- Ted

2010-02-04 11:29:32

by Aneesh Kumar K.V

[permalink] [raw]
Subject: Re: [PATCH 2/2] ext4: fix delalloc retry loop logic v2

On Wed, 03 Feb 2010 22:07:03 +0300, Dmitry Monakhov <[email protected]> wrote:
> Dmitry Monakhov <[email protected]> writes:
>
> > Theodore please review this patch ASAP, currently ext4+quota is
> > fatally broken due to your patch. Christmas holidays when you
> > submit your patch is not good time for good review, IMHO
> > i was too lazy to review it carefully.
> > Testcase is trivial it is enough just hit a quota barrier.
> > dmon$ set-quota-limit /mnt id=dmon --bsoft=1000 --bsoft=1000
> > dmon$ dd if=/dev/zefo of=/mnt/file
> >
> > kernel BUG at fs/jbd2/transaction.c:1027!
> OOps, i'm sorry. seems that i've send wrong patch version
> the only difference is follows:
> - dqretry = (ret == -EDQUOT) || EXT4_I(inode)->i_reserved_meta_blocks;
> + dqretry = (ret == -EDQUOT) && EXT4_I(inode)->i_reserved_meta_blocks;
> Correct version attached.
> From 3dd53f88470fdc4ec3f06da34cfc760fa8359be8 Mon Sep 17 00:00:00 2001
> From: Dmitry Monakhov <[email protected]>
> Date: Wed, 3 Feb 2010 22:03:17 +0300
> Subject: [PATCH 2/2] ext4: fix delalloc retry loop logic -v2
>
> Current delalloc write path is broken:
> ext4_da_write_begin()
> ext4_journal_start(inode, 1); -> current->journal != NULL
> block_write_begin
> ext4_da_get_block_prep()
> ext4_da_reserve_space()
> ext4_should_retry_alloc() -> deadlock
> write_inode_now() -> BUG_ON due to lack of journal credits
>
> Bug was partly introduced by following commit:
> 0637c6f4135f592f094207c7c21e7c0fc5557834
> ext4: Patch up how we claim metadata blocks for quota purposes
> In order to preserve retry logic and eliminate bugs we have to
> move retry loop to ext4_da_write_begin()
>
> Signed-off-by: Dmitry Monakhov <[email protected]>
> ---
> fs/ext4/inode.c | 41 ++++++++++++++++++-----------------------
> 1 files changed, 18 insertions(+), 23 deletions(-)
>
> diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
> index 2d3fe4d..bd9e573 100644
> --- a/fs/ext4/inode.c
> +++ b/fs/ext4/inode.c
> @@ -1815,7 +1815,6 @@ static int ext4_journalled_write_end(struct file *file,
> */
> static int ext4_da_reserve_space(struct inode *inode, sector_t lblock)
> {
> - int retries = 0;
> struct ext4_sb_info *sbi = EXT4_SB(inode->i_sb);
> struct ext4_inode_info *ei = EXT4_I(inode);
> unsigned long md_needed, md_reserved;
> @@ -1825,7 +1824,6 @@ static int ext4_da_reserve_space(struct inode *inode, sector_t lblock)
> * in order to allocate nrblocks
> * worse case is one extent per block
> */
> -repeat:
> spin_lock(&ei->i_block_reservation_lock);
> md_reserved = ei->i_reserved_meta_blocks;
> md_needed = ext4_calc_metadata_amount(inode, lblock);
> @@ -1836,27 +1834,11 @@ repeat:
> * later. Real quota accounting is done at pages writeout
> * time.
> */
> - if (vfs_dq_reserve_block(inode, md_needed + 1)) {
> - /*
> - * We tend to badly over-estimate the amount of
> - * metadata blocks which are needed, so if we have
> - * reserved any metadata blocks, try to force out the
> - * inode and see if we have any better luck.
> - */
> - if (md_reserved && retries++ <= 3)
> - goto retry;
> + if (vfs_dq_reserve_block(inode, md_needed + 1))
> return -EDQUOT;
> - }
>
> if (ext4_claim_free_blocks(sbi, md_needed + 1)) {
> vfs_dq_release_reservation_block(inode, md_needed + 1);
> - if (ext4_should_retry_alloc(inode->i_sb, &retries)) {
> - retry:
> - if (md_reserved)
> - write_inode_now(inode, (retries == 3));
> - yield();
> - goto repeat;
> - }
> return -ENOSPC;
> }
> spin_lock(&ei->i_block_reservation_lock);
> @@ -3033,7 +3015,7 @@ static int ext4_da_write_begin(struct file *file, struct address_space *mapping,
> loff_t pos, unsigned len, unsigned flags,
> struct page **pagep, void **fsdata)
> {
> - int ret, retries = 0;
> + int ret, dqretry, retries = 0;
> struct page *page;
> pgoff_t index;
> unsigned from, to;
> @@ -3090,9 +3072,22 @@ retry:
> ext4_truncate_failed_write(inode);
> }
>
> - if (!(flags & EXT4_AOP_FLAG_NORETRY) &&
> - ret == -ENOSPC && ext4_should_retry_alloc(inode->i_sb, &retries))
> - goto retry;
> + dqretry = (ret == -EDQUOT) && EXT4_I(inode)->i_reserved_meta_blocks;
> + if ( !(flags & EXT4_AOP_FLAG_NORETRY) &&
> + (ret == -ENOSPC || dqretry) &&
> + ext4_should_retry_alloc(inode->i_sb, &retries)) {
> + if (dqretry) {
> + /*
> + * We tend to badly over-estimate the amount of
> + * metadata blocks which are needed, so if we have
> + * reserved any metadata blocks, try to force out the
> + * inode and see if we have any better luck.
> + */
> + write_inode_now(inode, (retries == 3));
> + }
> + yield();
> + goto retry;
> + }
> out:
> return ret;
> }


Where is EXT4_AOP_FLAG_NORETRY defined ?. I have submitted a different
version of the patch and it is already upstream with commit
1db913823c0f8360fccbd24ca67eb073966a5ffd


-aneesh

2010-02-04 19:46:06

by Theodore Ts'o

[permalink] [raw]
Subject: Re: [PATCH 2/2] ext4: fix delalloc retry loop logic v2

Dmitry, what version were you testing when you ran into the problem
that you reported? Annesh's patch hit mainline just before 2.6.33-rc6.

- Ted

2010-02-04 21:50:22

by Dmitry Monakhov

[permalink] [raw]
Subject: Re: [PATCH 2/2] ext4: fix delalloc retry loop logic v2

[email protected] writes:

> Dmitry, what version were you testing when you ran into the problem
> that you reported? Annesh's patch hit mainline just before 2.6.33-rc6.
I've hit the bug on Jan's quota tree (33-rc5 before Annesh's patch)
Yesterday i have no inet connection to check mainstream tree.
Obviously i've expected what quota-tree should contain working quota code.
My patch is almost equals to Annesh's. So current mainstream is OK.
Sorry for false warning.
BTW. I want to deploy automated testing suite in order to test some devel
trees on daily basis in order to avoid obvious regressions (f.e. when i
broke ext3+quota). Do you know a good one?
Currently i'm looking in to autotest.kernel.org


2010-02-05 03:55:24

by Theodore Ts'o

[permalink] [raw]
Subject: Re: [PATCH 2/2] ext4: fix delalloc retry loop logic v2

On Fri, Feb 05, 2010 at 12:50:15AM +0300, Dmitry Monakhov wrote:
> BTW. I want to deploy automated testing suite in order to test some devel
> trees on daily basis in order to avoid obvious regressions (f.e. when i
> broke ext3+quota). Do you know a good one?

My general rule is that I won't push a patch set to Linus until I run
it against the XFSQA test suite. There has been talk about adding
generic quota tests (as opposed to the XFS-specific quota tests, since
XFS has its own quota system different from the one used by other
Linux file systems) to XFSQA, and I think there are a few, but clearly
we need to add more.

So if you want to make the biggest impact in terms of trying to avoid
regressions, helping to contribute more tests to the XFSQA test suite
would be the most useful thing to do. Right now Eric is the only ext4
developer is really familiar with the test suites, and he's added a
few tests, but he's super busy as of late. I've dabbled with the test
suites a little, and made a few changes, but I haven't added a new
test before, and I'm also super busy as of late. :-(

> Currently i'm looking in to autotest.kernel.org

Personally, I don't find frameworks for running automated tests to be
that useful. They have their place, but the problem isn't really
running the tests; the challenge is getting someone to actually *look*
at the results. Having a set of tests which is easy to set up, and
easy to run, is far more important.

If someone sets up autotest, but I don't have an occasion to look at
the results, it's not terribly useful. If it's really easy for me to
run the XFSQA test suite, then I'll run it every couple of patches
that I add to the ext4 patch queue, and run the complete set before I
push a set of patches to Linus. That's **far** more useful.

Automated tests are good, but they tend to be too noisy, and so no one
ever bothers to look at the output. A useful automated system would
only run tests that had clear and unambiguous failures; be able to
tolerate it if some test starts to fail and still be useful, and then
be able to do git-style bisection searches so it can say, "test NNN
started failing at commit XXX", "test MMM started failing at commit
YYY", etc. If it then mailed the results the relevant maintainer and
to the people who were the patch authors and the people who signed off
on the patch, then it would have a *chance* of being something that
people actually would pay attention to. Unfortunately, I don't know
of any automated test framework which fits this bill. :-(

So instead, I use the discpline of "make check" between almost every
single commit for e2fsprogs, and running "xfsqa -g quick" between most
patches (because the tests take a lot longer to run, I can't afford to
do it between every single patch), and "xfsqa -g auto" before I submit
a patchset to Linus (the most comprehensive set of tests, but it takes
hours so I have to run them overnight).

- Ted