2013-06-19 04:48:26

by Younger Liu

[permalink] [raw]
Subject: [PATCH] fs/jbd2: t_updates should increase when start_this_handle() failed in jbd2__journal_restart()

jbd2_journal_restart() would restart a handle. In this function, it
calls start_this_handle(). Before calling start_this_handle(),subtract
1 from transaction->t_updates.
If start_this_handle() succeeds, transaction->t_updates increases by 1
in it. But if start_this_handle() fails, transaction->t_updates does
not increase.
So, when commit the handle's transaction in jbd2_journal_stop(), the
assertion is false, and then trigger a bug.
The assertion is as follows:
J_ASSERT(atomic_read(&transaction->t_updates) > 0)

Signed-off-by: Younger Liu <[email protected]>
---
fs/jbd2/transaction.c | 2 ++
1 file changed, 2 insertions(+)

diff --git a/fs/jbd2/transaction.c b/fs/jbd2/transaction.c
index 325bc01..9ddb444 100644
--- a/fs/jbd2/transaction.c
+++ b/fs/jbd2/transaction.c
@@ -530,6 +530,8 @@ int jbd2__journal_restart(handle_t *handle, int nblocks, gfp_t gfp_mask)
lock_map_release(&handle->h_lockdep_map);
handle->h_buffer_credits = nblocks;
ret = start_this_handle(journal, handle, gfp_mask);
+ if (ret < 0)
+ atomic_inc(&transaction->t_updates);
return ret;
}
EXPORT_SYMBOL(jbd2__journal_restart);
--
1.7.9.7


2013-06-20 15:55:55

by Theodore Ts'o

[permalink] [raw]
Subject: Re: [PATCH] fs/jbd2: t_updates should increase when start_this_handle() failed in jbd2__journal_restart()

[ LKML and linux-fsdevel BCC'ed ]

On Wed, Jun 19, 2013 at 12:48:26PM +0800, Younger Liu wrote:
> jbd2_journal_restart() would restart a handle. In this function, it
> calls start_this_handle(). Before calling start_this_handle(),subtract
> 1 from transaction->t_updates.
> If start_this_handle() succeeds, transaction->t_updates increases by 1
> in it. But if start_this_handle() fails, transaction->t_updates does
> not increase.
> So, when commit the handle's transaction in jbd2_journal_stop(), the
> assertion is false, and then trigger a bug.
> The assertion is as follows:
> J_ASSERT(atomic_read(&transaction->t_updates) > 0)
>
> Signed-off-by: Younger Liu <[email protected]>

Thanks for pointing out this potential problem. Your fix isn't quite
the right one, however.

The problem is once we get to this point, the transaction pointer may
no longer be valid, since once we decrement t_updates, the transaction
could start commiting, and so we should not actually dereference the
transaction pointer after we unlock transaction->t_handle_lock. (We
are referencing t_tid two lines later, and technically that's a bug.
We've just been getting lucky.)

The real issue is that by the time we call start_this_handle() in
jbd2__journal_restart, the handle is not attached to any transaction.
So if jbd2_journal_restart() fails, the handle has to be considered
invalid, and the calling code should not try to use the handle at all,
including calling jbd2_journal_stop().

Jan Kara is I believe currently on vacation but I'd really like him to
chime in with his opinion about the best way to fix this, since he's
also quite familiar with the jbd2 code.

Also, Jan has recently submitted changes to implement reserved handles
(to be submitted in the next merge window), and in these new
functions, if start_this_handle() fails when called from
jbd2_journal_start_reserved(), the handle is left invalidated, and the
caller of jbd2_journal_start_reserved() must not touch the handle
again, including calling jbd2_journal_stop() --- in fact, because
jbd2_journal_start_reserved() clears current->journal_info on failure,
an attempt to call jbd2_journal_stop() will result in the kernel oops
due to an assertion failure.

My inclination is to fix this in the same way, but it will require
changing the current code paths that use jbd2_journal_restart(), and
in some cases passing back the state that the handle is now invalid
and should not be released via jbd2_journal_stop() is going to be
tricky indeed.


Another possible fix is to set the handle to be aborted, via
jbd2_journal_abort_handle(). This function isn't used at all at the
moment, but from what I can tell this should do the right thing. The
one unfortunate thing about this is that when jbd2_journal_stop() gets
called, it will return EROFS, which is a misleading error code. I'm
guessing you're seeing this because start_this_handle() returned
ENOMEM, correct? We could hack around this by stashing the real error
in the handle, and then change jbd2_journal_stop() to return that
error instead of EROFS if it is set.

This second solution is hacky all-around, and it's also inconsistent
with how we are doing things with jbd2_journal_start_reserved(). So
I'm not so happy with this solution. But it would require a lot less
work because the fix would be isolated in the jbd2 layer. OTOH, right
now if the code calls jbd2_journal_stop() on the handle after a
failure in jbd2_journal_start_reserved(), they are crashing anyway, so
changing the code so it changes with an assertion failure doesn't make
things any worse, and then we fix things in ext4 and ocfs2 without any
patch interdependencies --- and this is a problem which appears to
happen very rarely in practice.

(How did you manage to trigger this, BTW? Was this something you
noticed by manual code inspection? Or are you instrumenting the
kernel's memory allocators to occasionally fail to test our error
paths? Or were you running in a system with very heavy memory
pressure?)

- Ted



> ---
> fs/jbd2/transaction.c | 2 ++
> 1 file changed, 2 insertions(+)
>
> diff --git a/fs/jbd2/transaction.c b/fs/jbd2/transaction.c
> index 325bc01..9ddb444 100644
> --- a/fs/jbd2/transaction.c
> +++ b/fs/jbd2/transaction.c
> @@ -530,6 +530,8 @@ int jbd2__journal_restart(handle_t *handle, int nblocks, gfp_t gfp_mask)
> lock_map_release(&handle->h_lockdep_map);
> handle->h_buffer_credits = nblocks;
> ret = start_this_handle(journal, handle, gfp_mask);
> + if (ret < 0)
> + atomic_inc(&transaction->t_updates);
> return ret;
> }
> EXPORT_SYMBOL(jbd2__journal_restart);
> --
> 1.7.9.7
>

2013-06-20 16:08:08

by Theodore Ts'o

[permalink] [raw]
Subject: [PATCH] jbd2: fix theoretical race in jbd2__journal_restart

Once we decrement transaction->t_updates, if this is the last handle
holding the transaction from closing, and once we release the
t_handle_lock spinlock, it's possible for the transaction to commit
and be released. In practice with normal kernels, this probably won't
happen, since the commit happens in a separate kernel thread and it's
unlikely this could all happen within the space of a few CPU cycles.

On the other hand, with a real-time kernel, this could potentially
happen, so save the tid found in transaction->t_tid before we release
t_handle_lock. It would require an insane configuration, such as one
where the jbd2 thread was set to a very high real-time priority,
perhaps because a high priority real-time thread is trying to read or
write to a file system. But some people who use real-time kernels
have been known to do insane things, including controlling
laser-wielding industrial robots. :-)

Signed-off-by: "Theodore Ts'o" <[email protected]>
Cc: [email protected]
---

[ I plan to carry this in the ext4 tree. ]

fs/jbd2/transaction.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/jbd2/transaction.c b/fs/jbd2/transaction.c
index 91d62e1..7391456 100644
--- a/fs/jbd2/transaction.c
+++ b/fs/jbd2/transaction.c
@@ -652,10 +652,10 @@ int jbd2__journal_restart(handle_t *handle, int nblocks, gfp_t gfp_mask)
}
if (atomic_dec_and_test(&transaction->t_updates))
wake_up(&journal->j_wait_updates);
+ tid = transaction->t_tid;
spin_unlock(&transaction->t_handle_lock);

jbd_debug(2, "restarting handle %p\n", handle);
- tid = transaction->t_tid;
need_to_start = !tid_geq(journal->j_commit_request, tid);
read_unlock(&journal->j_state_lock);
if (need_to_start)
--
1.7.12.rc0.22.gcdd159b


2013-06-20 17:26:12

by Josef Bacik

[permalink] [raw]
Subject: Re: [PATCH] fs/jbd2: t_updates should increase when start_this_handle() failed in jbd2__journal_restart()

On Thu, Jun 20, 2013 at 11:55:55AM -0400, Theodore Ts'o wrote:
> [ LKML and linux-fsdevel BCC'ed ]
>
> On Wed, Jun 19, 2013 at 12:48:26PM +0800, Younger Liu wrote:
> > jbd2_journal_restart() would restart a handle. In this function, it
> > calls start_this_handle(). Before calling start_this_handle(),subtract
> > 1 from transaction->t_updates.
> > If start_this_handle() succeeds, transaction->t_updates increases by 1
> > in it. But if start_this_handle() fails, transaction->t_updates does
> > not increase.
> > So, when commit the handle's transaction in jbd2_journal_stop(), the
> > assertion is false, and then trigger a bug.
> > The assertion is as follows:
> > J_ASSERT(atomic_read(&transaction->t_updates) > 0)
> >
> > Signed-off-by: Younger Liu <[email protected]>
>
> Thanks for pointing out this potential problem. Your fix isn't quite
> the right one, however.
>
> The problem is once we get to this point, the transaction pointer may
> no longer be valid, since once we decrement t_updates, the transaction
> could start commiting, and so we should not actually dereference the
> transaction pointer after we unlock transaction->t_handle_lock. (We
> are referencing t_tid two lines later, and technically that's a bug.
> We've just been getting lucky.)
>
> The real issue is that by the time we call start_this_handle() in
> jbd2__journal_restart, the handle is not attached to any transaction.
> So if jbd2_journal_restart() fails, the handle has to be considered
> invalid, and the calling code should not try to use the handle at all,
> including calling jbd2_journal_stop().
>
> Jan Kara is I believe currently on vacation but I'd really like him to
> chime in with his opinion about the best way to fix this, since he's
> also quite familiar with the jbd2 code.
>
> Also, Jan has recently submitted changes to implement reserved handles
> (to be submitted in the next merge window), and in these new
> functions, if start_this_handle() fails when called from
> jbd2_journal_start_reserved(), the handle is left invalidated, and the
> caller of jbd2_journal_start_reserved() must not touch the handle
> again, including calling jbd2_journal_stop() --- in fact, because
> jbd2_journal_start_reserved() clears current->journal_info on failure,
> an attempt to call jbd2_journal_stop() will result in the kernel oops
> due to an assertion failure.
>
> My inclination is to fix this in the same way, but it will require
> changing the current code paths that use jbd2_journal_restart(), and
> in some cases passing back the state that the handle is now invalid
> and should not be released via jbd2_journal_stop() is going to be
> tricky indeed.
>
>
> Another possible fix is to set the handle to be aborted, via
> jbd2_journal_abort_handle(). This function isn't used at all at the
> moment, but from what I can tell this should do the right thing. The
> one unfortunate thing about this is that when jbd2_journal_stop() gets
> called, it will return EROFS, which is a misleading error code. I'm
> guessing you're seeing this because start_this_handle() returned
> ENOMEM, correct? We could hack around this by stashing the real error
> in the handle, and then change jbd2_journal_stop() to return that
> error instead of EROFS if it is set.
>
> This second solution is hacky all-around, and it's also inconsistent
> with how we are doing things with jbd2_journal_start_reserved(). So
> I'm not so happy with this solution. But it would require a lot less
> work because the fix would be isolated in the jbd2 layer. OTOH, right
> now if the code calls jbd2_journal_stop() on the handle after a
> failure in jbd2_journal_start_reserved(), they are crashing anyway, so
> changing the code so it changes with an assertion failure doesn't make
> things any worse, and then we fix things in ext4 and ocfs2 without any
> patch interdependencies --- and this is a problem which appears to
> happen very rarely in practice.
>

I realize it's been a little bit since I've looked at jbd but I'll offer my
opinion. Callers of jbd2_journal_restart() may not be the ones who originated
the handle, so doing what Jan has done with jbd2_journal_start_reserved() isn't
going to work because all the guy at the top is going to see is an error and
have no way to tell if his handle is invalid or not.

What I would suggest is getting a unified way to mark that the handle has
already been cleaned up and can just be free'd. Then fix
jbd2_journal_start_reserved() and jbd2_journal_restart() to set that in the
handle and make jbd2_journal_stop() just free up the handle and reset
current->journal_info but not return an error. It's important to not return an
error from jbd2_journal_stop() so that it doesn't invoke the ext4 error handling
stuff and you get a read only file system when the error may not be read only
file system worthy.

This way you have a nice clean way of dealing with handle errors that allow you
to pass back a real error to the caller and the caller can just do its normal
jbd2_journal_stop() and cleanup and do its own error handling the way it feels.
This keeps the yucky details of no longer valid handles all internal to jbd2 and
ext4/ocfs2 don't have to worry about it. Thanks,

Josef

2013-06-20 18:12:23

by Theodore Ts'o

[permalink] [raw]
Subject: Re: [PATCH] fs/jbd2: t_updates should increase when start_this_handle() failed in jbd2__journal_restart()

On Thu, Jun 20, 2013 at 01:26:09PM -0400, Josef Bacik wrote:
> I realize it's been a little bit since I've looked at jbd but I'll offer my
> opinion. Callers of jbd2_journal_restart() may not be the ones who originated
> the handle, so doing what Jan has done with jbd2_journal_start_reserved() isn't
> going to work because all the guy at the top is going to see is an error and
> have no way to tell if his handle is invalid or not.

Yeah, that's what I meant by "it would require changing all of the
callers".

> What I would suggest is getting a unified way to mark that the handle has
> already been cleaned up and can just be free'd.

The problem though is we need to make sure none of the callers don't
try to do anything else with handle besides calling
jbd2_journal_stop(). In particular, we can't allow a call to
jbd2_journal_get_write_access(), jbd2_journal_revoke() to operate on
the handle, because its transaction pointer is (potentially) invalid.

> Then fix jbd2_journal_start_reserved() and jbd2_journal_restart() to
> set that in the handle and make jbd2_journal_stop() just free up the
> handle and reset current->journal_info but not return an error.
> It's important to not return an error from jbd2_journal_stop() so
> that it doesn't invoke the ext4 error handling stuff and you get a
> read only file system when the error may not be read only file
> system worthy.

The handle->h_aborted bit, which is currently not used, does most of
the right thing, modulo the question of the fact that
jbd2_journal_stop() will return an error. What's important from my
perspective is that the various callers that operate on a handle check
is_handle_aborted() before trying to use the it. We'll still need to
audit the callers to make sure there isn't some uncommon-taken code
path where ext4_handle_dirty_metadata() gets called after
ext4_journal_restart() has returned an error.

As a FAST paper once opined, "EIO: Error Handling Is Occasionally correct". :-)

> This way you have a nice clean way of dealing with handle errors that allow you
> to pass back a real error to the caller and the caller can just do its normal
> jbd2_journal_stop() and cleanup and do its own error handling the way it feels.
> This keeps the yucky details of no longer valid handles all internal to jbd2 and
> ext4/ocfs2 don't have to worry about it. Thanks,

Yes, that could work, although we'll need to check to make sure all of
the code paths that invoke jbd2_journal_restart() handle errors
appropriately, and don't rely on jbd2_journal_stop() returning an
error. Thanks for your thoughts!

Regards,

- Ted

2013-06-21 13:29:58

by Younger Liu

[permalink] [raw]
Subject: Re: [PATCH] fs/jbd2: t_updates should increase when start_this_handle() failed in jbd2__journal_restart()

Hi Ted,

On 2013/6/20 23:55, Theodore Ts'o wrote:
> [ LKML and linux-fsdevel BCC'ed ]
>
> On Wed, Jun 19, 2013 at 12:48:26PM +0800, Younger Liu wrote:
>> jbd2_journal_restart() would restart a handle. In this function, it
>> calls start_this_handle(). Before calling start_this_handle(),subtract
>> 1 from transaction->t_updates.
>> If start_this_handle() succeeds, transaction->t_updates increases by 1
>> in it. But if start_this_handle() fails, transaction->t_updates does
>> not increase.
>> So, when commit the handle's transaction in jbd2_journal_stop(), the
>> assertion is false, and then trigger a bug.
>> The assertion is as follows:
>> J_ASSERT(atomic_read(&transaction->t_updates) > 0)
>>
>> Signed-off-by: Younger Liu <[email protected]>
>
> Thanks for pointing out this potential problem. Your fix isn't quite
> the right one, however.
>
> The problem is once we get to this point, the transaction pointer may
> no longer be valid, since once we decrement t_updates, the transaction
> could start commiting, and so we should not actually dereference the
> transaction pointer after we unlock transaction->t_handle_lock. (We
> are referencing t_tid two lines later, and technically that's a bug.
> We've just been getting lucky.)
>
> The real issue is that by the time we call start_this_handle() in
> jbd2__journal_restart, the handle is not attached to any transaction.
> So if jbd2_journal_restart() fails, the handle has to be considered
> invalid, and the calling code should not try to use the handle at all,
> including calling jbd2_journal_stop().
>
> Jan Kara is I believe currently on vacation but I'd really like him to
> chime in with his opinion about the best way to fix this, since he's
> also quite familiar with the jbd2 code.
>
> Also, Jan has recently submitted changes to implement reserved handles
> (to be submitted in the next merge window), and in these new
> functions, if start_this_handle() fails when called from
> jbd2_journal_start_reserved(), the handle is left invalidated, and the
> caller of jbd2_journal_start_reserved() must not touch the handle
> again, including calling jbd2_journal_stop() --- in fact, because
> jbd2_journal_start_reserved() clears current->journal_info on failure,
> an attempt to call jbd2_journal_stop() will result in the kernel oops
> due to an assertion failure.
>
> My inclination is to fix this in the same way, but it will require
> changing the current code paths that use jbd2_journal_restart(), and
> in some cases passing back the state that the handle is now invalid
> and should not be released via jbd2_journal_stop() is going to be
> tricky indeed.
>
>
> Another possible fix is to set the handle to be aborted, via
> jbd2_journal_abort_handle(). This function isn't used at all at the
> moment, but from what I can tell this should do the right thing. The
> one unfortunate thing about this is that when jbd2_journal_stop() gets
> called, it will return EROFS, which is a misleading error code. I'm
> guessing you're seeing this because start_this_handle() returned
> ENOMEM, correct? We could hack around this by stashing the real error
> in the handle, and then change jbd2_journal_stop() to return that
> error instead of EROFS if it is set.
>
> This second solution is hacky all-around, and it's also inconsistent
> with how we are doing things with jbd2_journal_start_reserved(). So
> I'm not so happy with this solution. But it would require a lot less
> work because the fix would be isolated in the jbd2 layer. OTOH, right
> now if the code calls jbd2_journal_stop() on the handle after a
> failure in jbd2_journal_start_reserved(), they are crashing anyway, so
> changing the code so it changes with an assertion failure doesn't make
> things any worse, and then we fix things in ext4 and ocfs2 without any
> patch interdependencies --- and this is a problem which appears to
> happen very rarely in practice.
>
> (How did you manage to trigger this, BTW? Was this something you
> noticed by manual code inspection? Or are you instrumenting the
> kernel's memory allocators to occasionally fail to test our error
> paths? Or were you running in a system with very heavy memory
> pressure?)
>
> - Ted
>

This bug was triggered by the following scenario:
In ocfs2 file system, allocate a very large disk space for a small file
with ocfs2_fallocate(), while the journal file size is 32M.

Because there are much many journal blocks needed by jbd2_journal_restart(),
so that nblocks is greater than journal->j_max_transaction_buffers
in start_this_handle(), and then return -ENOSPC.

In start_this_handle():
if (nblocks > journal->j_max_transaction_buffers) {
printk(KERN_ERR "JBD: %s wants too many credits (%d > %d)\n",
current->comm, nblocks,
journal->j_max_transaction_buffers);
return -ENOSPC;
}

Dump stack:
<ffffffffa0b04620>{jbd2:jbd2_journal_stop+0x50}
<ffffffffa0b671a3>{ocfs2:ocfs2_commit_trans+0x23}
<ffffffffa0b5c8ce>{ocfs2:__ocfs2_extend_allocation+0x66e}
<ffffffffa0b5cc55>{ocfs2:ocfs2_allocate_unwritten_extents+0xc5}
<ffffffffa0b5d315>{ocfs2:__ocfs2_change_file_space+0x3f5}
<ffffffffa0b5d67a>{ocfs2:ocfs2_fallocate+0x7a}
<ffffffff80127689>{do_fallocate+0x129}
<ffffffff801276d6>{sys_fallocate+0x46}
<ffffffff803fd423>{system_call_fastpath+0x16}
[<00007f7283b25030>]

This problem may be because jbd2_journal_restart() was called incorrectly
in __ocfs2_extend_allocation() which was called by ocfs2_fallocate().

Although I solved this question by modifing __ocfs2_extend_allocation(),
there is also a risk in jbd2_journal_restart() for jbd2 system.
So I put this potential risk to see if there is a better ideas.

>
>
>> ---
>> fs/jbd2/transaction.c | 2 ++
>> 1 file changed, 2 insertions(+)
>>
>> diff --git a/fs/jbd2/transaction.c b/fs/jbd2/transaction.c
>> index 325bc01..9ddb444 100644
>> --- a/fs/jbd2/transaction.c
>> +++ b/fs/jbd2/transaction.c
>> @@ -530,6 +530,8 @@ int jbd2__journal_restart(handle_t *handle, int nblocks, gfp_t gfp_mask)
>> lock_map_release(&handle->h_lockdep_map);
>> handle->h_buffer_credits = nblocks;
>> ret = start_this_handle(journal, handle, gfp_mask);
>> + if (ret < 0)
>> + atomic_inc(&transaction->t_updates);
>> return ret;
>> }
>> EXPORT_SYMBOL(jbd2__journal_restart);
>> --
>> 1.7.9.7
>>
>
> .
>

2013-06-21 23:26:20

by Jan Kara

[permalink] [raw]
Subject: Re: [PATCH] fs/jbd2: t_updates should increase when start_this_handle() failed in jbd2__journal_restart()

On Thu 20-06-13 14:12:15, Ted Tso wrote:
> On Thu, Jun 20, 2013 at 01:26:09PM -0400, Josef Bacik wrote:
> > I realize it's been a little bit since I've looked at jbd but I'll offer my
> > opinion. Callers of jbd2_journal_restart() may not be the ones who originated
> > the handle, so doing what Jan has done with jbd2_journal_start_reserved() isn't
> > going to work because all the guy at the top is going to see is an error and
> > have no way to tell if his handle is invalid or not.
>
> Yeah, that's what I meant by "it would require changing all of the
> callers".
>
> > What I would suggest is getting a unified way to mark that the handle has
> > already been cleaned up and can just be free'd.
>
> The problem though is we need to make sure none of the callers don't
> try to do anything else with handle besides calling
> jbd2_journal_stop(). In particular, we can't allow a call to
> jbd2_journal_get_write_access(), jbd2_journal_revoke() to operate on
> the handle, because its transaction pointer is (potentially) invalid.
I think this is the cleanest solution going forward as well.

> > Then fix jbd2_journal_start_reserved() and jbd2_journal_restart() to
> > set that in the handle and make jbd2_journal_stop() just free up the
> > handle and reset current->journal_info but not return an error.
> > It's important to not return an error from jbd2_journal_stop() so
> > that it doesn't invoke the ext4 error handling stuff and you get a
> > read only file system when the error may not be read only file
> > system worthy.
>
> The handle->h_aborted bit, which is currently not used, does most of
> the right thing, modulo the question of the fact that
> jbd2_journal_stop() will return an error. What's important from my
> perspective is that the various callers that operate on a handle check
> is_handle_aborted() before trying to use the it. We'll still need to
> audit the callers to make sure there isn't some uncommon-taken code
> path where ext4_handle_dirty_metadata() gets called after
> ext4_journal_restart() has returned an error.
Yeah, I don't see a solution where we could avoid the audit... Somewhat
comforting is that if the change to error handling will break something it
has been broken previously as well, only the bug was better hidden :)

> As a FAST paper once opined, "EIO: Error Handling Is Occasionally correct". :-)
>
> > This way you have a nice clean way of dealing with handle errors that allow you
> > to pass back a real error to the caller and the caller can just do its normal
> > jbd2_journal_stop() and cleanup and do its own error handling the way it feels.
> > This keeps the yucky details of no longer valid handles all internal to jbd2 and
> > ext4/ocfs2 don't have to worry about it. Thanks,
>
> Yes, that could work, although we'll need to check to make sure all of
> the code paths that invoke jbd2_journal_restart() handle errors
> appropriately, and don't rely on jbd2_journal_stop() returning an
> error. Thanks for your thoughts!

Honza
--
Jan Kara <[email protected]>
SUSE Labs, CR

2013-06-23 17:36:36

by Theodore Ts'o

[permalink] [raw]
Subject: Re: [PATCH] fs/jbd2: t_updates should increase when start_this_handle() failed in jbd2__journal_restart()

On Fri, Jun 21, 2013 at 09:29:31PM +0800, Younger Liu wrote:
>
> This bug was triggered by the following scenario:
> In ocfs2 file system, allocate a very large disk space for a small file
> with ocfs2_fallocate(), while the journal file size is 32M.
>
> Because there are much many journal blocks needed by jbd2_journal_restart(),
> so that nblocks is greater than journal->j_max_transaction_buffers
> in start_this_handle(), and then return -ENOSPC.

Ah, I see. I have a patch that should prevent the kernel from
crashing in this situation, and which adds some additional checks to
make sure no one tries to use the handle after jbd2_journal_restart()
fails in this circumstance.

However, you may want to further pursue a fix in ocfs2 so you don't
actually return ENOSPC to userspace, since it is a very misleading
error message --- it's not that the file system is out of space, but
that the journal is too small for the amount of space that you are
trying to allocate using fallocate().

I would think a better way of handling this situation would be to log
a warning message that the journal is probably too small, and then to
break up the fallocate into smaller chunks, so that it can
successfully complete despite the fact that the journal was
unfortunately missized.

I'll be sending the proposed fix in a moment; could you check and see
if the patch prevents ocfs2/jbd2 from tripping over the assertion
given your test case?

Thanks,

- Ted

2013-06-23 17:44:36

by Theodore Ts'o

[permalink] [raw]
Subject: [PATCH] jbd2: invalidate handle if jbd2_journal_restart() fails

If jbd2_journal_restart() fails the handle will have been disconnected
from the current transaction. In this situation, the handle must not
be used for for any jbd2 function other than jbd2_journal_stop().
Enforce this with by treating a handle which has a NULL transaction
pointer as an aborted handle, and issue a kernel warning if
jbd2_journal_extent(), jbd2_journal_get_write_access(),
jbd2_journal_dirty_metadata(), etc. is called with an invalid handle.

This commit also fixes a bug where jbd2_journal_stop() would trip over
a kernel jbd2 assertion check when trying to free an invalid handle.

Also move the responsibility of setting current->journal_info to
start_this_handle(), simplifying the three users of this function.

Signed-off-by: "Theodore Ts'o" <[email protected]>
Reported-by: Younger Liu <[email protected]>
Cc: Jan Kara <[email protected]>
---
fs/jbd2/transaction.c | 27 ++++++++++++++++-----------
include/linux/jbd2.h | 2 +-
2 files changed, 17 insertions(+), 12 deletions(-)

diff --git a/fs/jbd2/transaction.c b/fs/jbd2/transaction.c
index 7391456..8ac8306 100644
--- a/fs/jbd2/transaction.c
+++ b/fs/jbd2/transaction.c
@@ -368,6 +368,7 @@ repeat:
atomic_read(&transaction->t_outstanding_credits),
jbd2_log_space_left(journal));
read_unlock(&journal->j_state_lock);
+ current->journal_info = handle;

lock_map_acquire(&handle->h_lockdep_map);
jbd2_journal_free_transaction(new_transaction);
@@ -442,14 +443,11 @@ handle_t *jbd2__journal_start(journal_t *journal, int nblocks, int rsv_blocks,
handle->h_rsv_handle = rsv_handle;
}

- current->journal_info = handle;
-
err = start_this_handle(journal, handle, gfp_mask);
if (err < 0) {
if (handle->h_rsv_handle)
jbd2_free_handle(handle->h_rsv_handle);
jbd2_free_handle(handle);
- current->journal_info = NULL;
return ERR_PTR(err);
}
handle->h_type = type;
@@ -511,16 +509,13 @@ int jbd2_journal_start_reserved(handle_t *handle, unsigned int type,
}

handle->h_journal = NULL;
- current->journal_info = handle;
/*
* GFP_NOFS is here because callers are likely from writeback or
* similarly constrained call sites
*/
ret = start_this_handle(journal, handle, GFP_NOFS);
- if (ret < 0) {
- current->journal_info = NULL;
+ if (ret < 0)
jbd2_journal_free_reserved(handle);
- }
handle->h_type = type;
handle->h_line_no = line_no;
return ret;
@@ -555,6 +550,7 @@ int jbd2_journal_extend(handle_t *handle, int nblocks)
int wanted;

result = -EIO;
+ WARN_ON(!handle->h_transaction);
if (is_handle_aborted(handle))
goto out;

@@ -630,6 +626,7 @@ int jbd2__journal_restart(handle_t *handle, int nblocks, gfp_t gfp_mask)
tid_t tid;
int need_to_start, ret;

+ WARN_ON(!handle->h_transaction);
/* If we've had an abort of any type, don't even think about
* actually doing the restart! */
if (is_handle_aborted(handle))
@@ -654,6 +651,8 @@ int jbd2__journal_restart(handle_t *handle, int nblocks, gfp_t gfp_mask)
wake_up(&journal->j_wait_updates);
tid = transaction->t_tid;
spin_unlock(&transaction->t_handle_lock);
+ handle->h_transaction = NULL;
+ current->journal_info = NULL;

jbd_debug(2, "restarting handle %p\n", handle);
need_to_start = !tid_geq(journal->j_commit_request, tid);
@@ -790,6 +789,7 @@ do_get_write_access(handle_t *handle, struct journal_head *jh,
int need_copy = 0;
unsigned long start_lock, time_lock;

+ WARN_ON(!handle->h_transaction);
if (is_handle_aborted(handle))
return -EROFS;

@@ -1057,6 +1057,7 @@ int jbd2_journal_get_create_access(handle_t *handle, struct buffer_head *bh)
int err;

jbd_debug(5, "journal_head %p\n", jh);
+ WARN_ON(!handle->h_transaction);
err = -EROFS;
if (is_handle_aborted(handle))
goto out;
@@ -1269,6 +1270,7 @@ int jbd2_journal_dirty_metadata(handle_t *handle, struct buffer_head *bh)
struct journal_head *jh;
int ret = 0;

+ WARN_ON(!handle->h_transaction);
if (is_handle_aborted(handle))
goto out;
jh = jbd2_journal_grab_journal_head(bh);
@@ -1523,18 +1525,19 @@ int jbd2_journal_stop(handle_t *handle)
{
transaction_t *transaction = handle->h_transaction;
journal_t *journal = transaction->t_journal;
- int err, wait_for_commit = 0;
+ int err = 0, wait_for_commit = 0;
tid_t tid;
pid_t pid;

+ if (!handle->h_transaction)
+ goto free_and_exit;
+
J_ASSERT(journal_current_handle() == handle);

if (is_handle_aborted(handle))
err = -EIO;
- else {
+ else
J_ASSERT(atomic_read(&transaction->t_updates) > 0);
- err = 0;
- }

if (--handle->h_ref > 0) {
jbd_debug(4, "h_ref %d -> %d\n", handle->h_ref + 1,
@@ -1657,6 +1660,7 @@ int jbd2_journal_stop(handle_t *handle)

if (handle->h_rsv_handle)
jbd2_journal_free_reserved(handle->h_rsv_handle);
+free_and_exit:
jbd2_free_handle(handle);
return err;
}
@@ -2364,6 +2368,7 @@ int jbd2_journal_file_inode(handle_t *handle, struct jbd2_inode *jinode)
transaction_t *transaction = handle->h_transaction;
journal_t *journal = transaction->t_journal;

+ WARN_ON(!handle->h_transaction);
if (is_handle_aborted(handle))
return -EIO;

diff --git a/include/linux/jbd2.h b/include/linux/jbd2.h
index 5f3c094..1891112 100644
--- a/include/linux/jbd2.h
+++ b/include/linux/jbd2.h
@@ -1295,7 +1295,7 @@ static inline int is_journal_aborted(journal_t *journal)

static inline int is_handle_aborted(handle_t *handle)
{
- if (handle->h_aborted)
+ if (handle->h_aborted || !handle->h_transaction)
return 1;
return is_journal_aborted(handle->h_transaction->t_journal);
}
--
1.7.12.rc0.22.gcdd159b


2013-06-24 09:54:01

by Jan Kara

[permalink] [raw]
Subject: Re: [PATCH] jbd2: invalidate handle if jbd2_journal_restart() fails

On Sun 23-06-13 13:44:28, Ted Tso wrote:
> If jbd2_journal_restart() fails the handle will have been disconnected
> from the current transaction. In this situation, the handle must not
> be used for for any jbd2 function other than jbd2_journal_stop().
> Enforce this with by treating a handle which has a NULL transaction
> pointer as an aborted handle, and issue a kernel warning if
> jbd2_journal_extent(), jbd2_journal_get_write_access(),
> jbd2_journal_dirty_metadata(), etc. is called with an invalid handle.
>
> This commit also fixes a bug where jbd2_journal_stop() would trip over
> a kernel jbd2 assertion check when trying to free an invalid handle.
>
> Also move the responsibility of setting current->journal_info to
> start_this_handle(), simplifying the three users of this function.
>
> Signed-off-by: "Theodore Ts'o" <[email protected]>
> Reported-by: Younger Liu <[email protected]>
> Cc: Jan Kara <[email protected]>
The patch looks good to me. So you can add:
Reviewed-by: Jan Kara <[email protected]>

Honza

> ---
> fs/jbd2/transaction.c | 27 ++++++++++++++++-----------
> include/linux/jbd2.h | 2 +-
> 2 files changed, 17 insertions(+), 12 deletions(-)
>
> diff --git a/fs/jbd2/transaction.c b/fs/jbd2/transaction.c
> index 7391456..8ac8306 100644
> --- a/fs/jbd2/transaction.c
> +++ b/fs/jbd2/transaction.c
> @@ -368,6 +368,7 @@ repeat:
> atomic_read(&transaction->t_outstanding_credits),
> jbd2_log_space_left(journal));
> read_unlock(&journal->j_state_lock);
> + current->journal_info = handle;
>
> lock_map_acquire(&handle->h_lockdep_map);
> jbd2_journal_free_transaction(new_transaction);
> @@ -442,14 +443,11 @@ handle_t *jbd2__journal_start(journal_t *journal, int nblocks, int rsv_blocks,
> handle->h_rsv_handle = rsv_handle;
> }
>
> - current->journal_info = handle;
> -
> err = start_this_handle(journal, handle, gfp_mask);
> if (err < 0) {
> if (handle->h_rsv_handle)
> jbd2_free_handle(handle->h_rsv_handle);
> jbd2_free_handle(handle);
> - current->journal_info = NULL;
> return ERR_PTR(err);
> }
> handle->h_type = type;
> @@ -511,16 +509,13 @@ int jbd2_journal_start_reserved(handle_t *handle, unsigned int type,
> }
>
> handle->h_journal = NULL;
> - current->journal_info = handle;
> /*
> * GFP_NOFS is here because callers are likely from writeback or
> * similarly constrained call sites
> */
> ret = start_this_handle(journal, handle, GFP_NOFS);
> - if (ret < 0) {
> - current->journal_info = NULL;
> + if (ret < 0)
> jbd2_journal_free_reserved(handle);
> - }
> handle->h_type = type;
> handle->h_line_no = line_no;
> return ret;
> @@ -555,6 +550,7 @@ int jbd2_journal_extend(handle_t *handle, int nblocks)
> int wanted;
>
> result = -EIO;
> + WARN_ON(!handle->h_transaction);
> if (is_handle_aborted(handle))
> goto out;
>
> @@ -630,6 +626,7 @@ int jbd2__journal_restart(handle_t *handle, int nblocks, gfp_t gfp_mask)
> tid_t tid;
> int need_to_start, ret;
>
> + WARN_ON(!handle->h_transaction);
> /* If we've had an abort of any type, don't even think about
> * actually doing the restart! */
> if (is_handle_aborted(handle))
> @@ -654,6 +651,8 @@ int jbd2__journal_restart(handle_t *handle, int nblocks, gfp_t gfp_mask)
> wake_up(&journal->j_wait_updates);
> tid = transaction->t_tid;
> spin_unlock(&transaction->t_handle_lock);
> + handle->h_transaction = NULL;
> + current->journal_info = NULL;
>
> jbd_debug(2, "restarting handle %p\n", handle);
> need_to_start = !tid_geq(journal->j_commit_request, tid);
> @@ -790,6 +789,7 @@ do_get_write_access(handle_t *handle, struct journal_head *jh,
> int need_copy = 0;
> unsigned long start_lock, time_lock;
>
> + WARN_ON(!handle->h_transaction);
> if (is_handle_aborted(handle))
> return -EROFS;
>
> @@ -1057,6 +1057,7 @@ int jbd2_journal_get_create_access(handle_t *handle, struct buffer_head *bh)
> int err;
>
> jbd_debug(5, "journal_head %p\n", jh);
> + WARN_ON(!handle->h_transaction);
> err = -EROFS;
> if (is_handle_aborted(handle))
> goto out;
> @@ -1269,6 +1270,7 @@ int jbd2_journal_dirty_metadata(handle_t *handle, struct buffer_head *bh)
> struct journal_head *jh;
> int ret = 0;
>
> + WARN_ON(!handle->h_transaction);
> if (is_handle_aborted(handle))
> goto out;
> jh = jbd2_journal_grab_journal_head(bh);
> @@ -1523,18 +1525,19 @@ int jbd2_journal_stop(handle_t *handle)
> {
> transaction_t *transaction = handle->h_transaction;
> journal_t *journal = transaction->t_journal;
> - int err, wait_for_commit = 0;
> + int err = 0, wait_for_commit = 0;
> tid_t tid;
> pid_t pid;
>
> + if (!handle->h_transaction)
> + goto free_and_exit;
> +
> J_ASSERT(journal_current_handle() == handle);
>
> if (is_handle_aborted(handle))
> err = -EIO;
> - else {
> + else
> J_ASSERT(atomic_read(&transaction->t_updates) > 0);
> - err = 0;
> - }
>
> if (--handle->h_ref > 0) {
> jbd_debug(4, "h_ref %d -> %d\n", handle->h_ref + 1,
> @@ -1657,6 +1660,7 @@ int jbd2_journal_stop(handle_t *handle)
>
> if (handle->h_rsv_handle)
> jbd2_journal_free_reserved(handle->h_rsv_handle);
> +free_and_exit:
> jbd2_free_handle(handle);
> return err;
> }
> @@ -2364,6 +2368,7 @@ int jbd2_journal_file_inode(handle_t *handle, struct jbd2_inode *jinode)
> transaction_t *transaction = handle->h_transaction;
> journal_t *journal = transaction->t_journal;
>
> + WARN_ON(!handle->h_transaction);
> if (is_handle_aborted(handle))
> return -EIO;
>
> diff --git a/include/linux/jbd2.h b/include/linux/jbd2.h
> index 5f3c094..1891112 100644
> --- a/include/linux/jbd2.h
> +++ b/include/linux/jbd2.h
> @@ -1295,7 +1295,7 @@ static inline int is_journal_aborted(journal_t *journal)
>
> static inline int is_handle_aborted(handle_t *handle)
> {
> - if (handle->h_aborted)
> + if (handle->h_aborted || !handle->h_transaction)
> return 1;
> return is_journal_aborted(handle->h_transaction->t_journal);
> }
> --
> 1.7.12.rc0.22.gcdd159b
>
--
Jan Kara <[email protected]>
SUSE Labs, CR

2013-06-25 08:34:58

by Younger Liu

[permalink] [raw]
Subject: Re: [PATCH] fs/jbd2: t_updates should increase when start_this_handle() failed in jbd2__journal_restart()

I will check and test the pacth.

I only merge the patch about " jbd2: invalidate handle if
jbd2_journal_restart() fails" int my source. But I do not
merge the patch about "jbd2: Transaction reservation support...".

Does it affect the test?

On 2013/6/24 1:36, Theodore Ts'o wrote:
> On Fri, Jun 21, 2013 at 09:29:31PM +0800, Younger Liu wrote:
>>
>> This bug was triggered by the following scenario:
>> In ocfs2 file system, allocate a very large disk space for a small file
>> with ocfs2_fallocate(), while the journal file size is 32M.
>>
>> Because there are much many journal blocks needed by jbd2_journal_restart(),
>> so that nblocks is greater than journal->j_max_transaction_buffers
>> in start_this_handle(), and then return -ENOSPC.
>
> Ah, I see. I have a patch that should prevent the kernel from
> crashing in this situation, and which adds some additional checks to
> make sure no one tries to use the handle after jbd2_journal_restart()
> fails in this circumstance.
>
> However, you may want to further pursue a fix in ocfs2 so you don't
> actually return ENOSPC to userspace, since it is a very misleading
> error message --- it's not that the file system is out of space, but
> that the journal is too small for the amount of space that you are
> trying to allocate using fallocate().
>
> I would think a better way of handling this situation would be to log
> a warning message that the journal is probably too small, and then to
> break up the fallocate into smaller chunks, so that it can
> successfully complete despite the fact that the journal was
> unfortunately missized.
>
> I'll be sending the proposed fix in a moment; could you check and see
> if the patch prevents ocfs2/jbd2 from tripping over the assertion
> given your test case?
>
> Thanks,
>
> - Ted
>
> .
>


2013-06-25 09:42:15

by Younger Liu

[permalink] [raw]
Subject: Re: [PATCH] jbd2: invalidate handle if jbd2_journal_restart() fails

On 2013/6/24 1:44, Theodore Ts'o wrote:
> If jbd2_journal_restart() fails the handle will have been disconnected
> from the current transaction. In this situation, the handle must not
> be used for for any jbd2 function other than jbd2_journal_stop().
> Enforce this with by treating a handle which has a NULL transaction
> pointer as an aborted handle, and issue a kernel warning if
> jbd2_journal_extent(), jbd2_journal_get_write_access(),
> jbd2_journal_dirty_metadata(), etc. is called with an invalid handle.
>
> This commit also fixes a bug where jbd2_journal_stop() would trip over
> a kernel jbd2 assertion check when trying to free an invalid handle.
>
> Also move the responsibility of setting current->journal_info to
> start_this_handle(), simplifying the three users of this function.
>
> Signed-off-by: "Theodore Ts'o" <[email protected]>
> Reported-by: Younger Liu <[email protected]>
> Cc: Jan Kara <[email protected]>
Hi Ted,
There is a mistake in this patch.
Please see my comments below.
> ---
> fs/jbd2/transaction.c | 27 ++++++++++++++++-----------
> include/linux/jbd2.h | 2 +-
> 2 files changed, 17 insertions(+), 12 deletions(-)
>
> diff --git a/fs/jbd2/transaction.c b/fs/jbd2/transaction.c
> index 7391456..8ac8306 100644
> --- a/fs/jbd2/transaction.c
> +++ b/fs/jbd2/transaction.c
...
> @@ -1523,18 +1525,19 @@ int jbd2_journal_stop(handle_t *handle)
> {
> transaction_t *transaction = handle->h_transaction;
> journal_t *journal = transaction->t_journal;
> - int err, wait_for_commit = 0;
> + int err = 0, wait_for_commit = 0;
> tid_t tid;
> pid_t pid;
>
> + if (!handle->h_transaction)
> + goto free_and_exit;
> +
> J_ASSERT(journal_current_handle() == handle);
>
If jbd2__journal_restart fails, handle->h_transaction may be NULL.
So we should check handle->h_transaction before
"journal = transaction->t_journal", Right?
How about the following?

transaction_t *transaction = handle->h_transaction;
journal_t *journal = NULL;
int err = 0, wait_for_commit = 0;
tid_t tid;
pid_t pid;

if (!handle->h_transaction)
goto free_and_exit;

journal = transaction->t_journal;
J_ASSERT(journal_current_handle() == handle);

We should check this in other functions too,such as
jbd2_journal_extend(), jbd2_journal_get_create_access(),
jbd2_journal_dirty_metadata(),jbd2_journal_file_inode(),
jbd2__journal_restart(), etc.

> if (is_handle_aborted(handle))
> err = -EIO;
> - else {
> + else
> J_ASSERT(atomic_read(&transaction->t_updates) > 0);
> - err = 0;
> - }
>
> if (--handle->h_ref > 0) {
> jbd_debug(4, "h_ref %d -> %d\n", handle->h_ref + 1,
> @@ -1657,6 +1660,7 @@ int jbd2_journal_stop(handle_t *handle)
>
> if (handle->h_rsv_handle)
> jbd2_journal_free_reserved(handle->h_rsv_handle);
> +free_and_exit:
> jbd2_free_handle(handle);
> return err;
> }
> @@ -2364,6 +2368,7 @@ int jbd2_journal_file_inode(handle_t *handle, struct jbd2_inode *jinode)
> transaction_t *transaction = handle->h_transaction;
> journal_t *journal = transaction->t_journal;
>
> + WARN_ON(!handle->h_transaction);
> if (is_handle_aborted(handle))
> return -EIO;
>
> diff --git a/include/linux/jbd2.h b/include/linux/jbd2.h
> index 5f3c094..1891112 100644
> --- a/include/linux/jbd2.h
> +++ b/include/linux/jbd2.h
> @@ -1295,7 +1295,7 @@ static inline int is_journal_aborted(journal_t *journal)
>
> static inline int is_handle_aborted(handle_t *handle)
> {
> - if (handle->h_aborted)
> + if (handle->h_aborted || !handle->h_transaction)
> return 1;
> return is_journal_aborted(handle->h_transaction->t_journal);
> }
>


2013-06-29 13:22:35

by Joel Becker

[permalink] [raw]
Subject: Re: [PATCH] fs/jbd2: t_updates should increase when start_this_handle() failed in jbd2__journal_restart()

On Sun, Jun 23, 2013 at 01:36:28PM -0400, Theodore Ts'o wrote:
> On Fri, Jun 21, 2013 at 09:29:31PM +0800, Younger Liu wrote:
> >
> > This bug was triggered by the following scenario:
> > In ocfs2 file system, allocate a very large disk space for a small file
> > with ocfs2_fallocate(), while the journal file size is 32M.
> >
> > Because there are much many journal blocks needed by jbd2_journal_restart(),
> > so that nblocks is greater than journal->j_max_transaction_buffers
> > in start_this_handle(), and then return -ENOSPC.
>
> Ah, I see. I have a patch that should prevent the kernel from
> crashing in this situation, and which adds some additional checks to
> make sure no one tries to use the handle after jbd2_journal_restart()
> fails in this circumstance.
>
> However, you may want to further pursue a fix in ocfs2 so you don't
> actually return ENOSPC to userspace, since it is a very misleading
> error message --- it's not that the file system is out of space, but
> that the journal is too small for the amount of space that you are
> trying to allocate using fallocate().
>
> I would think a better way of handling this situation would be to log
> a warning message that the journal is probably too small, and then to
> break up the fallocate into smaller chunks, so that it can
> successfully complete despite the fact that the journal was
> unfortunately missized.

Yes, this solution is a good one.

Joel

>
> I'll be sending the proposed fix in a moment; could you check and see
> if the patch prevents ocfs2/jbd2 from tripping over the assertion
> given your test case?
>
> Thanks,
>
> - Ted
> --
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html

--

2013-06-29 23:46:59

by Theodore Ts'o

[permalink] [raw]
Subject: Re: [PATCH] jbd2: invalidate handle if jbd2_journal_restart() fails

On Tue, Jun 25, 2013 at 05:42:10PM +0800, Younger Liu wrote:
> If jbd2__journal_restart fails, handle->h_transaction may be NULL.
> So we should check handle->h_transaction before
> "journal = transaction->t_journal", Right?

Good point, yes. Here's a new version of the patch which I have in my tree.

- Ted

>From fd8d369f9ad921eb6dc5c56e87e4c6d6106bad56 Mon Sep 17 00:00:00 2001
From: Theodore Ts'o <[email protected]>
Date: Sun, 23 Jun 2013 12:59:01 -0400
Subject: [PATCH] jbd2: invalidate handle if jbd2_journal_restart() fails

If jbd2_journal_restart() fails the handle will have been disconnected
from the current transaction. In this situation, the handle must not
be used for for any jbd2 function other than jbd2_journal_stop().
Enforce this with by treating a handle which has a NULL transaction
pointer as an aborted handle, and issue a kernel warning if
jbd2_journal_extent(), jbd2_journal_get_write_access(),
jbd2_journal_dirty_metadata(), etc. is called with an invalid handle.

This commit also fixes a bug where jbd2_journal_stop() would trip over
a kernel jbd2 assertion check when trying to free an invalid handle.

Also move the responsibility of setting current->journal_info to
start_this_handle(), simplifying the three users of this function.

Signed-off-by: "Theodore Ts'o" <[email protected]>
Reported-by: Younger Liu <[email protected]>
Cc: Jan Kara <[email protected]>
---
fs/jbd2/transaction.c | 74 ++++++++++++++++++++++++++++++---------------------
include/linux/jbd2.h | 2 +-
2 files changed, 44 insertions(+), 32 deletions(-)

diff --git a/fs/jbd2/transaction.c b/fs/jbd2/transaction.c
index 383b0fb..7aa9a32 100644
--- a/fs/jbd2/transaction.c
+++ b/fs/jbd2/transaction.c
@@ -368,6 +368,7 @@ repeat:
atomic_read(&transaction->t_outstanding_credits),
jbd2_log_space_left(journal));
read_unlock(&journal->j_state_lock);
+ current->journal_info = handle;

lock_map_acquire(&handle->h_lockdep_map);
jbd2_journal_free_transaction(new_transaction);
@@ -442,14 +443,11 @@ handle_t *jbd2__journal_start(journal_t *journal, int nblocks, int rsv_blocks,
handle->h_rsv_handle = rsv_handle;
}

- current->journal_info = handle;
-
err = start_this_handle(journal, handle, gfp_mask);
if (err < 0) {
if (handle->h_rsv_handle)
jbd2_free_handle(handle->h_rsv_handle);
jbd2_free_handle(handle);
- current->journal_info = NULL;
return ERR_PTR(err);
}
handle->h_type = type;
@@ -511,16 +509,13 @@ int jbd2_journal_start_reserved(handle_t *handle, unsigned int type,
}

handle->h_journal = NULL;
- current->journal_info = handle;
/*
* GFP_NOFS is here because callers are likely from writeback or
* similarly constrained call sites
*/
ret = start_this_handle(journal, handle, GFP_NOFS);
- if (ret < 0) {
- current->journal_info = NULL;
+ if (ret < 0)
jbd2_journal_free_reserved(handle);
- }
handle->h_type = type;
handle->h_line_no = line_no;
return ret;
@@ -550,20 +545,21 @@ EXPORT_SYMBOL(jbd2_journal_start_reserved);
int jbd2_journal_extend(handle_t *handle, int nblocks)
{
transaction_t *transaction = handle->h_transaction;
- journal_t *journal = transaction->t_journal;
+ journal_t *journal;
int result;
int wanted;

- result = -EIO;
+ WARN_ON(!transaction);
if (is_handle_aborted(handle))
- goto out;
+ return -EROFS;
+ journal = transaction->t_journal;

result = 1;

read_lock(&journal->j_state_lock);

/* Don't extend a locked-down transaction! */
- if (handle->h_transaction->t_state != T_RUNNING) {
+ if (transaction->t_state != T_RUNNING) {
jbd_debug(3, "denied handle %p %d blocks: "
"transaction not running\n", handle, nblocks);
goto error_out;
@@ -589,7 +585,7 @@ int jbd2_journal_extend(handle_t *handle, int nblocks)
}

trace_jbd2_handle_extend(journal->j_fs_dev->bd_dev,
- handle->h_transaction->t_tid,
+ transaction->t_tid,
handle->h_type, handle->h_line_no,
handle->h_buffer_credits,
nblocks);
@@ -603,7 +599,6 @@ unlock:
spin_unlock(&transaction->t_handle_lock);
error_out:
read_unlock(&journal->j_state_lock);
-out:
return result;
}

@@ -626,14 +621,16 @@ out:
int jbd2__journal_restart(handle_t *handle, int nblocks, gfp_t gfp_mask)
{
transaction_t *transaction = handle->h_transaction;
- journal_t *journal = transaction->t_journal;
+ journal_t *journal;
tid_t tid;
int need_to_start, ret;

+ WARN_ON(!transaction);
/* If we've had an abort of any type, don't even think about
* actually doing the restart! */
if (is_handle_aborted(handle))
return 0;
+ journal = transaction->t_journal;

/*
* First unlink the handle from its current transaction, and start the
@@ -654,6 +651,8 @@ int jbd2__journal_restart(handle_t *handle, int nblocks, gfp_t gfp_mask)
wake_up(&journal->j_wait_updates);
tid = transaction->t_tid;
spin_unlock(&transaction->t_handle_lock);
+ handle->h_transaction = NULL;
+ current->journal_info = NULL;

jbd_debug(2, "restarting handle %p\n", handle);
need_to_start = !tid_geq(journal->j_commit_request, tid);
@@ -783,17 +782,16 @@ do_get_write_access(handle_t *handle, struct journal_head *jh,
int force_copy)
{
struct buffer_head *bh;
- transaction_t *transaction;
+ transaction_t *transaction = handle->h_transaction;
journal_t *journal;
int error;
char *frozen_buffer = NULL;
int need_copy = 0;
unsigned long start_lock, time_lock;

+ WARN_ON(!transaction);
if (is_handle_aborted(handle))
return -EROFS;
-
- transaction = handle->h_transaction;
journal = transaction->t_journal;

jbd_debug(5, "journal_head %p, force_copy %d\n", jh, force_copy);
@@ -1052,14 +1050,16 @@ int jbd2_journal_get_write_access(handle_t *handle, struct buffer_head *bh)
int jbd2_journal_get_create_access(handle_t *handle, struct buffer_head *bh)
{
transaction_t *transaction = handle->h_transaction;
- journal_t *journal = transaction->t_journal;
+ journal_t *journal;
struct journal_head *jh = jbd2_journal_add_journal_head(bh);
int err;

jbd_debug(5, "journal_head %p\n", jh);
+ WARN_ON(!transaction);
err = -EROFS;
if (is_handle_aborted(handle))
goto out;
+ journal = transaction->t_journal;
err = 0;

JBUFFER_TRACE(jh, "entry");
@@ -1265,12 +1265,14 @@ void jbd2_buffer_abort_trigger(struct journal_head *jh,
int jbd2_journal_dirty_metadata(handle_t *handle, struct buffer_head *bh)
{
transaction_t *transaction = handle->h_transaction;
- journal_t *journal = transaction->t_journal;
+ journal_t *journal;
struct journal_head *jh;
int ret = 0;

+ WARN_ON(!transaction);
if (is_handle_aborted(handle))
- goto out;
+ return -EROFS;
+ journal = transaction->t_journal;
jh = jbd2_journal_grab_journal_head(bh);
if (!jh) {
ret = -EUCLEAN;
@@ -1364,7 +1366,7 @@ int jbd2_journal_dirty_metadata(handle_t *handle, struct buffer_head *bh)

JBUFFER_TRACE(jh, "file as BJ_Metadata");
spin_lock(&journal->j_list_lock);
- __jbd2_journal_file_buffer(jh, handle->h_transaction, BJ_Metadata);
+ __jbd2_journal_file_buffer(jh, transaction, BJ_Metadata);
spin_unlock(&journal->j_list_lock);
out_unlock_bh:
jbd_unlock_bh_state(bh);
@@ -1395,12 +1397,17 @@ out:
int jbd2_journal_forget (handle_t *handle, struct buffer_head *bh)
{
transaction_t *transaction = handle->h_transaction;
- journal_t *journal = transaction->t_journal;
+ journal_t *journal;
struct journal_head *jh;
int drop_reserve = 0;
int err = 0;
int was_modified = 0;

+ WARN_ON(!transaction);
+ if (is_handle_aborted(handle))
+ return -EROFS;
+ journal = transaction->t_journal;
+
BUFFER_TRACE(bh, "entry");

jbd_lock_bh_state(bh);
@@ -1427,7 +1434,7 @@ int jbd2_journal_forget (handle_t *handle, struct buffer_head *bh)
*/
jh->b_modified = 0;

- if (jh->b_transaction == handle->h_transaction) {
+ if (jh->b_transaction == transaction) {
J_ASSERT_JH(jh, !jh->b_frozen_data);

/* If we are forgetting a buffer which is already part
@@ -1522,19 +1529,21 @@ drop:
int jbd2_journal_stop(handle_t *handle)
{
transaction_t *transaction = handle->h_transaction;
- journal_t *journal = transaction->t_journal;
- int err, wait_for_commit = 0;
+ journal_t *journal;
+ int err = 0, wait_for_commit = 0;
tid_t tid;
pid_t pid;

+ if (!transaction)
+ goto free_and_exit;
+ journal = transaction->t_journal;
+
J_ASSERT(journal_current_handle() == handle);

if (is_handle_aborted(handle))
err = -EIO;
- else {
+ else
J_ASSERT(atomic_read(&transaction->t_updates) > 0);
- err = 0;
- }

if (--handle->h_ref > 0) {
jbd_debug(4, "h_ref %d -> %d\n", handle->h_ref + 1,
@@ -1544,7 +1553,7 @@ int jbd2_journal_stop(handle_t *handle)

jbd_debug(4, "Handle %p going down\n", handle);
trace_jbd2_handle_stats(journal->j_fs_dev->bd_dev,
- handle->h_transaction->t_tid,
+ transaction->t_tid,
handle->h_type, handle->h_line_no,
jiffies - handle->h_start_jiffies,
handle->h_sync, handle->h_requested_credits,
@@ -1657,6 +1666,7 @@ int jbd2_journal_stop(handle_t *handle)

if (handle->h_rsv_handle)
jbd2_journal_free_reserved(handle->h_rsv_handle);
+free_and_exit:
jbd2_free_handle(handle);
return err;
}
@@ -2362,10 +2372,12 @@ void jbd2_journal_refile_buffer(journal_t *journal, struct journal_head *jh)
int jbd2_journal_file_inode(handle_t *handle, struct jbd2_inode *jinode)
{
transaction_t *transaction = handle->h_transaction;
- journal_t *journal = transaction->t_journal;
+ journal_t *journal;

+ WARN_ON(!transaction);
if (is_handle_aborted(handle))
- return -EIO;
+ return -EROFS;
+ journal = transaction->t_journal;

jbd_debug(4, "Adding inode %lu, tid:%d\n", jinode->i_vfs_inode->i_ino,
transaction->t_tid);
diff --git a/include/linux/jbd2.h b/include/linux/jbd2.h
index 0302f3f..d5b50a1 100644
--- a/include/linux/jbd2.h
+++ b/include/linux/jbd2.h
@@ -1266,7 +1266,7 @@ static inline int is_journal_aborted(journal_t *journal)

static inline int is_handle_aborted(handle_t *handle)
{
- if (handle->h_aborted)
+ if (handle->h_aborted || !handle->h_transaction)
return 1;
return is_journal_aborted(handle->h_transaction->t_journal);
}
--
1.7.12.rc0.22.gcdd159b


2013-07-03 12:22:36

by Younger Liu

[permalink] [raw]
Subject: Re: [PATCH] jbd2: invalidate handle if jbd2_journal_restart() fails

On 2013/6/30 7:46, Theodore Ts'o wrote:
> On Tue, Jun 25, 2013 at 05:42:10PM +0800, Younger Liu wrote:
>> If jbd2__journal_restart fails, handle->h_transaction may be NULL.
>> So we should check handle->h_transaction before
>> "journal = transaction->t_journal", Right?
>
> Good point, yes. Here's a new version of the patch which I have in my tree.
>
> - Ted
>
>>From fd8d369f9ad921eb6dc5c56e87e4c6d6106bad56 Mon Sep 17 00:00:00 2001
> From: Theodore Ts'o <[email protected]>
> Date: Sun, 23 Jun 2013 12:59:01 -0400
> Subject: [PATCH] jbd2: invalidate handle if jbd2_journal_restart() fails
>
> If jbd2_journal_restart() fails the handle will have been disconnected
> from the current transaction. In this situation, the handle must not
> be used for for any jbd2 function other than jbd2_journal_stop().
> Enforce this with by treating a handle which has a NULL transaction
> pointer as an aborted handle, and issue a kernel warning if
> jbd2_journal_extent(), jbd2_journal_get_write_access(),
> jbd2_journal_dirty_metadata(), etc. is called with an invalid handle.
>
> This commit also fixes a bug where jbd2_journal_stop() would trip over
> a kernel jbd2 assertion check when trying to free an invalid handle.
>
> Also move the responsibility of setting current->journal_info to
> start_this_handle(), simplifying the three users of this function.
>
> Signed-off-by: "Theodore Ts'o" <[email protected]>
> Reported-by: Younger Liu <[email protected]>
> Cc: Jan Kara <[email protected]>
> ---
> fs/jbd2/transaction.c | 74 ++++++++++++++++++++++++++++++---------------------
> include/linux/jbd2.h | 2 +-
> 2 files changed, 44 insertions(+), 32 deletions(-)
>
> diff --git a/fs/jbd2/transaction.c b/fs/jbd2/transaction.c
> index 383b0fb..7aa9a32 100644
> --- a/fs/jbd2/transaction.c
> +++ b/fs/jbd2/transaction.c
> @@ -368,6 +368,7 @@ repeat:
> atomic_read(&transaction->t_outstanding_credits),
> jbd2_log_space_left(journal));
> read_unlock(&journal->j_state_lock);
> + current->journal_info = handle;
>
> lock_map_acquire(&handle->h_lockdep_map);
> jbd2_journal_free_transaction(new_transaction);
> @@ -442,14 +443,11 @@ handle_t *jbd2__journal_start(journal_t *journal, int nblocks, int rsv_blocks,
> handle->h_rsv_handle = rsv_handle;
> }
>
> - current->journal_info = handle;
> -
> err = start_this_handle(journal, handle, gfp_mask);
> if (err < 0) {
> if (handle->h_rsv_handle)
> jbd2_free_handle(handle->h_rsv_handle);
> jbd2_free_handle(handle);
> - current->journal_info = NULL;
> return ERR_PTR(err);
> }
> handle->h_type = type;
> @@ -511,16 +509,13 @@ int jbd2_journal_start_reserved(handle_t *handle, unsigned int type,
> }
>
> handle->h_journal = NULL;
> - current->journal_info = handle;
> /*
> * GFP_NOFS is here because callers are likely from writeback or
> * similarly constrained call sites
> */
> ret = start_this_handle(journal, handle, GFP_NOFS);
> - if (ret < 0) {
> - current->journal_info = NULL;
> + if (ret < 0)
> jbd2_journal_free_reserved(handle);
> - }
> handle->h_type = type;
> handle->h_line_no = line_no;
> return ret;
> @@ -550,20 +545,21 @@ EXPORT_SYMBOL(jbd2_journal_start_reserved);
> int jbd2_journal_extend(handle_t *handle, int nblocks)
> {
> transaction_t *transaction = handle->h_transaction;
> - journal_t *journal = transaction->t_journal;
> + journal_t *journal;
> int result;
> int wanted;
>
> - result = -EIO;
> + WARN_ON(!transaction);
> if (is_handle_aborted(handle))
> - goto out;
> + return -EROFS;
> + journal = transaction->t_journal;
>
> result = 1;
>
> read_lock(&journal->j_state_lock);
>
> /* Don't extend a locked-down transaction! */
> - if (handle->h_transaction->t_state != T_RUNNING) {
> + if (transaction->t_state != T_RUNNING) {
> jbd_debug(3, "denied handle %p %d blocks: "
> "transaction not running\n", handle, nblocks);
> goto error_out;
> @@ -589,7 +585,7 @@ int jbd2_journal_extend(handle_t *handle, int nblocks)
> }
>
> trace_jbd2_handle_extend(journal->j_fs_dev->bd_dev,
> - handle->h_transaction->t_tid,
> + transaction->t_tid,
> handle->h_type, handle->h_line_no,
> handle->h_buffer_credits,
> nblocks);
> @@ -603,7 +599,6 @@ unlock:
> spin_unlock(&transaction->t_handle_lock);
> error_out:
> read_unlock(&journal->j_state_lock);
> -out:
> return result;
> }
>
> @@ -626,14 +621,16 @@ out:
> int jbd2__journal_restart(handle_t *handle, int nblocks, gfp_t gfp_mask)
> {
> transaction_t *transaction = handle->h_transaction;
> - journal_t *journal = transaction->t_journal;
> + journal_t *journal;
> tid_t tid;
> int need_to_start, ret;
>
> + WARN_ON(!transaction);
> /* If we've had an abort of any type, don't even think about
> * actually doing the restart! */
> if (is_handle_aborted(handle))
> return 0;
> + journal = transaction->t_journal;
>
> /*
> * First unlink the handle from its current transaction, and start the
> @@ -654,6 +651,8 @@ int jbd2__journal_restart(handle_t *handle, int nblocks, gfp_t gfp_mask)
> wake_up(&journal->j_wait_updates);
> tid = transaction->t_tid;
> spin_unlock(&transaction->t_handle_lock);
> + handle->h_transaction = NULL;
> + current->journal_info = NULL;
>
> jbd_debug(2, "restarting handle %p\n", handle);
> need_to_start = !tid_geq(journal->j_commit_request, tid);
> @@ -783,17 +782,16 @@ do_get_write_access(handle_t *handle, struct journal_head *jh,
> int force_copy)
> {
> struct buffer_head *bh;
> - transaction_t *transaction;
> + transaction_t *transaction = handle->h_transaction;
> journal_t *journal;
> int error;
> char *frozen_buffer = NULL;
> int need_copy = 0;
> unsigned long start_lock, time_lock;
>
> + WARN_ON(!transaction);
> if (is_handle_aborted(handle))
> return -EROFS;
> -
> - transaction = handle->h_transaction;
> journal = transaction->t_journal;
>
> jbd_debug(5, "journal_head %p, force_copy %d\n", jh, force_copy);
> @@ -1052,14 +1050,16 @@ int jbd2_journal_get_write_access(handle_t *handle, struct buffer_head *bh)
> int jbd2_journal_get_create_access(handle_t *handle, struct buffer_head *bh)
> {
> transaction_t *transaction = handle->h_transaction;
> - journal_t *journal = transaction->t_journal;
> + journal_t *journal;
> struct journal_head *jh = jbd2_journal_add_journal_head(bh);
> int err;
>
> jbd_debug(5, "journal_head %p\n", jh);
> + WARN_ON(!transaction);
> err = -EROFS;
> if (is_handle_aborted(handle))
> goto out;
> + journal = transaction->t_journal;
> err = 0;
>
> JBUFFER_TRACE(jh, "entry");
> @@ -1265,12 +1265,14 @@ void jbd2_buffer_abort_trigger(struct journal_head *jh,
> int jbd2_journal_dirty_metadata(handle_t *handle, struct buffer_head *bh)
> {
> transaction_t *transaction = handle->h_transaction;
> - journal_t *journal = transaction->t_journal;
> + journal_t *journal;
> struct journal_head *jh;
> int ret = 0;
>
> + WARN_ON(!transaction);
> if (is_handle_aborted(handle))
> - goto out;
> + return -EROFS;
> + journal = transaction->t_journal;
> jh = jbd2_journal_grab_journal_head(bh);
> if (!jh) {
> ret = -EUCLEAN;
> @@ -1364,7 +1366,7 @@ int jbd2_journal_dirty_metadata(handle_t *handle, struct buffer_head *bh)
>
> JBUFFER_TRACE(jh, "file as BJ_Metadata");
> spin_lock(&journal->j_list_lock);
> - __jbd2_journal_file_buffer(jh, handle->h_transaction, BJ_Metadata);
> + __jbd2_journal_file_buffer(jh, transaction, BJ_Metadata);
> spin_unlock(&journal->j_list_lock);
> out_unlock_bh:
> jbd_unlock_bh_state(bh);
> @@ -1395,12 +1397,17 @@ out:
> int jbd2_journal_forget (handle_t *handle, struct buffer_head *bh)
> {
> transaction_t *transaction = handle->h_transaction;
> - journal_t *journal = transaction->t_journal;
> + journal_t *journal;
> struct journal_head *jh;
> int drop_reserve = 0;
> int err = 0;
> int was_modified = 0;
>
> + WARN_ON(!transaction);
> + if (is_handle_aborted(handle))
> + return -EROFS;
> + journal = transaction->t_journal;
> +
> BUFFER_TRACE(bh, "entry");
>
> jbd_lock_bh_state(bh);
> @@ -1427,7 +1434,7 @@ int jbd2_journal_forget (handle_t *handle, struct buffer_head *bh)
> */
> jh->b_modified = 0;
>
> - if (jh->b_transaction == handle->h_transaction) {
> + if (jh->b_transaction == transaction) {
> J_ASSERT_JH(jh, !jh->b_frozen_data);
>
> /* If we are forgetting a buffer which is already part
> @@ -1522,19 +1529,21 @@ drop:
> int jbd2_journal_stop(handle_t *handle)
> {
> transaction_t *transaction = handle->h_transaction;
> - journal_t *journal = transaction->t_journal;
> - int err, wait_for_commit = 0;
> + journal_t *journal;
> + int err = 0, wait_for_commit = 0;
> tid_t tid;
> pid_t pid;
>
> + if (!transaction)
> + goto free_and_exit;
> + journal = transaction->t_journal;
> +
> J_ASSERT(journal_current_handle() == handle);
>
> if (is_handle_aborted(handle))
> err = -EIO;
> - else {
> + else
> J_ASSERT(atomic_read(&transaction->t_updates) > 0);
> - err = 0;
> - }
>
> if (--handle->h_ref > 0) {
> jbd_debug(4, "h_ref %d -> %d\n", handle->h_ref + 1,
> @@ -1544,7 +1553,7 @@ int jbd2_journal_stop(handle_t *handle)
>
> jbd_debug(4, "Handle %p going down\n", handle);
> trace_jbd2_handle_stats(journal->j_fs_dev->bd_dev,
> - handle->h_transaction->t_tid,
> + transaction->t_tid,
> handle->h_type, handle->h_line_no,
> jiffies - handle->h_start_jiffies,
> handle->h_sync, handle->h_requested_credits,
> @@ -1657,6 +1666,7 @@ int jbd2_journal_stop(handle_t *handle)
>
> if (handle->h_rsv_handle)
> jbd2_journal_free_reserved(handle->h_rsv_handle);
> +free_and_exit:
> jbd2_free_handle(handle);
> return err;
> }
> @@ -2362,10 +2372,12 @@ void jbd2_journal_refile_buffer(journal_t *journal, struct journal_head *jh)
> int jbd2_journal_file_inode(handle_t *handle, struct jbd2_inode *jinode)
> {
> transaction_t *transaction = handle->h_transaction;
> - journal_t *journal = transaction->t_journal;
> + journal_t *journal;
>
> + WARN_ON(!transaction);
> if (is_handle_aborted(handle))
> - return -EIO;
> + return -EROFS;
> + journal = transaction->t_journal;
>
> jbd_debug(4, "Adding inode %lu, tid:%d\n", jinode->i_vfs_inode->i_ino,
> transaction->t_tid);
> diff --git a/include/linux/jbd2.h b/include/linux/jbd2.h
> index 0302f3f..d5b50a1 100644
> --- a/include/linux/jbd2.h
> +++ b/include/linux/jbd2.h
> @@ -1266,7 +1266,7 @@ static inline int is_journal_aborted(journal_t *journal)
>
> static inline int is_handle_aborted(handle_t *handle)
> {
> - if (handle->h_aborted)
> + if (handle->h_aborted || !handle->h_transaction)
> return 1;
> return is_journal_aborted(handle->h_transaction->t_journal);
> }
>

Thanks for your patch.
I test the patch in ocfs2 file system, the results look fine.

# mkfs.ocfs2 -b 4K -C 32K -T datafiles /dev/sdc
...(jounral size is 32M)
# mount.ocfs2 /dev/sdc /mnt/ocfs2/
# touch /mnt/ocfs2/1.vhd
# fallocate -o 0 -l 400G /mnt/ocfs2/1.vhd
fallocate: /mnt/ocfs2/1.vhd: fallocate failed: Cannot allocate memory
# tail -f /var/log/messages
Jul 3 19:40:58 linux-KooNDD kernel: [ 7302.646408] JBD: Ignoring recovery information on journal
Jul 3 19:40:58 linux-KooNDD kernel: [ 7302.667908] ocfs2: Mounting device (65,160) on (node 10, slot 0) with ordered data mode.
Jul 3 19:41:19 linux-KooNDD ovsdb-server: 00129|monitor|INFO|Monitor: send HEARTBEART_MESSGE again count = 71400
Jul 3 19:41:54 linux-KooNDD ovs-brcompatd: 00132|monitor|INFO|Monitor: send HEARTBEART_MESSGE again count = 73200
Jul 3 19:42:08 linux-KooNDD kernel: [ 7372.278591] JBD: fallocate wants too many credits (2051 > 2048)
Jul 3 19:42:08 linux-KooNDD kernel: [ 7372.278597] (fallocate,6438,0):__ocfs2_extend_allocation:709 ERROR: status = -12
Jul 3 19:42:08 linux-KooNDD kernel: [ 7372.278603] (fallocate,6438,0):ocfs2_allocate_unwritten_extents:1504 ERROR: status = -12
Jul 3 19:42:08 linux-KooNDD kernel: [ 7372.278607] (fallocate,6438,0):__ocfs2_change_file_space:1955 ERROR: status = -12
Jul 3 19:42:20 linux-KooNDD ovsdb-server: 00130|monitor|INFO|Monitor: send HEARTBEART_MESSGE again count = 72000
Jul 3 19:42:20 linux-KooNDD ovsdb-server: 00131|vlog|INFO|opened log file /var/log/openvswitch//ovsdb-server.log
^C

If fallocate wants too many journal space, it would issue a kernel warning.

# cat test.sh
start=0
end=500
while [ ${start} -le ${end} ]
do
echo ${start}
touch /mnt/ocfs2/${start}.vhd
fallocate -o 0 -l ${start}G /mnt/ocfs2/${start}.vhd
ls -l /mnt/ocfs2/
du -h /mnt/ocfs2/
rm /mnt/ocfs2/${start}.vhd
sleep 2
start=$((${start}+5))
done
#sh test.sh

the results are fine too.

Tested-by: Younger Liu <[email protected]>

-- Younger





2013-07-03 12:42:56

by Younger Liu

[permalink] [raw]
Subject: Re: [PATCH] jbd2: invalidate handle if jbd2_journal_restart() fails

On 2013/7/3 20:22, Younger Liu wrote:
> On 2013/6/30 7:46, Theodore Ts'o wrote:
>> On Tue, Jun 25, 2013 at 05:42:10PM +0800, Younger Liu wrote:
>>> If jbd2__journal_restart fails, handle->h_transaction may be NULL.
>>> So we should check handle->h_transaction before
>>> "journal = transaction->t_journal", Right?
>>
>> Good point, yes. Here's a new version of the patch which I have in my tree.
>>
>> - Ted
>>
>> >From fd8d369f9ad921eb6dc5c56e87e4c6d6106bad56 Mon Sep 17 00:00:00 2001
>> From: Theodore Ts'o <[email protected]>
>> Date: Sun, 23 Jun 2013 12:59:01 -0400
>> Subject: [PATCH] jbd2: invalidate handle if jbd2_journal_restart() fails
>>
>> If jbd2_journal_restart() fails the handle will have been disconnected
>> from the current transaction. In this situation, the handle must not
>> be used for for any jbd2 function other than jbd2_journal_stop().
>> Enforce this with by treating a handle which has a NULL transaction
>> pointer as an aborted handle, and issue a kernel warning if
>> jbd2_journal_extent(), jbd2_journal_get_write_access(),
>> jbd2_journal_dirty_metadata(), etc. is called with an invalid handle.
>>
>> This commit also fixes a bug where jbd2_journal_stop() would trip over
>> a kernel jbd2 assertion check when trying to free an invalid handle.
>>
>> Also move the responsibility of setting current->journal_info to
>> start_this_handle(), simplifying the three users of this function.
>>
>> Signed-off-by: "Theodore Ts'o" <[email protected]>
>> Reported-by: Younger Liu <[email protected]>
>> Cc: Jan Kara <[email protected]>
...
>
> Thanks for your patch.
> I test the patch in ocfs2 file system, the results look fine.
>
> # mkfs.ocfs2 -b 4K -C 32K -T datafiles /dev/sdc
> ...(jounral size is 32M)
> # mount.ocfs2 /dev/sdc /mnt/ocfs2/
> # touch /mnt/ocfs2/1.vhd
> # fallocate -o 0 -l 400G /mnt/ocfs2/1.vhd
> fallocate: /mnt/ocfs2/1.vhd: fallocate failed: Cannot allocate memory
> # tail -f /var/log/messages
> Jul 3 19:40:58 linux-KooNDD kernel: [ 7302.646408] JBD: Ignoring recovery information on journal
> Jul 3 19:40:58 linux-KooNDD kernel: [ 7302.667908] ocfs2: Mounting device (65,160) on (node 10, slot 0) with ordered data mode.
> Jul 3 19:41:19 linux-KooNDD ovsdb-server: 00129|monitor|INFO|Monitor: send HEARTBEART_MESSGE again count = 71400
> Jul 3 19:41:54 linux-KooNDD ovs-brcompatd: 00132|monitor|INFO|Monitor: send HEARTBEART_MESSGE again count = 73200
> Jul 3 19:42:08 linux-KooNDD kernel: [ 7372.278591] JBD: fallocate wants too many credits (2051 > 2048)
> Jul 3 19:42:08 linux-KooNDD kernel: [ 7372.278597] (fallocate,6438,0):__ocfs2_extend_allocation:709 ERROR: status = -12
> Jul 3 19:42:08 linux-KooNDD kernel: [ 7372.278603] (fallocate,6438,0):ocfs2_allocate_unwritten_extents:1504 ERROR: status = -12
> Jul 3 19:42:08 linux-KooNDD kernel: [ 7372.278607] (fallocate,6438,0):__ocfs2_change_file_space:1955 ERROR: status = -12
> Jul 3 19:42:20 linux-KooNDD ovsdb-server: 00130|monitor|INFO|Monitor: send HEARTBEART_MESSGE again count = 72000
> Jul 3 19:42:20 linux-KooNDD ovsdb-server: 00131|vlog|INFO|opened log file /var/log/openvswitch//ovsdb-server.log
> ^C
>
> If fallocate wants too many journal space, it would issue a kernel warning.
>
> # cat test.sh
> start=0
> end=500
> while [ ${start} -le ${end} ]
> do
> echo ${start}
> touch /mnt/ocfs2/${start}.vhd
> fallocate -o 0 -l ${start}G /mnt/ocfs2/${start}.vhd
> ls -l /mnt/ocfs2/
> du -h /mnt/ocfs2/
> rm /mnt/ocfs2/${start}.vhd
> sleep 2
> start=$((${start}+5))
> done
> #sh test.sh
>
> the results are fine too.
>
> Tested-by: Younger Liu <[email protected]>
>
> -- Younger
Sorry, I made a mistake, email is not right.
It should be:
Tested-by: Younger Liu <[email protected]>

>
>
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
> .
>