hi!
ext3/jbd use b_committed_data buffer in order to prevent
allocation of blocks which were freed in non-committed
transaction. I think there is bug in this code. look,
some thread commit thread
----------------------------------------------------------
get_undo_access(#1)
dirty_buffer(#1)
stop_journal()
start commit
start_journal()
get_undo_access(#1):
1) wait for #1 to be
in t_forget_list
write #1 to log
put #1 onto t_forget_list
2) b_commit_data exists,
finish get_undo_access()
for_each_bh_in_forget_list() {
if (jh->b_committed_data) {
kfree(jh->b_committed_data);
jh->b_committed_data = NULL;
}
}
/* using of b_committed_data */
b_committed_data is NULL ?
with best regards, Alex
Hi,
On Sun, 2003-05-18 at 18:21, Alex Tomas wrote:
> ext3/jbd use b_committed_data buffer in order to prevent
> allocation of blocks which were freed in non-committed
> transaction. I think there is bug in this code. look,
You just found a rather subtle bug which was fixed around 2 or 3 years
ago. :-)
> some thread commit thread
> ----------------------------------------------------------
> start_journal()
> get_undo_access(#1):
> 1) wait for #1 to be
> in t_forget_list
get_undo_access is a declaration of intention to modify the buffer.
When that happens, it calls do_get_write_access() with the force_copy
flag set. That means that it _always_ creates a new frozen_data copy of
the buffer the first time we get undo access to a bitmap buffer within
any given transaction. That basically means that for bitmaps,
frozen_data always holds the version of the buffer as of the end of the
previously completed transaction.
> for_each_bh_in_forget_list() {
> if (jh->b_committed_data) {
> kfree(jh->b_committed_data);
> jh->b_committed_data = NULL;
> }
Ah, but the *immediately* following lines are:
if (jh->b_frozen_data) {
jh->b_committed_data = jh->b_frozen_data;
jh->b_frozen_data = NULL;
}
so the frozen data that was preserved at get_undo_access() time has now
committed to disk and gets rotated into the b_committed_data version.
This is exactly how we get the new version of the committed data when
the old transaction commits.
Cheers,
Stephen
hi!
please, look:
thread A commit thread
if (jh->b_committed_data) {
kfree(jh->b_committed_data);
jh->b_committed_data = NULL;
}
access for
b_committed_data == NULL ?
if (jh->b_frozen_data) {
jh->b_committed_data = jh->b_frozen_data;
jh->b_frozen_data = NULL;
}
or I miss something subtle here?
>>>>> Stephen C Tweedie (SCT) writes:
SCT> get_undo_access is a declaration of intention to modify the buffer.
SCT> When that happens, it calls do_get_write_access() with the force_copy
SCT> flag set. That means that it _always_ creates a new frozen_data copy of
SCT> the buffer the first time we get undo access to a bitmap buffer within
SCT> any given transaction. That basically means that for bitmaps,
SCT> frozen_data always holds the version of the buffer as of the end of the
SCT> previously completed transaction.
>> for_each_bh_in_forget_list() {
>> if (jh->b_committed_data) {
>> kfree(jh->b_committed_data);
jh-> b_committed_data = NULL;
>> }
SCT> Ah, but the *immediately* following lines are:
SCT> if (jh->b_frozen_data) {
jh-> b_committed_data = jh->b_frozen_data;
jh-> b_frozen_data = NULL;
SCT> }
SCT> so the frozen data that was preserved at get_undo_access() time has now
SCT> committed to disk and gets rotated into the b_committed_data version.
SCT> This is exactly how we get the new version of the committed data when
SCT> the old transaction commits.
SCT> Cheers,
SCT> Stephen
Hi,
On Tue, 2003-05-20 at 01:46, Alex Tomas wrote:
> please, look:
>
> thread A commit thread
>
> if (jh->b_committed_data) {
> kfree(jh->b_committed_data);
> jh->b_committed_data = NULL;
> }
> access for
> b_committed_data == NULL ?
Not with BKL. Without it, yes, that's definitely a risk, and you need
some locking for the access to b_committed_data. Without that, even if
you keep the jh->b_committed_data field valid, you risk freeing the old
copy that another thread is using.
Cheers,
Stephen
aha. now it's clear. thank you. I catched J_ASSERT(b_committed_data != NULL)
in ext3_free_blocks() with de-BKL'ed JBD. hence, my solution is to have a
tid journal_head indicating which transaction uses b_committed_data.
I don't want to look intrusive, but .. what do you think about new locking
schema I'm trying to implement?
>>>>> Stephen C Tweedie (SCT) writes:
>> access for
>> b_committed_data == NULL ?
SCT> Not with BKL. Without it, yes, that's definitely a risk, and you need
SCT> some locking for the access to b_committed_data. Without that, even if
SCT> you keep the jh->b_committed_data field valid, you risk freeing the old
SCT> copy that another thread is using.
SCT> Cheers,
SCT> Stephen