2011-04-26 08:32:40

by Niraj Kulkarni

[permalink] [raw]
Subject: Need of revoke mechanism in JBD

Hi all,
I am new to fs development. I am trying to modify the journal
structure of JBD. While analyzing the code, I could understand most of
the things, but I am not able to understand the need of revoke
mechanism. Can anybody enlighten me on this issue?

Regards
Niraj


2011-04-26 08:56:59

by Yongqiang Yang

[permalink] [raw]
Subject: Re: Need of revoke mechanism in JBD

AFAIK, it can accelerate the recovering process. If a block is in the
revoke table of a transaction t1 and t1 is committed, then the there
is no need to recover the block in transactions which is earlier than
t1.

On Tue, Apr 26, 2011 at 4:29 PM, Niraj Kulkarni
<[email protected]> wrote:
> Hi all,
> ? ? ?I am new to fs development. I am trying to modify the journal structure
> of JBD. While analyzing the code, I could understand most of the things, but
> I am not able to understand the need of revoke mechanism. Can anybody
> enlighten me on this issue?
>
> Regards
> Niraj
> --
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to [email protected]
> More majordomo info at ?http://vger.kernel.org/majordomo-info.html
>



--
Best Wishes
Yongqiang Yang

2011-04-26 09:23:21

by Ding Dinghua

[permalink] [raw]
Subject: Re: Need of revoke mechanism in JBD

I think it's not only a performance issue but more important, a
correctness issue.
Revoke table is used for preventing the wrong replay of journal which
cause data corruption:
If block A has been journalled its modification, committed to journal
and hasn't been checkpointed,
and in later transactions block A is freed and reused for data in
no-journalled-data mode, then If
we don't have revoke table which recording the releasing event, replay
of journal will overwrite the new data,
which causing data corruption.

2011/4/26 Yongqiang Yang <[email protected]>:
> AFAIK, it can accelerate the recovering process. ?If a block is in the
> revoke table of a transaction t1 and t1 is committed, then the there
> is no need to recover the block in transactions which is earlier than
> t1.
>
> On Tue, Apr 26, 2011 at 4:29 PM, Niraj Kulkarni
> <[email protected]> wrote:
>> Hi all,
>> ? ? ?I am new to fs development. I am trying to modify the journal structure
>> of JBD. While analyzing the code, I could understand most of the things, but
>> I am not able to understand the need of revoke mechanism. Can anybody
>> enlighten me on this issue?
>>
>> Regards
>> Niraj
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
>> the body of a message to [email protected]
>> More majordomo info at ?http://vger.kernel.org/majordomo-info.html
>>
>
>
>
> --
> Best Wishes
> Yongqiang Yang
> --
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to [email protected]
> More majordomo info at ?http://vger.kernel.org/majordomo-info.html
>



--
Ding Dinghua

2011-04-26 10:42:51

by Niraj Kulkarni

[permalink] [raw]
Subject: Re: Need of revoke mechanism in JBD

If I am thinking correctly, journal would be checkpointed on filesystem
unmount calls.
This implies the given scenario would be pretty rare.

ie first filesystem should be mounted in full-journal mode, and crashed
prior to checkpoint.
then it should be remounted in no-journalled-data mode without recovery
and again remounted in full journalled mode with recovery.

Am I thinking on correct lines?

On Tuesday 26 April 2011 02:53 PM, Ding Dinghua wrote:
> I think it's not only a performance issue but more important, a
> correctness issue.
> Revoke table is used for preventing the wrong replay of journal which
> cause data corruption:
> If block A has been journalled its modification, committed to journal
> and hasn't been checkpointed,
> and in later transactions block A is freed and reused for data in
> no-journalled-data mode, then If
> we don't have revoke table which recording the releasing event, replay
> of journal will overwrite the new data,
> which causing data corruption.
>
> 2011/4/26 Yongqiang Yang<[email protected]>:
>> AFAIK, it can accelerate the recovering process. If a block is in the
>> revoke table of a transaction t1 and t1 is committed, then the there
>> is no need to recover the block in transactions which is earlier than
>> t1.
>>
>> On Tue, Apr 26, 2011 at 4:29 PM, Niraj Kulkarni
>> <[email protected]> wrote:
>>> Hi all,
>>> I am new to fs development. I am trying to modify the journal structure
>>> of JBD. While analyzing the code, I could understand most of the things, but
>>> I am not able to understand the need of revoke mechanism. Can anybody
>>> enlighten me on this issue?
>>>
>>> Regards
>>> Niraj
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
>>> the body of a message to [email protected]
>>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>>>
>>
>>
>> --
>> Best Wishes
>> Yongqiang Yang
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
>> the body of a message to [email protected]
>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>>
>
>


2011-04-26 10:57:49

by Amir Goldstein

[permalink] [raw]
Subject: Re: Need of revoke mechanism in JBD

On Tue, Apr 26, 2011 at 1:47 PM, Niraj Kulkarni
<[email protected]> wrote:
> If I am thinking correctly, journal would be checkpointed ?on filesystem
> unmount calls.
> This implies the given scenario would be pretty rare.
>
> ie first filesystem should be mounted in full-journal mode, and crashed
> prior to checkpoint.
> then it should be remounted in no-journalled-data mode without recovery
> and again remounted in full journalled mode with recovery.
>
> Am I thinking on correct lines?

nope.
first the block is allocated as metadata and journaled.
then the metadata block is freed and re-allocated as non-journaled data.
this is the use case and it would be very easy to corrupt data if it wasn't
for journal revoke.

Amir.

>
> On Tuesday 26 April 2011 02:53 PM, Ding Dinghua wrote:
>>
>> I think it's not only a performance issue but more important, a
>> correctness issue.
>> Revoke table is used for preventing the wrong replay of journal which
>> cause data corruption:
>> If block A has been journalled its modification, committed to journal
>> and hasn't been checkpointed,
>> and in later transactions block A is freed and reused for data in
>> no-journalled-data mode, then If
>> we don't have revoke table which recording the releasing event, replay
>> of journal will overwrite the new data,
>> which causing data corruption.
>>
>> 2011/4/26 Yongqiang Yang<[email protected]>:
>>>
>>> AFAIK, it can accelerate the recovering process. ?If a block is in the
>>> revoke table of a transaction t1 and t1 is committed, then the there
>>> is no need to recover the block in transactions which is earlier than
>>> t1.
>>>
>>> On Tue, Apr 26, 2011 at 4:29 PM, Niraj Kulkarni
>>> <[email protected]> ?wrote:
>>>>
>>>> Hi all,
>>>> ? ? ?I am new to fs development. I am trying to modify the journal
>>>> structure
>>>> of JBD. While analyzing the code, I could understand most of the things,
>>>> but
>>>> I am not able to understand the need of revoke mechanism. Can anybody
>>>> enlighten me on this issue?
>>>>
>>>> Regards
>>>> Niraj
>>>> --
>>>> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
>>>> the body of a message to [email protected]
>>>> More majordomo info at ?http://vger.kernel.org/majordomo-info.html
>>>>
>>>
>>>
>>> --
>>> Best Wishes
>>> Yongqiang Yang
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
>>> the body of a message to [email protected]
>>> More majordomo info at ?http://vger.kernel.org/majordomo-info.html
>>>
>>
>>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to [email protected]
> More majordomo info at ?http://vger.kernel.org/majordomo-info.html
>

2011-04-26 12:25:59

by Theodore Ts'o

[permalink] [raw]
Subject: Re: Need of revoke mechanism in JBD

On Tue, Apr 26, 2011 at 05:23:21PM +0800, Ding Dinghua wrote:
> I think it's not only a performance issue but more important, a
> correctness issue.
> Revoke table is used for preventing the wrong replay of journal which
> cause data corruption:
> If block A has been journalled its modification, committed to journal
> and hasn't been checkpointed,
> and in later transactions block A is freed and reused for data in
> no-journalled-data mode, then If
> we don't have revoke table which recording the releasing event, replay
> of journal will overwrite the new data,
> which causing data corruption.

Yes, this is correct. It should be covered fairly well in Stephen
Tweedie's, "Journaling the ext2fs file system" paper, which you can
find at:

https://ext4.wiki.kernel.org/index.php/Publications

if you'd like more details.

Hope this helps!

- Ted

2011-04-26 17:27:22

by Andreas Dilger

[permalink] [raw]
Subject: Re: Need of revoke mechanism in JBD

On 2011-04-26, at 4:47 AM, Niraj Kulkarni wrote:
> If I am thinking correctly, journal would be checkpointed on
> filesystem unmount calls.
> This implies the given scenario would be pretty rare.
>
> ie first filesystem should be mounted in full-journal mode, and
> crashed prior to checkpoint.
> then it should be remounted in no-journalled-data mode without
> recovery and again remounted in full journalled mode with recovery.

It shouldn't be possible to mount the filesystem in no-journal mode
without doing journal recovery. The filesystem sets an INCOMPAT_RECOVER
flag when the journal has any transactions in it, and the journal should
be replayed before the filesystem is finished mounting.

Looking at ext4_fill_super() the "noload" mount option is used to avoid
loading the journal even if there is a journal (COMPAT_HAS_JOURNAL is set),
but if INCOMPAT_RECOVER is set the filesystem will refuse to mount.


> On Tuesday 26 April 2011 02:53 PM, Ding Dinghua wrote:
>> I think it's not only a performance issue but more important, a
>> correctness issue.
>> Revoke table is used for preventing the wrong replay of journal which
>> cause data corruption:
>> If block A has been journalled its modification, committed to journal
>> and hasn't been checkpointed,
>> and in later transactions block A is freed and reused for data in
>> no-journalled-data mode, then If
>> we don't have revoke table which recording the releasing event, replay
>> of journal will overwrite the new data,
>> which causing data corruption.
>>
>> 2011/4/26 Yongqiang Yang<[email protected]>:
>>> AFAIK, it can accelerate the recovering process. If a block is in the
>>> revoke table of a transaction t1 and t1 is committed, then the there
>>> is no need to recover the block in transactions which is earlier than
>>> t1.
>>>
>>> On Tue, Apr 26, 2011 at 4:29 PM, Niraj Kulkarni
>>> <[email protected]> wrote:
>>>> Hi all,
>>>> I am new to fs development. I am trying to modify the journal structure
>>>> of JBD. While analyzing the code, I could understand most of the things, but
>>>> I am not able to understand the need of revoke mechanism. Can anybody
>>>> enlighten me on this issue?
>>>>
>>>> Regards
>>>> Niraj
>>>> --
>>>> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
>>>> the body of a message to [email protected]
>>>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>>>>
>>>
>>>
>>> --
>>> Best Wishes
>>> Yongqiang Yang
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
>>> the body of a message to [email protected]
>>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>>>
>>
>>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html


Cheers, Andreas






2011-04-27 00:52:33

by Ding Dinghua

[permalink] [raw]
Subject: Re: Need of revoke mechanism in JBD

Oh, here no-journalled-data is misleading, I mean mount in
data=writeback mode or data=ordered mode,
not in data=journal mode.

2011/4/27 Andreas Dilger <[email protected]>:
> On 2011-04-26, at 4:47 AM, Niraj Kulkarni wrote:
>> If I am thinking correctly, journal would be checkpointed ?on
>> filesystem unmount calls.
>> This implies the given scenario would be pretty rare.
>>
>> ie first filesystem should be mounted in full-journal mode, and
>> crashed prior to checkpoint.
>> then it should be remounted in no-journalled-data mode without
>> recovery and again remounted in full journalled mode with recovery.
>
> It shouldn't be possible to mount the filesystem in no-journal mode
> without doing journal recovery. ?The filesystem sets an INCOMPAT_RECOVER
> flag when the journal has any transactions in it, and the journal should
> be replayed before the filesystem is finished mounting.
>
> Looking at ext4_fill_super() the "noload" mount option is used to avoid
> loading the journal even if there is a journal (COMPAT_HAS_JOURNAL is set),
> but if INCOMPAT_RECOVER is set the filesystem will refuse to mount.
>
>
>> On Tuesday 26 April 2011 02:53 PM, Ding Dinghua wrote:
>>> I think it's not only a performance issue but more important, a
>>> correctness issue.
>>> Revoke table is used for preventing the wrong replay of journal which
>>> cause data corruption:
>>> If block A has been journalled its modification, committed to journal
>>> and hasn't been checkpointed,
>>> and in later transactions block A is freed and reused for data in
>>> no-journalled-data mode, then If
>>> we don't have revoke table which recording the releasing event, replay
>>> of journal will overwrite the new data,
>>> which causing data corruption.
>>>
>>> 2011/4/26 Yongqiang Yang<[email protected]>:
>>>> AFAIK, it can accelerate the recovering process. ?If a block is in the
>>>> revoke table of a transaction t1 and t1 is committed, then the there
>>>> is no need to recover the block in transactions which is earlier than
>>>> t1.
>>>>
>>>> On Tue, Apr 26, 2011 at 4:29 PM, Niraj Kulkarni
>>>> <[email protected]> ?wrote:
>>>>> Hi all,
>>>>> ? ? ?I am new to fs development. I am trying to modify the journal structure
>>>>> of JBD. While analyzing the code, I could understand most of the things, but
>>>>> I am not able to understand the need of revoke mechanism. Can anybody
>>>>> enlighten me on this issue?
>>>>>
>>>>> Regards
>>>>> Niraj
>>>>> --
>>>>> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
>>>>> the body of a message to [email protected]
>>>>> More majordomo info at ?http://vger.kernel.org/majordomo-info.html
>>>>>
>>>>
>>>>
>>>> --
>>>> Best Wishes
>>>> Yongqiang Yang
>>>> --
>>>> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
>>>> the body of a message to [email protected]
>>>> More majordomo info at ?http://vger.kernel.org/majordomo-info.html
>>>>
>>>
>>>
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
>> the body of a message to [email protected]
>> More majordomo info at ?http://vger.kernel.org/majordomo-info.html
>
>
> Cheers, Andreas
>
>
>
>
>
>



--
Ding Dinghua

2011-04-29 19:45:25

by Amir Goldstein

[permalink] [raw]
Subject: Re: Need of revoke mechanism in JBD

On Tue, Apr 26, 2011 at 3:25 PM, Ted Ts'o <[email protected]> wrote:
> On Tue, Apr 26, 2011 at 05:23:21PM +0800, Ding Dinghua wrote:
>> I think it's not only a performance issue but more important, a
>> correctness issue.
>> Revoke table is used for preventing the wrong replay of journal which
>> cause data corruption:
>> If block A has been journalled its modification, committed to journal
>> and hasn't been checkpointed,
>> and in later transactions block A is freed and reused for data in
>> no-journalled-data mode, then If
>> we don't have revoke table which recording the releasing event, replay
>> of journal will overwrite the new data,
>> which causing data corruption.
>
> Yes, this is correct. ?It should be covered fairly well in Stephen
> Tweedie's, "Journaling the ext2fs file system" paper, which you can
> find at:
>
> https://ext4.wiki.kernel.org/index.php/Publications

Actually, the original paper has no mention of revoke records.
I went out to look for useful documentation on journal forget/revoke
and came back empty handed as well.

>
> if you'd like more details.
>
> Hope this helps!
>
> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?- Ted
> --
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to [email protected]
> More majordomo info at ?http://vger.kernel.org/majordomo-info.html
>

2011-04-30 02:06:58

by Niraj Kulkarni

[permalink] [raw]
Subject: Re: Need of revoke mechanism in JBD

On Saturday 30 April 2011 01:15 AM, Amir Goldstein wrote:
> On Tue, Apr 26, 2011 at 3:25 PM, Ted Ts'o<[email protected]> wrote:
>> On Tue, Apr 26, 2011 at 05:23:21PM +0800, Ding Dinghua wrote:
>>> I think it's not only a performance issue but more important, a
>>> correctness issue.
>>> Revoke table is used for preventing the wrong replay of journal which
>>> cause data corruption:
>>> If block A has been journalled its modification, committed to journal
>>> and hasn't been checkpointed,
>>> and in later transactions block A is freed and reused for data in
>>> no-journalled-data mode, then If
>>> we don't have revoke table which recording the releasing event, replay
>>> of journal will overwrite the new data,
>>> which causing data corruption.
>> Yes, this is correct. It should be covered fairly well in Stephen
>> Tweedie's, "Journaling the ext2fs file system" paper, which you can
>> find at:
>>
>> https://ext4.wiki.kernel.org/index.php/Publications
> Actually, the original paper has no mention of revoke records.
> I went out to look for useful documentation on journal forget/revoke
> and came back empty handed as well.
>
>> if you'd like more details.
>>
>> Hope this helps!
>>
>> - Ted
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
>> the body of a message to [email protected]
>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>>
Yes, I tried some other papers too, but no use. Anyway I've figured out
that for my change, I dont need any kind of journalling related
facilities, so I am going to bypass it completely.

Niraj

2011-05-01 22:28:14

by Theodore Ts'o

[permalink] [raw]
Subject: Re: Need of revoke mechanism in JBD

On Fri, Apr 29, 2011 at 10:45:23PM +0300, Amir Goldstein wrote:
>
> Actually, the original paper has no mention of revoke records.
> I went out to look for useful documentation on journal forget/revoke
> and came back empty handed as well.

Stephen Tweedie gave a talk back in 2000 which covered revoke records.
There's a transcript of his talk here:

http://olstrans.sourceforge.net/release/OLS2000-ext3/OLS2000-ext3.html

The link to the audio file of the talk is dead, but I managed to find
a copy of the mp3 file on Google. To make sure it doesn't get lost
I've made a copy of it. The original and the copy can be found at:

http://ftp.gnumonks.org/pub/congress-talks/ols2000/high/cd1/2000-07-20_15-05-22_A_64.mp3
ftp://ftp.kernel.org/pub/linux/kernel/people/tytso/presentations/2000-07-20_15-05-22_A_64.mp3

- Ted

2011-05-02 10:42:51

by Amir Goldstein

[permalink] [raw]
Subject: Re: Need of revoke mechanism in JBD

On Mon, May 2, 2011 at 1:28 AM, Ted Ts'o <[email protected]> wrote:
> On Fri, Apr 29, 2011 at 10:45:23PM +0300, Amir Goldstein wrote:
>>
>> Actually, the original paper has no mention of revoke records.
>> I went out to look for useful documentation on journal forget/revoke
>> and came back empty handed as well.
>
> Stephen Tweedie gave a talk back in 2000 which covered revoke records.
> There's a transcript of his talk here:
>
> http://olstrans.sourceforge.net/release/OLS2000-ext3/OLS2000-ext3.html
>
> The link to the audio file of the talk is dead, but I managed to find
> a copy of the mp3 file on Google. ?To make sure it doesn't get lost
> I've made a copy of it. ?The original and the copy can be found at:
>
> http://ftp.gnumonks.org/pub/congress-talks/ols2000/high/cd1/2000-07-20_15-05-22_A_64.mp3
> ftp://ftp.kernel.org/pub/linux/kernel/people/tytso/presentations/2000-07-20_15-05-22_A_64.mp3
>
> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?- Ted
>

Thanks for that! It was an interesting read, a little piece of history for me.

Clearly, at the time of this talk, revoke code was still on the design table
and committed_data is not mentioned, so either it was not introduced yet
or just wasn't in the scope of the talk.

Amir.

2011-05-02 14:43:56

by Theodore Ts'o

[permalink] [raw]
Subject: Re: Need of revoke mechanism in JBD

On Mon, May 02, 2011 at 01:42:48PM +0300, Amir Goldstein wrote:
>
> Clearly, at the time of this talk, revoke code was still on the design table
> and committed_data is not mentioned, so either it was not introduced yet
> or just wasn't in the scope of the talk.

No, the revoke code was already implemented when he gave the talk.

"The way we're doing that in EXT3 is that deleting metadata can
cause a revoke record to be written into the journal. And when you
do the replay of the journal, the very first pass of the journal
recovery, we look for all of the revoke records and make sure that
any data that's been revoked is never, ever replayed. And so that
deals with that particular case."

Stephen gave the talk in July, 2000. Journal support was first
supported in e2fsprogs 1.20 (released May 2000), and we fixed a bug in
the revoke handling in e2fsprogs 1.21 (released June 2000). Data
journalling mode (only) came first, and was working by late 1999;
indeed we couldn't do ordered or writeback journaling at all until we
had support for the revoke handling, for reasons which he explained in
his talk.

- Ted