Hi all,
I am new to fs development. I am trying to modify the journal
structure of JBD. While analyzing the code, I could understand most of
the things, but I am not able to understand the need of revoke
mechanism. Can anybody enlighten me on this issue?
Regards
Niraj
AFAIK, it can accelerate the recovering process. If a block is in the
revoke table of a transaction t1 and t1 is committed, then the there
is no need to recover the block in transactions which is earlier than
t1.
On Tue, Apr 26, 2011 at 4:29 PM, Niraj Kulkarni
<[email protected]> wrote:
> Hi all,
> ? ? ?I am new to fs development. I am trying to modify the journal structure
> of JBD. While analyzing the code, I could understand most of the things, but
> I am not able to understand the need of revoke mechanism. Can anybody
> enlighten me on this issue?
>
> Regards
> Niraj
> --
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to [email protected]
> More majordomo info at ?http://vger.kernel.org/majordomo-info.html
>
--
Best Wishes
Yongqiang Yang
I think it's not only a performance issue but more important, a
correctness issue.
Revoke table is used for preventing the wrong replay of journal which
cause data corruption:
If block A has been journalled its modification, committed to journal
and hasn't been checkpointed,
and in later transactions block A is freed and reused for data in
no-journalled-data mode, then If
we don't have revoke table which recording the releasing event, replay
of journal will overwrite the new data,
which causing data corruption.
2011/4/26 Yongqiang Yang <[email protected]>:
> AFAIK, it can accelerate the recovering process. ?If a block is in the
> revoke table of a transaction t1 and t1 is committed, then the there
> is no need to recover the block in transactions which is earlier than
> t1.
>
> On Tue, Apr 26, 2011 at 4:29 PM, Niraj Kulkarni
> <[email protected]> wrote:
>> Hi all,
>> ? ? ?I am new to fs development. I am trying to modify the journal structure
>> of JBD. While analyzing the code, I could understand most of the things, but
>> I am not able to understand the need of revoke mechanism. Can anybody
>> enlighten me on this issue?
>>
>> Regards
>> Niraj
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
>> the body of a message to [email protected]
>> More majordomo info at ?http://vger.kernel.org/majordomo-info.html
>>
>
>
>
> --
> Best Wishes
> Yongqiang Yang
> --
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to [email protected]
> More majordomo info at ?http://vger.kernel.org/majordomo-info.html
>
--
Ding Dinghua
If I am thinking correctly, journal would be checkpointed on filesystem
unmount calls.
This implies the given scenario would be pretty rare.
ie first filesystem should be mounted in full-journal mode, and crashed
prior to checkpoint.
then it should be remounted in no-journalled-data mode without recovery
and again remounted in full journalled mode with recovery.
Am I thinking on correct lines?
On Tuesday 26 April 2011 02:53 PM, Ding Dinghua wrote:
> I think it's not only a performance issue but more important, a
> correctness issue.
> Revoke table is used for preventing the wrong replay of journal which
> cause data corruption:
> If block A has been journalled its modification, committed to journal
> and hasn't been checkpointed,
> and in later transactions block A is freed and reused for data in
> no-journalled-data mode, then If
> we don't have revoke table which recording the releasing event, replay
> of journal will overwrite the new data,
> which causing data corruption.
>
> 2011/4/26 Yongqiang Yang<[email protected]>:
>> AFAIK, it can accelerate the recovering process. If a block is in the
>> revoke table of a transaction t1 and t1 is committed, then the there
>> is no need to recover the block in transactions which is earlier than
>> t1.
>>
>> On Tue, Apr 26, 2011 at 4:29 PM, Niraj Kulkarni
>> <[email protected]> wrote:
>>> Hi all,
>>> I am new to fs development. I am trying to modify the journal structure
>>> of JBD. While analyzing the code, I could understand most of the things, but
>>> I am not able to understand the need of revoke mechanism. Can anybody
>>> enlighten me on this issue?
>>>
>>> Regards
>>> Niraj
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
>>> the body of a message to [email protected]
>>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>>>
>>
>>
>> --
>> Best Wishes
>> Yongqiang Yang
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
>> the body of a message to [email protected]
>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>>
>
>
On Tue, Apr 26, 2011 at 1:47 PM, Niraj Kulkarni
<[email protected]> wrote:
> If I am thinking correctly, journal would be checkpointed ?on filesystem
> unmount calls.
> This implies the given scenario would be pretty rare.
>
> ie first filesystem should be mounted in full-journal mode, and crashed
> prior to checkpoint.
> then it should be remounted in no-journalled-data mode without recovery
> and again remounted in full journalled mode with recovery.
>
> Am I thinking on correct lines?
nope.
first the block is allocated as metadata and journaled.
then the metadata block is freed and re-allocated as non-journaled data.
this is the use case and it would be very easy to corrupt data if it wasn't
for journal revoke.
Amir.
>
> On Tuesday 26 April 2011 02:53 PM, Ding Dinghua wrote:
>>
>> I think it's not only a performance issue but more important, a
>> correctness issue.
>> Revoke table is used for preventing the wrong replay of journal which
>> cause data corruption:
>> If block A has been journalled its modification, committed to journal
>> and hasn't been checkpointed,
>> and in later transactions block A is freed and reused for data in
>> no-journalled-data mode, then If
>> we don't have revoke table which recording the releasing event, replay
>> of journal will overwrite the new data,
>> which causing data corruption.
>>
>> 2011/4/26 Yongqiang Yang<[email protected]>:
>>>
>>> AFAIK, it can accelerate the recovering process. ?If a block is in the
>>> revoke table of a transaction t1 and t1 is committed, then the there
>>> is no need to recover the block in transactions which is earlier than
>>> t1.
>>>
>>> On Tue, Apr 26, 2011 at 4:29 PM, Niraj Kulkarni
>>> <[email protected]> ?wrote:
>>>>
>>>> Hi all,
>>>> ? ? ?I am new to fs development. I am trying to modify the journal
>>>> structure
>>>> of JBD. While analyzing the code, I could understand most of the things,
>>>> but
>>>> I am not able to understand the need of revoke mechanism. Can anybody
>>>> enlighten me on this issue?
>>>>
>>>> Regards
>>>> Niraj
>>>> --
>>>> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
>>>> the body of a message to [email protected]
>>>> More majordomo info at ?http://vger.kernel.org/majordomo-info.html
>>>>
>>>
>>>
>>> --
>>> Best Wishes
>>> Yongqiang Yang
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
>>> the body of a message to [email protected]
>>> More majordomo info at ?http://vger.kernel.org/majordomo-info.html
>>>
>>
>>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to [email protected]
> More majordomo info at ?http://vger.kernel.org/majordomo-info.html
>
On Tue, Apr 26, 2011 at 05:23:21PM +0800, Ding Dinghua wrote:
> I think it's not only a performance issue but more important, a
> correctness issue.
> Revoke table is used for preventing the wrong replay of journal which
> cause data corruption:
> If block A has been journalled its modification, committed to journal
> and hasn't been checkpointed,
> and in later transactions block A is freed and reused for data in
> no-journalled-data mode, then If
> we don't have revoke table which recording the releasing event, replay
> of journal will overwrite the new data,
> which causing data corruption.
Yes, this is correct. It should be covered fairly well in Stephen
Tweedie's, "Journaling the ext2fs file system" paper, which you can
find at:
https://ext4.wiki.kernel.org/index.php/Publications
if you'd like more details.
Hope this helps!
- Ted
On 2011-04-26, at 4:47 AM, Niraj Kulkarni wrote:
> If I am thinking correctly, journal would be checkpointed on
> filesystem unmount calls.
> This implies the given scenario would be pretty rare.
>
> ie first filesystem should be mounted in full-journal mode, and
> crashed prior to checkpoint.
> then it should be remounted in no-journalled-data mode without
> recovery and again remounted in full journalled mode with recovery.
It shouldn't be possible to mount the filesystem in no-journal mode
without doing journal recovery. The filesystem sets an INCOMPAT_RECOVER
flag when the journal has any transactions in it, and the journal should
be replayed before the filesystem is finished mounting.
Looking at ext4_fill_super() the "noload" mount option is used to avoid
loading the journal even if there is a journal (COMPAT_HAS_JOURNAL is set),
but if INCOMPAT_RECOVER is set the filesystem will refuse to mount.
> On Tuesday 26 April 2011 02:53 PM, Ding Dinghua wrote:
>> I think it's not only a performance issue but more important, a
>> correctness issue.
>> Revoke table is used for preventing the wrong replay of journal which
>> cause data corruption:
>> If block A has been journalled its modification, committed to journal
>> and hasn't been checkpointed,
>> and in later transactions block A is freed and reused for data in
>> no-journalled-data mode, then If
>> we don't have revoke table which recording the releasing event, replay
>> of journal will overwrite the new data,
>> which causing data corruption.
>>
>> 2011/4/26 Yongqiang Yang<[email protected]>:
>>> AFAIK, it can accelerate the recovering process. If a block is in the
>>> revoke table of a transaction t1 and t1 is committed, then the there
>>> is no need to recover the block in transactions which is earlier than
>>> t1.
>>>
>>> On Tue, Apr 26, 2011 at 4:29 PM, Niraj Kulkarni
>>> <[email protected]> wrote:
>>>> Hi all,
>>>> I am new to fs development. I am trying to modify the journal structure
>>>> of JBD. While analyzing the code, I could understand most of the things, but
>>>> I am not able to understand the need of revoke mechanism. Can anybody
>>>> enlighten me on this issue?
>>>>
>>>> Regards
>>>> Niraj
>>>> --
>>>> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
>>>> the body of a message to [email protected]
>>>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>>>>
>>>
>>>
>>> --
>>> Best Wishes
>>> Yongqiang Yang
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
>>> the body of a message to [email protected]
>>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>>>
>>
>>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
Cheers, Andreas
Oh, here no-journalled-data is misleading, I mean mount in
data=writeback mode or data=ordered mode,
not in data=journal mode.
2011/4/27 Andreas Dilger <[email protected]>:
> On 2011-04-26, at 4:47 AM, Niraj Kulkarni wrote:
>> If I am thinking correctly, journal would be checkpointed ?on
>> filesystem unmount calls.
>> This implies the given scenario would be pretty rare.
>>
>> ie first filesystem should be mounted in full-journal mode, and
>> crashed prior to checkpoint.
>> then it should be remounted in no-journalled-data mode without
>> recovery and again remounted in full journalled mode with recovery.
>
> It shouldn't be possible to mount the filesystem in no-journal mode
> without doing journal recovery. ?The filesystem sets an INCOMPAT_RECOVER
> flag when the journal has any transactions in it, and the journal should
> be replayed before the filesystem is finished mounting.
>
> Looking at ext4_fill_super() the "noload" mount option is used to avoid
> loading the journal even if there is a journal (COMPAT_HAS_JOURNAL is set),
> but if INCOMPAT_RECOVER is set the filesystem will refuse to mount.
>
>
>> On Tuesday 26 April 2011 02:53 PM, Ding Dinghua wrote:
>>> I think it's not only a performance issue but more important, a
>>> correctness issue.
>>> Revoke table is used for preventing the wrong replay of journal which
>>> cause data corruption:
>>> If block A has been journalled its modification, committed to journal
>>> and hasn't been checkpointed,
>>> and in later transactions block A is freed and reused for data in
>>> no-journalled-data mode, then If
>>> we don't have revoke table which recording the releasing event, replay
>>> of journal will overwrite the new data,
>>> which causing data corruption.
>>>
>>> 2011/4/26 Yongqiang Yang<[email protected]>:
>>>> AFAIK, it can accelerate the recovering process. ?If a block is in the
>>>> revoke table of a transaction t1 and t1 is committed, then the there
>>>> is no need to recover the block in transactions which is earlier than
>>>> t1.
>>>>
>>>> On Tue, Apr 26, 2011 at 4:29 PM, Niraj Kulkarni
>>>> <[email protected]> ?wrote:
>>>>> Hi all,
>>>>> ? ? ?I am new to fs development. I am trying to modify the journal structure
>>>>> of JBD. While analyzing the code, I could understand most of the things, but
>>>>> I am not able to understand the need of revoke mechanism. Can anybody
>>>>> enlighten me on this issue?
>>>>>
>>>>> Regards
>>>>> Niraj
>>>>> --
>>>>> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
>>>>> the body of a message to [email protected]
>>>>> More majordomo info at ?http://vger.kernel.org/majordomo-info.html
>>>>>
>>>>
>>>>
>>>> --
>>>> Best Wishes
>>>> Yongqiang Yang
>>>> --
>>>> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
>>>> the body of a message to [email protected]
>>>> More majordomo info at ?http://vger.kernel.org/majordomo-info.html
>>>>
>>>
>>>
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
>> the body of a message to [email protected]
>> More majordomo info at ?http://vger.kernel.org/majordomo-info.html
>
>
> Cheers, Andreas
>
>
>
>
>
>
--
Ding Dinghua
On Tue, Apr 26, 2011 at 3:25 PM, Ted Ts'o <[email protected]> wrote:
> On Tue, Apr 26, 2011 at 05:23:21PM +0800, Ding Dinghua wrote:
>> I think it's not only a performance issue but more important, a
>> correctness issue.
>> Revoke table is used for preventing the wrong replay of journal which
>> cause data corruption:
>> If block A has been journalled its modification, committed to journal
>> and hasn't been checkpointed,
>> and in later transactions block A is freed and reused for data in
>> no-journalled-data mode, then If
>> we don't have revoke table which recording the releasing event, replay
>> of journal will overwrite the new data,
>> which causing data corruption.
>
> Yes, this is correct. ?It should be covered fairly well in Stephen
> Tweedie's, "Journaling the ext2fs file system" paper, which you can
> find at:
>
> https://ext4.wiki.kernel.org/index.php/Publications
Actually, the original paper has no mention of revoke records.
I went out to look for useful documentation on journal forget/revoke
and came back empty handed as well.
>
> if you'd like more details.
>
> Hope this helps!
>
> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?- Ted
> --
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to [email protected]
> More majordomo info at ?http://vger.kernel.org/majordomo-info.html
>
On Saturday 30 April 2011 01:15 AM, Amir Goldstein wrote:
> On Tue, Apr 26, 2011 at 3:25 PM, Ted Ts'o<[email protected]> wrote:
>> On Tue, Apr 26, 2011 at 05:23:21PM +0800, Ding Dinghua wrote:
>>> I think it's not only a performance issue but more important, a
>>> correctness issue.
>>> Revoke table is used for preventing the wrong replay of journal which
>>> cause data corruption:
>>> If block A has been journalled its modification, committed to journal
>>> and hasn't been checkpointed,
>>> and in later transactions block A is freed and reused for data in
>>> no-journalled-data mode, then If
>>> we don't have revoke table which recording the releasing event, replay
>>> of journal will overwrite the new data,
>>> which causing data corruption.
>> Yes, this is correct. It should be covered fairly well in Stephen
>> Tweedie's, "Journaling the ext2fs file system" paper, which you can
>> find at:
>>
>> https://ext4.wiki.kernel.org/index.php/Publications
> Actually, the original paper has no mention of revoke records.
> I went out to look for useful documentation on journal forget/revoke
> and came back empty handed as well.
>
>> if you'd like more details.
>>
>> Hope this helps!
>>
>> - Ted
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
>> the body of a message to [email protected]
>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>>
Yes, I tried some other papers too, but no use. Anyway I've figured out
that for my change, I dont need any kind of journalling related
facilities, so I am going to bypass it completely.
Niraj
On Fri, Apr 29, 2011 at 10:45:23PM +0300, Amir Goldstein wrote:
>
> Actually, the original paper has no mention of revoke records.
> I went out to look for useful documentation on journal forget/revoke
> and came back empty handed as well.
Stephen Tweedie gave a talk back in 2000 which covered revoke records.
There's a transcript of his talk here:
http://olstrans.sourceforge.net/release/OLS2000-ext3/OLS2000-ext3.html
The link to the audio file of the talk is dead, but I managed to find
a copy of the mp3 file on Google. To make sure it doesn't get lost
I've made a copy of it. The original and the copy can be found at:
http://ftp.gnumonks.org/pub/congress-talks/ols2000/high/cd1/2000-07-20_15-05-22_A_64.mp3
ftp://ftp.kernel.org/pub/linux/kernel/people/tytso/presentations/2000-07-20_15-05-22_A_64.mp3
- Ted
On Mon, May 2, 2011 at 1:28 AM, Ted Ts'o <[email protected]> wrote:
> On Fri, Apr 29, 2011 at 10:45:23PM +0300, Amir Goldstein wrote:
>>
>> Actually, the original paper has no mention of revoke records.
>> I went out to look for useful documentation on journal forget/revoke
>> and came back empty handed as well.
>
> Stephen Tweedie gave a talk back in 2000 which covered revoke records.
> There's a transcript of his talk here:
>
> http://olstrans.sourceforge.net/release/OLS2000-ext3/OLS2000-ext3.html
>
> The link to the audio file of the talk is dead, but I managed to find
> a copy of the mp3 file on Google. ?To make sure it doesn't get lost
> I've made a copy of it. ?The original and the copy can be found at:
>
> http://ftp.gnumonks.org/pub/congress-talks/ols2000/high/cd1/2000-07-20_15-05-22_A_64.mp3
> ftp://ftp.kernel.org/pub/linux/kernel/people/tytso/presentations/2000-07-20_15-05-22_A_64.mp3
>
> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?- Ted
>
Thanks for that! It was an interesting read, a little piece of history for me.
Clearly, at the time of this talk, revoke code was still on the design table
and committed_data is not mentioned, so either it was not introduced yet
or just wasn't in the scope of the talk.
Amir.
On Mon, May 02, 2011 at 01:42:48PM +0300, Amir Goldstein wrote:
>
> Clearly, at the time of this talk, revoke code was still on the design table
> and committed_data is not mentioned, so either it was not introduced yet
> or just wasn't in the scope of the talk.
No, the revoke code was already implemented when he gave the talk.
"The way we're doing that in EXT3 is that deleting metadata can
cause a revoke record to be written into the journal. And when you
do the replay of the journal, the very first pass of the journal
recovery, we look for all of the revoke records and make sure that
any data that's been revoked is never, ever replayed. And so that
deals with that particular case."
Stephen gave the talk in July, 2000. Journal support was first
supported in e2fsprogs 1.20 (released May 2000), and we fixed a bug in
the revoke handling in e2fsprogs 1.21 (released June 2000). Data
journalling mode (only) came first, and was working by late 1999;
indeed we couldn't do ordered or writeback journaling at all until we
had support for the revoke handling, for reasons which he explained in
his talk.
- Ted