2016-02-25 11:28:27

by OGAWA Hirofumi

[permalink] [raw]
Subject: [PATCH] ext4/jbd2: Fix jbd2_journal_destory() for umount path

On umount path, jbd2_journal_destroy() writes latest transaction ID
(->j_tail_sequence) to be used at next mount.

The bug is that ->j_tail_sequence is not holding latest transaction ID
in some cases. So, at next mount, there is chance to conflict with
remaining (not overwritten yet) transactions.

mount (id=10)
write transaction (id=11)
write transaction (id=12)
umount (id=10) <= the bug doesn't write latest ID

mount (id=10)
write transaction (id=11)
crash

recovery
transaction (id=11)
transaction (id=12) <= valid transaction ID, but old commit
must not replay

Like above, this bug become the cause of recovery failure, or FS
corruption.

So why ->j_tail_sequence doesn't point latest ID?

Because if checkpoint transactions was reclaimed by memory pressure
(i.e. bdev_try_to_free_page()), then ->j_tail_sequence is not updated.
(And another case is, __jbd2_journal_clean_checkpoint_list() is called
with empty transaction.)

So in above cases, ->j_tail_sequence is not pointing latest
transaction ID at umount path. Plus, REQ_FLUSH for checkpoint is not
done too.

So, to fix this problem with minimum changes, this patch updates
->j_tail_sequence, and issue REQ_FLUSH. (With more complex changes,
some optimizations would be possible to avoid unnecessary REQ_FLUSH
for example though.)

BTW,

journal->j_tail_sequence =
++journal->j_transaction_sequence;

Increment of ->j_transaction_sequence seems to be unnecessary, but
ext3 does this.

Signed-off-by: OGAWA Hirofumi <[email protected]>
---

fs/jbd2/journal.c | 16 +++++++++++-----
1 file changed, 11 insertions(+), 5 deletions(-)

diff -puN fs/jbd2/journal.c~ext4-umount-fix fs/jbd2/journal.c
--- linux/fs/jbd2/journal.c~ext4-umount-fix 2016-02-25 03:26:32.710407670 +0900
+++ linux-hirofumi/fs/jbd2/journal.c 2016-02-25 03:26:32.711407673 +0900
@@ -1412,7 +1412,7 @@ out:
* Update a journal's dynamic superblock fields to show that journal is empty.
* Write updated superblock to disk waiting for IO to complete.
*/
-static void jbd2_mark_journal_empty(journal_t *journal)
+static void jbd2_mark_journal_empty(journal_t *journal, int write_op)
{
journal_superblock_t *sb = journal->j_superblock;

@@ -1430,7 +1430,7 @@ static void jbd2_mark_journal_empty(jour
sb->s_start = cpu_to_be32(0);
read_unlock(&journal->j_state_lock);

- jbd2_write_superblock(journal, WRITE_FUA);
+ jbd2_write_superblock(journal, write_op);

/* Log is no longer empty */
write_lock(&journal->j_state_lock);
@@ -1716,7 +1716,13 @@ int jbd2_journal_destroy(journal_t *jour
if (journal->j_sb_buffer) {
if (!is_journal_aborted(journal)) {
mutex_lock(&journal->j_checkpoint_mutex);
- jbd2_mark_journal_empty(journal);
+
+ write_lock(&journal->j_state_lock);
+ journal->j_tail_sequence =
+ ++journal->j_transaction_sequence;
+ write_unlock(&journal->j_state_lock);
+
+ jbd2_mark_journal_empty(journal, WRITE_FLUSH_FUA);
mutex_unlock(&journal->j_checkpoint_mutex);
} else
err = -EIO;
@@ -1975,7 +1981,7 @@ int jbd2_journal_flush(journal_t *journa
* the magic code for a fully-recovered superblock. Any future
* commits of data to the journal will restore the current
* s_start value. */
- jbd2_mark_journal_empty(journal);
+ jbd2_mark_journal_empty(journal, WRITE_FUA);
mutex_unlock(&journal->j_checkpoint_mutex);
write_lock(&journal->j_state_lock);
J_ASSERT(!journal->j_running_transaction);
@@ -2021,7 +2027,7 @@ int jbd2_journal_wipe(journal_t *journal
if (write) {
/* Lock to make assertions happy... */
mutex_lock(&journal->j_checkpoint_mutex);
- jbd2_mark_journal_empty(journal);
+ jbd2_mark_journal_empty(journal, WRITE_FUA);
mutex_unlock(&journal->j_checkpoint_mutex);
}

_

--
OGAWA Hirofumi <[email protected]>


2016-02-25 13:11:19

by kernel test robot

[permalink] [raw]
Subject: Re: [PATCH] ext4/jbd2: Fix jbd2_journal_destory() for umount path

Hi OGAWA,

[auto build test WARNING on ext4/dev]
[also build test WARNING on v4.5-rc5 next-20160224]
[if your patch is applied to the wrong git tree, please drop us a note to help improving the system]

url: https://github.com/0day-ci/linux/commits/OGAWA-Hirofumi/ext4-jbd2-Fix-jbd2_journal_destory-for-umount-path/20160225-195750
base: https://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4.git dev
reproduce: make htmldocs

All warnings (new ones prefixed by >>):

include/linux/jbd2.h:434: warning: No description found for parameter 'i_transaction'
include/linux/jbd2.h:434: warning: No description found for parameter 'i_next_transaction'
include/linux/jbd2.h:434: warning: No description found for parameter 'i_list'
include/linux/jbd2.h:434: warning: No description found for parameter 'i_vfs_inode'
include/linux/jbd2.h:434: warning: No description found for parameter 'i_flags'
include/linux/jbd2.h:490: warning: No description found for parameter 'h_rsv_handle'
include/linux/jbd2.h:490: warning: No description found for parameter 'h_reserved'
include/linux/jbd2.h:490: warning: No description found for parameter 'h_type'
include/linux/jbd2.h:490: warning: No description found for parameter 'h_line_no'
include/linux/jbd2.h:490: warning: No description found for parameter 'h_start_jiffies'
include/linux/jbd2.h:490: warning: No description found for parameter 'h_requested_credits'
include/linux/jbd2.h:490: warning: No description found for parameter 'h_lockdep_map'
include/linux/jbd2.h:1033: warning: No description found for parameter 'j_chkpt_bhs[JBD2_NR_BATCH]'
include/linux/jbd2.h:1033: warning: No description found for parameter 'j_devname[BDEVNAME_SIZE+24]'
include/linux/jbd2.h:1033: warning: No description found for parameter 'j_average_commit_time'
include/linux/jbd2.h:1033: warning: No description found for parameter 'j_min_batch_time'
include/linux/jbd2.h:1033: warning: No description found for parameter 'j_max_batch_time'
include/linux/jbd2.h:1033: warning: No description found for parameter 'j_commit_callback'
include/linux/jbd2.h:1033: warning: No description found for parameter 'j_failed_commit'
include/linux/jbd2.h:1033: warning: No description found for parameter 'j_chksum_driver'
include/linux/jbd2.h:1033: warning: No description found for parameter 'j_csum_seed'
include/linux/jbd2.h:1033: warning: Excess struct/union/enum/typedef member 'j_history' description in 'journal_s'
include/linux/jbd2.h:1033: warning: Excess struct/union/enum/typedef member 'j_history_max' description in 'journal_s'
include/linux/jbd2.h:1033: warning: Excess struct/union/enum/typedef member 'j_history_cur' description in 'journal_s'
>> fs/jbd2/journal.c:1438: warning: No description found for parameter 'write_op'
fs/jbd2/transaction.c:429: warning: No description found for parameter 'rsv_blocks'
fs/jbd2/transaction.c:429: warning: No description found for parameter 'gfp_mask'
fs/jbd2/transaction.c:429: warning: No description found for parameter 'type'
fs/jbd2/transaction.c:429: warning: No description found for parameter 'line_no'
fs/jbd2/transaction.c:505: warning: No description found for parameter 'type'
fs/jbd2/transaction.c:505: warning: No description found for parameter 'line_no'
fs/jbd2/transaction.c:635: warning: No description found for parameter 'gfp_mask'

vim +/write_op +1438 fs/jbd2/journal.c

24bcc89c Jan Kara 2012-03-13 1422 WARN_ON(!sb->s_sequence);
f7f4bccb Mingming Cao 2006-10-11 1423 journal->j_flags &= ~JBD2_FLUSHED;
24bcc89c Jan Kara 2012-03-13 1424 write_unlock(&journal->j_state_lock);
6f6a6fda Joseph Qi 2015-06-15 1425
6f6a6fda Joseph Qi 2015-06-15 1426 out:
6f6a6fda Joseph Qi 2015-06-15 1427 return ret;
24bcc89c Jan Kara 2012-03-13 1428 }
24bcc89c Jan Kara 2012-03-13 1429
24bcc89c Jan Kara 2012-03-13 1430 /**
24bcc89c Jan Kara 2012-03-13 1431 * jbd2_mark_journal_empty() - Mark on disk journal as empty.
24bcc89c Jan Kara 2012-03-13 1432 * @journal: The journal to update.
24bcc89c Jan Kara 2012-03-13 1433 *
24bcc89c Jan Kara 2012-03-13 1434 * Update a journal's dynamic superblock fields to show that journal is empty.
24bcc89c Jan Kara 2012-03-13 1435 * Write updated superblock to disk waiting for IO to complete.
24bcc89c Jan Kara 2012-03-13 1436 */
b16feb49 OGAWA Hirofumi 2016-02-25 1437 static void jbd2_mark_journal_empty(journal_t *journal, int write_op)
24bcc89c Jan Kara 2012-03-13 @1438 {
24bcc89c Jan Kara 2012-03-13 1439 journal_superblock_t *sb = journal->j_superblock;
24bcc89c Jan Kara 2012-03-13 1440
a78bb11d Jan Kara 2012-03-13 1441 BUG_ON(!mutex_is_locked(&journal->j_checkpoint_mutex));
24bcc89c Jan Kara 2012-03-13 1442 read_lock(&journal->j_state_lock);
eeecef0a Eric Sandeen 2012-08-18 1443 /* Is it already empty? */
eeecef0a Eric Sandeen 2012-08-18 1444 if (sb->s_start == 0) {
eeecef0a Eric Sandeen 2012-08-18 1445 read_unlock(&journal->j_state_lock);
eeecef0a Eric Sandeen 2012-08-18 1446 return;

:::::: The code at line 1438 was first introduced by commit
:::::: 24bcc89c7e7c64982e6192b4952a0a92379fc341 jbd2: split updating of journal superblock and marking journal empty

:::::: TO: Jan Kara <[email protected]>
:::::: CC: Theodore Ts'o <[email protected]>

---
0-DAY kernel test infrastructure Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all Intel Corporation


Attachments:
(No filename) (5.49 kB)
.config.gz (6.08 kB)
Download all attachments

2016-03-10 04:59:47

by Theodore Ts'o

[permalink] [raw]
Subject: Re: [PATCH] ext4/jbd2: Fix jbd2_journal_destory() for umount path

On Thu, Feb 25, 2016 at 08:28:27PM +0900, OGAWA Hirofumi wrote:
> On umount path, jbd2_journal_destroy() writes latest transaction ID
> (->j_tail_sequence) to be used at next mount.
>
> The bug is that ->j_tail_sequence is not holding latest transaction ID
> in some cases. So, at next mount, there is chance to conflict with
> remaining (not overwritten yet) transactions.
>
> mount (id=10)
> write transaction (id=11)
> write transaction (id=12)
> umount (id=10) <= the bug doesn't write latest ID
>
> mount (id=10)
> write transaction (id=11)
> crash
>
> recovery
> transaction (id=11)
> transaction (id=12) <= valid transaction ID, but old commit
> must not replay
>
> Like above, this bug become the cause of recovery failure, or FS
> corruption.
>
> So why ->j_tail_sequence doesn't point latest ID?
>
> Because if checkpoint transactions was reclaimed by memory pressure
> (i.e. bdev_try_to_free_page()), then ->j_tail_sequence is not updated.
> (And another case is, __jbd2_journal_clean_checkpoint_list() is called
> with empty transaction.)
>
> So in above cases, ->j_tail_sequence is not pointing latest
> transaction ID at umount path. Plus, REQ_FLUSH for checkpoint is not
> done too.
>
> So, to fix this problem with minimum changes, this patch updates
> ->j_tail_sequence, and issue REQ_FLUSH. (With more complex changes,
> some optimizations would be possible to avoid unnecessary REQ_FLUSH
> for example though.)
>
> BTW,
>
> journal->j_tail_sequence =
> ++journal->j_transaction_sequence;
>
> Increment of ->j_transaction_sequence seems to be unnecessary, but
> ext3 does this.
>
> Signed-off-by: OGAWA Hirofumi <[email protected]>

Thanks, applied.

- Ted