2022-02-12 19:17:48

by Jaegeuk Kim

[permalink] [raw]
Subject: [PATCH 1/2] f2fs: fix missing free nid in f2fs_handle_failed_inode

This patch fixes xfstests/generic/475 failure.

[ 293.680694] F2FS-fs (dm-1): May loss orphan inode, run fsck to fix.
[ 293.685358] Buffer I/O error on dev dm-1, logical block 8388592, async page read
[ 293.691527] Buffer I/O error on dev dm-1, logical block 8388592, async page read
[ 293.691764] sh (7615): drop_caches: 3
[ 293.691819] sh (7616): drop_caches: 3
[ 293.694017] Buffer I/O error on dev dm-1, logical block 1, async page read
[ 293.695659] sh (7618): drop_caches: 3
[ 293.696979] sh (7617): drop_caches: 3
[ 293.700290] sh (7623): drop_caches: 3
[ 293.708621] sh (7626): drop_caches: 3
[ 293.711386] sh (7628): drop_caches: 3
[ 293.711825] sh (7627): drop_caches: 3
[ 293.716738] sh (7630): drop_caches: 3
[ 293.719613] sh (7632): drop_caches: 3
[ 293.720971] sh (7633): drop_caches: 3
[ 293.727741] sh (7634): drop_caches: 3
[ 293.730783] sh (7636): drop_caches: 3
[ 293.732681] sh (7635): drop_caches: 3
[ 293.732988] sh (7637): drop_caches: 3
[ 293.738836] sh (7639): drop_caches: 3
[ 293.740568] sh (7641): drop_caches: 3
[ 293.743053] sh (7640): drop_caches: 3
[ 293.821889] ------------[ cut here ]------------
[ 293.824654] kernel BUG at fs/f2fs/node.c:3334!
[ 293.826226] invalid opcode: 0000 [#1] PREEMPT SMP PTI
[ 293.828713] CPU: 0 PID: 7653 Comm: umount Tainted: G OE 5.17.0-rc1-custom #1
[ 293.830946] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014
[ 293.832526] RIP: 0010:f2fs_destroy_node_manager+0x33f/0x350 [f2fs]
[ 293.833905] Code: e8 d6 3d f9 f9 48 8b 45 d0 65 48 2b 04 25 28 00 00 00 75 1a 48 81 c4 28 03 00 00 5b 41 5c 41 5d 41 5e 41 5f 5d c3 0f 0b
[ 293.837783] RSP: 0018:ffffb04ec31e7a20 EFLAGS: 00010202
[ 293.839062] RAX: 0000000000000001 RBX: ffff9df947db2eb8 RCX: 0000000080aa0072
[ 293.840666] RDX: 0000000000000000 RSI: ffffe86c0432a140 RDI: ffffffffc0b72a21
[ 293.842261] RBP: ffffb04ec31e7d70 R08: ffff9df94ca85780 R09: 0000000080aa0072
[ 293.843909] R10: ffff9df94ca85700 R11: ffff9df94e1ccf58 R12: ffff9df947db2e00
[ 293.845594] R13: ffff9df947db2ed0 R14: ffff9df947db2eb8 R15: ffff9df947db2eb8
[ 293.847855] FS: 00007f5a97379800(0000) GS:ffff9dfa77c00000(0000) knlGS:0000000000000000
[ 293.850647] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 293.852940] CR2: 00007f5a97528730 CR3: 000000010bc76005 CR4: 0000000000370ef0
[ 293.854680] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 293.856423] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 293.858380] Call Trace:
[ 293.859302] <TASK>
[ 293.860311] ? ttwu_do_wakeup+0x1c/0x170
[ 293.861800] ? ttwu_do_activate+0x6d/0xb0
[ 293.863057] ? _raw_spin_unlock_irqrestore+0x29/0x40
[ 293.864411] ? try_to_wake_up+0x9d/0x5e0
[ 293.865618] ? debug_smp_processor_id+0x17/0x20
[ 293.866934] ? debug_smp_processor_id+0x17/0x20
[ 293.868223] ? free_unref_page+0xbf/0x120
[ 293.869470] ? __free_slab+0xcb/0x1c0
[ 293.870614] ? preempt_count_add+0x7a/0xc0
[ 293.871811] ? __slab_free+0xa0/0x2d0
[ 293.872918] ? __wake_up_common_lock+0x8a/0xc0
[ 293.874186] ? __slab_free+0xa0/0x2d0
[ 293.875305] ? free_inode_nonrcu+0x20/0x20
[ 293.876466] ? free_inode_nonrcu+0x20/0x20
[ 293.877650] ? debug_smp_processor_id+0x17/0x20
[ 293.878949] ? call_rcu+0x11a/0x240
[ 293.880060] ? f2fs_destroy_stats+0x59/0x60 [f2fs]
[ 293.881437] ? kfree+0x1fe/0x230
[ 293.882674] f2fs_put_super+0x160/0x390 [f2fs]
[ 293.883978] generic_shutdown_super+0x7a/0x120
[ 293.885274] kill_block_super+0x27/0x50
[ 293.886496] kill_f2fs_super+0x7f/0x100 [f2fs]
[ 293.887806] deactivate_locked_super+0x35/0xa0
[ 293.889271] deactivate_super+0x40/0x50
[ 293.890513] cleanup_mnt+0x139/0x190
[ 293.891689] __cleanup_mnt+0x12/0x20
[ 293.892850] task_work_run+0x64/0xa0
[ 293.894035] exit_to_user_mode_prepare+0x1b7/0x1c0
[ 293.895409] syscall_exit_to_user_mode+0x27/0x50
[ 293.896872] do_syscall_64+0x48/0xc0
[ 293.898090] entry_SYSCALL_64_after_hwframe+0x44/0xae
[ 293.899517] RIP: 0033:0x7f5a975cd25b

Fixes: 7735730d39d7 ("f2fs: fix to propagate error from __get_meta_page()")
Signed-off-by: Jaegeuk Kim <[email protected]>
---
fs/f2fs/inode.c | 1 +
1 file changed, 1 insertion(+)

diff --git a/fs/f2fs/inode.c b/fs/f2fs/inode.c
index 0ec8e32a00b4..ab8e0c06c78c 100644
--- a/fs/f2fs/inode.c
+++ b/fs/f2fs/inode.c
@@ -885,6 +885,7 @@ void f2fs_handle_failed_inode(struct inode *inode)
err = f2fs_get_node_info(sbi, inode->i_ino, &ni, false);
if (err) {
set_sbi_flag(sbi, SBI_NEED_FSCK);
+ set_inode_flag(inode, FI_FREE_NID);
f2fs_warn(sbi, "May loss orphan inode, run fsck to fix.");
goto out;
}
--
2.35.1.265.g69c8d7142f-goog


2022-02-14 10:04:53

by Jaegeuk Kim

[permalink] [raw]
Subject: [PATCH 2/2] f2fs: avoid an infinite loop in f2fs_sync_dirty_inodes

If one read IO is always failing, we can fall into an infinite loop in
f2fs_sync_dirty_inodes. This happens during xfstests/generic/457.

[ 142.803335] Buffer I/O error on dev dm-1, logical block 8388592, async page read
...
[ 382.887210] submit_bio_noacct+0xdd/0x2a0
[ 382.887213] submit_bio+0x80/0x110
[ 382.887223] __submit_bio+0x4d/0x300 [f2fs]
[ 382.887282] f2fs_submit_page_bio+0x125/0x200 [f2fs]
[ 382.887299] __get_meta_page+0xc9/0x280 [f2fs]
[ 382.887315] f2fs_get_meta_page+0x13/0x20 [f2fs]
[ 382.887331] f2fs_get_node_info+0x317/0x3c0 [f2fs]
[ 382.887350] f2fs_do_write_data_page+0x327/0x6f0 [f2fs]
[ 382.887367] f2fs_write_single_data_page+0x5b7/0x960 [f2fs]
[ 382.887386] f2fs_write_cache_pages+0x302/0x890 [f2fs]
[ 382.887405] ? preempt_count_add+0x7a/0xc0
[ 382.887408] f2fs_write_data_pages+0xfd/0x320 [f2fs]
[ 382.887425] ? _raw_spin_unlock+0x1a/0x30
[ 382.887428] do_writepages+0xd3/0x1d0
[ 382.887432] filemap_fdatawrite_wbc+0x69/0x90
[ 382.887434] filemap_fdatawrite+0x50/0x70
[ 382.887437] f2fs_sync_dirty_inodes+0xa4/0x270 [f2fs]
[ 382.887453] f2fs_write_checkpoint+0x189/0x1640 [f2fs]
[ 382.887469] ? schedule_timeout+0x114/0x150
[ 382.887471] ? ttwu_do_activate+0x6d/0xb0
[ 382.887473] ? preempt_count_add+0x7a/0xc0
[ 382.887476] kill_f2fs_super+0xca/0x100 [f2fs]
[ 382.887491] deactivate_locked_super+0x35/0xa0
[ 382.887494] deactivate_super+0x40/0x50
[ 382.887497] cleanup_mnt+0x139/0x190
[ 382.887499] __cleanup_mnt+0x12/0x20
[ 382.887501] task_work_run+0x64/0xa0
[ 382.887505] exit_to_user_mode_prepare+0x1b7/0x1c0
[ 382.887508] syscall_exit_to_user_mode+0x27/0x50
[ 382.887510] do_syscall_64+0x48/0xc0
[ 382.887513] entry_SYSCALL_64_after_hwframe+0x44/0xae

Signed-off-by: Jaegeuk Kim <[email protected]>
---
fs/f2fs/checkpoint.c | 10 ++++++----
1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/fs/f2fs/checkpoint.c b/fs/f2fs/checkpoint.c
index 203a1577942d..756abfdf3628 100644
--- a/fs/f2fs/checkpoint.c
+++ b/fs/f2fs/checkpoint.c
@@ -1059,13 +1059,13 @@ int f2fs_sync_dirty_inodes(struct f2fs_sb_info *sbi, enum inode_type type)
struct inode *inode;
struct f2fs_inode_info *fi;
bool is_dir = (type == DIR_INODE);
- unsigned long ino = 0;
+ unsigned long ino = 0, retry_count = DEFAULT_RETRY_IO_COUNT;

trace_f2fs_sync_dirty_inodes_enter(sbi->sb, is_dir,
get_pages(sbi, is_dir ?
F2FS_DIRTY_DENTS : F2FS_DIRTY_DATA));
retry:
- if (unlikely(f2fs_cp_error(sbi))) {
+ if (unlikely(f2fs_cp_error(sbi) || !retry_count)) {
trace_f2fs_sync_dirty_inodes_exit(sbi->sb, is_dir,
get_pages(sbi, is_dir ?
F2FS_DIRTY_DENTS : F2FS_DIRTY_DATA));
@@ -1096,10 +1096,12 @@ int f2fs_sync_dirty_inodes(struct f2fs_sb_info *sbi, enum inode_type type)

iput(inode);
/* We need to give cpu to another writers. */
- if (ino == cur_ino)
+ if (ino == cur_ino) {
+ retry_count--;
cond_resched();
- else
+ } else {
ino = cur_ino;
+ }
} else {
/*
* We should submit bio, since it exists several
--
2.35.1.265.g69c8d7142f-goog

2022-02-15 09:18:53

by Jaegeuk Kim

[permalink] [raw]
Subject: Re: [PATCH 2/2 v2] f2fs: avoid an infinite loop in f2fs_sync_dirty_inodes

If one read IO is always failing, we can fall into an infinite loop in
f2fs_sync_dirty_inodes. This happens during xfstests/generic/475.

[ 142.803335] Buffer I/O error on dev dm-1, logical block 8388592, async page read
...
[ 382.887210] submit_bio_noacct+0xdd/0x2a0
[ 382.887213] submit_bio+0x80/0x110
[ 382.887223] __submit_bio+0x4d/0x300 [f2fs]
[ 382.887282] f2fs_submit_page_bio+0x125/0x200 [f2fs]
[ 382.887299] __get_meta_page+0xc9/0x280 [f2fs]
[ 382.887315] f2fs_get_meta_page+0x13/0x20 [f2fs]
[ 382.887331] f2fs_get_node_info+0x317/0x3c0 [f2fs]
[ 382.887350] f2fs_do_write_data_page+0x327/0x6f0 [f2fs]
[ 382.887367] f2fs_write_single_data_page+0x5b7/0x960 [f2fs]
[ 382.887386] f2fs_write_cache_pages+0x302/0x890 [f2fs]
[ 382.887405] ? preempt_count_add+0x7a/0xc0
[ 382.887408] f2fs_write_data_pages+0xfd/0x320 [f2fs]
[ 382.887425] ? _raw_spin_unlock+0x1a/0x30
[ 382.887428] do_writepages+0xd3/0x1d0
[ 382.887432] filemap_fdatawrite_wbc+0x69/0x90
[ 382.887434] filemap_fdatawrite+0x50/0x70
[ 382.887437] f2fs_sync_dirty_inodes+0xa4/0x270 [f2fs]
[ 382.887453] f2fs_write_checkpoint+0x189/0x1640 [f2fs]
[ 382.887469] ? schedule_timeout+0x114/0x150
[ 382.887471] ? ttwu_do_activate+0x6d/0xb0
[ 382.887473] ? preempt_count_add+0x7a/0xc0
[ 382.887476] kill_f2fs_super+0xca/0x100 [f2fs]
[ 382.887491] deactivate_locked_super+0x35/0xa0
[ 382.887494] deactivate_super+0x40/0x50
[ 382.887497] cleanup_mnt+0x139/0x190
[ 382.887499] __cleanup_mnt+0x12/0x20
[ 382.887501] task_work_run+0x64/0xa0
[ 382.887505] exit_to_user_mode_prepare+0x1b7/0x1c0
[ 382.887508] syscall_exit_to_user_mode+0x27/0x50
[ 382.887510] do_syscall_64+0x48/0xc0
[ 382.887513] entry_SYSCALL_64_after_hwframe+0x44/0xae

Signed-off-by: Jaegeuk Kim <[email protected]>
---
Change log from v1:
- fix a regression to report EIO too early

fs/f2fs/checkpoint.c | 13 ++++++++-----
fs/f2fs/f2fs.h | 3 +++
2 files changed, 11 insertions(+), 5 deletions(-)

diff --git a/fs/f2fs/checkpoint.c b/fs/f2fs/checkpoint.c
index 203a1577942d..56c81c68ef71 100644
--- a/fs/f2fs/checkpoint.c
+++ b/fs/f2fs/checkpoint.c
@@ -1059,13 +1059,13 @@ int f2fs_sync_dirty_inodes(struct f2fs_sb_info *sbi, enum inode_type type)
struct inode *inode;
struct f2fs_inode_info *fi;
bool is_dir = (type == DIR_INODE);
- unsigned long ino = 0;
+ unsigned long ino = 0, retry_count = DEFAULT_RETRY_SYNC_DIR_COUNT;

trace_f2fs_sync_dirty_inodes_enter(sbi->sb, is_dir,
get_pages(sbi, is_dir ?
F2FS_DIRTY_DENTS : F2FS_DIRTY_DATA));
retry:
- if (unlikely(f2fs_cp_error(sbi))) {
+ if (unlikely(f2fs_cp_error(sbi) || (is_dir && !retry_count))) {
trace_f2fs_sync_dirty_inodes_exit(sbi->sb, is_dir,
get_pages(sbi, is_dir ?
F2FS_DIRTY_DENTS : F2FS_DIRTY_DATA));
@@ -1096,10 +1096,13 @@ int f2fs_sync_dirty_inodes(struct f2fs_sb_info *sbi, enum inode_type type)

iput(inode);
/* We need to give cpu to another writers. */
- if (ino == cur_ino)
- cond_resched();
- else
+ if (ino == cur_ino) {
+ retry_count--;
+ io_schedule_timeout(DEFAULT_IO_TIMEOUT);
+ } else {
+ retry_count = DEFAULT_RETRY_SYNC_DIR_COUNT;
ino = cur_ino;
+ }
} else {
/*
* We should submit bio, since it exists several
diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
index c9515c3c54fd..f40ef7b61965 100644
--- a/fs/f2fs/f2fs.h
+++ b/fs/f2fs/f2fs.h
@@ -577,6 +577,9 @@ enum {
/* maximum retry quota flush count */
#define DEFAULT_RETRY_QUOTA_FLUSH_COUNT 8

+/* maximum retry sync dirty inodes */
+#define DEFAULT_RETRY_SYNC_DIR_COUNT 3000
+
#define F2FS_LINK_MAX 0xffffffff /* maximum link count per file */

#define MAX_DIR_RA_PAGES 4 /* maximum ra pages of dir */
--
2.35.1.265.g69c8d7142f-goog

2022-02-24 08:56:14

by Chao Yu

[permalink] [raw]
Subject: Re: [f2fs-dev] [PATCH 1/2] f2fs: fix missing free nid in f2fs_handle_failed_inode

On 2022/2/12 22:20, Jaegeuk Kim wrote:
> This patch fixes xfstests/generic/475 failure.
>
> [ 293.680694] F2FS-fs (dm-1): May loss orphan inode, run fsck to fix.
> [ 293.685358] Buffer I/O error on dev dm-1, logical block 8388592, async page read
> [ 293.691527] Buffer I/O error on dev dm-1, logical block 8388592, async page read
> [ 293.691764] sh (7615): drop_caches: 3
> [ 293.691819] sh (7616): drop_caches: 3
> [ 293.694017] Buffer I/O error on dev dm-1, logical block 1, async page read
> [ 293.695659] sh (7618): drop_caches: 3
> [ 293.696979] sh (7617): drop_caches: 3
> [ 293.700290] sh (7623): drop_caches: 3
> [ 293.708621] sh (7626): drop_caches: 3
> [ 293.711386] sh (7628): drop_caches: 3
> [ 293.711825] sh (7627): drop_caches: 3
> [ 293.716738] sh (7630): drop_caches: 3
> [ 293.719613] sh (7632): drop_caches: 3
> [ 293.720971] sh (7633): drop_caches: 3
> [ 293.727741] sh (7634): drop_caches: 3
> [ 293.730783] sh (7636): drop_caches: 3
> [ 293.732681] sh (7635): drop_caches: 3
> [ 293.732988] sh (7637): drop_caches: 3
> [ 293.738836] sh (7639): drop_caches: 3
> [ 293.740568] sh (7641): drop_caches: 3
> [ 293.743053] sh (7640): drop_caches: 3
> [ 293.821889] ------------[ cut here ]------------
> [ 293.824654] kernel BUG at fs/f2fs/node.c:3334!
> [ 293.826226] invalid opcode: 0000 [#1] PREEMPT SMP PTI
> [ 293.828713] CPU: 0 PID: 7653 Comm: umount Tainted: G OE 5.17.0-rc1-custom #1
> [ 293.830946] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014
> [ 293.832526] RIP: 0010:f2fs_destroy_node_manager+0x33f/0x350 [f2fs]
> [ 293.833905] Code: e8 d6 3d f9 f9 48 8b 45 d0 65 48 2b 04 25 28 00 00 00 75 1a 48 81 c4 28 03 00 00 5b 41 5c 41 5d 41 5e 41 5f 5d c3 0f 0b
> [ 293.837783] RSP: 0018:ffffb04ec31e7a20 EFLAGS: 00010202
> [ 293.839062] RAX: 0000000000000001 RBX: ffff9df947db2eb8 RCX: 0000000080aa0072
> [ 293.840666] RDX: 0000000000000000 RSI: ffffe86c0432a140 RDI: ffffffffc0b72a21
> [ 293.842261] RBP: ffffb04ec31e7d70 R08: ffff9df94ca85780 R09: 0000000080aa0072
> [ 293.843909] R10: ffff9df94ca85700 R11: ffff9df94e1ccf58 R12: ffff9df947db2e00
> [ 293.845594] R13: ffff9df947db2ed0 R14: ffff9df947db2eb8 R15: ffff9df947db2eb8
> [ 293.847855] FS: 00007f5a97379800(0000) GS:ffff9dfa77c00000(0000) knlGS:0000000000000000
> [ 293.850647] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 293.852940] CR2: 00007f5a97528730 CR3: 000000010bc76005 CR4: 0000000000370ef0
> [ 293.854680] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [ 293.856423] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> [ 293.858380] Call Trace:
> [ 293.859302] <TASK>
> [ 293.860311] ? ttwu_do_wakeup+0x1c/0x170
> [ 293.861800] ? ttwu_do_activate+0x6d/0xb0
> [ 293.863057] ? _raw_spin_unlock_irqrestore+0x29/0x40
> [ 293.864411] ? try_to_wake_up+0x9d/0x5e0
> [ 293.865618] ? debug_smp_processor_id+0x17/0x20
> [ 293.866934] ? debug_smp_processor_id+0x17/0x20
> [ 293.868223] ? free_unref_page+0xbf/0x120
> [ 293.869470] ? __free_slab+0xcb/0x1c0
> [ 293.870614] ? preempt_count_add+0x7a/0xc0
> [ 293.871811] ? __slab_free+0xa0/0x2d0
> [ 293.872918] ? __wake_up_common_lock+0x8a/0xc0
> [ 293.874186] ? __slab_free+0xa0/0x2d0
> [ 293.875305] ? free_inode_nonrcu+0x20/0x20
> [ 293.876466] ? free_inode_nonrcu+0x20/0x20
> [ 293.877650] ? debug_smp_processor_id+0x17/0x20
> [ 293.878949] ? call_rcu+0x11a/0x240
> [ 293.880060] ? f2fs_destroy_stats+0x59/0x60 [f2fs]
> [ 293.881437] ? kfree+0x1fe/0x230
> [ 293.882674] f2fs_put_super+0x160/0x390 [f2fs]
> [ 293.883978] generic_shutdown_super+0x7a/0x120
> [ 293.885274] kill_block_super+0x27/0x50
> [ 293.886496] kill_f2fs_super+0x7f/0x100 [f2fs]
> [ 293.887806] deactivate_locked_super+0x35/0xa0
> [ 293.889271] deactivate_super+0x40/0x50
> [ 293.890513] cleanup_mnt+0x139/0x190
> [ 293.891689] __cleanup_mnt+0x12/0x20
> [ 293.892850] task_work_run+0x64/0xa0
> [ 293.894035] exit_to_user_mode_prepare+0x1b7/0x1c0
> [ 293.895409] syscall_exit_to_user_mode+0x27/0x50
> [ 293.896872] do_syscall_64+0x48/0xc0
> [ 293.898090] entry_SYSCALL_64_after_hwframe+0x44/0xae
> [ 293.899517] RIP: 0033:0x7f5a975cd25b
>
> Fixes: 7735730d39d7 ("f2fs: fix to propagate error from __get_meta_page()")
> Signed-off-by: Jaegeuk Kim <[email protected]>

Reviewed-by: Chao Yu <[email protected]>

Thanks,

2022-02-25 06:15:11

by Chao Yu

[permalink] [raw]
Subject: Re: [f2fs-dev] [PATCH 2/2 v2] f2fs: avoid an infinite loop in f2fs_sync_dirty_inodes

On 2022/2/15 7:27, Jaegeuk Kim wrote:
> If one read IO is always failing, we can fall into an infinite loop in
> f2fs_sync_dirty_inodes. This happens during xfstests/generic/475.
>
> [ 142.803335] Buffer I/O error on dev dm-1, logical block 8388592, async page read
> ...
> [ 382.887210] submit_bio_noacct+0xdd/0x2a0
> [ 382.887213] submit_bio+0x80/0x110
> [ 382.887223] __submit_bio+0x4d/0x300 [f2fs]
> [ 382.887282] f2fs_submit_page_bio+0x125/0x200 [f2fs]
> [ 382.887299] __get_meta_page+0xc9/0x280 [f2fs]
> [ 382.887315] f2fs_get_meta_page+0x13/0x20 [f2fs]
> [ 382.887331] f2fs_get_node_info+0x317/0x3c0 [f2fs]
> [ 382.887350] f2fs_do_write_data_page+0x327/0x6f0 [f2fs]
> [ 382.887367] f2fs_write_single_data_page+0x5b7/0x960 [f2fs]
> [ 382.887386] f2fs_write_cache_pages+0x302/0x890 [f2fs]
> [ 382.887405] ? preempt_count_add+0x7a/0xc0
> [ 382.887408] f2fs_write_data_pages+0xfd/0x320 [f2fs]
> [ 382.887425] ? _raw_spin_unlock+0x1a/0x30
> [ 382.887428] do_writepages+0xd3/0x1d0
> [ 382.887432] filemap_fdatawrite_wbc+0x69/0x90
> [ 382.887434] filemap_fdatawrite+0x50/0x70
> [ 382.887437] f2fs_sync_dirty_inodes+0xa4/0x270 [f2fs]
> [ 382.887453] f2fs_write_checkpoint+0x189/0x1640 [f2fs]
> [ 382.887469] ? schedule_timeout+0x114/0x150
> [ 382.887471] ? ttwu_do_activate+0x6d/0xb0
> [ 382.887473] ? preempt_count_add+0x7a/0xc0
> [ 382.887476] kill_f2fs_super+0xca/0x100 [f2fs]
> [ 382.887491] deactivate_locked_super+0x35/0xa0
> [ 382.887494] deactivate_super+0x40/0x50
> [ 382.887497] cleanup_mnt+0x139/0x190
> [ 382.887499] __cleanup_mnt+0x12/0x20
> [ 382.887501] task_work_run+0x64/0xa0
> [ 382.887505] exit_to_user_mode_prepare+0x1b7/0x1c0
> [ 382.887508] syscall_exit_to_user_mode+0x27/0x50
> [ 382.887510] do_syscall_64+0x48/0xc0
> [ 382.887513] entry_SYSCALL_64_after_hwframe+0x44/0xae
>
> Signed-off-by: Jaegeuk Kim <[email protected]>
> ---
> Change log from v1:
> - fix a regression to report EIO too early
>
> fs/f2fs/checkpoint.c | 13 ++++++++-----
> fs/f2fs/f2fs.h | 3 +++
> 2 files changed, 11 insertions(+), 5 deletions(-)
>
> diff --git a/fs/f2fs/checkpoint.c b/fs/f2fs/checkpoint.c
> index 203a1577942d..56c81c68ef71 100644
> --- a/fs/f2fs/checkpoint.c
> +++ b/fs/f2fs/checkpoint.c
> @@ -1059,13 +1059,13 @@ int f2fs_sync_dirty_inodes(struct f2fs_sb_info *sbi, enum inode_type type)
> struct inode *inode;
> struct f2fs_inode_info *fi;
> bool is_dir = (type == DIR_INODE);
> - unsigned long ino = 0;
> + unsigned long ino = 0, retry_count = DEFAULT_RETRY_SYNC_DIR_COUNT;
>
> trace_f2fs_sync_dirty_inodes_enter(sbi->sb, is_dir,
> get_pages(sbi, is_dir ?
> F2FS_DIRTY_DENTS : F2FS_DIRTY_DATA));
> retry:
> - if (unlikely(f2fs_cp_error(sbi))) {
> + if (unlikely(f2fs_cp_error(sbi) || (is_dir && !retry_count))) {
> trace_f2fs_sync_dirty_inodes_exit(sbi->sb, is_dir,
> get_pages(sbi, is_dir ?
> F2FS_DIRTY_DENTS : F2FS_DIRTY_DATA));
> @@ -1096,10 +1096,13 @@ int f2fs_sync_dirty_inodes(struct f2fs_sb_info *sbi, enum inode_type type)
>
> iput(inode);
> /* We need to give cpu to another writers. */
> - if (ino == cur_ino)
> - cond_resched();
> - else
> + if (ino == cur_ino) {
> + retry_count--;
> + io_schedule_timeout(DEFAULT_IO_TIMEOUT);
> + } else {
> + retry_count = DEFAULT_RETRY_SYNC_DIR_COUNT;
> ino = cur_ino;
> + }
> } else {
> /*
> * We should submit bio, since it exists several
> diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
> index c9515c3c54fd..f40ef7b61965 100644
> --- a/fs/f2fs/f2fs.h
> +++ b/fs/f2fs/f2fs.h
> @@ -577,6 +577,9 @@ enum {
> /* maximum retry quota flush count */
> #define DEFAULT_RETRY_QUOTA_FLUSH_COUNT 8
>
> +/* maximum retry sync dirty inodes */
> +#define DEFAULT_RETRY_SYNC_DIR_COUNT 3000

3000 * 20ms/round = 60sec

How about just trying 5 or 10 sec?

Thanks,

> +
> #define F2FS_LINK_MAX 0xffffffff /* maximum link count per file */
>
> #define MAX_DIR_RA_PAGES 4 /* maximum ra pages of dir */

2022-02-26 02:29:17

by Jaegeuk Kim

[permalink] [raw]
Subject: Re: [f2fs-dev] [PATCH 2/2 v2] f2fs: avoid an infinite loop in f2fs_sync_dirty_inodes

On 02/25, Chao Yu wrote:
> On 2022/2/15 7:27, Jaegeuk Kim wrote:
> > If one read IO is always failing, we can fall into an infinite loop in
> > f2fs_sync_dirty_inodes. This happens during xfstests/generic/475.
> >
> > [ 142.803335] Buffer I/O error on dev dm-1, logical block 8388592, async page read
> > ...
> > [ 382.887210] submit_bio_noacct+0xdd/0x2a0
> > [ 382.887213] submit_bio+0x80/0x110
> > [ 382.887223] __submit_bio+0x4d/0x300 [f2fs]
> > [ 382.887282] f2fs_submit_page_bio+0x125/0x200 [f2fs]
> > [ 382.887299] __get_meta_page+0xc9/0x280 [f2fs]
> > [ 382.887315] f2fs_get_meta_page+0x13/0x20 [f2fs]
> > [ 382.887331] f2fs_get_node_info+0x317/0x3c0 [f2fs]
> > [ 382.887350] f2fs_do_write_data_page+0x327/0x6f0 [f2fs]
> > [ 382.887367] f2fs_write_single_data_page+0x5b7/0x960 [f2fs]
> > [ 382.887386] f2fs_write_cache_pages+0x302/0x890 [f2fs]
> > [ 382.887405] ? preempt_count_add+0x7a/0xc0
> > [ 382.887408] f2fs_write_data_pages+0xfd/0x320 [f2fs]
> > [ 382.887425] ? _raw_spin_unlock+0x1a/0x30
> > [ 382.887428] do_writepages+0xd3/0x1d0
> > [ 382.887432] filemap_fdatawrite_wbc+0x69/0x90
> > [ 382.887434] filemap_fdatawrite+0x50/0x70
> > [ 382.887437] f2fs_sync_dirty_inodes+0xa4/0x270 [f2fs]
> > [ 382.887453] f2fs_write_checkpoint+0x189/0x1640 [f2fs]
> > [ 382.887469] ? schedule_timeout+0x114/0x150
> > [ 382.887471] ? ttwu_do_activate+0x6d/0xb0
> > [ 382.887473] ? preempt_count_add+0x7a/0xc0
> > [ 382.887476] kill_f2fs_super+0xca/0x100 [f2fs]
> > [ 382.887491] deactivate_locked_super+0x35/0xa0
> > [ 382.887494] deactivate_super+0x40/0x50
> > [ 382.887497] cleanup_mnt+0x139/0x190
> > [ 382.887499] __cleanup_mnt+0x12/0x20
> > [ 382.887501] task_work_run+0x64/0xa0
> > [ 382.887505] exit_to_user_mode_prepare+0x1b7/0x1c0
> > [ 382.887508] syscall_exit_to_user_mode+0x27/0x50
> > [ 382.887510] do_syscall_64+0x48/0xc0
> > [ 382.887513] entry_SYSCALL_64_after_hwframe+0x44/0xae
> >
> > Signed-off-by: Jaegeuk Kim <[email protected]>
> > ---
> > Change log from v1:
> > - fix a regression to report EIO too early
> >
> > fs/f2fs/checkpoint.c | 13 ++++++++-----
> > fs/f2fs/f2fs.h | 3 +++
> > 2 files changed, 11 insertions(+), 5 deletions(-)
> >
> > diff --git a/fs/f2fs/checkpoint.c b/fs/f2fs/checkpoint.c
> > index 203a1577942d..56c81c68ef71 100644
> > --- a/fs/f2fs/checkpoint.c
> > +++ b/fs/f2fs/checkpoint.c
> > @@ -1059,13 +1059,13 @@ int f2fs_sync_dirty_inodes(struct f2fs_sb_info *sbi, enum inode_type type)
> > struct inode *inode;
> > struct f2fs_inode_info *fi;
> > bool is_dir = (type == DIR_INODE);
> > - unsigned long ino = 0;
> > + unsigned long ino = 0, retry_count = DEFAULT_RETRY_SYNC_DIR_COUNT;
> > trace_f2fs_sync_dirty_inodes_enter(sbi->sb, is_dir,
> > get_pages(sbi, is_dir ?
> > F2FS_DIRTY_DENTS : F2FS_DIRTY_DATA));
> > retry:
> > - if (unlikely(f2fs_cp_error(sbi))) {
> > + if (unlikely(f2fs_cp_error(sbi) || (is_dir && !retry_count))) {
> > trace_f2fs_sync_dirty_inodes_exit(sbi->sb, is_dir,
> > get_pages(sbi, is_dir ?
> > F2FS_DIRTY_DENTS : F2FS_DIRTY_DATA));
> > @@ -1096,10 +1096,13 @@ int f2fs_sync_dirty_inodes(struct f2fs_sb_info *sbi, enum inode_type type)
> > iput(inode);
> > /* We need to give cpu to another writers. */
> > - if (ino == cur_ino)
> > - cond_resched();
> > - else
> > + if (ino == cur_ino) {
> > + retry_count--;
> > + io_schedule_timeout(DEFAULT_IO_TIMEOUT);
> > + } else {
> > + retry_count = DEFAULT_RETRY_SYNC_DIR_COUNT;
> > ino = cur_ino;
> > + }
> > } else {
> > /*
> > * We should submit bio, since it exists several
> > diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
> > index c9515c3c54fd..f40ef7b61965 100644
> > --- a/fs/f2fs/f2fs.h
> > +++ b/fs/f2fs/f2fs.h
> > @@ -577,6 +577,9 @@ enum {
> > /* maximum retry quota flush count */
> > #define DEFAULT_RETRY_QUOTA_FLUSH_COUNT 8
> > +/* maximum retry sync dirty inodes */
> > +#define DEFAULT_RETRY_SYNC_DIR_COUNT 3000
>
> 3000 * 20ms/round = 60sec
>
> How about just trying 5 or 10 sec?

It seems this causes another EIO issue in other test. Let me drop this for now.

>
> Thanks,
>
> > +
> > #define F2FS_LINK_MAX 0xffffffff /* maximum link count per file */
> > #define MAX_DIR_RA_PAGES 4 /* maximum ra pages of dir */