2024-02-07 02:38:42

by Zhiguo Niu

[permalink] [raw]
Subject: [PATCH v2 0/4] f2fs: fix panic issue in small capacity device

A panic issue happened in a reboot test in small capacity device
as following:
1.The device size is 64MB, and main area has 24 segments, and
CONFIG_F2FS_CHECK_FS is not enabled.
2.There is no any free segments left shown in free_segmap_info,
then another write request cause get_new_segment get a out-of-bound
segment with segno 24.
3.panic happen in update_sit_entry because access invalid bitmap
pointer.

More detail shown in following patch sets.
The three patches are splited here because the modifications are
relatively independent and more readable.

---
Changes of v2: stop checkpoint when get a out-of-bound segment
---

Zhiguo Niu (4):
f2fs: correct counting methods of free_segments in __set_inuse
f2fs: fix panic issue in update_sit_entry
f2fs: enhance judgment conditions of GET_SEGNO
f2fs: stop checkpoint when get a out-of-bounds segment

fs/f2fs/file.c | 7 ++++++-
fs/f2fs/segment.c | 21 ++++++++++++++++-----
fs/f2fs/segment.h | 7 ++++---
include/linux/f2fs_fs.h | 1 +
4 files changed, 27 insertions(+), 9 deletions(-)

--
1.9.1



2024-02-07 02:39:05

by Zhiguo Niu

[permalink] [raw]
Subject: [PATCH v2 2/4] f2fs: fix panic issue in update_sit_entry

When CONFIG_F2FS_CHECK_FS is not enabled, f2fs_bug_on just printing
warning, get_new_segment may get an out-of-bounds segment when there
is no free segments. Then a block is allocated from this invalid
segment, update_sit_entry will access the invalid bitmap address,
cause system panic. Just as below call stack:

f2fs_allocate_data_block get a block address with 0x4000 and
partition size is 64MB

[ 13.401997] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000
[ 13.402003] Mem abort info:
[ 13.402006] ESR = 0x96000005
[ 13.402009] EC = 0x25: DABT (current EL), IL = 32 bits
[ 13.402015] SET = 0, FnV = 0
[ 13.402018] EA = 0, S1PTW = 0
[ 13.402021] FSC = 0x05: level 1 translation fault
[ 13.402025] Data abort info:
[ 13.402027] ISV = 0, ISS = 0x00000005
[ 13.402030] CM = 0, WnR = 0
[ 13.402034] user pgtable: 4k pages, 39-bit VAs, pgdp=00000001066ab000
[ 13.402038] [0000000000000000] pgd=0000000000000000, p4d=0000000000000000, pud=0000000000000000
[ 13.402052] Internal error: Oops: 96000005 [#1] PREEMPT SMP
[ 13.489854] pc : update_sit_entry+0x128/0x420
[ 13.490497] lr : f2fs_allocate_data_block+0x6b0/0xc2c
[ 13.491218] sp : ffffffc00e023440
[ 13.501530] Call trace:
[ 13.501930] update_sit_entry+0x128/0x420
[ 13.502523] f2fs_allocate_data_block+0x6b0/0xc2c
[ 13.503203] do_write_page+0xf0/0x1d4
[ 13.503752] f2fs_outplace_write_data+0x68/0xfc
[ 13.504408] f2fs_do_write_data_page+0x3a8/0x65c
[ 13.505076] move_data_page+0x294/0x7a8
[ 13.505647] gc_data_segment+0x4b8/0x800
[ 13.506229] do_garbage_collect+0x354/0x674
[ 13.506843] f2fs_gc+0x280/0x68c
[ 13.507340] f2fs_balance_fs+0x104/0x144
[ 13.507921] f2fs_create+0x310/0x3d8
[ 13.508458] path_openat+0x53c/0xc28
[ 13.508997] do_filp_open+0xbc/0x16c
[ 13.509535] do_sys_openat2+0xa0/0x2a0

So sanity check should be add in update_sit_entry.
Also remove some redundant judgment code.

Signed-off-by: Zhiguo Niu <[email protected]>
---
fs/f2fs/segment.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/fs/f2fs/segment.c b/fs/f2fs/segment.c
index ad6511f..f373ff7 100644
--- a/fs/f2fs/segment.c
+++ b/fs/f2fs/segment.c
@@ -2399,6 +2399,8 @@ static void update_sit_entry(struct f2fs_sb_info *sbi, block_t blkaddr, int del)
#endif

segno = GET_SEGNO(sbi, blkaddr);
+ if (segno == NULL_SEGNO)
+ return;

se = get_seg_entry(sbi, segno);
new_vblocks = se->valid_blocks + del;
@@ -3464,8 +3466,7 @@ void f2fs_allocate_data_block(struct f2fs_sb_info *sbi, struct page *page,
* since SSR needs latest valid block information.
*/
update_sit_entry(sbi, *new_blkaddr, 1);
- if (GET_SEGNO(sbi, old_blkaddr) != NULL_SEGNO)
- update_sit_entry(sbi, old_blkaddr, -1);
+ update_sit_entry(sbi, old_blkaddr, -1);

/*
* If the current segment is full, flush it out and replace it with a
--
1.9.1


2024-02-21 18:10:50

by patchwork-bot+f2fs

[permalink] [raw]
Subject: Re: [f2fs-dev] [PATCH v2 0/4] f2fs: fix panic issue in small capacity device

Hello:

This series was applied to jaegeuk/f2fs.git (dev)
by Jaegeuk Kim <[email protected]>:

On Wed, 7 Feb 2024 10:01:00 +0800 you wrote:
> A panic issue happened in a reboot test in small capacity device
> as following:
> 1.The device size is 64MB, and main area has 24 segments, and
> CONFIG_F2FS_CHECK_FS is not enabled.
> 2.There is no any free segments left shown in free_segmap_info,
> then another write request cause get_new_segment get a out-of-bound
> segment with segno 24.
> 3.panic happen in update_sit_entry because access invalid bitmap
> pointer.
>
> [...]

Here is the summary with links:
- [f2fs-dev,v2,1/4] f2fs: correct counting methods of free_segments in __set_inuse
https://git.kernel.org/jaegeuk/f2fs/c/8bac4167fd14
- [f2fs-dev,v2,2/4] f2fs: fix panic issue in update_sit_entry
https://git.kernel.org/jaegeuk/f2fs/c/4acac2bf18d6
- [f2fs-dev,v2,3/4] f2fs: enhance judgment conditions of GET_SEGNO
(no matching commit)
- [f2fs-dev,v2,4/4] f2fs: stop checkpoint when get a out-of-bounds segment
(no matching commit)

You are awesome, thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html



2024-02-22 12:33:21

by Chao Yu

[permalink] [raw]
Subject: Re: [PATCH v2 0/4] f2fs: fix panic issue in small capacity device

On 2024/2/7 10:01, Zhiguo Niu wrote:
> A panic issue happened in a reboot test in small capacity device
> as following:
> 1.The device size is 64MB, and main area has 24 segments, and
> CONFIG_F2FS_CHECK_FS is not enabled.
> 2.There is no any free segments left shown in free_segmap_info,
> then another write request cause get_new_segment get a out-of-bound
> segment with segno 24.
> 3.panic happen in update_sit_entry because access invalid bitmap
> pointer.

Zhiguo,

Can you please try below patch to see whether it can fix your problem?

https://lore.kernel.org/linux-f2fs-devel/[email protected]

Thanks,

>
> More detail shown in following patch sets.
> The three patches are splited here because the modifications are
> relatively independent and more readable.
>
> ---
> Changes of v2: stop checkpoint when get a out-of-bound segment
> ---
>
> Zhiguo Niu (4):
> f2fs: correct counting methods of free_segments in __set_inuse
> f2fs: fix panic issue in update_sit_entry
> f2fs: enhance judgment conditions of GET_SEGNO
> f2fs: stop checkpoint when get a out-of-bounds segment
>
> fs/f2fs/file.c | 7 ++++++-
> fs/f2fs/segment.c | 21 ++++++++++++++++-----
> fs/f2fs/segment.h | 7 ++++---
> include/linux/f2fs_fs.h | 1 +
> 4 files changed, 27 insertions(+), 9 deletions(-)
>

2024-02-23 02:38:53

by Chao Yu

[permalink] [raw]
Subject: Re: [PATCH v2 0/4] f2fs: fix panic issue in small capacity device

On 2024/2/23 10:01, Zhiguo Niu wrote:
>
>
> On Thu, Feb 22, 2024 at 8:30 PM Chao Yu <[email protected] <mailto:[email protected]>> wrote:
>
> On 2024/2/7 10:01, Zhiguo Niu wrote:
> > A panic issue happened in a reboot test in small capacity device
> > as following:
> > 1.The device size is 64MB, and main area has 24 segments, and
> > CONFIG_F2FS_CHECK_FS is not enabled.
> > 2.There is no any free segments left shown in free_segmap_info,
> > then another write request cause get_new_segment get a out-of-bound
> > segment with segno 24.
> > 3.panic happen in update_sit_entry because access invalid bitmap
> > pointer.
>
> Zhiguo,
>
> Can you please try below patch to see whether it can fix your problem?
>
> https://lore.kernel.org/linux-f2fs-devel/[email protected] <https://lore.kernel.org/linux-f2fs-devel/[email protected]>
>
> Thanks,
>
>
> Dear Chao,
> I need to coordinate the testing resources. The previous testing has been stopped because it was fixed with the current patch. In addition, this requires stability testing to reproduce, so it will take a certain amount of time. If there is any situation, I will tell you in time.

Zhiguo, thank you!

BTW, I've tested this patch for a while, and it looks there is no issue w/
FAULT_NO_SEGMENT fault injection is on.

> btw, Why can’t I see this patch on your branch^^?
> https://git.kernel.org/pub/scm/linux/kernel/git/chao/linux.git/log/?h=dev-test <https://git.kernel.org/pub/scm/linux/kernel/git/chao/linux.git/log/?h=dev-test>

Too lazy to push patches in time, will do it in this weekend. :P

> thanks!
>
>
> >
> > More detail shown in following patch sets.
> > The three patches are splited here because the modifications are
> > relatively independent and more readable.
> >
> > ---
> > Changes of v2: stop checkpoint when get a out-of-bound segment
> > ---
> >
> > Zhiguo Niu (4):
> >    f2fs: correct counting methods of free_segments in __set_inuse
> >    f2fs: fix panic issue in update_sit_entry
> >    f2fs: enhance judgment conditions of GET_SEGNO
> >    f2fs: stop checkpoint when get a out-of-bounds segment
> >
> >   fs/f2fs/file.c          |  7 ++++++-
> >   fs/f2fs/segment.c       | 21 ++++++++++++++++-----
> >   fs/f2fs/segment.h       |  7 ++++---
> >   include/linux/f2fs_fs.h |  1 +
> >   4 files changed, 27 insertions(+), 9 deletions(-)
> >
>

2024-02-26 03:26:02

by Zhiguo Niu

[permalink] [raw]
Subject: Re: [PATCH v2 0/4] f2fs: fix panic issue in small capacity device

Dear Chao,

On Fri, Feb 23, 2024 at 10:38 AM Chao Yu <[email protected]> wrote:
>
> On 2024/2/23 10:01, Zhiguo Niu wrote:
> >
> >
> > On Thu, Feb 22, 2024 at 8:30 PM Chao Yu <[email protected] <mailto:[email protected]>> wrote:
> >
> > On 2024/2/7 10:01, Zhiguo Niu wrote:
> > > A panic issue happened in a reboot test in small capacity device
> > > as following:
> > > 1.The device size is 64MB, and main area has 24 segments, and
> > > CONFIG_F2FS_CHECK_FS is not enabled.
> > > 2.There is no any free segments left shown in free_segmap_info,
> > > then another write request cause get_new_segment get a out-of-bound
> > > segment with segno 24.
> > > 3.panic happen in update_sit_entry because access invalid bitmap
> > > pointer.
> >
> > Zhiguo,
> >
> > Can you please try below patch to see whether it can fix your problem?
> >
> > https://lore.kernel.org/linux-f2fs-devel/[email protected] <https://lore.kernel.org/linux-f2fs-devel/[email protected]>
> >
> > Thanks,
> >
> >
> > Dear Chao,
> > I need to coordinate the testing resources. The previous testing has been stopped because it was fixed with the current patch. In addition, this requires stability testing to reproduce, so it will take a certain amount of time. If there is any situation, I will tell you in time.
>
> Zhiguo, thank you!

We tested this patch this weekend on the previous version with
problem, and it can not reproduce panic issues,
so this patch should fix the original issue.
thanks!

>
> BTW, I've tested this patch for a while, and it looks there is no issue w/
> FAULT_NO_SEGMENT fault injection is on.
>
> > btw, Why can’t I see this patch on your branch^^?
> > https://git.kernel.org/pub/scm/linux/kernel/git/chao/linux.git/log/?h=dev-test <https://git.kernel.org/pub/scm/linux/kernel/git/chao/linux.git/log/?h=dev-test>
>
> Too lazy to push patches in time, will do it in this weekend. :P
>
> > thanks!
> >
> >
> > >
> > > More detail shown in following patch sets.
> > > The three patches are splited here because the modifications are
> > > relatively independent and more readable.
> > >
> > > ---
> > > Changes of v2: stop checkpoint when get a out-of-bound segment
> > > ---
> > >
> > > Zhiguo Niu (4):
> > > f2fs: correct counting methods of free_segments in __set_inuse
> > > f2fs: fix panic issue in update_sit_entry
> > > f2fs: enhance judgment conditions of GET_SEGNO
> > > f2fs: stop checkpoint when get a out-of-bounds segment
> > >
> > > fs/f2fs/file.c | 7 ++++++-
> > > fs/f2fs/segment.c | 21 ++++++++++++++++-----
> > > fs/f2fs/segment.h | 7 ++++---
> > > include/linux/f2fs_fs.h | 1 +
> > > 4 files changed, 27 insertions(+), 9 deletions(-)
> > >
> >

2024-02-26 06:48:28

by Chao Yu

[permalink] [raw]
Subject: Re: [PATCH v2 0/4] f2fs: fix panic issue in small capacity device

On 2024/2/26 11:25, Zhiguo Niu wrote:
> Dear Chao,
>
> On Fri, Feb 23, 2024 at 10:38 AM Chao Yu <[email protected]> wrote:
>>
>> On 2024/2/23 10:01, Zhiguo Niu wrote:
>>>
>>>
>>> On Thu, Feb 22, 2024 at 8:30 PM Chao Yu <[email protected] <mailto:[email protected]>> wrote:
>>>
>>> On 2024/2/7 10:01, Zhiguo Niu wrote:
>>> > A panic issue happened in a reboot test in small capacity device
>>> > as following:
>>> > 1.The device size is 64MB, and main area has 24 segments, and
>>> > CONFIG_F2FS_CHECK_FS is not enabled.
>>> > 2.There is no any free segments left shown in free_segmap_info,
>>> > then another write request cause get_new_segment get a out-of-bound
>>> > segment with segno 24.
>>> > 3.panic happen in update_sit_entry because access invalid bitmap
>>> > pointer.
>>>
>>> Zhiguo,
>>>
>>> Can you please try below patch to see whether it can fix your problem?
>>>
>>> https://lore.kernel.org/linux-f2fs-devel/[email protected] <https://lore.kernel.org/linux-f2fs-devel/[email protected]>
>>>
>>> Thanks,
>>>
>>>
>>> Dear Chao,
>>> I need to coordinate the testing resources. The previous testing has been stopped because it was fixed with the current patch. In addition, this requires stability testing to reproduce, so it will take a certain amount of time. If there is any situation, I will tell you in time.
>>
>> Zhiguo, thank you!
>
> We tested this patch this weekend on the previous version with
> problem, and it can not reproduce panic issues,
> so this patch should fix the original issue.

Zhiguo,

Thanks a lot for the test!

Do you mind replying to original patch below tag?

Tested-by: Zhiguo Niu <[email protected]>

Thanks,

> thanks!
>
>>
>> BTW, I've tested this patch for a while, and it looks there is no issue w/
>> FAULT_NO_SEGMENT fault injection is on.
>>
>>> btw, Why can’t I see this patch on your branch^^?
>>> https://git.kernel.org/pub/scm/linux/kernel/git/chao/linux.git/log/?h=dev-test <https://git.kernel.org/pub/scm/linux/kernel/git/chao/linux.git/log/?h=dev-test>
>>
>> Too lazy to push patches in time, will do it in this weekend. :P
>>
>>> thanks!
>>>
>>>
>>> >
>>> > More detail shown in following patch sets.
>>> > The three patches are splited here because the modifications are
>>> > relatively independent and more readable.
>>> >
>>> > ---
>>> > Changes of v2: stop checkpoint when get a out-of-bound segment
>>> > ---
>>> >
>>> > Zhiguo Niu (4):
>>> > f2fs: correct counting methods of free_segments in __set_inuse
>>> > f2fs: fix panic issue in update_sit_entry
>>> > f2fs: enhance judgment conditions of GET_SEGNO
>>> > f2fs: stop checkpoint when get a out-of-bounds segment
>>> >
>>> > fs/f2fs/file.c | 7 ++++++-
>>> > fs/f2fs/segment.c | 21 ++++++++++++++++-----
>>> > fs/f2fs/segment.h | 7 ++++---
>>> > include/linux/f2fs_fs.h | 1 +
>>> > 4 files changed, 27 insertions(+), 9 deletions(-)
>>> >
>>>

2024-02-27 01:13:47

by Jaegeuk Kim

[permalink] [raw]
Subject: Re: [PATCH v2 0/4] f2fs: fix panic issue in small capacity device

On 02/26, Zhiguo Niu wrote:
> Dear Chao,
>
> On Fri, Feb 23, 2024 at 10:38 AM Chao Yu <[email protected]> wrote:
> >
> > On 2024/2/23 10:01, Zhiguo Niu wrote:
> > >
> > >
> > > On Thu, Feb 22, 2024 at 8:30 PM Chao Yu <[email protected] <mailto:[email protected]>> wrote:
> > >
> > > On 2024/2/7 10:01, Zhiguo Niu wrote:
> > > > A panic issue happened in a reboot test in small capacity device
> > > > as following:
> > > > 1.The device size is 64MB, and main area has 24 segments, and
> > > > CONFIG_F2FS_CHECK_FS is not enabled.
> > > > 2.There is no any free segments left shown in free_segmap_info,
> > > > then another write request cause get_new_segment get a out-of-bound
> > > > segment with segno 24.
> > > > 3.panic happen in update_sit_entry because access invalid bitmap
> > > > pointer.
> > >
> > > Zhiguo,
> > >
> > > Can you please try below patch to see whether it can fix your problem?
> > >
> > > https://lore.kernel.org/linux-f2fs-devel/[email protected] <https://lore.kernel.org/linux-f2fs-devel/[email protected]>
> > >
> > > Thanks,
> > >
> > >
> > > Dear Chao,
> > > I need to coordinate the testing resources. The previous testing has been stopped because it was fixed with the current patch. In addition, this requires stability testing to reproduce, so it will take a certain amount of time. If there is any situation, I will tell you in time.
> >
> > Zhiguo, thank you!
>
> We tested this patch this weekend on the previous version with
> problem, and it can not reproduce panic issues,
> so this patch should fix the original issue.
> thanks!

Hey, do you guys please point out which patches were tested without what?
IOWs, which patches should I remove and keep Chao's patch?

>
> >
> > BTW, I've tested this patch for a while, and it looks there is no issue w/
> > FAULT_NO_SEGMENT fault injection is on.
> >
> > > btw, Why can’t I see this patch on your branch^^?
> > > https://git.kernel.org/pub/scm/linux/kernel/git/chao/linux.git/log/?h=dev-test <https://git.kernel.org/pub/scm/linux/kernel/git/chao/linux.git/log/?h=dev-test>
> >
> > Too lazy to push patches in time, will do it in this weekend. :P
> >
> > > thanks!
> > >
> > >
> > > >
> > > > More detail shown in following patch sets.
> > > > The three patches are splited here because the modifications are
> > > > relatively independent and more readable.
> > > >
> > > > ---
> > > > Changes of v2: stop checkpoint when get a out-of-bound segment
> > > > ---
> > > >
> > > > Zhiguo Niu (4):
> > > > f2fs: correct counting methods of free_segments in __set_inuse
> > > > f2fs: fix panic issue in update_sit_entry
> > > > f2fs: enhance judgment conditions of GET_SEGNO
> > > > f2fs: stop checkpoint when get a out-of-bounds segment
> > > >
> > > > fs/f2fs/file.c | 7 ++++++-
> > > > fs/f2fs/segment.c | 21 ++++++++++++++++-----
> > > > fs/f2fs/segment.h | 7 ++++---
> > > > include/linux/f2fs_fs.h | 1 +
> > > > 4 files changed, 27 insertions(+), 9 deletions(-)
> > > >
> > >

2024-02-27 02:37:08

by Zhiguo Niu

[permalink] [raw]
Subject: Re: [PATCH v2 0/4] f2fs: fix panic issue in small capacity device

On Tue, Feb 27, 2024 at 9:13 AM Jaegeuk Kim <[email protected]> wrote:
>
> On 02/26, Zhiguo Niu wrote:
> > Dear Chao,
> >
> > On Fri, Feb 23, 2024 at 10:38 AM Chao Yu <[email protected]> wrote:
> > >
> > > On 2024/2/23 10:01, Zhiguo Niu wrote:
> > > >
> > > >
> > > > On Thu, Feb 22, 2024 at 8:30 PM Chao Yu <[email protected] <mailto:[email protected]>> wrote:
> > > >
> > > > On 2024/2/7 10:01, Zhiguo Niu wrote:
> > > > > A panic issue happened in a reboot test in small capacity device
> > > > > as following:
> > > > > 1.The device size is 64MB, and main area has 24 segments, and
> > > > > CONFIG_F2FS_CHECK_FS is not enabled.
> > > > > 2.There is no any free segments left shown in free_segmap_info,
> > > > > then another write request cause get_new_segment get a out-of-bound
> > > > > segment with segno 24.
> > > > > 3.panic happen in update_sit_entry because access invalid bitmap
> > > > > pointer.
> > > >
> > > > Zhiguo,
> > > >
> > > > Can you please try below patch to see whether it can fix your problem?
> > > >
> > > > https://lore.kernel.org/linux-f2fs-devel/[email protected] <https://lore.kernel.org/linux-f2fs-devel/[email protected]>
> > > >
> > > > Thanks,
> > > >
> > > >
> > > > Dear Chao,
> > > > I need to coordinate the testing resources. The previous testing has been stopped because it was fixed with the current patch. In addition, this requires stability testing to reproduce, so it will take a certain amount of time. If there is any situation, I will tell you in time.
> > >
> > > Zhiguo, thank you!
> >
> > We tested this patch this weekend on the previous version with
> > problem, and it can not reproduce panic issues,
> > so this patch should fix the original issue.
> > thanks!
>
Dear Jaegeuk,
> Hey, do you guys please point out which patches were tested without what?
This problem occurred during our platform stability testing.
it can be fixed by my this patch set, mainly be fixed by:
f2fs: fix panic issue in update_sit_entry & f2fs: enhance judgment
conditions of GET_SEGNO
and Chao's patch can also fix this problems testing without my patch
> IOWs, which patches should I remove and keep Chao's patch?
I think chao's patch is more reasonable, it does error handling more complete.
but my patch just do some sanity check for return value of GET_SEGNO
Same as other codes(update_segment_mtime)
and i think it also needed except this part:

diff --git a/fs/f2fs/segment.h b/fs/f2fs/segment.h
index 3bf2ce46fa0907..bb22feeae1cfcb 100644
--- a/fs/f2fs/segment.h
+++ b/fs/f2fs/segment.h
@@ -96,7 +96,8 @@ static inline void sanity_check_seg_type(struct
f2fs_sb_info *sbi,
(GET_SEGOFF_FROM_SEG0(sbi, blk_addr) & (BLKS_PER_SEG(sbi) - 1))
#define GET_SEGNO(sbi, blk_addr) \
- ((!__is_valid_data_blkaddr(blk_addr)) ? \
+ ((!__is_valid_data_blkaddr(blk_addr) || \
+ !f2fs_is_valid_blkaddr(sbi, blk_addr, DATA_GENERIC)) ? \
NULL_SEGNO : GET_L2R_SEGNO(FREE_I(sbi), \
GET_SEGNO_FROM_SEG0(sbi, blk_addr)))
#define CAP_BLKS_PER_SEC(sbi)
because Chao's patch let new_addr=null_addr when get_new_segment
returns NOSPACE,
so I think this can be reverted and it also saves code running time.
How about Chao's opinions?
thanks!
>
> >
> > >
> > > BTW, I've tested this patch for a while, and it looks there is no issue w/
> > > FAULT_NO_SEGMENT fault injection is on.
> > >
> > > > btw, Why can’t I see this patch on your branch^^?
> > > > https://git.kernel.org/pub/scm/linux/kernel/git/chao/linux.git/log/?h=dev-test <https://git.kernel.org/pub/scm/linux/kernel/git/chao/linux.git/log/?h=dev-test>
> > >
> > > Too lazy to push patches in time, will do it in this weekend. :P
> > >
> > > > thanks!
> > > >
> > > >
> > > > >
> > > > > More detail shown in following patch sets.
> > > > > The three patches are splited here because the modifications are
> > > > > relatively independent and more readable.
> > > > >
> > > > > ---
> > > > > Changes of v2: stop checkpoint when get a out-of-bound segment
> > > > > ---
> > > > >
> > > > > Zhiguo Niu (4):
> > > > > f2fs: correct counting methods of free_segments in __set_inuse
> > > > > f2fs: fix panic issue in update_sit_entry
> > > > > f2fs: enhance judgment conditions of GET_SEGNO
> > > > > f2fs: stop checkpoint when get a out-of-bounds segment
> > > > >
> > > > > fs/f2fs/file.c | 7 ++++++-
> > > > > fs/f2fs/segment.c | 21 ++++++++++++++++-----
> > > > > fs/f2fs/segment.h | 7 ++++---
> > > > > include/linux/f2fs_fs.h | 1 +
> > > > > 4 files changed, 27 insertions(+), 9 deletions(-)
> > > > >
> > > >

2024-02-27 17:18:34

by Jaegeuk Kim

[permalink] [raw]
Subject: Re: [PATCH v2 0/4] f2fs: fix panic issue in small capacity device

On 02/27, Zhiguo Niu wrote:
> On Tue, Feb 27, 2024 at 9:13 AM Jaegeuk Kim <[email protected]> wrote:
> >
> > On 02/26, Zhiguo Niu wrote:
> > > Dear Chao,
> > >
> > > On Fri, Feb 23, 2024 at 10:38 AM Chao Yu <[email protected]> wrote:
> > > >
> > > > On 2024/2/23 10:01, Zhiguo Niu wrote:
> > > > >
> > > > >
> > > > > On Thu, Feb 22, 2024 at 8:30 PM Chao Yu <[email protected] <mailto:[email protected]>> wrote:
> > > > >
> > > > > On 2024/2/7 10:01, Zhiguo Niu wrote:
> > > > > > A panic issue happened in a reboot test in small capacity device
> > > > > > as following:
> > > > > > 1.The device size is 64MB, and main area has 24 segments, and
> > > > > > CONFIG_F2FS_CHECK_FS is not enabled.
> > > > > > 2.There is no any free segments left shown in free_segmap_info,
> > > > > > then another write request cause get_new_segment get a out-of-bound
> > > > > > segment with segno 24.
> > > > > > 3.panic happen in update_sit_entry because access invalid bitmap
> > > > > > pointer.
> > > > >
> > > > > Zhiguo,
> > > > >
> > > > > Can you please try below patch to see whether it can fix your problem?
> > > > >
> > > > > https://lore.kernel.org/linux-f2fs-devel/[email protected] <https://lore.kernel.org/linux-f2fs-devel/[email protected]>
> > > > >
> > > > > Thanks,
> > > > >
> > > > >
> > > > > Dear Chao,
> > > > > I need to coordinate the testing resources. The previous testing has been stopped because it was fixed with the current patch. In addition, this requires stability testing to reproduce, so it will take a certain amount of time. If there is any situation, I will tell you in time.
> > > >
> > > > Zhiguo, thank you!
> > >
> > > We tested this patch this weekend on the previous version with
> > > problem, and it can not reproduce panic issues,
> > > so this patch should fix the original issue.
> > > thanks!
> >
> Dear Jaegeuk,
> > Hey, do you guys please point out which patches were tested without what?
> This problem occurred during our platform stability testing.
> it can be fixed by my this patch set, mainly be fixed by:
> f2fs: fix panic issue in update_sit_entry & f2fs: enhance judgment
> conditions of GET_SEGNO
> and Chao's patch can also fix this problems testing without my patch
> > IOWs, which patches should I remove and keep Chao's patch?
> I think chao's patch is more reasonable, it does error handling more complete.
> but my patch just do some sanity check for return value of GET_SEGNO
> Same as other codes(update_segment_mtime)
> and i think it also needed except this part:

Thanks for confirmation. It seems it'd be better to revert yours and apply
Chao's patch first. If you think there's something to improve on top of it,
could you please send another patch afterwards?

>
> diff --git a/fs/f2fs/segment.h b/fs/f2fs/segment.h
> index 3bf2ce46fa0907..bb22feeae1cfcb 100644
> --- a/fs/f2fs/segment.h
> +++ b/fs/f2fs/segment.h
> @@ -96,7 +96,8 @@ static inline void sanity_check_seg_type(struct
> f2fs_sb_info *sbi,
> (GET_SEGOFF_FROM_SEG0(sbi, blk_addr) & (BLKS_PER_SEG(sbi) - 1))
> #define GET_SEGNO(sbi, blk_addr) \
> - ((!__is_valid_data_blkaddr(blk_addr)) ? \
> + ((!__is_valid_data_blkaddr(blk_addr) || \
> + !f2fs_is_valid_blkaddr(sbi, blk_addr, DATA_GENERIC)) ? \
> NULL_SEGNO : GET_L2R_SEGNO(FREE_I(sbi), \
> GET_SEGNO_FROM_SEG0(sbi, blk_addr)))
> #define CAP_BLKS_PER_SEC(sbi)
> because Chao's patch let new_addr=null_addr when get_new_segment
> returns NOSPACE,
> so I think this can be reverted and it also saves code running time.
> How about Chao's opinions?
> thanks!
> >
> > >
> > > >
> > > > BTW, I've tested this patch for a while, and it looks there is no issue w/
> > > > FAULT_NO_SEGMENT fault injection is on.
> > > >
> > > > > btw, Why can’t I see this patch on your branch^^?
> > > > > https://git.kernel.org/pub/scm/linux/kernel/git/chao/linux.git/log/?h=dev-test <https://git.kernel.org/pub/scm/linux/kernel/git/chao/linux.git/log/?h=dev-test>
> > > >
> > > > Too lazy to push patches in time, will do it in this weekend. :P
> > > >
> > > > > thanks!
> > > > >
> > > > >
> > > > > >
> > > > > > More detail shown in following patch sets.
> > > > > > The three patches are splited here because the modifications are
> > > > > > relatively independent and more readable.
> > > > > >
> > > > > > ---
> > > > > > Changes of v2: stop checkpoint when get a out-of-bound segment
> > > > > > ---
> > > > > >
> > > > > > Zhiguo Niu (4):
> > > > > > f2fs: correct counting methods of free_segments in __set_inuse
> > > > > > f2fs: fix panic issue in update_sit_entry
> > > > > > f2fs: enhance judgment conditions of GET_SEGNO
> > > > > > f2fs: stop checkpoint when get a out-of-bounds segment
> > > > > >
> > > > > > fs/f2fs/file.c | 7 ++++++-
> > > > > > fs/f2fs/segment.c | 21 ++++++++++++++++-----
> > > > > > fs/f2fs/segment.h | 7 ++++---
> > > > > > include/linux/f2fs_fs.h | 1 +
> > > > > > 4 files changed, 27 insertions(+), 9 deletions(-)
> > > > > >
> > > > >

2024-02-28 01:01:07

by Zhiguo Niu

[permalink] [raw]
Subject: Re: [PATCH v2 0/4] f2fs: fix panic issue in small capacity device

On Wed, Feb 28, 2024 at 1:18 AM Jaegeuk Kim <[email protected]> wrote:
>
> On 02/27, Zhiguo Niu wrote:
> > On Tue, Feb 27, 2024 at 9:13 AM Jaegeuk Kim <[email protected]> wrote:
> > >
> > > On 02/26, Zhiguo Niu wrote:
> > > > Dear Chao,
> > > >
> > > > On Fri, Feb 23, 2024 at 10:38 AM Chao Yu <[email protected]> wrote:
> > > > >
> > > > > On 2024/2/23 10:01, Zhiguo Niu wrote:
> > > > > >
> > > > > >
> > > > > > On Thu, Feb 22, 2024 at 8:30 PM Chao Yu <[email protected] <mailto:[email protected]>> wrote:
> > > > > >
> > > > > > On 2024/2/7 10:01, Zhiguo Niu wrote:
> > > > > > > A panic issue happened in a reboot test in small capacity device
> > > > > > > as following:
> > > > > > > 1.The device size is 64MB, and main area has 24 segments, and
> > > > > > > CONFIG_F2FS_CHECK_FS is not enabled.
> > > > > > > 2.There is no any free segments left shown in free_segmap_info,
> > > > > > > then another write request cause get_new_segment get a out-of-bound
> > > > > > > segment with segno 24.
> > > > > > > 3.panic happen in update_sit_entry because access invalid bitmap
> > > > > > > pointer.
> > > > > >
> > > > > > Zhiguo,
> > > > > >
> > > > > > Can you please try below patch to see whether it can fix your problem?
> > > > > >
> > > > > > https://lore.kernel.org/linux-f2fs-devel/[email protected] <https://lore.kernel.org/linux-f2fs-devel/[email protected]>
> > > > > >
> > > > > > Thanks,
> > > > > >
> > > > > >
> > > > > > Dear Chao,
> > > > > > I need to coordinate the testing resources. The previous testing has been stopped because it was fixed with the current patch. In addition, this requires stability testing to reproduce, so it will take a certain amount of time. If there is any situation, I will tell you in time.
> > > > >
> > > > > Zhiguo, thank you!
> > > >
> > > > We tested this patch this weekend on the previous version with
> > > > problem, and it can not reproduce panic issues,
> > > > so this patch should fix the original issue.
> > > > thanks!
> > >
> > Dear Jaegeuk,
> > > Hey, do you guys please point out which patches were tested without what?
> > This problem occurred during our platform stability testing.
> > it can be fixed by my this patch set, mainly be fixed by:
> > f2fs: fix panic issue in update_sit_entry & f2fs: enhance judgment
> > conditions of GET_SEGNO
> > and Chao's patch can also fix this problems testing without my patch
> > > IOWs, which patches should I remove and keep Chao's patch?
> > I think chao's patch is more reasonable, it does error handling more complete.
> > but my patch just do some sanity check for return value of GET_SEGNO
> > Same as other codes(update_segment_mtime)
> > and i think it also needed except this part:
>
> Thanks for confirmation. It seems it'd be better to revert yours and apply
> Chao's patch first. If you think there's something to improve on top of it,
> could you please send another patch afterwards?

OK, I think this two patches still needed
f2fs: correct counting methods of free_segments in __set_inuse
f2fs: fix panic issue in update_sit_entry
and I'll reorganize it
thanks
>
> >
> > diff --git a/fs/f2fs/segment.h b/fs/f2fs/segment.h
> > index 3bf2ce46fa0907..bb22feeae1cfcb 100644
> > --- a/fs/f2fs/segment.h
> > +++ b/fs/f2fs/segment.h
> > @@ -96,7 +96,8 @@ static inline void sanity_check_seg_type(struct
> > f2fs_sb_info *sbi,
> > (GET_SEGOFF_FROM_SEG0(sbi, blk_addr) & (BLKS_PER_SEG(sbi) - 1))
> > #define GET_SEGNO(sbi, blk_addr) \
> > - ((!__is_valid_data_blkaddr(blk_addr)) ? \
> > + ((!__is_valid_data_blkaddr(blk_addr) || \
> > + !f2fs_is_valid_blkaddr(sbi, blk_addr, DATA_GENERIC)) ? \
> > NULL_SEGNO : GET_L2R_SEGNO(FREE_I(sbi), \
> > GET_SEGNO_FROM_SEG0(sbi, blk_addr)))
> > #define CAP_BLKS_PER_SEC(sbi)
> > because Chao's patch let new_addr=null_addr when get_new_segment
> > returns NOSPACE,
> > so I think this can be reverted and it also saves code running time.
> > How about Chao's opinions?
> > thanks!
> > >
> > > >
> > > > >
> > > > > BTW, I've tested this patch for a while, and it looks there is no issue w/
> > > > > FAULT_NO_SEGMENT fault injection is on.
> > > > >
> > > > > > btw, Why can’t I see this patch on your branch^^?
> > > > > > https://git.kernel.org/pub/scm/linux/kernel/git/chao/linux.git/log/?h=dev-test <https://git.kernel.org/pub/scm/linux/kernel/git/chao/linux.git/log/?h=dev-test>
> > > > >
> > > > > Too lazy to push patches in time, will do it in this weekend. :P
> > > > >
> > > > > > thanks!
> > > > > >
> > > > > >
> > > > > > >
> > > > > > > More detail shown in following patch sets.
> > > > > > > The three patches are splited here because the modifications are
> > > > > > > relatively independent and more readable.
> > > > > > >
> > > > > > > ---
> > > > > > > Changes of v2: stop checkpoint when get a out-of-bound segment
> > > > > > > ---
> > > > > > >
> > > > > > > Zhiguo Niu (4):
> > > > > > > f2fs: correct counting methods of free_segments in __set_inuse
> > > > > > > f2fs: fix panic issue in update_sit_entry
> > > > > > > f2fs: enhance judgment conditions of GET_SEGNO
> > > > > > > f2fs: stop checkpoint when get a out-of-bounds segment
> > > > > > >
> > > > > > > fs/f2fs/file.c | 7 ++++++-
> > > > > > > fs/f2fs/segment.c | 21 ++++++++++++++++-----
> > > > > > > fs/f2fs/segment.h | 7 ++++---
> > > > > > > include/linux/f2fs_fs.h | 1 +
> > > > > > > 4 files changed, 27 insertions(+), 9 deletions(-)
> > > > > > >
> > > > > >

2024-02-28 01:22:01

by Chao Yu

[permalink] [raw]
Subject: Re: [PATCH v2 0/4] f2fs: fix panic issue in small capacity device

On 2024/2/28 1:18, Jaegeuk Kim wrote:
> On 02/27, Zhiguo Niu wrote:
>> On Tue, Feb 27, 2024 at 9:13 AM Jaegeuk Kim <[email protected]> wrote:
>>>
>>> On 02/26, Zhiguo Niu wrote:
>>>> Dear Chao,
>>>>
>>>> On Fri, Feb 23, 2024 at 10:38 AM Chao Yu <[email protected]> wrote:
>>>>>
>>>>> On 2024/2/23 10:01, Zhiguo Niu wrote:
>>>>>>
>>>>>>
>>>>>> On Thu, Feb 22, 2024 at 8:30 PM Chao Yu <[email protected] <mailto:[email protected]>> wrote:
>>>>>>
>>>>>> On 2024/2/7 10:01, Zhiguo Niu wrote:
>>>>>> > A panic issue happened in a reboot test in small capacity device
>>>>>> > as following:
>>>>>> > 1.The device size is 64MB, and main area has 24 segments, and
>>>>>> > CONFIG_F2FS_CHECK_FS is not enabled.
>>>>>> > 2.There is no any free segments left shown in free_segmap_info,
>>>>>> > then another write request cause get_new_segment get a out-of-bound
>>>>>> > segment with segno 24.
>>>>>> > 3.panic happen in update_sit_entry because access invalid bitmap
>>>>>> > pointer.
>>>>>>
>>>>>> Zhiguo,
>>>>>>
>>>>>> Can you please try below patch to see whether it can fix your problem?
>>>>>>
>>>>>> https://lore.kernel.org/linux-f2fs-devel/[email protected] <https://lore.kernel.org/linux-f2fs-devel/[email protected]>
>>>>>>
>>>>>> Thanks,
>>>>>>
>>>>>>
>>>>>> Dear Chao,
>>>>>> I need to coordinate the testing resources. The previous testing has been stopped because it was fixed with the current patch. In addition, this requires stability testing to reproduce, so it will take a certain amount of time. If there is any situation, I will tell you in time.
>>>>>
>>>>> Zhiguo, thank you!
>>>>
>>>> We tested this patch this weekend on the previous version with
>>>> problem, and it can not reproduce panic issues,
>>>> so this patch should fix the original issue.
>>>> thanks!
>>>
>> Dear Jaegeuk,
>>> Hey, do you guys please point out which patches were tested without what?
>> This problem occurred during our platform stability testing.
>> it can be fixed by my this patch set, mainly be fixed by:
>> f2fs: fix panic issue in update_sit_entry & f2fs: enhance judgment
>> conditions of GET_SEGNO
>> and Chao's patch can also fix this problems testing without my patch
>>> IOWs, which patches should I remove and keep Chao's patch?
>> I think chao's patch is more reasonable, it does error handling more complete.
>> but my patch just do some sanity check for return value of GET_SEGNO
>> Same as other codes(update_segment_mtime)
>> and i think it also needed except this part:
>
> Thanks for confirmation. It seems it'd be better to revert yours and apply
> Chao's patch first. If you think there's something to improve on top of it,
> could you please send another patch afterwards?

Agreed.

Thanks,

>
>>
>> diff --git a/fs/f2fs/segment.h b/fs/f2fs/segment.h
>> index 3bf2ce46fa0907..bb22feeae1cfcb 100644
>> --- a/fs/f2fs/segment.h
>> +++ b/fs/f2fs/segment.h
>> @@ -96,7 +96,8 @@ static inline void sanity_check_seg_type(struct
>> f2fs_sb_info *sbi,
>> (GET_SEGOFF_FROM_SEG0(sbi, blk_addr) & (BLKS_PER_SEG(sbi) - 1))
>> #define GET_SEGNO(sbi, blk_addr) \
>> - ((!__is_valid_data_blkaddr(blk_addr)) ? \
>> + ((!__is_valid_data_blkaddr(blk_addr) || \
>> + !f2fs_is_valid_blkaddr(sbi, blk_addr, DATA_GENERIC)) ? \
>> NULL_SEGNO : GET_L2R_SEGNO(FREE_I(sbi), \
>> GET_SEGNO_FROM_SEG0(sbi, blk_addr)))
>> #define CAP_BLKS_PER_SEC(sbi)
>> because Chao's patch let new_addr=null_addr when get_new_segment
>> returns NOSPACE,
>> so I think this can be reverted and it also saves code running time.
>> How about Chao's opinions?
>> thanks!
>>>
>>>>
>>>>>
>>>>> BTW, I've tested this patch for a while, and it looks there is no issue w/
>>>>> FAULT_NO_SEGMENT fault injection is on.
>>>>>
>>>>>> btw, Why can’t I see this patch on your branch^^?
>>>>>> https://git.kernel.org/pub/scm/linux/kernel/git/chao/linux.git/log/?h=dev-test <https://git.kernel.org/pub/scm/linux/kernel/git/chao/linux.git/log/?h=dev-test>
>>>>>
>>>>> Too lazy to push patches in time, will do it in this weekend. :P
>>>>>
>>>>>> thanks!
>>>>>>
>>>>>>
>>>>>> >
>>>>>> > More detail shown in following patch sets.
>>>>>> > The three patches are splited here because the modifications are
>>>>>> > relatively independent and more readable.
>>>>>> >
>>>>>> > ---
>>>>>> > Changes of v2: stop checkpoint when get a out-of-bound segment
>>>>>> > ---
>>>>>> >
>>>>>> > Zhiguo Niu (4):
>>>>>> > f2fs: correct counting methods of free_segments in __set_inuse
>>>>>> > f2fs: fix panic issue in update_sit_entry
>>>>>> > f2fs: enhance judgment conditions of GET_SEGNO
>>>>>> > f2fs: stop checkpoint when get a out-of-bounds segment
>>>>>> >
>>>>>> > fs/f2fs/file.c | 7 ++++++-
>>>>>> > fs/f2fs/segment.c | 21 ++++++++++++++++-----
>>>>>> > fs/f2fs/segment.h | 7 ++++---
>>>>>> > include/linux/f2fs_fs.h | 1 +
>>>>>> > 4 files changed, 27 insertions(+), 9 deletions(-)
>>>>>> >
>>>>>>