2020-11-09 17:02:27

by Jaegeuk Kim

[permalink] [raw]
Subject: [PATCH] f2fs: avoid race condition for shinker count

Light reported sometimes shinker gets nat_cnt < dirty_nat_cnt resulting in
wrong do_shinker work. Basically the two counts should not happen like that.

So, I suspect this race condtion where:
- f2fs_try_to_free_nats __flush_nat_entry_set
nat_cnt=2, dirty_nat_cnt=2
__clear_nat_cache_dirty
spin_lock(nat_list_lock)
list_move()
spin_unlock(nat_list_lock)
spin_lock(nat_list_lock)
list_del()
spin_unlock(nat_list_lock)
nat_cnt=1, dirty_nat_cnt=2
nat_cnt=1, dirty_nat_cnt=1

Reported-by: Light Hsieh <[email protected]>
Signed-off-by: Jaegeuk Kim <[email protected]>
---
fs/f2fs/node.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/fs/f2fs/node.c b/fs/f2fs/node.c
index 42394de6c7eb..e8ec65e40f06 100644
--- a/fs/f2fs/node.c
+++ b/fs/f2fs/node.c
@@ -269,11 +269,10 @@ static void __clear_nat_cache_dirty(struct f2fs_nm_info *nm_i,
{
spin_lock(&nm_i->nat_list_lock);
list_move_tail(&ne->list, &nm_i->nat_entries);
- spin_unlock(&nm_i->nat_list_lock);
-
set_nat_flag(ne, IS_DIRTY, false);
set->entry_cnt--;
nm_i->dirty_nat_cnt--;
+ spin_unlock(&nm_i->nat_list_lock);
}

static unsigned int __gang_lookup_nat_set(struct f2fs_nm_info *nm_i,
--
2.29.2.222.g5d2a92d10f8-goog


2020-11-10 02:18:15

by Chao Yu

[permalink] [raw]
Subject: Re: [f2fs-dev] [PATCH] f2fs: avoid race condition for shinker count

On 2020/11/10 1:00, Jaegeuk Kim wrote:
> Light reported sometimes shinker gets nat_cnt < dirty_nat_cnt resulting in

I didn't get the problem clearly, did you mean __count_nat_entries() will
give the wrong shrink count due to race condition? should there be a lock
while reading these two variables?

> wrong do_shinker work. Basically the two counts should not happen like that.
>
> So, I suspect this race condtion where:
> - f2fs_try_to_free_nats __flush_nat_entry_set
> nat_cnt=2, dirty_nat_cnt=2
> __clear_nat_cache_dirty
> spin_lock(nat_list_lock)
> list_move()
> spin_unlock(nat_list_lock)
> spin_lock(nat_list_lock)
> list_del()
> spin_unlock(nat_list_lock)
> nat_cnt=1, dirty_nat_cnt=2
> nat_cnt=1, dirty_nat_cnt=1

nm_i->nat_cnt and nm_i->dirty_nat_cnt were protected by
nm_i->nat_tree_lock, I didn't see why expanding nat_list_lock range
will help... since there are still places nat_list_lock() didn't
cover these two reference counts.

Thanks,

>
> Reported-by: Light Hsieh <[email protected]>
> Signed-off-by: Jaegeuk Kim <[email protected]>
> ---
> fs/f2fs/node.c | 3 +--
> 1 file changed, 1 insertion(+), 2 deletions(-)
>
> diff --git a/fs/f2fs/node.c b/fs/f2fs/node.c
> index 42394de6c7eb..e8ec65e40f06 100644
> --- a/fs/f2fs/node.c
> +++ b/fs/f2fs/node.c
> @@ -269,11 +269,10 @@ static void __clear_nat_cache_dirty(struct f2fs_nm_info *nm_i,
> {
> spin_lock(&nm_i->nat_list_lock);
> list_move_tail(&ne->list, &nm_i->nat_entries);
> - spin_unlock(&nm_i->nat_list_lock);
> -
> set_nat_flag(ne, IS_DIRTY, false);
> set->entry_cnt--;
> nm_i->dirty_nat_cnt--;
> + spin_unlock(&nm_i->nat_list_lock);
> }
>
> static unsigned int __gang_lookup_nat_set(struct f2fs_nm_info *nm_i,
>

2020-11-10 04:21:50

by Jaegeuk Kim

[permalink] [raw]
Subject: Re: [f2fs-dev] [PATCH] f2fs: avoid race condition for shinker count

On 11/10, Chao Yu wrote:
> On 2020/11/10 1:00, Jaegeuk Kim wrote:
> > Light reported sometimes shinker gets nat_cnt < dirty_nat_cnt resulting in
>
> I didn't get the problem clearly, did you mean __count_nat_entries() will
> give the wrong shrink count due to race condition? should there be a lock
> while reading these two variables?
>
> > wrong do_shinker work. Basically the two counts should not happen like that.
> >
> > So, I suspect this race condtion where:
> > - f2fs_try_to_free_nats __flush_nat_entry_set
> > nat_cnt=2, dirty_nat_cnt=2
> > __clear_nat_cache_dirty
> > spin_lock(nat_list_lock)
> > list_move()
> > spin_unlock(nat_list_lock)
> > spin_lock(nat_list_lock)
> > list_del()
> > spin_unlock(nat_list_lock)
> > nat_cnt=1, dirty_nat_cnt=2
> > nat_cnt=1, dirty_nat_cnt=1
>
> nm_i->nat_cnt and nm_i->dirty_nat_cnt were protected by
> nm_i->nat_tree_lock, I didn't see why expanding nat_list_lock range
> will help... since there are still places nat_list_lock() didn't
> cover these two reference counts.

Yeah, I missed nat_tree_lock, and indeed it should cover this. So, the problem
is Light reported subtle case of nat_cnt < dirty_nat_cnt in shrink_count.
We may need to use nat_tree_lock in shrink_count?

>
> Thanks,
>
> >
> > Reported-by: Light Hsieh <[email protected]>
> > Signed-off-by: Jaegeuk Kim <[email protected]>
> > ---
> > fs/f2fs/node.c | 3 +--
> > 1 file changed, 1 insertion(+), 2 deletions(-)
> >
> > diff --git a/fs/f2fs/node.c b/fs/f2fs/node.c
> > index 42394de6c7eb..e8ec65e40f06 100644
> > --- a/fs/f2fs/node.c
> > +++ b/fs/f2fs/node.c
> > @@ -269,11 +269,10 @@ static void __clear_nat_cache_dirty(struct f2fs_nm_info *nm_i,
> > {
> > spin_lock(&nm_i->nat_list_lock);
> > list_move_tail(&ne->list, &nm_i->nat_entries);
> > - spin_unlock(&nm_i->nat_list_lock);
> > -
> > set_nat_flag(ne, IS_DIRTY, false);
> > set->entry_cnt--;
> > nm_i->dirty_nat_cnt--;
> > + spin_unlock(&nm_i->nat_list_lock);
> > }
> > static unsigned int __gang_lookup_nat_set(struct f2fs_nm_info *nm_i,
> >

2020-11-10 06:08:14

by Chao Yu

[permalink] [raw]
Subject: Re: [f2fs-dev] [PATCH] f2fs: avoid race condition for shinker count

On 2020/11/10 12:19, Jaegeuk Kim wrote:
> On 11/10, Chao Yu wrote:
>> On 2020/11/10 1:00, Jaegeuk Kim wrote:
>>> Light reported sometimes shinker gets nat_cnt < dirty_nat_cnt resulting in
>>
>> I didn't get the problem clearly, did you mean __count_nat_entries() will
>> give the wrong shrink count due to race condition? should there be a lock
>> while reading these two variables?
>>
>>> wrong do_shinker work. Basically the two counts should not happen like that.
>>>
>>> So, I suspect this race condtion where:
>>> - f2fs_try_to_free_nats __flush_nat_entry_set
>>> nat_cnt=2, dirty_nat_cnt=2
>>> __clear_nat_cache_dirty
>>> spin_lock(nat_list_lock)
>>> list_move()
>>> spin_unlock(nat_list_lock)
>>> spin_lock(nat_list_lock)
>>> list_del()
>>> spin_unlock(nat_list_lock)
>>> nat_cnt=1, dirty_nat_cnt=2
>>> nat_cnt=1, dirty_nat_cnt=1
>>
>> nm_i->nat_cnt and nm_i->dirty_nat_cnt were protected by
>> nm_i->nat_tree_lock, I didn't see why expanding nat_list_lock range
>> will help... since there are still places nat_list_lock() didn't
>> cover these two reference counts.
>
> Yeah, I missed nat_tree_lock, and indeed it should cover this. So, the problem
> is Light reported subtle case of nat_cnt < dirty_nat_cnt in shrink_count.
> We may need to use nat_tree_lock in shrink_count?

change like this?

__count_nat_entries()

down_read(&nm_i->nat_tree_lock);
count = NM_I(sbi)->nat_cnt - NM_I(sbi)->dirty_nat_cnt;
up_read(&nm_i->nat_tree_lock);

Thanks,

>
>>
>> Thanks,
>>
>>>
>>> Reported-by: Light Hsieh <[email protected]>
>>> Signed-off-by: Jaegeuk Kim <[email protected]>
>>> ---
>>> fs/f2fs/node.c | 3 +--
>>> 1 file changed, 1 insertion(+), 2 deletions(-)
>>>
>>> diff --git a/fs/f2fs/node.c b/fs/f2fs/node.c
>>> index 42394de6c7eb..e8ec65e40f06 100644
>>> --- a/fs/f2fs/node.c
>>> +++ b/fs/f2fs/node.c
>>> @@ -269,11 +269,10 @@ static void __clear_nat_cache_dirty(struct f2fs_nm_info *nm_i,
>>> {
>>> spin_lock(&nm_i->nat_list_lock);
>>> list_move_tail(&ne->list, &nm_i->nat_entries);
>>> - spin_unlock(&nm_i->nat_list_lock);
>>> -
>>> set_nat_flag(ne, IS_DIRTY, false);
>>> set->entry_cnt--;
>>> nm_i->dirty_nat_cnt--;
>>> + spin_unlock(&nm_i->nat_list_lock);
>>> }
>>> static unsigned int __gang_lookup_nat_set(struct f2fs_nm_info *nm_i,
>>>
> .
>

2020-11-12 05:43:56

by Jaegeuk Kim

[permalink] [raw]
Subject: Re: [f2fs-dev] [PATCH] f2fs: avoid race condition for shinker count

On 11/10, Chao Yu wrote:
> On 2020/11/10 12:19, Jaegeuk Kim wrote:
> > On 11/10, Chao Yu wrote:
> > > On 2020/11/10 1:00, Jaegeuk Kim wrote:
> > > > Light reported sometimes shinker gets nat_cnt < dirty_nat_cnt resulting in
> > >
> > > I didn't get the problem clearly, did you mean __count_nat_entries() will
> > > give the wrong shrink count due to race condition? should there be a lock
> > > while reading these two variables?
> > >
> > > > wrong do_shinker work. Basically the two counts should not happen like that.
> > > >
> > > > So, I suspect this race condtion where:
> > > > - f2fs_try_to_free_nats __flush_nat_entry_set
> > > > nat_cnt=2, dirty_nat_cnt=2
> > > > __clear_nat_cache_dirty
> > > > spin_lock(nat_list_lock)
> > > > list_move()
> > > > spin_unlock(nat_list_lock)
> > > > spin_lock(nat_list_lock)
> > > > list_del()
> > > > spin_unlock(nat_list_lock)
> > > > nat_cnt=1, dirty_nat_cnt=2
> > > > nat_cnt=1, dirty_nat_cnt=1
> > >
> > > nm_i->nat_cnt and nm_i->dirty_nat_cnt were protected by
> > > nm_i->nat_tree_lock, I didn't see why expanding nat_list_lock range
> > > will help... since there are still places nat_list_lock() didn't
> > > cover these two reference counts.
> >
> > Yeah, I missed nat_tree_lock, and indeed it should cover this. So, the problem
> > is Light reported subtle case of nat_cnt < dirty_nat_cnt in shrink_count.
> > We may need to use nat_tree_lock in shrink_count?
>
> change like this?
>

Yup.

> __count_nat_entries()
>
> down_read(&nm_i->nat_tree_lock);
> count = NM_I(sbi)->nat_cnt - NM_I(sbi)->dirty_nat_cnt;
> up_read(&nm_i->nat_tree_lock);
>
> Thanks,
>
> >
> > >
> > > Thanks,
> > >
> > > >
> > > > Reported-by: Light Hsieh <[email protected]>
> > > > Signed-off-by: Jaegeuk Kim <[email protected]>
> > > > ---
> > > > fs/f2fs/node.c | 3 +--
> > > > 1 file changed, 1 insertion(+), 2 deletions(-)
> > > >
> > > > diff --git a/fs/f2fs/node.c b/fs/f2fs/node.c
> > > > index 42394de6c7eb..e8ec65e40f06 100644
> > > > --- a/fs/f2fs/node.c
> > > > +++ b/fs/f2fs/node.c
> > > > @@ -269,11 +269,10 @@ static void __clear_nat_cache_dirty(struct f2fs_nm_info *nm_i,
> > > > {
> > > > spin_lock(&nm_i->nat_list_lock);
> > > > list_move_tail(&ne->list, &nm_i->nat_entries);
> > > > - spin_unlock(&nm_i->nat_list_lock);
> > > > -
> > > > set_nat_flag(ne, IS_DIRTY, false);
> > > > set->entry_cnt--;
> > > > nm_i->dirty_nat_cnt--;
> > > > + spin_unlock(&nm_i->nat_list_lock);
> > > > }
> > > > static unsigned int __gang_lookup_nat_set(struct f2fs_nm_info *nm_i,
> > > >
> > .
> >

2020-11-12 05:45:42

by Jaegeuk Kim

[permalink] [raw]
Subject: Re: [PATCH v2] f2fs: avoid race condition for shinker count

Light reported sometimes shinker gets nat_cnt < dirty_nat_cnt resulting in
wrong do_shinker work. Let's avoid to get stale data by using nat_tree_lock.

Reported-by: Light Hsieh <[email protected]>
Signed-off-by: Jaegeuk Kim <[email protected]>
---
fs/f2fs/shrinker.c | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/fs/f2fs/shrinker.c b/fs/f2fs/shrinker.c
index d66de5999a26..d42245ab07f4 100644
--- a/fs/f2fs/shrinker.c
+++ b/fs/f2fs/shrinker.c
@@ -18,7 +18,11 @@ static unsigned int shrinker_run_no;

static unsigned long __count_nat_entries(struct f2fs_sb_info *sbi)
{
- long count = NM_I(sbi)->nat_cnt - NM_I(sbi)->dirty_nat_cnt;
+ long count;
+
+ down_read(&nm_i->nat_tree_lock);
+ count = NM_I(sbi)->nat_cnt - NM_I(sbi)->dirty_nat_cnt;
+ up_read(&nm_i->nat_tree_lock);

return count > 0 ? count : 0;
}
--
2.29.2.299.gdc1121823c-goog