'node_map_lock' is a spinlock only used to protect calls to set_bit(),
clear_bit() and test_bit().
{set|clear}_bit() are already atomic and don't need this extra spinlock.
test_bit() only reads the bitmap for a given bit.
Remove this useless spinlock.
Signed-off-by: Christophe JAILLET <[email protected]>
---
test_bit() is NOT documented as an atomic function. However, I can't see
how it could return a wrong result here.
So review with care. There is maybe something I don't think about that is
lurking here.
---
fs/ocfs2/heartbeat.c | 11 ++++-------
fs/ocfs2/ocfs2.h | 2 --
2 files changed, 4 insertions(+), 9 deletions(-)
diff --git a/fs/ocfs2/heartbeat.c b/fs/ocfs2/heartbeat.c
index 1d72e0788943..4863ad35c242 100644
--- a/fs/ocfs2/heartbeat.c
+++ b/fs/ocfs2/heartbeat.c
@@ -35,7 +35,6 @@ static void ocfs2_node_map_init(struct ocfs2_node_map *map)
void ocfs2_init_node_maps(struct ocfs2_super *osb)
{
- spin_lock_init(&osb->node_map_lock);
ocfs2_node_map_init(&osb->osb_recovering_orphan_dirs);
}
@@ -67,9 +66,8 @@ void ocfs2_node_map_set_bit(struct ocfs2_super *osb,
if (bit==-1)
return;
BUG_ON(bit >= map->num_nodes);
- spin_lock(&osb->node_map_lock);
+
set_bit(bit, map->map);
- spin_unlock(&osb->node_map_lock);
}
void ocfs2_node_map_clear_bit(struct ocfs2_super *osb,
@@ -79,9 +77,8 @@ void ocfs2_node_map_clear_bit(struct ocfs2_super *osb,
if (bit==-1)
return;
BUG_ON(bit >= map->num_nodes);
- spin_lock(&osb->node_map_lock);
+
clear_bit(bit, map->map);
- spin_unlock(&osb->node_map_lock);
}
int ocfs2_node_map_test_bit(struct ocfs2_super *osb,
@@ -89,13 +86,13 @@ int ocfs2_node_map_test_bit(struct ocfs2_super *osb,
int bit)
{
int ret;
+
if (bit >= map->num_nodes) {
mlog(ML_ERROR, "bit=%d map->num_nodes=%d\n", bit, map->num_nodes);
BUG();
}
- spin_lock(&osb->node_map_lock);
+
ret = test_bit(bit, map->map);
- spin_unlock(&osb->node_map_lock);
return ret;
}
diff --git a/fs/ocfs2/ocfs2.h b/fs/ocfs2/ocfs2.h
index 740b64238312..1df193b97c30 100644
--- a/fs/ocfs2/ocfs2.h
+++ b/fs/ocfs2/ocfs2.h
@@ -302,8 +302,6 @@ struct ocfs2_super
u32 *slot_recovery_generations;
- spinlock_t node_map_lock;
-
u64 root_blkno;
u64 system_dir_blkno;
u64 bitmap_blkno;
--
2.34.1
From: Christophe JAILLET
> Sent: 19 July 2022 11:02
>
> 'node_map_lock' is a spinlock only used to protect calls to set_bit(),
> clear_bit() and test_bit().
>
> {set|clear}_bit() are already atomic and don't need this extra spinlock.
> test_bit() only reads the bitmap for a given bit.
>
> Remove this useless spinlock.
It looks to me like the calling code is racy
unless there is another lock in the callers.
While map->map is protected, the result of test_bit()
is stale - so can't be used for much.
David
>
> Signed-off-by: Christophe JAILLET <[email protected]>
> ---
> test_bit() is NOT documented as an atomic function. However, I can't see
> how it could return a wrong result here.
>
> So review with care. There is maybe something I don't think about that is
> lurking here.
> ---
> fs/ocfs2/heartbeat.c | 11 ++++-------
> fs/ocfs2/ocfs2.h | 2 --
> 2 files changed, 4 insertions(+), 9 deletions(-)
>
> diff --git a/fs/ocfs2/heartbeat.c b/fs/ocfs2/heartbeat.c
> index 1d72e0788943..4863ad35c242 100644
> --- a/fs/ocfs2/heartbeat.c
> +++ b/fs/ocfs2/heartbeat.c
> @@ -35,7 +35,6 @@ static void ocfs2_node_map_init(struct ocfs2_node_map *map)
>
> void ocfs2_init_node_maps(struct ocfs2_super *osb)
> {
> - spin_lock_init(&osb->node_map_lock);
> ocfs2_node_map_init(&osb->osb_recovering_orphan_dirs);
> }
>
> @@ -67,9 +66,8 @@ void ocfs2_node_map_set_bit(struct ocfs2_super *osb,
> if (bit==-1)
> return;
> BUG_ON(bit >= map->num_nodes);
> - spin_lock(&osb->node_map_lock);
> +
> set_bit(bit, map->map);
> - spin_unlock(&osb->node_map_lock);
> }
>
> void ocfs2_node_map_clear_bit(struct ocfs2_super *osb,
> @@ -79,9 +77,8 @@ void ocfs2_node_map_clear_bit(struct ocfs2_super *osb,
> if (bit==-1)
> return;
> BUG_ON(bit >= map->num_nodes);
> - spin_lock(&osb->node_map_lock);
> +
> clear_bit(bit, map->map);
> - spin_unlock(&osb->node_map_lock);
> }
>
> int ocfs2_node_map_test_bit(struct ocfs2_super *osb,
> @@ -89,13 +86,13 @@ int ocfs2_node_map_test_bit(struct ocfs2_super *osb,
> int bit)
> {
> int ret;
> +
> if (bit >= map->num_nodes) {
> mlog(ML_ERROR, "bit=%d map->num_nodes=%d\n", bit, map->num_nodes);
> BUG();
> }
> - spin_lock(&osb->node_map_lock);
> +
> ret = test_bit(bit, map->map);
> - spin_unlock(&osb->node_map_lock);
> return ret;
> }
>
> diff --git a/fs/ocfs2/ocfs2.h b/fs/ocfs2/ocfs2.h
> index 740b64238312..1df193b97c30 100644
> --- a/fs/ocfs2/ocfs2.h
> +++ b/fs/ocfs2/ocfs2.h
> @@ -302,8 +302,6 @@ struct ocfs2_super
>
> u32 *slot_recovery_generations;
>
> - spinlock_t node_map_lock;
> -
> u64 root_blkno;
> u64 system_dir_blkno;
> u64 bitmap_blkno;
> --
> 2.34.1
-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)
Le 19/07/2022 à 12:24, David Laight a écrit :
> From: Christophe JAILLET
>> Sent: 19 July 2022 11:02
>>
>> 'node_map_lock' is a spinlock only used to protect calls to set_bit(),
>> clear_bit() and test_bit().
>>
>> {set|clear}_bit() are already atomic and don't need this extra spinlock.
>> test_bit() only reads the bitmap for a given bit.
>>
>> Remove this useless spinlock.
>
> It looks to me like the calling code is racy
> unless there is another lock in the callers.
The call chains are:
ocfs2_recover_orphans()
ocfs2_mark_recovering_orphan_dir()
spin_lock(&osb->osb_lock); <-- osb_lock spinlock
ocfs2_node_map_set_bit() <-- uses node_map_lock
...
spin_unlock(&osb->osb_lock);
...
ocfs2_clear_recovering_orphan_dir()
ocfs2_node_map_clear_bit() <-- uses node_map_lock
osb_lock is NOT taken
ocfs2_check_orphan_recovery_state()
spin_lock(&osb->osb_lock); <-- osb_lock spinlock
...
ocfs2_node_map_test_bit() <-- uses node_map_lock
...
spin_unlock(&osb->osb_lock);
So the code looks already protected by the 'osb_lock' spinlock, but I
don't know this code and ocfs2_mark_recovering_orphan_dir() looks tricky
to me. (so some other eyes are much welcome)
> While map->map is protected, the result of test_bit()
> is stale - so can't be used for much.
>
Anyway, should there be a locking issue, it is there with or without my
patch, right?
CJ
> David
>
>>
>> Signed-off-by: Christophe JAILLET <[email protected]>
>> ---
>> test_bit() is NOT documented as an atomic function. However, I can't see
>> how it could return a wrong result here.
>>
>> So review with care. There is maybe something I don't think about that is
>> lurking here.
>> ---
>> fs/ocfs2/heartbeat.c | 11 ++++-------
>> fs/ocfs2/ocfs2.h | 2 --
>> 2 files changed, 4 insertions(+), 9 deletions(-)
>>
>> diff --git a/fs/ocfs2/heartbeat.c b/fs/ocfs2/heartbeat.c
>> index 1d72e0788943..4863ad35c242 100644
>> --- a/fs/ocfs2/heartbeat.c
>> +++ b/fs/ocfs2/heartbeat.c
>> @@ -35,7 +35,6 @@ static void ocfs2_node_map_init(struct ocfs2_node_map *map)
>>
>> void ocfs2_init_node_maps(struct ocfs2_super *osb)
>> {
>> - spin_lock_init(&osb->node_map_lock);
>> ocfs2_node_map_init(&osb->osb_recovering_orphan_dirs);
>> }
>>
>> @@ -67,9 +66,8 @@ void ocfs2_node_map_set_bit(struct ocfs2_super *osb,
>> if (bit==-1)
>> return;
>> BUG_ON(bit >= map->num_nodes);
>> - spin_lock(&osb->node_map_lock);
>> +
>> set_bit(bit, map->map);
>> - spin_unlock(&osb->node_map_lock);
>> }
>>
>> void ocfs2_node_map_clear_bit(struct ocfs2_super *osb,
>> @@ -79,9 +77,8 @@ void ocfs2_node_map_clear_bit(struct ocfs2_super *osb,
>> if (bit==-1)
>> return;
>> BUG_ON(bit >= map->num_nodes);
>> - spin_lock(&osb->node_map_lock);
>> +
>> clear_bit(bit, map->map);
>> - spin_unlock(&osb->node_map_lock);
>> }
>>
>> int ocfs2_node_map_test_bit(struct ocfs2_super *osb,
>> @@ -89,13 +86,13 @@ int ocfs2_node_map_test_bit(struct ocfs2_super *osb,
>> int bit)
>> {
>> int ret;
>> +
>> if (bit >= map->num_nodes) {
>> mlog(ML_ERROR, "bit=%d map->num_nodes=%d\n", bit, map->num_nodes);
>> BUG();
>> }
>> - spin_lock(&osb->node_map_lock);
>> +
>> ret = test_bit(bit, map->map);
>> - spin_unlock(&osb->node_map_lock);
>> return ret;
>> }
>>
>> diff --git a/fs/ocfs2/ocfs2.h b/fs/ocfs2/ocfs2.h
>> index 740b64238312..1df193b97c30 100644
>> --- a/fs/ocfs2/ocfs2.h
>> +++ b/fs/ocfs2/ocfs2.h
>> @@ -302,8 +302,6 @@ struct ocfs2_super
>>
>> u32 *slot_recovery_generations;
>>
>> - spinlock_t node_map_lock;
>> -
>> u64 root_blkno;
>> u64 system_dir_blkno;
>> u64 bitmap_blkno;
>> --
>> 2.34.1
>
> -
> Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
> Registration No: 1397386 (Wales)
>
>
From: Christophe JAILLET
> Sent: 19 July 2022 14:25
>
> Le 19/07/2022 à 12:24, David Laight a écrit :
> > From: Christophe JAILLET
> >> Sent: 19 July 2022 11:02
> >>
> >> 'node_map_lock' is a spinlock only used to protect calls to set_bit(),
> >> clear_bit() and test_bit().
> >>
> >> {set|clear}_bit() are already atomic and don't need this extra spinlock.
> >> test_bit() only reads the bitmap for a given bit.
> >>
> >> Remove this useless spinlock.
> >
> > It looks to me like the calling code is racy
> > unless there is another lock in the callers.
>
> The call chains are:
> ocfs2_recover_orphans()
> ocfs2_mark_recovering_orphan_dir()
> spin_lock(&osb->osb_lock); <-- osb_lock spinlock
> ocfs2_node_map_set_bit() <-- uses node_map_lock
> ...
> spin_unlock(&osb->osb_lock);
> ...
> ocfs2_clear_recovering_orphan_dir()
> ocfs2_node_map_clear_bit() <-- uses node_map_lock
> osb_lock is NOT taken
>
>
> ocfs2_check_orphan_recovery_state()
> spin_lock(&osb->osb_lock); <-- osb_lock spinlock
> ...
> ocfs2_node_map_test_bit() <-- uses node_map_lock
> ...
> spin_unlock(&osb->osb_lock);
>
>
> So the code looks already protected by the 'osb_lock' spinlock, but I
> don't know this code and ocfs2_mark_recovering_orphan_dir() looks tricky
> to me. (so some other eyes are much welcome)
>
> > While map->map is protected, the result of test_bit()
> > is stale - so can't be used for much.
> >
>
> Anyway, should there be a locking issue, it is there with or without my
> patch, right?
Indeed.
David
-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)
On 7/19/22 9:25 PM, Christophe JAILLET wrote:
> Le 19/07/2022 à 12:24, David Laight a écrit :
>> From: Christophe JAILLET
>>> Sent: 19 July 2022 11:02
>>>
>>> 'node_map_lock' is a spinlock only used to protect calls to set_bit(),
>>> clear_bit() and test_bit().
>>>
>>> {set|clear}_bit() are already atomic and don't need this extra spinlock.
>>> test_bit() only reads the bitmap for a given bit.
>>>
>>> Remove this useless spinlock.
>>
>> It looks to me like the calling code is racy
>> unless there is another lock in the callers.
>
> The call chains are:
> ocfs2_recover_orphans()
> ocfs2_mark_recovering_orphan_dir()
> spin_lock(&osb->osb_lock); <-- osb_lock spinlock
> ocfs2_node_map_set_bit() <-- uses node_map_lock
> ...
> spin_unlock(&osb->osb_lock);
> ...
> ocfs2_clear_recovering_orphan_dir()
> ocfs2_node_map_clear_bit() <-- uses node_map_lock
> osb_lock is NOT taken
>
>
> ocfs2_check_orphan_recovery_state()
> spin_lock(&osb->osb_lock); <-- osb_lock spinlock
> ...
> ocfs2_node_map_test_bit() <-- uses node_map_lock
> ...
> spin_unlock(&osb->osb_lock);
>
>
> So the code looks already protected by the 'osb_lock' spinlock, but I don't know this code and ocfs2_mark_recovering_orphan_dir() looks tricky to me. (so some other eyes are much welcome)
osb_lock is to protect osb filed such as 'osb_orphan_wipes', while
node_map_lock is to protect the node map 'osb_recovering_orphan_dirs'
specifically.
Thanks,
Joseph
Le 20/07/2022 à 03:59, Joseph Qi a écrit :
>
> On 7/19/22 9:25 PM, Christophe JAILLET wrote:
>> Le 19/07/2022 à 12:24, David Laight a écrit :
>>> From: Christophe JAILLET
>>>> Sent: 19 July 2022 11:02
>>>>
>>>> 'node_map_lock' is a spinlock only used to protect calls to set_bit(),
>>>> clear_bit() and test_bit().
>>>>
>>>> {set|clear}_bit() are already atomic and don't need this extra spinlock.
>>>> test_bit() only reads the bitmap for a given bit.
>>>>
>>>> Remove this useless spinlock.
>>> It looks to me like the calling code is racy
>>> unless there is another lock in the callers.
>> The call chains are:
>> ocfs2_recover_orphans()
>> ocfs2_mark_recovering_orphan_dir()
>> spin_lock(&osb->osb_lock); <-- osb_lock spinlock
>> ocfs2_node_map_set_bit() <-- uses node_map_lock
>> ...
>> spin_unlock(&osb->osb_lock);
>> ...
>> ocfs2_clear_recovering_orphan_dir()
>> ocfs2_node_map_clear_bit() <-- uses node_map_lock
>> osb_lock is NOT taken
>>
>>
>> ocfs2_check_orphan_recovery_state()
>> spin_lock(&osb->osb_lock); <-- osb_lock spinlock
>> ...
>> ocfs2_node_map_test_bit() <-- uses node_map_lock
>> ...
>> spin_unlock(&osb->osb_lock);
>>
>>
>> So the code looks already protected by the 'osb_lock' spinlock, but I don't know this code and ocfs2_mark_recovering_orphan_dir() looks tricky to me. (so some other eyes are much welcome)
>
> osb_lock is to protect osb filed such as 'osb_orphan_wipes', while
> node_map_lock is to protect the node map 'osb_recovering_orphan_dirs'
> specifically.
Thanks for this explanation.
But does "node_map_lock" really protects anything?
It is just around some atomic function calls which shouldn't need any,
right?
test_bit() is not documented as atomic, but {clear|set}_bit() could be
executed just before or just after it with the current locking
mechanism, so I don't really see how it would make a difference.
I don't understand the logic of this lock here.
Can you elaborate?
CJ
> Thanks,
> Joseph
>
On 7/20/22 4:26 PM, Marion & Christophe JAILLET wrote:
>
> Le 20/07/2022 à 03:59, Joseph Qi a écrit :
>>
>> On 7/19/22 9:25 PM, Christophe JAILLET wrote:
>>> Le 19/07/2022 à 12:24, David Laight a écrit :
>>>> From: Christophe JAILLET
>>>>> Sent: 19 July 2022 11:02
>>>>>
>>>>> 'node_map_lock' is a spinlock only used to protect calls to set_bit(),
>>>>> clear_bit() and test_bit().
>>>>>
>>>>> {set|clear}_bit() are already atomic and don't need this extra spinlock.
>>>>> test_bit() only reads the bitmap for a given bit.
>>>>>
>>>>> Remove this useless spinlock.
>>>> It looks to me like the calling code is racy
>>>> unless there is another lock in the callers.
>>> The call chains are:
>>> ocfs2_recover_orphans()
>>> ocfs2_mark_recovering_orphan_dir()
>>> spin_lock(&osb->osb_lock); <-- osb_lock spinlock
>>> ocfs2_node_map_set_bit() <-- uses node_map_lock
>>> ...
>>> spin_unlock(&osb->osb_lock);
>>> ...
>>> ocfs2_clear_recovering_orphan_dir()
>>> ocfs2_node_map_clear_bit() <-- uses node_map_lock
>>> osb_lock is NOT taken
>>>
>>>
>>> ocfs2_check_orphan_recovery_state()
>>> spin_lock(&osb->osb_lock); <-- osb_lock spinlock
>>> ...
>>> ocfs2_node_map_test_bit() <-- uses node_map_lock
>>> ...
>>> spin_unlock(&osb->osb_lock);
>>>
>>>
>>> So the code looks already protected by the 'osb_lock' spinlock, but I don't know this code and ocfs2_mark_recovering_orphan_dir() looks tricky to me. (so some other eyes are much welcome)
>> osb_lock is to protect osb filed such as 'osb_orphan_wipes', while
>> node_map_lock is to protect the node map 'osb_recovering_orphan_dirs'
>> specifically.
>
> Thanks for this explanation.
>
> But does "node_map_lock" really protects anything?
> It is just around some atomic function calls which shouldn't need any, right?
>
> test_bit() is not documented as atomic, but {clear|set}_bit() could be executed just before or just after it with the current locking mechanism, so I don't really see how it would make a difference.
>
> I don't understand the logic of this lock here.
>
> Can you elaborate?
These code are introduced long time ago...
Refer to commit b4df6ed8db0c "[PATCH] ocfs2: fix orphan recovery
deadlock", I guess it plays a role 'barrier' and make sure test node map
is executed prior than signal orphan recovery thread. In other words, to
serialize evict inode and orphan recovery.
Thanks,
Joseph
Le 20/07/2022 à 11:48, Joseph Qi a écrit :
>
> These code are introduced long time ago...
> Refer to commit b4df6ed8db0c "[PATCH] ocfs2: fix orphan recovery
> deadlock", I guess it plays a role 'barrier' and make sure test node map
> is executed prior than signal orphan recovery thread. In other words, to
> serialize evict inode and orphan recovery.
>
> Thanks,
> Joseph
>
Ok, so just leave it as-is.
Should I resend the serie without this patch, or can 1/3 and 3/3 be
applied as-is?
CJ
On 7/20/22 9:32 PM, Christophe JAILLET wrote:
> Le 20/07/2022 à 11:48, Joseph Qi a écrit :
>>
>> These code are introduced long time ago...
>> Refer to commit b4df6ed8db0c "[PATCH] ocfs2: fix orphan recovery
>> deadlock", I guess it plays a role 'barrier' and make sure test node map
>> is executed prior than signal orphan recovery thread. In other words, to
>> serialize evict inode and orphan recovery.
>>
>> Thanks,
>> Joseph
>>
>
> Ok, so just leave it as-is.
>
> Should I resend the serie without this patch, or can 1/3 and 3/3 be applied as-is?
>
If you don't mind, please resend with my rvb and involve akpm as well.
Thanks,
Joseph