2020-07-30 10:09:10

by Zhang, Qiang

[permalink] [raw]
Subject: [PATCH v3] mm/slab.c: add node spinlock protect in __cache_free_alien

From: Zhang Qiang <[email protected]>

for example:
node0
cpu0 cpu1
slab_dead_cpu
>mutex_lock(&slab_mutex)
>cpuup_canceled slab_dead_cpu
>mask = cpumask_of_node(node) >mutex_lock(&slab_mutex)
>n = get_node(cachep0, node0)
>spin_lock_irq(n&->list_lock)
>if (!cpumask_empty(mask)) == true
>spin_unlock_irq(&n->list_lock)
>goto free_slab
....
>mutex_unlock(&slab_mutex)

.... >cpuup_canceled
>mask = cpumask_of_node(node)
kmem_cache_free(cachep0 ) >n = get_node(cachep0, node0)
>__cache_free_alien(cachep0 ) >spin_lock_irq(n&->list_lock)
>n = get_node(cachep0, node0) >if (!cpumask_empty(mask)) == false
>if (n->alien && n->alien[page_node]) >alien = n->alien
>alien = n->alien[page_node] >n->alien = NULL
>.... >spin_unlock_irq(&n->list_lock)
>....

Due to multiple cpu offline, The same cache in a node may be operated in
parallel,the "n->alien" should be protect.

Fixes: 6731d4f12315 ("slab: Convert to hotplug state machine")
Signed-off-by: Zhang Qiang <[email protected]>
---
v1->v2->v3:
change submission information and fixes tags.

mm/slab.c | 7 +++++--
1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/mm/slab.c b/mm/slab.c
index a89633603b2d..290523c90b4e 100644
--- a/mm/slab.c
+++ b/mm/slab.c
@@ -759,8 +759,10 @@ static int __cache_free_alien(struct kmem_cache *cachep, void *objp,

n = get_node(cachep, node);
STATS_INC_NODEFREES(cachep);
+ spin_lock(&n->list_lock);
if (n->alien && n->alien[page_node]) {
alien = n->alien[page_node];
+ spin_unlock(&n->list_lock);
ac = &alien->ac;
spin_lock(&alien->lock);
if (unlikely(ac->avail == ac->limit)) {
@@ -769,14 +771,15 @@ static int __cache_free_alien(struct kmem_cache *cachep, void *objp,
}
ac->entry[ac->avail++] = objp;
spin_unlock(&alien->lock);
- slabs_destroy(cachep, &list);
} else {
+ spin_unlock(&n->list_lock);
n = get_node(cachep, page_node);
spin_lock(&n->list_lock);
free_block(cachep, &objp, 1, page_node, &list);
spin_unlock(&n->list_lock);
- slabs_destroy(cachep, &list);
}
+
+ slabs_destroy(cachep, &list);
return 1;
}

--
2.26.2


2020-07-30 23:49:46

by David Rientjes

[permalink] [raw]
Subject: Re: [PATCH v3] mm/slab.c: add node spinlock protect in __cache_free_alien

On Thu, 30 Jul 2020, [email protected] wrote:

> From: Zhang Qiang <[email protected]>
>
> for example:
> node0
> cpu0 cpu1
> slab_dead_cpu
> >mutex_lock(&slab_mutex)
> >cpuup_canceled slab_dead_cpu
> >mask = cpumask_of_node(node) >mutex_lock(&slab_mutex)
> >n = get_node(cachep0, node0)
> >spin_lock_irq(n&->list_lock)
> >if (!cpumask_empty(mask)) == true
> >spin_unlock_irq(&n->list_lock)
> >goto free_slab
> ....
> >mutex_unlock(&slab_mutex)
>
> .... >cpuup_canceled
> >mask = cpumask_of_node(node)
> kmem_cache_free(cachep0 ) >n = get_node(cachep0, node0)
> >__cache_free_alien(cachep0 ) >spin_lock_irq(n&->list_lock)
> >n = get_node(cachep0, node0) >if (!cpumask_empty(mask)) == false
> >if (n->alien && n->alien[page_node]) >alien = n->alien
> >alien = n->alien[page_node] >n->alien = NULL
> >.... >spin_unlock_irq(&n->list_lock)
> >....
>

As mentioned in the review of v1 of this patch, we likely want to do a fix
for cpuup_canceled() instead.

2020-07-31 01:31:30

by Zhang, Qiang

[permalink] [raw]
Subject: 回复: [PATCH v3] mm/slab.c: add node spinlock protect in __cache_free_alien



________________________________________
??????: David Rientjes <[email protected]>
????ʱ??: 2020??7??31?? 7:45
?ռ???: Zhang, Qiang
????: [email protected]; [email protected]; [email protected]; [email protected]; [email protected]; [email protected]
????: Re: [PATCH v3] mm/slab.c: add node spinlock protect in __cache_free_alien

On Thu, 30 Jul 2020, [email protected] wrote:

> From: Zhang Qiang <[email protected]>
>
> for example:
> node0
> cpu0 cpu1
> slab_dead_cpu
> >mutex_lock(&slab_mutex)
> >cpuup_canceled slab_dead_cpu
> >mask = cpumask_of_node(node) >mutex_lock(&slab_mutex)
> >n = get_node(cachep0, node0)
> >spin_lock_irq(n&->list_lock)
> >if (!cpumask_empty(mask)) == true
> >spin_unlock_irq(&n->list_lock)
> >goto free_slab
> ....
> >mutex_unlock(&slab_mutex)
>
> .... >cpuup_canceled
> >mask = cpumask_of_node(node)
> kmem_cache_free(cachep0 ) >n = get_node(cachep0, node0)
> >__cache_free_alien(cachep0 ) >spin_lock_irq(n&->list_lock)
> >n = get_node(cachep0, node0) >if (!cpumask_empty(mask)) == false
> >if (n->alien && n->alien[page_node]) >alien = n->alien
> >alien = n->alien[page_node] >n->alien = NULL
> >.... >spin_unlock_irq(&n->list_lock)
> >....
>

>As mentioned in the review of v1 of this patch, we likely want to do a fix
>for cpuup_canceled() instead.

I see, you mean do fix in "cpuup_canceled" func?

2020-07-31 08:11:17

by Zhang, Qiang

[permalink] [raw]
Subject: 回复: [PATCH v3] mm/slab.c: add node spinlock protect in __cache_free_alien



________________________________________
??????: Zhang, Qiang <[email protected]>
????ʱ??: 2020??7??31?? 9:27
?ռ???: David Rientjes
????: [email protected]; [email protected]; [email protected]; [email protected]; [email protected]; [email protected]
????: ?ظ?: [PATCH v3] mm/slab.c: add node spinlock protect in __cache_free_alien



________________________________________
??????: David Rientjes <[email protected]>
????ʱ??: 2020??7??31?? 7:45
?ռ???: Zhang, Qiang
????: [email protected]; [email protected]; [email protected]; [email protected]; [email protected]; [email protected]
????: Re: [PATCH v3] mm/slab.c: add node spinlock protect in __cache_free_alien

On Thu, 30 Jul 2020, [email protected] wrote:

> From: Zhang Qiang <[email protected]>
>
> for example:
> node0
> cpu0 cpu1
> slab_dead_cpu
> >mutex_lock(&slab_mutex)
> >cpuup_canceled slab_dead_cpu
> >mask = cpumask_of_node(node) >mutex_lock(&slab_mutex)
> >n = get_node(cachep0, node0)
> >spin_lock_irq(n&->list_lock)
> >if (!cpumask_empty(mask)) == true
> >spin_unlock_irq(&n->list_lock)
> >goto free_slab
> ....
> >mutex_unlock(&slab_mutex)
>
> .... >cpuup_canceled
> >mask = cpumask_of_node(node)
> kmem_cache_free(cachep0 ) >n = get_node(cachep0, node0)
> >__cache_free_alien(cachep0 ) >spin_lock_irq(n&->list_lock)
> >n = get_node(cachep0, node0) >if (!cpumask_empty(mask)) == false
> >if (n->alien && n->alien[page_node]) >alien = n->alien
> >alien = n->alien[page_node] >n->alien = NULL
> >.... >spin_unlock_irq(&n->list_lock)
> >....
>

>As mentioned in the review of v1 of this patch, we likely want to do a fix
>for cpuup_canceled() instead.

>I see, you mean do fix in "cpuup_canceled" func?

I'm very sorry, due to cpu_down receive gobal "cpu_hotplug_lock" write lock protect. multiple cpu offline is serial??the scenario I described above does not exist.