2018-03-06 22:49:34

by Pavel Tatashin

[permalink] [raw]
Subject: [PATCH v2] mm: might_sleep warning

Robot reported this issue:
https://lkml.org/lkml/2018/2/27/851

That is introduced by:
mm: initialize pages on demand during boot

The problem is caused by changing static branch value within spin lock.
Spin lock disables preemption, and changing static branch value takes
mutex lock in its path, and thus may sleep.

The fix is to add another boolean variable to avoid the need to change
static branch within spinlock.

Also, as noticed by Andrew, change spin_lock to spin_lock_irq, in order
to disable interrupts and avoid possible deadlock with
deferred_grow_zone().

Signed-off-by: Pavel Tatashin <[email protected]>
---
mm/page_alloc.c | 12 +++++++++---
1 file changed, 9 insertions(+), 3 deletions(-)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index b337a026007c..5df1ca40a2ff 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -1579,6 +1579,7 @@ static int __init deferred_init_memmap(void *data)
* page_alloc_init_late() soon after smp_init() is complete.
*/
static __initdata DEFINE_SPINLOCK(deferred_zone_grow_lock);
+static bool deferred_zone_grow __initdata = true;
static DEFINE_STATIC_KEY_TRUE(deferred_pages);

/*
@@ -1616,7 +1617,7 @@ deferred_grow_zone(struct zone *zone, unsigned int order)
* Bail if we raced with another thread that disabled on demand
* initialization.
*/
- if (!static_branch_unlikely(&deferred_pages)) {
+ if (!static_branch_unlikely(&deferred_pages) || !deferred_zone_grow) {
spin_unlock_irqrestore(&deferred_zone_grow_lock, flags);
return false;
}
@@ -1683,10 +1684,15 @@ void __init page_alloc_init_late(void)
/*
* We are about to initialize the rest of deferred pages, permanently
* disable on-demand struct page initialization.
+ *
+ * Note: it is prohibited to modify static branches in non-preemptible
+ * context. Since, spin_lock() disables preemption, we must use an
+ * extra boolean deferred_zone_grow.
*/
- spin_lock(&deferred_zone_grow_lock);
+ spin_lock_irq(&deferred_zone_grow_lock);
+ deferred_zone_grow = false;
+ spin_unlock_irq(&deferred_zone_grow_lock);
static_branch_disable(&deferred_pages);
- spin_unlock(&deferred_zone_grow_lock);

/* There will be num_node_state(N_MEMORY) threads */
atomic_set(&pgdat_init_n_undone, num_node_state(N_MEMORY));
--
2.16.2



2018-03-07 09:09:07

by Vlastimil Babka

[permalink] [raw]
Subject: Re: [PATCH v2] mm: might_sleep warning

On 03/06/2018 11:40 PM, Pavel Tatashin wrote:
> Robot reported this issue:
> https://lkml.org/lkml/2018/2/27/851
>
> That is introduced by:
> mm: initialize pages on demand during boot
>
> The problem is caused by changing static branch value within spin lock.
> Spin lock disables preemption, and changing static branch value takes
> mutex lock in its path, and thus may sleep.
>
> The fix is to add another boolean variable to avoid the need to change
> static branch within spinlock.
>
> Also, as noticed by Andrew, change spin_lock to spin_lock_irq, in order
> to disable interrupts and avoid possible deadlock with
> deferred_grow_zone().
>
> Signed-off-by: Pavel Tatashin <[email protected]>
> ---
> mm/page_alloc.c | 12 +++++++++---
> 1 file changed, 9 insertions(+), 3 deletions(-)
>
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index b337a026007c..5df1ca40a2ff 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -1579,6 +1579,7 @@ static int __init deferred_init_memmap(void *data)
> * page_alloc_init_late() soon after smp_init() is complete.
> */
> static __initdata DEFINE_SPINLOCK(deferred_zone_grow_lock);
> +static bool deferred_zone_grow __initdata = true;
> static DEFINE_STATIC_KEY_TRUE(deferred_pages);
>
> /*
> @@ -1616,7 +1617,7 @@ deferred_grow_zone(struct zone *zone, unsigned int order)
> * Bail if we raced with another thread that disabled on demand
> * initialization.
> */
> - if (!static_branch_unlikely(&deferred_pages)) {
> + if (!static_branch_unlikely(&deferred_pages) || !deferred_zone_grow) {
> spin_unlock_irqrestore(&deferred_zone_grow_lock, flags);
> return false;
> }
> @@ -1683,10 +1684,15 @@ void __init page_alloc_init_late(void)
> /*
> * We are about to initialize the rest of deferred pages, permanently
> * disable on-demand struct page initialization.

Hi,

I've noticed that this function first disables the on-demand
initialization, and then runs the kthreads. Doesn't that leave a window
where allocations can fail? The chances are probably small, but I think
it would be better to avoid it completely, rare failures suck.

Fixing that probably means rethinking the whole synchronization more
dramatically though :/

Vlastimil

> + *
> + * Note: it is prohibited to modify static branches in non-preemptible
> + * context. Since, spin_lock() disables preemption, we must use an
> + * extra boolean deferred_zone_grow.
> */
> - spin_lock(&deferred_zone_grow_lock);
> + spin_lock_irq(&deferred_zone_grow_lock);
> + deferred_zone_grow = false;
> + spin_unlock_irq(&deferred_zone_grow_lock);
> static_branch_disable(&deferred_pages);
> - spin_unlock(&deferred_zone_grow_lock);
>
> /* There will be num_node_state(N_MEMORY) threads */
> atomic_set(&pgdat_init_n_undone, num_node_state(N_MEMORY));
>


2018-03-07 18:04:42

by Pavel Tatashin

[permalink] [raw]
Subject: Re: [PATCH v2] mm: might_sleep warning

> Hi,
>
> I've noticed that this function first disables the on-demand
> initialization, and then runs the kthreads. Doesn't that leave a window
> where allocations can fail? The chances are probably small, but I think
> it would be better to avoid it completely, rare failures suck.
>
> Fixing that probably means rethinking the whole synchronization more
> dramatically though :/
>
> Vlastimil

Hi Vlastimil,

You are right, there is a window, it is short, and probably not
possible to reproduce, as it happens before user threads are started,
and after init calls done by smp_init() are finished. The only way it
can happen, as far as I can see, is if some device fires an interrupt,
and interrupt handler decides to allocate a large chunk of memory. The
small allocations will succeed, as zone grow function growth more than
strictly requested, and also there are zones without deferred pages.

I will, however, think some more how to solve this problem to be future proof.

Thank you,
Pavel