2011-05-09 02:40:42

by Daniel J Blueman

[permalink] [raw]
Subject: [PATCH] ioapic: fix potential resume deadlock

Fix a potential deadlock when resuming; here the calling function
has disabled interrupts, so we cannot sleep.

Signed-off-by: Daniel J Blueman <[email protected]>

---
arch/x86/kernel/apic/io_apic.c | 4 ++--
1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kernel/apic/io_apic.c b/arch/x86/kernel/apic/io_apic.c
index 45fd33d..df63620 100644
--- a/arch/x86/kernel/apic/io_apic.c
+++ b/arch/x86/kernel/apic/io_apic.c
@@ -621,14 +621,14 @@ struct IO_APIC_route_entry **alloc_ioapic_entries(void)
struct IO_APIC_route_entry **ioapic_entries;

ioapic_entries = kzalloc(sizeof(*ioapic_entries) * nr_ioapics,
- GFP_KERNEL);
+ GFP_ATOMIC);
if (!ioapic_entries)
return 0;

for (apic = 0; apic < nr_ioapics; apic++) {
ioapic_entries[apic] =
kzalloc(sizeof(struct IO_APIC_route_entry) *
- nr_ioapic_registers[apic], GFP_KERNEL);
+ nr_ioapic_registers[apic], GFP_ATOMIC);
if (!ioapic_entries[apic])
goto nomem;
}
--
1.7.4.1


2011-05-10 05:32:52

by Daniel J Blueman

[permalink] [raw]
Subject: Re: [PATCH] ioapic: fix potential resume deadlock

Hi Thomas, Ingo, HPA,

On 9 May 2011 10:40, Daniel J Blueman <[email protected]> wrote:
> Fix a potential deadlock when resuming; here the calling function
> has disabled interrupts, so we cannot sleep.
>
> Signed-off-by: Daniel J Blueman <[email protected]>
>
> ---
> ?arch/x86/kernel/apic/io_apic.c | ? ?4 ++--
> ?1 files changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/arch/x86/kernel/apic/io_apic.c b/arch/x86/kernel/apic/io_apic.c
> index 45fd33d..df63620 100644
> --- a/arch/x86/kernel/apic/io_apic.c
> +++ b/arch/x86/kernel/apic/io_apic.c
> @@ -621,14 +621,14 @@ struct IO_APIC_route_entry **alloc_ioapic_entries(void)
> ? ? ? ?struct IO_APIC_route_entry **ioapic_entries;
>
> ? ? ? ?ioapic_entries = kzalloc(sizeof(*ioapic_entries) * nr_ioapics,
> - ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? GFP_KERNEL);
> + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? GFP_ATOMIC);
> ? ? ? ?if (!ioapic_entries)
> ? ? ? ? ? ? ? ?return 0;
>
> ? ? ? ?for (apic = 0; apic < nr_ioapics; apic++) {
> ? ? ? ? ? ? ? ?ioapic_entries[apic] =
> ? ? ? ? ? ? ? ? ? ? ? ?kzalloc(sizeof(struct IO_APIC_route_entry) *
> - ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? nr_ioapic_registers[apic], GFP_KERNEL);
> + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? nr_ioapic_registers[apic], GFP_ATOMIC);
> ? ? ? ? ? ? ? ?if (!ioapic_entries[apic])
> ? ? ? ? ? ? ? ? ? ? ? ?goto nomem;
> ? ? ? ?}

Any feedback on this? I tested it on 2.6.39-rc6 and it does address
the potential deadlock warning I receive when resuming from suspend.
If positive, it would be good to get into -rc8, as rc7 is released.

Thanks,
Daniel
--
Daniel J Blueman

2011-05-10 07:36:14

by Ingo Molnar

[permalink] [raw]
Subject: Re: [PATCH] ioapic: fix potential resume deadlock


* Daniel J Blueman <[email protected]> wrote:

> Fix a potential deadlock when resuming; here the calling function
> has disabled interrupts, so we cannot sleep.
>
> Signed-off-by: Daniel J Blueman <[email protected]>
>
> ---
> arch/x86/kernel/apic/io_apic.c | 4 ++--
> 1 files changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/arch/x86/kernel/apic/io_apic.c b/arch/x86/kernel/apic/io_apic.c
> index 45fd33d..df63620 100644
> --- a/arch/x86/kernel/apic/io_apic.c
> +++ b/arch/x86/kernel/apic/io_apic.c
> @@ -621,14 +621,14 @@ struct IO_APIC_route_entry **alloc_ioapic_entries(void)
> struct IO_APIC_route_entry **ioapic_entries;
>
> ioapic_entries = kzalloc(sizeof(*ioapic_entries) * nr_ioapics,
> - GFP_KERNEL);
> + GFP_ATOMIC);
> if (!ioapic_entries)
> return 0;
>
> for (apic = 0; apic < nr_ioapics; apic++) {
> ioapic_entries[apic] =
> kzalloc(sizeof(struct IO_APIC_route_entry) *
> - nr_ioapic_registers[apic], GFP_KERNEL);
> + nr_ioapic_registers[apic], GFP_ATOMIC);
> if (!ioapic_entries[apic])
> goto nomem;
> }

Hm, there must be some other bug here.

ioapic entries should be allocated on system bootup and should never really be
deallocated.

The bug might be in the idea to call to enable_IR_x2apic() on resume - why are
ioapic entries reallocated there? That call in default_setup_apic_routing() was
introduced some time ago in:

fa47f7e52874: x86, x2apic: Simplify apic init in SMP and UP builds

and that it results in reallocation in the suspend patch is distinctly wrong.

I suspect the reason why this has not triggered for others is the relative
rarity of affected systems?

Suresh?

Thanks,

Ingo

2011-05-10 10:53:27

by Daniel J Blueman

[permalink] [raw]
Subject: Re: [PATCH] ioapic: fix potential resume deadlock

On 10 May 2011 15:35, Ingo Molnar <[email protected]> wrote:
> * Daniel J Blueman <[email protected]> wrote:
>
>> Fix a potential deadlock when resuming; here the calling function
>> has disabled interrupts, so we cannot sleep.
>>
>> Signed-off-by: Daniel J Blueman <[email protected]>
>>
>> ---
>> ?arch/x86/kernel/apic/io_apic.c | ? ?4 ++--
>> ?1 files changed, 2 insertions(+), 2 deletions(-)
>>
>> diff --git a/arch/x86/kernel/apic/io_apic.c b/arch/x86/kernel/apic/io_apic.c
>> index 45fd33d..df63620 100644
>> --- a/arch/x86/kernel/apic/io_apic.c
>> +++ b/arch/x86/kernel/apic/io_apic.c
>> @@ -621,14 +621,14 @@ struct IO_APIC_route_entry **alloc_ioapic_entries(void)
>> ? ? ? struct IO_APIC_route_entry **ioapic_entries;
>>
>> ? ? ? ioapic_entries = kzalloc(sizeof(*ioapic_entries) * nr_ioapics,
>> - ? ? ? ? ? ? ? ? ? ? ? ? ? ? GFP_KERNEL);
>> + ? ? ? ? ? ? ? ? ? ? ? ? ? ? GFP_ATOMIC);
>> ? ? ? if (!ioapic_entries)
>> ? ? ? ? ? ? ? return 0;
>>
>> ? ? ? for (apic = 0; apic < nr_ioapics; apic++) {
>> ? ? ? ? ? ? ? ioapic_entries[apic] =
>> ? ? ? ? ? ? ? ? ? ? ? kzalloc(sizeof(struct IO_APIC_route_entry) *
>> - ? ? ? ? ? ? ? ? ? ? ? ? ? ? nr_ioapic_registers[apic], GFP_KERNEL);
>> + ? ? ? ? ? ? ? ? ? ? ? ? ? ? nr_ioapic_registers[apic], GFP_ATOMIC);
>> ? ? ? ? ? ? ? if (!ioapic_entries[apic])
>> ? ? ? ? ? ? ? ? ? ? ? goto nomem;
>> ? ? ? }
>
> Hm, there must be some other bug here.
>
> ioapic entries should be allocated on system bootup and should never really be
> deallocated.
>
> The bug might be in the idea to call to enable_IR_x2apic() on resume - why are
> ioapic entries reallocated there? That call in default_setup_apic_routing() was
> introduced some time ago in:
>
> ?fa47f7e52874: x86, x2apic: Simplify apic init in SMP and UP builds
>
> and that it results in reallocation in the suspend patch is distinctly wrong.
>
> I suspect the reason why this has not triggered for others is the relative
> rarity of affected systems?
>
> Suresh?

For reference, this is the warning I get on resume with debugging enabled:

BUG: sleeping function called from invalid context at mm/slub.c:824
in_atomic(): 0, irqs_disabled(): 1, pid: 1842, name: pm-suspend
INFO: lockdep is turned off.
irq event stamp: 0
hardirqs last enabled at (0): [< (null)>] (null)
hardirqs last disabled at (0): [<ffffffff8105eca2>] copy_process+
0x4e2/0x1040
softirqs last enabled at (0): [<ffffffff8105eca2>] copy_process+0x4e2/0x1040
softirqs last disabled at (0): [< (null)>] (null)
Pid: 1842, comm: pm-suspend Tainted: G M 2.6.39-rc6-330cd+ #3
Call Trace:
[<ffffffff81098290>] ? print_irqtrace_events+0xd0/0xe0
[<ffffffff8104a11d>] __might_sleep+0xed/0x120
[<ffffffff8112f7c3>] __kmalloc+0xd3/0x140
[<ffffffff81025566>] alloc_ioapic_entries+0x66/0xb0
[<ffffffff81022bc5>] lapic_resume+0x265/0x350
[<ffffffff814059d5>] syscore_resume+0x35/0xd0
[<ffffffff810a8ace>] suspend_enter+0xde/0x120
[<ffffffff810a8b7a>] suspend_devices_and_enter+0x6a/0xb0
[<ffffffff810a8c91>] enter_state+0xd1/0x100
[<ffffffff810a81c6>] state_store+0xc6/0x100
[<ffffffff81329c37>] kobj_attr_store+0x17/0x20
[<ffffffff811a78d1>] sysfs_write_file+0xe1/0x160
[<ffffffff811400c6>] vfs_write+0xc6/0x180
[<ffffffff811403dc>] sys_write+0x4c/0x90
[<ffffffff8170927b>] system_call_fastpath+0x16/0x1b
--
Daniel J Blueman

2011-05-10 23:57:50

by Suresh Siddha

[permalink] [raw]
Subject: Re: [PATCH] ioapic: fix potential resume deadlock

On Tue, 2011-05-10 at 03:53 -0700, Daniel J Blueman wrote:
> On 10 May 2011 15:35, Ingo Molnar <[email protected]> wrote:
> > Hm, there must be some other bug here.
> >
> > ioapic entries should be allocated on system bootup and should never really be
> > deallocated.
> >
> > The bug might be in the idea to call to enable_IR_x2apic() on resume - why are
> > ioapic entries reallocated there? That call in default_setup_apic_routing() was
> > introduced some time ago in:
> >
> > fa47f7e52874: x86, x2apic: Simplify apic init in SMP and UP builds
> >
> > and that it results in reallocation in the suspend patch is distinctly wrong.
> >
> > I suspect the reason why this has not triggered for others is the relative
> > rarity of affected systems?
> >
> > Suresh?

This will happen only on interrupt-remapping enabled systems and hence
hidden so far.

enable_IR_x2apic() / default_setup_apic_routing() routines don't get
called during resume. Problem is with the lapic_resume() path which is
trying to re-enable x2apic mode and this requires masking io-apic
entries, re-enabling x2apic and interrupt-remapping, restoring io-apic
entries etc. ioapic entry allocation that is happening in lapic_resume()
is just for storing the original io-apic RTE's during the mask/restore
operations that happen around re-enabling x2apic.

Anyways looking at that code, there is some duplicate code between
x2apic code and the io-apic suspend/resume sequence. So a better fix
will be to remove this duplicate code.

Daniel, I appended a quick patch which should fix the problem you
reported. This is not a final patch, as I can do some more cleanup and
re-use the io-apic suspend/resume code a bit more.

If this patch works for you, then I can spend some more time to do the
complete cleanup patch in a day or two. Thanks.

arch/x86/include/asm/io_apic.h | 1 +
arch/x86/kernel/apic/apic.c | 26 ++++----------------------
arch/x86/kernel/apic/io_apic.c | 2 +-
3 files changed, 6 insertions(+), 23 deletions(-)

diff --git a/arch/x86/include/asm/io_apic.h b/arch/x86/include/asm/io_apic.h
index a97a240..9e229be 100644
--- a/arch/x86/include/asm/io_apic.h
+++ b/arch/x86/include/asm/io_apic.h
@@ -177,6 +177,7 @@ extern void __init pre_init_apic_IRQ0(void);
extern void mp_save_irq(struct mpc_intsrc *m);

extern void disable_ioapic_support(void);
+extern struct IO_APIC_route_entry *ioapic_saved_data[MAX_IO_APICS];

#else /* !CONFIG_X86_IO_APIC */

diff --git a/arch/x86/kernel/apic/apic.c b/arch/x86/kernel/apic/apic.c
index fabf01e..9000409 100644
--- a/arch/x86/kernel/apic/apic.c
+++ b/arch/x86/kernel/apic/apic.c
@@ -2088,28 +2088,14 @@ static void lapic_resume(void)
{
unsigned int l, h;
unsigned long flags;
- int maxlvt, ret;
- struct IO_APIC_route_entry **ioapic_entries = NULL;
+ int maxlvt;

if (!apic_pm_state.active)
return;

local_irq_save(flags);
if (intr_remapping_enabled) {
- ioapic_entries = alloc_ioapic_entries();
- if (!ioapic_entries) {
- WARN(1, "Alloc ioapic_entries in lapic resume failed.");
- goto restore;
- }
-
- ret = save_IO_APIC_setup(ioapic_entries);
- if (ret) {
- WARN(1, "Saving IO-APIC state failed: %d\n", ret);
- free_ioapic_entries(ioapic_entries);
- goto restore;
- }
-
- mask_IO_APIC_setup(ioapic_entries);
+ mask_IO_APIC_setup(ioapic_saved_data);
legacy_pic->mask_all();
}

@@ -2152,13 +2138,9 @@ static void lapic_resume(void)
apic_write(APIC_ESR, 0);
apic_read(APIC_ESR);

- if (intr_remapping_enabled) {
+ if (intr_remapping_enabled)
reenable_intr_remapping(x2apic_mode);
- legacy_pic->restore_mask();
- restore_IO_APIC_setup(ioapic_entries);
- free_ioapic_entries(ioapic_entries);
- }
-restore:
+
local_irq_restore(flags);
}

diff --git a/arch/x86/kernel/apic/io_apic.c b/arch/x86/kernel/apic/io_apic.c
index 45fd33d..5e6a78e 100644
--- a/arch/x86/kernel/apic/io_apic.c
+++ b/arch/x86/kernel/apic/io_apic.c
@@ -2918,7 +2918,7 @@ static int __init io_apic_bug_finalize(void)

late_initcall(io_apic_bug_finalize);

-static struct IO_APIC_route_entry *ioapic_saved_data[MAX_IO_APICS];
+struct IO_APIC_route_entry *ioapic_saved_data[MAX_IO_APICS];

static void suspend_ioapic(int ioapic_id)
{

2011-05-11 16:15:10

by Daniel J Blueman

[permalink] [raw]
Subject: Re: [PATCH] ioapic: fix potential resume deadlock

On 11 May 2011 07:58, Suresh Siddha <[email protected]> wrote:
> On Tue, 2011-05-10 at 03:53 -0700, Daniel J Blueman wrote:
>> On 10 May 2011 15:35, Ingo Molnar <[email protected]> wrote:
>> > Hm, there must be some other bug here.
>> >
>> > ioapic entries should be allocated on system bootup and should never really be
>> > deallocated.
>> >
>> > The bug might be in the idea to call to enable_IR_x2apic() on resume - why are
>> > ioapic entries reallocated there? That call in default_setup_apic_routing() was
>> > introduced some time ago in:
>> >
>> > ?fa47f7e52874: x86, x2apic: Simplify apic init in SMP and UP builds
>> >
>> > and that it results in reallocation in the suspend patch is distinctly wrong.
>> >
>> > I suspect the reason why this has not triggered for others is the relative
>> > rarity of affected systems?
>> >
>> > Suresh?
>
> This will happen only on interrupt-remapping enabled systems and hence
> hidden so far.
>
> enable_IR_x2apic() / default_setup_apic_routing() routines don't get
> called during resume. Problem is with the lapic_resume() path which is
> trying to re-enable x2apic mode and this requires masking io-apic
> entries, re-enabling x2apic and interrupt-remapping, restoring io-apic
> entries etc. ioapic entry allocation that is happening in lapic_resume()
> is just for storing the original io-apic RTE's during the mask/restore
> operations that happen around re-enabling x2apic.
>
> Anyways looking at that code, there is some duplicate code between
> x2apic code and the io-apic suspend/resume sequence. So a better fix
> will be to remove this duplicate code.
>
> Daniel, I appended a quick patch which should fix the problem you
> reported. This is not a final patch, as I can do some more cleanup and
> re-use the io-apic suspend/resume code a bit more.
>
> If this patch works for you, then I can spend some more time to do the
> complete cleanup patch in a day or two. Thanks.
[...]

Superb, this works, tested against 2.6.39-rc7 and addresses the "BUG:
sleeping function called from invalid context at mm/slub.c:824"
warning I was previously seeing. It would be good to get this fix into
2.6.39-final if possible.

Tested-by: Daniel J Blueman <[email protected]>

Thanks,
Daniel
--
Daniel J Blueman

2011-05-13 17:48:30

by Suresh Siddha

[permalink] [raw]
Subject: Re: [PATCH] ioapic: fix potential resume deadlock

On Wed, 2011-05-11 at 09:15 -0700, Daniel J Blueman wrote:
> Superb, this works, tested against 2.6.39-rc7 and addresses the "BUG:
> sleeping function called from invalid context at mm/slub.c:824"
> warning I was previously seeing. It would be good to get this fix into
> 2.6.39-final if possible.
>
> Tested-by: Daniel J Blueman <[email protected]>

Thanks Daniel for testing my quick patch. I have appended the complete
patch which cleans up this code.

Ingo, This patch is relatively big (mostly removes the duplicate code
and changes the location where we allocate ioapic_saved_data, so that
this can be shared between interrupt-remapping and io-apic
suspend/resume flows). May be this can go into 2.6.40-rc1 and probably
go to 2.6.39-stable?

Or we can take the Daniel's GFP_ATOMIC patch for 2.6.39 and push this
patch for 2.6.40-rc1. I am ok either way.

Thanks.
---

From: Suresh Siddha <[email protected]>
Subject: x86, ioapic: Remove the duplicate code for saving/restoring io-apic RTE's

Daniel reported that x2apic and interrupt-remapping resume code flow is
allocating memory with GFP_KERNEL while the interrupts are disabled.

Code flow for enabling interrupt-remapping has its own routines for saving and
restoring io-apic RTE's. ioapic suspend/resume code flow also has similar
routines. Remove the duplicate code. This will consolidate code aswell
as remove the unnecessary allocation/free of the temporary buffers during
suspend/resume of interrupt-remapping enabled platforms (and thus avoid
the GFP_KERNEL allocation in the interrupts disabled context).

Reported-by: Daniel J Blueman <[email protected]>
Signed-off-by: Suresh Siddha <[email protected]>
Cc: [email protected] [v2.6.39]
---

arch/x86/include/asm/io_apic.h | 21 ++----
arch/x86/kernel/apic/apic.c | 49 ++++----------
arch/x86/kernel/apic/io_apic.c | 148 +++++++++++-----------------------------
3 files changed, 61 insertions(+), 157 deletions(-)

diff --git a/arch/x86/include/asm/io_apic.h b/arch/x86/include/asm/io_apic.h
index a97a240..7e52404 100644
--- a/arch/x86/include/asm/io_apic.h
+++ b/arch/x86/include/asm/io_apic.h
@@ -152,11 +152,9 @@ extern void ioapic_insert_resources(void);

int io_apic_setup_irq_pin_once(unsigned int irq, int node, struct io_apic_irq_attr *attr);

-extern struct IO_APIC_route_entry **alloc_ioapic_entries(void);
-extern void free_ioapic_entries(struct IO_APIC_route_entry **ioapic_entries);
-extern int save_IO_APIC_setup(struct IO_APIC_route_entry **ioapic_entries);
-extern void mask_IO_APIC_setup(struct IO_APIC_route_entry **ioapic_entries);
-extern int restore_IO_APIC_setup(struct IO_APIC_route_entry **ioapic_entries);
+extern int save_ioapic_entries(void);
+extern void mask_ioapic_entries(void);
+extern void restore_ioapic_entries(void);

extern int get_nr_irqs_gsi(void);

@@ -192,21 +190,14 @@ struct io_apic_irq_attr;
static inline int io_apic_set_pci_routing(struct device *dev, int irq,
struct io_apic_irq_attr *irq_attr) { return 0; }

-static inline struct IO_APIC_route_entry **alloc_ioapic_entries(void)
-{
- return NULL;
-}
-
-static inline void free_ioapic_entries(struct IO_APIC_route_entry **ent) { }
-static inline int save_IO_APIC_setup(struct IO_APIC_route_entry **ent)
+static inline int save_ioapic_entries(void)
{
return -ENOMEM;
}

-static inline void mask_IO_APIC_setup(struct IO_APIC_route_entry **ent) { }
-static inline int restore_IO_APIC_setup(struct IO_APIC_route_entry **ent)
+static inline void mask_ioapic_entries(void) { }
+static inline void restore_ioapic_entries(void)
{
- return -ENOMEM;
}

static inline void mp_save_irq(struct mpc_intsrc *m) { };
diff --git a/arch/x86/kernel/apic/apic.c b/arch/x86/kernel/apic/apic.c
index fabf01e..835766f 100644
--- a/arch/x86/kernel/apic/apic.c
+++ b/arch/x86/kernel/apic/apic.c
@@ -1450,7 +1450,6 @@ int __init enable_IR(void)
void __init enable_IR_x2apic(void)
{
unsigned long flags;
- struct IO_APIC_route_entry **ioapic_entries;
int ret, x2apic_enabled = 0;
int dmar_table_init_ret;

@@ -1458,13 +1457,7 @@ void __init enable_IR_x2apic(void)
if (dmar_table_init_ret && !x2apic_supported())
return;

- ioapic_entries = alloc_ioapic_entries();
- if (!ioapic_entries) {
- pr_err("Allocate ioapic_entries failed\n");
- goto out;
- }
-
- ret = save_IO_APIC_setup(ioapic_entries);
+ ret = save_ioapic_entries();
if (ret) {
pr_info("Saving IO-APIC state failed: %d\n", ret);
goto out;
@@ -1472,7 +1465,7 @@ void __init enable_IR_x2apic(void)

local_irq_save(flags);
legacy_pic->mask_all();
- mask_IO_APIC_setup(ioapic_entries);
+ mask_ioapic_entries();

if (dmar_table_init_ret)
ret = 0;
@@ -1503,14 +1496,11 @@ void __init enable_IR_x2apic(void)

nox2apic:
if (!ret) /* IR enabling failed */
- restore_IO_APIC_setup(ioapic_entries);
+ restore_ioapic_entries();
legacy_pic->restore_mask();
local_irq_restore(flags);

out:
- if (ioapic_entries)
- free_ioapic_entries(ioapic_entries);
-
if (x2apic_enabled)
return;

@@ -2088,28 +2078,21 @@ static void lapic_resume(void)
{
unsigned int l, h;
unsigned long flags;
- int maxlvt, ret;
- struct IO_APIC_route_entry **ioapic_entries = NULL;
+ int maxlvt;

if (!apic_pm_state.active)
return;

local_irq_save(flags);
- if (intr_remapping_enabled) {
- ioapic_entries = alloc_ioapic_entries();
- if (!ioapic_entries) {
- WARN(1, "Alloc ioapic_entries in lapic resume failed.");
- goto restore;
- }
-
- ret = save_IO_APIC_setup(ioapic_entries);
- if (ret) {
- WARN(1, "Saving IO-APIC state failed: %d\n", ret);
- free_ioapic_entries(ioapic_entries);
- goto restore;
- }

- mask_IO_APIC_setup(ioapic_entries);
+ if (intr_remapping_enabled) {
+ /*
+ * IO-APIC and PIC have their own resume routines.
+ * We just mask them here to make sure the interrupt
+ * subsystem is completely quiet while we enable x2apic
+ * and interrupt-remapping.
+ */
+ mask_ioapic_entries();
legacy_pic->mask_all();
}

@@ -2152,13 +2135,9 @@ static void lapic_resume(void)
apic_write(APIC_ESR, 0);
apic_read(APIC_ESR);

- if (intr_remapping_enabled) {
+ if (intr_remapping_enabled)
reenable_intr_remapping(x2apic_mode);
- legacy_pic->restore_mask();
- restore_IO_APIC_setup(ioapic_entries);
- free_ioapic_entries(ioapic_entries);
- }
-restore:
+
local_irq_restore(flags);
}

diff --git a/arch/x86/kernel/apic/io_apic.c b/arch/x86/kernel/apic/io_apic.c
index 45fd33d..13fe2f7 100644
--- a/arch/x86/kernel/apic/io_apic.c
+++ b/arch/x86/kernel/apic/io_apic.c
@@ -100,6 +100,12 @@ int mp_irq_entries;
/* GSI interrupts */
static int nr_irqs_gsi = NR_IRQS_LEGACY;

+/*
+ * Saved I/O APIC state during suspend/resume, or while enabling
+ * interrupt-remapping
+ */
+static struct IO_APIC_route_entry *ioapic_saved_data[MAX_IO_APICS];
+
#if defined (CONFIG_MCA) || defined (CONFIG_EISA)
int mp_bus_id_to_type[MAX_MP_BUSSES];
#endif
@@ -179,6 +185,14 @@ int __init arch_early_irq_init(void)
io_apic_irqs = ~0UL;
}

+ for (i = 0; i < nr_ioapics; i++) {
+ ioapic_saved_data[i] =
+ kzalloc(sizeof(struct IO_APIC_route_entry) *
+ nr_ioapic_registers[i], GFP_KERNEL);
+ if (!ioapic_saved_data[i])
+ pr_err("IOAPIC %d: suspend/resume impossible!\n", i);
+ }
+
cfg = irq_cfgx;
count = ARRAY_SIZE(irq_cfgx);
node = cpu_to_node(0);
@@ -615,74 +629,43 @@ static int __init ioapic_pirq_setup(char *str)
__setup("pirq=", ioapic_pirq_setup);
#endif /* CONFIG_X86_32 */

-struct IO_APIC_route_entry **alloc_ioapic_entries(void)
-{
- int apic;
- struct IO_APIC_route_entry **ioapic_entries;
-
- ioapic_entries = kzalloc(sizeof(*ioapic_entries) * nr_ioapics,
- GFP_KERNEL);
- if (!ioapic_entries)
- return 0;
-
- for (apic = 0; apic < nr_ioapics; apic++) {
- ioapic_entries[apic] =
- kzalloc(sizeof(struct IO_APIC_route_entry) *
- nr_ioapic_registers[apic], GFP_KERNEL);
- if (!ioapic_entries[apic])
- goto nomem;
- }
-
- return ioapic_entries;
-
-nomem:
- while (--apic >= 0)
- kfree(ioapic_entries[apic]);
- kfree(ioapic_entries);
-
- return 0;
-}
-
/*
* Saves all the IO-APIC RTE's
*/
-int save_IO_APIC_setup(struct IO_APIC_route_entry **ioapic_entries)
+int save_ioapic_entries(void)
{
int apic, pin;
-
- if (!ioapic_entries)
- return -ENOMEM;
+ int err = 0;

for (apic = 0; apic < nr_ioapics; apic++) {
- if (!ioapic_entries[apic])
- return -ENOMEM;
+ if (!ioapic_saved_data[apic]) {
+ err = -ENOMEM;
+ continue;
+ }

for (pin = 0; pin < nr_ioapic_registers[apic]; pin++)
- ioapic_entries[apic][pin] =
+ ioapic_saved_data[apic][pin] =
ioapic_read_entry(apic, pin);
}

- return 0;
+ return err;
}

/*
* Mask all IO APIC entries.
*/
-void mask_IO_APIC_setup(struct IO_APIC_route_entry **ioapic_entries)
+void mask_ioapic_entries(void)
{
int apic, pin;

- if (!ioapic_entries)
- return;
-
for (apic = 0; apic < nr_ioapics; apic++) {
- if (!ioapic_entries[apic])
- break;
+ if (!ioapic_saved_data[apic])
+ continue;

for (pin = 0; pin < nr_ioapic_registers[apic]; pin++) {
struct IO_APIC_route_entry entry;

- entry = ioapic_entries[apic][pin];
+ entry = ioapic_saved_data[apic][pin];
if (!entry.mask) {
entry.mask = 1;
ioapic_write_entry(apic, pin, entry);
@@ -694,32 +677,18 @@ void mask_IO_APIC_setup(struct IO_APIC_route_entry **ioapic_entries)
/*
* Restore IO APIC entries which was saved in ioapic_entries.
*/
-int restore_IO_APIC_setup(struct IO_APIC_route_entry **ioapic_entries)
+void restore_ioapic_entries(void)
{
int apic, pin;

- if (!ioapic_entries)
- return -ENOMEM;
-
for (apic = 0; apic < nr_ioapics; apic++) {
- if (!ioapic_entries[apic])
- return -ENOMEM;
+ if (!ioapic_saved_data[apic])
+ continue;

for (pin = 0; pin < nr_ioapic_registers[apic]; pin++)
ioapic_write_entry(apic, pin,
- ioapic_entries[apic][pin]);
+ ioapic_saved_data[apic][pin]);
}
- return 0;
-}
-
-void free_ioapic_entries(struct IO_APIC_route_entry **ioapic_entries)
-{
- int apic;
-
- for (apic = 0; apic < nr_ioapics; apic++)
- kfree(ioapic_entries[apic]);
-
- kfree(ioapic_entries);
}

/*
@@ -2918,39 +2887,10 @@ static int __init io_apic_bug_finalize(void)

late_initcall(io_apic_bug_finalize);

-static struct IO_APIC_route_entry *ioapic_saved_data[MAX_IO_APICS];
-
-static void suspend_ioapic(int ioapic_id)
+static void restore_ioapic_id(int ioapic_id)
{
- struct IO_APIC_route_entry *saved_data = ioapic_saved_data[ioapic_id];
- int i;
-
- if (!saved_data)
- return;
-
- for (i = 0; i < nr_ioapic_registers[ioapic_id]; i++)
- saved_data[i] = ioapic_read_entry(ioapic_id, i);
-}
-
-static int ioapic_suspend(void)
-{
- int ioapic_id;
-
- for (ioapic_id = 0; ioapic_id < nr_ioapics; ioapic_id++)
- suspend_ioapic(ioapic_id);
-
- return 0;
-}
-
-static void resume_ioapic(int ioapic_id)
-{
- struct IO_APIC_route_entry *saved_data = ioapic_saved_data[ioapic_id];
unsigned long flags;
union IO_APIC_reg_00 reg_00;
- int i;
-
- if (!saved_data)
- return;

raw_spin_lock_irqsave(&ioapic_lock, flags);
reg_00.raw = io_apic_read(ioapic_id, 0);
@@ -2959,8 +2899,6 @@ static void resume_ioapic(int ioapic_id)
io_apic_write(ioapic_id, 0, reg_00.raw);
}
raw_spin_unlock_irqrestore(&ioapic_lock, flags);
- for (i = 0; i < nr_ioapic_registers[ioapic_id]; i++)
- ioapic_write_entry(ioapic_id, i, saved_data[i]);
}

static void ioapic_resume(void)
@@ -2968,28 +2906,18 @@ static void ioapic_resume(void)
int ioapic_id;

for (ioapic_id = nr_ioapics - 1; ioapic_id >= 0; ioapic_id--)
- resume_ioapic(ioapic_id);
+ restore_ioapic_id(ioapic_id);
+
+ restore_ioapic_entries();
}

static struct syscore_ops ioapic_syscore_ops = {
- .suspend = ioapic_suspend,
+ .suspend = save_ioapic_entries,
.resume = ioapic_resume,
};

static int __init ioapic_init_ops(void)
{
- int i;
-
- for (i = 0; i < nr_ioapics; i++) {
- unsigned int size;
-
- size = nr_ioapic_registers[i]
- * sizeof(struct IO_APIC_route_entry);
- ioapic_saved_data[i] = kzalloc(size, GFP_KERNEL);
- if (!ioapic_saved_data[i])
- pr_err("IOAPIC %d: suspend/resume impossible!\n", i);
- }
-
register_syscore_ops(&ioapic_syscore_ops);

return 0;
@@ -3992,6 +3920,7 @@ static __init int bad_ioapic(unsigned long address)

void __init mp_register_ioapic(int id, u32 address, u32 gsi_base)
{
+ unsigned int size;
int idx = 0;
int entries;

@@ -4030,6 +3959,11 @@ void __init mp_register_ioapic(int id, u32 address, u32 gsi_base)
mp_gsi_routing[idx].gsi_base, mp_gsi_routing[idx].gsi_end);

nr_ioapics++;
+
+ size = nr_ioapic_registers[idx] * sizeof(struct IO_APIC_route_entry);
+ ioapic_saved_data[idx] = kzalloc(size, GFP_KERNEL);
+ if (!ioapic_saved_data[idx])
+ pr_err("IOAPIC %d: suspend/resume impossible!\n", idx);
}

/* Enable IOAPIC early just for system timer */

2011-05-14 14:39:06

by Daniel J Blueman

[permalink] [raw]
Subject: Re: [PATCH] ioapic: fix potential resume deadlock

Hi Suresh,

On 14 May 2011 01:48, Suresh Siddha <[email protected]> wrote:
> On Wed, 2011-05-11 at 09:15 -0700, Daniel J Blueman wrote:
>> Superb, this works, tested against 2.6.39-rc7 and addresses the "BUG:
>> sleeping function called from invalid context at mm/slub.c:824"
>> warning I was previously seeing. It would be good to get this fix into
>> 2.6.39-final if possible.
>>
>> Tested-by: Daniel J Blueman <[email protected]>
>
> Thanks Daniel for testing my quick patch. I have appended the complete
> patch which cleans up this code.
>
> Ingo, This patch is relatively big (mostly removes the duplicate code
> and changes the location where we allocate ioapic_saved_data, so that
> this can be shared between interrupt-remapping and io-apic
> suspend/resume flows). May be this can go into 2.6.40-rc1 and probably
> go to 2.6.39-stable?
>
> Or we can take the Daniel's GFP_ATOMIC patch for 2.6.39 and push this
> patch for 2.6.40-rc1. I am ok either way.
[]

Testing this, all looks well in that the patch resolves the
potentially sleeping allocation, however I do see (on boot) this
suspicious message (though suspend and resume does work):

IOAPIC 0: suspend/resume impossible!

I guess it's not expected...
--
Daniel J Blueman

2011-05-16 11:32:23

by Ingo Molnar

[permalink] [raw]
Subject: Re: [PATCH] ioapic: fix potential resume deadlock


* Daniel J Blueman <[email protected]> wrote:

> Hi Suresh,
>
> On 14 May 2011 01:48, Suresh Siddha <[email protected]> wrote:
> > On Wed, 2011-05-11 at 09:15 -0700, Daniel J Blueman wrote:
> >> Superb, this works, tested against 2.6.39-rc7 and addresses the "BUG:
> >> sleeping function called from invalid context at mm/slub.c:824"
> >> warning I was previously seeing. It would be good to get this fix into
> >> 2.6.39-final if possible.
> >>
> >> Tested-by: Daniel J Blueman <[email protected]>
> >
> > Thanks Daniel for testing my quick patch. I have appended the complete
> > patch which cleans up this code.
> >
> > Ingo, This patch is relatively big (mostly removes the duplicate code
> > and changes the location where we allocate ioapic_saved_data, so that
> > this can be shared between interrupt-remapping and io-apic
> > suspend/resume flows). May be this can go into 2.6.40-rc1 and probably
> > go to 2.6.39-stable?
> >
> > Or we can take the Daniel's GFP_ATOMIC patch for 2.6.39 and push this
> > patch for 2.6.40-rc1. I am ok either way.
> []
>
> Testing this, all looks well in that the patch resolves the
> potentially sleeping allocation, however I do see (on boot) this
> suspicious message (though suspend and resume does work):
>
> IOAPIC 0: suspend/resume impossible!
>
> I guess it's not expected...

No. Has this been introduced by Suresh's patch?

Thanks,

Ingo

2011-05-16 11:34:27

by Ingo Molnar

[permalink] [raw]
Subject: Re: [PATCH] ioapic: fix potential resume deadlock


* Suresh Siddha <[email protected]> wrote:

> On Wed, 2011-05-11 at 09:15 -0700, Daniel J Blueman wrote:
> > Superb, this works, tested against 2.6.39-rc7 and addresses the "BUG:
> > sleeping function called from invalid context at mm/slub.c:824"
> > warning I was previously seeing. It would be good to get this fix into
> > 2.6.39-final if possible.
> >
> > Tested-by: Daniel J Blueman <[email protected]>
>
> Thanks Daniel for testing my quick patch. I have appended the complete
> patch which cleans up this code.
>
> Ingo, This patch is relatively big (mostly removes the duplicate code
> and changes the location where we allocate ioapic_saved_data, so that
> this can be shared between interrupt-remapping and io-apic
> suspend/resume flows). May be this can go into 2.6.40-rc1 and probably
> go to 2.6.39-stable?

Could you please split it up into multiple steps? Commits with such a diffstat:

> arch/x86/include/asm/io_apic.h | 21 ++----
> arch/x86/kernel/apic/apic.c | 49 ++++----------
> arch/x86/kernel/apic/io_apic.c | 148 +++++++++++-----------------------------
> 3 files changed, 61 insertions(+), 157 deletions(-)

... rarely come without regressions attached, and it's a whole lot easier to
figure out what's wrong with a small patch out of 3-4 than with such a big
patch.

I'd suggest to make the end result exactly the same as this one big patch, so
that we preserve Daniel's testing results.

Thanks,

Ingo

2011-05-16 17:04:12

by Suresh Siddha

[permalink] [raw]
Subject: Re: [PATCH] ioapic: fix potential resume deadlock

On Mon, 2011-05-16 at 04:32 -0700, Ingo Molnar wrote:
> * Daniel J Blueman <[email protected]> wrote:
>
> > Hi Suresh,
> >
> > On 14 May 2011 01:48, Suresh Siddha <[email protected]> wrote:
> > > On Wed, 2011-05-11 at 09:15 -0700, Daniel J Blueman wrote:
> > >> Superb, this works, tested against 2.6.39-rc7 and addresses the "BUG:
> > >> sleeping function called from invalid context at mm/slub.c:824"
> > >> warning I was previously seeing. It would be good to get this fix into
> > >> 2.6.39-final if possible.
> > >>
> > >> Tested-by: Daniel J Blueman <[email protected]>
> > >
> > > Thanks Daniel for testing my quick patch. I have appended the complete
> > > patch which cleans up this code.
> > >
> > > Ingo, This patch is relatively big (mostly removes the duplicate code
> > > and changes the location where we allocate ioapic_saved_data, so that
> > > this can be shared between interrupt-remapping and io-apic
> > > suspend/resume flows). May be this can go into 2.6.40-rc1 and probably
> > > go to 2.6.39-stable?
> > >
> > > Or we can take the Daniel's GFP_ATOMIC patch for 2.6.39 and push this
> > > patch for 2.6.40-rc1. I am ok either way.
> > []
> >
> > Testing this, all looks well in that the patch resolves the
> > potentially sleeping allocation, however I do see (on boot) this
> > suspicious message (though suspend and resume does work):
> >
> > IOAPIC 0: suspend/resume impossible!
> >
> > I guess it's not expected...

oops. I had a spurious initialization, from the earlier attempts to fix
this that was still left out.

> No. Has this been introduced by Suresh's patch?

Ofcourse yes ;(

Will fix this and split the patch into multiple steps.

thanks,
suresh