Hi,
Commit 1c3c5ea "sched/core: Enable might_sleep() and smp_processor_id()
checks early" seem to have uncovered an issue with amd-iommu/x2apic.
Starting with that commit the following warning started to show up on AMD
systems during boot:
[ 0.140480] smpboot: Max logical packages: 6
[ 0.160000] BUG: sleeping function called from invalid context at kernel/locking/mutex.c:747
[ 0.160000] in_atomic(): 0, irqs_disabled(): 1, pid: 1, name: swapper/0
[ 0.160000] no locks held by swapper/0/1.
[ 0.160000] irq event stamp: 304
[ 0.160000] hardirqs last enabled at (303): [<ffffffff818a87b6>] _raw_spin_unlock_irqrestore+0x36/0x60
[ 0.160000] hardirqs last disabled at (304): [<ffffffff8235d440>] enable_IR_x2apic+0x79/0x196
[ 0.160000] softirqs last enabled at (36): [<ffffffff818ae75f>] __do_softirq+0x35f/0x4ec
[ 0.160000] softirqs last disabled at (31): [<ffffffff810c1955>] irq_exit+0x105/0x120
[ 0.160000] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.13.0-rc2.1.el7a.test.x86_64.debug #1
[ 0.160000] Hardware name: PowerEdge C6145 /040N24, BIOS 3.5.0 10/28/2014
[ 0.160000] Call Trace:
[ 0.160000] dump_stack+0x85/0xca
[ 0.160000] ___might_sleep+0x22a/0x260
[ 0.160000] __might_sleep+0x4a/0x80
[ 0.160000] __mutex_lock+0x58/0x960
[ 0.160000] ? iommu_completion_wait.part.17+0xb5/0x160
[ 0.160000] ? register_syscore_ops+0x1d/0x70
[ 0.160000] ? iommu_flush_all_caches+0x120/0x150
[ 0.160000] mutex_lock_nested+0x1b/0x20
[ 0.160000] register_syscore_ops+0x1d/0x70
[ 0.160000] state_next+0x119/0x910
[ 0.160000] iommu_go_to_state+0x29/0x30
[ 0.160000] amd_iommu_enable+0x13/0x23
[ 0.160000] irq_remapping_enable+0x1b/0x39
[ 0.160000] enable_IR_x2apic+0x91/0x196
[ 0.160000] default_setup_apic_routing+0x16/0x6e
[ 0.160000] native_smp_prepare_cpus+0x257/0x2d5
[ 0.160000] kernel_init_freeable+0x131/0x2a7
[ 0.160000] ? kernel_init+0xe/0x104
[ 0.160000] ? _raw_spin_unlock_irq+0x2c/0x40
[ 0.160000] ? rest_init+0xe0/0xe0
[ 0.160000] kernel_init+0xe/0x104
[ 0.160000] ret_from_fork+0x2a/0x40
[ 0.160010] Switched APIC routing to physical flat.
--
Regards,
Artem
On Tue, 25 Jul 2017, Artem Savkov wrote:
> Hi,
>
> Commit 1c3c5ea "sched/core: Enable might_sleep() and smp_processor_id()
> checks early" seem to have uncovered an issue with amd-iommu/x2apic.
>
> Starting with that commit the following warning started to show up on AMD
> systems during boot:
> [ 0.160000] BUG: sleeping function called from invalid context at kernel/locking/mutex.c:747
> [ 0.160000] mutex_lock_nested+0x1b/0x20
> [ 0.160000] register_syscore_ops+0x1d/0x70
> [ 0.160000] state_next+0x119/0x910
> [ 0.160000] iommu_go_to_state+0x29/0x30
> [ 0.160000] amd_iommu_enable+0x13/0x23
> [ 0.160000] irq_remapping_enable+0x1b/0x39
> [ 0.160000] enable_IR_x2apic+0x91/0x196
> [ 0.160000] default_setup_apic_routing+0x16/0x6e
> [ 0.160000] native_smp_prepare_cpus+0x257/0x2d5
Yep, that's clearly stupid. The completely untested patch below should cure
the issue.
Thanks,
tglx
8<---------------
--- a/drivers/iommu/amd_iommu_init.c
+++ b/drivers/iommu/amd_iommu_init.c
@@ -2440,7 +2440,6 @@ static int __init state_next(void)
break;
case IOMMU_ACPI_FINISHED:
early_enable_iommus();
- register_syscore_ops(&amd_iommu_syscore_ops);
x86_platform.iommu_shutdown = disable_iommus;
init_state = IOMMU_ENABLED;
break;
@@ -2559,6 +2558,8 @@ static int __init amd_iommu_init(void)
for_each_iommu(iommu)
iommu_flush_all_caches(iommu);
}
+ } else {
+ register_syscore_ops(&amd_iommu_syscore_ops);
}
return ret;
Hi Artem, Thomas,
On Wed, Jul 26, 2017 at 12:42:49PM +0200, Thomas Gleixner wrote:
> On Tue, 25 Jul 2017, Artem Savkov wrote:
>
> > Hi,
> >
> > Commit 1c3c5ea "sched/core: Enable might_sleep() and smp_processor_id()
> > checks early" seem to have uncovered an issue with amd-iommu/x2apic.
> >
> > Starting with that commit the following warning started to show up on AMD
> > systems during boot:
>
> > [ 0.160000] BUG: sleeping function called from invalid context at kernel/locking/mutex.c:747
>
> > [ 0.160000] mutex_lock_nested+0x1b/0x20
> > [ 0.160000] register_syscore_ops+0x1d/0x70
> > [ 0.160000] state_next+0x119/0x910
> > [ 0.160000] iommu_go_to_state+0x29/0x30
> > [ 0.160000] amd_iommu_enable+0x13/0x23
> > [ 0.160000] irq_remapping_enable+0x1b/0x39
> > [ 0.160000] enable_IR_x2apic+0x91/0x196
> > [ 0.160000] default_setup_apic_routing+0x16/0x6e
> > [ 0.160000] native_smp_prepare_cpus+0x257/0x2d5
Thanks for the report!
> --- a/drivers/iommu/amd_iommu_init.c
> +++ b/drivers/iommu/amd_iommu_init.c
> @@ -2440,7 +2440,6 @@ static int __init state_next(void)
> break;
> case IOMMU_ACPI_FINISHED:
> early_enable_iommus();
> - register_syscore_ops(&amd_iommu_syscore_ops);
> x86_platform.iommu_shutdown = disable_iommus;
> init_state = IOMMU_ENABLED;
> break;
> @@ -2559,6 +2558,8 @@ static int __init amd_iommu_init(void)
> for_each_iommu(iommu)
> iommu_flush_all_caches(iommu);
> }
> + } else {
> + register_syscore_ops(&amd_iommu_syscore_ops);
> }
>
> return ret;
Yes, that should fix it, but I think its better to just move the
register_syscore_ops() call to a later initialization step, like in the
patch below. I tested it an will queue it to my iommu/fixes branch.
>From 461242d7211c7777901b6ccdf349cc89235bd5da Mon Sep 17 00:00:00 2001
From: Joerg Roedel <[email protected]>
Date: Wed, 26 Jul 2017 14:17:55 +0200
Subject: [PATCH] iommu/amd: Fix schedule-while-atomic BUG in initialization
code
The register_syscore_ops() function takes a mutex and might
sleep. In the IOMMU initialization code it is invoked during
irq-remapping setup already, where irqs are disabled.
This causes a schedule-while-atomic bug:
BUG: sleeping function called from invalid context at kernel/locking/mutex.c:747
in_atomic(): 0, irqs_disabled(): 1, pid: 1, name: swapper/0
no locks held by swapper/0/1.
irq event stamp: 304
hardirqs last enabled at (303): [<ffffffff818a87b6>] _raw_spin_unlock_irqrestore+0x36/0x60
hardirqs last disabled at (304): [<ffffffff8235d440>] enable_IR_x2apic+0x79/0x196
softirqs last enabled at (36): [<ffffffff818ae75f>] __do_softirq+0x35f/0x4ec
softirqs last disabled at (31): [<ffffffff810c1955>] irq_exit+0x105/0x120
CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.13.0-rc2.1.el7a.test.x86_64.debug #1
Hardware name: PowerEdge C6145 /040N24, BIOS 3.5.0 10/28/2014
Call Trace:
dump_stack+0x85/0xca
___might_sleep+0x22a/0x260
__might_sleep+0x4a/0x80
__mutex_lock+0x58/0x960
? iommu_completion_wait.part.17+0xb5/0x160
? register_syscore_ops+0x1d/0x70
? iommu_flush_all_caches+0x120/0x150
mutex_lock_nested+0x1b/0x20
register_syscore_ops+0x1d/0x70
state_next+0x119/0x910
iommu_go_to_state+0x29/0x30
amd_iommu_enable+0x13/0x23
Fix it by moving the register_syscore_ops() call to the next
initialization step, which runs with irqs enabled.
Signed-off-by: Joerg Roedel <[email protected]>
---
drivers/iommu/amd_iommu_init.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/iommu/amd_iommu_init.c b/drivers/iommu/amd_iommu_init.c
index 5cc597b383c7..372303700566 100644
--- a/drivers/iommu/amd_iommu_init.c
+++ b/drivers/iommu/amd_iommu_init.c
@@ -2440,11 +2440,11 @@ static int __init state_next(void)
break;
case IOMMU_ACPI_FINISHED:
early_enable_iommus();
- register_syscore_ops(&amd_iommu_syscore_ops);
x86_platform.iommu_shutdown = disable_iommus;
init_state = IOMMU_ENABLED;
break;
case IOMMU_ENABLED:
+ register_syscore_ops(&amd_iommu_syscore_ops);
ret = amd_iommu_init_pci();
init_state = ret ? IOMMU_INIT_ERROR : IOMMU_PCI_INIT;
enable_iommus_v2();
--
2.13.1
On Wed, 26 Jul 2017, Joerg Roedel wrote:
> Yes, that should fix it, but I think its better to just move the
> register_syscore_ops() call to a later initialization step, like in the
> patch below. I tested it an will queue it to my iommu/fixes branch.
Fair enough. Acked-by-me.
On Wed, Jul 26, 2017 at 02:26:14PM +0200, Joerg Roedel wrote:
> Hi Artem, Thomas,
>
> On Wed, Jul 26, 2017 at 12:42:49PM +0200, Thomas Gleixner wrote:
> > On Tue, 25 Jul 2017, Artem Savkov wrote:
> >
> > > Hi,
> > >
> > > Commit 1c3c5ea "sched/core: Enable might_sleep() and smp_processor_id()
> > > checks early" seem to have uncovered an issue with amd-iommu/x2apic.
> > >
> > > Starting with that commit the following warning started to show up on AMD
> > > systems during boot:
> >
> > > [ 0.160000] BUG: sleeping function called from invalid context at kernel/locking/mutex.c:747
> >
> > > [ 0.160000] mutex_lock_nested+0x1b/0x20
> > > [ 0.160000] register_syscore_ops+0x1d/0x70
> > > [ 0.160000] state_next+0x119/0x910
> > > [ 0.160000] iommu_go_to_state+0x29/0x30
> > > [ 0.160000] amd_iommu_enable+0x13/0x23
> > > [ 0.160000] irq_remapping_enable+0x1b/0x39
> > > [ 0.160000] enable_IR_x2apic+0x91/0x196
> > > [ 0.160000] default_setup_apic_routing+0x16/0x6e
> > > [ 0.160000] native_smp_prepare_cpus+0x257/0x2d5
>
> Thanks for the report!
>
> > --- a/drivers/iommu/amd_iommu_init.c
> > +++ b/drivers/iommu/amd_iommu_init.c
> > @@ -2440,7 +2440,6 @@ static int __init state_next(void)
> > break;
> > case IOMMU_ACPI_FINISHED:
> > early_enable_iommus();
> > - register_syscore_ops(&amd_iommu_syscore_ops);
> > x86_platform.iommu_shutdown = disable_iommus;
> > init_state = IOMMU_ENABLED;
> > break;
> > @@ -2559,6 +2558,8 @@ static int __init amd_iommu_init(void)
> > for_each_iommu(iommu)
> > iommu_flush_all_caches(iommu);
> > }
> > + } else {
> > + register_syscore_ops(&amd_iommu_syscore_ops);
> > }
> >
> > return ret;
>
> Yes, that should fix it, but I think its better to just move the
> register_syscore_ops() call to a later initialization step, like in the
> patch below. I tested it an will queue it to my iommu/fixes branch.
Checked it as well just in case, didn't see any issues. Thank you.
Reported-and-tested-by: Artem Savkov <[email protected]>
--
Regards,
Artem
On Wed, Jul 26, 2017 at 03:25:05PM +0200, Artem Savkov wrote:
> On Wed, Jul 26, 2017 at 02:26:14PM +0200, Joerg Roedel wrote:
> > Yes, that should fix it, but I think its better to just move the
> > register_syscore_ops() call to a later initialization step, like in the
> > patch below. I tested it an will queue it to my iommu/fixes branch.
>
> Checked it as well just in case, didn't see any issues. Thank you.
>
> Reported-and-tested-by: Artem Savkov <[email protected]>
Thanks for testing it! I added your's and Thomas' tags and applied the
patch to my tree. It should go upstream this week.
Joerg