2019-05-21 07:39:43

by Lu Baolu

[permalink] [raw]
Subject: [PATCH 1/2] iommu/vt-d: Fix lock inversion between iommu->lock and device_domain_lock

From: Dave Jiang <[email protected]>

Lockdep debug reported lock inversion related with the iommu code
caused by dmar_insert_one_dev_info() grabbing the iommu->lock and
the device_domain_lock out of order versus the code path in
iommu_flush_dev_iotlb(). Expanding the scope of the iommu->lock and
reversing the order of lock acquisition fixes the issue.

[ 76.238180] dsa_bus wq0.0: dsa wq wq0.0 disabled
[ 76.248706]
[ 76.250486] ========================================================
[ 76.257113] WARNING: possible irq lock inversion dependency detected
[ 76.263736] 5.1.0-rc5+ #162 Not tainted
[ 76.267854] --------------------------------------------------------
[ 76.274485] systemd-journal/521 just changed the state of lock:
[ 76.280685] 0000000055b330f5 (device_domain_lock){..-.}, at: iommu_flush_dev_iotlb.part.63+0x29/0x90
[ 76.290099] but this lock took another, SOFTIRQ-unsafe lock in the past:
[ 76.297093] (&(&iommu->lock)->rlock){+.+.}
[ 76.297094]
[ 76.297094]
[ 76.297094] and interrupts could create inverse lock ordering between them.
[ 76.297094]
[ 76.314257]
[ 76.314257] other info that might help us debug this:
[ 76.321448] Possible interrupt unsafe locking scenario:
[ 76.321448]
[ 76.328907] CPU0 CPU1
[ 76.333777] ---- ----
[ 76.338642] lock(&(&iommu->lock)->rlock);
[ 76.343165] local_irq_disable();
[ 76.349422] lock(device_domain_lock);
[ 76.356116] lock(&(&iommu->lock)->rlock);
[ 76.363154] <Interrupt>
[ 76.366134] lock(device_domain_lock);
[ 76.370548]
[ 76.370548] *** DEADLOCK ***

Fixes: 745f2586e78e ("iommu/vt-d: Simplify function get_domain_for_dev()")
Signed-off-by: Dave Jiang <[email protected]>
Reviewed-by: Lu Baolu <[email protected]>
---
drivers/iommu/intel-iommu.c | 7 ++++---
1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c
index a209199f3af6..91f4912c09c6 100644
--- a/drivers/iommu/intel-iommu.c
+++ b/drivers/iommu/intel-iommu.c
@@ -2512,6 +2512,7 @@ static struct dmar_domain *dmar_insert_one_dev_info(struct intel_iommu *iommu,
}
}

+ spin_lock(&iommu->lock);
spin_lock_irqsave(&device_domain_lock, flags);
if (dev)
found = find_domain(dev);
@@ -2527,17 +2528,16 @@ static struct dmar_domain *dmar_insert_one_dev_info(struct intel_iommu *iommu,

if (found) {
spin_unlock_irqrestore(&device_domain_lock, flags);
+ spin_unlock(&iommu->lock);
free_devinfo_mem(info);
/* Caller must free the original domain */
return found;
}

- spin_lock(&iommu->lock);
ret = domain_attach_iommu(domain, iommu);
- spin_unlock(&iommu->lock);
-
if (ret) {
spin_unlock_irqrestore(&device_domain_lock, flags);
+ spin_unlock(&iommu->lock);
free_devinfo_mem(info);
return NULL;
}
@@ -2547,6 +2547,7 @@ static struct dmar_domain *dmar_insert_one_dev_info(struct intel_iommu *iommu,
if (dev)
dev->archdata.iommu = info;
spin_unlock_irqrestore(&device_domain_lock, flags);
+ spin_unlock(&iommu->lock);

/* PASID table is mandatory for a PCI device in scalable mode. */
if (dev && dev_is_pci(dev) && sm_supported(iommu)) {
--
2.17.1



2019-05-27 14:34:49

by Joerg Roedel

[permalink] [raw]
Subject: Re: [PATCH 1/2] iommu/vt-d: Fix lock inversion between iommu->lock and device_domain_lock

On Tue, May 21, 2019 at 03:30:15PM +0800, Lu Baolu wrote:
> ---
> drivers/iommu/intel-iommu.c | 7 ++++---
> 1 file changed, 4 insertions(+), 3 deletions(-)

Applied both, thanks.

2019-06-18 21:02:52

by Chris Wilson

[permalink] [raw]
Subject: Re: [PATCH 1/2] iommu/vt-d: Fix lock inversion between iommu->lock and device_domain_lock

Quoting Lu Baolu (2019-05-21 08:30:15)
> From: Dave Jiang <[email protected]>
>
> Lockdep debug reported lock inversion related with the iommu code
> caused by dmar_insert_one_dev_info() grabbing the iommu->lock and
> the device_domain_lock out of order versus the code path in
> iommu_flush_dev_iotlb(). Expanding the scope of the iommu->lock and
> reversing the order of lock acquisition fixes the issue.

Which of course violates the property that device_domain_lock is the
outer lock...

<4>[ 1.252997] ======================================================
<4>[ 1.252999] WARNING: possible circular locking dependency detected
<4>[ 1.253002] 5.2.0-rc5-CI-CI_DRM_6299+ #1 Not tainted
<4>[ 1.253004] ------------------------------------------------------
<4>[ 1.253006] swapper/0/1 is trying to acquire lock:
<4>[ 1.253009] 0000000091462475 (&(&iommu->lock)->rlock){+.+.}, at: domain_context_mapping_one+0xa0/0x4f0
<4>[ 1.253015]
but task is already holding lock:
<4>[ 1.253017] 0000000069266737 (device_domain_lock){....}, at: domain_context_mapping_one+0x88/0x4f0
<4>[ 1.253021]
which lock already depends on the new lock.

<4>[ 1.253024]
the existing dependency chain (in reverse order) is:
<4>[ 1.253027]
-> #1 (device_domain_lock){....}:
<4>[ 1.253031] _raw_spin_lock_irqsave+0x33/0x50
<4>[ 1.253034] dmar_insert_one_dev_info+0xb8/0x520
<4>[ 1.253036] set_domain_for_dev+0x66/0xf0
<4>[ 1.253039] iommu_prepare_identity_map+0x48/0x95
<4>[ 1.253042] intel_iommu_init+0xfd8/0x138d
<4>[ 1.253045] pci_iommu_init+0x11/0x3a
<4>[ 1.253048] do_one_initcall+0x58/0x300
<4>[ 1.253051] kernel_init_freeable+0x2c0/0x359
<4>[ 1.253054] kernel_init+0x5/0x100
<4>[ 1.253056] ret_from_fork+0x3a/0x50
<4>[ 1.253058]
-> #0 (&(&iommu->lock)->rlock){+.+.}:
<4>[ 1.253062] lock_acquire+0xa6/0x1c0
<4>[ 1.253064] _raw_spin_lock+0x2a/0x40
<4>[ 1.253067] domain_context_mapping_one+0xa0/0x4f0
<4>[ 1.253070] pci_for_each_dma_alias+0x2b/0x160
<4>[ 1.253072] dmar_insert_one_dev_info+0x44e/0x520
<4>[ 1.253075] set_domain_for_dev+0x66/0xf0
<4>[ 1.253077] iommu_prepare_identity_map+0x48/0x95
<4>[ 1.253080] intel_iommu_init+0xfd8/0x138d
<4>[ 1.253082] pci_iommu_init+0x11/0x3a
<4>[ 1.253084] do_one_initcall+0x58/0x300
<4>[ 1.253086] kernel_init_freeable+0x2c0/0x359
<4>[ 1.253089] kernel_init+0x5/0x100
<4>[ 1.253091] ret_from_fork+0x3a/0x50
<4>[ 1.253093]
other info that might help us debug this:

<4>[ 1.253095] Possible unsafe locking scenario:

<4>[ 1.253095] CPU0 CPU1
<4>[ 1.253095] ---- ----
<4>[ 1.253095] lock(device_domain_lock);
<4>[ 1.253095] lock(&(&iommu->lock)->rlock);
<4>[ 1.253095] lock(device_domain_lock);
<4>[ 1.253095] lock(&(&iommu->lock)->rlock);
<4>[ 1.253095]
*** DEADLOCK ***

<4>[ 1.253095] 2 locks held by swapper/0/1:
<4>[ 1.253095] #0: 0000000076465a1e (dmar_global_lock){++++}, at: intel_iommu_init+0x1d3/0x138d
<4>[ 1.253095] #1: 0000000069266737 (device_domain_lock){....}, at: domain_context_mapping_one+0x88/0x4f0
<4>[ 1.253095]
stack backtrace:
<4>[ 1.253095] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.2.0-rc5-CI-CI_DRM_6299+ #1
<4>[ 1.253095] Hardware name: /NUC5i7RYB, BIOS RYBDWi35.86A.0362.2017.0118.0940 01/18/2017
<4>[ 1.253095] Call Trace:
<4>[ 1.253095] dump_stack+0x67/0x9b
<4>[ 1.253095] print_circular_bug+0x1c8/0x2b0
<4>[ 1.253095] __lock_acquire+0x1ce9/0x24c0
<4>[ 1.253095] ? lock_acquire+0xa6/0x1c0
<4>[ 1.253095] lock_acquire+0xa6/0x1c0
<4>[ 1.253095] ? domain_context_mapping_one+0xa0/0x4f0
<4>[ 1.253095] _raw_spin_lock+0x2a/0x40
<4>[ 1.253095] ? domain_context_mapping_one+0xa0/0x4f0
<4>[ 1.253095] domain_context_mapping_one+0xa0/0x4f0
<4>[ 1.253095] ? domain_context_mapping_one+0x4f0/0x4f0
<4>[ 1.253095] pci_for_each_dma_alias+0x2b/0x160
<4>[ 1.253095] dmar_insert_one_dev_info+0x44e/0x520
<4>[ 1.253095] set_domain_for_dev+0x66/0xf0
<4>[ 1.253095] iommu_prepare_identity_map+0x48/0x95
<4>[ 1.253095] intel_iommu_init+0xfd8/0x138d
<4>[ 1.253095] ? set_debug_rodata+0xc/0xc
<4>[ 1.253095] ? set_debug_rodata+0xc/0xc
<4>[ 1.253095] ? e820__memblock_setup+0x5b/0x5b
<4>[ 1.253095] ? pci_iommu_init+0x11/0x3a
<4>[ 1.253095] ? set_debug_rodata+0xc/0xc
<4>[ 1.253095] pci_iommu_init+0x11/0x3a
<4>[ 1.253095] do_one_initcall+0x58/0x300
<4>[ 1.253095] kernel_init_freeable+0x2c0/0x359
<4>[ 1.253095] ? rest_init+0x250/0x250
<4>[ 1.253095] kernel_init+0x5/0x100
<4>[ 1.253095] ret_from_fork+0x3a/0x50

2019-06-19 01:47:05

by Lu Baolu

[permalink] [raw]
Subject: Re: [PATCH 1/2] iommu/vt-d: Fix lock inversion between iommu->lock and device_domain_lock

Hi Chris,

On 6/19/19 5:02 AM, Chris Wilson wrote:
> Quoting Lu Baolu (2019-05-21 08:30:15)
>> From: Dave Jiang <[email protected]>
>>
>> Lockdep debug reported lock inversion related with the iommu code
>> caused by dmar_insert_one_dev_info() grabbing the iommu->lock and
>> the device_domain_lock out of order versus the code path in
>> iommu_flush_dev_iotlb(). Expanding the scope of the iommu->lock and
>> reversing the order of lock acquisition fixes the issue.
>
> Which of course violates the property that device_domain_lock is the
> outer lock...

Agreed.

I also realized that this might be an incorrect fix. I am looking into
it and will submit a new fix later.

Best regards,
Lu Baolu

>
> <4>[ 1.252997] ======================================================
> <4>[ 1.252999] WARNING: possible circular locking dependency detected
> <4>[ 1.253002] 5.2.0-rc5-CI-CI_DRM_6299+ #1 Not tainted
> <4>[ 1.253004] ------------------------------------------------------
> <4>[ 1.253006] swapper/0/1 is trying to acquire lock:
> <4>[ 1.253009] 0000000091462475 (&(&iommu->lock)->rlock){+.+.}, at: domain_context_mapping_one+0xa0/0x4f0
> <4>[ 1.253015]
> but task is already holding lock:
> <4>[ 1.253017] 0000000069266737 (device_domain_lock){....}, at: domain_context_mapping_one+0x88/0x4f0
> <4>[ 1.253021]
> which lock already depends on the new lock.
>
> <4>[ 1.253024]
> the existing dependency chain (in reverse order) is:
> <4>[ 1.253027]
> -> #1 (device_domain_lock){....}:
> <4>[ 1.253031] _raw_spin_lock_irqsave+0x33/0x50
> <4>[ 1.253034] dmar_insert_one_dev_info+0xb8/0x520
> <4>[ 1.253036] set_domain_for_dev+0x66/0xf0
> <4>[ 1.253039] iommu_prepare_identity_map+0x48/0x95
> <4>[ 1.253042] intel_iommu_init+0xfd8/0x138d
> <4>[ 1.253045] pci_iommu_init+0x11/0x3a
> <4>[ 1.253048] do_one_initcall+0x58/0x300
> <4>[ 1.253051] kernel_init_freeable+0x2c0/0x359
> <4>[ 1.253054] kernel_init+0x5/0x100
> <4>[ 1.253056] ret_from_fork+0x3a/0x50
> <4>[ 1.253058]
> -> #0 (&(&iommu->lock)->rlock){+.+.}:
> <4>[ 1.253062] lock_acquire+0xa6/0x1c0
> <4>[ 1.253064] _raw_spin_lock+0x2a/0x40
> <4>[ 1.253067] domain_context_mapping_one+0xa0/0x4f0
> <4>[ 1.253070] pci_for_each_dma_alias+0x2b/0x160
> <4>[ 1.253072] dmar_insert_one_dev_info+0x44e/0x520
> <4>[ 1.253075] set_domain_for_dev+0x66/0xf0
> <4>[ 1.253077] iommu_prepare_identity_map+0x48/0x95
> <4>[ 1.253080] intel_iommu_init+0xfd8/0x138d
> <4>[ 1.253082] pci_iommu_init+0x11/0x3a
> <4>[ 1.253084] do_one_initcall+0x58/0x300
> <4>[ 1.253086] kernel_init_freeable+0x2c0/0x359
> <4>[ 1.253089] kernel_init+0x5/0x100
> <4>[ 1.253091] ret_from_fork+0x3a/0x50
> <4>[ 1.253093]
> other info that might help us debug this:
>
> <4>[ 1.253095] Possible unsafe locking scenario:
>
> <4>[ 1.253095] CPU0 CPU1
> <4>[ 1.253095] ---- ----
> <4>[ 1.253095] lock(device_domain_lock);
> <4>[ 1.253095] lock(&(&iommu->lock)->rlock);
> <4>[ 1.253095] lock(device_domain_lock);
> <4>[ 1.253095] lock(&(&iommu->lock)->rlock);
> <4>[ 1.253095]
> *** DEADLOCK ***
>
> <4>[ 1.253095] 2 locks held by swapper/0/1:
> <4>[ 1.253095] #0: 0000000076465a1e (dmar_global_lock){++++}, at: intel_iommu_init+0x1d3/0x138d
> <4>[ 1.253095] #1: 0000000069266737 (device_domain_lock){....}, at: domain_context_mapping_one+0x88/0x4f0
> <4>[ 1.253095]
> stack backtrace:
> <4>[ 1.253095] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.2.0-rc5-CI-CI_DRM_6299+ #1
> <4>[ 1.253095] Hardware name: /NUC5i7RYB, BIOS RYBDWi35.86A.0362.2017.0118.0940 01/18/2017
> <4>[ 1.253095] Call Trace:
> <4>[ 1.253095] dump_stack+0x67/0x9b
> <4>[ 1.253095] print_circular_bug+0x1c8/0x2b0
> <4>[ 1.253095] __lock_acquire+0x1ce9/0x24c0
> <4>[ 1.253095] ? lock_acquire+0xa6/0x1c0
> <4>[ 1.253095] lock_acquire+0xa6/0x1c0
> <4>[ 1.253095] ? domain_context_mapping_one+0xa0/0x4f0
> <4>[ 1.253095] _raw_spin_lock+0x2a/0x40
> <4>[ 1.253095] ? domain_context_mapping_one+0xa0/0x4f0
> <4>[ 1.253095] domain_context_mapping_one+0xa0/0x4f0
> <4>[ 1.253095] ? domain_context_mapping_one+0x4f0/0x4f0
> <4>[ 1.253095] pci_for_each_dma_alias+0x2b/0x160
> <4>[ 1.253095] dmar_insert_one_dev_info+0x44e/0x520
> <4>[ 1.253095] set_domain_for_dev+0x66/0xf0
> <4>[ 1.253095] iommu_prepare_identity_map+0x48/0x95
> <4>[ 1.253095] intel_iommu_init+0xfd8/0x138d
> <4>[ 1.253095] ? set_debug_rodata+0xc/0xc
> <4>[ 1.253095] ? set_debug_rodata+0xc/0xc
> <4>[ 1.253095] ? e820__memblock_setup+0x5b/0x5b
> <4>[ 1.253095] ? pci_iommu_init+0x11/0x3a
> <4>[ 1.253095] ? set_debug_rodata+0xc/0xc
> <4>[ 1.253095] pci_iommu_init+0x11/0x3a
> <4>[ 1.253095] do_one_initcall+0x58/0x300
> <4>[ 1.253095] kernel_init_freeable+0x2c0/0x359
> <4>[ 1.253095] ? rest_init+0x250/0x250
> <4>[ 1.253095] kernel_init+0x5/0x100
> <4>[ 1.253095] ret_from_fork+0x3a/0x50
>