2023-03-14 05:19:48

by Lu Baolu

[permalink] [raw]
Subject: [PATCH] iommu/vt-d: Remove unnecessary locking in intel_irq_remapping_alloc()

The global rwsem dmar_global_lock was introduced by commit 3a5670e8ac932
("iommu/vt-d: Introduce a rwsem to protect global data structures"). It
is used to protect DMAR related global data from DMAR hotplug operations.

Using dmar_global_lock in intel_irq_remapping_alloc() is unnecessary as
the DMAR global data structures are not touched there. Remove it to avoid
below lockdep warning.

======================================================
WARNING: possible circular locking dependency detected
6.3.0-rc2 #468 Not tainted
------------------------------------------------------
swapper/0/1 is trying to acquire lock:
ff1db4cb40178698 (&domain->mutex){+.+.}-{3:3},
at: __irq_domain_alloc_irqs+0x3b/0xa0

but task is already holding lock:
ffffffffa0c1cdf0 (dmar_global_lock){++++}-{3:3},
at: intel_iommu_init+0x58e/0x880

which lock already depends on the new lock.

the existing dependency chain (in reverse order) is:

-> #1 (dmar_global_lock){++++}-{3:3}:
lock_acquire+0xd6/0x320
down_read+0x42/0x180
intel_irq_remapping_alloc+0xad/0x750
mp_irqdomain_alloc+0xb8/0x2b0
irq_domain_alloc_irqs_locked+0x12f/0x2d0
__irq_domain_alloc_irqs+0x56/0xa0
alloc_isa_irq_from_domain.isra.7+0xa0/0xe0
mp_map_pin_to_irq+0x1dc/0x330
setup_IO_APIC+0x128/0x210
apic_intr_mode_init+0x67/0x110
x86_late_time_init+0x24/0x40
start_kernel+0x41e/0x7e0
secondary_startup_64_no_verify+0xe0/0xeb

-> #0 (&domain->mutex){+.+.}-{3:3}:
check_prevs_add+0x160/0xef0
__lock_acquire+0x147d/0x1950
lock_acquire+0xd6/0x320
__mutex_lock+0x9c/0xfc0
__irq_domain_alloc_irqs+0x3b/0xa0
dmar_alloc_hwirq+0x9e/0x120
iommu_pmu_register+0x11d/0x200
intel_iommu_init+0x5de/0x880
pci_iommu_init+0x12/0x40
do_one_initcall+0x65/0x350
kernel_init_freeable+0x3ca/0x610
kernel_init+0x1a/0x140
ret_from_fork+0x29/0x50

other info that might help us debug this:

Possible unsafe locking scenario:

CPU0 CPU1
---- ----
lock(dmar_global_lock);
lock(&domain->mutex);
lock(dmar_global_lock);
lock(&domain->mutex);

*** DEADLOCK ***

Fixes: 9dbb8e3452ab ("irqdomain: Switch to per-domain locking")
Signed-off-by: Lu Baolu <[email protected]>
---
drivers/iommu/intel/irq_remapping.c | 6 ------
1 file changed, 6 deletions(-)

diff --git a/drivers/iommu/intel/irq_remapping.c b/drivers/iommu/intel/irq_remapping.c
index 6d01fa078c36..df9e261af0b5 100644
--- a/drivers/iommu/intel/irq_remapping.c
+++ b/drivers/iommu/intel/irq_remapping.c
@@ -311,14 +311,12 @@ static int set_ioapic_sid(struct irte *irte, int apic)
if (!irte)
return -1;

- down_read(&dmar_global_lock);
for (i = 0; i < MAX_IO_APICS; i++) {
if (ir_ioapic[i].iommu && ir_ioapic[i].id == apic) {
sid = (ir_ioapic[i].bus << 8) | ir_ioapic[i].devfn;
break;
}
}
- up_read(&dmar_global_lock);

if (sid == 0) {
pr_warn("Failed to set source-id of IOAPIC (%d)\n", apic);
@@ -338,14 +336,12 @@ static int set_hpet_sid(struct irte *irte, u8 id)
if (!irte)
return -1;

- down_read(&dmar_global_lock);
for (i = 0; i < MAX_HPET_TBS; i++) {
if (ir_hpet[i].iommu && ir_hpet[i].id == id) {
sid = (ir_hpet[i].bus << 8) | ir_hpet[i].devfn;
break;
}
}
- up_read(&dmar_global_lock);

if (sid == 0) {
pr_warn("Failed to set source-id of HPET block (%d)\n", id);
@@ -1339,9 +1335,7 @@ static int intel_irq_remapping_alloc(struct irq_domain *domain,
if (!data)
goto out_free_parent;

- down_read(&dmar_global_lock);
index = alloc_irte(iommu, &data->irq_2_iommu, nr_irqs);
- up_read(&dmar_global_lock);
if (index < 0) {
pr_warn("Failed to allocate IRTE\n");
kfree(data);
--
2.34.1



2023-03-14 15:51:19

by Jacob Pan

[permalink] [raw]
Subject: Re: [PATCH] iommu/vt-d: Remove unnecessary locking in intel_irq_remapping_alloc()

Hi BaoLu,

On Tue, 14 Mar 2023 13:18:36 +0800, Lu Baolu <[email protected]>
wrote:

> The global rwsem dmar_global_lock was introduced by commit 3a5670e8ac932
> ("iommu/vt-d: Introduce a rwsem to protect global data structures"). It
> is used to protect DMAR related global data from DMAR hotplug operations.
>
> Using dmar_global_lock in intel_irq_remapping_alloc() is unnecessary as
> the DMAR global data structures are not touched there. Remove it to avoid
> below lockdep warning.
>
> ======================================================
> WARNING: possible circular locking dependency detected
> 6.3.0-rc2 #468 Not tainted
> ------------------------------------------------------
> swapper/0/1 is trying to acquire lock:
> ff1db4cb40178698 (&domain->mutex){+.+.}-{3:3},
> at: __irq_domain_alloc_irqs+0x3b/0xa0
>
> but task is already holding lock:
> ffffffffa0c1cdf0 (dmar_global_lock){++++}-{3:3},
> at: intel_iommu_init+0x58e/0x880
>
> which lock already depends on the new lock.
>
> the existing dependency chain (in reverse order) is:
>
> -> #1 (dmar_global_lock){++++}-{3:3}:
> lock_acquire+0xd6/0x320
> down_read+0x42/0x180
> intel_irq_remapping_alloc+0xad/0x750
> mp_irqdomain_alloc+0xb8/0x2b0
> irq_domain_alloc_irqs_locked+0x12f/0x2d0
> __irq_domain_alloc_irqs+0x56/0xa0
> alloc_isa_irq_from_domain.isra.7+0xa0/0xe0
> mp_map_pin_to_irq+0x1dc/0x330
> setup_IO_APIC+0x128/0x210
> apic_intr_mode_init+0x67/0x110
> x86_late_time_init+0x24/0x40
> start_kernel+0x41e/0x7e0
> secondary_startup_64_no_verify+0xe0/0xeb
>
> -> #0 (&domain->mutex){+.+.}-{3:3}:
> check_prevs_add+0x160/0xef0
> __lock_acquire+0x147d/0x1950
> lock_acquire+0xd6/0x320
> __mutex_lock+0x9c/0xfc0
> __irq_domain_alloc_irqs+0x3b/0xa0
> dmar_alloc_hwirq+0x9e/0x120
> iommu_pmu_register+0x11d/0x200
> intel_iommu_init+0x5de/0x880
> pci_iommu_init+0x12/0x40
> do_one_initcall+0x65/0x350
> kernel_init_freeable+0x3ca/0x610
> kernel_init+0x1a/0x140
> ret_from_fork+0x29/0x50
>
> other info that might help us debug this:
>
> Possible unsafe locking scenario:
>
> CPU0 CPU1
> ---- ----
> lock(dmar_global_lock);
> lock(&domain->mutex);
> lock(dmar_global_lock);
> lock(&domain->mutex);
>
> *** DEADLOCK ***
>
> Fixes: 9dbb8e3452ab ("irqdomain: Switch to per-domain locking")
> Signed-off-by: Lu Baolu <[email protected]>
> ---
> drivers/iommu/intel/irq_remapping.c | 6 ------
> 1 file changed, 6 deletions(-)
>
> diff --git a/drivers/iommu/intel/irq_remapping.c
> b/drivers/iommu/intel/irq_remapping.c index 6d01fa078c36..df9e261af0b5
> 100644 --- a/drivers/iommu/intel/irq_remapping.c
> +++ b/drivers/iommu/intel/irq_remapping.c
> @@ -311,14 +311,12 @@ static int set_ioapic_sid(struct irte *irte, int
> apic) if (!irte)
> return -1;
>
> - down_read(&dmar_global_lock);
> for (i = 0; i < MAX_IO_APICS; i++) {
> if (ir_ioapic[i].iommu && ir_ioapic[i].id == apic) {
> sid = (ir_ioapic[i].bus << 8) |
> ir_ioapic[i].devfn; break;
> }
> }
> - up_read(&dmar_global_lock);
>
> if (sid == 0) {
> pr_warn("Failed to set source-id of IOAPIC (%d)\n",
> apic); @@ -338,14 +336,12 @@ static int set_hpet_sid(struct irte *irte,
> u8 id) if (!irte)
> return -1;
>
> - down_read(&dmar_global_lock);
> for (i = 0; i < MAX_HPET_TBS; i++) {
> if (ir_hpet[i].iommu && ir_hpet[i].id == id) {
> sid = (ir_hpet[i].bus << 8) | ir_hpet[i].devfn;
> break;
> }
> }
> - up_read(&dmar_global_lock);
>
> if (sid == 0) {
> pr_warn("Failed to set source-id of HPET block (%d)\n",
> id); @@ -1339,9 +1335,7 @@ static int intel_irq_remapping_alloc(struct
> irq_domain *domain, if (!data)
> goto out_free_parent;
>
> - down_read(&dmar_global_lock);
> index = alloc_irte(iommu, &data->irq_2_iommu, nr_irqs);
> - up_read(&dmar_global_lock);
> if (index < 0) {
> pr_warn("Failed to allocate IRTE\n");
> kfree(data);
Reviewed-by: Jacob Pan <[email protected]>

slightly beyond the scope of this, do we need to take dmar_global_lock
below? shouldn't it be in single threaded context?

down_write(&dmar_global_lock);
ret = dmar_dev_scope_init();
up_write(&dmar_global_lock);

return ret;
}
rootfs_initcall(ir_dev_scope_init);

Thanks,

Jacob

2023-03-15 01:15:59

by Lu Baolu

[permalink] [raw]
Subject: Re: [PATCH] iommu/vt-d: Remove unnecessary locking in intel_irq_remapping_alloc()

On 3/14/23 11:54 PM, Jacob Pan wrote:
> Hi BaoLu,
>
> On Tue, 14 Mar 2023 13:18:36 +0800, Lu Baolu <[email protected]>
> wrote:
>
>> The global rwsem dmar_global_lock was introduced by commit 3a5670e8ac932
>> ("iommu/vt-d: Introduce a rwsem to protect global data structures"). It
>> is used to protect DMAR related global data from DMAR hotplug operations.
>>
>> Using dmar_global_lock in intel_irq_remapping_alloc() is unnecessary as
>> the DMAR global data structures are not touched there. Remove it to avoid
>> below lockdep warning.
>>
>> ======================================================
>> WARNING: possible circular locking dependency detected
>> 6.3.0-rc2 #468 Not tainted
>> ------------------------------------------------------
>> swapper/0/1 is trying to acquire lock:
>> ff1db4cb40178698 (&domain->mutex){+.+.}-{3:3},
>> at: __irq_domain_alloc_irqs+0x3b/0xa0
>>
>> but task is already holding lock:
>> ffffffffa0c1cdf0 (dmar_global_lock){++++}-{3:3},
>> at: intel_iommu_init+0x58e/0x880
>>
>> which lock already depends on the new lock.
>>
>> the existing dependency chain (in reverse order) is:
>>
>> -> #1 (dmar_global_lock){++++}-{3:3}:
>> lock_acquire+0xd6/0x320
>> down_read+0x42/0x180
>> intel_irq_remapping_alloc+0xad/0x750
>> mp_irqdomain_alloc+0xb8/0x2b0
>> irq_domain_alloc_irqs_locked+0x12f/0x2d0
>> __irq_domain_alloc_irqs+0x56/0xa0
>> alloc_isa_irq_from_domain.isra.7+0xa0/0xe0
>> mp_map_pin_to_irq+0x1dc/0x330
>> setup_IO_APIC+0x128/0x210
>> apic_intr_mode_init+0x67/0x110
>> x86_late_time_init+0x24/0x40
>> start_kernel+0x41e/0x7e0
>> secondary_startup_64_no_verify+0xe0/0xeb
>>
>> -> #0 (&domain->mutex){+.+.}-{3:3}:
>> check_prevs_add+0x160/0xef0
>> __lock_acquire+0x147d/0x1950
>> lock_acquire+0xd6/0x320
>> __mutex_lock+0x9c/0xfc0
>> __irq_domain_alloc_irqs+0x3b/0xa0
>> dmar_alloc_hwirq+0x9e/0x120
>> iommu_pmu_register+0x11d/0x200
>> intel_iommu_init+0x5de/0x880
>> pci_iommu_init+0x12/0x40
>> do_one_initcall+0x65/0x350
>> kernel_init_freeable+0x3ca/0x610
>> kernel_init+0x1a/0x140
>> ret_from_fork+0x29/0x50
>>
>> other info that might help us debug this:
>>
>> Possible unsafe locking scenario:
>>
>> CPU0 CPU1
>> ---- ----
>> lock(dmar_global_lock);
>> lock(&domain->mutex);
>> lock(dmar_global_lock);
>> lock(&domain->mutex);
>>
>> *** DEADLOCK ***
>>
>> Fixes: 9dbb8e3452ab ("irqdomain: Switch to per-domain locking")
>> Signed-off-by: Lu Baolu <[email protected]>
>> ---
>> drivers/iommu/intel/irq_remapping.c | 6 ------
>> 1 file changed, 6 deletions(-)
>>
>> diff --git a/drivers/iommu/intel/irq_remapping.c
>> b/drivers/iommu/intel/irq_remapping.c index 6d01fa078c36..df9e261af0b5
>> 100644 --- a/drivers/iommu/intel/irq_remapping.c
>> +++ b/drivers/iommu/intel/irq_remapping.c
>> @@ -311,14 +311,12 @@ static int set_ioapic_sid(struct irte *irte, int
>> apic) if (!irte)
>> return -1;
>>
>> - down_read(&dmar_global_lock);
>> for (i = 0; i < MAX_IO_APICS; i++) {
>> if (ir_ioapic[i].iommu && ir_ioapic[i].id == apic) {
>> sid = (ir_ioapic[i].bus << 8) |
>> ir_ioapic[i].devfn; break;
>> }
>> }
>> - up_read(&dmar_global_lock);
>>
>> if (sid == 0) {
>> pr_warn("Failed to set source-id of IOAPIC (%d)\n",
>> apic); @@ -338,14 +336,12 @@ static int set_hpet_sid(struct irte *irte,
>> u8 id) if (!irte)
>> return -1;
>>
>> - down_read(&dmar_global_lock);
>> for (i = 0; i < MAX_HPET_TBS; i++) {
>> if (ir_hpet[i].iommu && ir_hpet[i].id == id) {
>> sid = (ir_hpet[i].bus << 8) | ir_hpet[i].devfn;
>> break;
>> }
>> }
>> - up_read(&dmar_global_lock);
>>
>> if (sid == 0) {
>> pr_warn("Failed to set source-id of HPET block (%d)\n",
>> id); @@ -1339,9 +1335,7 @@ static int intel_irq_remapping_alloc(struct
>> irq_domain *domain, if (!data)
>> goto out_free_parent;
>>
>> - down_read(&dmar_global_lock);
>> index = alloc_irte(iommu, &data->irq_2_iommu, nr_irqs);
>> - up_read(&dmar_global_lock);
>> if (index < 0) {
>> pr_warn("Failed to allocate IRTE\n");
>> kfree(data);
> Reviewed-by: Jacob Pan <[email protected]>
>
> slightly beyond the scope of this, do we need to take dmar_global_lock
> below? shouldn't it be in single threaded context?
>
> down_write(&dmar_global_lock);
> ret = dmar_dev_scope_init();
> up_write(&dmar_global_lock);
>
> return ret;
> }
> rootfs_initcall(ir_dev_scope_init);

Yes, agreed. This runs in a single threaded context. I will remove the
locking in a cleanup patch.

Best regards,
baolu

2023-03-27 12:21:59

by Jason Gunthorpe

[permalink] [raw]
Subject: Re: [PATCH] iommu/vt-d: Remove unnecessary locking in intel_irq_remapping_alloc()

On Tue, Mar 14, 2023 at 01:18:36PM +0800, Lu Baolu wrote:
> The global rwsem dmar_global_lock was introduced by commit 3a5670e8ac932
> ("iommu/vt-d: Introduce a rwsem to protect global data structures"). It
> is used to protect DMAR related global data from DMAR hotplug operations.
>
> Using dmar_global_lock in intel_irq_remapping_alloc() is unnecessary as
> the DMAR global data structures are not touched there. Remove it to avoid
> below lockdep warning.

Tested-by: Jason Gunthorpe <[email protected]>

Solves my splat too

Let's send this to -rc please

Jason

2023-03-27 13:14:12

by Lu Baolu

[permalink] [raw]
Subject: Re: [PATCH] iommu/vt-d: Remove unnecessary locking in intel_irq_remapping_alloc()

On 2023/3/27 20:18, Jason Gunthorpe wrote:
> On Tue, Mar 14, 2023 at 01:18:36PM +0800, Lu Baolu wrote:
>> The global rwsem dmar_global_lock was introduced by commit 3a5670e8ac932
>> ("iommu/vt-d: Introduce a rwsem to protect global data structures"). It
>> is used to protect DMAR related global data from DMAR hotplug operations.
>>
>> Using dmar_global_lock in intel_irq_remapping_alloc() is unnecessary as
>> the DMAR global data structures are not touched there. Remove it to avoid
>> below lockdep warning.
>
> Tested-by: Jason Gunthorpe <[email protected]>
>
> Solves my splat too
>
> Let's send this to -rc please

Thank you for the testing. I will queue it for Joerg this week.

Best regards,
baolu

2023-04-27 00:46:58

by Nicolin Chen

[permalink] [raw]
Subject: Re: [PATCH] iommu/vt-d: Remove unnecessary locking in intel_irq_remapping_alloc()

Hi Baolu/Jason,

On Mon, Mar 27, 2023 at 06:13:10AM -0700, Baolu Lu wrote:
> On 2023/3/27 20:18, Jason Gunthorpe wrote:
> > On Tue, Mar 14, 2023 at 01:18:36PM +0800, Lu Baolu wrote:
> >> The global rwsem dmar_global_lock was introduced by commit 3a5670e8ac932
> >> ("iommu/vt-d: Introduce a rwsem to protect global data structures"). It
> >> is used to protect DMAR related global data from DMAR hotplug operations.
> >>
> >> Using dmar_global_lock in intel_irq_remapping_alloc() is unnecessary as
> >> the DMAR global data structures are not touched there. Remove it to avoid
> >> below lockdep warning.
> >
> > Tested-by: Jason Gunthorpe <[email protected]>
> >
> > Solves my splat too
> >
> > Let's send this to -rc please
>
> Thank you for the testing. I will queue it for Joerg this week.

I found a couple of kernel warnings switching from v6.3-rc4
to v6.3-rc5. Git-bisect is pointing at this commit.

My test environment is MKT enabling irq_remap:
https://github.com/Mellanox/mkt
CONFIG_IOMMUFD=m
CONFIG_IOMMUFD_VFIO_CONTAINER=y
CONFIG_IOMMUFD_TEST=y
CONFIG_IRQ_REMAP=y

Any idea?

Thanks
Nicolin

Attaching WARNINGs:
[ 19.680725] ------------[ cut here ]------------
[ 19.681083] WARNING: CPU: 0 PID: 561 at include/linux/mmap_lock.h:161 track_pfn_remap+0xf5/0x100
[ 19.681356] Modules linked in: vfio_pci vfio_pci_core irqbypass vfio iommufd mlx5_ib ib_uverbs ib_core mlx5_core
[ 19.681654] CPU: 0 PID: 561 Comm: python3 Not tainted 6.3.0-rc6+ #1080
[ 19.681808] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
[ 19.682108] RIP: 0010:track_pfn_remap+0xf5/0x100
[ 19.682270] Code: 5e 5d c3 48 89 f2 31 c9 4c 89 c6 4c 89 e7 e8 42 fc ff ff e9 54 ff ff ff 48 8d b8 88 01 00 00 31 f6 e8 5f 97 76 00 85 c0 75 be <0f> 0b eb ba 0f 1f 80 00 00 00 00 80 3d 71 72 ef 00 00 74 01 c3 55
[ 19.682678] RSP: 0000:ffffc900014b7ce8 EFLAGS: 00010246
[ 19.682805] RAX: 0000000000000000 RBX: 0000000002000000 RCX: 0000000000000000
[ 19.682984] RDX: 0000000000000000 RSI: ffff888104a709c8 RDI: ffff888108756150
[ 19.683214] RBP: ffffc900014b7d08 R08: 0000000000000001 R09: 0000000000000003
[ 19.683464] R10: 000000000e6a4a47 R11: 00000000620892b1 R12: 00000000fc000000
[ 19.683663] R13: ffffc900014b7d20 R14: ffff888106a35100 R15: 0000000002000000
[ 19.683843] FS: 00007fa8aa4edb80(0000) GS:ffff8881ba400000(0000) knlGS:0000000000000000
[ 19.684054] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 19.684223] CR2: 00007fa8a748f004 CR3: 000000010673b005 CR4: 00000000003706b0
[ 19.684414] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 19.684632] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 19.684817] Call Trace:
[ 19.684893] <TASK>
[ 19.684967] remap_pfn_range+0x3e/0xa0
[ 19.685084] vfio_pci_mmap_fault+0x8a/0x160 [vfio_pci_core]
[ 19.685253] __do_fault+0x30/0xa0
[ 19.685368] __handle_mm_fault+0xe08/0x1ff0
[ 19.685520] ? find_held_lock+0x31/0x80
[ 19.685655] ? mt_find+0x15d/0x400
[ 19.685759] ? lock_release+0xbc/0x240
[ 19.685862] handle_mm_fault+0xa8/0x170
[ 19.685963] ? find_vma+0x3c/0x70
[ 19.686066] exc_page_fault+0x1e6/0x7b0
[ 19.686167] asm_exc_page_fault+0x27/0x30
[ 19.686271] RIP: 0033:0x7fa8a986bfd5
[ 19.686373] Code: ef 41 89 c4 e8 dc 70 fc ff 45 85 e4 0f 85 a0 0e 00 00 48 89 df e8 3b ce ff ff 48 85 c0 0f 84 eb 0d 00 00 48 8b ab e8 02 00 00 <8b> 45 04 0f c8 c1 e8 10 83 f8 05 0f 85 b0 0b 00 00 8b 45 14 ba 01
[ 19.686773] RSP: 002b:00007ffc2392ab30 EFLAGS: 00010206
[ 19.686931] RAX: 0000559646201590 RBX: 0000559646471a50 RCX: 0000559646471d00
[ 19.687132] RDX: 0000559646471d00 RSI: 0000000000003b71 RDI: 0000000000000003
[ 19.687377] RBP: 00007fa8a748f000 R08: 00000000fedfffff R09: 0000000000000000
[ 19.687563] R10: 0000000000000022 R11: 0000000000000246 R12: 0000559646470904
[ 19.687744] R13: 00007ffc2392ab70 R14: 0000000000000002 R15: 0000559646470934
[ 19.687943] </TASK>
[ 19.688016] irq event stamp: 655597
[ 19.688114] hardirqs last enabled at (655605): [<ffffffff810c9683>] __up_console_sem+0x53/0x60
[ 19.688340] hardirqs last disabled at (655612): [<ffffffff810c9668>] __up_console_sem+0x38/0x60
[ 19.688554] softirqs last enabled at (655148): [<ffffffff81064e79>] irq_exit_rcu+0x69/0x90
[ 19.688733] softirqs last disabled at (655143): [<ffffffff81064e79>] irq_exit_rcu+0x69/0x90
[ 19.688915] ---[ end trace 0000000000000000 ]---
[ 19.689076] ------------[ cut here ]------------
[ 19.689197] WARNING: CPU: 0 PID: 561 at include/linux/mmap_lock.h:161 remap_pfn_range_notrack+0x40f/0x4f0
[ 19.689440] Modules linked in: vfio_pci vfio_pci_core irqbypass vfio iommufd mlx5_ib ib_uverbs ib_core mlx5_core
[ 19.689691] CPU: 0 PID: 561 Comm: python3 Tainted: G W 6.3.0-rc6+ #1080
[ 19.689867] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
[ 19.690109] RIP: 0010:remap_pfn_range_notrack+0x40f/0x4f0
[ 19.690234] Code: 39 eb 0f 85 d9 fc ff ff 31 c0 eb 5e 48 8b 45 a8 31 f6 4c 89 45 d0 48 8d b8 88 01 00 00 e8 89 bc 5a 00 4c 8b 45 d0 85 c0 75 02 <0f> 0b 48 8b 43 20 e9 5e fc ff ff 48 8b 7d a8 48 89 c6 48 89 c3 e8
[ 19.690628] RSP: 0000:ffffc900014b7c68 EFLAGS: 00010246
[ 19.690750] RAX: 0000000000000000 RBX: ffff888106a35100 RCX: 0000000000000000
[ 19.690914] RDX: 0000000000000000 RSI: ffff888104a709c8 RDI: ffff888108756150
[ 19.691074] RBP: ffffc900014b7d08 R08: 00007fa8a948f000 R09: 0000000000000003
[ 19.691274] R10: 000000000e6a4a47 R11: 00000000620892b1 R12: 00000000000fc000
[ 19.691469] R13: 00007fa8a748f000 R14: 00007fa8a748f000 R15: 8000000000000037
[ 19.691647] FS: 00007fa8aa4edb80(0000) GS:ffff8881ba400000(0000) knlGS:0000000000000000
[ 19.691830] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 19.691980] CR2: 00007fa8a748f004 CR3: 000000010673b005 CR4: 00000000003706b0
[ 19.692159] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 19.692336] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 19.692538] Call Trace:
[ 19.692592] <TASK>
[ 19.692647] ? track_pfn_remap+0xf7/0x100
[ 19.692745] remap_pfn_range+0x57/0xa0
[ 19.692845] vfio_pci_mmap_fault+0x8a/0x160 [vfio_pci_core]
[ 19.692991] __do_fault+0x30/0xa0
[ 19.693089] __handle_mm_fault+0xe08/0x1ff0
[ 19.693186] ? find_held_lock+0x31/0x80
[ 19.693291] ? mt_find+0x15d/0x400
[ 19.693391] ? lock_release+0xbc/0x240
[ 19.693491] handle_mm_fault+0xa8/0x170
[ 19.693587] ? find_vma+0x3c/0x70
[ 19.693685] exc_page_fault+0x1e6/0x7b0
[ 19.693782] asm_exc_page_fault+0x27/0x30
[ 19.693880] RIP: 0033:0x7fa8a986bfd5
[ 19.693961] Code: ef 41 89 c4 e8 dc 70 fc ff 45 85 e4 0f 85 a0 0e 00 00 48 89 df e8 3b ce ff ff 48 85 c0 0f 84 eb 0d 00 00 48 8b ab e8 02 00 00 <8b> 45 04 0f c8 c1 e8 10 83 f8 05 0f 85 b0 0b 00 00 8b 45 14 ba 01
[ 19.694342] RSP: 002b:00007ffc2392ab30 EFLAGS: 00010206
[ 19.694466] RAX: 0000559646201590 RBX: 0000559646471a50 RCX: 0000559646471d00
[ 19.694619] RDX: 0000559646471d00 RSI: 0000000000003b71 RDI: 0000000000000003
[ 19.694778] RBP: 00007fa8a748f000 R08: 00000000fedfffff R09: 0000000000000000
[ 19.694934] R10: 0000000000000022 R11: 0000000000000246 R12: 0000559646470904
[ 19.695092] R13: 00007ffc2392ab70 R14: 0000000000000002 R15: 0000559646470934
[ 19.695302] </TASK>
[ 19.695373] irq event stamp: 656049
[ 19.695452] hardirqs last enabled at (656057): [<ffffffff810c9683>] __up_console_sem+0x53/0x60
[ 19.695657] hardirqs last disabled at (656064): [<ffffffff810c9668>] __up_console_sem+0x38/0x60
[ 19.695883] softirqs last enabled at (655148): [<ffffffff81064e79>] irq_exit_rcu+0x69/0x90
[ 19.696061] softirqs last disabled at (655143): [<ffffffff81064e79>] irq_exit_rcu+0x69/0x90
[ 19.696240] ---[ end trace 0000000000000000 ]---

2023-04-27 03:56:36

by Lu Baolu

[permalink] [raw]
Subject: Re: [PATCH] iommu/vt-d: Remove unnecessary locking in intel_irq_remapping_alloc()

On 4/27/23 8:43 AM, Nicolin Chen wrote:
> Hi Baolu/Jason,

Hi Nicolin,

>
> On Mon, Mar 27, 2023 at 06:13:10AM -0700, Baolu Lu wrote:
>> On 2023/3/27 20:18, Jason Gunthorpe wrote:
>>> On Tue, Mar 14, 2023 at 01:18:36PM +0800, Lu Baolu wrote:
>>>> The global rwsem dmar_global_lock was introduced by commit 3a5670e8ac932
>>>> ("iommu/vt-d: Introduce a rwsem to protect global data structures"). It
>>>> is used to protect DMAR related global data from DMAR hotplug operations.
>>>>
>>>> Using dmar_global_lock in intel_irq_remapping_alloc() is unnecessary as
>>>> the DMAR global data structures are not touched there. Remove it to avoid
>>>> below lockdep warning.
>>>
>>> Tested-by: Jason Gunthorpe <[email protected]>
>>>
>>> Solves my splat too
>>>
>>> Let's send this to -rc please
>>
>> Thank you for the testing. I will queue it for Joerg this week.
>
> I found a couple of kernel warnings switching from v6.3-rc4
> to v6.3-rc5. Git-bisect is pointing at this commit.
>
> My test environment is MKT enabling irq_remap:
> https://github.com/Mellanox/mkt
> CONFIG_IOMMUFD=m
> CONFIG_IOMMUFD_VFIO_CONTAINER=y
> CONFIG_IOMMUFD_TEST=y
> CONFIG_IRQ_REMAP=y
>
> Any idea?
>
> Thanks
> Nicolin
>
> Attaching WARNINGs:
> [ 19.680725] ------------[ cut here ]------------
> [ 19.681083] WARNING: CPU: 0 PID: 561 at include/linux/mmap_lock.h:161 track_pfn_remap+0xf5/0x100
> [ 19.681356] Modules linked in: vfio_pci vfio_pci_core irqbypass vfio iommufd mlx5_ib ib_uverbs ib_core mlx5_core
> [ 19.681654] CPU: 0 PID: 561 Comm: python3 Not tainted 6.3.0-rc6+ #1080
> [ 19.681808] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
> [ 19.682108] RIP: 0010:track_pfn_remap+0xf5/0x100
> [ 19.682270] Code: 5e 5d c3 48 89 f2 31 c9 4c 89 c6 4c 89 e7 e8 42 fc ff ff e9 54 ff ff ff 48 8d b8 88 01 00 00 31 f6 e8 5f 97 76 00 85 c0 75 be <0f> 0b eb ba 0f 1f 80 00 00 00 00 80 3d 71 72 ef 00 00 74 01 c3 55
> [ 19.682678] RSP: 0000:ffffc900014b7ce8 EFLAGS: 00010246
> [ 19.682805] RAX: 0000000000000000 RBX: 0000000002000000 RCX: 0000000000000000
> [ 19.682984] RDX: 0000000000000000 RSI: ffff888104a709c8 RDI: ffff888108756150
> [ 19.683214] RBP: ffffc900014b7d08 R08: 0000000000000001 R09: 0000000000000003
> [ 19.683464] R10: 000000000e6a4a47 R11: 00000000620892b1 R12: 00000000fc000000
> [ 19.683663] R13: ffffc900014b7d20 R14: ffff888106a35100 R15: 0000000002000000
> [ 19.683843] FS: 00007fa8aa4edb80(0000) GS:ffff8881ba400000(0000) knlGS:0000000000000000
> [ 19.684054] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 19.684223] CR2: 00007fa8a748f004 CR3: 000000010673b005 CR4: 00000000003706b0
> [ 19.684414] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [ 19.684632] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> [ 19.684817] Call Trace:
> [ 19.684893] <TASK>
> [ 19.684967] remap_pfn_range+0x3e/0xa0
> [ 19.685084] vfio_pci_mmap_fault+0x8a/0x160 [vfio_pci_core]
> [ 19.685253] __do_fault+0x30/0xa0
> [ 19.685368] __handle_mm_fault+0xe08/0x1ff0
> [ 19.685520] ? find_held_lock+0x31/0x80
> [ 19.685655] ? mt_find+0x15d/0x400
> [ 19.685759] ? lock_release+0xbc/0x240
> [ 19.685862] handle_mm_fault+0xa8/0x170
> [ 19.685963] ? find_vma+0x3c/0x70
> [ 19.686066] exc_page_fault+0x1e6/0x7b0
> [ 19.686167] asm_exc_page_fault+0x27/0x30
> [ 19.686271] RIP: 0033:0x7fa8a986bfd5
> [ 19.686373] Code: ef 41 89 c4 e8 dc 70 fc ff 45 85 e4 0f 85 a0 0e 00 00 48 89 df e8 3b ce ff ff 48 85 c0 0f 84 eb 0d 00 00 48 8b ab e8 02 00 00 <8b> 45 04 0f c8 c1 e8 10 83 f8 05 0f 85 b0 0b 00 00 8b 45 14 ba 01
> [ 19.686773] RSP: 002b:00007ffc2392ab30 EFLAGS: 00010206
> [ 19.686931] RAX: 0000559646201590 RBX: 0000559646471a50 RCX: 0000559646471d00
> [ 19.687132] RDX: 0000559646471d00 RSI: 0000000000003b71 RDI: 0000000000000003
> [ 19.687377] RBP: 00007fa8a748f000 R08: 00000000fedfffff R09: 0000000000000000
> [ 19.687563] R10: 0000000000000022 R11: 0000000000000246 R12: 0000559646470904
> [ 19.687744] R13: 00007ffc2392ab70 R14: 0000000000000002 R15: 0000559646470934
> [ 19.687943] </TASK>
> [ 19.688016] irq event stamp: 655597
> [ 19.688114] hardirqs last enabled at (655605): [<ffffffff810c9683>] __up_console_sem+0x53/0x60
> [ 19.688340] hardirqs last disabled at (655612): [<ffffffff810c9668>] __up_console_sem+0x38/0x60
> [ 19.688554] softirqs last enabled at (655148): [<ffffffff81064e79>] irq_exit_rcu+0x69/0x90
> [ 19.688733] softirqs last disabled at (655143): [<ffffffff81064e79>] irq_exit_rcu+0x69/0x90
> [ 19.688915] ---[ end trace 0000000000000000 ]---
> [ 19.689076] ------------[ cut here ]------------
> [ 19.689197] WARNING: CPU: 0 PID: 561 at include/linux/mmap_lock.h:161 remap_pfn_range_notrack+0x40f/0x4f0
> [ 19.689440] Modules linked in: vfio_pci vfio_pci_core irqbypass vfio iommufd mlx5_ib ib_uverbs ib_core mlx5_core
> [ 19.689691] CPU: 0 PID: 561 Comm: python3 Tainted: G W 6.3.0-rc6+ #1080
> [ 19.689867] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
> [ 19.690109] RIP: 0010:remap_pfn_range_notrack+0x40f/0x4f0
> [ 19.690234] Code: 39 eb 0f 85 d9 fc ff ff 31 c0 eb 5e 48 8b 45 a8 31 f6 4c 89 45 d0 48 8d b8 88 01 00 00 e8 89 bc 5a 00 4c 8b 45 d0 85 c0 75 02 <0f> 0b 48 8b 43 20 e9 5e fc ff ff 48 8b 7d a8 48 89 c6 48 89 c3 e8
> [ 19.690628] RSP: 0000:ffffc900014b7c68 EFLAGS: 00010246
> [ 19.690750] RAX: 0000000000000000 RBX: ffff888106a35100 RCX: 0000000000000000
> [ 19.690914] RDX: 0000000000000000 RSI: ffff888104a709c8 RDI: ffff888108756150
> [ 19.691074] RBP: ffffc900014b7d08 R08: 00007fa8a948f000 R09: 0000000000000003
> [ 19.691274] R10: 000000000e6a4a47 R11: 00000000620892b1 R12: 00000000000fc000
> [ 19.691469] R13: 00007fa8a748f000 R14: 00007fa8a748f000 R15: 8000000000000037
> [ 19.691647] FS: 00007fa8aa4edb80(0000) GS:ffff8881ba400000(0000) knlGS:0000000000000000
> [ 19.691830] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 19.691980] CR2: 00007fa8a748f004 CR3: 000000010673b005 CR4: 00000000003706b0
> [ 19.692159] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [ 19.692336] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> [ 19.692538] Call Trace:
> [ 19.692592] <TASK>
> [ 19.692647] ? track_pfn_remap+0xf7/0x100
> [ 19.692745] remap_pfn_range+0x57/0xa0
> [ 19.692845] vfio_pci_mmap_fault+0x8a/0x160 [vfio_pci_core]
> [ 19.692991] __do_fault+0x30/0xa0
> [ 19.693089] __handle_mm_fault+0xe08/0x1ff0
> [ 19.693186] ? find_held_lock+0x31/0x80
> [ 19.693291] ? mt_find+0x15d/0x400
> [ 19.693391] ? lock_release+0xbc/0x240
> [ 19.693491] handle_mm_fault+0xa8/0x170
> [ 19.693587] ? find_vma+0x3c/0x70
> [ 19.693685] exc_page_fault+0x1e6/0x7b0
> [ 19.693782] asm_exc_page_fault+0x27/0x30
> [ 19.693880] RIP: 0033:0x7fa8a986bfd5
> [ 19.693961] Code: ef 41 89 c4 e8 dc 70 fc ff 45 85 e4 0f 85 a0 0e 00 00 48 89 df e8 3b ce ff ff 48 85 c0 0f 84 eb 0d 00 00 48 8b ab e8 02 00 00 <8b> 45 04 0f c8 c1 e8 10 83 f8 05 0f 85 b0 0b 00 00 8b 45 14 ba 01
> [ 19.694342] RSP: 002b:00007ffc2392ab30 EFLAGS: 00010206
> [ 19.694466] RAX: 0000559646201590 RBX: 0000559646471a50 RCX: 0000559646471d00
> [ 19.694619] RDX: 0000559646471d00 RSI: 0000000000003b71 RDI: 0000000000000003
> [ 19.694778] RBP: 00007fa8a748f000 R08: 00000000fedfffff R09: 0000000000000000
> [ 19.694934] R10: 0000000000000022 R11: 0000000000000246 R12: 0000559646470904
> [ 19.695092] R13: 00007ffc2392ab70 R14: 0000000000000002 R15: 0000559646470934
> [ 19.695302] </TASK>
> [ 19.695373] irq event stamp: 656049
> [ 19.695452] hardirqs last enabled at (656057): [<ffffffff810c9683>] __up_console_sem+0x53/0x60
> [ 19.695657] hardirqs last disabled at (656064): [<ffffffff810c9668>] __up_console_sem+0x38/0x60
> [ 19.695883] softirqs last enabled at (655148): [<ffffffff81064e79>] irq_exit_rcu+0x69/0x90
> [ 19.696061] softirqs last disabled at (655143): [<ffffffff81064e79>] irq_exit_rcu+0x69/0x90
> [ 19.696240] ---[ end trace 0000000000000000 ]---

I took a quick look. It seems that above warnings are irrelevant to this
commit. Can you please simply revert this commit and check whether there
are any changes?

Best regards,
baolu

2023-04-27 03:57:13

by Nicolin Chen

[permalink] [raw]
Subject: Re: [PATCH] iommu/vt-d: Remove unnecessary locking in intel_irq_remapping_alloc()

On Thu, Apr 27, 2023 at 11:20:40AM +0800, Baolu Lu wrote:

> > Attaching WARNINGs:
> > [ 19.680725] ------------[ cut here ]------------
> > [ 19.681083] WARNING: CPU: 0 PID: 561 at include/linux/mmap_lock.h:161 track_pfn_remap+0xf5/0x100

> > [ 19.684817] Call Trace:
> > [ 19.684893] <TASK>
> > [ 19.684967] remap_pfn_range+0x3e/0xa0
> > [ 19.685084] vfio_pci_mmap_fault+0x8a/0x160 [vfio_pci_core]

> I took a quick look. It seems that above warnings are irrelevant to this
> commit. Can you please simply revert this commit and check whether there
> are any changes?

I tried on top of the v6.3-rc5 tag. The warnings were triggered
constantly. And reverting the commit fixes it:
nicolinc@Asurada-Nvidia:~/work/mkt/images/nicolinc/src/kernel$ git log --oneline -2
cb3dc9b2417e (HEAD -> v6.3-rc5) Revert "iommu/vt-d: Remove unnecessary locking in intel_irq_remapping_alloc()"
7e364e56293b (tag: v6.3-rc5, jgg/linus) Linux 6.3-rc5

I don't think the commit is the causation yet there seems to be
a strong correlation here...

Thanks
Nic

2023-04-27 09:45:23

by Robin Murphy

[permalink] [raw]
Subject: Re: [PATCH] iommu/vt-d: Remove unnecessary locking in intel_irq_remapping_alloc()

On 2023-04-27 04:37, Nicolin Chen wrote:
> On Thu, Apr 27, 2023 at 11:20:40AM +0800, Baolu Lu wrote:
>
>>> Attaching WARNINGs:
>>> [ 19.680725] ------------[ cut here ]------------
>>> [ 19.681083] WARNING: CPU: 0 PID: 561 at include/linux/mmap_lock.h:161 track_pfn_remap+0xf5/0x100
>
>>> [ 19.684817] Call Trace:
>>> [ 19.684893] <TASK>
>>> [ 19.684967] remap_pfn_range+0x3e/0xa0
>>> [ 19.685084] vfio_pci_mmap_fault+0x8a/0x160 [vfio_pci_core]
>
>> I took a quick look. It seems that above warnings are irrelevant to this
>> commit. Can you please simply revert this commit and check whether there
>> are any changes?
>
> I tried on top of the v6.3-rc5 tag. The warnings were triggered
> constantly. And reverting the commit fixes it:
> nicolinc@Asurada-Nvidia:~/work/mkt/images/nicolinc/src/kernel$ git log --oneline -2
> cb3dc9b2417e (HEAD -> v6.3-rc5) Revert "iommu/vt-d: Remove unnecessary locking in intel_irq_remapping_alloc()"
> 7e364e56293b (tag: v6.3-rc5, jgg/linus) Linux 6.3-rc5
>
> I don't think the commit is the causation yet there seems to be
> a strong correlation here...

The correlation is probably that you're now getting to see a genuine
warning from lockdep_assert_held(), since intel-iommu is no longer
causing debug_locks to be turned off earlier.

Robin.

2023-04-27 17:31:58

by Nicolin Chen

[permalink] [raw]
Subject: Re: [PATCH] iommu/vt-d: Remove unnecessary locking in intel_irq_remapping_alloc()

On Thu, Apr 27, 2023 at 10:35:11AM +0100, Robin Murphy wrote:
> On 2023-04-27 04:37, Nicolin Chen wrote:
> > On Thu, Apr 27, 2023 at 11:20:40AM +0800, Baolu Lu wrote:
> >
> > > > Attaching WARNINGs:
> > > > [ 19.680725] ------------[ cut here ]------------
> > > > [ 19.681083] WARNING: CPU: 0 PID: 561 at include/linux/mmap_lock.h:161 track_pfn_remap+0xf5/0x100
> >
> > > > [ 19.684817] Call Trace:
> > > > [ 19.684893] <TASK>
> > > > [ 19.684967] remap_pfn_range+0x3e/0xa0
> > > > [ 19.685084] vfio_pci_mmap_fault+0x8a/0x160 [vfio_pci_core]
> >
> > > I took a quick look. It seems that above warnings are irrelevant to this
> > > commit. Can you please simply revert this commit and check whether there
> > > are any changes?
> >
> > I tried on top of the v6.3-rc5 tag. The warnings were triggered
> > constantly. And reverting the commit fixes it:
> > nicolinc@Asurada-Nvidia:~/work/mkt/images/nicolinc/src/kernel$ git log --oneline -2
> > cb3dc9b2417e (HEAD -> v6.3-rc5) Revert "iommu/vt-d: Remove unnecessary locking in intel_irq_remapping_alloc()"
> > 7e364e56293b (tag: v6.3-rc5, jgg/linus) Linux 6.3-rc5
> >
> > I don't think the commit is the causation yet there seems to be
> > a strong correlation here...
>
> The correlation is probably that you're now getting to see a genuine
> warning from lockdep_assert_held(), since intel-iommu is no longer
> causing debug_locks to be turned off earlier.

Hmm, looks like so. The mmap_lock is held with a read permission,
in arch/x86/mm/fault.c file, v.s. the expectation of a write one
by remap_pfn_range().

Thanks
Nicolin