The global rwsem dmar_global_lock was introduced by commit 3a5670e8ac932
("iommu/vt-d: Introduce a rwsem to protect global data structures"). It
is used to protect DMAR related global data from DMAR hotplug operations.
Using dmar_global_lock in intel_irq_remapping_alloc() is unnecessary as
the DMAR global data structures are not touched there. Remove it to avoid
below lockdep warning.
======================================================
WARNING: possible circular locking dependency detected
6.3.0-rc2 #468 Not tainted
------------------------------------------------------
swapper/0/1 is trying to acquire lock:
ff1db4cb40178698 (&domain->mutex){+.+.}-{3:3},
at: __irq_domain_alloc_irqs+0x3b/0xa0
but task is already holding lock:
ffffffffa0c1cdf0 (dmar_global_lock){++++}-{3:3},
at: intel_iommu_init+0x58e/0x880
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
-> #1 (dmar_global_lock){++++}-{3:3}:
lock_acquire+0xd6/0x320
down_read+0x42/0x180
intel_irq_remapping_alloc+0xad/0x750
mp_irqdomain_alloc+0xb8/0x2b0
irq_domain_alloc_irqs_locked+0x12f/0x2d0
__irq_domain_alloc_irqs+0x56/0xa0
alloc_isa_irq_from_domain.isra.7+0xa0/0xe0
mp_map_pin_to_irq+0x1dc/0x330
setup_IO_APIC+0x128/0x210
apic_intr_mode_init+0x67/0x110
x86_late_time_init+0x24/0x40
start_kernel+0x41e/0x7e0
secondary_startup_64_no_verify+0xe0/0xeb
-> #0 (&domain->mutex){+.+.}-{3:3}:
check_prevs_add+0x160/0xef0
__lock_acquire+0x147d/0x1950
lock_acquire+0xd6/0x320
__mutex_lock+0x9c/0xfc0
__irq_domain_alloc_irqs+0x3b/0xa0
dmar_alloc_hwirq+0x9e/0x120
iommu_pmu_register+0x11d/0x200
intel_iommu_init+0x5de/0x880
pci_iommu_init+0x12/0x40
do_one_initcall+0x65/0x350
kernel_init_freeable+0x3ca/0x610
kernel_init+0x1a/0x140
ret_from_fork+0x29/0x50
other info that might help us debug this:
Possible unsafe locking scenario:
CPU0 CPU1
---- ----
lock(dmar_global_lock);
lock(&domain->mutex);
lock(dmar_global_lock);
lock(&domain->mutex);
*** DEADLOCK ***
Fixes: 9dbb8e3452ab ("irqdomain: Switch to per-domain locking")
Signed-off-by: Lu Baolu <[email protected]>
---
drivers/iommu/intel/irq_remapping.c | 6 ------
1 file changed, 6 deletions(-)
diff --git a/drivers/iommu/intel/irq_remapping.c b/drivers/iommu/intel/irq_remapping.c
index 6d01fa078c36..df9e261af0b5 100644
--- a/drivers/iommu/intel/irq_remapping.c
+++ b/drivers/iommu/intel/irq_remapping.c
@@ -311,14 +311,12 @@ static int set_ioapic_sid(struct irte *irte, int apic)
if (!irte)
return -1;
- down_read(&dmar_global_lock);
for (i = 0; i < MAX_IO_APICS; i++) {
if (ir_ioapic[i].iommu && ir_ioapic[i].id == apic) {
sid = (ir_ioapic[i].bus << 8) | ir_ioapic[i].devfn;
break;
}
}
- up_read(&dmar_global_lock);
if (sid == 0) {
pr_warn("Failed to set source-id of IOAPIC (%d)\n", apic);
@@ -338,14 +336,12 @@ static int set_hpet_sid(struct irte *irte, u8 id)
if (!irte)
return -1;
- down_read(&dmar_global_lock);
for (i = 0; i < MAX_HPET_TBS; i++) {
if (ir_hpet[i].iommu && ir_hpet[i].id == id) {
sid = (ir_hpet[i].bus << 8) | ir_hpet[i].devfn;
break;
}
}
- up_read(&dmar_global_lock);
if (sid == 0) {
pr_warn("Failed to set source-id of HPET block (%d)\n", id);
@@ -1339,9 +1335,7 @@ static int intel_irq_remapping_alloc(struct irq_domain *domain,
if (!data)
goto out_free_parent;
- down_read(&dmar_global_lock);
index = alloc_irte(iommu, &data->irq_2_iommu, nr_irqs);
- up_read(&dmar_global_lock);
if (index < 0) {
pr_warn("Failed to allocate IRTE\n");
kfree(data);
--
2.34.1
Hi BaoLu,
On Tue, 14 Mar 2023 13:18:36 +0800, Lu Baolu <[email protected]>
wrote:
> The global rwsem dmar_global_lock was introduced by commit 3a5670e8ac932
> ("iommu/vt-d: Introduce a rwsem to protect global data structures"). It
> is used to protect DMAR related global data from DMAR hotplug operations.
>
> Using dmar_global_lock in intel_irq_remapping_alloc() is unnecessary as
> the DMAR global data structures are not touched there. Remove it to avoid
> below lockdep warning.
>
> ======================================================
> WARNING: possible circular locking dependency detected
> 6.3.0-rc2 #468 Not tainted
> ------------------------------------------------------
> swapper/0/1 is trying to acquire lock:
> ff1db4cb40178698 (&domain->mutex){+.+.}-{3:3},
> at: __irq_domain_alloc_irqs+0x3b/0xa0
>
> but task is already holding lock:
> ffffffffa0c1cdf0 (dmar_global_lock){++++}-{3:3},
> at: intel_iommu_init+0x58e/0x880
>
> which lock already depends on the new lock.
>
> the existing dependency chain (in reverse order) is:
>
> -> #1 (dmar_global_lock){++++}-{3:3}:
> lock_acquire+0xd6/0x320
> down_read+0x42/0x180
> intel_irq_remapping_alloc+0xad/0x750
> mp_irqdomain_alloc+0xb8/0x2b0
> irq_domain_alloc_irqs_locked+0x12f/0x2d0
> __irq_domain_alloc_irqs+0x56/0xa0
> alloc_isa_irq_from_domain.isra.7+0xa0/0xe0
> mp_map_pin_to_irq+0x1dc/0x330
> setup_IO_APIC+0x128/0x210
> apic_intr_mode_init+0x67/0x110
> x86_late_time_init+0x24/0x40
> start_kernel+0x41e/0x7e0
> secondary_startup_64_no_verify+0xe0/0xeb
>
> -> #0 (&domain->mutex){+.+.}-{3:3}:
> check_prevs_add+0x160/0xef0
> __lock_acquire+0x147d/0x1950
> lock_acquire+0xd6/0x320
> __mutex_lock+0x9c/0xfc0
> __irq_domain_alloc_irqs+0x3b/0xa0
> dmar_alloc_hwirq+0x9e/0x120
> iommu_pmu_register+0x11d/0x200
> intel_iommu_init+0x5de/0x880
> pci_iommu_init+0x12/0x40
> do_one_initcall+0x65/0x350
> kernel_init_freeable+0x3ca/0x610
> kernel_init+0x1a/0x140
> ret_from_fork+0x29/0x50
>
> other info that might help us debug this:
>
> Possible unsafe locking scenario:
>
> CPU0 CPU1
> ---- ----
> lock(dmar_global_lock);
> lock(&domain->mutex);
> lock(dmar_global_lock);
> lock(&domain->mutex);
>
> *** DEADLOCK ***
>
> Fixes: 9dbb8e3452ab ("irqdomain: Switch to per-domain locking")
> Signed-off-by: Lu Baolu <[email protected]>
> ---
> drivers/iommu/intel/irq_remapping.c | 6 ------
> 1 file changed, 6 deletions(-)
>
> diff --git a/drivers/iommu/intel/irq_remapping.c
> b/drivers/iommu/intel/irq_remapping.c index 6d01fa078c36..df9e261af0b5
> 100644 --- a/drivers/iommu/intel/irq_remapping.c
> +++ b/drivers/iommu/intel/irq_remapping.c
> @@ -311,14 +311,12 @@ static int set_ioapic_sid(struct irte *irte, int
> apic) if (!irte)
> return -1;
>
> - down_read(&dmar_global_lock);
> for (i = 0; i < MAX_IO_APICS; i++) {
> if (ir_ioapic[i].iommu && ir_ioapic[i].id == apic) {
> sid = (ir_ioapic[i].bus << 8) |
> ir_ioapic[i].devfn; break;
> }
> }
> - up_read(&dmar_global_lock);
>
> if (sid == 0) {
> pr_warn("Failed to set source-id of IOAPIC (%d)\n",
> apic); @@ -338,14 +336,12 @@ static int set_hpet_sid(struct irte *irte,
> u8 id) if (!irte)
> return -1;
>
> - down_read(&dmar_global_lock);
> for (i = 0; i < MAX_HPET_TBS; i++) {
> if (ir_hpet[i].iommu && ir_hpet[i].id == id) {
> sid = (ir_hpet[i].bus << 8) | ir_hpet[i].devfn;
> break;
> }
> }
> - up_read(&dmar_global_lock);
>
> if (sid == 0) {
> pr_warn("Failed to set source-id of HPET block (%d)\n",
> id); @@ -1339,9 +1335,7 @@ static int intel_irq_remapping_alloc(struct
> irq_domain *domain, if (!data)
> goto out_free_parent;
>
> - down_read(&dmar_global_lock);
> index = alloc_irte(iommu, &data->irq_2_iommu, nr_irqs);
> - up_read(&dmar_global_lock);
> if (index < 0) {
> pr_warn("Failed to allocate IRTE\n");
> kfree(data);
Reviewed-by: Jacob Pan <[email protected]>
slightly beyond the scope of this, do we need to take dmar_global_lock
below? shouldn't it be in single threaded context?
down_write(&dmar_global_lock);
ret = dmar_dev_scope_init();
up_write(&dmar_global_lock);
return ret;
}
rootfs_initcall(ir_dev_scope_init);
Thanks,
Jacob
On 3/14/23 11:54 PM, Jacob Pan wrote:
> Hi BaoLu,
>
> On Tue, 14 Mar 2023 13:18:36 +0800, Lu Baolu <[email protected]>
> wrote:
>
>> The global rwsem dmar_global_lock was introduced by commit 3a5670e8ac932
>> ("iommu/vt-d: Introduce a rwsem to protect global data structures"). It
>> is used to protect DMAR related global data from DMAR hotplug operations.
>>
>> Using dmar_global_lock in intel_irq_remapping_alloc() is unnecessary as
>> the DMAR global data structures are not touched there. Remove it to avoid
>> below lockdep warning.
>>
>> ======================================================
>> WARNING: possible circular locking dependency detected
>> 6.3.0-rc2 #468 Not tainted
>> ------------------------------------------------------
>> swapper/0/1 is trying to acquire lock:
>> ff1db4cb40178698 (&domain->mutex){+.+.}-{3:3},
>> at: __irq_domain_alloc_irqs+0x3b/0xa0
>>
>> but task is already holding lock:
>> ffffffffa0c1cdf0 (dmar_global_lock){++++}-{3:3},
>> at: intel_iommu_init+0x58e/0x880
>>
>> which lock already depends on the new lock.
>>
>> the existing dependency chain (in reverse order) is:
>>
>> -> #1 (dmar_global_lock){++++}-{3:3}:
>> lock_acquire+0xd6/0x320
>> down_read+0x42/0x180
>> intel_irq_remapping_alloc+0xad/0x750
>> mp_irqdomain_alloc+0xb8/0x2b0
>> irq_domain_alloc_irqs_locked+0x12f/0x2d0
>> __irq_domain_alloc_irqs+0x56/0xa0
>> alloc_isa_irq_from_domain.isra.7+0xa0/0xe0
>> mp_map_pin_to_irq+0x1dc/0x330
>> setup_IO_APIC+0x128/0x210
>> apic_intr_mode_init+0x67/0x110
>> x86_late_time_init+0x24/0x40
>> start_kernel+0x41e/0x7e0
>> secondary_startup_64_no_verify+0xe0/0xeb
>>
>> -> #0 (&domain->mutex){+.+.}-{3:3}:
>> check_prevs_add+0x160/0xef0
>> __lock_acquire+0x147d/0x1950
>> lock_acquire+0xd6/0x320
>> __mutex_lock+0x9c/0xfc0
>> __irq_domain_alloc_irqs+0x3b/0xa0
>> dmar_alloc_hwirq+0x9e/0x120
>> iommu_pmu_register+0x11d/0x200
>> intel_iommu_init+0x5de/0x880
>> pci_iommu_init+0x12/0x40
>> do_one_initcall+0x65/0x350
>> kernel_init_freeable+0x3ca/0x610
>> kernel_init+0x1a/0x140
>> ret_from_fork+0x29/0x50
>>
>> other info that might help us debug this:
>>
>> Possible unsafe locking scenario:
>>
>> CPU0 CPU1
>> ---- ----
>> lock(dmar_global_lock);
>> lock(&domain->mutex);
>> lock(dmar_global_lock);
>> lock(&domain->mutex);
>>
>> *** DEADLOCK ***
>>
>> Fixes: 9dbb8e3452ab ("irqdomain: Switch to per-domain locking")
>> Signed-off-by: Lu Baolu <[email protected]>
>> ---
>> drivers/iommu/intel/irq_remapping.c | 6 ------
>> 1 file changed, 6 deletions(-)
>>
>> diff --git a/drivers/iommu/intel/irq_remapping.c
>> b/drivers/iommu/intel/irq_remapping.c index 6d01fa078c36..df9e261af0b5
>> 100644 --- a/drivers/iommu/intel/irq_remapping.c
>> +++ b/drivers/iommu/intel/irq_remapping.c
>> @@ -311,14 +311,12 @@ static int set_ioapic_sid(struct irte *irte, int
>> apic) if (!irte)
>> return -1;
>>
>> - down_read(&dmar_global_lock);
>> for (i = 0; i < MAX_IO_APICS; i++) {
>> if (ir_ioapic[i].iommu && ir_ioapic[i].id == apic) {
>> sid = (ir_ioapic[i].bus << 8) |
>> ir_ioapic[i].devfn; break;
>> }
>> }
>> - up_read(&dmar_global_lock);
>>
>> if (sid == 0) {
>> pr_warn("Failed to set source-id of IOAPIC (%d)\n",
>> apic); @@ -338,14 +336,12 @@ static int set_hpet_sid(struct irte *irte,
>> u8 id) if (!irte)
>> return -1;
>>
>> - down_read(&dmar_global_lock);
>> for (i = 0; i < MAX_HPET_TBS; i++) {
>> if (ir_hpet[i].iommu && ir_hpet[i].id == id) {
>> sid = (ir_hpet[i].bus << 8) | ir_hpet[i].devfn;
>> break;
>> }
>> }
>> - up_read(&dmar_global_lock);
>>
>> if (sid == 0) {
>> pr_warn("Failed to set source-id of HPET block (%d)\n",
>> id); @@ -1339,9 +1335,7 @@ static int intel_irq_remapping_alloc(struct
>> irq_domain *domain, if (!data)
>> goto out_free_parent;
>>
>> - down_read(&dmar_global_lock);
>> index = alloc_irte(iommu, &data->irq_2_iommu, nr_irqs);
>> - up_read(&dmar_global_lock);
>> if (index < 0) {
>> pr_warn("Failed to allocate IRTE\n");
>> kfree(data);
> Reviewed-by: Jacob Pan <[email protected]>
>
> slightly beyond the scope of this, do we need to take dmar_global_lock
> below? shouldn't it be in single threaded context?
>
> down_write(&dmar_global_lock);
> ret = dmar_dev_scope_init();
> up_write(&dmar_global_lock);
>
> return ret;
> }
> rootfs_initcall(ir_dev_scope_init);
Yes, agreed. This runs in a single threaded context. I will remove the
locking in a cleanup patch.
Best regards,
baolu
On Tue, Mar 14, 2023 at 01:18:36PM +0800, Lu Baolu wrote:
> The global rwsem dmar_global_lock was introduced by commit 3a5670e8ac932
> ("iommu/vt-d: Introduce a rwsem to protect global data structures"). It
> is used to protect DMAR related global data from DMAR hotplug operations.
>
> Using dmar_global_lock in intel_irq_remapping_alloc() is unnecessary as
> the DMAR global data structures are not touched there. Remove it to avoid
> below lockdep warning.
Tested-by: Jason Gunthorpe <[email protected]>
Solves my splat too
Let's send this to -rc please
Jason
On 2023/3/27 20:18, Jason Gunthorpe wrote:
> On Tue, Mar 14, 2023 at 01:18:36PM +0800, Lu Baolu wrote:
>> The global rwsem dmar_global_lock was introduced by commit 3a5670e8ac932
>> ("iommu/vt-d: Introduce a rwsem to protect global data structures"). It
>> is used to protect DMAR related global data from DMAR hotplug operations.
>>
>> Using dmar_global_lock in intel_irq_remapping_alloc() is unnecessary as
>> the DMAR global data structures are not touched there. Remove it to avoid
>> below lockdep warning.
>
> Tested-by: Jason Gunthorpe <[email protected]>
>
> Solves my splat too
>
> Let's send this to -rc please
Thank you for the testing. I will queue it for Joerg this week.
Best regards,
baolu
Hi Baolu/Jason,
On Mon, Mar 27, 2023 at 06:13:10AM -0700, Baolu Lu wrote:
> On 2023/3/27 20:18, Jason Gunthorpe wrote:
> > On Tue, Mar 14, 2023 at 01:18:36PM +0800, Lu Baolu wrote:
> >> The global rwsem dmar_global_lock was introduced by commit 3a5670e8ac932
> >> ("iommu/vt-d: Introduce a rwsem to protect global data structures"). It
> >> is used to protect DMAR related global data from DMAR hotplug operations.
> >>
> >> Using dmar_global_lock in intel_irq_remapping_alloc() is unnecessary as
> >> the DMAR global data structures are not touched there. Remove it to avoid
> >> below lockdep warning.
> >
> > Tested-by: Jason Gunthorpe <[email protected]>
> >
> > Solves my splat too
> >
> > Let's send this to -rc please
>
> Thank you for the testing. I will queue it for Joerg this week.
I found a couple of kernel warnings switching from v6.3-rc4
to v6.3-rc5. Git-bisect is pointing at this commit.
My test environment is MKT enabling irq_remap:
https://github.com/Mellanox/mkt
CONFIG_IOMMUFD=m
CONFIG_IOMMUFD_VFIO_CONTAINER=y
CONFIG_IOMMUFD_TEST=y
CONFIG_IRQ_REMAP=y
Any idea?
Thanks
Nicolin
Attaching WARNINGs:
[ 19.680725] ------------[ cut here ]------------
[ 19.681083] WARNING: CPU: 0 PID: 561 at include/linux/mmap_lock.h:161 track_pfn_remap+0xf5/0x100
[ 19.681356] Modules linked in: vfio_pci vfio_pci_core irqbypass vfio iommufd mlx5_ib ib_uverbs ib_core mlx5_core
[ 19.681654] CPU: 0 PID: 561 Comm: python3 Not tainted 6.3.0-rc6+ #1080
[ 19.681808] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
[ 19.682108] RIP: 0010:track_pfn_remap+0xf5/0x100
[ 19.682270] Code: 5e 5d c3 48 89 f2 31 c9 4c 89 c6 4c 89 e7 e8 42 fc ff ff e9 54 ff ff ff 48 8d b8 88 01 00 00 31 f6 e8 5f 97 76 00 85 c0 75 be <0f> 0b eb ba 0f 1f 80 00 00 00 00 80 3d 71 72 ef 00 00 74 01 c3 55
[ 19.682678] RSP: 0000:ffffc900014b7ce8 EFLAGS: 00010246
[ 19.682805] RAX: 0000000000000000 RBX: 0000000002000000 RCX: 0000000000000000
[ 19.682984] RDX: 0000000000000000 RSI: ffff888104a709c8 RDI: ffff888108756150
[ 19.683214] RBP: ffffc900014b7d08 R08: 0000000000000001 R09: 0000000000000003
[ 19.683464] R10: 000000000e6a4a47 R11: 00000000620892b1 R12: 00000000fc000000
[ 19.683663] R13: ffffc900014b7d20 R14: ffff888106a35100 R15: 0000000002000000
[ 19.683843] FS: 00007fa8aa4edb80(0000) GS:ffff8881ba400000(0000) knlGS:0000000000000000
[ 19.684054] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 19.684223] CR2: 00007fa8a748f004 CR3: 000000010673b005 CR4: 00000000003706b0
[ 19.684414] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 19.684632] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 19.684817] Call Trace:
[ 19.684893] <TASK>
[ 19.684967] remap_pfn_range+0x3e/0xa0
[ 19.685084] vfio_pci_mmap_fault+0x8a/0x160 [vfio_pci_core]
[ 19.685253] __do_fault+0x30/0xa0
[ 19.685368] __handle_mm_fault+0xe08/0x1ff0
[ 19.685520] ? find_held_lock+0x31/0x80
[ 19.685655] ? mt_find+0x15d/0x400
[ 19.685759] ? lock_release+0xbc/0x240
[ 19.685862] handle_mm_fault+0xa8/0x170
[ 19.685963] ? find_vma+0x3c/0x70
[ 19.686066] exc_page_fault+0x1e6/0x7b0
[ 19.686167] asm_exc_page_fault+0x27/0x30
[ 19.686271] RIP: 0033:0x7fa8a986bfd5
[ 19.686373] Code: ef 41 89 c4 e8 dc 70 fc ff 45 85 e4 0f 85 a0 0e 00 00 48 89 df e8 3b ce ff ff 48 85 c0 0f 84 eb 0d 00 00 48 8b ab e8 02 00 00 <8b> 45 04 0f c8 c1 e8 10 83 f8 05 0f 85 b0 0b 00 00 8b 45 14 ba 01
[ 19.686773] RSP: 002b:00007ffc2392ab30 EFLAGS: 00010206
[ 19.686931] RAX: 0000559646201590 RBX: 0000559646471a50 RCX: 0000559646471d00
[ 19.687132] RDX: 0000559646471d00 RSI: 0000000000003b71 RDI: 0000000000000003
[ 19.687377] RBP: 00007fa8a748f000 R08: 00000000fedfffff R09: 0000000000000000
[ 19.687563] R10: 0000000000000022 R11: 0000000000000246 R12: 0000559646470904
[ 19.687744] R13: 00007ffc2392ab70 R14: 0000000000000002 R15: 0000559646470934
[ 19.687943] </TASK>
[ 19.688016] irq event stamp: 655597
[ 19.688114] hardirqs last enabled at (655605): [<ffffffff810c9683>] __up_console_sem+0x53/0x60
[ 19.688340] hardirqs last disabled at (655612): [<ffffffff810c9668>] __up_console_sem+0x38/0x60
[ 19.688554] softirqs last enabled at (655148): [<ffffffff81064e79>] irq_exit_rcu+0x69/0x90
[ 19.688733] softirqs last disabled at (655143): [<ffffffff81064e79>] irq_exit_rcu+0x69/0x90
[ 19.688915] ---[ end trace 0000000000000000 ]---
[ 19.689076] ------------[ cut here ]------------
[ 19.689197] WARNING: CPU: 0 PID: 561 at include/linux/mmap_lock.h:161 remap_pfn_range_notrack+0x40f/0x4f0
[ 19.689440] Modules linked in: vfio_pci vfio_pci_core irqbypass vfio iommufd mlx5_ib ib_uverbs ib_core mlx5_core
[ 19.689691] CPU: 0 PID: 561 Comm: python3 Tainted: G W 6.3.0-rc6+ #1080
[ 19.689867] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
[ 19.690109] RIP: 0010:remap_pfn_range_notrack+0x40f/0x4f0
[ 19.690234] Code: 39 eb 0f 85 d9 fc ff ff 31 c0 eb 5e 48 8b 45 a8 31 f6 4c 89 45 d0 48 8d b8 88 01 00 00 e8 89 bc 5a 00 4c 8b 45 d0 85 c0 75 02 <0f> 0b 48 8b 43 20 e9 5e fc ff ff 48 8b 7d a8 48 89 c6 48 89 c3 e8
[ 19.690628] RSP: 0000:ffffc900014b7c68 EFLAGS: 00010246
[ 19.690750] RAX: 0000000000000000 RBX: ffff888106a35100 RCX: 0000000000000000
[ 19.690914] RDX: 0000000000000000 RSI: ffff888104a709c8 RDI: ffff888108756150
[ 19.691074] RBP: ffffc900014b7d08 R08: 00007fa8a948f000 R09: 0000000000000003
[ 19.691274] R10: 000000000e6a4a47 R11: 00000000620892b1 R12: 00000000000fc000
[ 19.691469] R13: 00007fa8a748f000 R14: 00007fa8a748f000 R15: 8000000000000037
[ 19.691647] FS: 00007fa8aa4edb80(0000) GS:ffff8881ba400000(0000) knlGS:0000000000000000
[ 19.691830] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 19.691980] CR2: 00007fa8a748f004 CR3: 000000010673b005 CR4: 00000000003706b0
[ 19.692159] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 19.692336] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 19.692538] Call Trace:
[ 19.692592] <TASK>
[ 19.692647] ? track_pfn_remap+0xf7/0x100
[ 19.692745] remap_pfn_range+0x57/0xa0
[ 19.692845] vfio_pci_mmap_fault+0x8a/0x160 [vfio_pci_core]
[ 19.692991] __do_fault+0x30/0xa0
[ 19.693089] __handle_mm_fault+0xe08/0x1ff0
[ 19.693186] ? find_held_lock+0x31/0x80
[ 19.693291] ? mt_find+0x15d/0x400
[ 19.693391] ? lock_release+0xbc/0x240
[ 19.693491] handle_mm_fault+0xa8/0x170
[ 19.693587] ? find_vma+0x3c/0x70
[ 19.693685] exc_page_fault+0x1e6/0x7b0
[ 19.693782] asm_exc_page_fault+0x27/0x30
[ 19.693880] RIP: 0033:0x7fa8a986bfd5
[ 19.693961] Code: ef 41 89 c4 e8 dc 70 fc ff 45 85 e4 0f 85 a0 0e 00 00 48 89 df e8 3b ce ff ff 48 85 c0 0f 84 eb 0d 00 00 48 8b ab e8 02 00 00 <8b> 45 04 0f c8 c1 e8 10 83 f8 05 0f 85 b0 0b 00 00 8b 45 14 ba 01
[ 19.694342] RSP: 002b:00007ffc2392ab30 EFLAGS: 00010206
[ 19.694466] RAX: 0000559646201590 RBX: 0000559646471a50 RCX: 0000559646471d00
[ 19.694619] RDX: 0000559646471d00 RSI: 0000000000003b71 RDI: 0000000000000003
[ 19.694778] RBP: 00007fa8a748f000 R08: 00000000fedfffff R09: 0000000000000000
[ 19.694934] R10: 0000000000000022 R11: 0000000000000246 R12: 0000559646470904
[ 19.695092] R13: 00007ffc2392ab70 R14: 0000000000000002 R15: 0000559646470934
[ 19.695302] </TASK>
[ 19.695373] irq event stamp: 656049
[ 19.695452] hardirqs last enabled at (656057): [<ffffffff810c9683>] __up_console_sem+0x53/0x60
[ 19.695657] hardirqs last disabled at (656064): [<ffffffff810c9668>] __up_console_sem+0x38/0x60
[ 19.695883] softirqs last enabled at (655148): [<ffffffff81064e79>] irq_exit_rcu+0x69/0x90
[ 19.696061] softirqs last disabled at (655143): [<ffffffff81064e79>] irq_exit_rcu+0x69/0x90
[ 19.696240] ---[ end trace 0000000000000000 ]---
On 4/27/23 8:43 AM, Nicolin Chen wrote:
> Hi Baolu/Jason,
Hi Nicolin,
>
> On Mon, Mar 27, 2023 at 06:13:10AM -0700, Baolu Lu wrote:
>> On 2023/3/27 20:18, Jason Gunthorpe wrote:
>>> On Tue, Mar 14, 2023 at 01:18:36PM +0800, Lu Baolu wrote:
>>>> The global rwsem dmar_global_lock was introduced by commit 3a5670e8ac932
>>>> ("iommu/vt-d: Introduce a rwsem to protect global data structures"). It
>>>> is used to protect DMAR related global data from DMAR hotplug operations.
>>>>
>>>> Using dmar_global_lock in intel_irq_remapping_alloc() is unnecessary as
>>>> the DMAR global data structures are not touched there. Remove it to avoid
>>>> below lockdep warning.
>>>
>>> Tested-by: Jason Gunthorpe <[email protected]>
>>>
>>> Solves my splat too
>>>
>>> Let's send this to -rc please
>>
>> Thank you for the testing. I will queue it for Joerg this week.
>
> I found a couple of kernel warnings switching from v6.3-rc4
> to v6.3-rc5. Git-bisect is pointing at this commit.
>
> My test environment is MKT enabling irq_remap:
> https://github.com/Mellanox/mkt
> CONFIG_IOMMUFD=m
> CONFIG_IOMMUFD_VFIO_CONTAINER=y
> CONFIG_IOMMUFD_TEST=y
> CONFIG_IRQ_REMAP=y
>
> Any idea?
>
> Thanks
> Nicolin
>
> Attaching WARNINGs:
> [ 19.680725] ------------[ cut here ]------------
> [ 19.681083] WARNING: CPU: 0 PID: 561 at include/linux/mmap_lock.h:161 track_pfn_remap+0xf5/0x100
> [ 19.681356] Modules linked in: vfio_pci vfio_pci_core irqbypass vfio iommufd mlx5_ib ib_uverbs ib_core mlx5_core
> [ 19.681654] CPU: 0 PID: 561 Comm: python3 Not tainted 6.3.0-rc6+ #1080
> [ 19.681808] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
> [ 19.682108] RIP: 0010:track_pfn_remap+0xf5/0x100
> [ 19.682270] Code: 5e 5d c3 48 89 f2 31 c9 4c 89 c6 4c 89 e7 e8 42 fc ff ff e9 54 ff ff ff 48 8d b8 88 01 00 00 31 f6 e8 5f 97 76 00 85 c0 75 be <0f> 0b eb ba 0f 1f 80 00 00 00 00 80 3d 71 72 ef 00 00 74 01 c3 55
> [ 19.682678] RSP: 0000:ffffc900014b7ce8 EFLAGS: 00010246
> [ 19.682805] RAX: 0000000000000000 RBX: 0000000002000000 RCX: 0000000000000000
> [ 19.682984] RDX: 0000000000000000 RSI: ffff888104a709c8 RDI: ffff888108756150
> [ 19.683214] RBP: ffffc900014b7d08 R08: 0000000000000001 R09: 0000000000000003
> [ 19.683464] R10: 000000000e6a4a47 R11: 00000000620892b1 R12: 00000000fc000000
> [ 19.683663] R13: ffffc900014b7d20 R14: ffff888106a35100 R15: 0000000002000000
> [ 19.683843] FS: 00007fa8aa4edb80(0000) GS:ffff8881ba400000(0000) knlGS:0000000000000000
> [ 19.684054] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 19.684223] CR2: 00007fa8a748f004 CR3: 000000010673b005 CR4: 00000000003706b0
> [ 19.684414] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [ 19.684632] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> [ 19.684817] Call Trace:
> [ 19.684893] <TASK>
> [ 19.684967] remap_pfn_range+0x3e/0xa0
> [ 19.685084] vfio_pci_mmap_fault+0x8a/0x160 [vfio_pci_core]
> [ 19.685253] __do_fault+0x30/0xa0
> [ 19.685368] __handle_mm_fault+0xe08/0x1ff0
> [ 19.685520] ? find_held_lock+0x31/0x80
> [ 19.685655] ? mt_find+0x15d/0x400
> [ 19.685759] ? lock_release+0xbc/0x240
> [ 19.685862] handle_mm_fault+0xa8/0x170
> [ 19.685963] ? find_vma+0x3c/0x70
> [ 19.686066] exc_page_fault+0x1e6/0x7b0
> [ 19.686167] asm_exc_page_fault+0x27/0x30
> [ 19.686271] RIP: 0033:0x7fa8a986bfd5
> [ 19.686373] Code: ef 41 89 c4 e8 dc 70 fc ff 45 85 e4 0f 85 a0 0e 00 00 48 89 df e8 3b ce ff ff 48 85 c0 0f 84 eb 0d 00 00 48 8b ab e8 02 00 00 <8b> 45 04 0f c8 c1 e8 10 83 f8 05 0f 85 b0 0b 00 00 8b 45 14 ba 01
> [ 19.686773] RSP: 002b:00007ffc2392ab30 EFLAGS: 00010206
> [ 19.686931] RAX: 0000559646201590 RBX: 0000559646471a50 RCX: 0000559646471d00
> [ 19.687132] RDX: 0000559646471d00 RSI: 0000000000003b71 RDI: 0000000000000003
> [ 19.687377] RBP: 00007fa8a748f000 R08: 00000000fedfffff R09: 0000000000000000
> [ 19.687563] R10: 0000000000000022 R11: 0000000000000246 R12: 0000559646470904
> [ 19.687744] R13: 00007ffc2392ab70 R14: 0000000000000002 R15: 0000559646470934
> [ 19.687943] </TASK>
> [ 19.688016] irq event stamp: 655597
> [ 19.688114] hardirqs last enabled at (655605): [<ffffffff810c9683>] __up_console_sem+0x53/0x60
> [ 19.688340] hardirqs last disabled at (655612): [<ffffffff810c9668>] __up_console_sem+0x38/0x60
> [ 19.688554] softirqs last enabled at (655148): [<ffffffff81064e79>] irq_exit_rcu+0x69/0x90
> [ 19.688733] softirqs last disabled at (655143): [<ffffffff81064e79>] irq_exit_rcu+0x69/0x90
> [ 19.688915] ---[ end trace 0000000000000000 ]---
> [ 19.689076] ------------[ cut here ]------------
> [ 19.689197] WARNING: CPU: 0 PID: 561 at include/linux/mmap_lock.h:161 remap_pfn_range_notrack+0x40f/0x4f0
> [ 19.689440] Modules linked in: vfio_pci vfio_pci_core irqbypass vfio iommufd mlx5_ib ib_uverbs ib_core mlx5_core
> [ 19.689691] CPU: 0 PID: 561 Comm: python3 Tainted: G W 6.3.0-rc6+ #1080
> [ 19.689867] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
> [ 19.690109] RIP: 0010:remap_pfn_range_notrack+0x40f/0x4f0
> [ 19.690234] Code: 39 eb 0f 85 d9 fc ff ff 31 c0 eb 5e 48 8b 45 a8 31 f6 4c 89 45 d0 48 8d b8 88 01 00 00 e8 89 bc 5a 00 4c 8b 45 d0 85 c0 75 02 <0f> 0b 48 8b 43 20 e9 5e fc ff ff 48 8b 7d a8 48 89 c6 48 89 c3 e8
> [ 19.690628] RSP: 0000:ffffc900014b7c68 EFLAGS: 00010246
> [ 19.690750] RAX: 0000000000000000 RBX: ffff888106a35100 RCX: 0000000000000000
> [ 19.690914] RDX: 0000000000000000 RSI: ffff888104a709c8 RDI: ffff888108756150
> [ 19.691074] RBP: ffffc900014b7d08 R08: 00007fa8a948f000 R09: 0000000000000003
> [ 19.691274] R10: 000000000e6a4a47 R11: 00000000620892b1 R12: 00000000000fc000
> [ 19.691469] R13: 00007fa8a748f000 R14: 00007fa8a748f000 R15: 8000000000000037
> [ 19.691647] FS: 00007fa8aa4edb80(0000) GS:ffff8881ba400000(0000) knlGS:0000000000000000
> [ 19.691830] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 19.691980] CR2: 00007fa8a748f004 CR3: 000000010673b005 CR4: 00000000003706b0
> [ 19.692159] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [ 19.692336] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> [ 19.692538] Call Trace:
> [ 19.692592] <TASK>
> [ 19.692647] ? track_pfn_remap+0xf7/0x100
> [ 19.692745] remap_pfn_range+0x57/0xa0
> [ 19.692845] vfio_pci_mmap_fault+0x8a/0x160 [vfio_pci_core]
> [ 19.692991] __do_fault+0x30/0xa0
> [ 19.693089] __handle_mm_fault+0xe08/0x1ff0
> [ 19.693186] ? find_held_lock+0x31/0x80
> [ 19.693291] ? mt_find+0x15d/0x400
> [ 19.693391] ? lock_release+0xbc/0x240
> [ 19.693491] handle_mm_fault+0xa8/0x170
> [ 19.693587] ? find_vma+0x3c/0x70
> [ 19.693685] exc_page_fault+0x1e6/0x7b0
> [ 19.693782] asm_exc_page_fault+0x27/0x30
> [ 19.693880] RIP: 0033:0x7fa8a986bfd5
> [ 19.693961] Code: ef 41 89 c4 e8 dc 70 fc ff 45 85 e4 0f 85 a0 0e 00 00 48 89 df e8 3b ce ff ff 48 85 c0 0f 84 eb 0d 00 00 48 8b ab e8 02 00 00 <8b> 45 04 0f c8 c1 e8 10 83 f8 05 0f 85 b0 0b 00 00 8b 45 14 ba 01
> [ 19.694342] RSP: 002b:00007ffc2392ab30 EFLAGS: 00010206
> [ 19.694466] RAX: 0000559646201590 RBX: 0000559646471a50 RCX: 0000559646471d00
> [ 19.694619] RDX: 0000559646471d00 RSI: 0000000000003b71 RDI: 0000000000000003
> [ 19.694778] RBP: 00007fa8a748f000 R08: 00000000fedfffff R09: 0000000000000000
> [ 19.694934] R10: 0000000000000022 R11: 0000000000000246 R12: 0000559646470904
> [ 19.695092] R13: 00007ffc2392ab70 R14: 0000000000000002 R15: 0000559646470934
> [ 19.695302] </TASK>
> [ 19.695373] irq event stamp: 656049
> [ 19.695452] hardirqs last enabled at (656057): [<ffffffff810c9683>] __up_console_sem+0x53/0x60
> [ 19.695657] hardirqs last disabled at (656064): [<ffffffff810c9668>] __up_console_sem+0x38/0x60
> [ 19.695883] softirqs last enabled at (655148): [<ffffffff81064e79>] irq_exit_rcu+0x69/0x90
> [ 19.696061] softirqs last disabled at (655143): [<ffffffff81064e79>] irq_exit_rcu+0x69/0x90
> [ 19.696240] ---[ end trace 0000000000000000 ]---
I took a quick look. It seems that above warnings are irrelevant to this
commit. Can you please simply revert this commit and check whether there
are any changes?
Best regards,
baolu
On Thu, Apr 27, 2023 at 11:20:40AM +0800, Baolu Lu wrote:
> > Attaching WARNINGs:
> > [ 19.680725] ------------[ cut here ]------------
> > [ 19.681083] WARNING: CPU: 0 PID: 561 at include/linux/mmap_lock.h:161 track_pfn_remap+0xf5/0x100
> > [ 19.684817] Call Trace:
> > [ 19.684893] <TASK>
> > [ 19.684967] remap_pfn_range+0x3e/0xa0
> > [ 19.685084] vfio_pci_mmap_fault+0x8a/0x160 [vfio_pci_core]
> I took a quick look. It seems that above warnings are irrelevant to this
> commit. Can you please simply revert this commit and check whether there
> are any changes?
I tried on top of the v6.3-rc5 tag. The warnings were triggered
constantly. And reverting the commit fixes it:
nicolinc@Asurada-Nvidia:~/work/mkt/images/nicolinc/src/kernel$ git log --oneline -2
cb3dc9b2417e (HEAD -> v6.3-rc5) Revert "iommu/vt-d: Remove unnecessary locking in intel_irq_remapping_alloc()"
7e364e56293b (tag: v6.3-rc5, jgg/linus) Linux 6.3-rc5
I don't think the commit is the causation yet there seems to be
a strong correlation here...
Thanks
Nic
On 2023-04-27 04:37, Nicolin Chen wrote:
> On Thu, Apr 27, 2023 at 11:20:40AM +0800, Baolu Lu wrote:
>
>>> Attaching WARNINGs:
>>> [ 19.680725] ------------[ cut here ]------------
>>> [ 19.681083] WARNING: CPU: 0 PID: 561 at include/linux/mmap_lock.h:161 track_pfn_remap+0xf5/0x100
>
>>> [ 19.684817] Call Trace:
>>> [ 19.684893] <TASK>
>>> [ 19.684967] remap_pfn_range+0x3e/0xa0
>>> [ 19.685084] vfio_pci_mmap_fault+0x8a/0x160 [vfio_pci_core]
>
>> I took a quick look. It seems that above warnings are irrelevant to this
>> commit. Can you please simply revert this commit and check whether there
>> are any changes?
>
> I tried on top of the v6.3-rc5 tag. The warnings were triggered
> constantly. And reverting the commit fixes it:
> nicolinc@Asurada-Nvidia:~/work/mkt/images/nicolinc/src/kernel$ git log --oneline -2
> cb3dc9b2417e (HEAD -> v6.3-rc5) Revert "iommu/vt-d: Remove unnecessary locking in intel_irq_remapping_alloc()"
> 7e364e56293b (tag: v6.3-rc5, jgg/linus) Linux 6.3-rc5
>
> I don't think the commit is the causation yet there seems to be
> a strong correlation here...
The correlation is probably that you're now getting to see a genuine
warning from lockdep_assert_held(), since intel-iommu is no longer
causing debug_locks to be turned off earlier.
Robin.
On Thu, Apr 27, 2023 at 10:35:11AM +0100, Robin Murphy wrote:
> On 2023-04-27 04:37, Nicolin Chen wrote:
> > On Thu, Apr 27, 2023 at 11:20:40AM +0800, Baolu Lu wrote:
> >
> > > > Attaching WARNINGs:
> > > > [ 19.680725] ------------[ cut here ]------------
> > > > [ 19.681083] WARNING: CPU: 0 PID: 561 at include/linux/mmap_lock.h:161 track_pfn_remap+0xf5/0x100
> >
> > > > [ 19.684817] Call Trace:
> > > > [ 19.684893] <TASK>
> > > > [ 19.684967] remap_pfn_range+0x3e/0xa0
> > > > [ 19.685084] vfio_pci_mmap_fault+0x8a/0x160 [vfio_pci_core]
> >
> > > I took a quick look. It seems that above warnings are irrelevant to this
> > > commit. Can you please simply revert this commit and check whether there
> > > are any changes?
> >
> > I tried on top of the v6.3-rc5 tag. The warnings were triggered
> > constantly. And reverting the commit fixes it:
> > nicolinc@Asurada-Nvidia:~/work/mkt/images/nicolinc/src/kernel$ git log --oneline -2
> > cb3dc9b2417e (HEAD -> v6.3-rc5) Revert "iommu/vt-d: Remove unnecessary locking in intel_irq_remapping_alloc()"
> > 7e364e56293b (tag: v6.3-rc5, jgg/linus) Linux 6.3-rc5
> >
> > I don't think the commit is the causation yet there seems to be
> > a strong correlation here...
>
> The correlation is probably that you're now getting to see a genuine
> warning from lockdep_assert_held(), since intel-iommu is no longer
> causing debug_locks to be turned off earlier.
Hmm, looks like so. The mmap_lock is held with a read permission,
in arch/x86/mm/fault.c file, v.s. the expectation of a write one
by remap_pfn_range().
Thanks
Nicolin