2024-02-13 16:08:28

by Shivaprasad G Bhat

[permalink] [raw]
Subject: [PATCH] powerpc/iommu: Fix the missing iommu_group_put() during platform domain attach

The function spapr_tce_platform_iommu_attach_dev() is missing to call
iommu_group_put() when the domain is already set. This refcount leak
shows up with BUG_ON() during DLPAR remove operation as,

KernelBug: Kernel bug in state 'None': kernel BUG at arch/powerpc/platforms/pseries/iommu.c:100!
Oops: Exception in kernel mode, sig: 5 [#1]
LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=8192 NUMA pSeries
<snip>
Hardware name: IBM,9080-HEX POWER10 (raw) 0x800200 0xf000006 of:IBM,FW1060.00 (NH1060_016) hv:phyp pSeries
NIP: c0000000000ff4d4 LR: c0000000000ff4cc CTR: 0000000000000000
REGS: c0000013aed5f840 TRAP: 0700 Tainted: G I (6.8.0-rc3-autotest-g99bd3cb0d12e)
MSR: 8000000000029033 <SF,EE,ME,IR,DR,RI,LE> CR: 44002402 XER: 20040000
CFAR: c000000000a0d170 IRQMASK: 0
GPR00: c0000000000ff4cc c0000013aed5fae0 c000000001512700 c0000013aa362138
GPR04: 0000000000000000 0000000000000000 0000000000000000 0000000119c8afd0
GPR08: 0000000000000000 c000001284442b00 0000000000000001 0000000000001003
GPR12: 0000000300000000 c0000018ffff2f00 0000000000000000 0000000000000000
GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
GPR20: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
GPR24: c0000013aed5fc40 0000000000000002 0000000000000000 c000000002757d90
GPR28: c0000000000ff440 c000000002757cb8 c00000183799c1a0 c0000013aa362b00
NIP [c0000000000ff4d4] iommu_reconfig_notifier+0x94/0x200
LR [c0000000000ff4cc] iommu_reconfig_notifier+0x8c/0x200
Call Trace:
[c0000013aed5fae0] [c0000000000ff4cc] iommu_reconfig_notifier+0x8c/0x200 (unreliable)
[c0000013aed5fb10] [c0000000001a27b0] notifier_call_chain+0xb8/0x19c
[c0000013aed5fb70] [c0000000001a2a78] blocking_notifier_call_chain+0x64/0x98
[c0000013aed5fbb0] [c000000000c4a898] of_reconfig_notify+0x44/0xdc
[c0000013aed5fc20] [c000000000c4add4] of_detach_node+0x78/0xb0
[c0000013aed5fc70] [c0000000000f96a8] ofdt_write.part.0+0x86c/0xbb8
[c0000013aed5fce0] [c00000000069b4bc] proc_reg_write+0xf4/0x150
[c0000013aed5fd10] [c0000000005bfeb4] vfs_write+0xf8/0x488
[c0000013aed5fdc0] [c0000000005c0570] ksys_write+0x84/0x140
[c0000013aed5fe10] [c000000000033358] system_call_exception+0x138/0x330
[c0000013aed5fe50] [c00000000000d05c] system_call_vectored_common+0x15c/0x2ec
--- interrupt: 3000 at 0x20000433acb4
<snip>
---[ end trace 0000000000000000 ]---

The patch adds the missing iommu_group_put() call.

Fixes: a8ca9fc9134c ("powerpc/iommu: Do not do platform domain attach atctions after probe")
Signed-off-by: Shivaprasad G Bhat <[email protected]>
---
arch/powerpc/kernel/iommu.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/kernel/iommu.c b/arch/powerpc/kernel/iommu.c
index d71eac3b2887..a9bebfd56b3b 100644
--- a/arch/powerpc/kernel/iommu.c
+++ b/arch/powerpc/kernel/iommu.c
@@ -1289,8 +1289,10 @@ spapr_tce_platform_iommu_attach_dev(struct iommu_domain *platform_domain,
struct iommu_table_group *table_group;

/* At first attach the ownership is already set */
- if (!domain)
+ if (!domain) {
+ iommu_group_put(grp);
return 0;
+ }

table_group = iommu_group_get_iommudata(grp);
/*




2024-02-13 17:21:45

by Jason Gunthorpe

[permalink] [raw]
Subject: Re: [PATCH] powerpc/iommu: Fix the missing iommu_group_put() during platform domain attach

On Tue, Feb 13, 2024 at 10:05:22AM -0600, Shivaprasad G Bhat wrote:
> The function spapr_tce_platform_iommu_attach_dev() is missing to call
> iommu_group_put() when the domain is already set. This refcount leak
> shows up with BUG_ON() during DLPAR remove operation as,
>
> KernelBug: Kernel bug in state 'None': kernel BUG at arch/powerpc/platforms/pseries/iommu.c:100!
> Oops: Exception in kernel mode, sig: 5 [#1]
> LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=8192 NUMA pSeries
> <snip>
> Hardware name: IBM,9080-HEX POWER10 (raw) 0x800200 0xf000006 of:IBM,FW1060.00 (NH1060_016) hv:phyp pSeries
> NIP: c0000000000ff4d4 LR: c0000000000ff4cc CTR: 0000000000000000
> REGS: c0000013aed5f840 TRAP: 0700 Tainted: G I (6.8.0-rc3-autotest-g99bd3cb0d12e)
> MSR: 8000000000029033 <SF,EE,ME,IR,DR,RI,LE> CR: 44002402 XER: 20040000
> CFAR: c000000000a0d170 IRQMASK: 0
> GPR00: c0000000000ff4cc c0000013aed5fae0 c000000001512700 c0000013aa362138
> GPR04: 0000000000000000 0000000000000000 0000000000000000 0000000119c8afd0
> GPR08: 0000000000000000 c000001284442b00 0000000000000001 0000000000001003
> GPR12: 0000000300000000 c0000018ffff2f00 0000000000000000 0000000000000000
> GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
> GPR20: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
> GPR24: c0000013aed5fc40 0000000000000002 0000000000000000 c000000002757d90
> GPR28: c0000000000ff440 c000000002757cb8 c00000183799c1a0 c0000013aa362b00
> NIP [c0000000000ff4d4] iommu_reconfig_notifier+0x94/0x200
> LR [c0000000000ff4cc] iommu_reconfig_notifier+0x8c/0x200
> Call Trace:
> [c0000013aed5fae0] [c0000000000ff4cc] iommu_reconfig_notifier+0x8c/0x200 (unreliable)
> [c0000013aed5fb10] [c0000000001a27b0] notifier_call_chain+0xb8/0x19c
> [c0000013aed5fb70] [c0000000001a2a78] blocking_notifier_call_chain+0x64/0x98
> [c0000013aed5fbb0] [c000000000c4a898] of_reconfig_notify+0x44/0xdc
> [c0000013aed5fc20] [c000000000c4add4] of_detach_node+0x78/0xb0
> [c0000013aed5fc70] [c0000000000f96a8] ofdt_write.part.0+0x86c/0xbb8
> [c0000013aed5fce0] [c00000000069b4bc] proc_reg_write+0xf4/0x150
> [c0000013aed5fd10] [c0000000005bfeb4] vfs_write+0xf8/0x488
> [c0000013aed5fdc0] [c0000000005c0570] ksys_write+0x84/0x140
> [c0000013aed5fe10] [c000000000033358] system_call_exception+0x138/0x330
> [c0000013aed5fe50] [c00000000000d05c] system_call_vectored_common+0x15c/0x2ec
> --- interrupt: 3000 at 0x20000433acb4
> <snip>
> ---[ end trace 0000000000000000 ]---
>
> The patch adds the missing iommu_group_put() call.
>
> Fixes: a8ca9fc9134c ("powerpc/iommu: Do not do platform domain attach atctions after probe")
> Signed-off-by: Shivaprasad G Bhat <[email protected]>
> ---
> arch/powerpc/kernel/iommu.c | 4 +++-
> 1 file changed, 3 insertions(+), 1 deletion(-)

Doh, that is a weird splat for this but thanks for finding it

Reviewed-by: Jason Gunthorpe <[email protected]>

Jason

2024-02-14 08:09:51

by Venkat Rao Bagalkote

[permalink] [raw]
Subject: Re: [PATCH] powerpc/iommu: Fix the missing iommu_group_put() during platform domain attach

Thanks for the patch. Applied this patch and verified and issue is fixed.

This issue way originally reported in the below mail.

https://marc.info/?l=linux-kernel&m=170737160630106&w=2


Tested-by: Venkat Rao Bagalkote <[email protected]>

On 13/02/24 10:51 pm, Jason Gunthorpe wrote:
> On Tue, Feb 13, 2024 at 10:05:22AM -0600, Shivaprasad G Bhat wrote:
>> The function spapr_tce_platform_iommu_attach_dev() is missing to call
>> iommu_group_put() when the domain is already set. This refcount leak
>> shows up with BUG_ON() during DLPAR remove operation as,
>>
>> KernelBug: Kernel bug in state 'None': kernel BUG at arch/powerpc/platforms/pseries/iommu.c:100!
>> Oops: Exception in kernel mode, sig: 5 [#1]
>> LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=8192 NUMA pSeries
>> <snip>
>> Hardware name: IBM,9080-HEX POWER10 (raw) 0x800200 0xf000006 of:IBM,FW1060.00 (NH1060_016) hv:phyp pSeries
>> NIP: c0000000000ff4d4 LR: c0000000000ff4cc CTR: 0000000000000000
>> REGS: c0000013aed5f840 TRAP: 0700 Tainted: G I (6.8.0-rc3-autotest-g99bd3cb0d12e)
>> MSR: 8000000000029033 <SF,EE,ME,IR,DR,RI,LE> CR: 44002402 XER: 20040000
>> CFAR: c000000000a0d170 IRQMASK: 0
>> GPR00: c0000000000ff4cc c0000013aed5fae0 c000000001512700 c0000013aa362138
>> GPR04: 0000000000000000 0000000000000000 0000000000000000 0000000119c8afd0
>> GPR08: 0000000000000000 c000001284442b00 0000000000000001 0000000000001003
>> GPR12: 0000000300000000 c0000018ffff2f00 0000000000000000 0000000000000000
>> GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
>> GPR20: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
>> GPR24: c0000013aed5fc40 0000000000000002 0000000000000000 c000000002757d90
>> GPR28: c0000000000ff440 c000000002757cb8 c00000183799c1a0 c0000013aa362b00
>> NIP [c0000000000ff4d4] iommu_reconfig_notifier+0x94/0x200
>> LR [c0000000000ff4cc] iommu_reconfig_notifier+0x8c/0x200
>> Call Trace:
>> [c0000013aed5fae0] [c0000000000ff4cc] iommu_reconfig_notifier+0x8c/0x200 (unreliable)
>> [c0000013aed5fb10] [c0000000001a27b0] notifier_call_chain+0xb8/0x19c
>> [c0000013aed5fb70] [c0000000001a2a78] blocking_notifier_call_chain+0x64/0x98
>> [c0000013aed5fbb0] [c000000000c4a898] of_reconfig_notify+0x44/0xdc
>> [c0000013aed5fc20] [c000000000c4add4] of_detach_node+0x78/0xb0
>> [c0000013aed5fc70] [c0000000000f96a8] ofdt_write.part.0+0x86c/0xbb8
>> [c0000013aed5fce0] [c00000000069b4bc] proc_reg_write+0xf4/0x150
>> [c0000013aed5fd10] [c0000000005bfeb4] vfs_write+0xf8/0x488
>> [c0000013aed5fdc0] [c0000000005c0570] ksys_write+0x84/0x140
>> [c0000013aed5fe10] [c000000000033358] system_call_exception+0x138/0x330
>> [c0000013aed5fe50] [c00000000000d05c] system_call_vectored_common+0x15c/0x2ec
>> --- interrupt: 3000 at 0x20000433acb4
>> <snip>
>> ---[ end trace 0000000000000000 ]---
>>
>> The patch adds the missing iommu_group_put() call.
>>
>> Fixes: a8ca9fc9134c ("powerpc/iommu: Do not do platform domain attach atctions after probe")
>> Signed-off-by: Shivaprasad G Bhat <[email protected]>
>> ---
>> arch/powerpc/kernel/iommu.c | 4 +++-
>> 1 file changed, 3 insertions(+), 1 deletion(-)
> Doh, that is a weird splat for this but thanks for finding it
>
> Reviewed-by: Jason Gunthorpe <[email protected]>
>
> Jason
>

2024-02-14 12:56:13

by Michael Ellerman

[permalink] [raw]
Subject: Re: [PATCH] powerpc/iommu: Fix the missing iommu_group_put() during platform domain attach

Venkat Rao Bagalkote <[email protected]> writes:
> Thanks for the patch. Applied this patch and verified and issue is fixed.
>
> This issue way originally reported in the below mail.
>
> https://marc.info/?l=linux-kernel&m=170737160630106&w=2

Please use lore for links, in this case:

https://lore.kernel.org/all/[email protected]/

cheers

> On 13/02/24 10:51 pm, Jason Gunthorpe wrote:
>> On Tue, Feb 13, 2024 at 10:05:22AM -0600, Shivaprasad G Bhat wrote:
>>> The function spapr_tce_platform_iommu_attach_dev() is missing to call
>>> iommu_group_put() when the domain is already set. This refcount leak
>>> shows up with BUG_ON() during DLPAR remove operation as,
>>>
>>> KernelBug: Kernel bug in state 'None': kernel BUG at arch/powerpc/platforms/pseries/iommu.c:100!
>>> Oops: Exception in kernel mode, sig: 5 [#1]
>>> LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=8192 NUMA pSeries
>>> <snip>
>>> Hardware name: IBM,9080-HEX POWER10 (raw) 0x800200 0xf000006 of:IBM,FW1060.00 (NH1060_016) hv:phyp pSeries
>>> NIP: c0000000000ff4d4 LR: c0000000000ff4cc CTR: 0000000000000000
>>> REGS: c0000013aed5f840 TRAP: 0700 Tainted: G I (6.8.0-rc3-autotest-g99bd3cb0d12e)
>>> MSR: 8000000000029033 <SF,EE,ME,IR,DR,RI,LE> CR: 44002402 XER: 20040000
>>> CFAR: c000000000a0d170 IRQMASK: 0
>>> GPR00: c0000000000ff4cc c0000013aed5fae0 c000000001512700 c0000013aa362138
>>> GPR04: 0000000000000000 0000000000000000 0000000000000000 0000000119c8afd0
>>> GPR08: 0000000000000000 c000001284442b00 0000000000000001 0000000000001003
>>> GPR12: 0000000300000000 c0000018ffff2f00 0000000000000000 0000000000000000
>>> GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
>>> GPR20: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
>>> GPR24: c0000013aed5fc40 0000000000000002 0000000000000000 c000000002757d90
>>> GPR28: c0000000000ff440 c000000002757cb8 c00000183799c1a0 c0000013aa362b00
>>> NIP [c0000000000ff4d4] iommu_reconfig_notifier+0x94/0x200
>>> LR [c0000000000ff4cc] iommu_reconfig_notifier+0x8c/0x200
>>> Call Trace:
>>> [c0000013aed5fae0] [c0000000000ff4cc] iommu_reconfig_notifier+0x8c/0x200 (unreliable)
>>> [c0000013aed5fb10] [c0000000001a27b0] notifier_call_chain+0xb8/0x19c
>>> [c0000013aed5fb70] [c0000000001a2a78] blocking_notifier_call_chain+0x64/0x98
>>> [c0000013aed5fbb0] [c000000000c4a898] of_reconfig_notify+0x44/0xdc
>>> [c0000013aed5fc20] [c000000000c4add4] of_detach_node+0x78/0xb0
>>> [c0000013aed5fc70] [c0000000000f96a8] ofdt_write.part.0+0x86c/0xbb8
>>> [c0000013aed5fce0] [c00000000069b4bc] proc_reg_write+0xf4/0x150
>>> [c0000013aed5fd10] [c0000000005bfeb4] vfs_write+0xf8/0x488
>>> [c0000013aed5fdc0] [c0000000005c0570] ksys_write+0x84/0x140
>>> [c0000013aed5fe10] [c000000000033358] system_call_exception+0x138/0x330
>>> [c0000013aed5fe50] [c00000000000d05c] system_call_vectored_common+0x15c/0x2ec
>>> --- interrupt: 3000 at 0x20000433acb4
>>> <snip>
>>> ---[ end trace 0000000000000000 ]---
>>>
>>> The patch adds the missing iommu_group_put() call.
>>>
>>> Fixes: a8ca9fc9134c ("powerpc/iommu: Do not do platform domain attach atctions after probe")
>>> Signed-off-by: Shivaprasad G Bhat <[email protected]>
>>> ---
>>> arch/powerpc/kernel/iommu.c | 4 +++-
>>> 1 file changed, 3 insertions(+), 1 deletion(-)
>> Doh, that is a weird splat for this but thanks for finding it
>>
>> Reviewed-by: Jason Gunthorpe <[email protected]>
>>
>> Jason
>>

2024-02-14 12:58:32

by Jason Gunthorpe

[permalink] [raw]
Subject: Re: [PATCH] powerpc/iommu: Fix the missing iommu_group_put() during platform domain attach

On Wed, Feb 14, 2024 at 11:53:20PM +1100, Michael Ellerman wrote:
> Venkat Rao Bagalkote <[email protected]> writes:
> > Thanks for the patch. Applied this patch and verified and issue is fixed.
> >
> > This issue way originally reported in the below mail.
> >
> > https://marc.info/?l=linux-kernel&m=170737160630106&w=2
>
> Please use lore for links, in this case:
>
> https://lore.kernel.org/all/[email protected]/

Also if you are respinning you may prefer this

@@ -1285,14 +1285,15 @@ spapr_tce_platform_iommu_attach_dev(struct iommu_domain *platform_domain,
struct device *dev)
{
struct iommu_domain *domain = iommu_get_domain_for_dev(dev);
- struct iommu_group *grp = iommu_group_get(dev);
struct iommu_table_group *table_group;
+ struct iommu_group *grp;
int ret = -EINVAL;

/* At first attach the ownership is already set */
if (!domain)
return 0;

+ grp = iommu_group_get(dev);
if (!grp)
return -ENODEV;

Which is sort of why this happened in the first place :)

Jason

2024-02-14 18:15:58

by Shivaprasad G Bhat

[permalink] [raw]
Subject: Re: [PATCH] powerpc/iommu: Fix the missing iommu_group_put() during platform domain attach


On 2/14/24 18:28, Jason Gunthorpe wrote:
> On Wed, Feb 14, 2024 at 11:53:20PM +1100, Michael Ellerman wrote:
>> Venkat Rao Bagalkote <[email protected]> writes:
>>> Thanks for the patch. Applied this patch and verified and issue is fixed.
>>>
>>> This issue way originally reported in the below mail.
>>>
>>> https://marc.info/?l=linux-kernel&m=170737160630106&w=2
>> Please use lore for links, in this case:
>>
>> https://lore.kernel.org/all/[email protected]/
> Also if you are respinning you may prefer this
>
> @@ -1285,14 +1285,15 @@ spapr_tce_platform_iommu_attach_dev(struct iommu_domain *platform_domain,
> struct device *dev)
> {
> struct iommu_domain *domain = iommu_get_domain_for_dev(dev);
> - struct iommu_group *grp = iommu_group_get(dev);
> struct iommu_table_group *table_group;
> + struct iommu_group *grp;
> int ret = -EINVAL;
>
> /* At first attach the ownership is already set */
> if (!domain)
> return 0;
>
> + grp = iommu_group_get(dev);
> if (!grp)
> return -ENODEV;
>
> Which is sort of why this happened in the first place :)

Right! Posted the v2 here

https://lore.kernel.org/linux-iommu/[email protected]/


Thanks,

Shivaprasad

> Jason

2024-02-15 13:01:24

by Michael Ellerman

[permalink] [raw]
Subject: Re: [PATCH] powerpc/iommu: Fix the missing iommu_group_put() during platform domain attach

On Tue, 13 Feb 2024 10:05:22 -0600, Shivaprasad G Bhat wrote:
> The function spapr_tce_platform_iommu_attach_dev() is missing to call
> iommu_group_put() when the domain is already set. This refcount leak
> shows up with BUG_ON() during DLPAR remove operation as,
>
> KernelBug: Kernel bug in state 'None': kernel BUG at arch/powerpc/platforms/pseries/iommu.c:100!
> Oops: Exception in kernel mode, sig: 5 [#1]
> LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=8192 NUMA pSeries
> <snip>
> Hardware name: IBM,9080-HEX POWER10 (raw) 0x800200 0xf000006 of:IBM,FW1060.00 (NH1060_016) hv:phyp pSeries
> NIP: c0000000000ff4d4 LR: c0000000000ff4cc CTR: 0000000000000000
> REGS: c0000013aed5f840 TRAP: 0700 Tainted: G I (6.8.0-rc3-autotest-g99bd3cb0d12e)
> MSR: 8000000000029033 <SF,EE,ME,IR,DR,RI,LE> CR: 44002402 XER: 20040000
> CFAR: c000000000a0d170 IRQMASK: 0
> GPR00: c0000000000ff4cc c0000013aed5fae0 c000000001512700 c0000013aa362138
> GPR04: 0000000000000000 0000000000000000 0000000000000000 0000000119c8afd0
> GPR08: 0000000000000000 c000001284442b00 0000000000000001 0000000000001003
> GPR12: 0000000300000000 c0000018ffff2f00 0000000000000000 0000000000000000
> GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
> GPR20: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
> GPR24: c0000013aed5fc40 0000000000000002 0000000000000000 c000000002757d90
> GPR28: c0000000000ff440 c000000002757cb8 c00000183799c1a0 c0000013aa362b00
> NIP [c0000000000ff4d4] iommu_reconfig_notifier+0x94/0x200
> LR [c0000000000ff4cc] iommu_reconfig_notifier+0x8c/0x200
> Call Trace:
> [c0000013aed5fae0] [c0000000000ff4cc] iommu_reconfig_notifier+0x8c/0x200 (unreliable)
> [c0000013aed5fb10] [c0000000001a27b0] notifier_call_chain+0xb8/0x19c
> [c0000013aed5fb70] [c0000000001a2a78] blocking_notifier_call_chain+0x64/0x98
> [c0000013aed5fbb0] [c000000000c4a898] of_reconfig_notify+0x44/0xdc
> [c0000013aed5fc20] [c000000000c4add4] of_detach_node+0x78/0xb0
> [c0000013aed5fc70] [c0000000000f96a8] ofdt_write.part.0+0x86c/0xbb8
> [c0000013aed5fce0] [c00000000069b4bc] proc_reg_write+0xf4/0x150
> [c0000013aed5fd10] [c0000000005bfeb4] vfs_write+0xf8/0x488
> [c0000013aed5fdc0] [c0000000005c0570] ksys_write+0x84/0x140
> [c0000013aed5fe10] [c000000000033358] system_call_exception+0x138/0x330
> [c0000013aed5fe50] [c00000000000d05c] system_call_vectored_common+0x15c/0x2ec
> --- interrupt: 3000 at 0x20000433acb4
> <snip>
> ---[ end trace 0000000000000000 ]---
>
> [...]

Applied to powerpc/fixes.

[1/1] powerpc/iommu: Fix the missing iommu_group_put() during platform domain attach
https://git.kernel.org/powerpc/c/0846dd77c8349ec92ca0079c9c71d130f34cb192

cheers