2022-07-21 00:42:23

by Jerry Snitselaar

[permalink] [raw]
Subject: [PATCH] iommu/amd: Disable guest vapic logging during early kdump init

With commit a8d4a37d1bb9 ("iommu/amd: Restore GA log/tail pointer on
host resume"), there now is a WARN_ON in iommu_ga_log_enable that
triggers if the GALOG_RUN bit is set in the MMIO status register, and
this warning has been reported when a kdump kernel is booting.

We probably don't really care about guest vapic logging during kdump,
but the proper thing seems to be disabling the GALOG_EN bit in the
MMIO control register in the kdump code path of
early_enable_iommus(). The iommu pci init code is going to set the ga
log address pointers to new values in the amd_iommu struct, and these
aren't pushed to the iommu until after the warning and return in
iommu_ga_log_enable(). So worst case, the GALOG bits are set and the
iommu has stale pointers to the ga log addresses. So disable GALOG_EN
in early_enable_iommus() and allow iommu_ga_log_enable() to set it up
properly and enable again.

Without the fix you will see the following warning when the kdump
kernel boots:

[ 7.731955] ------------[ cut here ]------------
[ 7.736575] WARNING: CPU: 0 PID: 1 at drivers/iommu/amd/init.c:829 iommu_ga_log_enable.isra.0+0x16f/0x190
[ 7.746135] Modules linked in:
[ 7.749193] CPU: 0 PID: 1 Comm: swapper/0 Tainted: G W -------- --- 5.19.0-0.rc7.53.eln120.x86_64 #1
[ 7.759706] Hardware name: Dell Inc. PowerEdge R7525/04D5GJ, BIOS 2.1.6 03/09/2021
[ 7.767274] RIP: 0010:iommu_ga_log_enable.isra.0+0x16f/0x190
[ 7.772931] Code: 20 20 00 00 8b 00 f6 c4 01 74 da 48 8b 44 24 08 65 48 2b 04 25 28 00 00 00 75 13 48 83 c4 10 5b 5d e9 f5 00 72 00 0f 0b eb e1 <0f> 0b eb dd e8 f8 66 42 00 48 8b 15 f1 85 53 01 e9 29 ff ff ff 48
[ 7.791679] RSP: 0018:ffffc90000107d20 EFLAGS: 00010206
[ 7.796905] RAX: ffffc90000780000 RBX: 0000000000000100 RCX: ffffc90000780000
[ 7.804038] RDX: 0000000000000001 RSI: ffffc90000780000 RDI: ffff8880451f9800
[ 7.811170] RBP: ffff8880451f9800 R08: ffffffffffffffff R09: 0000000000000000
[ 7.818303] R10: 0000000000000000 R11: 0000000000000000 R12: 0008000000000000
[ 7.825435] R13: ffff8880462ea900 R14: 0000000000000021 R15: 0000000000000000
[ 7.832572] FS: 0000000000000000(0000) GS:ffff888054a00000(0000) knlGS:0000000000000000
[ 7.840657] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 7.846400] CR2: ffff888054dff000 CR3: 0000000053210000 CR4: 0000000000350eb0
[ 7.853533] Call Trace:
[ 7.855979] <TASK>
[ 7.858085] amd_iommu_enable_interrupts+0x180/0x270
[ 7.863051] ? iommu_setup+0x271/0x271
[ 7.866803] state_next+0x197/0x2c0
[ 7.870295] ? iommu_setup+0x271/0x271
[ 7.874049] iommu_go_to_state+0x24/0x2c
[ 7.877976] amd_iommu_init+0xf/0x29
[ 7.881554] pci_iommu_init+0xe/0x36
[ 7.885133] do_one_initcall+0x44/0x200
[ 7.888975] do_initcalls+0xc8/0xe1
[ 7.892466] kernel_init_freeable+0x14c/0x199
[ 7.896826] ? rest_init+0xd0/0xd0
[ 7.900231] kernel_init+0x16/0x130
[ 7.903723] ret_from_fork+0x22/0x30
[ 7.907306] </TASK>
[ 7.909497] ---[ end trace 0000000000000000 ]---

Fixes: 3ac3e5ee5ed5 ("iommu/amd: Copy old trans table from old kernel")
Cc: Joerg Roedel <[email protected]>
Cc: Suravee Suthikulpanit <[email protected]>
Cc: Will Deacon <[email protected]> (maintainer:IOMMU DRIVERS)
Cc: [email protected]
Cc: [email protected]
Signed-off-by: Jerry Snitselaar <[email protected]>
---
drivers/iommu/amd/init.c | 6 ++++++
1 file changed, 6 insertions(+)

diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c
index 1d08f87e734b..2b00d7f28df7 100644
--- a/drivers/iommu/amd/init.c
+++ b/drivers/iommu/amd/init.c
@@ -815,6 +815,11 @@ static void free_ga_log(struct amd_iommu *iommu)
#endif
}

+static void iommu_ga_log_disable(struct amd_iommu *iommu)
+{
+ iommu_feature_disable(iommu, CONTROL_GALOG_EN);
+}
+
static int iommu_ga_log_enable(struct amd_iommu *iommu)
{
#ifdef CONFIG_IRQ_REMAP
@@ -2504,6 +2509,7 @@ static void early_enable_iommus(void)
for_each_iommu(iommu) {
iommu_disable_command_buffer(iommu);
iommu_disable_event_buffer(iommu);
+ iommu_ga_log_disable(iommu);
iommu_enable_command_buffer(iommu);
iommu_enable_event_buffer(iommu);
iommu_enable_ga(iommu);

base-commit: ca85855bdcae8f84f1512e88b4c75009ea17ea2f
--
2.36.1


2022-07-22 15:02:14

by Joerg Roedel

[permalink] [raw]
Subject: Re: [PATCH] iommu/amd: Disable guest vapic logging during early kdump init

On Wed, Jul 20, 2022 at 05:34:39PM -0700, Jerry Snitselaar wrote:
> diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c
> index 1d08f87e734b..2b00d7f28df7 100644
> --- a/drivers/iommu/amd/init.c
> +++ b/drivers/iommu/amd/init.c
> @@ -815,6 +815,11 @@ static void free_ga_log(struct amd_iommu *iommu)
> #endif
> }
>
> +static void iommu_ga_log_disable(struct amd_iommu *iommu)
> +{
> + iommu_feature_disable(iommu, CONTROL_GALOG_EN);
> +}
> +
> static int iommu_ga_log_enable(struct amd_iommu *iommu)
> {
> #ifdef CONFIG_IRQ_REMAP
> @@ -2504,6 +2509,7 @@ static void early_enable_iommus(void)
> for_each_iommu(iommu) {
> iommu_disable_command_buffer(iommu);
> iommu_disable_event_buffer(iommu);
> + iommu_ga_log_disable(iommu);
> iommu_enable_command_buffer(iommu);
> iommu_enable_event_buffer(iommu);
> iommu_enable_ga(iommu);

Looks about right, but I also let Suravee comment on this.

Disabling the GA-Log under a device still using it should hopefully not
put it into some undefined state.

2022-07-23 16:55:15

by Jerry Snitselaar

[permalink] [raw]
Subject: Re: [PATCH] iommu/amd: Disable guest vapic logging during early kdump init

On Fri, Jul 22, 2022 at 04:37:13PM +0200, Joerg Roedel wrote:
> On Wed, Jul 20, 2022 at 05:34:39PM -0700, Jerry Snitselaar wrote:
> > diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c
> > index 1d08f87e734b..2b00d7f28df7 100644
> > --- a/drivers/iommu/amd/init.c
> > +++ b/drivers/iommu/amd/init.c
> > @@ -815,6 +815,11 @@ static void free_ga_log(struct amd_iommu *iommu)
> > #endif
> > }
> >
> > +static void iommu_ga_log_disable(struct amd_iommu *iommu)
> > +{
> > + iommu_feature_disable(iommu, CONTROL_GALOG_EN);
> > +}
> > +
> > static int iommu_ga_log_enable(struct amd_iommu *iommu)
> > {
> > #ifdef CONFIG_IRQ_REMAP
> > @@ -2504,6 +2509,7 @@ static void early_enable_iommus(void)
> > for_each_iommu(iommu) {
> > iommu_disable_command_buffer(iommu);
> > iommu_disable_event_buffer(iommu);
> > + iommu_ga_log_disable(iommu);
> > iommu_enable_command_buffer(iommu);
> > iommu_enable_event_buffer(iommu);
> > iommu_enable_ga(iommu);
>
> Looks about right, but I also let Suravee comment on this.
>
> Disabling the GA-Log under a device still using it should hopefully not
> put it into some undefined state.

Hi Joerg,

Yes, I sent email to Suravee a few days ago as well to see what he thought,
but haven't heard back yet. It might be a couple days as I think he is
overseas at the moment.

Minor, stupid thing, but I wasn't sure whether it would be better to
name it iommu_disable_ga_log() like everything else called here, or
name it in the format of the other functions relating to logs. I opted
for the latter.

Did you see my other email about kdump kernels on vt-d with scalable
mode and passthrough enabled?

https://lore.kernel.org/linux-iommu/20220702154956.x2dxqalzjuzjraxk@cantor/

I've been looking at it, but haven't figured out what to do about tracking
the copying if can't borrow a bit in one of the pasid fields now. Any thoughts
on that?

Regards,
Jerry

2022-07-25 07:14:02

by Suthikulpanit, Suravee

[permalink] [raw]
Subject: Re: [PATCH] iommu/amd: Disable guest vapic logging during early kdump init

Joeg / Jerry,

On 7/22/22 9:37 PM, Joerg Roedel wrote:
> On Wed, Jul 20, 2022 at 05:34:39PM -0700, Jerry Snitselaar wrote:
>> diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c
>> index 1d08f87e734b..2b00d7f28df7 100644
>> --- a/drivers/iommu/amd/init.c
>> +++ b/drivers/iommu/amd/init.c
>> @@ -815,6 +815,11 @@ static void free_ga_log(struct amd_iommu *iommu)
>> #endif
>> }
>>
>> +static void iommu_ga_log_disable(struct amd_iommu *iommu)
>> +{
>> + iommu_feature_disable(iommu, CONTROL_GALOG_EN);
>> +}
>> +
>> static int iommu_ga_log_enable(struct amd_iommu *iommu)
>> {
>> #ifdef CONFIG_IRQ_REMAP
>> @@ -2504,6 +2509,7 @@ static void early_enable_iommus(void)
>> for_each_iommu(iommu) {
>> iommu_disable_command_buffer(iommu);
>> iommu_disable_event_buffer(iommu);
>> + iommu_ga_log_disable(iommu);
>> iommu_enable_command_buffer(iommu);
>> iommu_enable_event_buffer(iommu);
>> iommu_enable_ga(iommu);
>
> Looks about right, but I also let Suravee comment on this.
>
> Disabling the GA-Log under a device still using it should hopefully not
> put it into some undefined state.

Sorry for late reply. I have been actually working on the new GAM and GALOG enabling code,
which should also address this issue as well. I'll send out the patch soon.

Regards,
Suravee

2022-07-26 07:21:52

by Joerg Roedel

[permalink] [raw]
Subject: Re: [PATCH] iommu/amd: Disable guest vapic logging during early kdump init

Hi Suravee,

On Mon, Jul 25, 2022 at 02:05:59PM +0700, Suravee Suthikulpanit wrote:
> Sorry for late reply. I have been actually working on the new GAM and GALOG enabling code,
> which should also address this issue as well. I'll send out the patch soon.

Okay, how about basing your changes on Jerry's fix? A backportable fix
for a real issue is always better than a bigger rework (which can still
happen on-top).

Regards,

Joerg

2022-07-26 14:14:37

by Suthikulpanit, Suravee

[permalink] [raw]
Subject: Re: [PATCH] iommu/amd: Disable guest vapic logging during early kdump init



On 7/26/22 2:12 PM, Joerg Roedel wrote:
> Hi Suravee,
>
> On Mon, Jul 25, 2022 at 02:05:59PM +0700, Suravee Suthikulpanit wrote:
>> Sorry for late reply. I have been actually working on the new GAM and GALOG enabling code,
>> which should also address this issue as well. I'll send out the patch soon.
>
> Okay, how about basing your changes on Jerry's fix? A backportable fix
> for a real issue is always better than a bigger rework (which can still
> happen on-top).
>
> Regards,
>
> Joerg

Ahh... I have just sent out the new patch

https://lore.kernel.org/lkml/[email protected]/

Please lemme know in case you want me to rebase this on top of Jerry's.

Suravee