From: Arnd Bergmann <[email protected]>
Using 'imply AMD_IOMMU_V2' does not guarantee that the driver can link
against the exported functions. If the GPU driver is built-in but the
IOMMU driver is a loadable module, the kfd_iommu.c file is indeed
built but does not work:
x86_64-linux-ld: drivers/gpu/drm/amd/amdkfd/kfd_iommu.o: in function `kfd_iommu_bind_process_to_device':
kfd_iommu.c:(.text+0x516): undefined reference to `amd_iommu_bind_pasid'
x86_64-linux-ld: drivers/gpu/drm/amd/amdkfd/kfd_iommu.o: in function `kfd_iommu_unbind_process':
kfd_iommu.c:(.text+0x691): undefined reference to `amd_iommu_unbind_pasid'
x86_64-linux-ld: drivers/gpu/drm/amd/amdkfd/kfd_iommu.o: in function `kfd_iommu_suspend':
kfd_iommu.c:(.text+0x966): undefined reference to `amd_iommu_set_invalidate_ctx_cb'
x86_64-linux-ld: kfd_iommu.c:(.text+0x97f): undefined reference to `amd_iommu_set_invalid_ppr_cb'
x86_64-linux-ld: kfd_iommu.c:(.text+0x9a4): undefined reference to `amd_iommu_free_device'
x86_64-linux-ld: drivers/gpu/drm/amd/amdkfd/kfd_iommu.o: in function `kfd_iommu_resume':
kfd_iommu.c:(.text+0xa9a): undefined reference to `amd_iommu_init_device'
x86_64-linux-ld: kfd_iommu.c:(.text+0xadc): undefined reference to `amd_iommu_set_invalidate_ctx_cb'
x86_64-linux-ld: kfd_iommu.c:(.text+0xaff): undefined reference to `amd_iommu_set_invalid_ppr_cb'
x86_64-linux-ld: kfd_iommu.c:(.text+0xc72): undefined reference to `amd_iommu_bind_pasid'
x86_64-linux-ld: kfd_iommu.c:(.text+0xe08): undefined reference to `amd_iommu_set_invalidate_ctx_cb'
x86_64-linux-ld: kfd_iommu.c:(.text+0xe26): undefined reference to `amd_iommu_set_invalid_ppr_cb'
x86_64-linux-ld: kfd_iommu.c:(.text+0xe42): undefined reference to `amd_iommu_free_device'
Change the 'imply' to a weak dependency that still allows compiling
in all other configurations but disallows the configuration that
causes a link failure.
Fixes: 64d1c3a43a6f ("drm/amdkfd: Centralize IOMMUv2 code and make it conditional")
Signed-off-by: Arnd Bergmann <[email protected]>
---
drivers/gpu/drm/amd/amdkfd/Kconfig | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/amd/amdkfd/Kconfig b/drivers/gpu/drm/amd/amdkfd/Kconfig
index f02c938f75da..d01dba2af3bb 100644
--- a/drivers/gpu/drm/amd/amdkfd/Kconfig
+++ b/drivers/gpu/drm/amd/amdkfd/Kconfig
@@ -6,7 +6,7 @@
config HSA_AMD
bool "HSA kernel driver for AMD GPU devices"
depends on DRM_AMDGPU && (X86_64 || ARM64 || PPC64)
- imply AMD_IOMMU_V2 if X86_64
+ depends on AMD_IOMMU_V2=y || DRM_AMDGPU=m
select HMM_MIRROR
select MMU_NOTIFIER
select DRM_AMDGPU_USERPTR
--
2.29.2
Am 2021-03-08 um 3:45 p.m. schrieb Arnd Bergmann:
> From: Arnd Bergmann <[email protected]>
>
> Using 'imply AMD_IOMMU_V2' does not guarantee that the driver can link
> against the exported functions. If the GPU driver is built-in but the
> IOMMU driver is a loadable module, the kfd_iommu.c file is indeed
> built but does not work:
>
> x86_64-linux-ld: drivers/gpu/drm/amd/amdkfd/kfd_iommu.o: in function `kfd_iommu_bind_process_to_device':
> kfd_iommu.c:(.text+0x516): undefined reference to `amd_iommu_bind_pasid'
> x86_64-linux-ld: drivers/gpu/drm/amd/amdkfd/kfd_iommu.o: in function `kfd_iommu_unbind_process':
> kfd_iommu.c:(.text+0x691): undefined reference to `amd_iommu_unbind_pasid'
> x86_64-linux-ld: drivers/gpu/drm/amd/amdkfd/kfd_iommu.o: in function `kfd_iommu_suspend':
> kfd_iommu.c:(.text+0x966): undefined reference to `amd_iommu_set_invalidate_ctx_cb'
> x86_64-linux-ld: kfd_iommu.c:(.text+0x97f): undefined reference to `amd_iommu_set_invalid_ppr_cb'
> x86_64-linux-ld: kfd_iommu.c:(.text+0x9a4): undefined reference to `amd_iommu_free_device'
> x86_64-linux-ld: drivers/gpu/drm/amd/amdkfd/kfd_iommu.o: in function `kfd_iommu_resume':
> kfd_iommu.c:(.text+0xa9a): undefined reference to `amd_iommu_init_device'
> x86_64-linux-ld: kfd_iommu.c:(.text+0xadc): undefined reference to `amd_iommu_set_invalidate_ctx_cb'
> x86_64-linux-ld: kfd_iommu.c:(.text+0xaff): undefined reference to `amd_iommu_set_invalid_ppr_cb'
> x86_64-linux-ld: kfd_iommu.c:(.text+0xc72): undefined reference to `amd_iommu_bind_pasid'
> x86_64-linux-ld: kfd_iommu.c:(.text+0xe08): undefined reference to `amd_iommu_set_invalidate_ctx_cb'
> x86_64-linux-ld: kfd_iommu.c:(.text+0xe26): undefined reference to `amd_iommu_set_invalid_ppr_cb'
> x86_64-linux-ld: kfd_iommu.c:(.text+0xe42): undefined reference to `amd_iommu_free_device'
>
> Change the 'imply' to a weak dependency that still allows compiling
> in all other configurations but disallows the configuration that
> causes a link failure.
I don't like this solution. It hides the HSA_AMD option in menuconfig
and it's not intuitively obvious to someone configuring a kernel why
it's not available. They may not even notice that it's missing and then
wonder later, why KFD functionality is missing in their kernel.
What I'm trying to achieve is, that KFD can be compiled without
AMD-IOMMU-V2 support in this case. I just tested my version using
IS_REACHABLE. I'll reply with that patch in a moment.
Regards,
Felix
>
> Fixes: 64d1c3a43a6f ("drm/amdkfd: Centralize IOMMUv2 code and make it conditional")
> Signed-off-by: Arnd Bergmann <[email protected]>
> ---
> drivers/gpu/drm/amd/amdkfd/Kconfig | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/amd/amdkfd/Kconfig b/drivers/gpu/drm/amd/amdkfd/Kconfig
> index f02c938f75da..d01dba2af3bb 100644
> --- a/drivers/gpu/drm/amd/amdkfd/Kconfig
> +++ b/drivers/gpu/drm/amd/amdkfd/Kconfig
> @@ -6,7 +6,7 @@
> config HSA_AMD
> bool "HSA kernel driver for AMD GPU devices"
> depends on DRM_AMDGPU && (X86_64 || ARM64 || PPC64)
> - imply AMD_IOMMU_V2 if X86_64
> + depends on AMD_IOMMU_V2=y || DRM_AMDGPU=m
> select HMM_MIRROR
> select MMU_NOTIFIER
> select DRM_AMDGPU_USERPTR
Using 'imply AMD_IOMMU_V2' does not guarantee that the driver can link
against the exported functions. If the GPU driver is built-in but the
IOMMU driver is a loadable module, the kfd_iommu.c file is indeed
built but does not work:
x86_64-linux-ld: drivers/gpu/drm/amd/amdkfd/kfd_iommu.o: in function `kfd_iommu_bind_process_to_device':
kfd_iommu.c:(.text+0x516): undefined reference to `amd_iommu_bind_pasid'
x86_64-linux-ld: drivers/gpu/drm/amd/amdkfd/kfd_iommu.o: in function `kfd_iommu_unbind_process':
kfd_iommu.c:(.text+0x691): undefined reference to `amd_iommu_unbind_pasid'
x86_64-linux-ld: drivers/gpu/drm/amd/amdkfd/kfd_iommu.o: in function `kfd_iommu_suspend':
kfd_iommu.c:(.text+0x966): undefined reference to `amd_iommu_set_invalidate_ctx_cb'
x86_64-linux-ld: kfd_iommu.c:(.text+0x97f): undefined reference to `amd_iommu_set_invalid_ppr_cb'
x86_64-linux-ld: kfd_iommu.c:(.text+0x9a4): undefined reference to `amd_iommu_free_device'
x86_64-linux-ld: drivers/gpu/drm/amd/amdkfd/kfd_iommu.o: in function `kfd_iommu_resume':
kfd_iommu.c:(.text+0xa9a): undefined reference to `amd_iommu_init_device'
x86_64-linux-ld: kfd_iommu.c:(.text+0xadc): undefined reference to `amd_iommu_set_invalidate_ctx_cb'
x86_64-linux-ld: kfd_iommu.c:(.text+0xaff): undefined reference to `amd_iommu_set_invalid_ppr_cb'
x86_64-linux-ld: kfd_iommu.c:(.text+0xc72): undefined reference to `amd_iommu_bind_pasid'
x86_64-linux-ld: kfd_iommu.c:(.text+0xe08): undefined reference to `amd_iommu_set_invalidate_ctx_cb'
x86_64-linux-ld: kfd_iommu.c:(.text+0xe26): undefined reference to `amd_iommu_set_invalid_ppr_cb'
x86_64-linux-ld: kfd_iommu.c:(.text+0xe42): undefined reference to `amd_iommu_free_device'
Use IS_REACHABLE to only build IOMMU-V2 support if the amd_iommu symbols
are reachable by the amdkfd driver. Output a warning if they are not,
because that may not be what the user was expecting.
Fixes: 64d1c3a43a6f ("drm/amdkfd: Centralize IOMMUv2 code and make it conditional")
Reported-by: Arnd Bergmann <[email protected]>
Signed-off-by: Felix Kuehling <[email protected]>
---
drivers/gpu/drm/amd/amdkfd/kfd_iommu.c | 10 ++++++++++
drivers/gpu/drm/amd/amdkfd/kfd_iommu.h | 6 ++++--
2 files changed, 14 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_iommu.c b/drivers/gpu/drm/amd/amdkfd/kfd_iommu.c
index 66bbca61e3ef..7199eb833f66 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_iommu.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_iommu.c
@@ -20,6 +20,10 @@
* OTHER DEALINGS IN THE SOFTWARE.
*/
+#include <linux/kconfig.h>
+
+#if IS_REACHABLE(CONFIG_AMD_IOMMU_V2)
+
#include <linux/printk.h>
#include <linux/device.h>
#include <linux/slab.h>
@@ -355,3 +359,9 @@ int kfd_iommu_add_perf_counters(struct kfd_topology_device *kdev)
return 0;
}
+
+#else
+
+#warning "Moldular IOMMU-V2 is not usable by built-in KFD"
+
+#endif
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_iommu.h b/drivers/gpu/drm/amd/amdkfd/kfd_iommu.h
index dd23d9fdf6a8..b25365fc2c4e 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_iommu.h
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_iommu.h
@@ -23,7 +23,9 @@
#ifndef __KFD_IOMMU_H__
#define __KFD_IOMMU_H__
-#if defined(CONFIG_AMD_IOMMU_V2_MODULE) || defined(CONFIG_AMD_IOMMU_V2)
+#include <linux/kconfig.h>
+
+#if IS_REACHABLE(CONFIG_AMD_IOMMU_V2)
#define KFD_SUPPORT_IOMMU_V2
@@ -73,6 +75,6 @@ static inline int kfd_iommu_add_perf_counters(struct kfd_topology_device *kdev)
return 0;
}
-#endif /* defined(CONFIG_AMD_IOMMU_V2) */
+#endif /* IS_REACHABLE(CONFIG_AMD_IOMMU_V2) */
#endif /* __KFD_IOMMU_H__ */
--
2.30.0
On Tue, Mar 9, 2021 at 4:23 AM Felix Kuehling <[email protected]> wrote:
>
> Using 'imply AMD_IOMMU_V2' does not guarantee that the driver can link
> against the exported functions. If the GPU driver is built-in but the
> IOMMU driver is a loadable module, the kfd_iommu.c file is indeed
> built but does not work:
>
> x86_64-linux-ld: drivers/gpu/drm/amd/amdkfd/kfd_iommu.o: in function `kfd_iommu_bind_process_to_device':
> kfd_iommu.c:(.text+0x516): undefined reference to `amd_iommu_bind_pasid'
> x86_64-linux-ld: drivers/gpu/drm/amd/amdkfd/kfd_iommu.o: in function `kfd_iommu_unbind_process':
> kfd_iommu.c:(.text+0x691): undefined reference to `amd_iommu_unbind_pasid'
> x86_64-linux-ld: drivers/gpu/drm/amd/amdkfd/kfd_iommu.o: in function `kfd_iommu_suspend':
> kfd_iommu.c:(.text+0x966): undefined reference to `amd_iommu_set_invalidate_ctx_cb'
> x86_64-linux-ld: kfd_iommu.c:(.text+0x97f): undefined reference to `amd_iommu_set_invalid_ppr_cb'
> x86_64-linux-ld: kfd_iommu.c:(.text+0x9a4): undefined reference to `amd_iommu_free_device'
> x86_64-linux-ld: drivers/gpu/drm/amd/amdkfd/kfd_iommu.o: in function `kfd_iommu_resume':
> kfd_iommu.c:(.text+0xa9a): undefined reference to `amd_iommu_init_device'
> x86_64-linux-ld: kfd_iommu.c:(.text+0xadc): undefined reference to `amd_iommu_set_invalidate_ctx_cb'
> x86_64-linux-ld: kfd_iommu.c:(.text+0xaff): undefined reference to `amd_iommu_set_invalid_ppr_cb'
> x86_64-linux-ld: kfd_iommu.c:(.text+0xc72): undefined reference to `amd_iommu_bind_pasid'
> x86_64-linux-ld: kfd_iommu.c:(.text+0xe08): undefined reference to `amd_iommu_set_invalidate_ctx_cb'
> x86_64-linux-ld: kfd_iommu.c:(.text+0xe26): undefined reference to `amd_iommu_set_invalid_ppr_cb'
> x86_64-linux-ld: kfd_iommu.c:(.text+0xe42): undefined reference to `amd_iommu_free_device'
>
> Use IS_REACHABLE to only build IOMMU-V2 support if the amd_iommu symbols
> are reachable by the amdkfd driver. Output a warning if they are not,
> because that may not be what the user was expecting.
This would fix the compile-time failure, but it still fails my CI
because you introduce
a compile-time warning. Can you change it into a runtime warning instead?
Neither type of warning is likely to actually reach the person trying
to debug the
problem, so you still end up with machines that don't do what they should,
but I can live with the runtime warning as long as the build doesn't break.
I think the proper fix would be to not rely on custom hooks into a particular
IOMMU driver, but to instead ensure that the amdgpu driver can do everything
it needs through the regular linux/iommu.h interfaces. I realize this
is more work,
but I wonder if you've tried that, and why it didn't work out.
Arnd
Am 2021-03-09 um 3:58 a.m. schrieb Arnd Bergmann:
> On Tue, Mar 9, 2021 at 4:23 AM Felix Kuehling <[email protected]> wrote:
>> Using 'imply AMD_IOMMU_V2' does not guarantee that the driver can link
>> against the exported functions. If the GPU driver is built-in but the
>> IOMMU driver is a loadable module, the kfd_iommu.c file is indeed
>> built but does not work:
>>
>> x86_64-linux-ld: drivers/gpu/drm/amd/amdkfd/kfd_iommu.o: in function `kfd_iommu_bind_process_to_device':
>> kfd_iommu.c:(.text+0x516): undefined reference to `amd_iommu_bind_pasid'
>> x86_64-linux-ld: drivers/gpu/drm/amd/amdkfd/kfd_iommu.o: in function `kfd_iommu_unbind_process':
>> kfd_iommu.c:(.text+0x691): undefined reference to `amd_iommu_unbind_pasid'
>> x86_64-linux-ld: drivers/gpu/drm/amd/amdkfd/kfd_iommu.o: in function `kfd_iommu_suspend':
>> kfd_iommu.c:(.text+0x966): undefined reference to `amd_iommu_set_invalidate_ctx_cb'
>> x86_64-linux-ld: kfd_iommu.c:(.text+0x97f): undefined reference to `amd_iommu_set_invalid_ppr_cb'
>> x86_64-linux-ld: kfd_iommu.c:(.text+0x9a4): undefined reference to `amd_iommu_free_device'
>> x86_64-linux-ld: drivers/gpu/drm/amd/amdkfd/kfd_iommu.o: in function `kfd_iommu_resume':
>> kfd_iommu.c:(.text+0xa9a): undefined reference to `amd_iommu_init_device'
>> x86_64-linux-ld: kfd_iommu.c:(.text+0xadc): undefined reference to `amd_iommu_set_invalidate_ctx_cb'
>> x86_64-linux-ld: kfd_iommu.c:(.text+0xaff): undefined reference to `amd_iommu_set_invalid_ppr_cb'
>> x86_64-linux-ld: kfd_iommu.c:(.text+0xc72): undefined reference to `amd_iommu_bind_pasid'
>> x86_64-linux-ld: kfd_iommu.c:(.text+0xe08): undefined reference to `amd_iommu_set_invalidate_ctx_cb'
>> x86_64-linux-ld: kfd_iommu.c:(.text+0xe26): undefined reference to `amd_iommu_set_invalid_ppr_cb'
>> x86_64-linux-ld: kfd_iommu.c:(.text+0xe42): undefined reference to `amd_iommu_free_device'
>>
>> Use IS_REACHABLE to only build IOMMU-V2 support if the amd_iommu symbols
>> are reachable by the amdkfd driver. Output a warning if they are not,
>> because that may not be what the user was expecting.
> This would fix the compile-time failure, but it still fails my CI
> because you introduce
> a compile-time warning. Can you change it into a runtime warning instead?
Sure.
>
> Neither type of warning is likely to actually reach the person trying
> to debug the
> problem, so you still end up with machines that don't do what they should,
> but I can live with the runtime warning as long as the build doesn't break.
OK.
>
> I think the proper fix would be to not rely on custom hooks into a particular
> IOMMU driver, but to instead ensure that the amdgpu driver can do everything
> it needs through the regular linux/iommu.h interfaces. I realize this
> is more work,
> but I wonder if you've tried that, and why it didn't work out.
As far as I know this hasn't been tried. I see that intel-iommu has its
own SVM thing, which seems to be similar to what our IOMMUv2 does. I
guess we'd have to abstract that into a common API.
Regards,
Felix
>
> Arnd
Using 'imply AMD_IOMMU_V2' does not guarantee that the driver can link
against the exported functions. If the GPU driver is built-in but the
IOMMU driver is a loadable module, the kfd_iommu.c file is indeed
built but does not work:
x86_64-linux-ld: drivers/gpu/drm/amd/amdkfd/kfd_iommu.o: in function `kfd_iommu_bind_process_to_device':
kfd_iommu.c:(.text+0x516): undefined reference to `amd_iommu_bind_pasid'
x86_64-linux-ld: drivers/gpu/drm/amd/amdkfd/kfd_iommu.o: in function `kfd_iommu_unbind_process':
kfd_iommu.c:(.text+0x691): undefined reference to `amd_iommu_unbind_pasid'
x86_64-linux-ld: drivers/gpu/drm/amd/amdkfd/kfd_iommu.o: in function `kfd_iommu_suspend':
kfd_iommu.c:(.text+0x966): undefined reference to `amd_iommu_set_invalidate_ctx_cb'
x86_64-linux-ld: kfd_iommu.c:(.text+0x97f): undefined reference to `amd_iommu_set_invalid_ppr_cb'
x86_64-linux-ld: kfd_iommu.c:(.text+0x9a4): undefined reference to `amd_iommu_free_device'
x86_64-linux-ld: drivers/gpu/drm/amd/amdkfd/kfd_iommu.o: in function `kfd_iommu_resume':
kfd_iommu.c:(.text+0xa9a): undefined reference to `amd_iommu_init_device'
x86_64-linux-ld: kfd_iommu.c:(.text+0xadc): undefined reference to `amd_iommu_set_invalidate_ctx_cb'
x86_64-linux-ld: kfd_iommu.c:(.text+0xaff): undefined reference to `amd_iommu_set_invalid_ppr_cb'
x86_64-linux-ld: kfd_iommu.c:(.text+0xc72): undefined reference to `amd_iommu_bind_pasid'
x86_64-linux-ld: kfd_iommu.c:(.text+0xe08): undefined reference to `amd_iommu_set_invalidate_ctx_cb'
x86_64-linux-ld: kfd_iommu.c:(.text+0xe26): undefined reference to `amd_iommu_set_invalid_ppr_cb'
x86_64-linux-ld: kfd_iommu.c:(.text+0xe42): undefined reference to `amd_iommu_free_device'
Use IS_REACHABLE to only build IOMMU-V2 support if the amd_iommu symbols
are reachable by the amdkfd driver. Output a warning if they are not,
because that may not be what the user was expecting.
Fixes: 64d1c3a43a6f ("drm/amdkfd: Centralize IOMMUv2 code and make it conditional")
Reported-by: Arnd Bergmann <[email protected]>
Signed-off-by: Felix Kuehling <[email protected]>
---
drivers/gpu/drm/amd/amdkfd/kfd_iommu.c | 6 ++++++
drivers/gpu/drm/amd/amdkfd/kfd_iommu.h | 9 +++++++--
2 files changed, 13 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_iommu.c b/drivers/gpu/drm/amd/amdkfd/kfd_iommu.c
index 66bbca61e3ef..9318936aa805 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_iommu.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_iommu.c
@@ -20,6 +20,10 @@
* OTHER DEALINGS IN THE SOFTWARE.
*/
+#include <linux/kconfig.h>
+
+#if IS_REACHABLE(CONFIG_AMD_IOMMU_V2)
+
#include <linux/printk.h>
#include <linux/device.h>
#include <linux/slab.h>
@@ -355,3 +359,5 @@ int kfd_iommu_add_perf_counters(struct kfd_topology_device *kdev)
return 0;
}
+
+#endif
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_iommu.h b/drivers/gpu/drm/amd/amdkfd/kfd_iommu.h
index dd23d9fdf6a8..afd420b01a0c 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_iommu.h
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_iommu.h
@@ -23,7 +23,9 @@
#ifndef __KFD_IOMMU_H__
#define __KFD_IOMMU_H__
-#if defined(CONFIG_AMD_IOMMU_V2_MODULE) || defined(CONFIG_AMD_IOMMU_V2)
+#include <linux/kconfig.h>
+
+#if IS_REACHABLE(CONFIG_AMD_IOMMU_V2)
#define KFD_SUPPORT_IOMMU_V2
@@ -46,6 +48,9 @@ static inline int kfd_iommu_check_device(struct kfd_dev *kfd)
}
static inline int kfd_iommu_device_init(struct kfd_dev *kfd)
{
+#if IS_MODULE(CONFIG_AMD_IOMMU_V2)
+ WARN_ONCE(1, "iommu_v2 module is not usable by built-in KFD");
+#endif
return 0;
}
@@ -73,6 +78,6 @@ static inline int kfd_iommu_add_perf_counters(struct kfd_topology_device *kdev)
return 0;
}
-#endif /* defined(CONFIG_AMD_IOMMU_V2) */
+#endif /* IS_REACHABLE(CONFIG_AMD_IOMMU_V2) */
#endif /* __KFD_IOMMU_H__ */
--
2.30.0
Hi Felix,
On Tue, Mar 09, 2021 at 11:30:19AM -0500, Felix Kuehling wrote:
> > I think the proper fix would be to not rely on custom hooks into a particular
> > IOMMU driver, but to instead ensure that the amdgpu driver can do everything
> > it needs through the regular linux/iommu.h interfaces. I realize this
> > is more work,
> > but I wonder if you've tried that, and why it didn't work out.
>
> As far as I know this hasn't been tried. I see that intel-iommu has its
> own SVM thing, which seems to be similar to what our IOMMUv2 does. I
> guess we'd have to abstract that into a common API.
The common API was added in 26b25a2b98e4 and implemented by the Intel
driver in 064a57d7ddfc. To support it an IOMMU driver implements new IOMMU
ops:
.dev_has_feat()
.dev_feat_enabled()
.dev_enable_feat()
.dev_disable_feat()
.sva_bind()
.sva_unbind()
.sva_get_pasid()
And a device driver calls iommu_dev_enable_feature(IOMMU_DEV_FEAT_SVA)
followed by iommu_sva_bind_device().
If I remember correctly the biggest obstacle for KFD is the PASID
allocation, done by the GPU driver instead of the IOMMU driver, but there
may be others.
Thanks,
Jean
On Tue, Mar 9, 2021 at 12:55 PM Jean-Philippe Brucker
<[email protected]> wrote:
>
> Hi Felix,
>
> On Tue, Mar 09, 2021 at 11:30:19AM -0500, Felix Kuehling wrote:
> > > I think the proper fix would be to not rely on custom hooks into a particular
> > > IOMMU driver, but to instead ensure that the amdgpu driver can do everything
> > > it needs through the regular linux/iommu.h interfaces. I realize this
> > > is more work,
> > > but I wonder if you've tried that, and why it didn't work out.
> >
> > As far as I know this hasn't been tried. I see that intel-iommu has its
> > own SVM thing, which seems to be similar to what our IOMMUv2 does. I
> > guess we'd have to abstract that into a common API.
>
> The common API was added in 26b25a2b98e4 and implemented by the Intel
> driver in 064a57d7ddfc. To support it an IOMMU driver implements new IOMMU
> ops:
> .dev_has_feat()
> .dev_feat_enabled()
> .dev_enable_feat()
> .dev_disable_feat()
> .sva_bind()
> .sva_unbind()
> .sva_get_pasid()
>
> And a device driver calls iommu_dev_enable_feature(IOMMU_DEV_FEAT_SVA)
> followed by iommu_sva_bind_device().
>
> If I remember correctly the biggest obstacle for KFD is the PASID
> allocation, done by the GPU driver instead of the IOMMU driver, but there
> may be others.
IIRC, we tried to make the original IOMMUv2 functionality generic but
other vendors were not interested at the time, so it ended up being
AMD specific and since nothing else was using the pasid allocations we
put them in the GPU driver. I guess if this is generic now, it could
be moved to a common API and taken out of the driver.
Alex
Am 09.03.21 um 18:59 schrieb Alex Deucher:
> On Tue, Mar 9, 2021 at 12:55 PM Jean-Philippe Brucker
> <[email protected]> wrote:
>> Hi Felix,
>>
>> On Tue, Mar 09, 2021 at 11:30:19AM -0500, Felix Kuehling wrote:
>>>> I think the proper fix would be to not rely on custom hooks into a particular
>>>> IOMMU driver, but to instead ensure that the amdgpu driver can do everything
>>>> it needs through the regular linux/iommu.h interfaces. I realize this
>>>> is more work,
>>>> but I wonder if you've tried that, and why it didn't work out.
>>> As far as I know this hasn't been tried. I see that intel-iommu has its
>>> own SVM thing, which seems to be similar to what our IOMMUv2 does. I
>>> guess we'd have to abstract that into a common API.
>> The common API was added in 26b25a2b98e4 and implemented by the Intel
>> driver in 064a57d7ddfc. To support it an IOMMU driver implements new IOMMU
>> ops:
>> .dev_has_feat()
>> .dev_feat_enabled()
>> .dev_enable_feat()
>> .dev_disable_feat()
>> .sva_bind()
>> .sva_unbind()
>> .sva_get_pasid()
>>
>> And a device driver calls iommu_dev_enable_feature(IOMMU_DEV_FEAT_SVA)
>> followed by iommu_sva_bind_device().
>>
>> If I remember correctly the biggest obstacle for KFD is the PASID
>> allocation, done by the GPU driver instead of the IOMMU driver, but there
>> may be others.
> IIRC, we tried to make the original IOMMUv2 functionality generic but
> other vendors were not interested at the time, so it ended up being
> AMD specific and since nothing else was using the pasid allocations we
> put them in the GPU driver. I guess if this is generic now, it could
> be moved to a common API and taken out of the driver.
There has been quite some effort for this already for generic PASID
interface etc.. But it looks like that effort is stalled by now.
Anyway at least I'm perfectly fine to have the IOMMUv2 || !IOMMUv2
dependency on the core amdgpu driver for x86.
That should solve the build problem at hand quite nicely.
Regards,
Christian.
>
> Alex
On 2021-03-09 11:50 a.m., Felix Kuehling wrote:
> Using 'imply AMD_IOMMU_V2' does not guarantee that the driver can link
> against the exported functions. If the GPU driver is built-in but the
> IOMMU driver is a loadable module, the kfd_iommu.c file is indeed
> built but does not work:
>
> x86_64-linux-ld: drivers/gpu/drm/amd/amdkfd/kfd_iommu.o: in function `kfd_iommu_bind_process_to_device':
> kfd_iommu.c:(.text+0x516): undefined reference to `amd_iommu_bind_pasid'
> x86_64-linux-ld: drivers/gpu/drm/amd/amdkfd/kfd_iommu.o: in function `kfd_iommu_unbind_process':
> kfd_iommu.c:(.text+0x691): undefined reference to `amd_iommu_unbind_pasid'
> x86_64-linux-ld: drivers/gpu/drm/amd/amdkfd/kfd_iommu.o: in function `kfd_iommu_suspend':
> kfd_iommu.c:(.text+0x966): undefined reference to `amd_iommu_set_invalidate_ctx_cb'
> x86_64-linux-ld: kfd_iommu.c:(.text+0x97f): undefined reference to `amd_iommu_set_invalid_ppr_cb'
> x86_64-linux-ld: kfd_iommu.c:(.text+0x9a4): undefined reference to `amd_iommu_free_device'
> x86_64-linux-ld: drivers/gpu/drm/amd/amdkfd/kfd_iommu.o: in function `kfd_iommu_resume':
> kfd_iommu.c:(.text+0xa9a): undefined reference to `amd_iommu_init_device'
> x86_64-linux-ld: kfd_iommu.c:(.text+0xadc): undefined reference to `amd_iommu_set_invalidate_ctx_cb'
> x86_64-linux-ld: kfd_iommu.c:(.text+0xaff): undefined reference to `amd_iommu_set_invalid_ppr_cb'
> x86_64-linux-ld: kfd_iommu.c:(.text+0xc72): undefined reference to `amd_iommu_bind_pasid'
> x86_64-linux-ld: kfd_iommu.c:(.text+0xe08): undefined reference to `amd_iommu_set_invalidate_ctx_cb'
> x86_64-linux-ld: kfd_iommu.c:(.text+0xe26): undefined reference to `amd_iommu_set_invalid_ppr_cb'
> x86_64-linux-ld: kfd_iommu.c:(.text+0xe42): undefined reference to `amd_iommu_free_device'
>
> Use IS_REACHABLE to only build IOMMU-V2 support if the amd_iommu symbols
> are reachable by the amdkfd driver. Output a warning if they are not,
> because that may not be what the user was expecting.
>
> Fixes: 64d1c3a43a6f ("drm/amdkfd: Centralize IOMMUv2 code and make it conditional")
> Reported-by: Arnd Bergmann <[email protected]>
> Signed-off-by: Felix Kuehling <[email protected]>
Ping. Can I get an R-b for this patch.
Thanks,
Felix
> ---
> drivers/gpu/drm/amd/amdkfd/kfd_iommu.c | 6 ++++++
> drivers/gpu/drm/amd/amdkfd/kfd_iommu.h | 9 +++++++--
> 2 files changed, 13 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_iommu.c b/drivers/gpu/drm/amd/amdkfd/kfd_iommu.c
> index 66bbca61e3ef..9318936aa805 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_iommu.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_iommu.c
> @@ -20,6 +20,10 @@
> * OTHER DEALINGS IN THE SOFTWARE.
> */
>
> +#include <linux/kconfig.h>
> +
> +#if IS_REACHABLE(CONFIG_AMD_IOMMU_V2)
> +
> #include <linux/printk.h>
> #include <linux/device.h>
> #include <linux/slab.h>
> @@ -355,3 +359,5 @@ int kfd_iommu_add_perf_counters(struct kfd_topology_device *kdev)
>
> return 0;
> }
> +
> +#endif
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_iommu.h b/drivers/gpu/drm/amd/amdkfd/kfd_iommu.h
> index dd23d9fdf6a8..afd420b01a0c 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_iommu.h
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_iommu.h
> @@ -23,7 +23,9 @@
> #ifndef __KFD_IOMMU_H__
> #define __KFD_IOMMU_H__
>
> -#if defined(CONFIG_AMD_IOMMU_V2_MODULE) || defined(CONFIG_AMD_IOMMU_V2)
> +#include <linux/kconfig.h>
> +
> +#if IS_REACHABLE(CONFIG_AMD_IOMMU_V2)
>
> #define KFD_SUPPORT_IOMMU_V2
>
> @@ -46,6 +48,9 @@ static inline int kfd_iommu_check_device(struct kfd_dev *kfd)
> }
> static inline int kfd_iommu_device_init(struct kfd_dev *kfd)
> {
> +#if IS_MODULE(CONFIG_AMD_IOMMU_V2)
> + WARN_ONCE(1, "iommu_v2 module is not usable by built-in KFD");
> +#endif
> return 0;
> }
>
> @@ -73,6 +78,6 @@ static inline int kfd_iommu_add_perf_counters(struct kfd_topology_device *kdev)
> return 0;
> }
>
> -#endif /* defined(CONFIG_AMD_IOMMU_V2) */
> +#endif /* IS_REACHABLE(CONFIG_AMD_IOMMU_V2) */
>
> #endif /* __KFD_IOMMU_H__ */
Am 10.03.21 um 23:13 schrieb Felix Kuehling:
> On 2021-03-09 11:50 a.m., Felix Kuehling wrote:
>> Using 'imply AMD_IOMMU_V2' does not guarantee that the driver can link
>> against the exported functions. If the GPU driver is built-in but the
>> IOMMU driver is a loadable module, the kfd_iommu.c file is indeed
>> built but does not work:
>>
>> x86_64-linux-ld: drivers/gpu/drm/amd/amdkfd/kfd_iommu.o: in function
>> `kfd_iommu_bind_process_to_device':
>> kfd_iommu.c:(.text+0x516): undefined reference to `amd_iommu_bind_pasid'
>> x86_64-linux-ld: drivers/gpu/drm/amd/amdkfd/kfd_iommu.o: in function
>> `kfd_iommu_unbind_process':
>> kfd_iommu.c:(.text+0x691): undefined reference to
>> `amd_iommu_unbind_pasid'
>> x86_64-linux-ld: drivers/gpu/drm/amd/amdkfd/kfd_iommu.o: in function
>> `kfd_iommu_suspend':
>> kfd_iommu.c:(.text+0x966): undefined reference to
>> `amd_iommu_set_invalidate_ctx_cb'
>> x86_64-linux-ld: kfd_iommu.c:(.text+0x97f): undefined reference to
>> `amd_iommu_set_invalid_ppr_cb'
>> x86_64-linux-ld: kfd_iommu.c:(.text+0x9a4): undefined reference to
>> `amd_iommu_free_device'
>> x86_64-linux-ld: drivers/gpu/drm/amd/amdkfd/kfd_iommu.o: in function
>> `kfd_iommu_resume':
>> kfd_iommu.c:(.text+0xa9a): undefined reference to
>> `amd_iommu_init_device'
>> x86_64-linux-ld: kfd_iommu.c:(.text+0xadc): undefined reference to
>> `amd_iommu_set_invalidate_ctx_cb'
>> x86_64-linux-ld: kfd_iommu.c:(.text+0xaff): undefined reference to
>> `amd_iommu_set_invalid_ppr_cb'
>> x86_64-linux-ld: kfd_iommu.c:(.text+0xc72): undefined reference to
>> `amd_iommu_bind_pasid'
>> x86_64-linux-ld: kfd_iommu.c:(.text+0xe08): undefined reference to
>> `amd_iommu_set_invalidate_ctx_cb'
>> x86_64-linux-ld: kfd_iommu.c:(.text+0xe26): undefined reference to
>> `amd_iommu_set_invalid_ppr_cb'
>> x86_64-linux-ld: kfd_iommu.c:(.text+0xe42): undefined reference to
>> `amd_iommu_free_device'
>>
>> Use IS_REACHABLE to only build IOMMU-V2 support if the amd_iommu symbols
>> are reachable by the amdkfd driver. Output a warning if they are not,
>> because that may not be what the user was expecting.
>>
>> Fixes: 64d1c3a43a6f ("drm/amdkfd: Centralize IOMMUv2 code and make it
>> conditional")
>> Reported-by: Arnd Bergmann <[email protected]>
>> Signed-off-by: Felix Kuehling <[email protected]>
> Ping. Can I get an R-b for this patch.
Reviewed-by: Christian König <[email protected]>
>
> Thanks,
> Felix
>
>
>> ---
>> drivers/gpu/drm/amd/amdkfd/kfd_iommu.c | 6 ++++++
>> drivers/gpu/drm/amd/amdkfd/kfd_iommu.h | 9 +++++++--
>> 2 files changed, 13 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_iommu.c
>> b/drivers/gpu/drm/amd/amdkfd/kfd_iommu.c
>> index 66bbca61e3ef..9318936aa805 100644
>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_iommu.c
>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_iommu.c
>> @@ -20,6 +20,10 @@
>> * OTHER DEALINGS IN THE SOFTWARE.
>> */
>> +#include <linux/kconfig.h>
>> +
>> +#if IS_REACHABLE(CONFIG_AMD_IOMMU_V2)
>> +
>> #include <linux/printk.h>
>> #include <linux/device.h>
>> #include <linux/slab.h>
>> @@ -355,3 +359,5 @@ int kfd_iommu_add_perf_counters(struct
>> kfd_topology_device *kdev)
>> return 0;
>> }
>> +
>> +#endif
>> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_iommu.h
>> b/drivers/gpu/drm/amd/amdkfd/kfd_iommu.h
>> index dd23d9fdf6a8..afd420b01a0c 100644
>> --- a/drivers/gpu/drm/amd/amdkfd/kfd_iommu.h
>> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_iommu.h
>> @@ -23,7 +23,9 @@
>> #ifndef __KFD_IOMMU_H__
>> #define __KFD_IOMMU_H__
>> -#if defined(CONFIG_AMD_IOMMU_V2_MODULE) ||
>> defined(CONFIG_AMD_IOMMU_V2)
>> +#include <linux/kconfig.h>
>> +
>> +#if IS_REACHABLE(CONFIG_AMD_IOMMU_V2)
>> #define KFD_SUPPORT_IOMMU_V2
>> @@ -46,6 +48,9 @@ static inline int kfd_iommu_check_device(struct
>> kfd_dev *kfd)
>> }
>> static inline int kfd_iommu_device_init(struct kfd_dev *kfd)
>> {
>> +#if IS_MODULE(CONFIG_AMD_IOMMU_V2)
>> + WARN_ONCE(1, "iommu_v2 module is not usable by built-in KFD");
>> +#endif
>> return 0;
>> }
>> @@ -73,6 +78,6 @@ static inline int
>> kfd_iommu_add_perf_counters(struct kfd_topology_device *kdev)
>> return 0;
>> }
>> -#endif /* defined(CONFIG_AMD_IOMMU_V2) */
>> +#endif /* IS_REACHABLE(CONFIG_AMD_IOMMU_V2) */
>> #endif /* __KFD_IOMMU_H__ */
On Tue, Mar 9, 2021 at 7:34 PM Christian König <[email protected]> wrote:
> Am 09.03.21 um 18:59 schrieb Alex Deucher:
>
> There has been quite some effort for this already for generic PASID
> interface etc.. But it looks like that effort is stalled by now.
>
> Anyway at least I'm perfectly fine to have the IOMMUv2 || !IOMMUv2
> dependency on the core amdgpu driver for x86.
>
> That should solve the build problem at hand quite nicely.
Right, that sounds better than the IS_REACHABLE() hack, and would fix
the immediate build regression until the driver is changed to use the proper
IOMMU interfaces.
Arnd