2022-08-11 15:50:38

by Fabio M. De Francesco

[permalink] [raw]
Subject: [PATCH 0/3] drm/i915: Replace kmap() with kmap_local_page()

kmap() is being deprecated in favor of kmap_local_page().

There are two main problems with kmap(): (1) It comes with an overhead as
mapping space is restricted and protected by a global lock for
synchronization and (2) it also requires global TLB invalidation when the
kmap’s pool wraps and it might block when the mapping space is fully
utilized until a slot becomes available.

With kmap_local_page() the mappings are per thread, CPU local, can take
page faults, and can be called from any context (including interrupts).
It is faster than kmap() in kernels with HIGHMEM enabled. Furthermore,
the tasks can be preempted and, when they are scheduled to run again, the
kernel virtual addresses are restored and still valid.

Since its use in drm/i915 is safe everywhere, it should be preferred.

Therefore, replace kmap() with kmap_local_page() in drm/i915.

These changes should be tested in an 32 bits system, booting a kernel
with HIGHMEM enabled. Unfortunately I have no i915 based hardware,
therefore any help with testing would be greatly appreciated.

Suggested-by: Ira Weiny <[email protected]>
Signed-off-by: Fabio M. De Francesco <[email protected]>

Fabio M. De Francesco (3):
drm/i915: Replace kmap() with kmap_local_page()
drm/i915/gt: Replace kmap() with kmap_local_page()
drm/i915/gem: Replace kmap() with kmap_local_page()

drivers/gpu/drm/i915/gem/i915_gem_shmem.c | 6 ++----
drivers/gpu/drm/i915/gem/selftests/i915_gem_mman.c | 8 ++++----
drivers/gpu/drm/i915/gt/intel_ggtt_fencing.c | 4 ++--
drivers/gpu/drm/i915/gt/shmem_utils.c | 11 ++++-------
drivers/gpu/drm/i915/i915_gem.c | 8 ++++----
5 files changed, 16 insertions(+), 21 deletions(-)

--
2.37.1


2022-08-11 16:10:18

by Fabio M. De Francesco

[permalink] [raw]
Subject: [PATCH 2/3] drm/i915/gt: Replace kmap() with kmap_local_page()

kmap() is being deprecated in favor of kmap_local_page().

There are two main problems with kmap(): (1) It comes with an overhead as
mapping space is restricted and protected by a global lock for
synchronization and (2) it also requires global TLB invalidation when the
kmap’s pool wraps and it might block when the mapping space is fully
utilized until a slot becomes available.

With kmap_local_page() the mappings are per thread, CPU local, can take
page faults, and can be called from any context (including interrupts).
It is faster than kmap() in kernels with HIGHMEM enabled. Furthermore,
the tasks can be preempted and, when they are scheduled to run again, the
kernel virtual addresses are restored and are still valid.

Since its use in i915/gt is safe everywhere, it should be preferred.

Therefore, replace kmap() with kmap_local_page() in i915/gt

Suggested-by: Ira Weiny <[email protected]>
Signed-off-by: Fabio M. De Francesco <[email protected]>
---
drivers/gpu/drm/i915/gt/intel_ggtt_fencing.c | 4 ++--
drivers/gpu/drm/i915/gt/shmem_utils.c | 11 ++++-------
2 files changed, 6 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/i915/gt/intel_ggtt_fencing.c b/drivers/gpu/drm/i915/gt/intel_ggtt_fencing.c
index 6ebda3d65086..21d8ce40b897 100644
--- a/drivers/gpu/drm/i915/gt/intel_ggtt_fencing.c
+++ b/drivers/gpu/drm/i915/gt/intel_ggtt_fencing.c
@@ -747,7 +747,7 @@ static void swizzle_page(struct page *page)
char *vaddr;
int i;

- vaddr = kmap(page);
+ vaddr = kmap_local_page(page);

for (i = 0; i < PAGE_SIZE; i += 128) {
memcpy(temp, &vaddr[i], 64);
@@ -755,7 +755,7 @@ static void swizzle_page(struct page *page)
memcpy(&vaddr[i + 64], temp, 64);
}

- kunmap(page);
+ kunmap_local(vaddr);
}

/**
diff --git a/drivers/gpu/drm/i915/gt/shmem_utils.c b/drivers/gpu/drm/i915/gt/shmem_utils.c
index 402f085f3a02..48edbb8a33e5 100644
--- a/drivers/gpu/drm/i915/gt/shmem_utils.c
+++ b/drivers/gpu/drm/i915/gt/shmem_utils.c
@@ -98,22 +98,19 @@ static int __shmem_rw(struct file *file, loff_t off,
unsigned int this =
min_t(size_t, PAGE_SIZE - offset_in_page(off), len);
struct page *page;
- void *vaddr;

page = shmem_read_mapping_page_gfp(file->f_mapping, pfn,
GFP_KERNEL);
if (IS_ERR(page))
return PTR_ERR(page);

- vaddr = kmap(page);
if (write) {
- memcpy(vaddr + offset_in_page(off), ptr, this);
+ memcpy_to_page(page, offset_in_page(off), ptr, this);
set_page_dirty(page);
} else {
- memcpy(ptr, vaddr + offset_in_page(off), this);
+ memcpy_from_page(ptr, page, offset_in_page(off), this);
}
mark_page_accessed(page);
- kunmap(page);
put_page(page);

len -= this;
@@ -140,11 +137,11 @@ int shmem_read_to_iosys_map(struct file *file, loff_t off,
if (IS_ERR(page))
return PTR_ERR(page);

- vaddr = kmap(page);
+ vaddr = kmap_local_page(page);
iosys_map_memcpy_to(map, map_off, vaddr + offset_in_page(off),
this);
mark_page_accessed(page);
- kunmap(page);
+ kunmap_local(vaddr);
put_page(page);

len -= this;
--
2.37.1

2022-08-11 16:10:42

by Fabio M. De Francesco

[permalink] [raw]
Subject: [PATCH 1/3] drm/i915: Replace kmap() with kmap_local_page()

kmap() is being deprecated in favor of kmap_local_page().

There are two main problems with kmap(): (1) It comes with an overhead as
mapping space is restricted and protected by a global lock for
synchronization and (2) it also requires global TLB invalidation when the
kmap’s pool wraps and it might block when the mapping space is fully
utilized until a slot becomes available.

With kmap_local_page() the mappings are per thread, CPU local, can take
page faults, and can be called from any context (including interrupts).
It is faster than kmap() in kernels with HIGHMEM enabled. Furthermore,
the tasks can be preempted and, when they are scheduled to run again, the
kernel virtual addresses are restored and are still valid.

Since its use in i915_gem.c is safe everywhere, it should be preferred.

Therefore, replace kmap() with kmap_local_page() in i915_gem.c

Suggested-by: Ira Weiny <[email protected]>
Signed-off-by: Fabio M. De Francesco <[email protected]>
---
drivers/gpu/drm/i915/i915_gem.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 702e5b89be22..43effce60e1b 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -212,14 +212,14 @@ shmem_pread(struct page *page, int offset, int len, char __user *user_data,
char *vaddr;
int ret;

- vaddr = kmap(page);
+ vaddr = kmap_local_page(page);

if (needs_clflush)
drm_clflush_virt_range(vaddr + offset, len);

ret = __copy_to_user(user_data, vaddr + offset, len);

- kunmap(page);
+ kunmap_local(vaddr);

return ret ? -EFAULT : 0;
}
@@ -634,7 +634,7 @@ shmem_pwrite(struct page *page, int offset, int len, char __user *user_data,
char *vaddr;
int ret;

- vaddr = kmap(page);
+ vaddr = kmap_local_page(page);

if (needs_clflush_before)
drm_clflush_virt_range(vaddr + offset, len);
@@ -643,7 +643,7 @@ shmem_pwrite(struct page *page, int offset, int len, char __user *user_data,
if (!ret && needs_clflush_after)
drm_clflush_virt_range(vaddr + offset, len);

- kunmap(page);
+ kunmap_local(vaddr);

return ret ? -EFAULT : 0;
}
--
2.37.1