2021-11-23 20:46:53

by Christophe JAILLET

[permalink] [raw]
Subject: [PATCH v2 1/2] drm/amdkfd: Use bitmap_zalloc() when applicable

'doorbell_bitmap' and 'queue_slot_bitmap' are bitmaps. So use
'bitmap_zalloc()' to simplify code, improve the semantic and avoid some
open-coded arithmetic in allocator arguments.

Also change the corresponding 'kfree()' into 'bitmap_free()' to keep
consistency.

Reviewed-by: Felix Kuehling <[email protected]>
Signed-off-by: Christophe JAILLET <[email protected]>
---
v1 --> v2: Compile tested :)
Add a missing ',' (kernel test robot)
Add kfd_process_queue_manager.c (Felix Kuehling)
---
drivers/gpu/drm/amd/amdkfd/kfd_process.c | 7 +++----
drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c | 7 +++----
2 files changed, 6 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process.c b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
index d4c8a6948a9f..67bb1654becc 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_process.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
@@ -1011,7 +1011,7 @@ static void kfd_process_destroy_pdds(struct kfd_process *p)
free_pages((unsigned long)pdd->qpd.cwsr_kaddr,
get_order(KFD_CWSR_TBA_TMA_SIZE));

- kfree(pdd->qpd.doorbell_bitmap);
+ bitmap_free(pdd->qpd.doorbell_bitmap);
idr_destroy(&pdd->alloc_idr);

kfd_free_process_doorbells(pdd->dev, pdd->doorbell_index);
@@ -1433,9 +1433,8 @@ static int init_doorbell_bitmap(struct qcm_process_device *qpd,
if (!KFD_IS_SOC15(dev))
return 0;

- qpd->doorbell_bitmap =
- kzalloc(DIV_ROUND_UP(KFD_MAX_NUM_OF_QUEUES_PER_PROCESS,
- BITS_PER_BYTE), GFP_KERNEL);
+ qpd->doorbell_bitmap = bitmap_zalloc(KFD_MAX_NUM_OF_QUEUES_PER_PROCESS,
+ GFP_KERNEL);
if (!qpd->doorbell_bitmap)
return -ENOMEM;

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c b/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c
index 4f8464658daf..c5f5a25c6dcc 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c
@@ -135,9 +135,8 @@ void kfd_process_dequeue_from_all_devices(struct kfd_process *p)
int pqm_init(struct process_queue_manager *pqm, struct kfd_process *p)
{
INIT_LIST_HEAD(&pqm->queues);
- pqm->queue_slot_bitmap =
- kzalloc(DIV_ROUND_UP(KFD_MAX_NUM_OF_QUEUES_PER_PROCESS,
- BITS_PER_BYTE), GFP_KERNEL);
+ pqm->queue_slot_bitmap = bitmap_zalloc(KFD_MAX_NUM_OF_QUEUES_PER_PROCESS,
+ GFP_KERNEL);
if (!pqm->queue_slot_bitmap)
return -ENOMEM;
pqm->process = p;
@@ -159,7 +158,7 @@ void pqm_uninit(struct process_queue_manager *pqm)
kfree(pqn);
}

- kfree(pqm->queue_slot_bitmap);
+ bitmap_free(pqm->queue_slot_bitmap);
pqm->queue_slot_bitmap = NULL;
}

--
2.30.2



2021-11-23 20:47:07

by Christophe JAILLET

[permalink] [raw]
Subject: [PATCH v2 2/2] drm/amdkfd: Slighly optimize 'init_doorbell_bitmap()'

The 'doorbell_bitmap' bitmap has just been allocated. So we can use the
non-atomic '__set_bit()' function to save a few cycles as no concurrent
access can happen.

Reviewed-by: Felix Kuehling <[email protected]>
Signed-off-by: Christophe JAILLET <[email protected]>
---
bitmap_set() could certainly also be use, but range checking would be
tricky.

v1 --> v2: No change
---
drivers/gpu/drm/amd/amdkfd/kfd_process.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process.c b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
index 67bb1654becc..9158f9754a24 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_process.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
@@ -1446,9 +1446,9 @@ static int init_doorbell_bitmap(struct qcm_process_device *qpd,

for (i = 0; i < KFD_MAX_NUM_OF_QUEUES_PER_PROCESS / 2; i++) {
if (i >= range_start && i <= range_end) {
- set_bit(i, qpd->doorbell_bitmap);
- set_bit(i + KFD_QUEUE_DOORBELL_MIRROR_OFFSET,
- qpd->doorbell_bitmap);
+ __set_bit(i, qpd->doorbell_bitmap);
+ __set_bit(i + KFD_QUEUE_DOORBELL_MIRROR_OFFSET,
+ qpd->doorbell_bitmap);
}
}

--
2.30.2


2021-11-25 21:39:15

by Felix Kuehling

[permalink] [raw]
Subject: Re: [PATCH v2 2/2] drm/amdkfd: Slighly optimize 'init_doorbell_bitmap()'

Am 2021-11-23 um 3:46 p.m. schrieb Christophe JAILLET:
> The 'doorbell_bitmap' bitmap has just been allocated. So we can use the
> non-atomic '__set_bit()' function to save a few cycles as no concurrent
> access can happen.
>
> Reviewed-by: Felix Kuehling <[email protected]>
> Signed-off-by: Christophe JAILLET <[email protected]>

Thank you! I applied the series to amd-staging-drm-next.

Regards,
  Felix


> ---
> bitmap_set() could certainly also be use, but range checking would be
> tricky.
>
> v1 --> v2: No change
> ---
> drivers/gpu/drm/amd/amdkfd/kfd_process.c | 6 +++---
> 1 file changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process.c b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
> index 67bb1654becc..9158f9754a24 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_process.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
> @@ -1446,9 +1446,9 @@ static int init_doorbell_bitmap(struct qcm_process_device *qpd,
>
> for (i = 0; i < KFD_MAX_NUM_OF_QUEUES_PER_PROCESS / 2; i++) {
> if (i >= range_start && i <= range_end) {
> - set_bit(i, qpd->doorbell_bitmap);
> - set_bit(i + KFD_QUEUE_DOORBELL_MIRROR_OFFSET,
> - qpd->doorbell_bitmap);
> + __set_bit(i, qpd->doorbell_bitmap);
> + __set_bit(i + KFD_QUEUE_DOORBELL_MIRROR_OFFSET,
> + qpd->doorbell_bitmap);
> }
> }
>