2024-03-05 14:47:58

by Duoming Zhou

[permalink] [raw]
Subject: [PATCH v2] nouveau/dmem: handle kcalloc() allocation failure

The kcalloc() in nouveau_dmem_evict_chunk() will return null if
the physical memory has run out. As a result, if we dereference
src_pfns, dst_pfns or dma_addrs, the null pointer dereference bugs
will happen.

Moreover, the GPU is going away. If the kcalloc() fails, we could not
evict all pages mapping a chunk. So this patch adds a __GFP_NOFAIL
flag in kcalloc().

Fixes: 249881232e14 ("nouveau/dmem: evict device private memory during release")
Signed-off-by: Duoming Zhou <[email protected]>
---
Changes in v2:
- Allocate with __GFP_NOFAIL.

drivers/gpu/drm/nouveau/nouveau_dmem.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/nouveau/nouveau_dmem.c b/drivers/gpu/drm/nouveau/nouveau_dmem.c
index 12feecf71e7..f5ae9724ee2 100644
--- a/drivers/gpu/drm/nouveau/nouveau_dmem.c
+++ b/drivers/gpu/drm/nouveau/nouveau_dmem.c
@@ -378,9 +378,9 @@ nouveau_dmem_evict_chunk(struct nouveau_dmem_chunk *chunk)
dma_addr_t *dma_addrs;
struct nouveau_fence *fence;

- src_pfns = kcalloc(npages, sizeof(*src_pfns), GFP_KERNEL);
- dst_pfns = kcalloc(npages, sizeof(*dst_pfns), GFP_KERNEL);
- dma_addrs = kcalloc(npages, sizeof(*dma_addrs), GFP_KERNEL);
+ src_pfns = kcalloc(npages, sizeof(*src_pfns), GFP_KERNEL | __GFP_NOFAIL);
+ dst_pfns = kcalloc(npages, sizeof(*dst_pfns), GFP_KERNEL | __GFP_NOFAIL);
+ dma_addrs = kcalloc(npages, sizeof(*dma_addrs), GFP_KERNEL | __GFP_NOFAIL);

migrate_device_range(src_pfns, chunk->pagemap.range.start >> PAGE_SHIFT,
npages);
--
2.17.1



2024-03-05 17:08:33

by Danilo Krummrich

[permalink] [raw]
Subject: Re: [PATCH v2] nouveau/dmem: handle kcalloc() allocation failure

Hi Duoming,

thanks for sending a V2.

On 3/5/24 15:39, Duoming Zhou wrote:
> The kcalloc() in nouveau_dmem_evict_chunk() will return null if
> the physical memory has run out. As a result, if we dereference
> src_pfns, dst_pfns or dma_addrs, the null pointer dereference bugs
> will happen.
>
> Moreover, the GPU is going away. If the kcalloc() fails, we could not
> evict all pages mapping a chunk. So this patch adds a __GFP_NOFAIL
> flag in kcalloc().
>
> Fixes: 249881232e14 ("nouveau/dmem: evict device private memory during release")
> Signed-off-by: Duoming Zhou <[email protected]>
> ---
> Changes in v2:
> - Allocate with __GFP_NOFAIL.
>
> drivers/gpu/drm/nouveau/nouveau_dmem.c | 6 +++---
> 1 file changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/gpu/drm/nouveau/nouveau_dmem.c b/drivers/gpu/drm/nouveau/nouveau_dmem.c
> index 12feecf71e7..f5ae9724ee2 100644
> --- a/drivers/gpu/drm/nouveau/nouveau_dmem.c
> +++ b/drivers/gpu/drm/nouveau/nouveau_dmem.c
> @@ -378,9 +378,9 @@ nouveau_dmem_evict_chunk(struct nouveau_dmem_chunk *chunk)
> dma_addr_t *dma_addrs;
> struct nouveau_fence *fence;
>
> - src_pfns = kcalloc(npages, sizeof(*src_pfns), GFP_KERNEL);
> - dst_pfns = kcalloc(npages, sizeof(*dst_pfns), GFP_KERNEL);
> - dma_addrs = kcalloc(npages, sizeof(*dma_addrs), GFP_KERNEL);
> + src_pfns = kcalloc(npages, sizeof(*src_pfns), GFP_KERNEL | __GFP_NOFAIL);
> + dst_pfns = kcalloc(npages, sizeof(*dst_pfns), GFP_KERNEL | __GFP_NOFAIL);
> + dma_addrs = kcalloc(npages, sizeof(*dma_addrs), GFP_KERNEL | __GFP_NOFAIL);

I think we should also switch to kvcalloc(), AFAICS we don't need
physically contiguous memory.

Sorry I did not mention that in V1 already.

- Danilo

>
> migrate_device_range(src_pfns, chunk->pagemap.range.start >> PAGE_SHIFT,
> npages);