2020-06-22 23:45:37

by Ralph Campbell

[permalink] [raw]
Subject: [RESEND PATCH 1/3] nouveau: fix migrate page regression

The patch to add zero page migration to GPU memory inadvertantly included
part of a future change which broke normal page migration to GPU memory
by copying too much data and corrupting GPU memory.
Fix this by only copying one page instead of a byte count.

Fixes: 9d4296a7d4b3 ("drm/nouveau/nouveau/hmm: fix migrate zero page to GPU")
Signed-off-by: Ralph Campbell <[email protected]>
---
drivers/gpu/drm/nouveau/nouveau_dmem.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/nouveau/nouveau_dmem.c b/drivers/gpu/drm/nouveau/nouveau_dmem.c
index e5c230d9ae24..cc9993837508 100644
--- a/drivers/gpu/drm/nouveau/nouveau_dmem.c
+++ b/drivers/gpu/drm/nouveau/nouveau_dmem.c
@@ -550,7 +550,7 @@ static unsigned long nouveau_dmem_migrate_copy_one(struct nouveau_drm *drm,
DMA_BIDIRECTIONAL);
if (dma_mapping_error(dev, *dma_addr))
goto out_free_page;
- if (drm->dmem->migrate.copy_func(drm, page_size(spage),
+ if (drm->dmem->migrate.copy_func(drm, 1,
NOUVEAU_APER_VRAM, paddr, NOUVEAU_APER_HOST, *dma_addr))
goto out_dma_unmap;
} else {
--
2.20.1


2020-06-23 02:05:21

by John Hubbard

[permalink] [raw]
Subject: Re: [RESEND PATCH 1/3] nouveau: fix migrate page regression

On 2020-06-22 16:38, Ralph Campbell wrote:
> The patch to add zero page migration to GPU memory inadvertantly included

inadvertently

> part of a future change which broke normal page migration to GPU memory
> by copying too much data and corrupting GPU memory.
> Fix this by only copying one page instead of a byte count.
>
> Fixes: 9d4296a7d4b3 ("drm/nouveau/nouveau/hmm: fix migrate zero page to GPU")
> Signed-off-by: Ralph Campbell <[email protected]>
> ---
> drivers/gpu/drm/nouveau/nouveau_dmem.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/nouveau/nouveau_dmem.c b/drivers/gpu/drm/nouveau/nouveau_dmem.c
> index e5c230d9ae24..cc9993837508 100644
> --- a/drivers/gpu/drm/nouveau/nouveau_dmem.c
> +++ b/drivers/gpu/drm/nouveau/nouveau_dmem.c
> @@ -550,7 +550,7 @@ static unsigned long nouveau_dmem_migrate_copy_one(struct nouveau_drm *drm,
> DMA_BIDIRECTIONAL);
> if (dma_mapping_error(dev, *dma_addr))
> goto out_free_page;
> - if (drm->dmem->migrate.copy_func(drm, page_size(spage),
> + if (drm->dmem->migrate.copy_func(drm, 1,
> NOUVEAU_APER_VRAM, paddr, NOUVEAU_APER_HOST, *dma_addr))
> goto out_dma_unmap;
> } else {
>


I Am Not A Nouveau Expert, nor is it really clear to me how
page_size(spage) came to contain something other than a page's worth of
byte count, but this fix looks accurate to me. It's better for
maintenance, too, because the function never intends to migrate "some
number of bytes". It intends to migrate exactly one page.

Hope I'm not missing something fundamental, but:

Reviewed-by: John Hubbard <[email protected]


thanks,
--
John Hubbard
NVIDIA

2020-06-25 05:26:56

by Ben Skeggs

[permalink] [raw]
Subject: Re: [Nouveau] [RESEND PATCH 1/3] nouveau: fix migrate page regression

On Tue, 23 Jun 2020 at 10:51, John Hubbard <[email protected]> wrote:
>
> On 2020-06-22 16:38, Ralph Campbell wrote:
> > The patch to add zero page migration to GPU memory inadvertantly included
>
> inadvertently
>
> > part of a future change which broke normal page migration to GPU memory
> > by copying too much data and corrupting GPU memory.
> > Fix this by only copying one page instead of a byte count.
> >
> > Fixes: 9d4296a7d4b3 ("drm/nouveau/nouveau/hmm: fix migrate zero page to GPU")
> > Signed-off-by: Ralph Campbell <[email protected]>
> > ---
> > drivers/gpu/drm/nouveau/nouveau_dmem.c | 2 +-
> > 1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/drivers/gpu/drm/nouveau/nouveau_dmem.c b/drivers/gpu/drm/nouveau/nouveau_dmem.c
> > index e5c230d9ae24..cc9993837508 100644
> > --- a/drivers/gpu/drm/nouveau/nouveau_dmem.c
> > +++ b/drivers/gpu/drm/nouveau/nouveau_dmem.c
> > @@ -550,7 +550,7 @@ static unsigned long nouveau_dmem_migrate_copy_one(struct nouveau_drm *drm,
> > DMA_BIDIRECTIONAL);
> > if (dma_mapping_error(dev, *dma_addr))
> > goto out_free_page;
> > - if (drm->dmem->migrate.copy_func(drm, page_size(spage),
> > + if (drm->dmem->migrate.copy_func(drm, 1,
> > NOUVEAU_APER_VRAM, paddr, NOUVEAU_APER_HOST, *dma_addr))
> > goto out_dma_unmap;
> > } else {
> >
>
>
> I Am Not A Nouveau Expert, nor is it really clear to me how
> page_size(spage) came to contain something other than a page's worth of
> byte count, but this fix looks accurate to me. It's better for
> maintenance, too, because the function never intends to migrate "some
> number of bytes". It intends to migrate exactly one page.
>
> Hope I'm not missing something fundamental, but:
I'm actually a bit confused here too. Because, it *looks* like the
function takes a byte count, not a page count, and unless I'm missing
something too, it's setup the copy class for a byte count also.

>
> Reviewed-by: John Hubbard <[email protected]
>
>
> thanks,
> --
> John Hubbard
> NVIDIA
> _______________________________________________
> Nouveau mailing list
> [email protected]
> https://lists.freedesktop.org/mailman/listinfo/nouveau