2019-08-14 14:51:45

by Corentin Labbe

[permalink] [raw]
Subject: DMA-API: cacheline tracking ENOMEM, dma-debug disabled due to nouveau ?

Hello

Since lot of release (at least since 4.19), I hit the following error message:
DMA-API: cacheline tracking ENOMEM, dma-debug disabled

After hitting that, I try to check who is creating so many DMA mapping and see:
cat /sys/kernel/debug/dma-api/dump | cut -d' ' -f2 | sort | uniq -c
6 ahci
257 e1000e
6 ehci-pci
5891 nouveau
24 uhci_hcd

Does nouveau having this high number of DMA mapping is normal ?

Regards


2019-08-14 17:52:02

by Daniel Vetter

[permalink] [raw]
Subject: Re: DMA-API: cacheline tracking ENOMEM, dma-debug disabled due to nouveau ?

On Wed, Aug 14, 2019 at 04:50:33PM +0200, Corentin Labbe wrote:
> Hello
>
> Since lot of release (at least since 4.19), I hit the following error message:
> DMA-API: cacheline tracking ENOMEM, dma-debug disabled
>
> After hitting that, I try to check who is creating so many DMA mapping and see:
> cat /sys/kernel/debug/dma-api/dump | cut -d' ' -f2 | sort | uniq -c
> 6 ahci
> 257 e1000e
> 6 ehci-pci
> 5891 nouveau
> 24 uhci_hcd
>
> Does nouveau having this high number of DMA mapping is normal ?

Yeah seems perfectly fine for a gpu.
-Daniel
--
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch

2019-08-15 15:01:19

by Christoph Hellwig

[permalink] [raw]
Subject: Re: DMA-API: cacheline tracking ENOMEM, dma-debug disabled due to nouveau ?

On Wed, Aug 14, 2019 at 07:49:27PM +0200, Daniel Vetter wrote:
> On Wed, Aug 14, 2019 at 04:50:33PM +0200, Corentin Labbe wrote:
> > Hello
> >
> > Since lot of release (at least since 4.19), I hit the following error message:
> > DMA-API: cacheline tracking ENOMEM, dma-debug disabled
> >
> > After hitting that, I try to check who is creating so many DMA mapping and see:
> > cat /sys/kernel/debug/dma-api/dump | cut -d' ' -f2 | sort | uniq -c
> > 6 ahci
> > 257 e1000e
> > 6 ehci-pci
> > 5891 nouveau
> > 24 uhci_hcd
> >
> > Does nouveau having this high number of DMA mapping is normal ?
>
> Yeah seems perfectly fine for a gpu.

That is a lot and apparently overwhelm the dma-debug tracking. Robin
rewrote this code in Linux 4.21 to work a little better, so I'm curious
why this might have changes in 4.19, as dma-debug did not change at
all there.

2019-08-15 15:06:18

by Robin Murphy

[permalink] [raw]
Subject: Re: DMA-API: cacheline tracking ENOMEM, dma-debug disabled due to nouveau ?

On 15/08/2019 14:35, Christoph Hellwig wrote:
> On Wed, Aug 14, 2019 at 07:49:27PM +0200, Daniel Vetter wrote:
>> On Wed, Aug 14, 2019 at 04:50:33PM +0200, Corentin Labbe wrote:
>>> Hello
>>>
>>> Since lot of release (at least since 4.19), I hit the following error message:
>>> DMA-API: cacheline tracking ENOMEM, dma-debug disabled
>>>
>>> After hitting that, I try to check who is creating so many DMA mapping and see:
>>> cat /sys/kernel/debug/dma-api/dump | cut -d' ' -f2 | sort | uniq -c
>>> 6 ahci
>>> 257 e1000e
>>> 6 ehci-pci
>>> 5891 nouveau
>>> 24 uhci_hcd
>>>
>>> Does nouveau having this high number of DMA mapping is normal ?
>>
>> Yeah seems perfectly fine for a gpu.
>
> That is a lot and apparently overwhelm the dma-debug tracking. Robin
> rewrote this code in Linux 4.21 to work a little better, so I'm curious
> why this might have changes in 4.19, as dma-debug did not change at
> all there.

FWIW, the cacheline tracking entries are a separate thing from the
dma-debug entries that I rejigged - judging by those numbers there
should still be plenty of free dma-debug entries, but for some reason it
has failed to extend the radix tree :/

Robin.

2019-08-16 14:32:56

by Corentin Labbe

[permalink] [raw]
Subject: Re: DMA-API: cacheline tracking ENOMEM, dma-debug disabled due to nouveau ?

On Wed, Aug 14, 2019 at 07:49:27PM +0200, Daniel Vetter wrote:
> On Wed, Aug 14, 2019 at 04:50:33PM +0200, Corentin Labbe wrote:
> > Hello
> >
> > Since lot of release (at least since 4.19), I hit the following error message:
> > DMA-API: cacheline tracking ENOMEM, dma-debug disabled
> >
> > After hitting that, I try to check who is creating so many DMA mapping and see:
> > cat /sys/kernel/debug/dma-api/dump | cut -d' ' -f2 | sort | uniq -c
> > 6 ahci
> > 257 e1000e
> > 6 ehci-pci
> > 5891 nouveau
> > 24 uhci_hcd
> >
> > Does nouveau having this high number of DMA mapping is normal ?
>
> Yeah seems perfectly fine for a gpu.

Note that it never go down and when I terminate my X session, it stays the same.
So without any "real" GPU work, does it is still normal to have so many active mapping ?

For example, when doing some transfer, the ahci mapping number changes and then always go down to 6.

2019-08-16 18:47:26

by Daniel Vetter

[permalink] [raw]
Subject: Re: DMA-API: cacheline tracking ENOMEM, dma-debug disabled due to nouveau ?

On Fri, Aug 16, 2019 at 4:31 PM Corentin Labbe
<[email protected]> wrote:
> On Wed, Aug 14, 2019 at 07:49:27PM +0200, Daniel Vetter wrote:
> > On Wed, Aug 14, 2019 at 04:50:33PM +0200, Corentin Labbe wrote:
> > > Hello
> > >
> > > Since lot of release (at least since 4.19), I hit the following error message:
> > > DMA-API: cacheline tracking ENOMEM, dma-debug disabled
> > >
> > > After hitting that, I try to check who is creating so many DMA mapping and see:
> > > cat /sys/kernel/debug/dma-api/dump | cut -d' ' -f2 | sort | uniq -c
> > > 6 ahci
> > > 257 e1000e
> > > 6 ehci-pci
> > > 5891 nouveau
> > > 24 uhci_hcd
> > >
> > > Does nouveau having this high number of DMA mapping is normal ?
> >
> > Yeah seems perfectly fine for a gpu.
>
> Note that it never go down and when I terminate my X session, it stays the same.
> So without any "real" GPU work, does it is still normal to have so many active mapping ?

Might just be the dma_alloc cache. It should go down under memory
pressure I think. Otherwise might also be a leak.

> For example, when doing some transfer, the ahci mapping number changes and then always go down to 6.

gpu drivers tend to cache everything, all the time ...
-Daniel
--
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch