2018-04-26 21:55:29

by Luis Chamberlain

[permalink] [raw]
Subject: [LSF/MM TOPIC NOTES] x86 ZONE_DMA love

Below are my notes on the ZONE_DMA discussion at LSF/MM 2018. There were some
earlier discussion prior to my arrival to the session about moving around
ZOME_DMA around, if someone has notes on that please share too :)

PS. I'm not subscribed to linux-mm

Luis

Determining you don't need to support ZONE_DMA on x86 at run time
=================================================================

In practice if you don't have a floppy device on x86, you don't need ZONE_DMA,
in that case you dont need to support ZONE_DMA, however currently disabling it
is only possible at compile time, and we won't know for sure until boot time if
you have such a device. If you don't need ZONE_DMA though means we would not
have to deal with slab allocators for them and special casings for it in a slew
of places. In particular even kmalloc() has a branch which is always run if
CONFIG_ZONE_DMA is enabled.

ZONE_DMA is needed for old devices that requires lower addresses since it allows
allocations more reliably. There should be more devices that require this,
not just floppy though.

Christoph Lameter added CONFIG_ZONE_DMA to disable ZONE_DMA at build time but
most distributions enable this. If we could disable ZONE_DMA at run time once
we know we don't have any device present requiring it we could get the same
benefit of compiling without CONFIG_ZONE_DMA at run time.

It used to be that disabling CONFIG_ZONE_DMA could help with performance, we
don't seem to have modern benchmarks over possible gains on removing it.
Are the gains no longer expected to be significant? Very likely there are
no performance gains. The assumption then is that the main advantage over
being able to disable ZONE_DMA on x86 these days would be pure aesthetics, and
having x86 work more like other architectures with allocations. Use of ZONE_DMA
on drivers are also good signs these drivers are old, or may be deprecated.
Perhaps some of these on x86 should be moved to staging.

Note that some architectures rely on ZONE_DMA as well, the above notes
only applies to x86.

We can use certain kernel mechanisms to disable usage of x86 certain features
at run time. Below are a few options:

* x86 binary patching
* ACPI_SIG_FADT
* static keys
* compiler multiverse (at least the R&D gcc proof of concept is now complete)

Detecting legacy x86 devices with ACPI ACPI_SIG_FADT
----------------------------------------------------

We could expand on ACPI_SIG_FADT with more legacy devices. This mechanism was
used to help determine if certain legacy x86 devices are present or not with
paravirtualization. For instance:

* ACPI_FADT_NO_VGA
* ACPI_FADT_NO_CMOS_RTC

CONFIG_ZONE_DMA
---------------

Christoph Lameter added CONFIG_ZONE_DMA through commit 4b51d66989218
("[PATCH] optional ZONE_DMA: optional ZONE_DMA in the VM") merged on
v2.6.21.

On x86 ZONE_DMA is defined as follows:

config ZONE_DMA
bool "DMA memory allocation support" if EXPERT
default y
help
DMA memory allocation support allows devices with less than 32-bit
addressing to allocate within the first 16MB of address space.
Disable if no such devices will be used.

If unsure, say Y.

Most distributions enable CONFIG_ZONE_DMA.

Immediate impact of CONFIG_ZONE_DMA
-----------------------------------

CONFIG_ZONE_DMA implicaates kmalloc() as follows:

struct kmem_cache *kmalloc_slab(size_t size, gfp_t flags)
{
...
#ifdef CONFIG_ZONE_DMA
if (unlikely((flags & GFP_DMA)))
return kmalloc_dma_caches[index];
#endif
...
}

ZONE_DMA users
==============

Turns out there are much more users of ZONE_DMA than expected even on x86.

Explicit requirements for ZONE_DMA with gfp flags
-------------------------------------------------

All drivers which explicitly use any of these flags implicate use
of ZONE_DMA for allocations:

* GFP_DMA
* __GFP_DMA

Implicit ZONE_DMA users
-----------------------

There are a series of implicit users of ZONE_DMA which use helpers. These are,
with details documented further below:

* blk_queue_bounce()
* blk_queue_bounce_limit()
* dma_alloc_coherent_gfp_flags()
* dma_generic_alloc_coherent()
* intel_alloc_coherent()
* _regmap_raw_write()
* mempool_alloc_pages_isa()

x86 implicit and explicit ZONE_DMA users
-----------------------------------------

We list below all x86 implicit and explicit ZONE_DMA users.

# Explicit x86 users of GFP_DMA or __GFP_DMA

* drivers/iio/common/ssp_sensors - wonder if enabling this on x86 was a mistake.
Note that this needs SPI and SPI needs HAS_IOMEM. I only see HAS_IOMEM on
s390 ? But I do think the Intel Minnowboard has SPI, but doubt it has
the ssp sensor stuff.

* drivers/input/rmi4/rmi_spi.c - same SPI question
* drivers/media/common/siano/ - make allyesconfig yields it enabled, but
not sure if this should ever be on x86
* drivers/media/platform/sti/bdisp/ - likewise
* drivers/media/platform/sti/hva/ - likewise
* drivers/media/usb/gspca/ - likewise
* drivers/mmc/host/wbsd.c - likewise
* drivers/mtd/nand/gpmi-nand/ - likewise
* drivers/net/can/spi/hi311x.c - likewise
* drivers/net/can/spi/mcp251x.c - likewise
* drivers/net/ethernet/agere/ - likewise
* drivers/net/ethernet/neterion/vxge/ - likewise
* drivers/net/ethernet/rocker/ - likewise
* drivers/net/usb/kalmia.c - likewise
* drivers/net/ethernet/neterion/vxge/ - likewise
* drivers/spi/spi-pic32-sqi.c - likewise
* drivers/spi/spi-sh-msiof.c - likewise
* drivers/spi/spi-ti-qspi.c - likewise

* drivers/tty/serial/mxs-auart.c - likewise - MXS AUART support
* drivers/tty/synclink.c - likewise Microgate SyncLink card support
* drivers/uio/uio_pruss - Texas Instruments PRUSS driver
* drivers/usb/dwc2 - CONFIG_USB_DWC2_DUAL_ROLE - DesignWare USB2 DRD Core Support for dual role mode
* drivers/usb/gadget/udc/ USB_GR_UDC - Aeroflex Gaisler GRUSBDC USB Peripheral Controller Driver
* drivers/video/fbdev/da8xx-fb.c - FB_DA8XX DA8xx/OMAP-L1xx/AM335x Framebuffer support
* drivers/video/fbdev/mb862xx/mb862xxfb_accel.c - CONFIG_FB_MB862XX - Fujitsu MB862xx GDC support
* drivers/video/fbdev/vermilion/vermilion.c - Intel LE80578 (Vermilion) support

Then we have a few drivers which we know we need on x86 but for these
we could use a run time flip to enable ZONE_DMA.

* drivers/net/ethernet/broadcom/b44.c - bleh, yeah and there are some work hw bug
work arounds for this, *but* again since its also odd, we could deal with this
at run time
* drivers/net/wimax/i2400m/ - ugh, who cares about this crap anyway nowadays, my
point being another run time oddity
* drivers/net/wireless/broadcom/b43legacy/ - ugh, same
* drivers/platform/x86/asus-wmi.c - ugh same
* drivers/platform/x86/dell-smbios.c - ugh same

Staging drivers are expected to have flaws, but worth noting.

* drivers/staging/ - scattered drivers, rtlwifi/ is probably the only relevant one for x86
SCSI is *severely* affected:
* drivers/scsi/aacraid/ - crap Adaptec AACRAID support
* drivers/scsi/ch.c - SCSI media changer support...
* drivers/scsi/initio.c - Initio INI-9X00U/UW SCSI device driver...
* drivers/scsi/osst.c - CHR_DEV_OSST - SCSI OnStream SC-x0 tape support...
* drivers/scsi/pmcraid.c - CONFIG_SCSI_PMCRAID - PMC SIERRA Linux MaxRAID adapter support
* drivers/scsi/snic/ - Cisco SNIC Driver
* drivers/mmc/core/mmc_test.c - MMC_TEST - MMC host test driver
* drivers/net/wireless/broadcom/b43/ - means we'd have to at least use
static keys

Larger blockers (now I see one reason why SCSI is a disaster):

* drivers/scsi/hosts.c - scsi_host_alloc() always uses
__GFP_DMA if (sht->unchecked_isa_dma && privsize)
this could likely be adjusted or split off to other
callers where we know this to be true.
* drivers/scsi/scsi_scan.c - scsi_probe_and_add_lun() has a similar check
* drivers/scsi/sg.c - sg_build_indirect() has similar check
* drivers/scsi/sr.c - get_capabilities() *always* uses GFP_DMA
which is called on sr_probe() WTF
Don't drop this -- its cdrom
* drivers/scsi/sr_ioctl.c - seriously...
* drivers/scsi/sr_vendor.c - sr_cd_check() - checks if the CD is
multisession, asks for offset etc
* drivers/scsi/st.c - SCSI tape support - on enlarge_buffer() this
call BTW is recursive.. called on st_open(), the struct
file_operations open()...

Larger blockers (SPI is also severely affected):
* drivers/spi/spi.c - spi_pump_messages() which processes spi message queue

Larger blockers:

* drivers/tty/hvc/hvc_iucv.c - hyperv console

And finally a non-issue:

* drivers/xen/swiotlb-xen.c - used on struct dma_map_ops
xen_swiotlb_dma_ops alloc() for only to check if the caller
used it to se the dma_mask:
dma_mask = dev->coherent_dma_mask;
if (!dma_mask)
dma_mask = (gfp & GFP_DMA) ? DMA_BIT_MASK(24) : DMA_BIT_MASK(32);

That's the end of the review of all current explicit callers on x86.

# dma_alloc_coherent_gfp_flags() and dma_generic_alloc_coherent()

dma_alloc_coherent_gfp_flags() and dma_generic_alloc_coherent() set
GFP_DMA if if (dma_mask <= DMA_BIT_MASK(24))

# blk_queue_bounce()

void blk_queue_bounce(struct request_queue *q, struct bio **bio_orig)
{
...
/*
* for non-isa bounce case, just check if the bounce pfn is equal
* to or bigger than the highest pfn in the system -- in that case,
* don't waste time iterating over bio segments
*/
if (!(q->bounce_gfp & GFP_DMA)) {
if (q->limits.bounce_pfn >= blk_max_pfn)
return;
pool = page_pool;
} else {
BUG_ON(!isa_page_pool);
pool = isa_page_pool;
}
...
}

# blk_queue_bounce_limit()

void blk_queue_bounce_limit(struct request_queue *q, u64 max_addr)
{
unsigned long b_pfn = max_addr >> PAGE_SHIFT;
int dma = 0;

q->bounce_gfp = GFP_NOIO;
#if BITS_PER_LONG == 64
/*
* Assume anything <= 4GB can be handled by IOMMU. Actually
* some IOMMUs can handle everything, but I don't know of a
* way to test this here.
*/
if (b_pfn < (min_t(u64, 0xffffffffUL, BLK_BOUNCE_HIGH) >> PAGE_SHIFT))
dma = 1;
q->limits.bounce_pfn = max(max_low_pfn, b_pfn);
#else
if (b_pfn < blk_max_low_pfn)
dma = 1;
q->limits.bounce_pfn = b_pfn;
#endif
if (dma) {
init_emergency_isa_pool();
q->bounce_gfp = GFP_NOIO | GFP_DMA;
q->limits.bounce_pfn = b_pfn;
}
}

# dma_alloc_coherent_gfp_flags() and dma_generic_alloc_coherent()

dma_alloc_coherent_gfp_flags() sets GFP_DMA if if (dma_mask <= DMA_BIT_MASK(24)).
Likewise for dma_generic_alloc_coherent().

# intel_alloc_coherent()

intel_alloc_coherent() on drivers/iommu/intel-iommu.c also uses GFP_DMA
for DMA_BIT_MASK(32), part of the struct dma_map_ops intel_dma_ops on alloc().

# _regmap_raw_write()

_regmap_raw_write() seems to always use GFP_DMA for async writes.

# mempool_alloc_pages_isa()

It implies you use GFP_DMA.

Architectures removed which used ZONE_DMA
-----------------------------------------

Although this topic pertains to x86, its worth mentioning that on the v4.17-rc1
release 8 architectures were removed: blackfin, cris, frv, m32r, metag,
mn10300, score, tile. Of these 8 architectures, 3 defined and used their own
ZONE_DMA:

* blackfin
* cris
* m32r


2018-04-27 01:10:52

by Rik van Riel

[permalink] [raw]
Subject: Re: [Lsf-pc] [LSF/MM TOPIC NOTES] x86 ZONE_DMA love

On Thu, 2018-04-26 at 21:54 +0000, Luis R. Rodriguez wrote:
> Below are my notes on the ZONE_DMA discussion at LSF/MM 2018. There
> were some
> earlier discussion prior to my arrival to the session about moving
> around
> ZOME_DMA around, if someone has notes on that please share too :)

We took notes during LSF/MM 2018. Not a whole lot
on your topic, but most of the MM and plenary
topics have some notes.

https://etherpad.wikimedia.org/p/LSFMM2018

--
All Rights Reversed.


Attachments:
signature.asc (499.00 B)
This is a digitally signed message part

2018-04-27 05:37:30

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [LSF/MM TOPIC NOTES] x86 ZONE_DMA love

On Thu, Apr 26, 2018 at 09:54:06PM +0000, Luis R. Rodriguez wrote:
> In practice if you don't have a floppy device on x86, you don't need ZONE_DMA,

I call BS on that, and you actually explain later why it it BS due
to some drivers using it more explicitly. But even more importantly
we have plenty driver using it through dma_alloc_* and a small DMA
mask, and they are in use - we actually had a 4.16 regression due to
them.

> SCSI is *severely* affected:

Not really. We have unchecked_isa_dma to support about 4 drivers,
and less than a hand ful of drivers doing stupid things, which can
be fixed easily, and just need a volunteer.

> That's the end of the review of all current explicit callers on x86.
>
> # dma_alloc_coherent_gfp_flags() and dma_generic_alloc_coherent()
>
> dma_alloc_coherent_gfp_flags() and dma_generic_alloc_coherent() set
> GFP_DMA if if (dma_mask <= DMA_BIT_MASK(24))

All that code is long gone and replaced with dma-direct. Which still
uses GFP_DMA based on the dma mask, though - see above.

2018-04-27 07:20:06

by Michal Hocko

[permalink] [raw]
Subject: Re: [LSF/MM TOPIC NOTES] x86 ZONE_DMA love

On Thu 26-04-18 22:35:56, Christoph Hellwig wrote:
> On Thu, Apr 26, 2018 at 09:54:06PM +0000, Luis R. Rodriguez wrote:
> > In practice if you don't have a floppy device on x86, you don't need ZONE_DMA,
>
> I call BS on that, and you actually explain later why it it BS due
> to some drivers using it more explicitly. But even more importantly
> we have plenty driver using it through dma_alloc_* and a small DMA
> mask, and they are in use - we actually had a 4.16 regression due to
> them.

Well, but do we need a zone for that purpose? The idea was to actually
replace the zone by a CMA pool (at least on x86). With the current
implementation of the CMA we would move the range [0-16M] pfn range into
zone_movable so it can be used and we would get rid of all of the
overhead each zone brings (a bit in page flags, kmalloc caches and who
knows what else)
--
Michal Hocko
SUSE Labs

Subject: Re: [LSF/MM TOPIC NOTES] x86 ZONE_DMA love

On Fri, 27 Apr 2018, Michal Hocko wrote:

> On Thu 26-04-18 22:35:56, Christoph Hellwig wrote:
> > On Thu, Apr 26, 2018 at 09:54:06PM +0000, Luis R. Rodriguez wrote:
> > > In practice if you don't have a floppy device on x86, you don't need ZONE_DMA,
> >
> > I call BS on that, and you actually explain later why it it BS due
> > to some drivers using it more explicitly. But even more importantly
> > we have plenty driver using it through dma_alloc_* and a small DMA
> > mask, and they are in use - we actually had a 4.16 regression due to
> > them.
>
> Well, but do we need a zone for that purpose? The idea was to actually
> replace the zone by a CMA pool (at least on x86). With the current
> implementation of the CMA we would move the range [0-16M] pfn range into
> zone_movable so it can be used and we would get rid of all of the
> overhead each zone brings (a bit in page flags, kmalloc caches and who
> knows what else)

Well it looks like what we are using it for is to force allocation from
low physical memory if we fail to obtain proper memory through a normal
channel. The use of ZONE_DMA is only there for emergency purposes.
I think we could subsitute ZONE_DMA32 on x87 without a problem.

Which means that ZONE_DMA has no purpose anymore.

Can we make ZONE_DMA on x86 refer to the low 32 bit physical addresses
instead and remove ZONE_DMA32?

That would actually improve the fallback because you have more memory for
the old devices.


2018-04-27 16:16:32

by Luis Chamberlain

[permalink] [raw]
Subject: Re: [LSF/MM TOPIC NOTES] x86 ZONE_DMA love

On Thu, Apr 26, 2018 at 10:35:56PM -0700, Christoph Hellwig wrote:
> On Thu, Apr 26, 2018 at 09:54:06PM +0000, Luis R. Rodriguez wrote:
> > In practice if you don't have a floppy device on x86, you don't need ZONE_DMA,
>
> I call BS on that,

I did not explain though that it was not me who claimed this though.
The list displayed below is the result of trying to confirm/deny this,
and what could be done, and also evaluating if there is *any* gain
about doing something about it.

But curious, on a standard qemu x86_x64 KVM guest, which of the
drivers do we know for certain *are* being used from the ones
listed?

What about Xen guests, I wonder?

> and you actually explain later why it it BS due
> to some drivers using it more explicitly.

Or implicitly. The list I showed is the work to show that the users
of GFP_DMA on x86 is *much* more wide spread than expected from the
above claim.

I however did not also answer the above qemu x86_64 question, but
would be good to know. Note I stated that the claim was *in practice*.

> But even more importantly
> we have plenty driver using it through dma_alloc_* and a small DMA
> mask, and they are in use

Do we have a list of users for x86 with a small DMA mask?
Or, given that I'm not aware of a tool to be able to look
for this in an easy way, would it be good to find out which
x86 drivers do have a small mask?

> - we actually had a 4.16 regression due to them.

Ah what commit was the culprit? Is that fixed already? If so what
commit?

> > SCSI is *severely* affected:
>
> Not really. We have unchecked_isa_dma to support about 4 drivers,

Ah very neat:

* CONFIG_CHR_DEV_OSST - "SCSI OnStream SC-x0 tape support"
* CONFIG_SCSI_ADVANSYS - "AdvanSys SCSI support"
* CONFIG_SCSI_AHA1542 - "Adaptec AHA1542 support"
* CONFIG_SCSI_ESAS2R - "ATTO Technology's ExpressSAS RAID adapter driver"

> and less than a hand ful of drivers doing stupid things, which can
> be fixed easily, and just need a volunteer.

Care to list what needs to be done? Can an eager beaver student do it?

> > That's the end of the review of all current explicit callers on x86.
> >
> > # dma_alloc_coherent_gfp_flags() and dma_generic_alloc_coherent()
> >
> > dma_alloc_coherent_gfp_flags() and dma_generic_alloc_coherent() set
> > GFP_DMA if if (dma_mask <= DMA_BIT_MASK(24))
>
> All that code is long gone and replaced with dma-direct. Which still
> uses GFP_DMA based on the dma mask, though - see above.

And that's mostly IOMMU code, on the alloc() dma_map_ops.

Luis

2018-04-27 16:20:23

by Matthew Wilcox

[permalink] [raw]
Subject: Re: [LSF/MM TOPIC NOTES] x86 ZONE_DMA love

On Fri, Apr 27, 2018 at 11:07:07AM -0500, Christopher Lameter wrote:
> Well it looks like what we are using it for is to force allocation from
> low physical memory if we fail to obtain proper memory through a normal
> channel. The use of ZONE_DMA is only there for emergency purposes.
> I think we could subsitute ZONE_DMA32 on x87 without a problem.
>
> Which means that ZONE_DMA has no purpose anymore.
>
> Can we make ZONE_DMA on x86 refer to the low 32 bit physical addresses
> instead and remove ZONE_DMA32?
>
> That would actually improve the fallback because you have more memory for
> the old devices.

Some devices have incredibly bogus hardware like 28 bit addressing
or 39 bit addressing. We don't have a good way to allocate memory by
physical address other than than saying "GFP_DMA for anything less than
32, GFP_DMA32 (or GFP_KERNEL on 32-bit) for anything less than 64 bit".

Even CMA doesn't have a "cma_alloc_phys()". Maybe that's the right place
to put such an allocation API.

2018-04-27 16:30:19

by Matthew Wilcox

[permalink] [raw]
Subject: Re: [LSF/MM TOPIC NOTES] x86 ZONE_DMA love

On Fri, Apr 27, 2018 at 04:14:56PM +0000, Luis R. Rodriguez wrote:
> > Not really. We have unchecked_isa_dma to support about 4 drivers,
>
> Ah very neat:
>
> * CONFIG_CHR_DEV_OSST - "SCSI OnStream SC-x0 tape support"

That's an upper level driver, like cdrom, disk and regular tapes.

> * CONFIG_SCSI_ADVANSYS - "AdvanSys SCSI support"

If we ditch support for the ISA boards, this can go away.

> * CONFIG_SCSI_AHA1542 - "Adaptec AHA1542 support"

Probably true.

> * CONFIG_SCSI_ESAS2R - "ATTO Technology's ExpressSAS RAID adapter driver"

That's being set to 0.

You missed BusLogic.c and gdth.c


Subject: Re: [LSF/MM TOPIC NOTES] x86 ZONE_DMA love

On Fri, 27 Apr 2018, Matthew Wilcox wrote:

> Some devices have incredibly bogus hardware like 28 bit addressing
> or 39 bit addressing. We don't have a good way to allocate memory by
> physical address other than than saying "GFP_DMA for anything less than
> 32, GFP_DMA32 (or GFP_KERNEL on 32-bit) for anything less than 64 bit".
>
> Even CMA doesn't have a "cma_alloc_phys()". Maybe that's the right place
> to put such an allocation API.

The other way out of this would be to require a IOMMU?


2018-04-27 16:39:04

by Michal Hocko

[permalink] [raw]
Subject: Re: [LSF/MM TOPIC NOTES] x86 ZONE_DMA love

On Fri 27-04-18 11:07:07, Cristopher Lameter wrote:
> On Fri, 27 Apr 2018, Michal Hocko wrote:
>
> > On Thu 26-04-18 22:35:56, Christoph Hellwig wrote:
> > > On Thu, Apr 26, 2018 at 09:54:06PM +0000, Luis R. Rodriguez wrote:
> > > > In practice if you don't have a floppy device on x86, you don't need ZONE_DMA,
> > >
> > > I call BS on that, and you actually explain later why it it BS due
> > > to some drivers using it more explicitly. But even more importantly
> > > we have plenty driver using it through dma_alloc_* and a small DMA
> > > mask, and they are in use - we actually had a 4.16 regression due to
> > > them.
> >
> > Well, but do we need a zone for that purpose? The idea was to actually
> > replace the zone by a CMA pool (at least on x86). With the current
> > implementation of the CMA we would move the range [0-16M] pfn range into
> > zone_movable so it can be used and we would get rid of all of the
> > overhead each zone brings (a bit in page flags, kmalloc caches and who
> > knows what else)
>
> Well it looks like what we are using it for is to force allocation from
> low physical memory if we fail to obtain proper memory through a normal
> channel. The use of ZONE_DMA is only there for emergency purposes.
> I think we could subsitute ZONE_DMA32 on x87 without a problem.
>
> Which means that ZONE_DMA has no purpose anymore.

We still need to make sure the low 16MB is available on request. And
that is what CMA can help with. We do not really seem to need the whole
zone infreastructure for that.

> Can we make ZONE_DMA on x86 refer to the low 32 bit physical addresses
> instead and remove ZONE_DMA32?

Why that would be an advantage? If anything I would rename ZONE_DMA32 to
ZONE_ADDR32 or something like that.

> That would actually improve the fallback because you have more memory for
> the old devices.

I do not really understand how is that related to removal ZONE_DMA. We
are really talking about the lowest 16MB...

--
Michal Hocko
SUSE Labs

2018-04-28 08:32:30

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [LSF/MM TOPIC NOTES] x86 ZONE_DMA love

On Fri, Apr 27, 2018 at 09:18:43AM +0200, Michal Hocko wrote:
> > On Thu, Apr 26, 2018 at 09:54:06PM +0000, Luis R. Rodriguez wrote:
> > > In practice if you don't have a floppy device on x86, you don't need ZONE_DMA,
> >
> > I call BS on that, and you actually explain later why it it BS due
> > to some drivers using it more explicitly. But even more importantly
> > we have plenty driver using it through dma_alloc_* and a small DMA
> > mask, and they are in use - we actually had a 4.16 regression due to
> > them.
>
> Well, but do we need a zone for that purpose? The idea was to actually
> replace the zone by a CMA pool (at least on x86). With the current
> implementation of the CMA we would move the range [0-16M] pfn range into
> zone_movable so it can be used and we would get rid of all of the
> overhead each zone brings (a bit in page flags, kmalloc caches and who
> knows what else)

That wasn't clear in the mail. But if we have anothr way to allocate
<16MB memory we don't need ZONE_DMA for the floppy driver either, so the
above conclusion is still wrong.

> --
> Michal Hocko
> SUSE Labs
---end quoted text---

2018-04-28 08:34:27

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [LSF/MM TOPIC NOTES] x86 ZONE_DMA love

On Fri, Apr 27, 2018 at 11:07:07AM -0500, Christopher Lameter wrote:
> Well it looks like what we are using it for is to force allocation from
> low physical memory if we fail to obtain proper memory through a normal
> channel. The use of ZONE_DMA is only there for emergency purposes.
> I think we could subsitute ZONE_DMA32 on x87 without a problem.
>
> Which means that ZONE_DMA has no purpose anymore.
>
> Can we make ZONE_DMA on x86 refer to the low 32 bit physical addresses
> instead and remove ZONE_DMA32?

While < 32-bit allocations are much more common there are plenty
of requirements for < 24-bit or other weird masks still.

2018-04-28 08:35:18

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [LSF/MM TOPIC NOTES] x86 ZONE_DMA love

On Fri, Apr 27, 2018 at 11:36:23AM -0500, Christopher Lameter wrote:
> On Fri, 27 Apr 2018, Matthew Wilcox wrote:
>
> > Some devices have incredibly bogus hardware like 28 bit addressing
> > or 39 bit addressing. We don't have a good way to allocate memory by
> > physical address other than than saying "GFP_DMA for anything less than
> > 32, GFP_DMA32 (or GFP_KERNEL on 32-bit) for anything less than 64 bit".
> >
> > Even CMA doesn't have a "cma_alloc_phys()". Maybe that's the right place
> > to put such an allocation API.
>
> The other way out of this would be to require a IOMMU?

Which on many systems doesn't exist. And even if it exists might not
be usable.

2018-04-28 08:44:00

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [LSF/MM TOPIC NOTES] x86 ZONE_DMA love

On Fri, Apr 27, 2018 at 04:14:56PM +0000, Luis R. Rodriguez wrote:
> But curious, on a standard qemu x86_x64 KVM guest, which of the
> drivers do we know for certain *are* being used from the ones
> listed?

On a KVM guest probably none. But not all the world is relatively
sane and standardized VMs unfortunately.

> > But even more importantly
> > we have plenty driver using it through dma_alloc_* and a small DMA
> > mask, and they are in use
>
> Do we have a list of users for x86 with a small DMA mask?
> Or, given that I'm not aware of a tool to be able to look
> for this in an easy way, would it be good to find out which
> x86 drivers do have a small mask?

Basically you'll have to grep for calls to dma_set_mask/
dma_set_coherent_mask/dma_set_mask_and_coherent and their pci_*
wrappers with masks smaller 32-bit. Some use numeric values,
some use DMA_BIT_MASK and various places uses local variables
or struct members to parse them, so finding them will be a bit
more work. Nothing a coccinell expert couldn't solve, though :)

> > - we actually had a 4.16 regression due to them.
>
> Ah what commit was the culprit? Is that fixed already? If so what
> commit?

66bdb147 ("swiotlb: Use dma_direct_supported() for swiotlb_ops")

> > > SCSI is *severely* affected:
> >
> > Not really. We have unchecked_isa_dma to support about 4 drivers,
>
> Ah very neat:
>
> * CONFIG_CHR_DEV_OSST - "SCSI OnStream SC-x0 tape support"
> * CONFIG_SCSI_ADVANSYS - "AdvanSys SCSI support"
> * CONFIG_SCSI_AHA1542 - "Adaptec AHA1542 support"
> * CONFIG_SCSI_ESAS2R - "ATTO Technology's ExpressSAS RAID adapter driver"
>
> > and less than a hand ful of drivers doing stupid things, which can
> > be fixed easily, and just need a volunteer.
>
> Care to list what needs to be done? Can an eager beaver student do it?

Drop the drivers, as in my branch I prepared a while ago would be
easiest:

http://git.infradead.org/users/hch/misc.git/shortlog/refs/heads/unchecked_isa_dma

But unlike the other few aha1542 actually seems to have active users,
or at least had recently. I'll need to send this out as a RFC, but
don't really expect it to fly.

If it doesn't we'll need to enhance swiotlb to support a ISA DMA pool
in addition to current 32-bit DMA pool, and also convert aha1542 to
use the DMA API. Not really student material.

> > > That's the end of the review of all current explicit callers on x86.
> > >
> > > # dma_alloc_coherent_gfp_flags() and dma_generic_alloc_coherent()
> > >
> > > dma_alloc_coherent_gfp_flags() and dma_generic_alloc_coherent() set
> > > GFP_DMA if if (dma_mask <= DMA_BIT_MASK(24))
> >
> > All that code is long gone and replaced with dma-direct. Which still
> > uses GFP_DMA based on the dma mask, though - see above.
>
> And that's mostly IOMMU code, on the alloc() dma_map_ops.

It is the dma mapping API, which translates the dma mask to the right
zone, and probably is the biggest user of ZONE_DMA in modern systems.

Currently there are still various arch and iommu specific
implementations of the allocator decisions, but I'm working to
consolidate them into common code.

2018-04-28 18:57:29

by Luis Chamberlain

[permalink] [raw]
Subject: Re: [LSF/MM TOPIC NOTES] x86 ZONE_DMA love

On Sat, Apr 28, 2018 at 01:42:21AM -0700, Christoph Hellwig wrote:
> On Fri, Apr 27, 2018 at 04:14:56PM +0000, Luis R. Rodriguez wrote:
> > Do we have a list of users for x86 with a small DMA mask?
> > Or, given that I'm not aware of a tool to be able to look
> > for this in an easy way, would it be good to find out which
> > x86 drivers do have a small mask?
>
> Basically you'll have to grep for calls to dma_set_mask/
> dma_set_coherent_mask/dma_set_mask_and_coherent and their pci_*
> wrappers with masks smaller 32-bit. Some use numeric values,
> some use DMA_BIT_MASK and various places uses local variables
> or struct members to parse them, so finding them will be a bit
> more work. Nothing a coccinelle expert couldn't solve, though :)

Thing is unless we have a specific flag used consistently I don't believe we
can do this search with Coccinelle. ie, if we have local variables and based on
some series of variables things are set, this makes the grammatical expression
difficult to express. So Cocinelle is not designed for this purpose.

But I believe smatch [0] is intended exactly for this sort of purpose, is that
right Dan? I gave a cursory look and I think it'd take me significant time to
get such hunt down.

[0] https://lwn.net/Articles/691882/

Luis

2018-04-28 19:48:16

by Julia Lawall

[permalink] [raw]
Subject: Re: [LSF/MM TOPIC NOTES] x86 ZONE_DMA love



On Sat, 28 Apr 2018, Luis R. Rodriguez wrote:

> On Sat, Apr 28, 2018 at 01:42:21AM -0700, Christoph Hellwig wrote:
> > On Fri, Apr 27, 2018 at 04:14:56PM +0000, Luis R. Rodriguez wrote:
> > > Do we have a list of users for x86 with a small DMA mask?
> > > Or, given that I'm not aware of a tool to be able to look
> > > for this in an easy way, would it be good to find out which
> > > x86 drivers do have a small mask?
> >
> > Basically you'll have to grep for calls to dma_set_mask/
> > dma_set_coherent_mask/dma_set_mask_and_coherent and their pci_*
> > wrappers with masks smaller 32-bit. Some use numeric values,
> > some use DMA_BIT_MASK and various places uses local variables
> > or struct members to parse them, so finding them will be a bit
> > more work. Nothing a coccinelle expert couldn't solve, though :)
>
> Thing is unless we have a specific flag used consistently I don't believe we
> can do this search with Coccinelle. ie, if we have local variables and based on
> some series of variables things are set, this makes the grammatical expression
> difficult to express. So Cocinelle is not designed for this purpose.
>
> But I believe smatch [0] is intended exactly for this sort of purpose, is that
> right Dan? I gave a cursory look and I think it'd take me significant time to
> get such hunt down.
>
> [0] https://lwn.net/Articles/691882/

FWIW, here is my semantic patch and the output - it reports on things that
appear to be too small and things that it doesn't know about.

What are the relevant pci wrappers? I didn't find them.

julia

@initialize:ocaml@
@@

let clean s = String.concat "" (Str.split (Str.regexp " ") s)

let shorten s = List.nth (Str.split (Str.regexp "linux-next/") s) 1

@bad1 exists@
identifier i,x;
expression e;
position p;
@@

x = DMA_BIT_MASK(i)
...
\(dma_set_mask@p\|dma_set_coherent_mask@p\|dma_set_mask_and_coherent@p\)(e,x)

@bad2@
identifier i;
expression e;
position p;
@@

\(dma_set_mask@p\|dma_set_coherent_mask@p\|dma_set_mask_and_coherent@p\)
(e,DMA_BIT_MASK(i))

@ok1 exists@
identifier x;
expression e;
constant c;
position p != bad1.p;
@@

x = \(DMA_BIT_MASK(c)\|0xffffffff\)
...
\(dma_set_mask@p\|dma_set_coherent_mask@p\|dma_set_mask_and_coherent@p\)(e,x)

@script:ocaml@
p << ok1.p;
c << ok1.c;
@@

let c = int_of_string c in
if c < 32
then
let p = List.hd p in
Printf.printf "too small: %s:%d: %d\n" (shorten p.file) p.line c

@ok2@
expression e;
constant c;
position p != bad2.p;
@@

\(dma_set_mask@p\|dma_set_coherent_mask@p\|dma_set_mask_and_coherent@p\)
(e,\(DMA_BIT_MASK(c)\|0xffffffff\))

@script:ocaml@
p << ok2.p;
c << ok2.c;
@@

let c = int_of_string c in
if c < 32
then
let p = List.hd p in
Printf.printf "too small: %s:%d: %d\n" (shorten p.file) p.line c

@unk@
expression e,e1 != ATA_DMA_MASK;
position p != {ok1.p,ok2.p};
@@

\(dma_set_mask@p\|dma_set_coherent_mask@p\|dma_set_mask_and_coherent@p\)(e,e1)

@script:ocaml@
p << unk.p;
e1 << unk.e1;
@@

let p = List.hd p in
Printf.printf "unknown: %s:%d: %s\n" (shorten p.file) p.line (clean e1)

-----------------

too small: drivers/gpu/drm/i915/i915_drv.c:1138: 30
too small: drivers/net/wireless/broadcom/b43/dma.c:1068: 30
unknown: sound/pci/ctxfi/cthw20k2.c:2033: DMA_BIT_MASK(dma_bits)
unknown: sound/pci/ctxfi/cthw20k2.c:2034: DMA_BIT_MASK(dma_bits)
unknown: drivers/scsi/megaraid/megaraid_sas_base.c:6036: consistent_mask
unknown: drivers/net/wireless/ath/wil6210/txrx.c:200: DMA_BIT_MASK(wil->dma_addr_size)
unknown: drivers/net/ethernet/netronome/nfp/nfp_main.c:452: DMA_BIT_MASK(NFP_NET_MAX_DMA_BITS)
unknown: drivers/gpu/host1x/dev.c:199: host->info->dma_mask
unknown: drivers/iommu/arm-smmu-v3.c:2691: DMA_BIT_MASK(smmu->oas)
too small: sound/pci/es1968.c:2692: 28
too small: sound/pci/es1968.c:2693: 28
too small: drivers/net/wireless/broadcom/b43legacy/dma.c:809: 30
unknown: drivers/virtio/virtio_mmio.c:573: DMA_BIT_MASK(32+PAGE_SHIFT)
unknown: drivers/ata/sata_nv.c:762: pp->adma_dma_mask
unknown: drivers/dma/mmp_pdma.c:1094: pdev->dev->coherent_dma_mask
too small: sound/pci/maestro3.c:2557: 28
too small: sound/pci/maestro3.c:2558: 28
too small: sound/pci/ice1712/ice1712.c:2533: 28
too small: sound/pci/ice1712/ice1712.c:2534: 28
unknown: drivers/net/wireless/ath/wil6210/pmc.c:132: DMA_BIT_MASK(wil->dma_addr_size)
unknown: drivers/gpu/drm/nouveau/nvkm/engine/device/tegra.c:313: DMA_BIT_MASK(tdev->func->iommu_bit)
unknown: drivers/net/ethernet/synopsys/dwc-xlgmac-common.c:96: DMA_BIT_MASK(pdata->hw_feat.dma_width)
too small: sound/pci/als4000.c:874: 24
too small: sound/pci/als4000.c:875: 24
unknown: drivers/hwtracing/coresight/coresight-tmc.c:335: DMA_BIT_MASK(dma_mask)
unknown: drivers/dma/xilinx/xilinx_dma.c:2634: DMA_BIT_MASK(addr_width)
too small: sound/pci/sonicvibes.c:1262: 24
too small: sound/pci/sonicvibes.c:1263: 24
too small: sound/pci/es1938.c:1600: 24
too small: sound/pci/es1938.c:1601: 24
unknown: drivers/crypto/ccree/cc_driver.c:260: dma_mask
unknown: sound/pci/hda/hda_intel.c:1888: DMA_BIT_MASK(dma_bits)
unknown: sound/pci/hda/hda_intel.c:1889: DMA_BIT_MASK(dma_bits)
unknown: drivers/gpu/drm/nouveau/nvkm/engine/device/pci.c:1688: DMA_BIT_MASK(bits)
unknown: drivers/net/ethernet/amd/xgbe/xgbe-main.c:294: DMA_BIT_MASK(pdata->hw_feat.dma_width)
too small: sound/pci/ali5451/ali5451.c:2110: 31
too small: sound/pci/ali5451/ali5451.c:2111: 31
unknown: drivers/dma/pxa_dma.c:1375: op->dev.coherent_dma_mask
unknown: drivers/media/platform/qcom/venus/core.c:186: core->res->dma_mask
unknown: drivers/mtd/nand/raw/denali.c:1298: DMA_BIT_MASK(dma_bit)
unknown: drivers/net/wireless/ath/wil6210/pcie_bus.c:299: DMA_BIT_MASK(dma_addr_size[i])
unknown: drivers/gpu/drm/msm/msm_drv.c:1132: ~0
unknown: drivers/net/ethernet/altera/altera_tse_main.c:1449: DMA_BIT_MASK(priv->dmaops->dmamask)
unknown: drivers/net/ethernet/altera/altera_tse_main.c:1450: DMA_BIT_MASK(priv->dmaops->dmamask)
unknown: drivers/net/ethernet/sfc/efx.c:1298: dma_mask
too small: sound/pci/als300.c:661: 28
too small: sound/pci/als300.c:662: 28
unknown: drivers/hwtracing/intel_th/core.c:379: parent->coherent_dma_mask
too small: drivers/media/pci/sta2x11/sta2x11_vip.c:983: 26
too small: drivers/media/pci/sta2x11/sta2x11_vip.c:859: 29
too small: drivers/net/ethernet/broadcom/b44.c:2389: 30
too small: sound/pci/azt3328.c:2421: 24
too small: sound/pci/azt3328.c:2422: 24
too small: sound/pci/trident/trident_main.c:3552: 30
too small: sound/pci/trident/trident_main.c:3553: 30
unknown: drivers/net/ethernet/netronome/nfp/nfp_netvf_main.c:128: DMA_BIT_MASK(NFP_NET_MAX_DMA_BITS)
unknown: drivers/net/ethernet/sfc/falcon/efx.c:1251: dma_mask
unknown: drivers/virtio/virtio_pci_legacy.c:226: DMA_BIT_MASK(32+VIRTIO_PCI_QUEUE_ADDR_SHIFT)
unknown: sound/pci/ctxfi/cthw20k1.c:1908: DMA_BIT_MASK(dma_bits)
unknown: sound/pci/ctxfi/cthw20k1.c:1909: DMA_BIT_MASK(dma_bits)
unknown: drivers/iommu/arm-smmu.c:1848: DMA_BIT_MASK(size)
unknown: drivers/scsi/aic7xxx/aic7xxx_osm_pci.c:242: mask_39bit
unknown: sound/pci/emu10k1/emu10k1_main.c:1910: emu->dma_mask
unknown: drivers/usb/gadget/udc/bdc/bdc_pci.c:86: pci->dev.coherent_dma_mask
too small: sound/pci/sis7019.c:1328: 30

2018-04-28 20:45:32

by Matthew Wilcox

[permalink] [raw]
Subject: Re: [LSF/MM TOPIC NOTES] x86 ZONE_DMA love

On Sat, Apr 28, 2018 at 09:46:52PM +0200, Julia Lawall wrote:
> FWIW, here is my semantic patch and the output - it reports on things that
> appear to be too small and things that it doesn't know about.
>
> What are the relevant pci wrappers? I didn't find them.

Basically all of the functions in include/linux/pci-dma-compat.h

> too small: drivers/gpu/drm/i915/i915_drv.c:1138: 30
> too small: drivers/net/wireless/broadcom/b43/dma.c:1068: 30
> unknown: sound/pci/ctxfi/cthw20k2.c:2033: DMA_BIT_MASK(dma_bits)
> unknown: sound/pci/ctxfi/cthw20k2.c:2034: DMA_BIT_MASK(dma_bits)

This one's good:

const unsigned int dma_bits = BITS_PER_LONG;

> unknown: drivers/scsi/megaraid/megaraid_sas_base.c:6036: consistent_mask

and this one:
consistent_mask = (instance->adapter_type == VENTURA_SERIES) ?
DMA_BIT_MASK(64) : DMA_BIT_MASK(32);

> unknown: drivers/net/wireless/ath/wil6210/txrx.c:200: DMA_BIT_MASK(wil->dma_addr_size)

if (wil->dma_addr_size > 32)
dma_set_mask_and_coherent(dev,
DMA_BIT_MASK(wil->dma_addr_size));

> unknown: drivers/net/ethernet/netronome/nfp/nfp_main.c:452: DMA_BIT_MASK(NFP_NET_MAX_DMA_BITS)

drivers/net/ethernet/netronome/nfp/nfp_net.h:#define NFP_NET_MAX_DMA_BITS 40

> unknown: drivers/gpu/host1x/dev.c:199: host->info->dma_mask

Looks safe ...

drivers/gpu/host1x/bus.c: device->dev.coherent_dma_mask = host1x->dev->coherent_dma_mask;
drivers/gpu/host1x/bus.c: device->dev.dma_mask = &device->dev.coherent_dma_mask;
drivers/gpu/host1x/dev.c: .dma_mask = DMA_BIT_MASK(32),
drivers/gpu/host1x/dev.c: .dma_mask = DMA_BIT_MASK(32),
drivers/gpu/host1x/dev.c: .dma_mask = DMA_BIT_MASK(34),
drivers/gpu/host1x/dev.c: .dma_mask = DMA_BIT_MASK(34),
drivers/gpu/host1x/dev.c: .dma_mask = DMA_BIT_MASK(34),
drivers/gpu/host1x/dev.c: dma_set_mask_and_coherent(host->dev, host->info->dma_mask);
drivers/gpu/host1x/dev.h: u64 dma_mask; /* mask of addressable memory */

... but that reminds us that maybe some drivers aren't using dma_set_mask()
but rather touching dma_mask directly.

... 57 more to look at ...

2018-04-29 14:34:52

by Julia Lawall

[permalink] [raw]
Subject: Re: [LSF/MM TOPIC NOTES] x86 ZONE_DMA love

Here are some improved results, also taking into account the pci
functions.

julia

too small: drivers/gpu/drm/i915/i915_drv.c:1138: 30
too small: drivers/hwtracing/coresight/coresight-tmc.c:335: 0
too small: drivers/media/pci/sta2x11/sta2x11_vip.c:859: 29
too small: drivers/media/pci/sta2x11/sta2x11_vip.c:983: 26
too small: drivers/net/ethernet/broadcom/b44.c:2389: 30
too small: drivers/net/wan/wanxl.c:585: 28
too small: drivers/net/wan/wanxl.c:586: 28
too small: drivers/net/wireless/broadcom/b43/dma.c:1068: 30
too small: drivers/net/wireless/broadcom/b43legacy/dma.c:809: 30
too small: drivers/scsi/aacraid/commsup.c:1581: 31
too small: drivers/scsi/aacraid/linit.c:1651: 31
too small: drivers/usb/host/ehci-pci.c:127: 31
too small: sound/pci/ali5451/ali5451.c:2110: 31
too small: sound/pci/ali5451/ali5451.c:2111: 31
too small: sound/pci/als300.c:661: 28
too small: sound/pci/als300.c:662: 28
too small: sound/pci/als4000.c:874: 24
too small: sound/pci/als4000.c:875: 24
too small: sound/pci/azt3328.c:2421: 24
too small: sound/pci/azt3328.c:2422: 24
too small: sound/pci/emu10k1/emu10k1x.c:916: 28
too small: sound/pci/emu10k1/emu10k1x.c:917: 28
too small: sound/pci/es1938.c:1600: 24
too small: sound/pci/es1938.c:1601: 24
too small: sound/pci/es1968.c:2692: 28
too small: sound/pci/es1968.c:2693: 28
too small: sound/pci/ice1712/ice1712.c:2533: 28
too small: sound/pci/ice1712/ice1712.c:2534: 28
too small: sound/pci/maestro3.c:2557: 28
too small: sound/pci/maestro3.c:2558: 28
too small: sound/pci/sis7019.c:1328: 30
too small: sound/pci/sonicvibes.c:1262: 24
too small: sound/pci/sonicvibes.c:1263: 24
too small: sound/pci/trident/trident_main.c:3552: 30
too small: sound/pci/trident/trident_main.c:3553: 30
unknown: arch/x86/pci/sta2x11-fixup.c:169: STA2X11_AMBA_SIZE-1
unknown: arch/x86/pci/sta2x11-fixup.c:170: STA2X11_AMBA_SIZE-1
unknown: drivers/ata/sata_nv.c:762: pp->adma_dma_mask
unknown: drivers/char/agp/intel-gtt.c:1409: DMA_BIT_MASK(mask)
unknown: drivers/char/agp/intel-gtt.c:1413: DMA_BIT_MASK(mask)
unknown: drivers/crypto/ccree/cc_driver.c:260: dma_mask
unknown: drivers/dma/mmp_pdma.c:1094: pdev->dev->coherent_dma_mask
unknown: drivers/dma/pxa_dma.c:1375: op->dev.coherent_dma_mask
unknown: drivers/dma/xilinx/xilinx_dma.c:2634: DMA_BIT_MASK(addr_width)
unknown: drivers/gpu/drm/ati_pcigart.c:117: gart_info->table_mask
unknown: drivers/gpu/drm/msm/msm_drv.c:1132: ~0
unknown: drivers/gpu/drm/nouveau/nvkm/engine/device/tegra.c:313: DMA_BIT_MASK(tdev->func->iommu_bit)
unknown: drivers/gpu/host1x/dev.c:199: host->info->dma_mask
unknown: drivers/hwtracing/intel_th/core.c:379: parent->coherent_dma_mask
unknown: drivers/iommu/arm-smmu.c:1848: DMA_BIT_MASK(size)
unknown: drivers/media/pci/intel/ipu3/ipu3-cio2.c:1759: CIO2_DMA_MASK
unknown: drivers/media/platform/qcom/venus/core.c:186: core->res->dma_mask
unknown: drivers/message/fusion/mptbase.c:4599: ioc->dma_mask
unknown: drivers/message/fusion/mptbase.c:4600: ioc->dma_mask
unknown: drivers/net/ethernet/altera/altera_tse_main.c:1449: DMA_BIT_MASK(priv->dmaops->dmamask)
unknown: drivers/net/ethernet/altera/altera_tse_main.c:1450: DMA_BIT_MASK(priv->dmaops->dmamask)
unknown: drivers/net/ethernet/amazon/ena/ena_netdev.c:2455: DMA_BIT_MASK(dma_width)
unknown: drivers/net/ethernet/amazon/ena/ena_netdev.c:2461: DMA_BIT_MASK(dma_width)
unknown: drivers/net/ethernet/amd/pcnet32.c:1558: PCNET32_DMA_MASK
unknown: drivers/net/ethernet/amd/xgbe/xgbe-main.c:294: DMA_BIT_MASK(pdata->hw_feat.dma_width)
unknown: drivers/net/ethernet/broadcom/bnx2.c:8234: persist_dma_mask
unknown: drivers/net/ethernet/broadcom/tg3.c:17781: persist_dma_mask
unknown: drivers/net/ethernet/qlogic/netxen/netxen_nic_main.c:315: old_mask
unknown: drivers/net/ethernet/qlogic/netxen/netxen_nic_main.c:316: old_cmask
unknown: drivers/net/ethernet/sfc/efx.c:1298: dma_mask
unknown: drivers/net/ethernet/sfc/falcon/efx.c:1251: dma_mask
unknown: drivers/net/ethernet/synopsys/dwc-xlgmac-common.c:96: DMA_BIT_MASK(pdata->hw_feat.dma_width)
unknown: drivers/net/wireless/ath/wil6210/pcie_bus.c:299: DMA_BIT_MASK(dma_addr_size[i])
unknown: drivers/net/wireless/ath/wil6210/pmc.c:132: DMA_BIT_MASK(wil->dma_addr_size)
unknown: drivers/net/wireless/ath/wil6210/txrx.c:200: DMA_BIT_MASK(wil->dma_addr_size)
unknown: drivers/scsi/3w-xxxx.c:2260: TW_DMA_MASK
unknown: drivers/scsi/hptiop.c:1312: DMA_BIT_MASK(iop_ops->hw_dma_bit_mask)
unknown: drivers/scsi/megaraid/megaraid_sas_base.c:6036: consistent_mask
unknown: drivers/scsi/sym53c8xx_2/sym_glue.c:1315: DMA_DAC_MASK
unknown: drivers/usb/gadget/udc/bdc/bdc_pci.c:86: pci->dev.coherent_dma_mask
unknown: sound/pci/emu10k1/emu10k1_main.c:1910: emu->dma_mask

----------------

@initialize:ocaml@
@@

let clean s = String.concat "" (Str.split (Str.regexp " ") s)

let shorten s = List.nth (Str.split (Str.regexp "linux-next/") s) 1

let ios s =
match Str.split_delim (Str.regexp "ULL") s with
[n;""] -> int_of_string n
| _ -> int_of_string s

let number x = try ignore(ios x); true with _ -> false

let smallnumber x = (ios x) < 32

let longnumber x =
number x && String.length x >= 10 && not(String.get x 2 = '0')

@ok1 exists@
position p1,p2;
constant c1 : script:ocaml() { number c1 };
constant c2 : script:ocaml() { number c2 };
expression x, e, e1, e2;
@@

x =@p1 \(BITS_PER_LONG\|NFP_NET_MAX_DMA_BITS\|e1 ? c1 : c2\|c1\|c1+e1\)
... when any
when != x = e2
\(dma_set_mask@p2\|dma_set_coherent_mask@p2\|dma_set_mask_and_coherent@p2\|
pci_set_dma_mask@p2\|pci_set_consistent_dma_mask@p2\)(e,DMA_BIT_MASK(x))

@bad1 exists@
position ok1.p2;
position p1 != ok1.p1;
expression ok1.x, e, e2;
@@

x =@p1 e
... when any
when != x = e2
\(dma_set_mask@p2\|dma_set_coherent_mask@p2\|dma_set_mask_and_coherent@p2\|
pci_set_dma_mask@p2\|pci_set_consistent_dma_mask@p2\)(e,DMA_BIT_MASK(x))

@script:ocaml depends on bad1@
_p2 << ok1.p2;
@@

Coccilib.include_match false

@ok1a exists@
position ok1.p1,ok1.p2;
constant c1 : script:ocaml() { number c1 };
constant c2 : script:ocaml() { number c2 };
expression x, e, e1, e2;
@@

x =@p1 \(BITS_PER_LONG\|NFP_NET_MAX_DMA_BITS\|e1 ? c1 : c2\|c1\|c1+e1\)
... when any
when != x = e2
\(dma_set_mask@p2\|dma_set_coherent_mask@p2\|dma_set_mask_and_coherent@p2\|
pci_set_dma_mask@p2\|pci_set_consistent_dma_mask@p2\)(e,DMA_BIT_MASK(x))

@script:ocaml@
c1 << ok1a.c1;
c2 << ok1a.c2 = "64";
p2 << ok1.p2;
@@

let p2 = List.hd p2 in
(if smallnumber c1
then Printf.printf "too small: %s:%d: %s\n" (shorten p2.file) p2.line c1);
(if smallnumber c2
then Printf.printf "too small: %s:%d: %s\n" (shorten p2.file) p2.line c2)

(* ------------------------------------------------------------------ *)

@ok2 exists@
position p1,p2;
constant c1 : script:ocaml() { number c1 };
constant c2 : script:ocaml() { number c2 };
constant c3 : script:ocaml() { longnumber c3 };
expression x, e, e1, e2, e3;
@@

x =@p1 \(DMA_BIT_MASK(\(BITS_PER_LONG\|NFP_NET_MAX_DMA_BITS\|e1 ? c1 : c2\|c1\|c1+e1\))\|
DMA_BIT_MASK(\(BITS_PER_LONG\|NFP_NET_MAX_DMA_BITS\|e1 ? c1 : c2\|c1\|c1+e1\))&e3\|
c3\|ATA_DMA_MASK\)
... when any
when != x = e2
\(dma_set_mask@p2\|dma_set_coherent_mask@p2\|dma_set_mask_and_coherent@p2\|
pci_set_dma_mask@p2\|pci_set_consistent_dma_mask@p2\)(e,x)

@bad2 exists@
position ok2.p2;
position p1 != ok2.p1;
expression ok2.x, e, e2;
@@

x =@p1 e
... when any
when != x = e2
\(dma_set_mask@p2\|dma_set_coherent_mask@p2\|dma_set_mask_and_coherent@p2\|
pci_set_dma_mask@p2\|pci_set_consistent_dma_mask@p2\)(e,x)

@script:ocaml depends on bad2@
_p2 << ok2.p2;
@@

Coccilib.include_match false

@ok2a exists@
position ok2.p1,ok2.p2;
constant c1 : script:ocaml() { number c1 };
constant c2 : script:ocaml() { number c2 };
constant c3 : script:ocaml() { longnumber c3 };
expression x, e, e1, e2, e3;
@@

x =@p1 \(DMA_BIT_MASK(\(BITS_PER_LONG\|NFP_NET_MAX_DMA_BITS\|e1 ? c1 : c2\|c1\|c1+e1\))\|
DMA_BIT_MASK(\(BITS_PER_LONG\|NFP_NET_MAX_DMA_BITS\|e1 ? c1 : c2\|c1\|c1+e1\))&e3\|
c3\|ATA_DMA_MASK\)
... when any
when != x = e2
\(dma_set_mask@p2\|dma_set_coherent_mask@p2\|dma_set_mask_and_coherent@p2\|
pci_set_dma_mask@p2\|pci_set_consistent_dma_mask@p2\)(e,x)

@script:ocaml@
c1 << ok2a.c1;
c2 << ok2a.c2 = "64";
p2 << ok2.p2;
@@

let p2 = List.hd p2 in
(if smallnumber c1
then Printf.printf "too small: %s:%d: %s\n" (shorten p2.file) p2.line c1);
(if smallnumber c2
then Printf.printf "too small: %s:%d: %s\n" (shorten p2.file) p2.line c2)

(* ------------------------------------------------------------------ *)

@ok3@
position p2;
constant c1 : script:ocaml() { number c1 };
constant c2 : script:ocaml() { number c2 };
constant c3 : script:ocaml() { longnumber c3 };
expression e, e1, e3;
@@

\(dma_set_mask@p2\|dma_set_coherent_mask@p2\|dma_set_mask_and_coherent@p2\|
pci_set_dma_mask@p2\|pci_set_consistent_dma_mask@p2\)
(e,\(DMA_BIT_MASK(\(BITS_PER_LONG\|NFP_NET_MAX_DMA_BITS\|e1 ? c1 : c2\|c1\|c1+e1\))\|
DMA_BIT_MASK(\(BITS_PER_LONG\|NFP_NET_MAX_DMA_BITS\|e1 ? c1 : c2\|c1\|c1+e1\))&e3\|
c3\|ATA_DMA_MASK\))

@script:ocaml@
c1 << ok3.c1;
c2 << ok3.c2 = "64";
p2 << ok3.p2;
@@

let p2 = List.hd p2 in
(if smallnumber c1
then Printf.printf "too small: %s:%d: %s\n" (shorten p2.file) p2.line c1);
(if smallnumber c2
then Printf.printf "too small: %s:%d: %s\n" (shorten p2.file) p2.line c2)

@unk@
position p2 != {ok1.p2,ok2.p2,ok3.p2};
expression e, e1;
@@

\(dma_set_mask@p2\|dma_set_coherent_mask@p2\|dma_set_mask_and_coherent@p2\|
pci_set_dma_mask@p2\|pci_set_consistent_dma_mask@p2\)
(e,e1)

@script:ocaml@
p2 << unk.p2;
e1 << unk.e1;
@@

let p2 = List.hd p2 in
Printf.printf "unknown: %s:%d: %s\n" (shorten p2.file) p2.line (clean e1)

2018-05-03 08:21:35

by Geert Uytterhoeven

[permalink] [raw]
Subject: Re: [LSF/MM TOPIC NOTES] x86 ZONE_DMA love

Hi Luis,

On Thu, Apr 26, 2018 at 11:54 PM, Luis R. Rodriguez <[email protected]> wrote:
> x86 implicit and explicit ZONE_DMA users
> -----------------------------------------
>
> We list below all x86 implicit and explicit ZONE_DMA users.
>
> # Explicit x86 users of GFP_DMA or __GFP_DMA
>
> * drivers/iio/common/ssp_sensors - wonder if enabling this on x86 was a mistake.
> Note that this needs SPI and SPI needs HAS_IOMEM. I only see HAS_IOMEM on
> s390 ? But I do think the Intel Minnowboard has SPI, but doubt it has
> the ssp sensor stuff.
>
> * drivers/input/rmi4/rmi_spi.c - same SPI question
> * drivers/media/common/siano/ - make allyesconfig yields it enabled, but
> not sure if this should ever be on x86
> * drivers/media/platform/sti/bdisp/ - likewise
> * drivers/media/platform/sti/hva/ - likewise
> * drivers/media/usb/gspca/ - likewise
> * drivers/mmc/host/wbsd.c - likewise
> * drivers/mtd/nand/gpmi-nand/ - likewise
> * drivers/net/can/spi/hi311x.c - likewise
> * drivers/net/can/spi/mcp251x.c - likewise
> * drivers/net/ethernet/agere/ - likewise
> * drivers/net/ethernet/neterion/vxge/ - likewise
> * drivers/net/ethernet/rocker/ - likewise
> * drivers/net/usb/kalmia.c - likewise
> * drivers/net/ethernet/neterion/vxge/ - likewise
> * drivers/spi/spi-pic32-sqi.c - likewise
> * drivers/spi/spi-sh-msiof.c - likewise

depends on ARCH_SHMOBILE || ARCH_RENESAS || COMPILE_TEST

> * drivers/spi/spi-ti-qspi.c - likewise

I haven't checked the others, but probably you want to disable COMPILE_TEST
to make more educated guesses about driver usage on x86.

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- [email protected]

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds

2018-05-03 12:14:29

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [LSF/MM TOPIC NOTES] x86 ZONE_DMA love

On Thu, May 03, 2018 at 02:03:38PM +0200, Michal Hocko wrote:
> On Sat 28-04-18 19:10:47, Matthew Wilcox wrote:
> > Another way we could approach this is to get rid of ZONE_DMA. Make GFP_DMA
> > a flag which doesn't map to a zone. Rather, it redirects to a separate
> > allocator. At boot, we hand all memory under 16MB to the DMA allocator. The
> > DMA allocator can have a shrinker which just hands back all the memory once
> > we're under memory pressure (if it's never had an allocation).
>
> Yeah, that was exactly the plan with the CMA allocator... We wouldn't
> need the shrinker because who cares about 16MB which is not usable
> anyway.

The CMA pool sounds fine. But please kill GFP_DMA off first / at the
same time. 95% of the users are either completely bogus or should be
using the DMA API, and the few other can use the new allocator directly.

2018-05-05 16:09:58

by Mauro Carvalho Chehab

[permalink] [raw]
Subject: Are media drivers abusing of GFP_DMA? - was: Re: [LSF/MM TOPIC NOTES] x86 ZONE_DMA love

There was a recent discussion about the use/abuse of GFP_DMA flag when
allocating memories at LSF/MM 2018 (see Luis notes enclosed).

The idea seems to be to remove it, using CMA instead. Before doing that,
better to check if what we have on media is are valid use cases for it, or
if it is there just due to some misunderstanding (or because it was
copied from some other code).

Hans de Goede sent us today a patch stopping abuse at gspca, and I'm
also posting today two other patches meant to stop abuse of it on USB
drivers. Still, there are 4 platform drivers using it:

$ git grep -l -E "GFP_DMA\\b" drivers/media/
drivers/media/platform/omap3isp/ispstat.c
drivers/media/platform/sti/bdisp/bdisp-hw.c
drivers/media/platform/sti/hva/hva-mem.c
drivers/media/spi/cxd2880-spi.c

Could you please check if GFP_DMA is really needed there, or if it is
just because of some cut-and-paste from some other place?

Thanks!
Mauro


Em Thu, 26 Apr 2018 21:54:06 +0000
"Luis R. Rodriguez" <[email protected]> escreveu:

> Below are my notes on the ZONE_DMA discussion at LSF/MM 2018. There were some
> earlier discussion prior to my arrival to the session about moving around
> ZOME_DMA around, if someone has notes on that please share too :)
>
> PS. I'm not subscribed to linux-mm
>
> Luis
>
> Determining you don't need to support ZONE_DMA on x86 at run time
> =================================================================
>
> In practice if you don't have a floppy device on x86, you don't need ZONE_DMA,
> in that case you dont need to support ZONE_DMA, however currently disabling it
> is only possible at compile time, and we won't know for sure until boot time if
> you have such a device. If you don't need ZONE_DMA though means we would not
> have to deal with slab allocators for them and special casings for it in a slew
> of places. In particular even kmalloc() has a branch which is always run if
> CONFIG_ZONE_DMA is enabled.
>
> ZONE_DMA is needed for old devices that requires lower addresses since it allows
> allocations more reliably. There should be more devices that require this,
> not just floppy though.
>
> Christoph Lameter added CONFIG_ZONE_DMA to disable ZONE_DMA at build time but
> most distributions enable this. If we could disable ZONE_DMA at run time once
> we know we don't have any device present requiring it we could get the same
> benefit of compiling without CONFIG_ZONE_DMA at run time.
>
> It used to be that disabling CONFIG_ZONE_DMA could help with performance, we
> don't seem to have modern benchmarks over possible gains on removing it.
> Are the gains no longer expected to be significant? Very likely there are
> no performance gains. The assumption then is that the main advantage over
> being able to disable ZONE_DMA on x86 these days would be pure aesthetics, and
> having x86 work more like other architectures with allocations. Use of ZONE_DMA
> on drivers are also good signs these drivers are old, or may be deprecated.
> Perhaps some of these on x86 should be moved to staging.
>
> Note that some architectures rely on ZONE_DMA as well, the above notes
> only applies to x86.
>
> We can use certain kernel mechanisms to disable usage of x86 certain features
> at run time. Below are a few options:
>
> * x86 binary patching
> * ACPI_SIG_FADT
> * static keys
> * compiler multiverse (at least the R&D gcc proof of concept is now complete)
>
> Detecting legacy x86 devices with ACPI ACPI_SIG_FADT
> ----------------------------------------------------
>
> We could expand on ACPI_SIG_FADT with more legacy devices. This mechanism was
> used to help determine if certain legacy x86 devices are present or not with
> paravirtualization. For instance:
>
> * ACPI_FADT_NO_VGA
> * ACPI_FADT_NO_CMOS_RTC
>
> CONFIG_ZONE_DMA
> ---------------
>
> Christoph Lameter added CONFIG_ZONE_DMA through commit 4b51d66989218
> ("[PATCH] optional ZONE_DMA: optional ZONE_DMA in the VM") merged on
> v2.6.21.
>
> On x86 ZONE_DMA is defined as follows:
>
> config ZONE_DMA
> bool "DMA memory allocation support" if EXPERT
> default y
> help
> DMA memory allocation support allows devices with less than 32-bit
> addressing to allocate within the first 16MB of address space.
> Disable if no such devices will be used.
>
> If unsure, say Y.
>
> Most distributions enable CONFIG_ZONE_DMA.
>
> Immediate impact of CONFIG_ZONE_DMA
> -----------------------------------
>
> CONFIG_ZONE_DMA implicaates kmalloc() as follows:
>
> struct kmem_cache *kmalloc_slab(size_t size, gfp_t flags)
> {
> ...
> #ifdef CONFIG_ZONE_DMA
> if (unlikely((flags & GFP_DMA)))
> return kmalloc_dma_caches[index];
> #endif
> ...
> }
>
> ZONE_DMA users
> ==============
>
> Turns out there are much more users of ZONE_DMA than expected even on x86.
>
> Explicit requirements for ZONE_DMA with gfp flags
> -------------------------------------------------
>
> All drivers which explicitly use any of these flags implicate use
> of ZONE_DMA for allocations:
>
> * GFP_DMA
> * __GFP_DMA
>
> Implicit ZONE_DMA users
> -----------------------
>
> There are a series of implicit users of ZONE_DMA which use helpers. These are,
> with details documented further below:
>
> * blk_queue_bounce()
> * blk_queue_bounce_limit()
> * dma_alloc_coherent_gfp_flags()
> * dma_generic_alloc_coherent()
> * intel_alloc_coherent()
> * _regmap_raw_write()
> * mempool_alloc_pages_isa()
>
> x86 implicit and explicit ZONE_DMA users
> -----------------------------------------
>
> We list below all x86 implicit and explicit ZONE_DMA users.
>
> # Explicit x86 users of GFP_DMA or __GFP_DMA
>
> * drivers/iio/common/ssp_sensors - wonder if enabling this on x86 was a mistake.
> Note that this needs SPI and SPI needs HAS_IOMEM. I only see HAS_IOMEM on
> s390 ? But I do think the Intel Minnowboard has SPI, but doubt it has
> the ssp sensor stuff.
>
> * drivers/input/rmi4/rmi_spi.c - same SPI question
> * drivers/media/common/siano/ - make allyesconfig yields it enabled, but
> not sure if this should ever be on x86
> * drivers/media/platform/sti/bdisp/ - likewise
> * drivers/media/platform/sti/hva/ - likewise
> * drivers/media/usb/gspca/ - likewise
> * drivers/mmc/host/wbsd.c - likewise
> * drivers/mtd/nand/gpmi-nand/ - likewise
> * drivers/net/can/spi/hi311x.c - likewise
> * drivers/net/can/spi/mcp251x.c - likewise
> * drivers/net/ethernet/agere/ - likewise
> * drivers/net/ethernet/neterion/vxge/ - likewise
> * drivers/net/ethernet/rocker/ - likewise
> * drivers/net/usb/kalmia.c - likewise
> * drivers/net/ethernet/neterion/vxge/ - likewise
> * drivers/spi/spi-pic32-sqi.c - likewise
> * drivers/spi/spi-sh-msiof.c - likewise
> * drivers/spi/spi-ti-qspi.c - likewise
>
> * drivers/tty/serial/mxs-auart.c - likewise - MXS AUART support
> * drivers/tty/synclink.c - likewise Microgate SyncLink card support
> * drivers/uio/uio_pruss - Texas Instruments PRUSS driver
> * drivers/usb/dwc2 - CONFIG_USB_DWC2_DUAL_ROLE - DesignWare USB2 DRD Core Support for dual role mode
> * drivers/usb/gadget/udc/ USB_GR_UDC - Aeroflex Gaisler GRUSBDC USB Peripheral Controller Driver
> * drivers/video/fbdev/da8xx-fb.c - FB_DA8XX DA8xx/OMAP-L1xx/AM335x Framebuffer support
> * drivers/video/fbdev/mb862xx/mb862xxfb_accel.c - CONFIG_FB_MB862XX - Fujitsu MB862xx GDC support
> * drivers/video/fbdev/vermilion/vermilion.c - Intel LE80578 (Vermilion) support
>
> Then we have a few drivers which we know we need on x86 but for these
> we could use a run time flip to enable ZONE_DMA.
>
> * drivers/net/ethernet/broadcom/b44.c - bleh, yeah and there are some work hw bug
> work arounds for this, *but* again since its also odd, we could deal with this
> at run time
> * drivers/net/wimax/i2400m/ - ugh, who cares about this crap anyway nowadays, my
> point being another run time oddity
> * drivers/net/wireless/broadcom/b43legacy/ - ugh, same
> * drivers/platform/x86/asus-wmi.c - ugh same
> * drivers/platform/x86/dell-smbios.c - ugh same
>
> Staging drivers are expected to have flaws, but worth noting.
>
> * drivers/staging/ - scattered drivers, rtlwifi/ is probably the only relevant one for x86
> SCSI is *severely* affected:
> * drivers/scsi/aacraid/ - crap Adaptec AACRAID support
> * drivers/scsi/ch.c - SCSI media changer support...
> * drivers/scsi/initio.c - Initio INI-9X00U/UW SCSI device driver...
> * drivers/scsi/osst.c - CHR_DEV_OSST - SCSI OnStream SC-x0 tape support...
> * drivers/scsi/pmcraid.c - CONFIG_SCSI_PMCRAID - PMC SIERRA Linux MaxRAID adapter support
> * drivers/scsi/snic/ - Cisco SNIC Driver
> * drivers/mmc/core/mmc_test.c - MMC_TEST - MMC host test driver
> * drivers/net/wireless/broadcom/b43/ - means we'd have to at least use
> static keys
>
> Larger blockers (now I see one reason why SCSI is a disaster):
>
> * drivers/scsi/hosts.c - scsi_host_alloc() always uses
> __GFP_DMA if (sht->unchecked_isa_dma && privsize)
> this could likely be adjusted or split off to other
> callers where we know this to be true.
> * drivers/scsi/scsi_scan.c - scsi_probe_and_add_lun() has a similar check
> * drivers/scsi/sg.c - sg_build_indirect() has similar check
> * drivers/scsi/sr.c - get_capabilities() *always* uses GFP_DMA
> which is called on sr_probe() WTF
> Don't drop this -- its cdrom
> * drivers/scsi/sr_ioctl.c - seriously...
> * drivers/scsi/sr_vendor.c - sr_cd_check() - checks if the CD is
> multisession, asks for offset etc
> * drivers/scsi/st.c - SCSI tape support - on enlarge_buffer() this
> call BTW is recursive.. called on st_open(), the struct
> file_operations open()...
>
> Larger blockers (SPI is also severely affected):
> * drivers/spi/spi.c - spi_pump_messages() which processes spi message queue
>
> Larger blockers:
>
> * drivers/tty/hvc/hvc_iucv.c - hyperv console
>
> And finally a non-issue:
>
> * drivers/xen/swiotlb-xen.c - used on struct dma_map_ops
> xen_swiotlb_dma_ops alloc() for only to check if the caller
> used it to se the dma_mask:
> dma_mask = dev->coherent_dma_mask;
> if (!dma_mask)
> dma_mask = (gfp & GFP_DMA) ? DMA_BIT_MASK(24) : DMA_BIT_MASK(32);
>
> That's the end of the review of all current explicit callers on x86.
>
> # dma_alloc_coherent_gfp_flags() and dma_generic_alloc_coherent()
>
> dma_alloc_coherent_gfp_flags() and dma_generic_alloc_coherent() set
> GFP_DMA if if (dma_mask <= DMA_BIT_MASK(24))
>
> # blk_queue_bounce()
>
> void blk_queue_bounce(struct request_queue *q, struct bio **bio_orig)
> {
> ...
> /*
> * for non-isa bounce case, just check if the bounce pfn is equal
> * to or bigger than the highest pfn in the system -- in that case,
> * don't waste time iterating over bio segments
> */
> if (!(q->bounce_gfp & GFP_DMA)) {
> if (q->limits.bounce_pfn >= blk_max_pfn)
> return;
> pool = page_pool;
> } else {
> BUG_ON(!isa_page_pool);
> pool = isa_page_pool;
> }
> ...
> }
>
> # blk_queue_bounce_limit()
>
> void blk_queue_bounce_limit(struct request_queue *q, u64 max_addr)
> {
> unsigned long b_pfn = max_addr >> PAGE_SHIFT;
> int dma = 0;
>
> q->bounce_gfp = GFP_NOIO;
> #if BITS_PER_LONG == 64
> /*
> * Assume anything <= 4GB can be handled by IOMMU. Actually
> * some IOMMUs can handle everything, but I don't know of a
> * way to test this here.
> */
> if (b_pfn < (min_t(u64, 0xffffffffUL, BLK_BOUNCE_HIGH) >> PAGE_SHIFT))
> dma = 1;
> q->limits.bounce_pfn = max(max_low_pfn, b_pfn);
> #else
> if (b_pfn < blk_max_low_pfn)
> dma = 1;
> q->limits.bounce_pfn = b_pfn;
> #endif
> if (dma) {
> init_emergency_isa_pool();
> q->bounce_gfp = GFP_NOIO | GFP_DMA;
> q->limits.bounce_pfn = b_pfn;
> }
> }
>
> # dma_alloc_coherent_gfp_flags() and dma_generic_alloc_coherent()
>
> dma_alloc_coherent_gfp_flags() sets GFP_DMA if if (dma_mask <= DMA_BIT_MASK(24)).
> Likewise for dma_generic_alloc_coherent().
>
> # intel_alloc_coherent()
>
> intel_alloc_coherent() on drivers/iommu/intel-iommu.c also uses GFP_DMA
> for DMA_BIT_MASK(32), part of the struct dma_map_ops intel_dma_ops on alloc().
>
> # _regmap_raw_write()
>
> _regmap_raw_write() seems to always use GFP_DMA for async writes.
>
> # mempool_alloc_pages_isa()
>
> It implies you use GFP_DMA.
>
> Architectures removed which used ZONE_DMA
> -----------------------------------------
>
> Although this topic pertains to x86, its worth mentioning that on the v4.17-rc1
> release 8 architectures were removed: blackfin, cris, frv, m32r, metag,
> mn10300, score, tile. Of these 8 architectures, 3 defined and used their own
> ZONE_DMA:
>
> * blackfin
> * cris
> * m32r



Thanks,
Mauro

2018-05-07 13:27:26

by Laurent Pinchart

[permalink] [raw]
Subject: Re: Are media drivers abusing of GFP_DMA? - was: Re: [LSF/MM TOPIC NOTES] x86 ZONE_DMA love

Hi Mauro,

On Saturday, 5 May 2018 19:08:15 EEST Mauro Carvalho Chehab wrote:
> There was a recent discussion about the use/abuse of GFP_DMA flag when
> allocating memories at LSF/MM 2018 (see Luis notes enclosed).
>
> The idea seems to be to remove it, using CMA instead. Before doing that,
> better to check if what we have on media is are valid use cases for it, or
> if it is there just due to some misunderstanding (or because it was
> copied from some other code).
>
> Hans de Goede sent us today a patch stopping abuse at gspca, and I'm
> also posting today two other patches meant to stop abuse of it on USB
> drivers. Still, there are 4 platform drivers using it:
>
> $ git grep -l -E "GFP_DMA\\b" drivers/media/
> drivers/media/platform/omap3isp/ispstat.c
> drivers/media/platform/sti/bdisp/bdisp-hw.c
> drivers/media/platform/sti/hva/hva-mem.c
> drivers/media/spi/cxd2880-spi.c
>
> Could you please check if GFP_DMA is really needed there, or if it is
> just because of some cut-and-paste from some other place?

I started looking at that for the omap3isp driver but Sakari beat me at
submitting a patch. GFP_DMA isn't needed for omap3isp.

--
Regards,

Laurent Pinchart




2018-05-07 15:20:03

by Mauro Carvalho Chehab

[permalink] [raw]
Subject: Re: Are media drivers abusing of GFP_DMA? - was: Re: [LSF/MM TOPIC NOTES] x86 ZONE_DMA love

Em Mon, 07 May 2018 16:26:08 +0300
Laurent Pinchart <[email protected]> escreveu:

> Hi Mauro,
>
> On Saturday, 5 May 2018 19:08:15 EEST Mauro Carvalho Chehab wrote:
> > There was a recent discussion about the use/abuse of GFP_DMA flag when
> > allocating memories at LSF/MM 2018 (see Luis notes enclosed).
> >
> > The idea seems to be to remove it, using CMA instead. Before doing that,
> > better to check if what we have on media is are valid use cases for it, or
> > if it is there just due to some misunderstanding (or because it was
> > copied from some other code).
> >
> > Hans de Goede sent us today a patch stopping abuse at gspca, and I'm
> > also posting today two other patches meant to stop abuse of it on USB
> > drivers. Still, there are 4 platform drivers using it:
> >
> > $ git grep -l -E "GFP_DMA\\b" drivers/media/
> > drivers/media/platform/omap3isp/ispstat.c
> > drivers/media/platform/sti/bdisp/bdisp-hw.c
> > drivers/media/platform/sti/hva/hva-mem.c
> > drivers/media/spi/cxd2880-spi.c
> >
> > Could you please check if GFP_DMA is really needed there, or if it is
> > just because of some cut-and-paste from some other place?
>
> I started looking at that for the omap3isp driver but Sakari beat me at
> submitting a patch. GFP_DMA isn't needed for omap3isp.
>
Thank you both for looking into it.

Regards,
Mauro



Thanks,
Mauro

2018-05-10 04:41:14

by Takiguchi, Yasunari

[permalink] [raw]
Subject: RE: Are media drivers abusing of GFP_DMA? - was: Re: [LSF/MM TOPIC NOTES] x86 ZONE_DMA love

Dear Mauro

> -----Original Message-----
>
> There was a recent discussion about the use/abuse of GFP_DMA flag when
> allocating memories at LSF/MM 2018 (see Luis notes enclosed).
>
> The idea seems to be to remove it, using CMA instead. Before doing that,
> better to check if what we have on media is are valid use cases for it,
> or
> if it is there just due to some misunderstanding (or because it was
> copied from some other code).
>
> Hans de Goede sent us today a patch stopping abuse at gspca, and I'm
> also posting today two other patches meant to stop abuse of it on USB
> drivers. Still, there are 4 platform drivers using it:
>
> $ git grep -l -E "GFP_DMA\\b" drivers/media/
> drivers/media/platform/omap3isp/ispstat.c
> drivers/media/platform/sti/bdisp/bdisp-hw.c
> drivers/media/platform/sti/hva/hva-mem.c
> drivers/media/spi/cxd2880-spi.c
>
> Could you please check if GFP_DMA is really needed there, or if it is
> just because of some cut-and-paste from some other place?
About drivers/media/spi/cxd2880-spi.c,
we referred to kmalloc of driver/spi/spi.c spi_write_then_read() function and made this code.

Regards,
Takiguchi

2018-05-14 08:01:40

by Fabien DESSENNE

[permalink] [raw]
Subject: Re: Are media drivers abusing of GFP_DMA? - was: Re: [LSF/MM TOPIC NOTES] x86 ZONE_DMA love



On 07/05/18 17:19, Mauro Carvalho Chehab wrote:
> Em Mon, 07 May 2018 16:26:08 +0300
> Laurent Pinchart <[email protected]> escreveu:
>
>> Hi Mauro,
>>
>> On Saturday, 5 May 2018 19:08:15 EEST Mauro Carvalho Chehab wrote:
>>> There was a recent discussion about the use/abuse of GFP_DMA flag when
>>> allocating memories at LSF/MM 2018 (see Luis notes enclosed).
>>>
>>> The idea seems to be to remove it, using CMA instead. Before doing that,
>>> better to check if what we have on media is are valid use cases for it, or
>>> if it is there just due to some misunderstanding (or because it was
>>> copied from some other code).
>>>
>>> Hans de Goede sent us today a patch stopping abuse at gspca, and I'm
>>> also posting today two other patches meant to stop abuse of it on USB
>>> drivers. Still, there are 4 platform drivers using it:
>>>
>>> $ git grep -l -E "GFP_DMA\\b" drivers/media/
>>> drivers/media/platform/omap3isp/ispstat.c
>>> drivers/media/platform/sti/bdisp/bdisp-hw.c
>>> drivers/media/platform/sti/hva/hva-mem.c

Hi Mauro,

The two STI drivers (bdisp-hw.c and hva-mem.c) are only expected to run
on ARM platforms, not on x86.
Since this thread deals with x86 & DMA trouble, I am not sure that we
actually have a problem for the sti drivers.

There are some other sti drivers that make use of this GFP_DMA flag
(drivers/gpu/drm/sti/sti_*.c) and it does not seem to be a problem.

Nevertheless I can see that the media sti drivers depend on COMPILE_TEST
(which is not the case for the DRM ones).
Would it be an acceptable solution to remove the COMPILE_TEST dependency?

BR

Fabien

>>> drivers/media/spi/cxd2880-spi.c
>>>
>>> Could you please check if GFP_DMA is really needed there, or if it is
>>> just because of some cut-and-paste from some other place?
>> I started looking at that for the omap3isp driver but Sakari beat me at
>> submitting a patch. GFP_DMA isn't needed for omap3isp.
>>
> Thank you both for looking into it.
>
> Regards,
> Mauro
>
>
>
> Thanks,
> Mauro

2018-05-14 10:35:53

by Mauro Carvalho Chehab

[permalink] [raw]
Subject: Re: Are media drivers abusing of GFP_DMA? - was: Re: [LSF/MM TOPIC NOTES] x86 ZONE_DMA love

Hi Fabien,

Em Mon, 14 May 2018 08:00:37 +0000
Fabien DESSENNE <[email protected]> escreveu:

> On 07/05/18 17:19, Mauro Carvalho Chehab wrote:
> > Em Mon, 07 May 2018 16:26:08 +0300
> > Laurent Pinchart <[email protected]> escreveu:
> >
> >> Hi Mauro,
> >>
> >> On Saturday, 5 May 2018 19:08:15 EEST Mauro Carvalho Chehab wrote:
> >>> There was a recent discussion about the use/abuse of GFP_DMA flag when
> >>> allocating memories at LSF/MM 2018 (see Luis notes enclosed).
> >>>
> >>> The idea seems to be to remove it, using CMA instead. Before doing that,
> >>> better to check if what we have on media is are valid use cases for it, or
> >>> if it is there just due to some misunderstanding (or because it was
> >>> copied from some other code).
> >>>
> >>> Hans de Goede sent us today a patch stopping abuse at gspca, and I'm
> >>> also posting today two other patches meant to stop abuse of it on USB
> >>> drivers. Still, there are 4 platform drivers using it:
> >>>
> >>> $ git grep -l -E "GFP_DMA\\b" drivers/media/
> >>> drivers/media/platform/omap3isp/ispstat.c
> >>> drivers/media/platform/sti/bdisp/bdisp-hw.c
> >>> drivers/media/platform/sti/hva/hva-mem.c
>
> Hi Mauro,
>
> The two STI drivers (bdisp-hw.c and hva-mem.c) are only expected to run
> on ARM platforms, not on x86.
> Since this thread deals with x86 & DMA trouble, I am not sure that we
> actually have a problem for the sti drivers.
>
> There are some other sti drivers that make use of this GFP_DMA flag
> (drivers/gpu/drm/sti/sti_*.c) and it does not seem to be a problem.
>
> Nevertheless I can see that the media sti drivers depend on COMPILE_TEST
> (which is not the case for the DRM ones).
> Would it be an acceptable solution to remove the COMPILE_TEST dependency?

This has nothing to do with either x86 or COMPILE_TEST. The thing is
that there's a plan for removing GFP_DMA from the Kernel[1], as it was
originally meant to be used only by old PCs, where the DMA controllers
used only on the bottom 16 MB memory address (24 bits). IMHO, it is
very unlikely that any ARM SoC have such limitation.

[1] https://lwn.net/Articles/753273/ (article will be freely available
on May, 17)

Anyway, before the removal of GFP_DMA happens, I'd like to better
understand why we're using it at media, and if we can, instead,
set the DMA bit mask, just like almost all other media drivers
that require to confine DMA into a certain range do. In the case
of ARM, this is what we currently have:

drivers/media/platform/exynos-gsc/gsc-core.c: vb2_dma_contig_set_max_seg_size(dev, DMA_BIT_MASK(32));
drivers/media/platform/exynos4-is/fimc-core.c: vb2_dma_contig_set_max_seg_size(dev, DMA_BIT_MASK(32));
drivers/media/platform/exynos4-is/fimc-is.c: vb2_dma_contig_set_max_seg_size(dev, DMA_BIT_MASK(32));
drivers/media/platform/exynos4-is/fimc-lite.c: vb2_dma_contig_set_max_seg_size(dev, DMA_BIT_MASK(32));
drivers/media/platform/mtk-mdp/mtk_mdp_core.c: vb2_dma_contig_set_max_seg_size(&pdev->dev, DMA_BIT_MASK(32));
drivers/media/platform/omap3isp/isp.c: ret = dma_coerce_mask_and_coherent(isp->dev, DMA_BIT_MASK(32));
drivers/media/platform/s5p-g2d/g2d.c: vb2_dma_contig_set_max_seg_size(&pdev->dev, DMA_BIT_MASK(32));
drivers/media/platform/s5p-jpeg/jpeg-core.c: vb2_dma_contig_set_max_seg_size(&pdev->dev, DMA_BIT_MASK(32));
drivers/media/platform/s5p-mfc/s5p_mfc.c: DMA_BIT_MASK(32));
drivers/media/platform/s5p-mfc/s5p_mfc.c: DMA_BIT_MASK(32));
drivers/media/platform/s5p-mfc/s5p_mfc.c: vb2_dma_contig_set_max_seg_size(dev, DMA_BIT_MASK(32));

>
> BR
>
> Fabien
>
> >>> drivers/media/spi/cxd2880-spi.c
> >>>
> >>> Could you please check if GFP_DMA is really needed there, or if it is
> >>> just because of some cut-and-paste from some other place?
> >> I started looking at that for the omap3isp driver but Sakari beat me at
> >> submitting a patch. GFP_DMA isn't needed for omap3isp.
> >>
> > Thank you both for looking into it.
> >
> > Regards,
> > Mauro
> >
> >
> >
> > Thanks,
> > Mauro



Thanks,
Mauro

2018-05-14 10:40:46

by Mauro Carvalho Chehab

[permalink] [raw]
Subject: Re: Are media drivers abusing of GFP_DMA? - was: Re: [LSF/MM TOPIC NOTES] x86 ZONE_DMA love

Em Mon, 14 May 2018 07:35:03 -0300
Mauro Carvalho Chehab <[email protected]> escreveu:

> Hi Fabien,
>
> Em Mon, 14 May 2018 08:00:37 +0000
> Fabien DESSENNE <[email protected]> escreveu:
>
> > On 07/05/18 17:19, Mauro Carvalho Chehab wrote:
> > > Em Mon, 07 May 2018 16:26:08 +0300
> > > Laurent Pinchart <[email protected]> escreveu:
> > >
> > >> Hi Mauro,
> > >>
> > >> On Saturday, 5 May 2018 19:08:15 EEST Mauro Carvalho Chehab wrote:
> > >>> There was a recent discussion about the use/abuse of GFP_DMA flag when
> > >>> allocating memories at LSF/MM 2018 (see Luis notes enclosed).
> > >>>
> > >>> The idea seems to be to remove it, using CMA instead. Before doing that,
> > >>> better to check if what we have on media is are valid use cases for it, or
> > >>> if it is there just due to some misunderstanding (or because it was
> > >>> copied from some other code).
> > >>>
> > >>> Hans de Goede sent us today a patch stopping abuse at gspca, and I'm
> > >>> also posting today two other patches meant to stop abuse of it on USB
> > >>> drivers. Still, there are 4 platform drivers using it:
> > >>>
> > >>> $ git grep -l -E "GFP_DMA\\b" drivers/media/
> > >>> drivers/media/platform/omap3isp/ispstat.c
> > >>> drivers/media/platform/sti/bdisp/bdisp-hw.c
> > >>> drivers/media/platform/sti/hva/hva-mem.c
> >
> > Hi Mauro,
> >
> > The two STI drivers (bdisp-hw.c and hva-mem.c) are only expected to run
> > on ARM platforms, not on x86.
> > Since this thread deals with x86 & DMA trouble, I am not sure that we
> > actually have a problem for the sti drivers.
> >
> > There are some other sti drivers that make use of this GFP_DMA flag
> > (drivers/gpu/drm/sti/sti_*.c) and it does not seem to be a problem.
> >
> > Nevertheless I can see that the media sti drivers depend on COMPILE_TEST
> > (which is not the case for the DRM ones).
> > Would it be an acceptable solution to remove the COMPILE_TEST dependency?
>
> This has nothing to do with either x86 or COMPILE_TEST. The thing is
> that there's a plan for removing GFP_DMA from the Kernel[1], as it was
> originally meant to be used only by old PCs, where the DMA controllers
> used only on the bottom 16 MB memory address (24 bits). IMHO, it is
> very unlikely that any ARM SoC have such limitation.
>
> [1] https://lwn.net/Articles/753273/ (article will be freely available
> on May, 17)

Btw, you can also read about that at:
https://lwn.net/Articles/753274/

>
> Anyway, before the removal of GFP_DMA happens, I'd like to better
> understand why we're using it at media, and if we can, instead,
> set the DMA bit mask, just like almost all other media drivers
> that require to confine DMA into a certain range do. In the case
> of ARM, this is what we currently have:
>
> drivers/media/platform/exynos-gsc/gsc-core.c: vb2_dma_contig_set_max_seg_size(dev, DMA_BIT_MASK(32));
> drivers/media/platform/exynos4-is/fimc-core.c: vb2_dma_contig_set_max_seg_size(dev, DMA_BIT_MASK(32));
> drivers/media/platform/exynos4-is/fimc-is.c: vb2_dma_contig_set_max_seg_size(dev, DMA_BIT_MASK(32));
> drivers/media/platform/exynos4-is/fimc-lite.c: vb2_dma_contig_set_max_seg_size(dev, DMA_BIT_MASK(32));
> drivers/media/platform/mtk-mdp/mtk_mdp_core.c: vb2_dma_contig_set_max_seg_size(&pdev->dev, DMA_BIT_MASK(32));
> drivers/media/platform/omap3isp/isp.c: ret = dma_coerce_mask_and_coherent(isp->dev, DMA_BIT_MASK(32));
> drivers/media/platform/s5p-g2d/g2d.c: vb2_dma_contig_set_max_seg_size(&pdev->dev, DMA_BIT_MASK(32));
> drivers/media/platform/s5p-jpeg/jpeg-core.c: vb2_dma_contig_set_max_seg_size(&pdev->dev, DMA_BIT_MASK(32));
> drivers/media/platform/s5p-mfc/s5p_mfc.c: DMA_BIT_MASK(32));
> drivers/media/platform/s5p-mfc/s5p_mfc.c: DMA_BIT_MASK(32));
> drivers/media/platform/s5p-mfc/s5p_mfc.c: vb2_dma_contig_set_max_seg_size(dev, DMA_BIT_MASK(32));
>
> >
> > BR
> >
> > Fabien
> >
> > >>> drivers/media/spi/cxd2880-spi.c
> > >>>
> > >>> Could you please check if GFP_DMA is really needed there, or if it is
> > >>> just because of some cut-and-paste from some other place?
> > >> I started looking at that for the omap3isp driver but Sakari beat me at
> > >> submitting a patch. GFP_DMA isn't needed for omap3isp.
> > >>
> > > Thank you both for looking into it.
> > >
> > > Regards,
> > > Mauro
> > >
> > >
> > >
> > > Thanks,
> > > Mauro
>
>
>
> Thanks,
> Mauro



Thanks,
Mauro

2018-05-15 07:31:20

by Fabien DESSENNE

[permalink] [raw]
Subject: Re: Are media drivers abusing of GFP_DMA? - was: Re: [LSF/MM TOPIC NOTES] x86 ZONE_DMA love



On 14/05/18 12:39, Mauro Carvalho Chehab wrote:
> Em Mon, 14 May 2018 07:35:03 -0300
> Mauro Carvalho Chehab <[email protected]> escreveu:
>
>> Hi Fabien,
>>
>> Em Mon, 14 May 2018 08:00:37 +0000
>> Fabien DESSENNE <[email protected]> escreveu:
>>
>>> On 07/05/18 17:19, Mauro Carvalho Chehab wrote:
>>>> Em Mon, 07 May 2018 16:26:08 +0300
>>>> Laurent Pinchart <[email protected]> escreveu:
>>>>
>>>>> Hi Mauro,
>>>>>
>>>>> On Saturday, 5 May 2018 19:08:15 EEST Mauro Carvalho Chehab wrote:
>>>>>> There was a recent discussion about the use/abuse of GFP_DMA flag when
>>>>>> allocating memories at LSF/MM 2018 (see Luis notes enclosed).
>>>>>>
>>>>>> The idea seems to be to remove it, using CMA instead. Before doing that,
>>>>>> better to check if what we have on media is are valid use cases for it, or
>>>>>> if it is there just due to some misunderstanding (or because it was
>>>>>> copied from some other code).
>>>>>>
>>>>>> Hans de Goede sent us today a patch stopping abuse at gspca, and I'm
>>>>>> also posting today two other patches meant to stop abuse of it on USB
>>>>>> drivers. Still, there are 4 platform drivers using it:
>>>>>>
>>>>>> $ git grep -l -E "GFP_DMA\\b" drivers/media/
>>>>>> drivers/media/platform/omap3isp/ispstat.c
>>>>>> drivers/media/platform/sti/bdisp/bdisp-hw.c
>>>>>> drivers/media/platform/sti/hva/hva-mem.c
>>> Hi Mauro,
>>>
>>> The two STI drivers (bdisp-hw.c and hva-mem.c) are only expected to run
>>> on ARM platforms, not on x86.
>>> Since this thread deals with x86 & DMA trouble, I am not sure that we
>>> actually have a problem for the sti drivers.
>>>
>>> There are some other sti drivers that make use of this GFP_DMA flag
>>> (drivers/gpu/drm/sti/sti_*.c) and it does not seem to be a problem.
>>>
>>> Nevertheless I can see that the media sti drivers depend on COMPILE_TEST
>>> (which is not the case for the DRM ones).
>>> Would it be an acceptable solution to remove the COMPILE_TEST dependency?
>> This has nothing to do with either x86 or COMPILE_TEST. The thing is
>> that there's a plan for removing GFP_DMA from the Kernel[1], as it was
>> originally meant to be used only by old PCs, where the DMA controllers
>> used only on the bottom 16 MB memory address (24 bits). IMHO, it is
>> very unlikely that any ARM SoC have such limitation.
>>
>> [1] https://lwn.net/Articles/753273/ (article will be freely available
>> on May, 17)
> Btw, you can also read about that at:
> https://lwn.net/Articles/753274/
>
>> Anyway, before the removal of GFP_DMA happens, I'd like to better
>> understand why we're using it at media, and if we can, instead,
>> set the DMA bit mask, just like almost all other media drivers
>> that require to confine DMA into a certain range do. In the case
>> of ARM, this is what we currently have:
>>
>> drivers/media/platform/exynos-gsc/gsc-core.c: vb2_dma_contig_set_max_seg_size(dev, DMA_BIT_MASK(32));
>> drivers/media/platform/exynos4-is/fimc-core.c: vb2_dma_contig_set_max_seg_size(dev, DMA_BIT_MASK(32));
>> drivers/media/platform/exynos4-is/fimc-is.c: vb2_dma_contig_set_max_seg_size(dev, DMA_BIT_MASK(32));
>> drivers/media/platform/exynos4-is/fimc-lite.c: vb2_dma_contig_set_max_seg_size(dev, DMA_BIT_MASK(32));
>> drivers/media/platform/mtk-mdp/mtk_mdp_core.c: vb2_dma_contig_set_max_seg_size(&pdev->dev, DMA_BIT_MASK(32));
>> drivers/media/platform/omap3isp/isp.c: ret = dma_coerce_mask_and_coherent(isp->dev, DMA_BIT_MASK(32));
>> drivers/media/platform/s5p-g2d/g2d.c: vb2_dma_contig_set_max_seg_size(&pdev->dev, DMA_BIT_MASK(32));
>> drivers/media/platform/s5p-jpeg/jpeg-core.c: vb2_dma_contig_set_max_seg_size(&pdev->dev, DMA_BIT_MASK(32));
>> drivers/media/platform/s5p-mfc/s5p_mfc.c: DMA_BIT_MASK(32));
>> drivers/media/platform/s5p-mfc/s5p_mfc.c: DMA_BIT_MASK(32));
>> drivers/media/platform/s5p-mfc/s5p_mfc.c: vb2_dma_contig_set_max_seg_size(dev, DMA_BIT_MASK(32));
>>

That's clearer now, thank you for the clarification
I am about to send patches for the sti drivers (set the DMA bit mask)

BR,
Fabien

>>> BR
>>>
>>> Fabien
>>>
>>>>>> drivers/media/spi/cxd2880-spi.c
>>>>>>
>>>>>> Could you please check if GFP_DMA is really needed there, or if it is
>>>>>> just because of some cut-and-paste from some other place?
>>>>> I started looking at that for the omap3isp driver but Sakari beat me at
>>>>> submitting a patch. GFP_DMA isn't needed for omap3isp.
>>>>>
>>>> Thank you both for looking into it.
>>>>
>>>> Regards,
>>>> Mauro
>>>>
>>>>
>>>>
>>>> Thanks,
>>>> Mauro
>>
>>
>> Thanks,
>> Mauro
>
>
> Thanks,
> Mauro

2018-05-15 08:29:21

by Laurent Pinchart

[permalink] [raw]
Subject: Re: Are media drivers abusing of GFP_DMA? - was: Re: [LSF/MM TOPIC NOTES] x86 ZONE_DMA love

Hello,

On Tuesday, 15 May 2018 10:30:28 EEST Fabien DESSENNE wrote:
> On 14/05/18 12:39, Mauro Carvalho Chehab wrote:
> > Em Mon, 14 May 2018 07:35:03 -0300 Mauro Carvalho Chehab escreveu:
> >> Em Mon, 14 May 2018 08:00:37 +0000 Fabien DESSENNE escreveu:
> >>> On 07/05/18 17:19, Mauro Carvalho Chehab wrote:
> >>>> Em Mon, 07 May 2018 16:26:08 +0300 Laurent Pinchart escreveu:
> >>>>> On Saturday, 5 May 2018 19:08:15 EEST Mauro Carvalho Chehab wrote:
> >>>>>
> >>>>>> There was a recent discussion about the use/abuse of GFP_DMA flag
> >>>>>> when allocating memories at LSF/MM 2018 (see Luis notes enclosed).
> >>>>>>
> >>>>>> The idea seems to be to remove it, using CMA instead. Before doing
> >>>>>> that, better to check if what we have on media is are valid use cases
> >>>>>> for it, or if it is there just due to some misunderstanding (or
> >>>>>> because it was copied from some other code).
> >>>>>>
> >>>>>> Hans de Goede sent us today a patch stopping abuse at gspca, and I'm
> >>>>>> also posting today two other patches meant to stop abuse of it on
> >>>>>> USB drivers. Still, there are 4 platform drivers using it:
> >>>>>>
> >>>>>> $ git grep -l -E "GFP_DMA\\b" drivers/media/
> >>>>>> drivers/media/platform/omap3isp/ispstat.c
> >>>>>> drivers/media/platform/sti/bdisp/bdisp-hw.c
> >>>>>> drivers/media/platform/sti/hva/hva-mem.c
> >>>
> >>> The two STI drivers (bdisp-hw.c and hva-mem.c) are only expected to run
> >>> on ARM platforms, not on x86. Since this thread deals with x86 & DMA
> >>> trouble, I am not sure that we actually have a problem for the sti
> >>> drivers.
> >>>
> >>> There are some other sti drivers that make use of this GFP_DMA flag
> >>> (drivers/gpu/drm/sti/sti_*.c) and it does not seem to be a problem.
> >>>
> >>> Nevertheless I can see that the media sti drivers depend on COMPILE_TEST
> >>> (which is not the case for the DRM ones).
> >>> Would it be an acceptable solution to remove the COMPILE_TEST
> >>> dependency?
> >>
> >> This has nothing to do with either x86 or COMPILE_TEST. The thing is
> >> that there's a plan for removing GFP_DMA from the Kernel[1], as it was
> >> originally meant to be used only by old PCs, where the DMA controllers
> >> used only on the bottom 16 MB memory address (24 bits). IMHO, it is
> >> very unlikely that any ARM SoC have such limitation.
> >>
> >> [1] https://lwn.net/Articles/753273/ (article will be freely available
> >> on May, 17)
> >
> > Btw, you can also read about that at:
> >
> > https://lwn.net/Articles/753274/
> >
> >> Anyway, before the removal of GFP_DMA happens, I'd like to better
> >> understand why we're using it at media, and if we can, instead,
> >> set the DMA bit mask, just like almost all other media drivers
> >> that require to confine DMA into a certain range do. In the case
> >> of ARM, this is what we currently have:
> >>
> >> drivers/media/platform/exynos-gsc/gsc-core.c:
> >> vb2_dma_contig_set_max_seg_size(dev, DMA_BIT_MASK(32));
> >> drivers/media/platform/exynos4-is/fimc-core.c:
> >> vb2_dma_contig_set_max_seg_size(dev, DMA_BIT_MASK(32));
> >> drivers/media/platform/exynos4-is/fimc-is.c:
> >> vb2_dma_contig_set_max_seg_size(dev, DMA_BIT_MASK(32));
> >> drivers/media/platform/exynos4-is/fimc-lite.c:
> >> vb2_dma_contig_set_max_seg_size(dev, DMA_BIT_MASK(32));
> >> drivers/media/platform/mtk-mdp/mtk_mdp_core.c:
> >> vb2_dma_contig_set_max_seg_size(&pdev->dev, DMA_BIT_MASK(32));
> >> drivers/media/platform/omap3isp/isp.c: ret =
> >> dma_coerce_mask_and_coherent(isp->dev, DMA_BIT_MASK(32));
> >> drivers/media/platform/s5p-g2d/g2d.c:
> >> vb2_dma_contig_set_max_seg_size(&pdev->dev, DMA_BIT_MASK(32));
> >> drivers/media/platform/s5p-jpeg/jpeg-core.c:
> >> vb2_dma_contig_set_max_seg_size(&pdev->dev, DMA_BIT_MASK(32));
> >> drivers/media/platform/s5p-mfc/s5p_mfc.c:
> >> DMA_BIT_MASK(32));
> >> drivers/media/platform/s5p-mfc/s5p_mfc.c:
> >> DMA_BIT_MASK(32));
> >> drivers/media/platform/s5p-mfc/s5p_mfc.c:
> >> vb2_dma_contig_set_max_seg_size(dev, DMA_BIT_MASK(32));
>
> That's clearer now, thank you for the clarification
> I am about to send patches for the sti drivers (set the DMA bit mask)

Some drivers call vb2_dma_contig_set_max_seg_size() and some call
dma_coerce_mask_and_coherent(). Both are likely needed, the former telling the
DMA mapping API about the maximum size of a scatter-gather chunk that the
device supports (when using vb2-dma-contig that size should really be the full
address space supported by the device as we want DMA-contiguous buffers), and
the latter telling the DMA mapping API about the address space that is
accessible through DMA (and thus in which address range buffers must be
placed).

I wonder why the omap3isp driver works without a
vb2_dma_contig_set_max_seg_size() call. Sakari, any insight ?

> >>>>>> drivers/media/spi/cxd2880-spi.c
> >>>>>>
> >>>>>> Could you please check if GFP_DMA is really needed there, or if it
> >>>>>> is just because of some cut-and-paste from some other place?
> >>>>>
> >>>>> I started looking at that for the omap3isp driver but Sakari beat me
> >>>>> at submitting a patch. GFP_DMA isn't needed for omap3isp.
> >>>>
> >>>> Thank you both for looking into it.

--
Regards,

Laurent Pinchart




2018-05-15 10:31:41

by Mauro Carvalho Chehab

[permalink] [raw]
Subject: Re: Are media drivers abusing of GFP_DMA? - was: Re: [LSF/MM TOPIC NOTES] x86 ZONE_DMA love

Hi Laurent,

Em Tue, 15 May 2018 11:27:44 +0300
Laurent Pinchart <[email protected]> escreveu:

> Hello,
>
> On Tuesday, 15 May 2018 10:30:28 EEST Fabien DESSENNE wrote:
> > On 14/05/18 12:39, Mauro Carvalho Chehab wrote:
> > > Em Mon, 14 May 2018 07:35:03 -0300 Mauro Carvalho Chehab escreveu:
> > >> Em Mon, 14 May 2018 08:00:37 +0000 Fabien DESSENNE escreveu:
> > >>> On 07/05/18 17:19, Mauro Carvalho Chehab wrote:
> > >>>> Em Mon, 07 May 2018 16:26:08 +0300 Laurent Pinchart escreveu:
> > >>>>> On Saturday, 5 May 2018 19:08:15 EEST Mauro Carvalho Chehab wrote:
> > >>>>>
> > >>>>>> There was a recent discussion about the use/abuse of GFP_DMA flag
> > >>>>>> when allocating memories at LSF/MM 2018 (see Luis notes enclosed).
> > >>>>>>
> > >>>>>> The idea seems to be to remove it, using CMA instead. Before doing
> > >>>>>> that, better to check if what we have on media is are valid use cases
> > >>>>>> for it, or if it is there just due to some misunderstanding (or
> > >>>>>> because it was copied from some other code).
> > >>>>>>
> > >>>>>> Hans de Goede sent us today a patch stopping abuse at gspca, and I'm
> > >>>>>> also posting today two other patches meant to stop abuse of it on
> > >>>>>> USB drivers. Still, there are 4 platform drivers using it:
> > >>>>>>
> > >>>>>> $ git grep -l -E "GFP_DMA\\b" drivers/media/
> > >>>>>> drivers/media/platform/omap3isp/ispstat.c
> > >>>>>> drivers/media/platform/sti/bdisp/bdisp-hw.c
> > >>>>>> drivers/media/platform/sti/hva/hva-mem.c
> > >>>
> > >>> The two STI drivers (bdisp-hw.c and hva-mem.c) are only expected to run
> > >>> on ARM platforms, not on x86. Since this thread deals with x86 & DMA
> > >>> trouble, I am not sure that we actually have a problem for the sti
> > >>> drivers.
> > >>>
> > >>> There are some other sti drivers that make use of this GFP_DMA flag
> > >>> (drivers/gpu/drm/sti/sti_*.c) and it does not seem to be a problem.
> > >>>
> > >>> Nevertheless I can see that the media sti drivers depend on COMPILE_TEST
> > >>> (which is not the case for the DRM ones).
> > >>> Would it be an acceptable solution to remove the COMPILE_TEST
> > >>> dependency?
> > >>
> > >> This has nothing to do with either x86 or COMPILE_TEST. The thing is
> > >> that there's a plan for removing GFP_DMA from the Kernel[1], as it was
> > >> originally meant to be used only by old PCs, where the DMA controllers
> > >> used only on the bottom 16 MB memory address (24 bits). IMHO, it is
> > >> very unlikely that any ARM SoC have such limitation.
> > >>
> > >> [1] https://lwn.net/Articles/753273/ (article will be freely available
> > >> on May, 17)
> > >
> > > Btw, you can also read about that at:
> > >
> > > https://lwn.net/Articles/753274/
> > >
> > >> Anyway, before the removal of GFP_DMA happens, I'd like to better
> > >> understand why we're using it at media, and if we can, instead,
> > >> set the DMA bit mask, just like almost all other media drivers
> > >> that require to confine DMA into a certain range do. In the case
> > >> of ARM, this is what we currently have:
> > >>
> > >> drivers/media/platform/exynos-gsc/gsc-core.c:
> > >> vb2_dma_contig_set_max_seg_size(dev, DMA_BIT_MASK(32));
> > >> drivers/media/platform/exynos4-is/fimc-core.c:
> > >> vb2_dma_contig_set_max_seg_size(dev, DMA_BIT_MASK(32));
> > >> drivers/media/platform/exynos4-is/fimc-is.c:
> > >> vb2_dma_contig_set_max_seg_size(dev, DMA_BIT_MASK(32));
> > >> drivers/media/platform/exynos4-is/fimc-lite.c:
> > >> vb2_dma_contig_set_max_seg_size(dev, DMA_BIT_MASK(32));
> > >> drivers/media/platform/mtk-mdp/mtk_mdp_core.c:
> > >> vb2_dma_contig_set_max_seg_size(&pdev->dev, DMA_BIT_MASK(32));
> > >> drivers/media/platform/omap3isp/isp.c: ret =
> > >> dma_coerce_mask_and_coherent(isp->dev, DMA_BIT_MASK(32));
> > >> drivers/media/platform/s5p-g2d/g2d.c:
> > >> vb2_dma_contig_set_max_seg_size(&pdev->dev, DMA_BIT_MASK(32));
> > >> drivers/media/platform/s5p-jpeg/jpeg-core.c:
> > >> vb2_dma_contig_set_max_seg_size(&pdev->dev, DMA_BIT_MASK(32));
> > >> drivers/media/platform/s5p-mfc/s5p_mfc.c:
> > >> DMA_BIT_MASK(32));
> > >> drivers/media/platform/s5p-mfc/s5p_mfc.c:
> > >> DMA_BIT_MASK(32));
> > >> drivers/media/platform/s5p-mfc/s5p_mfc.c:
> > >> vb2_dma_contig_set_max_seg_size(dev, DMA_BIT_MASK(32));
> >
> > That's clearer now, thank you for the clarification
> > I am about to send patches for the sti drivers (set the DMA bit mask)
>
> Some drivers call vb2_dma_contig_set_max_seg_size() and some call
> dma_coerce_mask_and_coherent(). Both are likely needed, the former telling the
> DMA mapping API about the maximum size of a scatter-gather chunk that the
> device supports (when using vb2-dma-contig that size should really be the full
> address space supported by the device as we want DMA-contiguous buffers), and
> the latter telling the DMA mapping API about the address space that is
> accessible through DMA (and thus in which address range buffers must be
> placed).
>
> I wonder why the omap3isp driver works without a
> vb2_dma_contig_set_max_seg_size() call. Sakari, any insight ?

I checked the usage of vb2_dma_contig_set_max_seg_size(). What it
does is to allocate dev->dma_parms and call dma_set_max_seg_size().

Allocating dev->dma_parms change the behavior of 2 function pairs
(inlined at dma_mapping.h):

1) dma_get_seg_boundary() / dma_set_seg_boundary()

As no media drivers use dev->dma_parms->segment_boundary_mask,
it will keep returning DMA_BIT_MASK(32), either calling or not
the VB2-specific function.

1) dma_get_max_seg_size() / dma_set_max_seg_size()

Checking where dma_get_max_seg_size() is used returns:

$ git grep dma_get_max_seg_size
arch/alpha/kernel/pci_iommu.c: max_seg_size = dev ? dma_get_max_seg_size(dev) : 0;
arch/arm/mm/dma-mapping.c: unsigned int max = dma_get_max_seg_size(dev);
arch/ia64/hp/common/sba_iommu.c: unsigned int max_seg_size = dma_get_max_seg_size(dev);
arch/powerpc/kernel/iommu.c: max_seg_size = dma_get_max_seg_size(dev);
arch/s390/pci/pci_dma.c: unsigned int max = dma_get_max_seg_size(dev);
arch/sparc/kernel/iommu.c: max_seg_size = dma_get_max_seg_size(dev);
arch/sparc/kernel/pci_sun4v.c: max_seg_size = dma_get_max_seg_size(dev);
arch/x86/kernel/amd_gart_64.c: max_seg_size = dma_get_max_seg_size(dev);
drivers/firewire/sbp2.c: if (dma_get_max_seg_size(device->card->device) > SBP2_MAX_SEG_SIZE)
drivers/iio/buffer/industrialio-buffer-dmaengine.c: dmaengine_buffer->max_size = dma_get_max_seg_size(chan->device->dev);
drivers/iommu/dma-iommu.c: unsigned int cur_len = 0, max_len = dma_get_max_seg_size(dev);
drivers/media/common/videobuf2/videobuf2-dma-contig.c: if (dma_get_max_seg_size(dev) < size)
drivers/mmc/host/mmci.c: unsigned int max_seg_size = dma_get_max_seg_size(dev);
drivers/mmc/host/mmci.c: unsigned int max_seg_size = dma_get_max_seg_size(dev);
drivers/mmc/host/mxcmmc.c: mmc->max_seg_size = dma_get_max_seg_size(
drivers/mmc/host/mxs-mmc.c: mmc->max_seg_size = dma_get_max_seg_size(ssp->dmach->device->dev);
drivers/parisc/iommu-helpers.h: unsigned int max_seg_size = min(dma_get_max_seg_size(dev),
drivers/scsi/scsi_lib.c: blk_queue_max_segment_size(q, dma_get_max_seg_size(dev));
drivers/spi/spi.c: unsigned int max_seg_size = dma_get_max_seg_size(dev);
include/linux/dma-mapping.h:static inline unsigned int dma_get_max_seg_size(struct device *dev)
lib/dma-debug.c: unsigned int max_range = dma_get_max_seg_size(ref->dev);
sound/soc/soc-generic-dmaengine-pcm.c: hw.period_bytes_max = dma_get_max_seg_size(dma_dev);

As vb2_dma_contig_set_max_seg_size() is called only on ARM drivers
for Exynos and Mediatek, we only need to check where it is used inside
arch/arm/mm/dma-mapping.c. There, only __iommu_map_sg() function
calls it. So, except if mis-named, I would be expecting it to be used
only for Scatter/Gather DMA.

So, it seems that, with the current code, setting it for dma_contig
probably does nothing.

Please double-check, as things like that can be tricky.

IMO, it is worth removing vb2_dma_contig_set_max_seg_size() and be sure that
the drivers calls it are calling dma_coerce_mask_and_coherent().

Thanks,
Mauro

2018-05-15 16:26:20

by Luis Chamberlain

[permalink] [raw]
Subject: Re: Are media drivers abusing of GFP_DMA? - was: Re: [LSF/MM TOPIC NOTES] x86 ZONE_DMA love

On Mon, May 14, 2018 at 07:35:03AM -0300, Mauro Carvalho Chehab wrote:
> Hi Fabien,
>
> Em Mon, 14 May 2018 08:00:37 +0000
> Fabien DESSENNE <[email protected]> escreveu:
>
> > On 07/05/18 17:19, Mauro Carvalho Chehab wrote:
> > > Em Mon, 07 May 2018 16:26:08 +0300
> > > Laurent Pinchart <[email protected]> escreveu:
> > >
> > >> Hi Mauro,
> > >>
> > >> On Saturday, 5 May 2018 19:08:15 EEST Mauro Carvalho Chehab wrote:
> > >>> There was a recent discussion about the use/abuse of GFP_DMA flag when
> > >>> allocating memories at LSF/MM 2018 (see Luis notes enclosed).
> > >>>
> > >>> The idea seems to be to remove it, using CMA instead. Before doing that,
> > >>> better to check if what we have on media is are valid use cases for it, or
> > >>> if it is there just due to some misunderstanding (or because it was
> > >>> copied from some other code).
> > >>>
> > >>> Hans de Goede sent us today a patch stopping abuse at gspca, and I'm
> > >>> also posting today two other patches meant to stop abuse of it on USB
> > >>> drivers. Still, there are 4 platform drivers using it:
> > >>>
> > >>> $ git grep -l -E "GFP_DMA\\b" drivers/media/
> > >>> drivers/media/platform/omap3isp/ispstat.c
> > >>> drivers/media/platform/sti/bdisp/bdisp-hw.c
> > >>> drivers/media/platform/sti/hva/hva-mem.c
> >
> > Hi Mauro,
> >
> > The two STI drivers (bdisp-hw.c and hva-mem.c) are only expected to run
> > on ARM platforms, not on x86.
> > Since this thread deals with x86 & DMA trouble, I am not sure that we
> > actually have a problem for the sti drivers.
> >
> > There are some other sti drivers that make use of this GFP_DMA flag
> > (drivers/gpu/drm/sti/sti_*.c) and it does not seem to be a problem.
> >
> > Nevertheless I can see that the media sti drivers depend on COMPILE_TEST
> > (which is not the case for the DRM ones).
> > Would it be an acceptable solution to remove the COMPILE_TEST dependency?
>
> This has nothing to do with either x86

Actually it does.

> or COMPILE_TEST. The thing is
> that there's a plan for removing GFP_DMA from the Kernel[1],

That would not be possible given architectures use GFP_DMA for other
things and there are plenty of legacy x86 drivers which still need to be
around. So the focus from mm folks shifted to letting x86 folks map
GFP_DMA onto the CMA pool. Long term, this is nothing that driver developers
need to care for, but just knowing internally behind the scenes there is some
cleaning up being done in terms of architecture.

> as it was
> originally meant to be used only by old PCs, where the DMA controllers
> used only on the bottom 16 MB memory address (24 bits).

This is actually the part that is x86 specific.

Each other architecture may use it for some other definition and it seems
some architectures use GFP_DMA all over the place. So the topic really should
be about x86.

> IMHO, it is
> very unlikely that any ARM SoC have such limitation.

Right, how the flag is used on other architectures varies, so in fact the
focus for cleaning up for now should be an x86 effort. Whether or not
other architectures do something silly with GFP_DMA is beyond the scope
of what was discussed.

Luis