2024-06-04 08:42:17

by kunyu

[permalink] [raw]
Subject: [PATCH] dma: direct: Optimize the code for the dma_direct_free function

The 'is_swiotlb-for-alloc' and 'dev_isdma_coherent' judgment functions
need to be called multiple times, so they are adjusted to variable
judgment, which can improve code conciseness.

Signed-off-by: kunyu <[email protected]>
---
kernel/dma/direct.c | 9 +++++----
1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/kernel/dma/direct.c b/kernel/dma/direct.c
index 4d543b1e9d57..041e316ad4c0 100644
--- a/kernel/dma/direct.c
+++ b/kernel/dma/direct.c
@@ -315,23 +315,24 @@ void dma_direct_free(struct device *dev, size_t size,
void *cpu_addr, dma_addr_t dma_addr, unsigned long attrs)
{
unsigned int page_order = get_order(size);
+ bool swiotlb_for_alloc = is_swiotlb_for_alloc(dev);
+ bool is_dma_coherent = dev_is_dma_coherent(dev);

if ((attrs & DMA_ATTR_NO_KERNEL_MAPPING) &&
- !force_dma_unencrypted(dev) && !is_swiotlb_for_alloc(dev)) {
+ !force_dma_unencrypted(dev) && !swiotlb_for_alloc) {
/* cpu_addr is a struct page cookie, not a kernel address */
dma_free_contiguous(dev, cpu_addr, size);
return;
}

if (IS_ENABLED(CONFIG_ARCH_HAS_DMA_ALLOC) &&
- !dev_is_dma_coherent(dev) &&
- !is_swiotlb_for_alloc(dev)) {
+ !is_dma_coherent && !swiotlb_for_alloc) {
arch_dma_free(dev, size, cpu_addr, dma_addr, attrs);
return;
}

if (IS_ENABLED(CONFIG_DMA_GLOBAL_POOL) &&
- !dev_is_dma_coherent(dev)) {
+ !is_dma_coherent) {
if (!dma_release_from_global_coherent(page_order, cpu_addr))
WARN_ON_ONCE(1);
return;
--
2.18.2



2024-06-04 12:41:56

by Robin Murphy

[permalink] [raw]
Subject: Re: [PATCH] dma: direct: Optimize the code for the dma_direct_free function

On 04/06/2024 9:41 am, kunyu wrote:
> The 'is_swiotlb-for-alloc' and 'dev_isdma_coherent' judgment functions
> need to be called multiple times, so they are adjusted to variable
> judgment, which can improve code conciseness.
>
> Signed-off-by: kunyu <[email protected]>
> ---
> kernel/dma/direct.c | 9 +++++----
> 1 file changed, 5 insertions(+), 4 deletions(-)

It's hardly concise to add *more* code than was there before... :/

Personally I don't think shaving a handful of characters off each
invocation has any positive impact on readability in this case, while
the extra visual indirection, and breaking consistency with the rest of
this file, definitely has a negative one.

Also note that these "functions" are already just inline wrappers around
a single variable dereference - for my arm64 build at least, this patch
has no effect at all on the generated object code, since the compiler
can still optimise out the locals (so at least it doesn't make things
*worse* by forcing it to allocate a larger stack frame).

Thanks,
Robin.

> diff --git a/kernel/dma/direct.c b/kernel/dma/direct.c
> index 4d543b1e9d57..041e316ad4c0 100644
> --- a/kernel/dma/direct.c
> +++ b/kernel/dma/direct.c
> @@ -315,23 +315,24 @@ void dma_direct_free(struct device *dev, size_t size,
> void *cpu_addr, dma_addr_t dma_addr, unsigned long attrs)
> {
> unsigned int page_order = get_order(size);
> + bool swiotlb_for_alloc = is_swiotlb_for_alloc(dev);
> + bool is_dma_coherent = dev_is_dma_coherent(dev);
>
> if ((attrs & DMA_ATTR_NO_KERNEL_MAPPING) &&
> - !force_dma_unencrypted(dev) && !is_swiotlb_for_alloc(dev)) {
> + !force_dma_unencrypted(dev) && !swiotlb_for_alloc) {
> /* cpu_addr is a struct page cookie, not a kernel address */
> dma_free_contiguous(dev, cpu_addr, size);
> return;
> }
>
> if (IS_ENABLED(CONFIG_ARCH_HAS_DMA_ALLOC) &&
> - !dev_is_dma_coherent(dev) &&
> - !is_swiotlb_for_alloc(dev)) {
> + !is_dma_coherent && !swiotlb_for_alloc) {
> arch_dma_free(dev, size, cpu_addr, dma_addr, attrs);
> return;
> }
>
> if (IS_ENABLED(CONFIG_DMA_GLOBAL_POOL) &&
> - !dev_is_dma_coherent(dev)) {
> + !is_dma_coherent) {
> if (!dma_release_from_global_coherent(page_order, cpu_addr))
> WARN_ON_ONCE(1);
> return;