2024-01-13 18:41:31

by Aurelien Jarno

[permalink] [raw]
Subject: [PATCH] media: solo6x10: replace max(a, min(b, c)) by clamp(b, a, c)

This patch replaces max(a, min(b, c)) by clamp(b, a, c) in the solo6x10
driver. This improves the readability and more importantly, for the
solo6x10-p2m.c file, this reduces on my system (x86-64, gcc 13):
- the preprocessed size from 121 MiB to 4.5 MiB;
- the build CPU time from 46.8 s to 1.6 s;
- the build memory from 2786 MiB to 98MiB.

In fine, this allows this relatively simple C file to be built on a
32-bit system.

Reported-by: Jiri Slaby <[email protected]>
Closes: https://lore.kernel.org/lkml/[email protected]/
Cc: [email protected] # v6.7+
Suggested-by: David Laight <[email protected]>
Signed-off-by: Aurelien Jarno <[email protected]>
---
drivers/media/pci/solo6x10/solo6x10-offsets.h | 10 +++++-----
1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/drivers/media/pci/solo6x10/solo6x10-offsets.h b/drivers/media/pci/solo6x10/solo6x10-offsets.h
index f414ee1316f2..fdbb817e6360 100644
--- a/drivers/media/pci/solo6x10/solo6x10-offsets.h
+++ b/drivers/media/pci/solo6x10/solo6x10-offsets.h
@@ -57,16 +57,16 @@
#define SOLO_MP4E_EXT_ADDR(__solo) \
(SOLO_EREF_EXT_ADDR(__solo) + SOLO_EREF_EXT_AREA(__solo))
#define SOLO_MP4E_EXT_SIZE(__solo) \
- max((__solo->nr_chans * 0x00080000), \
- min(((__solo->sdram_size - SOLO_MP4E_EXT_ADDR(__solo)) - \
- __SOLO_JPEG_MIN_SIZE(__solo)), 0x00ff0000))
+ clamp(__solo->sdram_size - SOLO_MP4E_EXT_ADDR(__solo) - \
+ __SOLO_JPEG_MIN_SIZE(__solo), \
+ __solo->nr_chans * 0x00080000, 0x00ff0000)

#define __SOLO_JPEG_MIN_SIZE(__solo) (__solo->nr_chans * 0x00080000)
#define SOLO_JPEG_EXT_ADDR(__solo) \
(SOLO_MP4E_EXT_ADDR(__solo) + SOLO_MP4E_EXT_SIZE(__solo))
#define SOLO_JPEG_EXT_SIZE(__solo) \
- max(__SOLO_JPEG_MIN_SIZE(__solo), \
- min((__solo->sdram_size - SOLO_JPEG_EXT_ADDR(__solo)), 0x00ff0000))
+ clamp(__solo->sdram_size - SOLO_JPEG_EXT_ADDR(__solo), \
+ __SOLO_JPEG_MIN_SIZE(__solo), 0x00ff0000)

#define SOLO_SDRAM_END(__solo) \
(SOLO_JPEG_EXT_ADDR(__solo) + SOLO_JPEG_EXT_SIZE(__solo))
--
2.42.0



2024-01-13 22:42:55

by David Laight

[permalink] [raw]
Subject: RE: [PATCH] media: solo6x10: replace max(a, min(b, c)) by clamp(b, a, c)

From: Aurelien Jarno
> Sent: 13 January 2024 18:34
>
> This patch replaces max(a, min(b, c)) by clamp(b, a, c) in the solo6x10
> driver. This improves the readability and more importantly, for the
> solo6x10-p2m.c file, this reduces on my system (x86-64, gcc 13):
> - the preprocessed size from 121 MiB to 4.5 MiB;
> - the build CPU time from 46.8 s to 1.6 s;
> - the build memory from 2786 MiB to 98MiB.
>
> In fine, this allows this relatively simple C file to be built on a
> 32-bit system.
>
> Reported-by: Jiri Slaby <[email protected]>
> Closes: https://lore.kernel.org/lkml/[email protected]/
> Cc: [email protected] # v6.7+
> Suggested-by: David Laight <[email protected]>
> Signed-off-by: Aurelien Jarno <[email protected]>

Reviewed-by: David Laight <[email protected]>

I was about to send the same patch.
Although I'm not sure why the cpu time is so large.
It compiles pretty immediately on my system.
I do have some patches to minmax.h that reduce the .i for the nested
clamp() to around 200k.
Mostly obtained be adding min/max_const() for the few places
that need a constant and min/max_ptr() for pointer types.
Supporting both causes the expansion to be a lot larger.

David

> ---
> drivers/media/pci/solo6x10/solo6x10-offsets.h | 10 +++++-----
> 1 file changed, 5 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/media/pci/solo6x10/solo6x10-offsets.h b/drivers/media/pci/solo6x10/solo6x10-
> offsets.h
> index f414ee1316f2..fdbb817e6360 100644
> --- a/drivers/media/pci/solo6x10/solo6x10-offsets.h
> +++ b/drivers/media/pci/solo6x10/solo6x10-offsets.h
> @@ -57,16 +57,16 @@
> #define SOLO_MP4E_EXT_ADDR(__solo) \
> (SOLO_EREF_EXT_ADDR(__solo) + SOLO_EREF_EXT_AREA(__solo))
> #define SOLO_MP4E_EXT_SIZE(__solo) \
> - max((__solo->nr_chans * 0x00080000), \
> - min(((__solo->sdram_size - SOLO_MP4E_EXT_ADDR(__solo)) - \
> - __SOLO_JPEG_MIN_SIZE(__solo)), 0x00ff0000))
> + clamp(__solo->sdram_size - SOLO_MP4E_EXT_ADDR(__solo) - \
> + __SOLO_JPEG_MIN_SIZE(__solo), \
> + __solo->nr_chans * 0x00080000, 0x00ff0000)
>
> #define __SOLO_JPEG_MIN_SIZE(__solo) (__solo->nr_chans * 0x00080000)
> #define SOLO_JPEG_EXT_ADDR(__solo) \
> (SOLO_MP4E_EXT_ADDR(__solo) + SOLO_MP4E_EXT_SIZE(__solo))
> #define SOLO_JPEG_EXT_SIZE(__solo) \
> - max(__SOLO_JPEG_MIN_SIZE(__solo), \
> - min((__solo->sdram_size - SOLO_JPEG_EXT_ADDR(__solo)), 0x00ff0000))
> + clamp(__solo->sdram_size - SOLO_JPEG_EXT_ADDR(__solo), \
> + __SOLO_JPEG_MIN_SIZE(__solo), 0x00ff0000)
>
> #define SOLO_SDRAM_END(__solo) \
> (SOLO_JPEG_EXT_ADDR(__solo) + SOLO_JPEG_EXT_SIZE(__solo))
> --
> 2.42.0

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)


2024-01-14 11:04:23

by Hans Verkuil

[permalink] [raw]
Subject: Re: [PATCH] media: solo6x10: replace max(a, min(b, c)) by clamp(b, a, c)

Hi Aurelien,

On 13/01/2024 19:33, Aurelien Jarno wrote:
> This patch replaces max(a, min(b, c)) by clamp(b, a, c) in the solo6x10
> driver. This improves the readability and more importantly, for the
> solo6x10-p2m.c file, this reduces on my system (x86-64, gcc 13):
> - the preprocessed size from 121 MiB to 4.5 MiB;
> - the build CPU time from 46.8 s to 1.6 s;
> - the build memory from 2786 MiB to 98MiB.

Thank you for this patch.

Reviewed-by: Hans Verkuil <[email protected]>

I'll pick this up as a fix for v6.8.

Linus, if you prefer to pick this up directly, then that's fine as well.

I was aware for some time that something had changed elsewhere that caused
the solo driver compilation to take a lot of memory, but I didn't have the
time yet to investigate it. I now see that this was discussed in the thread
from the Closes tag. Note that it would have been nice if that was CCed to
linux-media, since it concerned a media driver.

The use of clamp() is definitely cleaner code, but it is rather insane
that max(a, min(b, c)) takes 2786 MiB over 98 MiB for clamp(). The driver
code looks perfectly innocuous, and you won't see the problem unless you compile
on a system or a VM with a lot less memory. (I only noticed it myself when
compiling in a VM.

Perhaps sparse or smatch should be extended with a check for nested min/max
and warn for that with a recommendation to use clamp instead? Although right
now these tools skip the solo driver since it takes too much memory :-)

I even found a min(max(max())) case in drivers/net/wireless/ath/ath11k/mac.c:

nss = min(nss, max(max(ath11k_mac_max_ht_nss(ht_mcs_mask),
ath11k_mac_max_vht_nss(vht_mcs_mask)),
ath11k_mac_max_he_nss(he_mcs_mask)));

I wonder how that expands in cpp.

So while this patch is a nice cleanup, the underlying problem remains.

Regards,

Hans

>
> In fine, this allows this relatively simple C file to be built on a
> 32-bit system.
>
> Reported-by: Jiri Slaby <[email protected]>
> Closes: https://lore.kernel.org/lkml/[email protected]/
> Cc: [email protected] # v6.7+
> Suggested-by: David Laight <[email protected]>
> Signed-off-by: Aurelien Jarno <[email protected]>
> ---
> drivers/media/pci/solo6x10/solo6x10-offsets.h | 10 +++++-----
> 1 file changed, 5 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/media/pci/solo6x10/solo6x10-offsets.h b/drivers/media/pci/solo6x10/solo6x10-offsets.h
> index f414ee1316f2..fdbb817e6360 100644
> --- a/drivers/media/pci/solo6x10/solo6x10-offsets.h
> +++ b/drivers/media/pci/solo6x10/solo6x10-offsets.h
> @@ -57,16 +57,16 @@
> #define SOLO_MP4E_EXT_ADDR(__solo) \
> (SOLO_EREF_EXT_ADDR(__solo) + SOLO_EREF_EXT_AREA(__solo))
> #define SOLO_MP4E_EXT_SIZE(__solo) \
> - max((__solo->nr_chans * 0x00080000), \
> - min(((__solo->sdram_size - SOLO_MP4E_EXT_ADDR(__solo)) - \
> - __SOLO_JPEG_MIN_SIZE(__solo)), 0x00ff0000))
> + clamp(__solo->sdram_size - SOLO_MP4E_EXT_ADDR(__solo) - \
> + __SOLO_JPEG_MIN_SIZE(__solo), \
> + __solo->nr_chans * 0x00080000, 0x00ff0000)
>
> #define __SOLO_JPEG_MIN_SIZE(__solo) (__solo->nr_chans * 0x00080000)
> #define SOLO_JPEG_EXT_ADDR(__solo) \
> (SOLO_MP4E_EXT_ADDR(__solo) + SOLO_MP4E_EXT_SIZE(__solo))
> #define SOLO_JPEG_EXT_SIZE(__solo) \
> - max(__SOLO_JPEG_MIN_SIZE(__solo), \
> - min((__solo->sdram_size - SOLO_JPEG_EXT_ADDR(__solo)), 0x00ff0000))
> + clamp(__solo->sdram_size - SOLO_JPEG_EXT_ADDR(__solo), \
> + __SOLO_JPEG_MIN_SIZE(__solo), 0x00ff0000)
>
> #define SOLO_SDRAM_END(__solo) \
> (SOLO_JPEG_EXT_ADDR(__solo) + SOLO_JPEG_EXT_SIZE(__solo))


2024-01-21 19:57:37

by Linus Torvalds

[permalink] [raw]
Subject: Re: [PATCH] media: solo6x10: replace max(a, min(b, c)) by clamp(b, a, c)

On Sun, 14 Jan 2024 at 03:04, Hans Verkuil <[email protected]> wrote:
>
> I'll pick this up as a fix for v6.8.
>
> Linus, if you prefer to pick this up directly, then that's fine as well.

Bah, missed this email, and so a belated note that I picked the patch
up as commit 31e97d7c9ae3.

It even got your Reviewed-by thanks to b4 picking that up automatically.

Linus