2024-05-02 09:25:55

by Kuan-Wei Chiu

[permalink] [raw]
Subject: [PATCH v5 0/2] bitops: Optimize fns() for improved performance

Hello,

This patch series optimizes the fns() function by avoiding repeated
calls to __ffs(). Additionally, tests for fns() have been added in
lib/test_bitops.c.

Changes in v5:
- Reduce testing iterations from 1000000 to 10000 to decrease testing
time.
- Move 'buf' inside the function.
- Mark 'buf' as __initdata.
- Assign the results of fns() to a volatile variable to prevent
compiler optimization.
- Remove the iteration count from the benchmark result.
- Update benchmark results in the commit message.

Changes in v4:
- Correct get_random_long() -> get_random_bytes() in the commit
message.

Changes in v3:
- Move the benchmark test for fns() to lib/test_bitops.c.
- Exclude the overhead of random number generation from the benchmark
result.
- Change the output to print only a total gross instead of each n in
the benchmark result.
- Update the commit message in the second patch.

Changes in v2:
- Add benchmark test for fns() in lib/find_bit_benchmark.c.
- Change the loop in fns() by counting down from n to 0.
- Add find_bit benchmark result for find_nth_bit in commit message.

Link to v4: https://lkml.kernel.org/[email protected]
Link to v3: https://lkml.kernel.org/[email protected]
Link to v2: https://lkml.kernel.org/[email protected]
Link to v1: https://lkml.kernel.org/[email protected]

Kuan-Wei Chiu (2):
lib/test_bitops: Add benchmark test for fns()
bitops: Optimize fns() for improved performance

include/linux/bitops.h | 12 +++---------
lib/test_bitops.c | 22 ++++++++++++++++++++++
2 files changed, 25 insertions(+), 9 deletions(-)

--
2.34.1



2024-05-02 09:26:07

by Kuan-Wei Chiu

[permalink] [raw]
Subject: [PATCH v5 1/2] lib/test_bitops: Add benchmark test for fns()

Introduce a benchmark test for the fns(). It measures the total time
taken by fns() to process 10,000 test data generated using
get_random_bytes() for each n in the range [0, BITS_PER_LONG).

example:
test_bitops: fns: 7637268 ns

Signed-off-by: Kuan-Wei Chiu <[email protected]>
Suggested-by: Yury Norov <[email protected]>
---

Changes in v5:
- Reduce testing iterations from 1000000 to 10000 to decrease testing
time.
- Move 'buf' inside the function.
- Mark 'buf' as __initdata.
- Assign the results of fns() to a volatile variable to prevent
compiler optimization.
- Remove the iteration count from the benchmark result.

lib/test_bitops.c | 22 ++++++++++++++++++++++
1 file changed, 22 insertions(+)

diff --git a/lib/test_bitops.c b/lib/test_bitops.c
index 3b7bcbee84db..5c627b525a48 100644
--- a/lib/test_bitops.c
+++ b/lib/test_bitops.c
@@ -50,6 +50,26 @@ static unsigned long order_comb_long[][2] = {
};
#endif

+static int __init test_fns(void)
+{
+ static unsigned long buf[10000] __initdata;
+ static volatile __used unsigned long tmp __initdata;
+ unsigned int i, n;
+ ktime_t time;
+
+ get_random_bytes(buf, sizeof(buf));
+ time = ktime_get();
+
+ for (n = 0; n < BITS_PER_LONG; n++)
+ for (i = 0; i < 10000; i++)
+ tmp = fns(buf[i], n);
+
+ time = ktime_get() - time;
+ pr_err("fns: %18llu ns\n", time);
+
+ return 0;
+}
+
static int __init test_bitops_startup(void)
{
int i, bit_set;
@@ -94,6 +114,8 @@ static int __init test_bitops_startup(void)
if (bit_set != BITOPS_LAST)
pr_err("ERROR: FOUND SET BIT %d\n", bit_set);

+ test_fns();
+
pr_info("Completed bitops test\n");

return 0;
--
2.34.1


2024-05-02 09:40:25

by Kuan-Wei Chiu

[permalink] [raw]
Subject: [PATCH v5 2/2] bitops: Optimize fns() for improved performance

The current fns() repeatedly uses __ffs() to find the index of the
least significant bit and then clears the corresponding bit using
__clear_bit(). The method for clearing the least significant bit can be
optimized by using word &= word - 1 instead.

Typically, the execution time of one __ffs() plus one __clear_bit() is
longer than that of a bitwise AND operation and a subtraction. To
improve performance, the loop for clearing the least significant bit
has been replaced with word &= word - 1, followed by a single __ffs()
operation to obtain the answer. This change reduces the number of
__ffs() iterations from n to just one, enhancing overall performance.

This modification significantly accelerates the fns() function in the
test_bitops benchmark, improving its speed by approximately 7.6 times.
Additionally, it enhances the performance of find_nth_bit() in the
find_bit benchmark by approximately 26%.

Before:
test_bitops: fns: 58033164 ns
find_nth_bit: 4254313 ns, 16525 iterations

After:
test_bitops: fns: 7637268 ns
find_nth_bit: 3362863 ns, 16501 iterations

Signed-off-by: Kuan-Wei Chiu <[email protected]>
---

Changes in v5:
- Update benchmark results in the commit message.

include/linux/bitops.h | 12 +++---------
1 file changed, 3 insertions(+), 9 deletions(-)

diff --git a/include/linux/bitops.h b/include/linux/bitops.h
index 2ba557e067fe..57ecef354f47 100644
--- a/include/linux/bitops.h
+++ b/include/linux/bitops.h
@@ -254,16 +254,10 @@ static inline unsigned long __ffs64(u64 word)
*/
static inline unsigned long fns(unsigned long word, unsigned int n)
{
- unsigned int bit;
+ while (word && n--)
+ word &= word - 1;

- while (word) {
- bit = __ffs(word);
- if (n-- == 0)
- return bit;
- __clear_bit(bit, &word);
- }
-
- return BITS_PER_LONG;
+ return word ? __ffs(word) : BITS_PER_LONG;
}

/**
--
2.34.1


2024-05-02 15:09:31

by Yury Norov

[permalink] [raw]
Subject: Re: [PATCH v5 0/2] bitops: Optimize fns() for improved performance

On Thu, May 02, 2024 at 05:24:41PM +0800, Kuan-Wei Chiu wrote:
> Hello,
>
> This patch series optimizes the fns() function by avoiding repeated
> calls to __ffs(). Additionally, tests for fns() have been added in
> lib/test_bitops.c.

OK, now looks good. Thanks for the work, Kuan-Wei.

I'll take it in bitmap-for-next. Andrew, can you drop the previous
version from -mm?

Thanks,
Yury

>
> Changes in v5:
> - Reduce testing iterations from 1000000 to 10000 to decrease testing
> time.
> - Move 'buf' inside the function.
> - Mark 'buf' as __initdata.
> - Assign the results of fns() to a volatile variable to prevent
> compiler optimization.
> - Remove the iteration count from the benchmark result.
> - Update benchmark results in the commit message.
>
> Changes in v4:
> - Correct get_random_long() -> get_random_bytes() in the commit
> message.
>
> Changes in v3:
> - Move the benchmark test for fns() to lib/test_bitops.c.
> - Exclude the overhead of random number generation from the benchmark
> result.
> - Change the output to print only a total gross instead of each n in
> the benchmark result.
> - Update the commit message in the second patch.
>
> Changes in v2:
> - Add benchmark test for fns() in lib/find_bit_benchmark.c.
> - Change the loop in fns() by counting down from n to 0.
> - Add find_bit benchmark result for find_nth_bit in commit message.
>
> Link to v4: https://lkml.kernel.org/[email protected]
> Link to v3: https://lkml.kernel.org/[email protected]
> Link to v2: https://lkml.kernel.org/[email protected]
> Link to v1: https://lkml.kernel.org/[email protected]
>
> Kuan-Wei Chiu (2):
> lib/test_bitops: Add benchmark test for fns()
> bitops: Optimize fns() for improved performance
>
> include/linux/bitops.h | 12 +++---------
> lib/test_bitops.c | 22 ++++++++++++++++++++++
> 2 files changed, 25 insertions(+), 9 deletions(-)
>
> --
> 2.34.1

2024-05-03 21:56:10

by Yury Norov

[permalink] [raw]
Subject: Re: [PATCH v5 1/2] lib/test_bitops: Add benchmark test for fns()

On Fri, May 03, 2024 at 08:54:01AM -0700, Nathan Chancellor wrote:
> On Fri, May 03, 2024 at 03:38:59PM +0800, Kuan-Wei Chiu wrote:
> > On Fri, May 03, 2024 at 03:31:26PM +0800, Kuan-Wei Chiu wrote:
> > > On Thu, May 02, 2024 at 09:17:01PM -0700, Nathan Chancellor wrote:
> > > > Hi Kuan-Wei,
> > > >
> > > > On Fri, May 03, 2024 at 09:34:28AM +0800, Kuan-Wei Chiu wrote:
> > > > > On Fri, May 03, 2024 at 08:49:00AM +0800, kernel test robot wrote:
> > > > > > Hi Kuan-Wei,
> > > > > >
> > > > > > kernel test robot noticed the following build errors:
> > > > > >
> > > > > > [auto build test ERROR on linus/master]
> > > > > > [also build test ERROR on v6.9-rc6 next-20240502]
> > > > > > [cannot apply to akpm-mm/mm-everything akpm-mm/mm-nonmm-unstable]
> > > > > > [If your patch is applied to the wrong git tree, kindly drop us a note.
> > > > > > And when submitting patch, we suggest to use '--base' as documented in
> > > > > > https://git-scm.com/docs/git-format-patch#_base_tree_information]
> > > > > >
> > > > > > url: https://github.com/intel-lab-lkp/linux/commits/Kuan-Wei-Chiu/lib-test_bitops-Add-benchmark-test-for-fns/20240502-172638
> > > > > > base: linus/master
> > > > > > patch link: https://lore.kernel.org/r/20240502092443.6845-2-visitorckw%40gmail.com
> > > > > > patch subject: [PATCH v5 1/2] lib/test_bitops: Add benchmark test for fns()
> > > > > > config: x86_64-allyesconfig (https://download.01.org/0day-ci/archive/20240503/[email protected]/config)
> > > > > > compiler: clang version 18.1.4 (https://github.com/llvm/llvm-project e6c3289804a67ea0bb6a86fadbe454dd93b8d855)
> > > > > > reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20240503/[email protected]/reproduce)
> > > > > >
> > > > > > If you fix the issue in a separate patch/commit (i.e. not just a new version of
> > > > > > the same patch/commit), kindly add following tags
> > > > > > | Reported-by: kernel test robot <[email protected]>
> > > > > > | Closes: https://lore.kernel.org/oe-kbuild-all/[email protected]/
> > > > > >
> > > > > > All errors (new ones prefixed by >>):
> > > > > >
> > > > > > >> lib/test_bitops.c:56:39: error: variable 'tmp' set but not used [-Werror,-Wunused-but-set-variable]
> > > > > > 56 | static volatile __used unsigned long tmp __initdata;
> > > > > > | ^
> > > > > > 1 error generated.
> > > > > >
> > > > > >
> > > > > > vim +/tmp +56 lib/test_bitops.c
> > > > > >
> > > > > > 52
> > > > > > 53 static int __init test_fns(void)
> > > > > > 54 {
> > > > > > 55 static unsigned long buf[10000] __initdata;
> > > > > > > 56 static volatile __used unsigned long tmp __initdata;
> > > > >
> > > > > I apologize for causing the compilation failure with clang. I'm not
> > > > > very familiar with clang and I'm not sure why something marked as
> > > > > __used would result in the warning mentioned above. Perhaps clang does
> > > > > not support attribute((used))? Is there a way to work around this
> > > > > issue?
> > > >
> > > > It looks like __attribute__((__used__)) is not enough to stop clang from
> > > > warning, unlike GCC. I can likely fix that in clang if it is acceptable
> > > > to the clang maintainers (although more below on why this might be
> > > > intentional) but the warning will still need to be resolved for older
> > > > versions. Looking at the current clang source code and tests, it looks
> > > > like __attribute__((__unused__)) should silence the warning, which the
> > > > kernel has available as __always_unused or __maybe_unused, depending on
> > > > the context.
> > > >
> > > > $ cat test.c
> > > > void foo(void)
> > > > {
> > > > int a;
> > > > a = 1;
> > > > }
> > > >
> > > > void bar(void)
> > > > {
> > > > static int b;
> > > > b = 2;
> > > > }
> > > >
> > > > void baz(void)
> > > > {
> > > > static int c __attribute__((__used__));
> > > > c = 3;
> > > > }
> > > >
> > > > void quux(void)
> > > > {
> > > > static int d __attribute__((__unused__));
> > > > d = 4;
> > > > }
> > > >
> > > > void foobar(void)
> > > > {
> > > > static int e __attribute__((__used__)) __attribute__((__unused__));
> > > > e = 1;
> > > > }
> > > >
> > > > $ gcc -Wunused-but-set-variable -c -o /dev/null test.c
> > > > test.c: In function ‘foo’:
> > > > test.c:3:13: warning: variable ‘a’ set but not used [-Wunused-but-set-variable]
> > > > 3 | int a;
> > > > | ^
> > > > test.c: In function ‘bar’:
> > > > test.c:9:20: warning: variable ‘b’ set but not used [-Wunused-but-set-variable]
> > > > 9 | static int b;
> > > > | ^
> > > >
> > > > $ clang -fsyntax-only -Wunused-but-set-variable test.c
> > > > test.c:3:6: warning: variable 'a' set but not used [-Wunused-but-set-variable]
> > > > 3 | int a;
> > > > | ^
> > > > test.c:9:13: warning: variable 'b' set but not used [-Wunused-but-set-variable]
> > > > 9 | static int b;
> > > > | ^
> > > > test.c:15:13: warning: variable 'c' set but not used [-Wunused-but-set-variable]
> > > > 15 | static int c __attribute__((__used__));
> > > > | ^
> > > > 3 warnings generated.
> > > >
> > > > I've attached a diff below that resolves the warning for me and it has
> > > > no code generation differences based on objdump. While having used and
> > > > unused attributes together might look unusual, reading the GCC attribute
> > > > manual makes it seem like these attributes fulfill similar yet different
> > > > roles, __unused__ prevents any unused warnings while __used__ forces the
> > > > variable to be emitted:
> > > >
> > > > https://gcc.gnu.org/onlinedocs/gcc-13.2.0/gcc/Common-Variable-Attributes.html#index-unused-variable-attribute
> > > >
> > > > A strict reading of that does not make it seem like __used__ implies
> > > > disabling unused warnings, so I can understand why clang's behavior is
> > > > the way that it is.
> > > >
> > > Thank you for your explanation and providing the solution. I tested the
> > > diff stat you provided, and it works well for me.
> > >
> > Should I submit an updated version of the patch to the bitmap
> > maintainer, or should this be a separate patch since the patch causing
> > build failure has already been accepted? My instinct is the latter, but
> > I'm concerned it might make git bisection more challenging.
>
> Yury would be the best person to answer these questions since each
> maintainer is different, some never rebase their trees while others will
> squash simple fixes in to avoid bisection issues and such. I've added
> him to the thread now to chime in (somehow he got dropped? the thread
> starts at https://lore.kernel.org/[email protected]/).
>
> I think the diff in my email should be directly applicable on top of
> your change with no conflicts so he could just squash that in if you are
> both happy with that.
>
> Cheers,
> Nathan
>
> > > > diff --git a/lib/test_bitops.c b/lib/test_bitops.c
> > > > index 5c627b525a48..28c91072cf85 100644
> > > > --- a/lib/test_bitops.c
> > > > +++ b/lib/test_bitops.c
> > > > @@ -53,7 +53,7 @@ static unsigned long order_comb_long[][2] = {
> > > > static int __init test_fns(void)
> > > > {
> > > > static unsigned long buf[10000] __initdata;
> > > > - static volatile __used unsigned long tmp __initdata;
> > > > + static volatile __always_unused __used unsigned long tmp __initdata;
> > > > unsigned int i, n;
> > > > ktime_t time;

Hi Nathan,

Thank you for sharing this.

I think this __used __unused thing may confuse readers when spotted in
a random test code. What do you think if we make it a new macro and
comment properly to avoid confusion?

I did that in the patch below. If you like it, I can prepend the
Kuan-Wei's series and fix the test inplace.

Thanks,
Yury

From 987a021cc76495b32f680507e0c55a105e8edff3 Mon Sep 17 00:00:00 2001
From: Yury Norov <[email protected]>
Date: Fri, 3 May 2024 12:12:00 -0700
Subject: [PATCH] Compiler Attributes: Add __always_used macro

In some cases like performance benchmarking, we need to call a
function, but don't need to read the returned value. If compiler
recognizes the function as pure or const, it can remove the function
invocation, which is not what we want.

To prevent that, the common practice is assigning the return value to
a temporary static volatile __used variable. This works with GCC, but
clang still emits Wunused-but-set-variable. To suppress that warning,
we need to teach clang to do that with the 'unused' attribute.

Nathan Chancellor explained that in details:

While having used and unused attributes together might look unusual,
reading the GCC attribute manual makes it seem like these attributes
fulfill similar yet different roles, __unused__ prevents any unused
warnings while __used__ forces the variable to be emitted. A strict
reading of that does not make it seem like __used__ implies disabling
unused warnings

The compiler documentation makes it clear what happens behind the used
and unused attributes, but the chosen names may confuse readers if such
combination catches an eye in a random code.

This patch adds __always_used macro, which combines both attributes
and comments on what happens for those interested in details.

Suggested-by: Nathan Chancellor <[email protected]>
Reported-by: kernel test robot <[email protected]>
Closes: https://lore.kernel.org/oe-kbuild-all/[email protected]/
Signed-off-by: Yury Norov <[email protected]>
---
include/linux/compiler_attributes.h | 12 ++++++++++++
1 file changed, 12 insertions(+)

diff --git a/include/linux/compiler_attributes.h b/include/linux/compiler_attributes.h
index 8bdf6e0918c1..957b2d914119 100644
--- a/include/linux/compiler_attributes.h
+++ b/include/linux/compiler_attributes.h
@@ -361,6 +361,18 @@
*/
#define __used __attribute__((__used__))

+/*
+ * The __used attribute guarantees that the attributed variable will be
+ * always emitted by a compiler. It doesn't prevent the compiler from
+ * throwing the 'unused' warnings when it can't detect how the variable
+ * is actually used. It's a compiler implementation details either emit
+ * the warning in that case or not.
+ *
+ * The combination of both 'used' and 'unused' attributes ensures that
+ * the variable would be emitted, and will not trigger 'unused' warnings.
+ */
+#define __always_used __used __maybe_unused
+
/*
* gcc: https://gcc.gnu.org/onlinedocs/gcc/Common-Function-Attributes.html#index-warn_005funused_005fresult-function-attribute
* clang: https://clang.llvm.org/docs/AttributeReference.html#nodiscard-warn-unused-result
--
2.40.1


2024-05-03 22:23:48

by Nathan Chancellor

[permalink] [raw]
Subject: Re: [PATCH v5 1/2] lib/test_bitops: Add benchmark test for fns()

On Fri, May 03, 2024 at 02:55:57PM -0700, Yury Norov wrote:
> On Fri, May 03, 2024 at 08:54:01AM -0700, Nathan Chancellor wrote:
> > On Fri, May 03, 2024 at 03:38:59PM +0800, Kuan-Wei Chiu wrote:
> > > On Fri, May 03, 2024 at 03:31:26PM +0800, Kuan-Wei Chiu wrote:
> > > > On Thu, May 02, 2024 at 09:17:01PM -0700, Nathan Chancellor wrote:
> > > > > Hi Kuan-Wei,
> > > > >
> > > > > On Fri, May 03, 2024 at 09:34:28AM +0800, Kuan-Wei Chiu wrote:
> > > > > > On Fri, May 03, 2024 at 08:49:00AM +0800, kernel test robot wrote:
> > > > > > > Hi Kuan-Wei,
> > > > > > >
> > > > > > > kernel test robot noticed the following build errors:
> > > > > > >
> > > > > > > [auto build test ERROR on linus/master]
> > > > > > > [also build test ERROR on v6.9-rc6 next-20240502]
> > > > > > > [cannot apply to akpm-mm/mm-everything akpm-mm/mm-nonmm-unstable]
> > > > > > > [If your patch is applied to the wrong git tree, kindly drop us a note.
> > > > > > > And when submitting patch, we suggest to use '--base' as documented in
> > > > > > > https://git-scm.com/docs/git-format-patch#_base_tree_information]
> > > > > > >
> > > > > > > url: https://github.com/intel-lab-lkp/linux/commits/Kuan-Wei-Chiu/lib-test_bitops-Add-benchmark-test-for-fns/20240502-172638
> > > > > > > base: linus/master
> > > > > > > patch link: https://lore.kernel.org/r/20240502092443.6845-2-visitorckw%40gmail.com
> > > > > > > patch subject: [PATCH v5 1/2] lib/test_bitops: Add benchmark test for fns()
> > > > > > > config: x86_64-allyesconfig (https://download.01.org/0day-ci/archive/20240503/[email protected]/config)
> > > > > > > compiler: clang version 18.1.4 (https://github.com/llvm/llvm-project e6c3289804a67ea0bb6a86fadbe454dd93b8d855)
> > > > > > > reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20240503/[email protected]/reproduce)
> > > > > > >
> > > > > > > If you fix the issue in a separate patch/commit (i.e. not just a new version of
> > > > > > > the same patch/commit), kindly add following tags
> > > > > > > | Reported-by: kernel test robot <[email protected]>
> > > > > > > | Closes: https://lore.kernel.org/oe-kbuild-all/[email protected]/
> > > > > > >
> > > > > > > All errors (new ones prefixed by >>):
> > > > > > >
> > > > > > > >> lib/test_bitops.c:56:39: error: variable 'tmp' set but not used [-Werror,-Wunused-but-set-variable]
> > > > > > > 56 | static volatile __used unsigned long tmp __initdata;
> > > > > > > | ^
> > > > > > > 1 error generated.
> > > > > > >
> > > > > > >
> > > > > > > vim +/tmp +56 lib/test_bitops.c
> > > > > > >
> > > > > > > 52
> > > > > > > 53 static int __init test_fns(void)
> > > > > > > 54 {
> > > > > > > 55 static unsigned long buf[10000] __initdata;
> > > > > > > > 56 static volatile __used unsigned long tmp __initdata;
> > > > > >
> > > > > > I apologize for causing the compilation failure with clang. I'm not
> > > > > > very familiar with clang and I'm not sure why something marked as
> > > > > > __used would result in the warning mentioned above. Perhaps clang does
> > > > > > not support attribute((used))? Is there a way to work around this
> > > > > > issue?
> > > > >
> > > > > It looks like __attribute__((__used__)) is not enough to stop clang from
> > > > > warning, unlike GCC. I can likely fix that in clang if it is acceptable
> > > > > to the clang maintainers (although more below on why this might be
> > > > > intentional) but the warning will still need to be resolved for older
> > > > > versions. Looking at the current clang source code and tests, it looks
> > > > > like __attribute__((__unused__)) should silence the warning, which the
> > > > > kernel has available as __always_unused or __maybe_unused, depending on
> > > > > the context.
> > > > >
> > > > > $ cat test.c
> > > > > void foo(void)
> > > > > {
> > > > > int a;
> > > > > a = 1;
> > > > > }
> > > > >
> > > > > void bar(void)
> > > > > {
> > > > > static int b;
> > > > > b = 2;
> > > > > }
> > > > >
> > > > > void baz(void)
> > > > > {
> > > > > static int c __attribute__((__used__));
> > > > > c = 3;
> > > > > }
> > > > >
> > > > > void quux(void)
> > > > > {
> > > > > static int d __attribute__((__unused__));
> > > > > d = 4;
> > > > > }
> > > > >
> > > > > void foobar(void)
> > > > > {
> > > > > static int e __attribute__((__used__)) __attribute__((__unused__));
> > > > > e = 1;
> > > > > }
> > > > >
> > > > > $ gcc -Wunused-but-set-variable -c -o /dev/null test.c
> > > > > test.c: In function ‘foo’:
> > > > > test.c:3:13: warning: variable ‘a’ set but not used [-Wunused-but-set-variable]
> > > > > 3 | int a;
> > > > > | ^
> > > > > test.c: In function ‘bar’:
> > > > > test.c:9:20: warning: variable ‘b’ set but not used [-Wunused-but-set-variable]
> > > > > 9 | static int b;
> > > > > | ^
> > > > >
> > > > > $ clang -fsyntax-only -Wunused-but-set-variable test.c
> > > > > test.c:3:6: warning: variable 'a' set but not used [-Wunused-but-set-variable]
> > > > > 3 | int a;
> > > > > | ^
> > > > > test.c:9:13: warning: variable 'b' set but not used [-Wunused-but-set-variable]
> > > > > 9 | static int b;
> > > > > | ^
> > > > > test.c:15:13: warning: variable 'c' set but not used [-Wunused-but-set-variable]
> > > > > 15 | static int c __attribute__((__used__));
> > > > > | ^
> > > > > 3 warnings generated.
> > > > >
> > > > > I've attached a diff below that resolves the warning for me and it has
> > > > > no code generation differences based on objdump. While having used and
> > > > > unused attributes together might look unusual, reading the GCC attribute
> > > > > manual makes it seem like these attributes fulfill similar yet different
> > > > > roles, __unused__ prevents any unused warnings while __used__ forces the
> > > > > variable to be emitted:
> > > > >
> > > > > https://gcc.gnu.org/onlinedocs/gcc-13.2.0/gcc/Common-Variable-Attributes.html#index-unused-variable-attribute
> > > > >
> > > > > A strict reading of that does not make it seem like __used__ implies
> > > > > disabling unused warnings, so I can understand why clang's behavior is
> > > > > the way that it is.
> > > > >
> > > > Thank you for your explanation and providing the solution. I tested the
> > > > diff stat you provided, and it works well for me.
> > > >
> > > Should I submit an updated version of the patch to the bitmap
> > > maintainer, or should this be a separate patch since the patch causing
> > > build failure has already been accepted? My instinct is the latter, but
> > > I'm concerned it might make git bisection more challenging.
> >
> > Yury would be the best person to answer these questions since each
> > maintainer is different, some never rebase their trees while others will
> > squash simple fixes in to avoid bisection issues and such. I've added
> > him to the thread now to chime in (somehow he got dropped? the thread
> > starts at https://lore.kernel.org/[email protected]/).
> >
> > I think the diff in my email should be directly applicable on top of
> > your change with no conflicts so he could just squash that in if you are
> > both happy with that.
> >
> > Cheers,
> > Nathan
> >
> > > > > diff --git a/lib/test_bitops.c b/lib/test_bitops.c
> > > > > index 5c627b525a48..28c91072cf85 100644
> > > > > --- a/lib/test_bitops.c
> > > > > +++ b/lib/test_bitops.c
> > > > > @@ -53,7 +53,7 @@ static unsigned long order_comb_long[][2] = {
> > > > > static int __init test_fns(void)
> > > > > {
> > > > > static unsigned long buf[10000] __initdata;
> > > > > - static volatile __used unsigned long tmp __initdata;
> > > > > + static volatile __always_unused __used unsigned long tmp __initdata;
> > > > > unsigned int i, n;
> > > > > ktime_t time;
>
> Hi Nathan,
>
> Thank you for sharing this.
>
> I think this __used __unused thing may confuse readers when spotted in
> a random test code. What do you think if we make it a new macro and
> comment properly to avoid confusion?
>
> I did that in the patch below. If you like it, I can prepend the
> Kuan-Wei's series and fix the test inplace.
>
> Thanks,
> Yury
>
> From 987a021cc76495b32f680507e0c55a105e8edff3 Mon Sep 17 00:00:00 2001
> From: Yury Norov <[email protected]>
> Date: Fri, 3 May 2024 12:12:00 -0700
> Subject: [PATCH] Compiler Attributes: Add __always_used macro
>
> In some cases like performance benchmarking, we need to call a
> function, but don't need to read the returned value. If compiler
> recognizes the function as pure or const, it can remove the function
> invocation, which is not what we want.
>
> To prevent that, the common practice is assigning the return value to
> a temporary static volatile __used variable. This works with GCC, but
> clang still emits Wunused-but-set-variable. To suppress that warning,
> we need to teach clang to do that with the 'unused' attribute.
>
> Nathan Chancellor explained that in details:
>
> While having used and unused attributes together might look unusual,
> reading the GCC attribute manual makes it seem like these attributes
> fulfill similar yet different roles, __unused__ prevents any unused
> warnings while __used__ forces the variable to be emitted. A strict
> reading of that does not make it seem like __used__ implies disabling
> unused warnings
>
> The compiler documentation makes it clear what happens behind the used
> and unused attributes, but the chosen names may confuse readers if such
> combination catches an eye in a random code.
>
> This patch adds __always_used macro, which combines both attributes
> and comments on what happens for those interested in details.
>
> Suggested-by: Nathan Chancellor <[email protected]>
> Reported-by: kernel test robot <[email protected]>
> Closes: https://lore.kernel.org/oe-kbuild-all/[email protected]/
> Signed-off-by: Yury Norov <[email protected]>

Yeah I think this is reasonable to make this a macro, I am sure there
are other places where this might be useful.

Reviewed-by: Nathan Chancellor <[email protected]>

Adding Miguel as compiler attributes maintainer to make him aware of the
change. I think it would be reasonable to have you take it through the
bitops tree with your ack so that the test patch can make use of this as
the fix for the robot's issue.

One gotcha that might be worth mentioning is that this combination only
works on functions and non-local variables (i.e., static or global).

> ---
> include/linux/compiler_attributes.h | 12 ++++++++++++
> 1 file changed, 12 insertions(+)
>
> diff --git a/include/linux/compiler_attributes.h b/include/linux/compiler_attributes.h
> index 8bdf6e0918c1..957b2d914119 100644
> --- a/include/linux/compiler_attributes.h
> +++ b/include/linux/compiler_attributes.h
> @@ -361,6 +361,18 @@
> */
> #define __used __attribute__((__used__))
>
> +/*
> + * The __used attribute guarantees that the attributed variable will be
> + * always emitted by a compiler. It doesn't prevent the compiler from
> + * throwing the 'unused' warnings when it can't detect how the variable
> + * is actually used. It's a compiler implementation details either emit
> + * the warning in that case or not.
> + *
> + * The combination of both 'used' and 'unused' attributes ensures that
> + * the variable would be emitted, and will not trigger 'unused' warnings.
> + */
> +#define __always_used __used __maybe_unused
> +
> /*
> * gcc: https://gcc.gnu.org/onlinedocs/gcc/Common-Function-Attributes.html#index-warn_005funused_005fresult-function-attribute
> * clang: https://clang.llvm.org/docs/AttributeReference.html#nodiscard-warn-unused-result
> --
> 2.40.1
>

2024-05-05 10:44:27

by Miguel Ojeda

[permalink] [raw]
Subject: Re: [PATCH v5 1/2] lib/test_bitops: Add benchmark test for fns()

On Sat, May 4, 2024 at 12:23 AM Nathan Chancellor <[email protected]> wrote:
>
> Adding Miguel as compiler attributes maintainer to make him aware of the
> change. I think it would be reasonable to have you take it through the
> bitops tree with your ack so that the test patch can make use of this as
> the fix for the robot's issue.

Thanks Nathan!

Acked-by: Miguel Ojeda <[email protected]>

The new macro sounds OK to me. Perhaps we should add some docs to the
other two attributes about their relationship and/or the existence of
this new `__always_used` one (so that it is easier to find). And if
so, then perhaps the docs for `__always_used` can be simplified.

> One gotcha that might be worth mentioning is that this combination only
> works on functions and non-local variables (i.e., static or global).

Yeah, since the `unused` one only applies to that, right?

Cheers,
Miguel

2024-05-05 10:44:35

by Miguel Ojeda

[permalink] [raw]
Subject: Re: [PATCH v5 1/2] lib/test_bitops: Add benchmark test for fns()

On Fri, May 3, 2024 at 11:56 PM Yury Norov <[email protected]> wrote:
>
> + * The __used attribute guarantees that the attributed variable will be

We should probably mention functions as Nathan said (unless it does
not work for some reason).

> + * always emitted by a compiler. It doesn't prevent the compiler from
> + * throwing the 'unused' warnings when it can't detect how the variable

Nit: "throwing the" -> "throwing" (I think)

Also, perhaps "when ..." -> "when it appears that the variable is not
used" like in the documentation of the attribute or similar? (e.g. in
the case that triggered the report, it is really unused, while one
could read this as the compiler not being able to detect a use
somewhere).

> + * is actually used. It's a compiler implementation details either emit
> + * the warning in that case or not.

Is it an implementation detail or rather that they took different
alternatives/options on purpose (even if not documented)? If we think
it is just a consequence of their implementation, perhaps we should
mention that and what GCC/Clang do today in their latest version, in
case it changes (so that we know whether we need to remove the macro,
for instance).

None of the above is a big deal though -- thanks!

Cheers,
Miguel

2024-05-06 17:53:52

by Nathan Chancellor

[permalink] [raw]
Subject: Re: [PATCH v5 1/2] lib/test_bitops: Add benchmark test for fns()

On Sun, May 05, 2024 at 12:42:50PM +0200, Miguel Ojeda wrote:
> On Sat, May 4, 2024 at 12:23 AM Nathan Chancellor <[email protected]> wrote:
> > One gotcha that might be worth mentioning is that this combination only
> > works on functions and non-local variables (i.e., static or global).
>
> Yeah, since the `unused` one only applies to that, right?

No, unused can be used with local variables, used cannot.

https://godbolt.org/z/1hroMGzb1

Cheers,
Nathatn

2024-05-06 17:56:55

by Nathan Chancellor

[permalink] [raw]
Subject: Re: [PATCH v5 1/2] lib/test_bitops: Add benchmark test for fns()

On Sun, May 05, 2024 at 12:42:53PM +0200, Miguel Ojeda wrote:
> On Fri, May 3, 2024 at 11:56 PM Yury Norov <[email protected]> wrote:
> >
> > + * The __used attribute guarantees that the attributed variable will be
>
> We should probably mention functions as Nathan said (unless it does
> not work for some reason).

Yeah, it should work for functions. I think clarifying it will not work
for local variables would probably be good as well, since __used__ does
not work on those like I replied in my other email, but it is not that
big of a deal.

> > + * is actually used. It's a compiler implementation details either emit
> > + * the warning in that case or not.
>
> Is it an implementation detail or rather that they took different
> alternatives/options on purpose (even if not documented)? If we think
> it is just a consequence of their implementation, perhaps we should
> mention that and what GCC/Clang do today in their latest version, in
> case it changes (so that we know whether we need to remove the macro,
> for instance).

Yeah it is entirely possible that this is not intentional but when
-Wunused-but-set-variable was introduced in Clang, I know there was a
lot of discussion around making the warning match GCC in certain ways as
well as breaking from GCC in others. I have not tried to dig up those
discussions to confirm though.

Cheers,
Nathan

2024-05-06 18:10:15

by Miguel Ojeda

[permalink] [raw]
Subject: Re: [PATCH v5 1/2] lib/test_bitops: Add benchmark test for fns()

On Mon, May 6, 2024 at 7:52 PM Nathan Chancellor <[email protected]> wrote:
>
> No, unused can be used with local variables, used cannot.

Yeah, sorry, I meant `used`, i.e. it is that one the one that
constraints the combination rather than `unused`.

From a quick look at the links in `compiler_attributes.h`, `unused`
can also be applied to types, while `used` cannot -- it is another
difference, but your sentence above already implies it anyway. :)

Thanks for the correction!

Cheers,
Miguel

2024-05-06 22:47:35

by Yury Norov

[permalink] [raw]
Subject: Re: [PATCH v5 1/2] lib/test_bitops: Add benchmark test for fns()

On Mon, May 06, 2024 at 08:08:41PM +0200, Miguel Ojeda wrote:
> On Mon, May 6, 2024 at 7:52 PM Nathan Chancellor <[email protected]> wrote:
> >
> > No, unused can be used with local variables, used cannot.
>
> Yeah, sorry, I meant `used`, i.e. it is that one the one that
> constraints the combination rather than `unused`.
>
> >From a quick look at the links in `compiler_attributes.h`, `unused`
> can also be applied to types, while `used` cannot -- it is another
> difference, but your sentence above already implies it anyway. :)
>
> Thanks for the correction!

I have applied the patch in bitmap-for-next this weekend.

https://github.com/norov/linux/commit/eb21fc0c96b48d1e779a0ab16f9165a3e0cd76ad

Can you guys please take a look at it wrt the last comments? I think
it's OK. But if not, I will resend it.

Thanks,
Yury

2024-05-07 14:23:35

by Nathan Chancellor

[permalink] [raw]
Subject: Re: [PATCH v5 1/2] lib/test_bitops: Add benchmark test for fns()

On Mon, May 06, 2024 at 03:47:24PM -0700, Yury Norov wrote:
> On Mon, May 06, 2024 at 08:08:41PM +0200, Miguel Ojeda wrote:
> > On Mon, May 6, 2024 at 7:52 PM Nathan Chancellor <[email protected]> wrote:
> > >
> > > No, unused can be used with local variables, used cannot.
> >
> > Yeah, sorry, I meant `used`, i.e. it is that one the one that
> > constraints the combination rather than `unused`.
> >
> > >From a quick look at the links in `compiler_attributes.h`, `unused`
> > can also be applied to types, while `used` cannot -- it is another
> > difference, but your sentence above already implies it anyway. :)
> >
> > Thanks for the correction!
>
> I have applied the patch in bitmap-for-next this weekend.
>
> https://github.com/norov/linux/commit/eb21fc0c96b48d1e779a0ab16f9165a3e0cd76ad
>
> Can you guys please take a look at it wrt the last comments? I think
> it's OK. But if not, I will resend it.

Yeah, I think it looks reasonable.

Cheers,
Nathan