2024-05-27 20:30:25

by Kuan-Wei Chiu

[permalink] [raw]
Subject: [PATCH 0/4] lib/sort: Optimizations and cleanups

Hi Andrew,

This patch series optimizes the handling of the last 2 or 3 elements in
lib/sort and adds a testcase in lib/test_sort to maintain 100% code
coverage reflecting this change. Additionally, it corrects outdated
descriptions regarding glibc qsort() and removes the unused pr_fmt
macro.

Regards,
Kuna-Wei

Kuan-Wei Chiu (4):
lib/sort: Remove unused pr_fmt macro
lib/sort: Fix outdated comment regarding glibc qsort()
lib/sort: Optimize heapsort for handling final 2 or 3 elements
lib/test_sort: Add a testcase to ensure code coverage

lib/sort.c | 14 +++++++-------
lib/test_sort.c | 14 +++++++++++++-
2 files changed, 20 insertions(+), 8 deletions(-)

--
2.34.1



2024-05-27 20:30:48

by Kuan-Wei Chiu

[permalink] [raw]
Subject: [PATCH 1/4] lib/sort: Remove unused pr_fmt macro

The pr_fmt macro is defined but not used in lib/sort.c. Since there are
no pr_* functions printing any messages, the pr_fmt macro is redundant
and can be safely removed.

Signed-off-by: Kuan-Wei Chiu <[email protected]>
---
lib/sort.c | 2 --
1 file changed, 2 deletions(-)

diff --git a/lib/sort.c b/lib/sort.c
index a0509088f82a..651b73205f6d 100644
--- a/lib/sort.c
+++ b/lib/sort.c
@@ -10,8 +10,6 @@
* quicksort's O(n^2) worst case.
*/

-#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
-
#include <linux/types.h>
#include <linux/export.h>
#include <linux/sort.h>
--
2.34.1


2024-05-27 20:31:01

by Kuan-Wei Chiu

[permalink] [raw]
Subject: [PATCH 3/4] lib/sort: Optimize heapsort for handling final 2 or 3 elements

After building the heap, the code continuously pops two elements from
the heap until only 2 or 3 elements remain, at which point it switches
back to a regular heapsort with one element popped at a time. However,
to handle the final 2 or 3 elements, an additional else-if statement in
the while loop was introduced, potentially increasing branch misses.
Moreover, when there are only 2 or 3 elements left, continuing with
regular heapify operations is unnecessary as these cases are simple
enough to be handled with a single comparison and 1 or 2 swaps outside
the while loop.

Eliminating the additional else-if statement and directly managing
cases involving 2 or 3 elements outside the loop reduces unnecessary
conditional branches resulting from the numerous loops and conditionals
in heapify.

This optimization maintains consistent numbers of comparisons and swaps
for arrays with even lengths while reducing swaps and comparisons for
arrays with odd lengths from 2.5 swaps and 1 comparison to 1.5 swaps
and 1 comparison.

Signed-off-by: Kuan-Wei Chiu <[email protected]>
---
lib/sort.c | 10 ++++++----
1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/lib/sort.c b/lib/sort.c
index b918ae15302d..048b7a6ef967 100644
--- a/lib/sort.c
+++ b/lib/sort.c
@@ -250,10 +250,7 @@ void sort_r(void *base, size_t num, size_t size,
a = size << shift;
n -= size;
do_swap(base + a, base + n, size, swap_func, priv);
- } else if (n > size) { /* Sorting: Extract root */
- n -= size;
- do_swap(base, base + n, size, swap_func, priv);
- } else { /* Sort complete */
+ } else { /* Sort complete */
break;
}

@@ -283,6 +280,11 @@ void sort_r(void *base, size_t num, size_t size,
do_swap(base + b, base + c, size, swap_func, priv);
}
}
+
+ n -= size;
+ do_swap(base, base + n, size, swap_func, priv);
+ if (n == size * 2 && do_cmp(base, base + size, cmp_func, priv) > 0)
+ do_swap(base, base + size, size, swap_func, priv);
}
EXPORT_SYMBOL(sort_r);

--
2.34.1


2024-05-27 20:31:04

by Kuan-Wei Chiu

[permalink] [raw]
Subject: [PATCH 2/4] lib/sort: Fix outdated comment regarding glibc qsort()

The existing comment in lib/sort refers to glibc qsort() using
quicksort. However, glibc qsort() no longer uses quicksort; it now uses
mergesort and falls back to heapsort if memory allocation for mergesort
fails. This makes the comment outdated and incorrect.

Update the comment to refer to quicksort in general rather than glibc's
implementation to provide accurate information about the comparisons
and trade-offs without implying an outdated implementation.

Signed-off-by: Kuan-Wei Chiu <[email protected]>
---
lib/sort.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/lib/sort.c b/lib/sort.c
index 651b73205f6d..b918ae15302d 100644
--- a/lib/sort.c
+++ b/lib/sort.c
@@ -5,7 +5,7 @@
* This performs n*log2(n) + 0.37*n + o(n) comparisons on average,
* and 1.5*n*log2(n) + O(n) in the (very contrived) worst case.
*
- * Glibc qsort() manages n*log2(n) - 1.26*n for random inputs (1.63*n
+ * Quicksort manages n*log2(n) - 1.26*n for random inputs (1.63*n
* better) at the expense of stack usage and much larger code to avoid
* quicksort's O(n^2) worst case.
*/
--
2.34.1


2024-05-27 20:31:14

by Kuan-Wei Chiu

[permalink] [raw]
Subject: [PATCH 4/4] lib/test_sort: Add a testcase to ensure code coverage

The addition of an if statement in lib/sort to handle the final
unsorted 2 or 3 elements is not covered by existing test cases, leading
to incomplete test coverage. To ensure comprehensive testing and
maintain 100% code coverage, add a new testcase for scenarios where the
if statement is triggered.

Since the if statement is only triggered when the array length is odd
and the first element is greater than the second element, a testcase is
created using an array length of TEST_LEN - 1 and a suitable random
seed to maintain full code coverage.

Signed-off-by: Kuan-Wei Chiu <[email protected]>
---
lib/test_sort.c | 14 +++++++++++++-
1 file changed, 13 insertions(+), 1 deletion(-)

diff --git a/lib/test_sort.c b/lib/test_sort.c
index be02e3a098cf..da4495125097 100644
--- a/lib/test_sort.c
+++ b/lib/test_sort.c
@@ -29,7 +29,19 @@ static void test_sort(struct kunit *test)

sort(a, TEST_LEN, sizeof(*a), cmpint, NULL);

- for (i = 0; i < TEST_LEN-1; i++)
+ for (i = 0; i < TEST_LEN - 1; i++)
+ KUNIT_ASSERT_LE(test, a[i], a[i + 1]);
+
+ r = 48;
+
+ for (i = 0; i < TEST_LEN - 1; i++) {
+ r = (r * 725861) % 6599;
+ a[i] = r;
+ }
+
+ sort(a, TEST_LEN - 1, sizeof(*a), cmpint, NULL);
+
+ for (i = 0; i < TEST_LEN - 2; i++)
KUNIT_ASSERT_LE(test, a[i], a[i + 1]);
}

--
2.34.1