The original code used conditional branching in the nfit_mem_cmp
function to compare two values and return -1, 1, or 0 based on the
result. However, the list_sort comparison function only needs results
<0, >0, or =0. This patch optimizes the code to make the comparison
branchless, improving efficiency and reducing code size. This change
reduces the number of comparison operations from 1-2 to a single
subtraction operation, thereby saving the number of instructions.
Signed-off-by: Kuan-Wei Chiu <[email protected]>
---
drivers/acpi/nfit/core.c | 6 +-----
1 file changed, 1 insertion(+), 5 deletions(-)
diff --git a/drivers/acpi/nfit/core.c b/drivers/acpi/nfit/core.c
index f96bf32cd368..eea827d9af08 100644
--- a/drivers/acpi/nfit/core.c
+++ b/drivers/acpi/nfit/core.c
@@ -1138,11 +1138,7 @@ static int nfit_mem_cmp(void *priv, const struct list_head *_a,
handleA = __to_nfit_memdev(a)->device_handle;
handleB = __to_nfit_memdev(b)->device_handle;
- if (handleA < handleB)
- return -1;
- else if (handleA > handleB)
- return 1;
- return 0;
+ return handleA - handleB;
}
static int nfit_mem_init(struct acpi_nfit_desc *acpi_desc)
--
2.25.1
The original code used conditional branching in the nfit_mem_cmp
function to compare two values and return -1, 1, or 0 based on the
result. However, the list_sort comparison function only needs results
<0, >0, or =0. This patch optimizes the code to make the comparison
branchless, improving efficiency and reducing code size. This change
reduces the number of comparison operations from 1-2 to a single
subtraction operation, thereby saving the number of instructions.
Signed-off-by: Kuan-Wei Chiu <[email protected]>
---
v1 -> v2:
- Add explicit type cast in case the sizes of u32 and int differ.
drivers/acpi/nfit/core.c | 6 +-----
1 file changed, 1 insertion(+), 5 deletions(-)
diff --git a/drivers/acpi/nfit/core.c b/drivers/acpi/nfit/core.c
index f96bf32cd368..563a32eba888 100644
--- a/drivers/acpi/nfit/core.c
+++ b/drivers/acpi/nfit/core.c
@@ -1138,11 +1138,7 @@ static int nfit_mem_cmp(void *priv, const struct list_head *_a,
handleA = __to_nfit_memdev(a)->device_handle;
handleB = __to_nfit_memdev(b)->device_handle;
- if (handleA < handleB)
- return -1;
- else if (handleA > handleB)
- return 1;
- return 0;
+ return (int)handleA - (int)handleB;
}
static int nfit_mem_init(struct acpi_nfit_desc *acpi_desc)
--
2.25.1
On Fri, Oct 13, 2023 at 2:22 PM Kuan-Wei Chiu <[email protected]> wrote:
>
> The original code used conditional branching in the nfit_mem_cmp
> function to compare two values and return -1, 1, or 0 based on the
> result. However, the list_sort comparison function only needs results
> <0, >0, or =0. This patch optimizes the code to make the comparison
> branchless, improving efficiency and reducing code size. This change
> reduces the number of comparison operations from 1-2 to a single
> subtraction operation, thereby saving the number of instructions.
>
> Signed-off-by: Kuan-Wei Chiu <[email protected]>
> ---
> v1 -> v2:
> - Add explicit type cast in case the sizes of u32 and int differ.
>
> drivers/acpi/nfit/core.c | 6 +-----
> 1 file changed, 1 insertion(+), 5 deletions(-)
>
> diff --git a/drivers/acpi/nfit/core.c b/drivers/acpi/nfit/core.c
> index f96bf32cd368..563a32eba888 100644
> --- a/drivers/acpi/nfit/core.c
> +++ b/drivers/acpi/nfit/core.c
> @@ -1138,11 +1138,7 @@ static int nfit_mem_cmp(void *priv, const struct list_head *_a,
>
> handleA = __to_nfit_memdev(a)->device_handle;
> handleB = __to_nfit_memdev(b)->device_handle;
> - if (handleA < handleB)
> - return -1;
> - else if (handleA > handleB)
> - return 1;
> - return 0;
> + return (int)handleA - (int)handleB;
Are you sure that you are not losing bits in these conversions?
> }
>
> static int nfit_mem_init(struct acpi_nfit_desc *acpi_desc)
> --
On Wed, Oct 18, 2023 at 01:17:31PM +0200, Rafael J. Wysocki wrote:
> On Fri, Oct 13, 2023 at 2:22 PM Kuan-Wei Chiu <[email protected]> wrote:
> >
> > The original code used conditional branching in the nfit_mem_cmp
> > function to compare two values and return -1, 1, or 0 based on the
> > result. However, the list_sort comparison function only needs results
> > <0, >0, or =0. This patch optimizes the code to make the comparison
> > branchless, improving efficiency and reducing code size. This change
> > reduces the number of comparison operations from 1-2 to a single
> > subtraction operation, thereby saving the number of instructions.
> >
> > Signed-off-by: Kuan-Wei Chiu <[email protected]>
> > ---
> > v1 -> v2:
> > - Add explicit type cast in case the sizes of u32 and int differ.
> >
> > drivers/acpi/nfit/core.c | 6 +-----
> > 1 file changed, 1 insertion(+), 5 deletions(-)
> >
> > diff --git a/drivers/acpi/nfit/core.c b/drivers/acpi/nfit/core.c
> > index f96bf32cd368..563a32eba888 100644
> > --- a/drivers/acpi/nfit/core.c
> > +++ b/drivers/acpi/nfit/core.c
> > @@ -1138,11 +1138,7 @@ static int nfit_mem_cmp(void *priv, const struct list_head *_a,
> >
> > handleA = __to_nfit_memdev(a)->device_handle;
> > handleB = __to_nfit_memdev(b)->device_handle;
> > - if (handleA < handleB)
> > - return -1;
> > - else if (handleA > handleB)
> > - return 1;
> > - return 0;
> > + return (int)handleA - (int)handleB;
>
> Are you sure that you are not losing bits in these conversions?
I believe your concerns are valid. Perhaps this was a stupid mistake I
made. Initially, I proposed this patch because I noticed that other
parts of the Linux kernel, such as the sram_reserve_cmp() function in
drivers/misc/sram.c, directly used subtraction for comparisons
involving u32. However, this approach could potentially lead to issues
when the size of int is 2 bytes instead of 4 bytes. I think maybe we
should consider dropping this patch. I apologize for proposing an
incorrect patch.
Thanks,
Kuan-Wei Chiu
>
> > }
> >
> > static int nfit_mem_init(struct acpi_nfit_desc *acpi_desc)
> > --