2024-03-13 08:47:20

by Arnd Bergmann

[permalink] [raw]
Subject: [PATCH] mtd: ubi: avoid expensive do_div() on 32-bit machines

From: Arnd Bergmann <[email protected]>

The use of do_div() in ubi_nvmem_reg_read() makes calling it on
32-bit machines rather expensive. Since the 'from' variable is
known to be a 32-bit quantity, it is clearly never needed and
can be optimized into a regular division operation.

Fixes: b8a77b9a5f9c ("mtd: ubi: fix NVMEM over UBI volumes on 32-bit systems")
Fixes: 3ce485803da1 ("mtd: ubi: provide NVMEM layer over UBI volumes")
Signed-off-by: Arnd Bergmann <[email protected]>
---
drivers/mtd/ubi/nvmem.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/mtd/ubi/nvmem.c b/drivers/mtd/ubi/nvmem.c
index 8aeb9c428e51..a94a1a9aaec1 100644
--- a/drivers/mtd/ubi/nvmem.c
+++ b/drivers/mtd/ubi/nvmem.c
@@ -6,7 +6,6 @@
/* UBI NVMEM provider */
#include "ubi.h"
#include <linux/nvmem-provider.h>
-#include <asm/div64.h>

/* List of all NVMEM devices */
static LIST_HEAD(nvmem_devices);
@@ -27,14 +26,15 @@ static int ubi_nvmem_reg_read(void *priv, unsigned int from,
struct ubi_nvmem *unv = priv;
struct ubi_volume_desc *desc;
uint32_t offs;
- uint64_t lnum = from;
+ uint32_t lnum;
int err = 0;

desc = ubi_open_volume(unv->ubi_num, unv->vol_id, UBI_READONLY);
if (IS_ERR(desc))
return PTR_ERR(desc);

- offs = do_div(lnum, unv->usable_leb_size);
+ offs = from % unv->usable_leb_size;
+ lnum = from / unv->usable_leb_size;
while (bytes_left) {
to_read = unv->usable_leb_size - offs;

--
2.39.2



2024-03-13 11:29:55

by Zhihao Cheng

[permalink] [raw]
Subject: Re: [PATCH] mtd: ubi: avoid expensive do_div() on 32-bit machines

?? 2024/3/13 16:46, Arnd Bergmann д??:
> From: Arnd Bergmann <[email protected]>
>
> The use of do_div() in ubi_nvmem_reg_read() makes calling it on
> 32-bit machines rather expensive. Since the 'from' variable is
> known to be a 32-bit quantity, it is clearly never needed and
> can be optimized into a regular division operation.
>
Do you meet a performance problem on a 32-bit machine? There are too
many places invoking do_div, why do you optimize this one?
Have you tested the influence on a x86_64 platform after this patch
applied? Looks like that do_div is more efficient in x86.

> Fixes: b8a77b9a5f9c ("mtd: ubi: fix NVMEM over UBI volumes on 32-bit systems")
> Fixes: 3ce485803da1 ("mtd: ubi: provide NVMEM layer over UBI volumes")
> Signed-off-by: Arnd Bergmann <[email protected]>
> ---
> drivers/mtd/ubi/nvmem.c | 6 +++---
> 1 file changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/mtd/ubi/nvmem.c b/drivers/mtd/ubi/nvmem.c
> index 8aeb9c428e51..a94a1a9aaec1 100644
> --- a/drivers/mtd/ubi/nvmem.c
> +++ b/drivers/mtd/ubi/nvmem.c
> @@ -6,7 +6,6 @@
> /* UBI NVMEM provider */
> #include "ubi.h"
> #include <linux/nvmem-provider.h>
> -#include <asm/div64.h>
>
> /* List of all NVMEM devices */
> static LIST_HEAD(nvmem_devices);
> @@ -27,14 +26,15 @@ static int ubi_nvmem_reg_read(void *priv, unsigned int from,
> struct ubi_nvmem *unv = priv;
> struct ubi_volume_desc *desc;
> uint32_t offs;
> - uint64_t lnum = from;
> + uint32_t lnum;
> int err = 0;
>
> desc = ubi_open_volume(unv->ubi_num, unv->vol_id, UBI_READONLY);
> if (IS_ERR(desc))
> return PTR_ERR(desc);
>
> - offs = do_div(lnum, unv->usable_leb_size);
> + offs = from % unv->usable_leb_size;
> + lnum = from / unv->usable_leb_size;
> while (bytes_left) {
> to_read = unv->usable_leb_size - offs;
>
>


2024-03-13 11:53:44

by Arnd Bergmann

[permalink] [raw]
Subject: Re: [PATCH] mtd: ubi: avoid expensive do_div() on 32-bit machines

On Wed, Mar 13, 2024, at 12:29, Zhihao Cheng wrote:
> 在 2024/3/13 16:46, Arnd Bergmann 写道:
>> From: Arnd Bergmann <[email protected]>
>>
>> The use of do_div() in ubi_nvmem_reg_read() makes calling it on
>> 32-bit machines rather expensive. Since the 'from' variable is
>> known to be a 32-bit quantity, it is clearly never needed and
>> can be optimized into a regular division operation.
>>
> Do you meet a performance problem on a 32-bit machine? There are too
> many places invoking do_div, why do you optimize this one?
> Have you tested the influence on a x86_64 platform after this patch
> applied? Looks like that do_div is more efficient in x86.

This one was just introduced. The call site looks like a fast
path and it caused a build regression that Daniel addressed with
an suboptimal commit b8a77b9a5f9c ("mtd: ubi: fix NVMEM over
UBI volumes on 32-bit systems").

The way it usually goes is that someone adds an open-coded
64-bit division that causes a link failure, which prompts
the original developer to either rewrite the code to avoid
the long division if possible, or add do_div() after showing
that it is now performance critical, e.g. only called at
probe time.

Arnd

2024-03-13 12:11:25

by Zhihao Cheng

[permalink] [raw]
Subject: Re: [PATCH] mtd: ubi: avoid expensive do_div() on 32-bit machines

在 2024/3/13 19:53, Arnd Bergmann 写道:
> On Wed, Mar 13, 2024, at 12:29, Zhihao Cheng wrote:
>> 在 2024/3/13 16:46, Arnd Bergmann 写道:
>>> From: Arnd Bergmann <[email protected]>
>>>
>>> The use of do_div() in ubi_nvmem_reg_read() makes calling it on
>>> 32-bit machines rather expensive. Since the 'from' variable is
>>> known to be a 32-bit quantity, it is clearly never needed and
>>> can be optimized into a regular division operation.
>>>
>> Do you meet a performance problem on a 32-bit machine? There are too
>> many places invoking do_div, why do you optimize this one?
>> Have you tested the influence on a x86_64 platform after this patch
>> applied? Looks like that do_div is more efficient in x86.
>
> This one was just introduced. The call site looks like a fast
> path and it caused a build regression that Daniel addressed with
> an suboptimal commit b8a77b9a5f9c ("mtd: ubi: fix NVMEM over
> UBI volumes on 32-bit systems").
>
> The way it usually goes is that someone adds an open-coded
> 64-bit division that causes a link failure, which prompts
I'm a little confused, what kind of link failure? Could you show an example?
> the original developer to either rewrite the code to avoid
> the long division if possible, or add do_div() after showing
> that it is now performance critical, e.g. only called at
> probe time.
>
> Arnd
> .
>


2024-03-13 12:21:56

by Arnd Bergmann

[permalink] [raw]
Subject: Re: [PATCH] mtd: ubi: avoid expensive do_div() on 32-bit machines

On Wed, Mar 13, 2024, at 13:10, Zhihao Cheng wrote:
> 在 2024/3/13 19:53, Arnd Bergmann 写道:
>> On Wed, Mar 13, 2024, at 12:29, Zhihao Cheng wrote:
>>
>> The way it usually goes is that someone adds an open-coded
>> 64-bit division that causes a link failure, which prompts
> I'm a little confused, what kind of link failure? Could you show an example?

The open-coded 64-bit division without using do_div() shows up as

x86_64-linux-ld: drivers/mtd/ubi/nvmem.o: in function `ubi_nvmem_reg_read':
nvmem.c:(.text+0x10a): undefined reference to `__umoddi3'
x86_64-linux-ld: nvmem.c:(.text+0x11f): undefined reference to `__udivdi3'
x86_64-linux-ld: drivers/mtd/ubi/nvmem.o: in function `ubi_nvmem_reg_read.cold':
nvmem.c:(.text.unlikely+0x2d): undefined reference to `__umoddi3'

The idea is that gcc expects __umoddi3 to be provided by libgcc,
but Linux intentionally leaves it out in order to catch accidental
64-bit divisions.

Arnd

2024-03-13 13:31:39

by Zhihao Cheng

[permalink] [raw]
Subject: Re: [PATCH] mtd: ubi: avoid expensive do_div() on 32-bit machines

在 2024/3/13 20:21, Arnd Bergmann 写道:
> On Wed, Mar 13, 2024, at 13:10, Zhihao Cheng wrote:
>> 在 2024/3/13 19:53, Arnd Bergmann 写道:
>>> On Wed, Mar 13, 2024, at 12:29, Zhihao Cheng wrote:
>>>
>>> The way it usually goes is that someone adds an open-coded
>>> 64-bit division that causes a link failure, which prompts
>> I'm a little confused, what kind of link failure? Could you show an example?
>
> The open-coded 64-bit division without using do_div() shows up as
>
> x86_64-linux-ld: drivers/mtd/ubi/nvmem.o: in function `ubi_nvmem_reg_read':
> nvmem.c:(.text+0x10a): undefined reference to `__umoddi3'
> x86_64-linux-ld: nvmem.c:(.text+0x11f): undefined reference to `__udivdi3'
> x86_64-linux-ld: drivers/mtd/ubi/nvmem.o: in function `ubi_nvmem_reg_read.cold':
> nvmem.c:(.text.unlikely+0x2d): undefined reference to `__umoddi3'
> > The idea is that gcc expects __umoddi3 to be provided by libgcc,
> but Linux intentionally leaves it out in order to catch accidental
> 64-bit divisions.
>

Thanks for explaination, which means that do_div is used for 64-bit
division to solve the link failure caused by missed libgcc. Since
parameter 'from' is u32, there is no need to invoke do_div on a 32-bit
platform, you just want to stop the wasting behavior on a 32-bit
platform. Do I understand right?

2024-03-13 13:39:42

by Arnd Bergmann

[permalink] [raw]
Subject: Re: [PATCH] mtd: ubi: avoid expensive do_div() on 32-bit machines

On Wed, Mar 13, 2024, at 14:29, Zhihao Cheng wrote:
> 在 2024/3/13 20:21, Arnd Bergmann 写道:
>> On Wed, Mar 13, 2024, at 13:10, Zhihao Cheng wrote:
>>> 在 2024/3/13 19:53, Arnd Bergmann 写道:
>>>> On Wed, Mar 13, 2024, at 12:29, Zhihao Cheng wrote:
>>>>
>>>> The way it usually goes is that someone adds an open-coded
>>>> 64-bit division that causes a link failure, which prompts
>>> I'm a little confused, what kind of link failure? Could you show an example?
>>
>> The open-coded 64-bit division without using do_div() shows up as
>>
>> x86_64-linux-ld: drivers/mtd/ubi/nvmem.o: in function `ubi_nvmem_reg_read':
>> nvmem.c:(.text+0x10a): undefined reference to `__umoddi3'
>> x86_64-linux-ld: nvmem.c:(.text+0x11f): undefined reference to `__udivdi3'
>> x86_64-linux-ld: drivers/mtd/ubi/nvmem.o: in function `ubi_nvmem_reg_read.cold':
>> nvmem.c:(.text.unlikely+0x2d): undefined reference to `__umoddi3'
>> > The idea is that gcc expects __umoddi3 to be provided by libgcc,
>> but Linux intentionally leaves it out in order to catch accidental
>> 64-bit divisions.
>>
>
> Thanks for explaination, which means that do_div is used for 64-bit
> division to solve the link failure caused by missed libgcc. Since
> parameter 'from' is u32, there is no need to invoke do_div on a 32-bit
> platform, you just want to stop the wasting behavior on a 32-bit
> platform. Do I understand right?

Yes, correct.

Arnd

2024-03-13 13:41:24

by Zhihao Cheng

[permalink] [raw]
Subject: Re: [PATCH] mtd: ubi: avoid expensive do_div() on 32-bit machines

?? 2024/3/13 16:46, Arnd Bergmann д??:
> From: Arnd Bergmann <[email protected]>
>
> The use of do_div() in ubi_nvmem_reg_read() makes calling it on
> 32-bit machines rather expensive. Since the 'from' variable is
> known to be a 32-bit quantity, it is clearly never needed and
> can be optimized into a regular division operation.
>
> Fixes: b8a77b9a5f9c ("mtd: ubi: fix NVMEM over UBI volumes on 32-bit systems")
> Fixes: 3ce485803da1 ("mtd: ubi: provide NVMEM layer over UBI volumes")
> Signed-off-by: Arnd Bergmann <[email protected]>
> ---
> drivers/mtd/ubi/nvmem.c | 6 +++---
> 1 file changed, 3 insertions(+), 3 deletions(-)

Reviewed-by: Zhihao Cheng <[email protected]>
>
> diff --git a/drivers/mtd/ubi/nvmem.c b/drivers/mtd/ubi/nvmem.c
> index 8aeb9c428e51..a94a1a9aaec1 100644
> --- a/drivers/mtd/ubi/nvmem.c
> +++ b/drivers/mtd/ubi/nvmem.c
> @@ -6,7 +6,6 @@
> /* UBI NVMEM provider */
> #include "ubi.h"
> #include <linux/nvmem-provider.h>
> -#include <asm/div64.h>
>
> /* List of all NVMEM devices */
> static LIST_HEAD(nvmem_devices);
> @@ -27,14 +26,15 @@ static int ubi_nvmem_reg_read(void *priv, unsigned int from,
> struct ubi_nvmem *unv = priv;
> struct ubi_volume_desc *desc;
> uint32_t offs;
> - uint64_t lnum = from;
> + uint32_t lnum;
> int err = 0;
>
> desc = ubi_open_volume(unv->ubi_num, unv->vol_id, UBI_READONLY);
> if (IS_ERR(desc))
> return PTR_ERR(desc);
>
> - offs = do_div(lnum, unv->usable_leb_size);
> + offs = from % unv->usable_leb_size;
> + lnum = from / unv->usable_leb_size;
> while (bytes_left) {
> to_read = unv->usable_leb_size - offs;
>
>


2024-03-13 13:44:11

by Zhihao Cheng

[permalink] [raw]
Subject: Re: [PATCH] mtd: ubi: avoid expensive do_div() on 32-bit machines

在 2024/3/13 21:39, Arnd Bergmann 写道:

>> Thanks for explaination, which means that do_div is used for 64-bit
>> division to solve the link failure caused by missed libgcc. Since
>> parameter 'from' is u32, there is no need to invoke do_div on a 32-bit
>> platform, you just want to stop the wasting behavior on a 32-bit
>> platform. Do I understand right?
>
> Yes, correct.
>
> Arnd
> .
>
How do you find it? I mean there are so many types and many do_div
callers, do you have a static check tool?

2024-03-13 13:58:47

by Arnd Bergmann

[permalink] [raw]
Subject: Re: [PATCH] mtd: ubi: avoid expensive do_div() on 32-bit machines

On Wed, Mar 13, 2024, at 14:43, Zhihao Cheng wrote:
> 在 2024/3/13 21:39, Arnd Bergmann 写道:
>
>>> Thanks for explaination, which means that do_div is used for 64-bit
>>> division to solve the link failure caused by missed libgcc. Since
>>> parameter 'from' is u32, there is no need to invoke do_div on a 32-bit
>>> platform, you just want to stop the wasting behavior on a 32-bit
>>> platform. Do I understand right?
>>
> How do you find it? I mean there are so many types and many do_div
> callers, do you have a static check tool?

I had a local fix with the same effect that I applied when
the code originally broke, but had not gotten around to
sending my fix.

After b8a77b9a5f9c ("mtd: ubi: fix NVMEM over UBI volumes on
32-bit systems") introduced a different workaround, I got the
link error because I had removed the do_div().

Arnd