Just for better readability, no code logic change.
Signed-off-by: Yangtao Li <[email protected]>
---
fs/ext4/inode.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index d251d705c276..d121cde74522 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -2218,8 +2218,7 @@ static int mpage_process_page_bufs(struct mpage_da_data *mpd,
{
struct inode *inode = mpd->inode;
int err;
- ext4_lblk_t blocks = (i_size_read(inode) + i_blocksize(inode) - 1)
- >> inode->i_blkbits;
+ ext4_lblk_t blocks = DIV_ROUND_UP(i_size_read(inode), i_blocksize(inode));
if (ext4_verity_in_progress(inode))
blocks = EXT_MAX_BLOCKS;
--
2.25.1
On Fri, Mar 10, 2023 at 02:07:34PM +0800, Yangtao Li wrote:
> Just for better readability, no code logic change.
>
> Signed-off-by: Yangtao Li <[email protected]>
> ---
> fs/ext4/inode.c | 3 +--
> 1 file changed, 1 insertion(+), 2 deletions(-)
>
> diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
> index d251d705c276..d121cde74522 100644
> --- a/fs/ext4/inode.c
> +++ b/fs/ext4/inode.c
> @@ -2218,8 +2218,7 @@ static int mpage_process_page_bufs(struct mpage_da_data *mpd,
> {
> struct inode *inode = mpd->inode;
> int err;
> - ext4_lblk_t blocks = (i_size_read(inode) + i_blocksize(inode) - 1)
> - >> inode->i_blkbits;
> + ext4_lblk_t blocks = DIV_ROUND_UP(i_size_read(inode), i_blocksize(inode));
>
Please don't do this. This makes the code compile down to a division, which is
far less efficient. I've verified this by checking the assembly generated.
- Eric
> Please don't do this. This makes the code compile down to a division, which is
> far less efficient. I've verified this by checking the assembly generated.
How much is the performance impact? So should the following be modified as shift operations as well?
fs/erofs/namei.c:92: int head = 0, back = DIV_ROUND_UP(dir->i_size, EROFS_BLKSIZ) - 1;
fs/erofs/zmap.c:252: const unsigned int totalidx = DIV_ROUND_UP(inode->i_size, EROFS_BLKSIZ);
fs/erofs/decompressor.c:14:#define LZ4_MAX_DISTANCE_PAGES (DIV_ROUND_UP(LZ4_DISTANCE_MAX, PAGE_SIZE) + 1)
fs/erofs/decompressor.c:56: DIV_ROUND_UP(distance, PAGE_SIZE) + 1 :
fs/erofs/decompressor.c:70: unsigned long bounced[DIV_ROUND_UP(LZ4_MAX_DISTANCE_PAGES,
fs/erofs/data.c:84: nblocks = DIV_ROUND_UP(inode->i_size, EROFS_BLKSIZ);
Thx,
Yangtao
On 2023/3/10 14:27, Yangtao Li wrote:
>> Please don't do this. This makes the code compile down to a division, which is
>> far less efficient. I've verified this by checking the assembly generated.
>
> How much is the performance impact? So should the following be modified as shift operations as well?
>
> fs/erofs/namei.c:92: int head = 0, back = DIV_ROUND_UP(dir->i_size, EROFS_BLKSIZ) - 1;
> fs/erofs/zmap.c:252: const unsigned int totalidx = DIV_ROUND_UP(inode->i_size, EROFS_BLKSIZ);
> fs/erofs/decompressor.c:14:#define LZ4_MAX_DISTANCE_PAGES (DIV_ROUND_UP(LZ4_DISTANCE_MAX, PAGE_SIZE) + 1)
> fs/erofs/decompressor.c:56: DIV_ROUND_UP(distance, PAGE_SIZE) + 1 :
> fs/erofs/decompressor.c:70: unsigned long bounced[DIV_ROUND_UP(LZ4_MAX_DISTANCE_PAGES,
> fs/erofs/data.c:84: nblocks = DIV_ROUND_UP(inode->i_size, EROFS_BLKSIZ);
Please stop taking EROFS as example on ext4 patches
and they will all be changed due to subpage support.
>
> Thx,
> Yangtao
On 2023/3/10 14:35, Gao Xiang wrote:
>
>
> On 2023/3/10 14:27, Yangtao Li wrote:
>>> Please don't do this. This makes the code compile down to a division, which is
>>> far less efficient. I've verified this by checking the assembly generated.
>>
>> How much is the performance impact? So should the following be modified as shift operations as well?
>>
>> fs/erofs/namei.c:92: int head = 0, back = DIV_ROUND_UP(dir->i_size, EROFS_BLKSIZ) - 1;
>> fs/erofs/zmap.c:252: const unsigned int totalidx = DIV_ROUND_UP(inode->i_size, EROFS_BLKSIZ);
>> fs/erofs/decompressor.c:14:#define LZ4_MAX_DISTANCE_PAGES (DIV_ROUND_UP(LZ4_DISTANCE_MAX, PAGE_SIZE) + 1)
>> fs/erofs/decompressor.c:56: DIV_ROUND_UP(distance, PAGE_SIZE) + 1 :
>> fs/erofs/decompressor.c:70: unsigned long bounced[DIV_ROUND_UP(LZ4_MAX_DISTANCE_PAGES,
>> fs/erofs/data.c:84: nblocks = DIV_ROUND_UP(inode->i_size, EROFS_BLKSIZ);
>
> Please stop taking EROFS as example on ext4 patches
> and they will all be changed due to subpage support.
Here EROFS_BLKSIZ == PAGE_SIZE is a constant, so no difference
to use shift or division.
>
>>
>> Thx,
>> Yangtao
On Thu, Mar 09, 2023 at 10:17:16PM -0800, Eric Biggers wrote:
> On Fri, Mar 10, 2023 at 02:07:34PM +0800, Yangtao Li wrote:
> > Just for better readability, no code logic change.
> >
> > Signed-off-by: Yangtao Li <[email protected]>
> > ---
> > fs/ext4/inode.c | 3 +--
> > 1 file changed, 1 insertion(+), 2 deletions(-)
> >
> > diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
> > index d251d705c276..d121cde74522 100644
> > --- a/fs/ext4/inode.c
> > +++ b/fs/ext4/inode.c
> > @@ -2218,8 +2218,7 @@ static int mpage_process_page_bufs(struct mpage_da_data *mpd,
> > {
> > struct inode *inode = mpd->inode;
> > int err;
> > - ext4_lblk_t blocks = (i_size_read(inode) + i_blocksize(inode) - 1)
> > - >> inode->i_blkbits;
> > + ext4_lblk_t blocks = DIV_ROUND_UP(i_size_read(inode), i_blocksize(inode));
> >
>
> Please don't do this. This makes the code compile down to a division, which is
> far less efficient. I've verified this by checking the assembly generated.
Which compiler is doing that?
On Fri, Mar 10, 2023 at 02:27:38PM +0800, Yangtao Li wrote:
> > Please don't do this. This makes the code compile down to a division, which is
> > far less efficient. I've verified this by checking the assembly generated.
>
> How much is the performance impact? So should the following be modified as shift operations as well?
>
> fs/erofs/namei.c:92: int head = 0, back = DIV_ROUND_UP(dir->i_size, EROFS_BLKSIZ) - 1;
> fs/erofs/zmap.c:252: const unsigned int totalidx = DIV_ROUND_UP(inode->i_size, EROFS_BLKSIZ);
> fs/erofs/decompressor.c:14:#define LZ4_MAX_DISTANCE_PAGES (DIV_ROUND_UP(LZ4_DISTANCE_MAX, PAGE_SIZE) + 1)
> fs/erofs/decompressor.c:56: DIV_ROUND_UP(distance, PAGE_SIZE) + 1 :
> fs/erofs/decompressor.c:70: unsigned long bounced[DIV_ROUND_UP(LZ4_MAX_DISTANCE_PAGES,
> fs/erofs/data.c:84: nblocks = DIV_ROUND_UP(inode->i_size, EROFS_BLKSIZ);
>
No, compilers can handle division by a power-of-2 constant just fine. It's just
division by a non-constant that is inefficient.
- Eric
On Fri, Mar 10, 2023 at 06:37:29AM +0000, Al Viro wrote:
> On Thu, Mar 09, 2023 at 10:17:16PM -0800, Eric Biggers wrote:
> > On Fri, Mar 10, 2023 at 02:07:34PM +0800, Yangtao Li wrote:
> > > Just for better readability, no code logic change.
> > >
> > > Signed-off-by: Yangtao Li <[email protected]>
> > > ---
> > > fs/ext4/inode.c | 3 +--
> > > 1 file changed, 1 insertion(+), 2 deletions(-)
> > >
> > > diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
> > > index d251d705c276..d121cde74522 100644
> > > --- a/fs/ext4/inode.c
> > > +++ b/fs/ext4/inode.c
> > > @@ -2218,8 +2218,7 @@ static int mpage_process_page_bufs(struct mpage_da_data *mpd,
> > > {
> > > struct inode *inode = mpd->inode;
> > > int err;
> > > - ext4_lblk_t blocks = (i_size_read(inode) + i_blocksize(inode) - 1)
> > > - >> inode->i_blkbits;
> > > + ext4_lblk_t blocks = DIV_ROUND_UP(i_size_read(inode), i_blocksize(inode));
> > >
> >
> > Please don't do this. This makes the code compile down to a division, which is
> > far less efficient. I've verified this by checking the assembly generated.
>
> Which compiler is doing that?
While we are at it, replace
return (1 << node->i_blkbits);
with
return (1u << node->i_blkbits);
and see if that changes the things.
On Fri, Mar 10, 2023 at 06:37:29AM +0000, Al Viro wrote:
> On Thu, Mar 09, 2023 at 10:17:16PM -0800, Eric Biggers wrote:
> > On Fri, Mar 10, 2023 at 02:07:34PM +0800, Yangtao Li wrote:
> > > Just for better readability, no code logic change.
> > >
> > > Signed-off-by: Yangtao Li <[email protected]>
> > > ---
> > > fs/ext4/inode.c | 3 +--
> > > 1 file changed, 1 insertion(+), 2 deletions(-)
> > >
> > > diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
> > > index d251d705c276..d121cde74522 100644
> > > --- a/fs/ext4/inode.c
> > > +++ b/fs/ext4/inode.c
> > > @@ -2218,8 +2218,7 @@ static int mpage_process_page_bufs(struct mpage_da_data *mpd,
> > > {
> > > struct inode *inode = mpd->inode;
> > > int err;
> > > - ext4_lblk_t blocks = (i_size_read(inode) + i_blocksize(inode) - 1)
> > > - >> inode->i_blkbits;
> > > + ext4_lblk_t blocks = DIV_ROUND_UP(i_size_read(inode), i_blocksize(inode));
> > >
> >
> > Please don't do this. This makes the code compile down to a division, which is
> > far less efficient. I've verified this by checking the assembly generated.
>
> Which compiler is doing that?
$ gcc --version
gcc (GCC) 12.2.1 20230201
i_blocksize(inode) is not a constant, so this should not be particularly
surprising. One might hope that a / (1 << b) would be optimized into a >> b,
but that doesn't seem to happen.
- Eric
On Thu, Mar 09, 2023 at 10:43:55PM -0800, Eric Biggers wrote:
> On Fri, Mar 10, 2023 at 06:37:29AM +0000, Al Viro wrote:
> > On Thu, Mar 09, 2023 at 10:17:16PM -0800, Eric Biggers wrote:
> > > On Fri, Mar 10, 2023 at 02:07:34PM +0800, Yangtao Li wrote:
> > > > Just for better readability, no code logic change.
> > > >
> > > > Signed-off-by: Yangtao Li <[email protected]>
> > > > ---
> > > > fs/ext4/inode.c | 3 +--
> > > > 1 file changed, 1 insertion(+), 2 deletions(-)
> > > >
> > > > diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
> > > > index d251d705c276..d121cde74522 100644
> > > > --- a/fs/ext4/inode.c
> > > > +++ b/fs/ext4/inode.c
> > > > @@ -2218,8 +2218,7 @@ static int mpage_process_page_bufs(struct mpage_da_data *mpd,
> > > > {
> > > > struct inode *inode = mpd->inode;
> > > > int err;
> > > > - ext4_lblk_t blocks = (i_size_read(inode) + i_blocksize(inode) - 1)
> > > > - >> inode->i_blkbits;
> > > > + ext4_lblk_t blocks = DIV_ROUND_UP(i_size_read(inode), i_blocksize(inode));
> > > >
> > >
> > > Please don't do this. This makes the code compile down to a division, which is
> > > far less efficient. I've verified this by checking the assembly generated.
> >
> > Which compiler is doing that?
>
> $ gcc --version
> gcc (GCC) 12.2.1 20230201
>
> i_blocksize(inode) is not a constant, so this should not be particularly
> surprising. One might hope that a / (1 << b) would be optimized into a >> b,
> but that doesn't seem to happen.
It really ought to be a / (1u << b), though...
On Fri, Mar 10, 2023 at 06:46:12AM +0000, Al Viro wrote:
> On Thu, Mar 09, 2023 at 10:43:55PM -0800, Eric Biggers wrote:
> > On Fri, Mar 10, 2023 at 06:37:29AM +0000, Al Viro wrote:
> > > On Thu, Mar 09, 2023 at 10:17:16PM -0800, Eric Biggers wrote:
> > > > On Fri, Mar 10, 2023 at 02:07:34PM +0800, Yangtao Li wrote:
> > > > > Just for better readability, no code logic change.
> > > > >
> > > > > Signed-off-by: Yangtao Li <[email protected]>
> > > > > ---
> > > > > fs/ext4/inode.c | 3 +--
> > > > > 1 file changed, 1 insertion(+), 2 deletions(-)
> > > > >
> > > > > diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
> > > > > index d251d705c276..d121cde74522 100644
> > > > > --- a/fs/ext4/inode.c
> > > > > +++ b/fs/ext4/inode.c
> > > > > @@ -2218,8 +2218,7 @@ static int mpage_process_page_bufs(struct mpage_da_data *mpd,
> > > > > {
> > > > > struct inode *inode = mpd->inode;
> > > > > int err;
> > > > > - ext4_lblk_t blocks = (i_size_read(inode) + i_blocksize(inode) - 1)
> > > > > - >> inode->i_blkbits;
> > > > > + ext4_lblk_t blocks = DIV_ROUND_UP(i_size_read(inode), i_blocksize(inode));
> > > > >
> > > >
> > > > Please don't do this. This makes the code compile down to a division, which is
> > > > far less efficient. I've verified this by checking the assembly generated.
> > >
> > > Which compiler is doing that?
> >
> > $ gcc --version
> > gcc (GCC) 12.2.1 20230201
> >
> > i_blocksize(inode) is not a constant, so this should not be particularly
> > surprising. One might hope that a / (1 << b) would be optimized into a >> b,
> > but that doesn't seem to happen.
>
> It really ought to be a / (1u << b), though...
Sure, that does better:
uint64_t f(uint64_t a, int b)
{
return a / (1U << b);
}
gcc:
0000000000000000 <f>:
0: 48 89 f8 mov %rdi,%rax
3: 89 f1 mov %esi,%ecx
5: 48 d3 e8 shr %cl,%rax
8: c3 ret
clang:
0000000000000000 <f>:
0: 89 f1 mov %esi,%ecx
2: 48 89 f8 mov %rdi,%rax
5: 48 d3 e8 shr %cl,%rax
8: c3 ret
But with a signed dividend (which is the case here) it gets messed up:
int64_t f(int64_t a, int b)
{
return a / (1U << b);
}
gcc:
0000000000000000 <f>:
0: 48 89 f8 mov %rdi,%rax
3: 89 f1 mov %esi,%ecx
5: bf 01 00 00 00 mov $0x1,%edi
a: d3 e7 shl %cl,%edi
c: 48 99 cqto
e: 48 f7 ff idiv %rdi
11: c3 ret
clang:
0000000000000000 <f>:
0: 89 f1 mov %esi,%ecx
2: be 01 00 00 00 mov $0x1,%esi
7: d3 e6 shl %cl,%esi
9: 48 89 f8 mov %rdi,%rax
c: 48 89 f9 mov %rdi,%rcx
f: 48 c1 e9 20 shr $0x20,%rcx
13: 74 06 je 1b <f+0x1b>
15: 48 99 cqto
17: 48 f7 fe idiv %rsi
1a: c3 ret
1b: 31 d2 xor %edx,%edx
1d: f7 f6 div %esi
1f: c3 ret
On Thu, Mar 09, 2023 at 10:54:38PM -0800, Eric Biggers wrote:
> On Fri, Mar 10, 2023 at 06:46:12AM +0000, Al Viro wrote:
> > On Thu, Mar 09, 2023 at 10:43:55PM -0800, Eric Biggers wrote:
> > > On Fri, Mar 10, 2023 at 06:37:29AM +0000, Al Viro wrote:
> > > > On Thu, Mar 09, 2023 at 10:17:16PM -0800, Eric Biggers wrote:
> > > > > On Fri, Mar 10, 2023 at 02:07:34PM +0800, Yangtao Li wrote:
> > > > > > Just for better readability, no code logic change.
> > > > > >
> > > > > > Signed-off-by: Yangtao Li <[email protected]>
> > > > > > ---
> > > > > > fs/ext4/inode.c | 3 +--
> > > > > > 1 file changed, 1 insertion(+), 2 deletions(-)
> > > > > >
> > > > > > diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
> > > > > > index d251d705c276..d121cde74522 100644
> > > > > > --- a/fs/ext4/inode.c
> > > > > > +++ b/fs/ext4/inode.c
> > > > > > @@ -2218,8 +2218,7 @@ static int mpage_process_page_bufs(struct mpage_da_data *mpd,
> > > > > > {
> > > > > > struct inode *inode = mpd->inode;
> > > > > > int err;
> > > > > > - ext4_lblk_t blocks = (i_size_read(inode) + i_blocksize(inode) - 1)
> > > > > > - >> inode->i_blkbits;
> > > > > > + ext4_lblk_t blocks = DIV_ROUND_UP(i_size_read(inode), i_blocksize(inode));
> > > > > >
> > > > >
> > > > > Please don't do this. This makes the code compile down to a division, which is
> > > > > far less efficient. I've verified this by checking the assembly generated.
> > > >
> > > > Which compiler is doing that?
> > >
> > > $ gcc --version
> > > gcc (GCC) 12.2.1 20230201
> > >
> > > i_blocksize(inode) is not a constant, so this should not be particularly
> > > surprising. One might hope that a / (1 << b) would be optimized into a >> b,
> > > but that doesn't seem to happen.
> >
> > It really ought to be a / (1u << b), though...
>
> Sure, that does better:
>
> uint64_t f(uint64_t a, int b)
> {
> return a / (1U << b);
> }
>
> gcc:
> 0000000000000000 <f>:
> 0: 48 89 f8 mov %rdi,%rax
> 3: 89 f1 mov %esi,%ecx
> 5: 48 d3 e8 shr %cl,%rax
> 8: c3 ret
>
> clang:
> 0000000000000000 <f>:
> 0: 89 f1 mov %esi,%ecx
> 2: 48 89 f8 mov %rdi,%rax
> 5: 48 d3 e8 shr %cl,%rax
> 8: c3 ret
>
> But with a signed dividend (which is the case here) it gets messed up:
>
> int64_t f(int64_t a, int b)
> {
> return a / (1U << b);
> }
*ow*
And i_size_read() is long long ;-/ Right.
On Fri, Mar 10, 2023 at 07:05:27AM +0000, Al Viro wrote:
> On Thu, Mar 09, 2023 at 10:54:38PM -0800, Eric Biggers wrote:
> > On Fri, Mar 10, 2023 at 06:46:12AM +0000, Al Viro wrote:
> > > On Thu, Mar 09, 2023 at 10:43:55PM -0800, Eric Biggers wrote:
> > > > On Fri, Mar 10, 2023 at 06:37:29AM +0000, Al Viro wrote:
> > > > > On Thu, Mar 09, 2023 at 10:17:16PM -0800, Eric Biggers wrote:
> > > > > > On Fri, Mar 10, 2023 at 02:07:34PM +0800, Yangtao Li wrote:
> > > > > > > Just for better readability, no code logic change.
> > > > > > >
> > > > > > > Signed-off-by: Yangtao Li <[email protected]>
> > > > > > > ---
> > > > > > > fs/ext4/inode.c | 3 +--
> > > > > > > 1 file changed, 1 insertion(+), 2 deletions(-)
> > > > > > >
> > > > > > > diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
> > > > > > > index d251d705c276..d121cde74522 100644
> > > > > > > --- a/fs/ext4/inode.c
> > > > > > > +++ b/fs/ext4/inode.c
> > > > > > > @@ -2218,8 +2218,7 @@ static int mpage_process_page_bufs(struct mpage_da_data *mpd,
> > > > > > > {
> > > > > > > struct inode *inode = mpd->inode;
> > > > > > > int err;
> > > > > > > - ext4_lblk_t blocks = (i_size_read(inode) + i_blocksize(inode) - 1)
> > > > > > > - >> inode->i_blkbits;
> > > > > > > + ext4_lblk_t blocks = DIV_ROUND_UP(i_size_read(inode), i_blocksize(inode));
> > > > > > >
> > > > > >
> > > > > > Please don't do this. This makes the code compile down to a division, which is
> > > > > > far less efficient. I've verified this by checking the assembly generated.
> > > > >
> > > > > Which compiler is doing that?
> > > >
> > > > $ gcc --version
> > > > gcc (GCC) 12.2.1 20230201
> > > >
> > > > i_blocksize(inode) is not a constant, so this should not be particularly
> > > > surprising. One might hope that a / (1 << b) would be optimized into a >> b,
> > > > but that doesn't seem to happen.
> > >
> > > It really ought to be a / (1u << b), though...
> >
> > Sure, that does better:
> >
> > uint64_t f(uint64_t a, int b)
> > {
> > return a / (1U << b);
> > }
> >
> > gcc:
> > 0000000000000000 <f>:
> > 0: 48 89 f8 mov %rdi,%rax
> > 3: 89 f1 mov %esi,%ecx
> > 5: 48 d3 e8 shr %cl,%rax
> > 8: c3 ret
> >
> > clang:
> > 0000000000000000 <f>:
> > 0: 89 f1 mov %esi,%ecx
> > 2: 48 89 f8 mov %rdi,%rax
> > 5: 48 d3 e8 shr %cl,%rax
> > 8: c3 ret
> >
> > But with a signed dividend (which is the case here) it gets messed up:
> >
> > int64_t f(int64_t a, int b)
> > {
> > return a / (1U << b);
> > }
>
> *ow*
>
> And i_size_read() is long long ;-/ Right.
Out of curiosity (and that's already too brittle for practical use) -
does DIV_ROUND_UP_ULL() do any better on full example?
Hi Yangtao,
I love your patch! Yet something to improve:
[auto build test ERROR on tytso-ext4/dev]
[also build test ERROR on linus/master v6.3-rc1 next-20230310]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]
url: https://github.com/intel-lab-lkp/linux/commits/Yangtao-Li/ext4-convert-to-DIV_ROUND_UP-in-mpage_process_page_bufs/20230310-140903
base: https://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4.git dev
patch link: https://lore.kernel.org/r/20230310060734.8780-1-frank.li%40vivo.com
patch subject: [PATCH] ext4: convert to DIV_ROUND_UP() in mpage_process_page_bufs()
config: i386-defconfig (https://download.01.org/0day-ci/archive/20230311/[email protected]/config)
compiler: gcc-11 (Debian 11.3.0-8) 11.3.0
reproduce (this is a W=1 build):
# https://github.com/intel-lab-lkp/linux/commit/f4d2db5f59592a5688be6e4d2d3dd6f3f94d4f96
git remote add linux-review https://github.com/intel-lab-lkp/linux
git fetch --no-tags linux-review Yangtao-Li/ext4-convert-to-DIV_ROUND_UP-in-mpage_process_page_bufs/20230310-140903
git checkout f4d2db5f59592a5688be6e4d2d3dd6f3f94d4f96
# save the config file
mkdir build_dir && cp config build_dir/.config
make W=1 O=build_dir ARCH=i386 olddefconfig
make W=1 O=build_dir ARCH=i386 SHELL=/bin/bash
If you fix the issue, kindly add following tag where applicable
| Reported-by: kernel test robot <[email protected]>
| Link: https://lore.kernel.org/oe-kbuild-all/[email protected]/
All errors (new ones prefixed by >>):
ld: fs/ext4/inode.o: in function `mpage_process_page_bufs':
>> inode.c:(.text+0xbda): undefined reference to `__divdi3'
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests
Hi Yangtao,
I love your patch! Yet something to improve:
[auto build test ERROR on tytso-ext4/dev]
[also build test ERROR on linus/master v6.3-rc1 next-20230310]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]
url: https://github.com/intel-lab-lkp/linux/commits/Yangtao-Li/ext4-convert-to-DIV_ROUND_UP-in-mpage_process_page_bufs/20230310-140903
base: https://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4.git dev
patch link: https://lore.kernel.org/r/20230310060734.8780-1-frank.li%40vivo.com
patch subject: [PATCH] ext4: convert to DIV_ROUND_UP() in mpage_process_page_bufs()
config: i386-debian-10.3 (https://download.01.org/0day-ci/archive/20230311/[email protected]/config)
compiler: gcc-11 (Debian 11.3.0-8) 11.3.0
reproduce (this is a W=1 build):
# https://github.com/intel-lab-lkp/linux/commit/f4d2db5f59592a5688be6e4d2d3dd6f3f94d4f96
git remote add linux-review https://github.com/intel-lab-lkp/linux
git fetch --no-tags linux-review Yangtao-Li/ext4-convert-to-DIV_ROUND_UP-in-mpage_process_page_bufs/20230310-140903
git checkout f4d2db5f59592a5688be6e4d2d3dd6f3f94d4f96
# save the config file
mkdir build_dir && cp config build_dir/.config
make W=1 O=build_dir ARCH=i386 olddefconfig
make W=1 O=build_dir ARCH=i386 SHELL=/bin/bash
If you fix the issue, kindly add following tag where applicable
| Reported-by: kernel test robot <[email protected]>
| Link: https://lore.kernel.org/oe-kbuild-all/[email protected]/
All errors (new ones prefixed by >>, old ones prefixed by <<):
>> ERROR: modpost: "__divdi3" [fs/ext4/ext4.ko] undefined!
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests
On Fri, Mar 10, 2023 at 07:12:31AM +0000, Al Viro wrote:
>
> Out of curiosity (and that's already too brittle for practical use) -
> does DIV_ROUND_UP_ULL() do any better on full example?
'DIV_ROUND_UP_ULL(i_size_read(inode), i_blocksize(inode))' works properly with
clang but not gcc. If i_blocksize() is changed to do '1U << inode->i_blkbits'
instead of '1 << inode->i_blkbits', it works with gcc too.
- Eric
On Fri, Mar 10, 2023 at 02:07:34PM +0800, Yangtao Li wrote:
> Just for better readability, no code logic change.
All aside from the arguments over performance, I'm not at *all*
convinced by the "it's more readable" argument.
So yeah, let's not. We have i_blkbits for a reason, and it's because
shifting right is just simpler and easier.
BTW, doing a 64-bit division on a 32-bit platforms causes compile
failures, which was the cause of the test bot complaint:
ld: fs/ext4/inode.o: in function `mpage_process_page_bufs':
>> inode.c:(.text+0xbda): undefined reference to `__divdi3'
On 32-bit platforms --- i386 in particular --- the 64-bit division
results in an out-of-line call to a helper function that is not
supplied in the kernel compilation environment, so not only is it
slower, it Just Doesn't Work.
- Ted
...
> On 32-bit platforms --- i386 in particular --- the 64-bit division
> results in an out-of-line call to a helper function that is not
> supplied in the kernel compilation environment, so not only is it
> slower, it Just Doesn't Work.
Even on some 64-bit systems a 64bit divide can be significantly
slower than a 32-bit divide - even with the same arguments.
IIRC Intel x86-64 a 64-bit divide (128-bit dividend) is about
twice the clocks of a 32-bit one. On AMD cpu they are much the same.
David
-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)