2023-06-30 09:42:26

by zenghongling

[permalink] [raw]
Subject: [PATCH] fs: Optimize unixbench's file copy test

The iomap_set_range_uptodate function checks if the file is a private
mapping,and if it is, it needs to do something about it.UnixBench's
file copy tests are mostly share mapping, such a check would reduce
file copy scores, so we added the unlikely macro for optimization.
and the score of file copy can be improved after branch optimization.
As follows:

./Run -c 8 -i 3 fstime fsbuffer fsdisk

Before the optimization
System Benchmarks Partial Index BASELINE RESULT INDEX
File Copy 1024 bufsize 2000 maxblocks 3960.0 689276.0 1740.6
File Copy 256 bufsize 500 maxblocks 1655.0 204133.0 1233.4
File Copy 4096 bufsize 8000 maxblocks 5800.0 1526945.0 2632.7
========
System Benchmarks Index Score (Partial Only) 1781.3

After the optimization
System Benchmarks Partial Index BASELINE RESULT INDEX
File Copy 1024 bufsize 2000 maxblocks 3960.0 741524.0 1872.5
File Copy 256 bufsize 500 maxblocks 1655.0 208334.0 1258.8
File Copy 4096 bufsize 8000 maxblocks 5800.0 1641660.0 2830.4
========
System Benchmarks Index Score (Partial Only) 1882.6

Signed-off-by: zenghongling <[email protected]>
---
fs/iomap/buffered-io.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
index 53cd7b2..35a50c2 100644
--- a/fs/iomap/buffered-io.c
+++ b/fs/iomap/buffered-io.c
@@ -148,7 +148,7 @@ iomap_set_range_uptodate(struct page *page, unsigned off, unsigned len)
if (PageError(page))
return;

- if (page_has_private(page))
+ if (unlikely(page_has_private(page)))
iomap_iop_set_range_uptodate(page, off, len);
else
SetPageUptodate(page);
--
2.1.0



2023-06-30 15:38:24

by Darrick J. Wong

[permalink] [raw]
Subject: Re: [PATCH] fs: Optimize unixbench's file copy test

On Fri, Jun 30, 2023 at 05:28:23PM +0800, zenghongling wrote:
> The iomap_set_range_uptodate function checks if the file is a private
> mapping,and if it is, it needs to do something about it.UnixBench's
> file copy tests are mostly share mapping, such a check would reduce
> file copy scores, so we added the unlikely macro for optimization.
> and the score of file copy can be improved after branch optimization.
> As follows:
>
> ./Run -c 8 -i 3 fstime fsbuffer fsdisk
>
> Before the optimization
> System Benchmarks Partial Index BASELINE RESULT INDEX
> File Copy 1024 bufsize 2000 maxblocks 3960.0 689276.0 1740.6
> File Copy 256 bufsize 500 maxblocks 1655.0 204133.0 1233.4
> File Copy 4096 bufsize 8000 maxblocks 5800.0 1526945.0 2632.7
> ========
> System Benchmarks Index Score (Partial Only) 1781.3
>
> After the optimization
> System Benchmarks Partial Index BASELINE RESULT INDEX
> File Copy 1024 bufsize 2000 maxblocks 3960.0 741524.0 1872.5
> File Copy 256 bufsize 500 maxblocks 1655.0 208334.0 1258.8
> File Copy 4096 bufsize 8000 maxblocks 5800.0 1641660.0 2830.4
> ========
> System Benchmarks Index Score (Partial Only) 1882.6

Kernel version? And how does this intersect with the ongoing work to
use large folios throughout iomap?

--D

> Signed-off-by: zenghongling <[email protected]>
> ---
> fs/iomap/buffered-io.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
> index 53cd7b2..35a50c2 100644
> --- a/fs/iomap/buffered-io.c
> +++ b/fs/iomap/buffered-io.c
> @@ -148,7 +148,7 @@ iomap_set_range_uptodate(struct page *page, unsigned off, unsigned len)
> if (PageError(page))
> return;
>
> - if (page_has_private(page))
> + if (unlikely(page_has_private(page)))
> iomap_iop_set_range_uptodate(page, off, len);
> else
> SetPageUptodate(page);
> --
> 2.1.0
>

2023-06-30 22:00:45

by Matthew Wilcox

[permalink] [raw]
Subject: Re: [PATCH] fs: Optimize unixbench's file copy test

On Fri, Jun 30, 2023 at 05:28:23PM +0800, zenghongling wrote:
> The iomap_set_range_uptodate function checks if the file is a private
> mapping,and if it is, it needs to do something about it.UnixBench's
> file copy tests are mostly share mapping, such a check would reduce
> file copy scores, so we added the unlikely macro for optimization.
> and the score of file copy can be improved after branch optimization.
>
> - if (page_has_private(page))
> + if (unlikely(page_has_private(page)))

This changelog shows a complete misunderstanding of the code you're
changing. page_has_private() has nothing to do with whether the file
is "a private mapping", whatever that means. The test is whether the
filesystem has added private data to the page.

As Darrick said, this code has been completely rewritten in a current
kernel. You should test with something recent.

2023-07-05 05:19:29

by Ritesh Harjani

[permalink] [raw]
Subject: Re: [PATCH] fs: Optimize unixbench's file copy test

zenghongling <[email protected]> writes:

> The iomap_set_range_uptodate function checks if the file is a private
> mapping,and if it is, it needs to do something about it.UnixBench's
> file copy tests are mostly share mapping, such a check would reduce
> file copy scores, so we added the unlikely macro for optimization.
> and the score of file copy can be improved after branch optimization.
> As follows:
>
> ./Run -c 8 -i 3 fstime fsbuffer fsdisk
>
> Before the optimization
> System Benchmarks Partial Index BASELINE RESULT INDEX
> File Copy 1024 bufsize 2000 maxblocks 3960.0 689276.0 1740.6
> File Copy 256 bufsize 500 maxblocks 1655.0 204133.0 1233.4
> File Copy 4096 bufsize 8000 maxblocks 5800.0 1526945.0 2632.7
> ========
> System Benchmarks Index Score (Partial Only) 1781.3
>
> After the optimization
> System Benchmarks Partial Index BASELINE RESULT INDEX
> File Copy 1024 bufsize 2000 maxblocks 3960.0 741524.0 1872.5
> File Copy 256 bufsize 500 maxblocks 1655.0 208334.0 1258.8
> File Copy 4096 bufsize 8000 maxblocks 5800.0 1641660.0 2830.4
> ========
> System Benchmarks Index Score (Partial Only) 1882.6
>
> Signed-off-by: zenghongling <[email protected]>
> ---
> fs/iomap/buffered-io.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
> index 53cd7b2..35a50c2 100644
> --- a/fs/iomap/buffered-io.c
> +++ b/fs/iomap/buffered-io.c
> @@ -148,7 +148,7 @@ iomap_set_range_uptodate(struct page *page, unsigned off, unsigned len)
> if (PageError(page))
> return;
>
> - if (page_has_private(page))
> + if (unlikely(page_has_private(page)))

IIUC likely and unlikely are macros which provides hints to the compiler
to emit code which can produce branch prediction to favour the likely
case. Prefetchers can then prefetch instructions in the pipeline for the
likely case. But if the branch prediction goes wrong, then it could
cause performance problem too.

Now, with large folio support in XFS and for platforms with bs < ps,
this patch is incorrect. As page->private is always true for bs < ps and
it can be sometimes true for a large folio (in the latest upstream code)

I don't think this is the right fix anyway. However I will be curious to know
why did you make this change? Ideally with advanced micro architecture
improvements, I believe the cores already have a branch target history (BTH),
to know which branch to take. So was this showing up in your perf
profiles as branch misses or something?

-ritesh

> iomap_iop_set_range_uptodate(page, off, len);
> else
> SetPageUptodate(page);
> --
> 2.1.0