I have a Sipeed Lichee RV dock board which only has 512MB DDR, so
memory optimizations such as swap on zram are helpful. As is seen
in commit d0637c505f8a ("arm64: enable THP_SWAP for arm64") and
commit bd4c82c22c367e ("mm, THP, swap: delay splitting THP after
swapped out"), THP_SWAP can improve the swap throughput significantly.
Enable THP_SWAP for RV64, testing the micro-benchmark which is
introduced by commit d0637c505f8a ("arm64: enable THP_SWAP for arm64")
shows below numbers on the Lichee RV dock board:
thp swp throughput w/o patch: 66908 bytes/ms (mean of 10 tests)
thp swp throughput w/ patch: 322638 bytes/ms (mean of 10 tests)
Improved by 382%!
Signed-off-by: Jisheng Zhang <[email protected]>
---
arch/riscv/Kconfig | 1 +
1 file changed, 1 insertion(+)
diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
index ed66c31e4655..19088c750c7f 100644
--- a/arch/riscv/Kconfig
+++ b/arch/riscv/Kconfig
@@ -45,6 +45,7 @@ config RISCV
select ARCH_WANT_FRAME_POINTERS
select ARCH_WANT_GENERAL_HUGETLB
select ARCH_WANT_HUGE_PMD_SHARE if 64BIT
+ select ARCH_WANTS_THP_SWAP if TRANSPARENT_HUGEPAGE
select BINFMT_FLAT_NO_DATA_START_OFFSET if !MMU
select BUILDTIME_TABLE_SORT if MMU
select CLONE_BACKWARDS
--
2.34.1
On Mon, Aug 22, 2022 at 01:05:59AM +0800, Jisheng Zhang wrote:
> I have a Sipeed Lichee RV dock board which only has 512MB DDR, so
> memory optimizations such as swap on zram are helpful. As is seen
> in commit d0637c505f8a ("arm64: enable THP_SWAP for arm64") and
> commit bd4c82c22c367e ("mm, THP, swap: delay splitting THP after
> swapped out"), THP_SWAP can improve the swap throughput significantly.
>
> Enable THP_SWAP for RV64, testing the micro-benchmark which is
> introduced by commit d0637c505f8a ("arm64: enable THP_SWAP for arm64")
> shows below numbers on the Lichee RV dock board:
>
> thp swp throughput w/o patch: 66908 bytes/ms (mean of 10 tests)
> thp swp throughput w/ patch: 322638 bytes/ms (mean of 10 tests)
>
> Improved by 382%!
>
> Signed-off-by: Jisheng Zhang <[email protected]>
> ---
> arch/riscv/Kconfig | 1 +
> 1 file changed, 1 insertion(+)
>
> diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
> index ed66c31e4655..19088c750c7f 100644
> --- a/arch/riscv/Kconfig
> +++ b/arch/riscv/Kconfig
> @@ -45,6 +45,7 @@ config RISCV
> select ARCH_WANT_FRAME_POINTERS
> select ARCH_WANT_GENERAL_HUGETLB
> select ARCH_WANT_HUGE_PMD_SHARE if 64BIT
> + select ARCH_WANTS_THP_SWAP if TRANSPARENT_HUGEPAGE
> select BINFMT_FLAT_NO_DATA_START_OFFSET if !MMU
> select BUILDTIME_TABLE_SORT if MMU
> select CLONE_BACKWARDS
> --
> 2.34.1
>
That looks like a good idea to me.
Reviewed-by: Andrew Jones <[email protected]>
On Sun, 21 Aug 2022 10:05:59 PDT (-0700), [email protected] wrote:
> I have a Sipeed Lichee RV dock board which only has 512MB DDR, so
> memory optimizations such as swap on zram are helpful. As is seen
> in commit d0637c505f8a ("arm64: enable THP_SWAP for arm64") and
> commit bd4c82c22c367e ("mm, THP, swap: delay splitting THP after
> swapped out"), THP_SWAP can improve the swap throughput significantly.
>
> Enable THP_SWAP for RV64, testing the micro-benchmark which is
> introduced by commit d0637c505f8a ("arm64: enable THP_SWAP for arm64")
> shows below numbers on the Lichee RV dock board:
>
> thp swp throughput w/o patch: 66908 bytes/ms (mean of 10 tests)
> thp swp throughput w/ patch: 322638 bytes/ms (mean of 10 tests)
>
> Improved by 382%!
>
> Signed-off-by: Jisheng Zhang <[email protected]>
> ---
> arch/riscv/Kconfig | 1 +
> 1 file changed, 1 insertion(+)
>
> diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
> index ed66c31e4655..19088c750c7f 100644
> --- a/arch/riscv/Kconfig
> +++ b/arch/riscv/Kconfig
> @@ -45,6 +45,7 @@ config RISCV
> select ARCH_WANT_FRAME_POINTERS
> select ARCH_WANT_GENERAL_HUGETLB
> select ARCH_WANT_HUGE_PMD_SHARE if 64BIT
> + select ARCH_WANTS_THP_SWAP if TRANSPARENT_HUGEPAGE
> select BINFMT_FLAT_NO_DATA_START_OFFSET if !MMU
> select BUILDTIME_TABLE_SORT if MMU
> select CLONE_BACKWARDS
Thanks, this is on for-next.
On Wed, Oct 05, 2022 at 07:35:53PM -0700, Palmer Dabbelt wrote:
> On Sun, 21 Aug 2022 10:05:59 PDT (-0700), [email protected] wrote:
> > I have a Sipeed Lichee RV dock board which only has 512MB DDR, so
> > memory optimizations such as swap on zram are helpful. As is seen
> > in commit d0637c505f8a ("arm64: enable THP_SWAP for arm64") and
> > commit bd4c82c22c367e ("mm, THP, swap: delay splitting THP after
> > swapped out"), THP_SWAP can improve the swap throughput significantly.
> >
> > Enable THP_SWAP for RV64, testing the micro-benchmark which is
> > introduced by commit d0637c505f8a ("arm64: enable THP_SWAP for arm64")
> > shows below numbers on the Lichee RV dock board:
> >
> > thp swp throughput w/o patch: 66908 bytes/ms (mean of 10 tests)
> > thp swp throughput w/ patch: 322638 bytes/ms (mean of 10 tests)
> >
> > Improved by 382%!
> >
> > Signed-off-by: Jisheng Zhang <[email protected]>
> > ---
> > arch/riscv/Kconfig | 1 +
> > 1 file changed, 1 insertion(+)
> >
> > diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
> > index ed66c31e4655..19088c750c7f 100644
> > --- a/arch/riscv/Kconfig
> > +++ b/arch/riscv/Kconfig
> > @@ -45,6 +45,7 @@ config RISCV
> > select ARCH_WANT_FRAME_POINTERS
> > select ARCH_WANT_GENERAL_HUGETLB
> > select ARCH_WANT_HUGE_PMD_SHARE if 64BIT
> > + select ARCH_WANTS_THP_SWAP if TRANSPARENT_HUGEPAGE
> > select BINFMT_FLAT_NO_DATA_START_OFFSET if !MMU
> > select BUILDTIME_TABLE_SORT if MMU
> > select CLONE_BACKWARDS
>
> Thanks, this is on for-next.
FYI, this is v1 of a patchset that went to v3.
v3 only changed the commit message, but v2 had a functional change.
v3 is here:
https://lore.kernel.org/all/[email protected]/
Thanks,
Conor.
On Wed, 05 Oct 2022 23:53:03 PDT (-0700), [email protected] wrote:
> On Wed, Oct 05, 2022 at 07:35:53PM -0700, Palmer Dabbelt wrote:
>> On Sun, 21 Aug 2022 10:05:59 PDT (-0700), [email protected] wrote:
>> > I have a Sipeed Lichee RV dock board which only has 512MB DDR, so
>> > memory optimizations such as swap on zram are helpful. As is seen
>> > in commit d0637c505f8a ("arm64: enable THP_SWAP for arm64") and
>> > commit bd4c82c22c367e ("mm, THP, swap: delay splitting THP after
>> > swapped out"), THP_SWAP can improve the swap throughput significantly.
>> >
>> > Enable THP_SWAP for RV64, testing the micro-benchmark which is
>> > introduced by commit d0637c505f8a ("arm64: enable THP_SWAP for arm64")
>> > shows below numbers on the Lichee RV dock board:
>> >
>> > thp swp throughput w/o patch: 66908 bytes/ms (mean of 10 tests)
>> > thp swp throughput w/ patch: 322638 bytes/ms (mean of 10 tests)
>> >
>> > Improved by 382%!
>> >
>> > Signed-off-by: Jisheng Zhang <[email protected]>
>> > ---
>> > arch/riscv/Kconfig | 1 +
>> > 1 file changed, 1 insertion(+)
>> >
>> > diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
>> > index ed66c31e4655..19088c750c7f 100644
>> > --- a/arch/riscv/Kconfig
>> > +++ b/arch/riscv/Kconfig
>> > @@ -45,6 +45,7 @@ config RISCV
>> > select ARCH_WANT_FRAME_POINTERS
>> > select ARCH_WANT_GENERAL_HUGETLB
>> > select ARCH_WANT_HUGE_PMD_SHARE if 64BIT
>> > + select ARCH_WANTS_THP_SWAP if TRANSPARENT_HUGEPAGE
>> > select BINFMT_FLAT_NO_DATA_START_OFFSET if !MMU
>> > select BUILDTIME_TABLE_SORT if MMU
>> > select CLONE_BACKWARDS
>>
>> Thanks, this is on for-next.
>
> FYI, this is v1 of a patchset that went to v3.
> v3 only changed the commit message, but v2 had a functional change.
>
> v3 is here:
> https://lore.kernel.org/all/[email protected]/
Thanks, not sure why I missed those. I've put the v3 on for-next.