On Sun, Feb 11, 2024 at 5:58 AM Chengming Zhou
<[email protected]> wrote:
>
> All zswap entries will take a reference of zswap_pool when
> zswap_store(), and drop it when free. Change it to use the
> percpu_ref is better for scalability performance.
>
> Testing kernel build in tmpfs with memory.max=2GB
> (zswap shrinker and writeback enabled with one 50GB swapfile).
>
> mm-unstable zswap-global-lru
> real 63.20 63.12
> user 1061.75 1062.95
> sys 268.74 264.44
Are these numbers from a single run or the average of multiple runs?
It just seems that the improvement is small, and percpu refcnt is
slightly less intuitive (and uses a bit more memory), so let's make
sure there is a real performance gain first.
It would also be useful to mention how many threads/CPUs are being used here.
On 2024/2/13 06:42, Yosry Ahmed wrote:
> On Sun, Feb 11, 2024 at 5:58 AM Chengming Zhou
> <[email protected]> wrote:
>>
>> All zswap entries will take a reference of zswap_pool when
>> zswap_store(), and drop it when free. Change it to use the
>> percpu_ref is better for scalability performance.
>>
>> Testing kernel build in tmpfs with memory.max=2GB
>> (zswap shrinker and writeback enabled with one 50GB swapfile).
>>
>> mm-unstable zswap-global-lru
>> real 63.20 63.12
>> user 1061.75 1062.95
>> sys 268.74 264.44
>
> Are these numbers from a single run or the average of multiple runs?
The average of 5 runs. And I just checked/compared each run result,
the improvement is stable. So yes, it should be a real performance gain.
> It just seems that the improvement is small, and percpu refcnt is
> slightly less intuitive (and uses a bit more memory), so let's make
> sure there is a real performance gain first.
Right, percpu_ref use a bit more memory which should be ok for our use case,
since we almost have only one zswap_pool to be using. The performance gain is
for zswap_store/load hotpath.
>
> It would also be useful to mention how many threads/CPUs are being used here.
My bad, the testing uses 32 threads on a 128 CPUs x86-64 machine.
Thanks.
On Tue, Feb 13, 2024 at 10:31:16PM +0800, Chengming Zhou wrote:
> On 2024/2/13 06:42, Yosry Ahmed wrote:
> > On Sun, Feb 11, 2024 at 5:58 AM Chengming Zhou
> > <[email protected]> wrote:
> >>
> >> All zswap entries will take a reference of zswap_pool when
> >> zswap_store(), and drop it when free. Change it to use the
> >> percpu_ref is better for scalability performance.
> >>
> >> Testing kernel build in tmpfs with memory.max=2GB
> >> (zswap shrinker and writeback enabled with one 50GB swapfile).
> >>
> >> mm-unstable zswap-global-lru
> >> real 63.20 63.12
> >> user 1061.75 1062.95
> >> sys 268.74 264.44
> >
> > Are these numbers from a single run or the average of multiple runs?
>
> The average of 5 runs. And I just checked/compared each run result,
> the improvement is stable. So yes, it should be a real performance gain.
>
> > It just seems that the improvement is small, and percpu refcnt is
> > slightly less intuitive (and uses a bit more memory), so let's make
> > sure there is a real performance gain first.
>
> Right, percpu_ref use a bit more memory which should be ok for our use case,
> since we almost have only one zswap_pool to be using. The performance gain is
> for zswap_store/load hotpath.
>
> >
> > It would also be useful to mention how many threads/CPUs are being used here.
>
> My bad, the testing uses 32 threads on a 128 CPUs x86-64 machine.
Thanks for the clarification. Please include such details in the commit
message.