2024-02-06 20:25:44

by Barry Song

[permalink] [raw]
Subject: [PATCH v2] zram: easy the allocation of zcomp_strm's buffers through vmalloc

From: Barry Song <[email protected]>

Firstly, there is no need to keep zcomp_strm's buffers contiguous
physically.

Secondly, The recent mTHP project has provided the possibility to
swapout and swapin large folios. Compressing/decompressing large
blocks can hugely decrease CPU consumption and improve compression
ratio. This requires us to make zRAM support the compression and
decompression for large objects.
With the support of large objects in zRAM of our out-of-tree code,
we have observed many allocation failures during CPU hotplug as
large objects need larger buffers. So this change is also more
future-proof once we begin to bring up multiple sizes in zRAM.

Signed-off-by: Barry Song <[email protected]>
---
drivers/block/zram/zcomp.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/block/zram/zcomp.c b/drivers/block/zram/zcomp.c
index 55af4efd7983..8237b08c49d8 100644
--- a/drivers/block/zram/zcomp.c
+++ b/drivers/block/zram/zcomp.c
@@ -11,6 +11,7 @@
#include <linux/sched.h>
#include <linux/cpu.h>
#include <linux/crypto.h>
+#include <linux/vmalloc.h>

#include "zcomp.h"

@@ -37,7 +38,7 @@ static void zcomp_strm_free(struct zcomp_strm *zstrm)
{
if (!IS_ERR_OR_NULL(zstrm->tfm))
crypto_free_comp(zstrm->tfm);
- free_pages((unsigned long)zstrm->buffer, 1);
+ vfree(zstrm->buffer);
zstrm->tfm = NULL;
zstrm->buffer = NULL;
}
@@ -53,7 +54,7 @@ static int zcomp_strm_init(struct zcomp_strm *zstrm, struct zcomp *comp)
* allocate 2 pages. 1 for compressed data, plus 1 extra for the
* case when compressed size is larger than the original one
*/
- zstrm->buffer = (void *)__get_free_pages(GFP_KERNEL | __GFP_ZERO, 1);
+ zstrm->buffer = vzalloc(2 * PAGE_SIZE);
if (IS_ERR_OR_NULL(zstrm->tfm) || !zstrm->buffer) {
zcomp_strm_free(zstrm);
return -ENOMEM;
--
2.34.1



2024-02-07 02:00:28

by Sergey Senozhatsky

[permalink] [raw]
Subject: Re: [PATCH v2] zram: easy the allocation of zcomp_strm's buffers through vmalloc

On (24/02/07 09:25), Barry Song wrote:
> From: Barry Song <[email protected]>
>
> Firstly, there is no need to keep zcomp_strm's buffers contiguous
> physically.
>
> Secondly, The recent mTHP project has provided the possibility to
> swapout and swapin large folios. Compressing/decompressing large
> blocks can hugely decrease CPU consumption and improve compression
> ratio. This requires us to make zRAM support the compression and
> decompression for large objects.
> With the support of large objects in zRAM of our out-of-tree code,
> we have observed many allocation failures during CPU hotplug as
> large objects need larger buffers. So this change is also more
> future-proof once we begin to bring up multiple sizes in zRAM.
>
> Signed-off-by: Barry Song <[email protected]>

Reviewed-by: Sergey Senozhatsky <[email protected]>

Note:
Taking it in NOT because of the out-of-tree code (we don't really
do that), but because this is executed from CPU offline/online
paths, which can happen on devices with fragmented memory (a valid
concern IMHO).

Minchan, if you have any objections, please chime in.

> @@ -37,7 +38,7 @@ static void zcomp_strm_free(struct zcomp_strm *zstrm)
> {
> if (!IS_ERR_OR_NULL(zstrm->tfm))
> crypto_free_comp(zstrm->tfm);
> - free_pages((unsigned long)zstrm->buffer, 1);
> + vfree(zstrm->buffer);
> zstrm->tfm = NULL;
> zstrm->buffer = NULL;
> }
> @@ -53,7 +54,7 @@ static int zcomp_strm_init(struct zcomp_strm *zstrm, struct zcomp *comp)
> * allocate 2 pages. 1 for compressed data, plus 1 extra for the
> * case when compressed size is larger than the original one
> */
> - zstrm->buffer = (void *)__get_free_pages(GFP_KERNEL | __GFP_ZERO, 1);
> + zstrm->buffer = vzalloc(2 * PAGE_SIZE);
> if (IS_ERR_OR_NULL(zstrm->tfm) || !zstrm->buffer) {
> zcomp_strm_free(zstrm);
> return -ENOMEM;
> --
> 2.34.1
>

2024-02-07 02:40:30

by Jens Axboe

[permalink] [raw]
Subject: Re: [PATCH v2] zram: easy the allocation of zcomp_strm's buffers through vmalloc

On 2/6/24 6:44 PM, Sergey Senozhatsky wrote:
> On (24/02/07 09:25), Barry Song wrote:
>> From: Barry Song <[email protected]>
>>
>> Firstly, there is no need to keep zcomp_strm's buffers contiguous
>> physically.
>>
>> Secondly, The recent mTHP project has provided the possibility to
>> swapout and swapin large folios. Compressing/decompressing large
>> blocks can hugely decrease CPU consumption and improve compression
>> ratio. This requires us to make zRAM support the compression and
>> decompression for large objects.
>> With the support of large objects in zRAM of our out-of-tree code,
>> we have observed many allocation failures during CPU hotplug as
>> large objects need larger buffers. So this change is also more
>> future-proof once we begin to bring up multiple sizes in zRAM.
>>
>> Signed-off-by: Barry Song <[email protected]>
>
> Reviewed-by: Sergey Senozhatsky <[email protected]>
>
> Note:
> Taking it in NOT because of the out-of-tree code (we don't really
> do that), but because this is executed from CPU offline/online
> paths, which can happen on devices with fragmented memory (a valid
> concern IMHO).
>
> Minchan, if you have any objections, please chime in.

Not Minchan, but I do have an issue with the title of the commit, it
doesn't make any sense. Can the maintainer please re-write that to be
something that is appropriate and actually describes what the patch
does?

--
Jens Axboe


2024-02-07 03:17:30

by Jens Axboe

[permalink] [raw]
Subject: Re: [PATCH v2] zram: easy the allocation of zcomp_strm's buffers through vmalloc

On 2/6/24 8:14 PM, Sergey Senozhatsky wrote:
> On (24/02/06 19:40), Jens Axboe wrote:
>> On 2/6/24 6:44 PM, Sergey Senozhatsky wrote:
>>> On (24/02/07 09:25), Barry Song wrote:
>>>> From: Barry Song <[email protected]>
>>>>
>>>> Firstly, there is no need to keep zcomp_strm's buffers contiguous
>>>> physically.
>>>>
>>>> Secondly, The recent mTHP project has provided the possibility to
>>>> swapout and swapin large folios. Compressing/decompressing large
>>>> blocks can hugely decrease CPU consumption and improve compression
>>>> ratio. This requires us to make zRAM support the compression and
>>>> decompression for large objects.
>>>> With the support of large objects in zRAM of our out-of-tree code,
>>>> we have observed many allocation failures during CPU hotplug as
>>>> large objects need larger buffers. So this change is also more
>>>> future-proof once we begin to bring up multiple sizes in zRAM.
>>>>
>>>> Signed-off-by: Barry Song <[email protected]>
>>>
>>> Reviewed-by: Sergey Senozhatsky <[email protected]>
>>>
>>> Note:
>>> Taking it in NOT because of the out-of-tree code (we don't really
>>> do that), but because this is executed from CPU offline/online
>>> paths, which can happen on devices with fragmented memory (a valid
>>> concern IMHO).
>>>
>>> Minchan, if you have any objections, please chime in.
>>
>> Not Minchan, but I do have an issue with the title of the commit, it
>> doesn't make any sense. Can the maintainer please re-write that to be
>> something that is appropriate and actually describes what the patch
>> does?
>
> Thanks Jens. I fully agree, I requested a better commit message in
> v1 feedback, we probably still can improve on this.
>
>
> Something like this?
>
> ---
>
> zram: do not allocate physically contiguous strm buffers
>
> Currently zram allocates 2 physically contigous pages per-CPU's
> compression stream (we may have up to 3 streams per-CPU). Since
> those buffers are per-CPU we allocate them from CPU hotplug path,
> which may have higher risks of failed allocations on devices with
> fragmented memory.
>
> Switch to virtually contiguos allocations - crypto comp does not
> seem impose requirements on compression working buffers to be
> physically contiguous.

Yep, this is much better! Thanks.

--
Jens Axboe


2024-02-07 03:22:05

by Sergey Senozhatsky

[permalink] [raw]
Subject: Re: [PATCH v2] zram: easy the allocation of zcomp_strm's buffers through vmalloc

On (24/02/06 20:17), Jens Axboe wrote:
[..]
> >>> Minchan, if you have any objections, please chime in.
> >>
> >> Not Minchan, but I do have an issue with the title of the commit, it
> >> doesn't make any sense. Can the maintainer please re-write that to be
> >> something that is appropriate and actually describes what the patch
> >> does?
> >
> > Thanks Jens. I fully agree, I requested a better commit message in
> > v1 feedback, we probably still can improve on this.
> >
> >
> > Something like this?
> >
> > ---
> >
> > zram: do not allocate physically contiguous strm buffers
> >
> > Currently zram allocates 2 physically contigous pages per-CPU's
> > compression stream (we may have up to 3 streams per-CPU). Since
> > those buffers are per-CPU we allocate them from CPU hotplug path,
> > which may have higher risks of failed allocations on devices with
> > fragmented memory.
> >
> > Switch to virtually contiguos allocations - crypto comp does not
> > seem impose requirements on compression working buffers to be
> > physically contiguous.
>
> Yep, this is much better! Thanks.

Thanks.

Barry, can you please send v3 with the suggested subject and commit
message?

2024-02-07 03:38:51

by Sergey Senozhatsky

[permalink] [raw]
Subject: Re: [PATCH v2] zram: easy the allocation of zcomp_strm's buffers through vmalloc

On (24/02/06 19:40), Jens Axboe wrote:
> On 2/6/24 6:44 PM, Sergey Senozhatsky wrote:
> > On (24/02/07 09:25), Barry Song wrote:
> >> From: Barry Song <[email protected]>
> >>
> >> Firstly, there is no need to keep zcomp_strm's buffers contiguous
> >> physically.
> >>
> >> Secondly, The recent mTHP project has provided the possibility to
> >> swapout and swapin large folios. Compressing/decompressing large
> >> blocks can hugely decrease CPU consumption and improve compression
> >> ratio. This requires us to make zRAM support the compression and
> >> decompression for large objects.
> >> With the support of large objects in zRAM of our out-of-tree code,
> >> we have observed many allocation failures during CPU hotplug as
> >> large objects need larger buffers. So this change is also more
> >> future-proof once we begin to bring up multiple sizes in zRAM.
> >>
> >> Signed-off-by: Barry Song <[email protected]>
> >
> > Reviewed-by: Sergey Senozhatsky <[email protected]>
> >
> > Note:
> > Taking it in NOT because of the out-of-tree code (we don't really
> > do that), but because this is executed from CPU offline/online
> > paths, which can happen on devices with fragmented memory (a valid
> > concern IMHO).
> >
> > Minchan, if you have any objections, please chime in.
>
> Not Minchan, but I do have an issue with the title of the commit, it
> doesn't make any sense. Can the maintainer please re-write that to be
> something that is appropriate and actually describes what the patch
> does?

Thanks Jens. I fully agree, I requested a better commit message in
v1 feedback, we probably still can improve on this.


Something like this?

---

zram: do not allocate physically contiguous strm buffers

Currently zram allocates 2 physically contigous pages per-CPU's
compression stream (we may have up to 3 streams per-CPU). Since
those buffers are per-CPU we allocate them from CPU hotplug path,
which may have higher risks of failed allocations on devices with
fragmented memory.

Switch to virtually contiguos allocations - crypto comp does not
seem impose requirements on compression working buffers to be
physically contiguous.

2024-02-07 04:38:10

by Sergey Senozhatsky

[permalink] [raw]
Subject: Re: [PATCH v2] zram: easy the allocation of zcomp_strm's buffers through vmalloc

On (24/02/07 12:14), Sergey Senozhatsky wrote:
> ---
>
> zram: do not allocate physically contiguous strm buffers
>
> Currently zram allocates 2 physically contigous pages per-CPU's
> compression stream (we may have up to 3 streams per-CPU). Since

Correction:
^ up to 4

> those buffers are per-CPU we allocate them from CPU hotplug path,
> which may have higher risks of failed allocations on devices with
> fragmented memory.
>
> Switch to virtually contiguos allocations - crypto comp does not
> seem impose requirements on compression working buffers to be
> physically contiguous.

2024-02-07 05:10:36

by Barry Song

[permalink] [raw]
Subject: Re: [PATCH v2] zram: easy the allocation of zcomp_strm's buffers through vmalloc

On Wed, Feb 7, 2024 at 4:21 PM Sergey Senozhatsky
<[email protected]> wrote:
>
> On (24/02/06 20:17), Jens Axboe wrote:
> [..]
> > >>> Minchan, if you have any objections, please chime in.
> > >>
> > >> Not Minchan, but I do have an issue with the title of the commit, it
> > >> doesn't make any sense. Can the maintainer please re-write that to be
> > >> something that is appropriate and actually describes what the patch
> > >> does?
> > >
> > > Thanks Jens. I fully agree, I requested a better commit message in
> > > v1 feedback, we probably still can improve on this.
> > >
> > >
> > > Something like this?
> > >
> > > ---
> > >
> > > zram: do not allocate physically contiguous strm buffers
> > >
> > > Currently zram allocates 2 physically contigous pages per-CPU's
> > > compression stream (we may have up to 3 streams per-CPU). Since
> > > those buffers are per-CPU we allocate them from CPU hotplug path,
> > > which may have higher risks of failed allocations on devices with
> > > fragmented memory.
> > >
> > > Switch to virtually contiguos allocations - crypto comp does not
> > > seem impose requirements on compression working buffers to be
> > > physically contiguous.
> >
> > Yep, this is much better! Thanks.
>
> Thanks.
>

Hi Sergey, Jens,

> Barry, can you please send v3 with the suggested subject and commit
> message?

Thanks for your comments and improvements. will send v3 accordingly.

Best regard
Barry