On 15.06.23 13:15, 贺中坤 wrote:
> On Thu, Jun 15, 2023 at 5:27 PM David Hildenbrand <[email protected]> wrote:
>>
>> Interesting. When looking at possible ways to achieve that in a clean
>> way, my understanding was that the page might not always be accounted to
>> a memcg (e.g., simply some buffer?). Besides the swap->zram case I was
>> thinking about other fs->zram case, or just a straight read/write to the
>> zram device.
>>
>> The easiest way to see where that goes wrong I think is
>> zram_bvec_write_partial(), where we simply allocate a temporary page.
>>
>> Either I am missing something important, or this only works in some
>> special cases.
>>
>> I came to the conclusion that the only reasonable way is to assign a
>> complete zram device to a single cgroup and have all memory accounted to
>> that cgroup. Does also not solve all use cases (especially the
>> swap->zram case that might be better offjust using zswap?) but at least
>> the memory gets accounted in a more predictable way.
>>
>> Happy to learn more.
>>
>> --
>> Cheers,
>>
>> David / dhildenb
>>
>
> Hi david, thanks for your reply. This should be my mistake, I didn't
> describe the
> problem clearly.
>
> The reason for the problem is not the temporary page allocated at the beginning
> of zram_bvec_write_partial() and released at the end,but the compressed memory
> allocated by zs_malloc() in zram_write_page().
>
> So, it may not meet the need to assign a complete zram device to a
> single cgroup.
> we need to charge the compressed memory to the original page's memory cgroup,
> which is not charged so far.
>
Yes, but my point is that there are cases where the pages you get are
not charged. zram_bvec_write_partial() is just one such example that
highlights the issue.
--
Cheers,
David / dhildenb
On Thu, Jun 15, 2023 at 7:19 PM David Hildenbrand <[email protected]> wrote:
>
> Yes, but my point is that there are cases where the pages you get are
> not charged. zram_bvec_write_partial() is just one such example that
> highlights the issue.
Sorry ,I got it.
>
> --
> Cheers,
>
> David / dhildenb
>
On 15.06.23 14:19, 贺中坤 wrote:
> On Thu, Jun 15, 2023 at 7:19 PM David Hildenbrand <[email protected]> wrote:
>>
>> Yes, but my point is that there are cases where the pages you get are
>> not charged. zram_bvec_write_partial() is just one such example that
>> highlights the issue.
>
> Sorry ,I got it.
I suspect for the swap->zram we should always get charged pages, because
we're effectively writing out charged anon/shmem pages only -- without
any buffer in between.
For the fs->zram or direct zram access device case I'm not so sure. It
highly depends on what gets mapped into the bio (e.g., a kernel buffer,
zeropage, ...). If it's a pagecache page, that should be charged and
we're good. No so sure about fs metadata or some other fs cases (e.g.,
write() to a file that bypass the pagecache).
--
Cheers,
David / dhildenb
> I suspect for the swap->zram we should always get charged pages, because
> we're effectively writing out charged anon/shmem pages only -- without
> any buffer in between.
Hi David,the charged memory will be released in swap->zram. New pages
are allocated by alloc_zspage(), and we did not charge the page directly,but
the objects(like slab), because the zspage are shared by any memcg.
>
> For the fs->zram or direct zram access device case I'm not so sure. It
> highly depends on what gets mapped into the bio (e.g., a kernel buffer,
> zeropage, ...). If it's a pagecache page, that should be charged and
> we're good. No so sure about fs metadata or some other fs cases (e.g.,
> write() to a file that bypass the pagecache).
>
Yes, the pagecaches are charged in fs->zram, but will be released if
we drop the cache. the compressed objects are not charged.
> --
> Cheers,
>
> David / dhildenb
>
On 15.06.23 15:40, 贺中坤 wrote:
>> I suspect for the swap->zram we should always get charged pages, because
>> we're effectively writing out charged anon/shmem pages only -- without
>> any buffer in between.
>
> Hi David,the charged memory will be released in swap->zram. New pages
> are allocated by alloc_zspage(), and we did not charge the page directly,but
> the objects(like slab), because the zspage are shared by any memcg.
>
>>
>> For the fs->zram or direct zram access device case I'm not so sure. It
>> highly depends on what gets mapped into the bio (e.g., a kernel buffer,
>> zeropage, ...). If it's a pagecache page, that should be charged and
>> we're good. No so sure about fs metadata or some other fs cases (e.g.,
>> write() to a file that bypass the pagecache).
>>
>
> Yes, the pagecaches are charged in fs->zram, but will be released if
> we drop the cache. the compressed objects are not charged.
Yes. But just to stress again, one issue I see is that if there is a
page in the BIO that is not charged, you cannot charge the compressed page.
Assume you have some FS on that zram block device and you want to make
sure it gets properly charged to whoever is reading/writing a file on
that filesystem. (imagine something like a compress shmem)
If a user (or the filesystem?) can trigger a BIO that has an uncharged
page in it, it would not get charged accordingly.
The "easy" reproducer would have been O_DIRECT write() using the shared
zeropage, but zram_write_page() is smart enough to optimize for that
case (page_same_filled()). :)
Maybe I'm over-thinking this (well, the we do have partial I/O support,
so something seems to be able to trigger such cases), and it would be
great if someone with more FS->BIO experience could comment.
I'll note that this is fundamentally different to zswap, because with
zswap you don't get arbitrary BIOs, you get an anon or shmem page (that
should be charged).
--
Cheers,
David / dhildenb
>
> Yes. But just to stress again, one issue I see is that if there is a
> page in the BIO that is not charged, you cannot charge the compressed page.
I got it. Thanks.
>
> Assume you have some FS on that zram block device and you want to make
> sure it gets properly charged to whoever is reading/writing a file on
> that filesystem. (imagine something like a compress shmem)
>
> If a user (or the filesystem?) can trigger a BIO that has an uncharged
> page in it, it would not get charged accordingly.
>
> The "easy" reproducer would have been O_DIRECT write() using the shared
> zeropage, but zram_write_page() is smart enough to optimize for that
> case (page_same_filled()). :)
Ok, I will try it.
>
> Maybe I'm over-thinking this (well, the we do have partial I/O support,
> so something seems to be able to trigger such cases), and it would be
> great if someone with more FS->BIO experience could comment.
>
> I'll note that this is fundamentally different to zswap, because with
> zswap you don't get arbitrary BIOs, you get an anon or shmem page (that
> should be charged).
>
Hi David, I know your concern and I will try to find the uncharged case.
> --
> Cheers,
>
> David / dhildenb
>