> For case (1), I think a cleaner solution would be to move the
> zswap_invalidate() call from swap_range_free() (which is called after
> the cached slots are freed) to __swap_entry_free_locked() if the usage
> goes to 0. I actually think conceptually this makes not just for
> zswap_invalidate(), but also for the arch call, memcg uncharging, etc.
> Slots caching is a swapfile optimization that should be internal to
> swapfile code. Once a swap entry is freed (i.e. swap count is 0 AND
> not in the swap cache), all the hooks should be called (memcg, zswap,
> arch, ..) as the swap entry is effectively freed. The fact that
> swapfile code internally batches and caches slots should be
> transparent to other parts of MM. I am not sure if the calls can just
> be moved or if there are underlying assumptions in the implementation
> that would be broken, but it feels like the right thing to do.
Good idea, This is indeed a clear solution. I'll try it in another
patch later.
>
> For case (2), I don't think zswap can just decide to free the entry.
>
> In that case, the page is in the swap cache pointing to a swp_entry
> which has a corresponding zswap entry, and the page is clean because
> it is already in swap/zswap, so we don't need to write it out again
> unless it is redirtied. If zswap just drops the entry, and reclaim
> tries to reclaim the page in the swap cache, it will drop the page
> assuming that there is a copy in swap/zswap (because it is clean). Now
> we lost all copies of the page.
>
> Am I missing something?
>
Ah, my bad. Missed the step of marking the page as dirty.
Please have a look, just like zswap_exclusive_loads_enabled,
set page dity so that it can be pageout again.
if (!page_was_allocated) {
if (!count) {
set_page_dirty(page);
ret = 0;
} else
ret = -EEXIST;
put_page(page);
}
Thanks for your feedback, Yosry.
On Wed, Nov 15, 2023 at 4:53 AM 贺中坤 <[email protected]> wrote:
>
> > For case (1), I think a cleaner solution would be to move the
> > zswap_invalidate() call from swap_range_free() (which is called after
> > the cached slots are freed) to __swap_entry_free_locked() if the usage
> > goes to 0. I actually think conceptually this makes not just for
> > zswap_invalidate(), but also for the arch call, memcg uncharging, etc.
> > Slots caching is a swapfile optimization that should be internal to
> > swapfile code. Once a swap entry is freed (i.e. swap count is 0 AND
> > not in the swap cache), all the hooks should be called (memcg, zswap,
> > arch, ..) as the swap entry is effectively freed. The fact that
> > swapfile code internally batches and caches slots should be
> > transparent to other parts of MM. I am not sure if the calls can just
> > be moved or if there are underlying assumptions in the implementation
> > that would be broken, but it feels like the right thing to do.
>
> Good idea, This is indeed a clear solution. I'll try it in another
> patch later.
>
> >
> > For case (2), I don't think zswap can just decide to free the entry.
> >
> > In that case, the page is in the swap cache pointing to a swp_entry
> > which has a corresponding zswap entry, and the page is clean because
> > it is already in swap/zswap, so we don't need to write it out again
> > unless it is redirtied. If zswap just drops the entry, and reclaim
> > tries to reclaim the page in the swap cache, it will drop the page
> > assuming that there is a copy in swap/zswap (because it is clean). Now
> > we lost all copies of the page.
> >
> > Am I missing something?
> >
>
> Ah, my bad. Missed the step of marking the page as dirty.
> Please have a look, just like zswap_exclusive_loads_enabled,
> set page dity so that it can be pageout again.
> if (!page_was_allocated) {
> if (!count) {
> set_page_dirty(page);
> ret = 0;
> } else
> ret = -EEXIST;
> put_page(page);
> }
I think we may need to try to lock the folio. Otherwise we may race
with reclaim reading the dirty bit before we set it.
Taking a step back, this seems like we are going behind exclusive
loads. We "should" keep the page in zswap as exclusive loads are
disabled and the page is not yet invalidated from zswap (the swap
entry is still in use). What you are trying to do here is sneakily
drop the page from zswap as if we wrote it back, but we didn't. We
just know that it was already loaded from zswap. We are essentially
making the previous load exclusive retroactively.
Is there a reason why exclusive loads cannot be enabled to achieve the
same result in the (arguably) correct way?
> Thanks for your feedback, Yosry.
On Thu, Nov 16, 2023 at 4:13 AM Yosry Ahmed <[email protected]> wrote:
>
> I think we may need to try to lock the folio. Otherwise we may race
> with reclaim reading the dirty bit before we set it.
>
> Taking a step back, this seems like we are going behind exclusive
> loads. We "should" keep the page in zswap as exclusive loads are
> disabled and the page is not yet invalidated from zswap (the swap
> entry is still in use). What you are trying to do here is sneakily
> drop the page from zswap as if we wrote it back, but we didn't.
If we want to reclaim the cached zswap_entry, Writing back might
be the easy way.
> We just know that it was already loaded from zswap. We are essentially
> making the previous load exclusive retroactively.
>
> Is there a reason why exclusive loads cannot be enabled to achieve the
> same result in the (arguably) correct way?
>
zswap_exclusive_loads is not enabled by default, so the shrink_worker
may fail if there are many cached zswap_entries on the zswap_pool->lru.
Is it possible to make zswap_exclusive_loads the only way in zswap_load?
It only makes sense when the page is read and no longer dirty.
If the page is read frequently, it should stay in cache rather than zswap.
The benefit of doing this is very small, two copies of the same page
in memory.
Thanks.
On Wed, Nov 15, 2023 at 7:33 PM 贺中坤 <[email protected]> wrote:
>
> On Thu, Nov 16, 2023 at 4:13 AM Yosry Ahmed <[email protected]> wrote:
> >
> > I think we may need to try to lock the folio. Otherwise we may race
> > with reclaim reading the dirty bit before we set it.
> >
> > Taking a step back, this seems like we are going behind exclusive
> > loads. We "should" keep the page in zswap as exclusive loads are
> > disabled and the page is not yet invalidated from zswap (the swap
> > entry is still in use). What you are trying to do here is sneakily
> > drop the page from zswap as if we wrote it back, but we didn't.
>
> If we want to reclaim the cached zswap_entry, Writing back might
> be the easy way.
>
> > We just know that it was already loaded from zswap. We are essentially
> > making the previous load exclusive retroactively.
> >
> > Is there a reason why exclusive loads cannot be enabled to achieve the
> > same result in the (arguably) correct way?
> >
>
> zswap_exclusive_loads is not enabled by default, so the shrink_worker
> may fail if there are many cached zswap_entries on the zswap_pool->lru.
It can be enabled at runtime, or enabled by default by using
CONFIG_ZSWAP_EXCLUSIVE_LOADS_DEFAULT_ON.
>
>
> Is it possible to make zswap_exclusive_loads the only way in zswap_load?
> It only makes sense when the page is read and no longer dirty.
> If the page is read frequently, it should stay in cache rather than zswap.
> The benefit of doing this is very small, two copies of the same page
> in memory.
The reason I added it behind runtime and config knobs is to preserve
the existing behavior in case someone depends on it. At Google, we
have been using exclusive loads for a long time. If other users of
zswap agree to make this the default behavior or make it the only way
to do zswap loads I don't have a problem with it.
>
>
> Thanks.
On Thu, Nov 16, 2023 at 12:10 PM Yosry Ahmed <[email protected]> wrote:
>
> It can be enabled at runtime, or enabled by default by using
> CONFIG_ZSWAP_EXCLUSIVE_LOADS_DEFAULT_ON.
>
Yes, I see it in the doc. Thanks.
>
>
> The reason I added it behind runtime and config knobs is to preserve
> the existing behavior in case someone depends on it. At Google, we
> have been using exclusive loads for a long time. If other users of
> zswap agree to make this the default behavior or make it the only way
> to do zswap loads I don't have a problem with it.
>
Got it. Thanks for your feedback.