by Mika Penttilä

[permalink] [raw]

Subject: Re: [PATCH v2] mm: zswap: handle incorrect attempts to load of large folios

On 6/10/24 20:35, Yosry Ahmed wrote:
> On Fri, Jun 7, 2024 at 9:08 PM Mika Penttilä <[email protected]> wrote:
>> On 6/8/24 05:36, Yosry Ahmed wrote:
>>> diff --git a/mm/zswap.c b/mm/zswap.c
>>> index b9b35ef86d9be..ebb878d3e7865 100644
>>> --- a/mm/zswap.c
>>> +++ b/mm/zswap.c
>>> @@ -1557,6 +1557,26 @@ bool zswap_load(struct folio *folio)
>>>
>>> VM_WARN_ON_ONCE(!folio_test_locked(folio));
>>>
>>> + /*
>>> + * Large folios should not be swapped in while zswap is being used, as
>>> + * they are not properly handled. Zswap does not properly load large
>>> + * folios, and a large folio may only be partially in zswap.
>>> + *
>>> + * If any of the subpages are in zswap, reading from disk would result
>>> + * in data corruption, so return true without marking the folio uptodate
>>> + * so that an IO error is emitted (e.g. do_swap_page() will sigfault).
>>> + *
>>> + * Otherwise, return false and read the folio from disk.
>>> + */
>>> + if (folio_test_large(folio)) {
>>> + if (xa_find(tree, &offset,
>>> + offset + folio_nr_pages(folio) - 1, XA_PRESENT)) {
>>> + WARN_ON_ONCE(1);
>>> + return true;
>>> + }
>> How does that work? Should it be xa_find_after() to not always find
>> current entry?
> By "current entry" I believe you mean the entry corresponding to
> "offset" (i.e. the first subpage of the folio). At this point, we
> haven't checked if that offset has a corresponding entry in zswap or
> not. It may be on disk, or zwap may be disabled.

Okay you test if there's any matching offset in zswap for the folio.

>> And does it still mean those subsequent entries map to same folio?
> If I understand correctly, a folio in the swapcache has contiguous
> swap offsets for its subpages. So I am assuming that the large folio
> swapin case will adhere to that (i.e. we only swapin a large folio if
> the swap offsets are contiguous). Did I misunderstand something here?

Yes I think that is fair assumption for now. But also saw your v3 which
doesn't depend on this.

>
>>
>> --Mika
>>
>>

2024-06-11 15:48:10

by Yosry Ahmed

[permalink] [raw]

Subject: Re: [PATCH v2] mm: zswap: handle incorrect attempts to load of large folios

On Mon, Jun 10, 2024 at 9:14 PM Mika Penttilä <[email protected]> wrote:
>
>
> On 6/10/24 20:35, Yosry Ahmed wrote:
> > On Fri, Jun 7, 2024 at 9:08 PM Mika Penttilä <[email protected]> wrote:
> >> On 6/8/24 05:36, Yosry Ahmed wrote:
> >>> diff --git a/mm/zswap.c b/mm/zswap.c
> >>> index b9b35ef86d9be..ebb878d3e7865 100644
> >>> --- a/mm/zswap.c
> >>> +++ b/mm/zswap.c
> >>> @@ -1557,6 +1557,26 @@ bool zswap_load(struct folio *folio)
> >>>
> >>> VM_WARN_ON_ONCE(!folio_test_locked(folio));
> >>>
> >>> + /*
> >>> + * Large folios should not be swapped in while zswap is being used, as
> >>> + * they are not properly handled. Zswap does not properly load large
> >>> + * folios, and a large folio may only be partially in zswap.
> >>> + *
> >>> + * If any of the subpages are in zswap, reading from disk would result
> >>> + * in data corruption, so return true without marking the folio uptodate
> >>> + * so that an IO error is emitted (e.g. do_swap_page() will sigfault).
> >>> + *
> >>> + * Otherwise, return false and read the folio from disk.
> >>> + */
> >>> + if (folio_test_large(folio)) {
> >>> + if (xa_find(tree, &offset,
> >>> + offset + folio_nr_pages(folio) - 1, XA_PRESENT)) {
> >>> + WARN_ON_ONCE(1);
> >>> + return true;
> >>> + }
> >> How does that work? Should it be xa_find_after() to not always find
> >> current entry?
> > By "current entry" I believe you mean the entry corresponding to
> > "offset" (i.e. the first subpage of the folio). At this point, we
> > haven't checked if that offset has a corresponding entry in zswap or
> > not. It may be on disk, or zwap may be disabled.
>
> Okay you test if there's any matching offset in zswap for the folio.
>
>
> >> And does it still mean those subsequent entries map to same folio?
> > If I understand correctly, a folio in the swapcache has contiguous
> > swap offsets for its subpages. So I am assuming that the large folio
> > swapin case will adhere to that (i.e. we only swapin a large folio if
> > the swap offsets are contiguous). Did I misunderstand something here?
>
> Yes I think that is fair assumption for now. But also saw your v3 which
> doesn't depend on this.

Yeah Barry pointed out that we want to warn if a large folio reaches
zswap and there is a chance that it is in zswap (i.e. zswap was
enabled), even if it happens that the folio is not in zswap during
swapin. This gives wider coverage and is cheaper when zswap is
disabled than the lookups.

We will also need to check if zswap was ever enabled in the large
folio swapin series anyway, so the helper introduced in v3 should be
helpful there as well.