2022-03-04 08:50:49

by Matthew Wilcox

[permalink] [raw]
Subject: Re: [PATCH mmotm] mm: filemap_unaccount_folio() large skip mapcount fixup

On Thu, Mar 03, 2022 at 08:21:19PM -0800, Hugh Dickins wrote:
> The page_mapcount_reset() when folio_mapped() while mapping_exiting()
> was devised long before there were huge or compound pages in the cache.
> It is still valid for small pages, but not at all clear what's right to
> check and reset on large pages. Just don't try when folio_test_large().

Thanks for bringing this up! I was really unsure about this chunk of code
when converting unaccount_page_cache_page() to filemap_unaccount_folio().

Part of me wants to just delete the whole thing. I'm unconvinced by
the argument; surely it's better to leak memory than perhaps reuse a
page which should not have been freed yet?

Also, the code doesn't take into account that folio_mapped() is freaking
expensive for THP (512 cache lines, blowing away 32kB of your L1 cache!),
and we may as well calculate folio_mapcount() while we're doing it.

Do you see this report often on machines that don't have
VM_BUG_ON_FOLIO() enabled?


2022-03-04 18:56:47

by Hugh Dickins

[permalink] [raw]
Subject: Re: [PATCH mmotm] mm: filemap_unaccount_folio() large skip mapcount fixup

On Fri, 4 Mar 2022, Matthew Wilcox wrote:
> On Thu, Mar 03, 2022 at 08:21:19PM -0800, Hugh Dickins wrote:
> > The page_mapcount_reset() when folio_mapped() while mapping_exiting()
> > was devised long before there were huge or compound pages in the cache.
> > It is still valid for small pages, but not at all clear what's right to
> > check and reset on large pages. Just don't try when folio_test_large().
>
> Thanks for bringing this up! I was really unsure about this chunk of code
> when converting unaccount_page_cache_page() to filemap_unaccount_folio().
>
> Part of me wants to just delete the whole thing.

A part of me feels the same way. Accepting to waste 2MB while
footling around over 4kB doesn't help my case for that code -
whose beauty is, umm, questionable.

> I'm unconvinced by
> the argument; surely it's better to leak memory than perhaps reuse a
> page which should not have been freed yet?

I know people who would agree with you. And in most cases we do prefer
to waste it, because it has become suspicious.

But by far the most common reason for getting here, is that a page table
has got corrupted in some way, so the pte or pmd was never identified to
unmap the page. Nothing suspect about the page itself, and (barring a
bug in vma unlinking) all the mappings which might hold the page have
been unmapped: we've just got a leftover in the mapcount.

>
> Also, the code doesn't take into account that folio_mapped() is freaking
> expensive for THP (512 cache lines, blowing away 32kB of your L1 cache!),
> and we may as well calculate folio_mapcount() while we're doing it.

Well, we don't need to optimize the rare case.

As you know, I've long thought that there should be one quick total
mapcount field; but it's never been a priority to work out the details
on that (and Linus seems to be driving mapcount out of fashion; and
you have your own ideas there).

>
> Do you see this report often on machines that don't have
> VM_BUG_ON_FOLIO() enabled?

No, not often at all. Almost never. I think it did come up occasionally
when that code was written, but seems to have faded away in recent years.
Cosmic rays getting weaker, or memory getting better, or fewer bugs perhaps
- I haven't looked back to check, may be wrong, but I think when I added
unmap_mapping_page() last year (now s/page/folio/), that was to fix one
real (but benign) cause of such warnings.

Hugh