So any arch code which uses page_mapping_file() might get the wrong
answer, this function will return NULL for a swap backed page which
can be a shmem pages. But shmem pages can still be shared among
multiple process (and possibly at different virtual addresses if
mremap was use).
Attached is a patch that changes page_mapping_file() to return the
shmem mapping for swap backed shmem page. I have not tested it (no
way for me to test all those architecture) and i spotted this while
working on something else. So i hope someone can take a closer look.
Cheers,
J?r?me
Hi Jerome,
On Wed, 27 May 2020, Jerome Glisse wrote:
> So any arch code which uses page_mapping_file() might get the wrong
> answer, this function will return NULL for a swap backed page which
> can be a shmem pages. But shmem pages can still be shared among
> multiple process (and possibly at different virtual addresses if
> mremap was use).
>
> Attached is a patch that changes page_mapping_file() to return the
> shmem mapping for swap backed shmem page. I have not tested it (no
> way for me to test all those architecture) and i spotted this while
> working on something else. So i hope someone can take a closer look.
I'm certainly no expert on flush_dcache_page() and friends, but I'd
be very surprised if such a problem exists, yet has gone unnoticed
for so long. page_mapping_file() itself is fairly new, added when
a risk of crashing on a race with swapoff came in: but the previous
use of page_mapping() would have suffered equally if there were such
a cache flushinhg problem here.
And I'm afraid your patch won't do anything to help if there is a
problem: very soon after shmem calls add_to_swap_cache(), it calls
shmem_delete_from_page_cache(), which sets page->mapping to NULL.
But I can assure you that a shmem page (unlike an anon page) is never
put into swap cache while it is mapped into userspace, and never
mapped into userspace while it is still in swap cache: does that help?
Hugh
> This might be a shmem page that is in a sense a file that
> can be mapped multiple times in different processes at
> possibly different virtual addresses (fork + mremap). So
> return the shmem mapping that will allow any arch code to
> find all mappings of the page.
>
> Note that even if page is not anonymous then the page might
> have a NULL page->mapping field if it is being truncated,
> but then it is fine as each pte poiting to the page will be
> remove and cache flushing should be handled properly by that
> part of the code.
>
> Signed-off-by: Jerome Glisse <[email protected]>
> Cc: "Huang, Ying" <[email protected]>
> Cc: Michal Hocko <[email protected]>
> Cc: Mel Gorman <[email protected]>
> Cc: Russell King <[email protected]>
> Cc: Andrew Morton <[email protected]>
> Cc: Mike Rapoport <[email protected]>
> Cc: "David S. Miller" <[email protected]>
> Cc: "James E.J. Bottomley" <[email protected]>
> ---
> mm/util.c | 18 +++++++++++++++++-
> 1 file changed, 17 insertions(+), 1 deletion(-)
>
> diff --git a/mm/util.c b/mm/util.c
> index 988d11e6c17c..ec8739ab0cc3 100644
> --- a/mm/util.c
> +++ b/mm/util.c
> @@ -685,8 +685,24 @@ EXPORT_SYMBOL(page_mapping);
> */
> struct address_space *page_mapping_file(struct page *page)
> {
> - if (unlikely(PageSwapCache(page)))
> + if (unlikely(PageSwapCache(page))) {
> + /*
> + * This might be a shmem page that is in a sense a file that
> + * can be mapped multiple times in different processes at
> + * possibly different virtual addresses (fork + mremap). So
> + * return the shmem mapping that will allow any arch code to
> + * find all mappings of the page.
> + *
> + * Note that even if page is not anonymous then the page might
> + * have a NULL page->mapping field if it is being truncated,
> + * but then it is fine as each pte poiting to the page will be
> + * remove and cache flushing should be handled properly by that
> + * part of the code.
> + */
> + if (!PageAnon(page))
> + return page->mapping;
> return NULL;
> + }
> return page_mapping(page);
> }
>
> --
> 2.26.2
On Wed, May 27, 2020 at 08:46:22PM -0700, Hugh Dickins wrote:
> Hi Jerome,
>
> On Wed, 27 May 2020, Jerome Glisse wrote:
> > So any arch code which uses page_mapping_file() might get the wrong
> > answer, this function will return NULL for a swap backed page which
> > can be a shmem pages. But shmem pages can still be shared among
> > multiple process (and possibly at different virtual addresses if
> > mremap was use).
> >
> > Attached is a patch that changes page_mapping_file() to return the
> > shmem mapping for swap backed shmem page. I have not tested it (no
> > way for me to test all those architecture) and i spotted this while
> > working on something else. So i hope someone can take a closer look.
>
> I'm certainly no expert on flush_dcache_page() and friends, but I'd
> be very surprised if such a problem exists, yet has gone unnoticed
> for so long. page_mapping_file() itself is fairly new, added when
> a risk of crashing on a race with swapoff came in: but the previous
> use of page_mapping() would have suffered equally if there were such
> a cache flushinhg problem here.
>
> And I'm afraid your patch won't do anything to help if there is a
> problem: very soon after shmem calls add_to_swap_cache(), it calls
> shmem_delete_from_page_cache(), which sets page->mapping to NULL.
>
> But I can assure you that a shmem page (unlike an anon page) is never
> put into swap cache while it is mapped into userspace, and never
> mapped into userspace while it is still in swap cache: does that help?
>
You are right i missed/forgot the part where shmem is never swapcache
and mapped at the same time, thus page_mapping_file() can return NULL
for those as they can no longer have alias mapping.
Thank you Hugh
J?r?me