We used to have a single swap address space with swp_entry_t.val
as its radix tree index. This is not the case anymore. Now Each
swp_type() has its own address space and should use swp_offset()
as radix tree index.
Signed-off-by: Yu Zhao <[email protected]>
---
mm/shmem.c | 11 +++++++----
1 file changed, 7 insertions(+), 4 deletions(-)
diff --git a/mm/shmem.c b/mm/shmem.c
index d44991ea5ed4..a92c6ae26915 100644
--- a/mm/shmem.c
+++ b/mm/shmem.c
@@ -1509,11 +1509,13 @@ static int shmem_replace_page(struct page **pagep, gfp_t gfp,
{
struct page *oldpage, *newpage;
struct address_space *swap_mapping;
- pgoff_t swap_index;
+ swp_entry_t entry;
int error;
+ VM_BUG_ON(PageSwapCache(*pagep));
+
oldpage = *pagep;
- swap_index = page_private(oldpage);
+ entry.val = page_private(oldpage);
swap_mapping = page_mapping(oldpage);
/*
@@ -1532,7 +1534,7 @@ static int shmem_replace_page(struct page **pagep, gfp_t gfp,
__SetPageLocked(newpage);
__SetPageSwapBacked(newpage);
SetPageUptodate(newpage);
- set_page_private(newpage, swap_index);
+ set_page_private(newpage, entry.val);
SetPageSwapCache(newpage);
/*
@@ -1540,7 +1542,8 @@ static int shmem_replace_page(struct page **pagep, gfp_t gfp,
* a nice clean interface for us to replace oldpage by newpage there.
*/
xa_lock_irq(&swap_mapping->i_pages);
- error = shmem_replace_entry(swap_mapping, swap_index, oldpage, newpage);
+ error = shmem_replace_entry(swap_mapping, swp_offset(entry),
+ oldpage, newpage);
if (!error) {
__inc_node_page_state(newpage, NR_FILE_PAGES);
__dec_node_page_state(oldpage, NR_FILE_PAGES);
--
2.19.1.1215.g8438c0b245-goog
We used to have a single swap address space with swp_entry_t.val
as its radix tree index. This is not the case anymore. Now Each
swp_type() has its own address space and should use swp_offset()
as radix tree index.
Signed-off-by: Yu Zhao <[email protected]>
---
mm/shmem.c | 11 +++++++----
1 file changed, 7 insertions(+), 4 deletions(-)
diff --git a/mm/shmem.c b/mm/shmem.c
index d44991ea5ed4..685faa3e0191 100644
--- a/mm/shmem.c
+++ b/mm/shmem.c
@@ -1509,11 +1509,13 @@ static int shmem_replace_page(struct page **pagep, gfp_t gfp,
{
struct page *oldpage, *newpage;
struct address_space *swap_mapping;
- pgoff_t swap_index;
+ swp_entry_t entry;
int error;
+ VM_BUG_ON(!PageSwapCache(*pagep));
+
oldpage = *pagep;
- swap_index = page_private(oldpage);
+ entry.val = page_private(oldpage);
swap_mapping = page_mapping(oldpage);
/*
@@ -1532,7 +1534,7 @@ static int shmem_replace_page(struct page **pagep, gfp_t gfp,
__SetPageLocked(newpage);
__SetPageSwapBacked(newpage);
SetPageUptodate(newpage);
- set_page_private(newpage, swap_index);
+ set_page_private(newpage, entry.val);
SetPageSwapCache(newpage);
/*
@@ -1540,7 +1542,8 @@ static int shmem_replace_page(struct page **pagep, gfp_t gfp,
* a nice clean interface for us to replace oldpage by newpage there.
*/
xa_lock_irq(&swap_mapping->i_pages);
- error = shmem_replace_entry(swap_mapping, swap_index, oldpage, newpage);
+ error = shmem_replace_entry(swap_mapping, swp_offset(entry),
+ oldpage, newpage);
if (!error) {
__inc_node_page_state(newpage, NR_FILE_PAGES);
__dec_node_page_state(oldpage, NR_FILE_PAGES);
--
2.19.1.1215.g8438c0b245-goog
On Sun, 18 Nov 2018, Yu Zhao wrote:
> We used to have a single swap address space with swp_entry_t.val
> as its radix tree index. This is not the case anymore. Now Each
> swp_type() has its own address space and should use swp_offset()
> as radix tree index.
>
> Signed-off-by: Yu Zhao <[email protected]>
This fix is a great find, thank you! But completely mis-described!
And could you do a smaller patch, keeping swap_index, that can go to
stable without getting into trouble with the recent xarrifications?
Fixes: bde05d1ccd51 ("shmem: replace page if mapping excludes its zone")
Cc: [email protected] # 3.5+
Seems shmem_replace_page() has been wrong since the day I wrote it:
good enough to work on swap "type" 0, which is all most people ever use
(especially those few who need shmem_replace_page() at all), but broken
once there are any non-0 swp_type bits set in the higher order bits.
> ---
> mm/shmem.c | 11 +++++++----
> 1 file changed, 7 insertions(+), 4 deletions(-)
>
> diff --git a/mm/shmem.c b/mm/shmem.c
> index d44991ea5ed4..685faa3e0191 100644
> --- a/mm/shmem.c
> +++ b/mm/shmem.c
> @@ -1509,11 +1509,13 @@ static int shmem_replace_page(struct page **pagep, gfp_t gfp,
> {
> struct page *oldpage, *newpage;
> struct address_space *swap_mapping;
> - pgoff_t swap_index;
> + swp_entry_t entry;
Please keep swap_index as well as adding entry.
> int error;
>
> + VM_BUG_ON(!PageSwapCache(*pagep));
> +
I'd prefer you to drop that, it has no bearing on this patch;
we used to have it, along with lots of other VM_BUG_ONs in here,
but they outlived their usefulness, and don't need reintroducing -
they didn't help at all to prevent the actual bug you've found.
> oldpage = *pagep;
> - swap_index = page_private(oldpage);
> + entry.val = page_private(oldpage);
entry.val = page_private(oldpage);
swap_index = swp_offset(entry);
> swap_mapping = page_mapping(oldpage);
>
> /*
> @@ -1532,7 +1534,7 @@ static int shmem_replace_page(struct page **pagep, gfp_t gfp,
> __SetPageLocked(newpage);
> __SetPageSwapBacked(newpage);
> SetPageUptodate(newpage);
> - set_page_private(newpage, swap_index);
> + set_page_private(newpage, entry.val);
Yes.
> SetPageSwapCache(newpage);
>
> /*
> @@ -1540,7 +1542,8 @@ static int shmem_replace_page(struct page **pagep, gfp_t gfp,
> * a nice clean interface for us to replace oldpage by newpage there.
> */
> xa_lock_irq(&swap_mapping->i_pages);
> - error = shmem_replace_entry(swap_mapping, swap_index, oldpage, newpage);
> + error = shmem_replace_entry(swap_mapping, swp_offset(entry),
> + oldpage, newpage);
I'd prefer to omit that hunk, to avoid the xa_lock_irq() in the context;
the patch is just as good if we keep the swap_index variable.
> if (!error) {
> __inc_node_page_state(newpage, NR_FILE_PAGES);
> __dec_node_page_state(oldpage, NR_FILE_PAGES);
> --
> 2.19.1.1215.g8438c0b245-goog
Thanks,
Hugh
On Mon, Nov 19, 2018 at 02:11:27PM -0800, Hugh Dickins wrote:
> On Sun, 18 Nov 2018, Yu Zhao wrote:
>
> > We used to have a single swap address space with swp_entry_t.val
> > as its radix tree index. This is not the case anymore. Now Each
> > swp_type() has its own address space and should use swp_offset()
> > as radix tree index.
> >
> > Signed-off-by: Yu Zhao <[email protected]>
>
> This fix is a great find, thank you! But completely mis-described!
Yes, now I remember making swap offset as key was done long after per
swap device radix tree.
> And could you do a smaller patch, keeping swap_index, that can go to
> stable without getting into trouble with the recent xarrifications?
>
> Fixes: bde05d1ccd51 ("shmem: replace page if mapping excludes its zone")
> Cc: [email protected] # 3.5+
>
> Seems shmem_replace_page() has been wrong since the day I wrote it:
> good enough to work on swap "type" 0, which is all most people ever use
> (especially those few who need shmem_replace_page() at all), but broken
> once there are any non-0 swp_type bits set in the higher order bits.
But you did get it right when you wrote the function, which was before
the per swap device radix tree. so
Fixes: f6ab1f7f6b2d ("mm, swap: use offset of swap entry as key of swap cache")
looks good?
> > ---
> > mm/shmem.c | 11 +++++++----
> > 1 file changed, 7 insertions(+), 4 deletions(-)
> >
> > diff --git a/mm/shmem.c b/mm/shmem.c
> > index d44991ea5ed4..685faa3e0191 100644
> > --- a/mm/shmem.c
> > +++ b/mm/shmem.c
> > @@ -1509,11 +1509,13 @@ static int shmem_replace_page(struct page **pagep, gfp_t gfp,
> > {
> > struct page *oldpage, *newpage;
> > struct address_space *swap_mapping;
> > - pgoff_t swap_index;
> > + swp_entry_t entry;
>
> Please keep swap_index as well as adding entry.
Ack.
> > int error;
> >
> > + VM_BUG_ON(!PageSwapCache(*pagep));
> > +
>
> I'd prefer you to drop that, it has no bearing on this patch;
> we used to have it, along with lots of other VM_BUG_ONs in here,
> but they outlived their usefulness, and don't need reintroducing -
> they didn't help at all to prevent the actual bug you've found.
>
> > oldpage = *pagep;
> > - swap_index = page_private(oldpage);
> > + entry.val = page_private(oldpage);
>
> entry.val = page_private(oldpage);
> swap_index = swp_offset(entry);
>
> > swap_mapping = page_mapping(oldpage);
> >
> > /*
> > @@ -1532,7 +1534,7 @@ static int shmem_replace_page(struct page **pagep, gfp_t gfp,
> > __SetPageLocked(newpage);
> > __SetPageSwapBacked(newpage);
> > SetPageUptodate(newpage);
> > - set_page_private(newpage, swap_index);
> > + set_page_private(newpage, entry.val);
>
> Yes.
>
> > SetPageSwapCache(newpage);
> >
> > /*
> > @@ -1540,7 +1542,8 @@ static int shmem_replace_page(struct page **pagep, gfp_t gfp,
> > * a nice clean interface for us to replace oldpage by newpage there.
> > */
> > xa_lock_irq(&swap_mapping->i_pages);
> > - error = shmem_replace_entry(swap_mapping, swap_index, oldpage, newpage);
> > + error = shmem_replace_entry(swap_mapping, swp_offset(entry),
> > + oldpage, newpage);
>
> I'd prefer to omit that hunk, to avoid the xa_lock_irq() in the context;
> the patch is just as good if we keep the swap_index variable.
>
> > if (!error) {
> > __inc_node_page_state(newpage, NR_FILE_PAGES);
> > __dec_node_page_state(oldpage, NR_FILE_PAGES);
> > --
> > 2.19.1.1215.g8438c0b245-goog
>
> Thanks,
> Hugh
On Mon, 19 Nov 2018, Yu Zhao wrote:
> On Mon, Nov 19, 2018 at 02:11:27PM -0800, Hugh Dickins wrote:
> > On Sun, 18 Nov 2018, Yu Zhao wrote:
> >
> > > We used to have a single swap address space with swp_entry_t.val
> > > as its radix tree index. This is not the case anymore. Now Each
> > > swp_type() has its own address space and should use swp_offset()
> > > as radix tree index.
> > >
> > > Signed-off-by: Yu Zhao <[email protected]>
> >
> > This fix is a great find, thank you! But completely mis-described!
>
> Yes, now I remember making swap offset as key was done long after per
> swap device radix tree.
>
> > And could you do a smaller patch, keeping swap_index, that can go to
> > stable without getting into trouble with the recent xarrifications?
> >
> > Fixes: bde05d1ccd51 ("shmem: replace page if mapping excludes its zone")
> > Cc: [email protected] # 3.5+
> >
> > Seems shmem_replace_page() has been wrong since the day I wrote it:
> > good enough to work on swap "type" 0, which is all most people ever use
> > (especially those few who need shmem_replace_page() at all), but broken
> > once there are any non-0 swp_type bits set in the higher order bits.
>
> But you did get it right when you wrote the function, which was before
> the per swap device radix tree. so
> Fixes: f6ab1f7f6b2d ("mm, swap: use offset of swap entry as key of swap cache")
> looks good?
Oh, you're right, thank you. Yes, the fix is to that one, in 4.9 onwards.
I don't much like my original use of the name "swap_index", when it was
not the index in a swapfile (though it was the index in the radix tree);
but it will become a correct name with your patch.
Though Matthew Wilcox seems to want us to avoid saying "radix tree"...
Hugh
On Mon, Nov 19, 2018 at 09:07:27PM -0800, Hugh Dickins wrote:
> I don't much like my original use of the name "swap_index", when it was
> not the index in a swapfile (though it was the index in the radix tree);
> but it will become a correct name with your patch.
>
> Though Matthew Wilcox seems to want us to avoid saying "radix tree"...
Naming is hard ... but the Linux radix tree looks almost nothing like
a classic computer science radix tree. If you try to reconcile our
implementation with the wikipedia article on radix trees, you'll get
very confused.
A lot of places where we were saying 'radix tree' in comments should
really have said 'page cache'. So is this a swap cache index? I'm
not really familiar enough with the swapping code to say.
On Wed, Nov 21, 2018 at 02:54:42PM -0700, Yu Zhao wrote:
> We changed key of swap cache tree from swp_entry_t.val to
> swp_offset. Need to do so in shmem_replace_page() as well.
>
> Fixes: f6ab1f7f6b2d ("mm, swap: use offset of swap entry as key of swap cache")
> Cc: [email protected] # v4.9+
> Signed-off-by: Yu Zhao <[email protected]>
Reviewed-by: Matthew Wilcox <[email protected]>
We changed key of swap cache tree from swp_entry_t.val to
swp_offset. Need to do so in shmem_replace_page() as well.
Fixes: f6ab1f7f6b2d ("mm, swap: use offset of swap entry as key of swap cache")
Cc: [email protected] # v4.9+
Signed-off-by: Yu Zhao <[email protected]>
---
mm/shmem.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/mm/shmem.c b/mm/shmem.c
index d44991ea5ed4..42b70978e814 100644
--- a/mm/shmem.c
+++ b/mm/shmem.c
@@ -1509,11 +1509,13 @@ static int shmem_replace_page(struct page **pagep, gfp_t gfp,
{
struct page *oldpage, *newpage;
struct address_space *swap_mapping;
+ swp_entry_t entry;
pgoff_t swap_index;
int error;
oldpage = *pagep;
- swap_index = page_private(oldpage);
+ entry.val = page_private(oldpage);
+ swap_index = swp_offset(entry);
swap_mapping = page_mapping(oldpage);
/*
@@ -1532,7 +1534,7 @@ static int shmem_replace_page(struct page **pagep, gfp_t gfp,
__SetPageLocked(newpage);
__SetPageSwapBacked(newpage);
SetPageUptodate(newpage);
- set_page_private(newpage, swap_index);
+ set_page_private(newpage, entry.val);
SetPageSwapCache(newpage);
/*
--
2.19.1.1215.g8438c0b245-goog
On Wed, 21 Nov 2018, Yu Zhao wrote:
> We changed key of swap cache tree from swp_entry_t.val to
> swp_offset. Need to do so in shmem_replace_page() as well.
>
> Fixes: f6ab1f7f6b2d ("mm, swap: use offset of swap entry as key of swap cache")
> Cc: [email protected] # v4.9+
> Signed-off-by: Yu Zhao <[email protected]>
Acked-by: Hugh Dickins <[email protected]>
Thanks!
> ---
> mm/shmem.c | 6 ++++--
> 1 file changed, 4 insertions(+), 2 deletions(-)
>
> diff --git a/mm/shmem.c b/mm/shmem.c
> index d44991ea5ed4..42b70978e814 100644
> --- a/mm/shmem.c
> +++ b/mm/shmem.c
> @@ -1509,11 +1509,13 @@ static int shmem_replace_page(struct page **pagep, gfp_t gfp,
> {
> struct page *oldpage, *newpage;
> struct address_space *swap_mapping;
> + swp_entry_t entry;
> pgoff_t swap_index;
> int error;
>
> oldpage = *pagep;
> - swap_index = page_private(oldpage);
> + entry.val = page_private(oldpage);
> + swap_index = swp_offset(entry);
> swap_mapping = page_mapping(oldpage);
>
> /*
> @@ -1532,7 +1534,7 @@ static int shmem_replace_page(struct page **pagep, gfp_t gfp,
> __SetPageLocked(newpage);
> __SetPageSwapBacked(newpage);
> SetPageUptodate(newpage);
> - set_page_private(newpage, swap_index);
> + set_page_private(newpage, entry.val);
> SetPageSwapCache(newpage);
>
> /*
> --
> 2.19.1.1215.g8438c0b245-goog
On Wed, 21 Nov 2018 14:54:42 -0700 Yu Zhao <[email protected]> wrote:
> We changed key of swap cache tree from swp_entry_t.val to
> swp_offset. Need to do so in shmem_replace_page() as well.
What are the user-visible effects of this change?
> Fixes: f6ab1f7f6b2d ("mm, swap: use offset of swap entry as key of swap cache")
> Cc: [email protected] # v4.9+
Please always provide the user-impact information when fixing bugs. This
becomes especially important when proposing -stable backporting.
Hugh said
: shmem_replace_page() has been wrong since the day I wrote it: good
: enough to work on swap "type" 0, which is all most people ever use
: (especially those few who need shmem_replace_page() at all), but broken
: once there are any non-0 swp_type bits set in the higher order bits.
but we still don't have a description of "broken".
Thanks.
On Wed, 21 Nov 2018, Andrew Morton wrote:
> On Wed, 21 Nov 2018 14:54:42 -0700 Yu Zhao <[email protected]> wrote:
>
> > We changed key of swap cache tree from swp_entry_t.val to
> > swp_offset. Need to do so in shmem_replace_page() as well.
>
> What are the user-visible effects of this change?
Sorry, I don't know; and I don't know whether it's something Yu noticed
by source inspection, or in trying to use it, perhaps in some new way:
I assume the former, but he can add more info if the latter.
shmem_replace_page() was added in 3.5, to allow Intel's GMA500 graphics
driver to use the shmem support for GEM objects, despite its hardware
addressing limitations. (I did once try to extend it to NUMA placement,
but that turned out to be a waste, doing more harm than good.)
My guess at user-visible effects would be that the screen goes blank
or weird on GMA500 after a bout of swapping (because shmem_getpage_gfp()
failed to bring back a part of the object). The code in shmem.c looks
robust (amusing to see my "Is this possible? I think not" comment!),
but I don't know how it plays out at the graphics driver end.
>
> > Fixes: f6ab1f7f6b2d ("mm, swap: use offset of swap entry as key of swap cache")
> > Cc: [email protected] # v4.9+
>
> Please always provide the user-impact information when fixing bugs. This
> becomes especially important when proposing -stable backporting.
So far as I know, nothing but GMA500 gets to use it, but perhaps other
uses have been added since. We could ask around, but, frankly I'd prefer
just to remove the Fixes and Cc stable tags if they're causing trouble.
It's just a good simple bugfix, we thought stable would be glad of it.
>
> Hugh said
>
> : shmem_replace_page() has been wrong since the day I wrote it: good
> : enough to work on swap "type" 0, which is all most people ever use
> : (especially those few who need shmem_replace_page() at all), but broken
> : once there are any non-0 swp_type bits set in the higher order bits.
Yu then kindly restored my honour on that: I was misreading,
it was not broken from the start, but got broken by the commit in 4.9.
>
> but we still don't have a description of "broken".
>
> Thanks.
Hugh