2014-01-27 10:10:06

by Weijie Yang

[permalink] [raw]
Subject: [PATCH 2/8] mm/swap: fix race on swap_info reuse between swapoff and swapon

swapoff clear swap_info's SWP_USED flag prematurely and free its resources
after that. A concurrent swapon will reuse this swap_info while its previous
resources are not cleared completely.

These late freed resources are:
- p->percpu_cluster
- swap_cgroup_ctrl[type]
- block_device setting
- inode->i_flags &= ~S_SWAPFILE

This patch clear SWP_USED flag after all its resources freed, so that swapon
can reuse this swap_info by alloc_swap_info() safely.

This patch is just for a rare scenario, aim to correct of code.

Suggested-by: Heesub Shin <[email protected]>
Suggested-by: Mateusz Guzik <[email protected]>
Signed-off-by: Weijie Yang <[email protected]>
---
mm/swapfile.c | 14 +++++++++++---
1 file changed, 11 insertions(+), 3 deletions(-)

diff --git a/mm/swapfile.c b/mm/swapfile.c
index 0a623a9..4d24158 100644
--- a/mm/swapfile.c
+++ b/mm/swapfile.c
@@ -1977,7 +1977,6 @@ SYSCALL_DEFINE1(swapoff, const char __user *, specialfile)
p->swap_map = NULL;
cluster_info = p->cluster_info;
p->cluster_info = NULL;
- p->flags = 0;
frontswap_map = frontswap_map_get(p);
spin_unlock(&p->lock);
spin_unlock(&swap_lock);
@@ -2003,6 +2002,15 @@ SYSCALL_DEFINE1(swapoff, const char __user *, specialfile)
mutex_unlock(&inode->i_mutex);
}
filp_close(swap_file, NULL);
+
+ /*
+ * clear SWP_USED flag after all resources freed so that
+ * swapon can reuse this swap_info in alloc_swap_info() safely
+ * it is ok to not hold any lock after we cleared SWP_WRITEOK flag
+ */
+ smp_wmb();
+ p->flags = 0;
+
err = 0;
atomic_inc(&proc_poll_event);
wake_up_interruptible(&proc_poll_wait);
@@ -2050,7 +2058,7 @@ static void *swap_start(struct seq_file *swap, loff_t *pos)
for (type = 0; type < nr_swapfiles; type++) {
smp_rmb(); /* read nr_swapfiles before swap_info[type] */
si = swap_info[type];
- if (!(si->flags & SWP_USED) || !si->swap_map)
+ if (!(si->flags & SWP_WRITEOK))
continue;
if (!--l)
return si;
@@ -2072,7 +2080,7 @@ static void *swap_next(struct seq_file *swap, void *v, loff_t *pos)
for (; type < nr_swapfiles; type++) {
smp_rmb(); /* read nr_swapfiles before swap_info[type] */
si = swap_info[type];
- if (!(si->flags & SWP_USED) || !si->swap_map)
+ if (!(si->flags & SWP_WRITEOK))
continue;
++*pos;
return si;
--
1.7.10.4


2014-02-03 23:23:43

by Andrew Morton

[permalink] [raw]
Subject: Re: [PATCH 2/8] mm/swap: fix race on swap_info reuse between swapoff and swapon

On Mon, 27 Jan 2014 18:03:04 +0800 Weijie Yang <[email protected]> wrote:

> swapoff clear swap_info's SWP_USED flag prematurely and free its resources
> after that. A concurrent swapon will reuse this swap_info while its previous
> resources are not cleared completely.
>
> These late freed resources are:
> - p->percpu_cluster
> - swap_cgroup_ctrl[type]
> - block_device setting
> - inode->i_flags &= ~S_SWAPFILE
>
> This patch clear SWP_USED flag after all its resources freed, so that swapon
> can reuse this swap_info by alloc_swap_info() safely.
>
> This patch is just for a rare scenario, aim to correct of code.

I believe that
http://ozlabs.org/~akpm/mmots/broken-out/mm-swap-fix-race-on-swap_info-reuse-between-swapoff-and-swapon.patch
makes this patch redundant?

2014-02-03 23:29:31

by Andrew Morton

[permalink] [raw]
Subject: Re: [PATCH 2/8] mm/swap: fix race on swap_info reuse between swapoff and swapon

On Mon, 3 Feb 2014 15:23:40 -0800 Andrew Morton <[email protected]> wrote:

> On Mon, 27 Jan 2014 18:03:04 +0800 Weijie Yang <[email protected]> wrote:
>
> > swapoff clear swap_info's SWP_USED flag prematurely and free its resources
> > after that. A concurrent swapon will reuse this swap_info while its previous
> > resources are not cleared completely.
> >
> > These late freed resources are:
> > - p->percpu_cluster
> > - swap_cgroup_ctrl[type]
> > - block_device setting
> > - inode->i_flags &= ~S_SWAPFILE
> >
> > This patch clear SWP_USED flag after all its resources freed, so that swapon
> > can reuse this swap_info by alloc_swap_info() safely.
> >
> > This patch is just for a rare scenario, aim to correct of code.
>
> I believe that
> http://ozlabs.org/~akpm/mmots/broken-out/mm-swap-fix-race-on-swap_info-reuse-between-swapoff-and-swapon.patch
> makes this patch redundant?
>

oop, hang on. This patch *is* a stealth-updated version of
http://ozlabs.org/~akpm/mmots/broken-out/mm-swap-fix-race-on-swap_info-reuse-between-swapoff-and-swapon.patch.

Undocumented removals of si->swap_map have been added. What's going on
there?

I think I'll stick with the original patch for now. If you see
additional optimisations or changes, let's address that separately?

2014-02-04 04:14:55

by Hugh Dickins

[permalink] [raw]
Subject: Re: [PATCH 2/8] mm/swap: fix race on swap_info reuse between swapoff and swapon

On Mon, 3 Feb 2014, Andrew Morton wrote:
> On Mon, 3 Feb 2014 15:23:40 -0800 Andrew Morton <[email protected]> wrote:
> > On Mon, 27 Jan 2014 18:03:04 +0800 Weijie Yang <[email protected]> wrote:
> >
> > > swapoff clear swap_info's SWP_USED flag prematurely and free its resources
> > > after that. A concurrent swapon will reuse this swap_info while its previous
> > > resources are not cleared completely.
> > >
> > > These late freed resources are:
> > > - p->percpu_cluster
> > > - swap_cgroup_ctrl[type]
> > > - block_device setting
> > > - inode->i_flags &= ~S_SWAPFILE
> > >
> > > This patch clear SWP_USED flag after all its resources freed, so that swapon
> > > can reuse this swap_info by alloc_swap_info() safely.
> > >
> > > This patch is just for a rare scenario, aim to correct of code.
> >
> > I believe that
> > http://ozlabs.org/~akpm/mmots/broken-out/mm-swap-fix-race-on-swap_info-reuse-between-swapoff-and-swapon.patch
> > makes this patch redundant?
> >
>
> oop, hang on. This patch *is* a stealth-updated version of
> http://ozlabs.org/~akpm/mmots/broken-out/mm-swap-fix-race-on-swap_info-reuse-between-swapoff-and-swapon.patch.
>
> Undocumented removals of si->swap_map have been added. What's going on
> there?
>
> I think I'll stick with the original patch for now. If you see
> additional optimisations or changes, let's address that separately?

Correct decision, thanks: I explained in an answer when Acking the
previous version why I dislike this version (it would prevent you
from watching the slow progress of swapoff).

Hugh