2021-07-07 22:13:46

by Hugh Dickins

[permalink] [raw]
Subject: [PATCH] fs, mm: fix race in unlinking swapfile

We had a recurring situation in which admin procedures setting up
swapfiles would race with test preparation clearing away swapfiles;
and just occasionally that got stuck on a swapfile "(deleted)" which
could never be swapped off. That is not supposed to be possible.

2.6.28 commit f9454548e17c ("don't unlink an active swapfile") admitted
that it was leaving a race window open: now close it.

may_delete() makes the IS_SWAPFILE check (amongst many others) before
inode_lock has been taken on target: now repeat just that simple check
in vfs_unlink() and vfs_rename(), after taking inode_lock.

Which goes most of the way to fixing the race, but swapon() must also
check after it acquires inode_lock, that the file just opened has not
already been unlinked.

Fixes: f9454548e17c ("don't unlink an active swapfile")
Signed-off-by: Hugh Dickins <[email protected]>
---
fs/namei.c | 8 +++++++-
mm/swapfile.c | 6 ++++++
2 files changed, 13 insertions(+), 1 deletion(-)

diff --git a/fs/namei.c b/fs/namei.c
index bf6d8a738c59..ff866c07f4d2 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -4024,7 +4024,9 @@ int vfs_unlink(struct user_namespace *mnt_userns, struct inode *dir,
return -EPERM;

inode_lock(target);
- if (is_local_mountpoint(dentry))
+ if (IS_SWAPFILE(target))
+ error = -EPERM;
+ else if (is_local_mountpoint(dentry))
error = -EBUSY;
else {
error = security_inode_unlink(dir, dentry);
@@ -4526,6 +4528,10 @@ int vfs_rename(struct renamedata *rd)
else if (target)
inode_lock(target);

+ error = -EPERM;
+ if (IS_SWAPFILE(source) || (target && IS_SWAPFILE(target)))
+ goto out;
+
error = -EBUSY;
if (is_local_mountpoint(old_dentry) || is_local_mountpoint(new_dentry))
goto out;
diff --git a/mm/swapfile.c b/mm/swapfile.c
index 1e07d1c776f2..7527afd95284 100644
--- a/mm/swapfile.c
+++ b/mm/swapfile.c
@@ -3130,6 +3130,7 @@ SYSCALL_DEFINE2(swapon, const char __user *, specialfile, int, swap_flags)
struct filename *name;
struct file *swap_file = NULL;
struct address_space *mapping;
+ struct dentry *dentry;
int prio;
int error;
union swap_header *swap_header;
@@ -3173,6 +3174,7 @@ SYSCALL_DEFINE2(swapon, const char __user *, specialfile, int, swap_flags)

p->swap_file = swap_file;
mapping = swap_file->f_mapping;
+ dentry = swap_file->f_path.dentry;
inode = mapping->host;

error = claim_swapfile(p, inode);
@@ -3180,6 +3182,10 @@ SYSCALL_DEFINE2(swapon, const char __user *, specialfile, int, swap_flags)
goto bad_swap;

inode_lock(inode);
+ if (d_unlinked(dentry) || cant_mount(dentry)) {
+ error = -ENOENT;
+ goto bad_swap_unlock_inode;
+ }
if (IS_SWAPFILE(inode)) {
error = -EBUSY;
goto bad_swap_unlock_inode;
--
2.26.2


2021-07-12 09:10:53

by Jan Kara

[permalink] [raw]
Subject: Re: [PATCH] fs, mm: fix race in unlinking swapfile

On Wed 07-07-21 13:48:31, Hugh Dickins wrote:
> We had a recurring situation in which admin procedures setting up
> swapfiles would race with test preparation clearing away swapfiles;
> and just occasionally that got stuck on a swapfile "(deleted)" which
> could never be swapped off. That is not supposed to be possible.
>
> 2.6.28 commit f9454548e17c ("don't unlink an active swapfile") admitted
> that it was leaving a race window open: now close it.
>
> may_delete() makes the IS_SWAPFILE check (amongst many others) before
> inode_lock has been taken on target: now repeat just that simple check
> in vfs_unlink() and vfs_rename(), after taking inode_lock.
>
> Which goes most of the way to fixing the race, but swapon() must also
> check after it acquires inode_lock, that the file just opened has not
> already been unlinked.
>
> Fixes: f9454548e17c ("don't unlink an active swapfile")
> Signed-off-by: Hugh Dickins <[email protected]>

Looks good to me. Feel free to add:

Reviewed-by: Jan Kara <[email protected]>

Honza

> ---
> fs/namei.c | 8 +++++++-
> mm/swapfile.c | 6 ++++++
> 2 files changed, 13 insertions(+), 1 deletion(-)
>
> diff --git a/fs/namei.c b/fs/namei.c
> index bf6d8a738c59..ff866c07f4d2 100644
> --- a/fs/namei.c
> +++ b/fs/namei.c
> @@ -4024,7 +4024,9 @@ int vfs_unlink(struct user_namespace *mnt_userns, struct inode *dir,
> return -EPERM;
>
> inode_lock(target);
> - if (is_local_mountpoint(dentry))
> + if (IS_SWAPFILE(target))
> + error = -EPERM;
> + else if (is_local_mountpoint(dentry))
> error = -EBUSY;
> else {
> error = security_inode_unlink(dir, dentry);
> @@ -4526,6 +4528,10 @@ int vfs_rename(struct renamedata *rd)
> else if (target)
> inode_lock(target);
>
> + error = -EPERM;
> + if (IS_SWAPFILE(source) || (target && IS_SWAPFILE(target)))
> + goto out;
> +
> error = -EBUSY;
> if (is_local_mountpoint(old_dentry) || is_local_mountpoint(new_dentry))
> goto out;
> diff --git a/mm/swapfile.c b/mm/swapfile.c
> index 1e07d1c776f2..7527afd95284 100644
> --- a/mm/swapfile.c
> +++ b/mm/swapfile.c
> @@ -3130,6 +3130,7 @@ SYSCALL_DEFINE2(swapon, const char __user *, specialfile, int, swap_flags)
> struct filename *name;
> struct file *swap_file = NULL;
> struct address_space *mapping;
> + struct dentry *dentry;
> int prio;
> int error;
> union swap_header *swap_header;
> @@ -3173,6 +3174,7 @@ SYSCALL_DEFINE2(swapon, const char __user *, specialfile, int, swap_flags)
>
> p->swap_file = swap_file;
> mapping = swap_file->f_mapping;
> + dentry = swap_file->f_path.dentry;
> inode = mapping->host;
>
> error = claim_swapfile(p, inode);
> @@ -3180,6 +3182,10 @@ SYSCALL_DEFINE2(swapon, const char __user *, specialfile, int, swap_flags)
> goto bad_swap;
>
> inode_lock(inode);
> + if (d_unlinked(dentry) || cant_mount(dentry)) {
> + error = -ENOENT;
> + goto bad_swap_unlock_inode;
> + }
> if (IS_SWAPFILE(inode)) {
> error = -EBUSY;
> goto bad_swap_unlock_inode;
> --
> 2.26.2
>
--
Jan Kara <[email protected]>
SUSE Labs, CR