2020-05-04 16:18:46

by Jia-Ju Bai

[permalink] [raw]
Subject: [PATCH] fs: xfs: fix a possible data race in xfs_inode_set_reclaim_tag()

We find that xfs_inode_set_reclaim_tag() and xfs_reclaim_inode() are
concurrently executed at runtime in the following call contexts:

Thread1:
xfs_fs_put_super()
xfs_unmountfs()
xfs_rtunmount_inodes()
xfs_irele()
xfs_fs_destroy_inode()
xfs_inode_set_reclaim_tag()

Thread2:
xfs_reclaim_worker()
xfs_reclaim_inodes()
xfs_reclaim_inodes_ag()
xfs_reclaim_inode()

In xfs_inode_set_reclaim_tag():
pag = xfs_perag_get(mp, XFS_INO_TO_AGNO(mp, ip->i_ino));
...
spin_lock(&ip->i_flags_lock);

In xfs_reclaim_inode():
spin_lock(&ip->i_flags_lock);
...
ip->i_ino = 0;
spin_unlock(&ip->i_flags_lock);

Thus, a data race can occur for ip->i_ino.

To fix this data race, the spinlock ip->i_flags_lock is used to protect
the access to ip->i_ino in xfs_inode_set_reclaim_tag().

This data race is found by our concurrency fuzzer.

Signed-off-by: Jia-Ju Bai <[email protected]>
---
fs/xfs/xfs_icache.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/xfs/xfs_icache.c b/fs/xfs/xfs_icache.c
index 8bf1d15be3f6..a2de08222ff5 100644
--- a/fs/xfs/xfs_icache.c
+++ b/fs/xfs/xfs_icache.c
@@ -229,9 +229,9 @@ xfs_inode_set_reclaim_tag(
struct xfs_mount *mp = ip->i_mount;
struct xfs_perag *pag;

+ spin_lock(&ip->i_flags_lock);
pag = xfs_perag_get(mp, XFS_INO_TO_AGNO(mp, ip->i_ino));
spin_lock(&pag->pag_ici_lock);
- spin_lock(&ip->i_flags_lock);

radix_tree_tag_set(&pag->pag_ici_root, XFS_INO_TO_AGINO(mp, ip->i_ino),
XFS_ICI_RECLAIM_TAG);
--
2.17.1


2020-05-04 21:29:29

by Dave Chinner

[permalink] [raw]
Subject: Re: [PATCH] fs: xfs: fix a possible data race in xfs_inode_set_reclaim_tag()

On Tue, May 05, 2020 at 12:15:30AM +0800, Jia-Ju Bai wrote:
> We find that xfs_inode_set_reclaim_tag() and xfs_reclaim_inode() are
> concurrently executed at runtime in the following call contexts:
>
> Thread1:
> xfs_fs_put_super()
> xfs_unmountfs()
> xfs_rtunmount_inodes()
> xfs_irele()
> xfs_fs_destroy_inode()
> xfs_inode_set_reclaim_tag()
>
> Thread2:
> xfs_reclaim_worker()
> xfs_reclaim_inodes()
> xfs_reclaim_inodes_ag()
> xfs_reclaim_inode()
>
> In xfs_inode_set_reclaim_tag():
> pag = xfs_perag_get(mp, XFS_INO_TO_AGNO(mp, ip->i_ino));
> ...
> spin_lock(&ip->i_flags_lock);
>
> In xfs_reclaim_inode():
> spin_lock(&ip->i_flags_lock);
> ...
> ip->i_ino = 0;
> spin_unlock(&ip->i_flags_lock);
>
> Thus, a data race can occur for ip->i_ino.
>
> To fix this data race, the spinlock ip->i_flags_lock is used to protect
> the access to ip->i_ino in xfs_inode_set_reclaim_tag().
>
> This data race is found by our concurrency fuzzer.

This data race cannot happen.

xfs_reclaim_inode() will not be called on this inode until -after-
the XFS_ICI_RECLAIM_TAG is set in the radix tree for this inode, and
setting that is protected by the i_flags_lock.

So while the xfs_perag_get() call doesn't lock the ip->i_ino access,
there is are -multiple_ iflags_lock lock/unlock cycles before
ip->i_ino is cleared in the reclaim worker. Hence there is a full
unlock->lock memory barrier for the ip->i_ino reset inside the
critical section vs xfs_inode_set_reclaim_tag().

Hence even if the reclaim worker could access the inode before the
XFS_ICI_RECLAIM_TAG is set, no data race exists here.

> Signed-off-by: Jia-Ju Bai <[email protected]>
> ---
> fs/xfs/xfs_icache.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/fs/xfs/xfs_icache.c b/fs/xfs/xfs_icache.c
> index 8bf1d15be3f6..a2de08222ff5 100644
> --- a/fs/xfs/xfs_icache.c
> +++ b/fs/xfs/xfs_icache.c
> @@ -229,9 +229,9 @@ xfs_inode_set_reclaim_tag(
> struct xfs_mount *mp = ip->i_mount;
> struct xfs_perag *pag;
>
> + spin_lock(&ip->i_flags_lock);
> pag = xfs_perag_get(mp, XFS_INO_TO_AGNO(mp, ip->i_ino));
> spin_lock(&pag->pag_ici_lock);
> - spin_lock(&ip->i_flags_lock);

Also, this creates a lock inversion deadlock here with
xfs_iget_cache_hit() clearing the XFS_IRECLAIMABLE flag.

Cheers,

Dave.
--
Dave Chinner
[email protected]