We find that xfs_inode_set_reclaim_tag() and xfs_reclaim_inode() are
concurrently executed at runtime in the following call contexts:
Thread1:
xfs_fs_put_super()
xfs_unmountfs()
xfs_rtunmount_inodes()
xfs_irele()
xfs_fs_destroy_inode()
xfs_inode_set_reclaim_tag()
Thread2:
xfs_reclaim_worker()
xfs_reclaim_inodes()
xfs_reclaim_inodes_ag()
xfs_reclaim_inode()
In xfs_inode_set_reclaim_tag():
pag = xfs_perag_get(mp, XFS_INO_TO_AGNO(mp, ip->i_ino));
...
spin_lock(&ip->i_flags_lock);
In xfs_reclaim_inode():
spin_lock(&ip->i_flags_lock);
...
ip->i_ino = 0;
spin_unlock(&ip->i_flags_lock);
Thus, a data race can occur for ip->i_ino.
To fix this data race, the spinlock ip->i_flags_lock is used to protect
the access to ip->i_ino in xfs_inode_set_reclaim_tag().
This data race is found by our concurrency fuzzer.
Signed-off-by: Jia-Ju Bai <[email protected]>
---
fs/xfs/xfs_icache.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/fs/xfs/xfs_icache.c b/fs/xfs/xfs_icache.c
index 8bf1d15be3f6..a2de08222ff5 100644
--- a/fs/xfs/xfs_icache.c
+++ b/fs/xfs/xfs_icache.c
@@ -229,9 +229,9 @@ xfs_inode_set_reclaim_tag(
struct xfs_mount *mp = ip->i_mount;
struct xfs_perag *pag;
+ spin_lock(&ip->i_flags_lock);
pag = xfs_perag_get(mp, XFS_INO_TO_AGNO(mp, ip->i_ino));
spin_lock(&pag->pag_ici_lock);
- spin_lock(&ip->i_flags_lock);
radix_tree_tag_set(&pag->pag_ici_root, XFS_INO_TO_AGINO(mp, ip->i_ino),
XFS_ICI_RECLAIM_TAG);
--
2.17.1
On Tue, May 05, 2020 at 12:15:30AM +0800, Jia-Ju Bai wrote:
> We find that xfs_inode_set_reclaim_tag() and xfs_reclaim_inode() are
> concurrently executed at runtime in the following call contexts:
>
> Thread1:
> xfs_fs_put_super()
> xfs_unmountfs()
> xfs_rtunmount_inodes()
> xfs_irele()
> xfs_fs_destroy_inode()
> xfs_inode_set_reclaim_tag()
>
> Thread2:
> xfs_reclaim_worker()
> xfs_reclaim_inodes()
> xfs_reclaim_inodes_ag()
> xfs_reclaim_inode()
>
> In xfs_inode_set_reclaim_tag():
> pag = xfs_perag_get(mp, XFS_INO_TO_AGNO(mp, ip->i_ino));
> ...
> spin_lock(&ip->i_flags_lock);
>
> In xfs_reclaim_inode():
> spin_lock(&ip->i_flags_lock);
> ...
> ip->i_ino = 0;
> spin_unlock(&ip->i_flags_lock);
>
> Thus, a data race can occur for ip->i_ino.
>
> To fix this data race, the spinlock ip->i_flags_lock is used to protect
> the access to ip->i_ino in xfs_inode_set_reclaim_tag().
>
> This data race is found by our concurrency fuzzer.
This data race cannot happen.
xfs_reclaim_inode() will not be called on this inode until -after-
the XFS_ICI_RECLAIM_TAG is set in the radix tree for this inode, and
setting that is protected by the i_flags_lock.
So while the xfs_perag_get() call doesn't lock the ip->i_ino access,
there is are -multiple_ iflags_lock lock/unlock cycles before
ip->i_ino is cleared in the reclaim worker. Hence there is a full
unlock->lock memory barrier for the ip->i_ino reset inside the
critical section vs xfs_inode_set_reclaim_tag().
Hence even if the reclaim worker could access the inode before the
XFS_ICI_RECLAIM_TAG is set, no data race exists here.
> Signed-off-by: Jia-Ju Bai <[email protected]>
> ---
> fs/xfs/xfs_icache.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/fs/xfs/xfs_icache.c b/fs/xfs/xfs_icache.c
> index 8bf1d15be3f6..a2de08222ff5 100644
> --- a/fs/xfs/xfs_icache.c
> +++ b/fs/xfs/xfs_icache.c
> @@ -229,9 +229,9 @@ xfs_inode_set_reclaim_tag(
> struct xfs_mount *mp = ip->i_mount;
> struct xfs_perag *pag;
>
> + spin_lock(&ip->i_flags_lock);
> pag = xfs_perag_get(mp, XFS_INO_TO_AGNO(mp, ip->i_ino));
> spin_lock(&pag->pag_ici_lock);
> - spin_lock(&ip->i_flags_lock);
Also, this creates a lock inversion deadlock here with
xfs_iget_cache_hit() clearing the XFS_IRECLAIMABLE flag.
Cheers,
Dave.
--
Dave Chinner
[email protected]