2023-05-03 18:25:25

by Jan Kara

[permalink] [raw]
Subject: [PATCH] ext4: Fix data races when using cached status extents

When using cached extent stored in extent status tree in tree->cache_es
another process holding ei->i_es_lock for reading can be racing with us
setting new value of tree->cache_es. If the compiler would decide to
refetch tree->cache_es at an unfortunate moment, it could result in a
bogus in_range() check. Fix the possible race by using READ_ONCE() when
using tree->cache_es only under ei->i_es_lock for reading.

Reported-by: [email protected]
Link: https://lore.kernel.org/all/[email protected]
Suggested-by: Dmitry Vyukov <[email protected]>
Signed-off-by: Jan Kara <[email protected]>
---
fs/ext4/extents_status.c | 28 ++++++++++++----------------
1 file changed, 12 insertions(+), 16 deletions(-)

diff --git a/fs/ext4/extents_status.c b/fs/ext4/extents_status.c
index 7bc221038c6c..ca2cb926894f 100644
--- a/fs/ext4/extents_status.c
+++ b/fs/ext4/extents_status.c
@@ -267,14 +267,12 @@ static void __es_find_extent_range(struct inode *inode,

/* see if the extent has been cached */
es->es_lblk = es->es_len = es->es_pblk = 0;
- if (tree->cache_es) {
- es1 = tree->cache_es;
- if (in_range(lblk, es1->es_lblk, es1->es_len)) {
- es_debug("%u cached by [%u/%u) %llu %x\n",
- lblk, es1->es_lblk, es1->es_len,
- ext4_es_pblock(es1), ext4_es_status(es1));
- goto out;
- }
+ es1 = READ_ONCE(tree->cache_es);
+ if (es1 && in_range(lblk, es1->es_lblk, es1->es_len)) {
+ es_debug("%u cached by [%u/%u) %llu %x\n",
+ lblk, es1->es_lblk, es1->es_len,
+ ext4_es_pblock(es1), ext4_es_status(es1));
+ goto out;
}

es1 = __es_tree_search(&tree->root, lblk);
@@ -931,14 +929,12 @@ int ext4_es_lookup_extent(struct inode *inode, ext4_lblk_t lblk,

/* find extent in cache firstly */
es->es_lblk = es->es_len = es->es_pblk = 0;
- if (tree->cache_es) {
- es1 = tree->cache_es;
- if (in_range(lblk, es1->es_lblk, es1->es_len)) {
- es_debug("%u cached by [%u/%u)\n",
- lblk, es1->es_lblk, es1->es_len);
- found = 1;
- goto out;
- }
+ es1 = READ_ONCE(tree->cache_es);
+ if (es1 && in_range(lblk, es1->es_lblk, es1->es_len)) {
+ es_debug("%u cached by [%u/%u)\n",
+ lblk, es1->es_lblk, es1->es_len);
+ found = 1;
+ goto out;
}

node = tree->root.rb_node;
--
2.35.3


2023-05-04 02:57:08

by Theodore Ts'o

[permalink] [raw]
Subject: Re: [PATCH] ext4: Fix data races when using cached status extents

On Wed, May 03, 2023 at 08:21:28PM +0200, Jan Kara wrote:
> When using cached extent stored in extent status tree in tree->cache_es
> another process holding ei->i_es_lock for reading can be racing with us
> setting new value of tree->cache_es. If the compiler would decide to
> refetch tree->cache_es at an unfortunate moment, it could result in a
> bogus in_range() check. Fix the possible race by using READ_ONCE() when
> using tree->cache_es only under ei->i_es_lock for reading.
>
> Reported-by: [email protected]
> Link: https://lore.kernel.org/all/[email protected]
> Suggested-by: Dmitry Vyukov <[email protected]>
> Signed-off-by: Jan Kara <[email protected]>

Don't we also need a WRITE_ONCE here?

diff --git a/fs/ext4/extents_status.c b/fs/ext4/extents_status.c
index 7bc221038c6c..4694582cf255 100644
--- a/fs/ext4/extents_status.c
+++ b/fs/ext4/extents_status.c
@@ -293,7 +293,7 @@ static void __es_find_extent_range(struct inode *inode,
}

if (es1 && matching_fn(es1)) {
- tree->cache_es = es1;
+ WRITE_ONCE(tree->cache_es, es1);
es->es_lblk = es1->es_lblk;
es->es_len = es1->es_len;
es->es_pblk = es1->es_pblk;

- Ted

2023-05-04 10:59:25

by Jan Kara

[permalink] [raw]
Subject: Re: [PATCH] ext4: Fix data races when using cached status extents

On Wed 03-05-23 22:41:57, Theodore Ts'o wrote:
> On Wed, May 03, 2023 at 08:21:28PM +0200, Jan Kara wrote:
> > When using cached extent stored in extent status tree in tree->cache_es
> > another process holding ei->i_es_lock for reading can be racing with us
> > setting new value of tree->cache_es. If the compiler would decide to
> > refetch tree->cache_es at an unfortunate moment, it could result in a
> > bogus in_range() check. Fix the possible race by using READ_ONCE() when
> > using tree->cache_es only under ei->i_es_lock for reading.
> >
> > Reported-by: [email protected]
> > Link: https://lore.kernel.org/all/[email protected]
> > Suggested-by: Dmitry Vyukov <[email protected]>
> > Signed-off-by: Jan Kara <[email protected]>
>
> Don't we also need a WRITE_ONCE here?

Right, we should do that as well. I'll update the patch.

Honza

>
> diff --git a/fs/ext4/extents_status.c b/fs/ext4/extents_status.c
> index 7bc221038c6c..4694582cf255 100644
> --- a/fs/ext4/extents_status.c
> +++ b/fs/ext4/extents_status.c
> @@ -293,7 +293,7 @@ static void __es_find_extent_range(struct inode *inode,
> }
>
> if (es1 && matching_fn(es1)) {
> - tree->cache_es = es1;
> + WRITE_ONCE(tree->cache_es, es1);
> es->es_lblk = es1->es_lblk;
> es->es_len = es1->es_len;
> es->es_pblk = es1->es_pblk;
>
> - Ted
--
Jan Kara <[email protected]>
SUSE Labs, CR