LinuxLists.cc - [PATCH] fuse: annotate potential data-race in num

2024-05-09 12:57:36

Subject: [PATCH] fuse: annotate potential data-race in num_background

A data race occurs when two concurrent data paths potentially access
fuse_conn->num_background simultaneously.

Specifically, fuse_request_end() accesses and modifies ->num_background
while holding the bg_lock, whereas fuse_readahead() reads
->num_background without acquiring any lock beforehand. This potential
data race is flagged by KCSAN:

BUG: KCSAN: data-race in fuse_readahead [fuse] / fuse_request_end [fuse]

read-write to 0xffff8883a6666598 of 4 bytes by task 113809 on cpu 39:
fuse_request_end (fs/fuse/dev.c:318) fuse
fuse_dev_do_write (fs/fuse/dev.c:?) fuse
fuse_dev_write (fs/fuse/dev.c:?) fuse
...

read to 0xffff8883a6666598 of 4 bytes by task 113787 on cpu 8:
fuse_readahead (fs/fuse/file.c:1005) fuse
read_pages (mm/readahead.c:166)
page_cache_ra_unbounded (mm/readahead.c:?)
...

value changed: 0x00000001 -> 0x00000000

Annotated the reader with READ_ONCE() and the writer with WRITE_ONCE()
to avoid such complaint from KCSAN.

Suggested-by: Miklos Szeredi <[email protected]>
Signed-off-by: Breno Leitao <[email protected]>
---
fs/fuse/dev.c | 6 ++++--
fs/fuse/file.c | 2 +-
2 files changed, 5 insertions(+), 3 deletions(-)

diff --git a/fs/fuse/dev.c b/fs/fuse/dev.c
index 3ec8bb5e68ff..8e63dba49eff 100644
--- a/fs/fuse/dev.c
+++ b/fs/fuse/dev.c
@@ -282,6 +282,7 @@ void fuse_request_end(struct fuse_req *req)
struct fuse_mount *fm = req->fm;
struct fuse_conn *fc = fm->fc;
struct fuse_iqueue *fiq = &fc->iq;
+ unsigned int num_background;

if (test_and_set_bit(FR_FINISHED, &req->flags))
goto put_request;
@@ -301,7 +302,8 @@ void fuse_request_end(struct fuse_req *req)
if (test_bit(FR_BACKGROUND, &req->flags)) {
spin_lock(&fc->bg_lock);
clear_bit(FR_BACKGROUND, &req->flags);
- if (fc->num_background == fc->max_background) {
+ num_background = READ_ONCE(fc->num_background);
+ if (num_background == fc->max_background) {
fc->blocked = 0;
wake_up(&fc->blocked_waitq);
} else if (!fc->blocked) {
@@ -315,7 +317,7 @@ void fuse_request_end(struct fuse_req *req)
wake_up(&fc->blocked_waitq);
}

- fc->num_background--;
+ WRITE_ONCE(fc->num_background, num_background - 1);
fc->active_background--;
flush_bg_queue(fc);
spin_unlock(&fc->bg_lock);
diff --git a/fs/fuse/file.c b/fs/fuse/file.c
index b57ce4157640..07331889bbf3 100644
--- a/fs/fuse/file.c
+++ b/fs/fuse/file.c
@@ -1002,7 +1002,7 @@ static void fuse_readahead(struct readahead_control *rac)
struct fuse_io_args *ia;
struct fuse_args_pages *ap;

- if (fc->num_background >= fc->congestion_threshold &&
+ if (READ_ONCE(fc->num_background) >= fc->congestion_threshold &&
rac->ra->async_size >= readahead_count(rac))
/*
* Congested and only async pages left, so skip the
--
2.43.0

2024-05-10 09:21:40

by Miklos Szeredi

[permalink] [raw]

Subject: Re: [PATCH] fuse: annotate potential data-race in num_background

On Thu, 9 May 2024 at 14:57, Breno Leitao <[email protected]> wrote:

> Annotated the reader with READ_ONCE() and the writer with WRITE_ONCE()
> to avoid such complaint from KCSAN.

I'm not sure the write side part is really needed, since the lock is
properly protecting against concurrent readers/writers within the
locked region.

Does KCSAN still complain if you just add the READ_ONCE() to fuse_readahead()?

Thanks,
Miklos

2024-05-13 12:41:46

by Breno Leitao

[permalink] [raw]

Subject: Re: [PATCH] fuse: annotate potential data-race in num_background

Hello Miklos,

On Fri, May 10, 2024 at 11:21:19AM +0200, Miklos Szeredi wrote:
> On Thu, 9 May 2024 at 14:57, Breno Leitao <[email protected]> wrote:
>
> > Annotated the reader with READ_ONCE() and the writer with WRITE_ONCE()
> > to avoid such complaint from KCSAN.
>
> I'm not sure the write side part is really needed, since the lock is
> properly protecting against concurrent readers/writers within the
> locked region.

I understand that num_background is read from an unlocked region
(fuse_readahead()).

> Does KCSAN still complain if you just add the READ_ONCE() to fuse_readahead()?

I haven't checked, but, looking at the documentation it says that both part
needs to be marked. Here is an example very similar to ours here, from
tools/memory-model/Documentation/access-marking.txt

Lock-Protected Writes With Lockless Reads
-----------------------------------------

For another example, suppose a shared variable "foo" is updated only
while holding a spinlock, but is read locklessly. The code might look
as follows:

int foo;
DEFINE_SPINLOCK(foo_lock);

void update_foo(int newval)
{
spin_lock(&foo_lock);
WRITE_ONCE(foo, newval);
ASSERT_EXCLUSIVE_WRITER(foo);
do_something(newval);
spin_unlock(&foo_wlock);
}

int read_foo(void)
{
do_something_else();
return READ_ONCE(foo);
}

Because foo is read locklessly, all accesses are marked.

From my understanding, we need a WRITE_ONCE() inside the lock, because
the bg_lock lock in fuse_request_end() is invisible for fuse_readahead(),
and fuse_readahead() might read num_backgroud that was writen
non-atomically/corrupted (if there is no WRITE_ONCE()).

That said, if the reader (fuse_readahead()) can handle possible
corrupted data, we can mark is with data_race() annotation. Then I
understand we don't need to mark the write with WRITE_ONCE().

Here is what access-marking.txt says about this case:

Here are some situations where data_race() should be used instead of
READ_ONCE() and WRITE_ONCE():

1. Data-racy loads from shared variables whose values are used only
for diagnostic purposes.

2. Data-racy reads whose values are checked against marked reload.

3. Reads whose values feed into error-tolerant heuristics.

4. Writes setting values that feed into error-tolerant heuristics.

Anyway, I am more than happy to test with only a READ_ONLY() in the
reader side, if that the approach you prefer.

Thanks!

2024-05-17 15:24:21

by Miklos Szeredi

[permalink] [raw]

Subject: Re: [PATCH] fuse: annotate potential data-race in num_background

On Mon, 13 May 2024 at 14:41, Breno Leitao <[email protected]> wrote:

> That said, if the reader (fuse_readahead()) can handle possible
> corrupted data, we can mark is with data_race() annotation. Then I
> understand we don't need to mark the write with WRITE_ONCE().

Adding Willy, since the readahead code in fuse is fairly special.

I don't think it actually matters if "fc->num_background >=
fc->congestion_threshold" returns false positive or false negative,
but I don't have a full understanding of how readahead works.

Willy, can you please look at fuse_readahead() to confirm that
breaking out of the loop is okay if (rac->ra->async_size >=
readahead_count(rac)) no mater what?

Thanks,
Miklos