2022-07-28 15:58:32

by Siddh Raman Pant

[permalink] [raw]
Subject: [PATCH v3] kernel/watch_queue: Make pipe NULL while clearing watch_queue

If not done, a reference to a freed pipe remains in the watch_queue,
as this function is called before freeing a pipe in free_pipe_info()
(see line 834 of fs/pipe.c).

This causes a UAF when post_one_notification() tries to access the pipe
on a key update, which is reported by syzbot.

We also need to use READ_ONCE() in post_one_notification() to prevent the
compiler from optimising and loading a non-NULL value from wqueue->pipe.

Bug report: https://syzkaller.appspot.com/bug?id=1870dd7791ba05f2ea7f47f7cbdde701173973fc
Reported-and-tested-by: [email protected]

Signed-off-by: Siddh Raman Pant <[email protected]>
---
Changes in v3:
- Restore the original unlock order, and clear before unlock.
- Use READ_ONCE() in post path.

This was explained by David Howells <[email protected]> in
reply to v1. Not added Suggested-by since he didn't reply yet.

Changes in v2:
- Removed the superfluous ifdef guard.

kernel/watch_queue.c | 8 +++++++-
1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/kernel/watch_queue.c b/kernel/watch_queue.c
index bb9962b33f95..617425e34252 100644
--- a/kernel/watch_queue.c
+++ b/kernel/watch_queue.c
@@ -99,7 +99,7 @@ static bool post_one_notification(struct watch_queue *wqueue,
struct watch_notification *n)
{
void *p;
- struct pipe_inode_info *pipe = wqueue->pipe;
+ struct pipe_inode_info *pipe = READ_ONCE(wqueue->pipe);
struct pipe_buffer *buf;
struct page *page;
unsigned int head, tail, mask, note, offset, len;
@@ -637,6 +637,12 @@ void watch_queue_clear(struct watch_queue *wqueue)
spin_lock_bh(&wqueue->lock);
}

+ /* Clearing the watch queue, so we should clean the associated pipe. */
+ if (wqueue->pipe) {
+ wqueue->pipe->watch_queue = NULL;
+ wqueue->pipe = NULL;
+ }
+
spin_unlock_bh(&wqueue->lock);
rcu_read_unlock();
}
--
2.35.1



2022-08-01 09:58:33

by Siddh Raman Pant

[permalink] [raw]
Subject: Re: [PATCH v3] kernel/watch_queue: Make pipe NULL while clearing watch_queue

Hello Dipanjan,

It would be nice if you could test this patch and tell if it fixes the
issue on v5.10, as you had reported it earlier.

Please apply the following commits before applying this patch:
db8facfc9faf ("watch_queue, pipe: Free watchqueue state after clearing pipe ring")
353f7988dd84 ("watchqueue: make sure to serialize 'wqueue->defunct' properly")

I have tested locally on tag v5.10, using the reproducer available on
syzkaller dashboard. The crash occurred when the patches weren't applied,
and it no longer occurs after applying the three patches.

Thanks,
Siddh

> If not done, a reference to a freed pipe remains in the watch_queue,
> as this function is called before freeing a pipe in free_pipe_info()
> (see line 834 of fs/pipe.c).
>
> This causes a UAF when post_one_notification() tries to access the pipe
> on a key update, which is reported by syzbot.
>
> We also need to use READ_ONCE() in post_one_notification() to prevent the
> compiler from optimising and loading a non-NULL value from wqueue->pipe.
>
> Bug report: https://syzkaller.appspot.com/bug?id=1870dd7791ba05f2ea7f47f7cbdde701173973fc
> Reported-and-tested-by: [email protected]
>
> Signed-off-by: Siddh Raman Pant <[email protected]>
> ---
> Changes in v3:
> - Restore the original unlock order, and clear before unlock.
> - Use READ_ONCE() in post path.
>
> This was explained by David Howells <[email protected]> in
> reply to v1. Not added Suggested-by since he didn't reply yet.
>
> Changes in v2:
> - Removed the superfluous ifdef guard.
>
> kernel/watch_queue.c | 8 +++++++-
> 1 file changed, 7 insertions(+), 1 deletion(-)
>
> diff --git a/kernel/watch_queue.c b/kernel/watch_queue.c
> index bb9962b33f95..617425e34252 100644
> --- a/kernel/watch_queue.c
> +++ b/kernel/watch_queue.c
> @@ -99,7 +99,7 @@ static bool post_one_notification(struct watch_queue *wqueue,
> struct watch_notification *n)
> {
> void *p;
> - struct pipe_inode_info *pipe = wqueue->pipe;
> + struct pipe_inode_info *pipe = READ_ONCE(wqueue->pipe);
> struct pipe_buffer *buf;
> struct page *page;
> unsigned int head, tail, mask, note, offset, len;
> @@ -637,6 +637,12 @@ void watch_queue_clear(struct watch_queue *wqueue)
> spin_lock_bh(&wqueue->lock);
> }
>
> + /* Clearing the watch queue, so we should clean the associated pipe. */
> + if (wqueue->pipe) {
> + wqueue->pipe->watch_queue = NULL;
> + wqueue->pipe = NULL;
> + }
> +
> spin_unlock_bh(&wqueue->lock);
> rcu_read_unlock();
> }
> --
> 2.35.1


2022-08-01 11:12:39

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH v3] kernel/watch_queue: Make pipe NULL while clearing watch_queue

A: http://en.wikipedia.org/wiki/Top_post
Q: Were do I find info about this thing called top-posting?
A: Because it messes up the order in which people normally read text.
Q: Why is top-posting such a bad thing?
A: Top-posting.
Q: What is the most annoying thing in e-mail?

A: No.
Q: Should I include quotations after my reply?

http://daringfireball.net/2007/07/on_top

On Mon, Aug 01, 2022 at 03:04:33PM +0530, Siddh Raman Pant via Linux-kernel-mentees wrote:
> Hello Dipanjan,
>
> It would be nice if you could test this patch and tell if it fixes the
> issue on v5.10, as you had reported it earlier.
>
> Please apply the following commits before applying this patch:
> db8facfc9faf ("watch_queue, pipe: Free watchqueue state after clearing pipe ring")
> 353f7988dd84 ("watchqueue: make sure to serialize 'wqueue->defunct' properly")
>
> I have tested locally on tag v5.10, using the reproducer available on
> syzkaller dashboard. The crash occurred when the patches weren't applied,
> and it no longer occurs after applying the three patches.

Trying anything on tag v5.10 is not a good test as that kernel is very
old and obsolete and over 15000 changes behind what the latest 5.10.y
kernel release has in it.

So trying it on v5.10.134 would be best.

thanks,

greg k-h

2022-08-01 13:04:09

by Siddh Raman Pant

[permalink] [raw]
Subject: Re: [PATCH v3] kernel/watch_queue: Make pipe NULL while clearing watch_queue

On Mon, 01 Aug 2022 18:28:25 +0530 Siddh Raman Pant <[email protected]> wrote:
> I now tried the 5.10.y branch of stable (which has v5.10.134), but the
> reproducer isn't triggering the bug for me.

By this, I mean without the patches. I should have been clear.

Thanks,
Siddh

2022-08-01 13:15:22

by Siddh Raman Pant

[permalink] [raw]
Subject: Re: [PATCH v3] kernel/watch_queue: Make pipe NULL while clearing watch_queue

On Mon, 01 Aug 2022 15:54:03 +0530 Greg KH <[email protected]> wrote:
> A: http://en.wikipedia.org/wiki/Top_post
> Q: Were do I find info about this thing called top-posting?
> A: Because it messes up the order in which people normally read text.
> Q: Why is top-posting such a bad thing?
> A: Top-posting.
> Q: What is the most annoying thing in e-mail?
>
> A: No.
> Q: Should I include quotations after my reply?
>
> http://daringfireball.net/2007/07/on_top
>

Oops, sorry.

> Trying anything on tag v5.10 is not a good test as that kernel is very
> old and obsolete and over 15000 changes behind what the latest 5.10.y
> kernel release has in it.
>
> So trying it on v5.10.134 would be best.
>
> thanks,
>
> greg k-h
>

Noted.

I now tried the 5.10.y branch of stable (which has v5.10.134), but the
reproducer isn't triggering the bug for me.

Thanks,
Siddh

2022-08-01 16:51:58

by Dipanjan Das

[permalink] [raw]
Subject: Re: [PATCH v3] kernel/watch_queue: Make pipe NULL while clearing watch_queue

On Mon, Aug 01, 2022 at 06:30:58PM +0530, Siddh Raman Pant wrote:
> On Mon, 01 Aug 2022 18:28:25 +0530 Siddh Raman Pant <[email protected]> wrote:
> > I now tried the 5.10.y branch of stable (which has v5.10.134), but the
> > reproducer isn't triggering the bug for me.

Are you referring to the reproducer attached to our original report?
https://lore.kernel.org/all/CANX2M5bHye2ZEEhEV6PUj1kYL2KdWYeJtgXw8KZRzwrNpLYz+A@mail.gmail.com/


2022-08-01 18:54:00

by Siddh Raman Pant

[permalink] [raw]
Subject: Re: [PATCH v3] kernel/watch_queue: Make pipe NULL while clearing watch_queue

On Mon, 01 Aug 2022 21:46:42 +0530 Dipanjan Das <[email protected]> wrote:
> Are you referring to the reproducer attached to our original report?
> https://lore.kernel.org/all/CANX2M5bHye2ZEEhEV6PUj1kYL2KdWYeJtgXw8KZRzwrNpLYz+A@mail.gmail.com/

Yes, I meant the reproducer you gave.

I suspect I must have missed CONFIG_WATCH_QUEUE=y while setting the kernel
up, extremely sorry for it.

I now tried 5.10.y with it (using a modification of syzkaller's dashboard
config I had been using[1]), and I'm getting a __post_watch_notification()
crash (which is a related crash, as the fix[2][3] for that causes the
reproducer to not reproduce the post_one_notification crash on mainline),
but not the post_one_notification() crash you had reported. It seems if I
apply my patch, it doesn't trigger this related crash, so these bugs are
seem to be very related maybe due to racing? I haven't looked at that yet.

I then tried on v5.10.131 since that was the exact version you had
reproduced on, and it reproduces the post_one_notification() error
successfully. Applying 353f7988dd84 causes __post_watch_notification()
crash, and then applying this v3 patch does not trigger the issue, but
the patch to fix __post_watch_notification() crash is [2], which does
not really address the issue of post_one_notification() crash which
is due to the dangling reference to a freed pipe.

Can you please try reproducer at your end?

Thanks,
Siddh

[1] https://gist.github.com/siddhpant/06c7d64ca83273f0fd6604e4677f7c0d
[2] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=e64ab2dbd882933b65cd82ff6235d705ad65dbb6
[3] https://lore.kernel.org/linux-mm/[email protected]/

2022-08-03 01:52:59

by Eric Biggers

[permalink] [raw]
Subject: Re: [PATCH v3] kernel/watch_queue: Make pipe NULL while clearing watch_queue

On Thu, Jul 28, 2022 at 09:21:21PM +0530, Siddh Raman Pant wrote:
> If not done, a reference to a freed pipe remains in the watch_queue,
> as this function is called before freeing a pipe in free_pipe_info()
> (see line 834 of fs/pipe.c).
>
> This causes a UAF when post_one_notification() tries to access the pipe
> on a key update, which is reported by syzbot.
>
> We also need to use READ_ONCE() in post_one_notification() to prevent the
> compiler from optimising and loading a non-NULL value from wqueue->pipe.

Didn't this already get fixed by the following commit?

commit 353f7988dd8413c47718f7ca79c030b6fb62cfe5
Author: Linus Torvalds <[email protected]>
Date: Tue Jul 19 11:09:01 2022 -0700

watchqueue: make sure to serialize 'wqueue->defunct' properly

With that, post_one_notification() only runs while the watch_queue is locked and
not "defunct". So it's guaranteed that the pipe still exists. Any concurrent
free_pipe_info() waits for the watch_queue to be unlocked in watch_queue_clear()
before proceeding to free the pipe. So where is there still a bug?

>
> Bug report: https://syzkaller.appspot.com/bug?id=1870dd7791ba05f2ea7f47f7cbdde701173973fc
> Reported-and-tested-by: [email protected]

If this actually does fix something, then it's mixing Fixes and Cc stable tags.

> diff --git a/kernel/watch_queue.c b/kernel/watch_queue.c
> index bb9962b33f95..617425e34252 100644
> --- a/kernel/watch_queue.c
> +++ b/kernel/watch_queue.c
> @@ -99,7 +99,7 @@ static bool post_one_notification(struct watch_queue *wqueue,
> struct watch_notification *n)
> {
> void *p;
> - struct pipe_inode_info *pipe = wqueue->pipe;
> + struct pipe_inode_info *pipe = READ_ONCE(wqueue->pipe);
> struct pipe_buffer *buf;
> struct page *page;
> unsigned int head, tail, mask, note, offset, len;
> @@ -637,6 +637,12 @@ void watch_queue_clear(struct watch_queue *wqueue)
> spin_lock_bh(&wqueue->lock);
> }
>
> + /* Clearing the watch queue, so we should clean the associated pipe. */
> + if (wqueue->pipe) {
> + wqueue->pipe->watch_queue = NULL;
> + wqueue->pipe = NULL;
> + }
> +
> spin_unlock_bh(&wqueue->lock);
> rcu_read_unlock();
> }

And this is clearly the wrong fix anyway, since it makes the call to
put_watch_queue() in free_pipe_info() never be executed. So AFAICT, this patch
introduces a memory leak, and doesn't actually fix anything...

- Eric

2022-08-03 02:09:43

by Eric Biggers

[permalink] [raw]
Subject: Re: [PATCH v3] kernel/watch_queue: Make pipe NULL while clearing watch_queue

On Tue, Aug 02, 2022 at 12:19:05AM +0530, Siddh Raman Pant wrote:
> On Mon, 01 Aug 2022 21:46:42 +0530 Dipanjan Das <[email protected]> wrote:
> > Are you referring to the reproducer attached to our original report?
> > https://lore.kernel.org/all/CANX2M5bHye2ZEEhEV6PUj1kYL2KdWYeJtgXw8KZRzwrNpLYz+A@mail.gmail.com/
>
> Yes, I meant the reproducer you gave.
>
> I suspect I must have missed CONFIG_WATCH_QUEUE=y while setting the kernel
> up, extremely sorry for it.
>
> I now tried 5.10.y with it (using a modification of syzkaller's dashboard
> config I had been using[1]), and I'm getting a __post_watch_notification()
> crash (which is a related crash, as the fix[2][3] for that causes the
> reproducer to not reproduce the post_one_notification crash on mainline),
> but not the post_one_notification() crash you had reported. It seems if I
> apply my patch, it doesn't trigger this related crash, so these bugs are
> seem to be very related maybe due to racing? I haven't looked at that yet.
>
> I then tried on v5.10.131 since that was the exact version you had
> reproduced on, and it reproduces the post_one_notification() error
> successfully. Applying 353f7988dd84 causes __post_watch_notification()
> crash, and then applying this v3 patch does not trigger the issue, but
> the patch to fix __post_watch_notification() crash is [2], which does
> not really address the issue of post_one_notification() crash which
> is due to the dangling reference to a freed pipe.
>
> Can you please try reproducer at your end?
>
> Thanks,
> Siddh

There were several watch_queue fixes that got merged in v5.10.134, so there's no
point in testing v5.10.131. If you're still seeing a bug in the *latest*
5.10.y, then please check whether it's also present upstream. If yes, then send
out a fix for it. If no, then tell [email protected] what commit(s) need
to be backported. Please provide *full* details in each case, including any
KASAN reports or other information that would help people understand the bug.

- Eric

2022-08-03 04:15:00

by Siddh Raman Pant

[permalink] [raw]
Subject: Re: [PATCH v3] kernel/watch_queue: Make pipe NULL while clearing watch_queue

On Wed, 03 Aug 2022 06:38:36 +0530 Eric Biggers <[email protected]> wrote:
> On Thu, Jul 28, 2022 at 09:21:21PM +0530, Siddh Raman Pant wrote:
> > If not done, a reference to a freed pipe remains in the watch_queue,
> > as this function is called before freeing a pipe in free_pipe_info()
> > (see line 834 of fs/pipe.c).
> >
> > This causes a UAF when post_one_notification() tries to access the pipe
> > on a key update, which is reported by syzbot.
> >
> > We also need to use READ_ONCE() in post_one_notification() to prevent the
> > compiler from optimising and loading a non-NULL value from wqueue->pipe.
>
> Didn't this already get fixed by the following commit?
>
> commit 353f7988dd8413c47718f7ca79c030b6fb62cfe5
> Author: Linus Torvalds <[email protected]>
> Date: Tue Jul 19 11:09:01 2022 -0700
>
> watchqueue: make sure to serialize 'wqueue->defunct' properly
>
> With that, post_one_notification() only runs while the watch_queue is locked and
> not "defunct". So it's guaranteed that the pipe still exists. Any concurrent
> free_pipe_info() waits for the watch_queue to be unlocked in watch_queue_clear()
> before proceeding to free the pipe. So where is there still a bug?

It doesn't fix the dangling pointer to the freed pipe in the watch_queue, which
had caused this crash.

> >
> > Bug report: https://syzkaller.appspot.com/bug?id=1870dd7791ba05f2ea7f47f7cbdde701173973fc
> > Reported-and-tested-by: [email protected]
>
> If this actually does fix something, then it's mixing Fixes and Cc stable tags.

Noted.

> > diff --git a/kernel/watch_queue.c b/kernel/watch_queue.c
> > index bb9962b33f95..617425e34252 100644
> > --- a/kernel/watch_queue.c
> > +++ b/kernel/watch_queue.c
> > @@ -99,7 +99,7 @@ static bool post_one_notification(struct watch_queue *wqueue,
> > struct watch_notification *n)
> > {
> > void *p;
> > - struct pipe_inode_info *pipe = wqueue->pipe;
> > + struct pipe_inode_info *pipe = READ_ONCE(wqueue->pipe);
> > struct pipe_buffer *buf;
> > struct page *page;
> > unsigned int head, tail, mask, note, offset, len;
> > @@ -637,6 +637,12 @@ void watch_queue_clear(struct watch_queue *wqueue)
> > spin_lock_bh(&wqueue->lock);
> > }
> >
> > + /* Clearing the watch queue, so we should clean the associated pipe. */
> > + if (wqueue->pipe) {
> > + wqueue->pipe->watch_queue = NULL;
> > + wqueue->pipe = NULL;
> > + }
> > +
> > spin_unlock_bh(&wqueue->lock);
> > rcu_read_unlock();
> > }
>
> And this is clearly the wrong fix anyway, since it makes the call to
> put_watch_queue() in free_pipe_info() never be executed. So AFAICT, this patch
> introduces a memory leak, and doesn't actually fix anything...
>
> - Eric
>

Sorry for overlooking that. wqueue->pipe->watch_queue = NULL; is unnecessary,
but wqueue->pipe = NULL; is needed.

I will send a final v4 once all issues are resolved in this thread.

Thanks,
Siddh

2022-08-03 04:15:48

by Dipanjan Das

[permalink] [raw]
Subject: Re: [PATCH v3] kernel/watch_queue: Make pipe NULL while clearing watch_queue

On Mon, Aug 1, 2022 at 11:49 AM Siddh Raman Pant <[email protected]> wrote:
>
> On Mon, 01 Aug 2022 21:46:42 +0530 Dipanjan Das <[email protected]> wrote:
> > Are you referring to the reproducer attached to our original report?
> > https://lore.kernel.org/all/CANX2M5bHye2ZEEhEV6PUj1kYL2KdWYeJtgXw8KZRzwrNpLYz+A@mail.gmail.com/
>
> Yes, I meant the reproducer you gave.
>
> I suspect I must have missed CONFIG_WATCH_QUEUE=y while setting the kernel
> up, extremely sorry for it.
>
> I now tried 5.10.y with it (using a modification of syzkaller's dashboard
> config I had been using[1]), and I'm getting a __post_watch_notification()
> crash (which is a related crash, as the fix[2][3] for that causes the
> reproducer to not reproduce the post_one_notification crash on mainline),
> but not the post_one_notification() crash you had reported. It seems if I
> apply my patch, it doesn't trigger this related crash, so these bugs are
> seem to be very related maybe due to racing? I haven't looked at that yet.
>
> I then tried on v5.10.131 since that was the exact version you had
> reproduced on, and it reproduces the post_one_notification() error
> successfully. Applying 353f7988dd84 causes __post_watch_notification()
> crash, and then applying this v3 patch does not trigger the issue, but
> the patch to fix __post_watch_notification() crash is [2], which does
> not really address the issue of post_one_notification() crash which
> is due to the dangling reference to a freed pipe.
>
> Can you please try reproducer at your end?

We can confirm that the repro triggers the `post_watch_notification`
crash on 5.10.134 (7a62a4b6212a7f04251fdaf8b049c47aec59c54a). After
applying your v3 patch, the repro did not trigger the bug.

>
> [1] https://gist.github.com/siddhpant/06c7d64ca83273f0fd6604e4677f7c0d
> [2] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=e64ab2dbd882933b65cd82ff6235d705ad65dbb6
> [3] https://lore.kernel.org/linux-mm/[email protected]/



--
Thanks and Regards,

Dipanjan

2022-08-03 04:20:58

by Eric Biggers

[permalink] [raw]
Subject: Re: [PATCH v3] kernel/watch_queue: Make pipe NULL while clearing watch_queue

On Wed, Aug 03, 2022 at 09:26:04AM +0530, Siddh Raman Pant wrote:
> On Wed, 03 Aug 2022 06:38:36 +0530 Eric Biggers <[email protected]> wrote:
> > On Thu, Jul 28, 2022 at 09:21:21PM +0530, Siddh Raman Pant wrote:
> > > If not done, a reference to a freed pipe remains in the watch_queue,
> > > as this function is called before freeing a pipe in free_pipe_info()
> > > (see line 834 of fs/pipe.c).
> > >
> > > This causes a UAF when post_one_notification() tries to access the pipe
> > > on a key update, which is reported by syzbot.
> > >
> > > We also need to use READ_ONCE() in post_one_notification() to prevent the
> > > compiler from optimising and loading a non-NULL value from wqueue->pipe.
> >
> > Didn't this already get fixed by the following commit?
> >
> > commit 353f7988dd8413c47718f7ca79c030b6fb62cfe5
> > Author: Linus Torvalds <[email protected]>
> > Date: Tue Jul 19 11:09:01 2022 -0700
> >
> > watchqueue: make sure to serialize 'wqueue->defunct' properly
> >
> > With that, post_one_notification() only runs while the watch_queue is locked and
> > not "defunct". So it's guaranteed that the pipe still exists. Any concurrent
> > free_pipe_info() waits for the watch_queue to be unlocked in watch_queue_clear()
> > before proceeding to free the pipe. So where is there still a bug?
>
> It doesn't fix the dangling pointer to the freed pipe in the watch_queue, which
> had caused this crash.
>

Under what circumstances is the pipe pointer still being dereferenced after the
pipe has been freed? I don't see how it can be; see my explanation above.

- Eric

2022-08-03 04:30:32

by Siddh Raman Pant

[permalink] [raw]
Subject: Re: [PATCH v3] kernel/watch_queue: Make pipe NULL while clearing watch_queue

On Wed, 03 Aug 2022 06:46:13 +0530 Eric Biggers <[email protected]> wrote:
> There were several watch_queue fixes that got merged in v5.10.134, so there's no
> point in testing v5.10.131. If you're still seeing a bug in the *latest*
> 5.10.y, then please check whether it's also present upstream. If yes, then send
> out a fix for it. If no, then tell [email protected] what commit(s) need
> to be backported. Please provide *full* details in each case, including any
> KASAN reports or other information that would help people understand the bug.
>
> - Eric
>

Okay, noted.

Thanks,
Siddh

2022-08-03 05:37:13

by Siddh Raman Pant

[permalink] [raw]
Subject: Re: [PATCH v3] kernel/watch_queue: Make pipe NULL while clearing watch_queue

On Wed, 03 Aug 2022 09:42:28 +0530 Eric Biggers <[email protected]> wrote:
> Under what circumstances is the pipe pointer still being dereferenced after the
> pipe has been freed? I don't see how it can be; see my explanation above.

It really didn't fix the crash. It caused the same crash reported here, which
I was already locally getting:
https://syzkaller.appspot.com/bug?extid=03d7b43290037d1f87ca

(It's same because __post_watch_notification calls post_one_notification, and
this patch seems to stop that crash too, as was verified by Dipanjan here).

While it has been fixed by e64ab2dbd882 ("watch_queue: Fix missing locking in
add_watch_to_object()"), it just shows there can be paths leading to it. (Also,
I posted this patch (v1, v2, v3) before that even landed, so I had no way of
knowing about it).

There is a null check in post_one_notification for the pipe, most probably
because it *expects* the pointer to be NULL'd. Also, there is no reason to have
a dangling pointer stay, it's just a recipe for further bugs.

Thanks,
Siddh

2022-08-03 05:48:38

by Eric Biggers

[permalink] [raw]
Subject: Re: [PATCH v3] kernel/watch_queue: Make pipe NULL while clearing watch_queue

On Wed, Aug 03, 2022 at 10:43:31AM +0530, Siddh Raman Pant wrote:
> On Wed, 03 Aug 2022 09:42:28 +0530 Eric Biggers <[email protected]> wrote:
> > Under what circumstances is the pipe pointer still being dereferenced after the
> > pipe has been freed? I don't see how it can be; see my explanation above.
>
> It really didn't fix the crash. It caused the same crash reported here, which
> I was already locally getting:
> https://syzkaller.appspot.com/bug?extid=03d7b43290037d1f87ca

I tested the syzbot reproducer
https://syzkaller.appspot.com/text?tag=ReproC&x=174ea97e080000, and it does
*not* trigger the bug on the latest upstream. But, it does trigger the bug if I
recent Linus's recent watch_queue fixes.

So I don't currently see any evidence of an unfixed bug. If, nevertheless, you
believe that a bug is still unfixed, then please provide a reproducer and a fix
patch that clearly explains what it is fixing.

> There is a null check in post_one_notification for the pipe, most probably
> because it *expects* the pointer to be NULL'd. Also, there is no reason to have
> a dangling pointer stay, it's just a recipe for further bugs.

If you want to send a patch or patches to clean up the code, that is fine, but
please make it super clear what is a cleanup and what is a fix.

- Eric

2022-08-03 08:48:19

by Siddh Raman Pant

[permalink] [raw]
Subject: Re: [PATCH v3] kernel/watch_queue: Make pipe NULL while clearing watch_queue

On Wed, 03 Aug 2022 11:11:31 +0530 Eric Biggers <[email protected]> wrote:
> I tested the syzbot reproducer
> https://syzkaller.appspot.com/text?tag=ReproC&x=174ea97e080000, and it does
> *not* trigger the bug on the latest upstream. But, it does trigger the bug if I
> recent Linus's recent watch_queue fixes.
>
> So I don't currently see any evidence of an unfixed bug. If, nevertheless, you
> believe that a bug is still unfixed, then please provide a reproducer and a fix
> patch that clearly explains what it is fixing.
>
> > There is a null check in post_one_notification for the pipe, most probably
> > because it *expects* the pointer to be NULL'd. Also, there is no reason to have
> > a dangling pointer stay, it's just a recipe for further bugs.
>
> If you want to send a patch or patches to clean up the code, that is fine, but
> please make it super clear what is a cleanup and what is a fix.
>
> - Eric
>

I honestly feel like I am repeating myself yet again, but okay.

Of course, the race condition has been solved by a patch upstream, which I had
myself mentioned earlier.

But what I am saying is that it did *not* address *what* that race condition
had triggered, i.e. the visible cause of the UAF crash, which, among other
things, is *because* there is a dangling pointer to the freed pipe, which
*caused* the crash in post_one_notification() when it tried to access
&pipe->rd.wait_lock as an argument to spin_lock_irq(), a path it reached
after checking if wqueue->pipe is NULL and proceeded when it was not the case.

And the upstream commit was made *after* I had posted this patch, hence this
was a fix for the syzkaller issue. While I am *not* saying to accept it just
because this was posted earlier, I am saying this patch addresses a parallel
issue, i.e. the *actual use-after-free crash* which was reproduced by those
reproducers, i.e., what was attempted to be used after getting freed and
detected by KASAN.

We don't need to wait for another similar syzbot report to pop up before doing
this change, and say let's not fix a dangling pointer reference because now
another commit apparately fixes the specific syzkaller issue, causing the given
specific reproducer with its specific way of reproducing to fail, when we in
fact now know it *can* be a valid problem in practice and doing this change
too causes the specific reproducer under consideration to fail reproducing, as
was reported by the reproducer itself.

I really don't know how to create stress tests / reproducers like how syzkaller
makes, so if a similar new reproducer is really required for showing this
patch's validity disregarding any earlier reproducers, I unfortunately cannot
make it due to skill issue as I just started in kernel dev, and I am deeply
sorry for wasting the time of everyone, and I am thankful for your criticism of
my patch.

Thanks,
Siddh

2022-08-03 18:55:16

by Eric Biggers

[permalink] [raw]
Subject: Re: [PATCH v3] kernel/watch_queue: Make pipe NULL while clearing watch_queue

On Wed, Aug 03, 2022 at 02:10:06PM +0530, Siddh Raman Pant wrote:
> On Wed, 03 Aug 2022 11:11:31 +0530 Eric Biggers <[email protected]> wrote:
> > I tested the syzbot reproducer
> > https://syzkaller.appspot.com/text?tag=ReproC&x=174ea97e080000, and it does
> > *not* trigger the bug on the latest upstream. But, it does trigger the bug if I
> > recent Linus's recent watch_queue fixes.
> >
> > So I don't currently see any evidence of an unfixed bug. If, nevertheless, you
> > believe that a bug is still unfixed, then please provide a reproducer and a fix
> > patch that clearly explains what it is fixing.
> >
> > > There is a null check in post_one_notification for the pipe, most probably
> > > because it *expects* the pointer to be NULL'd. Also, there is no reason to have
> > > a dangling pointer stay, it's just a recipe for further bugs.
> >
> > If you want to send a patch or patches to clean up the code, that is fine, but
> > please make it super clear what is a cleanup and what is a fix.
> >
> > - Eric
> >
>
> I honestly feel like I am repeating myself yet again, but okay.

Well, you should try listening instead. Because you are not listening.

> Of course, the race condition has been solved by a patch upstream, which I had
> myself mentioned earlier.
>
> But what I am saying is that it did *not* address *what* that race condition
> had triggered, i.e. the visible cause of the UAF crash, which, among other
> things, is *because* there is a dangling pointer to the freed pipe, which
> *caused* the crash in post_one_notification() when it tried to access
> &pipe->rd.wait_lock as an argument to spin_lock_irq(), a path it reached
> after checking if wqueue->pipe is NULL and proceeded when it was not the case.
>
> And the upstream commit was made *after* I had posted this patch, hence this
> was a fix for the syzkaller issue. While I am *not* saying to accept it just
> because this was posted earlier, I am saying this patch addresses a parallel
> issue, i.e. the *actual use-after-free crash* which was reproduced by those
> reproducers, i.e., what was attempted to be used after getting freed and
> detected by KASAN.

Even if wqueue->pipe was set to NULL during free_pipe_info(), there would still
have been a use-after-free, as the real bug was the lack of synchronization
between post_one_notification() and free_pipe_info(). That is fixed now.

>
> We don't need to wait for another similar syzbot report to pop up before doing
> this change, and say let's not fix a dangling pointer reference because now
> another commit apparately fixes the specific syzkaller issue, causing the given
> specific reproducer with its specific way of reproducing to fail, when we in
> fact now know it *can* be a valid problem in practice and doing this change
> too causes the specific reproducer under consideration to fail reproducing, as
> was reported by the reproducer itself.

To re-iterate, I encourage you to send a cleanup patch if you see an
opportunity. It looks like the state wqueue->defunct==true could be replaced
with wqueue->pipe==NULL, which would be simpler, so how about doing that? Just
don't claim that it is "fixing" something, unless it is, as that makes things
very confusing and difficult for everyone.

>
> I really don't know how to create stress tests / reproducers like how syzkaller
> makes, so if a similar new reproducer is really required for showing this
> patch's validity disregarding any earlier reproducers, I unfortunately cannot
> make it due to skill issue as I just started in kernel dev, and I am deeply
> sorry for wasting the time of everyone, and I am thankful for your criticism of
> my patch.

A reproducer can just be written as a normal program, in C or another language.
The syzkaller reproducers are really hard to read as they are auto-generated, so
don't read too much into them -- they're certainly not examples of good code.

- Eric

2022-08-04 09:04:16

by Siddh Raman Pant

[permalink] [raw]
Subject: Re: [PATCH v3] kernel/watch_queue: Make pipe NULL while clearing watch_queue

On Wed, 03 Aug 2022 23:45:31 +0530 Eric Biggers wrote:
> Well, you should try listening instead. Because you are not listening.

Sorry for that, never meant to come across like that.

> Even if wqueue->pipe was set to NULL during free_pipe_info(), there would still
> have been a use-after-free, as the real bug was the lack of synchronization
> between post_one_notification() and free_pipe_info(). That is fixed now.

Okay, noted.

> To re-iterate, I encourage you to send a cleanup patch if you see an
> opportunity. It looks like the state wqueue->defunct==true could be replaced
> with wqueue->pipe==NULL, which would be simpler, so how about doing that? Just
> don't claim that it is "fixing" something, unless it is, as that makes things
> very confusing and difficult for everyone.

Okay, I will do that. That actually seems like a plausible thing to do, in
v2 convo, David Howells had also remarked similarly about `defunct` to a reply.

https://lore.kernel.org/linux-kernel/[email protected]/

> A reproducer can just be written as a normal program, in C or another language.
> The syzkaller reproducers are really hard to read as they are auto-generated, so
> don't read too much into them -- they're certainly not examples of good code.
>
> - Eric

Okay, noted.

Thanks for your patience, I probably annoyed you.

Thanks,
Siddh