2023-03-13 09:06:40

by Zheng Wang

[permalink] [raw]
Subject: [PATCH net v2] 9p/xen : Fix use after free bug in xen_9pfs_front_remove due to race condition

In xen_9pfs_front_probe, it calls xen_9pfs_front_alloc_dataring
to init priv->rings and bound &ring->work with p9_xen_response.

When it calls xen_9pfs_front_event_handler to handle IRQ requests,
it will finally call schedule_work to start the work.

When we call xen_9pfs_front_remove to remove the driver, there
may be a sequence as follows:

Fix it by finishing the work before cleanup in xen_9pfs_front_free.

Note that, this bug is found by static analysis, which might be
false positive.

CPU0 CPU1

|p9_xen_response
xen_9pfs_front_remove|
xen_9pfs_front_free|
kfree(priv) |
//free priv |
|p9_tag_lookup
|//use priv->client

Fixes: 71ebd71921e4 ("xen/9pfs: connect to the backend")
Signed-off-by: Zheng Wang <[email protected]>
---
v2:
- fix type error of ring found by kernel test robot
---
net/9p/trans_xen.c | 5 +++++
1 file changed, 5 insertions(+)

diff --git a/net/9p/trans_xen.c b/net/9p/trans_xen.c
index c64050e839ac..83764431c066 100644
--- a/net/9p/trans_xen.c
+++ b/net/9p/trans_xen.c
@@ -274,12 +274,17 @@ static const struct xenbus_device_id xen_9pfs_front_ids[] = {
static void xen_9pfs_front_free(struct xen_9pfs_front_priv *priv)
{
int i, j;
+ struct xen_9pfs_dataring *ring = NULL;

write_lock(&xen_9pfs_lock);
list_del(&priv->list);
write_unlock(&xen_9pfs_lock);

for (i = 0; i < priv->num_rings; i++) {
+ /*cancel work*/
+ ring = &priv->rings[i];
+ cancel_work_sync(&ring->work);
+
if (!priv->rings[i].intf)
break;
if (priv->rings[i].irq > 0)
--
2.25.1



2023-03-13 13:54:35

by Michal Swiatkowski

[permalink] [raw]
Subject: Re: [PATCH net v2] 9p/xen : Fix use after free bug in xen_9pfs_front_remove due to race condition

On Mon, Mar 13, 2023 at 05:00:02PM +0800, Zheng Wang wrote:
> In xen_9pfs_front_probe, it calls xen_9pfs_front_alloc_dataring
> to init priv->rings and bound &ring->work with p9_xen_response.
>
> When it calls xen_9pfs_front_event_handler to handle IRQ requests,
> it will finally call schedule_work to start the work.
>
> When we call xen_9pfs_front_remove to remove the driver, there
> may be a sequence as follows:
>
> Fix it by finishing the work before cleanup in xen_9pfs_front_free.
>
> Note that, this bug is found by static analysis, which might be
> false positive.
>
> CPU0 CPU1
>
> |p9_xen_response
> xen_9pfs_front_remove|
> xen_9pfs_front_free|
> kfree(priv) |
> //free priv |
> |p9_tag_lookup
> |//use priv->client
>
> Fixes: 71ebd71921e4 ("xen/9pfs: connect to the backend")
> Signed-off-by: Zheng Wang <[email protected]>
> ---
> v2:
> - fix type error of ring found by kernel test robot
> ---
> net/9p/trans_xen.c | 5 +++++
> 1 file changed, 5 insertions(+)
>
> diff --git a/net/9p/trans_xen.c b/net/9p/trans_xen.c
> index c64050e839ac..83764431c066 100644
> --- a/net/9p/trans_xen.c
> +++ b/net/9p/trans_xen.c
> @@ -274,12 +274,17 @@ static const struct xenbus_device_id xen_9pfs_front_ids[] = {
> static void xen_9pfs_front_free(struct xen_9pfs_front_priv *priv)
> {
> int i, j;
> + struct xen_9pfs_dataring *ring = NULL;
Move it before int i, j to have RCT.

>
> write_lock(&xen_9pfs_lock);
> list_del(&priv->list);
> write_unlock(&xen_9pfs_lock);
>
> for (i = 0; i < priv->num_rings; i++) {
> + /*cancel work*/
It isn't needed I think, the function cancel_work_sync() tells everything
here.

> + ring = &priv->rings[i];
> + cancel_work_sync(&ring->work);
> +
> if (!priv->rings[i].intf)
> break;
> if (priv->rings[i].irq > 0)
> --
> 2.25.1

2023-03-13 14:02:28

by Zheng Hacker

[permalink] [raw]
Subject: Re: [PATCH net v2] 9p/xen : Fix use after free bug in xen_9pfs_front_remove due to race condition

Michal Swiatkowski <[email protected]> 于2023年3月13日周一 21:54写道:
>
> On Mon, Mar 13, 2023 at 05:00:02PM +0800, Zheng Wang wrote:
> > In xen_9pfs_front_probe, it calls xen_9pfs_front_alloc_dataring
> > to init priv->rings and bound &ring->work with p9_xen_response.
> >
> > When it calls xen_9pfs_front_event_handler to handle IRQ requests,
> > it will finally call schedule_work to start the work.
> >
> > When we call xen_9pfs_front_remove to remove the driver, there
> > may be a sequence as follows:
> >
> > Fix it by finishing the work before cleanup in xen_9pfs_front_free.
> >
> > Note that, this bug is found by static analysis, which might be
> > false positive.
> >
> > CPU0 CPU1
> >
> > |p9_xen_response
> > xen_9pfs_front_remove|
> > xen_9pfs_front_free|
> > kfree(priv) |
> > //free priv |
> > |p9_tag_lookup
> > |//use priv->client
> >
> > Fixes: 71ebd71921e4 ("xen/9pfs: connect to the backend")
> > Signed-off-by: Zheng Wang <[email protected]>
> > ---
> > v2:
> > - fix type error of ring found by kernel test robot
> > ---
> > net/9p/trans_xen.c | 5 +++++
> > 1 file changed, 5 insertions(+)
> >
> > diff --git a/net/9p/trans_xen.c b/net/9p/trans_xen.c
> > index c64050e839ac..83764431c066 100644
> > --- a/net/9p/trans_xen.c
> > +++ b/net/9p/trans_xen.c
> > @@ -274,12 +274,17 @@ static const struct xenbus_device_id xen_9pfs_front_ids[] = {
> > static void xen_9pfs_front_free(struct xen_9pfs_front_priv *priv)
> > {
> > int i, j;
> > + struct xen_9pfs_dataring *ring = NULL;
> Move it before int i, j to have RCT.
>
> >
> > write_lock(&xen_9pfs_lock);
> > list_del(&priv->list);
> > write_unlock(&xen_9pfs_lock);
> >
> > for (i = 0; i < priv->num_rings; i++) {
> > + /*cancel work*/
> It isn't needed I think, the function cancel_work_sync() tells everything
> here.
>

Get it, will remove it in the next version of patch.

Best regards,
Zheng

2023-03-13 14:08:18

by Michal Swiatkowski

[permalink] [raw]
Subject: Re: [PATCH net v2] 9p/xen : Fix use after free bug in xen_9pfs_front_remove due to race condition

On Mon, Mar 13, 2023 at 10:01:21PM +0800, Zheng Hacker wrote:
> Michal Swiatkowski <[email protected]> 于2023年3月13日周一 21:54写道:
> >
> > On Mon, Mar 13, 2023 at 05:00:02PM +0800, Zheng Wang wrote:
> > > In xen_9pfs_front_probe, it calls xen_9pfs_front_alloc_dataring
> > > to init priv->rings and bound &ring->work with p9_xen_response.
> > >
> > > When it calls xen_9pfs_front_event_handler to handle IRQ requests,
> > > it will finally call schedule_work to start the work.
> > >
> > > When we call xen_9pfs_front_remove to remove the driver, there
> > > may be a sequence as follows:
> > >
> > > Fix it by finishing the work before cleanup in xen_9pfs_front_free.
> > >
> > > Note that, this bug is found by static analysis, which might be
> > > false positive.
> > >
> > > CPU0 CPU1
> > >
> > > |p9_xen_response
> > > xen_9pfs_front_remove|
> > > xen_9pfs_front_free|
> > > kfree(priv) |
> > > //free priv |
> > > |p9_tag_lookup
> > > |//use priv->client
> > >
> > > Fixes: 71ebd71921e4 ("xen/9pfs: connect to the backend")
> > > Signed-off-by: Zheng Wang <[email protected]>
> > > ---
> > > v2:
> > > - fix type error of ring found by kernel test robot
> > > ---
> > > net/9p/trans_xen.c | 5 +++++
> > > 1 file changed, 5 insertions(+)
> > >
> > > diff --git a/net/9p/trans_xen.c b/net/9p/trans_xen.c
> > > index c64050e839ac..83764431c066 100644
> > > --- a/net/9p/trans_xen.c
> > > +++ b/net/9p/trans_xen.c
> > > @@ -274,12 +274,17 @@ static const struct xenbus_device_id xen_9pfs_front_ids[] = {
> > > static void xen_9pfs_front_free(struct xen_9pfs_front_priv *priv)
> > > {
> > > int i, j;
> > > + struct xen_9pfs_dataring *ring = NULL;
> > Move it before int i, j to have RCT.
> >
Sorry I didn't notice it before, move the definition to for loop.

> > >
> > > write_lock(&xen_9pfs_lock);
> > > list_del(&priv->list);
> > > write_unlock(&xen_9pfs_lock);
> > >
> > > for (i = 0; i < priv->num_rings; i++) {
Here:
struct xen_9pfs_dataring *ring = &priv->rings[i];

> > > + /*cancel work*/
> > It isn't needed I think, the function cancel_work_sync() tells everything
> > here.
> >
>
> Get it, will remove it in the next version of patch.
>
> Best regards,
> Zheng

2023-03-13 14:11:17

by Zheng Hacker

[permalink] [raw]
Subject: Re: [PATCH net v2] 9p/xen : Fix use after free bug in xen_9pfs_front_remove due to race condition

Michal Swiatkowski <[email protected]> 于2023年3月13日周一 22:08写道:
>
> On Mon, Mar 13, 2023 at 10:01:21PM +0800, Zheng Hacker wrote:
> > Michal Swiatkowski <[email protected]> 于2023年3月13日周一 21:54写道:
> > >
> > > On Mon, Mar 13, 2023 at 05:00:02PM +0800, Zheng Wang wrote:
> > > > In xen_9pfs_front_probe, it calls xen_9pfs_front_alloc_dataring
> > > > to init priv->rings and bound &ring->work with p9_xen_response.
> > > >
> > > > When it calls xen_9pfs_front_event_handler to handle IRQ requests,
> > > > it will finally call schedule_work to start the work.
> > > >
> > > > When we call xen_9pfs_front_remove to remove the driver, there
> > > > may be a sequence as follows:
> > > >
> > > > Fix it by finishing the work before cleanup in xen_9pfs_front_free.
> > > >
> > > > Note that, this bug is found by static analysis, which might be
> > > > false positive.
> > > >
> > > > CPU0 CPU1
> > > >
> > > > |p9_xen_response
> > > > xen_9pfs_front_remove|
> > > > xen_9pfs_front_free|
> > > > kfree(priv) |
> > > > //free priv |
> > > > |p9_tag_lookup
> > > > |//use priv->client
> > > >
> > > > Fixes: 71ebd71921e4 ("xen/9pfs: connect to the backend")
> > > > Signed-off-by: Zheng Wang <[email protected]>
> > > > ---
> > > > v2:
> > > > - fix type error of ring found by kernel test robot
> > > > ---
> > > > net/9p/trans_xen.c | 5 +++++
> > > > 1 file changed, 5 insertions(+)
> > > >
> > > > diff --git a/net/9p/trans_xen.c b/net/9p/trans_xen.c
> > > > index c64050e839ac..83764431c066 100644
> > > > --- a/net/9p/trans_xen.c
> > > > +++ b/net/9p/trans_xen.c
> > > > @@ -274,12 +274,17 @@ static const struct xenbus_device_id xen_9pfs_front_ids[] = {
> > > > static void xen_9pfs_front_free(struct xen_9pfs_front_priv *priv)
> > > > {
> > > > int i, j;
> > > > + struct xen_9pfs_dataring *ring = NULL;
> > > Move it before int i, j to have RCT.
> > >
> Sorry I didn't notice it before, move the definition to for loop.
>

Get it, will correct it in the next patch. Thanks for your notice :)

> > > >
> > > > write_lock(&xen_9pfs_lock);
> > > > list_del(&priv->list);
> > > > write_unlock(&xen_9pfs_lock);
> > > >
> > > > for (i = 0; i < priv->num_rings; i++) {
> Here:
> struct xen_9pfs_dataring *ring = &priv->rings[i];
>

Best regards,
Zheng

2023-03-13 21:31:03

by Jakub Kicinski

[permalink] [raw]
Subject: Re: [PATCH net v2] 9p/xen : Fix use after free bug in xen_9pfs_front_remove due to race condition

On Mon, 13 Mar 2023 14:54:20 +0100 Michal Swiatkowski wrote:
> > @@ -274,12 +274,17 @@ static const struct xenbus_device_id xen_9pfs_front_ids[] = {
> > static void xen_9pfs_front_free(struct xen_9pfs_front_priv *priv)
> > {
> > int i, j;
> > + struct xen_9pfs_dataring *ring = NULL;
> Move it before int i, j to have RCT.
>
> >
> > write_lock(&xen_9pfs_lock);
> > list_del(&priv->list);
> > write_unlock(&xen_9pfs_lock);
> >
> > for (i = 0; i < priv->num_rings; i++) {
> > + /*cancel work*/
> It isn't needed I think, the function cancel_work_sync() tells everything
> here.

Note that 9p is more storage than networking, so this patch is likely
to go via a different tree than us.

2023-03-13 22:08:19

by Dominique Martinet

[permalink] [raw]
Subject: Re: [PATCH net v2] 9p/xen : Fix use after free bug in xen_9pfs_front_remove due to race condition

Jakub Kicinski wrote on Mon, Mar 13, 2023 at 02:30:54PM -0700:
> On Mon, 13 Mar 2023 14:54:20 +0100 Michal Swiatkowski wrote:
> > > for (i = 0; i < priv->num_rings; i++) {
> > > + /*cancel work*/
> > It isn't needed I think, the function cancel_work_sync() tells everything
> > here.
>
> Note that 9p is more storage than networking, so this patch is likely
> to go via a different tree than us.

Any review done is useful anyway ;)

Either Eric or me will take the patch, but in the past such fixes have
sometimes also been taken into the net tree; honestly I wouldn't mind a
bit more "rule" here as it's a bit weird that some of our patches are Cc
to fsdevel@ (fs/ from fs/9p) and the other half netdev@ (net/ from
net/9p), but afaict the MAINTAINERS syntax doesn't have a way of
excluding e.g. net/9p from the `NETWORKING [GENERAL]` group so I guess
we just have to live with that.

There's little enough volume and netdev automation sends a mail when a
patch is picked up, so as long as there's no conflict (large majority of
the cases) such fixes can go either way as far as I'm concerned.

--
Dominique

2023-03-14 01:07:56

by Zheng Hacker

[permalink] [raw]
Subject: Re: [PATCH net v2] 9p/xen : Fix use after free bug in xen_9pfs_front_remove due to race condition

Jakub Kicinski <[email protected]> 于2023年3月14日周二 05:30写道:
>
> On Mon, 13 Mar 2023 14:54:20 +0100 Michal Swiatkowski wrote:
> > > @@ -274,12 +274,17 @@ static const struct xenbus_device_id xen_9pfs_front_ids[] = {
> > > static void xen_9pfs_front_free(struct xen_9pfs_front_priv *priv)
> > > {
> > > int i, j;
> > > + struct xen_9pfs_dataring *ring = NULL;
> > Move it before int i, j to have RCT.
> >
> > >
> > > write_lock(&xen_9pfs_lock);
> > > list_del(&priv->list);
> > > write_unlock(&xen_9pfs_lock);
> > >
> > > for (i = 0; i < priv->num_rings; i++) {
> > > + /*cancel work*/
> > It isn't needed I think, the function cancel_work_sync() tells everything
> > here.
>
> Note that 9p is more storage than networking, so this patch is likely
> to go via a different tree than us.

Sorry I got confused.

Best regards,
Zheng

2023-03-14 01:16:09

by Zheng Hacker

[permalink] [raw]
Subject: Re: [PATCH net v2] 9p/xen : Fix use after free bug in xen_9pfs_front_remove due to race condition

<[email protected]> 于2023年3月14日周二 06:08写道:
>
> Jakub Kicinski wrote on Mon, Mar 13, 2023 at 02:30:54PM -0700:
> > On Mon, 13 Mar 2023 14:54:20 +0100 Michal Swiatkowski wrote:
> > > > for (i = 0; i < priv->num_rings; i++) {
> > > > + /*cancel work*/
> > > It isn't needed I think, the function cancel_work_sync() tells everything
> > > here.
> >
> > Note that 9p is more storage than networking, so this patch is likely
> > to go via a different tree than us.
>
> Any review done is useful anyway ;)
>
> Either Eric or me will take the patch, but in the past such fixes have
> sometimes also been taken into the net tree; honestly I wouldn't mind a
> bit more "rule" here as it's a bit weird that some of our patches are Cc
> to fsdevel@ (fs/ from fs/9p) and the other half netdev@ (net/ from
> net/9p), but afaict the MAINTAINERS syntax doesn't have a way of
> excluding e.g. net/9p from the `NETWORKING [GENERAL]` group so I guess
> we just have to live with that.

Dear Dominique,

Sorry for my confusion and thanks for your patient explanation. I'll take care
of it when submitting a fix to the linux kernel in the future.

>
> There's little enough volume and netdev automation sends a mail when a
> patch is picked up, so as long as there's no conflict (large majority of
> the cases) such fixes can go either way as far as I'm concerned.
>
Thanks again for your effort. Hope you have a good day :)

Best regards,
Zheng