2022-10-30 10:02:02

by Eli Billauer

[permalink] [raw]
Subject: [PATCH v2] char: xillybus: Prevent use-after-free due to race condition

The driver for XillyUSB devices maintains a kref reference count on each
xillyusb_dev structure, which represents a physical device. This reference
count reaches zero when the device has been disconnected and there are no
open file descriptors that are related to the device. When this occurs,
kref_put() calls cleanup_dev(), which clears up the device's data,
including the structure itself.

However, when xillyusb_open() is called, this reference count becomes
tricky: This function needs to obtain the xillyusb_dev structure that
relates to the inode's major and minor (as there can be several such).
xillybus_find_inode() (which is defined in xillybus_class.c) is called
for this purpose. xillybus_find_inode() holds a mutex that is global in
xillybus_class.c to protect the list of devices, and releases this
mutex before returning. As a result, nothing protects the xillyusb_dev's
reference counter from being decremented to zero before xillyusb_open()
increments it on its own behalf. Hence the structure can be freed
due to a rare race condition.

To solve this, a mutex is added. It is locked by xillyusb_open() before
the call to xillybus_find_inode() and is released only after the kref
counter has been incremented on behalf of the newly opened inode. This
protects the kref reference counters of all xillyusb_dev structs from
being decremented by xillyusb_disconnect() during this time segment, as
the call to kref_put() in this function is done with the same lock held.

There is no need to hold the lock on other calls to kref_put(), because
if xillybus_find_inode() finds a struct, xillyusb_disconnect() has not
made the call to remove it, and hence not made its call to kref_put(),
which takes place afterwards. Hence preventing xillyusb_disconnect's
call to kref_put() is enough to ensure that the reference doesn't reach
zero before it's incremented by xillyusb_open().

It would have been more natural to increment the reference count in
xillybus_find_inode() of course, however this function is also called by
Xillybus' driver for PCIe / OF, which registers a completely different
structure. Therefore, xillybus_find_inode() treats these structures as
void pointers, and accordingly can't make any changes.

Reported-by: Hyunwoo Kim <[email protected]>
Suggested-by: Alan Stern <[email protected]>
Signed-off-by: Eli Billauer <[email protected]>
---

Notes:
Changelog:

v2
-- Rewrite completely: Instead of juggling with a mutex, add a new
one.

drivers/char/xillybus/xillyusb.c | 22 +++++++++++++++++++---
1 file changed, 19 insertions(+), 3 deletions(-)

diff --git a/drivers/char/xillybus/xillyusb.c b/drivers/char/xillybus/xillyusb.c
index 39bcbfd908b4..5a5afa14ca8c 100644
--- a/drivers/char/xillybus/xillyusb.c
+++ b/drivers/char/xillybus/xillyusb.c
@@ -184,6 +184,14 @@ struct xillyusb_dev {
struct mutex process_in_mutex; /* synchronize wakeup_all() */
};

+/*
+ * kref_mutex is used in xillyusb_open() to prevent the xillyusb_dev
+ * struct from being freed during the gap between being found by
+ * xillybus_find_inode() and having its reference count incremented.
+ */
+
+static DEFINE_MUTEX(kref_mutex);
+
/* FPGA to host opcodes */
enum {
OPCODE_DATA = 0,
@@ -1237,9 +1245,16 @@ static int xillyusb_open(struct inode *inode, struct file *filp)
int rc;
int index;

+ mutex_lock(&kref_mutex);
+
rc = xillybus_find_inode(inode, (void **)&xdev, &index);
- if (rc)
+ if (rc) {
+ mutex_unlock(&kref_mutex);
return rc;
+ }
+
+ kref_get(&xdev->kref);
+ mutex_unlock(&kref_mutex);

chan = &xdev->channels[index];
filp->private_data = chan;
@@ -1275,8 +1290,6 @@ static int xillyusb_open(struct inode *inode, struct file *filp)
((filp->f_mode & FMODE_WRITE) && chan->open_for_write))
goto unmutex_fail;

- kref_get(&xdev->kref);
-
if (filp->f_mode & FMODE_READ)
chan->open_for_read = 1;

@@ -1413,6 +1426,7 @@ static int xillyusb_open(struct inode *inode, struct file *filp)
return rc;

unmutex_fail:
+ kref_put(&xdev->kref, cleanup_dev);
mutex_unlock(&chan->lock);
return rc;
}
@@ -2227,7 +2241,9 @@ static void xillyusb_disconnect(struct usb_interface *interface)

xdev->dev = NULL;

+ mutex_lock(&kref_mutex);
kref_put(&xdev->kref, cleanup_dev);
+ mutex_unlock(&kref_mutex);
}

static struct usb_driver xillyusb_driver = {
--
2.17.1



2022-10-30 17:19:05

by Alan Stern

[permalink] [raw]
Subject: Re: [PATCH v2] char: xillybus: Prevent use-after-free due to race condition

On Sun, Oct 30, 2022 at 11:42:09AM +0200, Eli Billauer wrote:
> The driver for XillyUSB devices maintains a kref reference count on each
> xillyusb_dev structure, which represents a physical device. This reference
> count reaches zero when the device has been disconnected and there are no
> open file descriptors that are related to the device. When this occurs,
> kref_put() calls cleanup_dev(), which clears up the device's data,
> including the structure itself.
>
> However, when xillyusb_open() is called, this reference count becomes
> tricky: This function needs to obtain the xillyusb_dev structure that
> relates to the inode's major and minor (as there can be several such).
> xillybus_find_inode() (which is defined in xillybus_class.c) is called
> for this purpose. xillybus_find_inode() holds a mutex that is global in
> xillybus_class.c to protect the list of devices, and releases this
> mutex before returning. As a result, nothing protects the xillyusb_dev's
> reference counter from being decremented to zero before xillyusb_open()
> increments it on its own behalf. Hence the structure can be freed
> due to a rare race condition.
>
> To solve this, a mutex is added. It is locked by xillyusb_open() before
> the call to xillybus_find_inode() and is released only after the kref
> counter has been incremented on behalf of the newly opened inode. This
> protects the kref reference counters of all xillyusb_dev structs from
> being decremented by xillyusb_disconnect() during this time segment, as
> the call to kref_put() in this function is done with the same lock held.
>
> There is no need to hold the lock on other calls to kref_put(), because
> if xillybus_find_inode() finds a struct, xillyusb_disconnect() has not
> made the call to remove it, and hence not made its call to kref_put(),
> which takes place afterwards. Hence preventing xillyusb_disconnect's
> call to kref_put() is enough to ensure that the reference doesn't reach
> zero before it's incremented by xillyusb_open().
>
> It would have been more natural to increment the reference count in
> xillybus_find_inode() of course, however this function is also called by
> Xillybus' driver for PCIe / OF, which registers a completely different
> structure. Therefore, xillybus_find_inode() treats these structures as
> void pointers, and accordingly can't make any changes.
>
> Reported-by: Hyunwoo Kim <[email protected]>
> Suggested-by: Alan Stern <[email protected]>
> Signed-off-by: Eli Billauer <[email protected]>

It looks like the xillybus driver already has a private mutex that would
have been very well suited for this task: unit_mutex defined in
xillybus_class.c. Of course, there's nothing wrong with using a new
mutex instead -- just make sure there aren't any ABBA locking order
problems.

Alan Stern

2022-10-30 19:16:09

by Eli Billauer

[permalink] [raw]
Subject: Re: [PATCH v2] char: xillybus: Prevent use-after-free due to race condition

On 30/10/2022 18:23, Alan Stern wrote:
> It looks like the xillybus driver already has a private mutex that would
> have been very well suited for this task: unit_mutex defined in
> xillybus_class.c.

Indeed so. The problem is that unit_mutex is global to xillybus_class.c,
and I need the mutex to be released in xillyusb.c: The kref counter,
which needs to be incremented with a mutex held, is inside a structure
that only the XillyUSB driver knows about, so xillybus_class can't do
that. Which is why xillybus_find_inode() passed a pointer to unit_mutex
to its caller in the v1 version of this patch. But that turned out to
be too odd.

> Of course, there's nothing wrong with using a new
> mutex instead -- just make sure there aren't any ABBA locking order
> problems.

The kref_mutex is always taken with no other (Xillybus-related) mutex
taken. So the locking order is assured.

But thanks for the reminder. Never hurts checking again.

Regards,
Eli

2022-11-13 08:49:31

by Hyunwoo Kim

[permalink] [raw]
Subject: Re: [PATCH v2] char: xillybus: Prevent use-after-free due to race condition

Dear,

Sorry for the late review.

This patch cannot prevent the UAF scenario I presented:
```
cpu0 cpu1
1. xillyusb_open()
mutex_lock(&kref_mutex); // unaffected lock
xillybus_find_inode()
mutex_lock(&unit_mutex);
unit = iter;
mutex_unlock(&unit_mutex);
2. xillyusb_disconnect()
xillybus_cleanup_chrdev()
mutex_lock(&unit_mutex);
kfree(unit);
mutex_unlock(&unit_mutex);
3. *private_data = unit->private_data; // UAF

```

This is a UAF for 'unit', not a UAF for 'xdev'.
So, the added 'kref_mutex' has no effect.


Regards,
Hyunwoo Kim

2022-11-13 09:20:49

by Hyunwoo Kim

[permalink] [raw]
Subject: Re: [PATCH v2] char: xillybus: Prevent use-after-free due to race condition

Dear,

On Sun, Nov 13, 2022 at 10:30:16AM +0200, Eli Billauer wrote:
> On 13/11/2022 10:05, Hyunwoo Kim wrote:
> > Dear,
> >
> > Sorry for the late review.
> >
> > This patch cannot prevent the UAF scenario I presented:
> > ```
> > cpu0 cpu1
> > 1. xillyusb_open()
> > mutex_lock(&kref_mutex); // unaffected lock
> > xillybus_find_inode()
> > mutex_lock(&unit_mutex);
> > unit = iter;
> > mutex_unlock(&unit_mutex);
> > 2. xillyusb_disconnect()
> > xillybus_cleanup_chrdev()
> > mutex_lock(&unit_mutex);
> > kfree(unit);
> > mutex_unlock(&unit_mutex);
> > 3. *private_data = unit->private_data; // UAF
> >
> > ```
> >
> > This is a UAF for 'unit', not a UAF for 'xdev'.
> > So, the added 'kref_mutex' has no effect.
> >
>
> You're correct. This submitted patch solves only one problem out of two. It
> prevents the content of @private_data to be freed, but you're right that
> @unit itself isn't protected well enough.
>
> The problem you're pointing at is that @unit can be freed before its
> attempted use, because the mutex is released too early.
>
> This is easily solved by moving down the mutex_unlock() call to after @unit
> has been used. Do you want the pleasure to submit this patch, or should I?

I hope you work as before.

And, even if the mutex_unlock(&unit_mutex); of xillybus_find_inode()
is finally moved, xdev may be released before kref_get() is executed
if xillyusb_disconnect() ends just before the function returns.
(Of course, this is an extremely rare case.)

So, in xillyusb_open() we need to move kref_get() above xillybus_find_inode().


Regards,
Hyunwoo Kim

2022-11-13 09:28:22

by Eli Billauer

[permalink] [raw]
Subject: Re: [PATCH v2] char: xillybus: Prevent use-after-free due to race condition

On 13/11/2022 10:05, Hyunwoo Kim wrote:
> Dear,
>
> Sorry for the late review.
>
> This patch cannot prevent the UAF scenario I presented:
> ```
> cpu0 cpu1
> 1. xillyusb_open()
> mutex_lock(&kref_mutex); // unaffected lock
> xillybus_find_inode()
> mutex_lock(&unit_mutex);
> unit = iter;
> mutex_unlock(&unit_mutex);
> 2. xillyusb_disconnect()
> xillybus_cleanup_chrdev()
> mutex_lock(&unit_mutex);
> kfree(unit);
> mutex_unlock(&unit_mutex);
> 3. *private_data = unit->private_data; // UAF
>
> ```
>
> This is a UAF for 'unit', not a UAF for 'xdev'.
> So, the added 'kref_mutex' has no effect.
>

You're correct. This submitted patch solves only one problem out of two.
It prevents the content of @private_data to be freed, but you're right
that @unit itself isn't protected well enough.

The problem you're pointing at is that @unit can be freed before its
attempted use, because the mutex is released too early.

This is easily solved by moving down the mutex_unlock() call to after
@unit has been used. Do you want the pleasure to submit this patch, or
should I?

Thanks,
Eli

2022-11-13 09:37:14

by Hyunwoo Kim

[permalink] [raw]
Subject: Re: [PATCH v2] char: xillybus: Prevent use-after-free due to race condition

On Sun, Nov 13, 2022 at 11:03:20AM +0200, Eli Billauer wrote:
> On 13/11/2022 10:47, Hyunwoo Kim wrote:
> > And, even if the mutex_unlock(&unit_mutex); of xillybus_find_inode()
> > is finally moved, xdev may be released before kref_get() is executed
> > if xillyusb_disconnect() ends just before the function returns.
> > (Of course, this is an extremely rare case.)
> >
> > So, in xillyusb_open() we need to move kref_get() above xillybus_find_inode().
>
> First of all, that's impossible. kref_get() is called on a member of a
> specific @xdev's struct, and it's xillybus_find_inode()'s job to find it. So
> before the call to xillybus_find_inode(), we don't know which @xdev it is.
> That's the tricky part of all this.
>
> The solution of this submitted patch was a lock that briefly prevents the
> kref_put() of all @xdevs. The way it works is that if an @xdev is found by
> xillybus_find_inode(), it necessarily means that xillyusb_disconnect()'s
> call to xillybus_cleanup_chrdev() hasn't returned (yet). Therefore, holding
> @kref_mutex guarantees that the kref_put() call, which is later on, isn't
> reached for the @xdev that has been found.
>
> If you've found a flaw in this mechanism, please be more specific about it.

you're right.

It seems that you only need to move the location of unit_mutex
in xillybus_find_inode().


Regards,
Hyunwoo Kim

2022-11-13 09:49:30

by Eli Billauer

[permalink] [raw]
Subject: Re: [PATCH v2] char: xillybus: Prevent use-after-free due to race condition

On 13/11/2022 10:47, Hyunwoo Kim wrote:
> And, even if the mutex_unlock(&unit_mutex); of xillybus_find_inode()
> is finally moved, xdev may be released before kref_get() is executed
> if xillyusb_disconnect() ends just before the function returns.
> (Of course, this is an extremely rare case.)
>
> So, in xillyusb_open() we need to move kref_get() above xillybus_find_inode().

First of all, that's impossible. kref_get() is called on a member of a
specific @xdev's struct, and it's xillybus_find_inode()'s job to find
it. So before the call to xillybus_find_inode(), we don't know which
@xdev it is. That's the tricky part of all this.

The solution of this submitted patch was a lock that briefly prevents
the kref_put() of all @xdevs. The way it works is that if an @xdev is
found by xillybus_find_inode(), it necessarily means that
xillyusb_disconnect()'s call to xillybus_cleanup_chrdev() hasn't
returned (yet). Therefore, holding @kref_mutex guarantees that the
kref_put() call, which is later on, isn't reached for the @xdev that has
been found.

If you've found a flaw in this mechanism, please be more specific about it.

Regards,
Eli

2022-11-13 12:18:15

by Eli Billauer

[permalink] [raw]
Subject: Re: [PATCH v2] char: xillybus: Prevent use-after-free due to race condition

On 13/11/2022 11:14, Hyunwoo Kim wrote:
>
> It seems that you only need to move the location of unit_mutex
> in xillybus_find_inode().
>

Agreed. I'll prepare a follow-up patch to fix this issue as well.

Thanks a lot for drawing my attention to this.

Eli