The driver for XillyUSB devices maintains a kref reference count on each
xillyusb_dev structure, which represents a physical device. This reference
count reaches zero when the device has been disconnected and there are no
open file descriptors that are related to the device. When this occurs,
kref_put() calls cleanup_dev(), which clears up the device's data,
including the structure itself.
However, when xillyusb_open() is called, this reference count becomes
tricky: This function needs to obtain the xillyusb_dev structure that
relates to the inode's major and minor (as there can be several such).
xillybus_find_inode() (which is defined in xillybus_class.c) is called
for this purpose. xillybus_find_inode() holds a mutex that is global in
xillybus_class.c to protect the list of devices, and releases this
mutex before returning. As a result, nothing protects the xillyusb_dev's
reference counter from being decremented to zero before xillyusb_open()
increments it on its own behalf. Hence the structure can be freed
due to a rare race condition.
To solve this, a mutex is added. It is locked by xillyusb_open() before
the call to xillybus_find_inode() and is released only after the kref
counter has been incremented on behalf of the newly opened inode. This
protects the kref reference counters of all xillyusb_dev structs from
being decremented by xillyusb_disconnect() during this time segment, as
the call to kref_put() in this function is done with the same lock held.
There is no need to hold the lock on other calls to kref_put(), because
if xillybus_find_inode() finds a struct, xillyusb_disconnect() has not
made the call to remove it, and hence not made its call to kref_put(),
which takes place afterwards. Hence preventing xillyusb_disconnect's
call to kref_put() is enough to ensure that the reference doesn't reach
zero before it's incremented by xillyusb_open().
It would have been more natural to increment the reference count in
xillybus_find_inode() of course, however this function is also called by
Xillybus' driver for PCIe / OF, which registers a completely different
structure. Therefore, xillybus_find_inode() treats these structures as
void pointers, and accordingly can't make any changes.
Reported-by: Hyunwoo Kim <[email protected]>
Suggested-by: Alan Stern <[email protected]>
Signed-off-by: Eli Billauer <[email protected]>
---
Notes:
Changelog:
v2
-- Rewrite completely: Instead of juggling with a mutex, add a new
one.
drivers/char/xillybus/xillyusb.c | 22 +++++++++++++++++++---
1 file changed, 19 insertions(+), 3 deletions(-)
diff --git a/drivers/char/xillybus/xillyusb.c b/drivers/char/xillybus/xillyusb.c
index 39bcbfd908b4..5a5afa14ca8c 100644
--- a/drivers/char/xillybus/xillyusb.c
+++ b/drivers/char/xillybus/xillyusb.c
@@ -184,6 +184,14 @@ struct xillyusb_dev {
struct mutex process_in_mutex; /* synchronize wakeup_all() */
};
+/*
+ * kref_mutex is used in xillyusb_open() to prevent the xillyusb_dev
+ * struct from being freed during the gap between being found by
+ * xillybus_find_inode() and having its reference count incremented.
+ */
+
+static DEFINE_MUTEX(kref_mutex);
+
/* FPGA to host opcodes */
enum {
OPCODE_DATA = 0,
@@ -1237,9 +1245,16 @@ static int xillyusb_open(struct inode *inode, struct file *filp)
int rc;
int index;
+ mutex_lock(&kref_mutex);
+
rc = xillybus_find_inode(inode, (void **)&xdev, &index);
- if (rc)
+ if (rc) {
+ mutex_unlock(&kref_mutex);
return rc;
+ }
+
+ kref_get(&xdev->kref);
+ mutex_unlock(&kref_mutex);
chan = &xdev->channels[index];
filp->private_data = chan;
@@ -1275,8 +1290,6 @@ static int xillyusb_open(struct inode *inode, struct file *filp)
((filp->f_mode & FMODE_WRITE) && chan->open_for_write))
goto unmutex_fail;
- kref_get(&xdev->kref);
-
if (filp->f_mode & FMODE_READ)
chan->open_for_read = 1;
@@ -1413,6 +1426,7 @@ static int xillyusb_open(struct inode *inode, struct file *filp)
return rc;
unmutex_fail:
+ kref_put(&xdev->kref, cleanup_dev);
mutex_unlock(&chan->lock);
return rc;
}
@@ -2227,7 +2241,9 @@ static void xillyusb_disconnect(struct usb_interface *interface)
xdev->dev = NULL;
+ mutex_lock(&kref_mutex);
kref_put(&xdev->kref, cleanup_dev);
+ mutex_unlock(&kref_mutex);
}
static struct usb_driver xillyusb_driver = {
--
2.17.1
On Sun, Oct 30, 2022 at 11:42:09AM +0200, Eli Billauer wrote:
> The driver for XillyUSB devices maintains a kref reference count on each
> xillyusb_dev structure, which represents a physical device. This reference
> count reaches zero when the device has been disconnected and there are no
> open file descriptors that are related to the device. When this occurs,
> kref_put() calls cleanup_dev(), which clears up the device's data,
> including the structure itself.
>
> However, when xillyusb_open() is called, this reference count becomes
> tricky: This function needs to obtain the xillyusb_dev structure that
> relates to the inode's major and minor (as there can be several such).
> xillybus_find_inode() (which is defined in xillybus_class.c) is called
> for this purpose. xillybus_find_inode() holds a mutex that is global in
> xillybus_class.c to protect the list of devices, and releases this
> mutex before returning. As a result, nothing protects the xillyusb_dev's
> reference counter from being decremented to zero before xillyusb_open()
> increments it on its own behalf. Hence the structure can be freed
> due to a rare race condition.
>
> To solve this, a mutex is added. It is locked by xillyusb_open() before
> the call to xillybus_find_inode() and is released only after the kref
> counter has been incremented on behalf of the newly opened inode. This
> protects the kref reference counters of all xillyusb_dev structs from
> being decremented by xillyusb_disconnect() during this time segment, as
> the call to kref_put() in this function is done with the same lock held.
>
> There is no need to hold the lock on other calls to kref_put(), because
> if xillybus_find_inode() finds a struct, xillyusb_disconnect() has not
> made the call to remove it, and hence not made its call to kref_put(),
> which takes place afterwards. Hence preventing xillyusb_disconnect's
> call to kref_put() is enough to ensure that the reference doesn't reach
> zero before it's incremented by xillyusb_open().
>
> It would have been more natural to increment the reference count in
> xillybus_find_inode() of course, however this function is also called by
> Xillybus' driver for PCIe / OF, which registers a completely different
> structure. Therefore, xillybus_find_inode() treats these structures as
> void pointers, and accordingly can't make any changes.
>
> Reported-by: Hyunwoo Kim <[email protected]>
> Suggested-by: Alan Stern <[email protected]>
> Signed-off-by: Eli Billauer <[email protected]>
It looks like the xillybus driver already has a private mutex that would
have been very well suited for this task: unit_mutex defined in
xillybus_class.c. Of course, there's nothing wrong with using a new
mutex instead -- just make sure there aren't any ABBA locking order
problems.
Alan Stern
On 30/10/2022 18:23, Alan Stern wrote:
> It looks like the xillybus driver already has a private mutex that would
> have been very well suited for this task: unit_mutex defined in
> xillybus_class.c.
Indeed so. The problem is that unit_mutex is global to xillybus_class.c,
and I need the mutex to be released in xillyusb.c: The kref counter,
which needs to be incremented with a mutex held, is inside a structure
that only the XillyUSB driver knows about, so xillybus_class can't do
that. Which is why xillybus_find_inode() passed a pointer to unit_mutex
to its caller in the v1 version of this patch. But that turned out to
be too odd.
> Of course, there's nothing wrong with using a new
> mutex instead -- just make sure there aren't any ABBA locking order
> problems.
The kref_mutex is always taken with no other (Xillybus-related) mutex
taken. So the locking order is assured.
But thanks for the reminder. Never hurts checking again.
Regards,
Eli
Dear,
Sorry for the late review.
This patch cannot prevent the UAF scenario I presented:
```
cpu0 cpu1
1. xillyusb_open()
mutex_lock(&kref_mutex); // unaffected lock
xillybus_find_inode()
mutex_lock(&unit_mutex);
unit = iter;
mutex_unlock(&unit_mutex);
2. xillyusb_disconnect()
xillybus_cleanup_chrdev()
mutex_lock(&unit_mutex);
kfree(unit);
mutex_unlock(&unit_mutex);
3. *private_data = unit->private_data; // UAF
```
This is a UAF for 'unit', not a UAF for 'xdev'.
So, the added 'kref_mutex' has no effect.
Regards,
Hyunwoo Kim
Dear,
On Sun, Nov 13, 2022 at 10:30:16AM +0200, Eli Billauer wrote:
> On 13/11/2022 10:05, Hyunwoo Kim wrote:
> > Dear,
> >
> > Sorry for the late review.
> >
> > This patch cannot prevent the UAF scenario I presented:
> > ```
> > cpu0 cpu1
> > 1. xillyusb_open()
> > mutex_lock(&kref_mutex); // unaffected lock
> > xillybus_find_inode()
> > mutex_lock(&unit_mutex);
> > unit = iter;
> > mutex_unlock(&unit_mutex);
> > 2. xillyusb_disconnect()
> > xillybus_cleanup_chrdev()
> > mutex_lock(&unit_mutex);
> > kfree(unit);
> > mutex_unlock(&unit_mutex);
> > 3. *private_data = unit->private_data; // UAF
> >
> > ```
> >
> > This is a UAF for 'unit', not a UAF for 'xdev'.
> > So, the added 'kref_mutex' has no effect.
> >
>
> You're correct. This submitted patch solves only one problem out of two. It
> prevents the content of @private_data to be freed, but you're right that
> @unit itself isn't protected well enough.
>
> The problem you're pointing at is that @unit can be freed before its
> attempted use, because the mutex is released too early.
>
> This is easily solved by moving down the mutex_unlock() call to after @unit
> has been used. Do you want the pleasure to submit this patch, or should I?
I hope you work as before.
And, even if the mutex_unlock(&unit_mutex); of xillybus_find_inode()
is finally moved, xdev may be released before kref_get() is executed
if xillyusb_disconnect() ends just before the function returns.
(Of course, this is an extremely rare case.)
So, in xillyusb_open() we need to move kref_get() above xillybus_find_inode().
Regards,
Hyunwoo Kim
On 13/11/2022 10:05, Hyunwoo Kim wrote:
> Dear,
>
> Sorry for the late review.
>
> This patch cannot prevent the UAF scenario I presented:
> ```
> cpu0 cpu1
> 1. xillyusb_open()
> mutex_lock(&kref_mutex); // unaffected lock
> xillybus_find_inode()
> mutex_lock(&unit_mutex);
> unit = iter;
> mutex_unlock(&unit_mutex);
> 2. xillyusb_disconnect()
> xillybus_cleanup_chrdev()
> mutex_lock(&unit_mutex);
> kfree(unit);
> mutex_unlock(&unit_mutex);
> 3. *private_data = unit->private_data; // UAF
>
> ```
>
> This is a UAF for 'unit', not a UAF for 'xdev'.
> So, the added 'kref_mutex' has no effect.
>
You're correct. This submitted patch solves only one problem out of two.
It prevents the content of @private_data to be freed, but you're right
that @unit itself isn't protected well enough.
The problem you're pointing at is that @unit can be freed before its
attempted use, because the mutex is released too early.
This is easily solved by moving down the mutex_unlock() call to after
@unit has been used. Do you want the pleasure to submit this patch, or
should I?
Thanks,
Eli
On Sun, Nov 13, 2022 at 11:03:20AM +0200, Eli Billauer wrote:
> On 13/11/2022 10:47, Hyunwoo Kim wrote:
> > And, even if the mutex_unlock(&unit_mutex); of xillybus_find_inode()
> > is finally moved, xdev may be released before kref_get() is executed
> > if xillyusb_disconnect() ends just before the function returns.
> > (Of course, this is an extremely rare case.)
> >
> > So, in xillyusb_open() we need to move kref_get() above xillybus_find_inode().
>
> First of all, that's impossible. kref_get() is called on a member of a
> specific @xdev's struct, and it's xillybus_find_inode()'s job to find it. So
> before the call to xillybus_find_inode(), we don't know which @xdev it is.
> That's the tricky part of all this.
>
> The solution of this submitted patch was a lock that briefly prevents the
> kref_put() of all @xdevs. The way it works is that if an @xdev is found by
> xillybus_find_inode(), it necessarily means that xillyusb_disconnect()'s
> call to xillybus_cleanup_chrdev() hasn't returned (yet). Therefore, holding
> @kref_mutex guarantees that the kref_put() call, which is later on, isn't
> reached for the @xdev that has been found.
>
> If you've found a flaw in this mechanism, please be more specific about it.
you're right.
It seems that you only need to move the location of unit_mutex
in xillybus_find_inode().
Regards,
Hyunwoo Kim
On 13/11/2022 10:47, Hyunwoo Kim wrote:
> And, even if the mutex_unlock(&unit_mutex); of xillybus_find_inode()
> is finally moved, xdev may be released before kref_get() is executed
> if xillyusb_disconnect() ends just before the function returns.
> (Of course, this is an extremely rare case.)
>
> So, in xillyusb_open() we need to move kref_get() above xillybus_find_inode().
First of all, that's impossible. kref_get() is called on a member of a
specific @xdev's struct, and it's xillybus_find_inode()'s job to find
it. So before the call to xillybus_find_inode(), we don't know which
@xdev it is. That's the tricky part of all this.
The solution of this submitted patch was a lock that briefly prevents
the kref_put() of all @xdevs. The way it works is that if an @xdev is
found by xillybus_find_inode(), it necessarily means that
xillyusb_disconnect()'s call to xillybus_cleanup_chrdev() hasn't
returned (yet). Therefore, holding @kref_mutex guarantees that the
kref_put() call, which is later on, isn't reached for the @xdev that has
been found.
If you've found a flaw in this mechanism, please be more specific about it.
Regards,
Eli
On 13/11/2022 11:14, Hyunwoo Kim wrote:
>
> It seems that you only need to move the location of unit_mutex
> in xillybus_find_inode().
>
Agreed. I'll prepare a follow-up patch to fix this issue as well.
Thanks a lot for drawing my attention to this.
Eli