2022-06-10 12:39:49

by zhangfei

[permalink] [raw]
Subject: [PATCH] uacce: fix concurrency of fops_open and uacce_remove

The uacce parent's module can be removed when uacce is working,
which may cause troubles.

If rmmod/uacce_remove happens just after fops_open: bind_queue,
the uacce_remove can not remove the bound queue since it is not
added to the queue list yet, which blocks the uacce_disable_sva.

Change queues_lock area to make sure the bound queue is added to
the list thereby can be searched in uacce_remove.

And uacce->parent->driver is checked immediately in case rmmod is
just happening.

Also the parent driver must always stop DMA before calling
uacce_remove.

Signed-off-by: Yang Shen <[email protected]>
Signed-off-by: Zhangfei Gao <[email protected]>
---
drivers/misc/uacce/uacce.c | 19 +++++++++++++------
1 file changed, 13 insertions(+), 6 deletions(-)

diff --git a/drivers/misc/uacce/uacce.c b/drivers/misc/uacce/uacce.c
index 281c54003edc..b6219c6bfb48 100644
--- a/drivers/misc/uacce/uacce.c
+++ b/drivers/misc/uacce/uacce.c
@@ -136,9 +136,16 @@ static int uacce_fops_open(struct inode *inode, struct file *filep)
if (!q)
return -ENOMEM;

+ mutex_lock(&uacce->queues_lock);
+
+ if (!uacce->parent->driver) {
+ ret = -ENODEV;
+ goto out_with_lock;
+ }
+
ret = uacce_bind_queue(uacce, q);
if (ret)
- goto out_with_mem;
+ goto out_with_lock;

q->uacce = uacce;

@@ -153,7 +160,6 @@ static int uacce_fops_open(struct inode *inode, struct file *filep)
uacce->inode = inode;
q->state = UACCE_Q_INIT;

- mutex_lock(&uacce->queues_lock);
list_add(&q->list, &uacce->queues);
mutex_unlock(&uacce->queues_lock);

@@ -161,7 +167,8 @@ static int uacce_fops_open(struct inode *inode, struct file *filep)

out_with_bond:
uacce_unbind_queue(q);
-out_with_mem:
+out_with_lock:
+ mutex_unlock(&uacce->queues_lock);
kfree(q);
return ret;
}
@@ -171,10 +178,10 @@ static int uacce_fops_release(struct inode *inode, struct file *filep)
struct uacce_queue *q = filep->private_data;

mutex_lock(&q->uacce->queues_lock);
- list_del(&q->list);
- mutex_unlock(&q->uacce->queues_lock);
uacce_put_queue(q);
uacce_unbind_queue(q);
+ list_del(&q->list);
+ mutex_unlock(&q->uacce->queues_lock);
kfree(q);

return 0;
@@ -513,10 +520,10 @@ void uacce_remove(struct uacce_device *uacce)
uacce_put_queue(q);
uacce_unbind_queue(q);
}
- mutex_unlock(&uacce->queues_lock);

/* disable sva now since no opened queues */
uacce_disable_sva(uacce);
+ mutex_unlock(&uacce->queues_lock);

if (uacce->cdev)
cdev_device_del(uacce->cdev, &uacce->dev);
--
2.36.1


2022-06-15 15:17:58

by Jean-Philippe Brucker

[permalink] [raw]
Subject: Re: [PATCH] uacce: fix concurrency of fops_open and uacce_remove

Hi,

On Fri, Jun 10, 2022 at 08:34:23PM +0800, Zhangfei Gao wrote:
> The uacce parent's module can be removed when uacce is working,
> which may cause troubles.
>
> If rmmod/uacce_remove happens just after fops_open: bind_queue,
> the uacce_remove can not remove the bound queue since it is not
> added to the queue list yet, which blocks the uacce_disable_sva.
>
> Change queues_lock area to make sure the bound queue is added to
> the list thereby can be searched in uacce_remove.
>
> And uacce->parent->driver is checked immediately in case rmmod is
> just happening.
>
> Also the parent driver must always stop DMA before calling
> uacce_remove.
>
> Signed-off-by: Yang Shen <[email protected]>
> Signed-off-by: Zhangfei Gao <[email protected]>
> ---
> drivers/misc/uacce/uacce.c | 19 +++++++++++++------
> 1 file changed, 13 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/misc/uacce/uacce.c b/drivers/misc/uacce/uacce.c
> index 281c54003edc..b6219c6bfb48 100644
> --- a/drivers/misc/uacce/uacce.c
> +++ b/drivers/misc/uacce/uacce.c
> @@ -136,9 +136,16 @@ static int uacce_fops_open(struct inode *inode, struct file *filep)
> if (!q)
> return -ENOMEM;
>
> + mutex_lock(&uacce->queues_lock);
> +
> + if (!uacce->parent->driver) {

I don't think this is useful, because the core clears parent->driver after
having run uacce_remove():

rmmod hisi_zip open()
... uacce_fops_open()
__device_release_driver() ...
pci_device_remove()
hisi_zip_remove()
hisi_qm_uninit()
uacce_remove()
... ...
mutex_lock(uacce->queues_lock)
... if (!uacce->parent->driver)
device_unbind_cleanup() /* driver still valid, proceed */
dev->driver = NULL

Since uacce_remove() disabled SVA, the following uacce_bind_queue() will
fail anyway. However, if uacce->flags does not have UACCE_DEV_SVA set,
we'll proceed further and call uacce->ops->get_queue(), which does not
exist anymore since the parent module is gone.

I think we need the global uacce_mutex to serialize uacce_remove() and
uacce_fops_open(). uacce_remove() would do everything, including
xa_erase(), while holding that mutex. And uacce_fops_open() would try to
obtain the uacce object from the xarray while holding the mutex, which
fails if the uacce object is being removed.

Thanks,
Jean

> + ret = -ENODEV;
> + goto out_with_lock;
> + }
> +
> ret = uacce_bind_queue(uacce, q);
> if (ret)
> - goto out_with_mem;
> + goto out_with_lock;
>
> q->uacce = uacce;
>
> @@ -153,7 +160,6 @@ static int uacce_fops_open(struct inode *inode, struct file *filep)
> uacce->inode = inode;
> q->state = UACCE_Q_INIT;
>
> - mutex_lock(&uacce->queues_lock);
> list_add(&q->list, &uacce->queues);
> mutex_unlock(&uacce->queues_lock);
>
> @@ -161,7 +167,8 @@ static int uacce_fops_open(struct inode *inode, struct file *filep)
>
> out_with_bond:
> uacce_unbind_queue(q);
> -out_with_mem:
> +out_with_lock:
> + mutex_unlock(&uacce->queues_lock);
> kfree(q);
> return ret;
> }
> @@ -171,10 +178,10 @@ static int uacce_fops_release(struct inode *inode, struct file *filep)
> struct uacce_queue *q = filep->private_data;
>
> mutex_lock(&q->uacce->queues_lock);
> - list_del(&q->list);
> - mutex_unlock(&q->uacce->queues_lock);
> uacce_put_queue(q);
> uacce_unbind_queue(q);
> + list_del(&q->list);
> + mutex_unlock(&q->uacce->queues_lock);
> kfree(q);
>
> return 0;
> @@ -513,10 +520,10 @@ void uacce_remove(struct uacce_device *uacce)
> uacce_put_queue(q);
> uacce_unbind_queue(q);
> }
> - mutex_unlock(&uacce->queues_lock);
>
> /* disable sva now since no opened queues */
> uacce_disable_sva(uacce);
> + mutex_unlock(&uacce->queues_lock);
>
> if (uacce->cdev)
> cdev_device_del(uacce->cdev, &uacce->dev);
> --
> 2.36.1
>

2022-06-16 04:13:44

by zhangfei

[permalink] [raw]
Subject: Re: [PATCH] uacce: fix concurrency of fops_open and uacce_remove

Hi, Jean

On 2022/6/15 下午11:16, Jean-Philippe Brucker wrote:
> Hi,
>
> On Fri, Jun 10, 2022 at 08:34:23PM +0800, Zhangfei Gao wrote:
>> The uacce parent's module can be removed when uacce is working,
>> which may cause troubles.
>>
>> If rmmod/uacce_remove happens just after fops_open: bind_queue,
>> the uacce_remove can not remove the bound queue since it is not
>> added to the queue list yet, which blocks the uacce_disable_sva.
>>
>> Change queues_lock area to make sure the bound queue is added to
>> the list thereby can be searched in uacce_remove.
>>
>> And uacce->parent->driver is checked immediately in case rmmod is
>> just happening.
>>
>> Also the parent driver must always stop DMA before calling
>> uacce_remove.
>>
>> Signed-off-by: Yang Shen <[email protected]>
>> Signed-off-by: Zhangfei Gao <[email protected]>
>> ---
>> drivers/misc/uacce/uacce.c | 19 +++++++++++++------
>> 1 file changed, 13 insertions(+), 6 deletions(-)
>>
>> diff --git a/drivers/misc/uacce/uacce.c b/drivers/misc/uacce/uacce.c
>> index 281c54003edc..b6219c6bfb48 100644
>> --- a/drivers/misc/uacce/uacce.c
>> +++ b/drivers/misc/uacce/uacce.c
>> @@ -136,9 +136,16 @@ static int uacce_fops_open(struct inode *inode, struct file *filep)
>> if (!q)
>> return -ENOMEM;
>>
>> + mutex_lock(&uacce->queues_lock);
>> +
>> + if (!uacce->parent->driver) {
> I don't think this is useful, because the core clears parent->driver after
> having run uacce_remove():
>
> rmmod hisi_zip open()
> ... uacce_fops_open()
> __device_release_driver() ...
> pci_device_remove()
> hisi_zip_remove()
> hisi_qm_uninit()
> uacce_remove()
> ... ...
> mutex_lock(uacce->queues_lock)
> ... if (!uacce->parent->driver)
> device_unbind_cleanup() /* driver still valid, proceed */
> dev->driver = NULL

The check  if (!uacce->parent->driver) is required, otherwise NULL
pointer may happen.
iommu_sva_bind_device
const struct iommu_ops *ops = dev_iommu_ops(dev);  ->
dev->iommu->iommu_dev->ops

rmmod has no issue, but remove parent pci device has the issue.

Test:
sleep in fops_open before mutex.

estuary:/mnt$ ./work/a.out &
//sleep in fops_open

echo 1 > /sys/bus/pci/devices/0000:00:02.0/remove &
estuary:/mnt$ [   22.594348] uacce_remove!
[   22.594663] pci 0000:00:02.0: Removing from iommu group 2
[   22.595073] iommu_release_device dev->iommu=0
[   22.595076] CPU: 2 PID: 229 Comm: ash Not tainted
5.19.0-rc1-15071-gcbcf098c5257-dirty #633
[   22.595079] Hardware name: QEMU KVM Virtual Machine, BIOS 0.0.0
02/06/2015
[   22.595080] Call trace:
[   22.595080]  dump_backtrace+0xe4/0xf0
[   22.595085]  show_stack+0x20/0x70
[   22.595086]  dump_stack_lvl+0x8c/0xb8
[   22.595089]  dump_stack+0x18/0x34
[   22.595091]  iommu_release_device+0x90/0x98
[   22.595095]  iommu_bus_notifier+0x58/0x68
[   22.595097]  blocking_notifier_call_chain+0x74/0xa8
[   22.595100]  device_del+0x268/0x3b0
[   22.595102]  pci_remove_bus_device+0x84/0x110
[   22.595106]  pci_stop_and_remove_bus_device_locked+0x30/0x60
...

estuary:/mnt$ [   31.466360] uacce: sleep end!
[   31.466362] uacce->parent->driver=0
[   31.466364] uacce->parent->iommu=0
[   31.466365] uacce_bind_queue!
[   31.466366] uacce_bind_queue call iommu_sva_bind_device!
[   31.466367] uacce->parent=d120d0
[   31.466371] Unable to handle kernel NULL pointer dereference at
virtual address 0000000000000038
[   31.472870] Mem abort info:
[   31.473450]   ESR = 0x0000000096000004
[   31.474223]   EC = 0x25: DABT (current EL), IL = 32 bits
[   31.475390]   SET = 0, FnV = 0
[   31.476031]   EA = 0, S1PTW = 0
[   31.476680]   FSC = 0x04: level 0 translation fault
[   31.477687] Data abort info:
[   31.478294]   ISV = 0, ISS = 0x00000004
[   31.479152]   CM = 0, WnR = 0
[   31.479785] user pgtable: 4k pages, 48-bit VAs, pgdp=00000000714d8000
[   31.481144] [0000000000000038] pgd=0000000000000000, p4d=0000000000000000
[   31.482622] Internal error: Oops: 96000004 [#1] PREEMPT SMP
[   31.483784] Modules linked in: hisi_zip
[   31.484590] CPU: 2 PID: 228 Comm: a.out Not tainted
5.19.0-rc1-15071-gcbcf098c5257-dirty #633
[   31.486374] Hardware name: QEMU KVM Virtual Machine, BIOS 0.0.0
02/06/2015
[   31.487862] pstate: 60400005 (nZCv daif +PAN -UAO -TCO -DIT -SSBS
BTYPE=--)
[   31.489390] pc : iommu_sva_bind_device+0x44/0xf4
[   31.490404] lr : uacce_fops_open+0x128/0x234

>
> Since uacce_remove() disabled SVA, the following uacce_bind_queue() will
> fail anyway. However, if uacce->flags does not have UACCE_DEV_SVA set,
> we'll proceed further and call uacce->ops->get_queue(), which does not
> exist anymore since the parent module is gone.
>
> I think we need the global uacce_mutex to serialize uacce_remove() and
> uacce_fops_open(). uacce_remove() would do everything, including
> xa_erase(), while holding that mutex. And uacce_fops_open() would try to
> obtain the uacce object from the xarray while holding the mutex, which
> fails if the uacce object is being removed.

Since fops_open get char device refcount, uacce_release will not happen
until open returns.
So either uacce = xa_load(&uacce_xa, iminor(inode)) is got,
uacce_release release uacce after fops_release.
Or uacce is not got and return -ENODEV.

open:
        uacce = xa_load(&uacce_xa, iminor(inode));
        if (!uacce)
                return -ENODEV;

uacce->dev.release = uacce_release;
uacce_release: kfree(uacce);

Thanks

2022-06-16 08:16:29

by Jean-Philippe Brucker

[permalink] [raw]
Subject: Re: [PATCH] uacce: fix concurrency of fops_open and uacce_remove

On Thu, Jun 16, 2022 at 12:10:18PM +0800, Zhangfei Gao wrote:
> > > diff --git a/drivers/misc/uacce/uacce.c b/drivers/misc/uacce/uacce.c
> > > index 281c54003edc..b6219c6bfb48 100644
> > > --- a/drivers/misc/uacce/uacce.c
> > > +++ b/drivers/misc/uacce/uacce.c
> > > @@ -136,9 +136,16 @@ static int uacce_fops_open(struct inode *inode, struct file *filep)
> > > if (!q)
> > > return -ENOMEM;
> > > + mutex_lock(&uacce->queues_lock);
> > > +
> > > + if (!uacce->parent->driver) {
> > I don't think this is useful, because the core clears parent->driver after
> > having run uacce_remove():
> >
> > rmmod hisi_zip open()
> > ... uacce_fops_open()
> > __device_release_driver() ...
> > pci_device_remove()
> > hisi_zip_remove()
> > hisi_qm_uninit()
> > uacce_remove()
> > ... ...
> > mutex_lock(uacce->queues_lock)
> > ... if (!uacce->parent->driver)
> > device_unbind_cleanup() /* driver still valid, proceed */
> > dev->driver = NULL
>
> The check  if (!uacce->parent->driver) is required, otherwise NULL pointer
> may happen.

I agree we need something, what I mean is that this check is not
sufficient.

> iommu_sva_bind_device
> const struct iommu_ops *ops = dev_iommu_ops(dev);  ->
> dev->iommu->iommu_dev->ops
>
> rmmod has no issue, but remove parent pci device has the issue.

Ah right, relying on the return value of bind() wouldn't be enough even if
we mandated SVA.

[...]
> >
> > I think we need the global uacce_mutex to serialize uacce_remove() and
> > uacce_fops_open(). uacce_remove() would do everything, including
> > xa_erase(), while holding that mutex. And uacce_fops_open() would try to
> > obtain the uacce object from the xarray while holding the mutex, which
> > fails if the uacce object is being removed.
>
> Since fops_open get char device refcount, uacce_release will not happen
> until open returns.

The refcount only ensures that the uacce_device object is not freed as
long as there are open fds. But uacce_remove() can run while there are
open fds, or fds in the process of being opened. And atfer uacce_remove()
runs, the uacce_device object still exists but is mostly unusable. For
example once the module is freed, uacce->ops is not valid anymore. But
currently uacce_fops_open() may dereference the ops in this case:

uacce_fops_open()
if (!uacce->parent->driver)
/* Still valid, keep going */
... rmmod
uacce_remove()
... free_module()
uacce->ops->get_queue() /* BUG */

Accessing uacce->ops after free_module() is a use-after-free. We need all
the fops to synchronize with uacce_remove() to ensure they don't use any
resource of the parent after it's been freed.

I see uacce_fops_poll() may have the same problem, and should be inside
uacce_mutex.

Thanks,
Jean

2022-06-17 06:13:18

by zhangfei

[permalink] [raw]
Subject: Re: [PATCH] uacce: fix concurrency of fops_open and uacce_remove



On 2022/6/16 下午4:14, Jean-Philippe Brucker wrote:
> On Thu, Jun 16, 2022 at 12:10:18PM +0800, Zhangfei Gao wrote:
>>>> diff --git a/drivers/misc/uacce/uacce.c b/drivers/misc/uacce/uacce.c
>>>> index 281c54003edc..b6219c6bfb48 100644
>>>> --- a/drivers/misc/uacce/uacce.c
>>>> +++ b/drivers/misc/uacce/uacce.c
>>>> @@ -136,9 +136,16 @@ static int uacce_fops_open(struct inode *inode, struct file *filep)
>>>> if (!q)
>>>> return -ENOMEM;
>>>> + mutex_lock(&uacce->queues_lock);
>>>> +
>>>> + if (!uacce->parent->driver) {
>>> I don't think this is useful, because the core clears parent->driver after
>>> having run uacce_remove():
>>>
>>> rmmod hisi_zip open()
>>> ... uacce_fops_open()
>>> __device_release_driver() ...
>>> pci_device_remove()
>>> hisi_zip_remove()
>>> hisi_qm_uninit()
>>> uacce_remove()
>>> ... ...
>>> mutex_lock(uacce->queues_lock)
>>> ... if (!uacce->parent->driver)
>>> device_unbind_cleanup() /* driver still valid, proceed */
>>> dev->driver = NULL
>> The check  if (!uacce->parent->driver) is required, otherwise NULL pointer
>> may happen.
> I agree we need something, what I mean is that this check is not
> sufficient.
>
>> iommu_sva_bind_device
>> const struct iommu_ops *ops = dev_iommu_ops(dev);  ->
>> dev->iommu->iommu_dev->ops
>>
>> rmmod has no issue, but remove parent pci device has the issue.
> Ah right, relying on the return value of bind() wouldn't be enough even if
> we mandated SVA.
>
> [...]
>>> I think we need the global uacce_mutex to serialize uacce_remove() and
>>> uacce_fops_open(). uacce_remove() would do everything, including
>>> xa_erase(), while holding that mutex. And uacce_fops_open() would try to
>>> obtain the uacce object from the xarray while holding the mutex, which
>>> fails if the uacce object is being removed.
>> Since fops_open get char device refcount, uacce_release will not happen
>> until open returns.
> The refcount only ensures that the uacce_device object is not freed as
> long as there are open fds. But uacce_remove() can run while there are
> open fds, or fds in the process of being opened. And atfer uacce_remove()
> runs, the uacce_device object still exists but is mostly unusable. For
> example once the module is freed, uacce->ops is not valid anymore. But
> currently uacce_fops_open() may dereference the ops in this case:
>
> uacce_fops_open()
> if (!uacce->parent->driver)
> /* Still valid, keep going */
> ... rmmod
> uacce_remove()
> ... free_module()
> uacce->ops->get_queue() /* BUG */

uacce_remove should wait for uacce->queues_lock, until fops_open release
the lock.
If open happen just after the uacce_remove: unlock, uacce_bind_queue in
open should fail.

> Accessing uacce->ops after free_module() is a use-after-free. We need all
you men parent release the resources.
> the fops to synchronize with uacce_remove() to ensure they don't use any
> resource of the parent after it's been freed.
After fops_open, currently we are counting on parent driver stop all dma
first, then call uacce_remove, which is assumption.
Like drivers/crypto/hisilicon/zip/zip_main.c: hisi_qm_wait_task_finish,
which will wait uacce_release.
If comments this , there may other issue,
Unable to handle kernel paging request at virtual address ffff80000b700204
pc : hisi_qm_cache_wb.part.0+0x2c/0xa0

> I see uacce_fops_poll() may have the same problem, and should be inside
> uacce_mutex.
Do we need consider this, uacce_remove can happen anytime but not
waiting dma stop?

Not sure uacce_mutex can do this.
Currently the sequence is
mutex_lock(&uacce->queues_lock);
mutex_lock(&uacce_mutex);

Or we set all the callbacks of uacce_ops to NULL?
Module_get/put only works for module, but not for removing device.

Thanks

>
> Thanks,
> Jean

2022-06-17 08:22:17

by zhangfei

[permalink] [raw]
Subject: Re: [PATCH] uacce: fix concurrency of fops_open and uacce_remove



On 2022/6/17 下午2:05, Zhangfei Gao wrote:
>
>
> On 2022/6/16 下午4:14, Jean-Philippe Brucker wrote:
>> On Thu, Jun 16, 2022 at 12:10:18PM +0800, Zhangfei Gao wrote:
>>>>> diff --git a/drivers/misc/uacce/uacce.c b/drivers/misc/uacce/uacce.c
>>>>> index 281c54003edc..b6219c6bfb48 100644
>>>>> --- a/drivers/misc/uacce/uacce.c
>>>>> +++ b/drivers/misc/uacce/uacce.c
>>>>> @@ -136,9 +136,16 @@ static int uacce_fops_open(struct inode
>>>>> *inode, struct file *filep)
>>>>>        if (!q)
>>>>>            return -ENOMEM;
>>>>> +    mutex_lock(&uacce->queues_lock);
>>>>> +
>>>>> +    if (!uacce->parent->driver) {
>>>> I don't think this is useful, because the core clears
>>>> parent->driver after
>>>> having run uacce_remove():
>>>>
>>>>     rmmod hisi_zip        open()
>>>>      ...                 uacce_fops_open()
>>>>      __device_release_driver()      ...
>>>>       pci_device_remove()
>>>>        hisi_zip_remove()
>>>>         hisi_qm_uninit()
>>>>          uacce_remove()
>>>>           ...              ...
>>>>                        mutex_lock(uacce->queues_lock)
>>>>       ...                  if (!uacce->parent->driver)
>>>>       device_unbind_cleanup()      /* driver still valid, proceed */
>>>>        dev->driver = NULL
>>> The check  if (!uacce->parent->driver) is required, otherwise NULL
>>> pointer
>>> may happen.
>> I agree we need something, what I mean is that this check is not
>> sufficient.
>>
>>> iommu_sva_bind_device
>>> const struct iommu_ops *ops = dev_iommu_ops(dev);  ->
>>> dev->iommu->iommu_dev->ops
>>>
>>> rmmod has no issue, but remove parent pci device has the issue.
>> Ah right, relying on the return value of bind() wouldn't be enough
>> even if
>> we mandated SVA.
>>
>> [...]
>>>> I think we need the global uacce_mutex to serialize uacce_remove() and
>>>> uacce_fops_open(). uacce_remove() would do everything, including
>>>> xa_erase(), while holding that mutex. And uacce_fops_open() would
>>>> try to
>>>> obtain the uacce object from the xarray while holding the mutex, which
>>>> fails if the uacce object is being removed.
>>> Since fops_open get char device refcount, uacce_release will not happen
>>> until open returns.
>> The refcount only ensures that the uacce_device object is not freed as
>> long as there are open fds. But uacce_remove() can run while there are
>> open fds, or fds in the process of being opened. And atfer
>> uacce_remove()
>> runs, the uacce_device object still exists but is mostly unusable. For
>> example once the module is freed, uacce->ops is not valid anymore. But
>> currently uacce_fops_open() may dereference the ops in this case:
>>
>>     uacce_fops_open()
>>      if (!uacce->parent->driver)
>>      /* Still valid, keep going */
>>      ...                    rmmod
>>                          uacce_remove()
>>      ...                     free_module()
>>      uacce->ops->get_queue() /* BUG */
>
> uacce_remove should wait for uacce->queues_lock, until fops_open
> release the lock.
> If open happen just after the uacce_remove: unlock, uacce_bind_queue
> in open should fail.
>
>> Accessing uacce->ops after free_module() is a use-after-free. We need
>> all
> you men parent release the resources.
>> the fops to synchronize with uacce_remove() to ensure they don't use any
>> resource of the parent after it's been freed.
> After fops_open, currently we are counting on parent driver stop all
> dma first, then call uacce_remove, which is assumption.
> Like drivers/crypto/hisilicon/zip/zip_main.c:
> hisi_qm_wait_task_finish, which will wait uacce_release.
> If comments this , there may other issue,
> Unable to handle kernel paging request at virtual address
> ffff80000b700204
> pc : hisi_qm_cache_wb.part.0+0x2c/0xa0
>
>> I see uacce_fops_poll() may have the same problem, and should be inside
>> uacce_mutex.
> Do we need consider this, uacce_remove can happen anytime but not
> waiting dma stop?
>
> Not sure uacce_mutex can do this.
> Currently the sequence is
> mutex_lock(&uacce->queues_lock);
> mutex_lock(&uacce_mutex);
>
> Or we set all the callbacks of uacce_ops to NULL?
How about in uacce_remove
mutex_lock(&uacce_mutex);
uacce->ops = NULL;
mutex_unlock(&uacce_mutex);

And check uacce->ops  first when using.

Or set all ops of uacce->ops to NULL.

> Module_get/put only works for module, but not for removing device.
>
> Thanks
>
>>
>> Thanks,
>> Jean
>

2022-06-17 14:25:27

by zhangfei

[permalink] [raw]
Subject: Re: [PATCH] uacce: fix concurrency of fops_open and uacce_remove



On 2022/6/17 下午4:20, Zhangfei Gao wrote:
>
>
> On 2022/6/17 下午2:05, Zhangfei Gao wrote:
>>
>>
>> On 2022/6/16 下午4:14, Jean-Philippe Brucker wrote:
>>> On Thu, Jun 16, 2022 at 12:10:18PM +0800, Zhangfei Gao wrote:
>>>>>> diff --git a/drivers/misc/uacce/uacce.c b/drivers/misc/uacce/uacce.c
>>>>>> index 281c54003edc..b6219c6bfb48 100644
>>>>>> --- a/drivers/misc/uacce/uacce.c
>>>>>> +++ b/drivers/misc/uacce/uacce.c
>>>>>> @@ -136,9 +136,16 @@ static int uacce_fops_open(struct inode
>>>>>> *inode, struct file *filep)
>>>>>>        if (!q)
>>>>>>            return -ENOMEM;
>>>>>> +    mutex_lock(&uacce->queues_lock);
>>>>>> +
>>>>>> +    if (!uacce->parent->driver) {
>>>>> I don't think this is useful, because the core clears
>>>>> parent->driver after
>>>>> having run uacce_remove():
>>>>>
>>>>>     rmmod hisi_zip        open()
>>>>>      ...                 uacce_fops_open()
>>>>>      __device_release_driver()      ...
>>>>>       pci_device_remove()
>>>>>        hisi_zip_remove()
>>>>>         hisi_qm_uninit()
>>>>>          uacce_remove()
>>>>>           ...              ...
>>>>>                        mutex_lock(uacce->queues_lock)
>>>>>       ...                  if (!uacce->parent->driver)
>>>>>       device_unbind_cleanup()      /* driver still valid, proceed */
>>>>>        dev->driver = NULL
>>>> The check  if (!uacce->parent->driver) is required, otherwise NULL
>>>> pointer
>>>> may happen.
>>> I agree we need something, what I mean is that this check is not
>>> sufficient.
>>>
>>>> iommu_sva_bind_device
>>>> const struct iommu_ops *ops = dev_iommu_ops(dev);  ->
>>>> dev->iommu->iommu_dev->ops
>>>>
>>>> rmmod has no issue, but remove parent pci device has the issue.
>>> Ah right, relying on the return value of bind() wouldn't be enough
>>> even if
>>> we mandated SVA.
>>>
>>> [...]
>>>>> I think we need the global uacce_mutex to serialize uacce_remove()
>>>>> and
>>>>> uacce_fops_open(). uacce_remove() would do everything, including
>>>>> xa_erase(), while holding that mutex. And uacce_fops_open() would
>>>>> try to
>>>>> obtain the uacce object from the xarray while holding the mutex,
>>>>> which
>>>>> fails if the uacce object is being removed.
>>>> Since fops_open get char device refcount, uacce_release will not
>>>> happen
>>>> until open returns.
>>> The refcount only ensures that the uacce_device object is not freed as
>>> long as there are open fds. But uacce_remove() can run while there are
>>> open fds, or fds in the process of being opened. And atfer
>>> uacce_remove()
>>> runs, the uacce_device object still exists but is mostly unusable. For
>>> example once the module is freed, uacce->ops is not valid anymore. But
>>> currently uacce_fops_open() may dereference the ops in this case:
>>>
>>>     uacce_fops_open()
>>>      if (!uacce->parent->driver)
>>>      /* Still valid, keep going */
>>>      ...                    rmmod
>>>                          uacce_remove()
>>>      ...                     free_module()
>>>      uacce->ops->get_queue() /* BUG */
>>
>> uacce_remove should wait for uacce->queues_lock, until fops_open
>> release the lock.
>> If open happen just after the uacce_remove: unlock, uacce_bind_queue
>> in open should fail.
>>
>>> Accessing uacce->ops after free_module() is a use-after-free. We
>>> need all
>> you men parent release the resources.
>>> the fops to synchronize with uacce_remove() to ensure they don't use
>>> any
>>> resource of the parent after it's been freed.
>> After fops_open, currently we are counting on parent driver stop all
>> dma first, then call uacce_remove, which is assumption.
>> Like drivers/crypto/hisilicon/zip/zip_main.c:
>> hisi_qm_wait_task_finish, which will wait uacce_release.
>> If comments this , there may other issue,
>> Unable to handle kernel paging request at virtual address
>> ffff80000b700204
>> pc : hisi_qm_cache_wb.part.0+0x2c/0xa0
>>
>>> I see uacce_fops_poll() may have the same problem, and should be inside
>>> uacce_mutex.
>> Do we need consider this, uacce_remove can happen anytime but not
>> waiting dma stop?
>>
>> Not sure uacce_mutex can do this.
>> Currently the sequence is
>> mutex_lock(&uacce->queues_lock);
>> mutex_lock(&uacce_mutex);
>>
>> Or we set all the callbacks of uacce_ops to NULL?
> How about in uacce_remove
> mutex_lock(&uacce_mutex);
> uacce->ops = NULL;
> mutex_unlock(&uacce_mutex);
>
> And check uacce->ops  first when using.
>

Diff like this, will merge together.

 drivers/misc/uacce/uacce.c | 65 ++++++++++++++++++++++++++++++++------
 1 file changed, 56 insertions(+), 9 deletions(-)

diff --git a/drivers/misc/uacce/uacce.c b/drivers/misc/uacce/uacce.c
index b6219c6bfb48..311192728132 100644
--- a/drivers/misc/uacce/uacce.c
+++ b/drivers/misc/uacce/uacce.c
@@ -23,6 +23,11 @@ static int uacce_start_queue(struct uacce_queue *q)
         goto out_with_lock;
     }

+    if (!q->uacce->ops) {
+        ret = -EINVAL;
+        goto out_with_lock;
+    }
+
     if (q->uacce->ops->start_queue) {
         ret = q->uacce->ops->start_queue(q);
         if (ret < 0)
@@ -46,6 +51,9 @@ static int uacce_put_queue(struct uacce_queue *q)
     if (q->state == UACCE_Q_ZOMBIE)
         goto out;

+    if (!uacce->ops)
+        goto out;
+
     if ((q->state == UACCE_Q_STARTED) && uacce->ops->stop_queue)
         uacce->ops->stop_queue(q);

@@ -65,6 +73,7 @@ static long uacce_fops_unl_ioctl(struct file *filep,
 {
     struct uacce_queue *q = filep->private_data;
     struct uacce_device *uacce = q->uacce;
+    long ret;

     switch (cmd) {
     case UACCE_CMD_START_Q:
@@ -74,10 +83,17 @@ static long uacce_fops_unl_ioctl(struct file *filep,
         return uacce_put_queue(q);

     default:
-        if (!uacce->ops->ioctl)
-            return -EINVAL;
+        mutex_lock(&uacce_mutex);
+
+        if (!uacce->ops || !uacce->ops->ioctl) {
+            ret = -EINVAL;
+            goto out_with_lock;
+        }

-        return uacce->ops->ioctl(q, cmd, arg);
+        ret = uacce->ops->ioctl(q, cmd, arg);
+out_with_lock:
+        mutex_unlock(&uacce_mutex);
+        return ret;
     }
 }

@@ -138,10 +154,13 @@ static int uacce_fops_open(struct inode *inode,
struct file *filep)

     mutex_lock(&uacce->queues_lock);

-    if (!uacce->parent->driver) {
+    mutex_lock(&uacce_mutex);
+    if (!uacce->parent || !uacce->ops) {
+        mutex_unlock(&uacce_mutex);
         ret = -ENODEV;
         goto out_with_lock;
     }
+    mutex_unlock(&uacce_mutex);

     ret = uacce_bind_queue(uacce, q);
     if (ret)
@@ -226,6 +245,11 @@ static int uacce_fops_mmap(struct file *filep,
struct vm_area_struct *vma)

     mutex_lock(&uacce_mutex);

+    if (!uacce->ops) {
+        ret = -EINVAL;
+        goto out_with_lock;
+    }
+
     if (q->state != UACCE_Q_INIT && q->state != UACCE_Q_STARTED) {
         ret = -EINVAL;
         goto out_with_lock;
@@ -271,9 +295,18 @@ static __poll_t uacce_fops_poll(struct file *file,
poll_table *wait)
     struct uacce_device *uacce = q->uacce;

     poll_wait(file, &q->wait, wait);
-    if (uacce->ops->is_q_updated && uacce->ops->is_q_updated(q))
+
+    mutex_lock(&uacce_mutex);
+    if (!uacce->ops)
+        goto out_with_lock;
+
+    if (uacce->ops->is_q_updated && uacce->ops->is_q_updated(q)) {
+        mutex_unlock(&uacce_mutex);
         return EPOLLIN | EPOLLRDNORM;
+    }

+out_with_lock:
+    mutex_unlock(&uacce_mutex);
     return 0;
 }

@@ -312,12 +345,20 @@ static ssize_t available_instances_show(struct
device *dev,
                     char *buf)
 {
     struct uacce_device *uacce = to_uacce_device(dev);
+    ssize_t ret;

-    if (!uacce->ops->get_available_instances)
-        return -ENODEV;
+    mutex_lock(&uacce_mutex);
+    if (!uacce->ops || !uacce->ops->get_available_instances) {
+        ret = -ENODEV;
+        goto out_with_lock;
+    }
+
+    ret = sysfs_emit(buf, "%d\n",
+             uacce->ops->get_available_instances(uacce));

-    return sysfs_emit(buf, "%d\n",
-               uacce->ops->get_available_instances(uacce));
+out_with_lock:
+    mutex_unlock(&uacce_mutex);
+    return ret;
 }

 static ssize_t algorithms_show(struct device *dev,
@@ -523,6 +564,12 @@ void uacce_remove(struct uacce_device *uacce)

     /* disable sva now since no opened queues */
     uacce_disable_sva(uacce);
+
+    mutex_lock(&uacce_mutex);
+    uacce->parent = NULL;
+    uacce->ops = NULL;
+    mutex_unlock(&uacce_mutex);
+
     mutex_unlock(&uacce->queues_lock);

     if (uacce->cdev)
--
2.25.1

2022-06-20 14:02:57

by Jean-Philippe Brucker

[permalink] [raw]
Subject: Re: [PATCH] uacce: fix concurrency of fops_open and uacce_remove

On Fri, Jun 17, 2022 at 02:05:21PM +0800, Zhangfei Gao wrote:
> > The refcount only ensures that the uacce_device object is not freed as
> > long as there are open fds. But uacce_remove() can run while there are
> > open fds, or fds in the process of being opened. And atfer uacce_remove()
> > runs, the uacce_device object still exists but is mostly unusable. For
> > example once the module is freed, uacce->ops is not valid anymore. But
> > currently uacce_fops_open() may dereference the ops in this case:
> >
> > uacce_fops_open()
> > if (!uacce->parent->driver)
> > /* Still valid, keep going */
> > ... rmmod
> > uacce_remove()
> > ... free_module()
> > uacce->ops->get_queue() /* BUG */
>
> uacce_remove should wait for uacce->queues_lock, until fops_open release the
> lock.
> If open happen just after the uacce_remove: unlock, uacce_bind_queue in open
> should fail.

Ah yes sorry, I lost sight of what this patch was adding. But we could
have the same issue with the patch, just in a different order, no?

uacce_fops_open()
uacce = xa_load()
... rmmod
uacce_remove()
mutex_lock()
mutex_unlock()
mutex_lock()
if (!uacce->parent->driver)
/* Still valid, keep going */ parent->driver = NULL
free_module()
uacce->ops->get_queue() /* BUG */


> > Accessing uacce->ops after free_module() is a use-after-free. We need all
> you men parent release the resources.
> > the fops to synchronize with uacce_remove() to ensure they don't use any
> > resource of the parent after it's been freed.
> After fops_open, currently we are counting on parent driver stop all dma
> first, then call uacce_remove, which is assumption.
> Like drivers/crypto/hisilicon/zip/zip_main.c: hisi_qm_wait_task_finish,
> which will wait uacce_release.
> If comments this , there may other issue,
> Unable to handle kernel paging request at virtual address ffff80000b700204
> pc : hisi_qm_cache_wb.part.0+0x2c/0xa0
>
> > I see uacce_fops_poll() may have the same problem, and should be inside
> > uacce_mutex.
> Do we need consider this, uacce_remove can happen anytime but not waiting
> dma stop?

No, the parent driver must stop DMA before calling uacce_remove(), there
is no way around that

>
> Not sure uacce_mutex can do this.
> Currently the sequence is
> mutex_lock(&uacce->queues_lock);
> mutex_lock(&uacce_mutex);

We should document why some ops use one lock or the other. I believe it's
to avoid circular lock dependency between ioctl and mmap, do you know if
there was another reason?

>
> Or we set all the callbacks of uacce_ops to NULL?

That would be cleaner, though we already use the queue state to indicate
whether it is usable or not. I think we just need to extend that to all
ops.

How about the following patch? Unfortunately it still has the lock
disparity between ioctl and mmap because of the circular lockking with
mmap_sem, I don't know how to make that cleaner.

--- 8< ---

From c7c2b051ec19285bbb973f8a2a5e58bb5326e00e Mon Sep 17 00:00:00 2001
From: Jean-Philippe Brucker <[email protected]>
Date: Mon, 20 Jun 2022 10:10:41 +0100
Subject: [PATCH] uacce: Tidy up locking

The uacce driver must deal with a possible removal of the parent driver
or device at any time. At the moment there are several issues that may
result in use-after-free. Tidy up the locking to handle module removal.

When unbinding the parent device from its driver, the driver calls
uacce_remove(). This function removes the cdev, ensuring that no new
uacce file descriptor will be opened, but existing fds are still open
and uacce fops may be called after uacce_remove() completes, when the
parent module is gone. Each open fd holds a reference to the uacce
device, ensuring that the structure cannot be freed until all fds are
closed. But the uacce fops may still access uacce->ops which belonged to
the parent module, now freed. To solve this:

* use the global uacce_mutex to serialize uacce_fops_open() against
uacce_remove(), and q->mutex to serialize all other fops against
uacce_remove().

* q->mutex replaces the less scalable uacce->queues_lock. The queues
list is now protected by uacce_mutex, and the queue state by q->mutex.
Note that scalability is only desirable for poll(), since the other
fops are only used during setup.

* uacce_queue_is_valid(), checked under q->mutex, denotes whether
uacce_remove() has disabled all queues. If that is the case, don't go
any further since the parent module may be gone. Any call to
uacce->ops must be done while holding q->mutex to ensure the state
doesn't change.

* Clear uacce->ops in uacce_remove(), just to make sure that a
programming error would result in NULL dereference rather than
use-after-free.

Signed-off-by: Jean-Philippe Brucker <[email protected]>
---
drivers/misc/uacce/uacce.c | 137 ++++++++++++++++++++++++-------------
include/linux/uacce.h | 4 +-
2 files changed, 91 insertions(+), 50 deletions(-)

diff --git a/drivers/misc/uacce/uacce.c b/drivers/misc/uacce/uacce.c
index 281c54003edc..0bb2743d8da3 100644
--- a/drivers/misc/uacce/uacce.c
+++ b/drivers/misc/uacce/uacce.c
@@ -9,43 +9,40 @@

static struct class *uacce_class;
static dev_t uacce_devt;
+/* Protects uacce_xa and uacce->queues */
static DEFINE_MUTEX(uacce_mutex);
static DEFINE_XARRAY_ALLOC(uacce_xa);

-static int uacce_start_queue(struct uacce_queue *q)
+/*
+ * If the parent driver or the device disappears, the queue state is invalid and
+ * ops are not usable anymore.
+ */
+static bool uacce_queue_is_valid(struct uacce_queue *q)
{
- int ret = 0;
+ return q->state == UACCE_Q_INIT || q->state == UACCE_Q_STARTED;
+}

- mutex_lock(&uacce_mutex);
+static int uacce_start_queue(struct uacce_queue *q)
+{
+ int ret;

- if (q->state != UACCE_Q_INIT) {
- ret = -EINVAL;
- goto out_with_lock;
- }
+ if (q->state != UACCE_Q_INIT)
+ return -EINVAL;

if (q->uacce->ops->start_queue) {
ret = q->uacce->ops->start_queue(q);
if (ret < 0)
- goto out_with_lock;
+ return ret;
}

q->state = UACCE_Q_STARTED;
-
-out_with_lock:
- mutex_unlock(&uacce_mutex);
-
- return ret;
+ return 0;
}

static int uacce_put_queue(struct uacce_queue *q)
{
struct uacce_device *uacce = q->uacce;

- mutex_lock(&uacce_mutex);
-
- if (q->state == UACCE_Q_ZOMBIE)
- goto out;
-
if ((q->state == UACCE_Q_STARTED) && uacce->ops->stop_queue)
uacce->ops->stop_queue(q);

@@ -54,8 +51,6 @@ static int uacce_put_queue(struct uacce_queue *q)
uacce->ops->put_queue(q);

q->state = UACCE_Q_ZOMBIE;
-out:
- mutex_unlock(&uacce_mutex);

return 0;
}
@@ -65,20 +60,36 @@ static long uacce_fops_unl_ioctl(struct file *filep,
{
struct uacce_queue *q = filep->private_data;
struct uacce_device *uacce = q->uacce;
+ long ret = -ENXIO;
+
+ /*
+ * uacce->ops->ioctl() may take the mmap_sem in order to copy arg
+ * to/from user. Avoid a circular lock dependency with
+ * uacce_fops_mmap(), which gets called with mmap_sem held, by taking
+ * uacce_mutex instead of q->mutex. Doing this in uacce_fops_mmap() is
+ * not possible because uacce_fops_open() calls iommu_sva_bind_device(),
+ * which takes mmap_sem, while holding uacce_mutex.
+ */
+ mutex_lock(&uacce_mutex);
+ if (!uacce_queue_is_valid(q))
+ goto out_unlock;

switch (cmd) {
case UACCE_CMD_START_Q:
- return uacce_start_queue(q);
-
+ ret = uacce_start_queue(q);
+ break;
case UACCE_CMD_PUT_Q:
- return uacce_put_queue(q);
-
+ ret = uacce_put_queue(q);
+ break;
default:
- if (!uacce->ops->ioctl)
- return -EINVAL;
-
- return uacce->ops->ioctl(q, cmd, arg);
+ if (uacce->ops->ioctl)
+ ret = uacce->ops->ioctl(q, cmd, arg);
+ else
+ ret = -EINVAL;
}
+out_unlock:
+ mutex_unlock(&uacce_mutex);
+ return ret;
}

#ifdef CONFIG_COMPAT
@@ -128,13 +139,18 @@ static int uacce_fops_open(struct inode *inode, struct file *filep)
struct uacce_queue *q;
int ret;

+ mutex_lock(&uacce_mutex);
uacce = xa_load(&uacce_xa, iminor(inode));
- if (!uacce)
- return -ENODEV;
+ if (!uacce) {
+ ret = -ENODEV;
+ goto out_with_lock;
+ }

q = kzalloc(sizeof(struct uacce_queue), GFP_KERNEL);
- if (!q)
- return -ENOMEM;
+ if (!q) {
+ ret = -ENOMEM;
+ goto out_with_lock;
+ }

ret = uacce_bind_queue(uacce, q);
if (ret)
@@ -152,10 +168,10 @@ static int uacce_fops_open(struct inode *inode, struct file *filep)
filep->private_data = q;
uacce->inode = inode;
q->state = UACCE_Q_INIT;
+ mutex_init(&q->mutex);

- mutex_lock(&uacce->queues_lock);
list_add(&q->list, &uacce->queues);
- mutex_unlock(&uacce->queues_lock);
+ mutex_unlock(&uacce_mutex);

return 0;

@@ -163,6 +179,8 @@ static int uacce_fops_open(struct inode *inode, struct file *filep)
uacce_unbind_queue(q);
out_with_mem:
kfree(q);
+out_with_lock:
+ mutex_unlock(&uacce_mutex);
return ret;
}

@@ -170,11 +188,11 @@ static int uacce_fops_release(struct inode *inode, struct file *filep)
{
struct uacce_queue *q = filep->private_data;

- mutex_lock(&q->uacce->queues_lock);
- list_del(&q->list);
- mutex_unlock(&q->uacce->queues_lock);
+ mutex_lock(&uacce_mutex);
uacce_put_queue(q);
uacce_unbind_queue(q);
+ list_del(&q->list);
+ mutex_unlock(&uacce_mutex);
kfree(q);

return 0;
@@ -217,10 +235,9 @@ static int uacce_fops_mmap(struct file *filep, struct vm_area_struct *vma)
vma->vm_private_data = q;
qfr->type = type;

- mutex_lock(&uacce_mutex);
-
- if (q->state != UACCE_Q_INIT && q->state != UACCE_Q_STARTED) {
- ret = -EINVAL;
+ mutex_lock(&q->mutex);
+ if (!uacce_queue_is_valid(q)) {
+ ret = -ENXIO;
goto out_with_lock;
}

@@ -248,12 +265,12 @@ static int uacce_fops_mmap(struct file *filep, struct vm_area_struct *vma)
}

q->qfrs[type] = qfr;
- mutex_unlock(&uacce_mutex);
+ mutex_unlock(&q->mutex);

return ret;

out_with_lock:
- mutex_unlock(&uacce_mutex);
+ mutex_unlock(&q->mutex);
kfree(qfr);
return ret;
}
@@ -262,12 +279,20 @@ static __poll_t uacce_fops_poll(struct file *file, poll_table *wait)
{
struct uacce_queue *q = file->private_data;
struct uacce_device *uacce = q->uacce;
+ int ret = 0;

poll_wait(file, &q->wait, wait);
+
+ mutex_lock(&q->mutex);
+ if (!uacce_queue_is_valid(q))
+ goto out_unlock;
+
if (uacce->ops->is_q_updated && uacce->ops->is_q_updated(q))
- return EPOLLIN | EPOLLRDNORM;
+ ret = EPOLLIN | EPOLLRDNORM;

- return 0;
+out_unlock:
+ mutex_unlock(&q->mutex);
+ return ret;
}

static const struct file_operations uacce_fops = {
@@ -450,7 +475,6 @@ struct uacce_device *uacce_alloc(struct device *parent,
goto err_with_uacce;

INIT_LIST_HEAD(&uacce->queues);
- mutex_init(&uacce->queues_lock);
device_initialize(&uacce->dev);
uacce->dev.devt = MKDEV(MAJOR(uacce_devt), uacce->dev_id);
uacce->dev.class = uacce_class;
@@ -507,13 +531,23 @@ void uacce_remove(struct uacce_device *uacce)
if (uacce->inode)
unmap_mapping_range(uacce->inode->i_mapping, 0, 0, 1);

+ /*
+ * uacce_fops_open() may be running concurrently, even after we remove
+ * the cdev. Holding uacce_mutex ensures that open() does not obtain a
+ * removed uacce device.
+ */
+ mutex_lock(&uacce_mutex);
/* ensure no open queue remains */
- mutex_lock(&uacce->queues_lock);
list_for_each_entry_safe(q, next_q, &uacce->queues, list) {
+ /*
+ * Taking q->mutex ensures that fops do not use the defunct
+ * uacce->ops after the queue is disabled.
+ */
+ mutex_lock(&q->mutex);
uacce_put_queue(q);
+ mutex_unlock(&q->mutex);
uacce_unbind_queue(q);
}
- mutex_unlock(&uacce->queues_lock);

/* disable sva now since no opened queues */
uacce_disable_sva(uacce);
@@ -521,7 +555,14 @@ void uacce_remove(struct uacce_device *uacce)
if (uacce->cdev)
cdev_device_del(uacce->cdev, &uacce->dev);
xa_erase(&uacce_xa, uacce->dev_id);
+ /*
+ * uacce exists as long as there are open fds, but ops will be freed
+ * now. Ensure that bugs cause NULL deref rather than use-after-free.
+ */
+ uacce->ops = NULL;
+ uacce->parent = NULL;
put_device(&uacce->dev);
+ mutex_unlock(&uacce_mutex);
}
EXPORT_SYMBOL_GPL(uacce_remove);

diff --git a/include/linux/uacce.h b/include/linux/uacce.h
index 48e319f40275..1ea2d753ef89 100644
--- a/include/linux/uacce.h
+++ b/include/linux/uacce.h
@@ -70,6 +70,7 @@ enum uacce_q_state {
* @wait: wait queue head
* @list: index into uacce queues list
* @qfrs: pointer of qfr regions
+ * @mutex: protects queue state
* @state: queue state machine
* @pasid: pasid associated to the mm
* @handle: iommu_sva handle returned by iommu_sva_bind_device()
@@ -80,6 +81,7 @@ struct uacce_queue {
wait_queue_head_t wait;
struct list_head list;
struct uacce_qfile_region *qfrs[UACCE_MAX_REGION];
+ struct mutex mutex;
enum uacce_q_state state;
u32 pasid;
struct iommu_sva *handle;
@@ -99,7 +101,6 @@ struct uacce_queue {
* @dev: dev of the uacce
* @priv: private pointer of the uacce
* @queues: list of queues
- * @queues_lock: lock for queues list
* @inode: core vfs
*/
struct uacce_device {
@@ -115,7 +116,6 @@ struct uacce_device {
struct device dev;
void *priv;
struct list_head queues;
- struct mutex queues_lock;
struct inode *inode;
};

--
2.25.1



Attachments:
(No filename) (13.30 kB)

2022-06-20 14:03:14

by Jean-Philippe Brucker

[permalink] [raw]
Subject: Re: [PATCH] uacce: fix concurrency of fops_open and uacce_remove

On Fri, Jun 17, 2022 at 10:23:13PM +0800, Zhangfei Gao wrote:
> @@ -312,12 +345,20 @@ static ssize_t available_instances_show(struct device
> *dev,
>                      char *buf)
>  {
>      struct uacce_device *uacce = to_uacce_device(dev);
> +    ssize_t ret;
>
> -    if (!uacce->ops->get_available_instances)
> -        return -ENODEV;
> +    mutex_lock(&uacce_mutex);
> +    if (!uacce->ops || !uacce->ops->get_available_instances) {

Doesn't the sysfs group go away with uacce_remove()? We shouldn't need
this check

Thanks,
Jean

2022-06-20 14:47:18

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH] uacce: fix concurrency of fops_open and uacce_remove

On Mon, Jun 20, 2022 at 02:24:31PM +0100, Jean-Philippe Brucker wrote:
> >From c7c2b051ec19285bbb973f8a2a5e58bb5326e00e Mon Sep 17 00:00:00 2001
> From: Jean-Philippe Brucker <[email protected]>
> Date: Mon, 20 Jun 2022 10:10:41 +0100
> Subject: [PATCH] uacce: Tidy up locking
>
> The uacce driver must deal with a possible removal of the parent driver
> or device at any time.

No it should not, if the reference counting logic is properly set up.
The parent driver should correctly tear things down here.

> At the moment there are several issues that may
> result in use-after-free. Tidy up the locking to handle module removal.

I don't think you did that, as module removal should never happen if a
file descriptor is opened as I previously mentioned.

thanks,

greg k-h

2022-06-20 14:47:36

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH] uacce: fix concurrency of fops_open and uacce_remove

On Mon, Jun 20, 2022 at 02:24:31PM +0100, Jean-Philippe Brucker wrote:
> On Fri, Jun 17, 2022 at 02:05:21PM +0800, Zhangfei Gao wrote:
> > > The refcount only ensures that the uacce_device object is not freed as
> > > long as there are open fds. But uacce_remove() can run while there are
> > > open fds, or fds in the process of being opened. And atfer uacce_remove()
> > > runs, the uacce_device object still exists but is mostly unusable. For
> > > example once the module is freed, uacce->ops is not valid anymore. But
> > > currently uacce_fops_open() may dereference the ops in this case:
> > >
> > > uacce_fops_open()
> > > if (!uacce->parent->driver)
> > > /* Still valid, keep going */
> > > ... rmmod
> > > uacce_remove()
> > > ... free_module()
> > > uacce->ops->get_queue() /* BUG */
> >
> > uacce_remove should wait for uacce->queues_lock, until fops_open release the
> > lock.
> > If open happen just after the uacce_remove: unlock, uacce_bind_queue in open
> > should fail.
>
> Ah yes sorry, I lost sight of what this patch was adding. But we could
> have the same issue with the patch, just in a different order, no?
>
> uacce_fops_open()
> uacce = xa_load()
> ... rmmod

Um, how is rmmod called if the file descriptor is open?

That should not be possible if the owner of the file descriptor is
properly set. Please fix that up.

thanks,

greg k-h

2022-06-20 20:26:44

by kernel test robot

[permalink] [raw]
Subject: Re: [PATCH] uacce: Tidy up locking

Hi Jean-Philippe,

I love your patch! Perhaps something to improve:

[auto build test WARNING on char-misc/char-misc-testing]
[also build test WARNING on soc/for-next linus/master v5.19-rc2 next-20220617]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url: https://github.com/intel-lab-lkp/linux/commits/Jean-Philippe-Brucker/uacce-Tidy-up-locking/20220620-220634
base: https://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc.git 0a35780c755ccec097d15c6b4ff8b246a89f1689
config: i386-randconfig-s001 (https://download.01.org/0day-ci/archive/20220621/[email protected]/config)
compiler: gcc-11 (Debian 11.3.0-3) 11.3.0
reproduce:
# apt-get install sparse
# sparse version: v0.6.4-30-g92122700-dirty
# https://github.com/intel-lab-lkp/linux/commit/3589b5391f54bea3dc85ed65fe0f036757a4f21c
git remote add linux-review https://github.com/intel-lab-lkp/linux
git fetch --no-tags linux-review Jean-Philippe-Brucker/uacce-Tidy-up-locking/20220620-220634
git checkout 3589b5391f54bea3dc85ed65fe0f036757a4f21c
# save the config file
mkdir build_dir && cp config build_dir/.config
make W=1 C=1 CF='-fdiagnostic-prefix -D__CHECK_ENDIAN__' O=build_dir ARCH=i386 SHELL=/bin/bash drivers/misc/uacce/

If you fix the issue, kindly add following tag where applicable
Reported-by: kernel test robot <[email protected]>


sparse warnings: (new ones prefixed by >>)
>> drivers/misc/uacce/uacce.c:291:21: sparse: sparse: incorrect type in assignment (different base types) @@ expected int ret @@ got restricted __poll_t @@
drivers/misc/uacce/uacce.c:291:21: sparse: expected int ret
drivers/misc/uacce/uacce.c:291:21: sparse: got restricted __poll_t
>> drivers/misc/uacce/uacce.c:295:16: sparse: sparse: incorrect type in return expression (different base types) @@ expected restricted __poll_t @@ got int ret @@
drivers/misc/uacce/uacce.c:295:16: sparse: expected restricted __poll_t
drivers/misc/uacce/uacce.c:295:16: sparse: got int ret

vim +291 drivers/misc/uacce/uacce.c

277
278 static __poll_t uacce_fops_poll(struct file *file, poll_table *wait)
279 {
280 struct uacce_queue *q = file->private_data;
281 struct uacce_device *uacce = q->uacce;
282 int ret = 0;
283
284 poll_wait(file, &q->wait, wait);
285
286 mutex_lock(&q->mutex);
287 if (!uacce_queue_is_valid(q))
288 goto out_unlock;
289
290 if (uacce->ops->is_q_updated && uacce->ops->is_q_updated(q))
> 291 ret = EPOLLIN | EPOLLRDNORM;
292
293 out_unlock:
294 mutex_unlock(&q->mutex);
> 295 return ret;
296 }
297

--
0-DAY CI Kernel Test Service
https://01.org/lkp

2022-06-21 07:50:24

by zhangfei

[permalink] [raw]
Subject: Re: [PATCH] uacce: fix concurrency of fops_open and uacce_remove



On 2022/6/20 下午9:36, Greg Kroah-Hartman wrote:
> On Mon, Jun 20, 2022 at 02:24:31PM +0100, Jean-Philippe Brucker wrote:
>> On Fri, Jun 17, 2022 at 02:05:21PM +0800, Zhangfei Gao wrote:
>>>> The refcount only ensures that the uacce_device object is not freed as
>>>> long as there are open fds. But uacce_remove() can run while there are
>>>> open fds, or fds in the process of being opened. And atfer uacce_remove()
>>>> runs, the uacce_device object still exists but is mostly unusable. For
>>>> example once the module is freed, uacce->ops is not valid anymore. But
>>>> currently uacce_fops_open() may dereference the ops in this case:
>>>>
>>>> uacce_fops_open()
>>>> if (!uacce->parent->driver)
>>>> /* Still valid, keep going */
>>>> ... rmmod
>>>> uacce_remove()
>>>> ... free_module()
>>>> uacce->ops->get_queue() /* BUG */
>>> uacce_remove should wait for uacce->queues_lock, until fops_open release the
>>> lock.
>>> If open happen just after the uacce_remove: unlock, uacce_bind_queue in open
>>> should fail.
>> Ah yes sorry, I lost sight of what this patch was adding. But we could
>> have the same issue with the patch, just in a different order, no?
>>
>> uacce_fops_open()
>> uacce = xa_load()
>> ... rmmod
> Um, how is rmmod called if the file descriptor is open?
>
> That should not be possible if the owner of the file descriptor is
> properly set. Please fix that up.
Thanks Greg

Set cdev owner or use module_get/put can block rmmod once fops_open.
-       uacce->cdev->owner = THIS_MODULE;
+       uacce->cdev->owner = uacce->parent->driver->owner;

However, still not find good method to block removing parent pci device.

$ echo 1 > /sys/bus/pci/devices/0000:00:02.0/remove &

[   32.563350]  uacce_remove+0x6c/0x148
[   32.563353]  hisi_qm_uninit+0x12c/0x178
[   32.563356]  hisi_zip_remove+0xa0/0xd0 [hisi_zip]
[   32.563361]  pci_device_remove+0x44/0xd8
[   32.563364]  device_remove+0x54/0x88
[   32.563367]  device_release_driver_internal+0xec/0x1a0
[   32.563370]  device_release_driver+0x20/0x30
[   32.563372]  pci_stop_bus_device+0x8c/0xe0
[   32.563375]  pci_stop_and_remove_bus_device_locked+0x28/0x60
[   32.563378]  remove_store+0x9c/0xb0
[   32.563379]  dev_attr_store+0x20/0x38

mutex_lock(&dev->device_lock) can be used, which used in
device_release_driver_internal.
Or use internal mutex.

Thanks

2022-06-21 08:06:35

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH] uacce: fix concurrency of fops_open and uacce_remove

On Tue, Jun 21, 2022 at 03:37:31PM +0800, Zhangfei Gao wrote:
>
>
> On 2022/6/20 下午9:36, Greg Kroah-Hartman wrote:
> > On Mon, Jun 20, 2022 at 02:24:31PM +0100, Jean-Philippe Brucker wrote:
> > > On Fri, Jun 17, 2022 at 02:05:21PM +0800, Zhangfei Gao wrote:
> > > > > The refcount only ensures that the uacce_device object is not freed as
> > > > > long as there are open fds. But uacce_remove() can run while there are
> > > > > open fds, or fds in the process of being opened. And atfer uacce_remove()
> > > > > runs, the uacce_device object still exists but is mostly unusable. For
> > > > > example once the module is freed, uacce->ops is not valid anymore. But
> > > > > currently uacce_fops_open() may dereference the ops in this case:
> > > > >
> > > > > uacce_fops_open()
> > > > > if (!uacce->parent->driver)
> > > > > /* Still valid, keep going */
> > > > > ... rmmod
> > > > > uacce_remove()
> > > > > ... free_module()
> > > > > uacce->ops->get_queue() /* BUG */
> > > > uacce_remove should wait for uacce->queues_lock, until fops_open release the
> > > > lock.
> > > > If open happen just after the uacce_remove: unlock, uacce_bind_queue in open
> > > > should fail.
> > > Ah yes sorry, I lost sight of what this patch was adding. But we could
> > > have the same issue with the patch, just in a different order, no?
> > >
> > > uacce_fops_open()
> > > uacce = xa_load()
> > > ... rmmod
> > Um, how is rmmod called if the file descriptor is open?
> >
> > That should not be possible if the owner of the file descriptor is
> > properly set. Please fix that up.
> Thanks Greg
>
> Set cdev owner or use module_get/put can block rmmod once fops_open.
> -       uacce->cdev->owner = THIS_MODULE;
> +       uacce->cdev->owner = uacce->parent->driver->owner;
>
> However, still not find good method to block removing parent pci device.
>
> $ echo 1 > /sys/bus/pci/devices/0000:00:02.0/remove &
>
> [   32.563350]  uacce_remove+0x6c/0x148
> [   32.563353]  hisi_qm_uninit+0x12c/0x178
> [   32.563356]  hisi_zip_remove+0xa0/0xd0 [hisi_zip]
> [   32.563361]  pci_device_remove+0x44/0xd8
> [   32.563364]  device_remove+0x54/0x88
> [   32.563367]  device_release_driver_internal+0xec/0x1a0
> [   32.563370]  device_release_driver+0x20/0x30
> [   32.563372]  pci_stop_bus_device+0x8c/0xe0
> [   32.563375]  pci_stop_and_remove_bus_device_locked+0x28/0x60
> [   32.563378]  remove_store+0x9c/0xb0
> [   32.563379]  dev_attr_store+0x20/0x38

Removing the parent pci device does not remove the module code, it
removes the device itself. Don't confuse code vs. data here.

thanks,

greg k-h

2022-06-22 08:19:39

by zhangfei

[permalink] [raw]
Subject: Re: [PATCH] uacce: fix concurrency of fops_open and uacce_remove

Hi, Greg

On 2022/6/21 下午3:44, Greg Kroah-Hartman wrote:
> On Tue, Jun 21, 2022 at 03:37:31PM +0800, Zhangfei Gao wrote:
>>
>> On 2022/6/20 下午9:36, Greg Kroah-Hartman wrote:
>>> On Mon, Jun 20, 2022 at 02:24:31PM +0100, Jean-Philippe Brucker wrote:
>>>> On Fri, Jun 17, 2022 at 02:05:21PM +0800, Zhangfei Gao wrote:
>>>>>> The refcount only ensures that the uacce_device object is not freed as
>>>>>> long as there are open fds. But uacce_remove() can run while there are
>>>>>> open fds, or fds in the process of being opened. And atfer uacce_remove()
>>>>>> runs, the uacce_device object still exists but is mostly unusable. For
>>>>>> example once the module is freed, uacce->ops is not valid anymore. But
>>>>>> currently uacce_fops_open() may dereference the ops in this case:
>>>>>>
>>>>>> uacce_fops_open()
>>>>>> if (!uacce->parent->driver)
>>>>>> /* Still valid, keep going */
>>>>>> ... rmmod
>>>>>> uacce_remove()
>>>>>> ... free_module()
>>>>>> uacce->ops->get_queue() /* BUG */
>>>>> uacce_remove should wait for uacce->queues_lock, until fops_open release the
>>>>> lock.
>>>>> If open happen just after the uacce_remove: unlock, uacce_bind_queue in open
>>>>> should fail.
>>>> Ah yes sorry, I lost sight of what this patch was adding. But we could
>>>> have the same issue with the patch, just in a different order, no?
>>>>
>>>> uacce_fops_open()
>>>> uacce = xa_load()
>>>> ... rmmod
>>> Um, how is rmmod called if the file descriptor is open?
>>>
>>> That should not be possible if the owner of the file descriptor is
>>> properly set. Please fix that up.
>> Thanks Greg
>>
>> Set cdev owner or use module_get/put can block rmmod once fops_open.
>> -       uacce->cdev->owner = THIS_MODULE;
>> +       uacce->cdev->owner = uacce->parent->driver->owner;
>>
>> However, still not find good method to block removing parent pci device.
>>
>> $ echo 1 > /sys/bus/pci/devices/0000:00:02.0/remove &
>>
>> [   32.563350]  uacce_remove+0x6c/0x148
>> [   32.563353]  hisi_qm_uninit+0x12c/0x178
>> [   32.563356]  hisi_zip_remove+0xa0/0xd0 [hisi_zip]
>> [   32.563361]  pci_device_remove+0x44/0xd8
>> [   32.563364]  device_remove+0x54/0x88
>> [   32.563367]  device_release_driver_internal+0xec/0x1a0
>> [   32.563370]  device_release_driver+0x20/0x30
>> [   32.563372]  pci_stop_bus_device+0x8c/0xe0
>> [   32.563375]  pci_stop_and_remove_bus_device_locked+0x28/0x60
>> [   32.563378]  remove_store+0x9c/0xb0
>> [   32.563379]  dev_attr_store+0x20/0x38
> Removing the parent pci device does not remove the module code, it
> removes the device itself. Don't confuse code vs. data here.

Do you mean even parent pci device is removed immediately, the code has
to wait, like dma etc?

Currently parent driver has to ensure all dma stopped then call
uacce_remove,
ie, after uacce_fops_open succeed, parent driver need wait fops_release,
then uacce_remove can be called.
For example:
drivers/crypto/hisilicon/zip/zip_main.c:
hisi_qm_wait_task_finish

If remove this wait , there may other issue,
Unable to handle kernel paging request at virtual address ffff80000b700204
pc : hisi_qm_cache_wb.part.0+0x2c/0xa0

So uacce only need serialize uacce_fops_open and uacce_remove.
After uacce_fops_open, we can assume uacce_remove only happen after
uacce_fops_release?
Then it would be much simpler.

Thanks


2022-06-22 08:27:45

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH] uacce: fix concurrency of fops_open and uacce_remove

On Wed, Jun 22, 2022 at 04:14:45PM +0800, Zhangfei Gao wrote:
> Hi, Greg
>
> On 2022/6/21 下午3:44, Greg Kroah-Hartman wrote:
> > On Tue, Jun 21, 2022 at 03:37:31PM +0800, Zhangfei Gao wrote:
> > >
> > > On 2022/6/20 下午9:36, Greg Kroah-Hartman wrote:
> > > > On Mon, Jun 20, 2022 at 02:24:31PM +0100, Jean-Philippe Brucker wrote:
> > > > > On Fri, Jun 17, 2022 at 02:05:21PM +0800, Zhangfei Gao wrote:
> > > > > > > The refcount only ensures that the uacce_device object is not freed as
> > > > > > > long as there are open fds. But uacce_remove() can run while there are
> > > > > > > open fds, or fds in the process of being opened. And atfer uacce_remove()
> > > > > > > runs, the uacce_device object still exists but is mostly unusable. For
> > > > > > > example once the module is freed, uacce->ops is not valid anymore. But
> > > > > > > currently uacce_fops_open() may dereference the ops in this case:
> > > > > > >
> > > > > > > uacce_fops_open()
> > > > > > > if (!uacce->parent->driver)
> > > > > > > /* Still valid, keep going */
> > > > > > > ... rmmod
> > > > > > > uacce_remove()
> > > > > > > ... free_module()
> > > > > > > uacce->ops->get_queue() /* BUG */
> > > > > > uacce_remove should wait for uacce->queues_lock, until fops_open release the
> > > > > > lock.
> > > > > > If open happen just after the uacce_remove: unlock, uacce_bind_queue in open
> > > > > > should fail.
> > > > > Ah yes sorry, I lost sight of what this patch was adding. But we could
> > > > > have the same issue with the patch, just in a different order, no?
> > > > >
> > > > > uacce_fops_open()
> > > > > uacce = xa_load()
> > > > > ... rmmod
> > > > Um, how is rmmod called if the file descriptor is open?
> > > >
> > > > That should not be possible if the owner of the file descriptor is
> > > > properly set. Please fix that up.
> > > Thanks Greg
> > >
> > > Set cdev owner or use module_get/put can block rmmod once fops_open.
> > > -       uacce->cdev->owner = THIS_MODULE;
> > > +       uacce->cdev->owner = uacce->parent->driver->owner;
> > >
> > > However, still not find good method to block removing parent pci device.
> > >
> > > $ echo 1 > /sys/bus/pci/devices/0000:00:02.0/remove &
> > >
> > > [   32.563350]  uacce_remove+0x6c/0x148
> > > [   32.563353]  hisi_qm_uninit+0x12c/0x178
> > > [   32.563356]  hisi_zip_remove+0xa0/0xd0 [hisi_zip]
> > > [   32.563361]  pci_device_remove+0x44/0xd8
> > > [   32.563364]  device_remove+0x54/0x88
> > > [   32.563367]  device_release_driver_internal+0xec/0x1a0
> > > [   32.563370]  device_release_driver+0x20/0x30
> > > [   32.563372]  pci_stop_bus_device+0x8c/0xe0
> > > [   32.563375]  pci_stop_and_remove_bus_device_locked+0x28/0x60
> > > [   32.563378]  remove_store+0x9c/0xb0
> > > [   32.563379]  dev_attr_store+0x20/0x38
> > Removing the parent pci device does not remove the module code, it
> > removes the device itself. Don't confuse code vs. data here.
>
> Do you mean even parent pci device is removed immediately, the code has to
> wait, like dma etc?

No, reads will fail, as will DMA transfers, all PCI drivers need to
handle surprise removal like this as we have had PCI hotplug systems for
decades now.

> Currently parent driver has to ensure all dma stopped then call
> uacce_remove,
> ie, after uacce_fops_open succeed, parent driver need wait fops_release,
> then uacce_remove can be called.

remove can be called before the file close can happen all the time, you
need to handle that properly.

> For example:
> drivers/crypto/hisilicon/zip/zip_main.c:
> hisi_qm_wait_task_finish
>
> If remove this wait , there may other issue,
> Unable to handle kernel paging request at virtual address ffff80000b700204
> pc : hisi_qm_cache_wb.part.0+0x2c/0xa0
>
> So uacce only need serialize uacce_fops_open and uacce_remove.

That's not going to help much.

> After uacce_fops_open, we can assume uacce_remove only happen after
> uacce_fops_release?

Nope, again, device remove can happen at any point in time and is
independent of userspace open/close of file descriptors on the char
device.

This is a common problem/pattern that drivers need to handle, and
unfortunatly they all need to handle it on their own. We have discussed
ways of making it easier (see the ksummit discuss list archives from
last year), but no one has stepped up and done the work yet :(

> Then it would be much simpler.

Sorry. If you treat the structures as independant, and properly grab
some reference counts or a lock, you should be ok.

greg k-h