On 22 September 2016 at 19:28, Greg Kroah-Hartman
<[email protected]> wrote:
> 4.4-stable review patch. If anyone has any objections, please let me know.
>
> ------------------
>
> From: Jason Gunthorpe <[email protected]>
>
> commit d1e09f304a1d9651c5059ebfeb696dc2effc9b32 upstream.
>
> Fixes an oops that might happen if uverbs_close races with
> remove_one.
>
> Both contexts may run ib_uverbs_cleanup_ucontext, it depends
> on the flow.
>
> Currently, there is no protection for a case that remove_one
> didn't make the cleanup it runs to its end, the underlying
> ib_device was freed then uverbs_close will call
> ib_uverbs_cleanup_ucontext and OOPs.
>
> Above might happen if uverbs_close deleted the file from the list
> then remove_one didn't find it and runs to its end.
>
> Fixes to protect against that case by a new cleanup lock so that
> ib_uverbs_cleanup_ucontext will be called always before that
> remove_one is ended.
>
> Fixes: 35d4a0b63dc0 ("IB/uverbs: Fix race between ib_uverbs_open and remove_one")
> Reported-by: Devesh Sharma <[email protected]>
> Signed-off-by: Jason Gunthorpe <[email protected]>
> Signed-off-by: Yishai Hadas <[email protected]>
> Signed-off-by: Leon Romanovsky <[email protected]>
> Signed-off-by: Doug Ledford <[email protected]>
> Signed-off-by: Greg Kroah-Hartman <[email protected]>
>
> ---
> drivers/infiniband/core/uverbs.h | 1
> drivers/infiniband/core/uverbs_main.c | 37 ++++++++++++++++++++++------------
> 2 files changed, 25 insertions(+), 13 deletions(-)
>
> --- a/drivers/infiniband/core/uverbs.h
> +++ b/drivers/infiniband/core/uverbs.h
> @@ -116,6 +116,7 @@ struct ib_uverbs_event_file {
> struct ib_uverbs_file {
> struct kref ref;
> struct mutex mutex;
> + struct mutex cleanup_mutex; /* protect cleanup */
> struct ib_uverbs_device *device;
> struct ib_ucontext *ucontext;
> struct ib_event_handler event_handler;
> --- a/drivers/infiniband/core/uverbs_main.c
> +++ b/drivers/infiniband/core/uverbs_main.c
> @@ -922,6 +922,7 @@ static int ib_uverbs_open(struct inode *
> file->async_file = NULL;
> kref_init(&file->ref);
> mutex_init(&file->mutex);
> + mutex_init(&file->cleanup_mutex);
>
> filp->private_data = file;
> kobject_get(&dev->kobj);
> @@ -947,18 +948,20 @@ static int ib_uverbs_close(struct inode
> {
> struct ib_uverbs_file *file = filp->private_data;
> struct ib_uverbs_device *dev = file->device;
> - struct ib_ucontext *ucontext = NULL;
> +
> + mutex_lock(&file->cleanup_mutex);
> + if (file->ucontext) {
> + ib_uverbs_cleanup_ucontext(file, file->ucontext);
> + file->ucontext = NULL;
> + }
> + mutex_unlock(&file->cleanup_mutex);
>
> mutex_lock(&file->device->lists_mutex);
> - ucontext = file->ucontext;
> - file->ucontext = NULL;
> if (!file->is_closed) {
> list_del(&file->list);
> file->is_closed = 1;
> }
> mutex_unlock(&file->device->lists_mutex);
> - if (ucontext)
> - ib_uverbs_cleanup_ucontext(file, ucontext);
>
> if (file->async_file)
> kref_put(&file->async_file->ref, ib_uverbs_release_event_file);
> @@ -1172,22 +1175,30 @@ static void ib_uverbs_free_hw_resources(
> mutex_lock(&uverbs_dev->lists_mutex);
> while (!list_empty(&uverbs_dev->uverbs_file_list)) {
> struct ib_ucontext *ucontext;
> -
> file = list_first_entry(&uverbs_dev->uverbs_file_list,
> struct ib_uverbs_file, list);
> file->is_closed = 1;
> - ucontext = file->ucontext;
> list_del(&file->list);
> - file->ucontext = NULL;
> kref_get(&file->ref);
> mutex_unlock(&uverbs_dev->lists_mutex);
> - /* We must release the mutex before going ahead and calling
> - * disassociate_ucontext. disassociate_ucontext might end up
> - * indirectly calling uverbs_close, for example due to freeing
> - * the resources (e.g mmput).
> - */
> +
> ib_uverbs_event_handler(&file->event_handler, &event);
> +
> + mutex_lock(&file->cleanup_mutex);
> + ucontext = file->ucontext;
> + file->ucontext = NULL;
> + mutex_unlock(&file->cleanup_mutex);
> +
> + /* At this point ib_uverbs_close cannot be running
> + * ib_uverbs_cleanup_ucontext
> + */
> if (ucontext) {
> + /* We must release the mutex before going ahead and
> + * calling disassociate_ucontext. disassociate_ucontext
> + * might end up indirectly calling uverbs_close,
> + * for example due to freeing the resources
> + * (e.g mmput).
> + */
> ib_dev->disassociate_ucontext(ucontext);
> ib_uverbs_cleanup_ucontext(file, ucontext);
> }
Another code-quality (not stable) comment. We're told again and again
to "lock data, not code", e.g.
"And as Alan Cox says, “Lock data, not code”." (Unreliable Guide To
Locking, Rusty Russel, Documentation/DocBook/kernel-locking.tmpl)
"Big Fat Rule: Locks that simply wrap code regions are hard to
understand and prone to race conditions. Lock data, not code." (Linux
Kernel Development Second Edition, Robert Love)
This lock is literally called "cleanup mutex" and it's not really
documented what data it protects. Is there a better solution here?
Vegard
On Mon, Oct 03, 2016 at 02:28:40PM +0200, Vegard Nossum wrote:
> This lock is literally called "cleanup mutex" and it's not really
> documented what data it protects. Is there a better solution here?
I agree it is very complex and hard to understand. This is why it
needed patching :| The mutex is in fact pretty much locking
code. (ensuring that ib_uverbs_cleanup_ucontext only runs on one
thread during this race)
There are at least three locks involved in this process. I didn't see
any obvious way to extend any of the other locks to handle this case.
The argument against most simple solutions (eg a rw lock rather than
the srcu) has been performance on these paths.
Jason