LinuxLists.cc - [PATCH 0/2] aio: Fix use after free bugs

2011-02-15 12:59:33

Subject: [PATCH 0/2] aio: Fix use after free bugs

Hi Al,

The following two patches fix use after free bugs in AIO code. I'm not
completely sure you're the right one to merge these but hopefully yes ;).
Could you please merge them? Thanks.

Honza

2011-02-15 12:59:43

by Jan Kara

[permalink] [raw]

Subject: [PATCH 1/2] fs: Fix aio rcu ioctx lookup

From: Nick Piggin <[email protected]>

aio-dio-invalidate-failure GPFs in aio_put_req from io_submit.

lookup_ioctx doesn't implement the rcu lookup pattern properly. rcu_read_lock
does not prevent refcount going to zero, so we might take a refcount on a zero
count ioctx.

Fix the bug by atomically testing for zero refcount before incrementing.

[JK: Added comment into the code]

Reviewed-by: Jeff Moyer <[email protected]>
Signed-off-by: Nick Piggin <[email protected]>
Signed-off-by: Jan Kara <[email protected]>
---
fs/aio.c | 35 ++++++++++++++++++++++++-----------
1 files changed, 24 insertions(+), 11 deletions(-)

diff --git a/fs/aio.c b/fs/aio.c
index fc557a3..b4dd668 100644
--- a/fs/aio.c
+++ b/fs/aio.c
@@ -239,15 +239,23 @@ static void __put_ioctx(struct kioctx *ctx)
call_rcu(&ctx->rcu_head, ctx_rcu_free);
}

-#define get_ioctx(kioctx) do { \
- BUG_ON(atomic_read(&(kioctx)->users) <= 0); \
- atomic_inc(&(kioctx)->users); \
-} while (0)
-#define put_ioctx(kioctx) do { \
- BUG_ON(atomic_read(&(kioctx)->users) <= 0); \
- if (unlikely(atomic_dec_and_test(&(kioctx)->users))) \
- __put_ioctx(kioctx); \
-} while (0)
+static inline void get_ioctx(struct kioctx *kioctx)
+{
+ BUG_ON(atomic_read(&kioctx->users) <= 0);
+ atomic_inc(&kioctx->users);
+}
+
+static inline int try_get_ioctx(struct kioctx *kioctx)
+{
+ return atomic_inc_not_zero(&kioctx->users);
+}
+
+static inline void put_ioctx(struct kioctx *kioctx)
+{
+ BUG_ON(atomic_read(&kioctx->users) <= 0);
+ if (unlikely(atomic_dec_and_test(&kioctx->users)))
+ __put_ioctx(kioctx);
+}

/* ioctx_alloc
* Allocates and initializes an ioctx. Returns an ERR_PTR if it failed.
@@ -601,8 +609,13 @@ static struct kioctx *lookup_ioctx(unsigned long ctx_id)
rcu_read_lock();

hlist_for_each_entry_rcu(ctx, n, &mm->ioctx_list, list) {
- if (ctx->user_id == ctx_id && !ctx->dead) {
- get_ioctx(ctx);
+ /*
+ * RCU protects us against accessing freed memory but
+ * we have to be careful not to get a reference when the
+ * reference count already dropped to 0 (ctx->dead test
+ * is unreliable because of races).
+ */
+ if (ctx->user_id == ctx_id && !ctx->dead && try_get_ioctx(ctx)){
ret = ctx;
break;
}
--
1.7.1

2011-02-15 12:59:54

by Jan Kara

[permalink] [raw]

Subject: [PATCH 2/2] fs: Fix race between io_destroy() and io_submit() in AIO

A race can occur when io_submit() races with io_destroy():

CPU1 CPU2
io_submit()
do_io_submit()
...
ctx = lookup_ioctx(ctx_id);
io_destroy()
Now do_io_submit() holds the last reference to ctx.
...
queue new AIO
put_ioctx(ctx) - frees ctx with active AIOs

We solve this issue by checking whether ctx is being destroyed
in AIO submission path after adding new AIO to ctx. Then we
are guaranteed that either io_destroy() waits for new AIO or
we see that ctx is being destroyed and bail out.

Reviewed-by: Jeff Moyer <[email protected]>
Signed-off-by: Jan Kara <[email protected]>
CC: Nick Piggin <[email protected]>
---
fs/aio.c | 15 +++++++++++++++
1 files changed, 15 insertions(+), 0 deletions(-)

diff --git a/fs/aio.c b/fs/aio.c
index b4dd668..0244c04 100644
--- a/fs/aio.c
+++ b/fs/aio.c
@@ -1642,6 +1642,21 @@ static int io_submit_one(struct kioctx *ctx, struct iocb __user *user_iocb,
goto out_put_req;

spin_lock_irq(&ctx->ctx_lock);
+ /*
+ * We could have raced with io_destroy() and are currently holding a
+ * reference to ctx which should be destroyed. We cannot submit IO
+ * since ctx gets freed as soon as io_submit() puts its reference.
+ * The check here is reliable since io_destroy() sets ctx->dead before
+ * waiting for outstanding IO. Thus if we don't see ctx->dead set here,
+ * io_destroy() waits for our IO to finish.
+ * The check is inside ctx->ctx_lock to avoid extra memory barrier
+ * in this fast path...
+ */
+ if (ctx->dead) {
+ spin_unlock_irq(&ctx->ctx_lock);
+ ret = -EINVAL;
+ goto out_put_req;
+ }
aio_run_iocb(req);
if (!list_empty(&ctx->run_list)) {
/* drain the run list */
--
1.7.1

2011-02-15 13:55:21

by Jeff Moyer

[permalink] [raw]

Subject: Re: [PATCH 0/2] aio: Fix use after free bugs

Jan Kara <[email protected]> writes:

> Hi Al,
>
> The following two patches fix use after free bugs in AIO code. I'm not
> completely sure you're the right one to merge these but hopefully yes ;).
> Could you please merge them? Thanks.

I've always pushed aio changes through akpm, fwiw.

Cheers,
Jeff

2011-02-15 16:58:47

by Milton Miller

[permalink] [raw]

Subject: [2/2] fs: Fix race between io_destroy() and io_submit() in AIO

> A race can occur when io_submit() races with io_destroy():
>
> CPU1 CPU2
> io_submit()
> do_io_submit()
> ...
> ctx = lookup_ioctx(ctx_id);
> io_destroy()
> Now do_io_submit() holds the last reference to ctx.
> ...
> queue new AIO
> put_ioctx(ctx) - frees ctx with active AIOs
>
> We solve this issue by checking whether ctx is being destroyed
> in AIO submission path after adding new AIO to ctx. Then we
> are guaranteed that either io_destroy() waits for new AIO or
> we see that ctx is being destroyed and bail out.
>
> Reviewed-by: Jeff Moyer <[email protected]>
> Signed-off-by: Jan Kara <[email protected]>
> CC: Nick Piggin <[email protected]>
>
> ---
> fs/aio.c | 15 +++++++++++++++
> 1 files changed, 15 insertions(+), 0 deletions(-)
>
> diff --git a/fs/aio.c b/fs/aio.c
> index b4dd668..0244c04 100644
> --- a/fs/aio.c
> +++ b/fs/aio.c
> @@ -1642,6 +1642,21 @@ static int io_submit_one(struct kioctx *ctx, struct iocb __user *user_iocb,
> goto out_put_req;
>
> spin_lock_irq(&ctx->ctx_lock);
> + /*
> + * We could have raced with io_destroy() and are currently holding a
> + * reference to ctx which should be destroyed. We cannot submit IO
> + * since ctx gets freed as soon as io_submit() puts its reference.
> + * The check here is reliable since io_destroy() sets ctx->dead before
> + * waiting for outstanding IO. Thus if we don't see ctx->dead set here,
> + * io_destroy() waits for our IO to finish.
> + * The check is inside ctx->ctx_lock to avoid extra memory barrier
> + * in this fast path...
> + */

When reading this comment, and with all of the recient discussions I
had with Paul in the smp ipi thread (especially with resepect to third
party writes), I looked to see that the spinlock was paired with the
spinlock to set dead in io_destroy. It is not. It took me some time
to find that the paired lock is actually in wait_for_all_aios. Also,
dead is also set in aio_cancel_all which is under the same spinlock.

Please update this lack of memory barrier comment to reflect the locking.

> + if (ctx->dead) {
> + spin_unlock_irq(&ctx->ctx_lock);
> + ret = -EINVAL;
> + goto out_put_req;
> + }
> aio_run_iocb(req);
> if (!list_empty(&ctx->run_list)) {
> /* drain the run list */

thanks,
milton

2011-02-15 17:16:23

by Jan Kara

[permalink] [raw]

Subject: Re: [2/2] fs: Fix race between io_destroy() and io_submit() in AIO

On Tue 15-02-11 12:59:24, Milton Miller wrote:
> > A race can occur when io_submit() races with io_destroy():
> >
> > CPU1 CPU2
> > io_submit()
> > do_io_submit()
> > ...
> > ctx = lookup_ioctx(ctx_id);
> > io_destroy()
> > Now do_io_submit() holds the last reference to ctx.
> > ...
> > queue new AIO
> > put_ioctx(ctx) - frees ctx with active AIOs
> >
> > We solve this issue by checking whether ctx is being destroyed
> > in AIO submission path after adding new AIO to ctx. Then we
> > are guaranteed that either io_destroy() waits for new AIO or
> > we see that ctx is being destroyed and bail out.
> >
> > Reviewed-by: Jeff Moyer <[email protected]>
> > Signed-off-by: Jan Kara <[email protected]>
> > CC: Nick Piggin <[email protected]>
> >
> > ---
> > fs/aio.c | 15 +++++++++++++++
> > 1 files changed, 15 insertions(+), 0 deletions(-)
> >
> > diff --git a/fs/aio.c b/fs/aio.c
> > index b4dd668..0244c04 100644
> > --- a/fs/aio.c
> > +++ b/fs/aio.c
> > @@ -1642,6 +1642,21 @@ static int io_submit_one(struct kioctx *ctx, struct iocb __user *user_iocb,
> > goto out_put_req;
> >
> > spin_lock_irq(&ctx->ctx_lock);
> > + /*
> > + * We could have raced with io_destroy() and are currently holding a
> > + * reference to ctx which should be destroyed. We cannot submit IO
> > + * since ctx gets freed as soon as io_submit() puts its reference.
> > + * The check here is reliable since io_destroy() sets ctx->dead before
> > + * waiting for outstanding IO. Thus if we don't see ctx->dead set here,
> > + * io_destroy() waits for our IO to finish.
> > + * The check is inside ctx->ctx_lock to avoid extra memory barrier
> > + * in this fast path...
> > + */
>
> When reading this comment, and with all of the recient discussions I
> had with Paul in the smp ipi thread (especially with resepect to third
> party writes), I looked to see that the spinlock was paired with the
> spinlock to set dead in io_destroy. It is not. It took me some time
> to find that the paired lock is actually in wait_for_all_aios. Also,
> dead is also set in aio_cancel_all which is under the same spinlock.
>
> Please update this lack of memory barrier comment to reflect the locking.
Hum, sorry but I don't understand. The above message wants to say that
io_destroy() does
ctx->dead = 1
barrier (implied by a spin_unlock)
wait for reqs_active to get to 0

while io_submit() does
increment reqs_active
barrier (implied by a spin_lock - on a different lock but that does not
matter as we only need the barrier semantics)
check ctx->dead

So if io_submit() gets past ctx->dead check, io_destroy() will certainly
wait for our reference in reqs_active to be released.

I don't see any lock pairing needed here... But maybe I miss something.

Honza
>
> > + if (ctx->dead) {
> > + spin_unlock_irq(&ctx->ctx_lock);
> > + ret = -EINVAL;
> > + goto out_put_req;
> > + }
> > aio_run_iocb(req);
> > if (!list_empty(&ctx->run_list)) {
> > /* drain the run list */
>
> thanks,
> milton
--
Jan Kara <[email protected]>
SUSE Labs, CR

2011-02-15 18:51:18

by Milton Miller

[permalink] [raw]

Subject: Re: [2/2] fs: Fix race between io_destroy() and io_submit() in AIO

Hi. I chatted with Paul a bit to clarify my thoughts.

On Tue, 15 Feb 2011 about 11:16:16 -0600, Jan Kara wrote:
> On Tue 15-02-11 12:59:24, Milton Miller wrote:
> > > A race can occur when io_submit() races with io_destroy():
> > >
> > > CPU1 CPU2
> > > io_submit()
> > > do_io_submit()
> > > ...
> > > ctx = lookup_ioctx(ctx_id);
> > > io_destroy()
> > > Now do_io_submit() holds the last reference to ctx.
> > > ...
> > > queue new AIO
> > > put_ioctx(ctx) - frees ctx with active AIOs
> > >
> > > We solve this issue by checking whether ctx is being destroyed
> > > in AIO submission path after adding new AIO to ctx. Then we
> > > are guaranteed that either io_destroy() waits for new AIO or
> > > we see that ctx is being destroyed and bail out.
> > >
> > > Reviewed-by: Jeff Moyer <[email protected]>
> > > Signed-off-by: Jan Kara <[email protected]>
> > > CC: Nick Piggin <[email protected]>
> > >
> > > ---
> > > fs/aio.c | 15 +++++++++++++++
> > > 1 files changed, 15 insertions(+), 0 deletions(-)
> > >
> > > diff --git a/fs/aio.c b/fs/aio.c
> > > index b4dd668..0244c04 100644
> > > --- a/fs/aio.c
> > > +++ b/fs/aio.c
> > > @@ -1642,6 +1642,21 @@ static int io_submit_one(struct kioctx *ctx, struct iocb __user *user_iocb,
> > > goto out_put_req;
> > >
> > > spin_lock_irq(&ctx->ctx_lock);
> > > + /*
> > > + * We could have raced with io_destroy() and are currently holding a
> > > + * reference to ctx which should be destroyed. We cannot submit IO
> > > + * since ctx gets freed as soon as io_submit() puts its reference.
> > > + * The check here is reliable since io_destroy() sets ctx->dead before
> > > + * waiting for outstanding IO. Thus if we don't see ctx->dead set here,
> > > + * io_destroy() waits for our IO to finish.
> > > + * The check is inside ctx->ctx_lock to avoid extra memory barrier
> > > + * in this fast path...
> > > + */
> >
> > When reading this comment, and with all of the recient discussions I
> > had with Paul in the smp ipi thread (especially with resepect to third
> > party writes), I looked to see that the spinlock was paired with the
> > spinlock to set dead in io_destroy. It is not. It took me some time
> > to find that the paired lock is actually in wait_for_all_aios. Also,
> > dead is also set in aio_cancel_all which is under the same spinlock.
> >
> > Please update this lack of memory barrier comment to reflect the locking.

This locking description is wrong:

> Hum, sorry but I don't understand. The above message wants to say that
> io_destroy() does
> ctx->dead = 1
> barrier (implied by a spin_unlock)

no spin_unlock only does a release barrier.

> wait for reqs_active to get to 0

This read can move up into the spinlocked region (up to the lock acquire).

>
> while io_submit() does
> increment reqs_active
> barrier (implied by a spin_lock - on a different lock but that does not
> matter as we only need the barrier semantics)

No only an acquire barrier, old writes can move into the spinlock region

> check ctx->dead

the increment can move down past this check to the unlock here.

>
> So if io_submit() gets past ctx->dead check, io_destroy() will certainly
> wait for our reference in reqs_active to be released.
>
> I don't see any lock pairing needed here... But maybe I miss something.
>
> Honza

spin lock and unlock are only half barriers as described in
Documentation/memory-barriers.txt

Now, as I said, the code is ok because the active count is read and
written under ctx->ctx_lock, and aio_cancel_all sets dead under
that lock.

But the comment needs to reflect that and not just the the code is
under in some random spin_lock region instead of a memory barrier,
which is not sufficient. Bad lock descriptions leads to making bad
code in the future, either through copying it to another context or
though future work removing the additional constraints not mentioned.

So please, comment which locks are being used here, as what
you described is not enough.

thanks,
milton

2011-02-15 19:15:19

by Jan Kara

[permalink] [raw]

Subject: Re: [2/2] fs: Fix race between io_destroy() and io_submit() in AIO

Hi,

On Tue 15-02-11 12:50:32, Milton Miller wrote:
> On Tue, 15 Feb 2011 about 11:16:16 -0600, Jan Kara wrote:
> > On Tue 15-02-11 12:59:24, Milton Miller wrote:
> > > > A race can occur when io_submit() races with io_destroy():
> > > >
> > > > CPU1 CPU2
> > > > io_submit()
> > > > do_io_submit()
> > > > ...
> > > > ctx = lookup_ioctx(ctx_id);
> > > > io_destroy()
> > > > Now do_io_submit() holds the last reference to ctx.
> > > > ...
> > > > queue new AIO
> > > > put_ioctx(ctx) - frees ctx with active AIOs
> > > >
> > > > We solve this issue by checking whether ctx is being destroyed
> > > > in AIO submission path after adding new AIO to ctx. Then we
> > > > are guaranteed that either io_destroy() waits for new AIO or
> > > > we see that ctx is being destroyed and bail out.
> > > >
> > > > Reviewed-by: Jeff Moyer <[email protected]>
> > > > Signed-off-by: Jan Kara <[email protected]>
> > > > CC: Nick Piggin <[email protected]>
> > > >
> > > > ---
> > > > fs/aio.c | 15 +++++++++++++++
> > > > 1 files changed, 15 insertions(+), 0 deletions(-)
> > > >
> > > > diff --git a/fs/aio.c b/fs/aio.c
> > > > index b4dd668..0244c04 100644
> > > > --- a/fs/aio.c
> > > > +++ b/fs/aio.c
> > > > @@ -1642,6 +1642,21 @@ static int io_submit_one(struct kioctx *ctx, struct iocb __user *user_iocb,
> > > > goto out_put_req;
> > > >
> > > > spin_lock_irq(&ctx->ctx_lock);
> > > > + /*
> > > > + * We could have raced with io_destroy() and are currently holding a
> > > > + * reference to ctx which should be destroyed. We cannot submit IO
> > > > + * since ctx gets freed as soon as io_submit() puts its reference.
> > > > + * The check here is reliable since io_destroy() sets ctx->dead before
> > > > + * waiting for outstanding IO. Thus if we don't see ctx->dead set here,
> > > > + * io_destroy() waits for our IO to finish.
> > > > + * The check is inside ctx->ctx_lock to avoid extra memory barrier
> > > > + * in this fast path...
> > > > + */
> > >
> > > When reading this comment, and with all of the recient discussions I
> > > had with Paul in the smp ipi thread (especially with resepect to third
> > > party writes), I looked to see that the spinlock was paired with the
> > > spinlock to set dead in io_destroy. It is not. It took me some time
> > > to find that the paired lock is actually in wait_for_all_aios. Also,
> > > dead is also set in aio_cancel_all which is under the same spinlock.
> > >
> > > Please update this lack of memory barrier comment to reflect the locking.
>
> This locking description is wrong:
>
> > Hum, sorry but I don't understand. The above message wants to say that
> > io_destroy() does
> > ctx->dead = 1
> > barrier (implied by a spin_unlock)
>
> no spin_unlock only does a release barrier.
>
> > wait for reqs_active to get to 0
>
> This read can move up into the spinlocked region (up to the lock acquire).
>
> >
> > while io_submit() does
> > increment reqs_active
> > barrier (implied by a spin_lock - on a different lock but that does not
> > matter as we only need the barrier semantics)
>
> No only an acquire barrier, old writes can move into the spinlock region
>
> > check ctx->dead
>
> the increment can move down past this check to the unlock here.
Ah OK, you're right. I was typing too fast and thinking too slow ;).

> > So if io_submit() gets past ctx->dead check, io_destroy() will certainly
> > wait for our reference in reqs_active to be released.
> >
> > I don't see any lock pairing needed here... But maybe I miss something.
> >
> > Honza
>
> spin lock and unlock are only half barriers as described in
> Documentation/memory-barriers.txt
>
>
> Now, as I said, the code is ok because the active count is read and
> written under ctx->ctx_lock, and aio_cancel_all sets dead under
> that lock.
>
> But the comment needs to reflect that and not just the the code is
> under in some random spin_lock region instead of a memory barrier,
> which is not sufficient. Bad lock descriptions leads to making bad
> code in the future, either through copying it to another context or
> though future work removing the additional constraints not mentioned.
>
> So please, comment which locks are being used here, as what
> you described is not enough.
Yep, I'll improve the comment. Thanks for explanation.

Honza
--
Jan Kara <[email protected]>
SUSE Labs, CR

2011-02-15 19:33:24

by Jan Kara

[permalink] [raw]

Subject: Re: [2/2] fs: Fix race between io_destroy() and io_submit() in AIO

Hello,

On Tue 15-02-11 20:15:14, Jan Kara wrote:
> On Tue 15-02-11 12:50:32, Milton Miller wrote:
> > On Tue, 15 Feb 2011 about 11:16:16 -0600, Jan Kara wrote:
> > > On Tue 15-02-11 12:59:24, Milton Miller wrote:
> > > > > A race can occur when io_submit() races with io_destroy():
> > > > >
> > > > > CPU1 CPU2
> > > > > io_submit()
> > > > > do_io_submit()
> > > > > ...
> > > > > ctx = lookup_ioctx(ctx_id);
> > > > > io_destroy()
> > > > > Now do_io_submit() holds the last reference to ctx.
> > > > > ...
> > > > > queue new AIO
> > > > > put_ioctx(ctx) - frees ctx with active AIOs
> > > > >
> > > > > We solve this issue by checking whether ctx is being destroyed
> > > > > in AIO submission path after adding new AIO to ctx. Then we
> > > > > are guaranteed that either io_destroy() waits for new AIO or
> > > > > we see that ctx is being destroyed and bail out.
> > > > >
> > > > > Reviewed-by: Jeff Moyer <[email protected]>
> > > > > Signed-off-by: Jan Kara <[email protected]>
> > > > > CC: Nick Piggin <[email protected]>
> > > > >
> > > > > ---
> > > > > fs/aio.c | 15 +++++++++++++++
> > > > > 1 files changed, 15 insertions(+), 0 deletions(-)
> > > > >
> > > > > diff --git a/fs/aio.c b/fs/aio.c
> > > > > index b4dd668..0244c04 100644
> > > > > --- a/fs/aio.c
> > > > > +++ b/fs/aio.c
> > > > > @@ -1642,6 +1642,21 @@ static int io_submit_one(struct kioctx *ctx, struct iocb __user *user_iocb,
> > > > > goto out_put_req;
> > > > >
> > > > > spin_lock_irq(&ctx->ctx_lock);
> > > > > + /*
> > > > > + * We could have raced with io_destroy() and are currently holding a
> > > > > + * reference to ctx which should be destroyed. We cannot submit IO
> > > > > + * since ctx gets freed as soon as io_submit() puts its reference.
> > > > > + * The check here is reliable since io_destroy() sets ctx->dead before
> > > > > + * waiting for outstanding IO. Thus if we don't see ctx->dead set here,
> > > > > + * io_destroy() waits for our IO to finish.
> > > > > + * The check is inside ctx->ctx_lock to avoid extra memory barrier
> > > > > + * in this fast path...
> > > > > + */
> > > >
> > > > When reading this comment, and with all of the recient discussions I
> > > > had with Paul in the smp ipi thread (especially with resepect to third
> > > > party writes), I looked to see that the spinlock was paired with the
> > > > spinlock to set dead in io_destroy. It is not. It took me some time
> > > > to find that the paired lock is actually in wait_for_all_aios. Also,
> > > > dead is also set in aio_cancel_all which is under the same spinlock.
> > > >
> > > > Please update this lack of memory barrier comment to reflect the locking.
> >
> > This locking description is wrong:
> >
> > > Hum, sorry but I don't understand. The above message wants to say that
> > > io_destroy() does
> > > ctx->dead = 1
> > > barrier (implied by a spin_unlock)
> >
> > no spin_unlock only does a release barrier.
> >
> > > wait for reqs_active to get to 0
> >
> > This read can move up into the spinlocked region (up to the lock acquire).
> >
> > >
> > > while io_submit() does
> > > increment reqs_active
> > > barrier (implied by a spin_lock - on a different lock but that does not
> > > matter as we only need the barrier semantics)
> >
> > No only an acquire barrier, old writes can move into the spinlock region
> >
> > > check ctx->dead
> >
> > the increment can move down past this check to the unlock here.
> Ah OK, you're right. I was typing too fast and thinking too slow ;).
>
> > > So if io_submit() gets past ctx->dead check, io_destroy() will certainly
> > > wait for our reference in reqs_active to be released.
> > >
> > > I don't see any lock pairing needed here... But maybe I miss something.
> > >
> > > Honza
> >
> > spin lock and unlock are only half barriers as described in
> > Documentation/memory-barriers.txt
> >
> >
> > Now, as I said, the code is ok because the active count is read and
> > written under ctx->ctx_lock, and aio_cancel_all sets dead under
> > that lock.
> >
> > But the comment needs to reflect that and not just the the code is
> > under in some random spin_lock region instead of a memory barrier,
> > which is not sufficient. Bad lock descriptions leads to making bad
> > code in the future, either through copying it to another context or
> > though future work removing the additional constraints not mentioned.
> >
> > So please, comment which locks are being used here, as what
> > you described is not enough.
> Yep, I'll improve the comment. Thanks for explanation.
Do you like this comment better?
/*
* We could have raced with io_destroy() and are currently holding a
* reference to ctx which should be destroyed. We cannot submit IO
* since ctx gets freed as soon as io_submit() puts its reference. The
* check here is reliable: io_destroy() sets ctx->dead before waiting
* for outstanding IO and the barrier between these two is realized by
* unlock of mm->ioctx_lock and lock of ctx->ctx_lock. Analogously we
* increment ctx->reqs_active before checking for ctx->dead and the
* barrier is realized by unlock and lock of ctx->ctx_lock. Thus if we
* don't see ctx->dead set here, io_destroy() waits for our IO to
* finish.
*/

Honza
--
Jan Kara <[email protected]>
SUSE Labs, CR