2018-01-15 17:34:37

by Max Kellermann

[permalink] [raw]
Subject: [PATCH] drivers/scsi/qla2xxx: fix double free bug after firmware timeout

When the qla2xxx firmware is unavailable, eventually
qla2x00_sp_timeout() is reached, which calls the timeout function and
frees the srb_t instance.

The timeout function always resolves to qla2x00_async_iocb_timeout(),
which invokes another callback function called "done". All of these
qla2x00_*_sp_done() callbacks also free the srb_t instance; after
returning to qla2x00_sp_timeout(), it is freed again.

The fix is to remove the "sp->free(sp)" call from qla2x00_sp_timeout()
and add it to those code paths in qla2x00_async_iocb_timeout() which
do not already free the object.

This is how it looks like with KASAN:

BUG: KASAN: use-after-free in qla2x00_sp_timeout+0x228/0x250
Read of size 8 at addr ffff88278147a590 by task swapper/2/0

Allocated by task 1502:
save_stack+0x33/0xa0
kasan_kmalloc+0xa0/0xd0
kmem_cache_alloc+0xb8/0x1c0
mempool_alloc+0xd6/0x260
qla24xx_async_gnl+0x3c5/0x1100

Freed by task 0:
save_stack+0x33/0xa0
kasan_slab_free+0x72/0xc0
kmem_cache_free+0x75/0x200
qla24xx_async_gnl_sp_done+0x556/0x9e0
qla2x00_async_iocb_timeout+0x1c7/0x420
qla2x00_sp_timeout+0x16d/0x250
call_timer_fn+0x36/0x200

The buggy address belongs to the object at ffff88278147a440
which belongs to the cache qla2xxx_srbs of size 344
The buggy address is located 336 bytes inside of
344-byte region [ffff88278147a440, ffff88278147a598)

Signed-off-by: Max Kellermann <[email protected]>
---
drivers/scsi/qla2xxx/qla_init.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/scsi/qla2xxx/qla_init.c b/drivers/scsi/qla2xxx/qla_init.c
index b5b48ddca962..801890564e00 100644
--- a/drivers/scsi/qla2xxx/qla_init.c
+++ b/drivers/scsi/qla2xxx/qla_init.c
@@ -58,7 +58,6 @@ qla2x00_sp_timeout(unsigned long __data)
req->outstanding_cmds[sp->handle] = NULL;
iocb = &sp->u.iocb_cmd;
iocb->timeout(sp);
- sp->free(sp);
spin_unlock_irqrestore(&vha->hw->hardware_lock, flags);
}

@@ -121,9 +120,11 @@ qla2x00_async_iocb_timeout(void *data)
ea.data[1] = lio->u.logio.data[1];
ea.sp = sp;
qla24xx_handle_plogi_done_event(fcport->vha, &ea);
+ sp->free(sp);
break;
case SRB_LOGOUT_CMD:
qlt_logo_completion_handler(fcport, QLA_FUNCTION_TIMEOUT);
+ sp->free(sp);
break;
case SRB_CT_PTHRU_CMD:
case SRB_MB_IOCB:


2018-01-15 19:58:52

by Madhani, Himanshu

[permalink] [raw]
Subject: Re: [PATCH] drivers/scsi/qla2xxx: fix double free bug after firmware timeout

Hi Max,

> On Jan 15, 2018, at 9:26 AM, Max Kellermann <[email protected]> wrote:
>
> When the qla2xxx firmware is unavailable, eventually
> qla2x00_sp_timeout() is reached, which calls the timeout function and
> frees the srb_t instance.
>
> The timeout function always resolves to qla2x00_async_iocb_timeout(),
> which invokes another callback function called "done". All of these
> qla2x00_*_sp_done() callbacks also free the srb_t instance; after
> returning to qla2x00_sp_timeout(), it is freed again.
>
> The fix is to remove the "sp->free(sp)" call from qla2x00_sp_timeout()
> and add it to those code paths in qla2x00_async_iocb_timeout() which
> do not already free the object.
>
> This is how it looks like with KASAN:
>
> BUG: KASAN: use-after-free in qla2x00_sp_timeout+0x228/0x250
> Read of size 8 at addr ffff88278147a590 by task swapper/2/0
>
> Allocated by task 1502:
> save_stack+0x33/0xa0
> kasan_kmalloc+0xa0/0xd0
> kmem_cache_alloc+0xb8/0x1c0
> mempool_alloc+0xd6/0x260
> qla24xx_async_gnl+0x3c5/0x1100
>
> Freed by task 0:
> save_stack+0x33/0xa0
> kasan_slab_free+0x72/0xc0
> kmem_cache_free+0x75/0x200
> qla24xx_async_gnl_sp_done+0x556/0x9e0
> qla2x00_async_iocb_timeout+0x1c7/0x420
> qla2x00_sp_timeout+0x16d/0x250
> call_timer_fn+0x36/0x200
>
> The buggy address belongs to the object at ffff88278147a440
> which belongs to the cache qla2xxx_srbs of size 344
> The buggy address is located 336 bytes inside of
> 344-byte region [ffff88278147a440, ffff88278147a598)
>
> Signed-off-by: Max Kellermann <[email protected]>
> ---
> drivers/scsi/qla2xxx/qla_init.c | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/scsi/qla2xxx/qla_init.c b/drivers/scsi/qla2xxx/qla_init.c
> index b5b48ddca962..801890564e00 100644
> --- a/drivers/scsi/qla2xxx/qla_init.c
> +++ b/drivers/scsi/qla2xxx/qla_init.c
> @@ -58,7 +58,6 @@ qla2x00_sp_timeout(unsigned long __data)
> req->outstanding_cmds[sp->handle] = NULL;
> iocb = &sp->u.iocb_cmd;
> iocb->timeout(sp);
> - sp->free(sp);
> spin_unlock_irqrestore(&vha->hw->hardware_lock, flags);
> }
>
> @@ -121,9 +120,11 @@ qla2x00_async_iocb_timeout(void *data)
> ea.data[1] = lio->u.logio.data[1];
> ea.sp = sp;
> qla24xx_handle_plogi_done_event(fcport->vha, &ea);
> + sp->free(sp);
> break;
> case SRB_LOGOUT_CMD:
> qlt_logo_completion_handler(fcport, QLA_FUNCTION_TIMEOUT);
> + sp->free(sp);
> break;
> case SRB_CT_PTHRU_CMD:
> case SRB_MB_IOCB:
>

We have patch to prevent this double free in 4.16/scsi-queue already.

https://git.kernel.org/pub/scm/linux/kernel/git/mkp/scsi.git/commit/?h=4.16/scsi-queue&id=045d6ea200af794ba15515984cff63787a7fc3c0

Thanks,
- Himanshu

2018-01-15 20:37:36

by Max Kellermann

[permalink] [raw]
Subject: Re: [PATCH] drivers/scsi/qla2xxx: fix double free bug after firmware timeout

On 2018/01/15 20:58, "Madhani, Himanshu" <[email protected]> wrote:
> We have patch to prevent this double free in 4.16/scsi-queue
> already.

No, let me repeat: this is a different bug!

Your bug is about the free call after waiting for completion
synchronously in qla24xx_els_dcmd_iocb(), after it was already freed
by qla2x00_sp_timeout().

My bug is about free in qla2x00_*_sp_done() and again in
qla2x00_sp_timeout(). My patch description describes exactly that.

And you know what? My patch fixes both bugs. It is superior to the
one that was merged 4 weeks later, isn't it?

You NACKed my patch 5 weeks ago, and I explained to you that you were
talking about a different bug, but you never replied to that.

Max

2018-01-15 21:03:54

by Madhani, Himanshu

[permalink] [raw]
Subject: Re: [PATCH] drivers/scsi/qla2xxx: fix double free bug after firmware timeout

Hi Max,

> On Jan 15, 2018, at 12:37 PM, Max Kellermann <[email protected]> wrote:
>
> On 2018/01/15 20:58, "Madhani, Himanshu" <[email protected]> wrote:
>> We have patch to prevent this double free in 4.16/scsi-queue
>> already.
>
> No, let me repeat: this is a different bug!
>
> Your bug is about the free call after waiting for completion
> synchronously in qla24xx_els_dcmd_iocb(), after it was already freed
> by qla2x00_sp_timeout().
>
> My bug is about free in qla2x00_*_sp_done() and again in
> qla2x00_sp_timeout(). My patch description describes exactly that.
>
> And you know what? My patch fixes both bugs. It is superior to the
> one that was merged 4 weeks later, isn't it?
>
> You NACKed my patch 5 weeks ago, and I explained to you that you were
> talking about a different bug, but you never replied to that.
>
> Max

Thanks for refreshing my memory. Sorry about not getting back to you earlier.
I was out for couple weeks so it got dropped out of my bucket for response.

I do agree there is double free issue here. I’ll discuss this internally with folks who has
deeper history of this code and will get back to you and see which approach we want to
take.

Thanks,
- Himanshu

2018-01-18 01:03:15

by Madhani, Himanshu

[permalink] [raw]
Subject: Re: [PATCH] drivers/scsi/qla2xxx: fix double free bug after firmware timeout

Max,

> On Jan 15, 2018, at 12:37 PM, Max Kellermann <[email protected]> wrote:
>
> On 2018/01/15 20:58, "Madhani, Himanshu" <[email protected]> wrote:
>> We have patch to prevent this double free in 4.16/scsi-queue
>> already.
>
> No, let me repeat: this is a different bug!
>
> Your bug is about the free call after waiting for completion
> synchronously in qla24xx_els_dcmd_iocb(), after it was already freed
> by qla2x00_sp_timeout().
>
> My bug is about free in qla2x00_*_sp_done() and again in
> qla2x00_sp_timeout(). My patch description describes exactly that.
>
> And you know what? My patch fixes both bugs. It is superior to the
> one that was merged 4 weeks later, isn't it?
>
> You NACKed my patch 5 weeks ago, and I explained to you that you were
> talking about a different bug, but you never replied to that.
>
> Max

Just to update you, we are working on fixing this internally and will post
patch soon.

I’ll add you are Reporter and please review and let us know your comments.

Thanks,
- Himanshu