2020-09-11 07:18:18

by Brooke Basile

[permalink] [raw]
Subject: [PATCH] wireless: ath9k: hif_usb: fix race condition between usb_get_urb() and usb_kill_anchored_urbs()

Calls to usb_kill_anchored_urbs() after usb_kill_urb() on multiprocessor
systems create a race condition in which usb_kill_anchored_urbs() deallocates
the URB before the completer callback is called in usb_kill_urb(), resulting
in a use-after-free.
To fix this, add proper lock protection to usb_kill_urb() calls that can
possibly run concurrently with usb_kill_anchored_urbs().

Reported-by: [email protected]
Link: https://syzkaller.appspot.com/bug?id=cabffad18eb74197f84871802fd2c5117b61febf
Signed-off-by: Brooke Basile <[email protected]>
---
drivers/net/wireless/ath/ath9k/hif_usb.c | 19 +++++++++++++++++++
1 file changed, 19 insertions(+)

diff --git a/drivers/net/wireless/ath/ath9k/hif_usb.c b/drivers/net/wireless/ath/ath9k/hif_usb.c
index 3f563e02d17d..2ed98aaed6fb 100644
--- a/drivers/net/wireless/ath/ath9k/hif_usb.c
+++ b/drivers/net/wireless/ath/ath9k/hif_usb.c
@@ -449,10 +449,19 @@ static void hif_usb_stop(void *hif_handle)
spin_unlock_irqrestore(&hif_dev->tx.tx_lock, flags);

/* The pending URBs have to be canceled. */
+ spin_lock_irqsave(&hif_dev->tx.tx_lock, flags);
list_for_each_entry_safe(tx_buf, tx_buf_tmp,
&hif_dev->tx.tx_pending, list) {
+ usb_get_urb(tx_buf->urb);
+ spin_unlock_irqrestore(&hif_dev->tx.tx_lock, flags);
usb_kill_urb(tx_buf->urb);
+ list_del(&tx_buf->list);
+ usb_free_urb(tx_buf->urb);
+ kfree(tx_buf->buf);
+ kfree(tx_buf);
+ spin_lock_irqsave(&hif_dev->tx.tx_lock, flags);
}
+ spin_unlock_irqrestore(&hif_dev->tx.tx_lock, flags);

usb_kill_anchored_urbs(&hif_dev->mgmt_submitted);
}
@@ -762,27 +771,37 @@ static void ath9k_hif_usb_dealloc_tx_urbs(struct hif_device_usb *hif_dev)
struct tx_buf *tx_buf = NULL, *tx_buf_tmp = NULL;
unsigned long flags;

+ spin_lock_irqsave(&hif_dev->tx.tx_lock, flags);
list_for_each_entry_safe(tx_buf, tx_buf_tmp,
&hif_dev->tx.tx_buf, list) {
+ usb_get_urb(tx_buf->urb);
+ spin_unlock_irqrestore(&hif_dev->tx.tx_lock, flags);
usb_kill_urb(tx_buf->urb);
list_del(&tx_buf->list);
usb_free_urb(tx_buf->urb);
kfree(tx_buf->buf);
kfree(tx_buf);
+ spin_lock_irqsave(&hif_dev->tx.tx_lock, flags);
}
+ spin_unlock_irqrestore(&hif_dev->tx.tx_lock, flags);

spin_lock_irqsave(&hif_dev->tx.tx_lock, flags);
hif_dev->tx.flags |= HIF_USB_TX_FLUSH;
spin_unlock_irqrestore(&hif_dev->tx.tx_lock, flags);

+ spin_lock_irqsave(&hif_dev->tx.tx_lock, flags);
list_for_each_entry_safe(tx_buf, tx_buf_tmp,
&hif_dev->tx.tx_pending, list) {
+ usb_get_urb(tx_buf->urb);
+ spin_unlock_irqrestore(&hif_dev->tx.tx_lock, flags);
usb_kill_urb(tx_buf->urb);
list_del(&tx_buf->list);
usb_free_urb(tx_buf->urb);
kfree(tx_buf->buf);
kfree(tx_buf);
+ spin_lock_irqsave(&hif_dev->tx.tx_lock, flags);
}
+ spin_unlock_irqrestore(&hif_dev->tx.tx_lock, flags);

usb_kill_anchored_urbs(&hif_dev->mgmt_submitted);
}
--
2.28.0


2020-09-20 02:11:38

by Brooke Basile

[permalink] [raw]
Subject: Re: [PATCH] wireless: ath9k: hif_usb: fix race condition between usb_get_urb() and usb_kill_anchored_urbs()

On 9/11/20 3:14 AM, Brooke Basile wrote:
> Calls to usb_kill_anchored_urbs() after usb_kill_urb() on multiprocessor
> systems create a race condition in which usb_kill_anchored_urbs() deallocates
> the URB before the completer callback is called in usb_kill_urb(), resulting
> in a use-after-free.
> To fix this, add proper lock protection to usb_kill_urb() calls that can
> possibly run concurrently with usb_kill_anchored_urbs().
>
> Reported-by: [email protected]
> Link: https://syzkaller.appspot.com/bug?id=cabffad18eb74197f84871802fd2c5117b61febf
> Signed-off-by: Brooke Basile <[email protected]>
> ---
> drivers/net/wireless/ath/ath9k/hif_usb.c | 19 +++++++++++++++++++
> 1 file changed, 19 insertions(+)
>
> diff --git a/drivers/net/wireless/ath/ath9k/hif_usb.c b/drivers/net/wireless/ath/ath9k/hif_usb.c
> index 3f563e02d17d..2ed98aaed6fb 100644
> --- a/drivers/net/wireless/ath/ath9k/hif_usb.c
> +++ b/drivers/net/wireless/ath/ath9k/hif_usb.c
> @@ -449,10 +449,19 @@ static void hif_usb_stop(void *hif_handle)
> spin_unlock_irqrestore(&hif_dev->tx.tx_lock, flags);
>
> /* The pending URBs have to be canceled. */
> + spin_lock_irqsave(&hif_dev->tx.tx_lock, flags);
> list_for_each_entry_safe(tx_buf, tx_buf_tmp,
> &hif_dev->tx.tx_pending, list) {
> + usb_get_urb(tx_buf->urb);
> + spin_unlock_irqrestore(&hif_dev->tx.tx_lock, flags);
> usb_kill_urb(tx_buf->urb);
> + list_del(&tx_buf->list);
> + usb_free_urb(tx_buf->urb);
> + kfree(tx_buf->buf);
> + kfree(tx_buf);
> + spin_lock_irqsave(&hif_dev->tx.tx_lock, flags);
> }
> + spin_unlock_irqrestore(&hif_dev->tx.tx_lock, flags);
>
> usb_kill_anchored_urbs(&hif_dev->mgmt_submitted);
> }
> @@ -762,27 +771,37 @@ static void ath9k_hif_usb_dealloc_tx_urbs(struct hif_device_usb *hif_dev)
> struct tx_buf *tx_buf = NULL, *tx_buf_tmp = NULL;
> unsigned long flags;
>
> + spin_lock_irqsave(&hif_dev->tx.tx_lock, flags);
> list_for_each_entry_safe(tx_buf, tx_buf_tmp,
> &hif_dev->tx.tx_buf, list) {
> + usb_get_urb(tx_buf->urb);
> + spin_unlock_irqrestore(&hif_dev->tx.tx_lock, flags);
> usb_kill_urb(tx_buf->urb);
> list_del(&tx_buf->list);
> usb_free_urb(tx_buf->urb);
> kfree(tx_buf->buf);
> kfree(tx_buf);
> + spin_lock_irqsave(&hif_dev->tx.tx_lock, flags);
> }
> + spin_unlock_irqrestore(&hif_dev->tx.tx_lock, flags);
>
> spin_lock_irqsave(&hif_dev->tx.tx_lock, flags);
> hif_dev->tx.flags |= HIF_USB_TX_FLUSH;
> spin_unlock_irqrestore(&hif_dev->tx.tx_lock, flags);
>
> + spin_lock_irqsave(&hif_dev->tx.tx_lock, flags);
> list_for_each_entry_safe(tx_buf, tx_buf_tmp,
> &hif_dev->tx.tx_pending, list) {
> + usb_get_urb(tx_buf->urb);
> + spin_unlock_irqrestore(&hif_dev->tx.tx_lock, flags);
> usb_kill_urb(tx_buf->urb);
> list_del(&tx_buf->list);
> usb_free_urb(tx_buf->urb);
> kfree(tx_buf->buf);
> kfree(tx_buf);
> + spin_lock_irqsave(&hif_dev->tx.tx_lock, flags);
> }
> + spin_unlock_irqrestore(&hif_dev->tx.tx_lock, flags);
>
> usb_kill_anchored_urbs(&hif_dev->mgmt_submitted);
> }
> --
> 2.28.0
>

Hi,

Just wanted to check on the status of this patch, if there's anything
wrong I'm happy to make it right.
Sorry to bother!

Best,
Brooke Basile

2020-09-21 23:06:37

by Brooke Basile

[permalink] [raw]
Subject: Re: [PATCH] wireless: ath9k: hif_usb: fix race condition between usb_get_urb() and usb_kill_anchored_urbs()

On 9/21/20 9:05 AM, Kalle Valo wrote:
> Brooke Basile <[email protected]> wrote:
>
>> Calls to usb_kill_anchored_urbs() after usb_kill_urb() on multiprocessor
>> systems create a race condition in which usb_kill_anchored_urbs() deallocates
>> the URB before the completer callback is called in usb_kill_urb(), resulting
>> in a use-after-free.
>> To fix this, add proper lock protection to usb_kill_urb() calls that can
>> possibly run concurrently with usb_kill_anchored_urbs().
>>
>> Reported-by: [email protected]
>> Link: https://syzkaller.appspot.com/bug?id=cabffad18eb74197f84871802fd2c5117b61febf
>> Signed-off-by: Brooke Basile <[email protected]>
>> Signed-off-by: Kalle Valo <[email protected]>
>
> Patch applied to ath-next branch of ath.git, thanks.
>
> 03fb92a432ea ath9k: hif_usb: fix race condition between usb_get_urb() and usb_kill_anchored_urbs()
>

Thank you! :)

Best,
Brooke Basile

2021-03-30 19:39:40

by Pavel Skripkin

[permalink] [raw]
Subject: Memory leak in ath9k_hif_usb_dealloc_tx_urbs()

Hi!

I did some debugging on this
https://syzkaller.appspot.com/bug?id=3ea507fb3c47426497b52bd82b8ef0dd5b6cc7ee
and, I believe, I recognized the problem. The problem appears in case of
ath9k_htc_hw_init() fail. In case of this fail all tx_buf->urb krefs will be
initialized to 1, but in free function:

static void ath9k_hif_usb_dealloc_tx_urbs(struct hif_device_usb *hif_dev)

....

static void ath9k_hif_usb_dealloc_tx_urbs(struct hif_device_usb *hif_dev)
{
...
list_for_each_entry_safe(tx_buf, tx_buf_tmp,
&hif_dev->tx.tx_buf, list) {
usb_get_urb(tx_buf->urb);
...
usb_free_urb(tx_buf->urb);
...
}

Krefs are incremented and then decremented, that means urbs won't be freed.
I found your patch and I can't properly understand why You added usb_get_urb(tx_buf->urb).
Can You explain please, I believe this will help me or somebody to fix this ussue :)

With regards,
Pavel Skripkin

2021-04-27 11:37:13

by Atul Gopinathan

[permalink] [raw]
Subject: Re: Memory leak in ath9k_hif_usb_dealloc_tx_urbs()

On Wed, Mar 31, 2021 at 08:28:15AM +0200, Greg KH wrote:
> On Tue, Mar 30, 2021 at 10:36:52PM +0300, Pavel Skripkin wrote:
> > Hi!
> >
> > I did some debugging on this
> > https://syzkaller.appspot.com/bug?id=3ea507fb3c47426497b52bd82b8ef0dd5b6cc7ee
> > and, I believe, I recognized the problem. The problem appears in case of
> > ath9k_htc_hw_init() fail. In case of this fail all tx_buf->urb krefs will be
> > initialized to 1, but in free function:
> >
> > static void ath9k_hif_usb_dealloc_tx_urbs(struct hif_device_usb *hif_dev)
> >
> > ....
> >
> > static void ath9k_hif_usb_dealloc_tx_urbs(struct hif_device_usb *hif_dev)
> > {
> > ...
> > list_for_each_entry_safe(tx_buf, tx_buf_tmp,
> > &hif_dev->tx.tx_buf, list) {
> > usb_get_urb(tx_buf->urb);
> > ...
> > usb_free_urb(tx_buf->urb);
> > ...
> > }
> >
> > Krefs are incremented and then decremented, that means urbs won't be freed.
> > I found your patch and I can't properly understand why You added usb_get_urb(tx_buf->urb).
> > Can You explain please, I believe this will help me or somebody to fix this ussue :)
>
> I think almost everyone who has looked into this has given up due to the
> mess of twisty-passages here with almost no real-world benefits for
> unwinding them :)

Just wanted to confirm, what is the status of this bug then, as in is it
invalid (not sure if that's the correct word)? I happened to stumble
across the same syzkaller bug report Pavel posted above, in the morning.
Saw that there has been no patch tests/fixes on this yet according to
syzkaller. Spent a couple of hours going through it before sending a
test patch to syzbot which returned an "OK" (and the patch is exactly
what Pavel pointed out, I simply removed the `usb_get_urb()`). Before
sending anything to the mailing list, I made sure to search all the
relavant networking lists to see if this topic had been brought up (learnt
to do this from my preious mistakes of sending already accepted patches) and
luckily I found this.

Syzbot has had 380 crashes caused by this bug, with the latest being
today. So I wanted to confirm what should be done be about this bug.

Thank you!
Atul

2021-04-27 11:51:43

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: Memory leak in ath9k_hif_usb_dealloc_tx_urbs()

On Tue, Apr 27, 2021 at 05:05:59PM +0530, Atul Gopinathan wrote:
> On Wed, Mar 31, 2021 at 08:28:15AM +0200, Greg KH wrote:
> > On Tue, Mar 30, 2021 at 10:36:52PM +0300, Pavel Skripkin wrote:
> > > Hi!
> > >
> > > I did some debugging on this
> > > https://syzkaller.appspot.com/bug?id=3ea507fb3c47426497b52bd82b8ef0dd5b6cc7ee
> > > and, I believe, I recognized the problem. The problem appears in case of
> > > ath9k_htc_hw_init() fail. In case of this fail all tx_buf->urb krefs will be
> > > initialized to 1, but in free function:
> > >
> > > static void ath9k_hif_usb_dealloc_tx_urbs(struct hif_device_usb *hif_dev)
> > >
> > > ....
> > >
> > > static void ath9k_hif_usb_dealloc_tx_urbs(struct hif_device_usb *hif_dev)
> > > {
> > > ...
> > > list_for_each_entry_safe(tx_buf, tx_buf_tmp,
> > > &hif_dev->tx.tx_buf, list) {
> > > usb_get_urb(tx_buf->urb);
> > > ...
> > > usb_free_urb(tx_buf->urb);
> > > ...
> > > }
> > >
> > > Krefs are incremented and then decremented, that means urbs won't be freed.
> > > I found your patch and I can't properly understand why You added usb_get_urb(tx_buf->urb).
> > > Can You explain please, I believe this will help me or somebody to fix this ussue :)
> >
> > I think almost everyone who has looked into this has given up due to the
> > mess of twisty-passages here with almost no real-world benefits for
> > unwinding them :)
>
> Just wanted to confirm, what is the status of this bug then, as in is it
> invalid (not sure if that's the correct word)? I happened to stumble
> across the same syzkaller bug report Pavel posted above, in the morning.
> Saw that there has been no patch tests/fixes on this yet according to
> syzkaller. Spent a couple of hours going through it before sending a
> test patch to syzbot which returned an "OK" (and the patch is exactly
> what Pavel pointed out, I simply removed the `usb_get_urb()`). Before
> sending anything to the mailing list, I made sure to search all the
> relavant networking lists to see if this topic had been brought up (learnt
> to do this from my preious mistakes of sending already accepted patches) and
> luckily I found this.
>
> Syzbot has had 380 crashes caused by this bug, with the latest being
> today. So I wanted to confirm what should be done be about this bug.

If you think you can fix it, wonderful, go ahead please!

greg k-h

2021-04-27 12:30:17

by Pavel Skripkin

[permalink] [raw]
Subject: Re: Memory leak in ath9k_hif_usb_dealloc_tx_urbs()

On Tue, 27 Apr 2021 15:04:29 +0300
Pavel Skripkin <[email protected]> wrote:

> Hi!
>
> On Tue, 2021-04-27 at 17:05 +0530, Atul Gopinathan wrote:
> > On Wed, Mar 31, 2021 at 08:28:15AM +0200, Greg KH wrote:
> > > On Tue, Mar 30, 2021 at 10:36:52PM +0300, Pavel Skripkin wrote:
> > > > Hi!
> > > >
> > > > I did some debugging on this
> > > > https://syzkaller.appspot.com/bug?id=3ea507fb3c47426497b52bd82b8ef0dd5b6cc7ee
> > > > and, I believe, I recognized the problem. The problem appears in
> > > > case of
> > > > ath9k_htc_hw_init() fail. In case of this fail all tx_buf->urb
> > > > krefs will be
> > > > initialized to 1, but in free function:
> > > >
> > > > static void ath9k_hif_usb_dealloc_tx_urbs(struct hif_device_usb
> > > > *hif_dev)
> > > >
> > > > ....
> > > >
> > > > static void ath9k_hif_usb_dealloc_tx_urbs(struct hif_device_usb
> > > > *hif_dev)
> > > > {
> > > >     ...
> > > >         list_for_each_entry_safe(tx_buf, tx_buf_tmp,
> > > >                                  &hif_dev->tx.tx_buf, list) {
> > > >                 usb_get_urb(tx_buf->urb);
> > > >                 ...
> > > >                 usb_free_urb(tx_buf->urb);
> > > >                 ...
> > > >                 }
> > > >
> > > > Krefs are incremented and then decremented, that means urbs
> > > > won't be freed.
> > > > I found your patch and I can't properly understand why You added
> > > > usb_get_urb(tx_buf->urb).
> > > > Can You explain please, I believe this will help me or somebody
> > > > to fix this ussue :)
> > >
> > > I think almost everyone who has looked into this has given up due
> > > to the
> > > mess of twisty-passages here with almost no real-world benefits
> > > for unwinding them :)
> >
> > Just wanted to confirm, what is the status of this bug then, as in
> > is it
> > invalid (not sure if that's the correct word)? I happened to stumble
> > across the same syzkaller bug report Pavel posted above, in the
> > morning.
> > Saw that there has been no patch tests/fixes on this yet according
> > to syzkaller. Spent a couple of hours going through it before
> > sending a test patch to syzbot which returned an "OK" (and the
> > patch is exactly what Pavel pointed out, I simply removed the
> > `usb_get_urb()`). Before sending anything to the mailing list, I
> > made sure to search all the relavant networking lists to see if
> > this topic had been brought up (learnt
> > to do this from my preious mistakes of sending already accepted
> > patches) and
> > luckily I found this.
> >
> > Syzbot has had 380 crashes caused by this bug, with the latest being
> > today. So I wanted to confirm what should be done be about this
> > bug.
> >
>
> I saw on dashboard, that Dmitry tested latest upstream commit and
> syzbot returned "OK", but usb_get_urb(tx_buf->urb); is still there.
>

I am sorry, I clicked wrong link on dashboard :( My bad.

I believe, You can test your patch on this
https://syzkaller.appspot.com/bug?id=cabffad18eb74197f84871802fd2c5117b61febf.

usb_get_urb(tx_buf->urb) was introduced in patch related to this bug

> I think, this usb_get_urb prevents race condition, but I'm not sure
> about it, that's why I sent an email to patch author. As You can see,
> he has not responded yet :)
>
> > Thank you!
> > Atul
>
> With regards,
> Pavel Skripkin
>
>



With regards,
Pavel Skripkin

2021-04-27 13:02:27

by Atul Gopinathan

[permalink] [raw]
Subject: Re: Memory leak in ath9k_hif_usb_dealloc_tx_urbs()

On Tue, Apr 27, 2021 at 03:29:28PM +0300, Pavel Skripkin wrote:
> On Tue, 27 Apr 2021 15:04:29 +0300
> Pavel Skripkin <[email protected]> wrote:
>
> > Hi!
> >
> > On Tue, 2021-04-27 at 17:05 +0530, Atul Gopinathan wrote:
> > > On Wed, Mar 31, 2021 at 08:28:15AM +0200, Greg KH wrote:
> > > > On Tue, Mar 30, 2021 at 10:36:52PM +0300, Pavel Skripkin wrote:
> > > > > Hi!
> > > > >
> > > > > I did some debugging on this
> > > > > https://syzkaller.appspot.com/bug?id=3ea507fb3c47426497b52bd82b8ef0dd5b6cc7ee
> > > > > and, I believe, I recognized the problem. The problem appears in
> > > > > case of
> > > > > ath9k_htc_hw_init() fail. In case of this fail all tx_buf->urb
> > > > > krefs will be
> > > > > initialized to 1, but in free function:
> > > > >
> > > > > static void ath9k_hif_usb_dealloc_tx_urbs(struct hif_device_usb
> > > > > *hif_dev)
> > > > >
> > > > > ....
> > > > >
> > > > > static void ath9k_hif_usb_dealloc_tx_urbs(struct hif_device_usb
> > > > > *hif_dev)
> > > > > {
> > > > > ??? ...
> > > > > ????????list_for_each_entry_safe(tx_buf, tx_buf_tmp,
> > > > > ???????????????????????????????? &hif_dev->tx.tx_buf, list) {
> > > > > ????????????????usb_get_urb(tx_buf->urb);
> > > > > ????????????????...
> > > > > ????????????????usb_free_urb(tx_buf->urb);
> > > > > ????????????????...
> > > > > ????????????????}
> > > > >
> > > > > Krefs are incremented and then decremented, that means urbs
> > > > > won't be freed.
> > > > > I found your patch and I can't properly understand why You added
> > > > > usb_get_urb(tx_buf->urb).
> > > > > Can You explain please, I believe this will help me or somebody
> > > > > to fix this ussue :)
> > > >
> > > > I think almost everyone who has looked into this has given up due
> > > > to the
> > > > mess of twisty-passages here with almost no real-world benefits
> > > > for unwinding them :)
> > >
> > > Just wanted to confirm, what is the status of this bug then, as in
> > > is it
> > > invalid (not sure if that's the correct word)? I happened to stumble
> > > across the same syzkaller bug report Pavel posted above, in the
> > > morning.
> > > Saw that there has been no patch tests/fixes on this yet according
> > > to syzkaller. Spent a couple of hours going through it before
> > > sending a test patch to syzbot which returned an "OK" (and the
> > > patch is exactly what Pavel pointed out, I simply removed the
> > > `usb_get_urb()`). Before sending anything to the mailing list, I
> > > made sure to search all the relavant networking lists to see if
> > > this topic had been brought up (learnt
> > > to do this from my preious mistakes of sending already accepted
> > > patches) and
> > > luckily I found this.
> > >
> > > Syzbot has had 380 crashes caused by this bug, with the latest being
> > > today. So I wanted to confirm what should be done be about this
> > > bug.
> > >
> >
> > I saw on dashboard, that Dmitry tested latest upstream commit and
> > syzbot returned "OK", but usb_get_urb(tx_buf->urb); is still there.
> >
>
> I am sorry, I clicked wrong link on dashboard :( My bad.

Oh right, I forgot to mention. Just want to make it clear that the test
patch was mine. There was a bug in syzkaller, so when I sent the patches
for testing they returned a weird error. Dmitry later pointed out that
it was a syzkaller bug and was kind enough to re-send my patch on a
fixed commit of syzkaller.

https://groups.google.com/g/syzkaller-bugs/c/cBQP4fKjhFQ

>
> I believe, You can test your patch on this
> https://syzkaller.appspot.com/bug?id=cabffad18eb74197f84871802fd2c5117b61febf.
>
> usb_get_urb(tx_buf->urb) was introduced in patch related to this bug
>
> > I think, this usb_get_urb prevents race condition, but I'm not sure
> > about it, that's why I sent an email to patch author. As You can see,
> > he has not responded yet :)

Ah that's how it is. Well not sure we could do much here. Also thanks
for clarifying things, I thought that no one had been looking into this
bug especially when it had so many crash counts which suprised me, but I
guess I was wrong.

Thank you!
Atul