I am seeing two skb leaks in the BT sub-system for kernel 4.8-rc2. I only
recently re-enabled kmemleak, but I do not think I saw these leaks in 4.7.
The first leak is at btusb_recv_intr+0x12b/0x170 [btusb]. This address refers to
the call to bt_skb_alloc() in routine btusb_recv_intr().
The second leak is at hci_event_packet+0xb8/0x30b0 [bluetooth]. The backtrace
for this address is
0x13d38 is in hci_event_packet (net/bluetooth/hci_event.c:5254).
5249 * various handlers may modify the original one through
5250 * skb_pull() calls, etc.
5251 */
5252 if (req_complete_skb || event == HCI_EV_CMD_STATUS ||
5253 event == HCI_EV_CMD_COMPLETE)
5254 orig_skb = skb_clone(skb, GFP_KERNEL);
5255
5256 skb_pull(skb, HCI_EVENT_HDR_SIZE);
5257
5258 switch (event) {
I am unable to unload module bluetooth to verify that the second leak is not a
false positive; however, the one in btusb is a real memory leak.
As always, I will be happy to test any patches.
Larry
On 08/25/2016 01:22 AM, Frederic Dalleau wrote:
> Hi Larry,
>
> On 24/08/2016 22:02, Larry Finger wrote:
>> On 08/21/2016 07:09 AM, Frederic Dalleau wrote:
>>>>>> I am unable to unload module bluetooth to verify that the second
>>>>>> leak is not a false positive; however, the one in btusb is a real
>>>>>> memory leak.
>
>>> I have a patch on the grill.
>
>> Any progress on this patch?
>
> Yes, it is in bluetooth-stable.
> http://marc.info/?l=linux-bluetooth&m=147205068529234&w=2
Fred,
Thanks for the link. That patch fixes the leaks that I was seeing from
bluetooth. It also seems to have fixed the leaks that were attributed to btusb.
At least, none have shown up in my preliminary testing.
Larry
Hi Larry,
On 24/08/2016 22:02, Larry Finger wrote:
> On 08/21/2016 07:09 AM, Frederic Dalleau wrote:
>>>>> I am unable to unload module bluetooth to verify that the second
>>>>> leak is not a false positive; however, the one in btusb is a real
>>>>> memory leak.
>> I have a patch on the grill.
> Any progress on this patch?
Yes, it is in bluetooth-stable.
http://marc.info/?l=linux-bluetooth&m=147205068529234&w=2
Regards
Fred
On 08/21/2016 07:09 AM, Frederic Dalleau wrote:
> Hi Marcel, Johan,
>
>>>> I am unable to unload module bluetooth to verify that the second
>>>> leak is not a false positive; however, the one in btusb is a real
>>>> memory leak.
>
> There was a bugzilla last week with that backtrace:
> https://bugzilla.kernel.org/show_bug.cgi?id=120691
>
> At the time, I was thinking that the leak could originate from one of the
> req_complete_skb callback, but which one?
>
> And today that the issue has popped again, I found that hci_req_sync_complete
> references the skb in hdev->req_skb. It is called (via hci_req_run_skb) from
> either __hci_cmd_sync_ev which will pass the skb to the caller, or
> __hci_req_sync which leaks.
>
> I have a patch on the grill.
Fr?d?ric,
Any progress on this patch?
Thanks,
Larry
Hi Marcel, Johan,
>>> I am unable to unload module bluetooth to verify that the second
>>> leak is not a false positive; however, the one in btusb is a real
>>> memory leak.
There was a bugzilla last week with that backtrace:
https://bugzilla.kernel.org/show_bug.cgi?id=120691
At the time, I was thinking that the leak could originate from one of
the req_complete_skb callback, but which one?
And today that the issue has popped again, I found that
hci_req_sync_complete references the skb in hdev->req_skb. It is called
(via hci_req_run_skb) from either __hci_cmd_sync_ev which will pass the
skb to the caller, or __hci_req_sync which leaks.
I have a patch on the grill.
Best Regards,
Fr?d?ric
On 08/20/2016 01:01 AM, Marcel Holtmann wrote:
> Hi Larry,
>
> I can not see a leak. Maybe Johan has an idea.
Marcel and Johan,
The hardware in question is an Intel device with USB ID 8087:07dc, which is part
of an Intel Wireless 7260.
The kmemleak backtraces for the two kinds of leaks are:
unreferenced object 0xffff8801e182e000 (size 1024):
comm "hardirq", pid 0, jiffies 4312467853 (age 61.716s)
hex dump (first 32 bytes):
00 84 82 e1 01 88 ff ff 0e 04 01 10 20 00 20 00 ............ . .
05 06 02 00 00 05 00 00 40 00 40 00 00 00 00 00 ........@.@.....
backtrace:
[<ffffffff8169ef8a>] kmemleak_alloc+0x4a/0xa0
[<ffffffff811e9b18>] __kmalloc_node_track_caller+0x178/0x270
[<ffffffff815acb41>] __kmalloc_reserve.isra.35+0x31/0x90
[<ffffffff815aec3e>] __alloc_skb+0x7e/0x280
[<ffffffffa04c54bb>] btusb_recv_intr+0x12b/0x170 [btusb]
[<ffffffffa04c55c5>] btusb_intr_complete+0xc5/0x130 [btusb]
[<ffffffffa00a6f45>] __usb_hcd_giveback_urb+0x85/0x110 [usbcore]
[<ffffffffa00a70df>] usb_hcd_giveback_urb+0x3f/0x130 [usbcore]
[<ffffffffa0163cfa>] handle_tx_event+0x4ca/0x13c0 [xhci_hcd]
[<ffffffffa0164e62>] xhci_irq+0x272/0xa30 [xhci_hcd]
[<ffffffffa0165631>] xhci_msi_irq+0x11/0x20 [xhci_hcd]
[<ffffffff810cb42f>] __handle_irq_event_percpu+0x3f/0x1d0
[<ffffffff810cb5e3>] handle_irq_event_percpu+0x23/0x60
[<ffffffff810cb65c>] handle_irq_event+0x3c/0x60
[<ffffffff810ced0b>] handle_edge_irq+0x9b/0x160
[<ffffffff8101e340>] handle_irq+0x20/0x30
unreferenced object 0xffff88010ccbd200 (size 256):
comm "kworker/u17:2", pid 684, jiffies 4312467853 (age 61.716s)
hex dump (first 32 bytes):
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
1e c6 5f 75 10 90 6c 14 00 00 00 00 00 00 00 00 .._u..l.........
backtrace:
[<ffffffff8169ef8a>] kmemleak_alloc+0x4a/0xa0
[<ffffffff811e8364>] kmem_cache_alloc+0xc4/0x1f0
[<ffffffff815b04ac>] skb_clone+0x4c/0xa0
[<ffffffffa0550d08>] hci_event_packet+0xb8/0x30b0 [bluetooth]
[<ffffffffa054126d>] hci_rx_work+0x18d/0x380 [bluetooth]
[<ffffffff81089f3b>] process_one_work+0x14b/0x430
[<ffffffff8108a34b>] worker_thread+0x12b/0x490
[<ffffffff8108fd49>] kthread+0xc9/0xe0
[<ffffffff816a8e5f>] ret_from_fork+0x1f/0x40
[<ffffffffffffffff>] 0xffffffffffffffff
I will attempt a bisection.
Larry
Hi Marcel,
On Sat, Aug 20, 2016, Marcel Holtmann wrote:
> > I am seeing two skb leaks in the BT sub-system for kernel 4.8-rc2. I
> > only recently re-enabled kmemleak, but I do not think I saw these
> > leaks in 4.7.
> >
> > The first leak is at btusb_recv_intr+0x12b/0x170 [btusb]. This
> > address refers to the call to bt_skb_alloc() in routine
> > btusb_recv_intr().
>
> do you have a backtrace for this one? Also which hardware is this?
>
> > The second leak is at hci_event_packet+0xb8/0x30b0 [bluetooth]. The
> > backtrace for this address is
> >
> > 0x13d38 is in hci_event_packet (net/bluetooth/hci_event.c:5254).
> > 5249 * various handlers may modify the original one through
> > 5250 * skb_pull() calls, etc.
> > 5251 */
> > 5252 if (req_complete_skb || event == HCI_EV_CMD_STATUS ||
> > 5253 event == HCI_EV_CMD_COMPLETE)
> > 5254 orig_skb = skb_clone(skb, GFP_KERNEL);
> > 5255
> > 5256 skb_pull(skb, HCI_EVENT_HDR_SIZE);
> > 5257
> > 5258 switch (event) {
> >
> > I am unable to unload module bluetooth to verify that the second
> > leak is not a false positive; however, the one in btusb is a real
> > memory leak.
>
> I can not see a leak. Maybe Johan has an idea.
Unfortunately I don't have any ideas either - there is no exit path from
hci_event_packet() after orig_skb has been allocated that would not
result in kfree_skb(orig_skb) being called first.
Johan
Hi Larry,
> I am seeing two skb leaks in the BT sub-system for kernel 4.8-rc2. I only recently re-enabled kmemleak, but I do not think I saw these leaks in 4.7.
>
> The first leak is at btusb_recv_intr+0x12b/0x170 [btusb]. This address refers to the call to bt_skb_alloc() in routine btusb_recv_intr().
do you have a backtrace for this one? Also which hardware is this?
> The second leak is at hci_event_packet+0xb8/0x30b0 [bluetooth]. The backtrace for this address is
>
> 0x13d38 is in hci_event_packet (net/bluetooth/hci_event.c:5254).
> 5249 * various handlers may modify the original one through
> 5250 * skb_pull() calls, etc.
> 5251 */
> 5252 if (req_complete_skb || event == HCI_EV_CMD_STATUS ||
> 5253 event == HCI_EV_CMD_COMPLETE)
> 5254 orig_skb = skb_clone(skb, GFP_KERNEL);
> 5255
> 5256 skb_pull(skb, HCI_EVENT_HDR_SIZE);
> 5257
> 5258 switch (event) {
>
> I am unable to unload module bluetooth to verify that the second leak is not a false positive; however, the one in btusb is a real memory leak.
I can not see a leak. Maybe Johan has an idea.
Regards
Marcel