2015-05-16 12:17:29

by Haggai Eran

[permalink] [raw]
Subject: kernel page fault in r8712u

Hi,

I've encountered the oops below running the r8712u driver. It occurred
on Raspberry Pi (OpenELEC 5.95 beta, running kernel version 4.0.3),
with the following device:
> 0bda:8172 Realtek Semiconductor Corp. RTL8191SU 802.11n WLAN Adapter

I'd be happy to dig in and see what the problem is, but I wanted to
make sure this is the right driver to look at. linuxwireless.org says
that this driver is going to be replaced by rtl8192su [1].

Have you seen this issue?

Thanks,
Haggai

[1] http://linuxwireless.org/en/users/Drivers/rtl819x/

[ 834.537661] Unable to handle kernel paging request at virtual
address a9d797d7
[ 834.544900] pgd = 96b14000
[ 834.547604] [a9d797d7] *pgd=00000000
[ 834.551186] Internal error: Oops: 5 [#1] ARM
[ 834.555449] Modules linked in: cfg80211 bluetooth r8712u(C)
bcm2708_rng [last unloaded: btusb]
[ 834.564092] CPU: 0 PID: 3 Comm: ksoftirqd/0 Tainted: G C
4.0.1 #1
[ 834.571303] Hardware name: BCM2708
[ 834.574702] task: 9703a700 ti: 97056000 task.ti: 97056000
[ 834.580125] PC is at put_compound_page+0x64/0x1d0
[ 834.584833] LR is at put_page+0x48/0x4c
[ 834.588670] pc : [<80076354>] lr : [<80076508>] psr: 00000113
[ 834.588670] sp : 97057d70 ip : 97057d88 fp : 97057d84
[ 834.600129] r10: 97335c00 r9 : 0000003c r8 : 00000c21
[ 834.605345] r7 : 9723cf00 r6 : 972a6900 r5 : 972a6900 r4 : 5d270b9a
[ 834.611862] r3 : d7a0d795 r2 : d7a0d795 r1 : 00000001 r0 : a9d797d7
[ 834.618390] Flags: nzcv IRQs on FIQs on Mode SVC_32 ISA ARM
Segment kernel
[ 834.625690] Control: 00c5387d Table: 16b14008 DAC: 00000015
[ 834.631429] Process ksoftirqd/0 (pid: 3, stack limit = 0x97056188)
[ 834.637601] Stack: (0x97057d70 to 0x97058000)
[ 834.641957] 7d60: 00000000
972a6900 97057d94 97057d88
[ 834.650132] 7d80: 80076508 800762fc 97057db4 97057d98 803d86c0
800764cc 9723cf00 89868bd0
[ 834.658308] 7da0: 89868420 89868bd0 97057dcc 97057db8 803d8748
803d8660 00000001 9723cf00
[ 834.666483] 7dc0: 97057de4 97057dd0 803d8844 803d872c 40000113
898c2d18 97057df4 97057de8
[ 834.674657] 7de0: 803e4b94 803d8828 97057e14 97057df8 7f01a714
803e4b5c 00000009 89868420
[ 834.682831] 7e00: 898c2d18 00000000 97057e4c 97057e18 7f01b094
7f01a6f4 8986a420 00000760
[ 834.691003] 7e20: 972a6260 89868420 898c2d18 8986a420 00000c80
00000c21 0000003c 97335c00
[ 834.699179] 7e40: 97057e6c 97057e50 7f01c6fc 7f01b014 000000ec
898c2d18 972a6906 00000018
[ 834.707353] 7e60: 97057ebc 97057e70 7f01a354 7f01c6d4 9703a730
89868c34 97322480 8986a420
[ 834.715529] 7e80: 89868bd0 9723cf00 fffff580 00000002 9703a730
00000000 00000000 8082cbec
[ 834.723702] 7ea0: 80864760 00000000 80864780 80864780 97057edc
97057ec0 8001e12c 7f01a1b8
[ 834.731874] 7ec0: 00000001 40000000 00000100 97056000 97057f2c
97057ee0 8001e4d8 8001e0c0
[ 834.740048] 7ee0: 80033bc0 80033a88 973db180 04208040 00027227
0000000a 80864780 97056020
[ 834.748223] 7f00: 97057f28 97002560 97056000 00000000 00000001
8082cbb8 00000002 00000000
[ 834.756397] 7f20: 97057f3c 97057f30 8001e630 8001e3fc 97057f64
97057f40 80036430 8001e610
[ 834.764568] 7f40: 00000000 97002580 97002560 80036320 00000000
00000000 97057fac 97057f68
[ 834.772741] 7f60: 80032ff0 8003632c c4c0d0c4 00000001 00000000
97002560 00000000 97057f7c
[ 834.780913] 7f80: 97057f7c 00000000 97057f88 97057f88 97002580
80032f1c 00000000 00000000
[ 834.789088] 7fa0: 00000000 97057fb0 8000e740 80032f28 00000000
00000000 00000000 00000000
[ 834.797261] 7fc0: 00000000 00000000 00000000 00000000 00000000
00000000 00000000 00000000
[ 834.805432] 7fe0: 00000000 00000000 00000000 00000000 00000013
00000000 c4c0ceca d0c0c040
[ 834.813594] Backtrace:
[ 834.816072] [<800762f0>] (put_compound_page) from [<80076508>]
(put_page+0x48/0x4c)
[ 834.823722] r5:972a6900 r4:00000000
[ 834.827331] [<800764c0>] (put_page) from [<803d86c0>]
(skb_release_data+0x6c/0xcc)
[ 834.834904] [<803d8654>] (skb_release_data) from [<803d8748>]
(skb_release_all+0x28/0x2c)
[ 834.843071] r7:89868bd0 r6:89868420 r5:89868bd0 r4:9723cf00
[ 834.848759] [<803d8720>] (skb_release_all) from [<803d8844>]
(consume_skb+0x28/0x5c)
[ 834.856490] r4:9723cf00 r3:00000001
[ 834.860088] [<803d881c>] (consume_skb) from [<803e4b94>]
(__dev_kfree_skb_any+0x44/0x48)
[ 834.868167] r4:898c2d18 r3:40000113
[ 834.871974] [<803e4b50>] (__dev_kfree_skb_any) from [<7f01a714>]
(r8712_free_recvframe+0x2c/0x9c [r8712u])
[ 834.881891] [<7f01a6e8>] (r8712_free_recvframe [r8712u]) from
[<7f01b094>] (recv_func+0x8c/0x6e8 [r8712u])
[ 834.891533] r6:00000000 r5:898c2d18 r4:89868420 r3:00000009
[ 834.897456] [<7f01b008>] (recv_func [r8712u]) from [<7f01c6fc>]
(r8712_recv_entry+0x34/0x78 [r8712u])
[ 834.906667] r10:97335c00 r9:0000003c r8:00000c21 r7:00000c80
r6:8986a420 r5:898c2d18
[ 834.914518] r4:89868420
[ 834.917398] [<7f01c6c8>] (r8712_recv_entry [r8712u]) from
[<7f01a354>] (recv_tasklet+0x1a8/0x31c [r8712u])
[ 834.927047] r6:00000018 r5:972a6906 r4:898c2d18 r3:000000ec
[ 834.932903] [<7f01a1ac>] (recv_tasklet [r8712u]) from [<8001e12c>]
(tasklet_hi_action+0x78/0xcc)
[ 834.941680] r10:80864780 r9:80864780 r8:00000000 r7:80864760
r6:8082cbec r5:00000000
[ 834.949529] r4:00000000
[ 834.952086] [<8001e0b4>] (tasklet_hi_action) from [<8001e4d8>]
(__do_softirq+0xe8/0x214)
[ 834.960166] r7:97056000 r6:00000100 r5:40000000 r4:00000001
[ 834.965849] [<8001e3f0>] (__do_softirq) from [<8001e630>]
(run_ksoftirqd+0x2c/0x3c)
[ 834.973492] r10:00000000 r9:00000002 r8:8082cbb8 r7:00000001
r6:00000000 r5:97056000
[ 834.981339] r4:97002560
[ 834.983901] [<8001e604>] (run_ksoftirqd) from [<80036430>]
(smpboot_thread_fn+0x110/0x15c)
[ 834.992173] [<80036320>] (smpboot_thread_fn) from [<80032ff0>]
(kthread+0xd4/0xf0)
[ 834.999731] r9:00000000 r8:00000000 r7:80036320 r6:97002560
r5:97002580 r4:00000000
[ 835.007519] [<80032f1c>] (kthread) from [<8000e740>]
(ret_from_fork+0x14/0x34)
[ 835.014738] r7:00000000 r6:00000000 r5:80032f1c r4:97002580
[ 835.020420] Code: e590001c e5943000 e3130902 0a00001d (e5903000)
[ 835.026527] ---[ end trace 633783c74480b173 ]---
[ 835.031145] Kernel panic - not syncing: Fatal exception in interrupt
[ 835.037494] ---[ end Kernel panic - not syncing: Fatal exception in interrupt


2015-05-17 19:22:46

by Haggai Eran

[permalink] [raw]
Subject: Re: kernel page fault in r8712u

I added some debugging prints, trying to see more details about the
packet that fails the r8712_validate_recv_frame. I noticed I'm getting
many packets where recv_decache returns _FAIL. However, the last two
packets before the crash fail for different reasons. The first has the
ver field set to 3 (instead of zero). The second (the one that get's
freed and cause the crash apparently) has an unknown type (12). If I'm
not mistaken, 12 = WIFI_CTRL_TYPE | WIFI_DATA_TYPE. Is that possible?

It could be that the packet headers are garbled though.

Haggai

On 17 May 2015 at 20:20, Haggai Eran <[email protected]> wrote:
> On 17 May 2015 at 13:29, Arend van Spriel <[email protected]> wrote:
>> On 17-05-15 06:25, Haggai Eran wrote:
>>>
>>> On 16 May 2015 at 20:54, Larry Finger <[email protected]> wrote:
>>>>
>>>> Another location needed from gdb is "l *recv_func+0x8c".
>>>
>>>
>>> Here it is:
>>> (gdb) l *recv_func+0x8c
>>> 0x17094 is in recv_func (drivers/staging/rtl8712/rtl8712_recv.c:1004).
>>> 999 r8712_free_recvframe(orig_prframe,
>>> pfree_recv_queue);
>>> 1000 goto _exit_recv_func;
>>> 1001 }
>>> 1002 _exit_recv_func:
>>> 1003 return retval;
>>> 1004 }
>>> 1005
>>> 1006 static int recvbuf2recvframe(struct _adapter *padapter, struct
>>> sk_buff *pskb)
>>> 1007 {
>>> 1008 u8 *pbuf, shift_sz = 0;
>>>
>>> I don't think this means the relevant call is the one at line 999. I
>>> think it is an earlier call, after r8712_validate_recv_frame. Here's
>>> the disassembly:
>>
>>
>> can you provide the address of recv_func as well to determine the exact
>> location in assembly.
>
> Yes, it is in offset 0x17008 in the module:
>> 00017008 <recv_func>:
>
> Regards,
> Haggai
>
>>
>>> /* check the frame crtl field and decache */
>>> retval = r8712_validate_recv_frame(padapter, prframe);
>>> 17070: e1a00004 mov r0, r4
>>> 17074: e1a01005 mov r1, r5
>>> 17078: ebfffffe bl 17bc0 <r8712_validate_recv_frame>
>>> if (retval != _SUCCESS) {
>>> 1707c: e3500001 cmp r0, #1
>>> r8712_free_recvframe(orig_prframe,
>>> pfree_recv_queue);
>>> goto _exit_recv_func;
>>> }
>>> }
>>> /* check the frame crtl field and decache */
>>> retval = r8712_validate_recv_frame(padapter, prframe);
>>> 17080: e1a06000 mov r6, r0
>>> if (retval != _SUCCESS) {
>>> 17084: 0a000005 beq 170a0 <recv_func+0x98>
>>> /* free this recv_frame */
>>> r8712_free_recvframe(orig_prframe, pfree_recv_queue);
>>> 17088: e1a00005 mov r0, r5
>>> 1708c: e1a01007 mov r1, r7
>>> 17090: ebfffffe bl 166e8 <r8712_free_recvframe>
>>> r8712_free_recvframe(orig_prframe, pfree_recv_queue);
>>> goto _exit_recv_func;
>>> }
>>> _exit_recv_func:
>>> return retval;
>>> }
>>> 17094: e1a00006 mov r0, r6

2015-05-19 05:16:54

by Haggai Eran

[permalink] [raw]
Subject: Re: kernel page fault in r8712u

On 19 May 2015 at 07:52, Larry Finger <[email protected]> wrote:
> I checked the driver with Smatch and found a couple of array problems. These
> likely won't be the problem, but try the attached patches anyway.

I tried the patches. The first one prevents the driver from working. I
think the smatch warning may be a false positive, because HWXMIT_ENTRY
is checked specifically for values 4 and 5 in an if statement.

The second patch doesn't hurt, but it didn't solve the issue.

2015-05-16 17:41:14

by Larry Finger

[permalink] [raw]
Subject: Re: kernel page fault in r8712u

On 05/16/2015 12:16 PM, Haggai Eran wrote:
> On 16 May 2015 at 17:57, Larry Finger <[email protected]> wrote:
>> The problem appears to be from r8712u. From the stack dump, the problem
>> happens when r8712_free_recvframe() calls __dev_kfree_skb_any(). A
>> complication is that my copy of the kernel source does not show such a call.
>> :(
>>
>> Please use gdb to help with the debugging. From the main directory of your
>> source, enter the command 'gdb drivers/staging/rtl8712/r8712u.ko'. Once it
>> prompts you, enter 'l *r8712_free_recvframe+0x2c'. The first character is
>> ell, not one. That will show the actual line of code. Please post that info.
>
> Here's what I get:
>
> (gdb) l *r8712_free_recvframe+0x2c
> 0x16714 is in r8712_free_recvframe (drivers/staging/rtl8712/rtl8712_recv.c:145).
> 140 struct _adapter *padapter = precvframe->u.hdr.adapter;
> 141 struct recv_priv *precvpriv = &padapter->recvpriv;
> 142
> 143 if (precvframe->u.hdr.pkt) {
> 144
> dev_kfree_skb_any(precvframe->u.hdr.pkt);/*free skb by driver*/
> 145 precvframe->u.hdr.pkt = NULL;
> 146 }
> 147 spin_lock_irqsave(&pfree_recv_queue->lock, irqL);
> 148 list_del_init(&(precvframe->u.hdr.list));
> 149 list_add_tail(&(precvframe->u.hdr.list),
> &pfree_recv_queue->queue);
>
> It seems that dev_kfree_skb_any is an inline function that calls
> __dev_kfree_skb_any,
> so that should explain why that call didn't show up in the stack dump.

Thanks. From your original posting, the bad address for precvframe->u.hdr.pkt is
a9d797d7. I'm a little bothered by that odd address. I would have expected it to
be even, at least. The actual definition should be OK with the alignment, but
there are a number of places where there is a cast. If the alignment of the
object of the cast is wrong, then that might cause the problem. It will take a
while to trace back through your call chain to see if any of these are involved
here.

I will be back to you later.

Larry



2015-05-17 17:20:05

by Haggai Eran

[permalink] [raw]
Subject: Re: kernel page fault in r8712u

On 17 May 2015 at 13:29, Arend van Spriel <[email protected]> wrote:
> On 17-05-15 06:25, Haggai Eran wrote:
>>
>> On 16 May 2015 at 20:54, Larry Finger <[email protected]> wrote:
>>>
>>> Another location needed from gdb is "l *recv_func+0x8c".
>>
>>
>> Here it is:
>> (gdb) l *recv_func+0x8c
>> 0x17094 is in recv_func (drivers/staging/rtl8712/rtl8712_recv.c:1004).
>> 999 r8712_free_recvframe(orig_prframe,
>> pfree_recv_queue);
>> 1000 goto _exit_recv_func;
>> 1001 }
>> 1002 _exit_recv_func:
>> 1003 return retval;
>> 1004 }
>> 1005
>> 1006 static int recvbuf2recvframe(struct _adapter *padapter, struct
>> sk_buff *pskb)
>> 1007 {
>> 1008 u8 *pbuf, shift_sz = 0;
>>
>> I don't think this means the relevant call is the one at line 999. I
>> think it is an earlier call, after r8712_validate_recv_frame. Here's
>> the disassembly:
>
>
> can you provide the address of recv_func as well to determine the exact
> location in assembly.

Yes, it is in offset 0x17008 in the module:
> 00017008 <recv_func>:

Regards,
Haggai

>
>> /* check the frame crtl field and decache */
>> retval = r8712_validate_recv_frame(padapter, prframe);
>> 17070: e1a00004 mov r0, r4
>> 17074: e1a01005 mov r1, r5
>> 17078: ebfffffe bl 17bc0 <r8712_validate_recv_frame>
>> if (retval != _SUCCESS) {
>> 1707c: e3500001 cmp r0, #1
>> r8712_free_recvframe(orig_prframe,
>> pfree_recv_queue);
>> goto _exit_recv_func;
>> }
>> }
>> /* check the frame crtl field and decache */
>> retval = r8712_validate_recv_frame(padapter, prframe);
>> 17080: e1a06000 mov r6, r0
>> if (retval != _SUCCESS) {
>> 17084: 0a000005 beq 170a0 <recv_func+0x98>
>> /* free this recv_frame */
>> r8712_free_recvframe(orig_prframe, pfree_recv_queue);
>> 17088: e1a00005 mov r0, r5
>> 1708c: e1a01007 mov r1, r7
>> 17090: ebfffffe bl 166e8 <r8712_free_recvframe>
>> r8712_free_recvframe(orig_prframe, pfree_recv_queue);
>> goto _exit_recv_func;
>> }
>> _exit_recv_func:
>> return retval;
>> }
>> 17094: e1a00006 mov r0, r6

2015-05-16 17:16:22

by Haggai Eran

[permalink] [raw]
Subject: Re: kernel page fault in r8712u

On 16 May 2015 at 17:57, Larry Finger <[email protected]> wrote:
> The problem appears to be from r8712u. From the stack dump, the problem
> happens when r8712_free_recvframe() calls __dev_kfree_skb_any(). A
> complication is that my copy of the kernel source does not show such a call.
> :(
>
> Please use gdb to help with the debugging. From the main directory of your
> source, enter the command 'gdb drivers/staging/rtl8712/r8712u.ko'. Once it
> prompts you, enter 'l *r8712_free_recvframe+0x2c'. The first character is
> ell, not one. That will show the actual line of code. Please post that info.

Here's what I get:

(gdb) l *r8712_free_recvframe+0x2c
0x16714 is in r8712_free_recvframe (drivers/staging/rtl8712/rtl8712_recv.c:145).
140 struct _adapter *padapter = precvframe->u.hdr.adapter;
141 struct recv_priv *precvpriv = &padapter->recvpriv;
142
143 if (precvframe->u.hdr.pkt) {
144
dev_kfree_skb_any(precvframe->u.hdr.pkt);/*free skb by driver*/
145 precvframe->u.hdr.pkt = NULL;
146 }
147 spin_lock_irqsave(&pfree_recv_queue->lock, irqL);
148 list_del_init(&(precvframe->u.hdr.list));
149 list_add_tail(&(precvframe->u.hdr.list),
&pfree_recv_queue->queue);

It seems that dev_kfree_skb_any is an inline function that calls
__dev_kfree_skb_any,
so that should explain why that call didn't show up in the stack dump.

Haggai

2015-05-18 17:38:22

by Haggai Eran

[permalink] [raw]
Subject: Re: kernel page fault in r8712u

On 18 May 2015 at 18:31, Larry Finger <[email protected]> wrote:
> On 05/17/2015 02:22 PM, Haggai Eran wrote:
>>
>> I added some debugging prints, trying to see more details about the
>> packet that fails the r8712_validate_recv_frame. I noticed I'm getting
>> many packets where recv_decache returns _FAIL. However, the last two
>> packets before the crash fail for different reasons. The first has the
>> ver field set to 3 (instead of zero). The second (the one that get's
>> freed and cause the crash apparently) has an unknown type (12). If I'm
>> not mistaken, 12 = WIFI_CTRL_TYPE | WIFI_DATA_TYPE. Is that possible?
>>
>> It could be that the packet headers are garbled though.
>
>
> I think the headers are garbled. Did you log the address of the skb at
> precvframe->u.hdr.pkt in r8712_free_recvframe() or orig_prframe->u.hdr.pct
> in recv_func().
I haven't. I'll print that.

>
> I am still dubious of the cast "prframe = (union recv_frame *)pcontext;" in
> recv_func().

Why? As far as I can see, recv_func is called only at one place
(r8712_recv_entry), where it is passed a union recv_frame * as the
pcontext parameter.

Haggai

2015-05-16 17:54:22

by Larry Finger

[permalink] [raw]
Subject: Re: kernel page fault in r8712u

On 05/16/2015 12:16 PM, Haggai Eran wrote:
> On 16 May 2015 at 17:57, Larry Finger <[email protected]> wrote:
>> The problem appears to be from r8712u. From the stack dump, the problem
>> happens when r8712_free_recvframe() calls __dev_kfree_skb_any(). A
>> complication is that my copy of the kernel source does not show such a call.
>> :(
>>
>> Please use gdb to help with the debugging. From the main directory of your
>> source, enter the command 'gdb drivers/staging/rtl8712/r8712u.ko'. Once it
>> prompts you, enter 'l *r8712_free_recvframe+0x2c'. The first character is
>> ell, not one. That will show the actual line of code. Please post that info.

Another location needed from gdb is "l *recv_func+0x8c".

Thanks,

Larry



2015-05-19 05:00:25

by Haggai Eran

[permalink] [raw]
Subject: Re: kernel page fault in r8712u

On 19 May 2015 at 07:52, Larry Finger <[email protected]> wrote:
> OK, I will have to search further upstream to see how a faulty skb was
> provided.
>
> I have been testing r8712u on my x86_64 system with no difficulty.
>
> I checked the driver with Smatch and found a couple of array problems. These
> likely won't be the problem, but try the attached patches anyway.

I found one place that might be the cause for the fault. The
recvbuf2recvframe function has a line copying memory between the
incoming pskb and a new allocated skb:

1065 pkt_copy = netdev_alloc_skb(padapter->pnetdev, alloc_sz);
1066 if (pkt_copy) {
1067 precvframe->u.hdr.pkt = pkt_copy;
1068 skb_reserve(pkt_copy, 4 - ((addr_t)(pkt_copy->data)
1069 % 4));
1070 skb_reserve(pkt_copy, shift_sz);
1071 memcpy(pkt_copy->data, pbuf, tmp_len);
1072 precvframe->u.hdr.rx_head =
precvframe->u.hdr.rx_data =
1073 precvframe->u.hdr.rx_tail =
pkt_copy->data;
1074 precvframe->u.hdr.rx_end = pkt_copy->data
+ alloc_sz;

I added a BUG_ON there in case the memcpy overflows
(BUG_ON((pkt_copy->end - pkt_copy->data) < tmp_len)) and it trigerred.
I'm not sure why does the overflow occur though.

Haggai

2015-05-17 10:29:35

by Arend Van Spriel

[permalink] [raw]
Subject: Re: kernel page fault in r8712u



On 17-05-15 06:25, Haggai Eran wrote:
> On 16 May 2015 at 20:54, Larry Finger <[email protected]> wrote:
>> Another location needed from gdb is "l *recv_func+0x8c".
>
> Here it is:
> (gdb) l *recv_func+0x8c
> 0x17094 is in recv_func (drivers/staging/rtl8712/rtl8712_recv.c:1004).
> 999 r8712_free_recvframe(orig_prframe, pfree_recv_queue);
> 1000 goto _exit_recv_func;
> 1001 }
> 1002 _exit_recv_func:
> 1003 return retval;
> 1004 }
> 1005
> 1006 static int recvbuf2recvframe(struct _adapter *padapter, struct
> sk_buff *pskb)
> 1007 {
> 1008 u8 *pbuf, shift_sz = 0;
>
> I don't think this means the relevant call is the one at line 999. I
> think it is an earlier call, after r8712_validate_recv_frame. Here's
> the disassembly:

can you provide the address of recv_func as well to determine the exact
location in assembly.

Regards,
Arend

> /* check the frame crtl field and decache */
> retval = r8712_validate_recv_frame(padapter, prframe);
> 17070: e1a00004 mov r0, r4
> 17074: e1a01005 mov r1, r5
> 17078: ebfffffe bl 17bc0 <r8712_validate_recv_frame>
> if (retval != _SUCCESS) {
> 1707c: e3500001 cmp r0, #1
> r8712_free_recvframe(orig_prframe, pfree_recv_queue);
> goto _exit_recv_func;
> }
> }
> /* check the frame crtl field and decache */
> retval = r8712_validate_recv_frame(padapter, prframe);
> 17080: e1a06000 mov r6, r0
> if (retval != _SUCCESS) {
> 17084: 0a000005 beq 170a0 <recv_func+0x98>
> /* free this recv_frame */
> r8712_free_recvframe(orig_prframe, pfree_recv_queue);
> 17088: e1a00005 mov r0, r5
> 1708c: e1a01007 mov r1, r7
> 17090: ebfffffe bl 166e8 <r8712_free_recvframe>
> r8712_free_recvframe(orig_prframe, pfree_recv_queue);
> goto _exit_recv_func;
> }
> _exit_recv_func:
> return retval;
> }
> 17094: e1a00006 mov r0, r6
>
> Haggai
> --
> To unsubscribe from this list: send the line "unsubscribe linux-wireless" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>

2015-05-18 18:38:59

by Haggai Eran

[permalink] [raw]
Subject: Re: kernel page fault in r8712u

On 18 May 2015 at 18:31, Larry Finger <[email protected]> wrote:
> On 05/17/2015 02:22 PM, Haggai Eran wrote:
>>
>> I added some debugging prints, trying to see more details about the
>> packet that fails the r8712_validate_recv_frame. I noticed I'm getting
>> many packets where recv_decache returns _FAIL. However, the last two
>> packets before the crash fail for different reasons. The first has the
>> ver field set to 3 (instead of zero). The second (the one that get's
>> freed and cause the crash apparently) has an unknown type (12). If I'm
>> not mistaken, 12 = WIFI_CTRL_TYPE | WIFI_DATA_TYPE. Is that possible?
>>
>> It could be that the packet headers are garbled though.
>
>
> I think the headers are garbled. Did you log the address of the skb at
> precvframe->u.hdr.pkt in r8712_free_recvframe() or orig_prframe->u.hdr.pct
> in recv_func().

I added prints of the skb pointer in every call to recv_func. Here are
the results:

...
[ 674.111771] recv_func: pcontext = 96335820, prframe->u.hdr.pkt = 9729fb40
[ 674.118782] recv_func: pcontext = 963359b8, prframe->u.hdr.pkt = 9729f6c0
[ 674.125777] recv_func: pcontext = 96335930, prframe->u.hdr.pkt = 9729f780
[ 674.132769] recv_func: pcontext = 963358a8, prframe->u.hdr.pkt = 973d56c0
[ 674.139753] recv_func: pcontext = 96335d70, prframe->u.hdr.pkt = 973d5000
[ 674.146922] recv_func: pcontext = 963361b0, prframe->u.hdr.pkt = 973d5000
[ 674.153961] recv_func: pcontext = 963360a0, prframe->u.hdr.pkt = 973d5000
[ 674.161023] recv_func: pcontext = 96336128, prframe->u.hdr.pkt = 973d5000
[ 674.168186] recv_func: pcontext = 96336018, prframe->u.hdr.pkt = 973d5000
[ 674.175231] recv_func: pcontext = 96335f90, prframe->u.hdr.pkt = 973d5000
[ 674.182141] r8712_validate_recv_frame: ver = 1
[ 674.186814] recv_func: pcontext = 96335f08, prframe->u.hdr.pkt = 973d5000
[ 674.193811] r8712_validate_recv_frame: ver = 1
[ 674.198530] recv_func: pcontext = 963363d0, prframe->u.hdr.pkt = 973d5000
[ 674.205434] r8712_validate_recv_frame: ver = 3
[ 674.210018] Unable to handle kernel NULL pointer dereference at
virtual address 00000001
[ 674.218209] pgd = 80004000
[ 674.221025] [00000001] *pgd=00000000
[ 674.224752] Internal error: Oops: 5 [#1] ARM
[ 674.229028] Modules linked in: rfcomm cfg80211 r8712u(C) btusb
bluetooth bcm2708_rng
[ 674.236857] CPU: 0 PID: 530 Comm: kworker/0:1 Tainted: G WC
4.0.3 #1
[ 674.244247] Hardware name: BCM2708
[ 674.247663] task: 962cdee0 ti: 960fc000 task.ti: 960fc000
[ 674.253082] PC is at put_page+0xc/0x68
[ 674.256853] LR is at skb_release_data+0x6c/0xcc
[ 674.261388] pc : [<800933e0>] lr : [<80433fe4>] psr: 20000113
[ 674.261388] sp : 960fdc18 ip : 960fdc28 fp : 960fdc24
[ 674.272856] r10: 84d6cc00 r9 : 0000fff8 r8 : 00002f17
[ 674.278079] r7 : 973d5000 r6 : 84dc7620 r5 : 84dc7620 r4 : 00000000
[ 674.284602] r3 : 00000037 r2 : 00000000 r1 : 00000001 r0 : 00000001
[ 674.291127] Flags: nzCv IRQs on FIQs on Mode SVC_32 ISA ARM
Segment kernel
[ 674.298432] Control: 00c5387d Table: 168a4008 DAC: 00000015
[ 674.304179] Process kworker/0:1 (pid: 530, stack limit = 0x960fc188)
[ 674.310532] Stack: (0x960fdc18 to 0x960fe000)
[ 674.314893] dc00:
960fdc44 960fdc28
[ 674.323076] dc20: 80433fe4 800933e0 973d5000 96309010 96308520
96309010 960fdc5c 960fdc48
[ 674.331257] dc40: 8043406c 80433f84 00000001 973d5000 960fdc74
960fdc60 8043413c 80434050
[ 674.339437] dc60: 40000113 963363d0 960fdc84 960fdc78 80441c08
80434120 960fdcac 960fdc88
[ 674.347618] dc80: 7f10ca70 80441bd0 00000000 96308520 963363d0
00000000 96309010 00002f17
[ 674.355798] dca0: 960fdce4 960fdcb0 7f10d3f8 7f10ca50 960fdcd4
960fdcc0 80439aac 96308520
[ 674.363978] dcc0: 963363d0 9630a520 00002f80 00002f17 0000fff8
84d6cc00 960fdd04 960fdce8
[ 674.372157] dce0: 7f10eb84 7f10d36c 000000d2 963363d0 84dc7626
00000018 960fdd54 960fdd08
[ 674.380338] dd00: 7f10c65c 7f10eb5c 96308ff0 963090d4 9729f840
9630b520 96309010 973d5000
[ 674.388518] dd20: ffff2f00 00000002 808f4590 96309094 808f458c
8093f820 00000000 96a32900
[ 674.396698] dd40: 8093f840 40000000 960fdd7c 960fdd58 8001fbbc
7f10c4b8 0000833e 00000000
[ 674.404879] dd60: 00000000 00000102 960fc000 8093f840 960fddcc
960fdd80 8001ffc0 8001fb48
[ 674.413058] dd80: 8054e348 80052428 00000001 00000001 04208060
0001b61a 00000009 960fc000
[ 674.421237] dda0: 00000000 00000000 80920c94 00000000 00000000
00000000 8003555c 00000000
[ 674.429416] ddc0: 960fdde4 960fddd0 80020474 8001fea4 00000000
00000000 960fde0c 960fdde8
[ 674.437598] dde0: 80057298 800203bc 960fde20 8054e760 60000013
f200b200 960fde54 972ba1e0
[ 674.445777] de00: 960fde1c 960fde10 800081e4 80057224 960fde7c
960fde20 800127f8 800081cc
[ 674.453957] de20: 8054e75c 00000001 962cdee0 00000000 808f6668
969021e0 97051140 00000000
[ 674.462140] de40: 972ba1e0 8003555c 00000000 960fde7c 960fde58
960fde68 8004b37c 8054e760
[ 674.470319] de60: 60000013 ffffffff 00000000 808f6668 960fdeac
960fde80 8003e738 8054e73c
[ 674.478499] de80: 00000001 00000000 8003e6cc 960fde98 97239140
962cdee0 808f6668 972ba1e0
[ 674.486679] dea0: 960fded4 960fdeb0 80549748 8003e6d8 960fded8
960fc000 808f5834 808f5834
[ 674.494860] dec0: 808f5864 00000008 960fdeec 960fded8 80549ad8
80549558 962cdee0 971e55a0
[ 674.503039] dee0: 960fdf24 960fdef0 80035590 80549aa0 972c4940
971e55a0 800354cc 00000000
[ 674.511220] df00: 972c4940 971e55a0 800354cc 00000000 00000000
00000000 960fdfac 960fdf28
[ 674.519401] df20: 8003a080 800354d8 00000000 00000000 960fdf4c
971e55a0 00000000 00000001
[ 674.527581] df40: dead4ead ffffffff ffffffff 8093fd70 00000000
00000000 80646d2c 960fdf5c
[ 674.535759] df60: 960fdf5c 00000000 00000001 dead4ead ffffffff
ffffffff 8093fd70 00000000
[ 674.543939] df80: 00000000 80646d2c 960fdf88 960fdf88 972c4940
80039fa0 00000000 00000000
[ 674.552118] dfa0: 00000000 960fdfb0 8000e8f0 80039fac 00000000
00000000 00000000 00000000
[ 674.560296] dfc0: 00000000 00000000 00000000 00000000 00000000
00000000 00000000 00000000
[ 674.568474] dfe0: 00000000 00000000 00000000 00000000 00000013
00000000 00000000 00000000
[ 674.576643] Backtrace:
[ 674.579129] [<800933d4>] (put_page) from [<80433fe4>]
(skb_release_data+0x6c/0xcc)
[ 674.586711] [<80433f78>] (skb_release_data) from [<8043406c>]
(skb_release_all+0x28/0x2c)
[ 674.594881] r7:96309010 r6:96308520 r5:96309010 r4:973d5000
[ 674.600599] [<80434044>] (skb_release_all) from [<8043413c>]
(consume_skb+0x28/0x5c)
[ 674.608338] r4:973d5000 r3:00000001
[ 674.611961] [<80434114>] (consume_skb) from [<80441c08>]
(__dev_kfree_skb_any+0x44/0x48)
[ 674.620045] r4:963363d0 r3:40000113
[ 674.623891] [<80441bc4>] (__dev_kfree_skb_any) from [<7f10ca70>]
(r8712_free_recvframe+0x2c/0x94 [r8712u])
[ 674.633827] [<7f10ca44>] (r8712_free_recvframe [r8712u]) from
[<7f10d3f8>] (recv_func+0x98/0x6f0 [r8712u])
[ 674.643477] r8:00002f17 r7:96309010 r6:00000000 r5:963363d0
r4:96308520 r3:00000000
[ 674.651563] [<7f10d360>] (recv_func [r8712u]) from [<7f10eb84>]
(r8712_recv_entry+0x34/0x78 [r8712u])
[ 674.660780] r10:84d6cc00 r9:0000fff8 r8:00002f17 r7:00002f80
r6:9630a520 r5:963363d0
[ 674.668666] r4:96308520
[ 674.671495] [<7f10eb50>] (r8712_recv_entry [r8712u]) from
[<7f10c65c>] (recv_tasklet+0x1b0/0x324 [r8712u])
[ 674.681145] r6:00000018 r5:84dc7626 r4:963363d0 r3:000000d2
[ 674.687002] [<7f10c4ac>] (recv_tasklet [r8712u]) from [<8001fbbc>]
(tasklet_hi_action+0x80/0xdc)
[ 674.695785] r10:40000000 r9:8093f840 r8:96a32900 r7:00000000
r6:8093f820 r5:808f458c
[ 674.703670] r4:96309094
[ 674.706228] [<8001fb3c>] (tasklet_hi_action) from [<8001ffc0>]
(__do_softirq+0x128/0x290)
[ 674.714399] r8:8093f840 r7:960fc000 r6:00000102 r5:00000000
r4:00000000 r3:0000833e
[ 674.722205] [<8001fe98>] (__do_softirq) from [<80020474>]
(irq_exit+0xc4/0x118)
[ 674.729509] r10:00000000 r9:8003555c r8:00000000 r7:00000000
r6:00000000 r5:80920c94
[ 674.737392] r4:00000000
[ 674.739968] [<800203b0>] (irq_exit) from [<80057298>]
(__handle_domain_irq+0x80/0xe0)
[ 674.747793] r4:00000000 r3:00000000
[ 674.751406] [<80057218>] (__handle_domain_irq) from [<800081e4>]
(asm_do_IRQ+0x24/0x28)
[ 674.759404] r8:972ba1e0 r7:960fde54 r6:f200b200 r5:60000013
r4:8054e760 r3:960fde20
[ 674.767228] [<800081c0>] (asm_do_IRQ) from [<800127f8>] (__irq_svc+0x38/0xb0)
[ 674.774361] Exception stack(0x960fde20 to 0x960fde68)
[ 674.779421] de20: 8054e75c 00000001 962cdee0 00000000 808f6668
969021e0 97051140 00000000
[ 674.787600] de40: 972ba1e0 8003555c 00000000 960fde7c 960fde58
960fde68 8004b37c 8054e760
[ 674.795771] de60: 60000013 ffffffff
[ 674.799294] [<8054e730>] (_raw_spin_unlock_irq) from [<8003e738>]
(finish_task_switch+0x6c/0x108)
[ 674.808157] r4:808f6668 r3:00000000
[ 674.811772] [<8003e6cc>] (finish_task_switch) from [<80549748>]
(__schedule+0x1fc/0x548)
[ 674.819856] r7:972ba1e0 r6:808f6668 r5:962cdee0 r4:97239140
[ 674.825572] [<8054954c>] (__schedule) from [<80549ad8>] (schedule+0x44/0x9c)
[ 674.832615] r8:00000008 r7:808f5864 r6:808f5834 r5:808f5834
r4:960fc000 r3:960fded8
[ 674.840438] [<80549a94>] (schedule) from [<80035590>]
(worker_thread+0xc4/0x4d0)
[ 674.847829] r4:971e55a0 r3:962cdee0
[ 674.851442] [<800354cc>] (worker_thread) from [<8003a080>]
(kthread+0xe0/0x100)
[ 674.858747] r10:00000000 r9:00000000 r8:00000000 r7:800354cc
r6:971e55a0 r5:972c4940
[ 674.866630] r4:00000000
[ 674.869192] [<80039fa0>] (kthread) from [<8000e8f0>]
(ret_from_fork+0x14/0x24)
[ 674.876409] r7:00000000 r6:00000000 r5:80039fa0 r4:972c4940
[ 674.882121] Code: 8009272c e1a0c00d e92dd800 e24cb004 (e5902000)
[ 674.888596] ---[ end trace 8b18691702087335 ]---
[ 674.893371] Kernel panic - not syncing: Fatal exception in interrupt
[ 674.899744] ---[ end Kernel panic - not syncing: Fatal exception in interrupt

The offsets are a little different, I guess because of the added
prints, and debugging features I enabled in the kernel. One thing I
notice is that the skb at 0x973d5000 gets reused a couple of times
before the crash. Also, this time the pointer being dereferenced is
NULL (0x1).

Haggai

2015-05-19 04:52:06

by Larry Finger

[permalink] [raw]
Subject: Re: kernel page fault in r8712u

On 05/18/2015 01:38 PM, Haggai Eran wrote:
> On 18 May 2015 at 18:31, Larry Finger <[email protected]> wrote:
>> On 05/17/2015 02:22 PM, Haggai Eran wrote:
>>>
>>> I added some debugging prints, trying to see more details about the
>>> packet that fails the r8712_validate_recv_frame. I noticed I'm getting
>>> many packets where recv_decache returns _FAIL. However, the last two
>>> packets before the crash fail for different reasons. The first has the
>>> ver field set to 3 (instead of zero). The second (the one that get's
>>> freed and cause the crash apparently) has an unknown type (12). If I'm
>>> not mistaken, 12 = WIFI_CTRL_TYPE | WIFI_DATA_TYPE. Is that possible?
>>>
>>> It could be that the packet headers are garbled though.
>>
>>
>> I think the headers are garbled. Did you log the address of the skb at
>> precvframe->u.hdr.pkt in r8712_free_recvframe() or orig_prframe->u.hdr.pct
>> in recv_func().
>
> I added prints of the skb pointer in every call to recv_func. Here are
> the results:
>
> ...
> [ 674.111771] recv_func: pcontext = 96335820, prframe->u.hdr.pkt = 9729fb40
> [ 674.118782] recv_func: pcontext = 963359b8, prframe->u.hdr.pkt = 9729f6c0
> [ 674.125777] recv_func: pcontext = 96335930, prframe->u.hdr.pkt = 9729f780
> [ 674.132769] recv_func: pcontext = 963358a8, prframe->u.hdr.pkt = 973d56c0
> [ 674.139753] recv_func: pcontext = 96335d70, prframe->u.hdr.pkt = 973d5000
> [ 674.146922] recv_func: pcontext = 963361b0, prframe->u.hdr.pkt = 973d5000
> [ 674.153961] recv_func: pcontext = 963360a0, prframe->u.hdr.pkt = 973d5000
> [ 674.161023] recv_func: pcontext = 96336128, prframe->u.hdr.pkt = 973d5000
> [ 674.168186] recv_func: pcontext = 96336018, prframe->u.hdr.pkt = 973d5000
> [ 674.175231] recv_func: pcontext = 96335f90, prframe->u.hdr.pkt = 973d5000
> [ 674.182141] r8712_validate_recv_frame: ver = 1
> [ 674.186814] recv_func: pcontext = 96335f08, prframe->u.hdr.pkt = 973d5000
> [ 674.193811] r8712_validate_recv_frame: ver = 1
> [ 674.198530] recv_func: pcontext = 963363d0, prframe->u.hdr.pkt = 973d5000
> [ 674.205434] r8712_validate_recv_frame: ver = 3
> [ 674.210018] Unable to handle kernel NULL pointer dereference at
> virtual address 00000001
> [ 674.218209] pgd = 80004000
> [ 674.221025] [00000001] *pgd=00000000
> [ 674.224752] Internal error: Oops: 5 [#1] ARM
> [ 674.229028] Modules linked in: rfcomm cfg80211 r8712u(C) btusb
> bluetooth bcm2708_rng
> [ 674.236857] CPU: 0 PID: 530 Comm: kworker/0:1 Tainted: G WC
> 4.0.3 #1
> [ 674.244247] Hardware name: BCM2708
> [ 674.247663] task: 962cdee0 ti: 960fc000 task.ti: 960fc000
> [ 674.253082] PC is at put_page+0xc/0x68
> [ 674.256853] LR is at skb_release_data+0x6c/0xcc
> [ 674.261388] pc : [<800933e0>] lr : [<80433fe4>] psr: 20000113
> [ 674.261388] sp : 960fdc18 ip : 960fdc28 fp : 960fdc24
> [ 674.272856] r10: 84d6cc00 r9 : 0000fff8 r8 : 00002f17
> [ 674.278079] r7 : 973d5000 r6 : 84dc7620 r5 : 84dc7620 r4 : 00000000
> [ 674.284602] r3 : 00000037 r2 : 00000000 r1 : 00000001 r0 : 00000001
> [ 674.291127] Flags: nzCv IRQs on FIQs on Mode SVC_32 ISA ARM
> Segment kernel
> [ 674.298432] Control: 00c5387d Table: 168a4008 DAC: 00000015
> [ 674.304179] Process kworker/0:1 (pid: 530, stack limit = 0x960fc188)
> [ 674.310532] Stack: (0x960fdc18 to 0x960fe000)
> [ 674.314893] dc00:
> 960fdc44 960fdc28
> [ 674.323076] dc20: 80433fe4 800933e0 973d5000 96309010 96308520
> 96309010 960fdc5c 960fdc48
> [ 674.331257] dc40: 8043406c 80433f84 00000001 973d5000 960fdc74
> 960fdc60 8043413c 80434050
> [ 674.339437] dc60: 40000113 963363d0 960fdc84 960fdc78 80441c08
> 80434120 960fdcac 960fdc88
> [ 674.347618] dc80: 7f10ca70 80441bd0 00000000 96308520 963363d0
> 00000000 96309010 00002f17
> [ 674.355798] dca0: 960fdce4 960fdcb0 7f10d3f8 7f10ca50 960fdcd4
> 960fdcc0 80439aac 96308520
> [ 674.363978] dcc0: 963363d0 9630a520 00002f80 00002f17 0000fff8
> 84d6cc00 960fdd04 960fdce8
> [ 674.372157] dce0: 7f10eb84 7f10d36c 000000d2 963363d0 84dc7626
> 00000018 960fdd54 960fdd08
> [ 674.380338] dd00: 7f10c65c 7f10eb5c 96308ff0 963090d4 9729f840
> 9630b520 96309010 973d5000
> [ 674.388518] dd20: ffff2f00 00000002 808f4590 96309094 808f458c
> 8093f820 00000000 96a32900
> [ 674.396698] dd40: 8093f840 40000000 960fdd7c 960fdd58 8001fbbc
> 7f10c4b8 0000833e 00000000
> [ 674.404879] dd60: 00000000 00000102 960fc000 8093f840 960fddcc
> 960fdd80 8001ffc0 8001fb48
> [ 674.413058] dd80: 8054e348 80052428 00000001 00000001 04208060
> 0001b61a 00000009 960fc000
> [ 674.421237] dda0: 00000000 00000000 80920c94 00000000 00000000
> 00000000 8003555c 00000000
> [ 674.429416] ddc0: 960fdde4 960fddd0 80020474 8001fea4 00000000
> 00000000 960fde0c 960fdde8
> [ 674.437598] dde0: 80057298 800203bc 960fde20 8054e760 60000013
> f200b200 960fde54 972ba1e0
> [ 674.445777] de00: 960fde1c 960fde10 800081e4 80057224 960fde7c
> 960fde20 800127f8 800081cc
> [ 674.453957] de20: 8054e75c 00000001 962cdee0 00000000 808f6668
> 969021e0 97051140 00000000
> [ 674.462140] de40: 972ba1e0 8003555c 00000000 960fde7c 960fde58
> 960fde68 8004b37c 8054e760
> [ 674.470319] de60: 60000013 ffffffff 00000000 808f6668 960fdeac
> 960fde80 8003e738 8054e73c
> [ 674.478499] de80: 00000001 00000000 8003e6cc 960fde98 97239140
> 962cdee0 808f6668 972ba1e0
> [ 674.486679] dea0: 960fded4 960fdeb0 80549748 8003e6d8 960fded8
> 960fc000 808f5834 808f5834
> [ 674.494860] dec0: 808f5864 00000008 960fdeec 960fded8 80549ad8
> 80549558 962cdee0 971e55a0
> [ 674.503039] dee0: 960fdf24 960fdef0 80035590 80549aa0 972c4940
> 971e55a0 800354cc 00000000
> [ 674.511220] df00: 972c4940 971e55a0 800354cc 00000000 00000000
> 00000000 960fdfac 960fdf28
> [ 674.519401] df20: 8003a080 800354d8 00000000 00000000 960fdf4c
> 971e55a0 00000000 00000001
> [ 674.527581] df40: dead4ead ffffffff ffffffff 8093fd70 00000000
> 00000000 80646d2c 960fdf5c
> [ 674.535759] df60: 960fdf5c 00000000 00000001 dead4ead ffffffff
> ffffffff 8093fd70 00000000
> [ 674.543939] df80: 00000000 80646d2c 960fdf88 960fdf88 972c4940
> 80039fa0 00000000 00000000
> [ 674.552118] dfa0: 00000000 960fdfb0 8000e8f0 80039fac 00000000
> 00000000 00000000 00000000
> [ 674.560296] dfc0: 00000000 00000000 00000000 00000000 00000000
> 00000000 00000000 00000000
> [ 674.568474] dfe0: 00000000 00000000 00000000 00000000 00000013
> 00000000 00000000 00000000
> [ 674.576643] Backtrace:
> [ 674.579129] [<800933d4>] (put_page) from [<80433fe4>]
> (skb_release_data+0x6c/0xcc)
> [ 674.586711] [<80433f78>] (skb_release_data) from [<8043406c>]
> (skb_release_all+0x28/0x2c)
> [ 674.594881] r7:96309010 r6:96308520 r5:96309010 r4:973d5000
> [ 674.600599] [<80434044>] (skb_release_all) from [<8043413c>]
> (consume_skb+0x28/0x5c)
> [ 674.608338] r4:973d5000 r3:00000001
> [ 674.611961] [<80434114>] (consume_skb) from [<80441c08>]
> (__dev_kfree_skb_any+0x44/0x48)
> [ 674.620045] r4:963363d0 r3:40000113
> [ 674.623891] [<80441bc4>] (__dev_kfree_skb_any) from [<7f10ca70>]
> (r8712_free_recvframe+0x2c/0x94 [r8712u])
> [ 674.633827] [<7f10ca44>] (r8712_free_recvframe [r8712u]) from
> [<7f10d3f8>] (recv_func+0x98/0x6f0 [r8712u])
> [ 674.643477] r8:00002f17 r7:96309010 r6:00000000 r5:963363d0
> r4:96308520 r3:00000000
> [ 674.651563] [<7f10d360>] (recv_func [r8712u]) from [<7f10eb84>]
> (r8712_recv_entry+0x34/0x78 [r8712u])
> [ 674.660780] r10:84d6cc00 r9:0000fff8 r8:00002f17 r7:00002f80
> r6:9630a520 r5:963363d0
> [ 674.668666] r4:96308520
> [ 674.671495] [<7f10eb50>] (r8712_recv_entry [r8712u]) from
> [<7f10c65c>] (recv_tasklet+0x1b0/0x324 [r8712u])
> [ 674.681145] r6:00000018 r5:84dc7626 r4:963363d0 r3:000000d2
> [ 674.687002] [<7f10c4ac>] (recv_tasklet [r8712u]) from [<8001fbbc>]
> (tasklet_hi_action+0x80/0xdc)
> [ 674.695785] r10:40000000 r9:8093f840 r8:96a32900 r7:00000000
> r6:8093f820 r5:808f458c
> [ 674.703670] r4:96309094
> [ 674.706228] [<8001fb3c>] (tasklet_hi_action) from [<8001ffc0>]
> (__do_softirq+0x128/0x290)
> [ 674.714399] r8:8093f840 r7:960fc000 r6:00000102 r5:00000000
> r4:00000000 r3:0000833e
> [ 674.722205] [<8001fe98>] (__do_softirq) from [<80020474>]
> (irq_exit+0xc4/0x118)
> [ 674.729509] r10:00000000 r9:8003555c r8:00000000 r7:00000000
> r6:00000000 r5:80920c94
> [ 674.737392] r4:00000000
> [ 674.739968] [<800203b0>] (irq_exit) from [<80057298>]
> (__handle_domain_irq+0x80/0xe0)
> [ 674.747793] r4:00000000 r3:00000000
> [ 674.751406] [<80057218>] (__handle_domain_irq) from [<800081e4>]
> (asm_do_IRQ+0x24/0x28)
> [ 674.759404] r8:972ba1e0 r7:960fde54 r6:f200b200 r5:60000013
> r4:8054e760 r3:960fde20
> [ 674.767228] [<800081c0>] (asm_do_IRQ) from [<800127f8>] (__irq_svc+0x38/0xb0)
> [ 674.774361] Exception stack(0x960fde20 to 0x960fde68)
> [ 674.779421] de20: 8054e75c 00000001 962cdee0 00000000 808f6668
> 969021e0 97051140 00000000
> [ 674.787600] de40: 972ba1e0 8003555c 00000000 960fde7c 960fde58
> 960fde68 8004b37c 8054e760
> [ 674.795771] de60: 60000013 ffffffff
> [ 674.799294] [<8054e730>] (_raw_spin_unlock_irq) from [<8003e738>]
> (finish_task_switch+0x6c/0x108)
> [ 674.808157] r4:808f6668 r3:00000000
> [ 674.811772] [<8003e6cc>] (finish_task_switch) from [<80549748>]
> (__schedule+0x1fc/0x548)
> [ 674.819856] r7:972ba1e0 r6:808f6668 r5:962cdee0 r4:97239140
> [ 674.825572] [<8054954c>] (__schedule) from [<80549ad8>] (schedule+0x44/0x9c)
> [ 674.832615] r8:00000008 r7:808f5864 r6:808f5834 r5:808f5834
> r4:960fc000 r3:960fded8
> [ 674.840438] [<80549a94>] (schedule) from [<80035590>]
> (worker_thread+0xc4/0x4d0)
> [ 674.847829] r4:971e55a0 r3:962cdee0
> [ 674.851442] [<800354cc>] (worker_thread) from [<8003a080>]
> (kthread+0xe0/0x100)
> [ 674.858747] r10:00000000 r9:00000000 r8:00000000 r7:800354cc
> r6:971e55a0 r5:972c4940
> [ 674.866630] r4:00000000
> [ 674.869192] [<80039fa0>] (kthread) from [<8000e8f0>]
> (ret_from_fork+0x14/0x24)
> [ 674.876409] r7:00000000 r6:00000000 r5:80039fa0 r4:972c4940
> [ 674.882121] Code: 8009272c e1a0c00d e92dd800 e24cb004 (e5902000)
> [ 674.888596] ---[ end trace 8b18691702087335 ]---
> [ 674.893371] Kernel panic - not syncing: Fatal exception in interrupt
> [ 674.899744] ---[ end Kernel panic - not syncing: Fatal exception in interrupt
>
> The offsets are a little different, I guess because of the added
> prints, and debugging features I enabled in the kernel. One thing I
> notice is that the skb at 0x973d5000 gets reused a couple of times
> before the crash. Also, this time the pointer being dereferenced is
> NULL (0x1).

OK, I will have to search further upstream to see how a faulty skb was provided.

I have been testing r8712u on my x86_64 system with no difficulty.

I checked the driver with Smatch and found a couple of array problems. These
likely won't be the problem, but try the attached patches anyway.

Larry


Attachments:
0001-staging-rtl8712-Fix-two-Smatch-errors-in-rtl8712_xmi.patch (1.00 kB)
0002-staging-rtl8712-Fix-Smatch-error-in-rtl8712_efuse.c.patch (1.06 kB)
Download all attachments

2015-05-18 15:31:45

by Larry Finger

[permalink] [raw]
Subject: Re: kernel page fault in r8712u

On 05/17/2015 02:22 PM, Haggai Eran wrote:
> I added some debugging prints, trying to see more details about the
> packet that fails the r8712_validate_recv_frame. I noticed I'm getting
> many packets where recv_decache returns _FAIL. However, the last two
> packets before the crash fail for different reasons. The first has the
> ver field set to 3 (instead of zero). The second (the one that get's
> freed and cause the crash apparently) has an unknown type (12). If I'm
> not mistaken, 12 = WIFI_CTRL_TYPE | WIFI_DATA_TYPE. Is that possible?
>
> It could be that the packet headers are garbled though.

I think the headers are garbled. Did you log the address of the skb at
precvframe->u.hdr.pkt in r8712_free_recvframe() or orig_prframe->u.hdr.pct in
recv_func().

I am still dubious of the cast "prframe = (union recv_frame *)pcontext;" in
recv_func().

Larry



2015-05-16 14:57:16

by Larry Finger

[permalink] [raw]
Subject: Re: kernel page fault in r8712u

On 05/16/2015 07:17 AM, Haggai Eran wrote:
> Hi,
>
> I've encountered the oops below running the r8712u driver. It occurred
> on Raspberry Pi (OpenELEC 5.95 beta, running kernel version 4.0.3),
> with the following device:
>> 0bda:8172 Realtek Semiconductor Corp. RTL8191SU 802.11n WLAN Adapter
>
> I'd be happy to dig in and see what the problem is, but I wanted to
> make sure this is the right driver to look at. linuxwireless.org says
> that this driver is going to be replaced by rtl8192su [1].
>
> Have you seen this issue?
>
> Thanks,
> Haggai
>
> [1] http://linuxwireless.org/en/users/Drivers/rtl819x/
>
> [ 834.537661] Unable to handle kernel paging request at virtual
> address a9d797d7
> [ 834.544900] pgd = 96b14000
> [ 834.547604] [a9d797d7] *pgd=00000000
> [ 834.551186] Internal error: Oops: 5 [#1] ARM
> [ 834.555449] Modules linked in: cfg80211 bluetooth r8712u(C)
> bcm2708_rng [last unloaded: btusb]
> [ 834.564092] CPU: 0 PID: 3 Comm: ksoftirqd/0 Tainted: G C
> 4.0.1 #1
> [ 834.571303] Hardware name: BCM2708
> [ 834.574702] task: 9703a700 ti: 97056000 task.ti: 97056000
> [ 834.580125] PC is at put_compound_page+0x64/0x1d0
> [ 834.584833] LR is at put_page+0x48/0x4c
> [ 834.588670] pc : [<80076354>] lr : [<80076508>] psr: 00000113
> [ 834.588670] sp : 97057d70 ip : 97057d88 fp : 97057d84
> [ 834.600129] r10: 97335c00 r9 : 0000003c r8 : 00000c21
> [ 834.605345] r7 : 9723cf00 r6 : 972a6900 r5 : 972a6900 r4 : 5d270b9a
> [ 834.611862] r3 : d7a0d795 r2 : d7a0d795 r1 : 00000001 r0 : a9d797d7
> [ 834.618390] Flags: nzcv IRQs on FIQs on Mode SVC_32 ISA ARM
> Segment kernel
> [ 834.625690] Control: 00c5387d Table: 16b14008 DAC: 00000015
> [ 834.631429] Process ksoftirqd/0 (pid: 3, stack limit = 0x97056188)
> [ 834.637601] Stack: (0x97057d70 to 0x97058000)
> [ 834.641957] 7d60: 00000000

--snip--

> [ 834.813594] Backtrace:
> [ 834.816072] [<800762f0>] (put_compound_page) from [<80076508>]
> (put_page+0x48/0x4c)
> [ 834.823722] r5:972a6900 r4:00000000
> [ 834.827331] [<800764c0>] (put_page) from [<803d86c0>]
> (skb_release_data+0x6c/0xcc)
> [ 834.834904] [<803d8654>] (skb_release_data) from [<803d8748>]
> (skb_release_all+0x28/0x2c)
> [ 834.843071] r7:89868bd0 r6:89868420 r5:89868bd0 r4:9723cf00
> [ 834.848759] [<803d8720>] (skb_release_all) from [<803d8844>]
> (consume_skb+0x28/0x5c)
> [ 834.856490] r4:9723cf00 r3:00000001
> [ 834.860088] [<803d881c>] (consume_skb) from [<803e4b94>]
> (__dev_kfree_skb_any+0x44/0x48)
> [ 834.868167] r4:898c2d18 r3:40000113
> [ 834.871974] [<803e4b50>] (__dev_kfree_skb_any) from [<7f01a714>]
> (r8712_free_recvframe+0x2c/0x9c [r8712u])
> [ 834.881891] [<7f01a6e8>] (r8712_free_recvframe [r8712u]) from
> [<7f01b094>] (recv_func+0x8c/0x6e8 [r8712u])

Yes, this driver will likely be replaced, but I have no idea how soon that will be.

No, I am not aware of this problem, but see below.

The problem appears to be from r8712u. From the stack dump, the problem happens
when r8712_free_recvframe() calls __dev_kfree_skb_any(). A complication is that
my copy of the kernel source does not show such a call. :(

Please use gdb to help with the debugging. From the main directory of your
source, enter the command 'gdb drivers/staging/rtl8712/r8712u.ko'. Once it
prompts you, enter 'l *r8712_free_recvframe+0x2c'. The first character is ell,
not one. That will show the actual line of code. Please post that info.

This driver has been used for a long time with x86 architecture, but not much
with ARM, which has different alignment issues. That may be the source of the
problem.

Larry


2015-05-17 04:25:52

by Haggai Eran

[permalink] [raw]
Subject: Re: kernel page fault in r8712u

On 16 May 2015 at 20:54, Larry Finger <[email protected]> wrote:
> Another location needed from gdb is "l *recv_func+0x8c".

Here it is:
(gdb) l *recv_func+0x8c
0x17094 is in recv_func (drivers/staging/rtl8712/rtl8712_recv.c:1004).
999 r8712_free_recvframe(orig_prframe, pfree_recv_queue);
1000 goto _exit_recv_func;
1001 }
1002 _exit_recv_func:
1003 return retval;
1004 }
1005
1006 static int recvbuf2recvframe(struct _adapter *padapter, struct
sk_buff *pskb)
1007 {
1008 u8 *pbuf, shift_sz = 0;

I don't think this means the relevant call is the one at line 999. I
think it is an earlier call, after r8712_validate_recv_frame. Here's
the disassembly:

/* check the frame crtl field and decache */
retval = r8712_validate_recv_frame(padapter, prframe);
17070: e1a00004 mov r0, r4
17074: e1a01005 mov r1, r5
17078: ebfffffe bl 17bc0 <r8712_validate_recv_frame>
if (retval != _SUCCESS) {
1707c: e3500001 cmp r0, #1
r8712_free_recvframe(orig_prframe, pfree_recv_queue);
goto _exit_recv_func;
}
}
/* check the frame crtl field and decache */
retval = r8712_validate_recv_frame(padapter, prframe);
17080: e1a06000 mov r6, r0
if (retval != _SUCCESS) {
17084: 0a000005 beq 170a0 <recv_func+0x98>
/* free this recv_frame */
r8712_free_recvframe(orig_prframe, pfree_recv_queue);
17088: e1a00005 mov r0, r5
1708c: e1a01007 mov r1, r7
17090: ebfffffe bl 166e8 <r8712_free_recvframe>
r8712_free_recvframe(orig_prframe, pfree_recv_queue);
goto _exit_recv_func;
}
_exit_recv_func:
return retval;
}
17094: e1a00006 mov r0, r6

Haggai