2020-08-16 23:08:19

by Corey Minyard

[permalink] [raw]
Subject: Oops on current Raspian when closing an SCTP connection

I'm seeing the following when an SCTP connection terminates. This is on
Raspian on a Raspberry Pi, version is Linux version 5.4.51-v7+. That's
32-bit ARM.

I haven't looked into it yet, I thought I would report before trying to
chase anything down. I'm not seeing it on 5.4 x86_64 systems.

Aug 16 17:59:00 access kernel: [510640.326415] Unable to handle kernel NULL pointer dereference at virtual address 00000008
Aug 16 17:59:00 access kernel: [510640.341624] pgd = c00fc16c
Aug 16 17:59:00 access kernel: [510640.347834] [00000008] *pgd=355ef835, *pte=00000000, *ppte=00000000
Aug 16 17:59:00 access kernel: [510640.357731] Internal error: Oops: 17 [#22] SMP ARM
Aug 16 17:59:01 access kernel: [510640.365931] Modules linked in: md5 sctp ftdi_sio cp210x usbserial raspberrypi_hwmon bcm2835_codec(C) v4l2_mem2mem bcm2835_isp(C) bcm2835_v4l2(C) bcm2835_mmal_vchiq(C) videobuf2_vmalloc videobuf2_dma_contig videobuf2_memops videobuf2_v4l2 snd_bcm2835(C) videobuf2_common i2c_bcm2835 snd_pcm snd_timer videodev snd mc vc_sm_cma(C) uio_pdrv_genirq uio fixed nf_nat_pptp nf_conntrack_pptp nf_nat nf_conntrack nf_defrag_ipv4 rtc_ds1307 regmap_i2c i2c_dev ip_tables x_tables ipv6 nf_defrag_ipv6
Aug 16 17:59:01 access kernel: [510640.425420] CPU: 1 PID: 4592 Comm: gtlsshd Tainted: G D C 5.4.51-v7+ #1327
Aug 16 17:59:01 access kernel: [510640.438008] Hardware name: BCM2835
Aug 16 17:59:01 access kernel: [510640.443823] PC is at sctp_ulpevent_free+0x38/0xa0 [sctp]
Aug 16 17:59:01 access kernel: [510640.451498] LR is at sctp_queue_purge_ulpevents+0x34/0x50 [sctp]
Aug 16 17:59:01 access kernel: [510640.459748] pc : [<7f1f3e08>] lr : [<7f1f3ea4>] psr: 20000013
Aug 16 17:59:01 access kernel: [510640.468311] sp : ad397dd0 ip : ad397df0 fp : ad397dec
Aug 16 17:59:01 access kernel: [510640.475811] r10: 00000008 r9 : b082c2c0 r8 : 80d04f48
Aug 16 17:59:01 access kernel: [510640.483282] r7 : 00000000 r6 : 00000000 r5 : b1ebc6a8 r4 : 00000000
Aug 16 17:59:01 access kernel: [510640.492079] r3 : 00000000 r2 : 00000000 r1 : 00000000 r0 : b1ebc6a8
Aug 16 17:59:01 access kernel: [510640.500815] Flags: nzCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment user
Aug 16 17:59:01 access kernel: [510640.510156] Control: 10c5387d Table: 2028806a DAC: 00000055
Aug 16 17:59:01 access kernel: [510640.518148] Process gtlsshd (pid: 4592, stack limit = 0x4ac0cb39)
Aug 16 17:59:01 access kernel: [510640.526489] Stack: (0xad397dd0 to 0xad398000)
Aug 16 17:59:01 access kernel: [510640.533055] 7dc0: 00000000 00000000 b1ec2a10 b1ea1300
Aug 16 17:59:01 access kernel: [510640.545471] 7de0: ad397e04 ad397df0 7f1f3ea4 7f1f3ddc b1ec2600 80d87e00 ad397e54 ad397e08
Aug 16 17:59:01 access kernel: [510640.557828] 7e00: 7f1fb208 7f1f3e7c ad397e94 00000000 802d4fec 8020b210 ad397e44 ad397e28
Aug 16 17:59:01 access kernel: [510640.570172] 7e20: aeacaf00 9ff167fd 8010bbbc b1ec2600 b082c240 7f21c380 b64b9550 94d54bb0
Aug 16 17:59:01 access kernel: [510640.582536] 7e40: b082c2c0 00000008 ad397e6c ad397e58 80826f9c 7f1fb1ac b1ec2600 b082c240
Aug 16 17:59:01 access kernel: [510640.595093] 7e60: ad397e84 ad397e70 7f00a23c 80826f5c b082c240 b082c340 ad397ea4 ad397e88
Aug 16 17:59:01 access kernel: [510640.607835] 7e80: 80751c38 7f00a20c 80751cac ad0006c0 000e0003 b082c2c0 ad397eb4 ad397ea8
Aug 16 17:59:01 access kernel: [510640.620747] 7ea0: 80751ccc 80751bf4 ad397ef4 ad397eb8 802e43f0 80751cb8 00000000 00000000
Aug 16 17:59:01 access kernel: [510640.633918] 7ec0: 8010cce8 ad0006c8 ad0006c0 00000000 b3c160b0 b3c15b80 b3c160d4 80dab2c0
Aug 16 17:59:01 access kernel: [510640.647206] 7ee0: ad0006c0 00000006 ad397f04 ad397ef8 802e459c 802e4358 ad397f2c ad397f08
Aug 16 17:59:01 access kernel: [510640.660620] 7f00: 801408e4 802e4590 ad396000 ad397fb0 80d04f48 80d04f4c 00000004 801011c4
Aug 16 17:59:01 access kernel: [510640.674165] 7f20: ad397fac ad397f30 8010cce8 80140838 ad397f4c ad397f40 802e4b1c 802e4a30
Aug 16 17:59:01 access kernel: [510640.687762] 7f40: ad397f6c ad397f50 802ddeb8 802e4b0c 00000052 00000000 b4335600 00000000
Aug 16 17:59:01 access kernel: [510640.701356] 7f60: ad397f94 ad397f70 80304e0c 802dde54 01614360 01614150 01615b18 9ff167fd
Aug 16 17:59:01 access kernel: [510640.714947] 7f80: 801011c4 01614360 01614150 01615b18 00000006 801011c4 ad396000 00000006
Aug 16 17:59:01 access kernel: [510640.728540] 7fa0: 00000000 ad397fb0 80101034 8010c884 00000000 00000000 000011f0 00000000
Aug 16 17:59:01 access kernel: [510640.742130] 7fc0: 01614360 01614150 01615b18 00000006 76ea3c04 00000001 00000000 00000009
Aug 16 17:59:01 access kernel: [510640.755714] 7fe0: 76ecd4a0 7ea20418 76e8d2f8 76b8c2f0 60000010 00000000 00000000 00000000
Aug 16 17:59:01 access kernel: [510640.769284] Backtrace:
Aug 16 17:59:01 access kernel: [510640.774544] [<7f1f3dd0>] (sctp_ulpevent_free [sctp]) from [<7f1f3ea4>] (sctp_queue_purge_ulpevents+0x34/0x50 [sctp])
Aug 16 17:59:01 access kernel: [510640.790327] r6:b1ea1300 r5:b1ec2a10 r4:00000000 r3:00000000
Aug 16 17:59:01 access kernel: [510640.798799] [<7f1f3e70>] (sctp_queue_purge_ulpevents [sctp]) from [<7f1fb208>] (sctp_close+0x68/0x2a0 [sctp])
Aug 16 17:59:01 access kernel: [510640.813882] r5:80d87e00 r4:b1ec2600
Aug 16 17:59:01 access kernel: [510640.820090] [<7f1fb1a0>] (sctp_close [sctp]) from [<80826f9c>] (inet_release+0x4c/0x80)
Aug 16 17:59:01 access kernel: [510640.833102] r10:00000008 r9:b082c2c0 r8:94d54bb0 r7:b64b9550 r6:7f21c380 r5:b082c240
Aug 16 17:59:01 access kernel: [510640.845916] r4:b1ec2600
Aug 16 17:59:01 access kernel: [510640.851117] [<80826f50>] (inet_release) from [<7f00a23c>] (inet6_release+0x3c/0x48 [ipv6])
Aug 16 17:59:01 access kernel: [510640.864202] r5:b082c240 r4:b1ec2600
Aug 16 17:59:01 access kernel: [510640.870320] [<7f00a200>] (inet6_release [ipv6]) from [<80751c38>] (__sock_release+0x50/0xc4)
Aug 16 17:59:01 access kernel: [510640.883425] r5:b082c340 r4:b082c240
Aug 16 17:59:01 access kernel: [510640.889316] [<80751be8>] (__sock_release) from [<80751ccc>] (sock_close+0x20/0x28)
Aug 16 17:59:01 access kernel: [510640.901384] r6:b082c2c0 r5:000e0003 r4:ad0006c0 r3:80751cac
Aug 16 17:59:01 access kernel: [510640.909443] [<80751cac>] (sock_close) from [<802e43f0>] (__fput+0xa4/0x238)
Aug 16 17:59:01 access kernel: [510640.918821] [<802e434c>] (__fput) from [<802e459c>] (____fput+0x18/0x1c)
Aug 16 17:59:01 access kernel: [510640.927899] r10:00000006 r9:ad0006c0 r8:80dab2c0 r7:b3c160d4 r6:b3c15b80 r5:b3c160b0
Aug 16 17:59:01 access kernel: [510640.940311] r4:00000000
Aug 16 17:59:01 access kernel: [510640.945087] [<802e4584>] (____fput) from [<801408e4>] (task_work_run+0xb8/0xdc)
Aug 16 17:59:01 access kernel: [510640.956925] [<8014082c>] (task_work_run) from [<8010cce8>] (do_work_pending+0x470/0x4f4)
Aug 16 17:59:01 access kernel: [510640.969577] r9:801011c4 r8:00000004 r7:80d04f4c r6:80d04f48 r5:ad397fb0 r4:ad396000
Aug 16 17:59:01 access kernel: [510640.981967] [<8010c878>] (do_work_pending) from [<80101034>] (slow_work_pending+0xc/0x20)
Aug 16 17:59:01 access kernel: [510640.994903] Exception stack(0xad397fb0 to 0xad397ff8)
Aug 16 17:59:01 access kernel: [510641.002430] 7fa0: 00000000 00000000 000011f0 00000000
Aug 16 17:59:01 access kernel: [510641.015375] 7fc0: 01614360 01614150 01615b18 00000006 76ea3c04 00000001 00000000 00000009
Aug 16 17:59:01 access kernel: [510641.028402] 7fe0: 76ecd4a0 7ea20418 76e8d2f8 76b8c2f0 60000010 00000000
Aug 16 17:59:01 access kernel: [510641.037533] r10:00000006 r9:ad396000 r8:801011c4 r7:00000006 r6:01615b18 r5:01614150
Aug 16 17:59:01 access kernel: [510641.050214] r4:01614360
Aug 16 17:59:01 access kernel: [510641.055157] Code: e590603c e3530000 0a000008 e5903084 (e5934008)


2020-08-17 13:48:03

by Marcelo Ricardo Leitner

[permalink] [raw]
Subject: Re: Oops on current Raspian when closing an SCTP connection

On Sun, Aug 16, 2020 at 06:06:24PM -0500, Corey Minyard wrote:
> I'm seeing the following when an SCTP connection terminates. This is on
> Raspian on a Raspberry Pi, version is Linux version 5.4.51-v7+. That's
> 32-bit ARM.
>
> I haven't looked into it yet, I thought I would report before trying to
> chase anything down. I'm not seeing it on 5.4 x86_64 systems.
>
> Aug 16 17:59:00 access kernel: [510640.326415] Unable to handle kernel NULL pointer dereference at virtual address 00000008
> Aug 16 17:59:00 access kernel: [510640.341624] pgd = c00fc16c
> Aug 16 17:59:00 access kernel: [510640.347834] [00000008] *pgd=355ef835, *pte=00000000, *ppte=00000000
> Aug 16 17:59:00 access kernel: [510640.357731] Internal error: Oops: 17 [#22] SMP ARM
> Aug 16 17:59:01 access kernel: [510640.365931] Modules linked in: md5 sctp ftdi_sio cp210x usbserial raspberrypi_hwmon bcm2835_codec(C) v4l2_mem2mem bcm2835_isp(C) bcm2835_v4l2(C) bcm2835_mmal_vchiq(C) videobuf2_vmalloc videobuf2_dma_contig videobuf2_memops videobuf2_v4l2 snd_bcm2835(C) videobuf2_common i2c_bcm2835 snd_pcm snd_timer videodev snd mc vc_sm_cma(C) uio_pdrv_genirq uio fixed nf_nat_pptp nf_conntrack_pptp nf_nat nf_conntrack nf_defrag_ipv4 rtc_ds1307 regmap_i2c i2c_dev ip_tables x_tables ipv6 nf_defrag_ipv6
> Aug 16 17:59:01 access kernel: [510640.425420] CPU: 1 PID: 4592 Comm: gtlsshd Tainted: G D C 5.4.51-v7+ #1327
> Aug 16 17:59:01 access kernel: [510640.438008] Hardware name: BCM2835
> Aug 16 17:59:01 access kernel: [510640.443823] PC is at sctp_ulpevent_free+0x38/0xa0 [sctp]
> Aug 16 17:59:01 access kernel: [510640.451498] LR is at sctp_queue_purge_ulpevents+0x34/0x50 [sctp]

Not ringing a bell here. Can you pinpoint on which line this crash
was? It seems, by the 0x8 offset and these function offsets, that this
could be when it was trying to access event->rmem_len, but if event
was NULL then it should have crashed earlier.

Marcelo

2020-08-17 22:41:36

by Corey Minyard

[permalink] [raw]
Subject: Re: Oops on current Raspian when closing an SCTP connection

On Mon, Aug 17, 2020 at 10:44:57AM -0300, Marcelo Ricardo Leitner wrote:
> On Sun, Aug 16, 2020 at 06:06:24PM -0500, Corey Minyard wrote:
> > I'm seeing the following when an SCTP connection terminates. This is on
> > Raspian on a Raspberry Pi, version is Linux version 5.4.51-v7+. That's
> > 32-bit ARM.
> >
> > I haven't looked into it yet, I thought I would report before trying to
> > chase anything down. I'm not seeing it on 5.4 x86_64 systems.
> >
> > Aug 16 17:59:01 access kernel: [510640.438008] Hardware name: BCM2835
> > Aug 16 17:59:01 access kernel: [510640.443823] PC is at sctp_ulpevent_free+0x38/0xa0 [sctp]
> > Aug 16 17:59:01 access kernel: [510640.451498] LR is at sctp_queue_purge_ulpevents+0x34/0x50 [sctp]
>
> Not ringing a bell here. Can you pinpoint on which line this crash
> was? It seems, by the 0x8 offset and these function offsets, that this
> could be when it was trying to access event->rmem_len, but if event
> was NULL then it should have crashed earlier.
>
> Marcelo

I think so:

00015e38 <sctp_ulpevent_free>:
15e38: e1a0c00d mov ip, sp
15e3c: e92dd878 push {r3, r4, r5, r6, fp, ip, lr, pc}
15e40: e24cb004 sub fp, ip, #4
15e44: e52de004 push {lr} ; (str lr, [sp, #-4]!)
15e48: ebfffffe bl 0 <__gnu_mcount_nc>

ulpevent.c:1102 if (sctp_ulpevent_is_notification(event))
15e4c: e1d032f0 ldrsh r3, [r0, #32]
15e50: e1a05000 mov r5, r0
15e54: e3530000 cmp r3, #0
15e58: ba000011 blt 15ea4 <sctp_ulpevent_free+0x6c>

This is the false branch from the above (high bit isn't set).

ulpevent.c:1061 if (!skb->data_len) (just the fetch part)
15e5c: e5903040 ldr r3, [r0, #64] ; 0x40

ulpevent.c:1059 len = skb->len;
15e60: e590603c ldr r6, [r0, #60] ; 0x3c

ulpevent.c:1061 if (!skb->data_len) (the compare part)
15e64: e3530000 cmp r3, #0
15e68: 0a000008 beq 15e90 <sctp_ulpevent_free+0x58>


ulpevent.c:1065 skb_walk_frags(skb, frag) {
skbuff.h:1395 return skb->end; (inside skb_shinfo())
15e6c: e5903084 ldr r3, [r0, #132] ; 0x84

skbuff.h:3461 iter = skb_shinfo(skb)->frag_list
15e70: e5934008 ldr r4, [r3, #8] <-- Crash occurs here

15e74: e3540000 cmp r4, #0
15e78: 0a000004 beq 15e90 <sctp_ulpevent_free+0x58>
15e7c: e2840018 add r0, r4, #24
15e80: ebfffa34 bl 14758 <sctp_ulpevent_release_frag_data>


So it appears that skb->end is NULL.

-corey