2019-01-01 21:36:24

by Nathan Royce

[permalink] [raw]
Subject: kernel: xhci_hcd 0000:00:14.0: ERROR unknown event type 37 - Kernel 4.19.13

Kernel 4.19.13

00:14.0 USB controller: Intel Corporation 9 Series Chipset Family USB
xHCI Controller

Around 400 "unknown event type 37" messages logged in a 2 second span.
*****
Jan 01 02:08:07 computername tvheadend[2370]: linuxdvb: Auvitek AU8522
QAM/8VSB Frontend #0 : ATSC-T #0 - poll TIMEOUT
Jan 01 02:08:00 computername tvheadend[2370]: linuxdvb: Auvitek AU8522
QAM/8VSB Frontend #0 : ATSC-T #0 - poll TIMEOUT
Jan 01 02:07:56 computername kernel: xhci_hcd 0000:00:14.0: ERROR
unknown event type 37
...
Jan 01 02:07:55 computername kernel: xhci_hcd 0000:00:14.0: ERROR
unknown event type 37
Jan 01 02:07:55 computername kernel: xhci_hcd 0000:00:14.0: ERROR
unknown event type 37
Jan 01 02:07:52 computername tvheadend[2370]: linuxdvb: Auvitek AU8522
QAM/8VSB Frontend #0 : ATSC-T #0 - poll TIMEOUT
Jan 01 02:07:44 computername tvheadend[2370]: linuxdvb: Auvitek AU8522
QAM/8VSB Frontend #0 : ATSC-T #0 - poll TIMEOUT
*****

I question whether this also caused kemleak to crash as well (will
post after this).

Regarding my tv tuner, it isn't supported by the kernel specifically,
but is close enough that all I have to do is alter a single source
file to include my device's pid, and it works just fine almost all of
the time.


2019-01-01 23:08:30

by Nathan Royce

[permalink] [raw]
Subject: Re: kernel: xhci_hcd 0000:00:14.0: ERROR unknown event type 37 - Kernel 4.19.13

Looks like this particular issue may have been due to a touchy/finicky
connection.

I removed my tuner from my hub and removed the hub from my
motherboard's USB and put my tuner in directly.
It STILL produced the error, but after I put everything back and
played around a little, the errors stopped.

Just to be sure, I also rebooted and it's still fine. No xhci errors at all.
The only thing I've done recently (within the past few days) was play
with my scanner which is also on that hub and maybe brushed my tuner
cable or something.

On Tue, Jan 1, 2019 at 12:57 PM Nathan Royce <[email protected]> wrote:
>
> Kernel 4.19.13
>
> 00:14.0 USB controller: Intel Corporation 9 Series Chipset Family USB
> xHCI Controller
>
> Around 400 "unknown event type 37" messages logged in a 2 second span.
> *****
> Jan 01 02:08:07 computername tvheadend[2370]: linuxdvb: Auvitek AU8522
> QAM/8VSB Frontend #0 : ATSC-T #0 - poll TIMEOUT
> Jan 01 02:08:00 computername tvheadend[2370]: linuxdvb: Auvitek AU8522
> QAM/8VSB Frontend #0 : ATSC-T #0 - poll TIMEOUT
> Jan 01 02:07:56 computername kernel: xhci_hcd 0000:00:14.0: ERROR
> unknown event type 37
> ...
> Jan 01 02:07:55 computername kernel: xhci_hcd 0000:00:14.0: ERROR
> unknown event type 37
> Jan 01 02:07:55 computername kernel: xhci_hcd 0000:00:14.0: ERROR
> unknown event type 37
> Jan 01 02:07:52 computername tvheadend[2370]: linuxdvb: Auvitek AU8522
> QAM/8VSB Frontend #0 : ATSC-T #0 - poll TIMEOUT
> Jan 01 02:07:44 computername tvheadend[2370]: linuxdvb: Auvitek AU8522
> QAM/8VSB Frontend #0 : ATSC-T #0 - poll TIMEOUT
> *****
>
> I question whether this also caused kemleak to crash as well (will
> post after this).
>
> Regarding my tv tuner, it isn't supported by the kernel specifically,
> but is close enough that all I have to do is alter a single source
> file to include my device's pid, and it works just fine almost all of
> the time.

2019-01-02 10:16:33

by Nathan Royce

[permalink] [raw]
Subject: Re: kernel: xhci_hcd 0000:00:14.0: ERROR unknown event type 37 - Kernel 4.19.13

...But then again, maybe it wasn't the cable. It's acting up again.

On Tue, Jan 1, 2019 at 3:14 PM Nathan Royce <[email protected]> wrote:
>
> Looks like this particular issue may have been due to a touchy/finicky
> connection.
>
> I removed my tuner from my hub and removed the hub from my
> motherboard's USB and put my tuner in directly.
> It STILL produced the error, but after I put everything back and
> played around a little, the errors stopped.
>
> Just to be sure, I also rebooted and it's still fine. No xhci errors at all.
> The only thing I've done recently (within the past few days) was play
> with my scanner which is also on that hub and maybe brushed my tuner
> cable or something.

2019-01-02 15:03:06

by Mathias Nyman

[permalink] [raw]
Subject: Re: kernel: xhci_hcd 0000:00:14.0: ERROR unknown event type 37 - Kernel 4.19.13

On 01.01.2019 20:57, Nathan Royce wrote:
> Kernel 4.19.13
>
> 00:14.0 USB controller: Intel Corporation 9 Series Chipset Family USB
> xHCI Controller
>
> Around 400 "unknown event type 37" messages logged in a 2 second span.
> *****
> Jan 01 02:08:07 computername tvheadend[2370]: linuxdvb: Auvitek AU8522
> QAM/8VSB Frontend #0 : ATSC-T #0 - poll TIMEOUT
> Jan 01 02:08:00 computername tvheadend[2370]: linuxdvb: Auvitek AU8522
> QAM/8VSB Frontend #0 : ATSC-T #0 - poll TIMEOUT
> Jan 01 02:07:56 computername kernel: xhci_hcd 0000:00:14.0: ERROR
> unknown event type 37
> ...
> Jan 01 02:07:55 computername kernel: xhci_hcd 0000:00:14.0: ERROR
> unknown event type 37
> Jan 01 02:07:55 computername kernel: xhci_hcd 0000:00:14.0: ERROR
> unknown event type 37
> Jan 01 02:07:52 computername tvheadend[2370]: linuxdvb: Auvitek AU8522
> QAM/8VSB Frontend #0 : ATSC-T #0 - poll TIMEOUT
> Jan 01 02:07:44 computername tvheadend[2370]: linuxdvb: Auvitek AU8522
> QAM/8VSB Frontend #0 : ATSC-T #0 - poll TIMEOUT
> *****
>
> I question whether this also caused kemleak to crash as well (will
> post after this).
>
> Regarding my tv tuner, it isn't supported by the kernel specifically,
> but is close enough that all I have to do is alter a single source
> file to include my device's pid, and it works just fine almost all of
> the time.
>

The event type 37 is a host controller event, most likely a event ring full error.

So there are probably so many events that we fill the event ring before we can handle them.

Could you take traces of this?
Note that the trace file will be huge.

mount -t debugfs none /sys/kernel/debug
echo 81920 > /sys/kernel/debug/tracing/buffer_size_kb
echo 1 > /sys/kernel/debug/tracing/events/xhci-hcd/enable

copy the traces somewhere safe once the error is triggered:
cp /sys/kernel/debug/tracing/trace /<somewhere>

-Mathias

2019-01-06 19:34:43

by Nathan Royce

[permalink] [raw]
Subject: Re: kernel: xhci_hcd 0000:00:14.0: ERROR unknown event type 37 - Kernel 4.19.13

I'm only posting to say I'm still waiting...
The error came up while I slept, and when I copied that log and looked
at it (yes, it WAS huge, just as you said), the timestamps at the
head/tail were much later than the journal logged times.
So I made a little script to monitor the journal kernel entries for
that message and have it copy the file after maybe 5 seconds. And now,
I'm just waiting for that error to occur again.

On Wed, Jan 2, 2019 at 5:32 AM Mathias Nyman
<[email protected]> wrote:
>
> The event type 37 is a host controller event, most likely a event ring full error.
>
> So there are probably so many events that we fill the event ring before we can handle them.
>
> Could you take traces of this?
> Note that the trace file will be huge.
>
> mount -t debugfs none /sys/kernel/debug
> echo 81920 > /sys/kernel/debug/tracing/buffer_size_kb
> echo 1 > /sys/kernel/debug/tracing/events/xhci-hcd/enable
>
> copy the traces somewhere safe once the error is triggered:
> cp /sys/kernel/debug/tracing/trace /<somewhere>
>
> -Mathias

2019-01-09 03:10:26

by Nathan Royce

[permalink] [raw]
Subject: Re: kernel: xhci_hcd 0000:00:14.0: ERROR unknown event type 37 - Kernel 4.19.13

OK, I finally got one... but there was only 1 journal log entry. The
previous time there were like maybe 10 (also very little), but the 2
times before that had enough for me to have to page through the log.

I actually messed up on a variable in my script so missed the actual
time, but the trace still encompassed entries around the log entry
time when I copied it manually.
I fixed the script, tested it and have it running again for the next time.

GZip compressed the 1GB trace down to 43MB, but PLZip got it down to
19.5MB: https://1drv.ms/f/s!AkkOvekTOCrYn0kEFtJzreV7gCTD
All 3 files are from the same trace, but wanted to give you options in
case you didn't have plzip.
The journal entry (time): [501180.585516] computername kernel:
xhci_hcd 0000:00:14.0: ERROR unknown event type 37

On Wed, Jan 2, 2019 at 5:32 AM Mathias Nyman
<[email protected]> wrote:
>
> The event type 37 is a host controller event, most likely a event ring full error.
>
> So there are probably so many events that we fill the event ring before we can handle them.
>
> Could you take traces of this?
> Note that the trace file will be huge.
>
> mount -t debugfs none /sys/kernel/debug
> echo 81920 > /sys/kernel/debug/tracing/buffer_size_kb
> echo 1 > /sys/kernel/debug/tracing/events/xhci-hcd/enable
>
> copy the traces somewhere safe once the error is triggered:
> cp /sys/kernel/debug/tracing/trace /<somewhere>
>
> -Mathias

2019-01-09 10:41:44

by Nathan Royce

[permalink] [raw]
Subject: Re: kernel: xhci_hcd 0000:00:14.0: ERROR unknown event type 37 - Kernel 4.19.13

You can ignore the last set of files with a sample of 1.
I got a nice sample of like 150 about 6 hours ago.
The link I included in the previous reply contains the same filenames,
just updated.
The journal timestamps (to correspond with the trace times) go from
"[513438.430253] computername kernel: xhci_hcd 0000:00:14.0: ERROR
unknown event type 37"
to
"[513438.796965] computername kernel: xhci_hcd 0000:00:14.0: ERROR
unknown event type 37"
That's 150 of them in less than 1/2 second.

On Wed, Jan 2, 2019 at 5:32 AM Mathias Nyman
<[email protected]> wrote:
>
> The event type 37 is a host controller event, most likely a event ring full error.
>
> So there are probably so many events that we fill the event ring before we can handle them.
>
> Could you take traces of this?
> Note that the trace file will be huge.
>
> mount -t debugfs none /sys/kernel/debug
> echo 81920 > /sys/kernel/debug/tracing/buffer_size_kb
> echo 1 > /sys/kernel/debug/tracing/events/xhci-hcd/enable
>
> copy the traces somewhere safe once the error is triggered:
> cp /sys/kernel/debug/tracing/trace /<somewhere>
>
> -Mathias

2019-01-09 22:43:07

by Nathan Royce

[permalink] [raw]
Subject: Re: kernel: xhci_hcd 0000:00:14.0: ERROR unknown event type 37 - Kernel 4.19.13

Wow, my system got wrecked (exaggeration) during this latest stretch...
Pulseaudio was stretched to the limit and beyond and was forced to
restart. Anything that was producing audio had to be restarted to get
it back.
This time was much like the first time and went from timestamp
573100.060927 (line 1) to 572506.604155 (line 11069), where 100%
(literally) of it was that event 37 in the journal, no other kernel
log entries except for the systemd-hostnamed audit before it all went
down.
And as usual, it was my USB TV tuner (tvheadend really) giving the
Poll Timeout log entries.
Those same uploaded trace files will be updated with the latest bugout.

On Wed, Jan 2, 2019 at 5:32 AM Mathias Nyman
<[email protected]> wrote:
>
> The event type 37 is a host controller event, most likely a event ring full error.
>
> So there are probably so many events that we fill the event ring before we can handle them.
>
> Could you take traces of this?
> Note that the trace file will be huge.
>
> mount -t debugfs none /sys/kernel/debug
> echo 81920 > /sys/kernel/debug/tracing/buffer_size_kb
> echo 1 > /sys/kernel/debug/tracing/events/xhci-hcd/enable
>
> copy the traces somewhere safe once the error is triggered:
> cp /sys/kernel/debug/tracing/trace /<somewhere>
>
> -Mathias

2019-01-24 15:53:54

by Mathias Nyman

[permalink] [raw]
Subject: Re: kernel: xhci_hcd 0000:00:14.0: ERROR unknown event type 37 - Kernel 4.19.13

On 10.01.2019 00:11, Nathan Royce wrote:
> Wow, my system got wrecked (exaggeration) during this latest stretch...
> Pulseaudio was stretched to the limit and beyond and was forced to
> restart. Anything that was producing audio had to be restarted to get
> it back.
> This time was much like the first time and went from timestamp
> 573100.060927 (line 1) to 572506.604155 (line 11069), where 100%
> (literally) of it was that event 37 in the journal, no other kernel
> log entries except for the systemd-hostnamed audit before it all went
> down.
> And as usual, it was my USB TV tuner (tvheadend really) giving the
> Poll Timeout log entries.
> Those same uploaded trace files will be updated with the latest bugout.
>

Hi.

Finally had a chance to look at this. Sorry about the delay.

Logs show event ring is full:

573047.104801: xhci_handle_event: EVENT: TRB 0000000000000000 status 'Event Ring Full Error' len 0 slot 0 ep 0 type 'Host Controller Event' flags e:C

It's filled with 0 length short transfer events due to a the following loop:

1. Class driver asks for 58658 bytes from device (queues BULK IN URB)
2. short transfer event, xhci interrupts, saying we got less than 58658 bytes.
We actually got 0 bytes.
3. return URB with zero bytes to class driver
4. Class driver immediately queues a new URB, asking for 58658 bytes (see step 1)

Last 6ms before event ring is full this looped 255 times.

one cycle of the loop in log:

573047.104748: xhci_handle_event: EVENT: TRB 000000020b267770 status 'Short Packet' len 58658 slot 4 ep 7 type 'Transfer Event' flags e:C
573047.104749: xhci_handle_transfer: BULK: Buffer 000000001de20000 length 58658 TD size 0 intr 0 type 'Normal' flags b:i:I:c:s:I:e:c
573047.104749: xhci_inc_deq: BULK 0000000002f14758: enq 0x000000020b267860(0x000000020b267000) deq 0x000000020b267780(0x000000020b267000) segs 2 stream 0 free_trbs 495 bounce 512 cycle 0
573047.104752: xhci_urb_giveback: ep3in-bulk: urb 000000000bdcbe77 pipe 3225519232 slot 4 length 0/58658 sgs 0/0 stream 0 flags 00010200
573047.104758: xhci_urb_enqueue: ep3in-bulk: urb 000000000bdcbe77 pipe 3225519232 slot 4 length 0/58658 sgs 0/0 stream 0 flags 00010200
573047.104758: xhci_queue_trb: BULK: Buffer 000000001de20000 length 58658 TD size 0 intr 0 type 'Normal' flags b:i:I:c:s:I:e:C
573047.104759: xhci_inc_enq: BULK 0000000002f14758: enq 0x000000020b267870(0x000000020b267000) deq 0x000000020b267780(0x000000020b267000) segs 2 stream 0 free_trbs 494 bounce 512 cycle 0
573047.104759: xhci_inc_deq: EVENT 0000000022f906c2: enq 0x000000020b317000(0x000000020b317000) deq 0x000000020b3178d0(0x000000020b317000) segs 1 stream 0 free_trbs 254 bounce 0 cycle 1

I'll continue digging into this

-Mathias