Subject: iwlwifi module crash

I am using iwd demon for wifi. Once a while I loose connectivity. Restarting
the demon does not help. But once I restart the system, it starts working fine.
Attaching stack trace from journal.

Regards,
Bala


Attachments:
forwarded message (3.26 kB)
Denis Kenzior : Re: iwd crashes randomly
iwd_crash.log (13.72 kB)
Download all attachments

2019-06-07 09:44:23

by Emmanuel Grumbach

[permalink] [raw]
Subject: Re: iwlwifi module crash

On Fri, Jun 7, 2019 at 5:22 AM Balakrishnan Balasubramanian
<[email protected]> wrote:
>
> I am using iwd demon for wifi. Once a while I loose connectivity. Restarting
> the demon does not help. But once I restart the system, it starts working fine.
> Attaching stack trace from journal.

This is because the device is removed from the PCI bus. Nothing from
iwlwifi side can be done.
If that happens upon suspend / resume, I know there are been fixes in
PCI bus driver. If not, check that the device sits correctly in its
socket.

>
> Regards,
> Bala
>
>
> ---------- Forwarded message ----------
> From: Denis Kenzior <[email protected]>
> To: Balakrishnan Balasubramanian <[email protected]>, [email protected]
> Cc:
> Bcc:
> Date: Thu, 06 Jun 2019 18:07:40 -0500
> Subject: Re: iwd crashes randomly
> Hi Bala,
>
> On 06/06/2019 06:00 PM, Balakrishnan Balasubramanian wrote:
> > Sometimes after a week and sometimes after two days. Once crashed, restarting
> > the service does not help. Had to restart the computer. Attaching stack trace
> > from journal.
>
> That implies that your kernel is crashing, not iwd. The attached log
> shows a kernel stack trace somewhere inside iwlwifi module. I would
> post this trace to [email protected].
>
> If you have an associated iwd backtrace, then certainly post this here,
> but if the kernel module is crashing, there isn't much we can do.
>
> Regards,
> -Denis

Subject: Re: iwlwifi module crash

> This is because the device is removed from the PCI bus. Nothing from
> iwlwifi side can be done.

I am sure the device is not physically disturbed. If that was the case, should it not stay down when restarting the system?

> If that happens upon suspend / resume, I know there are been fixes in
> PCI bus driver.

To my knowledge I have disabled all power/suspend features and I don't see releated logs in journal except the below. Not sure if relevant.

Jun 03 21:33:14 zadesk kernel: wlan0: Limiting TX power to 14 (17 - 3) dBm as advertised by d4:5d:df:25:ee:90

Is there a way to restart the module safely without restarting the system?

Regards,
Bala


On Friday, June 7, 2019 5:25:41 AM EDT Emmanuel Grumbach wrote:
> On Fri, Jun 7, 2019 at 5:22 AM Balakrishnan Balasubramanian
>
> <[email protected]> wrote:
> > I am using iwd demon for wifi. Once a while I loose connectivity.
> > Restarting the demon does not help. But once I restart the system, it
> > starts working fine. Attaching stack trace from journal.
>
> This is because the device is removed from the PCI bus. Nothing from
> iwlwifi side can be done.
> If that happens upon suspend / resume, I know there are been fixes in
> PCI bus driver. If not, check that the device sits correctly in its
> socket.
>
> > Regards,
> > Bala
> >
> >
> > ---------- Forwarded message ----------
> > From: Denis Kenzior <[email protected]>
> > To: Balakrishnan Balasubramanian <[email protected]>, [email protected]
> > Cc:
> > Bcc:
> > Date: Thu, 06 Jun 2019 18:07:40 -0500
> > Subject: Re: iwd crashes randomly
> > Hi Bala,
> >
> > On 06/06/2019 06:00 PM, Balakrishnan Balasubramanian wrote:
> > > Sometimes after a week and sometimes after two days. Once crashed,
> > > restarting the service does not help. Had to restart the computer.
> > > Attaching stack trace from journal.
> >
> > That implies that your kernel is crashing, not iwd. The attached log
> > shows a kernel stack trace somewhere inside iwlwifi module. I would
> > post this trace to [email protected].
> >
> > If you have an associated iwd backtrace, then certainly post this here,
> > but if the kernel module is crashing, there isn't much we can do.
> >
> > Regards,
> > -Denis




2019-06-10 06:54:28

by Emmanuel Grumbach

[permalink] [raw]
Subject: Re: iwlwifi module crash

On Fri, Jun 7, 2019 at 2:41 PM Balakrishnan Balasubramanian
<[email protected]> wrote:
>
> > This is because the device is removed from the PCI bus. Nothing from
> > iwlwifi side can be done.
>
> I am sure the device is not physically disturbed. If that was the case, should it not stay down when restarting the system?

Not necessarily. The disturbance may impact ASPM or something alike.

>
> > If that happens upon suspend / resume, I know there are been fixes in
> > PCI bus driver.
>
> To my knowledge I have disabled all power/suspend features and I don't see releated logs in journal except the below. Not sure if relevant.
>
> Jun 03 21:33:14 zadesk kernel: wlan0: Limiting TX power to 14 (17 - 3) dBm as advertised by d4:5d:df:25:ee:90
>
> Is there a way to restart the module safely without restarting the system?

echo 1 > /sys/module/iwlwifi/devices/0000\:02\:00.0/remove
echo 1 > /sys/bus/pci/rescan

>
> Regards,
> Bala
>
>
> On Friday, June 7, 2019 5:25:41 AM EDT Emmanuel Grumbach wrote:
> > On Fri, Jun 7, 2019 at 5:22 AM Balakrishnan Balasubramanian
> >
> > <[email protected]> wrote:
> > > I am using iwd demon for wifi. Once a while I loose connectivity.
> > > Restarting the demon does not help. But once I restart the system, it
> > > starts working fine. Attaching stack trace from journal.
> >
> > This is because the device is removed from the PCI bus. Nothing from
> > iwlwifi side can be done.
> > If that happens upon suspend / resume, I know there are been fixes in
> > PCI bus driver. If not, check that the device sits correctly in its
> > socket.
> >
> > > Regards,
> > > Bala
> > >
> > >
> > > ---------- Forwarded message ----------
> > > From: Denis Kenzior <[email protected]>
> > > To: Balakrishnan Balasubramanian <[email protected]>, [email protected]
> > > Cc:
> > > Bcc:
> > > Date: Thu, 06 Jun 2019 18:07:40 -0500
> > > Subject: Re: iwd crashes randomly
> > > Hi Bala,
> > >
> > > On 06/06/2019 06:00 PM, Balakrishnan Balasubramanian wrote:
> > > > Sometimes after a week and sometimes after two days. Once crashed,
> > > > restarting the service does not help. Had to restart the computer.
> > > > Attaching stack trace from journal.
> > >
> > > That implies that your kernel is crashing, not iwd. The attached log
> > > shows a kernel stack trace somewhere inside iwlwifi module. I would
> > > post this trace to [email protected].
> > >
> > > If you have an associated iwd backtrace, then certainly post this here,
> > > but if the kernel module is crashing, there isn't much we can do.
> > >
> > > Regards,
> > > -Denis
>
>
>
>

Subject: Re: iwlwifi module crash

Jun 13 21:41:56 zadesk kernel: iwlwifi 0000:02:00.0: Failed to wake NIC for hcmd
Jun 13 21:41:56 zadesk kernel: iwlwifi 0000:02:00.0: Error sending SCAN_OFFLOAD_REQUEST_CMD: enqueue_hcmd failed: -5
Jun 13 21:41:56 zadesk kernel: iwlwifi 0000:02:00.0: Scan failed! ret -5
Jun 13 21:41:56 zadesk iwd[483]: Received error during CMD_TRIGGER_SCAN: Input/output error (5)


Attachments:
error (371.00 B)

2019-06-23 09:10:01

by bkil

[permalink] [raw]
Subject: Re: iwlwifi module crash

devices/ is probably just a symlink. Try to find it manually:
find /sys -iname remove
lspci

The interesting thing is that my iwlwifi card started to do the same
thing just recently (some weeks ago). However, I do suspend a lot and
it only happens after resuming, but not after every resume (maybe
5-10%). It always came back after restarting except on one day when it
needed three restarts, so maybe mine would be more about needing to
reseat the card.

> On Fri, Jun 14, 2019 at 4:54 AM Balakrishnan Balasubramanian <[email protected]> wrote:
>>
>> The issue occured again today. I tried to restart the module
>>
>> > echo 1 > /sys/module/iwlwifi/devices/0000\:02\:00.0/remove
>>
>> There is no folder 'devices'
>>
>> zadesk% ls /sys/module/iwlwifi
>> coresize drivers holders initsize initstate notes parameters refcnt
>> sections srcversion taint uevent
>>
>> > echo 1 > /sys/bus/pci/rescan
>>
>> Attached the error when trying to rescan.
>>
>> Thanks,
>> Bala
>>
>>
>>
>>

Subject: Re: iwlwifi module crash

Thanks for the tip. In my system the path to remove was below:

/sys/devices/pci0000:00/0000:00:1c.2/0000:02:00.0/remove
Also symlinked here:
/sys/module/iwlwifi/drivers/pci:iwlwifi/0000:02:00.0/remove

I am now able to restore internet without system restart. Now I need to find a
way to do this automatically whenever internet goes down.

Thanks,
Bala

On Sunday, June 23, 2019 5:08:32 AM EDT [email protected] wrote:
> devices/ is probably just a symlink. Try to find it manually:
> find /sys -iname remove
> lspci
>
> The interesting thing is that my iwlwifi card started to do the same
> thing just recently (some weeks ago). However, I do suspend a lot and
> it only happens after resuming, but not after every resume (maybe
> 5-10%). It always came back after restarting except on one day when it
> needed three restarts, so maybe mine would be more about needing to
> reseat the card.
>
> > On Fri, Jun 14, 2019 at 4:54 AM Balakrishnan Balasubramanian <linux-
[email protected]> wrote:
> >> The issue occured again today. I tried to restart the module
> >>
> >> > echo 1 > /sys/module/iwlwifi/devices/0000\:02\:00.0/remove
> >>
> >> There is no folder 'devices'
> >>
> >> zadesk% ls /sys/module/iwlwifi
> >> coresize drivers holders initsize initstate notes parameters
> >> refcnt
> >> sections srcversion taint uevent
> >>
> >> > echo 1 > /sys/bus/pci/rescan
> >>
> >> Attached the error when trying to rescan.
> >>
> >> Thanks,
> >> Bala




2019-06-25 22:19:47

by bkil

[permalink] [raw]
Subject: Re: iwlwifi module crash

Maybe we don't have the same issue, but today this one solved it for
me: `rmmod iwlmvm; rmmod iwlwifi; modprobe iwlwifi`. Well, kind of. It
couldn't really connect to any network afterwards, but after another
such dance it was finally back to normal.

The device disappears from iwconfig for me when this happens, so it is
easy to detect and correct with few lines of shell. It can also be
gone from lspci as well. Another alternative would be to set up a
syslog-ng program destination triggered by a patterndb sample matching
the noted failures among the kernel messages.

If it gets stuck without a visible sign, you could watch Inactive time
and tx failures in `iw dev wlan0 station dump` increase sharply in
case of issues. If you need a bit more reliability, it is easy to ping
the AP or gateway every second and refresh the connection in case of
too many consecutive missing replies.

On Mon, Jun 24, 2019 at 11:36 PM Balakrishnan Balasubramanian
<[email protected]> wrote:
>
> Thanks for the tip. In my system the path to remove was below:
>
> /sys/devices/pci0000:00/0000:00:1c.2/0000:02:00.0/remove
> Also symlinked here:
> /sys/module/iwlwifi/drivers/pci:iwlwifi/0000:02:00.0/remove
>
> I am now able to restore internet without system restart. Now I need to find a
> way to do this automatically whenever internet goes down.
>
> Thanks,
> Bala
>
> On Sunday, June 23, 2019 5:08:32 AM EDT [email protected] wrote:
> > devices/ is probably just a symlink. Try to find it manually:
> > find /sys -iname remove
> > lspci
> >
> > The interesting thing is that my iwlwifi card started to do the same
> > thing just recently (some weeks ago). However, I do suspend a lot and
> > it only happens after resuming, but not after every resume (maybe
> > 5-10%). It always came back after restarting except on one day when it
> > needed three restarts, so maybe mine would be more about needing to
> > reseat the card.
> >
> > > On Fri, Jun 14, 2019 at 4:54 AM Balakrishnan Balasubramanian <linux-
> [email protected]> wrote:
> > >> The issue occured again today. I tried to restart the module
> > >>
> > >> > echo 1 > /sys/module/iwlwifi/devices/0000\:02\:00.0/remove
> > >>
> > >> There is no folder 'devices'
> > >>
> > >> zadesk% ls /sys/module/iwlwifi
> > >> coresize drivers holders initsize initstate notes parameters
> > >> refcnt
> > >> sections srcversion taint uevent
> > >>
> > >> > echo 1 > /sys/bus/pci/rescan
> > >>
> > >> Attached the error when trying to rescan.
> > >>
> > >> Thanks,
> > >> Bala
>
>
>
>