2018-12-11 18:06:43

by Ricardo Salveti

[permalink] [raw]
Subject: wlcore getting stuck on hikey after the runtime PM autosuspend support change

Hey Tony and John,

I just got to test an OpenEmbedded-based rootfs with kernel
4.19/4.20-rc6 on a HiKey board and wlcore is constantly getting stuck
right after boot (via NetworkManager).

As this works just fine with 4.18, I did a quick bisect and found that
the patch that enables runtime PM autosuspend support (9b71578de0) is
the one that made the hang to happen.

The hang trace with 4.20-rc6:
[ 484.321030] INFO: task NetworkManager:599 blocked for more than 120 seconds.
[ 484.328324] Not tainted 4.20.0-rc6 #1
[ 484.334057] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[ 484.342182] NetworkManager D 0 599 1 0x00000008
[ 484.347724] Call trace:
[ 484.350200] __switch_to+0xa0/0xf8
[ 484.353647] __schedule+0x2ac/0x948
[ 484.357158] schedule+0x38/0x98
[ 484.360318] schedule_timeout+0x288/0x458
[ 484.364368] wait_for_common+0x148/0x170
[ 484.368310] wait_for_completion+0x28/0x38
[ 484.372430] mmc_wait_for_req_done+0x38/0x198
[ 484.376806] mmc_wait_for_req+0xb0/0xf0
[ 484.380664] mmc_io_rw_extended+0x1d0/0x2c0
[ 484.384866] sdio_io_rw_ext_helper+0x180/0x1f8
[ 484.389356] sdio_memcpy_toio+0x44/0x58
[ 484.393216] wl12xx_sdio_raw_write+0xe0/0x1b0
[ 484.397596] wlcore_boot_upload_firmware+0x1a8/0x4c0
[ 484.402582] wl18xx_boot+0x7dc/0xbc0
[ 484.406181] wl1271_op_add_interface+0x558/0x910
[ 484.410842] drv_add_interface+0x5c/0x1e8
[ 484.414876] ieee80211_do_open+0x220/0x7f8
[ 484.418992] ieee80211_open+0x4c/0x68
[ 484.422697] __dev_open+0xdc/0x158
[ 484.426119] __dev_change_flags+0x15c/0x1c0
[ 484.430326] dev_change_flags+0x34/0x70
[ 484.434198] do_setlink+0x28c/0xba8
[ 484.437709] rtnl_newlink+0x408/0x768
[ 484.441392] rtnetlink_rcv_msg+0x12c/0x338
[ 484.445510] netlink_rcv_skb+0x60/0x120
[ 484.449365] rtnetlink_rcv+0x28/0x38
[ 484.452961] netlink_unicast+0x194/0x210
[ 484.456902] netlink_sendmsg+0x1a0/0x348
[ 484.460847] sock_sendmsg+0x34/0x50
[ 484.464354] ___sys_sendmsg+0x288/0x2c8
[ 484.468234] __sys_sendmsg+0x7c/0xd0
[ 484.471814] __arm64_sys_sendmsg+0x2c/0x38
[ 484.475932] el0_svc_common+0x94/0xe8
[ 484.479635] el0_svc_handler+0x74/0x90
[ 484.483405] el0_svc+0x8/0xc

Since it seems the same driver and board combination is working fine
for John (with Android), I decided to take a look at what could be
causing this from the NetworkManager side and found that the MAC
address randomization during scan is what triggers the hang. If I
disable MAC address randomization in NetworkManager
(wifi.scan-rand-mac-address=no) it works fine, so I wonder if there is
a possible suspend/resume logic issue with the if up -> change mac ->
scan flow.

John, did you have any similar issue on your test environment with
kernel >= 4.19?

I'm still trying to isolate this issue without NetworkManager to see
what exactly is causing the hang, but wanted to report this first in
case you guys have any idea about what could be causing the hang.

Thanks,
--
Ricardo Salveti de Araujo


2018-12-11 18:19:49

by Tony Lindgren

[permalink] [raw]
Subject: Re: wlcore getting stuck on hikey after the runtime PM autosuspend support change

Hi,

* Ricardo Salveti <[email protected]> [181211 18:06]:
> John, did you have any similar issue on your test environment with
> kernel >= 4.19?

This sounds like the same issue that got fixed in the dts
earlier?

f904390ac8b2 ("arm64: dts: hikey: Define wl1835 power capabilities")

Regards,

Tony

2018-12-11 18:53:21

by Ricardo Salveti

[permalink] [raw]
Subject: Re: wlcore getting stuck on hikey after the runtime PM autosuspend support change

Hi,

On Tue, Dec 11, 2018 at 4:19 PM Tony Lindgren <[email protected]> wrote:
>
> Hi,
>
> * Ricardo Salveti <[email protected]> [181211 18:06]:
> > John, did you have any similar issue on your test environment with
> > kernel >= 4.19?
>
> This sounds like the same issue that got fixed in the dts
> earlier?
>
> f904390ac8b2 ("arm64: dts: hikey: Define wl1835 power capabilities")

keep-power-in-suspend was removed in 8883ac1db3e2 ("arm64: dts: hikey:
Remove keep-power-in-suspend property"), but I just tested after
adding that property back in and made no difference.

The funny thing is that it works fine if I disable NetworkManager MAC
address randomization.

Thanks,
--
Ricardo Salveti de Araujo

2018-12-11 19:01:32

by Tony Lindgren

[permalink] [raw]
Subject: Re: wlcore getting stuck on hikey after the runtime PM autosuspend support change

* Ricardo Salveti <[email protected]> [181211 18:53]:
> Hi,
>
> On Tue, Dec 11, 2018 at 4:19 PM Tony Lindgren <[email protected]> wrote:
> >
> > Hi,
> >
> > * Ricardo Salveti <[email protected]> [181211 18:06]:
> > > John, did you have any similar issue on your test environment with
> > > kernel >= 4.19?
> >
> > This sounds like the same issue that got fixed in the dts
> > earlier?
> >
> > f904390ac8b2 ("arm64: dts: hikey: Define wl1835 power capabilities")
>
> keep-power-in-suspend was removed in 8883ac1db3e2 ("arm64: dts: hikey:
> Remove keep-power-in-suspend property"), but I just tested after
> adding that property back in and made no difference.

OK

> The funny thing is that it works fine if I disable NetworkManager MAC
> address randomization.

Hmm. I wonder if there's pm_runtime_get() and put missing
around some calls. Do you know the related iw commands
that happen there to try to reproduce this manually?

Regards,

Tony

2018-12-11 19:14:12

by John Stultz

[permalink] [raw]
Subject: Re: wlcore getting stuck on hikey after the runtime PM autosuspend support change

On Tue, Dec 11, 2018 at 10:06 AM Ricardo Salveti <[email protected]> wrote:
>
> Hey Tony and John,
>
> I just got to test an OpenEmbedded-based rootfs with kernel
> 4.19/4.20-rc6 on a HiKey board and wlcore is constantly getting stuck
> right after boot (via NetworkManager).
>
> As this works just fine with 4.18, I did a quick bisect and found that
> the patch that enables runtime PM autosuspend support (9b71578de0) is
> the one that made the hang to happen.
>
> The hang trace with 4.20-rc6:
> [ 484.321030] INFO: task NetworkManager:599 blocked for more than 120 seconds.
> [ 484.328324] Not tainted 4.20.0-rc6 #1
> [ 484.334057] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> disables this message.
> [ 484.342182] NetworkManager D 0 599 1 0x00000008
> [ 484.347724] Call trace:
> [ 484.350200] __switch_to+0xa0/0xf8
> [ 484.353647] __schedule+0x2ac/0x948
> [ 484.357158] schedule+0x38/0x98
> [ 484.360318] schedule_timeout+0x288/0x458
> [ 484.364368] wait_for_common+0x148/0x170
> [ 484.368310] wait_for_completion+0x28/0x38
> [ 484.372430] mmc_wait_for_req_done+0x38/0x198
> [ 484.376806] mmc_wait_for_req+0xb0/0xf0
> [ 484.380664] mmc_io_rw_extended+0x1d0/0x2c0
> [ 484.384866] sdio_io_rw_ext_helper+0x180/0x1f8
> [ 484.389356] sdio_memcpy_toio+0x44/0x58
> [ 484.393216] wl12xx_sdio_raw_write+0xe0/0x1b0
> [ 484.397596] wlcore_boot_upload_firmware+0x1a8/0x4c0
> [ 484.402582] wl18xx_boot+0x7dc/0xbc0
> [ 484.406181] wl1271_op_add_interface+0x558/0x910
> [ 484.410842] drv_add_interface+0x5c/0x1e8
> [ 484.414876] ieee80211_do_open+0x220/0x7f8
> [ 484.418992] ieee80211_open+0x4c/0x68
> [ 484.422697] __dev_open+0xdc/0x158
> [ 484.426119] __dev_change_flags+0x15c/0x1c0
> [ 484.430326] dev_change_flags+0x34/0x70
> [ 484.434198] do_setlink+0x28c/0xba8
> [ 484.437709] rtnl_newlink+0x408/0x768
> [ 484.441392] rtnetlink_rcv_msg+0x12c/0x338
> [ 484.445510] netlink_rcv_skb+0x60/0x120
> [ 484.449365] rtnetlink_rcv+0x28/0x38
> [ 484.452961] netlink_unicast+0x194/0x210
> [ 484.456902] netlink_sendmsg+0x1a0/0x348
> [ 484.460847] sock_sendmsg+0x34/0x50
> [ 484.464354] ___sys_sendmsg+0x288/0x2c8
> [ 484.468234] __sys_sendmsg+0x7c/0xd0
> [ 484.471814] __arm64_sys_sendmsg+0x2c/0x38
> [ 484.475932] el0_svc_common+0x94/0xe8
> [ 484.479635] el0_svc_handler+0x74/0x90
> [ 484.483405] el0_svc+0x8/0xc
>
> Since it seems the same driver and board combination is working fine
> for John (with Android), I decided to take a look at what could be
> causing this from the NetworkManager side and found that the MAC
> address randomization during scan is what triggers the hang. If I
> disable MAC address randomization in NetworkManager
> (wifi.scan-rand-mac-address=no) it works fine, so I wonder if there is
> a possible suspend/resume logic issue with the if up -> change mac ->
> scan flow.
>
> John, did you have any similar issue on your test environment with
> kernel >= 4.19?
>
> I'm still trying to isolate this issue without NetworkManager to see
> what exactly is causing the hang, but wanted to report this first in
> case you guys have any idea about what could be causing the hang.

So this sounds very much like an issue Anders (cc'ed) was seeing in
the lab but not elsewhere. I think he had narrowed down to a similar
commit (my memory is foggy), but I think he hadn't isolated the
behavior to send details to the list.

I've not come across it myself, as the Android environment doesn't
seem to tickle this.

thanks
-john

2018-12-11 19:25:46

by Ricardo Salveti

[permalink] [raw]
Subject: Re: wlcore getting stuck on hikey after the runtime PM autosuspend support change

On Tue, Dec 11, 2018 at 5:01 PM Tony Lindgren <[email protected]> wrote:
>
> * Ricardo Salveti <[email protected]> [181211 18:53]:
> > Hi,
> >
> > On Tue, Dec 11, 2018 at 4:19 PM Tony Lindgren <[email protected]> wrote:
> > >
> > > Hi,
> > >
> > > * Ricardo Salveti <[email protected]> [181211 18:06]:
> > > > John, did you have any similar issue on your test environment with
> > > > kernel >= 4.19?
> > >
> > > This sounds like the same issue that got fixed in the dts
> > > earlier?
> > >
> > > f904390ac8b2 ("arm64: dts: hikey: Define wl1835 power capabilities")
> >
> > keep-power-in-suspend was removed in 8883ac1db3e2 ("arm64: dts: hikey:
> > Remove keep-power-in-suspend property"), but I just tested after
> > adding that property back in and made no difference.
>
> OK
>
> > The funny thing is that it works fine if I disable NetworkManager MAC
> > address randomization.
>
> Hmm. I wonder if there's pm_runtime_get() and put missing
> around some calls. Do you know the related iw commands
> that happen there to try to reproduce this manually?

Checking why this only happens with NetworkManager when MAC address
randomization is enabled, I noticed that NM performs the following
when starting:

Dec 11 18:59:04 hikey NetworkManager[1506]: <debug> [1544554744.3914]
device[0x5575703fe0] (wlan0): bringing up device 3
Dec 11 18:59:04 hikey NetworkManager[1506]: <debug> [1544554744.3915]
platform: link: setting up "wlan0" (3)
Dec 11 18:59:04 hikey NetworkManager[1506]: <debug> [1544554744.3939]
device[0x5575703fe0] (wlan0): taking down device 3
Dec 11 18:59:04 hikey NetworkManager[1506]: <debug> [1544554744.3940]
platform: link: setting down "wlan0" (3)
Dec 11 18:59:04 hikey NetworkManager[1506]: <info> [1544554744.4072]
device (wlan0): set-hw-addr: set MAC address to 22:D4:B1:49:B5:DB
(scanning)
Dec 11 18:59:04 hikey NetworkManager[1506]: <debug> [1544554744.4072]
device[0x5575703fe0] (wlan0): bringing up device 3
Dec 11 18:59:04 hikey NetworkManager[1506]: <debug> [1544554744.4073]
platform: link: setting up "wlan0" (3)

Then tried to reproduce with a simple 'while true; do ip link set dev
wlan0 down; ip link set dev wlan0 up; done;' and it was already enough
to cause the same hang. Adding a simple sleep 1 after down/up is
already enough to make it work, so something might be missing during
the down/up process that only happens when they get called right after
the other.

And it works fine with NetworkManager when I disable MAC address
randomization simply because it only brings the interface up once,
avoiding the hang.

Will enable debug and try to get a better trace.

Thanks,
--
Ricardo Salveti de Araujo

2018-12-11 19:51:14

by John Stultz

[permalink] [raw]
Subject: Re: wlcore getting stuck on hikey after the runtime PM autosuspend support change

On Tue, Dec 11, 2018 at 11:25 AM Ricardo Salveti <[email protected]> wrote:
>
> On Tue, Dec 11, 2018 at 5:01 PM Tony Lindgren <[email protected]> wrote:
> >
> > * Ricardo Salveti <[email protected]> [181211 18:53]:
> > > Hi,
> > >
> > > On Tue, Dec 11, 2018 at 4:19 PM Tony Lindgren <[email protected]> wrote:
> > > >
> > > > Hi,
> > > >
> > > > * Ricardo Salveti <[email protected]> [181211 18:06]:
> > > > > John, did you have any similar issue on your test environment with
> > > > > kernel >= 4.19?
> > > >
> > > > This sounds like the same issue that got fixed in the dts
> > > > earlier?
> > > >
> > > > f904390ac8b2 ("arm64: dts: hikey: Define wl1835 power capabilities")
> > >
> > > keep-power-in-suspend was removed in 8883ac1db3e2 ("arm64: dts: hikey:
> > > Remove keep-power-in-suspend property"), but I just tested after
> > > adding that property back in and made no difference.
> >
> > OK
> >
> > > The funny thing is that it works fine if I disable NetworkManager MAC
> > > address randomization.
> >
> > Hmm. I wonder if there's pm_runtime_get() and put missing
> > around some calls. Do you know the related iw commands
> > that happen there to try to reproduce this manually?
>
> Checking why this only happens with NetworkManager when MAC address
> randomization is enabled, I noticed that NM performs the following
> when starting:
>
> Dec 11 18:59:04 hikey NetworkManager[1506]: <debug> [1544554744.3914]
> device[0x5575703fe0] (wlan0): bringing up device 3
> Dec 11 18:59:04 hikey NetworkManager[1506]: <debug> [1544554744.3915]
> platform: link: setting up "wlan0" (3)
> Dec 11 18:59:04 hikey NetworkManager[1506]: <debug> [1544554744.3939]
> device[0x5575703fe0] (wlan0): taking down device 3
> Dec 11 18:59:04 hikey NetworkManager[1506]: <debug> [1544554744.3940]
> platform: link: setting down "wlan0" (3)
> Dec 11 18:59:04 hikey NetworkManager[1506]: <info> [1544554744.4072]
> device (wlan0): set-hw-addr: set MAC address to 22:D4:B1:49:B5:DB
> (scanning)
> Dec 11 18:59:04 hikey NetworkManager[1506]: <debug> [1544554744.4072]
> device[0x5575703fe0] (wlan0): bringing up device 3
> Dec 11 18:59:04 hikey NetworkManager[1506]: <debug> [1544554744.4073]
> platform: link: setting up "wlan0" (3)
>
> Then tried to reproduce with a simple 'while true; do ip link set dev
> wlan0 down; ip link set dev wlan0 up; done;' and it was already enough
> to cause the same hang. Adding a simple sleep 1 after down/up is
> already enough to make it work, so something might be missing during
> the down/up process that only happens when they get called right after
> the other.
>
> And it works fine with NetworkManager when I disable MAC address
> randomization simply because it only brings the interface up once,
> avoiding the hang.
>
> Will enable debug and try to get a better trace.

I'm not totally sure you're both seeing the same issue, but if its
helpful, here's the bug that Anders was working:
https://bugs.linaro.org/show_bug.cgi?id=3960

There's some mixing of a separate USB issue that has since been
resolved. But from the notes in that bug, Anders had possibly isolated
the wlcore issue down to the following commit:
https://git.linaro.org/people/anders.roxell/linux.git/commit/?h=tests-lwcore&id=598bfffb593d3fd0e31e790d604b44c9c5df368e

I wonder if something in that pm_runtime change effects this?

thanks
-john

2018-12-11 20:12:26

by Tony Lindgren

[permalink] [raw]
Subject: Re: wlcore getting stuck on hikey after the runtime PM autosuspend support change

* John Stultz <[email protected]> [181211 19:51]:
> On Tue, Dec 11, 2018 at 11:25 AM Ricardo Salveti <[email protected]> wrote:
> > Then tried to reproduce with a simple 'while true; do ip link set dev
> > wlan0 down; ip link set dev wlan0 up; done;' and it was already enough
> > to cause the same hang. Adding a simple sleep 1 after down/up is
> > already enough to make it work, so something might be missing during
> > the down/up process that only happens when they get called right after
> > the other.

With while true test above, I'm getting:

wlcore: ERROR timeout waiting for the hardware to complete initialization

Then after ctrl-c, wlan0 connects to the access point just fine
for me.

Is that what you're seeing or some oops?

> > And it works fine with NetworkManager when I disable MAC address
> > randomization simply because it only brings the interface up once,
> > avoiding the hang.
> >
> > Will enable debug and try to get a better trace.
>
> I'm not totally sure you're both seeing the same issue, but if its
> helpful, here's the bug that Anders was working:
> https://bugs.linaro.org/show_bug.cgi?id=3960
>
> There's some mixing of a separate USB issue that has since been
> resolved. But from the notes in that bug, Anders had possibly isolated
> the wlcore issue down to the following commit:
> https://git.linaro.org/people/anders.roxell/linux.git/commit/?h=tests-lwcore&id=598bfffb593d3fd0e31e790d604b44c9c5df368e
>
> I wonder if something in that pm_runtime change effects this?

Maybe yeah. Anyways, should be trivial to fix now once we figure
out how to reproduce it manually :)

Regards,

Tony

2018-12-11 20:24:26

by Ricardo Salveti

[permalink] [raw]
Subject: Re: wlcore getting stuck on hikey after the runtime PM autosuspend support change

On Tue, Dec 11, 2018 at 6:12 PM Tony Lindgren <[email protected]> wrote:
>
> * John Stultz <[email protected]> [181211 19:51]:
> > On Tue, Dec 11, 2018 at 11:25 AM Ricardo Salveti <[email protected]> wrote:
> > > Then tried to reproduce with a simple 'while true; do ip link set dev
> > > wlan0 down; ip link set dev wlan0 up; done;' and it was already enough
> > > to cause the same hang. Adding a simple sleep 1 after down/up is
> > > already enough to make it work, so something might be missing during
> > > the down/up process that only happens when they get called right after
> > > the other.
>
> With while true test above, I'm getting:
>
> wlcore: ERROR timeout waiting for the hardware to complete initialization
>
> Then after ctrl-c, wlan0 connects to the access point just fine
> for me.
>
> Is that what you're seeing or some oops?

I'm getting the following oops with the while true test:
[ 1088.565996] INFO: task ifconfig:941 blocked for more than 120 seconds.
[ 1088.572552] Not tainted 4.20.0-rc6-lmp-standard #1
[ 1088.577913] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[ 1088.585777] ifconfig D 0 941 916 0x00000008
[ 1088.591316] Call trace:
[ 1088.593772] __switch_to+0xa0/0xf8
[ 1088.597200] __schedule+0x2ac/0x948
[ 1088.600715] schedule+0x38/0x98
[ 1088.603882] schedule_timeout+0x288/0x458
[ 1088.607918] wait_for_common+0x148/0x170
[ 1088.611866] wait_for_completion+0x28/0x38
[ 1088.616013] mmc_wait_for_req_done+0x38/0x198
[ 1088.620397] mmc_wait_for_req+0xb0/0xf0
[ 1088.624259] mmc_io_rw_extended+0x1d0/0x2c0
[ 1088.628469] sdio_io_rw_ext_helper+0x180/0x1f8
[ 1088.633112] sdio_memcpy_toio+0x44/0x58
[ 1088.636979] wl12xx_sdio_raw_write+0xe0/0x1b0
[ 1088.641386] wlcore_boot_upload_firmware+0x1a8/0x4c0
[ 1088.646378] wl18xx_boot+0x7dc/0xbc0
[ 1088.649986] wl1271_op_add_interface+0x558/0x910
[ 1088.654659] drv_add_interface+0x5c/0x1e8
[ 1088.658716] ieee80211_do_open+0x220/0x7f8
[ 1088.662857] ieee80211_open+0x4c/0x68
[ 1088.666591] __dev_open+0xdc/0x158
[ 1088.670034] __dev_change_flags+0x15c/0x1c0
[ 1088.674246] dev_change_flags+0x34/0x70
[ 1088.678131] devinet_ioctl+0x3fc/0x740
[ 1088.681908] inet_ioctl+0x108/0x278
[ 1088.685425] sock_do_ioctl+0x5c/0x240
[ 1088.689133] sock_ioctl+0x1e8/0x3f0
[ 1088.692652] do_vfs_ioctl+0xc4/0x868
[ 1088.696253] ksys_ioctl+0x84/0xb8
[ 1088.699594] __arm64_sys_ioctl+0x28/0x38
[ 1088.703545] el0_svc_common+0x94/0xe8
[ 1088.707233] el0_svc_handler+0x74/0x90
[ 1088.711031] el0_svc+0x8/0xc

Added some more debugging, and this is what I'm getting when I add
some delay between down/up (working behavior):

wlan0 down:
Dec 11 20:17:06 hikey kernel: wlcore: mac80211 remove interface
Dec 11 20:17:06 hikey kernel: wlcore: mac80211 configure filter
changed 0 total 80000000
Dec 11 20:17:06 hikey kernel: wlcore: down
Dec 11 20:17:06 hikey kernel: wlcore: waking up chip from elp
Dec 11 20:17:06 hikey kernel: wlcore: wakeup time: 12 ms
Dec 11 20:17:06 hikey kernel: wlcore: cmd role disable
Dec 11 20:17:06 hikey kernel: wlcore: mac80211 stop
Dec 11 20:17:06 hikey kernel: wlcore: IRQ work
Dec 11 20:17:06 hikey kernel: wlcore: intr: 0x20 (fw_rx_counter = 172,
drv_rx_counter = 11, tx_results_counter = 0)
Dec 11 20:17:06 hikey kernel: wlcore: WL1271_ACX_INTR_HW_AVAILABLE

wlan0 up:
Dec 11 20:17:11 hikey kernel: wlcore: mac80211 start
Dec 11 20:17:11 hikey kernel: wlcore: mac80211 add interface type 2
mac 84:dd:20:92:54:5b
Dec 11 20:17:11 hikey kernel: wlcore: allocating hw queue base: 0
Dec 11 20:17:11 hikey kernel: mmc_host mmc2: Bus speed (slot 0) =
24800000Hz (slot req 300000Hz, actual 295238HZ div = 42)
Dec 11 20:17:11 hikey kernel: mmc_host mmc2: Bus speed (slot 0) =
24800000Hz (slot req 25000000Hz, actual 24800000HZ div = 0)
Dec 11 20:17:12 hikey kernel: wlcore: clock freq 5 (13, 120, 0, 0, spit)
Dec 11 20:17:12 hikey kernel: wlcore: chip id 0x6030111
Dec 11 20:17:12 hikey kernel: wlcore: firmware chunks to be uploaded: 15
Dec 11 20:17:12 hikey kernel: wlcore: chunk 14 addr 0x0 len 120432
Dec 11 20:17:12 hikey kernel: wlcore: starting firmware upload
Dec 11 20:17:12 hikey kernel: wlcore: fw_data_len 120432 chunk_size 16384
Dec 11 20:17:12 hikey kernel: wlcore: uploading fw chunk
0x000000002d8a6a40 to 0x0
Dec 11 20:17:12 hikey kernel: wlcore: uploading fw chunk
0x0000000048dff62f to 0x4000
Dec 11 20:17:12 hikey kernel: wlcore: uploading fw chunk
0x00000000f9372750 to 0x8000
Dec 11 20:17:12 hikey kernel: wlcore: uploading fw chunk
0x00000000f2685ee4 to 0xc000
Dec 11 20:17:12 hikey kernel: wlcore: uploading fw chunk
0x00000000c4fc484d to 0x10000
Dec 11 20:17:12 hikey kernel: wlcore: uploading fw chunk
0x00000000a6d3c9bf to 0x14000
Dec 11 20:17:12 hikey kernel: wlcore: uploading fw chunk
0x000000007404dc0f to 0x18000
Dec 11 20:17:12 hikey kernel: wlcore: uploading fw last chunk (5744 B)
0x00000000d7beb1f6 to 0x1c000
...

And now with while true:
Dec 11 20:18:13 hikey kernel: wlcore: mac80211 configure filter
changed 0 total 80000000
Dec 11 20:18:13 hikey kernel: wlcore: waking up chip from elp
Dec 11 20:18:13 hikey kernel: wlcore: wakeup time: 8 ms
Dec 11 20:18:13 hikey kernel: wlcore: acx group address tbl
Dec 11 20:18:13 hikey kernel: wlcore: cmd configure (63)
Dec 11 20:18:13 hikey kernel: wlcore: mac80211 remove interface
Dec 11 20:18:13 hikey kernel: wlcore: down
Dec 11 20:18:13 hikey kernel: wlcore: cmd role disable
Dec 11 20:18:13 hikey kernel: wlcore: IRQ work
Dec 11 20:18:13 hikey kernel: wlcore: mac80211 stop
Dec 11 20:18:13 hikey kernel: wlcore: intr: 0x20 (fw_rx_counter = 0,
drv_rx_counter = 0, tx_results_counter = 0)
Dec 11 20:18:13 hikey kernel: wlcore: WL1271_ACX_INTR_HW_AVAILABLE

Dec 11 20:18:13 hikey kernel: wlcore: mac80211 start
Dec 11 20:18:13 hikey kernel: wlcore: mac80211 add interface type 2
mac 84:dd:20:92:54:5b
Dec 11 20:18:13 hikey kernel: wlcore: allocating hw queue base: 0
Dec 11 20:18:13 hikey kernel: wlcore: clock freq 5 (13, 120, 0, 0, spit)
Dec 11 20:18:13 hikey kernel: wlcore: chip id 0x6030111
Dec 11 20:18:13 hikey kernel: wlcore: firmware chunks to be uploaded: 15
Dec 11 20:18:13 hikey kernel: wlcore: chunk 14 addr 0x0 len 120432
Dec 11 20:18:13 hikey kernel: wlcore: starting firmware upload
Dec 11 20:18:13 hikey kernel: wlcore: fw_data_len 120432 chunk_size 16384
Dec 11 20:18:13 hikey kernel: wlcore: uploading fw chunk
0x000000002d8a6a40 to 0x0
Dec 11 20:18:13 hikey kernel: wlcore: uploading fw chunk
0x0000000048dff62f to 0x4000
Dec 11 20:18:13 hikey kernel: wlcore: uploading fw chunk
0x00000000f9372750 to 0x8000
Dec 11 20:18:13 hikey kernel: wlcore: uploading fw chunk
0x00000000f2685ee4 to 0xc000
Dec 11 20:18:13 hikey kernel: wlcore: uploading fw chunk
0x00000000c4fc484d to 0x10000
Dec 11 20:18:13 hikey kernel: wlcore: uploading fw chunk
0x00000000a6d3c9bf to 0x14000
Dec 11 20:18:13 hikey kernel: wlcore: uploading fw chunk
0x000000007404dc0f to 0x18000
Dec 11 20:18:13 hikey kernel: wlcore: uploading fw last chunk (5744 B)
0x00000000d7beb1f6 to 0x1c000
Dec 11 20:18:13 hikey kernel: wlcore: chunk 13 addr 0x100000 len 131072
Dec 11 20:18:13 hikey kernel: wlcore: starting firmware upload
Dec 11 20:18:13 hikey kernel: wlcore: fw_data_len 131072 chunk_size 16384
Dec 11 20:18:13 hikey kernel: wlcore: uploading fw chunk
0x00000000c3cad69f to 0x100000
Dec 11 20:18:13 hikey kernel: wlcore: uploading fw chunk
0x00000000a9e4f338 to 0x104000
Dec 11 20:18:13 hikey kernel: wlcore: uploading fw chunk
0x000000003169e0a1 to 0x108000
Dec 11 20:18:13 hikey kernel: wlcore: uploading fw chunk
0x00000000cc5b7549 to 0x10c000
Dec 11 20:18:13 hikey kernel: wlcore: uploading fw chunk
0x000000009c0bb98a to 0x110000
Dec 11 20:18:13 hikey kernel: wlcore: uploading fw chunk
0x00000000f4d0b2c2 to 0x114000
Dec 11 20:18:13 hikey kernel: wlcore: uploading fw chunk
0x0000000026e1d203 to 0x118000
Dec 11 20:18:13 hikey kernel: wlcore: uploading fw chunk
0x00000000a3905e9c to 0x11c000
Dec 11 20:18:13 hikey kernel: wlcore: uploading fw last chunk (0 B)
0x00000000f0c022b6 to 0x120000
Dec 11 20:18:13 hikey kernel: wlcore: chunk 12 addr 0x20400000 len 52600
Dec 11 20:18:13 hikey kernel: wlcore: starting firmware upload
Dec 11 20:18:13 hikey kernel: wlcore: fw_data_len 52600 chunk_size 16384
Dec 11 20:18:13 hikey kernel: wlcore: uploading fw chunk
0x000000004f52801e to 0x20400000
Dec 11 20:18:13 hikey kernel: wlcore: uploading fw chunk
0x0000000066503358 to 0x20404000
Dec 11 20:18:13 hikey kernel: wlcore: uploading fw chunk
0x00000000433fff1b to 0x20408000
Dec 11 20:18:13 hikey kernel: wlcore: uploading fw last chunk (3448 B)
0x00000000a38c9bd9 to 0x2040c000
Dec 11 20:18:13 hikey kernel: wlcore: chunk 11 addr 0xb00000 len 536
Dec 11 20:18:13 hikey kernel: wlcore: starting firmware upload
Dec 11 20:18:13 hikey kernel: wlcore: fw_data_len 536 chunk_size 16384
Dec 11 20:18:13 hikey kernel: wlcore: uploading fw last chunk (536 B)
0x00000000ced82f93 to 0xb00000
Dec 11 20:18:13 hikey kernel: wlcore: chunk 10 addr 0xb0ae40 len 28120
Dec 11 20:18:13 hikey kernel: wlcore: starting firmware upload
Dec 11 20:18:13 hikey kernel: wlcore: fw_data_len 28120 chunk_size 16384
Dec 11 20:18:13 hikey kernel: wlcore: uploading fw chunk
0x00000000d6832925 to 0xb0ae40
Dec 11 20:18:13 hikey kernel: wlcore: uploading fw last chunk (11736
B) 0x0000000037ae96d5 to 0xb0ee40
Dec 11 20:18:13 hikey kernel: wlcore: chunk 9 addr 0x80900000 len 61140
Dec 11 20:18:13 hikey kernel: wlcore: starting firmware upload
Dec 11 20:18:13 hikey kernel: wlcore: fw_data_len 61140 chunk_size 16384
Dec 11 20:18:13 hikey kernel: wlcore: uploading fw chunk
0x0000000096b9ba67 to 0x80900000 -------> hangs

See the missing 'mmc_host mmc2' messages, which probably indicates
that mmc_host wasn't properly set after the down/up sequence, causing
the hang when uploading the firmware.

Thanks,
--
Ricardo Salveti de Araujo

2018-12-11 20:45:31

by Ricardo Salveti

[permalink] [raw]
Subject: Re: wlcore getting stuck on hikey after the runtime PM autosuspend support change

On Tue, Dec 11, 2018 at 6:23 PM Ricardo Salveti <[email protected]> wrote:
>
> On Tue, Dec 11, 2018 at 6:12 PM Tony Lindgren <[email protected]> wrote:
> >
> > * John Stultz <[email protected]> [181211 19:51]:
> > > On Tue, Dec 11, 2018 at 11:25 AM Ricardo Salveti <[email protected]> wrote:
> > > > Then tried to reproduce with a simple 'while true; do ip link set dev
> > > > wlan0 down; ip link set dev wlan0 up; done;' and it was already enough
> > > > to cause the same hang. Adding a simple sleep 1 after down/up is
> > > > already enough to make it work, so something might be missing during
> > > > the down/up process that only happens when they get called right after
> > > > the other.
> >
> > With while true test above, I'm getting:
> >
> > wlcore: ERROR timeout waiting for the hardware to complete initialization
> >
> > Then after ctrl-c, wlan0 connects to the access point just fine
> > for me.
> >
> > Is that what you're seeing or some oops?

Tried the same while true test on my beaglebone black wireless and
also got a trace, but without hanging the kernel (sdio write failed
instead):

root@beaglebone-yocto:~# while true; do ifconfig wlan0 down; ifconfig
wlan0 up; done;
[ 31.467917] wlcore: down
[ 33.205670] ------------[ cut here ]------------
[ 33.211048] WARNING: CPU: 0 PID: 407 at
drivers/net/wireless/ti/wlcore/sdio.c:145
wl12xx_sdio_raw_write+0x134/0x1c0 [wlcore_sdio]
[ 33.226358] Modules linked in: arc4 wl18xx wlcore mac80211
musb_dsps musb_hdrc cppi41 udc_core snd_soc_hdmi_codec phy_am335x
phy_generic phy_am335x_control pm33xx snd_soc_simple_card
snd_soc_simple_card_utils wkup_m3_ipc wkup_m3_rproc remoteproc virtio
virtio_ring ti_emif_sram snd_soc_davinci_mcasp hci_uart snd_soc_edma
snd_soc_sdma btqca btbcm tilcdc btintel snd_soc_core snd_pcm_dmaengine
snd_pcm snd_timer omap_mailbox wlcore_sdio snd soundcore musb_am335x
rtc_omap at24 tda998x cpufreq_dt thermal_sys fuse bluetooth_6lowpan
nhc_udp nhc_dest nhc_fragment nhc_hop nhc_ipv6 nhc_mobility
nhc_routing 6lowpan
[ 33.292595] CPU: 0 PID: 407 Comm: ifconfig Not tainted
4.20.0-rc6-lmp-standard #1
[ 33.304730] Hardware name: Generic AM33XX (Flattened Device Tree)
[ 33.311136] Backtrace:
[ 33.314930] [<c010f118>] (dump_backtrace) from [<c010f4b8>]
(show_stack+0x20/0x24)
[ 33.323540] r7:dc94cc10 r6:00000009 r5:00000000 r4:00000000
[ 33.329520] [<c010f498>] (show_stack) from [<c0a0f230>]
(dump_stack+0x24/0x28)
[ 33.338072] [<c0a0f20c>] (dump_stack) from [<c013a804>]
(__warn.part.3+0xcc/0xe4)
[ 33.346441] [<c013a738>] (__warn.part.3) from [<c013a99c>]
(warn_slowpath_null+0x54/0x5c)
[ 33.355623] r6:bf1393b4 r5:00000091 r4:bf13abe0
[ 33.360512] [<c013a948>] (warn_slowpath_null) from [<bf1393b4>]
(wl12xx_sdio_raw_write+0x134/0x1c0 [wlcore_sdio])
[ 33.372301] r6:00004000 r5:c0f03088 r4:dd69ec00
[ 33.377244] [<bf139280>] (wl12xx_sdio_raw_write [wlcore_sdio]) from
[<bf40b33c>] (wlcore_boot_upload_firmware+0x1ec/0x55c [wlcore])
[ 33.390554] r10:dca18000 r9:dcba4d60 r8:bf139280 r7:00014000
r6:bf41f480 r5:80904000
[ 33.399378] r4:00000001
[ 33.402673] [<bf40b150>] (wlcore_boot_upload_firmware [wlcore])
from [<bf436228>] (wl18xx_boot+0x8e4/0xd4c [wl18xx])
[ 33.414270] r10:00000000 r9:00000014 r8:c0cdf170 r7:bf41f480
r6:c0f03088 r5:00000000
[ 33.422993] r4:dcba4d60
[ 33.425722] [<bf435944>] (wl18xx_boot [wl18xx]) from [<bf3fba38>]
(wl1271_op_add_interface+0x580/0x9b0 [wlcore])
[ 33.437644] r9:00000003 r8:dcba4da0 r7:bf41f480 r6:c0f03088
r5:dcba4d60 r4:00000000
[ 33.446685] [<bf3fb4b8>] (wl1271_op_add_interface [wlcore]) from
[<bf345e10>] (drv_add_interface+0x58/0x1a8 [mac80211])
[ 33.458858] r10:00000000 r9:ffffffff r8:00000001 r7:de7eb4e0
r6:dcba4460 r5:dcba4460
[ 33.467670] r4:de7eb4e0
[ 33.470571] [<bf345db8>] (drv_add_interface [mac80211]) from
[<bf35d3a0>] (ieee80211_do_open+0x490/0x984 [mac80211])
[ 33.482807] r7:de7eb4e0 r6:dcba4460 r5:de7eb000 r4:de7eb4e8
[ 33.488965] [<bf35cf10>] (ieee80211_do_open [mac80211]) from
[<bf35d8e8>] (ieee80211_open+0x54/0x60 [mac80211])
[ 33.500593] r10:db2f2aa0 r9:00000000 r8:00001002 r7:c0f03088
r6:bf3a94e0 r5:c0f03088
[ 33.509482] r4:de7eb4e0
[ 33.512904] [<bf35d894>] (ieee80211_open [mac80211]) from
[<c07fa648>] (__dev_open+0xec/0x170)
[ 33.521946] r5:c0f03088 r4:de7eb000
[ 33.526686] [<c07fa55c>] (__dev_open) from [<c07faa68>]
(__dev_change_flags+0x174/0x1d4)
[ 33.535768] r6:00001043 r5:00000001 r4:de7eb000
[ 33.540656] [<c07fa8f4>] (__dev_change_flags) from [<c07faaf0>]
(dev_change_flags+0x28/0x58)
[ 33.550439] r9:00000000 r8:c0fa2060 r7:de7eb144 r6:00000000
r5:00001002 r4:de7eb000
[ 33.559205] [<c07faac8>] (dev_change_flags) from [<c0895d80>]
(devinet_ioctl+0x6d4/0x704)
[ 33.568345] r9:00000000 r8:c0fa2060 r7:dd0e4a0c r6:c0f03088
r5:dd1cbe08 r4:00000000
[ 33.576962] [<c08956ac>] (devinet_ioctl) from [<c0898048>]
(inet_ioctl+0x214/0x3c4)
[ 33.585575] r10:db2f2aa0 r9:00000004 r8:00000051 r7:c0fa2060
r6:00008914 r5:be9f7b0c
[ 33.594298] r4:c0f03088
[ 33.596979] [<c0897e34>] (inet_ioctl) from [<c07d4b14>]
(sock_ioctl+0x1d0/0x4e0)
[ 33.605773] r9:00000004 r8:c0fa2060 r7:be9f7b0c r6:dccc9b40
r5:c0f03088 r4:00008914
[ 33.614530] [<c07d4944>] (sock_ioctl) from [<c02e4b9c>]
(do_vfs_ioctl+0xc0/0x95c)
[ 33.622983] r9:00000004 r8:be9f7b0c r7:00000004 r6:dccc9b40
r5:be9f7b0c r4:c0f03088
[ 33.631144] [<c02e4adc>] (do_vfs_ioctl) from [<c02e54a4>]
(ksys_ioctl+0x6c/0x90)
[ 33.639762] r10:00000036 r9:00000004 r8:be9f7b0c r7:00008914
r6:00000000 r5:dccc9b40
[ 33.648502] r4:dccc9b40
[ 33.651177] [<c02e5438>] (ksys_ioctl) from [<c02e54e0>] (sys_ioctl+0x18/0x1c)
[ 33.659609] r9:dd1ca000 r8:c0101204 r7:00000036 r6:b6fdb900
r5:004d5e44 r4:be9f7b0c
[ 33.668370] [<c02e54c8>] (sys_ioctl) from [<c0101000>]
(ret_fast_syscall+0x0/0x54)
[ 33.676833] Exception stack(0xdd1cbfa8 to 0xdd1cbff0)
[ 33.682604] bfa0: be9f7b0c 004d5e44 00000004
00008914 be9f7b0c 00001002
[ 33.691187] bfc0: be9f7b0c 004d5e44 b6fdb900 00000036 be9f7bec
004d6384 004d5e44 be9f7efe
[ 33.700592] bfe0: 004d5ec4 be9f7afc 004bf0d1 b6f520f8
[ 33.706359] ---[ end trace 39415e1e5f535790 ]---
[ 33.711217] wl1271_sdio mmc2:0001:2: sdio write failed (-110)

Looks like it could be the same issue, but showing up differently on
hikey due the mmc controller used.

Thanks,
--
Ricardo Salveti de Araujo

2018-12-11 21:51:08

by Ricardo Salveti

[permalink] [raw]
Subject: Re: wlcore getting stuck on hikey after the runtime PM autosuspend support change

On Tue, Dec 11, 2018 at 6:23 PM Ricardo Salveti <[email protected]> wrote:
> And now with while true:
> Dec 11 20:18:13 hikey kernel: wlcore: mac80211 configure filter
> changed 0 total 80000000
> Dec 11 20:18:13 hikey kernel: wlcore: waking up chip from elp
> Dec 11 20:18:13 hikey kernel: wlcore: wakeup time: 8 ms
> Dec 11 20:18:13 hikey kernel: wlcore: acx group address tbl
> Dec 11 20:18:13 hikey kernel: wlcore: cmd configure (63)
> Dec 11 20:18:13 hikey kernel: wlcore: mac80211 remove interface
> Dec 11 20:18:13 hikey kernel: wlcore: down
> Dec 11 20:18:13 hikey kernel: wlcore: cmd role disable
> Dec 11 20:18:13 hikey kernel: wlcore: IRQ work
> Dec 11 20:18:13 hikey kernel: wlcore: mac80211 stop
> Dec 11 20:18:13 hikey kernel: wlcore: intr: 0x20 (fw_rx_counter = 0,
> drv_rx_counter = 0, tx_results_counter = 0)
> Dec 11 20:18:13 hikey kernel: wlcore: WL1271_ACX_INTR_HW_AVAILABLE
>
> Dec 11 20:18:13 hikey kernel: wlcore: mac80211 start
> Dec 11 20:18:13 hikey kernel: wlcore: mac80211 add interface type 2
> mac 84:dd:20:92:54:5b
> Dec 11 20:18:13 hikey kernel: wlcore: allocating hw queue base: 0
> Dec 11 20:18:13 hikey kernel: wlcore: clock freq 5 (13, 120, 0, 0, spit)
> Dec 11 20:18:13 hikey kernel: wlcore: chip id 0x6030111
> Dec 11 20:18:13 hikey kernel: wlcore: firmware chunks to be uploaded: 15
> Dec 11 20:18:13 hikey kernel: wlcore: chunk 14 addr 0x0 len 120432
> Dec 11 20:18:13 hikey kernel: wlcore: starting firmware upload
> Dec 11 20:18:13 hikey kernel: wlcore: fw_data_len 120432 chunk_size 16384
> Dec 11 20:18:13 hikey kernel: wlcore: uploading fw chunk
> 0x000000002d8a6a40 to 0x0
> Dec 11 20:18:13 hikey kernel: wlcore: uploading fw chunk
> 0x0000000048dff62f to 0x4000
> Dec 11 20:18:13 hikey kernel: wlcore: uploading fw chunk
> 0x00000000f9372750 to 0x8000
> Dec 11 20:18:13 hikey kernel: wlcore: uploading fw chunk
> 0x00000000f2685ee4 to 0xc000
> Dec 11 20:18:13 hikey kernel: wlcore: uploading fw chunk
> 0x00000000c4fc484d to 0x10000
> Dec 11 20:18:13 hikey kernel: wlcore: uploading fw chunk
> 0x00000000a6d3c9bf to 0x14000
> Dec 11 20:18:13 hikey kernel: wlcore: uploading fw chunk
> 0x000000007404dc0f to 0x18000
> Dec 11 20:18:13 hikey kernel: wlcore: uploading fw last chunk (5744 B)
> 0x00000000d7beb1f6 to 0x1c000
> Dec 11 20:18:13 hikey kernel: wlcore: chunk 13 addr 0x100000 len 131072
> Dec 11 20:18:13 hikey kernel: wlcore: starting firmware upload
> Dec 11 20:18:13 hikey kernel: wlcore: fw_data_len 131072 chunk_size 16384
> Dec 11 20:18:13 hikey kernel: wlcore: uploading fw chunk
> 0x00000000c3cad69f to 0x100000
> Dec 11 20:18:13 hikey kernel: wlcore: uploading fw chunk
> 0x00000000a9e4f338 to 0x104000
> Dec 11 20:18:13 hikey kernel: wlcore: uploading fw chunk
> 0x000000003169e0a1 to 0x108000
> Dec 11 20:18:13 hikey kernel: wlcore: uploading fw chunk
> 0x00000000cc5b7549 to 0x10c000
> Dec 11 20:18:13 hikey kernel: wlcore: uploading fw chunk
> 0x000000009c0bb98a to 0x110000
> Dec 11 20:18:13 hikey kernel: wlcore: uploading fw chunk
> 0x00000000f4d0b2c2 to 0x114000
> Dec 11 20:18:13 hikey kernel: wlcore: uploading fw chunk
> 0x0000000026e1d203 to 0x118000
> Dec 11 20:18:13 hikey kernel: wlcore: uploading fw chunk
> 0x00000000a3905e9c to 0x11c000
> Dec 11 20:18:13 hikey kernel: wlcore: uploading fw last chunk (0 B)
> 0x00000000f0c022b6 to 0x120000
> Dec 11 20:18:13 hikey kernel: wlcore: chunk 12 addr 0x20400000 len 52600
> Dec 11 20:18:13 hikey kernel: wlcore: starting firmware upload
> Dec 11 20:18:13 hikey kernel: wlcore: fw_data_len 52600 chunk_size 16384
> Dec 11 20:18:13 hikey kernel: wlcore: uploading fw chunk
> 0x000000004f52801e to 0x20400000
> Dec 11 20:18:13 hikey kernel: wlcore: uploading fw chunk
> 0x0000000066503358 to 0x20404000
> Dec 11 20:18:13 hikey kernel: wlcore: uploading fw chunk
> 0x00000000433fff1b to 0x20408000
> Dec 11 20:18:13 hikey kernel: wlcore: uploading fw last chunk (3448 B)
> 0x00000000a38c9bd9 to 0x2040c000
> Dec 11 20:18:13 hikey kernel: wlcore: chunk 11 addr 0xb00000 len 536
> Dec 11 20:18:13 hikey kernel: wlcore: starting firmware upload
> Dec 11 20:18:13 hikey kernel: wlcore: fw_data_len 536 chunk_size 16384
> Dec 11 20:18:13 hikey kernel: wlcore: uploading fw last chunk (536 B)
> 0x00000000ced82f93 to 0xb00000
> Dec 11 20:18:13 hikey kernel: wlcore: chunk 10 addr 0xb0ae40 len 28120
> Dec 11 20:18:13 hikey kernel: wlcore: starting firmware upload
> Dec 11 20:18:13 hikey kernel: wlcore: fw_data_len 28120 chunk_size 16384
> Dec 11 20:18:13 hikey kernel: wlcore: uploading fw chunk
> 0x00000000d6832925 to 0xb0ae40
> Dec 11 20:18:13 hikey kernel: wlcore: uploading fw last chunk (11736
> B) 0x0000000037ae96d5 to 0xb0ee40
> Dec 11 20:18:13 hikey kernel: wlcore: chunk 9 addr 0x80900000 len 61140
> Dec 11 20:18:13 hikey kernel: wlcore: starting firmware upload
> Dec 11 20:18:13 hikey kernel: wlcore: fw_data_len 61140 chunk_size 16384
> Dec 11 20:18:13 hikey kernel: wlcore: uploading fw chunk
> 0x0000000096b9ba67 to 0x80900000 -------> hangs
>
> See the missing 'mmc_host mmc2' messages, which probably indicates
> that mmc_host wasn't properly set after the down/up sequence, causing
> the hang when uploading the firmware.

Tried to change back wl12xx_sdio_power_off to use pm_runtime_put_sync
as a test, and noticed I always get -EBUSY when reproducing the hang,
so it looks like this could be a race between pm_runtime_put/get when
doing if down/up (and the side effect on the mmc controller).

Cheers,
--
Ricardo Salveti de Araujo

2018-12-11 23:45:03

by Anders Roxell

[permalink] [raw]
Subject: Re: wlcore getting stuck on hikey after the runtime PM autosuspend support change

On Tue, 11 Dec 2018 at 22:51, Ricardo Salveti <[email protected]> wrote:
>
> On Tue, Dec 11, 2018 at 6:23 PM Ricardo Salveti <[email protected]> wrote:
> > And now with while true:
> > Dec 11 20:18:13 hikey kernel: wlcore: mac80211 configure filter
> > changed 0 total 80000000
> > Dec 11 20:18:13 hikey kernel: wlcore: waking up chip from elp
> > Dec 11 20:18:13 hikey kernel: wlcore: wakeup time: 8 ms
> > Dec 11 20:18:13 hikey kernel: wlcore: acx group address tbl
> > Dec 11 20:18:13 hikey kernel: wlcore: cmd configure (63)
> > Dec 11 20:18:13 hikey kernel: wlcore: mac80211 remove interface
> > Dec 11 20:18:13 hikey kernel: wlcore: down
> > Dec 11 20:18:13 hikey kernel: wlcore: cmd role disable
> > Dec 11 20:18:13 hikey kernel: wlcore: IRQ work
> > Dec 11 20:18:13 hikey kernel: wlcore: mac80211 stop
> > Dec 11 20:18:13 hikey kernel: wlcore: intr: 0x20 (fw_rx_counter = 0,
> > drv_rx_counter = 0, tx_results_counter = 0)
> > Dec 11 20:18:13 hikey kernel: wlcore: WL1271_ACX_INTR_HW_AVAILABLE
> >
> > Dec 11 20:18:13 hikey kernel: wlcore: mac80211 start
> > Dec 11 20:18:13 hikey kernel: wlcore: mac80211 add interface type 2
> > mac 84:dd:20:92:54:5b
> > Dec 11 20:18:13 hikey kernel: wlcore: allocating hw queue base: 0
> > Dec 11 20:18:13 hikey kernel: wlcore: clock freq 5 (13, 120, 0, 0, spit)
> > Dec 11 20:18:13 hikey kernel: wlcore: chip id 0x6030111
> > Dec 11 20:18:13 hikey kernel: wlcore: firmware chunks to be uploaded: 15
> > Dec 11 20:18:13 hikey kernel: wlcore: chunk 14 addr 0x0 len 120432
> > Dec 11 20:18:13 hikey kernel: wlcore: starting firmware upload
> > Dec 11 20:18:13 hikey kernel: wlcore: fw_data_len 120432 chunk_size 16384
> > Dec 11 20:18:13 hikey kernel: wlcore: uploading fw chunk
> > 0x000000002d8a6a40 to 0x0
> > Dec 11 20:18:13 hikey kernel: wlcore: uploading fw chunk
> > 0x0000000048dff62f to 0x4000
> > Dec 11 20:18:13 hikey kernel: wlcore: uploading fw chunk
> > 0x00000000f9372750 to 0x8000
> > Dec 11 20:18:13 hikey kernel: wlcore: uploading fw chunk
> > 0x00000000f2685ee4 to 0xc000
> > Dec 11 20:18:13 hikey kernel: wlcore: uploading fw chunk
> > 0x00000000c4fc484d to 0x10000
> > Dec 11 20:18:13 hikey kernel: wlcore: uploading fw chunk
> > 0x00000000a6d3c9bf to 0x14000
> > Dec 11 20:18:13 hikey kernel: wlcore: uploading fw chunk
> > 0x000000007404dc0f to 0x18000
> > Dec 11 20:18:13 hikey kernel: wlcore: uploading fw last chunk (5744 B)
> > 0x00000000d7beb1f6 to 0x1c000
> > Dec 11 20:18:13 hikey kernel: wlcore: chunk 13 addr 0x100000 len 131072
> > Dec 11 20:18:13 hikey kernel: wlcore: starting firmware upload
> > Dec 11 20:18:13 hikey kernel: wlcore: fw_data_len 131072 chunk_size 16384
> > Dec 11 20:18:13 hikey kernel: wlcore: uploading fw chunk
> > 0x00000000c3cad69f to 0x100000
> > Dec 11 20:18:13 hikey kernel: wlcore: uploading fw chunk
> > 0x00000000a9e4f338 to 0x104000
> > Dec 11 20:18:13 hikey kernel: wlcore: uploading fw chunk
> > 0x000000003169e0a1 to 0x108000
> > Dec 11 20:18:13 hikey kernel: wlcore: uploading fw chunk
> > 0x00000000cc5b7549 to 0x10c000
> > Dec 11 20:18:13 hikey kernel: wlcore: uploading fw chunk
> > 0x000000009c0bb98a to 0x110000
> > Dec 11 20:18:13 hikey kernel: wlcore: uploading fw chunk
> > 0x00000000f4d0b2c2 to 0x114000
> > Dec 11 20:18:13 hikey kernel: wlcore: uploading fw chunk
> > 0x0000000026e1d203 to 0x118000
> > Dec 11 20:18:13 hikey kernel: wlcore: uploading fw chunk
> > 0x00000000a3905e9c to 0x11c000
> > Dec 11 20:18:13 hikey kernel: wlcore: uploading fw last chunk (0 B)
> > 0x00000000f0c022b6 to 0x120000
> > Dec 11 20:18:13 hikey kernel: wlcore: chunk 12 addr 0x20400000 len 52600
> > Dec 11 20:18:13 hikey kernel: wlcore: starting firmware upload
> > Dec 11 20:18:13 hikey kernel: wlcore: fw_data_len 52600 chunk_size 16384
> > Dec 11 20:18:13 hikey kernel: wlcore: uploading fw chunk
> > 0x000000004f52801e to 0x20400000
> > Dec 11 20:18:13 hikey kernel: wlcore: uploading fw chunk
> > 0x0000000066503358 to 0x20404000
> > Dec 11 20:18:13 hikey kernel: wlcore: uploading fw chunk
> > 0x00000000433fff1b to 0x20408000
> > Dec 11 20:18:13 hikey kernel: wlcore: uploading fw last chunk (3448 B)
> > 0x00000000a38c9bd9 to 0x2040c000
> > Dec 11 20:18:13 hikey kernel: wlcore: chunk 11 addr 0xb00000 len 536
> > Dec 11 20:18:13 hikey kernel: wlcore: starting firmware upload
> > Dec 11 20:18:13 hikey kernel: wlcore: fw_data_len 536 chunk_size 16384
> > Dec 11 20:18:13 hikey kernel: wlcore: uploading fw last chunk (536 B)
> > 0x00000000ced82f93 to 0xb00000
> > Dec 11 20:18:13 hikey kernel: wlcore: chunk 10 addr 0xb0ae40 len 28120
> > Dec 11 20:18:13 hikey kernel: wlcore: starting firmware upload
> > Dec 11 20:18:13 hikey kernel: wlcore: fw_data_len 28120 chunk_size 16384
> > Dec 11 20:18:13 hikey kernel: wlcore: uploading fw chunk
> > 0x00000000d6832925 to 0xb0ae40
> > Dec 11 20:18:13 hikey kernel: wlcore: uploading fw last chunk (11736
> > B) 0x0000000037ae96d5 to 0xb0ee40
> > Dec 11 20:18:13 hikey kernel: wlcore: chunk 9 addr 0x80900000 len 61140
> > Dec 11 20:18:13 hikey kernel: wlcore: starting firmware upload
> > Dec 11 20:18:13 hikey kernel: wlcore: fw_data_len 61140 chunk_size 16384
> > Dec 11 20:18:13 hikey kernel: wlcore: uploading fw chunk
> > 0x0000000096b9ba67 to 0x80900000 -------> hangs
> >
> > See the missing 'mmc_host mmc2' messages, which probably indicates
> > that mmc_host wasn't properly set after the down/up sequence, causing
> > the hang when uploading the firmware.
>
> Tried to change back wl12xx_sdio_power_off to use pm_runtime_put_sync
> as a test, and noticed I always get -EBUSY when reproducing the hang,
> so it looks like this could be a race between pm_runtime_put/get when
> doing if down/up (and the side effect on the mmc controller).

I have different serious, one when our board hangs on udhcpc [1], then I see
wlcore and wl18xx_driver.

[ 8.824419] cfg80211: Loading compiled-in X.509 certificates for
regulatory database
[ 8.852546] cfg80211: Loaded X.509 cert 'sforshee: 00b28ddf47aef9cea7'
[ 8.861006] platform regulatory.0: Direct firmware load for
regulatory.db failed with error -2
[ 8.870248] cfg80211: failed to load regulatory.db
[ 8.895501] asix 1-1.1:1.0 eth0: register 'asix' at
usb-f72c0000.usb-1.1, ASIX AX88772B USB 2.0 Ethernet,
00:50:b6:14:c8:76
[ 8.912285] usbcore: registered new interface driver asix
[ 9.053373] mmc_host mmc2: Bus speed (slot 0) = 24800000Hz (slot
req 100000Hz, actual 100000HZ div = 124)
[ 9.130886] mmc_host mmc2: Bus speed (slot 0) = 24800000Hz (slot
req 25000000Hz, actual 24800000HZ div = 0)
[ 9.149318] wl18xx_driver wl18xx.3.auto: Direct firmware load for
ti-connectivity/wl18xx-conf.bin failed with error -2
[ 9.160459] wlcore: ERROR could not get configuration binary
ti-connectivity/wl18xx-conf.bin: -2
[ 9.169615] wlcore: WARNING falling back to default config
[ 9.214108] mmc_host mmc2: Bus speed (slot 0) = 24800000Hz (slot
req 100000Hz, actual 100000HZ div = 124)
[ 9.294294] mmc_host mmc2: Bus speed (slot 0) = 24800000Hz (slot
req 25000000Hz, actual 24800000HZ div = 0)
[[0;32m OK [0m] Listening on Load/Save RF Kill Switch Status
/dev/rfkill Watch.
[[0;32m OK [0m] Reached target Sockets.
[[0;32m OK [0m] Reached target Basic System.
Starting Entropy Daemon based on the HAVEGE algorithm...
[[0;32m OK [0m] Started TEE Supplicant.
[ 9.514713] wlcore: wl18xx HW: 183x or 180x, PG 2.2 (ROM 0x11)
[ 9.526319] wlcore: WARNING Detected unconfigured mac address in
nvs, derive from fuse instead.
[ 9.535454] wlcore: WARNING This default nvs file can be removed
from the file system

and when it passes that stage I can't find wlcore nor wl18xx_driver.
However, I can see
wl1271_sdio failing instead. See [2]

[ 8.435849] Bluetooth: HCI UART protocol Broadcom registered
[ 8.439669] mmc_host mmc2: Bus speed (slot 0) = 24800000Hz (slot
req 200000Hz, actual 200000HZ div = 62)
[[0;32m OK [0m] Started dnf-automatic-download timer.
[ 8.469492] wl1271_sdio: probe of mmc2:0001:2 failed with error -16
[[0;32m OK [0m] Started Daily Cleanup of Temporary Directories.
[[0;32m OK [0m] Started dnf-automatic-install timer.
[ 8.525443] Console: switching to colour frame buffer device 256x72
[ 8.533489] Bluetooth: hci0: change remote baud rate command in firmware
[ 8.584607] kirin-drm f4100000.ade: fb0: DRM emulated frame buffer device
[[0;32m OK [0m] Started dnf-automatic-notifyonly timer.
[[0;32m OK [0m] Listening on R[ 8.614417] [drm] Initialized kirin
1.0.0 20150718 for f4100000.ade on minor 0

and another job that have the similar issue with the previous is [3]:

[ 8.670110] adv7511 2-0039: 2-0039 supply pvdd not found, using
dummy regulator
[ 8.677908] adv7511 2-0039: 2-0039 supply a2vdd not found, using
dummy regulator
[ 8.685788] adv7511 2-0039: 2-0039 supply v3p3 not found, using
dummy regulator
[ 8.688699] wl1271_sdio: probe of mmc2:0001:1 failed with error -16
[ 8.690154] Bluetooth: Core ver 2.22
[ 8.690346] NET: Registered protocol family 31
[ 8.690355] Bluetooth: HCI device and connection manager
initialized
[ 8.693495] adv7511 2-0039: 2-0039 supply v1p2 not found, using
dummy regulator
[ 8.700240] wl1271_sdio: probe of mmc2:0001:2 failed with error -16
[ 8.700992] Bluetooth: HCI socket layer initialized
[ 8.701022] Bluetooth: L2CAP socket layer initialized
[ 8.701554] Bluetooth: SCO socket layer initialized
Starting sshd.socket.
[[0;32m OK [0m] Listening on RPCbind Server Activation Socket.
[[0;32m OK [0m] Started Daily rotation of log files.
[ 8.791540] Bluetooth: HCI UART driver ver 2.3


I'm confused why we only see wlcore wl18xx in job [1] and that we see wl1271
in the other and that we are able to start running tests, even though
I don't think
the jobs where we run tests booted up OK, since we can see
"wl1271_sdio: probe of mmc2:0001:1 failed with error -16".

Cheers,
Anders

[1] https://lkft.validation.linaro.org/scheduler/job/543012
[2] https://lkft.validation.linaro.org/scheduler/job/506683
[3] https://lkft.validation.linaro.org/scheduler/job/487494

2018-12-12 01:25:44

by Tony Lindgren

[permalink] [raw]
Subject: Re: wlcore getting stuck on hikey after the runtime PM autosuspend support change

* Ricardo Salveti <[email protected]> [181211 21:51]:
> Tried to change back wl12xx_sdio_power_off to use pm_runtime_put_sync
> as a test, and noticed I always get -EBUSY when reproducing the hang,
> so it looks like this could be a race between pm_runtime_put/get when
> doing if down/up (and the side effect on the mmc controller).

Hmm so -EBUSY for pm_runtime means transition is already happening
and in many cases the test should be:

error = pm_runtime_get(dev);
if (error < 0 && error != -EBUSY) {
...
}
...
pm_runtime_put(dev);

Where are you getting it?

Regards,

Tony

2018-12-12 01:46:01

by Tony Lindgren

[permalink] [raw]
Subject: Re: wlcore getting stuck on hikey after the runtime PM autosuspend support change

* Ricardo Salveti <[email protected]> [181211 20:45]:
> On Tue, Dec 11, 2018 at 6:23 PM Ricardo Salveti <[email protected]> wrote:
> >
> > On Tue, Dec 11, 2018 at 6:12 PM Tony Lindgren <[email protected]> wrote:
> > >
> > > * John Stultz <[email protected]> [181211 19:51]:
> > > > On Tue, Dec 11, 2018 at 11:25 AM Ricardo Salveti <[email protected]> wrote:
> > > > > Then tried to reproduce with a simple 'while true; do ip link set dev
> > > > > wlan0 down; ip link set dev wlan0 up; done;' and it was already enough
> > > > > to cause the same hang. Adding a simple sleep 1 after down/up is
> > > > > already enough to make it work, so something might be missing during
> > > > > the down/up process that only happens when they get called right after
> > > > > the other.
> > >
> > > With while true test above, I'm getting:
> > >
> > > wlcore: ERROR timeout waiting for the hardware to complete initialization
> > >
> > > Then after ctrl-c, wlan0 connects to the access point just fine
> > > for me.
> > >
> > > Is that what you're seeing or some oops?
>
> Tried the same while true test on my beaglebone black wireless and
> also got a trace, but without hanging the kernel (sdio write failed
> instead):
>
> root@beaglebone-yocto:~# while true; do ifconfig wlan0 down; ifconfig
> wlan0 up; done;
> [ 31.467917] wlcore: down
> [ 33.205670] ------------[ cut here ]------------
> [ 33.211048] WARNING: CPU: 0 PID: 407 at
> drivers/net/wireless/ti/wlcore/sdio.c:145

Hmm OK so SDIO write failing. Adding also Eyal to Cc.

Regards,

Tony

2018-12-12 07:27:39

by Reizer, Eyal

[permalink] [raw]
Subject: RE: [EXTERNAL] Re: wlcore getting stuck on hikey after the runtime PM autosuspend support change

>
> * Ricardo Salveti <[email protected]> [181211 20:45]:
> > On Tue, Dec 11, 2018 at 6:23 PM Ricardo Salveti <[email protected]>
> wrote:
> > >
> > > On Tue, Dec 11, 2018 at 6:12 PM Tony Lindgren <[email protected]>
> wrote:
> > > >
> > > > * John Stultz <[email protected]> [181211 19:51]:
> > > > > On Tue, Dec 11, 2018 at 11:25 AM Ricardo Salveti
> <[email protected]> wrote:
> > > > > > Then tried to reproduce with a simple 'while true; do ip link set dev
> > > > > > wlan0 down; ip link set dev wlan0 up; done;' and it was already
> enough
> > > > > > to cause the same hang. Adding a simple sleep 1 after down/up is
> > > > > > already enough to make it work, so something might be missing
> during
> > > > > > the down/up process that only happens when they get called right
> after
> > > > > > the other.
> > > >
> > > > With while true test above, I'm getting:
> > > >
> > > > wlcore: ERROR timeout waiting for the hardware to complete
> initialization
> > > >
> > > > Then after ctrl-c, wlan0 connects to the access point just fine
> > > > for me.
> > > >
> > > > Is that what you're seeing or some oops?
> >
> > Tried the same while true test on my beaglebone black wireless and
> > also got a trace, but without hanging the kernel (sdio write failed
> > instead):
> >
> > root@beaglebone-yocto:~# while true; do ifconfig wlan0 down; ifconfig
> > wlan0 up; done;
> > [ 31.467917] wlcore: down
> > [ 33.205670] ------------[ cut here ]------------
> > [ 33.211048] WARNING: CPU: 0 PID: 407 at
> > drivers/net/wireless/ti/wlcore/sdio.c:145
>
> Hmm OK so SDIO write failing. Adding also Eyal to Cc.
>

I Just tried on an available am335x-evm using 4.20.0-rc1 which I believe has all the patches merged already.
I am using the same script and not seeing any failure yet.
See below:

root@am335x-evm:/usr/share/wl18xx# uname -r
4.20.0-rc1-11287-gf487c00
root@am335x-evm:/usr/share/wl18xx#
root@am335x-evm:/usr/share/wl18xx#
root@am335x-evm:/usr/share/wl18xx#
root@am335x-evm:/usr/share/wl18xx#
root@am335x-evm:/usr/share/wl18xx# while true; do ifconfig wlan0 down; ifconfig wlan0 up; done;
[1378786.491101] wlcore: down
[1378787.093006] wlcore: PHY firmware version: Rev 8.2.0.0.242
[1378787.168523] wlcore: firmware booted (Rev 8.9.0.1.79)
[1378787.215897] IPv6: ADDRCONF(NETDEV_UP): wlan0: link is not ready
[1378787.279120] wlcore: down
[1378787.852406] wlcore: PHY firmware version: Rev 8.2.0.0.242
[1378787.923964] wlcore: firmware booted (Rev 8.9.0.1.79)
[1378787.973034] IPv6: ADDRCONF(NETDEV_UP): wlan0: link is not ready
[1378788.036862] wlcore: down
[1378788.613743] wlcore: PHY firmware version: Rev 8.2.0.0.242
[1378788.693838] wlcore: firmware booted (Rev 8.9.0.1.79)
[1378788.740383] IPv6: ADDRCONF(NETDEV_UP): wlan0: link is not ready
[1378788.804205] wlcore: down
[1378789.392353] wlcore: PHY firmware version: Rev 8.2.0.0.242
[1378789.473698] wlcore: firmware booted (Rev 8.9.0.1.79)
[1378789.525618] IPv6: ADDRCONF(NETDEV_UP): wlan0: link is not ready
[1378789.589460] wlcore: down
[1378790.193421] wlcore: PHY firmware version: Rev 8.2.0.0.242
[1378790.268987] wlcore: firmware booted (Rev 8.9.0.1.79)
[1378790.319464] IPv6: ADDRCONF(NETDEV_UP): wlan0: link is not ready
[1378790.385539] wlcore: down
[1378790.959832] wlcore: PHY firmware version: Rev 8.2.0.0.242
[1378791.037329] wlcore: firmware booted (Rev 8.9.0.1.79)
[1378791.085311] IPv6: ADDRCONF(NETDEV_UP): wlan0: link is not ready
[1378791.149050] wlcore: down
[1378791.728624] wlcore: PHY firmware version: Rev 8.2.0.0.242
[1378791.801074] wlcore: firmware booted (Rev 8.9.0.1.79)
[1378791.851683] IPv6: ADDRCONF(NETDEV_UP): wlan0: link is not ready
[1378791.914852] wlcore: down
[1378792.481659] wlcore: PHY firmware version: Rev 8.2.0.0.242
[1378792.558157] wlcore: firmware booted (Rev 8.9.0.1.79)
[1378792.608917] IPv6: ADDRCONF(NETDEV_UP): wlan0: link is not ready
[1378792.671808] wlcore: down
[1378793.257493] wlcore: PHY firmware version: Rev 8.2.0.0.242
[1378793.335242] wlcore: firmware booted (Rev 8.9.0.1.79)
[1378793.381733] IPv6: ADDRCONF(NETDEV_UP): wlan0: link is not ready
[1378793.445155] wlcore: down
[1378794.022673] wlcore: PHY firmware version: Rev 8.2.0.0.242
[1378794.099983] wlcore: firmware booted (Rev 8.9.0.1.79)
[1378794.148400] IPv6: ADDRCONF(NETDEV_UP): wlan0: link is not ready
[1378794.211785] wlcore: down
[1378794.793126] wlcore: PHY firmware version: Rev 8.2.0.0.242
[1378794.870457] wlcore: firmware booted (Rev 8.9.0.1.79)
[1378794.918269] IPv6: ADDRCONF(NETDEV_UP): wlan0: link is not ready
[1378794.980576] wlcore: down
[1378795.567425] wlcore: PHY firmware version: Rev 8.2.0.0.242
[1378795.643666] wlcore: firmware booted (Rev 8.9.0.1.79)
[1378795.689798] IPv6: ADDRCONF(NETDEV_UP): wlan0: link is not ready
[1378795.753733] wlcore: down
[1378796.321779] wlcore: PHY firmware version: Rev 8.2.0.0.242
[1378796.395530] wlcore: firmware booted (Rev 8.9.0.1.79)
[1378796.446662] IPv6: ADDRCONF(NETDEV_UP): wlan0: link is not ready
[1378796.509674] wlcore: down
[1378797.082703] wlcore: PHY firmware version: Rev 8.2.0.0.242
[1378797.155784] wlcore: firmware booted (Rev 8.9.0.1.79)
[1378797.204702] IPv6: ADDRCONF(NETDEV_UP): wlan0: link is not ready
[1378797.272142] wlcore: down
[1378797.852175] wlcore: PHY firmware version: Rev 8.2.0.0.242
[1378797.927954] wlcore: firmware booted (Rev 8.9.0.1.79)
[1378797.977066] IPv6: ADDRCONF(NETDEV_UP): wlan0: link is not ready
[1378798.040515] wlcore: down
[1378798.629674] wlcore: PHY firmware version: Rev 8.2.0.0.242
[1378798.705860] wlcore: firmware booted (Rev 8.9.0.1.79)
[1378798.753307] IPv6: ADDRCONF(NETDEV_UP): wlan0: link is not ready
[1378798.817071] wlcore: down
[1378799.383758] wlcore: PHY firmware version: Rev 8.2.0.0.242
[1378799.456630] wlcore: firmware booted (Rev 8.9.0.1.79)
[1378799.504191] IPv6: ADDRCONF(NETDEV_UP): wlan0: link is not ready
[1378799.567613] wlcore: down
[1378800.149263] wlcore: PHY firmware version: Rev 8.2.0.0.242
[1378800.293733] wlcore: firmware booted (Rev 8.9.0.1.79)
[1378800.393680] IPv6: ADDRCONF(NETDEV_UP): wlan0: link is not ready
[1378800.458602] wlcore: down
[1378801.038925] wlcore: PHY firmware version: Rev 8.2.0.0.242
[1378801.115460] wlcore: firmware booted (Rev 8.9.0.1.79)
[1378801.166596] IPv6: ADDRCONF(NETDEV_UP): wlan0: link is not ready
[1378801.243967] wlcore: down
[1378801.819651] wlcore: PHY firmware version: Rev 8.2.0.0.242
[1378801.896467] wlcore: firmware booted (Rev 8.9.0.1.79)
[1378801.947433] IPv6: ADDRCONF(NETDEV_UP): wlan0: link is not ready
[1378802.009906] wlcore: down
[1378802.591682] wlcore: PHY firmware version: Rev 8.2.0.0.242
[1378802.665409] wlcore: firmware booted (Rev 8.9.0.1.79)
[1378802.713678] IPv6: ADDRCONF(NETDEV_UP): wlan0: link is not ready
[1378802.777449] wlcore: down
[1378803.359033] wlcore: PHY firmware version: Rev 8.2.0.0.242
[1378803.440687] wlcore: firmware booted (Rev 8.9.0.1.79)
[1378803.490562] IPv6: ADDRCONF(NETDEV_UP): wlan0: link is not ready
[1378803.554929] wlcore: down
[1378804.133670] wlcore: PHY firmware version: Rev 8.2.0.0.242
[1378804.209757] wlcore: firmware booted (Rev 8.9.0.1.79)
[1378804.258120] IPv6: ADDRCONF(NETDEV_UP): wlan0: link is not ready
[1378804.328504] wlcore: down
[1378804.901077] wlcore: PHY firmware version: Rev 8.2.0.0.242
[1378804.978251] wlcore: firmware booted (Rev 8.9.0.1.79)
[1378805.028641] IPv6: ADDRCONF(NETDEV_UP): wlan0: link is not ready
[1378805.092694] wlcore: down
[1378805.673109] wlcore: PHY firmware version: Rev 8.2.0.0.242
[1378805.748414] wlcore: firmware booted (Rev 8.9.0.1.79)
[1378805.798805] IPv6: ADDRCONF(NETDEV_UP): wlan0: link is not ready
[1378805.862123] wlcore: down
[1378806.442260] wlcore: PHY firmware version: Rev 8.2.0.0.242
[1378806.516679] wlcore: firmware booted (Rev 8.9.0.1.79)
[1378806.563867] IPv6: ADDRCONF(NETDEV_UP): wlan0: link is not ready
[1378806.628781] wlcore: down
[1378807.190830] wlcore: PHY firmware version: Rev 8.2.0.0.242
[1378807.266159] wlcore: firmware booted (Rev 8.9.0.1.79)
[1378807.315008] IPv6: ADDRCONF(NETDEV_UP): wlan0: link is not ready
[1378807.379133] wlcore: down
[1378807.948528] wlcore: PHY firmware version: Rev 8.2.0.0.242
[1378808.019867] wlcore: firmware booted (Rev 8.9.0.1.79)
[1378808.067142] IPv6: ADDRCONF(NETDEV_UP): wlan0: link is not ready
[1378808.130354] wlcore: down
[1378808.691478] wlcore: PHY firmware version: Rev 8.2.0.0.242
[1378808.765724] wlcore: firmware booted (Rev 8.9.0.1.79)
[1378808.813940] IPv6: ADDRCONF(NETDEV_UP): wlan0: link is not ready
[1378808.877296] wlcore: down
[1378809.463092] wlcore: PHY firmware version: Rev 8.2.0.0.242
[1378809.535343] wlcore: firmware booted (Rev 8.9.0.1.79)
[1378809.584844] IPv6: ADDRCONF(NETDEV_UP): wlan0: link is not ready
[1378809.648089] wlcore: down
[1378810.221439] wlcore: PHY firmware version: Rev 8.2.0.0.242
[1378810.300780] wlcore: firmware booted (Rev 8.9.0.1.79)
[1378810.350360] IPv6: ADDRCONF(NETDEV_UP): wlan0: link is not ready
[1378810.473263] wlcore: down
[1378811.090841] wlcore: PHY firmware version: Rev 8.2.0.0.242
[1378811.163899] wlcore: firmware booted (Rev 8.9.0.1.79)
[1378811.214713] IPv6: ADDRCONF(NETDEV_UP): wlan0: link is not ready
[1378811.278291] wlcore: down
[1378811.859060] wlcore: PHY firmware version: Rev 8.2.0.0.242
[1378811.940978] wlcore: firmware booted (Rev 8.9.0.1.79)
[1378811.990527] IPv6: ADDRCONF(NETDEV_UP): wlan0: link is not ready
[1378812.055433] wlcore: down
[1378812.632900] wlcore: PHY firmware version: Rev 8.2.0.0.242
[1378812.706303] wlcore: firmware booted (Rev 8.9.0.1.79)
[1378812.753410] IPv6: ADDRCONF(NETDEV_UP): wlan0: link is not ready
[1378812.816633] wlcore: down
[1378813.391444] wlcore: PHY firmware version: Rev 8.2.0.0.242
[1378813.464240] wlcore: firmware booted (Rev 8.9.0.1.79)
[1378813.512363] IPv6: ADDRCONF(NETDEV_UP): wlan0: link is not ready
[1378813.577216] wlcore: down
[1378814.160131] wlcore: PHY firmware version: Rev 8.2.0.0.242
[1378814.237682] wlcore: firmware booted (Rev 8.9.0.1.79)
[1378814.290315] IPv6: ADDRCONF(NETDEV_UP): wlan0: link is not ready
[1378814.354754] wlcore: down
[1378814.920495] wlcore: PHY firmware version: Rev 8.2.0.0.242
[1378814.998592] wlcore: firmware booted (Rev 8.9.0.1.79)
[1378815.046547] IPv6: ADDRCONF(NETDEV_UP): wlan0: link is not ready
[1378815.109031] wlcore: down
[1378815.689486] wlcore: PHY firmware version: Rev 8.2.0.0.242
[1378815.762435] wlcore: firmware booted (Rev 8.9.0.1.79)
[1378815.812226] IPv6: ADDRCONF(NETDEV_UP): wlan0: link is not ready
[1378815.875213] wlcore: down
[1378816.441837] wlcore: PHY firmware version: Rev 8.2.0.0.242
[1378816.515729] wlcore: firmware booted (Rev 8.9.0.1.79)
[1378816.563795] IPv6: ADDRCONF(NETDEV_UP): wlan0: link is not ready
[1378816.657909] wlcore: down
[1378817.248446] wlcore: PHY firmware version: Rev 8.2.0.0.242
[1378817.322038] wlcore: firmware booted (Rev 8.9.0.1.79)
[1378817.370687] IPv6: ADDRCONF(NETDEV_UP): wlan0: link is not ready
[1378817.434946] wlcore: down
[1378818.018837] wlcore: PHY firmware version: Rev 8.2.0.0.242
[1378818.096283] wlcore: firmware booted (Rev 8.9.0.1.79)
[1378818.148273] IPv6: ADDRCONF(NETDEV_UP): wlan0: link is not ready
[1378818.210775] wlcore: down
[1378818.799762] wlcore: PHY firmware version: Rev 8.2.0.0.242
[1378818.875681] wlcore: firmware booted (Rev 8.9.0.1.79)
[1378818.924567] IPv6: ADDRCONF(NETDEV_UP): wlan0: link is not ready
[1378818.988274] wlcore: down
[1378819.561957] wlcore: PHY firmware version: Rev 8.2.0.0.242
[1378819.636484] wlcore: firmware booted (Rev 8.9.0.1.79)
[1378819.684097] IPv6: ADDRCONF(NETDEV_UP): wlan0: link is not ready
[1378819.747501] wlcore: down
[1378820.321689] wlcore: PHY firmware version: Rev 8.2.0.0.242
[1378820.397763] wlcore: firmware booted (Rev 8.9.0.1.79)
[1378820.448642] IPv6: ADDRCONF(NETDEV_UP): wlan0: link is not ready
[1378820.512129] wlcore: down
[1378821.088666] wlcore: PHY firmware version: Rev 8.2.0.0.242
[1378821.160442] wlcore: firmware booted (Rev 8.9.0.1.79)
[1378821.211085] IPv6: ADDRCONF(NETDEV_UP): wlan0: link is not ready
[1378821.273888] wlcore: down
[1378821.839906] wlcore: PHY firmware version: Rev 8.2.0.0.242
[1378821.915729] wlcore: firmware booted (Rev 8.9.0.1.79)
[1378821.966909] IPv6: ADDRCONF(NETDEV_UP): wlan0: link is not ready
[1378822.031125] wlcore: down
[1378822.615046] wlcore: PHY firmware version: Rev 8.2.0.0.242
[1378822.693064] wlcore: firmware booted (Rev 8.9.0.1.79)
[1378822.739599] IPv6: ADDRCONF(NETDEV_UP): wlan0: link is not ready
[1378822.803289] wlcore: down
[1378823.380169] wlcore: PHY firmware version: Rev 8.2.0.0.242
[1378823.458113] wlcore: firmware booted (Rev 8.9.0.1.79)
[1378823.509566] IPv6: ADDRCONF(NETDEV_UP): wlan0: link is not ready
[1378823.572602] wlcore: down
[1378824.159983] wlcore: PHY firmware version: Rev 8.2.0.0.242
[1378824.237711] wlcore: firmware booted (Rev 8.9.0.1.79)
[1378824.290808] IPv6: ADDRCONF(NETDEV_UP): wlan0: link is not ready
[1378824.353941] wlcore: down
[1378824.931810] wlcore: PHY firmware version: Rev 8.2.0.0.242
[1378825.005707] wlcore: firmware booted (Rev 8.9.0.1.79)
[1378825.055433] IPv6: ADDRCONF(NETDEV_UP): wlan0: link is not ready
[1378825.118176] wlcore: down
[1378825.692724] wlcore: PHY firmware version: Rev 8.2.0.0.242
[1378825.765798] wlcore: firmware booted (Rev 8.9.0.1.79)
[1378825.813798] IPv6: ADDRCONF(NETDEV_UP): wlan0: link is not ready
[1378825.878006] wlcore: down
[1378826.442286] wlcore: PHY firmware version: Rev 8.2.0.0.242
[1378826.518544] wlcore: firmware booted (Rev 8.9.0.1.79)
[1378826.569183] IPv6: ADDRCONF(NETDEV_UP): wlan0: link is not ready
[1378826.633600] wlcore: down
[1378827.213265] wlcore: PHY firmware version: Rev 8.2.0.0.242
[1378827.288824] wlcore: firmware booted (Rev 8.9.0.1.79)
[1378827.336628] IPv6: ADDRCONF(NETDEV_UP): wlan0: link is not ready
[1378827.399026] wlcore: down
[1378827.971426] wlcore: PHY firmware version: Rev 8.2.0.0.242
[1378828.048369] wlcore: firmware booted (Rev 8.9.0.1.79)
[1378828.098243] IPv6: ADDRCONF(NETDEV_UP): wlan0: link is not ready
[1378828.160854] wlcore: down
[1378828.744644] wlcore: PHY firmware version: Rev 8.2.0.0.242
[1378828.817231] wlcore: firmware booted (Rev 8.9.0.1.79)
[1378828.866908] IPv6: ADDRCONF(NETDEV_UP): wlan0: link is not ready
[1378828.929402] wlcore: down
[1378829.493338] wlcore: PHY firmware version: Rev 8.2.0.0.242
[1378829.568975] wlcore: firmware booted (Rev 8.9.0.1.79)
[1378829.617434] IPv6: ADDRCONF(NETDEV_UP): wlan0: link is not ready
[1378829.679476] wlcore: down
[1378830.264191] wlcore: PHY firmware version: Rev 8.2.0.0.242
[1378830.337065] wlcore: firmware booted (Rev 8.9.0.1.79)
[1378830.386654] IPv6: ADDRCONF(NETDEV_UP): wlan0: link is not ready
[1378830.449473] wlcore: down
[1378831.076081] wlcore: PHY firmware version: Rev 8.2.0.0.242
[1378831.153519] wlcore: firmware booted (Rev 8.9.0.1.79)
[1378831.200456] IPv6: ADDRCONF(NETDEV_UP): wlan0: link is not ready
[1378831.266049] wlcore: down
[1378831.853474] wlcore: PHY firmware version: Rev 8.2.0.0.242
[1378831.927762] wlcore: firmware booted (Rev 8.9.0.1.79)
[1378831.976934] IPv6: ADDRCONF(NETDEV_UP): wlan0: link is not ready
[1378832.045476] wlcore: down
[1378832.612943] wlcore: PHY firmware version: Rev 8.2.0.0.242
[1378832.689160] wlcore: firmware booted (Rev 8.9.0.1.79)
[1378832.736919] IPv6: ADDRCONF(NETDEV_UP): wlan0: link is not ready
[1378832.799808] wlcore: down
[1378833.382691] wlcore: PHY firmware version: Rev 8.2.0.0.242
[1378833.455715] wlcore: firmware booted (Rev 8.9.0.1.79)
[1378833.504109] IPv6: ADDRCONF(NETDEV_UP): wlan0: link is not ready
[1378833.567781] wlcore: down
[1378834.133028] wlcore: PHY firmware version: Rev 8.2.0.0.242
[1378834.205805] wlcore: firmware booted (Rev 8.9.0.1.79)
[1378834.253361] IPv6: ADDRCONF(NETDEV_UP): wlan0: link is not ready
[1378834.318232] wlcore: down
[1378834.892304] wlcore: PHY firmware version: Rev 8.2.0.0.242
[1378834.968815] wlcore: firmware booted (Rev 8.9.0.1.79)
[1378835.016491] IPv6: ADDRCONF(NETDEV_UP): wlan0: link is not ready
[1378835.079376] wlcore: down
[1378835.642688] wlcore: PHY firmware version: Rev 8.2.0.0.242
[1378835.718502] wlcore: firmware booted (Rev 8.9.0.1.79)
[1378835.766202] IPv6: ADDRCONF(NETDEV_UP): wlan0: link is not ready
[1378835.829487] wlcore: down
[1378836.419008] wlcore: PHY firmware version: Rev 8.2.0.0.242
[1378836.495673] wlcore: firmware booted (Rev 8.9.0.1.79)
[1378836.544773] IPv6: ADDRCONF(NETDEV_UP): wlan0: link is not ready
[1378836.607245] wlcore: down
[1378837.181581] wlcore: PHY firmware version: Rev 8.2.0.0.242
[1378837.255567] wlcore: firmware booted (Rev 8.9.0.1.79)
[1378837.303845] IPv6: ADDRCONF(NETDEV_UP): wlan0: link is not ready
[1378837.367595] wlcore: down
[1378837.949135] wlcore: PHY firmware version: Rev 8.2.0.0.242
[1378838.025595] wlcore: firmware booted (Rev 8.9.0.1.79)
[1378838.073999] IPv6: ADDRCONF(NETDEV_UP): wlan0: link is not ready
[1378838.137830] wlcore: down
[1378838.712319] wlcore: PHY firmware version: Rev 8.2.0.0.242
[1378838.788453] wlcore: firmware booted (Rev 8.9.0.1.79)
[1378838.836558] IPv6: ADDRCONF(NETDEV_UP): wlan0: link is not ready
[1378838.900204] wlcore: down
[1378839.489400] wlcore: PHY firmware version: Rev 8.2.0.0.242
^C[1378839.568368] wlcore: firmware booted (Rev 8.9.0.1.79)
[1378839.619773] IPv6: ADDRCONF(NETDEV_UP): wlan0: link is not ready

root@am335x-evm:/usr/share/wl18xx#

Best Regards,
Eyal

2018-12-12 18:31:27

by Tony Lindgren

[permalink] [raw]
Subject: Re: [EXTERNAL] Re: wlcore getting stuck on hikey after the runtime PM autosuspend support change

* Reizer, Eyal <[email protected]> [181212 07:27]:
> I Just tried on an available am335x-evm using 4.20.0-rc1 which I believe has all the patches merged already.
> I am using the same script and not seeing any failure yet.
> See below:
>
> root@am335x-evm:/usr/share/wl18xx# uname -r
> 4.20.0-rc1-11287-gf487c00
> root@am335x-evm:/usr/share/wl18xx#
> root@am335x-evm:/usr/share/wl18xx#
> root@am335x-evm:/usr/share/wl18xx#
> root@am335x-evm:/usr/share/wl18xx#
> root@am335x-evm:/usr/share/wl18xx# while true; do ifconfig wlan0 down; ifconfig wlan0 up; done;
> [1378786.491101] wlcore: down
> [1378787.093006] wlcore: PHY firmware version: Rev 8.2.0.0.242
> [1378787.168523] wlcore: firmware booted (Rev 8.9.0.1.79)
> [1378787.215897] IPv6: ADDRCONF(NETDEV_UP): wlan0: link is not ready
...

Yeah I was seeing the same yesterday when I tried it.

Could it be that there needs to be a little msleep() after
loading the firmware on some platforms and before enabling
PM runtime?

Regards,

Tony


2018-12-12 18:33:53

by Tony Lindgren

[permalink] [raw]
Subject: Re: wlcore getting stuck on hikey after the runtime PM autosuspend support change

* Anders Roxell <[email protected]> [181211 23:45]:
> I have different serious, one when our board hangs on udhcpc [1], then I see
> wlcore and wl18xx_driver.

Hmm so on what SoC is this with? Adding Eyeal to Cc here too.

Regards,

Tony

> [ 8.824419] cfg80211: Loading compiled-in X.509 certificates for
> regulatory database
> [ 8.852546] cfg80211: Loaded X.509 cert 'sforshee: 00b28ddf47aef9cea7'
> [ 8.861006] platform regulatory.0: Direct firmware load for
> regulatory.db failed with error -2
> [ 8.870248] cfg80211: failed to load regulatory.db
> [ 8.895501] asix 1-1.1:1.0 eth0: register 'asix' at
> usb-f72c0000.usb-1.1, ASIX AX88772B USB 2.0 Ethernet,
> 00:50:b6:14:c8:76
> [ 8.912285] usbcore: registered new interface driver asix
> [ 9.053373] mmc_host mmc2: Bus speed (slot 0) = 24800000Hz (slot
> req 100000Hz, actual 100000HZ div = 124)
> [ 9.130886] mmc_host mmc2: Bus speed (slot 0) = 24800000Hz (slot
> req 25000000Hz, actual 24800000HZ div = 0)
> [ 9.149318] wl18xx_driver wl18xx.3.auto: Direct firmware load for
> ti-connectivity/wl18xx-conf.bin failed with error -2
> [ 9.160459] wlcore: ERROR could not get configuration binary
> ti-connectivity/wl18xx-conf.bin: -2
> [ 9.169615] wlcore: WARNING falling back to default config
> [ 9.214108] mmc_host mmc2: Bus speed (slot 0) = 24800000Hz (slot
> req 100000Hz, actual 100000HZ div = 124)
> [ 9.294294] mmc_host mmc2: Bus speed (slot 0) = 24800000Hz (slot
> req 25000000Hz, actual 24800000HZ div = 0)
> [[0;32m OK [0m] Listening on Load/Save RF Kill Switch Status
> /dev/rfkill Watch.
> [[0;32m OK [0m] Reached target Sockets.
> [[0;32m OK [0m] Reached target Basic System.
> Starting Entropy Daemon based on the HAVEGE algorithm...
> [[0;32m OK [0m] Started TEE Supplicant.
> [ 9.514713] wlcore: wl18xx HW: 183x or 180x, PG 2.2 (ROM 0x11)
> [ 9.526319] wlcore: WARNING Detected unconfigured mac address in
> nvs, derive from fuse instead.
> [ 9.535454] wlcore: WARNING This default nvs file can be removed
> from the file system
>
> and when it passes that stage I can't find wlcore nor wl18xx_driver.
> However, I can see
> wl1271_sdio failing instead. See [2]
>
> [ 8.435849] Bluetooth: HCI UART protocol Broadcom registered
> [ 8.439669] mmc_host mmc2: Bus speed (slot 0) = 24800000Hz (slot
> req 200000Hz, actual 200000HZ div = 62)
> [[0;32m OK [0m] Started dnf-automatic-download timer.
> [ 8.469492] wl1271_sdio: probe of mmc2:0001:2 failed with error -16
> [[0;32m OK [0m] Started Daily Cleanup of Temporary Directories.
> [[0;32m OK [0m] Started dnf-automatic-install timer.
> [ 8.525443] Console: switching to colour frame buffer device 256x72
> [ 8.533489] Bluetooth: hci0: change remote baud rate command in firmware
> [ 8.584607] kirin-drm f4100000.ade: fb0: DRM emulated frame buffer device
> [[0;32m OK [0m] Started dnf-automatic-notifyonly timer.
> [[0;32m OK [0m] Listening on R[ 8.614417] [drm] Initialized kirin
> 1.0.0 20150718 for f4100000.ade on minor 0
>
> and another job that have the similar issue with the previous is [3]:
>
> [ 8.670110] adv7511 2-0039: 2-0039 supply pvdd not found, using
> dummy regulator
> [ 8.677908] adv7511 2-0039: 2-0039 supply a2vdd not found, using
> dummy regulator
> [ 8.685788] adv7511 2-0039: 2-0039 supply v3p3 not found, using
> dummy regulator
> [ 8.688699] wl1271_sdio: probe of mmc2:0001:1 failed with error -16
> [ 8.690154] Bluetooth: Core ver 2.22
> [ 8.690346] NET: Registered protocol family 31
> [ 8.690355] Bluetooth: HCI device and connection manager
> initialized
> [ 8.693495] adv7511 2-0039: 2-0039 supply v1p2 not found, using
> dummy regulator
> [ 8.700240] wl1271_sdio: probe of mmc2:0001:2 failed with error -16
> [ 8.700992] Bluetooth: HCI socket layer initialized
> [ 8.701022] Bluetooth: L2CAP socket layer initialized
> [ 8.701554] Bluetooth: SCO socket layer initialized
> Starting sshd.socket.
> [[0;32m OK [0m] Listening on RPCbind Server Activation Socket.
> [[0;32m OK [0m] Started Daily rotation of log files.
> [ 8.791540] Bluetooth: HCI UART driver ver 2.3
>
>
> I'm confused why we only see wlcore wl18xx in job [1] and that we see wl1271
> in the other and that we are able to start running tests, even though
> I don't think
> the jobs where we run tests booted up OK, since we can see
> "wl1271_sdio: probe of mmc2:0001:1 failed with error -16".
>
> Cheers,
> Anders
>
> [1] https://lkft.validation.linaro.org/scheduler/job/543012
> [2] https://lkft.validation.linaro.org/scheduler/job/506683
> [3] https://lkft.validation.linaro.org/scheduler/job/487494

2018-12-12 19:17:22

by Ricardo Salveti

[permalink] [raw]
Subject: Re: [EXTERNAL] Re: wlcore getting stuck on hikey after the runtime PM autosuspend support change

On Wed, Dec 12, 2018 at 5:27 AM Reizer, Eyal <[email protected]> wrote:
> I Just tried on an available am335x-evm using 4.20.0-rc1 which I believe has all the patches merged already.
> I am using the same script and not seeing any failure yet.
> See below:
>
> root@am335x-evm:/usr/share/wl18xx# uname -r
> 4.20.0-rc1-11287-gf487c00
> root@am335x-evm:/usr/share/wl18xx#
> root@am335x-evm:/usr/share/wl18xx#
> root@am335x-evm:/usr/share/wl18xx#
> root@am335x-evm:/usr/share/wl18xx#
> root@am335x-evm:/usr/share/wl18xx# while true; do ifconfig wlan0 down; ifconfig wlan0 up; done;
> [1378786.491101] wlcore: down
> [1378787.093006] wlcore: PHY firmware version: Rev 8.2.0.0.242
> [1378787.168523] wlcore: firmware booted (Rev 8.9.0.1.79)

I noticed your firmware is different here (8.9.0.1.79 instead of
8.9.0.0.79), but unclear if that would cause the issue.

Tested with 4.20-rc1 on my beaglebone black wireless and had the same
crash as with rc6:

[ 157.947356] wlcore: PHY firmware version: Rev 8.2.0.0.242
[ 157.994951] wlcore: firmware booted (Rev 8.9.0.0.79)
[ 158.026020] IPv6: ADDRCONF(NETDEV_UP): wlan0: link is not ready
[ 158.042400] wlcore: down
[ 159.789119] ------------[ cut here ]------------
[ 159.794351] WARNING: CPU: 0 PID: 629 at
drivers/net/wireless/ti/wlcore/sdio.c:145
wl12xx_sdio_raw_write+0x134/0x1c0 [wlcore_sdio]
[ 159.809198] Modules linked in: ipt_MASQUERADE nf_conntrack_netlink
nfnetlink xfrm_user iptable_nat nf_nat_ipv4 xt_addrtype iptable_filter
ip_tables xt_conntrack x_tables nf_nat nf_conntrack nf_defrag_ipv6
nf_defrag_ipv4 libcrc32c br_netfilter bridge stp llc arc4 wl18xx
wlcore mac80211 musb_dsps musb_hdrc phy_am335x cppi41 udc_core
phy_am335x_control phy_generic snd_soc_hdmi_codec snd_soc_simple_card
snd_soc_simple_card_utils pm33xx wkup_m3_rproc wkup_m3_ipc remoteproc
virtio virtio_ring ti_emif_sram snd_soc_davinci_mcasp hci_uart
snd_soc_edma snd_soc_sdma tilcdc btqca btbcm btintel snd_soc_core
snd_pcm_dmaengine omap_mailbox snd_pcm wlcore_sdio snd_timer snd
soundcore musb_am335x rtc_omap tda998x at24 cpufreq_dt thermal_sys
fuse nhc_udp bluetooth_6lowpan nhc_routing nhc_hop nhc_mobility
nhc_fragment nhc_dest nhc_ipv6 6lowpan
[ 159.891095] CPU: 0 PID: 629 Comm: ifconfig Not tainted
4.20.0-rc1-lmp-standard #1
[ 159.905673] Hardware name: Generic AM33XX (Flattened Device Tree)
[ 159.912806] Backtrace:
[ 159.915412] [<c010f0b8>] (dump_backtrace) from [<c010f458>]
(show_stack+0x20/0x24)
[ 159.924309] r7:dc9f4410 r6:00000009 r5:00000000 r4:00000000
[ 159.930269] [<c010f438>] (show_stack) from [<c0a0d510>]
(dump_stack+0x24/0x28)
[ 159.939016] [<c0a0d4ec>] (dump_stack) from [<c013a864>]
(__warn.part.3+0xcc/0xe4)
[ 159.947485] [<c013a798>] (__warn.part.3) from [<c013a9fc>]
(warn_slowpath_null+0x54/0x5c)
[ 159.956626] r6:bf1383b4 r5:00000091 r4:bf139be0
[ 159.961995] [<c013a9a8>] (warn_slowpath_null) from [<bf1383b4>]
(wl12xx_sdio_raw_write+0x134/0x1c0 [wlcore_sdio])
[ 159.973266] r6:00004000 r5:c0f03088 r4:dd588400
[ 159.978201] [<bf138280>] (wl12xx_sdio_raw_write [wlcore_sdio]) from
[<bf42d33c>] (wlcore_boot_upload_firmware+0x1ec/0x55c [wlcore])
[ 159.991553] r10:dcf58000 r9:de490d60 r8:bf138280 r7:00014000
r6:bf441480 r5:80904000
[ 159.999766] r4:00000001
[ 160.003427] [<bf42d150>] (wlcore_boot_upload_firmware [wlcore])
from [<bf458228>] (wl18xx_boot+0x8e4/0xd4c [wl18xx])
[ 160.015176] r10:00000000 r9:00000014 r8:c0cdee98 r7:bf441480
r6:c0f03088 r5:00000000
[ 160.023933] r4:de490d60
[ 160.026673] [<bf457944>] (wl18xx_boot [wl18xx]) from [<bf41da38>]
(wl1271_op_add_interface+0x580/0x9b0 [wlcore])
[ 160.039580] r9:00000003 r8:de490da0 r7:bf441480 r6:c0f03088
r5:de490d60 r4:00000000
[ 160.048653] [<bf41d4b8>] (wl1271_op_add_interface [wlcore]) from
[<bf367df8>] (drv_add_interface+0x58/0x1a8 [mac80211])
[ 160.060950] r10:00000000 r9:ffffffff r8:00000001 r7:de4794e0
r6:de490460 r5:de490460
[ 160.069167] r4:de4794e0
[ 160.073230] [<bf367da0>] (drv_add_interface [mac80211]) from
[<bf37f37c>] (ieee80211_do_open+0x490/0x984 [mac80211])
[ 160.085115] r7:de4794e0 r6:de490460 r5:de479000 r4:de4794e8
[ 160.091854] [<bf37eeec>] (ieee80211_do_open [mac80211]) from
[<bf37f8c4>] (ieee80211_open+0x54/0x60 [mac80211])
[ 160.103055] r10:db213c60 r9:00000000 r8:00001002 r7:c0f03088
r6:bf3cb4e0 r5:c0f03088
[ 160.114672] r4:de4794e0
[ 160.117620] [<bf37f870>] (ieee80211_open [mac80211]) from
[<c07f8fec>] (__dev_open+0xec/0x170)
[ 160.132084] r5:c0f03088 r4:de479000
[ 160.135883] [<c07f8f00>] (__dev_open) from [<c07f940c>]
(__dev_change_flags+0x174/0x1d4)
[ 160.145604] r6:00001043 r5:00000001 r4:de479000
[ 160.151249] [<c07f9298>] (__dev_change_flags) from [<c07f9494>]
(dev_change_flags+0x28/0x58)
[ 160.160109] r9:00000000 r8:c0fa2060 r7:de479144 r6:00000000
r5:00001002 r4:de479000
[ 160.169241] [<c07f946c>] (dev_change_flags) from [<c0894650>]
(devinet_ioctl+0x6d4/0x704)
[ 160.178361] r9:00000000 r8:c0fa2060 r7:dcb0fd0c r6:c0f03088
r5:dcfffe08 r4:00000000
[ 160.187128] [<c0893f7c>] (devinet_ioctl) from [<c0896918>]
(inet_ioctl+0x214/0x3c4)
[ 160.195687] r10:db213c60 r9:00000004 r8:00000051 r7:c0fa2060
r6:00008914 r5:be811b0c
[ 160.204386] r4:c0f03088
[ 160.207064] [<c0896704>] (inet_ioctl) from [<c07d354c>]
(sock_ioctl+0x1d0/0x4e0)
[ 160.215677] r9:00000004 r8:c0fa2060 r7:be811b0c r6:d1041240
r5:c0f03088 r4:00008914
[ 160.224361] [<c07d337c>] (sock_ioctl) from [<c02e4314>]
(do_vfs_ioctl+0xc0/0x95c)
[ 160.232820] r9:00000004 r8:be811b0c r7:00000004 r6:d1041240
r5:be811b0c r4:c0f03088
[ 160.241466] [<c02e4254>] (do_vfs_ioctl) from [<c02e4c1c>]
(ksys_ioctl+0x6c/0x90)
[ 160.249230] r10:00000036 r9:00000004 r8:be811b0c r7:00008914
r6:00000000 r5:d1041240
[ 160.258250] r4:d1041240
[ 160.261515] [<c02e4bb0>] (ksys_ioctl) from [<c02e4c58>] (sys_ioctl+0x18/0x1c)
[ 160.269006] r9:dcffe000 r8:c0101204 r7:00000036 r6:b6fe7900
r5:004c3e44 r4:be811b0c
[ 160.278049] [<c02e4c40>] (sys_ioctl) from [<c0101000>]
(ret_fast_syscall+0x0/0x54)
[ 160.286585] Exception stack(0xdcffffa8 to 0xdcfffff0)
[ 160.292443] ffa0: be811b0c 004c3e44 00000004
00008914 be811b0c 00001002
[ 160.301530] ffc0: be811b0c 004c3e44 b6fe7900 00000036 be811bec
004c4384 004c3e44 be811efe
[ 160.310107] ffe0: 004c3ec4 be811afc 004ad0d1 b6f5e0f8
[ 160.316308] ---[ end trace e1cf141728071aef ]---
[ 160.321681] wl1271_sdio mmc2:0001:2: sdio write failed (-110)

I'm guessing this is just a timing issue that is happening on beagle,
and also possibly related to the kernel config used here.

Cheers,
--
Ricardo Salveti de Araujo

2018-12-12 19:20:45

by Ricardo Salveti

[permalink] [raw]
Subject: Re: wlcore getting stuck on hikey after the runtime PM autosuspend support change

On Tue, Dec 11, 2018 at 11:25 PM Tony Lindgren <[email protected]> wrote:
>
> * Ricardo Salveti <[email protected]> [181211 21:51]:
> > Tried to change back wl12xx_sdio_power_off to use pm_runtime_put_sync
> > as a test, and noticed I always get -EBUSY when reproducing the hang,
> > so it looks like this could be a race between pm_runtime_put/get when
> > doing if down/up (and the side effect on the mmc controller).
>
> Hmm so -EBUSY for pm_runtime means transition is already happening
> and in many cases the test should be:
>
> error = pm_runtime_get(dev);
> if (error < 0 && error != -EBUSY) {
> ...
> }
> ...
> pm_runtime_put(dev);
>
> Where are you getting it?

What I noticed is that with autosuspend support I always get -EBUSY in
wl12xx_sdio_power_off -> pm_runtime_put, and I would guess that the
additional delay here creates the issue later when bringing the
interface up again.

Without autosuspend I never get -EBUSY in wl12xx_sdio_power_off.

Cheers,
--
Ricardo Salveti de Araujo

2018-12-12 19:25:15

by Ricardo Salveti

[permalink] [raw]
Subject: Re: [EXTERNAL] Re: wlcore getting stuck on hikey after the runtime PM autosuspend support change

On Wed, Dec 12, 2018 at 4:31 PM Tony Lindgren <[email protected]> wrote:
> * Reizer, Eyal <[email protected]> [181212 07:27]:
> > I Just tried on an available am335x-evm using 4.20.0-rc1 which I believe has all the patches merged already.
> > I am using the same script and not seeing any failure yet.
> > See below:
> >
> > root@am335x-evm:/usr/share/wl18xx# uname -r
> > 4.20.0-rc1-11287-gf487c00
> > root@am335x-evm:/usr/share/wl18xx#
> > root@am335x-evm:/usr/share/wl18xx#
> > root@am335x-evm:/usr/share/wl18xx#
> > root@am335x-evm:/usr/share/wl18xx#
> > root@am335x-evm:/usr/share/wl18xx# while true; do ifconfig wlan0 down; ifconfig wlan0 up; done;
> > [1378786.491101] wlcore: down
> > [1378787.093006] wlcore: PHY firmware version: Rev 8.2.0.0.242
> > [1378787.168523] wlcore: firmware booted (Rev 8.9.0.1.79)
> > [1378787.215897] IPv6: ADDRCONF(NETDEV_UP): wlan0: link is not ready
> ...
>
> Yeah I was seeing the same yesterday when I tried it.
>
> Could it be that there needs to be a little msleep() after
> loading the firmware on some platforms and before enabling
> PM runtime?

The issue I'm having seems to happen when trying to load the firmware,
getting stuck in mmc_wait_for_req_done -> wait_for_completion, so the
firmware is never really fully loaded when the hang happens.

Cheers,
--
Ricardo Salveti de Araujo

2018-12-13 07:49:31

by Reizer, Eyal

[permalink] [raw]
Subject: RE: [EXTERNAL] Re: wlcore getting stuck on hikey after the runtime PM autosuspend support change

> > > I Just tried on an available am335x-evm using 4.20.0-rc1 which I believe
> has all the patches merged already.
> > > I am using the same script and not seeing any failure yet.
> > > See below:
> > >
> > > root@am335x-evm:/usr/share/wl18xx# uname -r
> > > 4.20.0-rc1-11287-gf487c00
> > > root@am335x-evm:/usr/share/wl18xx#
> > > root@am335x-evm:/usr/share/wl18xx#
> > > root@am335x-evm:/usr/share/wl18xx#
> > > root@am335x-evm:/usr/share/wl18xx#
> > > root@am335x-evm:/usr/share/wl18xx# while true; do ifconfig wlan0
> down; ifconfig wlan0 up; done;
> > > [1378786.491101] wlcore: down
> > > [1378787.093006] wlcore: PHY firmware version: Rev 8.2.0.0.242
> > > [1378787.168523] wlcore: firmware booted (Rev 8.9.0.1.79)
> > > [1378787.215897] IPv6: ADDRCONF(NETDEV_UP): wlan0: link is not
> ready
> > ...
> >
> > Yeah I was seeing the same yesterday when I tried it.
> >
> > Could it be that there needs to be a little msleep() after
> > loading the firmware on some platforms and before enabling
> > PM runtime?
>
> The issue I'm having seems to happen when trying to load the firmware,
> getting stuck in mmc_wait_for_req_done -> wait_for_completion, so the
> firmware is never really fully loaded when the hang happens.
>
When the interface is brought up the wlan_renable gpio comes up (enabled) and turning the wl18xx device on.
It might be that what happens here is that the chip is not yet enabled when the first sdio command is started causing a timeout.
There is normally a delay associated with the fixed voltage regulator that is tied to this gpio and used for disable/enable (startup-delay-us).

Similar to this:
/ {
model = "TI AM335x BeagleBone Black Wireless";
compatible = "ti,am335x-bone-black-wireless", "ti,am335x-bone-black", "ti,am335x-bone", "ti,am33xx";

wlan_en_reg: fixedregulator@2 {
...
...
startup-delay-us= <70000>;

/* WL_EN */
gpio = <&gpio3 9 0>;
enable-active-high;
};
};

Perhaps, as a debug step, try to add a delay in "wl12xx_sdio_power_on":
https://github.com/torvalds/linux/blob/master/drivers/net/wireless/ti/wlcore/sdio.c

and see if it makes a change?

Best Regards,
Eyal

2018-12-13 13:53:17

by Ricardo Salveti

[permalink] [raw]
Subject: Re: [EXTERNAL] Re: wlcore getting stuck on hikey after the runtime PM autosuspend support change

On Thu, Dec 13, 2018 at 5:49 AM Reizer, Eyal <[email protected]> wrote:
> > > > I Just tried on an available am335x-evm using 4.20.0-rc1 which I believe
> > has all the patches merged already.
> > > > I am using the same script and not seeing any failure yet.
> > > > See below:
> > > >
> > > > root@am335x-evm:/usr/share/wl18xx# uname -r
> > > > 4.20.0-rc1-11287-gf487c00
> > > > root@am335x-evm:/usr/share/wl18xx#
> > > > root@am335x-evm:/usr/share/wl18xx#
> > > > root@am335x-evm:/usr/share/wl18xx#
> > > > root@am335x-evm:/usr/share/wl18xx#
> > > > root@am335x-evm:/usr/share/wl18xx# while true; do ifconfig wlan0
> > down; ifconfig wlan0 up; done;
> > > > [1378786.491101] wlcore: down
> > > > [1378787.093006] wlcore: PHY firmware version: Rev 8.2.0.0.242
> > > > [1378787.168523] wlcore: firmware booted (Rev 8.9.0.1.79)
> > > > [1378787.215897] IPv6: ADDRCONF(NETDEV_UP): wlan0: link is not
> > ready
> > > ...
> > >
> > > Yeah I was seeing the same yesterday when I tried it.
> > >
> > > Could it be that there needs to be a little msleep() after
> > > loading the firmware on some platforms and before enabling
> > > PM runtime?
> >
> > The issue I'm having seems to happen when trying to load the firmware,
> > getting stuck in mmc_wait_for_req_done -> wait_for_completion, so the
> > firmware is never really fully loaded when the hang happens.
> >
> When the interface is brought up the wlan_renable gpio comes up (enabled) and turning the wl18xx device on.
> It might be that what happens here is that the chip is not yet enabled when the first sdio command is started causing a timeout.
> There is normally a delay associated with the fixed voltage regulator that is tied to this gpio and used for disable/enable (startup-delay-us).

Right, what is not yet clear to me is why we need a longer delay in
the enable path when using the autosuspend patch. I would guess that
the suspend/resume logic in the mmc driver is what causes the extra
delay (in the case of hikey).

> Similar to this:
> / {
> model = "TI AM335x BeagleBone Black Wireless";
> compatible = "ti,am335x-bone-black-wireless", "ti,am335x-bone-black", "ti,am335x-bone", "ti,am33xx";
>
> wlan_en_reg: fixedregulator@2 {
> ...
> ...
> startup-delay-us= <70000>;
>
> /* WL_EN */
> gpio = <&gpio3 9 0>;
> enable-active-high;
> };
> };
>
> Perhaps, as a debug step, try to add a delay in "wl12xx_sdio_power_on":
> https://github.com/torvalds/linux/blob/master/drivers/net/wireless/ti/wlcore/sdio.c
>
> and see if it makes a change?

I tried just increasing WL1271_PRE_POWER_ON_SLEEP from 20ms to 50ms
(which gets used before calling wl12xx_sdio_power_on), and that was
already enough to fix my issues on both hikey and beaglebone black
wireless.

Should we define another pre power on sleep specifically for the sdio
case (and use directly in wl12xx_sdio_power_on)?

Thanks,
--
Ricardo Salveti de Araujo

2018-12-13 14:45:26

by Tony Lindgren

[permalink] [raw]
Subject: Re: [EXTERNAL] Re: wlcore getting stuck on hikey after the runtime PM autosuspend support change

* Ricardo Salveti <[email protected]> [181213 13:53]:
> I tried just increasing WL1271_PRE_POWER_ON_SLEEP from 20ms to 50ms
> (which gets used before calling wl12xx_sdio_power_on), and that was
> already enough to fix my issues on both hikey and beaglebone black
> wireless.

OK good to hear that helps :)

> Should we define another pre power on sleep specifically for the sdio
> case (and use directly in wl12xx_sdio_power_on)?

I'd probably prefer to just increase WL1271_PRE_POWER_ON_SLEEP,
chances are the same delay is needed in all cases.

Up to Eyeal to decide though as far as I'm concerned.

Regards,

Tony

2018-12-13 14:53:27

by Reizer, Eyal

[permalink] [raw]
Subject: RE: [EXTERNAL] Re: wlcore getting stuck on hikey after the runtime PM autosuspend support change

> > I tried just increasing WL1271_PRE_POWER_ON_SLEEP from 20ms to
> 50ms
> > (which gets used before calling wl12xx_sdio_power_on), and that was
> > already enough to fix my issues on both hikey and beaglebone black
> > wireless.
>
> OK good to hear that helps :)
>
> > Should we define another pre power on sleep specifically for the sdio
> > case (and use directly in wl12xx_sdio_power_on)?
>
> I'd probably prefer to just increase WL1271_PRE_POWER_ON_SLEEP,
> chances are the same delay is needed in all cases.
>
> Up to Eyeal to decide though as far as I'm concerned.
>

I tend to agree with Tony. Probably better to just increase the value.

Best Regards,
Eyal

2018-12-13 14:55:53

by Ricardo Salveti

[permalink] [raw]
Subject: Re: [EXTERNAL] Re: wlcore getting stuck on hikey after the runtime PM autosuspend support change

On Thu, Dec 13, 2018 at 12:53 PM Reizer, Eyal <[email protected]> wrote:
> > > I tried just increasing WL1271_PRE_POWER_ON_SLEEP from 20ms to
> > 50ms
> > > (which gets used before calling wl12xx_sdio_power_on), and that was
> > > already enough to fix my issues on both hikey and beaglebone black
> > > wireless.
> >
> > OK good to hear that helps :)
> >
> > > Should we define another pre power on sleep specifically for the sdio
> > > case (and use directly in wl12xx_sdio_power_on)?
> >
> > I'd probably prefer to just increase WL1271_PRE_POWER_ON_SLEEP,
> > chances are the same delay is needed in all cases.
> >
> > Up to Eyeal to decide though as far as I'm concerned.
> >
>
> I tend to agree with Tony. Probably better to just increase the value.

Cool, will do a bit more stress testing and send the patch then.

Thanks,
--
Ricardo Salveti de Araujo

2018-12-14 20:42:26

by Ricardo Salveti

[permalink] [raw]
Subject: Re: [EXTERNAL] Re: wlcore getting stuck on hikey after the runtime PM autosuspend support change

Hi,

On Thu, Dec 13, 2018 at 12:45 PM Tony Lindgren <[email protected]> wrote:
> * Ricardo Salveti <[email protected]> [181213 13:53]:
> > I tried just increasing WL1271_PRE_POWER_ON_SLEEP from 20ms to 50ms
> > (which gets used before calling wl12xx_sdio_power_on), and that was
> > already enough to fix my issues on both hikey and beaglebone black
> > wireless.
>
> OK good to hear that helps :)
>
> > Should we define another pre power on sleep specifically for the sdio
> > case (and use directly in wl12xx_sdio_power_on)?
>
> I'd probably prefer to just increase WL1271_PRE_POWER_ON_SLEEP,
> chances are the same delay is needed in all cases.

Investigating this issue a bit more, I noticed that the hang happens
when PM decides to not suspend the device when doing an if down/up
cycle.

Basically since commit 60f36637bbbd ("wlcore: sdio: allow pm to handle
sdio power") PM is now handling the sdio power off/on process, and if
wl12xx_sdio_power_on gets called right after wl12xx_sdio_power_off (if
down/up), the device will not go to the required power off/on sequence
(since PM will abort the suspend process), and the firmware loading
process will fail. I would guess the problem only happens with
autosuspend because of the extra delay it causes (pm_runtime_put
always returns -EBUSY on wl12xx_sdio_power_off with autosuspend).

Is there a way to force the suspend on wl12xx_sdio_power_off, or
should we partially restore the old behavior?

Thanks,
--
Ricardo Salveti de Araujo

2018-12-14 23:29:03

by Tony Lindgren

[permalink] [raw]
Subject: Re: [EXTERNAL] Re: wlcore getting stuck on hikey after the runtime PM autosuspend support change

* Ricardo Salveti <[email protected]> [181214 12:42]:
> Hi,
>
> On Thu, Dec 13, 2018 at 12:45 PM Tony Lindgren <[email protected]> wrote:
> > * Ricardo Salveti <[email protected]> [181213 13:53]:
> > > I tried just increasing WL1271_PRE_POWER_ON_SLEEP from 20ms to 50ms
> > > (which gets used before calling wl12xx_sdio_power_on), and that was
> > > already enough to fix my issues on both hikey and beaglebone black
> > > wireless.
> >
> > OK good to hear that helps :)
> >
> > > Should we define another pre power on sleep specifically for the sdio
> > > case (and use directly in wl12xx_sdio_power_on)?
> >
> > I'd probably prefer to just increase WL1271_PRE_POWER_ON_SLEEP,
> > chances are the same delay is needed in all cases.
>
> Investigating this issue a bit more, I noticed that the hang happens
> when PM decides to not suspend the device when doing an if down/up
> cycle.
>
> Basically since commit 60f36637bbbd ("wlcore: sdio: allow pm to handle
> sdio power") PM is now handling the sdio power off/on process, and if
> wl12xx_sdio_power_on gets called right after wl12xx_sdio_power_off (if
> down/up), the device will not go to the required power off/on sequence
> (since PM will abort the suspend process), and the firmware loading
> process will fail. I would guess the problem only happens with
> autosuspend because of the extra delay it causes (pm_runtime_put
> always returns -EBUSY on wl12xx_sdio_power_off with autosuspend).

OK thanks for the update, that's interesting.

> Is there a way to force the suspend on wl12xx_sdio_power_off, or
> should we partially restore the old behavior?

Well usually we could do pm_runtime_put_sync_suspend() but
here it won't help as the pm_runtime_put() is already in
progress by the SDIO subsystem and that's why we get -EBUSY.

Does adding a little wait at the end of wl12xx_sdio_power_off()
before return? Maybe something like:

/* Make sure the card gets powered off */
while (error == -EBUSY && !pm_runtime_suspended(&card->dev) &&
retries--) {
msleep(100);
}

Regards,

Tony


2018-12-15 03:38:56

by Ricardo Salveti

[permalink] [raw]
Subject: Re: [EXTERNAL] Re: wlcore getting stuck on hikey after the runtime PM autosuspend support change

On Fri, Dec 14, 2018 at 9:29 PM Tony Lindgren <[email protected]> wrote:
> * Ricardo Salveti <[email protected]> [181214 12:42]:
> > Basically since commit 60f36637bbbd ("wlcore: sdio: allow pm to handle
> > sdio power") PM is now handling the sdio power off/on process, and if
> > wl12xx_sdio_power_on gets called right after wl12xx_sdio_power_off (if
> > down/up), the device will not go to the required power off/on sequence
> > (since PM will abort the suspend process), and the firmware loading
> > process will fail. I would guess the problem only happens with
> > autosuspend because of the extra delay it causes (pm_runtime_put
> > always returns -EBUSY on wl12xx_sdio_power_off with autosuspend).
>
> OK thanks for the update, that's interesting.
>
> > Is there a way to force the suspend on wl12xx_sdio_power_off, or
> > should we partially restore the old behavior?
>
> Well usually we could do pm_runtime_put_sync_suspend() but
> here it won't help as the pm_runtime_put() is already in
> progress by the SDIO subsystem and that's why we get -EBUSY.
>
> Does adding a little wait at the end of wl12xx_sdio_power_off()
> before return? Maybe something like:
>
> /* Make sure the card gets powered off */
> while (error == -EBUSY && !pm_runtime_suspended(&card->dev) &&
> retries--) {
> msleep(100);
> }

Yes, this is enough to force the power off process and fixes the hang
I was having.

Also tried playing a bit with the msleep value and can confirm it
usually needs to wait between 30-40 milliseconds before powering the
card off.

Do you mind sending a patch with the above changes then? I can help
testing with the boards I have.

Thanks,
--
Ricardo Salveti de Araujo

2018-12-17 09:36:38

by Anders Roxell

[permalink] [raw]
Subject: Re: wlcore getting stuck on hikey after the runtime PM autosuspend support change

On Wed, 12 Dec 2018 at 19:33, Tony Lindgren <[email protected]> wrote:
>
> * Anders Roxell <[email protected]> [181211 23:45]:
> > I have different serious, one when our board hangs on udhcpc [1], then I see
> > wlcore and wl18xx_driver.
>
> Hmm so on what SoC is this with?

I'm sorry for the late reply. The Soc is Kirin 620.

Cheers,
Anders

> Adding Eyeal to Cc here too.
>
> Regards,
>
> Tony
>
> > [ 8.824419] cfg80211: Loading compiled-in X.509 certificates for
> > regulatory database
> > [ 8.852546] cfg80211: Loaded X.509 cert 'sforshee: 00b28ddf47aef9cea7'
> > [ 8.861006] platform regulatory.0: Direct firmware load for
> > regulatory.db failed with error -2
> > [ 8.870248] cfg80211: failed to load regulatory.db
> > [ 8.895501] asix 1-1.1:1.0 eth0: register 'asix' at
> > usb-f72c0000.usb-1.1, ASIX AX88772B USB 2.0 Ethernet,
> > 00:50:b6:14:c8:76
> > [ 8.912285] usbcore: registered new interface driver asix
> > [ 9.053373] mmc_host mmc2: Bus speed (slot 0) = 24800000Hz (slot
> > req 100000Hz, actual 100000HZ div = 124)
> > [ 9.130886] mmc_host mmc2: Bus speed (slot 0) = 24800000Hz (slot
> > req 25000000Hz, actual 24800000HZ div = 0)
> > [ 9.149318] wl18xx_driver wl18xx.3.auto: Direct firmware load for
> > ti-connectivity/wl18xx-conf.bin failed with error -2
> > [ 9.160459] wlcore: ERROR could not get configuration binary
> > ti-connectivity/wl18xx-conf.bin: -2
> > [ 9.169615] wlcore: WARNING falling back to default config
> > [ 9.214108] mmc_host mmc2: Bus speed (slot 0) = 24800000Hz (slot
> > req 100000Hz, actual 100000HZ div = 124)
> > [ 9.294294] mmc_host mmc2: Bus speed (slot 0) = 24800000Hz (slot
> > req 25000000Hz, actual 24800000HZ div = 0)
> > [[0;32m OK [0m] Listening on Load/Save RF Kill Switch Status
> > /dev/rfkill Watch.
> > [[0;32m OK [0m] Reached target Sockets.
> > [[0;32m OK [0m] Reached target Basic System.
> > Starting Entropy Daemon based on the HAVEGE algorithm...
> > [[0;32m OK [0m] Started TEE Supplicant.
> > [ 9.514713] wlcore: wl18xx HW: 183x or 180x, PG 2.2 (ROM 0x11)
> > [ 9.526319] wlcore: WARNING Detected unconfigured mac address in
> > nvs, derive from fuse instead.
> > [ 9.535454] wlcore: WARNING This default nvs file can be removed
> > from the file system
> >
> > and when it passes that stage I can't find wlcore nor wl18xx_driver.
> > However, I can see
> > wl1271_sdio failing instead. See [2]
> >
> > [ 8.435849] Bluetooth: HCI UART protocol Broadcom registered
> > [ 8.439669] mmc_host mmc2: Bus speed (slot 0) = 24800000Hz (slot
> > req 200000Hz, actual 200000HZ div = 62)
> > [[0;32m OK [0m] Started dnf-automatic-download timer.
> > [ 8.469492] wl1271_sdio: probe of mmc2:0001:2 failed with error -16
> > [[0;32m OK [0m] Started Daily Cleanup of Temporary Directories.
> > [[0;32m OK [0m] Started dnf-automatic-install timer.
> > [ 8.525443] Console: switching to colour frame buffer device 256x72
> > [ 8.533489] Bluetooth: hci0: change remote baud rate command in firmware
> > [ 8.584607] kirin-drm f4100000.ade: fb0: DRM emulated frame buffer device
> > [[0;32m OK [0m] Started dnf-automatic-notifyonly timer.
> > [[0;32m OK [0m] Listening on R[ 8.614417] [drm] Initialized kirin
> > 1.0.0 20150718 for f4100000.ade on minor 0
> >
> > and another job that have the similar issue with the previous is [3]:
> >
> > [ 8.670110] adv7511 2-0039: 2-0039 supply pvdd not found, using
> > dummy regulator
> > [ 8.677908] adv7511 2-0039: 2-0039 supply a2vdd not found, using
> > dummy regulator
> > [ 8.685788] adv7511 2-0039: 2-0039 supply v3p3 not found, using
> > dummy regulator
> > [ 8.688699] wl1271_sdio: probe of mmc2:0001:1 failed with error -16
> > [ 8.690154] Bluetooth: Core ver 2.22
> > [ 8.690346] NET: Registered protocol family 31
> > [ 8.690355] Bluetooth: HCI device and connection manager
> > initialized
> > [ 8.693495] adv7511 2-0039: 2-0039 supply v1p2 not found, using
> > dummy regulator
> > [ 8.700240] wl1271_sdio: probe of mmc2:0001:2 failed with error -16
> > [ 8.700992] Bluetooth: HCI socket layer initialized
> > [ 8.701022] Bluetooth: L2CAP socket layer initialized
> > [ 8.701554] Bluetooth: SCO socket layer initialized
> > Starting sshd.socket.
> > [[0;32m OK [0m] Listening on RPCbind Server Activation Socket.
> > [[0;32m OK [0m] Started Daily rotation of log files.
> > [ 8.791540] Bluetooth: HCI UART driver ver 2.3
> >
> >
> > I'm confused why we only see wlcore wl18xx in job [1] and that we see wl1271
> > in the other and that we are able to start running tests, even though
> > I don't think
> > the jobs where we run tests booted up OK, since we can see
> > "wl1271_sdio: probe of mmc2:0001:1 failed with error -16".
> >
> > Cheers,
> > Anders
> >
> > [1] https://lkft.validation.linaro.org/scheduler/job/543012
> > [2] https://lkft.validation.linaro.org/scheduler/job/506683
> > [3] https://lkft.validation.linaro.org/scheduler/job/487494

2018-12-17 14:45:33

by Tony Lindgren

[permalink] [raw]
Subject: Re: [EXTERNAL] Re: wlcore getting stuck on hikey after the runtime PM autosuspend support change

* Ricardo Salveti <[email protected]> [181215 03:38]:
> On Fri, Dec 14, 2018 at 9:29 PM Tony Lindgren <[email protected]> wrote:
> > * Ricardo Salveti <[email protected]> [181214 12:42]:
> > > Basically since commit 60f36637bbbd ("wlcore: sdio: allow pm to handle
> > > sdio power") PM is now handling the sdio power off/on process, and if
> > > wl12xx_sdio_power_on gets called right after wl12xx_sdio_power_off (if
> > > down/up), the device will not go to the required power off/on sequence
> > > (since PM will abort the suspend process), and the firmware loading
> > > process will fail. I would guess the problem only happens with
> > > autosuspend because of the extra delay it causes (pm_runtime_put
> > > always returns -EBUSY on wl12xx_sdio_power_off with autosuspend).
> >
> > OK thanks for the update, that's interesting.
> >
> > > Is there a way to force the suspend on wl12xx_sdio_power_off, or
> > > should we partially restore the old behavior?
> >
> > Well usually we could do pm_runtime_put_sync_suspend() but
> > here it won't help as the pm_runtime_put() is already in
> > progress by the SDIO subsystem and that's why we get -EBUSY.
> >
> > Does adding a little wait at the end of wl12xx_sdio_power_off()
> > before return? Maybe something like:
> >
> > /* Make sure the card gets powered off */
> > while (error == -EBUSY && !pm_runtime_suspended(&card->dev) &&
> > retries--) {
> > msleep(100);
> > }
>
> Yes, this is enough to force the power off process and fixes the hang
> I was having.

OK

> Also tried playing a bit with the msleep value and can confirm it
> usually needs to wait between 30-40 milliseconds before powering the
> card off.

OK thanks for checking that, I was wondering about that too.

> Do you mind sending a patch with the above changes then? I can help
> testing with the boards I have.

Yes I'll send a patch today.

Regards,

Tony