2017-02-04 15:29:09

by Dmitry Osipenko

[permalink] [raw]
Subject: rtlwifi: rtl8192c_common: "BUG: KASAN: slab-out-of-bounds"

Hello,

I ran a 4.9.7 kernel with CONFIG_KASAN=y on my machine and it detected the
following problem:

[ 23.012238] rtl8192cu: MAC auto ON okay!
[ 23.045656] rtl8192cu: Tx queue select: 0x05
[ 23.541152] ==================================================================
[ 23.541160] BUG: KASAN: slab-out-of-bounds in
rtl92c_dm_bt_coexist+0x858/0x1e40 [rtl8192c_common] at addr ffff8801c90edb08
[ 23.541161] Read of size 1 by task kworker/0:1/38
[ 23.541163] page:ffffea0007243800 count:1 mapcount:0 mapping: (null)
index:0x0 compound_mapcount: 0
[ 23.543194] flags: 0x8000000000004000(head)
[ 23.545178] page dumped because: kasan: bad access detected
[ 23.545181] CPU: 0 PID: 38 Comm: kworker/0:1 Not tainted 4.9.7-gentoo #3
[ 23.545182] Hardware name: Gigabyte Technology Co., Ltd. To be filled by
O.E.M./Z77-DS3H, BIOS F11a 11/13/2013
[ 23.545186] Workqueue: rtl92c_usb rtl_watchdog_wq_callback [rtlwifi]
[ 23.545187] 0000000000000000 ffffffff829eea33 ffff8801d7f0fa30 ffff8801c90edb08
[ 23.545189] ffffffff824c0f09 ffff8801d4abee80 0000000000000004 0000000000000297
[ 23.545191] ffffffffc070b57c ffff8801c7aa7c48 ffff880100000004 ffffffff000003e8
[ 23.545192] Call Trace:
[ 23.545197] [<ffffffff829eea33>] ? dump_stack+0x5c/0x79
[ 23.545200] [<ffffffff824c0f09>] ? kasan_report_error+0x4b9/0x4e0
[ 23.545202] [<ffffffffc070b57c>] ? _usb_read_sync+0x15c/0x280 [rtl_usb]
[ 23.545204] [<ffffffff824c0f75>] ? __asan_report_load1_noabort+0x45/0x50
[ 23.545206] [<ffffffffc06d7a88>] ? rtl92c_dm_bt_coexist+0x858/0x1e40
[rtl8192c_common]
[ 23.545208] [<ffffffffc06d7a88>] ? rtl92c_dm_bt_coexist+0x858/0x1e40
[rtl8192c_common]
[ 23.545210] [<ffffffffc06d0cbe>] ? rtl92c_dm_rf_saving+0x96e/0x1330
[rtl8192c_common]
[ 23.545212] [<ffffffffc06dab80>] ? rtl92c_dm_watchdog+0x1130/0x4590
[rtl8192c_common]
[ 23.545214] [<ffffffffc06d9a50>] ? rtl92c_dm_dynamic_txpower+0x9e0/0x9e0
[rtl8192c_common]
[ 23.545216] [<ffffffff8222038d>] ? pick_next_entity+0x18d/0x400
[ 23.545218] [<ffffffff8223fdde>] ? pick_next_task_fair+0xa6e/0xf60
[ 23.545220] [<ffffffff8207548a>] ? __switch_to+0x7ba/0x1160
[ 23.545223] [<ffffffffc0689a2b>] ? rtl_watchdog_wq_callback+0xb7b/0x1190
[rtlwifi]
[ 23.545225] [<ffffffffc0688eb0>] ? rtl_tx_mgmt_proc+0x2c0/0x2c0 [rtlwifi]
[ 23.545228] [<ffffffff82d1ecfe>] ? drm_fb_helper_dirty_work+0x25e/0x2f0
[ 23.545230] [<ffffffff83c65d20>] ? io_schedule_timeout+0x390/0x390
[ 23.545233] [<ffffffff82b8cef0>] ? update_attr.isra.2+0x170/0x170
[ 23.545234] [<ffffffff82b80a01>] ? fb_flashcursor+0x331/0x3e0
[ 23.545237] [<ffffffff821d7af9>] ? process_one_work+0x539/0x12b0
[ 23.545239] [<ffffffff821d894f>] ? worker_thread+0xdf/0x13e0
[ 23.545240] [<ffffffff8225343c>] ? __wake_up_common+0xbc/0x160
[ 23.545242] [<ffffffff821d8870>] ? process_one_work+0x12b0/0x12b0
[ 23.545244] [<ffffffff821e7f89>] ? kthread+0x1b9/0x210
[ 23.545245] [<ffffffff821e7dd0>] ? kthread_park+0x80/0x80
[ 23.545247] [<ffffffff821e7dd0>] ? kthread_park+0x80/0x80
[ 23.545248] [<ffffffff821e7dd0>] ? kthread_park+0x80/0x80
[ 23.545250] [<ffffffff83c724c5>] ? ret_from_fork+0x25/0x30
[ 23.545251] Memory state around the buggy address:
[ 23.545253] ffff8801c90eda00: fe fe fe fe fe fe fe fe fe fe fe fe fe fe fe fe
[ 23.545254] ffff8801c90eda80: fe fe fe fe fe fe fe fe fe fe fe fe fe fe fe fe
[ 23.545255] >ffff8801c90edb00: fe fe fe fe fe fe fe fe fe fe fe fe fe fe fe fe
[ 23.545256] ^
[ 23.545257] ffff8801c90edb80: fe fe fe fe fe fe fe fe fe fe fe fe fe fe fe fe
[ 23.545257] ffff8801c90edc00: fe fe fe fe fe fe fe fe fe fe fe fe fe fe fe fe
[ 23.545258] ==================================================================
[ 23.545258] Disabling lock debugging due to kernel taint
[ 23.545300] ==================================================================
[ 23.545303] BUG: KASAN: slab-out-of-bounds in
rtl92c_dm_watchdog+0x3dfb/0x4590 [rtl8192c_common] at addr ffff8801c90edb1c
[ 23.545304] Read of size 4 by task kworker/0:1/38
[ 23.545305] page:ffffea0007243800 count:1 mapcount:0 mapping: (null)
index:0x0 compound_mapcount: 0
[ 23.547353] flags: 0x8000000000004000(head)
[ 23.547354] page dumped because: kasan: bad access detected
[ 23.547355] CPU: 0 PID: 38 Comm: kworker/0:1 Tainted: G B
4.9.7-gentoo #3
[ 23.547356] Hardware name: Gigabyte Technology Co., Ltd. To be filled by
O.E.M./Z77-DS3H, BIOS F11a 11/13/2013
[ 23.547358] Workqueue: rtl92c_usb rtl_watchdog_wq_callback [rtlwifi]
[ 23.547360] 0000000000000000 ffffffff829eea33 ffff8801d7f0fa80 ffff8801c90edb1c
[ 23.547361] ffffffff824c0f09 ffff8801c90e0748 ffff8801c90e1420 0000000000000297
[ 23.547363] ffff8801c90e1420 ffffffff824c0f75 ffff8801c90edb08 ffff8801c90edb08
[ 23.547363] Call Trace:
[ 23.547367] [<ffffffff829eea33>] ? dump_stack+0x5c/0x79
[ 23.547368] [<ffffffff824c0f09>] ? kasan_report_error+0x4b9/0x4e0
[ 23.547370] [<ffffffff824c0f75>] ? __asan_report_load1_noabort+0x45/0x50
[ 23.547371] [<ffffffff824c1015>] ? __asan_report_load4_noabort+0x45/0x50
[ 23.547373] [<ffffffffc06dd84b>] ? rtl92c_dm_watchdog+0x3dfb/0x4590
[rtl8192c_common]
[ 23.547375] [<ffffffffc06dd84b>] ? rtl92c_dm_watchdog+0x3dfb/0x4590
[rtl8192c_common]
[ 23.547377] [<ffffffffc06d9a50>] ? rtl92c_dm_dynamic_txpower+0x9e0/0x9e0
[rtl8192c_common]
[ 23.547378] [<ffffffff8222038d>] ? pick_next_entity+0x18d/0x400
[ 23.547380] [<ffffffff8223fdde>] ? pick_next_task_fair+0xa6e/0xf60
[ 23.547381] [<ffffffff8207548a>] ? __switch_to+0x7ba/0x1160
[ 23.547383] [<ffffffffc0689a2b>] ? rtl_watchdog_wq_callback+0xb7b/0x1190
[rtlwifi]
[ 23.547385] [<ffffffffc0688eb0>] ? rtl_tx_mgmt_proc+0x2c0/0x2c0 [rtlwifi]
[ 23.547387] [<ffffffff82d1ecfe>] ? drm_fb_helper_dirty_work+0x25e/0x2f0
[ 23.547388] [<ffffffff83c65d20>] ? io_schedule_timeout+0x390/0x390
[ 23.547390] [<ffffffff82b8cef0>] ? update_attr.isra.2+0x170/0x170
[ 23.547391] [<ffffffff82b80a01>] ? fb_flashcursor+0x331/0x3e0
[ 23.547393] [<ffffffff821d7af9>] ? process_one_work+0x539/0x12b0
[ 23.547394] [<ffffffff821d894f>] ? worker_thread+0xdf/0x13e0
[ 23.547396] [<ffffffff8225343c>] ? __wake_up_common+0xbc/0x160
[ 23.547398] [<ffffffff821d8870>] ? process_one_work+0x12b0/0x12b0
[ 23.547399] [<ffffffff821e7f89>] ? kthread+0x1b9/0x210
[ 23.547400] [<ffffffff821e7dd0>] ? kthread_park+0x80/0x80
[ 23.547401] [<ffffffff821e7dd0>] ? kthread_park+0x80/0x80
[ 23.547403] [<ffffffff821e7dd0>] ? kthread_park+0x80/0x80
[ 23.547404] [<ffffffff83c724c5>] ? ret_from_fork+0x25/0x30
[ 23.547405] Memory state around the buggy address:
[ 23.547406] ffff8801c90eda00: fe fe fe fe fe fe fe fe fe fe fe fe fe fe fe fe
[ 23.547407] ffff8801c90eda80: fe fe fe fe fe fe fe fe fe fe fe fe fe fe fe fe
[ 23.547407] >ffff8801c90edb00: fe fe fe fe fe fe fe fe fe fe fe fe fe fe fe fe
[ 23.547408] ^
[ 23.547409] ffff8801c90edb80: fe fe fe fe fe fe fe fe fe fe fe fe fe fe fe fe
[ 23.547410] ffff8801c90edc00: fe fe fe fe fe fe fe fe fe fe fe fe fe fe fe fe
[ 23.547410] ==================================================================

--
Dmitry


2017-02-05 17:30:33

by Larry Finger

[permalink] [raw]
Subject: Re: rtlwifi: rtl8192c_common: "BUG: KASAN: slab-out-of-bounds"

On 02/05/2017 05:34 AM, Dmitry Osipenko wrote:
> BTW, I have an issue with the 8192cu: WiFi stops to work after a while (3-15
> minutes) if I enable WMM QoS on the AP. There is nothing suspicious in KMSG,
> connection is up but no packets go in/out. I tried to enable debug messages in
> the driver, so when the WiFi stops to work I see that some "temperature/led"
> notify still going on in the driver, but nothing happens when I try to initiate
> a transfer (say to open a web page) - the log is silent, like the requests are
> getting stuck/dropped somewhere before reaching the driver. Is it a known issue?
> With the QoS disabled everything works hunky-dory, however I get 2x-4x faster
> download speed with QoS enabled (while it works.)
>
> I noticed that rtl92c_init_edca_param() isn't wired in the driver, so I suppose
> the QoS isn't implemented yet, right?
>
> If it is an expected behaviour, I think at least printing a warning message in
> the KMSG like "QoS unimplemented, you may expect problems" should be good enough
> to avoid confusion.

As you have already seen, I decided to defer the more invasive patch. When
backporting to stable, the smaller the change the better.

I have no knowledge of the internals of the RTL8192CU chip. As a result, the
kinds of changes I can make are limited. I do know that the chip does implement
QoS. I also noticed that the set_qos() callback routine was very different in
rtl8192ce than in rtl8192cu. Attached is an untested patch to make the CU
routine look like the CE version. Please see if it makes a difference.

Driver rtl8192cu has never been maintained by Realtek, and it will likely be
removed from the kernel in the next few cycles. As you are running a new kernel,
I would recommend rtl8xxxu instead. That driver has high reliability, and the
speed is improving. Your other option would be a driver offered by the vendor of
your particular device. Realtek used to have these drivers on their web site,
but they now seem to have been removed. If your vendor does not have a driver,
http://www.edimax.com/edimax/mw/cufiles/files/download/Driver_Utility/transfer/Wireless/NIC/EW-7811Un/EW-7811Un_Linux_driver_v1.0.0.5.zip
should work.

Larry




Attachments:
rtl8192cu_add_qos.patch (0.98 kB)

2017-02-06 15:45:33

by Larry Finger

[permalink] [raw]
Subject: Re: rtlwifi: rtl8192c_common: "BUG: KASAN: slab-out-of-bounds"

On 02/06/2017 04:29 AM, Johannes Berg wrote:
> On Sat, 2017-02-04 at 12:41 -0600, Larry Finger wrote:
>> On 02/04/2017 10:58 AM, Dmitry Osipenko wrote:
>>> Seems the problem is caused by rtl92c_dm_*() casting .priv to
>>> "struct
>>> rtl_pci_priv", while it is "struct rtl_usb_priv".
>>
>> Those routines are shared by rtl8192ce and rtl8192cu, thus we need to
>> make that
>> difference in cast to be immaterial. I think we need to move "struct
>> bt_coexist_info" to the beginning of both rtlpci_priv and
>> rtl_usb_priv. Then it
>> should not matter.
>
> I think you really should consider putting a struct rtl_common into
> that or something, and getting rid of all the casting that causes this
> problem to start with?

The fix you suggest is prepared and will be submitted soon. As it is much more
invasive with ~150 insertions and ~160 deletions, I decided not to have it be
the one that is pushed to all stable kernels from 4.0 onward.

Larry

2017-02-05 18:15:55

by Dmitry Osipenko

[permalink] [raw]
Subject: Re: rtlwifi: rtl8192c_common: "BUG: KASAN: slab-out-of-bounds"

On 05.02.2017 20:30, Larry Finger wrote:
> On 02/05/2017 05:34 AM, Dmitry Osipenko wrote:
>> BTW, I have an issue with the 8192cu: WiFi stops to work after a while (3-15
>> minutes) if I enable WMM QoS on the AP. There is nothing suspicious in KMSG,
>> connection is up but no packets go in/out. I tried to enable debug messages in
>> the driver, so when the WiFi stops to work I see that some "temperature/led"
>> notify still going on in the driver, but nothing happens when I try to initiate
>> a transfer (say to open a web page) - the log is silent, like the requests are
>> getting stuck/dropped somewhere before reaching the driver. Is it a known issue?
>> With the QoS disabled everything works hunky-dory, however I get 2x-4x faster
>> download speed with QoS enabled (while it works.)
>>
>> I noticed that rtl92c_init_edca_param() isn't wired in the driver, so I suppose
>> the QoS isn't implemented yet, right?
>>
>> If it is an expected behaviour, I think at least printing a warning message in
>> the KMSG like "QoS unimplemented, you may expect problems" should be good enough
>> to avoid confusion.
>
> As you have already seen, I decided to defer the more invasive patch. When
> backporting to stable, the smaller the change the better.

That is a right approach.

> I have no knowledge of the internals of the RTL8192CU chip. As a result, the
> kinds of changes I can make are limited. I do know that the chip does implement
> QoS. I also noticed that the set_qos() callback routine was very different in
> rtl8192ce than in rtl8192cu. Attached is an untested patch to make the CU
> routine look like the CE version. Please see if it makes a difference.

Thank you a lot, unfortunately it doesn't help.

> Driver rtl8192cu has never been maintained by Realtek, and it will likely be
> removed from the kernel in the next few cycles. As you are running a new kernel,
> I would recommend rtl8xxxu instead. That driver has high reliability, and the
> speed is improving. Your other option would be a driver offered by the vendor of
> your particular device. Realtek used to have these drivers on their web site,
> but they now seem to have been removed. If your vendor does not have a driver,
> http://www.edimax.com/edimax/mw/cufiles/files/download/Driver_Utility/transfer/Wireless/NIC/EW-7811Un/EW-7811Un_Linux_driver_v1.0.0.5.zip
> should work.

Oh, I wasn't aware of rtl8xxxu. Thought it is a driver for some other HW. Thanks
a lot for pointing to it, will give it a whirl.

I'm maintaining a personal fork of a downstream driver for a dozen kernel
releases now, need a hostapd sometime. Fortunately, it's not a big burden.

Thank you a lot again, keep up the good work.

--
Dmitry

2017-02-07 17:14:54

by Dmitry Osipenko

[permalink] [raw]
Subject: Re: rtlwifi: rtl8192c_common: "BUG: KASAN: slab-out-of-bounds"

On 07.02.2017 19:45, Tobias Guggenmos wrote:
> Am Montag, 6. Februar 2017, 09:45:31 CET schrieb Larry Finger:
>> On 02/06/2017 04:29 AM, Johannes Berg wrote:
>>> On Sat, 2017-02-04 at 12:41 -0600, Larry Finger wrote:
>>>> On 02/04/2017 10:58 AM, Dmitry Osipenko wrote:
>>>>> Seems the problem is caused by rtl92c_dm_*() casting .priv to
>>>>> "struct
>>>>> rtl_pci_priv", while it is "struct rtl_usb_priv".
>>>>
>>>> Those routines are shared by rtl8192ce and rtl8192cu, thus we need to
>>>> make that
>>>> difference in cast to be immaterial. I think we need to move "struct
>>>> bt_coexist_info" to the beginning of both rtlpci_priv and
>>>> rtl_usb_priv. Then it
>>>> should not matter.
>>>
>>> I think you really should consider putting a struct rtl_common into
>>> that or something, and getting rid of all the casting that causes this
>>> problem to start with?
>>
>> The fix you suggest is prepared and will be submitted soon. As it is much
>> more invasive with ~150 insertions and ~160 deletions, I decided not to
>> have it be the one that is pushed to all stable kernels from 4.0 onward.
>>
>> Larry
>
> This is possibly related to the following Fedora Bug:
> https://bugzilla.redhat.com/show_bug.cgi?id=1391987
>

Bug only affects USB adapters (8192cu), PCIe (8192ce) should be fine. The Fedora
bug sounds like the one I have with the enabled AP QoS.

--
Dmitry

2017-02-04 18:41:30

by Larry Finger

[permalink] [raw]
Subject: Re: rtlwifi: rtl8192c_common: "BUG: KASAN: slab-out-of-bounds"

On 02/04/2017 10:58 AM, Dmitry Osipenko wrote:
> Seems the problem is caused by rtl92c_dm_*() casting .priv to "struct
> rtl_pci_priv", while it is "struct rtl_usb_priv".

Those routines are shared by rtl8192ce and rtl8192cu, thus we need to make that
difference in cast to be immaterial. I think we need to move "struct
bt_coexist_info" to the beginning of both rtlpci_priv and rtl_usb_priv. Then it
should not matter.

I do not have a gcc version new enough to turn KASAN testing on, thus the
attached patch is only compile tested. Does it fix the problem?

Larry


Attachments:
reorder_private_data.patch (1.03 kB)

2017-02-07 17:42:59

by Dmitry Osipenko

[permalink] [raw]
Subject: Re: rtlwifi: rtl8192c_common: "BUG: KASAN: slab-out-of-bounds"

On 07.02.2017 20:22, Tobias Guggenmos wrote:
> Am Sonntag, 5. Februar 2017, 11:30:30 CET schrieb Larry Finger:
>> On 02/05/2017 05:34 AM, Dmitry Osipenko wrote:
>>> BTW, I have an issue with the 8192cu: WiFi stops to work after a while
>>> (3-15 minutes) if I enable WMM QoS on the AP. There is nothing suspicious
>>> in KMSG, connection is up but no packets go in/out. I tried to enable
>>> debug messages in the driver, so when the WiFi stops to work I see that
>>> some "temperature/led" notify still going on in the driver, but nothing
>>> happens when I try to initiate a transfer (say to open a web page) - the
>>> log is silent, like the requests are getting stuck/dropped somewhere
>>> before reaching the driver. Is it a known issue? With the QoS disabled
>>> everything works hunky-dory, however I get 2x-4x faster download speed
>>> with QoS enabled (while it works.)
>>>
>>> I noticed that rtl92c_init_edca_param() isn't wired in the driver, so I
>>> suppose the QoS isn't implemented yet, right?
>>>
>>> If it is an expected behaviour, I think at least printing a warning
>>> message in the KMSG like "QoS unimplemented, you may expect problems"
>>> should be good enough to avoid confusion.
>>
>> As you have already seen, I decided to defer the more invasive patch. When
>> backporting to stable, the smaller the change the better.
>>
>> I have no knowledge of the internals of the RTL8192CU chip. As a result, the
>> kinds of changes I can make are limited. I do know that the chip does
>> implement QoS. I also noticed that the set_qos() callback routine was very
>> different in rtl8192ce than in rtl8192cu. Attached is an untested patch to
>> make the CU routine look like the CE version. Please see if it makes a
>> difference.
>>
>> Driver rtl8192cu has never been maintained by Realtek, and it will likely be
>> removed from the kernel in the next few cycles. As you are running a new
>> kernel, I would recommend rtl8xxxu instead. That driver has high
>> reliability, and the speed is improving. Your other option would be a
>> driver offered by the vendor of your particular device. Realtek used to
>> have these drivers on their web site, but they now seem to have been
>> removed. If your vendor does not have a driver,
>> http://www.edimax.com/edimax/mw/cufiles/files/download/Driver_Utility/trans
>> fer/Wireless/NIC/EW-7811Un/EW-7811Un_Linux_driver_v1.0.0.5.zip should work.
>>
>> Larry
>
> On my Realtek RTL8188CE card (using the rtl8192ce driver) the patch seems to
> fix the Issue (on Kernel 4.9.0).
>
> In contrast to what Dmitry Osipenko experienced, before the patch was applied,
> the WIFI usually crashed already a few seconds instead of 3-15 minutes after
> connecting to a network.
>

The QoS issue is unrelated to the original bug. I think you are referring to the
"reorder_private_data.patch" here, it shouldn't affect anything other than the
USB. Maybe some other memory corruption is going on?

--
Dmitry

2017-02-04 16:58:40

by Dmitry Osipenko

[permalink] [raw]
Subject: Re: rtlwifi: rtl8192c_common: "BUG: KASAN: slab-out-of-bounds"

Seems the problem is caused by rtl92c_dm_*() casting .priv to "struct
rtl_pci_priv", while it is "struct rtl_usb_priv".


--
Dmitry

2017-02-07 17:22:45

by Tobias Guggenmos

[permalink] [raw]
Subject: Re: rtlwifi: rtl8192c_common: "BUG: KASAN: slab-out-of-bounds"

Am Sonntag, 5. Februar 2017, 11:30:30 CET schrieb Larry Finger:
> On 02/05/2017 05:34 AM, Dmitry Osipenko wrote:
> > BTW, I have an issue with the 8192cu: WiFi stops to work after a while
> > (3-15 minutes) if I enable WMM QoS on the AP. There is nothing suspicious
> > in KMSG, connection is up but no packets go in/out. I tried to enable
> > debug messages in the driver, so when the WiFi stops to work I see that
> > some "temperature/led" notify still going on in the driver, but nothing
> > happens when I try to initiate a transfer (say to open a web page) - the
> > log is silent, like the requests are getting stuck/dropped somewhere
> > before reaching the driver. Is it a known issue? With the QoS disabled
> > everything works hunky-dory, however I get 2x-4x faster download speed
> > with QoS enabled (while it works.)
> >
> > I noticed that rtl92c_init_edca_param() isn't wired in the driver, so I
> > suppose the QoS isn't implemented yet, right?
> >
> > If it is an expected behaviour, I think at least printing a warning
> > message in the KMSG like "QoS unimplemented, you may expect problems"
> > should be good enough to avoid confusion.
>
> As you have already seen, I decided to defer the more invasive patch. When
> backporting to stable, the smaller the change the better.
>
> I have no knowledge of the internals of the RTL8192CU chip. As a result, the
> kinds of changes I can make are limited. I do know that the chip does
> implement QoS. I also noticed that the set_qos() callback routine was very
> different in rtl8192ce than in rtl8192cu. Attached is an untested patch to
> make the CU routine look like the CE version. Please see if it makes a
> difference.
>
> Driver rtl8192cu has never been maintained by Realtek, and it will likely be
> removed from the kernel in the next few cycles. As you are running a new
> kernel, I would recommend rtl8xxxu instead. That driver has high
> reliability, and the speed is improving. Your other option would be a
> driver offered by the vendor of your particular device. Realtek used to
> have these drivers on their web site, but they now seem to have been
> removed. If your vendor does not have a driver,
> http://www.edimax.com/edimax/mw/cufiles/files/download/Driver_Utility/trans
> fer/Wireless/NIC/EW-7811Un/EW-7811Un_Linux_driver_v1.0.0.5.zip should work.
>
> Larry

On my Realtek RTL8188CE card (using the rtl8192ce driver) the patch seems to
fix the Issue (on Kernel 4.9.0).

In contrast to what Dmitry Osipenko experienced, before the patch was applied,
the WIFI usually crashed already a few seconds instead of 3-15 minutes after
connecting to a network.


Attachments:
signature.asc (473.00 B)
This is a digitally signed message part.

2017-02-05 11:34:22

by Dmitry Osipenko

[permalink] [raw]
Subject: Re: rtlwifi: rtl8192c_common: "BUG: KASAN: slab-out-of-bounds"

On 05.02.2017 04:05, Larry Finger wrote:
> On 02/04/2017 01:32 PM, Dmitry Osipenko wrote:
>> On 04.02.2017 21:41, Larry Finger wrote:
>>> On 02/04/2017 10:58 AM, Dmitry Osipenko wrote:
>>>> Seems the problem is caused by rtl92c_dm_*() casting .priv to "struct
>>>> rtl_pci_priv", while it is "struct rtl_usb_priv".
>>>
>>> Those routines are shared by rtl8192ce and rtl8192cu, thus we need to make that
>>> difference in cast to be immaterial. I think we need to move "struct
>>> bt_coexist_info" to the beginning of both rtlpci_priv and rtl_usb_priv. Then it
>>> should not matter.
>>>
>>> I do not have a gcc version new enough to turn KASAN testing on, thus the
>>> attached patch is only compile tested. Does it fix the problem?
>>
>> Thank you for the patch, it indeed fixes the bug.
>>
>> I noticed that struct rtl_priv contains .btcoexist, isn't it duplicated in the
>> struct rtl_pci_priv?
>
> Thanks for testing. When I submit the patch, is it OK to cite your reporting and
> testing?

Sure, Tested-by: Dmitry Osipenko <[email protected]>

> Yes, the bt_coexist_info structure is in two different places. I will change the
> code in rtl8192c-common and rtl8192ce to use only the one in rtlpriv. That
> should satisfy the problem you reported, as well as clean up the code.
>
> Thanks again,

Good, thank you.

BTW, I have an issue with the 8192cu: WiFi stops to work after a while (3-15
minutes) if I enable WMM QoS on the AP. There is nothing suspicious in KMSG,
connection is up but no packets go in/out. I tried to enable debug messages in
the driver, so when the WiFi stops to work I see that some "temperature/led"
notify still going on in the driver, but nothing happens when I try to initiate
a transfer (say to open a web page) - the log is silent, like the requests are
getting stuck/dropped somewhere before reaching the driver. Is it a known issue?
With the QoS disabled everything works hunky-dory, however I get 2x-4x faster
download speed with QoS enabled (while it works.)

I noticed that rtl92c_init_edca_param() isn't wired in the driver, so I suppose
the QoS isn't implemented yet, right?

If it is an expected behaviour, I think at least printing a warning message in
the KMSG like "QoS unimplemented, you may expect problems" should be good enough
to avoid confusion.

--
Dmitry

2017-02-07 16:53:15

by Tobias Guggenmos

[permalink] [raw]
Subject: Re: rtlwifi: rtl8192c_common: "BUG: KASAN: slab-out-of-bounds"

Am Montag, 6. Februar 2017, 09:45:31 CET schrieb Larry Finger:
> On 02/06/2017 04:29 AM, Johannes Berg wrote:
> > On Sat, 2017-02-04 at 12:41 -0600, Larry Finger wrote:
> >> On 02/04/2017 10:58 AM, Dmitry Osipenko wrote:
> >>> Seems the problem is caused by rtl92c_dm_*() casting .priv to
> >>> "struct
> >>> rtl_pci_priv", while it is "struct rtl_usb_priv".
> >>
> >> Those routines are shared by rtl8192ce and rtl8192cu, thus we need to
> >> make that
> >> difference in cast to be immaterial. I think we need to move "struct
> >> bt_coexist_info" to the beginning of both rtlpci_priv and
> >> rtl_usb_priv. Then it
> >> should not matter.
> >
> > I think you really should consider putting a struct rtl_common into
> > that or something, and getting rid of all the casting that causes this
> > problem to start with?
>
> The fix you suggest is prepared and will be submitted soon. As it is much
> more invasive with ~150 insertions and ~160 deletions, I decided not to
> have it be the one that is pushed to all stable kernels from 4.0 onward.
>
> Larry

This is possibly related to the following Fedora Bug:
https://bugzilla.redhat.com/show_bug.cgi?id=1391987


Attachments:
signature.asc (473.00 B)
This is a digitally signed message part.

2017-02-04 19:33:03

by Dmitry Osipenko

[permalink] [raw]
Subject: Re: rtlwifi: rtl8192c_common: "BUG: KASAN: slab-out-of-bounds"

On 04.02.2017 21:41, Larry Finger wrote:
> On 02/04/2017 10:58 AM, Dmitry Osipenko wrote:
>> Seems the problem is caused by rtl92c_dm_*() casting .priv to "struct
>> rtl_pci_priv", while it is "struct rtl_usb_priv".
>
> Those routines are shared by rtl8192ce and rtl8192cu, thus we need to make that
> difference in cast to be immaterial. I think we need to move "struct
> bt_coexist_info" to the beginning of both rtlpci_priv and rtl_usb_priv. Then it
> should not matter.
>
> I do not have a gcc version new enough to turn KASAN testing on, thus the
> attached patch is only compile tested. Does it fix the problem?

Thank you for the patch, it indeed fixes the bug.

I noticed that struct rtl_priv contains .btcoexist, isn't it duplicated in the
struct rtl_pci_priv?

--
Dmitry

2017-02-08 00:53:26

by Larry Finger

[permalink] [raw]
Subject: Re: rtlwifi: rtl8192c_common: "BUG: KASAN: slab-out-of-bounds"

On 02/07/2017 10:45 AM, Tobias Guggenmos wrote:
> Am Montag, 6. Februar 2017, 09:45:31 CET schrieb Larry Finger:
>> On 02/06/2017 04:29 AM, Johannes Berg wrote:
>>> On Sat, 2017-02-04 at 12:41 -0600, Larry Finger wrote:
>>>> On 02/04/2017 10:58 AM, Dmitry Osipenko wrote:
>>>>> Seems the problem is caused by rtl92c_dm_*() casting .priv to
>>>>> "struct
>>>>> rtl_pci_priv", while it is "struct rtl_usb_priv".
>>>>
>>>> Those routines are shared by rtl8192ce and rtl8192cu, thus we need to
>>>> make that
>>>> difference in cast to be immaterial. I think we need to move "struct
>>>> bt_coexist_info" to the beginning of both rtlpci_priv and
>>>> rtl_usb_priv. Then it
>>>> should not matter.
>>>
>>> I think you really should consider putting a struct rtl_common into
>>> that or something, and getting rid of all the casting that causes this
>>> problem to start with?
>>
>> The fix you suggest is prepared and will be submitted soon. As it is much
>> more invasive with ~150 insertions and ~160 deletions, I decided not to
>> have it be the one that is pushed to all stable kernels from 4.0 onward.
>>
>> Larry
>
> This is possibly related to the following Fedora Bug:
> https://bugzilla.redhat.com/show_bug.cgi?id=1391987

This bug is unlikely to be the cause of that problem. In fact, this bug only
affects rtl8192cu, not rtl8192ce. The RedHat problem is more likely caused by
the not-yet-merged patch entitled "rtlwifi: rtl8192ce: Fix loading of incorrect
firmware".

Larry

2017-02-05 01:05:03

by Larry Finger

[permalink] [raw]
Subject: Re: rtlwifi: rtl8192c_common: "BUG: KASAN: slab-out-of-bounds"

On 02/04/2017 01:32 PM, Dmitry Osipenko wrote:
> On 04.02.2017 21:41, Larry Finger wrote:
>> On 02/04/2017 10:58 AM, Dmitry Osipenko wrote:
>>> Seems the problem is caused by rtl92c_dm_*() casting .priv to "struct
>>> rtl_pci_priv", while it is "struct rtl_usb_priv".
>>
>> Those routines are shared by rtl8192ce and rtl8192cu, thus we need to make that
>> difference in cast to be immaterial. I think we need to move "struct
>> bt_coexist_info" to the beginning of both rtlpci_priv and rtl_usb_priv. Then it
>> should not matter.
>>
>> I do not have a gcc version new enough to turn KASAN testing on, thus the
>> attached patch is only compile tested. Does it fix the problem?
>
> Thank you for the patch, it indeed fixes the bug.
>
> I noticed that struct rtl_priv contains .btcoexist, isn't it duplicated in the
> struct rtl_pci_priv?

Thanks for testing. When I submit the patch, is it OK to cite your reporting and
testing?

Yes, the bt_coexist_info structure is in two different places. I will change the
code in rtl8192c-common and rtl8192ce to use only the one in rtlpriv. That
should satisfy the problem you reported, as well as clean up the code.

Thanks again,

Larry

2017-02-06 10:29:47

by Johannes Berg

[permalink] [raw]
Subject: Re: rtlwifi: rtl8192c_common: "BUG: KASAN: slab-out-of-bounds"

On Sat, 2017-02-04 at 12:41 -0600, Larry Finger wrote:
> On 02/04/2017 10:58 AM, Dmitry Osipenko wrote:
> > Seems the problem is caused by rtl92c_dm_*() casting .priv to
> > "struct
> > rtl_pci_priv", while it is "struct rtl_usb_priv".
>
> Those routines are shared by rtl8192ce and rtl8192cu, thus we need to
> make that 
> difference in cast to be immaterial. I think we need to move "struct 
> bt_coexist_info" to the beginning of both rtlpci_priv and
> rtl_usb_priv. Then it 
> should not matter.

I think you really should consider putting a struct rtl_common into
that or something, and getting rid of all the casting that causes this
problem to start with?

johannes