2014-12-21 17:25:23

by Eric Biggers

[permalink] [raw]
Subject: [BUG] rtl8192se: panic accessing unmapped memory in skb

Hi,

I have a RTL8192SE wireless card, attached via PCI. Usually it works with no
issues, but I recently had a kernel panic occur in the rtl8192se driver. The
kernel version is 3.18. Based on my analysis of the panic dump, the panic was
caused by a memory access violation in this block of code in
rtl92se_rx_query_desc():

if (stats->decrypted) {
hdr = (struct ieee80211_hdr *)(skb->data +
stats->rx_drvinfo_size + stats->rx_bufshift);

if ((_ieee80211_is_robust_mgmt_frame(hdr)) &&
(ieee80211_has_protected(hdr->frame_control)))
rx_status->flag &= ~RX_FLAG_DECRYPTED;
else
rx_status->flag |= RX_FLAG_DECRYPTED;
}

Specifically, the violation occurred the first time hdr->frame_control was
accessed, as part of _ieee80211_is_robust_mgmt_frame().

The panic occurred when the system was under heavy filesystem load but seemingly
is not easily reproducible.

There was recently a NULL check that was removed from this exact place in the
code, but it was certainly useless. Instead, what's much more suspect to me is
that inside _rtl_pci_rx_interrupt(), there is no error checking of the return
value of _rtl_pci_init_one_rxdesc(), which might fail if the skb couldn't be
allocated. I am wondering if this could be causing the problem.

Eric


2014-12-21 23:02:07

by Larry Finger

[permalink] [raw]
Subject: Re: [BUG] rtl8192se: panic accessing unmapped memory in skb

On 12/21/2014 11:25 AM, Eric Biggers wrote:
> Hi,
>
> I have a RTL8192SE wireless card, attached via PCI. Usually it works with no
> issues, but I recently had a kernel panic occur in the rtl8192se driver. The
> kernel version is 3.18. Based on my analysis of the panic dump, the panic was
> caused by a memory access violation in this block of code in
> rtl92se_rx_query_desc():
>
> if (stats->decrypted) {
> hdr = (struct ieee80211_hdr *)(skb->data +
> stats->rx_drvinfo_size + stats->rx_bufshift);
>
> if ((_ieee80211_is_robust_mgmt_frame(hdr)) &&
> (ieee80211_has_protected(hdr->frame_control)))
> rx_status->flag &= ~RX_FLAG_DECRYPTED;
> else
> rx_status->flag |= RX_FLAG_DECRYPTED;
> }
>
> Specifically, the violation occurred the first time hdr->frame_control was
> accessed, as part of _ieee80211_is_robust_mgmt_frame().
>
> The panic occurred when the system was under heavy filesystem load but seemingly
> is not easily reproducible.
>
> There was recently a NULL check that was removed from this exact place in the
> code, but it was certainly useless. Instead, what's much more suspect to me is
> that inside _rtl_pci_rx_interrupt(), there is no error checking of the return
> value of _rtl_pci_init_one_rxdesc(), which might fail if the skb couldn't be
> allocated. I am wondering if this could be causing the problem.

Your analysis is probably correct; however, I'm not sure what to do if the
allocate of an skb fails. As the name says, this routine is entered through an
interrupt, and I'm not sure what to do other than to exit.

The attached patch will implement the exit after logging an error. Please patch
your system and report back.

How much RAM does your system have? That info might be useful in trying to
reproduce the problem, which might indeed be difficult. Although pci.c was
extensively reworked in the 3.17 => 3.18 transition, most of the changes were
added to implement the changed descriptor structure for the RTL8192EE, and I do
not remember any changes that would affect any of the other drivers. As a
result, the current structure has been in place for some time, and this problem
has not been reported before.

Larry


Attachments:
rtlwifi_report_add_new_rxdesc_failure (2.13 kB)

2014-12-21 23:47:21

by Eric Biggers

[permalink] [raw]
Subject: Re: [BUG] rtl8192se: panic accessing unmapped memory in skb

Hi,

To get your patched version to work at all I had to update
_rtl_pci_init_rx_ring() to account for new return value of
_rtl_pci_init_one_rxdesc(). I will let you know if anything shows up in the
kernel log, but I expect this is a highly sporadic problem. The system has 4 GB
of memory, and I used the 3.18 kernel for 10 days prior to the panic with no
issues. The panic occurred while upgrading system packages, so it's possible
jhat the system was experiencing memory pressure.

I upgraded from 3.17 to 3.18 on Dec 8, so I've actually only had since then to
notice any bugs that may have been introduced since 3.17.

It does appear there were changes made to pci.c between 3.17 and 3.18. It
appears the 3.17 code will drop the incoming packet if a new skb can't be
allocated, whereas the 3.18 code assumes a new skb can always be allocated. The
3.17 behavior seems more logical to me. I don't know how either of these
behaviors compare to other networking drivers, however.

Eric

2014-12-22 17:43:42

by Larry Finger

[permalink] [raw]
Subject: Re: [BUG] rtl8192se: panic accessing unmapped memory in skb

On 12/21/2014 05:47 PM, Eric Biggers wrote:
> Hi,
>
> To get your patched version to work at all I had to update
> _rtl_pci_init_rx_ring() to account for new return value of
> _rtl_pci_init_one_rxdesc(). I will let you know if anything shows up in the
> kernel log, but I expect this is a highly sporadic problem. The system has 4 GB
> of memory, and I used the 3.18 kernel for 10 days prior to the panic with no
> issues. The panic occurred while upgrading system packages, so it's possible
> jhat the system was experiencing memory pressure.
>
> I upgraded from 3.17 to 3.18 on Dec 8, so I've actually only had since then to
> notice any bugs that may have been introduced since 3.17.
>
> It does appear there were changes made to pci.c between 3.17 and 3.18. It
> appears the 3.17 code will drop the incoming packet if a new skb can't be
> allocated, whereas the 3.18 code assumes a new skb can always be allocated. The
> 3.17 behavior seems more logical to me. I don't know how either of these
> behaviors compare to other networking drivers, however.

Sorry about missing the necessary changes in the rest of the driver. That is
what I get for only compile testing.

I reviewed the 3.17 => 3.18 changes and found the difference in the logic that
you noticed, and I missed earlier. As a result, I need to push this change for
3.19 with the notation for updating of 3.18. You have probably received this
patch already. As it needs to be backported, I decided to forgo changing the
return value of _rtl_pci_init_one_rxdesc(). That change should be made, but
there is no emergency there.

Thanks,

Larry