2007-08-29 22:37:57

by Jochen Voss

[permalink] [raw]
Subject: oops with recent wireless-dev tree

Hi,

when I use the b43 driver with my PCMCIA LinkSys WRT54GL adapter
(Broadcom 4318), run hostapd on the interface, and then try to add the
interface to a bridge, I get the following oops:

BUG: unable to handle kernel NULL pointer dereference at virtual address 00000000
printing eip:
*pde = 00000000
Oops: 0000 [#1]
CPU: 0
EIP: 0060:[<c02a2cb2>] Not tainted VLI
EFLAGS: 00010292 (2.6.23-rc3-wd #1)
EIP is at port_cost+0x11/0xaa
eax: c3c40000 ebx: c3c40000 ecx: 00000000 edx: 000000c0
esi: 00000000 edi: c2a3ea80 ebp: c2a31e00 esp: c2a31dc8
ds: 007b es: 007b fs: 0000 gs: 0033 ss: 0068
Process brctl (pid: 1498, ti=c2a30000 task=c25734c0 task.ti=c2a30000)
Stack: c0119bcf c3680140 c10beea0 c2a31ddc c0148dc4 0000000d c2a31e00 00000296
f000007e c10beea0 000080d0 c2bb7700 c3c40000 c2a3ea80 c2a31e2c c02a2ee3
c251b600 c3680270 00000001 c2bb7700 c3073380 c01117df c3c40000 00000001
Call Trace:
[<c010294f>] show_trace_log_lvl+0x1a/0x2f
[<c0102a01>] show_stack_log_lvl+0x9d/0xa5
[<c0102d76>] show_registers+0x1a5/0x277
[<c0102f24>] die+0xdc/0x1a5
[<c01082d1>] do_page_fault+0x461/0x530
[<c02db49a>] error_code+0x6a/0x70
[<c02a2ee3>] br_add_if+0x119/0x290
[<c02a35b0>] add_del_if+0x40/0x56
[<c02a3acc>] br_dev_ioctl+0x506/0x537
[<c0219bc8>] dev_ifsioc+0x3c1/0x3dd
[<c0219f2b>] dev_ioctl+0x347/0x3d4
[<c02110ec>] sock_ioctl+0x183/0x18f
[<c01435d4>] do_ioctl+0x1c/0x4b
[<c01437d3>] vfs_ioctl+0x1d0/0x1df
[<c0143815>] sys_ioctl+0x33/0x4a
[<c0102422>] syscall_call+0x7/0xb
=======================
Code: ff 31 c0 eb 05 b8 ea ff ff ff 5b 5d c3 55 89 e5 83 e8 7c e8 35 58 e9 ff 5d c3 55 89 e5 57 56 53 83 ec 2c 89 c3 8b b0 b4 00 00 00 <83> 3e 00 74 53
EIP: [<c02a2cb2>] port_cost+0x11/0xaa SS:ESP 0068:c2a31dc8

If I read this correctly, the EIP in the last line corresponds to
net/bridge/br_if.c, line 36:

static int port_cost(struct net_device *dev)
{
if (dev->ethtool_ops->get_settings) {
^^^^

As far as I can figure out, dev->ethtool_ops is NULL and the crash
happens while trying to derefernce ...->get_settings.

Is dev->ethtool_ops allowed to be NULL? In this case the appended
patch might be the correct fix. At least it makes the oops disappear
for me. Another possible fix would be to add an ethtool_ops structure
to the device created by b43.

I hope this helps,
Jochen
--
http://seehuhn.de/
----------------------------------------------------------------------
Avoid crashes while determining initial path cost for a bridge.

Check whether 'dev->ethtool_ops' is non-NULL before accessing
'dev->ethtool_ops->get_settings'.

Signed-off-by: Jochen Voss <[email protected]>

diff --git a/net/bridge/br_if.c b/net/bridge/br_if.c
index b40dada..5b396ea 100644
--- a/net/bridge/br_if.c
+++ b/net/bridge/br_if.c
@@ -33,7 +33,7 @@
*/
static int port_cost(struct net_device *dev)
{
- if (dev->ethtool_ops->get_settings) {
+ if (dev->ethtool_ops && dev->ethtool_ops->get_settings) {
struct ethtool_cmd ecmd = { ETHTOOL_GSET };
int err = dev->ethtool_ops->get_settings(dev, &ecmd);
if (!err) {


Attachments:
(No filename) (3.21 kB)
signature.asc (189.00 B)
Digital signature
Download all attachments

2007-08-30 15:45:15

by Matthew Wilcox

[permalink] [raw]
Subject: Re: oops with recent wireless-dev tree

On Thu, Aug 30, 2007 at 05:01:31PM +0200, Johannes Berg wrote:
> Jochen had a patch: http://marc.info/?l=linux-wireless&m=118842715026614&w=2

That's exactly the right patch, please add

Acked-by: Matthew Wilcox <[email protected]>

I just checked over the commit that introduced the bug, and I didn't
make the same mistake anywhere else.

--
Intel are signing my paycheques ... these opinions are still mine
"Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours. We can't possibly take such
a retrograde step."

2007-08-30 12:04:24

by Johannes Berg

[permalink] [raw]
Subject: Re: oops with recent wireless-dev tree

Hi Jochen,

[added CCs since it affects bridge code]

> If I read this correctly, the EIP in the last line corresponds to
> net/bridge/br_if.c, line 36:
>
> static int port_cost(struct net_device *dev)
> {
> if (dev->ethtool_ops->get_settings) {
> ^^^^
>
> As far as I can figure out, dev->ethtool_ops is NULL and the crash
> happens while trying to derefernce ...->get_settings.
>
> Is dev->ethtool_ops allowed to be NULL? In this case the appended
> patch might be the correct fix. At least it makes the oops disappear
> for me. Another possible fix would be to add an ethtool_ops structure
> to the device created by b43.

I don't think adding ethtool_ops in mac80211 should be necessary.
Stephen?

johannes


Attachments:
signature.asc (190.00 B)
This is a digitally signed message part

2007-08-30 16:48:14

by Matthew Wilcox

[permalink] [raw]
Subject: Re: [PATCH] bridge: fix OOPS when bridging device without ethtool

On Thu, Aug 30, 2007 at 08:29:32AM -0700, Stephen Hemminger wrote:
> Bridge code calls ethtool to get speed. The conversion to using
> only ethtool_ops broke the case of devices without ethtool_ops.
> This is a new regression in 2.6.23.
>
> Rearranged the switch to a logical order, and use gcc initializer.
>
> Ps: speed should have been part of the network device structure from
> the start rather than burying it in ethtool.

Feel free to do the conversion ;-) One of the things I like about the
ethtool framework is it gives us a way to take stuff out of the drivers
and put it in the midlayer without disturbing userspace.

> Signed-off-by: Stephen Hemminger <[email protected]>

Acked-by: Matthew Wilcox <[email protected]>

--
Intel are signing my paycheques ... these opinions are still mine
"Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours. We can't possibly take such
a retrograde step."

2007-08-30 14:58:42

by Matthew Wilcox

[permalink] [raw]
Subject: Re: oops with recent wireless-dev tree

On Thu, Aug 30, 2007 at 07:49:49AM -0700, Stephen Hemminger wrote:
> > > static int port_cost(struct net_device *dev)
> > > {
> > > if (dev->ethtool_ops->get_settings) {
> > > ^^^^
> > >
> > > As far as I can figure out, dev->ethtool_ops is NULL and the crash
> > > happens while trying to derefernce ...->get_settings.
>
> Devices aren't required to have ethtool_ops. The code there used to
> call ethtool directly, and it would handle the error cases. I'll rollup
> a fix this morning.

Yep, clearly my fault; that should read:

if (dev->ethtool_ops && dev->ethtool_ops->get_settings) {

Since Stephen's already bagged fixing this, I shan't send a patch.
But if it includes something like the line above please add:

Acked-by: Matthew Wilcox <[email protected]>

--
Intel are signing my paycheques ... these opinions are still mine
"Bill, look, we understand that you're interested in selling us this
operating system, but compare it to ours. We can't possibly take such
a retrograde step."

2007-08-30 15:31:12

by Stephen Hemminger

[permalink] [raw]
Subject: [PATCH] bridge: fix OOPS when bridging device without ethtool

Bridge code calls ethtool to get speed. The conversion to using
only ethtool_ops broke the case of devices without ethtool_ops.
This is a new regression in 2.6.23.

Rearranged the switch to a logical order, and use gcc initializer.

Ps: speed should have been part of the network device structure from
the start rather than burying it in ethtool.

Signed-off-by: Stephen Hemminger <[email protected]>


--- a/net/bridge/br_if.c 2007-08-30 07:49:01.000000000 -0700
+++ b/net/bridge/br_if.c 2007-08-30 07:48:16.000000000 -0700
@@ -33,17 +33,17 @@
*/
static int port_cost(struct net_device *dev)
{
- if (dev->ethtool_ops->get_settings) {
- struct ethtool_cmd ecmd = { ETHTOOL_GSET };
- int err = dev->ethtool_ops->get_settings(dev, &ecmd);
- if (!err) {
+ if (dev->ethtool_ops && dev->ethtool_ops->get_settings) {
+ struct ethtool_cmd ecmd = { .cmd = ETHTOOL_GSET, };
+
+ if (!dev->ethtool_ops->get_settings(dev, &ecmd)) {
switch(ecmd.speed) {
- case SPEED_100:
- return 19;
- case SPEED_1000:
- return 4;
case SPEED_10000:
return 2;
+ case SPEED_1000:
+ return 4;
+ case SPEED_100:
+ return 19;
case SPEED_10:
return 100;
}

2007-08-30 15:00:41

by Johannes Berg

[permalink] [raw]
Subject: Re: oops with recent wireless-dev tree

On Thu, 2007-08-30 at 07:49 -0700, Stephen Hemminger wrote:

> Devices aren't required to have ethtool_ops. The code there used to
> call ethtool directly, and it would handle the error cases. I'll rollup
> a fix this morning.

Great, thanks.

Jochen had a patch: http://marc.info/?l=linux-wireless&m=118842715026614&w=2

johannes


Attachments:
signature.asc (190.00 B)
This is a digitally signed message part

2007-08-31 05:16:43

by David Miller

[permalink] [raw]
Subject: Re: [PATCH] bridge: fix OOPS when bridging device without ethtool

From: Matthew Wilcox <[email protected]>
Date: Thu, 30 Aug 2007 10:48:13 -0600

> On Thu, Aug 30, 2007 at 08:29:32AM -0700, Stephen Hemminger wrote:
> > Bridge code calls ethtool to get speed. The conversion to using
> > only ethtool_ops broke the case of devices without ethtool_ops.
> > This is a new regression in 2.6.23.
> >
> > Rearranged the switch to a logical order, and use gcc initializer.
> >
> > Ps: speed should have been part of the network device structure from
> > the start rather than burying it in ethtool.
>
> Feel free to do the conversion ;-) One of the things I like about the
> ethtool framework is it gives us a way to take stuff out of the drivers
> and put it in the midlayer without disturbing userspace.
>
> > Signed-off-by: Stephen Hemminger <[email protected]>
>
> Acked-by: Matthew Wilcox <[email protected]>

Applied, thanks!