2007-11-19 01:37:53

by Shaddy Baddah

[permalink] [raw]
Subject: zd1211rw (2.6.22 sparc64): unaligned access (do_rx)

Hi,

I have successfully built a 2.6.22 kernel for sparc64, by taking the
configuration for the Debian build of this kernel, and additionally
configuring the zd1211rw driver.

I am able to modprobe this module, and the device is correctly detected.
However, beyond that, I am experiencing problems with the device here
and there.

The first noticeable problem is when using iwconfig on the device. I
cannot set a key for the device, and a query of the configuration always
claims that the "<encryption key is too long>".

If I try to scan for APs using iwlist, I get one AP listed, before a bus
error occurs. This of course, does not suggest that the problem is with
the driver, but I mention it for the record.

If I configure my AP for open authentication, I can actually configure
the device, and be assigned an address via DHCP. However, the connection
drops out frequently, and is quite unstable.

Whilst the device is up, I continually get console messages like those I
have inlined below:

Kernel unaligned access at TPC[100ee4d4] do_rx+0x244/0x614 [zd1211rw]
Kernel unaligned access at TPC[100ee4e8] do_rx+0x258/0x614 [zd1211rw]
Kernel unaligned access at TPC[100ee4cc] do_rx+0x23c/0x614 [zd1211rw]


I am supposing that the driver hasn't really been tested extensively on
sparc64 architecture. However, can anyone shed some light on why the
driver wouldn't just work as it (I assume) would do on 386 architecture?

Thanks in advance,
Shaddy


2007-11-20 12:34:23

by David Miller

[permalink] [raw]
Subject: Re: zd1211rw (2.6.22 sparc64): unaligned access (do_rx)

From: David Miller <[email protected]>
Date: Mon, 19 Nov 2007 00:27:55 -0800 (PST)

> From: Shaddy Baddah <[email protected]>
> Date: Mon, 19 Nov 2007 11:56:39 +1100
>
> > If I try to scan for APs using iwlist, I get one AP listed, before a bus
> > error occurs. This of course, does not suggest that the problem is with
> > the driver, but I mention it for the record.
>
> This is an unaligned data access in the userland tools.
> Try to catch it with GDB and give us ths backtrace.

I think I've figured out what's happening here.

The kernel makes no effort whatsoever to translate iwe streams in
compat environments. And userspace then tries to "correct" this and
does so miserably. Likely this is what causes the bus error.

The fix is that we need to add some handling code
fs/compat_ioctl.c:do_wireless_ioctl() for the case where we are
returning an iwe stream (SIOCGIWSCAN).

It should not be very difficult to do this, since the compat format
will be the same size or smaller, it should be easy to recode the
thing in-place in the user buffer.

So you make a pass over the user buffer fixing things up and then you
adjust the iw_data length for the new size.

Alternatively, you can allocate a kernel buffer for this, use a 'fs =
get_fs(); set_fs(KERNEL_DS); ioctl(); set_fs(fs);' sequence, fixup the
iwe stream, then copy the everything back out to userspace.

Then we can delete all of this incredibly stupid code in the wireless
tools that attempts to fix this up in userspace.

Can someone implement this and test it or send Shaddy a patch to test?

Thanks.

2007-11-19 12:11:49

by Shaddy Baddah

[permalink] [raw]
Subject: Re: zd1211rw (2.6.22 sparc64): unaligned access (do_rx)

David Miller wrote:
>> If I try to scan for APs using iwlist, I get one AP listed, before a bus
>> error occurs. This of course, does not suggest that the problem is with
>> the driver, but I mention it for the record.
>
> This is an unaligned data access in the userland tools.
> Try to catch it with GDB and give us ths backtrace.

I will try and do this at some point. Unfortunately, the system isn't
readily at hand, and it might take a couple of days before I am able to
attend to it.

>> Whilst the device is up, I continually get console messages like those I
>> have inlined below:
>>
>> Kernel unaligned access at TPC[100ee4d4] do_rx+0x244/0x614 [zd1211rw]
>> Kernel unaligned access at TPC[100ee4e8] do_rx+0x258/0x614 [zd1211rw]
>> Kernel unaligned access at TPC[100ee4cc] do_rx+0x23c/0x614 [zd1211rw]
>
> These are unaligned data accesses in the kernel driver.

Sure. Just to make sure I understand, does TPC stand for something like
Trap Program Code?


2007-11-21 13:32:52

by Shaddy Baddah

[permalink] [raw]
Subject: Re: zd1211rw (2.6.22 sparc64): unaligned access (do_rx)

Hi,

Shaddy Baddah wrote:
=> Please find this in-lined below (with key protected):
>
> # iwconfig eth2
> eth2 IEEE 802.11b/g ESSID:off/any Nickname:"zd1211"
> Mode:Managed Frequency:2.462 GHz Access Point: Invalid
> Bit Rate=1 Mb/s
> Encryption key:<too big>
>
> # iwconfig eth2 essid dixieland key "GGGGGGGGGG"
> # iwconfig eth2
> eth2 IEEE 802.11b/g ESSID:off/any Nickname:"zd1211"
> Mode:Managed Frequency:2.462 GHz Access Point:
> GG:GG:GG:GG:GG:GG
> Bit Rate=24 Mb/s
> Encryption key:<too big>
>
> #
>


Interestingly, after performing the above actions, I am now getting
additional log messages:

SoftMAC: Open Authentication completed with GG:GG:GG:GG:GG:GG
Kernel unaligned access at TPC[100df410]
ieee80211softmac_handle_assoc_response]
Kernel unaligned access at TPC[100df548]
ieee80211softmac_handle_assoc_response]
ADDRCONF(NETDEV_CHANGE): eth2: link becomes ready
Kernel unaligned access at TPC[100d03ec] ieee80211_copy_snap+0x74/0x78
[ieee802]
Kernel unaligned access at TPC[100d03ec] ieee80211_copy_snap+0x74/0x78
[ieee802]
Kernel unaligned access at TPC[100d03ec] ieee80211_copy_snap+0x74/0x78
[ieee802]
Kernel unaligned access at TPC[100d03ec] ieee80211_copy_snap+0x74/0x78
[ieee802]
Kernel unaligned access at TPC[100d03ec] ieee80211_copy_snap+0x74/0x78
[ieee802]
Kernel unaligned access at TPC[100ee624] do_rx+0x394/0x5ec [zd1211rw]

Kernel unaligned access at TPC[100ee62c] do_rx+0x39c/0x5ec [zd1211rw]

Kernel unaligned access at TPC[100ee638] do_rx+0x3a8/0x5ec [zd1211rw]

Kernel unaligned access at TPC[100ee668] do_rx+0x3d8/0x5ec [zd1211rw]

Kernel unaligned access at TPC[100ee670] do_rx+0x3e0/0x5ec [zd1211rw]



Hope this helps,
Shaddy

2007-11-19 15:01:44

by Johannes Berg

[permalink] [raw]
Subject: Re: zd1211rw (2.6.22 sparc64): unaligned access (do_rx)


> > The first noticeable problem is when using iwconfig on the device. I
> > cannot set a key for the device, and a query of the configuration always
> > claims that the "<encryption key is too long>".
>
> Likely some wireless ioctl compat layer bug in the kernel.

Is this 32/64 interop? The kernel has never been fixed and never will
be. New wireless tools should "support" this, but imnsho you're *much*
better off compiling 64-bit wireless tools.

> > If I try to scan for APs using iwlist, I get one AP listed, before a bus
> > error occurs. This of course, does not suggest that the problem is with
> > the driver, but I mention it for the record.
>
> This is an unaligned data access in the userland tools.
> Try to catch it with GDB and give us ths backtrace.

For Jean.

> > Whilst the device is up, I continually get console messages like those I
> > have inlined below:
> >
> > Kernel unaligned access at TPC[100ee4d4] do_rx+0x244/0x614 [zd1211rw]
> > Kernel unaligned access at TPC[100ee4e8] do_rx+0x258/0x614 [zd1211rw]
> > Kernel unaligned access at TPC[100ee4cc] do_rx+0x23c/0x614 [zd1211rw]
>
> These are unaligned data accesses in the kernel driver.

For Daniel.

johannes


Attachments:
signature.asc (828.00 B)
This is a digitally signed message part

2007-11-21 18:46:28

by Jean Tourrilhes

[permalink] [raw]
Subject: Re: zd1211rw (2.6.22 sparc64): unaligned access (do_rx)

On Thu, Nov 22, 2007 at 12:18:45AM +1100, Shaddy Baddah wrote:
> Hi Jean
>

Hi,

I decided to reorganise things a bit... Here are my answers...

> > Definitely, it sounds familiar.
> > This is that I need :
> > o version of the kernel
>
> Debian sparc64 2.6.22 (built myself, using make-kpg, to include the driver)

The patch from Masakazu Mokuno I was talking about is not in
your kernel. It defintely would explain why you would get the
"Encryption key:<too big>". Basically the length of any variable field
is not properly return to user space. This patch would fix that
problem.
The patch from Masakazu Mokuno is not in 2.6.22, and it not
either in 2.6.14. The patch is only included in 2.6.23 and later. I
would suggest upgrading to 2.6.23 to get that patch. Alternatively,
I've included the patch as attachement, and you can use it with
2.6.22.
I would like you to try that patch and report.

> > Can you give me the exact error text as reported by iwconfig ?
> >I'll probably have to send you a test version to see what's happening
> >under the cover.
>
> Please find this in-lined below (with key protected):
>
> # iwconfig eth2
> eth2 IEEE 802.11b/g ESSID:off/any Nickname:"zd1211"
> Mode:Managed Frequency:2.462 GHz Access Point: Invalid
> Bit Rate=1 Mb/s
> Encryption key:<too big>

Note that under Wireless Tools 30, you will notice that it
won't return too bug but will print an absurdly long encryption
key. Two symptoms of the same bug.

> > o version of Wireless Tools (iwconfig --version).
> > Most likely, you need to upgrade your Wireless Tools to
> >version 29 which fixes this 32/64 interop problem.
> > With the latest kernel and the latest wireless tools, the only
> >known bugs are the two ESSID bugs.
>
> I think I'm right for version:
> # iwconfig --version
> iwconfig Wireless-Tools version 29
> Compatible with Wireless Extension v11 to v22.
>
> Kernel Currently compiled with Wireless Extension v22.
>
> eth2 Recommend Wireless Extension v20 or later,
> Currently compiled with Wireless Extension v22.

Yep, that's the correct version. I was afraid you were running
Debian stable. I'll need to dig up a little bit more in this.
Note that the patch above *may* help with this issue as well.

> Hope that helps.
>
> Regards,
> Shaddy

Have fun...

Jean


Attachments:
(No filename) (2.31 kB)
iw262_return_compat_ioctl.diff (1.67 kB)
Download all attachments

2007-11-21 18:05:22

by Kyle McMartin

[permalink] [raw]
Subject: Re: zd1211rw (2.6.22 sparc64): unaligned access (do_rx)

On Wed, Nov 21, 2007 at 02:23:52PM +0000, Daniel Drake wrote:
> network = ieee80211softmac_get_network_by_bssid_locked(mac,
> resp->header.addr3);
>
> addr3 is offset 20 bytes in the struct and is 6 bytes long. Because 20
> is not evenly divisible by 6 does that make it an unaligned access?
>

Sort of. An "unaligned" access is one where the referenced address is
not aligned to the size of the data type being accessed. For example,
a 4-byte load must be accessed on 4-byte aligned boundaries:
0xffff0000 0xffff0004 0xffff0008, and so forth.

Obviously a single byte load is always aligned.

Unaligned accesses are universally bad. On some architectures (x86, ppc)
they will be emulated in hardware at a performance cost. On others
(sparc, parisc, ia64) they will trap, giving you the option of software
emulation (very slow) or sending a SIGBUS. On still others (arm, I
think[1]) they'll "align" in hardware, resulting in loads or stores to
the wrong address.

The Linux networking code is a pretty good example of when to use the
support macros (get|put)_unaligned to do the right thing. For something
like a packed structure accessed in a data stream, you're invariably going
to have to use these.

> Is there any documentation I can read on this topic? In my current
> uneducated state I'm likely to write further code with these problems...
>

cheers,
Kyle

1. Just a guess based on <asm-arm/unaligned.h> providing macros that
actually do something, and the lack of an obvious unaligned trap
handler.

2007-11-19 08:27:56

by David Miller

[permalink] [raw]
Subject: Re: zd1211rw (2.6.22 sparc64): unaligned access (do_rx)

From: Shaddy Baddah <[email protected]>
Date: Mon, 19 Nov 2007 11:56:39 +1100

> The first noticeable problem is when using iwconfig on the device. I
> cannot set a key for the device, and a query of the configuration always
> claims that the "<encryption key is too long>".

Likely some wireless ioctl compat layer bug in the kernel.

> If I try to scan for APs using iwlist, I get one AP listed, before a bus
> error occurs. This of course, does not suggest that the problem is with
> the driver, but I mention it for the record.

This is an unaligned data access in the userland tools.
Try to catch it with GDB and give us ths backtrace.

> Whilst the device is up, I continually get console messages like those I
> have inlined below:
>
> Kernel unaligned access at TPC[100ee4d4] do_rx+0x244/0x614 [zd1211rw]
> Kernel unaligned access at TPC[100ee4e8] do_rx+0x258/0x614 [zd1211rw]
> Kernel unaligned access at TPC[100ee4cc] do_rx+0x23c/0x614 [zd1211rw]

These are unaligned data accesses in the kernel driver.

2007-11-21 13:09:08

by Shaddy Baddah

[permalink] [raw]
Subject: Re: zd1211rw (2.6.22 sparc64): unaligned access (do_rx)

Hi again,

Shaddy Baddah wrote:
>> This is an unaligned data access in the userland tools.
>> Try to catch it with GDB and give us ths backtrace.
>
> I will try and do this at some point. Unfortunately, the system isn't
> readily at hand, and it might take a couple of days before I am able to
> attend to it.

I've tried to do this as best as I can (I've munged the ethernet address
for privacy), and have included the output in-line:

# gdb --args iwlist eth2 scanning
GNU gdb 6.4.90-debian
Copyright (C) 2006 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain
conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB. Type "show warranty" for details.
This GDB was configured as "sparc-linux-gnu"...(no debugging symbols found)
Using host libthread_db library "/lib/libthread_db.so.1".

(gdb) run
Starting program: /sbin/iwlist eth2 scanning
(no debugging symbols found)
(no debugging symbols found)
(no debugging symbols found)
(no debugging symbols found)
eth2 Scan completed :
Cell 01 - Address: GG:GG:GG:GG:GG:GG
ESSID:"dixieland"
Protocol:IEEE 802.11bg
Mode:Master
Channel:11
Frequency:2.462 GHz (Channel 11)

Program received signal SIGBUS, Bus error.
0xf7fa9eb8 in iw_extract_event_stream () from /lib/libiw.so.29
(gdb) bt
#0 0xf7fa9eb8 in iw_extract_event_stream () from /lib/libiw.so.29
#1 0x00011db0 in ?? ()
#2 0x00011db0 in ?? ()
Previous frame identical to this frame (corrupt stack?)
(gdb)


Hope that helps.

Regards,
Shaddy


2007-11-19 22:36:20

by Jean Tourrilhes

[permalink] [raw]
Subject: Re: zd1211rw (2.6.22 sparc64): unaligned access (do_rx)

On Mon, Nov 19, 2007 at 02:20:46PM -0800, David Miller wrote:
>
> This is an unreasonable requirement, all kernel ABIs should work in
> 32-bit compat tasks. And when it's very difficult or impossible to
> make it work, there should be a new ABI to transition to that does
> work in compat mode and in particular would work now for the user.

My personal goal is to fix those problems. I understand that
John and Johannes are not very interested.
I personally think it's not difficult to fix, so far, every
bug reported was quite easy to fix. The main issue is lack of testing,
I don't have a 64 bit box. Bug reports are *very* difficult to get
because there are very few users of wireless on 64 bits and because
"the box is in production, I can't try this".
Now, the good news is that many of such bugs have been fixed
in the last few months, actually more that in the few previous
years. Some of the bugs reported *may* be fixed in later versions.
Have fun...

Jean


2007-11-20 12:43:03

by Johannes Berg

[permalink] [raw]
Subject: Re: zd1211rw (2.6.22 sparc64): unaligned access (do_rx)



> It has to be, %99 of sparc64 userland is 32-bit.

Thought so. Same on powerpc.

> > The kernel has never been fixed and never will be. New wireless
> > tools should "support" this, but imnsho you're *much* better off
> > compiling 64-bit wireless tools.
>
> This is an unreasonable requirement, all kernel ABIs should work in
> 32-bit compat tasks. And when it's very difficult or impossible to
> make it work, there should be a new ABI to transition to that does
> work in compat mode and in particular would work now for the user.

Well, I complained too but it's unfixable without breaking one thing or
the other, consider netlink messages with iw_point in them that are
broadcast. Ugh.

johannes


Attachments:
signature.asc (828.00 B)
This is a digitally signed message part

2007-11-30 03:45:12

by Shaddy Baddah

[permalink] [raw]
Subject: Re: zd1211rw (2.6.22 sparc64): unaligned access (do_rx)

Hi,

First off, sorry for the delay in responding. I am now trying to catch
up with my backlog on this issue.

Jean Tourrilhes wrote:
> The patch from Masakazu Mokuno I was talking about is not in
> your kernel. It defintely would explain why you would get the
> "Encryption key:<too big>". Basically the length of any variable field
> is not properly return to user space. This patch would fix that
> problem.
> The patch from Masakazu Mokuno is not in 2.6.22, and it not
> either in 2.6.14. The patch is only included in 2.6.23 and later. I
> would suggest upgrading to 2.6.23 to get that patch. Alternatively,
> I've included the patch as attachement, and you can use it with
> 2.6.22.
> I would like you to try that patch and report.

Yes, this solved that problem. Thank you,

>>> Can you give me the exact error text as reported by iwconfig ?
>>> I'll probably have to send you a test version to see what's happening
>>> under the cover.
>> Please find this in-lined below (with key protected):
>>
>> # iwconfig eth2
>> eth2 IEEE 802.11b/g ESSID:off/any Nickname:"zd1211"
>> Mode:Managed Frequency:2.462 GHz Access Point: Invalid
>> Bit Rate=1 Mb/s
>> Encryption key:<too big>
>
> Note that under Wireless Tools 30, you will notice that it
> won't return too bug but will print an absurdly long encryption
> key. Two symptoms of the same bug.
>
>>> o version of Wireless Tools (iwconfig --version).
>>> Most likely, you need to upgrade your Wireless Tools to
>>> version 29 which fixes this 32/64 interop problem.
>>> With the latest kernel and the latest wireless tools, the only
>>> known bugs are the two ESSID bugs.
>> I think I'm right for version:
>> # iwconfig --version
>> iwconfig Wireless-Tools version 29
>> Compatible with Wireless Extension v11 to v22.
>>
>> Kernel Currently compiled with Wireless Extension v22.
>>
>> eth2 Recommend Wireless Extension v20 or later,
>> Currently compiled with Wireless Extension v22.
>
> Yep, that's the correct version. I was afraid you were running
> Debian stable. I'll need to dig up a little bit more in this.
> Note that the patch above *may* help with this issue as well.

No, it does not seem to have. An iwlist eth2 scanning command still
lists a single access point, then fails with a bus error.

Thanks for your help,
Shaddy


2007-11-20 21:56:43

by David Miller

[permalink] [raw]
Subject: Re: zd1211rw (2.6.22 sparc64): unaligned access (do_rx)

From: Jean Tourrilhes <[email protected]>
Date: Tue, 20 Nov 2007 09:43:30 -0800

> On Mon, Nov 19, 2007 at 11:42:01PM -0800, David Miller wrote:
> > From: Jean Tourrilhes <[email protected]>
> > Date: Mon, 19 Nov 2007 14:35:54 -0800
> >
> > > I don't have a 64 bit box.
> >
> > Let me know when you have joined us in the 21st century.
>
> Well, if you are so rich, please send me a new computer. I'm
> not paid to do any of this stuff.

The irony is that you don't even need a 64-bit system to work on this
kind of bug, consider qemu or similar.

But on my side I did all of my analysis with code inspection.

> As I was saying, we made a lot of progress in the last few
> months. And it works.

If you look at my bug fix, wext does anything but work, it writes over
random data in userspace on all 64-bit platforms.

2007-11-19 18:06:03

by Jean Tourrilhes

[permalink] [raw]
Subject: Re: zd1211rw (2.6.22 sparc64): unaligned access (do_rx)

On Mon, Nov 19, 2007 at 04:03:02PM +0100, Johannes Berg wrote:
>
> > > The first noticeable problem is when using iwconfig on the device. I
> > > cannot set a key for the device, and a query of the configuration always
> > > claims that the "<encryption key is too long>".
> >
> > Likely some wireless ioctl compat layer bug in the kernel.
>
> Is this 32/64 interop? The kernel has never been fixed and never will
> be. New wireless tools should "support" this, but imnsho you're *much*
> better off compiling 64-bit wireless tools.

There was a patch for this kind of issue from Masakazu Mokuno
that went in 2.6.22 or 2.6.23 that was touching that part. I don't
think he went in the stable series, but probably should have. There
are other potential cause of problem here, and I suspect he might get
the same error with 64 bits userspace.
Can you give me the exact error text as reported by iwconfig ?
I'll probably have to send you a test version to see what's happening
under the cover.

> > > If I try to scan for APs using iwlist, I get one AP listed, before a bus
> > > error occurs. This of course, does not suggest that the problem is with
> > > the driver, but I mention it for the record.
> >
> > This is an unaligned data access in the userland tools.
> > Try to catch it with GDB and give us ths backtrace.
>
> For Jean.

Definitely, it sounds familiar.
This is that I need :
o version of the kernel
o version of the driver if not stock driver
o version of Wireless Tools (iwconfig --version).
Most likely, you need to upgrade your Wireless Tools to
version 29 which fixes this 32/64 interop problem.
With the latest kernel and the latest wireless tools, the only
known bugs are the two ESSID bugs.

Regards,

Jean

2007-11-30 20:22:40

by Jean Tourrilhes

[permalink] [raw]
Subject: Re: zd1211rw (2.6.22 sparc64): unaligned access (do_rx)

On Fri, Nov 30, 2007 at 02:42:37PM +1100, Shaddy Baddah wrote:
> Hi,
>
> First off, sorry for the delay in responding. I am now trying to catch
> up with my backlog on this issue.

That's ok, I'm not in a hurry.

> > The patch from Masakazu Mokuno is not in 2.6.22, and it not
> >either in 2.6.14. The patch is only included in 2.6.23 and later. I
> >would suggest upgrading to 2.6.23 to get that patch. Alternatively,
> >I've included the patch as attachement, and you can use it with
> >2.6.22.
> > I would like you to try that patch and report.
>
> Yes, this solved that problem. Thank you,

Ok, one down, one to go. That's progress.

> > Note that the patch above *may* help with this issue as well.
>
> No, it does not seem to have. An iwlist eth2 scanning command still
> lists a single access point, then fails with a bus error.

Hmm... That's strange as I was seeing my code behaving as
expected.
I will ask you to compile a version of wireless tools with
debugging enabled and run it with gdb. See my other e-mail.

> Thanks for your help,
> Shaddy

No, no. Thanks for your help tracking those issues down. You
have been great.

Jean


2007-11-30 05:34:15

by Shaddy Baddah

[permalink] [raw]
Subject: Re: zd1211rw (2.6.22 sparc64): unaligned access (do_rx)

Hi Daniel,

Daniel Drake wrote:
>> Interestingly, after performing the above actions, I am now getting
>> additional log messages:
>>
>> SoftMAC: Open Authentication completed with GG:GG:GG:GG:GG:GG
>> Kernel unaligned access at TPC[100df410]
>> ieee80211softmac_handle_assoc_response]
>> Kernel unaligned access at TPC[100df548]
>> ieee80211softmac_handle_assoc_response]
>
> This one doesn't include any offsets, did the line get cut off?

Yes, it did. Sorry about that. Please find it extracted from syslog:

Kernel unaligned access at TPC[100df410]
ieee80211softmac_handle_assoc_response+0x18/0x1fc [ieee80211softmac]
Kernel unaligned access at TPC[100df548]
ieee80211softmac_handle_assoc_response+0x150/0x1fc [ieee80211softmac]

>> ADDRCONF(NETDEV_CHANGE): eth2: link becomes ready
>> Kernel unaligned access at TPC[100d03ec] ieee80211_copy_snap+0x74/0x78
>> [ieee802]
>> Kernel unaligned access at TPC[100d03ec] ieee80211_copy_snap+0x74/0x78
>> [ieee802]
>> Kernel unaligned access at TPC[100d03ec] ieee80211_copy_snap+0x74/0x78
>> [ieee802]
>> Kernel unaligned access at TPC[100d03ec] ieee80211_copy_snap+0x74/0x78
>> [ieee802]
>> Kernel unaligned access at TPC[100d03ec] ieee80211_copy_snap+0x74/0x78
>> [ieee802]
>
> This one should be fixed by the attached patch. Sorry for not sending it
> sooner, the contributor has not yet solved all problems and I was
> waiting to see if more patches would come.
>
>> Kernel unaligned access at TPC[100ee624] do_rx+0x394/0x5ec [zd1211rw]
>> Kernel unaligned access at TPC[100ee62c] do_rx+0x39c/0x5ec [zd1211rw]
>> Kernel unaligned access at TPC[100ee638] do_rx+0x3a8/0x5ec [zd1211rw]
>> Kernel unaligned access at TPC[100ee668] do_rx+0x3d8/0x5ec [zd1211rw]
>> Kernel unaligned access at TPC[100ee670] do_rx+0x3e0/0x5ec [zd1211rw]
>
> These might be solved by the patch David sent to the list a few days ago
> (thanks!). Have you applied it? If you can confirm it helps I will send
> it up through John.

I am catching up with patches at the moment, and will report back on
this shortly.

Thanks for your help,
Shaddy


2007-11-21 14:24:36

by Daniel Drake

[permalink] [raw]
Subject: Re: zd1211rw (2.6.22 sparc64): unaligned access (do_rx)

[PATCH] ieee80211: fix unaligned access in ieee80211_copy_snap

From: Daniel Drake <[email protected]>

Based on a patch from Jun Sun.

Signed-off-by: Daniel Drake <[email protected]>

diff --git a/net/ieee80211/ieee80211_tx.c b/net/ieee80211/ieee80211_tx.c
index a4c3c51..6d06f13 100644
--- a/net/ieee80211/ieee80211_tx.c
+++ b/net/ieee80211/ieee80211_tx.c
@@ -144,7 +144,8 @@ static int ieee80211_copy_snap(u8 * data, u16 h_proto)
snap->oui[1] = oui[1];
snap->oui[2] = oui[2];

- *(u16 *) (data + SNAP_SIZE) = htons(h_proto);
+ h_proto = htons(h_proto);
+ memcpy(data + SNAP_SIZE, &h_proto, sizeof(u16));

return SNAP_SIZE + sizeof(u16);
}


Attachments:
ieee80211-copy-snap.patch (645.00 B)

2007-11-20 17:38:51

by Jean Tourrilhes

[permalink] [raw]
Subject: Re: zd1211rw (2.6.22 sparc64): unaligned access (do_rx)

On Tue, Nov 20, 2007 at 01:40:54PM +0100, Johannes Berg wrote:
>
> Well, I complained too but it's unfixable without breaking one thing or
> the other, consider netlink messages with iw_point in them that are
> broadcast. Ugh.

Actually, iw_point are never included in netlink messages.

> johannes

Jean



2007-11-20 17:43:46

by Jean Tourrilhes

[permalink] [raw]
Subject: Re: zd1211rw (2.6.22 sparc64): unaligned access (do_rx)

On Mon, Nov 19, 2007 at 11:42:01PM -0800, David Miller wrote:
> From: Jean Tourrilhes <[email protected]>
> Date: Mon, 19 Nov 2007 14:35:54 -0800
>
> > I don't have a 64 bit box.
>
> Let me know when you have joined us in the 21st century.

Well, if you are so rich, please send me a new computer. I'm
not paid to do any of this stuff.

> > Bug reports are *very* difficult to get because
>
> I guess writing a dummy wireless driver with just some printk()'s and
> some test userland programs to make sure the data goes to and
> from the handlers with the proper layout is thinking way too far
> outside the box for you.

There is no point for that. The difficulty is finding out that
there is a problem. To find out that there is a problem, it's
sufficient to run a standard driver, no need for a dummy
driver. People already have little incentive to run real wireless
drivers on 64 bit boxes, I don't see who they would run dummy drivers.

> Nothing using iw_point in these request objects can possibly work in
> compat situations, for example. I think it works by accident on
> little-endian.

As I was saying, we made a lot of progress in the last few
months. And it works.

Jean


2007-11-20 12:46:00

by David Miller

[permalink] [raw]
Subject: Re: zd1211rw (2.6.22 sparc64): unaligned access (do_rx)

From: Johannes Berg <[email protected]>
Date: Tue, 20 Nov 2007 13:40:54 +0100

> > > The kernel has never been fixed and never will be. New wireless
> > > tools should "support" this, but imnsho you're *much* better off
> > > compiling 64-bit wireless tools.
> >
> > This is an unreasonable requirement, all kernel ABIs should work in
> > 32-bit compat tasks. And when it's very difficult or impossible to
> > make it work, there should be a new ABI to transition to that does
> > work in compat mode and in particular would work now for the user.
>
> Well, I complained too but it's unfixable without breaking one thing or
> the other, consider netlink messages with iw_point in them that are
> broadcast. Ugh.

That's fine, I understand how difficult it is to translate things
embedded in netlink messages but:

1) even the netlink case can be fixed, we need some infrastructure
to set some state at the top of the syscall path, and then when
we call into the netlink protocol handler it will know what
format the netlink payload needs to be in

and

2) the ioctls are broken too and that definitely can be fixed

We cannot have tools and interfaces that only work on native
binaries.

2007-11-19 12:19:51

by David Miller

[permalink] [raw]
Subject: Re: zd1211rw (2.6.22 sparc64): unaligned access (do_rx)

From: Shaddy Baddah <[email protected]>
Date: Mon, 19 Nov 2007 23:11:43 +1100

> David Miller wrote:
> >> Whilst the device is up, I continually get console messages like those I
> >> have inlined below:
> >>
> >> Kernel unaligned access at TPC[100ee4d4] do_rx+0x244/0x614 [zd1211rw]
> >> Kernel unaligned access at TPC[100ee4e8] do_rx+0x258/0x614 [zd1211rw]
> >> Kernel unaligned access at TPC[100ee4cc] do_rx+0x23c/0x614 [zd1211rw]
> >
> > These are unaligned data accesses in the kernel driver.
>
> Sure. Just to make sure I understand, does TPC stand for something like
> Trap Program Code?

It's simply the program counter of the CPU at the time of the trap.
It maps to the symbol and offset printed right afterwards, which
would be somewhere in do_rx() in the zd1211rw driver.

Probably it's parsing some data structure in the incoming packet
buffer which is misaligned.

2007-11-20 13:49:14

by Johannes Berg

[permalink] [raw]
Subject: Re: zd1211rw (2.6.22 sparc64): unaligned access (do_rx)


> I think I've figured out what's happening here.
>
> The kernel makes no effort whatsoever to translate iwe streams in
> compat environments. And userspace then tries to "correct" this and
> does so miserably. Likely this is what causes the bus error.

Quite possible. I wanted this fixed too but Jean refused to do it in the
kernel. And personally, I'm no longer touching wext with a 10 foot pole.
Too much backslash.

> The fix is that we need to add some handling code
> fs/compat_ioctl.c:do_wireless_ioctl() for the case where we are
> returning an iwe stream (SIOCGIWSCAN).
>
> It should not be very difficult to do this, since the compat format
> will be the same size or smaller, it should be easy to recode the
> thing in-place in the user buffer.
>
> So you make a pass over the user buffer fixing things up and then you
> adjust the iw_data length for the new size.
>
> Alternatively, you can allocate a kernel buffer for this, use a 'fs =
> get_fs(); set_fs(KERNEL_DS); ioctl(); set_fs(fs);' sequence, fixup the
> iwe stream, then copy the everything back out to userspace.

That may work, but wext also broadcasts iw_point inside netlink messages
for scan notifications etc. I don't see a good way to fix this part.

> Then we can delete all of this incredibly stupid code in the wireless
> tools that attempts to fix this up in userspace.

I wish. Really, I do.

johannes


Attachments:
signature.asc (828.00 B)
This is a digitally signed message part

2007-11-21 14:33:44

by Daniel Drake

[permalink] [raw]
Subject: Re: zd1211rw (2.6.22 sparc64): unaligned access (do_rx)

Shaddy Baddah wrote:
> Interestingly, after performing the above actions, I am now getting
> additional log messages:
>
> SoftMAC: Open Authentication completed with GG:GG:GG:GG:GG:GG
> Kernel unaligned access at TPC[100df410]
> ieee80211softmac_handle_assoc_response]
> Kernel unaligned access at TPC[100df548]
> ieee80211softmac_handle_assoc_response]
> ADDRCONF(NETDEV_CHANGE): eth2: link becomes ready
> Kernel unaligned access at TPC[100d03ec] ieee80211_copy_snap+0x74/0x78
> [ieee802]
> Kernel unaligned access at TPC[100d03ec] ieee80211_copy_snap+0x74/0x78
> [ieee802]
> Kernel unaligned access at TPC[100d03ec] ieee80211_copy_snap+0x74/0x78
> [ieee802]
> Kernel unaligned access at TPC[100d03ec] ieee80211_copy_snap+0x74/0x78
> [ieee802]
> Kernel unaligned access at TPC[100d03ec] ieee80211_copy_snap+0x74/0x78
> [ieee802]
> Kernel unaligned access at TPC[100ee624] do_rx+0x394/0x5ec [zd1211rw]
> Kernel unaligned access at TPC[100ee62c] do_rx+0x39c/0x5ec [zd1211rw]
> Kernel unaligned access at TPC[100ee638] do_rx+0x3a8/0x5ec [zd1211rw]
> Kernel unaligned access at TPC[100ee668] do_rx+0x3d8/0x5ec [zd1211rw]
> Kernel unaligned access at TPC[100ee670] do_rx+0x3e0/0x5ec [zd1211rw]

One thing I forgot to mention: in 2.6.25 zd1211rw is moving away from
ieee80211/softmac and changing to mac80211. I'm not sure if mac80211 has
been tested on architectures with alignment constraints, so we may be
greeted with new problems.

Or some might go away. Speaking only from memory, I think the code with
compare_ether_addr problem that David found is not present in the
mac80211 port.

Daniel

2007-11-19 22:20:47

by David Miller

[permalink] [raw]
Subject: Re: zd1211rw (2.6.22 sparc64): unaligned access (do_rx)

From: Johannes Berg <[email protected]>
Date: Mon, 19 Nov 2007 16:03:02 +0100

>
> > > The first noticeable problem is when using iwconfig on the device. I
> > > cannot set a key for the device, and a query of the configuration always
> > > claims that the "<encryption key is too long>".
> >
> > Likely some wireless ioctl compat layer bug in the kernel.
>
> Is this 32/64 interop?

It has to be, %99 of sparc64 userland is 32-bit.

> The kernel has never been fixed and never will be. New wireless
> tools should "support" this, but imnsho you're *much* better off
> compiling 64-bit wireless tools.

This is an unreasonable requirement, all kernel ABIs should work in
32-bit compat tasks. And when it's very difficult or impossible to
make it work, there should be a new ABI to transition to that does
work in compat mode and in particular would work now for the user.

2007-11-21 13:22:08

by Shaddy Baddah

[permalink] [raw]
Subject: Re: zd1211rw (2.6.22 sparc64): unaligned access (do_rx)

Hi Jean

Jean Tourrilhes wrote:
> On Mon, Nov 19, 2007 at 04:03:02PM +0100, Johannes Berg wrote:
>>>> The first noticeable problem is when using iwconfig on the device. I
>>>> cannot set a key for the device, and a query of the configuration always
>>>> claims that the "<encryption key is too long>".
>>> Likely some wireless ioctl compat layer bug in the kernel.
>> Is this 32/64 interop? The kernel has never been fixed and never will
>> be. New wireless tools should "support" this, but imnsho you're *much*
>> better off compiling 64-bit wireless tools.
>
> There was a patch for this kind of issue from Masakazu Mokuno
> that went in 2.6.22 or 2.6.23 that was touching that part. I don't
> think he went in the stable series, but probably should have. There
> are other potential cause of problem here, and I suspect he might get
> the same error with 64 bits userspace.
> Can you give me the exact error text as reported by iwconfig ?
> I'll probably have to send you a test version to see what's happening
> under the cover.

Please find this in-lined below (with key protected):

# iwconfig eth2
eth2 IEEE 802.11b/g ESSID:off/any Nickname:"zd1211"
Mode:Managed Frequency:2.462 GHz Access Point: Invalid
Bit Rate=1 Mb/s
Encryption key:<too big>

# iwconfig eth2 essid dixieland key "GGGGGGGGGG"
# iwconfig eth2
eth2 IEEE 802.11b/g ESSID:off/any Nickname:"zd1211"
Mode:Managed Frequency:2.462 GHz Access Point:
GG:GG:GG:GG:GG:GG
Bit Rate=24 Mb/s
Encryption key:<too big>

#

>>>> If I try to scan for APs using iwlist, I get one AP listed, before a bus
>>>> error occurs. This of course, does not suggest that the problem is with
>>>> the driver, but I mention it for the record.
>>> This is an unaligned data access in the userland tools.
>>> Try to catch it with GDB and give us ths backtrace.
>> For Jean.
>
> Definitely, it sounds familiar.
> This is that I need :
> o version of the kernel

Debian sparc64 2.6.22 (built myself, using make-kpg, to include the driver)

> o version of the driver if not stock driver
As stock with the kernel described above

> o version of Wireless Tools (iwconfig --version).
> Most likely, you need to upgrade your Wireless Tools to
> version 29 which fixes this 32/64 interop problem.
> With the latest kernel and the latest wireless tools, the only
> known bugs are the two ESSID bugs.

I think I'm right for version:
# iwconfig --version
iwconfig Wireless-Tools version 29
Compatible with Wireless Extension v11 to v22.

Kernel Currently compiled with Wireless Extension v22.

eth2 Recommend Wireless Extension v20 or later,
Currently compiled with Wireless Extension v22.


Hope that helps.

Regards,
Shaddy



2007-11-20 07:42:03

by David Miller

[permalink] [raw]
Subject: Re: zd1211rw (2.6.22 sparc64): unaligned access (do_rx)

From: Jean Tourrilhes <[email protected]>
Date: Mon, 19 Nov 2007 14:35:54 -0800

> I don't have a 64 bit box.

Let me know when you have joined us in the 21st century.

> Bug reports are *very* difficult to get because

I guess writing a dummy wireless driver with just some printk()'s and
some test userland programs to make sure the data goes to and
from the handlers with the proper layout is thinking way too far
outside the box for you.

Since you're not interested proactively doing things and everyone else
is also pushing the problem away from themselves I'll fix all of this.

Nothing using iw_point in these request objects can possibly work in
compat situations, for example. I think it works by accident on
little-endian.

2007-11-21 19:06:27

by Jean Tourrilhes

[permalink] [raw]
Subject: Re: zd1211rw (2.6.22 sparc64): unaligned access (do_rx)

On Thu, Nov 22, 2007 at 12:35:44AM +1100, Shaddy Baddah wrote:
> Hi Jean,
>
> Jean Tourrilhes wrote:
> >> This is a special version of Wireless Tools with some debug
> >>code.
> >
> > Sorry, I forgot to enable some debug. This version should do it.
>
> I've chosen to respond directly, because I hope not to publish my MAC
> address to the list,

Wise move. I made sure to blank it out.

> I've run as follows:
>
> # ./iwlist eth2 scanning 2>&1 | tee scan.log
>
> and have attached the output.
>
> Hope that helps,
> Shaddy

> Scan result 4096 [
{removed for privacy reasons}
00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00]

The first thing to notice is the large number of zeros at the
end. This is the same problem as you had with the encryption key, the
lenght of the buffer is not returned properly to user space. In
theory, we should not crash on bad data, but having all this extra
junk is not helpful.
The fix for that is the patch I sent in the other e-mail,
already included in 2.6.23.

> DBG - stream->current = 0x2c039, stream->value = (nil), stream->end = 0x2d008
> DBG - iwe->cmd = 0x8B01, iwe->len = 24
> DBG - event_type = 2, event_len = 16, pointer = 0x2c03d
> DBG - alt iwe->len = 20
> Protocol:IEEE 802.11bg
> DBG - stream->current = 0x2c051, stream->value = (nil), stream->end = 0x2d008
> DBG - iwe->cmd = 0x8B07, iwe->len = 12
> DBG - event_type = 4, event_len = 4, pointer = 0x2c055
> DBG - alt iwe->len = 8

Are you certain the log really stop there ? It looks truncated
to me. As far as I can see, the parsing works properly and the
workaround kicks in properly.
Note that there is only a single AP in your scan log. After
the protocol, you should see "mode: Master", "freq/channel" (twice),
and "encoding". After that, it should try to decode "rate" but will
most likely fail because of the extra junk (I'll look into that).

If you have time, you can try with gdb. Here is what you would
do.
1) edit Makefile. Go to CFLAGS, replace "-Os" with "-g".
2) make clean ; make
3) gdb --args iwlist eth0 scan
4) run
5) bt

I'll look into that.
Thanks a lot !

Jean




2007-12-06 22:01:23

by Jean Tourrilhes

[permalink] [raw]
Subject: Re: zd1211rw (2.6.22 sparc64): unaligned access (do_rx)

On Thu, Dec 06, 2007 at 10:25:35AM +1100, Shaddy Baddah wrote:
> Hi Jean,

Hi again,

> > So, you'll find attached a new version of Wireless Tools. Just
> >tell me if it work. You can remove the debugging output by commenting
> >out the "#define DEBUG" lines (grep for it).
>
> Yes, that has fixed it. Thank you very much.

Would you mind testing the attached version ? Following the
suggestions of David, I came up with a slightly better fix (use a
union instead of memcpy).
As soon as I know it worka, I'll push that to the Debian
maintainer...

> Regards,
> Shaddy

Thanks again for your help...

Jean


2007-12-04 00:03:13

by Jean Tourrilhes

[permalink] [raw]
Subject: Re: zd1211rw (2.6.22 sparc64): unaligned access (do_rx)

On Fri, Nov 30, 2007 at 12:21:15PM -0800, jt wrote:
> On Fri, Nov 30, 2007 at 02:42:37PM +1100, Shaddy Baddah wrote:
> >
> > No, it does not seem to have. An iwlist eth2 scanning command still
> > lists a single access point, then fails with a bus error.
>
> Hmm... That's strange as I was seeing my code behaving as
> expected.
> I will ask you to compile a version of wireless tools with
> debugging enabled and run it with gdb. See my other e-mail.

Hi,

Sorry to bother you again. Next week I will go on vacation,
and may not be able to continue working on that bug. Your best bet for
a fix is this week.

Now, this is what you will do :
1) Get wireless_tools.30.pre3-debug2.tar.gz I sent you a while
ago. I know you have it, as you sent me debug log from it.
2) Go in the wireless_tools.30 directory.
3) edit Makefile. Go to line defining CFLAGS, replace "-Os"
with "-g". It should look like :
---------------------------------------
CFLAGS=-g -W -Wall -Wstrict-prototypes -Wmissing-prototypes -Wshadow \
-Wpointer-arith -Wcast-qual -Winline -I.
---------------------------------------
4) Recompile : make clean ; make
5) Start the debugger : gdb --args iwlist eth0 scan
6) Run the program : run
7) Get a stack trace : bt

If you have properly enabled debug, the stack trace will show
the precise line number where it crashes. That's mostly what I'm
after.
I would also like to have the debugging output of iwlist. The
function where it seems to crash is called many time, I want to know
in which context it is called. Feel free to send it private.

Thanks in advance...

Jean

2007-12-08 12:53:37

by Daniel Drake

[permalink] [raw]
Subject: Re: zd1211rw (2.6.22 sparc64): unaligned access (do_rx)

Michael Buesch wrote:
> Uhm, ok.
> But I guess then I'm not the only one who got all this wrong. ;)

Probably. I recently submitted some documentation to help people like
you and me:
http://article.gmane.org/gmane.linux.kernel/609571

There's also an article in this week's LWN.

Daniel

2007-12-05 23:44:08

by Shaddy Baddah

[permalink] [raw]
Subject: Re: zd1211rw (2.6.22 sparc64): unaligned access (do_rx)

Hi Jean,

Jean Tourrilhes wrote:
> Personally, I was under the impression that in userspace libc
> trap unaligned access and make them work. I mean, this code was tested
> on other 64/32 bit platforms (PPC, AMD-64, PS3) and was working there,
> so this seems to be specific to your platform.

Sure. The platform is a Sun Ultra 10, running Debian Lenny:

Linux trad 2.6.22 #2 Fri Nov 30 13:47:43 EST 2007 sparc64 GNU/Linux

(just noticed that the clock is wrong. I'll sort that out later).

> In any case, I've modified my code to use a memcpy() instead
> of direct memory access. I'm not 100% sure it will fix it because many
> time memcpy() is optimised away to a direct memory access.
>
> So, you'll find attached a new version of Wireless Tools. Just
> tell me if it work. You can remove the debugging output by commenting
> out the "#define DEBUG" lines (grep for it).

Yes, that has fixed it. Thank you very much.

Regards,
Shaddy


2007-12-07 11:36:53

by Michael Büsch

[permalink] [raw]
Subject: Re: zd1211rw (2.6.22 sparc64): unaligned access (do_rx)

On Friday 07 December 2007 04:52:14 David Miller wrote:
> From: Jean Tourrilhes <[email protected]>
> Date: Thu, 6 Dec 2007 13:43:41 -0800
>
> > On Thu, Dec 06, 2007 at 10:33:18PM +0100, Michael Buesch wrote:
> > > Userspace handles this transparently through fault traps.
> >
> > This is exactly what I was assuming. On the SPARC, in this
> > instance, it did not (check back earlier in this discussion).
>
> Right, you get a SIGBUS, and this is what will happen on
> MIPS as well.
>
> It is even defined in the application ABIs for these platforms that
> all objects must be aligned to their type's natural size.
>
>

So, how do we handle unaligned data in userspace then?
A _lot_ of applications are broken if that's right what you say.

--
Greetings Michael.

2007-12-07 03:52:15

by David Miller

[permalink] [raw]
Subject: Re: zd1211rw (2.6.22 sparc64): unaligned access (do_rx)

From: Jean Tourrilhes <[email protected]>
Date: Thu, 6 Dec 2007 13:43:41 -0800

> On Thu, Dec 06, 2007 at 10:33:18PM +0100, Michael Buesch wrote:
> > Userspace handles this transparently through fault traps.
>
> This is exactly what I was assuming. On the SPARC, in this
> instance, it did not (check back earlier in this discussion).

Right, you get a SIGBUS, and this is what will happen on
MIPS as well.

It is even defined in the application ABIs for these platforms that
all objects must be aligned to their type's natural size.

2007-12-07 12:34:09

by David Miller

[permalink] [raw]
Subject: Re: zd1211rw (2.6.22 sparc64): unaligned access (do_rx)

From: Michael Buesch <[email protected]>
Date: Fri, 7 Dec 2007 12:35:11 +0100

> So, how do we handle unaligned data in userspace then?
> A _lot_ of applications are broken if that's right what you say.

Applications make sure data is always properly aligned, that's how
it's handled. I don't see what the big mystery is. :-)

In the rare cases where these bugs pop up, they simply get fixed.

And you should be thankful these things are being explicitly
flagged as errors on at least some platforms, they could be
silent performance problems otherwise.

2007-12-08 01:36:04

by David Miller

[permalink] [raw]
Subject: Re: zd1211rw (2.6.22 sparc64): unaligned access (do_rx)

From: Daniel Drake <[email protected]>
Date: Fri, 07 Dec 2007 14:48:46 +0000

> Please correct me if I'm wrong, but adding the "packed" attribute causes
> gcc to generate extra instructions to avoid the potentially unaligned
> accesses.
> (i.e. it basically does put_aligned/get_unaligned automatically)

Exactly.

2007-12-05 23:41:18

by Jean Tourrilhes

[permalink] [raw]
Subject: Re: zd1211rw (2.6.22 sparc64): unaligned access (do_rx)

On Thu, Dec 06, 2007 at 10:25:35AM +1100, Shaddy Baddah wrote:
> Hi Jean,
>
> Linux trad 2.6.22 #2 Fri Nov 30 13:47:43 EST 2007 sparc64 GNU/Linux
>
> (just noticed that the clock is wrong. I'll sort that out later).

ntpdate + ntpd is your friend. I believe the CMOS battery is
dead on my oldest computers.

> > In any case, I've modified my code to use a memcpy() instead
> >of direct memory access. I'm not 100% sure it will fix it because many
> >time memcpy() is optimised away to a direct memory access.
> >
> > So, you'll find attached a new version of Wireless Tools. Just
> >tell me if it work. You can remove the debugging output by commenting
> >out the "#define DEBUG" lines (grep for it).
>
> Yes, that has fixed it. Thank you very much.

Wow ! You means it's all working good now ? You don't see any
more problems ?
I also believe that this was the last issue you had with
Wireless Tools/Wireless Extensions. I believe the driver issue is in
good hands.
If that's the case, I'll release soon a new Wireless Tools
version and push that to distros.

> Regards,
> Shaddy

Thanks a lot !

Jean

2007-12-08 01:35:45

by David Miller

[permalink] [raw]
Subject: Re: zd1211rw (2.6.22 sparc64): unaligned access (do_rx)

From: Michael Buesch <[email protected]>
Date: Fri, 7 Dec 2007 14:36:08 +0100

> The wire protocol for the iv files is the following structure:
>
> struct b43_iv {
> be16 a;
> union {
> be16 x;
> be32 y;
> } ((attr_packed));
> } ((attr_packed));
>
> How do I handle that in userspace? In kernel space it's simple
> (and we already use get_unaligned() for the 32bit value there).

If you marked it packed, on platforms like sparc, MIPS, and IA-64, the
compiler is only going to use byte loads and stores to objects of this
type.

You don't need to use get_unaligned() on members of structures that
are marked packed.

If you look at the asm-generic/unaligned.h pure-C implementation,
it uses the packed attribute, which really shows how redundant it
is to use get_unaligned() on packed structure member access :-)

2007-12-08 11:23:43

by Michael Büsch

[permalink] [raw]
Subject: Re: zd1211rw (2.6.22 sparc64): unaligned access (do_rx)

On Saturday 08 December 2007 02:35:44 David Miller wrote:
> From: Michael Buesch <[email protected]>
> Date: Fri, 7 Dec 2007 14:36:08 +0100
>
> > The wire protocol for the iv files is the following structure:
> >
> > struct b43_iv {
> > be16 a;
> > union {
> > be16 x;
> > be32 y;
> > } ((attr_packed));
> > } ((attr_packed));
> >
> > How do I handle that in userspace? In kernel space it's simple
> > (and we already use get_unaligned() for the 32bit value there).
>
> If you marked it packed, on platforms like sparc, MIPS, and IA-64, the
> compiler is only going to use byte loads and stores to objects of this
> type.
>
> You don't need to use get_unaligned() on members of structures that
> are marked packed.
>
> If you look at the asm-generic/unaligned.h pure-C implementation,
> it uses the packed attribute, which really shows how redundant it
> is to use get_unaligned() on packed structure member access :-)

Uhm, ok.
But I guess then I'm not the only one who got all this wrong. ;)

--
Greetings Michael.

2007-12-07 14:48:07

by Daniel Drake

[permalink] [raw]
Subject: Re: zd1211rw (2.6.22 sparc64): unaligned access (do_rx)

Michael Buesch wrote:
> The wire protocol for the iv files is the following structure:
>
> struct b43_iv {
> be16 a;
> union {
> be16 x;
> be32 y;
> } ((attr_packed));
> } ((attr_packed));
>
> How do I handle that in userspace? In kernel space it's simple
> (and we already use get_unaligned() for the 32bit value there).

Please correct me if I'm wrong, but adding the "packed" attribute causes
gcc to generate extra instructions to avoid the potentially unaligned
accesses.
(i.e. it basically does put_aligned/get_unaligned automatically)

Daniel

2007-12-07 03:50:25

by David Miller

[permalink] [raw]
Subject: Re: zd1211rw (2.6.22 sparc64): unaligned access (do_rx)

From: Michael Buesch <[email protected]>
Date: Thu, 6 Dec 2007 22:33:18 +0100

> On Thursday 06 December 2007 22:25:25 Jean Tourrilhes wrote:
> > On Wed, Dec 05, 2007 at 06:36:28PM -0800, David Miller wrote:
> > Now, I'm wondering how to deal with unaligned access in
> > userspace. If you get data from hardware or the network, you can not
> > guarantee that everything will always be aligned, so we need a way to
> > deal with it.
> > In the kernel, we have get_unaligned(). I wonder what's the
> > equivalent in userspace.
>
> Userspace handles this transparently through fault traps.

It does not, the process gets a SIGBUS.

2007-12-06 21:26:55

by Jean Tourrilhes

[permalink] [raw]
Subject: Re: zd1211rw (2.6.22 sparc64): unaligned access (do_rx)

On Wed, Dec 05, 2007 at 06:36:28PM -0800, David Miller wrote:
> From: Jean Tourrilhes <[email protected]>
> Date: Wed, 5 Dec 2007 13:56:00 -0800
>
> > Personally, I was under the impression that in userspace libc
> > trap unaligned access and make them work. I mean, this code was tested
> > on other 64/32 bit platforms (PPC, AMD-64, PS3) and was working there,
> > so this seems to be specific to your platform.
>
> None of those listed platforms trap on unaligned accesses like sparc
> does.

I knew it was a problem in kernel space, and this is why we
have nice macros to perform unaligned access. My assumption that
userspace was behaving differently, and that all this stuff was hidden
from the end user through some trap handler.

> > In any case, I've modified my code to use a memcpy() instead
> > of direct memory access. I'm not 100% sure it will fix it because many
> > time memcpy() is optimised away to a direct memory access.
>
> This only works if you hide the underlying types from the compiler,
> using something like:
>
> static void copy_object(void *dst, void *src, int len)
> {
> memcpy(dst, src, len);
> }
>
> And use the copy_object() thing to move unaligned things around.

Most likely, copy_object() will be inlined.

> Otherwise gcc can see the underlying types and assume alignment, and
> thus generate the memcpy inline using word sized loads and stores, and
> you'll just get the same unaligned access problems.

Well, it seems that the memcpy() did the trick. I guess that
the gcc optimiser is smart enough to recognise the fact that the
pointers may be unaligned.
I don't like being at the mercy of gcc, so I'll fix that
better.

> The fix is to eliminate the unaligned data in the first place. This
> compat fixup stuff really belongs in the kernel.

If people were using the kernel API directly and not using
iwlib, I would agree 100% with you.
My long term plan is to fix the problem at the source,
i.e. always generate the data with the proper alignement regardless of
32 or 64 bits, so that you don't need any "fix" in the compat layer or
in userspace. Unfortunately, it's too early for that.


Now, I'm wondering how to deal with unaligned access in
userspace. If you get data from hardware or the network, you can not
guarantee that everything will always be aligned, so we need a way to
deal with it.
In the kernel, we have get_unaligned(). I wonder what's the
equivalent in userspace.

Have fun...

Jean

2007-12-06 02:36:29

by David Miller

[permalink] [raw]
Subject: Re: zd1211rw (2.6.22 sparc64): unaligned access (do_rx)

From: Jean Tourrilhes <[email protected]>
Date: Wed, 5 Dec 2007 13:56:00 -0800

> Personally, I was under the impression that in userspace libc
> trap unaligned access and make them work. I mean, this code was tested
> on other 64/32 bit platforms (PPC, AMD-64, PS3) and was working there,
> so this seems to be specific to your platform.

None of those listed platforms trap on unaligned accesses like sparc
does.

> In any case, I've modified my code to use a memcpy() instead
> of direct memory access. I'm not 100% sure it will fix it because many
> time memcpy() is optimised away to a direct memory access.

This only works if you hide the underlying types from the compiler,
using something like:

static void copy_object(void *dst, void *src, int len)
{
memcpy(dst, src, len);
}

And use the copy_object() thing to move unaligned things around.

Otherwise gcc can see the underlying types and assume alignment, and
thus generate the memcpy inline using word sized loads and stores, and
you'll just get the same unaligned access problems.

The fix is to eliminate the unaligned data in the first place. This
compat fixup stuff really belongs in the kernel.

2007-12-07 13:38:24

by Michael Büsch

[permalink] [raw]
Subject: Re: zd1211rw (2.6.22 sparc64): unaligned access (do_rx)

On Friday 07 December 2007 13:34:07 David Miller wrote:
> From: Michael Buesch <[email protected]>
> Date: Fri, 7 Dec 2007 12:35:11 +0100
>
> > So, how do we handle unaligned data in userspace then?
> > A _lot_ of applications are broken if that's right what you say.
>
> Applications make sure data is always properly aligned, that's how
> it's handled. I don't see what the big mystery is. :-)

That's not always possible. For example if we are handling
some kind of wire protocol data.

let's give you some real world example: b43-fwcutter

The wire protocol for the iv files is the following structure:

struct b43_iv {
be16 a;
union {
be16 x;
be32 y;
} ((attr_packed));
} ((attr_packed));

How do I handle that in userspace? In kernel space it's simple
(and we already use get_unaligned() for the 32bit value there).

--
Greetings Michael.

2007-12-06 21:35:13

by Michael Büsch

[permalink] [raw]
Subject: Re: zd1211rw (2.6.22 sparc64): unaligned access (do_rx)

On Thursday 06 December 2007 22:25:25 Jean Tourrilhes wrote:
> On Wed, Dec 05, 2007 at 06:36:28PM -0800, David Miller wrote:
> > From: Jean Tourrilhes <[email protected]>
> > Date: Wed, 5 Dec 2007 13:56:00 -0800
> >
> > > Personally, I was under the impression that in userspace libc
> > > trap unaligned access and make them work. I mean, this code was tested
> > > on other 64/32 bit platforms (PPC, AMD-64, PS3) and was working there,
> > > so this seems to be specific to your platform.
> >
> > None of those listed platforms trap on unaligned accesses like sparc
> > does.
>

> Now, I'm wondering how to deal with unaligned access in
> userspace. If you get data from hardware or the network, you can not
> guarantee that everything will always be aligned, so we need a way to
> deal with it.
> In the kernel, we have get_unaligned(). I wonder what's the
> equivalent in userspace.

Userspace handles this transparently through fault traps.

--
Greetings Michael.

2007-12-07 03:49:39

by David Miller

[permalink] [raw]
Subject: Re: zd1211rw (2.6.22 sparc64): unaligned access (do_rx)

From: Jean Tourrilhes <[email protected]>
Date: Thu, 6 Dec 2007 13:25:25 -0800

> On Wed, Dec 05, 2007 at 06:36:28PM -0800, David Miller wrote:
> > This only works if you hide the underlying types from the compiler,
> > using something like:
> >
> > static void copy_object(void *dst, void *src, int len)
> > {
> > memcpy(dst, src, len);
> > }
> >
> > And use the copy_object() thing to move unaligned things around.
>
> Most likely, copy_object() will be inlined.

It doesn't matter, by passing the pointer through the arguments of a
function you've casted the type to "void *" before the builtin
memcpy() logic in the compiler can look at it, and thus the compiler
is no longer allowed to see past that cast back to the original type
and assume any kind of alignment.

I used this to fix similar bugs in libgtk+ and gthumb.

> Now, I'm wondering how to deal with unaligned access in
> userspace. If you get data from hardware or the network, you can not
> guarantee that everything will always be aligned, so we need a way to
> deal with it.
> In the kernel, we have get_unaligned(). I wonder what's the
> equivalent in userspace.

attribute((__packed__)) usually

Better to enforce the alignment in the kernel though. We recently
found a case with link attributes in netlink where u64 attributes end
up unaligned in userspace because of how we size the netlink attribute
headers and the total attribute lengths. So we have to use __packed__
for these things now.

And you _DONT_ want to do this, because the code generated is
completely awful. It does byte at a time accesses for every access
and this is completely silly if the data is aligned most of the time.

So, again, better to put the onus on the kernel or whatever parses the
data to align things properly.

I totally disagree about your assertions about iwlib BTW, anybody
should be able to write code to parse these blobs coming out of the
kernel. Saying that everyone who wants to do so has to go through
iwlib is unreasonable.

If the kernel was converting the data streams correctly, we wouldn't
even having this conversation. And it really ticks me off that you
keep ignoring this point. If it had been done correctly from the
start, we'd have none of this compat code in iwlib, and none of these
problems.

But it's OK that we didn't fix it correctly at first, what is not
OK is refusing to recognize that the kernel does need to do these
conversions so we can fix this properly.

I think I'm finally angry enough about this to do the implementation
so expect to see some code soon...

2007-12-06 21:43:58

by Jean Tourrilhes

[permalink] [raw]
Subject: Re: zd1211rw (2.6.22 sparc64): unaligned access (do_rx)

On Thu, Dec 06, 2007 at 10:33:18PM +0100, Michael Buesch wrote:
> On Thursday 06 December 2007 22:25:25 Jean Tourrilhes wrote:
> > On Wed, Dec 05, 2007 at 06:36:28PM -0800, David Miller wrote:
> > > From: Jean Tourrilhes <[email protected]>
> > > Date: Wed, 5 Dec 2007 13:56:00 -0800
> > >
> > > > Personally, I was under the impression that in userspace libc
> > > > trap unaligned access and make them work. I mean, this code was tested
> > > > on other 64/32 bit platforms (PPC, AMD-64, PS3) and was working there,
> > > > so this seems to be specific to your platform.
> > >
> > > None of those listed platforms trap on unaligned accesses like sparc
> > > does.
> >
>
> > Now, I'm wondering how to deal with unaligned access in
> > userspace. If you get data from hardware or the network, you can not
> > guarantee that everything will always be aligned, so we need a way to
> > deal with it.
> > In the kernel, we have get_unaligned(). I wonder what's the
> > equivalent in userspace.
>
> Userspace handles this transparently through fault traps.

This is exactly what I was assuming. On the SPARC, in this
instance, it did not (check back earlier in this discussion).

> Greetings Michael.

Regards,

Jean