From: Michael Hennerich <[email protected]>
Signed-off-by: Michael Hennerich <[email protected]>
Signed-off-by: Bryan Wu <[email protected]>
---
drivers/usb/host/isp1760-hcd.c | 67 ++++++++++++++++++++++++++++-----------
1 files changed, 48 insertions(+), 19 deletions(-)
diff --git a/drivers/usb/host/isp1760-hcd.c b/drivers/usb/host/isp1760-hcd.c
index 8017f1c..00bece2 100644
--- a/drivers/usb/host/isp1760-hcd.c
+++ b/drivers/usb/host/isp1760-hcd.c
@@ -136,12 +136,21 @@ static void priv_read_copy(struct isp1760_hcd *priv, u32 *src,
return;
}
- while (len >= 4) {
- *src = __raw_readl(dst);
- len -= 4;
- src++;
- dst++;
- }
+ if (unlikely((u32)src & 0x3)) {
+ while (len >= 4) {
+ put_unaligned(__raw_readl(dst), src);
+ len -= 4;
+ src++;
+ dst++;
+ }
+ } else {
+ while (len >= 4) {
+ *src = __raw_readl(dst);
+ len -= 4;
+ src++;
+ dst++;
+ }
+ }
if (!len)
return;
@@ -159,25 +168,45 @@ static void priv_read_copy(struct isp1760_hcd *priv, u32 *src,
len--;
buff8++;
}
+
}
static void priv_write_copy(const struct isp1760_hcd *priv, const u32 *src,
__u32 __iomem *dst, u32 len)
{
- while (len >= 4) {
- __raw_writel(*src, dst);
- len -= 4;
- src++;
- dst++;
- }
- if (!len)
- return;
- /* in case we have 3, 2 or 1 by left. The buffer is allocated and the
- * extra bytes should not be read by the HW
- */
-
- __raw_writel(*src, dst);
+ if (unlikely((u32)src & 0x3)) {
+ while (len >= 4) {
+ __raw_writel(get_unaligned(src), dst);
+ len -= 4;
+ src++;
+ dst++;
+ }
+
+ if (!len)
+ return;
+ /* in case we have 3, 2 or 1 by left. The buffer is allocated and the
+ * extra bytes should not be read by the HW
+ */
+
+ __raw_writel(get_unaligned(src), dst);
+
+ } else{
+ while (len >= 4) {
+ __raw_writel(*src, dst);
+ len -= 4;
+ src++;
+ dst++;
+ }
+
+ if (!len)
+ return;
+ /* in case we have 3, 2 or 1 by left. The buffer is allocated and the
+ * extra bytes should not be read by the HW
+ */
+
+ __raw_writel(*src, dst);
+ }
}
/* memory management of the 60kb on the chip from 0x1000 to 0xffff */
--
1.5.6.3
Bryan Wu wrote:
> From: Michael Hennerich <[email protected]>
> Signed-off-by: Michael Hennerich <[email protected]>
> Signed-off-by: Bryan Wu <[email protected]>
> ---
> drivers/usb/host/isp1760-hcd.c | 67 ++++++++++++++++++++++++++++-----------
> 1 files changed, 48 insertions(+), 19 deletions(-)
>
> diff --git a/drivers/usb/host/isp1760-hcd.c b/drivers/usb/host/isp1760-hcd.c
> index 8017f1c..00bece2 100644
> --- a/drivers/usb/host/isp1760-hcd.c
> +++ b/drivers/usb/host/isp1760-hcd.c
> @@ -136,12 +136,21 @@ static void priv_read_copy(struct isp1760_hcd *priv, u32 *src,
> return;
> }
>
> - while (len >= 4) {
> - *src = __raw_readl(dst);
> - len -= 4;
> - src++;
> - dst++;
> - }
> + if (unlikely((u32)src & 0x3)) {
> + while (len >= 4) {
> + put_unaligned(__raw_readl(dst), src);
> + len -= 4;
> + src++;
> + dst++;
> + }
> + } else {
> + while (len >= 4) {
> + *src = __raw_readl(dst);
> + len -= 4;
> + src++;
> + dst++;
> + }
> + }
>
> if (!len)
> return;
> @@ -159,25 +168,45 @@ static void priv_read_copy(struct isp1760_hcd *priv, u32 *src,
> len--;
> buff8++;
> }
> +
> }
>
> static void priv_write_copy(const struct isp1760_hcd *priv, const u32 *src,
> __u32 __iomem *dst, u32 len)
> {
> - while (len >= 4) {
> - __raw_writel(*src, dst);
> - len -= 4;
> - src++;
> - dst++;
> - }
>
> - if (!len)
> - return;
> - /* in case we have 3, 2 or 1 by left. The buffer is allocated and the
> - * extra bytes should not be read by the HW
> - */
> -
> - __raw_writel(*src, dst);
The link [1] you sent me reports an unaligned access which occurred here.
So I thing the access to *src should be either a get_unaligned helper or a
byte read loop like I did it in the read path.
The r8a66597 is doing the same thing (as you suggest) for one SH machine.
However, I'm not convinced to fix it that way: The buffer should be
properly aligned by the driver. Unless there is HW which requires
unaligned data, I would prefer just to fix this unaligned access.
According to the thread in [1] a similar patch fixed it for the user until
he dropped into another bug.
> + if (unlikely((u32)src & 0x3)) {
> + while (len >= 4) {
> + __raw_writel(get_unaligned(src), dst);
> + len -= 4;
> + src++;
> + dst++;
> + }
> +
> + if (!len)
> + return;
> + /* in case we have 3, 2 or 1 by left. The buffer is allocated and the
> + * extra bytes should not be read by the HW
> + */
> +
> + __raw_writel(get_unaligned(src), dst);
> +
> + } else{
> + while (len >= 4) {
> + __raw_writel(*src, dst);
> + len -= 4;
> + src++;
> + dst++;
> + }
> +
> + if (!len)
> + return;
> + /* in case we have 3, 2 or 1 by left. The buffer is allocated and the
> + * extra bytes should not be read by the HW
> + */
> +
> + __raw_writel(*src, dst);
> + }
> }
>
> /* memory management of the 60kb on the chip from 0x1000 to 0xffff */
[1]
http://blackfin.uclinux.org/gf/project/uclinux-dist/forum/?action=ForumBrowse&_forum_action=MessageReply&message_id=64889
Sebastian
Sebastian,
>The link [1] you sent me reports an unaligned access which occurred
here.
>So I thing the access to *src should be either a get_unaligned helper
or a
>byte read loop like I did it in the read path.
It's not just that single spot.
I've seen unaligned pointers with count > 3 coming from various drivers.
Here just two examples:
1) The generic Bluetooth USB driver: CONFIG_BT_HCIUSB
Bluez-utils: hcitool scan:
priv_write_copy: src = 00efaa09, dst = 203c1200, len = 13
Full trace attached.
2) RTL8150 based USB Ethernet adapter: CONFIG_USB_RTL8150
dhcpcd:
priv_read_copy: src = 00ea4812, dst = 203d8000, len = 64
This trace was taken with the unaligned workaround for lengths < 4.
I wonder if it's only us (NOMMU) seeing these odd aligned buffers?
-Michael
>-----Original Message-----
>From: Sebastian Andrzej Siewior [mailto:[email protected]]
>Sent: Tuesday, November 18, 2008 11:52 AM
>To: Bryan Wu
>Cc: [email protected]; [email protected]; Michael
>Hennerich
>Subject: Re: [PATCH] USB/ISP1760: Fix for unaligned exceptions
>
>Bryan Wu wrote:
>> From: Michael Hennerich <[email protected]>
>> Signed-off-by: Michael Hennerich <[email protected]>
>> Signed-off-by: Bryan Wu <[email protected]>
>> ---
>> drivers/usb/host/isp1760-hcd.c | 67
++++++++++++++++++++++++++++------
>-----
>> 1 files changed, 48 insertions(+), 19 deletions(-)
>>
>> diff --git a/drivers/usb/host/isp1760-hcd.c
b/drivers/usb/host/isp1760-
>hcd.c
>> index 8017f1c..00bece2 100644
>> --- a/drivers/usb/host/isp1760-hcd.c
>> +++ b/drivers/usb/host/isp1760-hcd.c
>> @@ -136,12 +136,21 @@ static void priv_read_copy(struct isp1760_hcd
>*priv, u32 *src,
>> return;
>> }
>>
>> - while (len >= 4) {
>> - *src = __raw_readl(dst);
>> - len -= 4;
>> - src++;
>> - dst++;
>> - }
>> + if (unlikely((u32)src & 0x3)) {
>> + while (len >= 4) {
>> + put_unaligned(__raw_readl(dst), src);
>> + len -= 4;
>> + src++;
>> + dst++;
>> + }
>> + } else {
>> + while (len >= 4) {
>> + *src = __raw_readl(dst);
>> + len -= 4;
>> + src++;
>> + dst++;
>> + }
>> + }
>>
>> if (!len)
>> return;
>> @@ -159,25 +168,45 @@ static void priv_read_copy(struct isp1760_hcd
>*priv, u32 *src,
>> len--;
>> buff8++;
>> }
>> +
>> }
>>
>> static void priv_write_copy(const struct isp1760_hcd *priv, const
u32
>*src,
>> __u32 __iomem *dst, u32 len)
>> {
>> - while (len >= 4) {
>> - __raw_writel(*src, dst);
>> - len -= 4;
>> - src++;
>> - dst++;
>> - }
>>
>> - if (!len)
>> - return;
>> - /* in case we have 3, 2 or 1 by left. The buffer is allocated
and the
>> - * extra bytes should not be read by the HW
>> - */
>> -
>> - __raw_writel(*src, dst);
>The link [1] you sent me reports an unaligned access which occurred
here.
>So I thing the access to *src should be either a get_unaligned helper
or a
>byte read loop like I did it in the read path.
>The r8a66597 is doing the same thing (as you suggest) for one SH
machine.
>However, I'm not convinced to fix it that way: The buffer should be
>properly aligned by the driver. Unless there is HW which requires
>unaligned data, I would prefer just to fix this unaligned access.
>According to the thread in [1] a similar patch fixed it for the user
until
>he dropped into another bug.
>
>> + if (unlikely((u32)src & 0x3)) {
>> + while (len >= 4) {
>> + __raw_writel(get_unaligned(src), dst);
>> + len -= 4;
>> + src++;
>> + dst++;
>> + }
>> +
>> + if (!len)
>> + return;
>> + /* in case we have 3, 2 or 1 by left. The buffer is
allocated
>and the
>> + * extra bytes should not be read by the HW
>> + */
>> +
>> + __raw_writel(get_unaligned(src), dst);
>> +
>> + } else{
>> + while (len >= 4) {
>> + __raw_writel(*src, dst);
>> + len -= 4;
>> + src++;
>> + dst++;
>> + }
>> +
>> + if (!len)
>> + return;
>> + /* in case we have 3, 2 or 1 by left. The buffer is
allocated
>and the
>> + * extra bytes should not be read by the HW
>> + */
>> +
>> + __raw_writel(*src, dst);
>> + }
>> }
>>
>> /* memory management of the 60kb on the chip from 0x1000 to 0xffff
*/
>
>[1]
>http://blackfin.uclinux.org/gf/project/uclinux-
>dist/forum/?action=ForumBrowse&_forum_action=MessageReply&message_id=64
889
>
>Sebastian
* Hennerich, Michael | 2008-11-18 15:41:01 [-0000]:
>Sebastian,
Michael,
>It's not just that single spot.
>I've seen unaligned pointers with count > 3 coming from various drivers.
>
>Here just two examples:
>
>1) The generic Bluetooth USB driver: CONFIG_BT_HCIUSB
>Bluez-utils: hcitool scan:
>
>priv_write_copy: src = 00efaa09, dst = 203c1200, len = 13
>
>Full trace attached.
The trace is missing the kernel stack isn't it?
>
>2) RTL8150 based USB Ethernet adapter: CONFIG_USB_RTL8150
>dhcpcd:
>
>priv_read_copy: src = 00ea4812, dst = 203d8000, len = 64
0x00ea4812 doesn't feel right. Unless I'm missing something, this is
comming from rtl8150_open() while it was calling set_registers() to set
the mac address. So I assume the buffer is the mac address. This is
hardly possible because the MAC address itself is 6 bytes long and the
accompanying control packet has 8 bytes while this comment says that the
transfer legth is 64bytes. And since this is a control message, we
should not receive any response from the device.
Anyway with with WirelesEXT & NETPOLL in 32bit mode the offset from
begin of netdev to the mac address is 0x013c bytes and should be fine
for 32bit access. So either the netdev struct isn't properly aligned or
this a different transfer.
>I wonder if it's only us (NOMMU) seeing these odd aligned buffers?
Not sure. The only problem I have with this patch is that you might
cover bugs in drivers and you don't notice it anymore since you choose
"voluntary" the slow path.
>-Michael
Sebastian
>-----Original Message-----
>From: Sebastian Andrzej Siewior [mailto:[email protected]]
>Sent: Wednesday, November 19, 2008 10:19 AM
>To: Hennerich, Michael
>Cc: Bryan Wu; [email protected]; [email protected]
>Subject: Re: [PATCH] USB/ISP1760: Fix for unaligned exceptions
>
>* Hennerich, Michael | 2008-11-18 15:41:01 [-0000]:
>
>>Sebastian,
>Michael,
>
>>It's not just that single spot.
>>I've seen unaligned pointers with count > 3 coming from various
drivers.
>>
>>Here just two examples:
>>
>>1) The generic Bluetooth USB driver: CONFIG_BT_HCIUSB
>>Bluez-utils: hcitool scan:
>>
>>priv_write_copy: src = 00efaa09, dst = 203c1200, len = 13
>>
>>Full trace attached.
>The trace is missing the kernel stack isn't it?
Well in that particular case - this doesn't look right.
Need to check the way we print the kernel stack in the dump.
>
>>
>>2) RTL8150 based USB Ethernet adapter: CONFIG_USB_RTL8150
>>dhcpcd:
>>
>>priv_read_copy: src = 00ea4812, dst = 203d8000, len = 64
>0x00ea4812 doesn't feel right. Unless I'm missing something, this is
>comming from rtl8150_open() while it was calling set_registers() to set
>the mac address. So I assume the buffer is the mac address. This is
>hardly possible because the MAC address itself is 6 bytes long and the
>accompanying control packet has 8 bytes while this comment says that
the
>transfer legth is 64bytes. And since this is a control message, we
>should not receive any response from the device.
>Anyway with with WirelesEXT & NETPOLL in 32bit mode the offset from
>begin of netdev to the mac address is 0x013c bytes and should be fine
>for 32bit access. So either the netdev struct isn't properly aligned or
>this a different transfer.
I know the issue is originated in either RTL8150 set_registers or
get_registers. We get some unaligned address from the stack to the
ISP1760 priv_read/wite_copy.
The RTL8150 driver does something like this:
u8 data[3], tmp;
data[0] = phy;
data[1] = data[2] = 0;
tmp = indx | PHY_READ | PHY_GO;
i = 0;
set_registers(dev, PHYADD, sizeof(data), data);
With gcc-3.x this never used to be a problem because u8 data[] always
used to be aligned 4. However compiling this with gcc-4.x u8 data[] can
be odd aligned.
>
>>I wonder if it's only us (NOMMU) seeing these odd aligned buffers?
>
>Not sure. The only problem I have with this patch is that you might
>cover bugs in drivers and you don't notice it anymore since you choose
>"voluntary" the slow path.
Well here I disagree, but I agree with the fact that there are buggy
drivers.
Since most processors running Linux do have unaligned access handling,
this issue goes unnoticed for all of them. Believe me the penalty taken
by any Processor doing this automatically and unnoticed is typically
much higher than using get/put_unaligned.
I'm tiered fixing all unaligned issues in drivers. It's a hassle getting
them merged, since most people don't care. Having a workaround in a
single place, the hcd driver is much easier.
>
>>-Michael
>
>Sebastian
Hennerich, Michael wrote:
> I know the issue is originated in either RTL8150 set_registers or
> get_registers. We get some unaligned address from the stack to the
> ISP1760 priv_read/wite_copy.
>
> The RTL8150 driver does something like this:
>
> u8 data[3], tmp;
>
> data[0] = phy;
> data[1] = data[2] = 0;
> tmp = indx | PHY_READ | PHY_GO;
> i = 0;
>
> set_registers(dev, PHYADD, sizeof(data), data);
ach. So that's wrong anyway. There are arches which can't DMA stack
memory. So fixing this properly does not fix just your arch.
>>> I wonder if it's only us (NOMMU) seeing these odd aligned buffers?
>> Not sure. The only problem I have with this patch is that you might
>> cover bugs in drivers and you don't notice it anymore since you choose
>> "voluntary" the slow path.
>
> Well here I disagree, but I agree with the fact that there are buggy
> drivers.
>
> Since most processors running Linux do have unaligned access handling,
> this issue goes unnoticed for all of them. Believe me the penalty taken
> by any Processor doing this automatically and unnoticed is typically
> much higher than using get/put_unaligned.
Okay. A packed struct with a u8 followed by u16 which is required by the
spec can't be fixed. unaligned helper is the only solution. I agree here.
Allocating memory on the stack for a dma transfer is wrong.
On PowerPC and X86 get_unaligned() does not behave any different than a
normal dereference. So I doubt that there is a performance improvement.
> I'm tiered fixing all unaligned issues in drivers. It's a hassle getting
> them merged, since most people don't care. Having a workaround in a
> single place, the hcd driver is much easier.
Having a fixup in the exception handler like sparc does is probably little
slower than the fixup here. On the other hand you would not have to fix
unaligned access anymore.
>>> -Michael
Sebastian