2019-04-28 08:47:28

by Geert Uytterhoeven

[permalink] [raw]
Subject: Re: endianness swapped

Hi Angelo,

CC commit relations

On Sat, Apr 27, 2019 at 10:22 PM Angelo Dureghello <[email protected]> wrote:
> On Sat, Apr 27, 2019 at 05:32:22PM +0200, Angelo Dureghello wrote:
> > as you may know, i am working on mcf5441x.
> > Sorry for not following carefully all the threads, but from a certain
> > kernel version (likely 4.19 or near there), seems ioread32be
> > reads the bytes swapped in endianness (mcf-edma dma driver not working
> > anymore).
> >
> > Has there been a change about this in the architecture I/O access ?
> > How should i proceed now ? Fixing the DMA driver read/write, or what ?

> looks like the reason of my ioread32be now swapped is:
>
> https://patchwork.kernel.org/patch/10766673/
>
> Trying to figure out what to do now.

This is commit aecc787c06f4300f ("iomap: Use non-raw io functions for
io{read|write}XXbe"):

--- a/lib/iomap.c
+++ b/lib/iomap.c
@@ -65,8 +65,8 @@ static void bad_io_access(unsigned long port, const
char *access)
#endif

#ifndef mmio_read16be
-#define mmio_read16be(addr) be16_to_cpu(__raw_readw(addr))
-#define mmio_read32be(addr) be32_to_cpu(__raw_readl(addr))
+#define mmio_read16be(addr) swab16(readw(addr))
+#define mmio_read32be(addr) swab32(readl(addr))
#endif

unsigned int ioread8(void __iomem *addr)
@@ -106,8 +106,8 @@ EXPORT_SYMBOL(ioread32be);
#endif

#ifndef mmio_write16be
-#define mmio_write16be(val,port) __raw_writew(be16_to_cpu(val),port)
-#define mmio_write32be(val,port) __raw_writel(be32_to_cpu(val),port)
+#define mmio_write16be(val,port) writew(swab16(val),port)
+#define mmio_write32be(val,port) writel(swab32(val),port)

On big endian, the raw accessors are assumed to be non-swapping,
while non-raw accessors are assumed to be swapping.
The latter is not true for Coldfire internal registers, cfr.
arch/m68k/include/asm/io_no.h:

static inline u16 readw(const volatile void __iomem *addr)
{
if (cf_internalio(addr))
return __raw_readw(addr);
return __le16_to_cpu(__raw_readw(addr));
}

Orthogonal to how Coldfire's read[wl]() should be fixed, I find it a bit
questionable to swap data twice on big endian architectures.

Fortunately we can avoid that by defining our own
mmio_{read,write}{16,32}be()...

Gr{oetje,eeting}s,

Geert


--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- [email protected]

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds


2019-04-28 09:25:29

by Arnd Bergmann

[permalink] [raw]
Subject: Re: endianness swapped

On Sun, Apr 28, 2019 at 10:46 AM Geert Uytterhoeven
<[email protected]> wrote:
> On Sat, Apr 27, 2019 at 10:22 PM Angelo Dureghello <[email protected]> wrote:
> > On Sat, Apr 27, 2019 at 05:32:22PM +0200, Angelo Dureghello wrote:
> > > as you may know, i am working on mcf5441x.
> > > Sorry for not following carefully all the threads, but from a certain
> > > kernel version (likely 4.19 or near there), seems ioread32be
> > > reads the bytes swapped in endianness (mcf-edma dma driver not working
> > > anymore).
> > >
> > > Has there been a change about this in the architecture I/O access ?
> > > How should i proceed now ? Fixing the DMA driver read/write, or what ?
>
> > looks like the reason of my ioread32be now swapped is:
> >
> > https://patchwork.kernel.org/patch/10766673/
> >
> > Trying to figure out what to do now.
>
> This is commit aecc787c06f4300f ("iomap: Use non-raw io functions for
> io{read|write}XXbe"):
>
> --- a/lib/iomap.c
> +++ b/lib/iomap.c
> @@ -65,8 +65,8 @@ static void bad_io_access(unsigned long port, const
> char *access)
> #endif
>
> #ifndef mmio_read16be
> -#define mmio_read16be(addr) be16_to_cpu(__raw_readw(addr))
> -#define mmio_read32be(addr) be32_to_cpu(__raw_readl(addr))
> +#define mmio_read16be(addr) swab16(readw(addr))
> +#define mmio_read32be(addr) swab32(readl(addr))
> #endif
>
> unsigned int ioread8(void __iomem *addr)
> @@ -106,8 +106,8 @@ EXPORT_SYMBOL(ioread32be);
> #endif
>
> #ifndef mmio_write16be
> -#define mmio_write16be(val,port) __raw_writew(be16_to_cpu(val),port)
> -#define mmio_write32be(val,port) __raw_writel(be32_to_cpu(val),port)
> +#define mmio_write16be(val,port) writew(swab16(val),port)
> +#define mmio_write32be(val,port) writel(swab32(val),port)
>
> On big endian, the raw accessors are assumed to be non-swapping,
> while non-raw accessors are assumed to be swapping.
> The latter is not true for Coldfire internal registers, cfr.
> arch/m68k/include/asm/io_no.h:

The raw accessors are always assumed to be non-swapping
in the asm-generic code, while the non-raw ones are assumed to
be little-endian in order for them to work with portable drivers.

We have some other cases of big-endian machines that use
a hardware byteswap on their MMIO buses (iirc some mips
and superh parts), but they then need to swap the __raw_*
accessor data words in software to get back to the normal
behavior, as well as swizzle the address for accesses that are
less than 32 bit wide.

Coldfire makes the behavior of readw()/readl() depend on the
MMIO address, presumably since that was the easiest way to
get drivers working originally, but it breaks the assumption
in the asm-generic code.

> static inline u16 readw(const volatile void __iomem *addr)
> {
> if (cf_internalio(addr))
> return __raw_readw(addr);
> return __le16_to_cpu(__raw_readw(addr));
> }
>
> Orthogonal to how Coldfire's read[wl]() should be fixed, I find it a bit
> questionable to swap data twice on big endian architectures.

I would expect that the compiler is capable of detecting a double
swap and optimize it out. Even if it can't, there are not that many
instances of io{read,write}{16,32}be in the kernel, so the increase
in kernel image size from a double swap should be limited to a
few extra instructions, and the runtime overhead should be
negligible compared to the bus access.

> Fortunately we can avoid that by defining our own
> mmio_{read,write}{16,32}be()...

Yes, that should be the easiest workaround. I doubt that anyone
would want to audit all drivers for internal I/O on coldfire and
convert them to use the big-endian accessors in place of the
readw/readl ones in order to use the asm-generic/io.h conventions.

Arnd

2019-04-28 14:10:52

by Greg Ungerer

[permalink] [raw]
Subject: Re: endianness swapped


On 28/4/19 7:21 pm, Arnd Bergmann wrote:
> On Sun, Apr 28, 2019 at 10:46 AM Geert Uytterhoeven
> <[email protected]> wrote:
>> On Sat, Apr 27, 2019 at 10:22 PM Angelo Dureghello <[email protected]> wrote:
>>> On Sat, Apr 27, 2019 at 05:32:22PM +0200, Angelo Dureghello wrote:
>>>> as you may know, i am working on mcf5441x.
>>>> Sorry for not following carefully all the threads, but from a certain
>>>> kernel version (likely 4.19 or near there), seems ioread32be
>>>> reads the bytes swapped in endianness (mcf-edma dma driver not working
>>>> anymore).
>>>>
>>>> Has there been a change about this in the architecture I/O access ?
>>>> How should i proceed now ? Fixing the DMA driver read/write, or what ?
>>
>>> looks like the reason of my ioread32be now swapped is:
>>>
>>> https://patchwork.kernel.org/patch/10766673/
>>>
>>> Trying to figure out what to do now.
>>
>> This is commit aecc787c06f4300f ("iomap: Use non-raw io functions for
>> io{read|write}XXbe"):
>>
>> --- a/lib/iomap.c
>> +++ b/lib/iomap.c
>> @@ -65,8 +65,8 @@ static void bad_io_access(unsigned long port, const
>> char *access)
>> #endif
>>
>> #ifndef mmio_read16be
>> -#define mmio_read16be(addr) be16_to_cpu(__raw_readw(addr))
>> -#define mmio_read32be(addr) be32_to_cpu(__raw_readl(addr))
>> +#define mmio_read16be(addr) swab16(readw(addr))
>> +#define mmio_read32be(addr) swab32(readl(addr))
>> #endif
>>
>> unsigned int ioread8(void __iomem *addr)
>> @@ -106,8 +106,8 @@ EXPORT_SYMBOL(ioread32be);
>> #endif
>>
>> #ifndef mmio_write16be
>> -#define mmio_write16be(val,port) __raw_writew(be16_to_cpu(val),port)
>> -#define mmio_write32be(val,port) __raw_writel(be32_to_cpu(val),port)
>> +#define mmio_write16be(val,port) writew(swab16(val),port)
>> +#define mmio_write32be(val,port) writel(swab32(val),port)
>>
>> On big endian, the raw accessors are assumed to be non-swapping,
>> while non-raw accessors are assumed to be swapping.
>> The latter is not true for Coldfire internal registers, cfr.
>> arch/m68k/include/asm/io_no.h:
>
> The raw accessors are always assumed to be non-swapping
> in the asm-generic code, while the non-raw ones are assumed to
> be little-endian in order for them to work with portable drivers.
>
> We have some other cases of big-endian machines that use
> a hardware byteswap on their MMIO buses (iirc some mips
> and superh parts), but they then need to swap the __raw_*
> accessor data words in software to get back to the normal
> behavior, as well as swizzle the address for accesses that are
> less than 32 bit wide.
>
> Coldfire makes the behavior of readw()/readl() depend on the
> MMIO address, presumably since that was the easiest way to
> get drivers working originally, but it breaks the assumption
> in the asm-generic code.

Yes, that is right.

There is a number of common hardware modules that Freescale have
used in the ColdFire SoC parts and in their ARM based parts (iMX
families). The ARM parts are pretty much always little endian, and
the ColdFire is always big endian. The hardware registers in those
hardware blocks are always accessed in native endian of the processor.

So the address range checks are to deal with those internal
hardware blocks (i2c, spi, dma, etc), since we know those are
at fixed addresses. That leaves the usual endian swapping in place for
other general (ie external) devices (PCI devices, network chips, etc).


>> static inline u16 readw(const volatile void __iomem *addr)
>> {
>> if (cf_internalio(addr))
>> return __raw_readw(addr);
>> return __le16_to_cpu(__raw_readw(addr));
>> }
>>
>> Orthogonal to how Coldfire's read[wl]() should be fixed, I find it a bit
>> questionable to swap data twice on big endian architectures.
>
> I would expect that the compiler is capable of detecting a double
> swap and optimize it out. Even if it can't, there are not that many
> instances of io{read,write}{16,32}be in the kernel, so the increase
> in kernel image size from a double swap should be limited to a
> few extra instructions, and the runtime overhead should be
> negligible compared to the bus access.
>
>> Fortunately we can avoid that by defining our own
>> mmio_{read,write}{16,32}be()...

Makes sense.

Regards
Greg


2019-04-28 18:48:51

by Arnd Bergmann

[permalink] [raw]
Subject: Re: endianness swapped

On Sun, Apr 28, 2019 at 3:59 PM Greg Ungerer <[email protected]> wrote:
> On 28/4/19 7:21 pm, Arnd Bergmann wrote:
> > On Sun, Apr 28, 2019 at 10:46 AM Geert Uytterhoeven
> > <[email protected]> wrote:
> >> On Sat, Apr 27, 2019 at 10:22 PM Angelo Dureghello <[email protected]> wrote:
> >>> On Sat, Apr 27, 2019 at 05:32:22PM +0200, Angelo Dureghello wrote:
> >
> > Coldfire makes the behavior of readw()/readl() depend on the
> > MMIO address, presumably since that was the easiest way to
> > get drivers working originally, but it breaks the assumption
> > in the asm-generic code.
>
> Yes, that is right.
>
> There is a number of common hardware modules that Freescale have
> used in the ColdFire SoC parts and in their ARM based parts (iMX
> families). The ARM parts are pretty much always little endian, and
> the ColdFire is always big endian. The hardware registers in those
> hardware blocks are always accessed in native endian of the processor.

In later Freescale/NXP ARM SoCs (i.MX and Layerscape), we
also get a lot of devices pulled over from PowerPC, with random
endianess. In some cases, the same device that had big-endian
registers originally ends up in two different ARM products and one of
them uses big-endian while the other one uses little-endian registers.

> So the address range checks are to deal with those internal
> hardware blocks (i2c, spi, dma, etc), since we know those are
> at fixed addresses. That leaves the usual endian swapping in place for
> other general (ie external) devices (PCI devices, network chips, etc).

Is there a complete list of coldfire on-chip device drivers?

Looking at some of the drivers:

- drivers/i2c/busses/i2c-imx.c uses only 8-bit accesses and works either way,
same for drivers/tty/serial/mcf.c
- drivers/spi/spi-coldfire-qspi.c is apparently coldfire-only and could use
ioread32be for a portable to do big-endian register access.
- edma-common has a wrapper to support both big-endian and little-endian
configurations in the same kernel image, but the mcf interrupt handler
is hardcoded to the (normally) little-endian ioread32 function.
- drivers/net/ethernet/freescale/fec_main.c is shared between coldfire
and i.MX (but not mpc52xx), and is hardcoded to readl/writel, and
would need the same trick as edma to make it portable.

Arnd

2019-04-28 21:41:04

by Angelo Dureghello

[permalink] [raw]
Subject: Re: endianness swapped

Hi all,

On Sun, Apr 28, 2019 at 08:44:03PM +0200, Arnd Bergmann wrote:
> On Sun, Apr 28, 2019 at 3:59 PM Greg Ungerer <[email protected]> wrote:
> > On 28/4/19 7:21 pm, Arnd Bergmann wrote:
> > > On Sun, Apr 28, 2019 at 10:46 AM Geert Uytterhoeven
> > > <[email protected]> wrote:
> > >> On Sat, Apr 27, 2019 at 10:22 PM Angelo Dureghello <[email protected]> wrote:
> > >>> On Sat, Apr 27, 2019 at 05:32:22PM +0200, Angelo Dureghello wrote:
> > >
> > > Coldfire makes the behavior of readw()/readl() depend on the
> > > MMIO address, presumably since that was the easiest way to
> > > get drivers working originally, but it breaks the assumption
> > > in the asm-generic code.
> >
> > Yes, that is right.
> >
> > There is a number of common hardware modules that Freescale have
> > used in the ColdFire SoC parts and in their ARM based parts (iMX
> > families). The ARM parts are pretty much always little endian, and
> > the ColdFire is always big endian. The hardware registers in those
> > hardware blocks are always accessed in native endian of the processor.
>
> In later Freescale/NXP ARM SoCs (i.MX and Layerscape), we
> also get a lot of devices pulled over from PowerPC, with random
> endianess. In some cases, the same device that had big-endian
> registers originally ends up in two different ARM products and one of
> them uses big-endian while the other one uses little-endian registers.
>

Yes, this seems confirmed also from the drivers/dma/fsl-edma-common.h
comment:

/*
* R/W functions for big- or little-endian registers:
* The eDMA controller's endian is independent of the CPU core's endian.
* For the big-endian IP module, the offset for 8-bit or 16-bit registers
* should also be swapped opposite to that in little-endian IP.
*/

> > So the address range checks are to deal with those internal
> > hardware blocks (i2c, spi, dma, etc), since we know those are
> > at fixed addresses. That leaves the usual endian swapping in place for
> > other general (ie external) devices (PCI devices, network chips, etc).
>
> Is there a complete list of coldfire on-chip device drivers?
>

I can list those i worked on

i2c-imx.c
spi-fsl-dspi.c
mcf-edma.c + fsl-edma.common.c
now working on a sdhci-esdhc-mcf.c

And about mcf5441x, some other drivers as usb or probably can have still
to be enabled/mainlined.

> Looking at some of the drivers:
>
> - drivers/i2c/busses/i2c-imx.c uses only 8-bit accesses and works either way,
> same for drivers/tty/serial/mcf.c
> - drivers/spi/spi-coldfire-qspi.c is apparently coldfire-only and could use
> ioread32be for a portable to do big-endian register access.
> - edma-common has a wrapper to support both big-endian and little-endian
> configurations in the same kernel image, but the mcf interrupt handler
> is hardcoded to the (normally) little-endian ioread32 function.
> - drivers/net/ethernet/freescale/fec_main.c is shared between coldfire
> and i.MX (but not mpc52xx), and is hardcoded to readl/writel, and
> would need the same trick as edma to make it portable.
>
> Arnd

Regards,
Angelo

2019-04-29 07:05:26

by Greg Ungerer

[permalink] [raw]
Subject: Re: endianness swapped

Hi Arnd,

On 29/4/19 4:44 am, Arnd Bergmann wrote:
> On Sun, Apr 28, 2019 at 3:59 PM Greg Ungerer <[email protected]> wrote:
>> On 28/4/19 7:21 pm, Arnd Bergmann wrote:
>>> On Sun, Apr 28, 2019 at 10:46 AM Geert Uytterhoeven
>>> <[email protected]> wrote:
>>>> On Sat, Apr 27, 2019 at 10:22 PM Angelo Dureghello <[email protected]> wrote:
>>>>> On Sat, Apr 27, 2019 at 05:32:22PM +0200, Angelo Dureghello wrote:
>>>
>>> Coldfire makes the behavior of readw()/readl() depend on the
>>> MMIO address, presumably since that was the easiest way to
>>> get drivers working originally, but it breaks the assumption
>>> in the asm-generic code.
>>
>> Yes, that is right.
>>
>> There is a number of common hardware modules that Freescale have
>> used in the ColdFire SoC parts and in their ARM based parts (iMX
>> families). The ARM parts are pretty much always little endian, and
>> the ColdFire is always big endian. The hardware registers in those
>> hardware blocks are always accessed in native endian of the processor.
>
> In later Freescale/NXP ARM SoCs (i.MX and Layerscape), we
> also get a lot of devices pulled over from PowerPC, with random
> endianess. In some cases, the same device that had big-endian
> registers originally ends up in two different ARM products and one of
> them uses big-endian while the other one uses little-endian registers.
>
>> So the address range checks are to deal with those internal
>> hardware blocks (i2c, spi, dma, etc), since we know those are
>> at fixed addresses. That leaves the usual endian swapping in place for
>> other general (ie external) devices (PCI devices, network chips, etc).
>
> Is there a complete list of coldfire on-chip device drivers?
>
> Looking at some of the drivers:
>
> - drivers/i2c/busses/i2c-imx.c uses only 8-bit accesses and works either way,
> same for drivers/tty/serial/mcf.c
> - drivers/spi/spi-coldfire-qspi.c is apparently coldfire-only and could use
> ioread32be for a portable to do big-endian register access.
> - edma-common has a wrapper to support both big-endian and little-endian
> configurations in the same kernel image, but the mcf interrupt handler
> is hardcoded to the (normally) little-endian ioread32 function.
> - drivers/net/ethernet/freescale/fec_main.c is shared between coldfire
> and i.MX (but not mpc52xx), and is hardcoded to readl/writel, and
> would need the same trick as edma to make it portable.

That matches up with what we list out in arch/m68k/coldfire/devices.c.
I can't think of any other drivers.

There is a lot of use readl/writel and friends in the architecture
specific code too, in arch/m68k/coldfire. At first I used __raw_readl/
__raw_writel to always get native endianess. But quote a few uses of
readl/writel have crept in over the years.

Regards
Greg


2019-04-29 08:41:31

by Arnd Bergmann

[permalink] [raw]
Subject: Re: endianness swapped

On Mon, Apr 29, 2019 at 9:04 AM Greg Ungerer <[email protected]> wrote:
> On 29/4/19 4:44 am, Arnd Bergmann wrote:
> > Is there a complete list of coldfire on-chip device drivers?
> >
> > Looking at some of the drivers:
> >
> > - drivers/i2c/busses/i2c-imx.c uses only 8-bit accesses and works either way,
> > same for drivers/tty/serial/mcf.c
> > - drivers/spi/spi-coldfire-qspi.c is apparently coldfire-only and could use
> > ioread32be for a portable to do big-endian register access.
> > - edma-common has a wrapper to support both big-endian and little-endian
> > configurations in the same kernel image, but the mcf interrupt handler
> > is hardcoded to the (normally) little-endian ioread32 function.
> > - drivers/net/ethernet/freescale/fec_main.c is shared between coldfire
> > and i.MX (but not mpc52xx), and is hardcoded to readl/writel, and
> > would need the same trick as edma to make it portable.
>
> That matches up with what we list out in arch/m68k/coldfire/devices.c.
> I can't think of any other drivers.

Ok

> There is a lot of use readl/writel and friends in the architecture
> specific code too, in arch/m68k/coldfire. At first I used __raw_readl/
> __raw_writel to always get native endianess. But quote a few uses of
> readl/writel have crept in over the years.

I generally recommend avoiding the __raw_ accessors altogether
except for very special cases (e.g. copying from device memory
into main RAM in fixed-size units). On most architectures these
days, MMIO registers require additional properties that are
provided by readl() but not __raw_readl(), in particular atomicity
and ordering against other memory accesses and spinlocks.
Presumably m68k is similar to i386 here in that the __raw_ version
already provides all those guarantees due to the way the architecture
is defined.

What tends to work best in cases like this is to have a bus specific
wrapper around readl/ioread32/ioread32be that does exactly what
is needed for the bus that a device is on. As we concluded earlier,
it's easy enough to define coldfire specific ioread32be etc that
do what we want here, but it also sounds like it would not be hard
to change arch/m68k/coldfire/*.c, drivers/net/ethernet/freescale/fec*
and drivers/dma/*edma*.c to use big-endian accessors, and
then make coldfire use the generic readl/writel and
ioread32be/iowrite32be.

Arnd

2019-04-29 08:45:50

by Geert Uytterhoeven

[permalink] [raw]
Subject: Re: endianness swapped

Hi Greg,

On Sun, Apr 28, 2019 at 3:59 PM Greg Ungerer <[email protected]> wrote:
> On 28/4/19 7:21 pm, Arnd Bergmann wrote:
> > On Sun, Apr 28, 2019 at 10:46 AM Geert Uytterhoeven
> > <[email protected]> wrote:
> >> On Sat, Apr 27, 2019 at 10:22 PM Angelo Dureghello <[email protected]> wrote:
> >>> On Sat, Apr 27, 2019 at 05:32:22PM +0200, Angelo Dureghello wrote:
> >>>> as you may know, i am working on mcf5441x.
> >>>> Sorry for not following carefully all the threads, but from a certain
> >>>> kernel version (likely 4.19 or near there), seems ioread32be
> >>>> reads the bytes swapped in endianness (mcf-edma dma driver not working
> >>>> anymore).
> >>>>
> >>>> Has there been a change about this in the architecture I/O access ?
> >>>> How should i proceed now ? Fixing the DMA driver read/write, or what ?
> >>
> >>> looks like the reason of my ioread32be now swapped is:
> >>>
> >>> https://patchwork.kernel.org/patch/10766673/
> >>>
> >>> Trying to figure out what to do now.
> >>
> >> This is commit aecc787c06f4300f ("iomap: Use non-raw io functions for
> >> io{read|write}XXbe"):
> >>
> >> --- a/lib/iomap.c
> >> +++ b/lib/iomap.c
> >> @@ -65,8 +65,8 @@ static void bad_io_access(unsigned long port, const
> >> char *access)
> >> #endif
> >>
> >> #ifndef mmio_read16be
> >> -#define mmio_read16be(addr) be16_to_cpu(__raw_readw(addr))
> >> -#define mmio_read32be(addr) be32_to_cpu(__raw_readl(addr))
> >> +#define mmio_read16be(addr) swab16(readw(addr))
> >> +#define mmio_read32be(addr) swab32(readl(addr))
> >> #endif
> >>
> >> unsigned int ioread8(void __iomem *addr)
> >> @@ -106,8 +106,8 @@ EXPORT_SYMBOL(ioread32be);
> >> #endif
> >>
> >> #ifndef mmio_write16be
> >> -#define mmio_write16be(val,port) __raw_writew(be16_to_cpu(val),port)
> >> -#define mmio_write32be(val,port) __raw_writel(be32_to_cpu(val),port)
> >> +#define mmio_write16be(val,port) writew(swab16(val),port)
> >> +#define mmio_write32be(val,port) writel(swab32(val),port)
> >>
> >> On big endian, the raw accessors are assumed to be non-swapping,
> >> while non-raw accessors are assumed to be swapping.
> >> The latter is not true for Coldfire internal registers, cfr.
> >> arch/m68k/include/asm/io_no.h:

> >> static inline u16 readw(const volatile void __iomem *addr)
> >> {
> >> if (cf_internalio(addr))
> >> return __raw_readw(addr);
> >> return __le16_to_cpu(__raw_readw(addr));
> >> }
> >>
> >> Orthogonal to how Coldfire's read[wl]() should be fixed, I find it a bit
> >> questionable to swap data twice on big endian architectures.
> >
> > I would expect that the compiler is capable of detecting a double
> > swap and optimize it out. Even if it can't, there are not that many
> > instances of io{read,write}{16,32}be in the kernel, so the increase
> > in kernel image size from a double swap should be limited to a
> > few extra instructions, and the runtime overhead should be
> > negligible compared to the bus access.

Probably the overhead is not negligible on old m68k...

> >> Fortunately we can avoid that by defining our own
> >> mmio_{read,write}{16,32}be()...
>
> Makes sense.

I've just sent a patch to do that, as a fix for v5.1 or v5.2.

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- [email protected]

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds

2019-04-29 11:16:28

by Arnd Bergmann

[permalink] [raw]
Subject: Re: endianness swapped

On Mon, Apr 29, 2019 at 10:44 AM Geert Uytterhoeven
<[email protected]> wrote:
> On Sun, Apr 28, 2019 at 3:59 PM Greg Ungerer <[email protected]> wrote:
> > On 28/4/19 7:21 pm, Arnd Bergmann wrote:
> > > On Sun, Apr 28, 2019 at 10:46 AM Geert Uytterhoeven
> > > <[email protected]> wrote:
> > >>
> > >> Orthogonal to how Coldfire's read[wl]() should be fixed, I find it a bit
> > >> questionable to swap data twice on big endian architectures.
> > >
> > > I would expect that the compiler is capable of detecting a double
> > > swap and optimize it out. Even if it can't, there are not that many
> > > instances of io{read,write}{16,32}be in the kernel, so the increase
> > > in kernel image size from a double swap should be limited to a
> > > few extra instructions, and the runtime overhead should be
> > > negligible compared to the bus access.
>
> Probably the overhead is not negligible on old m68k...

Maybe the I/O devices are faster than I expected then. I was guessing
that there is still a factor of 100x or more between going to an on-chip
bus and single byterev register-to-register instruction.

I did notice that __arch_swab32() is an inline assembly, so the
compiler has no way of eliminating the double swap, but
setting CONFIG_ARCH_USE_BUILTIN_BSWAP makes it
do the right thing.

Arnd