2022-08-16 21:07:42

by Linus Walleij

[permalink] [raw]
Subject: [PATCH 1/2 v5] regmap: Support accelerated noinc operations

Several architectures have accelerated operations for MMIO
operations writing to a single register, such as writesb, writesw,
writesl, writesq, readsb, readsw, readsl and readsq but regmap
currently cannot use them because we have no hooks for providing
an accelerated noinc back-end for MMIO.

Solve this by providing reg_[read/write]_noinc callbacks for
the bus abstraction, so that the regmap-mmio bus can use this.

Currently I do not see a need to support this for custom regmaps
so it is only added to the bus.

Callbacks are passed a void * with the array of values and a
count which is the number of items of the byte chunk size for
the specific register width.

Signed-off-by: Linus Walleij <[email protected]>
---
ChangeLog v4->v5:
- Repost with the other patch as it was discovered that the
Hexagon needs fixing to support writesb/readsb.
ChangeLog v3->v4:
- Rebase on regmap for-next
ChangeLog v2->v3:
- Rebase on kernel v6.0-rc1
ChangeLog v1->v2:
- Factor out and reuse the code to format and read or write a
buffer of data to a noinc register at the cost of dropping
const from the buffer pointer in the write call. This is a
deadly sin in Rust and therefore impossible, but hey, this is
C, and dropping a const is a lesser evil than not being
able to reuse code.
---
drivers/base/regmap/regmap.c | 123 ++++++++++++++++++++++++++++++++++-
include/linux/regmap.h | 8 +++
2 files changed, 128 insertions(+), 3 deletions(-)

diff --git a/drivers/base/regmap/regmap.c b/drivers/base/regmap/regmap.c
index e371acea7e0e..41ff9f18b6a3 100644
--- a/drivers/base/regmap/regmap.c
+++ b/drivers/base/regmap/regmap.c
@@ -2129,6 +2129,99 @@ int regmap_raw_write(struct regmap *map, unsigned int reg,
}
EXPORT_SYMBOL_GPL(regmap_raw_write);

+static int regmap_noinc_readwrite(struct regmap *map, unsigned int reg,
+ void *val, unsigned int val_len, bool write)
+{
+ size_t val_bytes = map->format.val_bytes;
+ size_t val_count = val_len / val_bytes;
+ unsigned int lastval;
+ u8 *u8p;
+ u16 *u16p;
+ u32 *u32p;
+#ifdef CONFIG_64BIT
+ u64 *u64p;
+#endif
+ int ret;
+ int i;
+
+ switch (val_bytes) {
+ case 1:
+ u8p = val;
+ if (write)
+ lastval = (unsigned int)u8p[val_count - 1];
+ break;
+ case 2:
+ u16p = val;
+ if (write)
+ lastval = (unsigned int)u16p[val_count - 1];
+ break;
+ case 4:
+ u32p = val;
+ if (write)
+ lastval = (unsigned int)u32p[val_count - 1];
+ break;
+#ifdef CONFIG_64BIT
+ case 8:
+ u64p = val;
+ if (write)
+ lastval = (unsigned int)u64p[val_count - 1];
+ break;
+#endif
+ default:
+ return -EINVAL;
+ }
+
+ /*
+ * Update the cache with the last value we write, the rest is just
+ * gone down in the hardware FIFO. We can't cache FIFOs. This makes
+ * sure a single read from the cache will work.
+ */
+ if (write) {
+ if (!map->cache_bypass && !map->defer_caching) {
+ ret = regcache_write(map, reg, lastval);
+ if (ret != 0)
+ return ret;
+ if (map->cache_only) {
+ map->cache_dirty = true;
+ return 0;
+ }
+ }
+ ret = map->bus->reg_noinc_write(map->bus_context, reg, val, val_count);
+ } else {
+ ret = map->bus->reg_noinc_read(map->bus_context, reg, val, val_count);
+ }
+
+ if (!ret && regmap_should_log(map)) {
+ dev_info(map->dev, "%x %s [", reg, write ? "<=" : "=>");
+ for (i = 0; i < val_len; i++) {
+ switch (val_bytes) {
+ case 1:
+ pr_cont("%x", u8p[i]);
+ break;
+ case 2:
+ pr_cont("%x", u16p[i]);
+ break;
+ case 4:
+ pr_cont("%x", u32p[i]);
+ break;
+#ifdef CONFIG_64BIT
+ case 8:
+ pr_cont("%llx", u64p[i]);
+ break;
+#endif
+ default:
+ break;
+ }
+ if (i == (val_len - 1))
+ pr_cont("]\n");
+ else
+ pr_cont(",");
+ }
+ }
+
+ return 0;
+}
+
/**
* regmap_noinc_write(): Write data from a register without incrementing the
* register number
@@ -2156,9 +2249,8 @@ int regmap_noinc_write(struct regmap *map, unsigned int reg,
size_t write_len;
int ret;

- if (!map->write)
- return -ENOTSUPP;
-
+ if (!map->write && !(map->bus && map->bus->reg_noinc_write))
+ return -EINVAL;
if (val_len % map->format.val_bytes)
return -EINVAL;
if (!IS_ALIGNED(reg, map->reg_stride))
@@ -2173,6 +2265,15 @@ int regmap_noinc_write(struct regmap *map, unsigned int reg,
goto out_unlock;
}

+ /*
+ * Use the accelerated operation if we can. The val drops the const
+ * typing in order to facilitate code reuse in regmap_noinc_readwrite().
+ */
+ if (map->bus->reg_noinc_write) {
+ ret = regmap_noinc_readwrite(map, reg, (void *)val, val_len, true);
+ goto out_unlock;
+ }
+
while (val_len) {
if (map->max_raw_write && map->max_raw_write < val_len)
write_len = map->max_raw_write;
@@ -2943,6 +3044,22 @@ int regmap_noinc_read(struct regmap *map, unsigned int reg,
goto out_unlock;
}

+ /* Use the accelerated operation if we can */
+ if (map->bus->reg_noinc_read) {
+ /*
+ * We have not defined the FIFO semantics for cache, as the
+ * cache is just one value deep. Should we return the last
+ * written value? Just avoid this by always reading the FIFO
+ * even when using cache. Cache only will not work.
+ */
+ if (map->cache_only) {
+ ret = -EBUSY;
+ goto out_unlock;
+ }
+ ret = regmap_noinc_readwrite(map, reg, val, val_len, false);
+ goto out_unlock;
+ }
+
while (val_len) {
if (map->max_raw_read && map->max_raw_read < val_len)
read_len = map->max_raw_read;
diff --git a/include/linux/regmap.h b/include/linux/regmap.h
index 8cccc247cd37..ca3434dca3a0 100644
--- a/include/linux/regmap.h
+++ b/include/linux/regmap.h
@@ -492,8 +492,12 @@ typedef int (*regmap_hw_read)(void *context,
void *val_buf, size_t val_size);
typedef int (*regmap_hw_reg_read)(void *context, unsigned int reg,
unsigned int *val);
+typedef int (*regmap_hw_reg_noinc_read)(void *context, unsigned int reg,
+ void *val, size_t val_count);
typedef int (*regmap_hw_reg_write)(void *context, unsigned int reg,
unsigned int val);
+typedef int (*regmap_hw_reg_noinc_write)(void *context, unsigned int reg,
+ const void *val, size_t val_count);
typedef int (*regmap_hw_reg_update_bits)(void *context, unsigned int reg,
unsigned int mask, unsigned int val);
typedef struct regmap_async *(*regmap_hw_async_alloc)(void);
@@ -514,6 +518,8 @@ typedef void (*regmap_hw_free_context)(void *context);
* must serialise with respect to non-async I/O.
* @reg_write: Write a single register value to the given register address. This
* write operation has to complete when returning from the function.
+ * @reg_write_noinc: Write multiple register value to the same register. This
+ * write operation has to complete when returning from the function.
* @reg_update_bits: Update bits operation to be used against volatile
* registers, intended for devices supporting some mechanism
* for setting clearing bits without having to
@@ -541,9 +547,11 @@ struct regmap_bus {
regmap_hw_gather_write gather_write;
regmap_hw_async_write async_write;
regmap_hw_reg_write reg_write;
+ regmap_hw_reg_noinc_write reg_noinc_write;
regmap_hw_reg_update_bits reg_update_bits;
regmap_hw_read read;
regmap_hw_reg_read reg_read;
+ regmap_hw_reg_noinc_read reg_noinc_read;
regmap_hw_free_context free_context;
regmap_hw_async_alloc async_alloc;
u8 read_flag_mask;
--
2.37.2


2022-08-16 21:58:40

by Linus Walleij

[permalink] [raw]
Subject: [PATCH 2/2 v5] regmap: mmio: Support accelerared noinc operations

Use the newly added callback for accelerated noinc MMIO
to provide writesb, writesw, writesl, writesq, readsb, readsw,
readsl and readsq.

A special quirk is needed to deal with big endian regmaps: there
are no accelerated operations defined for big endian, so fall
back to calling the big endian operations itereatively for this
case.

The Hexagon architecture turns out to have an incomplete
<asm/io.h>: writesb() is not implemented. Fix this by doing
what other architectures do: include <asm-generic/io.h> into
the <asm/io.h> file.

Cc: Brian Cain <[email protected]>
Cc: [email protected]
Cc: Arnd Bergmann <[email protected]>
Signed-off-by: Linus Walleij <[email protected]>
---
ChangeLog v4->v5:
- Halfway throug posting v4 and the build robot complains
that regmap-mmio.c does not complain on Hexagon. Fix it and
hope the build robots are happy now.
ChangeLog v3->v4:
- Rebase on regmap for-next
- Notice Andy's patch rewriting *be accessors to use swab16 and
swab32 and changed my patch accordingly.
- Using swab64 we can actually implement proper 64bit register
handling as well pretty easily, so fix that while at it.
ChangeLog v2->v3:
- Rebase on kernel v6.0-rc1
ChangeLog v1->v2:
- No changes.
---
arch/hexagon/include/asm/io.h | 2 +
drivers/base/regmap/regmap-mmio.c | 162 ++++++++++++++++++++++++++++++
2 files changed, 164 insertions(+)

diff --git a/arch/hexagon/include/asm/io.h b/arch/hexagon/include/asm/io.h
index c33241425a5c..8e938dc1ca4b 100644
--- a/arch/hexagon/include/asm/io.h
+++ b/arch/hexagon/include/asm/io.h
@@ -308,6 +308,8 @@ static inline void outsl(unsigned long port, const void *buffer, int count)
}
}

+#include <asm-generic/io.h>
+
#endif /* __KERNEL__ */

#endif
diff --git a/drivers/base/regmap/regmap-mmio.c b/drivers/base/regmap/regmap-mmio.c
index eed488aad1b0..e8d2675463ac 100644
--- a/drivers/base/regmap/regmap-mmio.c
+++ b/drivers/base/regmap/regmap-mmio.c
@@ -16,6 +16,7 @@
struct regmap_mmio_context {
void __iomem *regs;
unsigned int val_bytes;
+ bool big_endian;

bool attached_clk;
struct clk *clk;
@@ -165,6 +166,83 @@ static int regmap_mmio_write(void *context, unsigned int reg, unsigned int val)
return 0;
}

+static int regmap_mmio_noinc_write(void *context, unsigned int reg,
+ const void *val, size_t val_count)
+{
+ struct regmap_mmio_context *ctx = context;
+ int ret = 0;
+ int i;
+
+ if (!IS_ERR(ctx->clk)) {
+ ret = clk_enable(ctx->clk);
+ if (ret < 0)
+ return ret;
+ }
+
+ /*
+ * There are no native, assembly-optimized write single register
+ * operations for big endian, so fall back to emulation if this
+ * is needed. (Single bytes are fine, they are not affected by
+ * endianness.)
+ */
+ if (ctx->big_endian && (ctx->val_bytes > 1)) {
+ switch (ctx->val_bytes) {
+ case 2:
+ {
+ const u16 *valp = (const u16 *)val;
+ for (i = 0; i < val_count; i++)
+ writew(swab16(valp[i]), ctx->regs + reg);
+ goto out_clk;
+ }
+ case 4:
+ {
+ const u32 *valp = (const u32 *)val;
+ for (i = 0; i < val_count; i++)
+ writel(swab32(valp[i]), ctx->regs + reg);
+ goto out_clk;
+ }
+#ifdef CONFIG_64BIT
+ case 8:
+ {
+ const u64 *valp = (const u64 *)val;
+ for (i = 0; i < val_count; i++)
+ writeq(swab64(valp[i]), ctx->regs + reg);
+ goto out_clk;
+ }
+#endif
+ default:
+ ret = -EINVAL;
+ goto out_clk;
+ }
+ }
+
+ switch (ctx->val_bytes) {
+ case 1:
+ writesb(ctx->regs + reg, (const u8 *)val, val_count);
+ break;
+ case 2:
+ writesw(ctx->regs + reg, (const u16 *)val, val_count);
+ break;
+ case 4:
+ writesl(ctx->regs + reg, (const u32 *)val, val_count);
+ break;
+#ifdef CONFIG_64BIT
+ case 8:
+ writesq(ctx->regs + reg, (const u64 *)val, val_count);
+ break;
+#endif
+ default:
+ ret = -EINVAL;
+ break;
+ }
+
+out_clk:
+ if (!IS_ERR(ctx->clk))
+ clk_disable(ctx->clk);
+
+ return ret;
+}
+
static unsigned int regmap_mmio_read8(struct regmap_mmio_context *ctx,
unsigned int reg)
{
@@ -262,6 +340,87 @@ static int regmap_mmio_read(void *context, unsigned int reg, unsigned int *val)
return 0;
}

+static int regmap_mmio_noinc_read(void *context, unsigned int reg,
+ void *val, size_t val_count)
+{
+ struct regmap_mmio_context *ctx = context;
+ int ret = 0;
+ int i;
+
+ if (!IS_ERR(ctx->clk)) {
+ ret = clk_enable(ctx->clk);
+ if (ret < 0)
+ return ret;
+ }
+
+ switch (ctx->val_bytes) {
+ case 1:
+ readsb(ctx->regs + reg, (u8 *)val, val_count);
+ break;
+ case 2:
+ readsw(ctx->regs + reg, (u16 *)val, val_count);
+ break;
+ case 4:
+ readsl(ctx->regs + reg, (u32 *)val, val_count);
+ break;
+#ifdef CONFIG_64BIT
+ case 8:
+ readsq(ctx->regs + reg, (u64 *)val, val_count);
+ break;
+#endif
+ default:
+ ret = -EINVAL;
+ goto out_clk;
+ }
+
+ /*
+ * There are no native, assembly-optimized write single register
+ * operations for big endian, so fall back to emulation if this
+ * is needed. (Single bytes are fine, they are not affected by
+ * endianness.)
+ */
+ if (ctx->big_endian && (ctx->val_bytes > 1)) {
+ switch (ctx->val_bytes) {
+ case 2:
+ {
+ u16 *valp = (u16 *)val;
+ for (i = 0; i < val_count; i++)
+ valp[i] = swab16(valp[i]);
+ break;
+ }
+ case 4:
+ {
+ u32 *valp = (u32 *)val;
+ for (i = 0; i < val_count; i++)
+ valp[i] = swab32(valp[i]);
+ break;
+ }
+#ifdef CONFIG_64BIT
+ case 8:
+ {
+ u64 *valp = (u64 *)val;
+ for (i = 0; i < val_count; i++)
+ valp[i] = swab64(valp[i]);
+ break;
+ }
+#endif
+ default:
+ ret = -EINVAL;
+ break;
+ }
+ }
+
+
+out_clk:
+ if (!IS_ERR(ctx->clk))
+ clk_disable(ctx->clk);
+
+ return ret;
+
+ return 0;
+}
+
+
static void regmap_mmio_free_context(void *context)
{
struct regmap_mmio_context *ctx = context;
@@ -278,6 +437,8 @@ static const struct regmap_bus regmap_mmio = {
.fast_io = true,
.reg_write = regmap_mmio_write,
.reg_read = regmap_mmio_read,
+ .reg_noinc_write = regmap_mmio_noinc_write,
+ .reg_noinc_read = regmap_mmio_noinc_read,
.free_context = regmap_mmio_free_context,
.val_format_endian_default = REGMAP_ENDIAN_LITTLE,
};
@@ -368,6 +529,7 @@ static struct regmap_mmio_context *regmap_mmio_gen_context(struct device *dev,
#ifdef __BIG_ENDIAN
case REGMAP_ENDIAN_NATIVE:
#endif
+ ctx->big_endian = true;
switch (config->val_bits) {
case 8:
if (config->io_port) {
--
2.37.2

2022-08-17 13:38:10

by Mark Brown

[permalink] [raw]
Subject: Re: [PATCH 1/2 v5] regmap: Support accelerated noinc operations

On Tue, 16 Aug 2022 22:48:31 +0200, Linus Walleij wrote:
> Several architectures have accelerated operations for MMIO
> operations writing to a single register, such as writesb, writesw,
> writesl, writesq, readsb, readsw, readsl and readsq but regmap
> currently cannot use them because we have no hooks for providing
> an accelerated noinc back-end for MMIO.
>
> Solve this by providing reg_[read/write]_noinc callbacks for
> the bus abstraction, so that the regmap-mmio bus can use this.
>
> [...]

Applied to

https://git.kernel.org/pub/scm/linux/kernel/git/broonie/regmap.git for-next

Thanks!

[1/2] regmap: Support accelerated noinc operations
commit: c20cc099b30abd50f563e422aa72edcd7f92da55
[2/2] regmap: mmio: Support accelerared noinc operations
commit: 81c0386c1376da54f05d6916936db5220df9f97d

All being well this means that it will be integrated into the linux-next
tree (usually sometime in the next 24 hours) and sent to Linus during
the next merge window (or sooner if it is a bug fix), however if
problems are discovered then the patch may be dropped or reverted.

You may get further e-mails resulting from automated or manual testing
and review of the tree, please engage with people reporting problems and
send followup patches addressing any issues that are reported if needed.

If any updates are required or you are submitting further changes they
should be sent as incremental updates against current git, existing
patches will not be replaced.

Please add any relevant lists and maintainers to the CCs when replying
to this mail.

Thanks,
Mark

2022-08-30 16:47:33

by Andy Shevchenko

[permalink] [raw]
Subject: Re: [PATCH 1/2 v5] regmap: Support accelerated noinc operations

On Tue, Aug 16, 2022 at 10:48:31PM +0200, Linus Walleij wrote:
> Several architectures have accelerated operations for MMIO
> operations writing to a single register, such as writesb, writesw,
> writesl, writesq, readsb, readsw, readsl and readsq but regmap
> currently cannot use them because we have no hooks for providing
> an accelerated noinc back-end for MMIO.
>
> Solve this by providing reg_[read/write]_noinc callbacks for
> the bus abstraction, so that the regmap-mmio bus can use this.
>
> Currently I do not see a need to support this for custom regmaps
> so it is only added to the bus.
>
> Callbacks are passed a void * with the array of values and a
> count which is the number of items of the byte chunk size for
> the specific register width.

I see these applied, but consider below for the possible followups.

...

> + ret = regcache_write(map, reg, lastval);
> + if (ret != 0)

if (ret) ?

> + return ret;

...

> + dev_info(map->dev, "%x %s [", reg, write ? "<=" : "=>");
> + for (i = 0; i < val_len; i++) {
> + switch (val_bytes) {
> + case 1:
> + pr_cont("%x", u8p[i]);
> + break;
> + case 2:
> + pr_cont("%x", u16p[i]);
> + break;
> + case 4:
> + pr_cont("%x", u32p[i]);
> + break;
> +#ifdef CONFIG_64BIT
> + case 8:
> + pr_cont("%llx", u64p[i]);
> + break;
> +#endif
> + default:
> + break;
> + }
> + if (i == (val_len - 1))
> + pr_cont("]\n");
> + else
> + pr_cont(",");
> + }

I'm wondering why we can't use hex_dump_to_buffer() approach? Or even better,
introduce eventually dev_hex_dump() (as it's done for seq_file and printk)
and use it.

--
With Best Regards,
Andy Shevchenko


2022-08-30 16:47:51

by Andy Shevchenko

[permalink] [raw]
Subject: Re: [PATCH 2/2 v5] regmap: mmio: Support accelerared noinc operations

On Tue, Aug 16, 2022 at 10:48:32PM +0200, Linus Walleij wrote:
> Use the newly added callback for accelerated noinc MMIO
> to provide writesb, writesw, writesl, writesq, readsb, readsw,
> readsl and readsq.
>
> A special quirk is needed to deal with big endian regmaps: there
> are no accelerated operations defined for big endian, so fall
> back to calling the big endian operations itereatively for this
> case.
>
> The Hexagon architecture turns out to have an incomplete
> <asm/io.h>: writesb() is not implemented. Fix this by doing
> what other architectures do: include <asm-generic/io.h> into
> the <asm/io.h> file.

Wonderful!

So, I have seen a recent blow up by kernel bot due to Alpha issues on these
accessors.

Also see below for the further followups.

...

> + if (!IS_ERR(ctx->clk)) {
> + ret = clk_enable(ctx->clk);
> + if (ret < 0)
> + return ret;
> + }

It's a new place of the duplicating check, can we have a helper for that?

...

> + /*
> + * There are no native, assembly-optimized write single register
> + * operations for big endian, so fall back to emulation if this
> + * is needed. (Single bytes are fine, they are not affected by
> + * endianness.)
> + */

Wouldn't be faster to memdup() / swap / use corresponding IO accessor?

...

> + /*
> + * There are no native, assembly-optimized write single register
> + * operations for big endian, so fall back to emulation if this
> + * is needed. (Single bytes are fine, they are not affected by
> + * endianness.)
> + */
> + if (ctx->big_endian && (ctx->val_bytes > 1)) {
> + switch (ctx->val_bytes) {
> + case 2:
> + {
> + u16 *valp = (u16 *)val;
> + for (i = 0; i < val_count; i++)
> + valp[i] = swab16(valp[i]);
> + break;
> + }
> + case 4:
> + {
> + u32 *valp = (u32 *)val;
> + for (i = 0; i < val_count; i++)
> + valp[i] = swab32(valp[i]);
> + break;
> + }
> +#ifdef CONFIG_64BIT
> + case 8:
> + {
> + u64 *valp = (u64 *)val;
> + for (i = 0; i < val_count; i++)
> + valp[i] = swab64(valp[i]);
> + break;
> + }
> +#endif
> + default:
> + ret = -EINVAL;
> + break;
> + }
> + }

So, two questions here:

1) can we use helpers from include/linux/byteorder/generic.h, such as
cpu_to_be32_array()/be32_to_cpu_array()?

2) have you considered using memcpy_toio() / memcpy_fromio() and why
it's not okay to use them?

...

> +
> +

Single blank line is enough.

> +out_clk:
> + if (!IS_ERR(ctx->clk))
> + clk_disable(ctx->clk);
> +
> + return ret;
> +
> + return 0;

Seems like misrebase? I believe this has to be fixed.

--
With Best Regards,
Andy Shevchenko


2022-08-31 14:29:53

by Linus Walleij

[permalink] [raw]
Subject: Re: [PATCH 2/2 v5] regmap: mmio: Support accelerared noinc operations

On Tue, Aug 30, 2022 at 6:39 PM Andy Shevchenko
<[email protected]> wrote:
> On Tue, Aug 16, 2022 at 10:48:32PM +0200, Linus Walleij wrote:
> > Use the newly added callback for accelerated noinc MMIO
> > to provide writesb, writesw, writesl, writesq, readsb, readsw,
> > readsl and readsq.
> >
> > A special quirk is needed to deal with big endian regmaps: there
> > are no accelerated operations defined for big endian, so fall
> > back to calling the big endian operations itereatively for this
> > case.
> >
> > The Hexagon architecture turns out to have an incomplete
> > <asm/io.h>: writesb() is not implemented. Fix this by doing
> > what other architectures do: include <asm-generic/io.h> into
> > the <asm/io.h> file.
>
> Wonderful!
>
> So, I have seen a recent blow up by kernel bot due to Alpha issues on these
> accessors.

There is a patch for that:
https://lore.kernel.org/linux-arch/[email protected]/

Alpha maintainance is not very active.

The problem is that some (fringe) architectures do not fulfil the contract
to provide full accessors. I can fix them all, I am fixing powerpc right now,
but the breakage is just random compile tests, they don't really use
regmap-mmio, we're just enabling regmap-mmio to compile on archs
that don't ever use it, so it's not urgent.

> > + if (!IS_ERR(ctx->clk)) {
> > + ret = clk_enable(ctx->clk);
> > + if (ret < 0)
> > + return ret;
> > + }
>
> It's a new place of the duplicating check, can we have a helper for that?
>
> ...
>
> > + /*
> > + * There are no native, assembly-optimized write single register
> > + * operations for big endian, so fall back to emulation if this
> > + * is needed. (Single bytes are fine, they are not affected by
> > + * endianness.)
> > + */
>
> Wouldn't be faster to memdup() / swap / use corresponding IO accessor?

Hm I would like a real BE target to test on and profile that.
If someone has a target I can make a patch.

> > + /*
> > + * There are no native, assembly-optimized write single register
> > + * operations for big endian, so fall back to emulation if this
> > + * is needed. (Single bytes are fine, they are not affected by
> > + * endianness.)
> > + */
> > + if (ctx->big_endian && (ctx->val_bytes > 1)) {
> > + switch (ctx->val_bytes) {
> > + case 2:
> > + {
> > + u16 *valp = (u16 *)val;
> > + for (i = 0; i < val_count; i++)
> > + valp[i] = swab16(valp[i]);
> > + break;
> > + }
> > + case 4:
> > + {
> > + u32 *valp = (u32 *)val;
> > + for (i = 0; i < val_count; i++)
> > + valp[i] = swab32(valp[i]);
> > + break;
> > + }
> > +#ifdef CONFIG_64BIT
> > + case 8:
> > + {
> > + u64 *valp = (u64 *)val;
> > + for (i = 0; i < val_count; i++)
> > + valp[i] = swab64(valp[i]);
> > + break;
> > + }
> > +#endif
> > + default:
> > + ret = -EINVAL;
> > + break;
> > + }
> > + }
>
> So, two questions here:
>
> 1) can we use helpers from include/linux/byteorder/generic.h, such as
> cpu_to_be32_array()/be32_to_cpu_array()?
>
> 2) have you considered using memcpy_toio() / memcpy_fromio() and why
> it's not okay to use them?

I got scared of all of these accessors because of
commit 7e7ba58c94127efa97c249e38cc2d1c0ed78b58f
"regmap: mmio: Fix MMIO accessors to avoid talking to IO port"
so I don't know if I dare to trust them. Therefore I opted for the
simplest thing that I could write that fulfils the requirement.

Again, if someone has a BE target to test on, I can write a patch!

> > +out_clk:
> > + if (!IS_ERR(ctx->clk))
> > + clk_disable(ctx->clk);
> > +
> > + return ret;
> > +
> > + return 0;
>
> Seems like misrebase? I believe this has to be fixed.

Ooops I fix. Also the double newline.

Yours,
Linus Walleij

2022-08-31 14:36:48

by Andy Shevchenko

[permalink] [raw]
Subject: Re: [PATCH 1/2 v5] regmap: Support accelerated noinc operations

On Wed, Aug 31, 2022 at 03:53:15PM +0200, Linus Walleij wrote:
> On Tue, Aug 30, 2022 at 6:36 PM Andy Shevchenko
> <[email protected]> wrote:
> > On Tue, Aug 16, 2022 at 10:48:31PM +0200, Linus Walleij wrote:

...

> > > + dev_info(map->dev, "%x %s [", reg, write ? "<=" : "=>");
> > > + for (i = 0; i < val_len; i++) {
> > > + switch (val_bytes) {
> > > + case 1:
> > > + pr_cont("%x", u8p[i]);
> > > + break;
> > > + case 2:
> > > + pr_cont("%x", u16p[i]);
> > > + break;
> > > + case 4:
> > > + pr_cont("%x", u32p[i]);
> > > + break;
> > > +#ifdef CONFIG_64BIT
> > > + case 8:
> > > + pr_cont("%llx", u64p[i]);
> > > + break;
> > > +#endif
> > > + default:
> > > + break;
> > > + }
> > > + if (i == (val_len - 1))
> > > + pr_cont("]\n");
> > > + else
> > > + pr_cont(",");
> > > + }
> >
> > I'm wondering why we can't use hex_dump_to_buffer() approach? Or even better,
> > introduce eventually dev_hex_dump() (as it's done for seq_file and printk)
> > and use it.
>
> Hmmmmm this is under regmap_should_log() which turns on
> dev_info() simple prints like x <= y and y => x to the console.
> e.g. dev_info(map->dev, "%x <= %x\n", reg, val);
> It would be rather print_hex_dump() the problem being
> that hex_dump_* accessors assumes line oriented
> linebufs "must be 16 or 32" (values per line), and here we probably
> don't want that:
> we want to show what we shoehorned into the FIFO.

Perhaps we may add a modifier to %ph (like 2,4,8) to group together several
bytes and use it here instead?

--
With Best Regards,
Andy Shevchenko


2022-08-31 14:50:15

by Linus Walleij

[permalink] [raw]
Subject: Re: [PATCH 1/2 v5] regmap: Support accelerated noinc operations

On Tue, Aug 30, 2022 at 6:36 PM Andy Shevchenko
<[email protected]> wrote:
> On Tue, Aug 16, 2022 at 10:48:31PM +0200, Linus Walleij wrote:

> > + ret = regcache_write(map, reg, lastval);
> > + if (ret != 0)
>
> if (ret) ?

This was done to follow the style of the rest of the .c file.

Could be patched everywhere though.

> > + dev_info(map->dev, "%x %s [", reg, write ? "<=" : "=>");
> > + for (i = 0; i < val_len; i++) {
> > + switch (val_bytes) {
> > + case 1:
> > + pr_cont("%x", u8p[i]);
> > + break;
> > + case 2:
> > + pr_cont("%x", u16p[i]);
> > + break;
> > + case 4:
> > + pr_cont("%x", u32p[i]);
> > + break;
> > +#ifdef CONFIG_64BIT
> > + case 8:
> > + pr_cont("%llx", u64p[i]);
> > + break;
> > +#endif
> > + default:
> > + break;
> > + }
> > + if (i == (val_len - 1))
> > + pr_cont("]\n");
> > + else
> > + pr_cont(",");
> > + }
>
> I'm wondering why we can't use hex_dump_to_buffer() approach? Or even better,
> introduce eventually dev_hex_dump() (as it's done for seq_file and printk)
> and use it.

Hmmmmm this is under regmap_should_log() which turns on
dev_info() simple prints like x <= y and y => x to the console.
e.g. dev_info(map->dev, "%x <= %x\n", reg, val);
It would be rather print_hex_dump() the problem being
that hex_dump_* accessors assumes line oriented
linebufs "must be 16 or 32" (values per line), and here we probably
don't want that:
we want to show what we shoehorned into the FIFO.

Yours,
Linus Walleij