2022-06-17 14:49:42

by Zhang Boyang

[permalink] [raw]
Subject: [PATCH v2 4/5] rslib: Improve the performance of encode_rs.c

This patch enhances the performance of RS encoder by following points:

1) Avoid memmove(). The shifting operation done by memmove() can be
merged into the calculation loop above.

2) Introduce rs_modnn_fast(). The original rs_modnn() contains a loop
which may be slow. Since (fb + genpoly[...]) is always strictly less
than (2 * rs->nn), we can use a ternary operator to do the same
calculation. The new faster function is named rs_modnn_fast(). The
new rs_modnn_fast(x) requires 0 <= x < 2*nn, in contrast, original
rs_modnn(x) only requires x >= 0. To make things clear, the
documentation of original rs_modnn() is also updated.

Signed-off-by: Zhang Boyang <[email protected]>
---
include/linux/rslib.h | 14 +++++++++++++-
lib/reed_solomon/encode_rs.c | 21 ++++++++++-----------
2 files changed, 23 insertions(+), 12 deletions(-)

diff --git a/include/linux/rslib.h b/include/linux/rslib.h
index cd0b5a7a5698..44ec7c6f24b2 100644
--- a/include/linux/rslib.h
+++ b/include/linux/rslib.h
@@ -110,7 +110,7 @@ void free_rs(struct rs_control *rs);
/** modulo replacement for galois field arithmetics
*
* @rs: Pointer to the RS codec
- * @x: the value to reduce
+ * @x: x >= 0 ; the value to reduce
*
* where
* rs->mm = number of bits per symbol
@@ -127,4 +127,16 @@ static inline int rs_modnn(struct rs_codec *rs, int x)
return x;
}

+/** modulo replacement for galois field arithmetics
+ *
+ * @rs: Pointer to the RS codec
+ * @x: 0 <= x < 2*nn ; the value to reduce
+ *
+ * Same as rs_modnn(x), but faster, at the cost of limited value range of @x
+*/
+static inline int rs_modnn_fast(struct rs_codec *rs, int x)
+{
+ return x - rs->nn < 0 ? x : x - rs->nn;
+}
+
#endif
diff --git a/lib/reed_solomon/encode_rs.c b/lib/reed_solomon/encode_rs.c
index 9112d46e869e..6e3847b17ad4 100644
--- a/lib/reed_solomon/encode_rs.c
+++ b/lib/reed_solomon/encode_rs.c
@@ -27,19 +27,18 @@

for (i = 0; i < len; i++) {
fb = index_of[((((uint16_t) data[i])^invmsk) & msk) ^ par[0]];
- /* feedback term is non-zero */
if (fb != nn) {
- for (j = 1; j < nroots; j++) {
- par[j] ^= alpha_to[rs_modnn(rs, fb +
- genpoly[nroots - j])];
- }
- }
- /* Shift */
- memmove(&par[0], &par[1], sizeof(uint16_t) * (nroots - 1));
- if (fb != nn) {
- par[nroots - 1] = alpha_to[rs_modnn(rs,
- fb + genpoly[0])];
+ /* feedback term is non-zero */
+ for (j = 1; j < nroots; j++)
+ par[j - 1] = par[j] ^ alpha_to[rs_modnn_fast(rs,
+ fb +
+ genpoly[nroots - j])];
+ par[nroots - 1] = alpha_to[rs_modnn_fast(rs,
+ fb +
+ genpoly[0])];
} else {
+ for (j = 1; j < nroots; j++)
+ par[j - 1] = par[j];
par[nroots - 1] = 0;
}
}
--
2.30.2


2022-06-19 01:49:08

by Randy Dunlap

[permalink] [raw]
Subject: Re: [PATCH v2 4/5] rslib: Improve the performance of encode_rs.c

Hi--

On 6/17/22 07:46, Zhang Boyang wrote:
> Signed-off-by: Zhang Boyang <[email protected]>
> ---
> include/linux/rslib.h | 14 +++++++++++++-
> lib/reed_solomon/encode_rs.c | 21 ++++++++++-----------
> 2 files changed, 23 insertions(+), 12 deletions(-)
>
> diff --git a/include/linux/rslib.h b/include/linux/rslib.h
> index cd0b5a7a5698..44ec7c6f24b2 100644
> --- a/include/linux/rslib.h
> +++ b/include/linux/rslib.h
> @@ -110,7 +110,7 @@ void free_rs(struct rs_control *rs);
> /** modulo replacement for galois field arithmetics
> *
> * @rs: Pointer to the RS codec
> - * @x: the value to reduce
> + * @x: x >= 0 ; the value to reduce
> *
> * where
> * rs->mm = number of bits per symbol
> @@ -127,4 +127,16 @@ static inline int rs_modnn(struct rs_codec *rs, int x)
> return x;
> }
>
> +/** modulo replacement for galois field arithmetics

/**
* rs_modnn_fast() - modulo replacement for galois field arithmetics

> + *
> + * @rs: Pointer to the RS codec
> + * @x: 0 <= x < 2*nn ; the value to reduce
> + *
> + * Same as rs_modnn(x), but faster, at the cost of limited value range of @x
> +*/
> +static inline int rs_modnn_fast(struct rs_codec *rs, int x)
> +{
> + return x - rs->nn < 0 ? x : x - rs->nn;
> +}
> +
> #endif

--
~Randy