2021-07-15 12:38:34

by Andy Shevchenko

[permalink] [raw]
Subject: [PATCH v1 1/3] clk: fractional-divider: Export approximation algo to the CCF users

At least one user currently duplicates some functions that are provided
by fractional divider module. Let's export approximation algo and replace
the open-coded variant.

As a bonus the exported function will get better documentation in place.

Signed-off-by: Andy Shevchenko <[email protected]>
---
drivers/clk/clk-fractional-divider.c | 11 +++++++----
drivers/clk/clk-fractional-divider.h | 9 +++++++++
drivers/clk/rockchip/clk.c | 17 +++--------------
3 files changed, 19 insertions(+), 18 deletions(-)
create mode 100644 drivers/clk/clk-fractional-divider.h

diff --git a/drivers/clk/clk-fractional-divider.c b/drivers/clk/clk-fractional-divider.c
index b1e556f20911..535d299af646 100644
--- a/drivers/clk/clk-fractional-divider.c
+++ b/drivers/clk/clk-fractional-divider.c
@@ -14,6 +14,8 @@
#include <linux/slab.h>
#include <linux/rational.h>

+#include "clk-fractional-divider.h"
+
static inline u32 clk_fd_readl(struct clk_fractional_divider *fd)
{
if (fd->flags & CLK_FRAC_DIVIDER_BIG_ENDIAN)
@@ -68,9 +70,10 @@ static unsigned long clk_fd_recalc_rate(struct clk_hw *hw,
return ret;
}

-static void clk_fd_general_approximation(struct clk_hw *hw, unsigned long rate,
- unsigned long *parent_rate,
- unsigned long *m, unsigned long *n)
+void clk_fractional_divider_general_approximation(struct clk_hw *hw,
+ unsigned long rate,
+ unsigned long *parent_rate,
+ unsigned long *m, unsigned long *n)
{
struct clk_fractional_divider *fd = to_clk_fd(hw);
unsigned long scale;
@@ -102,7 +105,7 @@ static long clk_fd_round_rate(struct clk_hw *hw, unsigned long rate,
if (fd->approximation)
fd->approximation(hw, rate, parent_rate, &m, &n);
else
- clk_fd_general_approximation(hw, rate, parent_rate, &m, &n);
+ clk_fractional_divider_general_approximation(hw, rate, parent_rate, &m, &n);

ret = (u64)*parent_rate * m;
do_div(ret, n);
diff --git a/drivers/clk/clk-fractional-divider.h b/drivers/clk/clk-fractional-divider.h
new file mode 100644
index 000000000000..4fa359a12ef4
--- /dev/null
+++ b/drivers/clk/clk-fractional-divider.h
@@ -0,0 +1,9 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+
+struct clk_hw;
+
+void clk_fractional_divider_general_approximation(struct clk_hw *hw,
+ unsigned long rate,
+ unsigned long *parent_rate,
+ unsigned long *m,
+ unsigned long *n);
diff --git a/drivers/clk/rockchip/clk.c b/drivers/clk/rockchip/clk.c
index 049e5e0b64f6..ce7b71594827 100644
--- a/drivers/clk/rockchip/clk.c
+++ b/drivers/clk/rockchip/clk.c
@@ -22,7 +22,9 @@
#include <linux/regmap.h>
#include <linux/reboot.h>
#include <linux/rational.h>
+
#include "clk.h"
+#include "clk-fractional-divider.h"

/*
* Register a clock branch.
@@ -178,10 +180,8 @@ static void rockchip_fractional_approximation(struct clk_hw *hw,
unsigned long rate, unsigned long *parent_rate,
unsigned long *m, unsigned long *n)
{
- struct clk_fractional_divider *fd = to_clk_fd(hw);
unsigned long p_rate, p_parent_rate;
struct clk_hw *p_parent;
- unsigned long scale;

p_rate = clk_hw_get_rate(clk_hw_get_parent(hw));
if ((rate * 20 > p_rate) && (p_rate % rate != 0)) {
@@ -190,18 +190,7 @@ static void rockchip_fractional_approximation(struct clk_hw *hw,
*parent_rate = p_parent_rate;
}

- /*
- * Get rate closer to *parent_rate to guarantee there is no overflow
- * for m and n. In the result it will be the nearest rate left shifted
- * by (scale - fd->nwidth) bits.
- */
- scale = fls_long(*parent_rate / rate - 1);
- if (scale > fd->nwidth)
- rate <<= scale - fd->nwidth;
-
- rational_best_approximation(rate, *parent_rate,
- GENMASK(fd->mwidth - 1, 0), GENMASK(fd->nwidth - 1, 0),
- m, n);
+ clk_fractional_divider_general_approximation(hw, rate, parent_rate, m, n);
}

static struct clk *rockchip_clk_register_frac_branch(
--
2.30.2


2021-07-15 14:08:13

by Andy Shevchenko

[permalink] [raw]
Subject: [PATCH v1 2/3] clk: fractional-divider: Introduce NO_PRESCALER flag

The newly introduced flag, when set, makes the flow to skip
the assumption that the caller will use an additional 2^scale
prescaler to get the desired clock rate.

Reported-by: Liu Ying <[email protected]>
Signed-off-by: Andy Shevchenko <[email protected]>
---
drivers/clk/clk-fractional-divider.c | 2 +-
include/linux/clk-provider.h | 5 +++++
2 files changed, 6 insertions(+), 1 deletion(-)

diff --git a/drivers/clk/clk-fractional-divider.c b/drivers/clk/clk-fractional-divider.c
index 535d299af646..b2f9aae9f172 100644
--- a/drivers/clk/clk-fractional-divider.c
+++ b/drivers/clk/clk-fractional-divider.c
@@ -84,7 +84,7 @@ void clk_fractional_divider_general_approximation(struct clk_hw *hw,
* by (scale - fd->nwidth) bits.
*/
scale = fls_long(*parent_rate / rate - 1);
- if (scale > fd->nwidth)
+ if (scale > fd->nwidth && !(fd->flags & CLK_FRAC_DIVIDER_NO_PRESCALER))
rate <<= scale - fd->nwidth;

rational_best_approximation(rate, *parent_rate,
diff --git a/include/linux/clk-provider.h b/include/linux/clk-provider.h
index d83b829305c0..f74d0afe275f 100644
--- a/include/linux/clk-provider.h
+++ b/include/linux/clk-provider.h
@@ -1001,6 +1001,10 @@ struct clk_hw *devm_clk_hw_register_fixed_factor(struct device *dev,
* CLK_FRAC_DIVIDER_BIG_ENDIAN - By default little endian register accesses are
* used for the divider register. Setting this flag makes the register
* accesses big endian.
+ * CLK_FRAC_DIVIDER_NO_PRESCALER - By default the resulting rate may be shifted
+ * left by a few bits in case when the asked one is quite small to satisfy
+ * the desired range of denominator. If the caller wants to get the best
+ * rate without using an additional prescaler, this flag may be set.
*/
struct clk_fractional_divider {
struct clk_hw hw;
@@ -1022,6 +1026,7 @@ struct clk_fractional_divider {

#define CLK_FRAC_DIVIDER_ZERO_BASED BIT(0)
#define CLK_FRAC_DIVIDER_BIG_ENDIAN BIT(1)
+#define CLK_FRAC_DIVIDER_NO_PRESCALER BIT(2)

extern const struct clk_ops clk_fractional_divider_ops;
struct clk *clk_register_fractional_divider(struct device *dev,
--
2.30.2

2021-07-15 16:08:30

by Andy Shevchenko

[permalink] [raw]
Subject: [PATCH v1 3/3] clk: fractional-divider: Document the arithmetics used behind the code

It appears that some code lines raise the question why they are needed
and how they are participated in the calculus of the resulting values.

Document this in a form of the top comment in the module file.

Reported-by: Liu Ying <[email protected]>
Signed-off-by: Andy Shevchenko <[email protected]>
---
drivers/clk/clk-fractional-divider.c | 34 +++++++++++++++++++++++++++-
1 file changed, 33 insertions(+), 1 deletion(-)

diff --git a/drivers/clk/clk-fractional-divider.c b/drivers/clk/clk-fractional-divider.c
index b2f9aae9f172..29b32b51210d 100644
--- a/drivers/clk/clk-fractional-divider.c
+++ b/drivers/clk/clk-fractional-divider.c
@@ -3,8 +3,38 @@
* Copyright (C) 2014 Intel Corporation
*
* Adjustable fractional divider clock implementation.
- * Output rate = (m / n) * parent_rate.
* Uses rational best approximation algorithm.
+ *
+ * Output is calculated as
+ *
+ * rate = (m / n) * parent_rate (1)
+ *
+ * This is useful when on die we have a prescaler block which asks for
+ * m (numerator) and n (denominator) values to be provided to satisfy
+ * the (1) as much as possible.
+ *
+ * Since m and n have the limitation by a range, e.g.
+ *
+ * n >= 1, n < N_lim, where N_lim = 2^nlim (2)
+ *
+ * for some cases the output may be saturated. Hence, from (1) and (2),
+ * assuming the worst case when m = 1, the inequality
+ *
+ * ln2(parent_rate / rate) <= nlim (3)
+ *
+ * may be derived. Thus, in cases when
+ *
+ * (parent_rate / rate) >> N_lim (4)
+ *
+ * we scale up the rate by 2^scale, where
+ *
+ * scale = ln2(parent_rate / rate) - nlim (5)
+ *
+ * and assume that the IP, that needs m and n, has also its own
+ * prescaler, which is capable to divide by 2^scale. In this way
+ * we get the denominator to satisfy the desired range (2) and
+ * at the same time much much better result of m and n than simple
+ * saturated values.
*/

#include <linux/clk-provider.h>
@@ -82,6 +112,8 @@ void clk_fractional_divider_general_approximation(struct clk_hw *hw,
* Get rate closer to *parent_rate to guarantee there is no overflow
* for m and n. In the result it will be the nearest rate left shifted
* by (scale - fd->nwidth) bits.
+ *
+ * For the detailed explanation see the top comment in this file.
*/
scale = fls_long(*parent_rate / rate - 1);
if (scale > fd->nwidth && !(fd->flags & CLK_FRAC_DIVIDER_NO_PRESCALER))
--
2.30.2

2021-07-15 16:50:19

by kernel test robot

[permalink] [raw]
Subject: Re: [PATCH v1 1/3] clk: fractional-divider: Export approximation algo to the CCF users

Hi Andy,

I love your patch! Yet something to improve:

[auto build test ERROR on clk/clk-next]
[also build test ERROR on rockchip/for-next linux/master linus/master v5.14-rc1 next-20210715]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url: https://github.com/0day-ci/linux/commits/Andy-Shevchenko/clk-fractional-divider-Export-approximation-algo-to-the-CCF-users/20210715-200828
base: https://git.kernel.org/pub/scm/linux/kernel/git/clk/linux.git clk-next
config: arm64-randconfig-s032-20210715 (attached as .config)
compiler: aarch64-linux-gcc (GCC) 10.3.0
reproduce:
wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
chmod +x ~/bin/make.cross
# apt-get install sparse
# sparse version: v0.6.3-341-g8af24329-dirty
# https://github.com/0day-ci/linux/commit/bcba401c1ad5c03f7d638f99c799576a771e87ca
git remote add linux-review https://github.com/0day-ci/linux
git fetch --no-tags linux-review Andy-Shevchenko/clk-fractional-divider-Export-approximation-algo-to-the-CCF-users/20210715-200828
git checkout bcba401c1ad5c03f7d638f99c799576a771e87ca
# save the attached .config to linux build tree
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-10.3.0 make.cross C=1 CF='-fdiagnostic-prefix -D__CHECK_ENDIAN__' O=build_dir ARCH=arm64 SHELL=/bin/bash drivers/clk/rockchip/

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <[email protected]>

All errors (new ones prefixed by >>):

>> drivers/clk/rockchip/clk.c:27:10: fatal error: clk-fractional-divider.h: No such file or directory
27 | #include "clk-fractional-divider.h"
| ^~~~~~~~~~~~~~~~~~~~~~~~~~
compilation terminated.


vim +27 drivers/clk/rockchip/clk.c

25
26 #include "clk.h"
> 27 #include "clk-fractional-divider.h"
28

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/[email protected]


Attachments:
(No filename) (2.14 kB)
.config.gz (39.57 kB)
Download all attachments

2021-07-15 18:24:06

by Andy Shevchenko

[permalink] [raw]
Subject: Re: [PATCH v1 1/3] clk: fractional-divider: Export approximation algo to the CCF users

On Thu, Jul 15, 2021 at 7:49 PM kernel test robot <[email protected]> wrote:
> I love your patch! Yet something to improve:

Definitely!

> All errors (new ones prefixed by >>):
>
> >> drivers/clk/rockchip/clk.c:27:10: fatal error: clk-fractional-divider.h: No such file or directory
> 27 | #include "clk-fractional-divider.h"
> | ^~~~~~~~~~~~~~~~~~~~~~~~~~
> compilation terminated.

Indeed. No idea how to compile-test this on x86.
Let me see what I can do to avoid other issues.


--
With Best Regards,
Andy Shevchenko

2021-07-15 18:28:12

by Robin Murphy

[permalink] [raw]
Subject: Re: [PATCH v1 1/3] clk: fractional-divider: Export approximation algo to the CCF users

On 2021-07-15 17:51, Andy Shevchenko wrote:
> On Thu, Jul 15, 2021 at 7:49 PM kernel test robot <[email protected]> wrote:
>> I love your patch! Yet something to improve:
>
> Definitely!
>
>> All errors (new ones prefixed by >>):
>>
>>>> drivers/clk/rockchip/clk.c:27:10: fatal error: clk-fractional-divider.h: No such file or directory
>> 27 | #include "clk-fractional-divider.h"
>> | ^~~~~~~~~~~~~~~~~~~~~~~~~~
>> compilation terminated.
>
> Indeed. No idea how to compile-test this on x86.
> Let me see what I can do to avoid other issues.

Ha, the individual SoC-specific parts have COMPILE_TEST support, but the
top-level COMMON_CLK_ROCKCHIP doesn't. That could probably be fixed.

Otherwise, you can always grab a toolchain from [1] (if your distro
doesn't offer one already) and cross-compile - defconfig for arm64 or
multi_v7_defconfig for arm should cover it.

Robin.

[1] https://cdn.kernel.org/pub/tools/crosstool/

2021-07-15 22:45:33

by kernel test robot

[permalink] [raw]
Subject: Re: [PATCH v1 1/3] clk: fractional-divider: Export approximation algo to the CCF users

Hi Andy,

I love your patch! Yet something to improve:

[auto build test ERROR on clk/clk-next]
[also build test ERROR on rockchip/for-next linux/master linus/master v5.14-rc1 next-20210715]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url: https://github.com/0day-ci/linux/commits/Andy-Shevchenko/clk-fractional-divider-Export-approximation-algo-to-the-CCF-users/20210715-200828
base: https://git.kernel.org/pub/scm/linux/kernel/git/clk/linux.git clk-next
config: arm64-buildonly-randconfig-r001-20210715 (attached as .config)
compiler: clang version 13.0.0 (https://github.com/llvm/llvm-project 0e49c54a8cbd3e779e5526a5888c683c01cc3c50)
reproduce (this is a W=1 build):
wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
chmod +x ~/bin/make.cross
# install arm64 cross compiling tool for clang build
# apt-get install binutils-aarch64-linux-gnu
# https://github.com/0day-ci/linux/commit/bcba401c1ad5c03f7d638f99c799576a771e87ca
git remote add linux-review https://github.com/0day-ci/linux
git fetch --no-tags linux-review Andy-Shevchenko/clk-fractional-divider-Export-approximation-algo-to-the-CCF-users/20210715-200828
git checkout bcba401c1ad5c03f7d638f99c799576a771e87ca
# save the attached .config to linux build tree
mkdir build_dir
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross O=build_dir ARCH=arm64 SHELL=/bin/bash

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <[email protected]>

All errors (new ones prefixed by >>):

>> drivers/clk/rockchip/clk.c:27:10: fatal error: 'clk-fractional-divider.h' file not found
#include "clk-fractional-divider.h"
^~~~~~~~~~~~~~~~~~~~~~~~~~
1 error generated.


vim +27 drivers/clk/rockchip/clk.c

25
26 #include "clk.h"
> 27 #include "clk-fractional-divider.h"
28

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/[email protected]


Attachments:
(No filename) (2.19 kB)
.config.gz (41.67 kB)
Download all attachments

2021-07-16 02:47:04

by Liu Ying

[permalink] [raw]
Subject: Re: [PATCH v1 2/3] clk: fractional-divider: Introduce NO_PRESCALER flag

On Thu, 2021-07-15 at 15:07 +0300, Andy Shevchenko wrote:
> The newly introduced flag, when set, makes the flow to skip
> the assumption that the caller will use an additional 2^scale
> prescaler to get the desired clock rate.

Now, I start to be aware of the reason why the "left shifting" is
needed but still not 100% sure that details are all right. IIUC, you
are considering a potential HW prescaler here, while I thought the HW
model is just a fractional divider(M/N) and the driver is fully
agnostic to the potential HW prescaler.

>
> Reported-by: Liu Ying <[email protected]>
> Signed-off-by: Andy Shevchenko <[email protected]>
> ---
> drivers/clk/clk-fractional-divider.c | 2 +-
> include/linux/clk-provider.h | 5 +++++
> 2 files changed, 6 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/clk/clk-fractional-divider.c b/drivers/clk/clk-fractional-divider.c
> index 535d299af646..b2f9aae9f172 100644
> --- a/drivers/clk/clk-fractional-divider.c
> +++ b/drivers/clk/clk-fractional-divider.c
> @@ -84,7 +84,7 @@ void clk_fractional_divider_general_approximation(struct clk_hw *hw,
> * by (scale - fd->nwidth) bits.
> */
> scale = fls_long(*parent_rate / rate - 1);
> - if (scale > fd->nwidth)
> + if (scale > fd->nwidth && !(fd->flags & CLK_FRAC_DIVIDER_NO_PRESCALER))
> rate <<= scale - fd->nwidth;

First of all, check the CLK_FRAC_DIVIDER_NO_PRESCALER flag for the
entire above snippet of code?

Second and more important, it seems that it would be good to decouple
the prescaler knowledge from this fractional divider clk driver so as
to make it simple(Output rate = (m / n) * parent_rate). This way, the
CLK_FRAC_DIVIDER_NO_PRESCALER flag is not even needed at the first
place, which means rational_best_approximation() just _directly_
offer best_{numerator,denominator} for all cases. Further more, is it
possilbe for rational_best_approximation() to make sure there is no
risk of overflow for best_{numerator,denominator}, since
max_{numerator,denominator} are already handed over to
rational_best_approximation()? Overflowed/unreasonable
best_{numerator,denominator} don't sound like the "best" offered value.
If that's impossible, then audit best_{numerator,denominator} after
calling rational_best_approximation()?

Make sense?

Regards,
Liu Ying

>
> rational_best_approximation(rate, *parent_rate,
> diff --git a/include/linux/clk-provider.h b/include/linux/clk-provider.h
> index d83b829305c0..f74d0afe275f 100644
> --- a/include/linux/clk-provider.h
> +++ b/include/linux/clk-provider.h
> @@ -1001,6 +1001,10 @@ struct clk_hw *devm_clk_hw_register_fixed_factor(struct device *dev,
> * CLK_FRAC_DIVIDER_BIG_ENDIAN - By default little endian register accesses are
> * used for the divider register. Setting this flag makes the register
> * accesses big endian.
> + * CLK_FRAC_DIVIDER_NO_PRESCALER - By default the resulting rate may be shifted
> + * left by a few bits in case when the asked one is quite small to satisfy
> + * the desired range of denominator. If the caller wants to get the best
> + * rate without using an additional prescaler, this flag may be set.
> */
> struct clk_fractional_divider {
> struct clk_hw hw;
> @@ -1022,6 +1026,7 @@ struct clk_fractional_divider {
>
> #define CLK_FRAC_DIVIDER_ZERO_BASED BIT(0)
> #define CLK_FRAC_DIVIDER_BIG_ENDIAN BIT(1)
> +#define CLK_FRAC_DIVIDER_NO_PRESCALER BIT(2)
>
> extern const struct clk_ops clk_fractional_divider_ops;
> struct clk *clk_register_fractional_divider(struct device *dev,

2021-07-16 13:11:05

by Andy Shevchenko

[permalink] [raw]
Subject: Re: [PATCH v1 1/3] clk: fractional-divider: Export approximation algo to the CCF users

On Thu, Jul 15, 2021 at 06:58:40PM +0100, Robin Murphy wrote:
> On 2021-07-15 17:51, Andy Shevchenko wrote:
> > On Thu, Jul 15, 2021 at 7:49 PM kernel test robot <[email protected]> wrote:
> > > I love your patch! Yet something to improve:
> >
> > Definitely!
> >
> > > All errors (new ones prefixed by >>):
> > >
> > > > > drivers/clk/rockchip/clk.c:27:10: fatal error: clk-fractional-divider.h: No such file or directory
> > > 27 | #include "clk-fractional-divider.h"
> > > | ^~~~~~~~~~~~~~~~~~~~~~~~~~
> > > compilation terminated.
> >
> > Indeed. No idea how to compile-test this on x86.
> > Let me see what I can do to avoid other issues.
>
> Ha, the individual SoC-specific parts have COMPILE_TEST support, but the
> top-level COMMON_CLK_ROCKCHIP doesn't. That could probably be fixed.
>
> Otherwise, you can always grab a toolchain from [1] (if your distro doesn't
> offer one already) and cross-compile - defconfig for arm64 or
> multi_v7_defconfig for arm should cover it.

With a hack patch I was able to compile-test.

--
With Best Regards,
Andy Shevchenko


2021-07-16 13:22:51

by Andy Shevchenko

[permalink] [raw]
Subject: Re: [PATCH v1 2/3] clk: fractional-divider: Introduce NO_PRESCALER flag

On Fri, Jul 16, 2021 at 10:43:57AM +0800, Liu Ying wrote:
> On Thu, 2021-07-15 at 15:07 +0300, Andy Shevchenko wrote:
> > The newly introduced flag, when set, makes the flow to skip
> > the assumption that the caller will use an additional 2^scale
> > prescaler to get the desired clock rate.
>
> Now, I start to be aware of the reason why the "left shifting" is
> needed but still not 100% sure that details are all right. IIUC, you
> are considering a potential HW prescaler here, while I thought the HW
> model is just a fractional divider(M/N) and the driver is fully
> agnostic to the potential HW prescaler.

It's not AFAICS. Otherwise we will get saturated values which is much worse
then shifted left frequency. Anyway, this driver appeared first for the hardware
that has it for all users, so currently the assumption stays.

...

> > scale = fls_long(*parent_rate / rate - 1);
> > - if (scale > fd->nwidth)
> > + if (scale > fd->nwidth && !(fd->flags & CLK_FRAC_DIVIDER_NO_PRESCALER))
> > rate <<= scale - fd->nwidth;
>
> First of all, check the CLK_FRAC_DIVIDER_NO_PRESCALER flag for the
> entire above snippet of code?

OK.

> Second and more important, it seems that it would be good to decouple
> the prescaler knowledge from this fractional divider clk driver so as
> to make it simple(Output rate = (m / n) * parent_rate). This way, the
> CLK_FRAC_DIVIDER_NO_PRESCALER flag is not even needed at the first
> place, which means rational_best_approximation() just _directly_
> offer best_{numerator,denominator} for all cases.

Feel free to submit a patch, just give a good test to avoid breakage of almost
all users of this driver.

> Further more, is it
> possilbe for rational_best_approximation() to make sure there is no
> risk of overflow for best_{numerator,denominator}, since
> max_{numerator,denominator} are already handed over to
> rational_best_approximation()?

How? It can not be satisfied for all possible inputs.

> Overflowed/unreasonable
> best_{numerator,denominator} don't sound like the "best" offered value.

I don't follow here. If you got saturated values it means that your input is
not convergent. In practice it means that we will supply quite a bad value to
the caller.

> If that's impossible, then audit best_{numerator,denominator} after
> calling rational_best_approximation()?

And? I do not understand what you will do if you get the values of m and n
as m = 1, n = 2^nlim - 1.

> Make sense?

Not really. I probably miss your point, sorry.

So, I will submit v2 with addressed first comment and LKP noticed compiler
error.

--
With Best Regards,
Andy Shevchenko


2021-07-19 03:19:04

by Liu Ying

[permalink] [raw]
Subject: Re: [PATCH v1 2/3] clk: fractional-divider: Introduce NO_PRESCALER flag

On Fri, 2021-07-16 at 16:19 +0300, Andy Shevchenko wrote:
> On Fri, Jul 16, 2021 at 10:43:57AM +0800, Liu Ying wrote:
> > On Thu, 2021-07-15 at 15:07 +0300, Andy Shevchenko wrote:
> > > The newly introduced flag, when set, makes the flow to skip
> > > the assumption that the caller will use an additional 2^scale
> > > prescaler to get the desired clock rate.
> >
> > Now, I start to be aware of the reason why the "left shifting" is
> > needed but still not 100% sure that details are all right. IIUC, you
> > are considering a potential HW prescaler here, while I thought the HW
> > model is just a fractional divider(M/N) and the driver is fully
> > agnostic to the potential HW prescaler.
>
> It's not AFAICS. Otherwise we will get saturated values which is much worse
> then shifted left frequency. Anyway, this driver appeared first for the hardware
> that has it for all users, so currently the assumption stays.
>
> ...
>
> > > scale = fls_long(*parent_rate / rate - 1);
> > > - if (scale > fd->nwidth)
> > > + if (scale > fd->nwidth && !(fd->flags & CLK_FRAC_DIVIDER_NO_PRESCALER))
> > > rate <<= scale - fd->nwidth;
> >
> > First of all, check the CLK_FRAC_DIVIDER_NO_PRESCALER flag for the
> > entire above snippet of code?
>
> OK.
>
> > Second and more important, it seems that it would be good to decouple
> > the prescaler knowledge from this fractional divider clk driver so as
> > to make it simple(Output rate = (m / n) * parent_rate). This way, the
> > CLK_FRAC_DIVIDER_NO_PRESCALER flag is not even needed at the first
> > place, which means rational_best_approximation() just _directly_
> > offer best_{numerator,denominator} for all cases.
>
> Feel free to submit a patch, just give a good test to avoid breakage of almost
> all users of this driver.

Maybe someone may do that. I just shared my thought that it sounds
like a good idea to decouple the prescaler knowledge from this
fractional divider clk driver.

>
> > Further more, is it
> > possilbe for rational_best_approximation() to make sure there is no
> > risk of overflow for best_{numerator,denominator}, since
> > max_{numerator,denominator} are already handed over to
> > rational_best_approximation()?
>
> How? It can not be satisfied for all possible inputs.

Just have rational_best_approximation() make sure
best_{numerator,denominator} are in the range of
[1, max_{numerator,denominator}] for all given_{numerator,denominator}.
At the same time, best_numerator/best_denominator should be as close
to given_numerator/given_denominator as possible. For this particular
fractional divider clk use case, clk_round_rate() can be called
multiple times until users find rounded rate is ok.

>
> > Overflowed/unreasonable
> > best_{numerator,denominator} don't sound like the "best" offered value.
>
> I don't follow here. If you got saturated values it means that your input is
> not convergent. In practice it means that we will supply quite a bad value to
> the caller.

Just like I mentioned above, if given_{numerator,denominator} are not
convergent, best_numerator/best_denominator should be as close
to given_numerator/given_denominator as possible and at the same time
best_{numerator,denominator} are in the range of
[1, max_{numerator,denominator}]. This way, caller may have chance to
propose convergent inputs.

Regards,
Liu Ying

>
> > If that's impossible, then audit best_{numerator,denominator} after
> > calling rational_best_approximation()?
>
> And? I do not understand what you will do if you get the values of m and n
> as m = 1, n = 2^nlim - 1.
>
> > Make sense?
>
> Not really. I probably miss your point, sorry.
>
> So, I will submit v2 with addressed first comment and LKP noticed compiler
> error.
>

2021-07-19 12:10:57

by Andy Shevchenko

[permalink] [raw]
Subject: Re: [PATCH v1 2/3] clk: fractional-divider: Introduce NO_PRESCALER flag

On Mon, Jul 19, 2021 at 11:16:07AM +0800, Liu Ying wrote:
> On Fri, 2021-07-16 at 16:19 +0300, Andy Shevchenko wrote:
> > On Fri, Jul 16, 2021 at 10:43:57AM +0800, Liu Ying wrote:
> > > On Thu, 2021-07-15 at 15:07 +0300, Andy Shevchenko wrote:

...

> > > Second and more important, it seems that it would be good to decouple
> > > the prescaler knowledge from this fractional divider clk driver so as
> > > to make it simple(Output rate = (m / n) * parent_rate). This way, the
> > > CLK_FRAC_DIVIDER_NO_PRESCALER flag is not even needed at the first
> > > place, which means rational_best_approximation() just _directly_
> > > offer best_{numerator,denominator} for all cases.
> >
> > Feel free to submit a patch, just give a good test to avoid breakage of almost
> > all users of this driver.
>
> Maybe someone may do that.

Perhaps. The idea per se is good I think, but I doubt that the implementation
will be plausible.

> I just shared my thought that it sounds
> like a good idea

Thanks!

> to decouple the prescaler knowledge from this
> fractional divider clk driver.

Are you suggesting that each of the device that has _private_ pre-scaler has to
be a clock provider at the same time?

OTOH you will probably need irrespresentable hierarchy to avoid saturated values.

At least those two issues I believe makes the idea fade in complications of the
actual implementation. But again, send the code (you or anybody else) and we will
see how it looks like.

...

> > > Further more, is it
> > > possilbe for rational_best_approximation() to make sure there is no
> > > risk of overflow for best_{numerator,denominator}, since
> > > max_{numerator,denominator} are already handed over to
> > > rational_best_approximation()?
> >
> > How? It can not be satisfied for all possible inputs.
>
> Just have rational_best_approximation() make sure
> best_{numerator,denominator} are in the range of
> [1, max_{numerator,denominator}] for all given_{numerator,denominator}.
> At the same time, best_numerator/best_denominator should be as close
> to given_numerator/given_denominator as possible. For this particular
> fractional divider clk use case, clk_round_rate() can be called
> multiple times until users find rounded rate is ok.

How is it supposed to work IRL? E.g. this driver is being used for UART. Serial
core (or even TTY) has a specific function to approximate the baud rate and it
tries it 2 or 3 times. In case of *saturated* values it won't progress anyhow
because from best rational approximation algorithm the very first attempt would
be done against the best possible clock rate.

Can you provide some code skeleton to see?

...

> > > Overflowed/unreasonable
> > > best_{numerator,denominator} don't sound like the "best" offered value.
> >
> > I don't follow here. If you got saturated values it means that your input is
> > not convergent. In practice it means that we will supply quite a bad value to
> > the caller.
>
> Just like I mentioned above, if given_{numerator,denominator} are not
> convergent, best_numerator/best_denominator should be as close
> to given_numerator/given_denominator as possible and at the same time
> best_{numerator,denominator} are in the range of
> [1, max_{numerator,denominator}]. This way, caller may have chance to
> propose convergent inputs.

How? Again, provide some code to understand this better.
(Spoiler: arithmetics won't allow you to do this. Or maybe
I'm badly missing something very simple and obvious...)

And, if it's possible to achieve, are you suggesting that part of
what CCF driver should do the users will have been doing by their
own?

TL;DR: please send a code to discuss.

Thanks for review and you review of v2 is warmly welcomed!

--
With Best Regards,
Andy Shevchenko


2021-07-22 06:06:39

by Liu Ying

[permalink] [raw]
Subject: Re: [PATCH v1 2/3] clk: fractional-divider: Introduce NO_PRESCALER flag

On Mon, 2021-07-19 at 15:09 +0300, Andy Shevchenko wrote:
> On Mon, Jul 19, 2021 at 11:16:07AM +0800, Liu Ying wrote:
> > On Fri, 2021-07-16 at 16:19 +0300, Andy Shevchenko wrote:
> > > On Fri, Jul 16, 2021 at 10:43:57AM +0800, Liu Ying wrote:
> > > > On Thu, 2021-07-15 at 15:07 +0300, Andy Shevchenko wrote:
>
> ...
>
> > > > Second and more important, it seems that it would be good to decouple
> > > > the prescaler knowledge from this fractional divider clk driver so as
> > > > to make it simple(Output rate = (m / n) * parent_rate). This way, the
> > > > CLK_FRAC_DIVIDER_NO_PRESCALER flag is not even needed at the first
> > > > place, which means rational_best_approximation() just _directly_
> > > > offer best_{numerator,denominator} for all cases.
> > >
> > > Feel free to submit a patch, just give a good test to avoid breakage of almost
> > > all users of this driver.
> >
> > Maybe someone may do that.
>
> Perhaps. The idea per se is good I think, but I doubt that the implementation
> will be plausible.
>
> > I just shared my thought that it sounds
> > like a good idea
>
> Thanks!
>
> > to decouple the prescaler knowledge from this
> > fractional divider clk driver.
>
> Are you suggesting that each of the device that has _private_ pre-scaler has to
> be a clock provider at the same time?

Maybe it depends on specific devices. But, if a device is designed to
dedicatedly control clocks, being a clock provider seems to be
intuitive.

>
> OTOH you will probably need irrespresentable hierarchy to avoid saturated values.
>
> At least those two issues I believe makes the idea fade in complications of the
> actual implementation. But again, send the code (you or anybody else) and we will
> see how it looks like.

Aside from making this fractional divider clk driver simple, there
seems to be another reason for decoupling the prescaler knowledge from
the driver. That is, the 'left shifting' done in
clk_fd_general_approximation()/clk_fd_round_rate() is likely to cause
mis-match bewteen 'rate = clk_round_rate(clk, r);' and
'clk_set_rate(clk, r); rate = clk_get_rate(clk);' as kerneldoc
of clk_round_rate() mentions that they are kinda equivalent
in include/linux/clk.h. clk_fd_set_rate() doesn't really contain the
'left shifting'.

So, it looks like decoupling is the right way to go.

>
> ...
>
> > > > Further more, is it
> > > > possilbe for rational_best_approximation() to make sure there is no
> > > > risk of overflow for best_{numerator,denominator}, since
> > > > max_{numerator,denominator} are already handed over to
> > > > rational_best_approximation()?
> > >
> > > How? It can not be satisfied for all possible inputs.
> >
> > Just have rational_best_approximation() make sure
> > best_{numerator,denominator} are in the range of
> > [1, max_{numerator,denominator}] for all given_{numerator,denominator}.
> > At the same time, best_numerator/best_denominator should be as close
> > to given_numerator/given_denominator as possible. For this particular
> > fractional divider clk use case, clk_round_rate() can be called
> > multiple times until users find rounded rate is ok.
>
> How is it supposed to work IRL? E.g. this driver is being used for UART. Serial

I guess the drivers are drivers/acpi/acpi_lpss.c and drivers/mfd/intel-
lpss.c? Both for Intel.

> core (or even TTY) has a specific function to approximate the baud rate and it
> tries it 2 or 3 times. In case of *saturated* values it won't progress anyhow
> because from best rational approximation algorithm the very first attempt would
> be done against the best possible clock rate.
>
> Can you provide some code skeleton to see?

Perhaps, two approaches can be taken in driver which uses the
fractional divider clock:
1) Tune prescaler to generate higher rate or lower rate accordingly
when clk_round_rate() for the fractional divider clock returns lower or
higher rates then desired rate. This might take several rounds until
desired rate is satisfied w/wo a tolerated bias.
2) Put working clock rates and/or parent clock rates in a table as sort
of prior knowledge, which means less code for rate negotiation.

>
> ...
>
> > > > Overflowed/unreasonable
> > > > best_{numerator,denominator} don't sound like the "best" offered value.
> > >
> > > I don't follow here. If you got saturated values it means that your input is
> > > not convergent. In practice it means that we will supply quite a bad value to
> > > the caller.
> >
> > Just like I mentioned above, if given_{numerator,denominator} are not
> > convergent, best_numerator/best_denominator should be as close
> > to given_numerator/given_denominator as possible and at the same time
> > best_{numerator,denominator} are in the range of
> > [1, max_{numerator,denominator}]. This way, caller may have chance to
> > propose convergent inputs.
>
> How? Again, provide some code to understand this better.
> (Spoiler: arithmetics won't allow you to do this. Or maybe
> I'm badly missing something very simple and obvious...)
>
> And, if it's possible to achieve, are you suggesting that part of
> what CCF driver should do the users will have been doing by their
> own?

Well, I just think it doesn't seem to be necessary for the CCF/common
frational drivider clk driver to have the prescaler knowledge. The
prescaler knowledge can be in a dedicated clk provider(if appropriate)
or somewhere else.

>
> TL;DR: please send a code to discuss.

It seems that you have some experience on those intel drivers, this
clock driver and rational algorithm driver and you probably have intel
HWs to test. May I encourage you to look into this and decouple the
prescaler knowledge out :-)

>
> Thanks for review and you review of v2 is warmly welcomed!

I'd like to see patches to decouple the prescaler knowledge out.
V2, like v1, tries to consolidate the knowledge in this fractional
divider clk driver. So, not the right direction I think.

Regards,
Liu Ying

>

2021-07-22 07:25:57

by Andy Shevchenko

[permalink] [raw]
Subject: Re: [PATCH v1 2/3] clk: fractional-divider: Introduce NO_PRESCALER flag

On Thu, Jul 22, 2021 at 9:04 AM Liu Ying <[email protected]> wrote:
> On Mon, 2021-07-19 at 15:09 +0300, Andy Shevchenko wrote:
> > On Mon, Jul 19, 2021 at 11:16:07AM +0800, Liu Ying wrote:
> > > On Fri, 2021-07-16 at 16:19 +0300, Andy Shevchenko wrote:
> > > > On Fri, Jul 16, 2021 at 10:43:57AM +0800, Liu Ying wrote:
> > > > > On Thu, 2021-07-15 at 15:07 +0300, Andy Shevchenko wrote:

> > > > > Second and more important, it seems that it would be good to decouple
> > > > > the prescaler knowledge from this fractional divider clk driver so as
> > > > > to make it simple(Output rate = (m / n) * parent_rate). This way, the
> > > > > CLK_FRAC_DIVIDER_NO_PRESCALER flag is not even needed at the first
> > > > > place, which means rational_best_approximation() just _directly_
> > > > > offer best_{numerator,denominator} for all cases.
> > > >
> > > > Feel free to submit a patch, just give a good test to avoid breakage of almost
> > > > all users of this driver.
> > >
> > > Maybe someone may do that.
> >
> > Perhaps. The idea per se is good I think, but I doubt that the implementation
> > will be plausible.
> >
> > > I just shared my thought that it sounds
> > > like a good idea
> >
> > Thanks!
> >
> > > to decouple the prescaler knowledge from this
> > > fractional divider clk driver.
> >
> > Are you suggesting that each of the device that has _private_ pre-scaler has to
> > be a clock provider at the same time?
>
> Maybe it depends on specific devices. But, if a device is designed to
> dedicatedly control clocks, being a clock provider seems to be
> intuitive.

OK.

> > OTOH you will probably need irrespresentable hierarchy to avoid saturated values.
> >
> > At least those two issues I believe makes the idea fade in complications of the
> > actual implementation. But again, send the code (you or anybody else) and we will
> > see how it looks like.
>
> Aside from making this fractional divider clk driver simple, there
> seems to be another reason for decoupling the prescaler knowledge from
> the driver. That is, the 'left shifting' done in
> clk_fd_general_approximation()/clk_fd_round_rate() is likely to cause
> mis-match bewteen 'rate = clk_round_rate(clk, r);' and
> 'clk_set_rate(clk, r); rate = clk_get_rate(clk);' as kerneldoc
> of clk_round_rate() mentions that they are kinda equivalent
> in include/linux/clk.h. clk_fd_set_rate() doesn't really contain the
> 'left shifting'.
>
> So, it looks like decoupling is the right way to go.

OK.

...

> > > > > Further more, is it
> > > > > possilbe for rational_best_approximation() to make sure there is no
> > > > > risk of overflow for best_{numerator,denominator}, since
> > > > > max_{numerator,denominator} are already handed over to
> > > > > rational_best_approximation()?
> > > >
> > > > How? It can not be satisfied for all possible inputs.
> > >
> > > Just have rational_best_approximation() make sure
> > > best_{numerator,denominator} are in the range of
> > > [1, max_{numerator,denominator}] for all given_{numerator,denominator}.
> > > At the same time, best_numerator/best_denominator should be as close
> > > to given_numerator/given_denominator as possible. For this particular
> > > fractional divider clk use case, clk_round_rate() can be called
> > > multiple times until users find rounded rate is ok.
> >
> > How is it supposed to work IRL? E.g. this driver is being used for UART. Serial
>
> I guess the drivers are drivers/acpi/acpi_lpss.c and drivers/mfd/intel-
> lpss.c? Both for Intel.

At least those I have knowledge of. Others, if any, seem to have taken
this into account.

> > core (or even TTY) has a specific function to approximate the baud rate and it
> > tries it 2 or 3 times. In case of *saturated* values it won't progress anyhow
> > because from best rational approximation algorithm the very first attempt would
> > be done against the best possible clock rate.
> >
> > Can you provide some code skeleton to see?
>
> Perhaps, two approaches can be taken in driver which uses the
> fractional divider clock:
> 1) Tune prescaler to generate higher rate or lower rate accordingly
> when clk_round_rate() for the fractional divider clock returns lower or
> higher rates then desired rate. This might take several rounds until
> desired rate is satisfied w/wo a tolerated bias.
> 2) Put working clock rates and/or parent clock rates in a table as sort
> of prior knowledge, which means less code for rate negotiation.

Often 2) is a bad idea which I'm against from day 1. I prefer to
calculate what can be calculated.
The 1) looks better but requires several (unnecessary IIRC) rounds.
Why not supply the additional parameter(s) to tell that we have a
prescaller with certain limitations?

...

> > > > > Overflowed/unreasonable
> > > > > best_{numerator,denominator} don't sound like the "best" offered value.
> > > >
> > > > I don't follow here. If you got saturated values it means that your input is
> > > > not convergent. In practice it means that we will supply quite a bad value to
> > > > the caller.
> > >
> > > Just like I mentioned above, if given_{numerator,denominator} are not
> > > convergent, best_numerator/best_denominator should be as close
> > > to given_numerator/given_denominator as possible and at the same time
> > > best_{numerator,denominator} are in the range of
> > > [1, max_{numerator,denominator}]. This way, caller may have chance to
> > > propose convergent inputs.
> >
> > How? Again, provide some code to understand this better.
> > (Spoiler: arithmetics won't allow you to do this. Or maybe
> > I'm badly missing something very simple and obvious...)
> >
> > And, if it's possible to achieve, are you suggesting that part of
> > what CCF driver should do the users will have been doing by their
> > own?
>
> Well, I just think it doesn't seem to be necessary for the CCF/common
> frational drivider clk driver to have the prescaler knowledge. The
> prescaler knowledge can be in a dedicated clk provider(if appropriate)
> or somewhere else.

I might disagree on the grounds of the HW hierarchy and the best that
we may achieve in _one_ pass. For example, for a 16-bit additional
prescaler it will require up to 16 steps to get the best possible
values for the m/n. Instead we may supply to this driver the
information about subordinate prescaler and get the best m/n. The
caller will need to just divide the resulting rate by the asked rate
to get a prescaler value.

...

> > TL;DR: please send a code to discuss.
>
> It seems that you have some experience on those intel drivers, this
> clock driver and rational algorithm driver and you probably have intel
> HWs to test. May I encourage you to look into this and decouple the
> prescaler knowledge out :-)
>
> >
> > Thanks for review and you review of v2 is warmly welcomed!
>
> I'd like to see patches to decouple the prescaler knowledge out.

Then produce them! Currently the code works for all its users and does
not need any changes (documentation is indeed a gap).

> V2, like v1, tries to consolidate the knowledge in this fractional
> divider clk driver. So, not the right direction I think.

Then why are you commenting here and not there? :-)
I think I would drop patch 2 from the set (patch 1 is Acked and patch
3 is definitely needed to describe current state of affairs) on the
grounds of the comments.

--
With Best Regards,
Andy Shevchenko

2021-07-22 09:13:14

by Liu Ying

[permalink] [raw]
Subject: Re: [PATCH v1 2/3] clk: fractional-divider: Introduce NO_PRESCALER flag

On Thu, 2021-07-22 at 10:24 +0300, Andy Shevchenko wrote:
> On Thu, Jul 22, 2021 at 9:04 AM Liu Ying <[email protected]> wrote:
> > On Mon, 2021-07-19 at 15:09 +0300, Andy Shevchenko wrote:
> > > On Mon, Jul 19, 2021 at 11:16:07AM +0800, Liu Ying wrote:
> > > > On Fri, 2021-07-16 at 16:19 +0300, Andy Shevchenko wrote:
> > > > > On Fri, Jul 16, 2021 at 10:43:57AM +0800, Liu Ying wrote:
> > > > > > On Thu, 2021-07-15 at 15:07 +0300, Andy Shevchenko wrote:
> > > > > > Second and more important, it seems that it would be good to decouple
> > > > > > the prescaler knowledge from this fractional divider clk driver so as
> > > > > > to make it simple(Output rate = (m / n) * parent_rate). This way, the
> > > > > > CLK_FRAC_DIVIDER_NO_PRESCALER flag is not even needed at the first
> > > > > > place, which means rational_best_approximation() just _directly_
> > > > > > offer best_{numerator,denominator} for all cases.
> > > > >
> > > > > Feel free to submit a patch, just give a good test to avoid breakage of almost
> > > > > all users of this driver.
> > > >
> > > > Maybe someone may do that.
> > >
> > > Perhaps. The idea per se is good I think, but I doubt that the implementation
> > > will be plausible.
> > >
> > > > I just shared my thought that it sounds
> > > > like a good idea
> > >
> > > Thanks!
> > >
> > > > to decouple the prescaler knowledge from this
> > > > fractional divider clk driver.
> > >
> > > Are you suggesting that each of the device that has _private_ pre-scaler has to
> > > be a clock provider at the same time?
> >
> > Maybe it depends on specific devices. But, if a device is designed to
> > dedicatedly control clocks, being a clock provider seems to be
> > intuitive.
>
> OK.
>
> > > OTOH you will probably need irrespresentable hierarchy to avoid saturated values.
> > >
> > > At least those two issues I believe makes the idea fade in complications of the
> > > actual implementation. But again, send the code (you or anybody else) and we will
> > > see how it looks like.
> >
> > Aside from making this fractional divider clk driver simple, there
> > seems to be another reason for decoupling the prescaler knowledge from
> > the driver. That is, the 'left shifting' done in
> > clk_fd_general_approximation()/clk_fd_round_rate() is likely to cause
> > mis-match bewteen 'rate = clk_round_rate(clk, r);' and
> > 'clk_set_rate(clk, r); rate = clk_get_rate(clk);' as kerneldoc
> > of clk_round_rate() mentions that they are kinda equivalent
> > in include/linux/clk.h. clk_fd_set_rate() doesn't really contain the
> > 'left shifting'.
> >
> > So, it looks like decoupling is the right way to go.
>
> OK.
>
> ...
>
> > > > > > Further more, is it
> > > > > > possilbe for rational_best_approximation() to make sure there is no
> > > > > > risk of overflow for best_{numerator,denominator}, since
> > > > > > max_{numerator,denominator} are already handed over to
> > > > > > rational_best_approximation()?
> > > > >
> > > > > How? It can not be satisfied for all possible inputs.
> > > >
> > > > Just have rational_best_approximation() make sure
> > > > best_{numerator,denominator} are in the range of
> > > > [1, max_{numerator,denominator}] for all given_{numerator,denominator}.
> > > > At the same time, best_numerator/best_denominator should be as close
> > > > to given_numerator/given_denominator as possible. For this particular
> > > > fractional divider clk use case, clk_round_rate() can be called
> > > > multiple times until users find rounded rate is ok.
> > >
> > > How is it supposed to work IRL? E.g. this driver is being used for UART. Serial
> >
> > I guess the drivers are drivers/acpi/acpi_lpss.c and drivers/mfd/intel-
> > lpss.c? Both for Intel.
>
> At least those I have knowledge of. Others, if any, seem to have taken
> this into account.
>
> > > core (or even TTY) has a specific function to approximate the baud rate and it
> > > tries it 2 or 3 times. In case of *saturated* values it won't progress anyhow
> > > because from best rational approximation algorithm the very first attempt would
> > > be done against the best possible clock rate.
> > >
> > > Can you provide some code skeleton to see?
> >
> > Perhaps, two approaches can be taken in driver which uses the
> > fractional divider clock:
> > 1) Tune prescaler to generate higher rate or lower rate accordingly
> > when clk_round_rate() for the fractional divider clock returns lower or
> > higher rates then desired rate. This might take several rounds until
> > desired rate is satisfied w/wo a tolerated bias.
> > 2) Put working clock rates and/or parent clock rates in a table as sort
> > of prior knowledge, which means less code for rate negotiation.
>
> Often 2) is a bad idea which I'm against from day 1. I prefer to
> calculate what can be calculated.
> The 1) looks better but requires several (unnecessary IIRC) rounds.
> Why not supply the additional parameter(s) to tell that we have a
> prescaller with certain limitations?

To me, it's kinda too much information to this common frational divider
clk driver. Making the common driver simple and easy to maintain is
important.

>
> ...
>
> > > > > > Overflowed/unreasonable
> > > > > > best_{numerator,denominator} don't sound like the "best" offered value.
> > > > >
> > > > > I don't follow here. If you got saturated values it means that your input is
> > > > > not convergent. In practice it means that we will supply quite a bad value to
> > > > > the caller.
> > > >
> > > > Just like I mentioned above, if given_{numerator,denominator} are not
> > > > convergent, best_numerator/best_denominator should be as close
> > > > to given_numerator/given_denominator as possible and at the same time
> > > > best_{numerator,denominator} are in the range of
> > > > [1, max_{numerator,denominator}]. This way, caller may have chance to
> > > > propose convergent inputs.
> > >
> > > How? Again, provide some code to understand this better.
> > > (Spoiler: arithmetics won't allow you to do this. Or maybe
> > > I'm badly missing something very simple and obvious...)
> > >
> > > And, if it's possible to achieve, are you suggesting that part of
> > > what CCF driver should do the users will have been doing by their
> > > own?
> >
> > Well, I just think it doesn't seem to be necessary for the CCF/common
> > frational drivider clk driver to have the prescaler knowledge. The
> > prescaler knowledge can be in a dedicated clk provider(if appropriate)
> > or somewhere else.
>
> I might disagree on the grounds of the HW hierarchy and the best that
> we may achieve in _one_ pass. For example, for a 16-bit additional
> prescaler it will require up to 16 steps to get the best possible

Would that be an unacceptable performance penalty?

> values for the m/n. Instead we may supply to this driver the
> information about subordinate prescaler and get the best m/n. The
> caller will need to just divide the resulting rate by the asked rate
> to get a prescaler value.

IMHO, a simpler fractional divider clk driver without the prescaler
knowledge wins the tradeoff.

>
> ...
>
> > > TL;DR: please send a code to discuss.
> >
> > It seems that you have some experience on those intel drivers, this
> > clock driver and rational algorithm driver and you probably have intel
> > HWs to test. May I encourage you to look into this and decouple the
> > prescaler knowledge out :-)
> >
> > > Thanks for review and you review of v2 is warmly welcomed!
> >
> > I'd like to see patches to decouple the prescaler knowledge out.
>
> Then produce them! Currently the code works for all its users and does
> not need any changes (documentation is indeed a gap).

IIUC, only the two Intel drivers mentioned before are affected.
Rockchip has it's own ->approximation() callback and i.MX7ulp hasn't
the prescaler(IIUC), thus kinda not affected. So, perhaps you may help
look into this and decouple the prescaler knowledge out, as it seems
that you have experience on the relevant drivers and HW to test.
Anyway, to me, it is _not_ a must to have if you really think it's hard
to do or unnesessary :-)

>
> > V2, like v1, tries to consolidate the knowledge in this fractional
> > divider clk driver. So, not the right direction I think.
>
> Then why are you commenting here and not there? :-)

Maybe v2 was sent too quickly as the decoupling comment on v1 hasn't
been sufficiently discussed :-) I'll comment v2 briefly.

> I think I would drop patch 2 from the set (patch 1 is Acked and patch
> 3 is definitely needed to describe current state of affairs) on the
> grounds of the comments.

Please consider i.MX7ulp, as it hasn't the prescaler IIUC. i.MX7ulp
needs NO_PRESCALER flag, if we keep the prescaler knowledge in this
driver ofc.

Regards,
Liu Ying



2021-07-22 09:36:22

by Andy Shevchenko

[permalink] [raw]
Subject: Re: [PATCH v1 2/3] clk: fractional-divider: Introduce NO_PRESCALER flag

On Thu, Jul 22, 2021 at 12:11 PM Liu Ying <[email protected]> wrote:
> On Thu, 2021-07-22 at 10:24 +0300, Andy Shevchenko wrote:
> > On Thu, Jul 22, 2021 at 9:04 AM Liu Ying <[email protected]> wrote:
> > > On Mon, 2021-07-19 at 15:09 +0300, Andy Shevchenko wrote:
> > > > On Mon, Jul 19, 2021 at 11:16:07AM +0800, Liu Ying wrote:
> > > > > On Fri, 2021-07-16 at 16:19 +0300, Andy Shevchenko wrote:
> > > > > > On Fri, Jul 16, 2021 at 10:43:57AM +0800, Liu Ying wrote:
> > > > > > > On Thu, 2021-07-15 at 15:07 +0300, Andy Shevchenko wrote:

...

> > > > core (or even TTY) has a specific function to approximate the baud rate and it
> > > > tries it 2 or 3 times. In case of *saturated* values it won't progress anyhow
> > > > because from best rational approximation algorithm the very first attempt would
> > > > be done against the best possible clock rate.
> > > >
> > > > Can you provide some code skeleton to see?
> > >
> > > Perhaps, two approaches can be taken in driver which uses the
> > > fractional divider clock:
> > > 1) Tune prescaler to generate higher rate or lower rate accordingly
> > > when clk_round_rate() for the fractional divider clock returns lower or
> > > higher rates then desired rate. This might take several rounds until
> > > desired rate is satisfied w/wo a tolerated bias.
> > > 2) Put working clock rates and/or parent clock rates in a table as sort
> > > of prior knowledge, which means less code for rate negotiation.
> >
> > Often 2) is a bad idea which I'm against from day 1. I prefer to
> > calculate what can be calculated.
> > The 1) looks better but requires several (unnecessary IIRC) rounds.
> > Why not supply the additional parameter(s) to tell that we have a
> > prescaller with certain limitations?
>
> To me, it's kinda too much information to this common frational divider
> clk driver. Making the common driver simple and easy to maintain is
> important.

But it has to have it due to the nature of the hardware design. If you
leave it w/o that you have immediately come into the situation where
the clock rate will be far too wrong because of *saturated* values.
Have you done the arithmetics on the paper by the way?

...

> > I might disagree on the grounds of the HW hierarchy and the best that
> > we may achieve in _one_ pass. For example, for a 16-bit additional
> > prescaler it will require up to 16 steps to get the best possible
>
> Would that be an unacceptable performance penalty?

Yes.

> > values for the m/n. Instead we may supply to this driver the
> > information about subordinate prescaler and get the best m/n. The
> > caller will need to just divide the resulting rate by the asked rate
> > to get a prescaler value.
>
> IMHO, a simpler fractional divider clk driver without the prescaler
> knowledge wins the tradeoff.

I'm far from being convinced.

...

> > > > TL;DR: please send a code to discuss.

^^^^ I am tired of telling you this, btw.

> > > It seems that you have some experience on those intel drivers, this
> > > clock driver and rational algorithm driver and you probably have intel
> > > HWs to test. May I encourage you to look into this and decouple the
> > > prescaler knowledge out :-)
> > >
> > > > Thanks for review and you review of v2 is warmly welcomed!
> > >
> > > I'd like to see patches to decouple the prescaler knowledge out.
> >
> > Then produce them! Currently the code works for all its users and does
> > not need any changes (documentation is indeed a gap).
>
> IIUC, only the two Intel drivers mentioned before are affected.
> Rockchip has it's own ->approximation() callback

...which is using the same algo, look at the patch 1 of the series. It
seems you missed to actually review. Just review the series as a
whole, please!

> and i.MX7ulp hasn't
> the prescaler(IIUC), thus kinda not affected. So, perhaps you may help
> look into this and decouple the prescaler knowledge out, as it seems
> that you have experience on the relevant drivers and HW to test.

> Anyway, to me, it is _not_ a must to have if you really think it's hard
> to do or unnesessary :-)

...

> > > V2, like v1, tries to consolidate the knowledge in this fractional
> > > divider clk driver. So, not the right direction I think.
> >
> > Then why are you commenting here and not there? :-)
>
> Maybe v2 was sent too quickly as the decoupling comment on v1 hasn't
> been sufficiently discussed :-)

Maybe.

> I'll comment v2 briefly.

Thanks!

...

> > I think I would drop patch 2 from the set (patch 1 is Acked and patch
> > 3 is definitely needed to describe current state of affairs) on the
> > grounds of the comments.
>
> Please consider i.MX7ulp, as it hasn't the prescaler IIUC. i.MX7ulp
> needs NO_PRESCALER flag, if we keep the prescaler knowledge in this
> driver ofc.

Then we need a flag and v2 can go as is.

--
With Best Regards,
Andy Shevchenko

2021-07-22 10:09:57

by Liu Ying

[permalink] [raw]
Subject: Re: [PATCH v1 2/3] clk: fractional-divider: Introduce NO_PRESCALER flag

On Thu, 2021-07-22 at 12:34 +0300, Andy Shevchenko wrote:
> On Thu, Jul 22, 2021 at 12:11 PM Liu Ying <[email protected]> wrote:
> > On Thu, 2021-07-22 at 10:24 +0300, Andy Shevchenko wrote:
> > > On Thu, Jul 22, 2021 at 9:04 AM Liu Ying <[email protected]> wrote:
> > > > On Mon, 2021-07-19 at 15:09 +0300, Andy Shevchenko wrote:
> > > > > On Mon, Jul 19, 2021 at 11:16:07AM +0800, Liu Ying wrote:
> > > > > > On Fri, 2021-07-16 at 16:19 +0300, Andy Shevchenko wrote:
> > > > > > > On Fri, Jul 16, 2021 at 10:43:57AM +0800, Liu Ying wrote:
> > > > > > > > On Thu, 2021-07-15 at 15:07 +0300, Andy Shevchenko wrote:
>
> ...
>
> > > > > core (or even TTY) has a specific function to approximate the baud rate and it
> > > > > tries it 2 or 3 times. In case of *saturated* values it won't progress anyhow
> > > > > because from best rational approximation algorithm the very first attempt would
> > > > > be done against the best possible clock rate.
> > > > >
> > > > > Can you provide some code skeleton to see?
> > > >
> > > > Perhaps, two approaches can be taken in driver which uses the
> > > > fractional divider clock:
> > > > 1) Tune prescaler to generate higher rate or lower rate accordingly
> > > > when clk_round_rate() for the fractional divider clock returns lower or
> > > > higher rates then desired rate. This might take several rounds until
> > > > desired rate is satisfied w/wo a tolerated bias.
> > > > 2) Put working clock rates and/or parent clock rates in a table as sort
> > > > of prior knowledge, which means less code for rate negotiation.
> > >
> > > Often 2) is a bad idea which I'm against from day 1. I prefer to
> > > calculate what can be calculated.
> > > The 1) looks better but requires several (unnecessary IIRC) rounds.
> > > Why not supply the additional parameter(s) to tell that we have a
> > > prescaller with certain limitations?
> >
> > To me, it's kinda too much information to this common frational divider
> > clk driver. Making the common driver simple and easy to maintain is
> > important.
>
> But it has to have it due to the nature of the hardware design. If you
> leave it w/o that you have immediately come into the situation where
> the clock rate will be far too wrong because of *saturated* values.
> Have you done the arithmetics on the paper by the way?
>
> ...
>
> > > I might disagree on the grounds of the HW hierarchy and the best that
> > > we may achieve in _one_ pass. For example, for a 16-bit additional
> > > prescaler it will require up to 16 steps to get the best possible
> >
> > Would that be an unacceptable performance penalty?
>
> Yes.
>
> > > values for the m/n. Instead we may supply to this driver the
> > > information about subordinate prescaler and get the best m/n. The
> > > caller will need to just divide the resulting rate by the asked rate
> > > to get a prescaler value.
> >
> > IMHO, a simpler fractional divider clk driver without the prescaler
> > knowledge wins the tradeoff.
>
> I'm far from being convinced.
>
> ...
>
> > > > > TL;DR: please send a code to discuss.
>
> ^^^^ I am tired of telling you this, btw.
>
> > > > It seems that you have some experience on those intel drivers, this
> > > > clock driver and rational algorithm driver and you probably have intel
> > > > HWs to test. May I encourage you to look into this and decouple the
> > > > prescaler knowledge out :-)
> > > >
> > > > > Thanks for review and you review of v2 is warmly welcomed!
> > > >
> > > > I'd like to see patches to decouple the prescaler knowledge out.
> > >
> > > Then produce them! Currently the code works for all its users and does
> > > not need any changes (documentation is indeed a gap).
> >
> > IIUC, only the two Intel drivers mentioned before are affected.
> > Rockchip has it's own ->approximation() callback
>
> ...which is using the same algo, look at the patch 1 of the series. It
> seems you missed to actually review. Just review the series as a
> whole, please!

But, the topic is to decouple the prescaler knowledge.
I reviewed it as a whole though I was not Cc'ed for the patch 1/3. It
looks like Rockchip driver doesn't have to be touched if the prescaler
knowledge is decoupled from this fractional divider clk driver. If you
consolidate the prescaler knowledge in the Rockchip driver as patch 1/3
does, you touch it.

Regards,
Liu Ying

>
> > and i.MX7ulp hasn't
> > the prescaler(IIUC), thus kinda not affected. So, perhaps you may help
> > look into this and decouple the prescaler knowledge out, as it seems
> > that you have experience on the relevant drivers and HW to test.
> > Anyway, to me, it is _not_ a must to have if you really think it's hard
> > to do or unnesessary :-)
>
> ...
>
> > > > V2, like v1, tries to consolidate the knowledge in this fractional
> > > > divider clk driver. So, not the right direction I think.
> > >
> > > Then why are you commenting here and not there? :-)
> >
> > Maybe v2 was sent too quickly as the decoupling comment on v1 hasn't
> > been sufficiently discussed :-)
>
> Maybe.
>
> > I'll comment v2 briefly.
>
> Thanks!
>
> ...
>
> > > I think I would drop patch 2 from the set (patch 1 is Acked and patch
> > > 3 is definitely needed to describe current state of affairs) on the
> > > grounds of the comments.
> >
> > Please consider i.MX7ulp, as it hasn't the prescaler IIUC. i.MX7ulp
> > needs NO_PRESCALER flag, if we keep the prescaler knowledge in this
> > driver ofc.
>
> Then we need a flag and v2 can go as is.
>