2023-07-02 18:04:14

by Frank Oltmanns

[permalink] [raw]
Subject: [PATCH v3 0/8] clk: sunxi-ng: Consider alternative parent rates when determining NKM clock rate

This patchset enables NKM clocks to consider alternative parent rates
and utilize this new feature to adjust the pll-video0 clock on Allwinner
A64 (PATCH 1 and 2).

Furthermore, with this patchset pll-video0 considers rates that are
higher than the requested rate when finding the closest rate. In
consequence, higher rates are also considered by pll-video0's
descandents (PATCH 3 et. seq.).

This allows us to achieve an optimal rate for driving the board's panel.

To provide some context, the clock structure involved in this process is
as follows:
clock clock type
--------------------------------------
pll-video0 ccu_nm
pll-mipi ccu_nkm
tcon0 ccu_mux
tcon-data-clock sun4i_dclk

The divider between tcon0 and tcon-data-clock is fixed at 4. Therefore,
in order to achieve a rate that closely matches the desired rate of the
panel, pll-mipi needs to operate at a specific rate.

Tests
=====
So far, this has been successfully tested on the A64-based Pinephone
using three different panel rates:

1. A panel rate that can be matched exactly by pll-video0.
2. A panel rate that requires pll-video0 to undershoot to get the
closest rate.
3. A panel rate that requires pll-video0 to overshoot to get the
closest rate.

Test records:

Re 1:
-----
Panel requests tcon-data-clock of 103500000 Hz, i.e., pll-mipi needs to
run at 414000000 Hz. This results in the following clock rates:
clock rate
----------------------------------------
pll-video0 207000000
hdmi-phy-clk 51750000
hdmi 207000000
tcon1 207000000
pll-mipi 414000000
tcon0 414000000
tcon-data-clock 103500000

The results of the find_best calls:
[ 12.345862] ccu_nkm_find_best_with_parent_adj: rate=414000000, best_rate=414000000, best_parent_rate=207000000, n=1, k=2, m=1
[ 12.346111] ccu_nkm_find_best_with_parent_adj: rate=414000000, best_rate=414000000, best_parent_rate=207000000, n=1, k=2, m=1
[ 12.346291] ccu_nkm_find_best_with_parent_adj: rate=414000000, best_rate=414000000, best_parent_rate=207000000, n=1, k=2, m=1
[ 12.346471] ccu_nkm_find_best_with_parent_adj: rate=414000000, best_rate=414000000, best_parent_rate=207000000, n=1, k=2, m=1
[ 12.346867] ccu_nkm_find_best: rate=414000000, best_rate=414000000, parent_rate=207000000, n=1, k=2, m=1

Re 2:
-----
Panel requests tcon-data-clock of 103650000 Hz, i.e., pll-mipi needs to
run at 414600000 Hz. This results in the following clock rates:
clock rate
----------------------------------------
pll-video0 282666666
hdmi-phy-clk 70666666
hdmi 282666666
tcon1 282666666
pll-mipi 414577776
tcon0 414577776
tcon-data-clock 103644444

The results of the find_best calls:
[ 13.638954] ccu_nkm_find_best_with_parent_adj: rate=414600000, best_rate=414577776, best_parent_rate=282666666, n=11, k=2, m=15
[ 13.639212] ccu_nkm_find_best_with_parent_adj: rate=414600000, best_rate=414577776, best_parent_rate=282666666, n=11, k=2, m=15
[ 13.639395] ccu_nkm_find_best_with_parent_adj: rate=414577776, best_rate=414577776, best_parent_rate=282666666, n=11, k=2, m=15
[ 13.639577] ccu_nkm_find_best_with_parent_adj: rate=414577776, best_rate=414577776, best_parent_rate=282666666, n=11, k=2, m=15
[ 13.639913] ccu_nkm_find_best: rate=414577776, best_rate=414577776, parent_rate=282666666, n=11, k=2, m=15

Here, we consistently ask the pll-video0 for a rate that it can't
provide exactly:
- rate=414600000: We ask the parent for 282681818 (rate * m / (n * k)),
it returns 282666666. Here the parent undershoots.
- rate=414577776: We ask the parent for 282666665 (rate * m / (n * k)),
it returns 282666666. Here the parent overshoots.

So, in both cases it rounds to the nearest rate (first down, then up),
which is the intended behaviour.

Re 3:
-----
Panel requests tcon-data-clock of 112266000 Hz, i.e., pll-mipi needs to
run at 449064000 Hz. This results in the following clock rates:
clock rate
----------------------------------------
pll-video0 207272727
hdmi-phy-clk 51818181
hdmi 207272727
tcon1 207272727
pll-mipi 449090908
tcon0 449090908
tcon-data-clock 112272727

The results of the find_best calls:
[ 13.871022] ccu_nkm_find_best_with_parent_adj: rate=449064000, best_rate=449090908, best_parent_rate=207272727, n=13, k=2, m=12
[ 13.871277] ccu_nkm_find_best_with_parent_adj: rate=449064000, best_rate=449090908, best_parent_rate=207272727, n=13, k=2, m=12
[ 13.871461] ccu_nkm_find_best_with_parent_adj: rate=449090908, best_rate=449090908, best_parent_rate=207272727, n=13, k=2, m=12
[ 13.871646] ccu_nkm_find_best_with_parent_adj: rate=449090908, best_rate=449090908, best_parent_rate=207272727, n=13, k=2, m=12
[ 13.872050] ccu_nkm_find_best: rate=449090908, best_rate=449090908, parent_rate=207272727, n=13, k=2, m=12

Here, we consistently ask the pll-video0 for a rate that it can't
provide exactly:
- rate=449064000: We ask the parent for 207260307 (rate * m / (n * k)),
it returns 207272727.
- rate=449090908: We ask the parent for 207272726 (rate * m / (n * k)),
it returns 207272727.

So, in both cases, it rounds up to the nearest rate, which is the
intended behavior.

Changes in v3:
- Use dedicated function for finding the best rate in cases where an
nkm clock supports setting its parent's rate, streamlining it with
the structure that is used in other sunxi-ng ccus such as ccu_mp
(PATCH 1).
- Therefore, remove the now obsolete comments that were introduced in
v2 (PATCH 1).
- Remove the dedicated function for calculating the optimal parent rate
for nkm clocks that was introduced in v2. Instead use a simple
calculation and require the parent clock to select the closest rate to
achieve optimal results (PATCH 1).
- Therefore, add support to set the closest rate for nm clocks (because
pll-mipi's parent pll-video0 is an nm clock) and all clock types that
are descendants of a64's pll-video0, i.e., nkm, mux, and div (PATCH 3
et. seq.).
- Link to v2: https://lore.kernel.org/all/[email protected]/

Changes in V2:
- Move optimal parent rate calculation to dedicated function
- Choose a parent rate that does not to overshoot requested rate
- Add comments to ccu_nkm_find_best
- Make sure that best_parent_rate stays at original parent rate in the unlikely
case that all combinations overshoot.

Link to V1:
https://lore.kernel.org/lkml/[email protected]/

---
Frank Oltmanns (8):
clk: sunxi-ng: nkm: consider alternative parent rates when determining rate
clk: sunxi-ng: a64: allow pll-mipi to set parent's rate
clk: sunxi-ng: Add feature to find closest rate
clk: sunxi-ng: nm: Support finding closest rate
clk: sunxi-ng: nkm: Support finding closest rate
clk: sunxi-ng: mux: Support finding closest rate
clk: sunxi-ng: div: Support finding closest rate
clk: sunxi-ng: a64: select closest rate for pll-video0

drivers/clk/sunxi-ng/ccu-sun50i-a64.c | 25 ++++++-----
drivers/clk/sunxi-ng/ccu-sun8i-h3.c | 3 +-
drivers/clk/sunxi-ng/ccu-sun8i-r40.c | 6 ++-
drivers/clk/sunxi-ng/ccu_common.h | 1 +
drivers/clk/sunxi-ng/ccu_div.h | 30 +++++++++++++
drivers/clk/sunxi-ng/ccu_mux.c | 36 +++++++++++++---
drivers/clk/sunxi-ng/ccu_mux.h | 17 ++++++++
drivers/clk/sunxi-ng/ccu_nkm.c | 80 ++++++++++++++++++++++++++++++++---
drivers/clk/sunxi-ng/ccu_nm.c | 23 +++++++++-
drivers/clk/sunxi-ng/ccu_nm.h | 6 ++-
10 files changed, 198 insertions(+), 29 deletions(-)
---
base-commit: 6995e2de6891c724bfeb2db33d7b87775f913ad1
change-id: 20230626-pll-mipi_set_rate_parent-3363fc0d6e6f

Best regards,
--
Frank Oltmanns <[email protected]>



2023-07-02 18:21:03

by Frank Oltmanns

[permalink] [raw]
Subject: [PATCH v3 1/8] clk: sunxi-ng: nkm: consider alternative parent rates when determining rate

In case the CLK_SET_RATE_PARENT flag is set, consider using a different
parent rate when determining a new rate.

To find the best match for the requested rate, perform the following
steps for each NKM combination:
- calculate the optimal parent rate,
- find the best parent rate that the parent clock actually supports
- use that parent rate to calculate the effective rate.

In case the clk does not support setting the parent rate, use the same
algorithm as before.

Signed-off-by: Frank Oltmanns <[email protected]>
---
drivers/clk/sunxi-ng/ccu_nkm.c | 48 +++++++++++++++++++++++++++++++++++++++---
1 file changed, 45 insertions(+), 3 deletions(-)

diff --git a/drivers/clk/sunxi-ng/ccu_nkm.c b/drivers/clk/sunxi-ng/ccu_nkm.c
index a0978a50edae..d83843e69c25 100644
--- a/drivers/clk/sunxi-ng/ccu_nkm.c
+++ b/drivers/clk/sunxi-ng/ccu_nkm.c
@@ -6,6 +6,7 @@

#include <linux/clk-provider.h>
#include <linux/io.h>
+#include <linux/math.h>

#include "ccu_gate.h"
#include "ccu_nkm.h"
@@ -16,6 +17,44 @@ struct _ccu_nkm {
unsigned long m, min_m, max_m;
};

+static unsigned long ccu_nkm_find_best_with_parent_adj(unsigned long *parent, unsigned long rate,
+ struct _ccu_nkm *nkm, struct clk_hw *phw)
+{
+ unsigned long best_rate = 0, best_parent_rate = *parent, tmp_parent = *parent;
+ unsigned long best_n = 0, best_k = 0, best_m = 0;
+ unsigned long _n, _k, _m;
+
+ for (_k = nkm->min_k; _k <= nkm->max_k; _k++) {
+ for (_n = nkm->min_n; _n <= nkm->max_n; _n++) {
+ for (_m = nkm->min_m; _m <= nkm->max_m; _m++) {
+ unsigned long tmp_rate;
+
+ tmp_parent = clk_hw_round_rate(phw, rate * _m / (_n * _k));
+
+ tmp_rate = tmp_parent * _n * _k / _m;
+ if (tmp_rate > rate)
+ continue;
+
+ if ((rate - tmp_rate) < (rate - best_rate)) {
+ best_rate = tmp_rate;
+ best_parent_rate = tmp_parent;
+ best_n = _n;
+ best_k = _k;
+ best_m = _m;
+ }
+ }
+ }
+ }
+
+ nkm->n = best_n;
+ nkm->k = best_k;
+ nkm->m = best_m;
+
+ *parent = best_parent_rate;
+
+ return best_rate;
+}
+
static unsigned long ccu_nkm_find_best(unsigned long parent, unsigned long rate,
struct _ccu_nkm *nkm)
{
@@ -106,7 +145,7 @@ static unsigned long ccu_nkm_recalc_rate(struct clk_hw *hw,
}

static unsigned long ccu_nkm_round_rate(struct ccu_mux_internal *mux,
- struct clk_hw *hw,
+ struct clk_hw *parent_hw,
unsigned long *parent_rate,
unsigned long rate,
void *data)
@@ -124,7 +163,10 @@ static unsigned long ccu_nkm_round_rate(struct ccu_mux_internal *mux,
if (nkm->common.features & CCU_FEATURE_FIXED_POSTDIV)
rate *= nkm->fixed_post_div;

- rate = ccu_nkm_find_best(*parent_rate, rate, &_nkm);
+ if (!clk_hw_can_set_rate_parent(&nkm->common.hw))
+ rate = ccu_nkm_find_best(*parent_rate, rate, &_nkm);
+ else
+ rate = ccu_nkm_find_best_with_parent_adj(parent_rate, rate, &_nkm, parent_hw);

if (nkm->common.features & CCU_FEATURE_FIXED_POSTDIV)
rate /= nkm->fixed_post_div;
@@ -159,7 +201,7 @@ static int ccu_nkm_set_rate(struct clk_hw *hw, unsigned long rate,
_nkm.min_m = 1;
_nkm.max_m = nkm->m.max ?: 1 << nkm->m.width;

- ccu_nkm_find_best(parent_rate, rate, &_nkm);
+ ccu_nkm_find_best(&parent_rate, rate, &_nkm);

spin_lock_irqsave(nkm->common.lock, flags);


--
2.41.0


2023-07-02 18:23:25

by Frank Oltmanns

[permalink] [raw]
Subject: [PATCH v3 7/8] clk: sunxi-ng: div: Support finding closest rate

Add initalization macros for divisor clocks with mux
(SUNXI_CCU_M_WITH_MUX) to support finding the closest rate. This clock
type requires the appropriate flags to be set in the .common structure
(for the mux part of the clock) and the .div part.

Signed-off-by: Frank Oltmanns <[email protected]>
---
drivers/clk/sunxi-ng/ccu_div.h | 30 ++++++++++++++++++++++++++++++
1 file changed, 30 insertions(+)

diff --git a/drivers/clk/sunxi-ng/ccu_div.h b/drivers/clk/sunxi-ng/ccu_div.h
index 948e2b0c0c3b..90d49ee8e0cc 100644
--- a/drivers/clk/sunxi-ng/ccu_div.h
+++ b/drivers/clk/sunxi-ng/ccu_div.h
@@ -143,6 +143,26 @@ struct ccu_div {
}, \
}

+#define SUNXI_CCU_M_WITH_MUX_TABLE_GATE_CLOSEST(_struct, _name, \
+ _parents, _table, \
+ _reg, \
+ _mshift, _mwidth, \
+ _muxshift, _muxwidth, \
+ _gate, _flags) \
+ struct ccu_div _struct = { \
+ .enable = _gate, \
+ .div = _SUNXI_CCU_DIV_FLAGS(_mshift, _mwidth, CLK_DIVIDER_ROUND_CLOSEST), \
+ .mux = _SUNXI_CCU_MUX_TABLE(_muxshift, _muxwidth, _table), \
+ .common = { \
+ .reg = _reg, \
+ .hw.init = CLK_HW_INIT_PARENTS(_name, \
+ _parents, \
+ &ccu_div_ops, \
+ _flags), \
+ .features = CCU_FEATURE_CLOSEST_RATE, \
+ }, \
+ }
+
#define SUNXI_CCU_M_WITH_MUX_GATE(_struct, _name, _parents, _reg, \
_mshift, _mwidth, _muxshift, _muxwidth, \
_gate, _flags) \
@@ -152,6 +172,16 @@ struct ccu_div {
_muxshift, _muxwidth, \
_gate, _flags)

+#define SUNXI_CCU_M_WITH_MUX_GATE_CLOSEST(_struct, _name, _parents, \
+ _reg, _mshift, _mwidth, \
+ _muxshift, _muxwidth, \
+ _gate, _flags) \
+ SUNXI_CCU_M_WITH_MUX_TABLE_GATE_CLOSEST(_struct, _name, \
+ _parents, NULL, \
+ _reg, _mshift, _mwidth, \
+ _muxshift, _muxwidth, \
+ _gate, _flags)
+
#define SUNXI_CCU_M_WITH_MUX(_struct, _name, _parents, _reg, \
_mshift, _mwidth, _muxshift, _muxwidth, \
_flags) \

--
2.41.0


2023-07-02 18:24:39

by Frank Oltmanns

[permalink] [raw]
Subject: [PATCH v3 6/8] clk: sunxi-ng: mux: Support finding closest rate

When finding the best rate for a mux clock, consider rates that are
higher than the requested rate by introducing a new clk_ops structure
that uses the existing __clk_mux_determine_rate_closest function.
Furthermore introduce an initialization macro that uses this new
structure.

Signed-off-by: Frank Oltmanns <[email protected]>
---
drivers/clk/sunxi-ng/ccu_mux.c | 13 +++++++++++++
drivers/clk/sunxi-ng/ccu_mux.h | 17 +++++++++++++++++
2 files changed, 30 insertions(+)

diff --git a/drivers/clk/sunxi-ng/ccu_mux.c b/drivers/clk/sunxi-ng/ccu_mux.c
index 8594d6a4addd..49a592bdeacf 100644
--- a/drivers/clk/sunxi-ng/ccu_mux.c
+++ b/drivers/clk/sunxi-ng/ccu_mux.c
@@ -264,6 +264,19 @@ static unsigned long ccu_mux_recalc_rate(struct clk_hw *hw,
parent_rate);
}

+const struct clk_ops ccu_mux_closest_ops = {
+ .disable = ccu_mux_disable,
+ .enable = ccu_mux_enable,
+ .is_enabled = ccu_mux_is_enabled,
+
+ .get_parent = ccu_mux_get_parent,
+ .set_parent = ccu_mux_set_parent,
+
+ .determine_rate = __clk_mux_determine_rate_closest,
+ .recalc_rate = ccu_mux_recalc_rate,
+};
+EXPORT_SYMBOL_NS_GPL(ccu_mux_closest_ops, SUNXI_CCU);
+
const struct clk_ops ccu_mux_ops = {
.disable = ccu_mux_disable,
.enable = ccu_mux_enable,
diff --git a/drivers/clk/sunxi-ng/ccu_mux.h b/drivers/clk/sunxi-ng/ccu_mux.h
index 2c1811a445b0..c4ee14e43719 100644
--- a/drivers/clk/sunxi-ng/ccu_mux.h
+++ b/drivers/clk/sunxi-ng/ccu_mux.h
@@ -46,6 +46,22 @@ struct ccu_mux {
struct ccu_common common;
};

+#define SUNXI_CCU_MUX_TABLE_WITH_GATE_CLOSEST(_struct, _name, _parents, _table, \
+ _reg, _shift, _width, _gate, \
+ _flags) \
+ struct ccu_mux _struct = { \
+ .enable = _gate, \
+ .mux = _SUNXI_CCU_MUX_TABLE(_shift, _width, _table), \
+ .common = { \
+ .reg = _reg, \
+ .hw.init = CLK_HW_INIT_PARENTS(_name, \
+ _parents, \
+ &ccu_mux_closest_ops, \
+ _flags), \
+ .features = CCU_FEATURE_CLOSEST_RATE, \
+ } \
+ }
+
#define SUNXI_CCU_MUX_TABLE_WITH_GATE(_struct, _name, _parents, _table, \
_reg, _shift, _width, _gate, \
_flags) \
@@ -113,6 +129,7 @@ static inline struct ccu_mux *hw_to_ccu_mux(struct clk_hw *hw)
}

extern const struct clk_ops ccu_mux_ops;
+extern const struct clk_ops ccu_mux_closest_ops;

unsigned long ccu_mux_helper_apply_prediv(struct ccu_common *common,
struct ccu_mux_internal *cm,

--
2.41.0


2023-07-02 18:28:50

by Frank Oltmanns

[permalink] [raw]
Subject: [PATCH v3 8/8] clk: sunxi-ng: a64: select closest rate for pll-video0

Selecting the closest rate for pll-video0 instead of the closest rate
that is less than the requested rate has no downside for this clock,
while allowing for selecting a more suitable rate, e.g. for the
connected panels.

Furthermore, the algorithm that sets an NKM clock's parent benefits from
the closest rate. Without it, the NKM clock's rate might drift away from
the requested rate in the multiple successive calls to
ccu_nkm_determine_rate that the clk framework performs when setting a
clock rate.

Therefore, configure pll-video0 and, in consequence, all of its
descendents to select the closest rate.

Signed-off-by: Frank Oltmanns <[email protected]>
---
drivers/clk/sunxi-ng/ccu-sun50i-a64.c | 18 ++++++++++--------
1 file changed, 10 insertions(+), 8 deletions(-)

diff --git a/drivers/clk/sunxi-ng/ccu-sun50i-a64.c b/drivers/clk/sunxi-ng/ccu-sun50i-a64.c
index ebfab1fdbb96..016432ff3bde 100644
--- a/drivers/clk/sunxi-ng/ccu-sun50i-a64.c
+++ b/drivers/clk/sunxi-ng/ccu-sun50i-a64.c
@@ -81,7 +81,7 @@ static SUNXI_CCU_NM_WITH_FRAC_GATE_LOCK_MIN_MAX(pll_video0_clk, "pll-video0",
BIT(31), /* gate */
BIT(28), /* lock */
CLK_SET_RATE_UNGATE,
- 0); /* features */
+ CCU_FEATURE_CLOSEST_RATE);

static SUNXI_CCU_NM_WITH_FRAC_GATE_LOCK(pll_ve_clk, "pll-ve",
"osc24M", 0x018,
@@ -183,6 +183,7 @@ static struct ccu_nkm pll_mipi_clk = {
.hw.init = CLK_HW_INIT("pll-mipi", "pll-video0",
&ccu_nkm_ops,
CLK_SET_RATE_UNGATE | CLK_SET_RATE_PARENT),
+ .features = CCU_FEATURE_CLOSEST_RATE,
},
};

@@ -533,7 +534,7 @@ static SUNXI_CCU_M_WITH_MUX_GATE(de_clk, "de", de_parents,

static const char * const tcon0_parents[] = { "pll-mipi", "pll-video0-2x" };
static const u8 tcon0_table[] = { 0, 2, };
-static SUNXI_CCU_MUX_TABLE_WITH_GATE(tcon0_clk, "tcon0", tcon0_parents,
+static SUNXI_CCU_MUX_TABLE_WITH_GATE_CLOSEST(tcon0_clk, "tcon0", tcon0_parents,
tcon0_table, 0x118, 24, 3, BIT(31),
CLK_SET_RATE_PARENT);

@@ -541,7 +542,7 @@ static const char * const tcon1_parents[] = { "pll-video0", "pll-video1" };
static const u8 tcon1_table[] = { 0, 2, };
static struct ccu_div tcon1_clk = {
.enable = BIT(31),
- .div = _SUNXI_CCU_DIV(0, 4),
+ .div = _SUNXI_CCU_DIV_FLAGS(0, 4, CLK_DIVIDER_ROUND_CLOSEST),
.mux = _SUNXI_CCU_MUX_TABLE(24, 2, tcon1_table),
.common = {
.reg = 0x11c,
@@ -549,6 +550,7 @@ static struct ccu_div tcon1_clk = {
tcon1_parents,
&ccu_div_ops,
CLK_SET_RATE_PARENT),
+ .features = CCU_FEATURE_CLOSEST_RATE,
},
};

@@ -580,8 +582,8 @@ static SUNXI_CCU_GATE(avs_clk, "avs", "osc24M",
0x144, BIT(31), 0);

static const char * const hdmi_parents[] = { "pll-video0", "pll-video1" };
-static SUNXI_CCU_M_WITH_MUX_GATE(hdmi_clk, "hdmi", hdmi_parents,
- 0x150, 0, 4, 24, 2, BIT(31), CLK_SET_RATE_PARENT);
+static SUNXI_CCU_M_WITH_MUX_GATE_CLOSEST(hdmi_clk, "hdmi", hdmi_parents,
+ 0x150, 0, 4, 24, 2, BIT(31), CLK_SET_RATE_PARENT);

static SUNXI_CCU_GATE(hdmi_ddc_clk, "hdmi-ddc", "osc24M",
0x154, BIT(31), 0);
@@ -593,9 +595,9 @@ static SUNXI_CCU_M_WITH_MUX_GATE(mbus_clk, "mbus", mbus_parents,

static const char * const dsi_dphy_parents[] = { "pll-video0", "pll-periph0" };
static const u8 dsi_dphy_table[] = { 0, 2, };
-static SUNXI_CCU_M_WITH_MUX_TABLE_GATE(dsi_dphy_clk, "dsi-dphy",
- dsi_dphy_parents, dsi_dphy_table,
- 0x168, 0, 4, 8, 2, BIT(15), CLK_SET_RATE_PARENT);
+static SUNXI_CCU_M_WITH_MUX_TABLE_GATE_CLOSEST(dsi_dphy_clk, "dsi-dphy",
+ dsi_dphy_parents, dsi_dphy_table,
+ 0x168, 0, 4, 8, 2, BIT(15), CLK_SET_RATE_PARENT);

static SUNXI_CCU_M_WITH_GATE(gpu_clk, "gpu", "pll-gpu",
0x1a0, 0, 3, BIT(31), CLK_SET_RATE_PARENT);

--
2.41.0


2023-07-02 18:29:03

by Frank Oltmanns

[permalink] [raw]
Subject: [PATCH v3 3/8] clk: sunxi-ng: Add feature to find closest rate

The default behaviour of clocks in the sunxi-ng driver is to select a
clock rate that is closest to but less than the requested rate.

Add the CCU_FEATURE_CLOSEST_RATE flag, which can be used to allow clocks
to find the closest rate instead.

Signed-off-by: Frank Oltmanns <[email protected]>
---
drivers/clk/sunxi-ng/ccu_common.h | 1 +
1 file changed, 1 insertion(+)

diff --git a/drivers/clk/sunxi-ng/ccu_common.h b/drivers/clk/sunxi-ng/ccu_common.h
index fbf16c6b896d..5ad219f041d5 100644
--- a/drivers/clk/sunxi-ng/ccu_common.h
+++ b/drivers/clk/sunxi-ng/ccu_common.h
@@ -18,6 +18,7 @@
#define CCU_FEATURE_MMC_TIMING_SWITCH BIT(6)
#define CCU_FEATURE_SIGMA_DELTA_MOD BIT(7)
#define CCU_FEATURE_KEY_FIELD BIT(8)
+#define CCU_FEATURE_CLOSEST_RATE BIT(9)

/* MMC timing mode switch bit */
#define CCU_MMC_NEW_TIMING_MODE BIT(30)

--
2.41.0


2023-07-02 18:29:56

by Frank Oltmanns

[permalink] [raw]
Subject: [PATCH v3 4/8] clk: sunxi-ng: nm: Support finding closest rate

Incorporate a new function, ccu_nm_find_best_closest, that selects the
closest possible rate instead of the closest rate that is less than the
requested rate. Utilizing rational_best_approximation has the additional
effect of boosting performance in clock rate selection.

In cases where the CCU_FEATURE_CLOSEST_RATE flag is set, use
ccu_nm_find_best_closest instead of the original ccu_nm_find_best
function.

Extend the macro SUNXI_CCU_NM_WITH_FRAC_GATE_LOCK_MIN_MAX to allow
selecting additional features and update all usages of the macro with no
additional flags set.

Signed-off-by: Frank Oltmanns <[email protected]>
---
drivers/clk/sunxi-ng/ccu-sun50i-a64.c | 6 ++++--
drivers/clk/sunxi-ng/ccu-sun8i-h3.c | 3 ++-
drivers/clk/sunxi-ng/ccu-sun8i-r40.c | 6 ++++--
drivers/clk/sunxi-ng/ccu_nm.c | 23 +++++++++++++++++++++--
drivers/clk/sunxi-ng/ccu_nm.h | 6 ++++--
5 files changed, 35 insertions(+), 9 deletions(-)

diff --git a/drivers/clk/sunxi-ng/ccu-sun50i-a64.c b/drivers/clk/sunxi-ng/ccu-sun50i-a64.c
index a139a5c438d4..ebfab1fdbb96 100644
--- a/drivers/clk/sunxi-ng/ccu-sun50i-a64.c
+++ b/drivers/clk/sunxi-ng/ccu-sun50i-a64.c
@@ -80,7 +80,8 @@ static SUNXI_CCU_NM_WITH_FRAC_GATE_LOCK_MIN_MAX(pll_video0_clk, "pll-video0",
297000000, /* frac rate 1 */
BIT(31), /* gate */
BIT(28), /* lock */
- CLK_SET_RATE_UNGATE);
+ CLK_SET_RATE_UNGATE,
+ 0); /* features */

static SUNXI_CCU_NM_WITH_FRAC_GATE_LOCK(pll_ve_clk, "pll-ve",
"osc24M", 0x018,
@@ -143,7 +144,8 @@ static SUNXI_CCU_NM_WITH_FRAC_GATE_LOCK_MIN_MAX(pll_video1_clk, "pll-video1",
297000000, /* frac rate 1 */
BIT(31), /* gate */
BIT(28), /* lock */
- CLK_SET_RATE_UNGATE);
+ CLK_SET_RATE_UNGATE,
+ 0); /* features */

static SUNXI_CCU_NM_WITH_FRAC_GATE_LOCK(pll_gpu_clk, "pll-gpu",
"osc24M", 0x038,
diff --git a/drivers/clk/sunxi-ng/ccu-sun8i-h3.c b/drivers/clk/sunxi-ng/ccu-sun8i-h3.c
index bfebe8dbbe65..1e2669648a24 100644
--- a/drivers/clk/sunxi-ng/ccu-sun8i-h3.c
+++ b/drivers/clk/sunxi-ng/ccu-sun8i-h3.c
@@ -76,7 +76,8 @@ static SUNXI_CCU_NM_WITH_FRAC_GATE_LOCK_MIN_MAX(pll_video_clk, "pll-video",
297000000, /* frac rate 1 */
BIT(31), /* gate */
BIT(28), /* lock */
- CLK_SET_RATE_UNGATE);
+ CLK_SET_RATE_UNGATE,
+ 0); /* features */

static SUNXI_CCU_NM_WITH_FRAC_GATE_LOCK(pll_ve_clk, "pll-ve",
"osc24M", 0x0018,
diff --git a/drivers/clk/sunxi-ng/ccu-sun8i-r40.c b/drivers/clk/sunxi-ng/ccu-sun8i-r40.c
index 31eca0d3bc1e..63907b86ce06 100644
--- a/drivers/clk/sunxi-ng/ccu-sun8i-r40.c
+++ b/drivers/clk/sunxi-ng/ccu-sun8i-r40.c
@@ -82,7 +82,8 @@ static SUNXI_CCU_NM_WITH_FRAC_GATE_LOCK_MIN_MAX(pll_video0_clk, "pll-video0",
297000000, /* frac rate 1 */
BIT(31), /* gate */
BIT(28), /* lock */
- CLK_SET_RATE_UNGATE);
+ CLK_SET_RATE_UNGATE,
+ 0); /* features */

/* TODO: The result of N/M is required to be in [8, 25] range. */
static SUNXI_CCU_NM_WITH_FRAC_GATE_LOCK(pll_ve_clk, "pll-ve",
@@ -169,7 +170,8 @@ static SUNXI_CCU_NM_WITH_FRAC_GATE_LOCK_MIN_MAX(pll_video1_clk, "pll-video1",
297000000, /* frac rate 1 */
BIT(31), /* gate */
BIT(28), /* lock */
- CLK_SET_RATE_UNGATE);
+ CLK_SET_RATE_UNGATE,
+ 0); /* features */

static struct ccu_nkm pll_sata_clk = {
.enable = BIT(31),
diff --git a/drivers/clk/sunxi-ng/ccu_nm.c b/drivers/clk/sunxi-ng/ccu_nm.c
index c1fd11542c45..97d8d9e3255c 100644
--- a/drivers/clk/sunxi-ng/ccu_nm.c
+++ b/drivers/clk/sunxi-ng/ccu_nm.c
@@ -6,6 +6,7 @@

#include <linux/clk-provider.h>
#include <linux/io.h>
+#include <linux/rational.h>

#include "ccu_frac.h"
#include "ccu_gate.h"
@@ -56,6 +57,18 @@ static unsigned long ccu_nm_find_best(unsigned long parent, unsigned long rate,
return best_rate;
}

+static unsigned long ccu_nm_find_best_closest(unsigned long parent, unsigned long rate,
+ struct _ccu_nm *nm)
+{
+ unsigned long best_rate = 0;
+
+ rational_best_approximation(rate, parent, nm->max_n, nm->max_m, &nm->n, &nm->m);
+
+ best_rate = ccu_nm_calc_rate(parent, nm->n, nm->m);
+
+ return best_rate;
+}
+
static void ccu_nm_disable(struct clk_hw *hw)
{
struct ccu_nm *nm = hw_to_ccu_nm(hw);
@@ -159,7 +172,10 @@ static long ccu_nm_round_rate(struct clk_hw *hw, unsigned long rate,
_nm.min_m = 1;
_nm.max_m = nm->m.max ?: 1 << nm->m.width;

- rate = ccu_nm_find_best(*parent_rate, rate, &_nm);
+ if (nm->common.features & CCU_FEATURE_CLOSEST_RATE)
+ rate = ccu_nm_find_best_closest(*parent_rate, rate, &_nm);
+ else
+ rate = ccu_nm_find_best(*parent_rate, rate, &_nm);

if (nm->common.features & CCU_FEATURE_FIXED_POSTDIV)
rate /= nm->fixed_post_div;
@@ -210,7 +226,10 @@ static int ccu_nm_set_rate(struct clk_hw *hw, unsigned long rate,
&_nm.m, &_nm.n);
} else {
ccu_sdm_helper_disable(&nm->common, &nm->sdm);
- ccu_nm_find_best(parent_rate, rate, &_nm);
+ if (nm->common.features & CCU_FEATURE_CLOSEST_RATE)
+ ccu_nm_find_best_closest(parent_rate, rate, &_nm);
+ else
+ ccu_nm_find_best(parent_rate, rate, &_nm);
}

spin_lock_irqsave(nm->common.lock, flags);
diff --git a/drivers/clk/sunxi-ng/ccu_nm.h b/drivers/clk/sunxi-ng/ccu_nm.h
index 2904e67f05a8..a3825c4e8d42 100644
--- a/drivers/clk/sunxi-ng/ccu_nm.h
+++ b/drivers/clk/sunxi-ng/ccu_nm.h
@@ -116,7 +116,8 @@ struct ccu_nm {
_frac_en, _frac_sel, \
_frac_rate_0, \
_frac_rate_1, \
- _gate, _lock, _flags) \
+ _gate, _lock, _flags, \
+ _features) \
struct ccu_nm _struct = { \
.enable = _gate, \
.lock = _lock, \
@@ -129,7 +130,8 @@ struct ccu_nm {
.max_rate = _max_rate, \
.common = { \
.reg = _reg, \
- .features = CCU_FEATURE_FRACTIONAL, \
+ .features = CCU_FEATURE_FRACTIONAL | \
+ _features, \
.hw.init = CLK_HW_INIT(_name, \
_parent, \
&ccu_nm_ops, \

--
2.41.0


2023-07-03 06:53:20

by Maxime Ripard

[permalink] [raw]
Subject: Re: [PATCH v3 1/8] clk: sunxi-ng: nkm: consider alternative parent rates when determining rate

Hi,

On Sun, Jul 02, 2023 at 07:55:20PM +0200, Frank Oltmanns wrote:
> In case the CLK_SET_RATE_PARENT flag is set, consider using a different
> parent rate when determining a new rate.
>
> To find the best match for the requested rate, perform the following
> steps for each NKM combination:
> - calculate the optimal parent rate,
> - find the best parent rate that the parent clock actually supports
> - use that parent rate to calculate the effective rate.
>
> In case the clk does not support setting the parent rate, use the same
> algorithm as before.
>
> Signed-off-by: Frank Oltmanns <[email protected]>
> ---
> drivers/clk/sunxi-ng/ccu_nkm.c | 48 +++++++++++++++++++++++++++++++++++++++---
> 1 file changed, 45 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/clk/sunxi-ng/ccu_nkm.c b/drivers/clk/sunxi-ng/ccu_nkm.c
> index a0978a50edae..d83843e69c25 100644
> --- a/drivers/clk/sunxi-ng/ccu_nkm.c
> +++ b/drivers/clk/sunxi-ng/ccu_nkm.c
> @@ -6,6 +6,7 @@
>
> #include <linux/clk-provider.h>
> #include <linux/io.h>
> +#include <linux/math.h>
>
> #include "ccu_gate.h"
> #include "ccu_nkm.h"
> @@ -16,6 +17,44 @@ struct _ccu_nkm {
> unsigned long m, min_m, max_m;
> };
>
> +static unsigned long ccu_nkm_find_best_with_parent_adj(unsigned long *parent, unsigned long rate,
> + struct _ccu_nkm *nkm, struct clk_hw *phw)

The usual order in that driver (and Linux in general) would make the
clk_hw and nkm structure pointers first, and then the parent rate and
rate.

But something looks off to me: ccu_nkm_find_best_with_parent_adj takes a
pointer to the parent rate which makes sense since we're going to modify
it.

> +{
> + unsigned long best_rate = 0, best_parent_rate = *parent, tmp_parent = *parent;
> + unsigned long best_n = 0, best_k = 0, best_m = 0;
> + unsigned long _n, _k, _m;
> +
> + for (_k = nkm->min_k; _k <= nkm->max_k; _k++) {
> + for (_n = nkm->min_n; _n <= nkm->max_n; _n++) {
> + for (_m = nkm->min_m; _m <= nkm->max_m; _m++) {
> + unsigned long tmp_rate;
> +
> + tmp_parent = clk_hw_round_rate(phw, rate * _m / (_n * _k));
> +
> + tmp_rate = tmp_parent * _n * _k / _m;
> + if (tmp_rate > rate)
> + continue;
> +
> + if ((rate - tmp_rate) < (rate - best_rate)) {
> + best_rate = tmp_rate;
> + best_parent_rate = tmp_parent;
> + best_n = _n;
> + best_k = _k;
> + best_m = _m;
> + }
> + }
> + }
> + }
> +
> + nkm->n = best_n;
> + nkm->k = best_k;
> + nkm->m = best_m;
> +
> + *parent = best_parent_rate;
> +
> + return best_rate;
> +}
> +
> static unsigned long ccu_nkm_find_best(unsigned long parent, unsigned long rate,
> struct _ccu_nkm *nkm)

You haven't modified ccu_nkm_find_best though, and it still takes the
parent rate value.

> {
> @@ -106,7 +145,7 @@ static unsigned long ccu_nkm_recalc_rate(struct clk_hw *hw,
> }
>
> static unsigned long ccu_nkm_round_rate(struct ccu_mux_internal *mux,
> - struct clk_hw *hw,
> + struct clk_hw *parent_hw,

(This should be another patch)

> unsigned long *parent_rate,
> unsigned long rate,
> void *data)
> @@ -124,7 +163,10 @@ static unsigned long ccu_nkm_round_rate(struct ccu_mux_internal *mux,
> if (nkm->common.features & CCU_FEATURE_FIXED_POSTDIV)
> rate *= nkm->fixed_post_div;
>
> - rate = ccu_nkm_find_best(*parent_rate, rate, &_nkm);

parent_rate is a pointer, we were dereferencing it to pass its value to
ccu_nkm_find_best. All good so far.

> + if (!clk_hw_can_set_rate_parent(&nkm->common.hw))
> + rate = ccu_nkm_find_best(*parent_rate, rate, &_nkm);

Still passing by value

> + else
> + rate = ccu_nkm_find_best_with_parent_adj(parent_rate, rate, &_nkm, parent_hw);

And passing the pointer there since it takes a pointer. Still good.

>
> if (nkm->common.features & CCU_FEATURE_FIXED_POSTDIV)
> rate /= nkm->fixed_post_div;
> @@ -159,7 +201,7 @@ static int ccu_nkm_set_rate(struct clk_hw *hw, unsigned long rate,
> _nkm.min_m = 1;
> _nkm.max_m = nkm->m.max ?: 1 << nkm->m.width;
>
> - ccu_nkm_find_best(parent_rate, rate, &_nkm);
> + ccu_nkm_find_best(&parent_rate, rate, &_nkm);

But here, we're passing a pointer to parent_rate to ccu_nkm_find_best,
while it's still supposed to take it by value?

Maxime


Attachments:
(No filename) (4.28 kB)
signature.asc (235.00 B)
Download all attachments

2023-07-03 07:13:09

by Maxime Ripard

[permalink] [raw]
Subject: Re: [PATCH v3 3/8] clk: sunxi-ng: Add feature to find closest rate

On Sun, 2 Jul 2023 19:55:22 +0200, Frank Oltmanns wrote:
> The default behaviour of clocks in the sunxi-ng driver is to select a
> clock rate that is closest to but less than the requested rate.
>
> Add the CCU_FEATURE_CLOSEST_RATE flag, which can be used to allow clocks
> to find the closest rate instead.
>
> [ ... ]

Acked-by: Maxime Ripard <[email protected]>

Thanks!
Maxime

2023-07-03 07:33:39

by Maxime Ripard

[permalink] [raw]
Subject: Re: [PATCH v3 4/8] clk: sunxi-ng: nm: Support finding closest rate

Hi,

On Sun, Jul 02, 2023 at 07:55:23PM +0200, Frank Oltmanns wrote:
> Incorporate a new function, ccu_nm_find_best_closest, that selects the
> closest possible rate instead of the closest rate that is less than the
> requested rate. Utilizing rational_best_approximation has the additional
> effect of boosting performance in clock rate selection.
>
> In cases where the CCU_FEATURE_CLOSEST_RATE flag is set, use
> ccu_nm_find_best_closest instead of the original ccu_nm_find_best
> function.
>
> Extend the macro SUNXI_CCU_NM_WITH_FRAC_GATE_LOCK_MIN_MAX to allow
> selecting additional features and update all usages of the macro with no
> additional flags set.
>
> Signed-off-by: Frank Oltmanns <[email protected]>
> ---
> drivers/clk/sunxi-ng/ccu-sun50i-a64.c | 6 ++++--
> drivers/clk/sunxi-ng/ccu-sun8i-h3.c | 3 ++-
> drivers/clk/sunxi-ng/ccu-sun8i-r40.c | 6 ++++--
> drivers/clk/sunxi-ng/ccu_nm.c | 23 +++++++++++++++++++++--
> drivers/clk/sunxi-ng/ccu_nm.h | 6 ++++--
> 5 files changed, 35 insertions(+), 9 deletions(-)
>
> diff --git a/drivers/clk/sunxi-ng/ccu-sun50i-a64.c b/drivers/clk/sunxi-ng/ccu-sun50i-a64.c
> index a139a5c438d4..ebfab1fdbb96 100644
> --- a/drivers/clk/sunxi-ng/ccu-sun50i-a64.c
> +++ b/drivers/clk/sunxi-ng/ccu-sun50i-a64.c
> @@ -80,7 +80,8 @@ static SUNXI_CCU_NM_WITH_FRAC_GATE_LOCK_MIN_MAX(pll_video0_clk, "pll-video0",
> 297000000, /* frac rate 1 */
> BIT(31), /* gate */
> BIT(28), /* lock */
> - CLK_SET_RATE_UNGATE);
> + CLK_SET_RATE_UNGATE,
> + 0); /* features */
>
> static SUNXI_CCU_NM_WITH_FRAC_GATE_LOCK(pll_ve_clk, "pll-ve",
> "osc24M", 0x018,
> @@ -143,7 +144,8 @@ static SUNXI_CCU_NM_WITH_FRAC_GATE_LOCK_MIN_MAX(pll_video1_clk, "pll-video1",
> 297000000, /* frac rate 1 */
> BIT(31), /* gate */
> BIT(28), /* lock */
> - CLK_SET_RATE_UNGATE);
> + CLK_SET_RATE_UNGATE,
> + 0); /* features */
>
> static SUNXI_CCU_NM_WITH_FRAC_GATE_LOCK(pll_gpu_clk, "pll-gpu",
> "osc24M", 0x038,
> diff --git a/drivers/clk/sunxi-ng/ccu-sun8i-h3.c b/drivers/clk/sunxi-ng/ccu-sun8i-h3.c
> index bfebe8dbbe65..1e2669648a24 100644
> --- a/drivers/clk/sunxi-ng/ccu-sun8i-h3.c
> +++ b/drivers/clk/sunxi-ng/ccu-sun8i-h3.c
> @@ -76,7 +76,8 @@ static SUNXI_CCU_NM_WITH_FRAC_GATE_LOCK_MIN_MAX(pll_video_clk, "pll-video",
> 297000000, /* frac rate 1 */
> BIT(31), /* gate */
> BIT(28), /* lock */
> - CLK_SET_RATE_UNGATE);
> + CLK_SET_RATE_UNGATE,
> + 0); /* features */
>
> static SUNXI_CCU_NM_WITH_FRAC_GATE_LOCK(pll_ve_clk, "pll-ve",
> "osc24M", 0x0018,
> diff --git a/drivers/clk/sunxi-ng/ccu-sun8i-r40.c b/drivers/clk/sunxi-ng/ccu-sun8i-r40.c
> index 31eca0d3bc1e..63907b86ce06 100644
> --- a/drivers/clk/sunxi-ng/ccu-sun8i-r40.c
> +++ b/drivers/clk/sunxi-ng/ccu-sun8i-r40.c
> @@ -82,7 +82,8 @@ static SUNXI_CCU_NM_WITH_FRAC_GATE_LOCK_MIN_MAX(pll_video0_clk, "pll-video0",
> 297000000, /* frac rate 1 */
> BIT(31), /* gate */
> BIT(28), /* lock */
> - CLK_SET_RATE_UNGATE);
> + CLK_SET_RATE_UNGATE,
> + 0); /* features */
>
> /* TODO: The result of N/M is required to be in [8, 25] range. */
> static SUNXI_CCU_NM_WITH_FRAC_GATE_LOCK(pll_ve_clk, "pll-ve",
> @@ -169,7 +170,8 @@ static SUNXI_CCU_NM_WITH_FRAC_GATE_LOCK_MIN_MAX(pll_video1_clk, "pll-video1",
> 297000000, /* frac rate 1 */
> BIT(31), /* gate */
> BIT(28), /* lock */
> - CLK_SET_RATE_UNGATE);
> + CLK_SET_RATE_UNGATE,
> + 0); /* features */
>
> static struct ccu_nkm pll_sata_clk = {
> .enable = BIT(31),
> diff --git a/drivers/clk/sunxi-ng/ccu_nm.c b/drivers/clk/sunxi-ng/ccu_nm.c
> index c1fd11542c45..97d8d9e3255c 100644
> --- a/drivers/clk/sunxi-ng/ccu_nm.c
> +++ b/drivers/clk/sunxi-ng/ccu_nm.c
> @@ -6,6 +6,7 @@
>
> #include <linux/clk-provider.h>
> #include <linux/io.h>
> +#include <linux/rational.h>
>
> #include "ccu_frac.h"
> #include "ccu_gate.h"
> @@ -56,6 +57,18 @@ static unsigned long ccu_nm_find_best(unsigned long parent, unsigned long rate,
> return best_rate;
> }
>
> +static unsigned long ccu_nm_find_best_closest(unsigned long parent, unsigned long rate,
> + struct _ccu_nm *nm)
> +{
> + unsigned long best_rate = 0;
> +
> + rational_best_approximation(rate, parent, nm->max_n, nm->max_m, &nm->n, &nm->m);
> +
> + best_rate = ccu_nm_calc_rate(parent, nm->n, nm->m);
> +
> + return best_rate;
> +}
> +
> static void ccu_nm_disable(struct clk_hw *hw)
> {
> struct ccu_nm *nm = hw_to_ccu_nm(hw);
> @@ -159,7 +172,10 @@ static long ccu_nm_round_rate(struct clk_hw *hw, unsigned long rate,
> _nm.min_m = 1;
> _nm.max_m = nm->m.max ?: 1 << nm->m.width;
>
> - rate = ccu_nm_find_best(*parent_rate, rate, &_nm);
> + if (nm->common.features & CCU_FEATURE_CLOSEST_RATE)
> + rate = ccu_nm_find_best_closest(*parent_rate, rate, &_nm);
> + else
> + rate = ccu_nm_find_best(*parent_rate, rate, &_nm);

So this ends up creating a completely different path and algo for the
"closest" case, and I'm not super comfortable with that.

I think you can simplify this a bit by creating a comparison function
that will take two rates and a set of flags and will behave differently
depending on whether CCU_FEATURE_CLOSEST_RATE is set, like
mux_is_better_rate is doing for example.

You'll also avoid code duplication that way that can be shown a bit in
you subsequent patches.

>
> if (nm->common.features & CCU_FEATURE_FIXED_POSTDIV)
> rate /= nm->fixed_post_div;
> @@ -210,7 +226,10 @@ static int ccu_nm_set_rate(struct clk_hw *hw, unsigned long rate,
> &_nm.m, &_nm.n);
> } else {
> ccu_sdm_helper_disable(&nm->common, &nm->sdm);
> - ccu_nm_find_best(parent_rate, rate, &_nm);
> + if (nm->common.features & CCU_FEATURE_CLOSEST_RATE)
> + ccu_nm_find_best_closest(parent_rate, rate, &_nm);
> + else
> + ccu_nm_find_best(parent_rate, rate, &_nm);
> }
>
> spin_lock_irqsave(nm->common.lock, flags);
> diff --git a/drivers/clk/sunxi-ng/ccu_nm.h b/drivers/clk/sunxi-ng/ccu_nm.h
> index 2904e67f05a8..a3825c4e8d42 100644
> --- a/drivers/clk/sunxi-ng/ccu_nm.h
> +++ b/drivers/clk/sunxi-ng/ccu_nm.h
> @@ -116,7 +116,8 @@ struct ccu_nm {
> _frac_en, _frac_sel, \
> _frac_rate_0, \
> _frac_rate_1, \
> - _gate, _lock, _flags) \
> + _gate, _lock, _flags, \
> + _features) \
> struct ccu_nm _struct = { \
> .enable = _gate, \
> .lock = _lock, \
> @@ -129,7 +130,8 @@ struct ccu_nm {
> .max_rate = _max_rate, \
> .common = { \
> .reg = _reg, \
> - .features = CCU_FEATURE_FRACTIONAL, \
> + .features = CCU_FEATURE_FRACTIONAL | \
> + _features, \

It's a bit odd to me that some features are set by the macro function,
and some are passed as an argument. I'm fine with either but we
shouldn't allow both.

It looks like (for NM clocks at least) we would only need the round
closest flag for pll-video0, so we could maybe just add a new variant of
the macro which will also reduce the changes in that patch.

Maxime


Attachments:
(No filename) (7.16 kB)
signature.asc (235.00 B)
Download all attachments

2023-07-03 07:52:51

by Maxime Ripard

[permalink] [raw]
Subject: Re: [PATCH v3 7/8] clk: sunxi-ng: div: Support finding closest rate

On Sun, 2 Jul 2023 19:55:26 +0200, Frank Oltmanns wrote:
> Add initalization macros for divisor clocks with mux
> (SUNXI_CCU_M_WITH_MUX) to support finding the closest rate. This clock
> type requires the appropriate flags to be set in the .common structure
> (for the mux part of the clock) and the .div part.
>
>
> [ ... ]

Acked-by: Maxime Ripard <[email protected]>

Thanks!
Maxime

2023-07-03 07:54:23

by Maxime Ripard

[permalink] [raw]
Subject: Re: [PATCH v3 6/8] clk: sunxi-ng: mux: Support finding closest rate

On Sun, Jul 02, 2023 at 07:55:25PM +0200, Frank Oltmanns wrote:
> When finding the best rate for a mux clock, consider rates that are
> higher than the requested rate by introducing a new clk_ops structure
> that uses the existing __clk_mux_determine_rate_closest function.
> Furthermore introduce an initialization macro that uses this new
> structure.
>
> Signed-off-by: Frank Oltmanns <[email protected]>
> ---
> drivers/clk/sunxi-ng/ccu_mux.c | 13 +++++++++++++
> drivers/clk/sunxi-ng/ccu_mux.h | 17 +++++++++++++++++
> 2 files changed, 30 insertions(+)
>
> diff --git a/drivers/clk/sunxi-ng/ccu_mux.c b/drivers/clk/sunxi-ng/ccu_mux.c
> index 8594d6a4addd..49a592bdeacf 100644
> --- a/drivers/clk/sunxi-ng/ccu_mux.c
> +++ b/drivers/clk/sunxi-ng/ccu_mux.c
> @@ -264,6 +264,19 @@ static unsigned long ccu_mux_recalc_rate(struct clk_hw *hw,
> parent_rate);
> }
>
> +const struct clk_ops ccu_mux_closest_ops = {
> + .disable = ccu_mux_disable,
> + .enable = ccu_mux_enable,
> + .is_enabled = ccu_mux_is_enabled,
> +
> + .get_parent = ccu_mux_get_parent,
> + .set_parent = ccu_mux_set_parent,
> +
> + .determine_rate = __clk_mux_determine_rate_closest,
> + .recalc_rate = ccu_mux_recalc_rate,
> +};
> +EXPORT_SYMBOL_NS_GPL(ccu_mux_closest_ops, SUNXI_CCU);
> +

This is also a bit inconsistent with the other clocks: most (all?) of
them will simply handle this through a flag, but this one requires a new
set of clk_ops as well?

I think we should create our own wrapper here around
__clk_mux_determine_rate and either call
__clk_mux_determine_rate_closest or __clk_mux_determine_rate depending
on the state of the flags, or call __clk_mux_determine_rate_flags with
the proper flags set for our clock.

The former is probably slightly simpler.

> const struct clk_ops ccu_mux_ops = {
> .disable = ccu_mux_disable,
> .enable = ccu_mux_enable,
> diff --git a/drivers/clk/sunxi-ng/ccu_mux.h b/drivers/clk/sunxi-ng/ccu_mux.h
> index 2c1811a445b0..c4ee14e43719 100644
> --- a/drivers/clk/sunxi-ng/ccu_mux.h
> +++ b/drivers/clk/sunxi-ng/ccu_mux.h
> @@ -46,6 +46,22 @@ struct ccu_mux {
> struct ccu_common common;
> };
>
> +#define SUNXI_CCU_MUX_TABLE_WITH_GATE_CLOSEST(_struct, _name, _parents, _table, \
> + _reg, _shift, _width, _gate, \
> + _flags) \
> + struct ccu_mux _struct = { \
> + .enable = _gate, \
> + .mux = _SUNXI_CCU_MUX_TABLE(_shift, _width, _table), \
> + .common = { \
> + .reg = _reg, \
> + .hw.init = CLK_HW_INIT_PARENTS(_name, \
> + _parents, \
> + &ccu_mux_closest_ops, \
> + _flags), \
> + .features = CCU_FEATURE_CLOSEST_RATE, \
> + } \
> + }
> +

I'm fine with that one, but like we discussed on the NM (I think?) patch
already, this creates some clocks and macros that will use the feature
as a flag, and some will encode it into their name.

Given that we need it here too, I'm really inclined to prefer what you
did there, and thus create a new macro for pll-video0 instead of
modifying the existing one.

Maxime


Attachments:
(No filename) (3.06 kB)
signature.asc (235.00 B)
Download all attachments

2023-07-03 07:59:45

by Maxime Ripard

[permalink] [raw]
Subject: Re: [PATCH v3 0/8] clk: sunxi-ng: Consider alternative parent rates when determining NKM clock rate

Hi,

On Sun, Jul 02, 2023 at 07:55:19PM +0200, Frank Oltmanns wrote:
> Changes in v3:
> - Use dedicated function for finding the best rate in cases where an
> nkm clock supports setting its parent's rate, streamlining it with
> the structure that is used in other sunxi-ng ccus such as ccu_mp
> (PATCH 1).
> - Therefore, remove the now obsolete comments that were introduced in
> v2 (PATCH 1).
> - Remove the dedicated function for calculating the optimal parent rate
> for nkm clocks that was introduced in v2. Instead use a simple
> calculation and require the parent clock to select the closest rate to
> achieve optimal results (PATCH 1).
> - Therefore, add support to set the closest rate for nm clocks (because
> pll-mipi's parent pll-video0 is an nm clock) and all clock types that
> are descendants of a64's pll-video0, i.e., nkm, mux, and div (PATCH 3
> et. seq.).
> - Link to v2: https://lore.kernel.org/all/[email protected]/

Thanks so much for that new version. I know it's been a long discussion,
but it definitely moves in the right direction and we're fairly close to
a final version now.

Maxime


Attachments:
(No filename) (1.17 kB)
signature.asc (235.00 B)
Download all attachments

2023-07-03 08:22:22

by Frank Oltmanns

[permalink] [raw]
Subject: Re: [PATCH v3 1/8] clk: sunxi-ng: nkm: consider alternative parent rates when determining rate


On 2023-07-03 at 08:47:43 +0200, Maxime Ripard <[email protected]> wrote:
> [[PGP Signed Part:Undecided]]
> Hi,
>
> On Sun, Jul 02, 2023 at 07:55:20PM +0200, Frank Oltmanns wrote:
>> In case the CLK_SET_RATE_PARENT flag is set, consider using a different
>> parent rate when determining a new rate.
>>
>> To find the best match for the requested rate, perform the following
>> steps for each NKM combination:
>> - calculate the optimal parent rate,
>> - find the best parent rate that the parent clock actually supports
>> - use that parent rate to calculate the effective rate.
>>
>> In case the clk does not support setting the parent rate, use the same
>> algorithm as before.
>>
>> Signed-off-by: Frank Oltmanns <[email protected]>
>> ---
>> drivers/clk/sunxi-ng/ccu_nkm.c | 48 +++++++++++++++++++++++++++++++++++++++---
>> 1 file changed, 45 insertions(+), 3 deletions(-)
>>
>> diff --git a/drivers/clk/sunxi-ng/ccu_nkm.c b/drivers/clk/sunxi-ng/ccu_nkm.c
>> index a0978a50edae..d83843e69c25 100644
>> --- a/drivers/clk/sunxi-ng/ccu_nkm.c
>> +++ b/drivers/clk/sunxi-ng/ccu_nkm.c
>> @@ -6,6 +6,7 @@
>>
>> #include <linux/clk-provider.h>
>> #include <linux/io.h>
>> +#include <linux/math.h>
>>
>> #include "ccu_gate.h"
>> #include "ccu_nkm.h"
>> @@ -16,6 +17,44 @@ struct _ccu_nkm {
>> unsigned long m, min_m, max_m;
>> };
>>
>> +static unsigned long ccu_nkm_find_best_with_parent_adj(unsigned long *parent, unsigned long rate,
>> + struct _ccu_nkm *nkm, struct clk_hw *phw)
>
> The usual order in that driver (and Linux in general) would make the
> clk_hw and nkm structure pointers first, and then the parent rate and
> rate.

I'll address that in v4.

>
> But something looks off to me: ccu_nkm_find_best_with_parent_adj takes a
> pointer to the parent rate which makes sense since we're going to modify
> it.
>
>> +{
>> + unsigned long best_rate = 0, best_parent_rate = *parent, tmp_parent = *parent;
>> + unsigned long best_n = 0, best_k = 0, best_m = 0;
>> + unsigned long _n, _k, _m;
>> +
>> + for (_k = nkm->min_k; _k <= nkm->max_k; _k++) {
>> + for (_n = nkm->min_n; _n <= nkm->max_n; _n++) {
>> + for (_m = nkm->min_m; _m <= nkm->max_m; _m++) {
>> + unsigned long tmp_rate;
>> +
>> + tmp_parent = clk_hw_round_rate(phw, rate * _m / (_n * _k));
>> +
>> + tmp_rate = tmp_parent * _n * _k / _m;
>> + if (tmp_rate > rate)
>> + continue;
>> +
>> + if ((rate - tmp_rate) < (rate - best_rate)) {
>> + best_rate = tmp_rate;
>> + best_parent_rate = tmp_parent;
>> + best_n = _n;
>> + best_k = _k;
>> + best_m = _m;
>> + }
>> + }
>> + }
>> + }
>> +
>> + nkm->n = best_n;
>> + nkm->k = best_k;
>> + nkm->m = best_m;
>> +
>> + *parent = best_parent_rate;
>> +
>> + return best_rate;
>> +}
>> +
>> static unsigned long ccu_nkm_find_best(unsigned long parent, unsigned long rate,
>> struct _ccu_nkm *nkm)
>
> You haven't modified ccu_nkm_find_best though, and it still takes the
> parent rate value.
>
>> {
>> @@ -106,7 +145,7 @@ static unsigned long ccu_nkm_recalc_rate(struct clk_hw *hw,
>> }
>>
>> static unsigned long ccu_nkm_round_rate(struct ccu_mux_internal *mux,
>> - struct clk_hw *hw,
>> + struct clk_hw *parent_hw,
>
> (This should be another patch)

Ok, will do in v4.

>
>> unsigned long *parent_rate,
>> unsigned long rate,
>> void *data)
>> @@ -124,7 +163,10 @@ static unsigned long ccu_nkm_round_rate(struct ccu_mux_internal *mux,
>> if (nkm->common.features & CCU_FEATURE_FIXED_POSTDIV)
>> rate *= nkm->fixed_post_div;
>>
>> - rate = ccu_nkm_find_best(*parent_rate, rate, &_nkm);
>
> parent_rate is a pointer, we were dereferencing it to pass its value to
> ccu_nkm_find_best. All good so far.
>
>> + if (!clk_hw_can_set_rate_parent(&nkm->common.hw))
>> + rate = ccu_nkm_find_best(*parent_rate, rate, &_nkm);
>
> Still passing by value
>
>> + else
>> + rate = ccu_nkm_find_best_with_parent_adj(parent_rate, rate, &_nkm, parent_hw);
>
> And passing the pointer there since it takes a pointer. Still good.
>
>>
>> if (nkm->common.features & CCU_FEATURE_FIXED_POSTDIV)
>> rate /= nkm->fixed_post_div;
>> @@ -159,7 +201,7 @@ static int ccu_nkm_set_rate(struct clk_hw *hw, unsigned long rate,
>> _nkm.min_m = 1;
>> _nkm.max_m = nkm->m.max ?: 1 << nkm->m.width;
>>
>> - ccu_nkm_find_best(parent_rate, rate, &_nkm);
>> + ccu_nkm_find_best(&parent_rate, rate, &_nkm);
>
> But here, we're passing a pointer to parent_rate to ccu_nkm_find_best,
> while it's still supposed to take it by value?

Ugh. Yeah, sorry. I had that error in V2 but squashed the correction
into patch 5 instead of patch 1. I'll fix that in v4.

Thanks,
Frank

>
> Maxime
>
> [[End of PGP Signed Part]]

2023-07-03 08:22:46

by Maxime Ripard

[permalink] [raw]
Subject: Re: [PATCH v3 8/8] clk: sunxi-ng: a64: select closest rate for pll-video0

On Sun, Jul 02, 2023 at 07:55:27PM +0200, Frank Oltmanns wrote:
> @@ -541,7 +542,7 @@ static const char * const tcon1_parents[] = { "pll-video0", "pll-video1" };
> static const u8 tcon1_table[] = { 0, 2, };
> static struct ccu_div tcon1_clk = {
> .enable = BIT(31),
> - .div = _SUNXI_CCU_DIV(0, 4),
> + .div = _SUNXI_CCU_DIV_FLAGS(0, 4, CLK_DIVIDER_ROUND_CLOSEST),
> .mux = _SUNXI_CCU_MUX_TABLE(24, 2, tcon1_table),
> .common = {
> .reg = 0x11c,
> @@ -549,6 +550,7 @@ static struct ccu_div tcon1_clk = {
> tcon1_parents,
> &ccu_div_ops,
> CLK_SET_RATE_PARENT),
> + .features = CCU_FEATURE_CLOSEST_RATE,
> },
> };

I'm not super comfortable with having to set it twice for dividers (or
composite clocks). Could we set CLK_DIVIDER_ROUND_CLOSEST automatically
if CCU_FEATURE_CLOSEST_RATE is set?

I'm guessing we would need it for muxes as well?

Maxime


Attachments:
(No filename) (936.00 B)
signature.asc (235.00 B)
Download all attachments

2023-07-03 08:51:20

by Frank Oltmanns

[permalink] [raw]
Subject: Re: [PATCH v3 4/8] clk: sunxi-ng: nm: Support finding closest rate


On 2023-07-03 at 09:24:53 +0200, Maxime Ripard <[email protected]> wrote:
> [[PGP Signed Part:Undecided]]
> Hi,
>
> On Sun, Jul 02, 2023 at 07:55:23PM +0200, Frank Oltmanns wrote:
>> Incorporate a new function, ccu_nm_find_best_closest, that selects the
>> closest possible rate instead of the closest rate that is less than the
>> requested rate. Utilizing rational_best_approximation has the additional
>> effect of boosting performance in clock rate selection.
>>
>> In cases where the CCU_FEATURE_CLOSEST_RATE flag is set, use
>> ccu_nm_find_best_closest instead of the original ccu_nm_find_best
>> function.
>>
>> Extend the macro SUNXI_CCU_NM_WITH_FRAC_GATE_LOCK_MIN_MAX to allow
>> selecting additional features and update all usages of the macro with no
>> additional flags set.
>>
>> Signed-off-by: Frank Oltmanns <[email protected]>
>> ---
>> drivers/clk/sunxi-ng/ccu-sun50i-a64.c | 6 ++++--
>> drivers/clk/sunxi-ng/ccu-sun8i-h3.c | 3 ++-
>> drivers/clk/sunxi-ng/ccu-sun8i-r40.c | 6 ++++--
>> drivers/clk/sunxi-ng/ccu_nm.c | 23 +++++++++++++++++++++--
>> drivers/clk/sunxi-ng/ccu_nm.h | 6 ++++--
>> 5 files changed, 35 insertions(+), 9 deletions(-)
>>
>> diff --git a/drivers/clk/sunxi-ng/ccu-sun50i-a64.c b/drivers/clk/sunxi-ng/ccu-sun50i-a64.c
>> index a139a5c438d4..ebfab1fdbb96 100644
>> --- a/drivers/clk/sunxi-ng/ccu-sun50i-a64.c
>> +++ b/drivers/clk/sunxi-ng/ccu-sun50i-a64.c
>> @@ -80,7 +80,8 @@ static SUNXI_CCU_NM_WITH_FRAC_GATE_LOCK_MIN_MAX(pll_video0_clk, "pll-video0",
>> 297000000, /* frac rate 1 */
>> BIT(31), /* gate */
>> BIT(28), /* lock */
>> - CLK_SET_RATE_UNGATE);
>> + CLK_SET_RATE_UNGATE,
>> + 0); /* features */
>>
>> static SUNXI_CCU_NM_WITH_FRAC_GATE_LOCK(pll_ve_clk, "pll-ve",
>> "osc24M", 0x018,
>> @@ -143,7 +144,8 @@ static SUNXI_CCU_NM_WITH_FRAC_GATE_LOCK_MIN_MAX(pll_video1_clk, "pll-video1",
>> 297000000, /* frac rate 1 */
>> BIT(31), /* gate */
>> BIT(28), /* lock */
>> - CLK_SET_RATE_UNGATE);
>> + CLK_SET_RATE_UNGATE,
>> + 0); /* features */
>>
>> static SUNXI_CCU_NM_WITH_FRAC_GATE_LOCK(pll_gpu_clk, "pll-gpu",
>> "osc24M", 0x038,
>> diff --git a/drivers/clk/sunxi-ng/ccu-sun8i-h3.c b/drivers/clk/sunxi-ng/ccu-sun8i-h3.c
>> index bfebe8dbbe65..1e2669648a24 100644
>> --- a/drivers/clk/sunxi-ng/ccu-sun8i-h3.c
>> +++ b/drivers/clk/sunxi-ng/ccu-sun8i-h3.c
>> @@ -76,7 +76,8 @@ static SUNXI_CCU_NM_WITH_FRAC_GATE_LOCK_MIN_MAX(pll_video_clk, "pll-video",
>> 297000000, /* frac rate 1 */
>> BIT(31), /* gate */
>> BIT(28), /* lock */
>> - CLK_SET_RATE_UNGATE);
>> + CLK_SET_RATE_UNGATE,
>> + 0); /* features */
>>
>> static SUNXI_CCU_NM_WITH_FRAC_GATE_LOCK(pll_ve_clk, "pll-ve",
>> "osc24M", 0x0018,
>> diff --git a/drivers/clk/sunxi-ng/ccu-sun8i-r40.c b/drivers/clk/sunxi-ng/ccu-sun8i-r40.c
>> index 31eca0d3bc1e..63907b86ce06 100644
>> --- a/drivers/clk/sunxi-ng/ccu-sun8i-r40.c
>> +++ b/drivers/clk/sunxi-ng/ccu-sun8i-r40.c
>> @@ -82,7 +82,8 @@ static SUNXI_CCU_NM_WITH_FRAC_GATE_LOCK_MIN_MAX(pll_video0_clk, "pll-video0",
>> 297000000, /* frac rate 1 */
>> BIT(31), /* gate */
>> BIT(28), /* lock */
>> - CLK_SET_RATE_UNGATE);
>> + CLK_SET_RATE_UNGATE,
>> + 0); /* features */
>>
>> /* TODO: The result of N/M is required to be in [8, 25] range. */
>> static SUNXI_CCU_NM_WITH_FRAC_GATE_LOCK(pll_ve_clk, "pll-ve",
>> @@ -169,7 +170,8 @@ static SUNXI_CCU_NM_WITH_FRAC_GATE_LOCK_MIN_MAX(pll_video1_clk, "pll-video1",
>> 297000000, /* frac rate 1 */
>> BIT(31), /* gate */
>> BIT(28), /* lock */
>> - CLK_SET_RATE_UNGATE);
>> + CLK_SET_RATE_UNGATE,
>> + 0); /* features */
>>
>> static struct ccu_nkm pll_sata_clk = {
>> .enable = BIT(31),
>> diff --git a/drivers/clk/sunxi-ng/ccu_nm.c b/drivers/clk/sunxi-ng/ccu_nm.c
>> index c1fd11542c45..97d8d9e3255c 100644
>> --- a/drivers/clk/sunxi-ng/ccu_nm.c
>> +++ b/drivers/clk/sunxi-ng/ccu_nm.c
>> @@ -6,6 +6,7 @@
>>
>> #include <linux/clk-provider.h>
>> #include <linux/io.h>
>> +#include <linux/rational.h>
>>
>> #include "ccu_frac.h"
>> #include "ccu_gate.h"
>> @@ -56,6 +57,18 @@ static unsigned long ccu_nm_find_best(unsigned long parent, unsigned long rate,
>> return best_rate;
>> }
>>
>> +static unsigned long ccu_nm_find_best_closest(unsigned long parent, unsigned long rate,
>> + struct _ccu_nm *nm)
>> +{
>> + unsigned long best_rate = 0;
>> +
>> + rational_best_approximation(rate, parent, nm->max_n, nm->max_m, &nm->n, &nm->m);
>> +
>> + best_rate = ccu_nm_calc_rate(parent, nm->n, nm->m);
>> +
>> + return best_rate;
>> +}
>> +
>> static void ccu_nm_disable(struct clk_hw *hw)
>> {
>> struct ccu_nm *nm = hw_to_ccu_nm(hw);
>> @@ -159,7 +172,10 @@ static long ccu_nm_round_rate(struct clk_hw *hw, unsigned long rate,
>> _nm.min_m = 1;
>> _nm.max_m = nm->m.max ?: 1 << nm->m.width;
>>
>> - rate = ccu_nm_find_best(*parent_rate, rate, &_nm);
>> + if (nm->common.features & CCU_FEATURE_CLOSEST_RATE)
>> + rate = ccu_nm_find_best_closest(*parent_rate, rate, &_nm);
>> + else
>> + rate = ccu_nm_find_best(*parent_rate, rate, &_nm);
>
> So this ends up creating a completely different path and algo for the
> "closest" case, and I'm not super comfortable with that.

What I like about this (and why I proposed it), is that we can use the
same functionality that is proven to work well in other clocks. So, this
would bring us closer to other (not sunxi-ng) clocks, many of which are
using the clk-fractional-divider directly. We can't do that in
sunxi-ng, because we still need to control when overshooting is fine and
when it is not fine. Other frameworks don't make that distinction.

I thougth about if it was possible to scrap most of ccu_nm and replace
it with clk-fractional-divider, but due to this limitation I did not see
a good way forward there. I see the following options:
a. Add a flag to clk-fractional-divider to support clocks that must not
overshoot. --> I don't think, I want to do that.
b. Analyze if it's it's really a requirement for ccu_nm to support
clocks that don't overshoot. --> I don't know who could / would take
up such a task, given that the current structure is proven to work.
c. Add a new clock ccu_fd that uses clk-fractional-divider internally
and must only be used by clock's that allow overshooting - such as
A64's pll-video0. --> This could work as a migration path for
scenario b as well. But it's even more complex than what I'm
proposing in this patch, so I'm not sure how much you'd like that.

Anyway, all of this seems rather involved and therefore, for this
patchset it makes sense to follow your proposal below.

>
> I think you can simplify this a bit by creating a comparison function
> that will take two rates and a set of flags and will behave differently
> depending on whether CCU_FEATURE_CLOSEST_RATE is set, like
> mux_is_better_rate is doing for example.
>
> You'll also avoid code duplication that way that can be shown a bit in
> you subsequent patches.

I'll look into that.

>
>>
>> if (nm->common.features & CCU_FEATURE_FIXED_POSTDIV)
>> rate /= nm->fixed_post_div;
>> @@ -210,7 +226,10 @@ static int ccu_nm_set_rate(struct clk_hw *hw, unsigned long rate,
>> &_nm.m, &_nm.n);
>> } else {
>> ccu_sdm_helper_disable(&nm->common, &nm->sdm);
>> - ccu_nm_find_best(parent_rate, rate, &_nm);
>> + if (nm->common.features & CCU_FEATURE_CLOSEST_RATE)
>> + ccu_nm_find_best_closest(parent_rate, rate, &_nm);
>> + else
>> + ccu_nm_find_best(parent_rate, rate, &_nm);
>> }
>>
>> spin_lock_irqsave(nm->common.lock, flags);
>> diff --git a/drivers/clk/sunxi-ng/ccu_nm.h b/drivers/clk/sunxi-ng/ccu_nm.h
>> index 2904e67f05a8..a3825c4e8d42 100644
>> --- a/drivers/clk/sunxi-ng/ccu_nm.h
>> +++ b/drivers/clk/sunxi-ng/ccu_nm.h
>> @@ -116,7 +116,8 @@ struct ccu_nm {
>> _frac_en, _frac_sel, \
>> _frac_rate_0, \
>> _frac_rate_1, \
>> - _gate, _lock, _flags) \
>> + _gate, _lock, _flags, \
>> + _features) \
>> struct ccu_nm _struct = { \
>> .enable = _gate, \
>> .lock = _lock, \
>> @@ -129,7 +130,8 @@ struct ccu_nm {
>> .max_rate = _max_rate, \
>> .common = { \
>> .reg = _reg, \
>> - .features = CCU_FEATURE_FRACTIONAL, \
>> + .features = CCU_FEATURE_FRACTIONAL | \
>> + _features, \
>
> It's a bit odd to me that some features are set by the macro function,
> and some are passed as an argument. I'm fine with either but we
> shouldn't allow both.
>
> It looks like (for NM clocks at least) we would only need the round
> closest flag for pll-video0, so we could maybe just add a new variant of
> the macro which will also reduce the changes in that patch.

Ok. I'll address that in v4.

>
> Maxime
>
> [[End of PGP Signed Part]]

2023-07-03 09:32:17

by Frank Oltmanns

[permalink] [raw]
Subject: Re: [PATCH v3 6/8] clk: sunxi-ng: mux: Support finding closest rate


On 2023-07-03 at 09:38:48 +0200, Maxime Ripard <[email protected]> wrote:
> [[PGP Signed Part:Undecided]]
> On Sun, Jul 02, 2023 at 07:55:25PM +0200, Frank Oltmanns wrote:
>> When finding the best rate for a mux clock, consider rates that are
>> higher than the requested rate by introducing a new clk_ops structure
>> that uses the existing __clk_mux_determine_rate_closest function.
>> Furthermore introduce an initialization macro that uses this new
>> structure.
>>
>> Signed-off-by: Frank Oltmanns <[email protected]>
>> ---
>> drivers/clk/sunxi-ng/ccu_mux.c | 13 +++++++++++++
>> drivers/clk/sunxi-ng/ccu_mux.h | 17 +++++++++++++++++
>> 2 files changed, 30 insertions(+)
>>
>> diff --git a/drivers/clk/sunxi-ng/ccu_mux.c b/drivers/clk/sunxi-ng/ccu_mux.c
>> index 8594d6a4addd..49a592bdeacf 100644
>> --- a/drivers/clk/sunxi-ng/ccu_mux.c
>> +++ b/drivers/clk/sunxi-ng/ccu_mux.c
>> @@ -264,6 +264,19 @@ static unsigned long ccu_mux_recalc_rate(struct clk_hw *hw,
>> parent_rate);
>> }
>>
>> +const struct clk_ops ccu_mux_closest_ops = {
>> + .disable = ccu_mux_disable,
>> + .enable = ccu_mux_enable,
>> + .is_enabled = ccu_mux_is_enabled,
>> +
>> + .get_parent = ccu_mux_get_parent,
>> + .set_parent = ccu_mux_set_parent,
>> +
>> + .determine_rate = __clk_mux_determine_rate_closest,
>> + .recalc_rate = ccu_mux_recalc_rate,
>> +};
>> +EXPORT_SYMBOL_NS_GPL(ccu_mux_closest_ops, SUNXI_CCU);
>> +
>
> This is also a bit inconsistent with the other clocks: most (all?) of
> them will simply handle this through a flag, but this one requires a new
> set of clk_ops as well?
>
> I think we should create our own wrapper here around
> __clk_mux_determine_rate and either call
> __clk_mux_determine_rate_closest or __clk_mux_determine_rate depending
> on the state of the flags, or call __clk_mux_determine_rate_flags with
> the proper flags set for our clock.
>
> The former is probably slightly simpler.

Ok, I will address that in v4.

>
>> const struct clk_ops ccu_mux_ops = {
>> .disable = ccu_mux_disable,
>> .enable = ccu_mux_enable,
>> diff --git a/drivers/clk/sunxi-ng/ccu_mux.h b/drivers/clk/sunxi-ng/ccu_mux.h
>> index 2c1811a445b0..c4ee14e43719 100644
>> --- a/drivers/clk/sunxi-ng/ccu_mux.h
>> +++ b/drivers/clk/sunxi-ng/ccu_mux.h
>> @@ -46,6 +46,22 @@ struct ccu_mux {
>> struct ccu_common common;
>> };
>>
>> +#define SUNXI_CCU_MUX_TABLE_WITH_GATE_CLOSEST(_struct, _name, _parents, _table, \
>> + _reg, _shift, _width, _gate, \
>> + _flags) \
>> + struct ccu_mux _struct = { \
>> + .enable = _gate, \
>> + .mux = _SUNXI_CCU_MUX_TABLE(_shift, _width, _table), \
>> + .common = { \
>> + .reg = _reg, \
>> + .hw.init = CLK_HW_INIT_PARENTS(_name, \
>> + _parents, \
>> + &ccu_mux_closest_ops, \
>> + _flags), \
>> + .features = CCU_FEATURE_CLOSEST_RATE, \
>> + } \
>> + }
>> +
>
> I'm fine with that one, but like we discussed on the NM (I think?) patch
> already, this creates some clocks and macros that will use the feature
> as a flag, and some will encode it into their name.
>
> Given that we need it here too, I'm really inclined to prefer what you
> did there, and thus create a new macro for pll-video0 instead of
> modifying the existing one.

Ok. Just to be clear: What I did in this patch is fine and I should use
the same approach for NM. Did I get that right?

Thanks,
Frank

>
> Maxime
>
> [[End of PGP Signed Part]]

2023-07-03 09:39:53

by Frank Oltmanns

[permalink] [raw]
Subject: Re: [PATCH v3 8/8] clk: sunxi-ng: a64: select closest rate for pll-video0


On 2023-07-03 at 09:50:05 +0200, Maxime Ripard <[email protected]> wrote:
> [[PGP Signed Part:Undecided]]
> On Sun, Jul 02, 2023 at 07:55:27PM +0200, Frank Oltmanns wrote:
>> @@ -541,7 +542,7 @@ static const char * const tcon1_parents[] = { "pll-video0", "pll-video1" };
>> static const u8 tcon1_table[] = { 0, 2, };
>> static struct ccu_div tcon1_clk = {
>> .enable = BIT(31),
>> - .div = _SUNXI_CCU_DIV(0, 4),
>> + .div = _SUNXI_CCU_DIV_FLAGS(0, 4, CLK_DIVIDER_ROUND_CLOSEST),
>> .mux = _SUNXI_CCU_MUX_TABLE(24, 2, tcon1_table),
>> .common = {
>> .reg = 0x11c,
>> @@ -549,6 +550,7 @@ static struct ccu_div tcon1_clk = {
>> tcon1_parents,
>> &ccu_div_ops,
>> CLK_SET_RATE_PARENT),
>> + .features = CCU_FEATURE_CLOSEST_RATE,
>> },
>> };
>
> I'm not super comfortable with having to set it twice for dividers (or
> composite clocks). Could we set CLK_DIVIDER_ROUND_CLOSEST automatically
> if CCU_FEATURE_CLOSEST_RATE is set?

You're of course right. If I'm not mistaken, I can use
SUNXI_CCU_M_WITH_MUX_TABLE_GATE_CLOSEST that I introduced in div patch
(PATCH 7). Otherwise I'll create a similar macro for use with tcon1.

>
> I'm guessing we would need it for muxes as well?
>

Yes, it's already in the mux and div patches.

Best regards,
Frank

>
> Maxime
>
> [[End of PGP Signed Part]]

2023-07-03 10:03:36

by Frank Oltmanns

[permalink] [raw]
Subject: Re: [PATCH v3 0/8] clk: sunxi-ng: Consider alternative parent rates when determining NKM clock rate


On 2023-07-03 at 09:51:22 +0200, Maxime Ripard <[email protected]> wrote:
> [[PGP Signed Part:Undecided]]
> Hi,
>
> On Sun, Jul 02, 2023 at 07:55:19PM +0200, Frank Oltmanns wrote:
>> Changes in v3:
>> - Use dedicated function for finding the best rate in cases where an
>> nkm clock supports setting its parent's rate, streamlining it with
>> the structure that is used in other sunxi-ng ccus such as ccu_mp
>> (PATCH 1).
>> - Therefore, remove the now obsolete comments that were introduced in
>> v2 (PATCH 1).
>> - Remove the dedicated function for calculating the optimal parent rate
>> for nkm clocks that was introduced in v2. Instead use a simple
>> calculation and require the parent clock to select the closest rate to
>> achieve optimal results (PATCH 1).
>> - Therefore, add support to set the closest rate for nm clocks (because
>> pll-mipi's parent pll-video0 is an nm clock) and all clock types that
>> are descendants of a64's pll-video0, i.e., nkm, mux, and div (PATCH 3
>> et. seq.).
>> - Link to v2: https://lore.kernel.org/all/[email protected]/
>
> Thanks so much for that new version. I know it's been a long discussion,
> but it definitely moves in the right direction and we're fairly close to
> a final version now.
>

I think it was a good discussion. So, thank you for that! I appreciate
your feedback even though we don't always agree. :)

Thanks,
Frank

>
> Maxime
>
> [[End of PGP Signed Part]]

2023-07-03 11:48:32

by Maxime Ripard

[permalink] [raw]
Subject: Re: [PATCH v3 6/8] clk: sunxi-ng: mux: Support finding closest rate

On Mon, Jul 03, 2023 at 11:17:24AM +0200, Frank Oltmanns wrote:
>
> On 2023-07-03 at 09:38:48 +0200, Maxime Ripard <[email protected]> wrote:
> > [[PGP Signed Part:Undecided]]
> > On Sun, Jul 02, 2023 at 07:55:25PM +0200, Frank Oltmanns wrote:
> >> When finding the best rate for a mux clock, consider rates that are
> >> higher than the requested rate by introducing a new clk_ops structure
> >> that uses the existing __clk_mux_determine_rate_closest function.
> >> Furthermore introduce an initialization macro that uses this new
> >> structure.
> >>
> >> Signed-off-by: Frank Oltmanns <[email protected]>
> >> ---
> >> drivers/clk/sunxi-ng/ccu_mux.c | 13 +++++++++++++
> >> drivers/clk/sunxi-ng/ccu_mux.h | 17 +++++++++++++++++
> >> 2 files changed, 30 insertions(+)
> >>
> >> diff --git a/drivers/clk/sunxi-ng/ccu_mux.c b/drivers/clk/sunxi-ng/ccu_mux.c
> >> index 8594d6a4addd..49a592bdeacf 100644
> >> --- a/drivers/clk/sunxi-ng/ccu_mux.c
> >> +++ b/drivers/clk/sunxi-ng/ccu_mux.c
> >> @@ -264,6 +264,19 @@ static unsigned long ccu_mux_recalc_rate(struct clk_hw *hw,
> >> parent_rate);
> >> }
> >>
> >> +const struct clk_ops ccu_mux_closest_ops = {
> >> + .disable = ccu_mux_disable,
> >> + .enable = ccu_mux_enable,
> >> + .is_enabled = ccu_mux_is_enabled,
> >> +
> >> + .get_parent = ccu_mux_get_parent,
> >> + .set_parent = ccu_mux_set_parent,
> >> +
> >> + .determine_rate = __clk_mux_determine_rate_closest,
> >> + .recalc_rate = ccu_mux_recalc_rate,
> >> +};
> >> +EXPORT_SYMBOL_NS_GPL(ccu_mux_closest_ops, SUNXI_CCU);
> >> +
> >
> > This is also a bit inconsistent with the other clocks: most (all?) of
> > them will simply handle this through a flag, but this one requires a new
> > set of clk_ops as well?
> >
> > I think we should create our own wrapper here around
> > __clk_mux_determine_rate and either call
> > __clk_mux_determine_rate_closest or __clk_mux_determine_rate depending
> > on the state of the flags, or call __clk_mux_determine_rate_flags with
> > the proper flags set for our clock.
> >
> > The former is probably slightly simpler.
>
> Ok, I will address that in v4.
>
> >
> >> const struct clk_ops ccu_mux_ops = {
> >> .disable = ccu_mux_disable,
> >> .enable = ccu_mux_enable,
> >> diff --git a/drivers/clk/sunxi-ng/ccu_mux.h b/drivers/clk/sunxi-ng/ccu_mux.h
> >> index 2c1811a445b0..c4ee14e43719 100644
> >> --- a/drivers/clk/sunxi-ng/ccu_mux.h
> >> +++ b/drivers/clk/sunxi-ng/ccu_mux.h
> >> @@ -46,6 +46,22 @@ struct ccu_mux {
> >> struct ccu_common common;
> >> };
> >>
> >> +#define SUNXI_CCU_MUX_TABLE_WITH_GATE_CLOSEST(_struct, _name, _parents, _table, \
> >> + _reg, _shift, _width, _gate, \
> >> + _flags) \
> >> + struct ccu_mux _struct = { \
> >> + .enable = _gate, \
> >> + .mux = _SUNXI_CCU_MUX_TABLE(_shift, _width, _table), \
> >> + .common = { \
> >> + .reg = _reg, \
> >> + .hw.init = CLK_HW_INIT_PARENTS(_name, \
> >> + _parents, \
> >> + &ccu_mux_closest_ops, \
> >> + _flags), \
> >> + .features = CCU_FEATURE_CLOSEST_RATE, \
> >> + } \
> >> + }
> >> +
> >
> > I'm fine with that one, but like we discussed on the NM (I think?) patch
> > already, this creates some clocks and macros that will use the feature
> > as a flag, and some will encode it into their name.
> >
> > Given that we need it here too, I'm really inclined to prefer what you
> > did there, and thus create a new macro for pll-video0 instead of
> > modifying the existing one.
>
> Ok. Just to be clear: What I did in this patch is fine and I should use
> the same approach for NM. Did I get that right?

Yes. Leave the NM macro as it is, and add a new _CLOSEST variant that
sets the flag like you did there.

Maxime


Attachments:
(No filename) (3.76 kB)
signature.asc (235.00 B)
Download all attachments