2017-12-04 05:26:21

by Chen-Yu Tsai

[permalink] [raw]
Subject: [PATCH 0/2] clk: sunxi-ng: sun50i: a64: Add 2x fixed post-divider to MMC module clocks

Hi,

This is a small fix to get MMC performance up to proper speeds on the
A64. According to the BSP kernel, the MMC module clocks have a /2 fixed
post-divider between the clock output and the MMC module, like what
we've seen with the "new MMC timing mode" on the A83T, but the A64 does
not have the mode switch.

Sub-par performance was observed on the Banana Pi M64 eMMC. It only
reached half the read throughput of other Banana Pi boards, using a
standard sequential readout with a large block size. After these
patches, the performance is up to spec.

The A64 can also do DDR transfer modes, but the clock delay config
registers in the MMC module are different from what we've seen so
far. One can just force enable DDR modes without tuning the delays,
and it does work. Proper support for this is left for another time.


ChenYu

Chen-Yu Tsai (2):
clk: sunxi-ng: Support fixed post-dividers on MP style clocks
clk: sunxi-ng: sun50i: a64: Add 2x fixed post-divider to MMC module
clocks

drivers/clk/sunxi-ng/ccu-sun50i-a64.c | 57 +++++++++++++++++++++++------------
drivers/clk/sunxi-ng/ccu_mp.c | 20 ++++++++++--
drivers/clk/sunxi-ng/ccu_mp.h | 24 +++++++++++++++
3 files changed, 79 insertions(+), 22 deletions(-)

--
2.15.0


2017-12-04 05:26:23

by Chen-Yu Tsai

[permalink] [raw]
Subject: [PATCH 1/2] clk: sunxi-ng: Support fixed post-dividers on MP style clocks

On the A64, the MMC module clocks are fixed in the new timing mode,
i.e. they do not have a bit to select the mode. These clocks have
a 2x divider somewhere between the clock and the MMC module.

To be consistent with other SoCs supporting the new timing mode,
we model the 2x divider as a fixed post-divider on the MMC module
clocks.

To do this, we first add fixed post-divider to the MP style clocks,
which the MMC module clocks are.

Signed-off-by: Chen-Yu Tsai <[email protected]>
---
drivers/clk/sunxi-ng/ccu_mp.c | 20 ++++++++++++++++++--
drivers/clk/sunxi-ng/ccu_mp.h | 24 ++++++++++++++++++++++++
2 files changed, 42 insertions(+), 2 deletions(-)

diff --git a/drivers/clk/sunxi-ng/ccu_mp.c b/drivers/clk/sunxi-ng/ccu_mp.c
index 688855e7dc8c..5d0af4051737 100644
--- a/drivers/clk/sunxi-ng/ccu_mp.c
+++ b/drivers/clk/sunxi-ng/ccu_mp.c
@@ -50,12 +50,19 @@ static unsigned long ccu_mp_round_rate(struct ccu_mux_internal *mux,
unsigned int max_m, max_p;
unsigned int m, p;

+ if (cmp->common.features & CCU_FEATURE_FIXED_POSTDIV)
+ rate *= cmp->fixed_post_div;
+
max_m = cmp->m.max ?: 1 << cmp->m.width;
max_p = cmp->p.max ?: 1 << ((1 << cmp->p.width) - 1);

ccu_mp_find_best(*parent_rate, rate, max_m, max_p, &m, &p);
+ rate = *parent_rate / p / m;
+
+ if (cmp->common.features & CCU_FEATURE_FIXED_POSTDIV)
+ rate /= cmp->fixed_post_div;

- return *parent_rate / p / m;
+ return rate;
}

static void ccu_mp_disable(struct clk_hw *hw)
@@ -83,6 +90,7 @@ static unsigned long ccu_mp_recalc_rate(struct clk_hw *hw,
unsigned long parent_rate)
{
struct ccu_mp *cmp = hw_to_ccu_mp(hw);
+ unsigned long rate;
unsigned int m, p;
u32 reg;

@@ -101,7 +109,11 @@ static unsigned long ccu_mp_recalc_rate(struct clk_hw *hw,
p = reg >> cmp->p.shift;
p &= (1 << cmp->p.width) - 1;

- return (parent_rate >> p) / m;
+ rate = (parent_rate >> p) / m;
+ if (cmp->common.features & CCU_FEATURE_FIXED_POSTDIV)
+ rate /= cmp->fixed_post_div;
+
+ return rate;
}

static int ccu_mp_determine_rate(struct clk_hw *hw,
@@ -129,6 +141,10 @@ static int ccu_mp_set_rate(struct clk_hw *hw, unsigned long rate,
max_m = cmp->m.max ?: 1 << cmp->m.width;
max_p = cmp->p.max ?: 1 << ((1 << cmp->p.width) - 1);

+ /* Adjust target rate according to post-dividers */
+ if (cmp->common.features & CCU_FEATURE_FIXED_POSTDIV)
+ rate = rate * cmp->fixed_post_div;
+
ccu_mp_find_best(parent_rate, rate, max_m, max_p, &m, &p);

spin_lock_irqsave(cmp->common.lock, flags);
diff --git a/drivers/clk/sunxi-ng/ccu_mp.h b/drivers/clk/sunxi-ng/ccu_mp.h
index aaef11d747ea..5107635e61de 100644
--- a/drivers/clk/sunxi-ng/ccu_mp.h
+++ b/drivers/clk/sunxi-ng/ccu_mp.h
@@ -33,9 +33,33 @@ struct ccu_mp {
struct ccu_div_internal m;
struct ccu_div_internal p;
struct ccu_mux_internal mux;
+
+ unsigned int fixed_post_div;
+
struct ccu_common common;
};

+#define SUNXI_CCU_MP_WITH_MUX_GATE_POSTDIV(_struct, _name, _parents, _reg, \
+ _mshift, _mwidth, \
+ _pshift, _pwidth, \
+ _muxshift, _muxwidth, \
+ _gate, _postdiv, _flags) \
+ struct ccu_mp _struct = { \
+ .enable = _gate, \
+ .m = _SUNXI_CCU_DIV(_mshift, _mwidth), \
+ .p = _SUNXI_CCU_DIV(_pshift, _pwidth), \
+ .mux = _SUNXI_CCU_MUX(_muxshift, _muxwidth), \
+ .fixed_post_div = _postdiv, \
+ .common = { \
+ .reg = _reg, \
+ .features = CCU_FEATURE_FIXED_POSTDIV, \
+ .hw.init = CLK_HW_INIT_PARENTS(_name, \
+ _parents, \
+ &ccu_mp_ops, \
+ _flags), \
+ } \
+ }
+
#define SUNXI_CCU_MP_WITH_MUX_GATE(_struct, _name, _parents, _reg, \
_mshift, _mwidth, \
_pshift, _pwidth, \
--
2.15.0

2017-12-04 05:26:18

by Chen-Yu Tsai

[permalink] [raw]
Subject: [PATCH 2/2] clk: sunxi-ng: sun50i: a64: Add 2x fixed post-divider to MMC module clocks

On the A64, the MMC module clocks are fixed in the new timing mode,
i.e. they do not have a bit to select the mode. These clocks have
a 2x divider somewhere between the clock and the MMC module.

To be consistent with other SoCs supporting the new timing mode,
we model the 2x divider as a fixed post-divider on the MMC module
clocks.

This patch adds the post-dividers to the MMC clocks.

Signed-off-by: Chen-Yu Tsai <[email protected]>
---
drivers/clk/sunxi-ng/ccu-sun50i-a64.c | 57 +++++++++++++++++++++++------------
1 file changed, 37 insertions(+), 20 deletions(-)

diff --git a/drivers/clk/sunxi-ng/ccu-sun50i-a64.c b/drivers/clk/sunxi-ng/ccu-sun50i-a64.c
index 2bb4cabf802f..ee9c12cf3f08 100644
--- a/drivers/clk/sunxi-ng/ccu-sun50i-a64.c
+++ b/drivers/clk/sunxi-ng/ccu-sun50i-a64.c
@@ -400,28 +400,45 @@ static SUNXI_CCU_MP_WITH_MUX_GATE(nand_clk, "nand", mod0_default_parents, 0x080,
BIT(31), /* gate */
0);

+/*
+ * MMC clocks are the new timing mode (see A83T & H3) variety, but without
+ * the mode switch. This means they have a 2x post divider between the clock
+ * and the MMC module. This is not documented in the manual, but is taken
+ * into consideration when setting the mmc module clocks in the BSP kernel.
+ * Without it, MMC performance is degraded.
+ *
+ * We model it here to be consistent with other SoCs supporting this mode.
+ * The alternative would be to add the 2x multiplier when setting the MMC
+ * module clock in the MMC driver, just for the A64.
+ */
static const char * const mmc_default_parents[] = { "osc24M", "pll-periph0-2x",
"pll-periph1-2x" };
-static SUNXI_CCU_MP_WITH_MUX_GATE(mmc0_clk, "mmc0", mmc_default_parents, 0x088,
- 0, 4, /* M */
- 16, 2, /* P */
- 24, 2, /* mux */
- BIT(31), /* gate */
- 0);
-
-static SUNXI_CCU_MP_WITH_MUX_GATE(mmc1_clk, "mmc1", mmc_default_parents, 0x08c,
- 0, 4, /* M */
- 16, 2, /* P */
- 24, 2, /* mux */
- BIT(31), /* gate */
- 0);
-
-static SUNXI_CCU_MP_WITH_MUX_GATE(mmc2_clk, "mmc2", mmc_default_parents, 0x090,
- 0, 4, /* M */
- 16, 2, /* P */
- 24, 2, /* mux */
- BIT(31), /* gate */
- 0);
+static SUNXI_CCU_MP_WITH_MUX_GATE_POSTDIV(mmc0_clk, "mmc0",
+ mmc_default_parents, 0x088,
+ 0, 4, /* M */
+ 16, 2, /* P */
+ 24, 2, /* mux */
+ BIT(31), /* gate */
+ 2, /* post-div */
+ 0);
+
+static SUNXI_CCU_MP_WITH_MUX_GATE_POSTDIV(mmc1_clk, "mmc1",
+ mmc_default_parents, 0x08c,
+ 0, 4, /* M */
+ 16, 2, /* P */
+ 24, 2, /* mux */
+ BIT(31), /* gate */
+ 2, /* post-div */
+ 0);
+
+static SUNXI_CCU_MP_WITH_MUX_GATE_POSTDIV(mmc2_clk, "mmc2",
+ mmc_default_parents, 0x090,
+ 0, 4, /* M */
+ 16, 2, /* P */
+ 24, 2, /* mux */
+ BIT(31), /* gate */
+ 2, /* post-div */
+ 0);

static const char * const ts_parents[] = { "osc24M", "pll-periph0", };
static SUNXI_CCU_MP_WITH_MUX_GATE(ts_clk, "ts", ts_parents, 0x098,
--
2.15.0

2017-12-04 23:18:41

by Andre Przywara

[permalink] [raw]
Subject: Re: [linux-sunxi] [PATCH 1/2] clk: sunxi-ng: Support fixed post-dividers on MP style clocks

Hi Chen-Yu,

On 04/12/17 05:19, Chen-Yu Tsai wrote:
> On the A64, the MMC module clocks are fixed in the new timing mode,
> i.e. they do not have a bit to select the mode. These clocks have
> a 2x divider somewhere between the clock and the MMC module.
>
> To be consistent with other SoCs supporting the new timing mode,
> we model the 2x divider as a fixed post-divider on the MMC module
> clocks.
>
> To do this, we first add fixed post-divider to the MP style clocks,
> which the MMC module clocks are.
>
> Signed-off-by: Chen-Yu Tsai <[email protected]>
> ---
> drivers/clk/sunxi-ng/ccu_mp.c | 20 ++++++++++++++++++--
> drivers/clk/sunxi-ng/ccu_mp.h | 24 ++++++++++++++++++++++++
> 2 files changed, 42 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/clk/sunxi-ng/ccu_mp.c b/drivers/clk/sunxi-ng/ccu_mp.c
> index 688855e7dc8c..5d0af4051737 100644
> --- a/drivers/clk/sunxi-ng/ccu_mp.c
> +++ b/drivers/clk/sunxi-ng/ccu_mp.c
> @@ -50,12 +50,19 @@ static unsigned long ccu_mp_round_rate(struct ccu_mux_internal *mux,
> unsigned int max_m, max_p;
> unsigned int m, p;
>
> + if (cmp->common.features & CCU_FEATURE_FIXED_POSTDIV)
> + rate *= cmp->fixed_post_div;

Can't you just initialise fixed_post_div to 1 normally and save the
CCU_FEATURE_FIXED_POSTDIV?

> +
> max_m = cmp->m.max ?: 1 << cmp->m.width;
> max_p = cmp->p.max ?: 1 << ((1 << cmp->p.width) - 1);
>
> ccu_mp_find_best(*parent_rate, rate, max_m, max_p, &m, &p);
> + rate = *parent_rate / p / m;
> +
> + if (cmp->common.features & CCU_FEATURE_FIXED_POSTDIV)
> + rate /= cmp->fixed_post_div;
>
> - return *parent_rate / p / m;
> + return rate;
> }
>
> static void ccu_mp_disable(struct clk_hw *hw)
> @@ -83,6 +90,7 @@ static unsigned long ccu_mp_recalc_rate(struct clk_hw *hw,
> unsigned long parent_rate)
> {
> struct ccu_mp *cmp = hw_to_ccu_mp(hw);
> + unsigned long rate;
> unsigned int m, p;
> u32 reg;
>
> @@ -101,7 +109,11 @@ static unsigned long ccu_mp_recalc_rate(struct clk_hw *hw,
> p = reg >> cmp->p.shift;
> p &= (1 << cmp->p.width) - 1;
>
> - return (parent_rate >> p) / m;
> + rate = (parent_rate >> p) / m;
> + if (cmp->common.features & CCU_FEATURE_FIXED_POSTDIV)
> + rate /= cmp->fixed_post_div;
> +
> + return rate;
> }
>
> static int ccu_mp_determine_rate(struct clk_hw *hw,
> @@ -129,6 +141,10 @@ static int ccu_mp_set_rate(struct clk_hw *hw, unsigned long rate,
> max_m = cmp->m.max ?: 1 << cmp->m.width;
> max_p = cmp->p.max ?: 1 << ((1 << cmp->p.width) - 1);
>
> + /* Adjust target rate according to post-dividers */
> + if (cmp->common.features & CCU_FEATURE_FIXED_POSTDIV)
> + rate = rate * cmp->fixed_post_div;
> +
> ccu_mp_find_best(parent_rate, rate, max_m, max_p, &m, &p);
>
> spin_lock_irqsave(cmp->common.lock, flags);
> diff --git a/drivers/clk/sunxi-ng/ccu_mp.h b/drivers/clk/sunxi-ng/ccu_mp.h
> index aaef11d747ea..5107635e61de 100644
> --- a/drivers/clk/sunxi-ng/ccu_mp.h
> +++ b/drivers/clk/sunxi-ng/ccu_mp.h
> @@ -33,9 +33,33 @@ struct ccu_mp {
> struct ccu_div_internal m;
> struct ccu_div_internal p;
> struct ccu_mux_internal mux;
> +
> + unsigned int fixed_post_div;
> +
> struct ccu_common common;
> };
>
> +#define SUNXI_CCU_MP_WITH_MUX_GATE_POSTDIV(_struct, _name, _parents, _reg, \
> + _mshift, _mwidth, \
> + _pshift, _pwidth, \
> + _muxshift, _muxwidth, \
> + _gate, _postdiv, _flags) \
> + struct ccu_mp _struct = { \
> + .enable = _gate, \
> + .m = _SUNXI_CCU_DIV(_mshift, _mwidth), \
> + .p = _SUNXI_CCU_DIV(_pshift, _pwidth), \
> + .mux = _SUNXI_CCU_MUX(_muxshift, _muxwidth), \
> + .fixed_post_div = _postdiv, \
> + .common = { \
> + .reg = _reg, \
> + .features = CCU_FEATURE_FIXED_POSTDIV, \
> + .hw.init = CLK_HW_INIT_PARENTS(_name, \
> + _parents, \
> + &ccu_mp_ops, \
> + _flags), \
> + } \
> + }
> +

This looks suspiciously like a copy of the macro below. What about you
define the one below as a special case of this new one above?
Should be even more straightforward with defaulting postdiv to 1 and
loosing the feature flags.

Cheers,
Andre.

> #define SUNXI_CCU_MP_WITH_MUX_GATE(_struct, _name, _parents, _reg, \
> _mshift, _mwidth, \
> _pshift, _pwidth, \
>

2017-12-04 23:18:50

by Andre Przywara

[permalink] [raw]
Subject: Re: [linux-sunxi] [PATCH 2/2] clk: sunxi-ng: sun50i: a64: Add 2x fixed post-divider to MMC module clocks

On 04/12/17 05:19, Chen-Yu Tsai wrote:
> On the A64, the MMC module clocks are fixed in the new timing mode,
> i.e. they do not have a bit to select the mode. These clocks have
> a 2x divider somewhere between the clock and the MMC module.
>
> To be consistent with other SoCs supporting the new timing mode,
> we model the 2x divider as a fixed post-divider on the MMC module
> clocks.
>
> This patch adds the post-dividers to the MMC clocks.
>
> Signed-off-by: Chen-Yu Tsai <[email protected]>

Reviewed-by: Andre Przywara <[email protected]>

Cheers,
Andre

> ---
> drivers/clk/sunxi-ng/ccu-sun50i-a64.c | 57 +++++++++++++++++++++++------------
> 1 file changed, 37 insertions(+), 20 deletions(-)
>
> diff --git a/drivers/clk/sunxi-ng/ccu-sun50i-a64.c b/drivers/clk/sunxi-ng/ccu-sun50i-a64.c
> index 2bb4cabf802f..ee9c12cf3f08 100644
> --- a/drivers/clk/sunxi-ng/ccu-sun50i-a64.c
> +++ b/drivers/clk/sunxi-ng/ccu-sun50i-a64.c
> @@ -400,28 +400,45 @@ static SUNXI_CCU_MP_WITH_MUX_GATE(nand_clk, "nand", mod0_default_parents, 0x080,
> BIT(31), /* gate */
> 0);
>
> +/*
> + * MMC clocks are the new timing mode (see A83T & H3) variety, but without
> + * the mode switch. This means they have a 2x post divider between the clock
> + * and the MMC module. This is not documented in the manual, but is taken
> + * into consideration when setting the mmc module clocks in the BSP kernel.
> + * Without it, MMC performance is degraded.
> + *
> + * We model it here to be consistent with other SoCs supporting this mode.
> + * The alternative would be to add the 2x multiplier when setting the MMC
> + * module clock in the MMC driver, just for the A64.
> + */
> static const char * const mmc_default_parents[] = { "osc24M", "pll-periph0-2x",
> "pll-periph1-2x" };
> -static SUNXI_CCU_MP_WITH_MUX_GATE(mmc0_clk, "mmc0", mmc_default_parents, 0x088,
> - 0, 4, /* M */
> - 16, 2, /* P */
> - 24, 2, /* mux */
> - BIT(31), /* gate */
> - 0);
> -
> -static SUNXI_CCU_MP_WITH_MUX_GATE(mmc1_clk, "mmc1", mmc_default_parents, 0x08c,
> - 0, 4, /* M */
> - 16, 2, /* P */
> - 24, 2, /* mux */
> - BIT(31), /* gate */
> - 0);
> -
> -static SUNXI_CCU_MP_WITH_MUX_GATE(mmc2_clk, "mmc2", mmc_default_parents, 0x090,
> - 0, 4, /* M */
> - 16, 2, /* P */
> - 24, 2, /* mux */
> - BIT(31), /* gate */
> - 0);
> +static SUNXI_CCU_MP_WITH_MUX_GATE_POSTDIV(mmc0_clk, "mmc0",
> + mmc_default_parents, 0x088,
> + 0, 4, /* M */
> + 16, 2, /* P */
> + 24, 2, /* mux */
> + BIT(31), /* gate */
> + 2, /* post-div */
> + 0);
> +
> +static SUNXI_CCU_MP_WITH_MUX_GATE_POSTDIV(mmc1_clk, "mmc1",
> + mmc_default_parents, 0x08c,
> + 0, 4, /* M */
> + 16, 2, /* P */
> + 24, 2, /* mux */
> + BIT(31), /* gate */
> + 2, /* post-div */
> + 0);
> +
> +static SUNXI_CCU_MP_WITH_MUX_GATE_POSTDIV(mmc2_clk, "mmc2",
> + mmc_default_parents, 0x090,
> + 0, 4, /* M */
> + 16, 2, /* P */
> + 24, 2, /* mux */
> + BIT(31), /* gate */
> + 2, /* post-div */
> + 0);
>
> static const char * const ts_parents[] = { "osc24M", "pll-periph0", };
> static SUNXI_CCU_MP_WITH_MUX_GATE(ts_clk, "ts", ts_parents, 0x098,
>

2017-12-04 23:26:02

by Andre Przywara

[permalink] [raw]
Subject: Re: [linux-sunxi] [PATCH 0/2] clk: sunxi-ng: sun50i: a64: Add 2x fixed post-divider to MMC module clocks

On 04/12/17 05:19, Chen-Yu Tsai wrote:
> Hi,
>
> This is a small fix to get MMC performance up to proper speeds on the

Maybe a small fix for a skilled developer, but a giant leap for all
users ;-)
MMC performance goes from: (4.15-rc1)

SD: Timing buffered disk reads: 36 MB in 3.17 seconds = 11.35 MB/sec
eMMC: Timing buffered disk reads: 66 MB in 3.03 seconds = 21.81 MB/sec

to: (4.15-rc2 plus those two patches)

SD: Timing buffered disk reads: 68 MB in 3.01 seconds = 22.61 MB/sec
eMMC: Timing buffered disk reads: 132 MB in 3.01 seconds = 43.80 MB/sec

So yes, factor of two ...

Tested-by: Andre Przywara <[email protected]>

Given the impact I wonder if this is a candidate for stable as well.

> A64. According to the BSP kernel, the MMC module clocks have a /2 fixed
> post-divider between the clock output and the MMC module, like what
> we've seen with the "new MMC timing mode" on the A83T, but the A64 does
> not have the mode switch.
>
> Sub-par performance was observed on the Banana Pi M64 eMMC. It only
> reached half the read throughput of other Banana Pi boards, using a
> standard sequential readout with a large block size. After these
> patches, the performance is up to spec.
>
> The A64 can also do DDR transfer modes, but the clock delay config
> registers in the MMC module are different from what we've seen so
> far.

But the BSP doesn't set those as well, does it? I mean to remember that
they were all zero, expect for HS200/HS400?

Thanks!
Andre.

One can just force enable DDR modes without tuning the delays,
> and it does work. Proper support for this is left for another time.
>
>
> ChenYu
>
> Chen-Yu Tsai (2):
> clk: sunxi-ng: Support fixed post-dividers on MP style clocks
> clk: sunxi-ng: sun50i: a64: Add 2x fixed post-divider to MMC module
> clocks
>
> drivers/clk/sunxi-ng/ccu-sun50i-a64.c | 57 +++++++++++++++++++++++------------
> drivers/clk/sunxi-ng/ccu_mp.c | 20 ++++++++++--
> drivers/clk/sunxi-ng/ccu_mp.h | 24 +++++++++++++++
> 3 files changed, 79 insertions(+), 22 deletions(-)
>

2017-12-05 03:01:39

by Chen-Yu Tsai

[permalink] [raw]
Subject: Re: [linux-sunxi] [PATCH 1/2] clk: sunxi-ng: Support fixed post-dividers on MP style clocks

On Tue, Dec 5, 2017 at 7:18 AM, AndrĂ© Przywara <[email protected]> wrote:
> Hi Chen-Yu,
>
> On 04/12/17 05:19, Chen-Yu Tsai wrote:
>> On the A64, the MMC module clocks are fixed in the new timing mode,
>> i.e. they do not have a bit to select the mode. These clocks have
>> a 2x divider somewhere between the clock and the MMC module.
>>
>> To be consistent with other SoCs supporting the new timing mode,
>> we model the 2x divider as a fixed post-divider on the MMC module
>> clocks.
>>
>> To do this, we first add fixed post-divider to the MP style clocks,
>> which the MMC module clocks are.
>>
>> Signed-off-by: Chen-Yu Tsai <[email protected]>
>> ---
>> drivers/clk/sunxi-ng/ccu_mp.c | 20 ++++++++++++++++++--
>> drivers/clk/sunxi-ng/ccu_mp.h | 24 ++++++++++++++++++++++++
>> 2 files changed, 42 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/clk/sunxi-ng/ccu_mp.c b/drivers/clk/sunxi-ng/ccu_mp.c
>> index 688855e7dc8c..5d0af4051737 100644
>> --- a/drivers/clk/sunxi-ng/ccu_mp.c
>> +++ b/drivers/clk/sunxi-ng/ccu_mp.c
>> @@ -50,12 +50,19 @@ static unsigned long ccu_mp_round_rate(struct ccu_mux_internal *mux,
>> unsigned int max_m, max_p;
>> unsigned int m, p;
>>
>> + if (cmp->common.features & CCU_FEATURE_FIXED_POSTDIV)
>> + rate *= cmp->fixed_post_div;
>
> Can't you just initialise fixed_post_div to 1 normally and save the
> CCU_FEATURE_FIXED_POSTDIV?

I'll refer to Maxime about this. The feature flag was there from day
one. We only started to implement support for it later on. I'm not
sure if there was a reason to add them as feature flags, instead of
a field that defaults to something (0 even).

Otherwise it's a reasonable change. And we probably don't have to
do a wholesale change for the other clocks in one go. Incidentally
I have a A83T audio series that also adds post-dividers for another
clock type. I'll wait for a conclusion on this end before posting
it.

>
>> +
>> max_m = cmp->m.max ?: 1 << cmp->m.width;
>> max_p = cmp->p.max ?: 1 << ((1 << cmp->p.width) - 1);
>>
>> ccu_mp_find_best(*parent_rate, rate, max_m, max_p, &m, &p);
>> + rate = *parent_rate / p / m;
>> +
>> + if (cmp->common.features & CCU_FEATURE_FIXED_POSTDIV)
>> + rate /= cmp->fixed_post_div;
>>
>> - return *parent_rate / p / m;
>> + return rate;
>> }
>>
>> static void ccu_mp_disable(struct clk_hw *hw)
>> @@ -83,6 +90,7 @@ static unsigned long ccu_mp_recalc_rate(struct clk_hw *hw,
>> unsigned long parent_rate)
>> {
>> struct ccu_mp *cmp = hw_to_ccu_mp(hw);
>> + unsigned long rate;
>> unsigned int m, p;
>> u32 reg;
>>
>> @@ -101,7 +109,11 @@ static unsigned long ccu_mp_recalc_rate(struct clk_hw *hw,
>> p = reg >> cmp->p.shift;
>> p &= (1 << cmp->p.width) - 1;
>>
>> - return (parent_rate >> p) / m;
>> + rate = (parent_rate >> p) / m;
>> + if (cmp->common.features & CCU_FEATURE_FIXED_POSTDIV)
>> + rate /= cmp->fixed_post_div;
>> +
>> + return rate;
>> }
>>
>> static int ccu_mp_determine_rate(struct clk_hw *hw,
>> @@ -129,6 +141,10 @@ static int ccu_mp_set_rate(struct clk_hw *hw, unsigned long rate,
>> max_m = cmp->m.max ?: 1 << cmp->m.width;
>> max_p = cmp->p.max ?: 1 << ((1 << cmp->p.width) - 1);
>>
>> + /* Adjust target rate according to post-dividers */
>> + if (cmp->common.features & CCU_FEATURE_FIXED_POSTDIV)
>> + rate = rate * cmp->fixed_post_div;
>> +
>> ccu_mp_find_best(parent_rate, rate, max_m, max_p, &m, &p);
>>
>> spin_lock_irqsave(cmp->common.lock, flags);
>> diff --git a/drivers/clk/sunxi-ng/ccu_mp.h b/drivers/clk/sunxi-ng/ccu_mp.h
>> index aaef11d747ea..5107635e61de 100644
>> --- a/drivers/clk/sunxi-ng/ccu_mp.h
>> +++ b/drivers/clk/sunxi-ng/ccu_mp.h
>> @@ -33,9 +33,33 @@ struct ccu_mp {
>> struct ccu_div_internal m;
>> struct ccu_div_internal p;
>> struct ccu_mux_internal mux;
>> +
>> + unsigned int fixed_post_div;
>> +
>> struct ccu_common common;
>> };
>>
>> +#define SUNXI_CCU_MP_WITH_MUX_GATE_POSTDIV(_struct, _name, _parents, _reg, \
>> + _mshift, _mwidth, \
>> + _pshift, _pwidth, \
>> + _muxshift, _muxwidth, \
>> + _gate, _postdiv, _flags) \
>> + struct ccu_mp _struct = { \
>> + .enable = _gate, \
>> + .m = _SUNXI_CCU_DIV(_mshift, _mwidth), \
>> + .p = _SUNXI_CCU_DIV(_pshift, _pwidth), \
>> + .mux = _SUNXI_CCU_MUX(_muxshift, _muxwidth), \
>> + .fixed_post_div = _postdiv, \
>> + .common = { \
>> + .reg = _reg, \
>> + .features = CCU_FEATURE_FIXED_POSTDIV, \
>> + .hw.init = CLK_HW_INIT_PARENTS(_name, \
>> + _parents, \
>> + &ccu_mp_ops, \
>> + _flags), \
>> + } \
>> + }
>> +
>
> This looks suspiciously like a copy of the macro below. What about you
> define the one below as a special case of this new one above?

Correct. But you can't unset the feature flag. So a copy is needed.

> Should be even more straightforward with defaulting postdiv to 1 and
> loosing the feature flags.

See above about the feature flag.

ChenYu

>
> Cheers,
> Andre.
>
>> #define SUNXI_CCU_MP_WITH_MUX_GATE(_struct, _name, _parents, _reg, \
>> _mshift, _mwidth, \
>> _pshift, _pwidth, \
>>
>

2017-12-05 03:05:13

by Chen-Yu Tsai

[permalink] [raw]
Subject: Re: [linux-sunxi] [PATCH 0/2] clk: sunxi-ng: sun50i: a64: Add 2x fixed post-divider to MMC module clocks

On Tue, Dec 5, 2017 at 7:25 AM, AndrĂ© Przywara <[email protected]> wrote:
> On 04/12/17 05:19, Chen-Yu Tsai wrote:
>> Hi,
>>
>> This is a small fix to get MMC performance up to proper speeds on the
>
> Maybe a small fix for a skilled developer, but a giant leap for all
> users ;-)
> MMC performance goes from: (4.15-rc1)
>
> SD: Timing buffered disk reads: 36 MB in 3.17 seconds = 11.35 MB/sec
> eMMC: Timing buffered disk reads: 66 MB in 3.03 seconds = 21.81 MB/sec
>
> to: (4.15-rc2 plus those two patches)
>
> SD: Timing buffered disk reads: 68 MB in 3.01 seconds = 22.61 MB/sec
> eMMC: Timing buffered disk reads: 132 MB in 3.01 seconds = 43.80 MB/sec
>
> So yes, factor of two ...
>
> Tested-by: Andre Przywara <[email protected]>
>
> Given the impact I wonder if this is a candidate for stable as well.

It could. But then again, nothing was broken. And it depends on the first
patch. I'm not sure stable would like that.

>
>> A64. According to the BSP kernel, the MMC module clocks have a /2 fixed
>> post-divider between the clock output and the MMC module, like what
>> we've seen with the "new MMC timing mode" on the A83T, but the A64 does
>> not have the mode switch.
>>
>> Sub-par performance was observed on the Banana Pi M64 eMMC. It only
>> reached half the read throughput of other Banana Pi boards, using a
>> standard sequential readout with a large block size. After these
>> patches, the performance is up to spec.
>>
>> The A64 can also do DDR transfer modes, but the clock delay config
>> registers in the MMC module are different from what we've seen so
>> far.
>
> But the BSP doesn't set those as well, does it? I mean to remember that
> they were all zero, expect for HS200/HS400?

What I meant was we should make it clear in the mmc driver that this
is another new configuration mechanism that is different from the
previous ones.

ChenYu

>
> Thanks!
> Andre.
>
>> One can just force enable DDR modes without tuning the delays,
>> and it does work. Proper support for this is left for another time.
>>
>>
>> ChenYu
>>
>> Chen-Yu Tsai (2):
>> clk: sunxi-ng: Support fixed post-dividers on MP style clocks
>> clk: sunxi-ng: sun50i: a64: Add 2x fixed post-divider to MMC module
>> clocks
>>
>> drivers/clk/sunxi-ng/ccu-sun50i-a64.c | 57 +++++++++++++++++++++++------------
>> drivers/clk/sunxi-ng/ccu_mp.c | 20 ++++++++++--
>> drivers/clk/sunxi-ng/ccu_mp.h | 24 +++++++++++++++
>> 3 files changed, 79 insertions(+), 22 deletions(-)
>>
>

2017-12-05 19:32:56

by Maxime Ripard

[permalink] [raw]
Subject: Re: [linux-sunxi] [PATCH 1/2] clk: sunxi-ng: Support fixed post-dividers on MP style clocks

1;5002;0c
On Tue, Dec 05, 2017 at 11:01:11AM +0800, Chen-Yu Tsai wrote:
> On Tue, Dec 5, 2017 at 7:18 AM, Andr? Przywara <[email protected]> wrote:
> > Hi Chen-Yu,
> >
> > On 04/12/17 05:19, Chen-Yu Tsai wrote:
> >> On the A64, the MMC module clocks are fixed in the new timing mode,
> >> i.e. they do not have a bit to select the mode. These clocks have
> >> a 2x divider somewhere between the clock and the MMC module.
> >>
> >> To be consistent with other SoCs supporting the new timing mode,
> >> we model the 2x divider as a fixed post-divider on the MMC module
> >> clocks.
> >>
> >> To do this, we first add fixed post-divider to the MP style clocks,
> >> which the MMC module clocks are.
> >>
> >> Signed-off-by: Chen-Yu Tsai <[email protected]>
> >> ---
> >> drivers/clk/sunxi-ng/ccu_mp.c | 20 ++++++++++++++++++--
> >> drivers/clk/sunxi-ng/ccu_mp.h | 24 ++++++++++++++++++++++++
> >> 2 files changed, 42 insertions(+), 2 deletions(-)
> >>
> >> diff --git a/drivers/clk/sunxi-ng/ccu_mp.c b/drivers/clk/sunxi-ng/ccu_mp.c
> >> index 688855e7dc8c..5d0af4051737 100644
> >> --- a/drivers/clk/sunxi-ng/ccu_mp.c
> >> +++ b/drivers/clk/sunxi-ng/ccu_mp.c
> >> @@ -50,12 +50,19 @@ static unsigned long ccu_mp_round_rate(struct ccu_mux_internal *mux,
> >> unsigned int max_m, max_p;
> >> unsigned int m, p;
> >>
> >> + if (cmp->common.features & CCU_FEATURE_FIXED_POSTDIV)
> >> + rate *= cmp->fixed_post_div;
> >
> > Can't you just initialise fixed_post_div to 1 normally and save the
> > CCU_FEATURE_FIXED_POSTDIV?
>
> I'll refer to Maxime about this. The feature flag was there from day
> one. We only started to implement support for it later on. I'm not
> sure if there was a reason to add them as feature flags, instead of
> a field that defaults to something (0 even).
>
> Otherwise it's a reasonable change. And we probably don't have to
> do a wholesale change for the other clocks in one go. Incidentally
> I have a A83T audio series that also adds post-dividers for another
> clock type. I'll wait for a conclusion on this end before posting
> it.

We can definitely remove that feature flag. However, that would also
mean going over all the clocks we define everywhere to set it to 1,
which is going to be cumbersome and likely to introduce some bugs. I
don't really want to tie those two patches to that effort.

I applied both, thanks!
Maxime

--
Maxime Ripard, Free Electrons
Embedded Linux and Kernel engineering
http://free-electrons.com


Attachments:
(No filename) (2.43 kB)
signature.asc (833.00 B)
Download all attachments

2017-12-05 20:00:12

by Maxime Ripard

[permalink] [raw]
Subject: Re: [PATCH 2/2] clk: sunxi-ng: sun50i: a64: Add 2x fixed post-divider to MMC module clocks

Hi,

On Mon, Dec 04, 2017 at 01:19:12PM +0800, Chen-Yu Tsai wrote:
> On the A64, the MMC module clocks are fixed in the new timing mode,
> i.e. they do not have a bit to select the mode. These clocks have
> a 2x divider somewhere between the clock and the MMC module.
>
> To be consistent with other SoCs supporting the new timing mode,
> we model the 2x divider as a fixed post-divider on the MMC module
> clocks.
>
> This patch adds the post-dividers to the MMC clocks.
>
> Signed-off-by: Chen-Yu Tsai <[email protected]>

I had a doubt applying that one... sorry.

> ---
> drivers/clk/sunxi-ng/ccu-sun50i-a64.c | 57 +++++++++++++++++++++++------------
> 1 file changed, 37 insertions(+), 20 deletions(-)
>
> diff --git a/drivers/clk/sunxi-ng/ccu-sun50i-a64.c b/drivers/clk/sunxi-ng/ccu-sun50i-a64.c
> index 2bb4cabf802f..ee9c12cf3f08 100644
> --- a/drivers/clk/sunxi-ng/ccu-sun50i-a64.c
> +++ b/drivers/clk/sunxi-ng/ccu-sun50i-a64.c
> @@ -400,28 +400,45 @@ static SUNXI_CCU_MP_WITH_MUX_GATE(nand_clk, "nand", mod0_default_parents, 0x080,
> BIT(31), /* gate */
> 0);
>
> +/*
> + * MMC clocks are the new timing mode (see A83T & H3) variety, but without
> + * the mode switch. This means they have a 2x post divider between the clock
> + * and the MMC module. This is not documented in the manual, but is taken
> + * into consideration when setting the mmc module clocks in the BSP kernel.
> + * Without it, MMC performance is degraded.
> + *
> + * We model it here to be consistent with other SoCs supporting this mode.
> + * The alternative would be to add the 2x multiplier when setting the MMC
> + * module clock in the MMC driver, just for the A64.
> + */
> static const char * const mmc_default_parents[] = { "osc24M", "pll-periph0-2x",
> "pll-periph1-2x" };
> -static SUNXI_CCU_MP_WITH_MUX_GATE(mmc0_clk, "mmc0", mmc_default_parents, 0x088,
> - 0, 4, /* M */
> - 16, 2, /* P */
> - 24, 2, /* mux */
> - BIT(31), /* gate */
> - 0);
> -
> -static SUNXI_CCU_MP_WITH_MUX_GATE(mmc1_clk, "mmc1", mmc_default_parents, 0x08c,
> - 0, 4, /* M */
> - 16, 2, /* P */
> - 24, 2, /* mux */
> - BIT(31), /* gate */
> - 0);
> -
> -static SUNXI_CCU_MP_WITH_MUX_GATE(mmc2_clk, "mmc2", mmc_default_parents, 0x090,
> - 0, 4, /* M */
> - 16, 2, /* P */
> - 24, 2, /* mux */
> - BIT(31), /* gate */
> - 0);
> +static SUNXI_CCU_MP_WITH_MUX_GATE_POSTDIV(mmc0_clk, "mmc0",
> + mmc_default_parents, 0x088,
> + 0, 4, /* M */
> + 16, 2, /* P */
> + 24, 2, /* mux */
> + BIT(31), /* gate */
> + 2, /* post-div */
> + 0);
> +
> +static SUNXI_CCU_MP_WITH_MUX_GATE_POSTDIV(mmc1_clk, "mmc1",
> + mmc_default_parents, 0x08c,
> + 0, 4, /* M */
> + 16, 2, /* P */
> + 24, 2, /* mux */
> + BIT(31), /* gate */
> + 2, /* post-div */
> + 0);
> +

Are you sure that the divider there for the non-eMMC clocks? Usually,
the new mode is only here for the eMMC, so we would divide the rate by
two in the non-eMMC case.

Maxime

--
Maxime Ripard, Free Electrons
Embedded Linux and Kernel engineering
http://free-electrons.com


Attachments:
(No filename) (3.11 kB)
signature.asc (833.00 B)
Download all attachments

2017-12-06 02:30:55

by Chen-Yu Tsai

[permalink] [raw]
Subject: Re: [PATCH 2/2] clk: sunxi-ng: sun50i: a64: Add 2x fixed post-divider to MMC module clocks

On Wed, Dec 6, 2017 at 3:59 AM, Maxime Ripard
<[email protected]> wrote:
> Hi,
>
> On Mon, Dec 04, 2017 at 01:19:12PM +0800, Chen-Yu Tsai wrote:
>> On the A64, the MMC module clocks are fixed in the new timing mode,
>> i.e. they do not have a bit to select the mode. These clocks have
>> a 2x divider somewhere between the clock and the MMC module.
>>
>> To be consistent with other SoCs supporting the new timing mode,
>> we model the 2x divider as a fixed post-divider on the MMC module
>> clocks.
>>
>> This patch adds the post-dividers to the MMC clocks.
>>
>> Signed-off-by: Chen-Yu Tsai <[email protected]>
>
> I had a doubt applying that one... sorry.
>
>> ---
>> drivers/clk/sunxi-ng/ccu-sun50i-a64.c | 57 +++++++++++++++++++++++------------
>> 1 file changed, 37 insertions(+), 20 deletions(-)
>>
>> diff --git a/drivers/clk/sunxi-ng/ccu-sun50i-a64.c b/drivers/clk/sunxi-ng/ccu-sun50i-a64.c
>> index 2bb4cabf802f..ee9c12cf3f08 100644
>> --- a/drivers/clk/sunxi-ng/ccu-sun50i-a64.c
>> +++ b/drivers/clk/sunxi-ng/ccu-sun50i-a64.c
>> @@ -400,28 +400,45 @@ static SUNXI_CCU_MP_WITH_MUX_GATE(nand_clk, "nand", mod0_default_parents, 0x080,
>> BIT(31), /* gate */
>> 0);
>>
>> +/*
>> + * MMC clocks are the new timing mode (see A83T & H3) variety, but without
>> + * the mode switch. This means they have a 2x post divider between the clock
>> + * and the MMC module. This is not documented in the manual, but is taken
>> + * into consideration when setting the mmc module clocks in the BSP kernel.
>> + * Without it, MMC performance is degraded.
>> + *
>> + * We model it here to be consistent with other SoCs supporting this mode.
>> + * The alternative would be to add the 2x multiplier when setting the MMC
>> + * module clock in the MMC driver, just for the A64.
>> + */
>> static const char * const mmc_default_parents[] = { "osc24M", "pll-periph0-2x",
>> "pll-periph1-2x" };
>> -static SUNXI_CCU_MP_WITH_MUX_GATE(mmc0_clk, "mmc0", mmc_default_parents, 0x088,
>> - 0, 4, /* M */
>> - 16, 2, /* P */
>> - 24, 2, /* mux */
>> - BIT(31), /* gate */
>> - 0);
>> -
>> -static SUNXI_CCU_MP_WITH_MUX_GATE(mmc1_clk, "mmc1", mmc_default_parents, 0x08c,
>> - 0, 4, /* M */
>> - 16, 2, /* P */
>> - 24, 2, /* mux */
>> - BIT(31), /* gate */
>> - 0);
>> -
>> -static SUNXI_CCU_MP_WITH_MUX_GATE(mmc2_clk, "mmc2", mmc_default_parents, 0x090,
>> - 0, 4, /* M */
>> - 16, 2, /* P */
>> - 24, 2, /* mux */
>> - BIT(31), /* gate */
>> - 0);
>> +static SUNXI_CCU_MP_WITH_MUX_GATE_POSTDIV(mmc0_clk, "mmc0",
>> + mmc_default_parents, 0x088,
>> + 0, 4, /* M */
>> + 16, 2, /* P */
>> + 24, 2, /* mux */
>> + BIT(31), /* gate */
>> + 2, /* post-div */
>> + 0);
>> +
>> +static SUNXI_CCU_MP_WITH_MUX_GATE_POSTDIV(mmc1_clk, "mmc1",
>> + mmc_default_parents, 0x08c,
>> + 0, 4, /* M */
>> + 16, 2, /* P */
>> + 24, 2, /* mux */
>> + BIT(31), /* gate */
>> + 2, /* post-div */
>> + 0);
>> +
>
> Are you sure that the divider there for the non-eMMC clocks? Usually,
> the new mode is only here for the eMMC, so we would divide the rate by
> two in the non-eMMC case.

The new mode is there for all MMC controllers. The other two MMC
controllers even have the old/new timing mode switch. In case you
forgot we have ".need_new_timings" set for the A64 compatible.

But to eliminate any doubts or concerns, I've rerun tests for the
micro SD card, instead of the eMMC. And yes the results are the same,
2x improvement (12 MB/s vs 23.7 MB/s).

ChenYu

2017-12-06 15:56:23

by Maxime Ripard

[permalink] [raw]
Subject: Re: [PATCH 2/2] clk: sunxi-ng: sun50i: a64: Add 2x fixed post-divider to MMC module clocks

Hi,

On Wed, Dec 06, 2017 at 10:30:26AM +0800, Chen-Yu Tsai wrote:
> On Wed, Dec 6, 2017 at 3:59 AM, Maxime Ripard
> <[email protected]> wrote:
> > Hi,
> >
> > On Mon, Dec 04, 2017 at 01:19:12PM +0800, Chen-Yu Tsai wrote:
> >> On the A64, the MMC module clocks are fixed in the new timing mode,
> >> i.e. they do not have a bit to select the mode. These clocks have
> >> a 2x divider somewhere between the clock and the MMC module.
> >>
> >> To be consistent with other SoCs supporting the new timing mode,
> >> we model the 2x divider as a fixed post-divider on the MMC module
> >> clocks.
> >>
> >> This patch adds the post-dividers to the MMC clocks.
> >>
> >> Signed-off-by: Chen-Yu Tsai <[email protected]>
> >
> > I had a doubt applying that one... sorry.
> >
> >> ---
> >> drivers/clk/sunxi-ng/ccu-sun50i-a64.c | 57 +++++++++++++++++++++++------------
> >> 1 file changed, 37 insertions(+), 20 deletions(-)
> >>
> >> diff --git a/drivers/clk/sunxi-ng/ccu-sun50i-a64.c b/drivers/clk/sunxi-ng/ccu-sun50i-a64.c
> >> index 2bb4cabf802f..ee9c12cf3f08 100644
> >> --- a/drivers/clk/sunxi-ng/ccu-sun50i-a64.c
> >> +++ b/drivers/clk/sunxi-ng/ccu-sun50i-a64.c
> >> @@ -400,28 +400,45 @@ static SUNXI_CCU_MP_WITH_MUX_GATE(nand_clk, "nand", mod0_default_parents, 0x080,
> >> BIT(31), /* gate */
> >> 0);
> >>
> >> +/*
> >> + * MMC clocks are the new timing mode (see A83T & H3) variety, but without
> >> + * the mode switch. This means they have a 2x post divider between the clock
> >> + * and the MMC module. This is not documented in the manual, but is taken
> >> + * into consideration when setting the mmc module clocks in the BSP kernel.
> >> + * Without it, MMC performance is degraded.
> >> + *
> >> + * We model it here to be consistent with other SoCs supporting this mode.
> >> + * The alternative would be to add the 2x multiplier when setting the MMC
> >> + * module clock in the MMC driver, just for the A64.
> >> + */
> >> static const char * const mmc_default_parents[] = { "osc24M", "pll-periph0-2x",
> >> "pll-periph1-2x" };
> >> -static SUNXI_CCU_MP_WITH_MUX_GATE(mmc0_clk, "mmc0", mmc_default_parents, 0x088,
> >> - 0, 4, /* M */
> >> - 16, 2, /* P */
> >> - 24, 2, /* mux */
> >> - BIT(31), /* gate */
> >> - 0);
> >> -
> >> -static SUNXI_CCU_MP_WITH_MUX_GATE(mmc1_clk, "mmc1", mmc_default_parents, 0x08c,
> >> - 0, 4, /* M */
> >> - 16, 2, /* P */
> >> - 24, 2, /* mux */
> >> - BIT(31), /* gate */
> >> - 0);
> >> -
> >> -static SUNXI_CCU_MP_WITH_MUX_GATE(mmc2_clk, "mmc2", mmc_default_parents, 0x090,
> >> - 0, 4, /* M */
> >> - 16, 2, /* P */
> >> - 24, 2, /* mux */
> >> - BIT(31), /* gate */
> >> - 0);
> >> +static SUNXI_CCU_MP_WITH_MUX_GATE_POSTDIV(mmc0_clk, "mmc0",
> >> + mmc_default_parents, 0x088,
> >> + 0, 4, /* M */
> >> + 16, 2, /* P */
> >> + 24, 2, /* mux */
> >> + BIT(31), /* gate */
> >> + 2, /* post-div */
> >> + 0);
> >> +
> >> +static SUNXI_CCU_MP_WITH_MUX_GATE_POSTDIV(mmc1_clk, "mmc1",
> >> + mmc_default_parents, 0x08c,
> >> + 0, 4, /* M */
> >> + 16, 2, /* P */
> >> + 24, 2, /* mux */
> >> + BIT(31), /* gate */
> >> + 2, /* post-div */
> >> + 0);
> >> +
> >
> > Are you sure that the divider there for the non-eMMC clocks? Usually,
> > the new mode is only here for the eMMC, so we would divide the rate by
> > two in the non-eMMC case.
>
> The new mode is there for all MMC controllers. The other two MMC
> controllers even have the old/new timing mode switch. In case you
> forgot we have ".need_new_timings" set for the A64 compatible.

But then, shouldn't we model them as such, using the work you did on
the A83t clocks?

> But to eliminate any doubts or concerns, I've rerun tests for the
> micro SD card, instead of the eMMC. And yes the results are the same,
> 2x improvement (12 MB/s vs 23.7 MB/s).

Ok, good.

Thanks!
Maxime

--
Maxime Ripard, Free Electrons
Embedded Linux and Kernel engineering
http://free-electrons.com


Attachments:
(No filename) (5.03 kB)
signature.asc (833.00 B)
Download all attachments

2017-12-06 16:11:24

by Chen-Yu Tsai

[permalink] [raw]
Subject: Re: [PATCH 2/2] clk: sunxi-ng: sun50i: a64: Add 2x fixed post-divider to MMC module clocks

On Wed, Dec 6, 2017 at 11:56 PM, Maxime Ripard
<[email protected]> wrote:
> Hi,
>
> On Wed, Dec 06, 2017 at 10:30:26AM +0800, Chen-Yu Tsai wrote:
>> On Wed, Dec 6, 2017 at 3:59 AM, Maxime Ripard
>> <[email protected]> wrote:
>> > Hi,
>> >
>> > On Mon, Dec 04, 2017 at 01:19:12PM +0800, Chen-Yu Tsai wrote:
>> >> On the A64, the MMC module clocks are fixed in the new timing mode,
>> >> i.e. they do not have a bit to select the mode. These clocks have
>> >> a 2x divider somewhere between the clock and the MMC module.
>> >>
>> >> To be consistent with other SoCs supporting the new timing mode,
>> >> we model the 2x divider as a fixed post-divider on the MMC module
>> >> clocks.
>> >>
>> >> This patch adds the post-dividers to the MMC clocks.
>> >>
>> >> Signed-off-by: Chen-Yu Tsai <[email protected]>
>> >
>> > I had a doubt applying that one... sorry.
>> >
>> >> ---
>> >> drivers/clk/sunxi-ng/ccu-sun50i-a64.c | 57 +++++++++++++++++++++++------------
>> >> 1 file changed, 37 insertions(+), 20 deletions(-)
>> >>
>> >> diff --git a/drivers/clk/sunxi-ng/ccu-sun50i-a64.c b/drivers/clk/sunxi-ng/ccu-sun50i-a64.c
>> >> index 2bb4cabf802f..ee9c12cf3f08 100644
>> >> --- a/drivers/clk/sunxi-ng/ccu-sun50i-a64.c
>> >> +++ b/drivers/clk/sunxi-ng/ccu-sun50i-a64.c
>> >> @@ -400,28 +400,45 @@ static SUNXI_CCU_MP_WITH_MUX_GATE(nand_clk, "nand", mod0_default_parents, 0x080,
>> >> BIT(31), /* gate */
>> >> 0);
>> >>
>> >> +/*
>> >> + * MMC clocks are the new timing mode (see A83T & H3) variety, but without
>> >> + * the mode switch. This means they have a 2x post divider between the clock
>> >> + * and the MMC module. This is not documented in the manual, but is taken
>> >> + * into consideration when setting the mmc module clocks in the BSP kernel.
>> >> + * Without it, MMC performance is degraded.
>> >> + *
>> >> + * We model it here to be consistent with other SoCs supporting this mode.
>> >> + * The alternative would be to add the 2x multiplier when setting the MMC
>> >> + * module clock in the MMC driver, just for the A64.
>> >> + */
>> >> static const char * const mmc_default_parents[] = { "osc24M", "pll-periph0-2x",
>> >> "pll-periph1-2x" };
>> >> -static SUNXI_CCU_MP_WITH_MUX_GATE(mmc0_clk, "mmc0", mmc_default_parents, 0x088,
>> >> - 0, 4, /* M */
>> >> - 16, 2, /* P */
>> >> - 24, 2, /* mux */
>> >> - BIT(31), /* gate */
>> >> - 0);
>> >> -
>> >> -static SUNXI_CCU_MP_WITH_MUX_GATE(mmc1_clk, "mmc1", mmc_default_parents, 0x08c,
>> >> - 0, 4, /* M */
>> >> - 16, 2, /* P */
>> >> - 24, 2, /* mux */
>> >> - BIT(31), /* gate */
>> >> - 0);
>> >> -
>> >> -static SUNXI_CCU_MP_WITH_MUX_GATE(mmc2_clk, "mmc2", mmc_default_parents, 0x090,
>> >> - 0, 4, /* M */
>> >> - 16, 2, /* P */
>> >> - 24, 2, /* mux */
>> >> - BIT(31), /* gate */
>> >> - 0);
>> >> +static SUNXI_CCU_MP_WITH_MUX_GATE_POSTDIV(mmc0_clk, "mmc0",
>> >> + mmc_default_parents, 0x088,
>> >> + 0, 4, /* M */
>> >> + 16, 2, /* P */
>> >> + 24, 2, /* mux */
>> >> + BIT(31), /* gate */
>> >> + 2, /* post-div */
>> >> + 0);
>> >> +
>> >> +static SUNXI_CCU_MP_WITH_MUX_GATE_POSTDIV(mmc1_clk, "mmc1",
>> >> + mmc_default_parents, 0x08c,
>> >> + 0, 4, /* M */
>> >> + 16, 2, /* P */
>> >> + 24, 2, /* mux */
>> >> + BIT(31), /* gate */
>> >> + 2, /* post-div */
>> >> + 0);
>> >> +
>> >
>> > Are you sure that the divider there for the non-eMMC clocks? Usually,
>> > the new mode is only here for the eMMC, so we would divide the rate by
>> > two in the non-eMMC case.
>>
>> The new mode is there for all MMC controllers. The other two MMC
>> controllers even have the old/new timing mode switch. In case you
>> forgot we have ".need_new_timings" set for the A64 compatible.
>
> But then, shouldn't we model them as such, using the work you did on
> the A83t clocks?

On the A64, the clocks don't have the switch. Only the MMC controller
does. On the A83T, both do.

ChenYu

>> But to eliminate any doubts or concerns, I've rerun tests for the
>> micro SD card, instead of the eMMC. And yes the results are the same,
>> 2x improvement (12 MB/s vs 23.7 MB/s).
>
> Ok, good.
>
> Thanks!
> Maxime
>
> --
> Maxime Ripard, Free Electrons
> Embedded Linux and Kernel engineering
> http://free-electrons.com

2017-12-07 09:10:47

by Maxime Ripard

[permalink] [raw]
Subject: Re: [PATCH 2/2] clk: sunxi-ng: sun50i: a64: Add 2x fixed post-divider to MMC module clocks

On Thu, Dec 07, 2017 at 12:10:39AM +0800, Chen-Yu Tsai wrote:
> On Wed, Dec 6, 2017 at 11:56 PM, Maxime Ripard
> <[email protected]> wrote:
> > Hi,
> >
> > On Wed, Dec 06, 2017 at 10:30:26AM +0800, Chen-Yu Tsai wrote:
> >> On Wed, Dec 6, 2017 at 3:59 AM, Maxime Ripard
> >> <[email protected]> wrote:
> >> > Hi,
> >> >
> >> > On Mon, Dec 04, 2017 at 01:19:12PM +0800, Chen-Yu Tsai wrote:
> >> >> On the A64, the MMC module clocks are fixed in the new timing mode,
> >> >> i.e. they do not have a bit to select the mode. These clocks have
> >> >> a 2x divider somewhere between the clock and the MMC module.
> >> >>
> >> >> To be consistent with other SoCs supporting the new timing mode,
> >> >> we model the 2x divider as a fixed post-divider on the MMC module
> >> >> clocks.
> >> >>
> >> >> This patch adds the post-dividers to the MMC clocks.
> >> >>
> >> >> Signed-off-by: Chen-Yu Tsai <[email protected]>
> >> >
> >> > I had a doubt applying that one... sorry.
> >> >
> >> >> ---
> >> >> drivers/clk/sunxi-ng/ccu-sun50i-a64.c | 57 +++++++++++++++++++++++------------
> >> >> 1 file changed, 37 insertions(+), 20 deletions(-)
> >> >>
> >> >> diff --git a/drivers/clk/sunxi-ng/ccu-sun50i-a64.c b/drivers/clk/sunxi-ng/ccu-sun50i-a64.c
> >> >> index 2bb4cabf802f..ee9c12cf3f08 100644
> >> >> --- a/drivers/clk/sunxi-ng/ccu-sun50i-a64.c
> >> >> +++ b/drivers/clk/sunxi-ng/ccu-sun50i-a64.c
> >> >> @@ -400,28 +400,45 @@ static SUNXI_CCU_MP_WITH_MUX_GATE(nand_clk, "nand", mod0_default_parents, 0x080,
> >> >> BIT(31), /* gate */
> >> >> 0);
> >> >>
> >> >> +/*
> >> >> + * MMC clocks are the new timing mode (see A83T & H3) variety, but without
> >> >> + * the mode switch. This means they have a 2x post divider between the clock
> >> >> + * and the MMC module. This is not documented in the manual, but is taken
> >> >> + * into consideration when setting the mmc module clocks in the BSP kernel.
> >> >> + * Without it, MMC performance is degraded.
> >> >> + *
> >> >> + * We model it here to be consistent with other SoCs supporting this mode.
> >> >> + * The alternative would be to add the 2x multiplier when setting the MMC
> >> >> + * module clock in the MMC driver, just for the A64.
> >> >> + */
> >> >> static const char * const mmc_default_parents[] = { "osc24M", "pll-periph0-2x",
> >> >> "pll-periph1-2x" };
> >> >> -static SUNXI_CCU_MP_WITH_MUX_GATE(mmc0_clk, "mmc0", mmc_default_parents, 0x088,
> >> >> - 0, 4, /* M */
> >> >> - 16, 2, /* P */
> >> >> - 24, 2, /* mux */
> >> >> - BIT(31), /* gate */
> >> >> - 0);
> >> >> -
> >> >> -static SUNXI_CCU_MP_WITH_MUX_GATE(mmc1_clk, "mmc1", mmc_default_parents, 0x08c,
> >> >> - 0, 4, /* M */
> >> >> - 16, 2, /* P */
> >> >> - 24, 2, /* mux */
> >> >> - BIT(31), /* gate */
> >> >> - 0);
> >> >> -
> >> >> -static SUNXI_CCU_MP_WITH_MUX_GATE(mmc2_clk, "mmc2", mmc_default_parents, 0x090,
> >> >> - 0, 4, /* M */
> >> >> - 16, 2, /* P */
> >> >> - 24, 2, /* mux */
> >> >> - BIT(31), /* gate */
> >> >> - 0);
> >> >> +static SUNXI_CCU_MP_WITH_MUX_GATE_POSTDIV(mmc0_clk, "mmc0",
> >> >> + mmc_default_parents, 0x088,
> >> >> + 0, 4, /* M */
> >> >> + 16, 2, /* P */
> >> >> + 24, 2, /* mux */
> >> >> + BIT(31), /* gate */
> >> >> + 2, /* post-div */
> >> >> + 0);
> >> >> +
> >> >> +static SUNXI_CCU_MP_WITH_MUX_GATE_POSTDIV(mmc1_clk, "mmc1",
> >> >> + mmc_default_parents, 0x08c,
> >> >> + 0, 4, /* M */
> >> >> + 16, 2, /* P */
> >> >> + 24, 2, /* mux */
> >> >> + BIT(31), /* gate */
> >> >> + 2, /* post-div */
> >> >> + 0);
> >> >> +
> >> >
> >> > Are you sure that the divider there for the non-eMMC clocks? Usually,
> >> > the new mode is only here for the eMMC, so we would divide the rate by
> >> > two in the non-eMMC case.
> >>
> >> The new mode is there for all MMC controllers. The other two MMC
> >> controllers even have the old/new timing mode switch. In case you
> >> forgot we have ".need_new_timings" set for the A64 compatible.
> >
> > But then, shouldn't we model them as such, using the work you did on
> > the A83t clocks?
>
> On the A64, the clocks don't have the switch. Only the MMC controller
> does. On the A83T, both do.

Ah, right. I've applied both patches.

Maxime

--
Maxime Ripard, Free Electrons
Embedded Linux and Kernel engineering
http://free-electrons.com


Attachments:
(No filename) (5.41 kB)
signature.asc (833.00 B)
Download all attachments