These patches optimize the cie1913() implementation by using fewer 64
bit divisions and multiplications. It also contains a minor fix for
the linear constant used.
v2:
- Drop patch 5.
- Fix thinko in patch 4, otherwise no code change.
- Better changelog in patch 3.
- Add Daniel's Reviewed-By to the four patches.
Daniel, I took the liberty of adding your R-B to patch 4 despite
changing it a little to fix a thinko - I should add 1<<31 and not
1<<15. Please tell me if that was inappropriate.
Rasmus Villemoes (4):
backlight: pwm_bl: fix cie1913 comments and constant
backlight: pwm_bl: eliminate a 64/32 division
backlight: pwm_bl: drop use of int_pow()
backlight: pwm_bl: switch to power-of-2 base for fixed-point math
drivers/video/backlight/pwm_bl.c | 29 +++++++++++++++++++----------
1 file changed, 19 insertions(+), 10 deletions(-)
--
2.20.1
The "break-even" point for the two formulas is L==8, which is also
what the code actually implements. [Incidentally, at that point one
has Y=0.008856, not 0.08856].
Moreover, all the sources I can find say the linear factor is 903.3
rather than 902.3, which makes sense since then the formulas agree at
L==8, both yielding the 0.008856 figure to four significant digits.
Reviewed-by: Daniel Thompson <[email protected]>
Signed-off-by: Rasmus Villemoes <[email protected]>
---
drivers/video/backlight/pwm_bl.c | 12 +++++++++---
1 file changed, 9 insertions(+), 3 deletions(-)
diff --git a/drivers/video/backlight/pwm_bl.c b/drivers/video/backlight/pwm_bl.c
index 746eebc411df..cc44a02e95c7 100644
--- a/drivers/video/backlight/pwm_bl.c
+++ b/drivers/video/backlight/pwm_bl.c
@@ -155,8 +155,8 @@ static const struct backlight_ops pwm_backlight_ops = {
*
* The CIE 1931 lightness formula is what actually describes how we perceive
* light:
- * Y = (L* / 902.3) if L* ≤ 0.08856
- * Y = ((L* + 16) / 116)^3 if L* > 0.08856
+ * Y = (L* / 903.3) if L* ≤ 8
+ * Y = ((L* + 16) / 116)^3 if L* > 8
*
* Where Y is the luminance, the amount of light coming out of the screen, and
* is a number between 0.0 and 1.0; and L* is the lightness, how bright a human
@@ -169,9 +169,15 @@ static u64 cie1931(unsigned int lightness, unsigned int scale)
{
u64 retval;
+ /*
+ * @lightness is given as a number between 0 and 1, expressed
+ * as a fixed-point number in scale @scale. Convert to a
+ * percentage, still expressed as a fixed-point number, so the
+ * above formulas can be applied.
+ */
lightness *= 100;
if (lightness <= (8 * scale)) {
- retval = DIV_ROUND_CLOSEST_ULL(lightness * 10, 9023);
+ retval = DIV_ROUND_CLOSEST_ULL(lightness * 10, 9033);
} else {
retval = int_pow((lightness + (16 * scale)) / 116, 3);
retval = DIV_ROUND_CLOSEST_ULL(retval, (scale * scale));
--
2.20.1
Using a power-of-2 instead of power-of-10 base makes the computations
much cheaper. 2^16 is safe; retval never becomes more than 2^48 +
2^32/2. On a 32 bit platform, the very expensive 64/32 division at the
end of cie1931() instead becomes essentially free (a shift by 32 is
just a register rename).
Reviewed-by: Daniel Thompson <[email protected]>
Signed-off-by: Rasmus Villemoes <[email protected]>
---
drivers/video/backlight/pwm_bl.c | 22 ++++++++++++----------
1 file changed, 12 insertions(+), 10 deletions(-)
diff --git a/drivers/video/backlight/pwm_bl.c b/drivers/video/backlight/pwm_bl.c
index 273d3fb628a0..a99c2210c935 100644
--- a/drivers/video/backlight/pwm_bl.c
+++ b/drivers/video/backlight/pwm_bl.c
@@ -148,7 +148,8 @@ static const struct backlight_ops pwm_backlight_ops = {
};
#ifdef CONFIG_OF
-#define PWM_LUMINANCE_SCALE 10000 /* luminance scale */
+#define PWM_LUMINANCE_SHIFT 16
+#define PWM_LUMINANCE_SCALE (1 << PWM_LUMINANCE_SHIFT) /* luminance scale */
/*
* CIE lightness to PWM conversion.
@@ -165,23 +166,25 @@ static const struct backlight_ops pwm_backlight_ops = {
* The following function does the fixed point maths needed to implement the
* above formula.
*/
-static u64 cie1931(unsigned int lightness, unsigned int scale)
+static u64 cie1931(unsigned int lightness)
{
u64 retval;
/*
* @lightness is given as a number between 0 and 1, expressed
- * as a fixed-point number in scale @scale. Convert to a
- * percentage, still expressed as a fixed-point number, so the
- * above formulas can be applied.
+ * as a fixed-point number in scale
+ * PWM_LUMINANCE_SCALE. Convert to a percentage, still
+ * expressed as a fixed-point number, so the above formulas
+ * can be applied.
*/
lightness *= 100;
- if (lightness <= (8 * scale)) {
+ if (lightness <= (8 * PWM_LUMINANCE_SCALE)) {
retval = DIV_ROUND_CLOSEST(lightness * 10, 9033);
} else {
- retval = (lightness + (16 * scale)) / 116;
+ retval = (lightness + (16 * PWM_LUMINANCE_SCALE)) / 116;
retval *= retval * retval;
- retval = DIV_ROUND_CLOSEST_ULL(retval, (scale * scale));
+ retval += 1ULL << (2*PWM_LUMINANCE_SHIFT - 1);
+ retval >>= 2*PWM_LUMINANCE_SHIFT;
}
return retval;
@@ -215,8 +218,7 @@ int pwm_backlight_brightness_default(struct device *dev,
/* Fill the table using the cie1931 algorithm */
for (i = 0; i < data->max_brightness; i++) {
retval = cie1931((i * PWM_LUMINANCE_SCALE) /
- data->max_brightness, PWM_LUMINANCE_SCALE) *
- period;
+ data->max_brightness) * period;
retval = DIV_ROUND_CLOSEST_ULL(retval, PWM_LUMINANCE_SCALE);
if (retval > UINT_MAX)
return -EINVAL;
--
2.20.1
lightness*1000 is nowhere near overflowing 32 bits, so we can just use
an ordinary 32/32 division, which is much cheaper than the 64/32 done
via do_div().
Reviewed-by: Daniel Thompson <[email protected]>
Signed-off-by: Rasmus Villemoes <[email protected]>
---
drivers/video/backlight/pwm_bl.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/video/backlight/pwm_bl.c b/drivers/video/backlight/pwm_bl.c
index cc44a02e95c7..672c5e7cfcd1 100644
--- a/drivers/video/backlight/pwm_bl.c
+++ b/drivers/video/backlight/pwm_bl.c
@@ -177,7 +177,7 @@ static u64 cie1931(unsigned int lightness, unsigned int scale)
*/
lightness *= 100;
if (lightness <= (8 * scale)) {
- retval = DIV_ROUND_CLOSEST_ULL(lightness * 10, 9033);
+ retval = DIV_ROUND_CLOSEST(lightness * 10, 9033);
} else {
retval = int_pow((lightness + (16 * scale)) / 116, 3);
retval = DIV_ROUND_CLOSEST_ULL(retval, (scale * scale));
--
2.20.1
For a fixed small exponent of 3, it is more efficient to simply use
two explicit multiplications rather than calling the int_pow() library
function: Aside from the function call overhead, its implementation
using repeated squaring means it ends up doing four 64x64
multiplications.
Reviewed-by: Daniel Thompson <[email protected]>
Signed-off-by: Rasmus Villemoes <[email protected]>
---
drivers/video/backlight/pwm_bl.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/video/backlight/pwm_bl.c b/drivers/video/backlight/pwm_bl.c
index 672c5e7cfcd1..273d3fb628a0 100644
--- a/drivers/video/backlight/pwm_bl.c
+++ b/drivers/video/backlight/pwm_bl.c
@@ -179,7 +179,8 @@ static u64 cie1931(unsigned int lightness, unsigned int scale)
if (lightness <= (8 * scale)) {
retval = DIV_ROUND_CLOSEST(lightness * 10, 9033);
} else {
- retval = int_pow((lightness + (16 * scale)) / 116, 3);
+ retval = (lightness + (16 * scale)) / 116;
+ retval *= retval * retval;
retval = DIV_ROUND_CLOSEST_ULL(retval, (scale * scale));
}
--
2.20.1
On Tue, 08 Oct 2019, Rasmus Villemoes wrote:
> The "break-even" point for the two formulas is L==8, which is also
> what the code actually implements. [Incidentally, at that point one
> has Y=0.008856, not 0.08856].
>
> Moreover, all the sources I can find say the linear factor is 903.3
> rather than 902.3, which makes sense since then the formulas agree at
> L==8, both yielding the 0.008856 figure to four significant digits.
>
> Reviewed-by: Daniel Thompson <[email protected]>
> Signed-off-by: Rasmus Villemoes <[email protected]>
> ---
> drivers/video/backlight/pwm_bl.c | 12 +++++++++---
> 1 file changed, 9 insertions(+), 3 deletions(-)
Applied, thanks.
--
Lee Jones [李琼斯]
Linaro Services Technical Lead
Linaro.org │ Open source software for ARM SoCs
Follow Linaro: Facebook | Twitter | Blog
On Tue, 08 Oct 2019, Rasmus Villemoes wrote:
> For a fixed small exponent of 3, it is more efficient to simply use
> two explicit multiplications rather than calling the int_pow() library
> function: Aside from the function call overhead, its implementation
> using repeated squaring means it ends up doing four 64x64
> multiplications.
>
> Reviewed-by: Daniel Thompson <[email protected]>
> Signed-off-by: Rasmus Villemoes <[email protected]>
> ---
> drivers/video/backlight/pwm_bl.c | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
Applied, thanks.
--
Lee Jones [李琼斯]
Linaro Services Technical Lead
Linaro.org │ Open source software for ARM SoCs
Follow Linaro: Facebook | Twitter | Blog
On Tue, 08 Oct 2019, Rasmus Villemoes wrote:
> Using a power-of-2 instead of power-of-10 base makes the computations
> much cheaper. 2^16 is safe; retval never becomes more than 2^48 +
> 2^32/2. On a 32 bit platform, the very expensive 64/32 division at the
> end of cie1931() instead becomes essentially free (a shift by 32 is
> just a register rename).
>
> Reviewed-by: Daniel Thompson <[email protected]>
> Signed-off-by: Rasmus Villemoes <[email protected]>
> ---
> drivers/video/backlight/pwm_bl.c | 22 ++++++++++++----------
> 1 file changed, 12 insertions(+), 10 deletions(-)
Applied, thanks.
--
Lee Jones [李琼斯]
Linaro Services Technical Lead
Linaro.org │ Open source software for ARM SoCs
Follow Linaro: Facebook | Twitter | Blog
On Tue, 08 Oct 2019, Rasmus Villemoes wrote:
> lightness*1000 is nowhere near overflowing 32 bits, so we can just use
> an ordinary 32/32 division, which is much cheaper than the 64/32 done
> via do_div().
>
> Reviewed-by: Daniel Thompson <[email protected]>
> Signed-off-by: Rasmus Villemoes <[email protected]>
> ---
> drivers/video/backlight/pwm_bl.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
Applied, thanks.
--
Lee Jones [李琼斯]
Linaro Services Technical Lead
Linaro.org │ Open source software for ARM SoCs
Follow Linaro: Facebook | Twitter | Blog
On 14/10/2019 09.27, Lee Jones wrote:
> Applied, thanks.
I'm not seeing the series in next-20191023, should it be there?
Rasmus