2019-10-16 14:21:19

by Jacopo Mondi

[permalink] [raw]
Subject: [PATCH] iio: adc: max9611: Defer probe on POR read

The max9611 driver tests communications with the chip by reading the die
temperature during the probe function. If the temperature register
POR (power-on reset) value is returned from the test read, defer probe to
give the chip a bit more time to properly exit from reset.

Reported-by: Geert Uytterhoeven <[email protected]>
Signed-off-by: Jacopo Mondi <[email protected]>

---
Geert,
I've not been able to reproduce the issue on my boards (M3-N
Salvator-XS and M3-W Salvator-X). As you reported the issue you might be
able to reproduce it, could you please test this?

Also, I opted for deferring probe instead of arbitrary repeat the
temperature read. What's your opinion?

Thanks
j
---
drivers/iio/adc/max9611.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/iio/adc/max9611.c b/drivers/iio/adc/max9611.c
index da073d72f649..30ae5879252c 100644
--- a/drivers/iio/adc/max9611.c
+++ b/drivers/iio/adc/max9611.c
@@ -80,6 +80,7 @@
* The complete formula to calculate temperature is:
* ((adc_read >> 7) * 1000) / (1 / 480 * 1000)
*/
+#define MAX9611_TEMP_POR 0x8000
#define MAX9611_TEMP_MAX_POS 0x7f80
#define MAX9611_TEMP_MAX_NEG 0xff80
#define MAX9611_TEMP_MIN_NEG 0xd980
@@ -480,8 +481,10 @@ static int max9611_init(struct max9611_dev *max9611)
if (ret)
return ret;

- regval &= MAX9611_TEMP_MASK;
+ if (regval == MAX9611_TEMP_POR)
+ return -EPROBE_DEFER;

+ regval &= MAX9611_TEMP_MASK;
if ((regval > MAX9611_TEMP_MAX_POS &&
regval < MAX9611_TEMP_MIN_NEG) ||
regval > MAX9611_TEMP_MAX_NEG) {
--
2.23.0


2019-10-18 15:55:25

by Geert Uytterhoeven

[permalink] [raw]
Subject: Re: [PATCH] iio: adc: max9611: Defer probe on POR read

Hi Jacopo,

CC i2c

On Wed, Oct 16, 2019 at 12:23 PM Jacopo Mondi <[email protected]> wrote:
> The max9611 driver tests communications with the chip by reading the die
> temperature during the probe function. If the temperature register
> POR (power-on reset) value is returned from the test read, defer probe to
> give the chip a bit more time to properly exit from reset.
>
> Reported-by: Geert Uytterhoeven <[email protected]>
> Signed-off-by: Jacopo Mondi <[email protected]>

Thanks for your patch!

> Geert,
> I've not been able to reproduce the issue on my boards (M3-N
> Salvator-XS and M3-W Salvator-X). As you reported the issue you might be
> able to reproduce it, could you please test this?

I can reproduce it on Salvator-XS with R-Car H3 ES2.0.
According to my logs, I've seen the issue on all Salvator-X(S) boards,
but not with the same frequency. Probability is highest on H3 ES2.0
(ca. 5% of the boots since I first saw the issue), followed by H3 ES1.0,
M3-W, and M3-N.

After more investigation, my findings are:
1. I cannot reproduce the issue if the max9611 driver is modular.
Is it related to using max9611 "too soon" after i2c bus init?
How can "i2c bus init" impact a slave device?
Perhaps due to pin configuration, e.g. changing from another pin
function or GPIO to function i2c4?
2. Adding a delay at the top of max9611_init() fixes the issue.
This would explain why the issue is less likely to happy on slower
SoCs like M3-N.
3. Disabling all other i2c slaves on i2c4 in DTS fixes the issue.
Before, max9611 was initialized last, so this moves init earlier,
contradicting theory #1.
4. Just disabling the adv7482 (which registers 11 dummies i2c slaves)
in DTS does not fix the issue.

Unfortunately i2c4 is exposed on a 60-pin Samtec QSH connector only,
for which I have no breakout adapter.

Wolfram: do you have any clues?

> Also, I opted for deferring probe instead of arbitrary repeat the
> temperature read. What's your opinion?

While this is probably OK if the max9611 driver is built-in, I'm afraid
this may lead to unbounded delays for a reprobe in case the driver
is modular.

> --- a/drivers/iio/adc/max9611.c
> +++ b/drivers/iio/adc/max9611.c
> @@ -80,6 +80,7 @@
> * The complete formula to calculate temperature is:
> * ((adc_read >> 7) * 1000) / (1 / 480 * 1000)
> */
> +#define MAX9611_TEMP_POR 0x8000
> #define MAX9611_TEMP_MAX_POS 0x7f80
> #define MAX9611_TEMP_MAX_NEG 0xff80
> #define MAX9611_TEMP_MIN_NEG 0xd980
> @@ -480,8 +481,10 @@ static int max9611_init(struct max9611_dev *max9611)
> if (ret)
> return ret;
>
> - regval &= MAX9611_TEMP_MASK;
> + if (regval == MAX9611_TEMP_POR)
> + return -EPROBE_DEFER;
>
> + regval &= MAX9611_TEMP_MASK;
> if ((regval > MAX9611_TEMP_MAX_POS &&
> regval < MAX9611_TEMP_MIN_NEG) ||
> regval > MAX9611_TEMP_MAX_NEG) {

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- [email protected]

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds

2019-11-10 17:16:26

by Jonathan Cameron

[permalink] [raw]
Subject: Re: [PATCH] iio: adc: max9611: Defer probe on POR read


On Thu, 17 Oct 2019 14:55:58 +0200
Geert Uytterhoeven <[email protected]> wrote:

> Hi Jacopo,
>
> CC i2c

Ping. Wolfram, a query in here for you.

Thanks,

Jonathan

>
> On Wed, Oct 16, 2019 at 12:23 PM Jacopo Mondi <[email protected]> wrote:
> > The max9611 driver tests communications with the chip by reading the die
> > temperature during the probe function. If the temperature register
> > POR (power-on reset) value is returned from the test read, defer probe to
> > give the chip a bit more time to properly exit from reset.
> >
> > Reported-by: Geert Uytterhoeven <[email protected]>
> > Signed-off-by: Jacopo Mondi <[email protected]>
>
> Thanks for your patch!
>
> > Geert,
> > I've not been able to reproduce the issue on my boards (M3-N
> > Salvator-XS and M3-W Salvator-X). As you reported the issue you might be
> > able to reproduce it, could you please test this?
>
> I can reproduce it on Salvator-XS with R-Car H3 ES2.0.
> According to my logs, I've seen the issue on all Salvator-X(S) boards,
> but not with the same frequency. Probability is highest on H3 ES2.0
> (ca. 5% of the boots since I first saw the issue), followed by H3 ES1.0,
> M3-W, and M3-N.
>
> After more investigation, my findings are:
> 1. I cannot reproduce the issue if the max9611 driver is modular.
> Is it related to using max9611 "too soon" after i2c bus init?
> How can "i2c bus init" impact a slave device?
> Perhaps due to pin configuration, e.g. changing from another pin
> function or GPIO to function i2c4?
> 2. Adding a delay at the top of max9611_init() fixes the issue.
> This would explain why the issue is less likely to happy on slower
> SoCs like M3-N.
> 3. Disabling all other i2c slaves on i2c4 in DTS fixes the issue.
> Before, max9611 was initialized last, so this moves init earlier,
> contradicting theory #1.
> 4. Just disabling the adv7482 (which registers 11 dummies i2c slaves)
> in DTS does not fix the issue.
>
> Unfortunately i2c4 is exposed on a 60-pin Samtec QSH connector only,
> for which I have no breakout adapter.
>
> Wolfram: do you have any clues?
>
> > Also, I opted for deferring probe instead of arbitrary repeat the
> > temperature read. What's your opinion?
>
> While this is probably OK if the max9611 driver is built-in, I'm afraid
> this may lead to unbounded delays for a reprobe in case the driver
> is modular.
>
> > --- a/drivers/iio/adc/max9611.c
> > +++ b/drivers/iio/adc/max9611.c
> > @@ -80,6 +80,7 @@
> > * The complete formula to calculate temperature is:
> > * ((adc_read >> 7) * 1000) / (1 / 480 * 1000)
> > */
> > +#define MAX9611_TEMP_POR 0x8000
> > #define MAX9611_TEMP_MAX_POS 0x7f80
> > #define MAX9611_TEMP_MAX_NEG 0xff80
> > #define MAX9611_TEMP_MIN_NEG 0xd980
> > @@ -480,8 +481,10 @@ static int max9611_init(struct max9611_dev *max9611)
> > if (ret)
> > return ret;
> >
> > - regval &= MAX9611_TEMP_MASK;
> > + if (regval == MAX9611_TEMP_POR)
> > + return -EPROBE_DEFER;
> >
> > + regval &= MAX9611_TEMP_MASK;
> > if ((regval > MAX9611_TEMP_MAX_POS &&
> > regval < MAX9611_TEMP_MIN_NEG) ||
> > regval > MAX9611_TEMP_MAX_NEG) {
>
> Gr{oetje,eeting}s,
>
> Geert
>

2019-11-10 18:56:51

by Geert Uytterhoeven

[permalink] [raw]
Subject: Re: [PATCH] iio: adc: max9611: Defer probe on POR read

On Thu, Oct 17, 2019 at 2:55 PM Geert Uytterhoeven <[email protected]> wrote:
> On Wed, Oct 16, 2019 at 12:23 PM Jacopo Mondi <[email protected]> wrote:
> > The max9611 driver tests communications with the chip by reading the die
> > temperature during the probe function. If the temperature register
> > POR (power-on reset) value is returned from the test read, defer probe to
> > give the chip a bit more time to properly exit from reset.
> >
> > Reported-by: Geert Uytterhoeven <[email protected]>
> > Signed-off-by: Jacopo Mondi <[email protected]>
>
> Thanks for your patch!
>
> > Geert,
> > I've not been able to reproduce the issue on my boards (M3-N
> > Salvator-XS and M3-W Salvator-X). As you reported the issue you might be
> > able to reproduce it, could you please test this?
>
> I can reproduce it on Salvator-XS with R-Car H3 ES2.0.
> According to my logs, I've seen the issue on all Salvator-X(S) boards,
> but not with the same frequency. Probability is highest on H3 ES2.0
> (ca. 5% of the boots since I first saw the issue), followed by H3 ES1.0,
> M3-W, and M3-N.
>
> After more investigation, my findings are:
> 1. I cannot reproduce the issue if the max9611 driver is modular.
> Is it related to using max9611 "too soon" after i2c bus init?
> How can "i2c bus init" impact a slave device?
> Perhaps due to pin configuration, e.g. changing from another pin
> function or GPIO to function i2c4?
> 2. Adding a delay at the top of max9611_init() fixes the issue.
> This would explain why the issue is less likely to happy on slower
> SoCs like M3-N.
> 3. Disabling all other i2c slaves on i2c4 in DTS fixes the issue.
> Before, max9611 was initialized last, so this moves init earlier,
> contradicting theory #1.
> 4. Just disabling the adv7482 (which registers 11 dummies i2c slaves)
> in DTS does not fix the issue.
>
> Unfortunately i2c4 is exposed on a 60-pin Samtec QSH connector only,
> for which I have no breakout adapter.

Some soldering fixed that. Still investigating.
Here's a status update:

A. I can reproduce the issue at 100 kHz instead of 400 kHz.
B. 3 above doesn't seem to be true: I can reproduce it with all other
slaves disabled.
C. The code says:

/*
* need a delay here to make register configuration
* stabilize. 1 msec at least, from empirical testing.
*/
usleep_range(1000, 2000);

However, the datasheet says:

Parameter MIN TYP MAX
Conversion Time - 2 ms -

So 1 ms is definitely too short.
Unfortunately the datasheet has no maximum value.

D. For 2: msleep(1) is sufficient, usleep_range(200, 500) is not.
And this is still not explained by C.
I also don't know yet who's resetting the chip on reboot, as it
does not have a reset line, but all registers are zeroed (except
for the POR temperature value).

To be investigated more...

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- [email protected]

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds

2019-11-13 09:45:13

by Geert Uytterhoeven

[permalink] [raw]
Subject: Re: [PATCH] iio: adc: max9611: Defer probe on POR read

On Sun, Nov 10, 2019 at 7:45 PM Geert Uytterhoeven <[email protected]> wrote:
> On Thu, Oct 17, 2019 at 2:55 PM Geert Uytterhoeven <[email protected]> wrote:
> > On Wed, Oct 16, 2019 at 12:23 PM Jacopo Mondi <[email protected]> wrote:
> > > The max9611 driver tests communications with the chip by reading the die
> > > temperature during the probe function. If the temperature register
> > > POR (power-on reset) value is returned from the test read, defer probe to
> > > give the chip a bit more time to properly exit from reset.
> > >
> > > Reported-by: Geert Uytterhoeven <[email protected]>
> > > Signed-off-by: Jacopo Mondi <[email protected]>
> >
> > > I've not been able to reproduce the issue on my boards (M3-N
> > > Salvator-XS and M3-W Salvator-X). As you reported the issue you might be
> > > able to reproduce it, could you please test this?
> >
> > I can reproduce it on Salvator-XS with R-Car H3 ES2.0.
> > According to my logs, I've seen the issue on all Salvator-X(S) boards,
> > but not with the same frequency. Probability is highest on H3 ES2.0
> > (ca. 5% of the boots since I first saw the issue), followed by H3 ES1.0,
> > M3-W, and M3-N.
> >
> > After more investigation, my findings are:
> > 1. I cannot reproduce the issue if the max9611 driver is modular.
> > Is it related to using max9611 "too soon" after i2c bus init?
> > How can "i2c bus init" impact a slave device?
> > Perhaps due to pin configuration, e.g. changing from another pin
> > function or GPIO to function i2c4?

Not true: I managed to reproduce it with a modular driver.

> > 2. Adding a delay at the top of max9611_init() fixes the issue.
> > This would explain why the issue is less likely to happy on slower
> > SoCs like M3-N.
> > 3. Disabling all other i2c slaves on i2c4 in DTS fixes the issue.
> > Before, max9611 was initialized last, so this moves init earlier,
> > contradicting theory #1.
> > 4. Just disabling the adv7482 (which registers 11 dummies i2c slaves)
> > in DTS does not fix the issue.
> >
> > Unfortunately i2c4 is exposed on a 60-pin Samtec QSH connector only,
> > for which I have no breakout adapter.
>
> Some soldering fixed that. Still investigating.
> Here's a status update:
>
> A. I can reproduce the issue at 100 kHz instead of 400 kHz.
> B. 3 above doesn't seem to be true: I can reproduce it with all other
> slaves disabled.
> C. The code says:
>
> /*
> * need a delay here to make register configuration
> * stabilize. 1 msec at least, from empirical testing.
> */
> usleep_range(1000, 2000);
>
> However, the datasheet says:
>
> Parameter MIN TYP MAX
> Conversion Time - 2 ms -
>
> So 1 ms is definitely too short.
> Unfortunately the datasheet has no maximum value.

usleep_range(1000, 2000) usually results in a sleep time of 2.0 ms: OK
It may take longer: I saw 4.8 -- 7.7 ms (nothing in between 2.0 -- 4.8!): OK
It may take shorter:
- 1.2 -- 1.7 ms: FAIL
- 1.8 ms - 2 ms: OK

So a minimum delay of 2 ms seems like a good value.

> D. For 2: msleep(1) is sufficient, usleep_range(200, 500) is not.
> And this is still not explained by C.

Without adding an msleep() call to max9611_init(), the usleep_range()
call in max9611_read_single() happens at an arbitrary moment.
After adding an msleep() call to max9611_init(), the code becomes
synchronized to the jiffies clock, and the usleep_range() call in
max9611_read_single() never completes in less than 2 ms, thus avoiding
the issue.

> I also don't know yet who's resetting the chip on reboot, as it
> does not have a reset line, but all registers are zeroed (except
> for the POR temperature value).

Looks like the PMIC powers down the +3.3V rail for ca. 25 ms when PSCI
initiates a system reboot.

Patch sent: "[PATCH] iio: adc: max9611: Fix too short conversion time
delay"
(https://lore.kernel.org/lkml/[email protected]/).

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- [email protected]

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds