2012-05-28 17:36:57

by Torne (Richard Coles)

[permalink] [raw]
Subject: [PATCH V2] MMC: core: cap MMC card timeouts at 2 seconds.

From: "Torne (Richard Coles)" <[email protected]>

MMC CSD info can specify very large, ridiculous timeouts, big enough to
overflow timeout_ns on 32-bit machines. This can result in the card
timing out on every operation because the wrapped timeout value is far
too small.

Fix the overflow by capping the result at 2 seconds. Cards specifying
longer timeouts are almost certainly insane, and host controllers
generally cannot support timeouts that long in any case.

2 seconds should be plenty of time for any card to actually function;
the timeout calculation code is already using 1 second as a "worst case"
timeout for cards running in SPI mode.
---
drivers/mmc/core/core.c | 11 ++++++++++-
1 files changed, 10 insertions(+), 1 deletions(-)

diff --git a/drivers/mmc/core/core.c b/drivers/mmc/core/core.c
index 0b6141d..3b4a9fc 100644
--- a/drivers/mmc/core/core.c
+++ b/drivers/mmc/core/core.c
@@ -512,7 +512,16 @@ void mmc_set_data_timeout(struct mmc_data *data, const struct mmc_card *card)
if (data->flags & MMC_DATA_WRITE)
mult <<= card->csd.r2w_factor;

- data->timeout_ns = card->csd.tacc_ns * mult;
+ /*
+ * The timeout in nanoseconds may overflow with some cards. Cap it at
+ * two seconds both to avoid the overflow and also because host
+ * controllers cannot generally generate timeouts that long anyway.
+ */
+ if (card->csd.tacc_ns <= (2 * NSEC_PER_SEC) / mult)
+ data->timeout_ns = card->csd.tacc_ns * mult;
+ else
+ data->timeout_ns = 2 * NSEC_PER_SEC;
+
data->timeout_clks = card->csd.tacc_clks * mult;

/*
--
1.7.7.3


2012-05-29 02:32:37

by Ben Hutchings

[permalink] [raw]
Subject: Re: [PATCH V2] MMC: core: cap MMC card timeouts at 2 seconds.

On Mon, 2012-05-28 at 18:31 +0100, Torne (Richard Coles) wrote:
> From: "Torne (Richard Coles)" <[email protected]>
>
> MMC CSD info can specify very large, ridiculous timeouts, big enough to
> overflow timeout_ns on 32-bit machines. This can result in the card
> timing out on every operation because the wrapped timeout value is far
> too small.
>
> Fix the overflow by capping the result at 2 seconds. Cards specifying
> longer timeouts are almost certainly insane, and host controllers
> generally cannot support timeouts that long in any case.
>
> 2 seconds should be plenty of time for any card to actually function;
> the timeout calculation code is already using 1 second as a "worst case"
> timeout for cards running in SPI mode.

Needs a 'Signed-off-by'.

> ---
> drivers/mmc/core/core.c | 11 ++++++++++-
> 1 files changed, 10 insertions(+), 1 deletions(-)
>
> diff --git a/drivers/mmc/core/core.c b/drivers/mmc/core/core.c
> index 0b6141d..3b4a9fc 100644
> --- a/drivers/mmc/core/core.c
> +++ b/drivers/mmc/core/core.c
> @@ -512,7 +512,16 @@ void mmc_set_data_timeout(struct mmc_data *data, const struct mmc_card *card)
> if (data->flags & MMC_DATA_WRITE)
> mult <<= card->csd.r2w_factor;
>
> - data->timeout_ns = card->csd.tacc_ns * mult;
> + /*
> + * The timeout in nanoseconds may overflow with some cards. Cap it at
> + * two seconds both to avoid the overflow and also because host
> + * controllers cannot generally generate timeouts that long anyway.
> + */
> + if (card->csd.tacc_ns <= (2 * NSEC_PER_SEC) / mult)
> + data->timeout_ns = card->csd.tacc_ns * mult;
> + else
> + data->timeout_ns = 2 * NSEC_PER_SEC;

We clearly need to guard against overflow here, and this is the correct
way to clamp the multiplication. I can't speak as to whether 2 seconds
is the right limit.

Ben.

> data->timeout_clks = card->csd.tacc_clks * mult;
>
> /*

--
Ben Hutchings
Teamwork is essential - it allows you to blame someone else.


Attachments:
signature.asc (828.00 B)
This is a digitally signed message part

2012-05-29 09:02:06

by Torne (Richard Coles)

[permalink] [raw]
Subject: Re: [PATCH V2] MMC: core: cap MMC card timeouts at 2 seconds.

On 29 May 2012 03:32, Ben Hutchings <[email protected]> wrote:
> On Mon, 2012-05-28 at 18:31 +0100, Torne (Richard Coles) wrote:
>> From: "Torne (Richard Coles)" <[email protected]>
>>
>> MMC CSD info can specify very large, ridiculous timeouts, big enough to
>> overflow timeout_ns on 32-bit machines. This can result in the card
>> timing out on every operation because the wrapped timeout value is far
>> too small.
>>
>> Fix the overflow by capping the result at 2 seconds. ?Cards specifying
>> longer timeouts are almost certainly insane, and host controllers
>> generally cannot support timeouts that long in any case.
>>
>> 2 seconds should be plenty of time for any card to actually function;
>> the timeout calculation code is already using 1 second as a "worst case"
>> timeout for cards running in SPI mode.
>
> Needs a 'Signed-off-by'.

Oops, sorry. I'm still getting used to the patch workflow. Will resend.

--
Torne (Richard Coles)
[email protected]

2012-05-29 09:02:29

by Torne (Richard Coles)

[permalink] [raw]
Subject: [PATCH] MMC: core: cap MMC card timeouts at 2 seconds.

From: "Torne (Richard Coles)" <[email protected]>

MMC CSD info can specify very large, ridiculous timeouts, big enough to
overflow timeout_ns on 32-bit machines. This can result in the card
timing out on every operation because the wrapped timeout value is far
too small.

Fix the overflow by capping the result at 2 seconds. Cards specifying
longer timeouts are almost certainly insane, and host controllers
generally cannot support timeouts that long in any case.

2 seconds should be plenty of time for any card to actually function;
the timeout calculation code is already using 1 second as a "worst case"
timeout for cards running in SPI mode.

Signed-off-by: Torne (Richard Coles) <[email protected]>
---
drivers/mmc/core/core.c | 11 ++++++++++-
1 files changed, 10 insertions(+), 1 deletions(-)

diff --git a/drivers/mmc/core/core.c b/drivers/mmc/core/core.c
index 0b6141d..3b4a9fc 100644
--- a/drivers/mmc/core/core.c
+++ b/drivers/mmc/core/core.c
@@ -512,7 +512,16 @@ void mmc_set_data_timeout(struct mmc_data *data, const struct mmc_card *card)
if (data->flags & MMC_DATA_WRITE)
mult <<= card->csd.r2w_factor;

- data->timeout_ns = card->csd.tacc_ns * mult;
+ /*
+ * The timeout in nanoseconds may overflow with some cards. Cap it at
+ * two seconds both to avoid the overflow and also because host
+ * controllers cannot generally generate timeouts that long anyway.
+ */
+ if (card->csd.tacc_ns <= (2 * NSEC_PER_SEC) / mult)
+ data->timeout_ns = card->csd.tacc_ns * mult;
+ else
+ data->timeout_ns = 2 * NSEC_PER_SEC;
+
data->timeout_clks = card->csd.tacc_clks * mult;

/*
--
1.7.7.3

2012-05-31 10:00:47

by Ulf Hansson

[permalink] [raw]
Subject: Re: [PATCH] MMC: core: cap MMC card timeouts at 2 seconds.

Hi Richard,

On 29 May 2012 17:02, Torne (Richard Coles) <[email protected]> wrote:
> From: "Torne (Richard Coles)" <[email protected]>
>
> MMC CSD info can specify very large, ridiculous timeouts, big enough to
> overflow timeout_ns on 32-bit machines. This can result in the card
> timing out on every operation because the wrapped timeout value is far
> too small.
>
> Fix the overflow by capping the result at 2 seconds. ?Cards specifying
> longer timeouts are almost certainly insane, and host controllers
> generally cannot support timeouts that long in any case.
>
> 2 seconds should be plenty of time for any card to actually function;
> the timeout calculation code is already using 1 second as a "worst case"
> timeout for cards running in SPI mode.
>
> Signed-off-by: Torne (Richard Coles) <[email protected]>
> ---
> ?drivers/mmc/core/core.c | ? 11 ++++++++++-
> ?1 files changed, 10 insertions(+), 1 deletions(-)
>
> diff --git a/drivers/mmc/core/core.c b/drivers/mmc/core/core.c
> index 0b6141d..3b4a9fc 100644
> --- a/drivers/mmc/core/core.c
> +++ b/drivers/mmc/core/core.c
> @@ -512,7 +512,16 @@ void mmc_set_data_timeout(struct mmc_data *data, const struct mmc_card *card)
> ? ? ? ?if (data->flags & MMC_DATA_WRITE)
> ? ? ? ? ? ? ? ?mult <<= card->csd.r2w_factor;
>
> - ? ? ? data->timeout_ns = card->csd.tacc_ns * mult;
> + ? ? ? /*
> + ? ? ? ?* The timeout in nanoseconds may overflow with some cards. Cap it at
> + ? ? ? ?* two seconds both to avoid the overflow and also because host
> + ? ? ? ?* controllers cannot generally generate timeouts that long anyway.
> + ? ? ? ?*/
> + ? ? ? if (card->csd.tacc_ns <= (2 * NSEC_PER_SEC) / mult)
> + ? ? ? ? ? ? ? data->timeout_ns = card->csd.tacc_ns * mult;
> + ? ? ? else
> + ? ? ? ? ? ? ? data->timeout_ns = 2 * NSEC_PER_SEC;
> +

The above looks OK to me, although if doing this for tacc_ns we might
want to do this for tacc_clks as well. Could we include that is this
patch as well?

> ? ? ? ?data->timeout_clks = card->csd.tacc_clks * mult;
>
> ? ? ? ?/*
> --
> 1.7.7.3
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-mmc" in
> the body of a message to [email protected]
> More majordomo info at ?http://vger.kernel.org/majordomo-info.html

Kind regards
Ulf Hansson

2012-05-31 10:15:19

by Torne (Richard Coles)

[permalink] [raw]
Subject: Re: [PATCH] MMC: core: cap MMC card timeouts at 2 seconds.

On 31 May 2012 11:00, Ulf Hansson <[email protected]> wrote:
> Hi Richard,
>
> On 29 May 2012 17:02, Torne (Richard Coles) <[email protected]> wrote:
>> From: "Torne (Richard Coles)" <[email protected]>
>>
>> MMC CSD info can specify very large, ridiculous timeouts, big enough to
>> overflow timeout_ns on 32-bit machines. This can result in the card
>> timing out on every operation because the wrapped timeout value is far
>> too small.
>>
>> Fix the overflow by capping the result at 2 seconds. ?Cards specifying
>> longer timeouts are almost certainly insane, and host controllers
>> generally cannot support timeouts that long in any case.
>>
>> 2 seconds should be plenty of time for any card to actually function;
>> the timeout calculation code is already using 1 second as a "worst case"
>> timeout for cards running in SPI mode.
>>
>> Signed-off-by: Torne (Richard Coles) <[email protected]>
>> ---
>> ?drivers/mmc/core/core.c | ? 11 ++++++++++-
>> ?1 files changed, 10 insertions(+), 1 deletions(-)
>>
>> diff --git a/drivers/mmc/core/core.c b/drivers/mmc/core/core.c
>> index 0b6141d..3b4a9fc 100644
>> --- a/drivers/mmc/core/core.c
>> +++ b/drivers/mmc/core/core.c
>> @@ -512,7 +512,16 @@ void mmc_set_data_timeout(struct mmc_data *data, const struct mmc_card *card)
>> ? ? ? ?if (data->flags & MMC_DATA_WRITE)
>> ? ? ? ? ? ? ? ?mult <<= card->csd.r2w_factor;
>>
>> - ? ? ? data->timeout_ns = card->csd.tacc_ns * mult;
>> + ? ? ? /*
>> + ? ? ? ?* The timeout in nanoseconds may overflow with some cards. Cap it at
>> + ? ? ? ?* two seconds both to avoid the overflow and also because host
>> + ? ? ? ?* controllers cannot generally generate timeouts that long anyway.
>> + ? ? ? ?*/
>> + ? ? ? if (card->csd.tacc_ns <= (2 * NSEC_PER_SEC) / mult)
>> + ? ? ? ? ? ? ? data->timeout_ns = card->csd.tacc_ns * mult;
>> + ? ? ? else
>> + ? ? ? ? ? ? ? data->timeout_ns = 2 * NSEC_PER_SEC;
>> +
>
> The above looks OK to me, although if doing this for tacc_ns we might
> want to do this for tacc_clks as well. Could we include that is this
> patch as well?

timeout_clks can't overflow, that I can see: the highest possible
value for tacc_clks is 25500 (it's an 8 bit field specified in
100-clock increments) and the highest mult is 12800 (r2w_factor only
goes up to 128). This only makes 0x13747800 :)

>> ? ? ? ?data->timeout_clks = card->csd.tacc_clks * mult;
>>
>> ? ? ? ?/*
>> --
>> 1.7.7.3
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-mmc" in
>> the body of a message to [email protected]
>> More majordomo info at ?http://vger.kernel.org/majordomo-info.html
>
> Kind regards
> Ulf Hansson



--
Torne (Richard Coles)
[email protected]

2012-05-31 15:00:34

by Ulf Hansson

[permalink] [raw]
Subject: Re: [PATCH] MMC: core: cap MMC card timeouts at 2 seconds.

Hi Richard,

On 31 May 2012 18:15, Torne (Richard Coles) <[email protected]> wrote:
> On 31 May 2012 11:00, Ulf Hansson <[email protected]> wrote:
>> Hi Richard,
>>
>> On 29 May 2012 17:02, Torne (Richard Coles) <[email protected]> wrote:
>>> From: "Torne (Richard Coles)" <[email protected]>
>>>
>>> MMC CSD info can specify very large, ridiculous timeouts, big enough to
>>> overflow timeout_ns on 32-bit machines. This can result in the card
>>> timing out on every operation because the wrapped timeout value is far
>>> too small.
>>>
>>> Fix the overflow by capping the result at 2 seconds. ?Cards specifying
>>> longer timeouts are almost certainly insane, and host controllers
>>> generally cannot support timeouts that long in any case.
>>>
>>> 2 seconds should be plenty of time for any card to actually function;
>>> the timeout calculation code is already using 1 second as a "worst case"
>>> timeout for cards running in SPI mode.
>>>
>>> Signed-off-by: Torne (Richard Coles) <[email protected]>
>>> ---
>>> ?drivers/mmc/core/core.c | ? 11 ++++++++++-
>>> ?1 files changed, 10 insertions(+), 1 deletions(-)
>>>
>>> diff --git a/drivers/mmc/core/core.c b/drivers/mmc/core/core.c
>>> index 0b6141d..3b4a9fc 100644
>>> --- a/drivers/mmc/core/core.c
>>> +++ b/drivers/mmc/core/core.c
>>> @@ -512,7 +512,16 @@ void mmc_set_data_timeout(struct mmc_data *data, const struct mmc_card *card)
>>> ? ? ? ?if (data->flags & MMC_DATA_WRITE)
>>> ? ? ? ? ? ? ? ?mult <<= card->csd.r2w_factor;
>>>
>>> - ? ? ? data->timeout_ns = card->csd.tacc_ns * mult;
>>> + ? ? ? /*
>>> + ? ? ? ?* The timeout in nanoseconds may overflow with some cards. Cap it at
>>> + ? ? ? ?* two seconds both to avoid the overflow and also because host
>>> + ? ? ? ?* controllers cannot generally generate timeouts that long anyway.
>>> + ? ? ? ?*/
>>> + ? ? ? if (card->csd.tacc_ns <= (2 * NSEC_PER_SEC) / mult)
>>> + ? ? ? ? ? ? ? data->timeout_ns = card->csd.tacc_ns * mult;
>>> + ? ? ? else
>>> + ? ? ? ? ? ? ? data->timeout_ns = 2 * NSEC_PER_SEC;
>>> +
>>
>> The above looks OK to me, although if doing this for tacc_ns we might
>> want to do this for tacc_clks as well. Could we include that is this
>> patch as well?
>
> timeout_clks can't overflow, that I can see: the highest possible
> value for tacc_clks is 25500 (it's an 8 bit field specified in
> 100-clock increments) and the highest mult is 12800 (r2w_factor only
> goes up to 128). This only makes 0x13747800 :)
>

Thanks for sorting this out and sorry for me being too lazy to read
the spec. :-)

Acked-by: Ulf Hansson <[email protected]>

Kind regards
Ulf Hansson

2012-06-01 08:36:07

by Adrian Hunter

[permalink] [raw]
Subject: Re: [PATCH V2] MMC: core: cap MMC card timeouts at 2 seconds.

On 29/05/12 05:32, Ben Hutchings wrote:
> On Mon, 2012-05-28 at 18:31 +0100, Torne (Richard Coles) wrote:
>> From: "Torne (Richard Coles)" <[email protected]>
>>
>> MMC CSD info can specify very large, ridiculous timeouts, big enough to
>> overflow timeout_ns on 32-bit machines. This can result in the card
>> timing out on every operation because the wrapped timeout value is far
>> too small.
>>
>> Fix the overflow by capping the result at 2 seconds. Cards specifying
>> longer timeouts are almost certainly insane, and host controllers
>> generally cannot support timeouts that long in any case.
>>
>> 2 seconds should be plenty of time for any card to actually function;
>> the timeout calculation code is already using 1 second as a "worst case"
>> timeout for cards running in SPI mode.
>
> Needs a 'Signed-off-by'.
>
>> ---
>> drivers/mmc/core/core.c | 11 ++++++++++-
>> 1 files changed, 10 insertions(+), 1 deletions(-)
>>
>> diff --git a/drivers/mmc/core/core.c b/drivers/mmc/core/core.c
>> index 0b6141d..3b4a9fc 100644
>> --- a/drivers/mmc/core/core.c
>> +++ b/drivers/mmc/core/core.c
>> @@ -512,7 +512,16 @@ void mmc_set_data_timeout(struct mmc_data *data, const struct mmc_card *card)
>> if (data->flags & MMC_DATA_WRITE)
>> mult <<= card->csd.r2w_factor;
>>
>> - data->timeout_ns = card->csd.tacc_ns * mult;
>> + /*
>> + * The timeout in nanoseconds may overflow with some cards. Cap it at
>> + * two seconds both to avoid the overflow and also because host
>> + * controllers cannot generally generate timeouts that long anyway.
>> + */
>> + if (card->csd.tacc_ns <= (2 * NSEC_PER_SEC) / mult)
>> + data->timeout_ns = card->csd.tacc_ns * mult;
>> + else
>> + data->timeout_ns = 2 * NSEC_PER_SEC;
>
> We clearly need to guard against overflow here, and this is the correct
> way to clamp the multiplication. I can't speak as to whether 2 seconds
> is the right limit.

The host controllers I have looked at have a limit of around 2.5 seconds.

But why not just use the size of the type as the limit? e.g.

if (card->csd.tacc_ns <= UINT_MAX / mult)
data->timeout_ns = card->csd.tacc_ns * mult;
else
data->timeout_ns = UINT_MAX;

>
> Ben.
>
>> data->timeout_clks = card->csd.tacc_clks * mult;
>>
>> /*
>

2012-06-01 09:31:18

by Torne (Richard Coles)

[permalink] [raw]
Subject: Re: [PATCH V2] MMC: core: cap MMC card timeouts at 2 seconds.

On 1 June 2012 09:35, Adrian Hunter <[email protected]> wrote:
> On 29/05/12 05:32, Ben Hutchings wrote:
>> On Mon, 2012-05-28 at 18:31 +0100, Torne (Richard Coles) wrote:
>>> From: "Torne (Richard Coles)" <[email protected]>
>>>
>>> MMC CSD info can specify very large, ridiculous timeouts, big enough to
>>> overflow timeout_ns on 32-bit machines. This can result in the card
>>> timing out on every operation because the wrapped timeout value is far
>>> too small.
>>>
>>> Fix the overflow by capping the result at 2 seconds. ?Cards specifying
>>> longer timeouts are almost certainly insane, and host controllers
>>> generally cannot support timeouts that long in any case.
>>>
>>> 2 seconds should be plenty of time for any card to actually function;
>>> the timeout calculation code is already using 1 second as a "worst case"
>>> timeout for cards running in SPI mode.
>>
>> Needs a 'Signed-off-by'.
>>
>>> ---
>>> ?drivers/mmc/core/core.c | ? 11 ++++++++++-
>>> ?1 files changed, 10 insertions(+), 1 deletions(-)
>>>
>>> diff --git a/drivers/mmc/core/core.c b/drivers/mmc/core/core.c
>>> index 0b6141d..3b4a9fc 100644
>>> --- a/drivers/mmc/core/core.c
>>> +++ b/drivers/mmc/core/core.c
>>> @@ -512,7 +512,16 @@ void mmc_set_data_timeout(struct mmc_data *data, const struct mmc_card *card)
>>> ? ? ?if (data->flags & MMC_DATA_WRITE)
>>> ? ? ? ? ? ? ?mult <<= card->csd.r2w_factor;
>>>
>>> - ? ?data->timeout_ns = card->csd.tacc_ns * mult;
>>> + ? ?/*
>>> + ? ? * The timeout in nanoseconds may overflow with some cards. Cap it at
>>> + ? ? * two seconds both to avoid the overflow and also because host
>>> + ? ? * controllers cannot generally generate timeouts that long anyway.
>>> + ? ? */
>>> + ? ?if (card->csd.tacc_ns <= (2 * NSEC_PER_SEC) / mult)
>>> + ? ? ? ? ? ?data->timeout_ns = card->csd.tacc_ns * mult;
>>> + ? ?else
>>> + ? ? ? ? ? ?data->timeout_ns = 2 * NSEC_PER_SEC;
>>
>> We clearly need to guard against overflow here, and this is the correct
>> way to clamp the multiplication. ?I can't speak as to whether 2 seconds
>> is the right limit.
>
> The host controllers I have looked at have a limit of around 2.5 seconds.
>
> But why not just use the size of the type as the limit? e.g.
>
> ? ? ? ?if (card->csd.tacc_ns <= UINT_MAX / mult)
> ? ? ? ? ? ? ? ?data->timeout_ns = card->csd.tacc_ns * mult;
> ? ? ? ?else
> ? ? ? ? ? ? ? ?data->timeout_ns = UINT_MAX;

The host controller drivers don't seem to all do a very good job of
preventing further overflows or handling large values correctly
(though some do). sdhci takes the especially annoying additional step
of printk'ing a warning for *every single MMC command* where
data->timeout_ns is larger than the controller can accommodate.
Capping it to a value with a sensible order of magnitude seems to make
it more likely that cards with obviously bogus CSD parameters will
actually work. I don't object to using a larger number for the limit,
but UINT_MAX on a 64-bit system obviously doesn't limit this at all
and will leave you with timeouts up to 17 minutes, which seems
ridiculous :)

My original motivation for this patch is that I have a device with an
eMMC that specifies a 25.5 second timeout, attached to a sdhci host
whose maximum timeout is 2.8 seconds. Originally I proposed a patch to
just remove the warning in sdhci, but nobody replied, and when I
realised there was actually an overflow happening I opted to fix that
instead.

So, yeah, we could use UINT_MAX, but then at minimum I also need to
kill the warning in sdhci to make my device work, and probably all the
host controller drivers need to be checked to make sure they don't use
timeout_ns in a way that can overflow.

I've also just noticed that struct mmc_data's comment for timeout_ns
says /* data timeout (in ns, max 80ms) */ which is not true (the max
is 102.4 seconds if my math is correct), which may have contributed to
the host drivers not being too careful :)

What do you think?

>>
>> Ben.
>>
>>> ? ? ?data->timeout_clks = card->csd.tacc_clks * mult;
>>>
>>> ? ? ?/*
>>
>



--
Torne (Richard Coles)
[email protected]

2012-06-01 09:32:42

by Torne (Richard Coles)

[permalink] [raw]
Subject: Re: [PATCH V2] MMC: core: cap MMC card timeouts at 2 seconds.

On 1 June 2012 10:31, Torne (Richard Coles) <[email protected]> wrote:
> On 1 June 2012 09:35, Adrian Hunter <[email protected]> wrote:
>> On 29/05/12 05:32, Ben Hutchings wrote:
>>> On Mon, 2012-05-28 at 18:31 +0100, Torne (Richard Coles) wrote:
>>>> From: "Torne (Richard Coles)" <[email protected]>
>>>>
>>>> MMC CSD info can specify very large, ridiculous timeouts, big enough to
>>>> overflow timeout_ns on 32-bit machines. This can result in the card
>>>> timing out on every operation because the wrapped timeout value is far
>>>> too small.
>>>>
>>>> Fix the overflow by capping the result at 2 seconds. ?Cards specifying
>>>> longer timeouts are almost certainly insane, and host controllers
>>>> generally cannot support timeouts that long in any case.
>>>>
>>>> 2 seconds should be plenty of time for any card to actually function;
>>>> the timeout calculation code is already using 1 second as a "worst case"
>>>> timeout for cards running in SPI mode.
>>>
>>> Needs a 'Signed-off-by'.
>>>
>>>> ---
>>>> ?drivers/mmc/core/core.c | ? 11 ++++++++++-
>>>> ?1 files changed, 10 insertions(+), 1 deletions(-)
>>>>
>>>> diff --git a/drivers/mmc/core/core.c b/drivers/mmc/core/core.c
>>>> index 0b6141d..3b4a9fc 100644
>>>> --- a/drivers/mmc/core/core.c
>>>> +++ b/drivers/mmc/core/core.c
>>>> @@ -512,7 +512,16 @@ void mmc_set_data_timeout(struct mmc_data *data, const struct mmc_card *card)
>>>> ? ? ?if (data->flags & MMC_DATA_WRITE)
>>>> ? ? ? ? ? ? ?mult <<= card->csd.r2w_factor;
>>>>
>>>> - ? ?data->timeout_ns = card->csd.tacc_ns * mult;
>>>> + ? ?/*
>>>> + ? ? * The timeout in nanoseconds may overflow with some cards. Cap it at
>>>> + ? ? * two seconds both to avoid the overflow and also because host
>>>> + ? ? * controllers cannot generally generate timeouts that long anyway.
>>>> + ? ? */
>>>> + ? ?if (card->csd.tacc_ns <= (2 * NSEC_PER_SEC) / mult)
>>>> + ? ? ? ? ? ?data->timeout_ns = card->csd.tacc_ns * mult;
>>>> + ? ?else
>>>> + ? ? ? ? ? ?data->timeout_ns = 2 * NSEC_PER_SEC;
>>>
>>> We clearly need to guard against overflow here, and this is the correct
>>> way to clamp the multiplication. ?I can't speak as to whether 2 seconds
>>> is the right limit.
>>
>> The host controllers I have looked at have a limit of around 2.5 seconds.
>>
>> But why not just use the size of the type as the limit? e.g.
>>
>> ? ? ? ?if (card->csd.tacc_ns <= UINT_MAX / mult)
>> ? ? ? ? ? ? ? ?data->timeout_ns = card->csd.tacc_ns * mult;
>> ? ? ? ?else
>> ? ? ? ? ? ? ? ?data->timeout_ns = UINT_MAX;
>
> The host controller drivers don't seem to all do a very good job of
> preventing further overflows or handling large values correctly
> (though some do). sdhci takes the especially annoying additional step
> of printk'ing a warning for *every single MMC command* where
> data->timeout_ns is larger than the controller can accommodate.
> Capping it to a value with a sensible order of magnitude seems to make
> it more likely that cards with obviously bogus CSD parameters will
> actually work. I don't object to using a larger number for the limit,
> but UINT_MAX on a 64-bit system obviously doesn't limit this at all
> and will leave you with timeouts up to 17 minutes, which seems
> ridiculous :)

Er, not 17 minutes; 102.4 seconds as I used later in my mail. SD cards
have their timeouts capped already, so their larger 100x multiplier is
not a problem; 102.4 seconds is the longest for an MMC card.

> My original motivation for this patch is that I have a device with an
> eMMC that specifies a 25.5 second timeout, attached to a sdhci host
> whose maximum timeout is 2.8 seconds. Originally I proposed a patch to
> just remove the warning in sdhci, but nobody replied, and when I
> realised there was actually an overflow happening I opted to fix that
> instead.
>
> So, yeah, we could use UINT_MAX, but then at minimum I also need to
> kill the warning in sdhci to make my device work, and probably all the
> host controller drivers need to be checked to make sure they don't use
> timeout_ns in a way that can overflow.
>
> I've also just noticed that struct mmc_data's comment for timeout_ns
> says /* data timeout (in ns, max 80ms) */ which is not true (the max
> is 102.4 seconds if my math is correct), which may have contributed to
> the host drivers not being too careful :)
>
> What do you think?
>
>>>
>>> Ben.
>>>
>>>> ? ? ?data->timeout_clks = card->csd.tacc_clks * mult;
>>>>
>>>> ? ? ?/*
>>>
>>
>
>
>
> --
> Torne (Richard Coles)
> [email protected]



--
Torne (Richard Coles)
[email protected]

2012-06-01 10:09:41

by Adrian Hunter

[permalink] [raw]
Subject: Re: [PATCH V2] MMC: core: cap MMC card timeouts at 2 seconds.

On 01/06/12 12:32, Torne (Richard Coles) wrote:
> On 1 June 2012 10:31, Torne (Richard Coles) <[email protected]> wrote:
>> On 1 June 2012 09:35, Adrian Hunter <[email protected]> wrote:
>>> On 29/05/12 05:32, Ben Hutchings wrote:
>>>> On Mon, 2012-05-28 at 18:31 +0100, Torne (Richard Coles) wrote:
>>>>> From: "Torne (Richard Coles)" <[email protected]>
>>>>>
>>>>> MMC CSD info can specify very large, ridiculous timeouts, big enough to
>>>>> overflow timeout_ns on 32-bit machines. This can result in the card
>>>>> timing out on every operation because the wrapped timeout value is far
>>>>> too small.
>>>>>
>>>>> Fix the overflow by capping the result at 2 seconds. Cards specifying
>>>>> longer timeouts are almost certainly insane, and host controllers
>>>>> generally cannot support timeouts that long in any case.
>>>>>
>>>>> 2 seconds should be plenty of time for any card to actually function;
>>>>> the timeout calculation code is already using 1 second as a "worst case"
>>>>> timeout for cards running in SPI mode.
>>>>
>>>> Needs a 'Signed-off-by'.
>>>>
>>>>> ---
>>>>> drivers/mmc/core/core.c | 11 ++++++++++-
>>>>> 1 files changed, 10 insertions(+), 1 deletions(-)
>>>>>
>>>>> diff --git a/drivers/mmc/core/core.c b/drivers/mmc/core/core.c
>>>>> index 0b6141d..3b4a9fc 100644
>>>>> --- a/drivers/mmc/core/core.c
>>>>> +++ b/drivers/mmc/core/core.c
>>>>> @@ -512,7 +512,16 @@ void mmc_set_data_timeout(struct mmc_data *data, const struct mmc_card *card)
>>>>> if (data->flags & MMC_DATA_WRITE)
>>>>> mult <<= card->csd.r2w_factor;
>>>>>
>>>>> - data->timeout_ns = card->csd.tacc_ns * mult;
>>>>> + /*
>>>>> + * The timeout in nanoseconds may overflow with some cards. Cap it at
>>>>> + * two seconds both to avoid the overflow and also because host
>>>>> + * controllers cannot generally generate timeouts that long anyway.
>>>>> + */
>>>>> + if (card->csd.tacc_ns <= (2 * NSEC_PER_SEC) / mult)
>>>>> + data->timeout_ns = card->csd.tacc_ns * mult;
>>>>> + else
>>>>> + data->timeout_ns = 2 * NSEC_PER_SEC;
>>>>
>>>> We clearly need to guard against overflow here, and this is the correct
>>>> way to clamp the multiplication. I can't speak as to whether 2 seconds
>>>> is the right limit.
>>>
>>> The host controllers I have looked at have a limit of around 2.5 seconds.
>>>
>>> But why not just use the size of the type as the limit? e.g.
>>>
>>> if (card->csd.tacc_ns <= UINT_MAX / mult)
>>> data->timeout_ns = card->csd.tacc_ns * mult;
>>> else
>>> data->timeout_ns = UINT_MAX;
>>
>> The host controller drivers don't seem to all do a very good job of
>> preventing further overflows or handling large values correctly
>> (though some do). sdhci takes the especially annoying additional step
>> of printk'ing a warning for *every single MMC command* where
>> data->timeout_ns is larger than the controller can accommodate.
>> Capping it to a value with a sensible order of magnitude seems to make
>> it more likely that cards with obviously bogus CSD parameters will
>> actually work. I don't object to using a larger number for the limit,
>> but UINT_MAX on a 64-bit system obviously doesn't limit this at all
>> and will leave you with timeouts up to 17 minutes, which seems
>> ridiculous :)
>
> Er, not 17 minutes; 102.4 seconds as I used later in my mail. SD cards
> have their timeouts capped already, so their larger 100x multiplier is
> not a problem; 102.4 seconds is the longest for an MMC card.
>

Linux is LP64. i.e. "int" is always 32-bit in the kernel

>> My original motivation for this patch is that I have a device with an
>> eMMC that specifies a 25.5 second timeout, attached to a sdhci host
>> whose maximum timeout is 2.8 seconds. Originally I proposed a patch to
>> just remove the warning in sdhci, but nobody replied, and when I
>> realised there was actually an overflow happening I opted to fix that
>> instead.
>>
>> So, yeah, we could use UINT_MAX, but then at minimum I also need to
>> kill the warning in sdhci to make my device work, and probably all the
>> host controller drivers need to be checked to make sure they don't use
>> timeout_ns in a way that can overflow.
>>
>> I've also just noticed that struct mmc_data's comment for timeout_ns
>> says /* data timeout (in ns, max 80ms) */ which is not true (the max
>> is 102.4 seconds if my math is correct), which may have contributed to
>> the host drivers not being too careful :)
>>
>> What do you think?

If you can identify the card, the you could make a new quirk in a fashion
similar to mmc_card_long_read_time().

Alternatively you could make use of SDHCI_QUIRK_BROKEN_TIMEOUT_VAL or
introduce your own sdhci quirk to suppress the warning.

>>
>>>>
>>>> Ben.
>>>>
>>>>> data->timeout_clks = card->csd.tacc_clks * mult;
>>>>>
>>>>> /*
>>>>
>>>
>>
>>
>>
>> --
>> Torne (Richard Coles)
>> [email protected]
>
>
>

2012-06-01 10:20:12

by Torne (Richard Coles)

[permalink] [raw]
Subject: Re: [PATCH V2] MMC: core: cap MMC card timeouts at 2 seconds.

On 1 June 2012 11:09, Adrian Hunter <[email protected]> wrote:
> On 01/06/12 12:32, Torne (Richard Coles) wrote:
>> On 1 June 2012 10:31, Torne (Richard Coles) <[email protected]> wrote:
>>> On 1 June 2012 09:35, Adrian Hunter <[email protected]> wrote:
>>>> On 29/05/12 05:32, Ben Hutchings wrote:
>>>>> On Mon, 2012-05-28 at 18:31 +0100, Torne (Richard Coles) wrote:
>>>>>> From: "Torne (Richard Coles)" <[email protected]>
>>>>>>
>>>>>> MMC CSD info can specify very large, ridiculous timeouts, big enough to
>>>>>> overflow timeout_ns on 32-bit machines. This can result in the card
>>>>>> timing out on every operation because the wrapped timeout value is far
>>>>>> too small.
>>>>>>
>>>>>> Fix the overflow by capping the result at 2 seconds. ?Cards specifying
>>>>>> longer timeouts are almost certainly insane, and host controllers
>>>>>> generally cannot support timeouts that long in any case.
>>>>>>
>>>>>> 2 seconds should be plenty of time for any card to actually function;
>>>>>> the timeout calculation code is already using 1 second as a "worst case"
>>>>>> timeout for cards running in SPI mode.
>>>>>
>>>>> Needs a 'Signed-off-by'.
>>>>>
>>>>>> ---
>>>>>> ?drivers/mmc/core/core.c | ? 11 ++++++++++-
>>>>>> ?1 files changed, 10 insertions(+), 1 deletions(-)
>>>>>>
>>>>>> diff --git a/drivers/mmc/core/core.c b/drivers/mmc/core/core.c
>>>>>> index 0b6141d..3b4a9fc 100644
>>>>>> --- a/drivers/mmc/core/core.c
>>>>>> +++ b/drivers/mmc/core/core.c
>>>>>> @@ -512,7 +512,16 @@ void mmc_set_data_timeout(struct mmc_data *data, const struct mmc_card *card)
>>>>>> ? ? ?if (data->flags & MMC_DATA_WRITE)
>>>>>> ? ? ? ? ? ? ?mult <<= card->csd.r2w_factor;
>>>>>>
>>>>>> - ? ?data->timeout_ns = card->csd.tacc_ns * mult;
>>>>>> + ? ?/*
>>>>>> + ? ? * The timeout in nanoseconds may overflow with some cards. Cap it at
>>>>>> + ? ? * two seconds both to avoid the overflow and also because host
>>>>>> + ? ? * controllers cannot generally generate timeouts that long anyway.
>>>>>> + ? ? */
>>>>>> + ? ?if (card->csd.tacc_ns <= (2 * NSEC_PER_SEC) / mult)
>>>>>> + ? ? ? ? ? ?data->timeout_ns = card->csd.tacc_ns * mult;
>>>>>> + ? ?else
>>>>>> + ? ? ? ? ? ?data->timeout_ns = 2 * NSEC_PER_SEC;
>>>>>
>>>>> We clearly need to guard against overflow here, and this is the correct
>>>>> way to clamp the multiplication. ?I can't speak as to whether 2 seconds
>>>>> is the right limit.
>>>>
>>>> The host controllers I have looked at have a limit of around 2.5 seconds.
>>>>
>>>> But why not just use the size of the type as the limit? e.g.
>>>>
>>>> ? ? ? ?if (card->csd.tacc_ns <= UINT_MAX / mult)
>>>> ? ? ? ? ? ? ? ?data->timeout_ns = card->csd.tacc_ns * mult;
>>>> ? ? ? ?else
>>>> ? ? ? ? ? ? ? ?data->timeout_ns = UINT_MAX;
>>>
>>> The host controller drivers don't seem to all do a very good job of
>>> preventing further overflows or handling large values correctly
>>> (though some do). sdhci takes the especially annoying additional step
>>> of printk'ing a warning for *every single MMC command* where
>>> data->timeout_ns is larger than the controller can accommodate.
>>> Capping it to a value with a sensible order of magnitude seems to make
>>> it more likely that cards with obviously bogus CSD parameters will
>>> actually work. I don't object to using a larger number for the limit,
>>> but UINT_MAX on a 64-bit system obviously doesn't limit this at all
>>> and will leave you with timeouts up to 17 minutes, which seems
>>> ridiculous :)
>>
>> Er, not 17 minutes; 102.4 seconds as I used later in my mail. SD cards
>> have their timeouts capped already, so their larger 100x multiplier is
>> not a problem; 102.4 seconds is the longest for an MMC card.
>>
>
> Linux is LP64. i.e. "int" is always 32-bit in the kernel

Oh, sorry; didn't think that through. So, yeah, that'd be 4.29
seconds, which is still too long for many hosts :)

>>> My original motivation for this patch is that I have a device with an
>>> eMMC that specifies a 25.5 second timeout, attached to a sdhci host
>>> whose maximum timeout is 2.8 seconds. Originally I proposed a patch to
>>> just remove the warning in sdhci, but nobody replied, and when I
>>> realised there was actually an overflow happening I opted to fix that
>>> instead.
>>>
>>> So, yeah, we could use UINT_MAX, but then at minimum I also need to
>>> kill the warning in sdhci to make my device work, and probably all the
>>> host controller drivers need to be checked to make sure they don't use
>>> timeout_ns in a way that can overflow.
>>>
>>> I've also just noticed that struct mmc_data's comment for timeout_ns
>>> says /* data timeout (in ns, max 80ms) */ which is not true (the max
>>> is 102.4 seconds if my math is correct), which may have contributed to
>>> the host drivers not being too careful :)
>>>
>>> What do you think?
>
> If you can identify the card, the you could make a new quirk in a fashion
> similar to mmc_card_long_read_time().
>
> Alternatively you could make use of SDHCI_QUIRK_BROKEN_TIMEOUT_VAL or
> introduce your own sdhci quirk to suppress the warning.

Those would work, but it seems silly to me to suppress the warning
only for some cards, or to cap the timeout only for some cards. The
best way to identify a card that has a "broken" timeout value is.. if
the timeout value is a really big number, no? I am very skeptical that
there is a card out there anywhere that will actually take more than
two seconds (or 4.29 seconds, if you prefer) to successfully complete
a command.

The warning itself seems to have extremely limited use; there's
nothing you can do about it other than suppress it (the driver is
already capping the timeout for you), and because timeouts are
calculated per-command the warning is absurdly noisy (in fact, the fun
part on my system was the warning being logged to klogd, being written
to logfiles on the eMMC, causing more warnings, causing more log
messages, etc) :) sdhci is the only host driver that complains about
this; it seems logical for the warning to apply to all hosts, or to
none of them...

>>>
>>>>>
>>>>> Ben.
>>>>>
>>>>>> ? ? ?data->timeout_clks = card->csd.tacc_clks * mult;
>>>>>>
>>>>>> ? ? ?/*
>>>>>
>>>>
>>>
>>>
>>>
>>> --
>>> Torne (Richard Coles)
>>> [email protected]
>>
>>
>>
>



--
Torne (Richard Coles)
[email protected]

2012-06-01 12:59:11

by Adrian Hunter

[permalink] [raw]
Subject: Re: [PATCH V2] MMC: core: cap MMC card timeouts at 2 seconds.

On 01/06/12 13:20, Torne (Richard Coles) wrote:
> On 1 June 2012 11:09, Adrian Hunter <[email protected]> wrote:
>> On 01/06/12 12:32, Torne (Richard Coles) wrote:
>>> On 1 June 2012 10:31, Torne (Richard Coles) <[email protected]> wrote:
>>>> On 1 June 2012 09:35, Adrian Hunter <[email protected]> wrote:
>>>>> On 29/05/12 05:32, Ben Hutchings wrote:
>>>>>> On Mon, 2012-05-28 at 18:31 +0100, Torne (Richard Coles) wrote:
>>>>>>> From: "Torne (Richard Coles)" <[email protected]>
>>>>>>>
>>>>>>> MMC CSD info can specify very large, ridiculous timeouts, big enough to
>>>>>>> overflow timeout_ns on 32-bit machines. This can result in the card
>>>>>>> timing out on every operation because the wrapped timeout value is far
>>>>>>> too small.
>>>>>>>
>>>>>>> Fix the overflow by capping the result at 2 seconds. Cards specifying
>>>>>>> longer timeouts are almost certainly insane, and host controllers
>>>>>>> generally cannot support timeouts that long in any case.
>>>>>>>
>>>>>>> 2 seconds should be plenty of time for any card to actually function;
>>>>>>> the timeout calculation code is already using 1 second as a "worst case"
>>>>>>> timeout for cards running in SPI mode.
>>>>>>
>>>>>> Needs a 'Signed-off-by'.
>>>>>>
>>>>>>> ---
>>>>>>> drivers/mmc/core/core.c | 11 ++++++++++-
>>>>>>> 1 files changed, 10 insertions(+), 1 deletions(-)
>>>>>>>
>>>>>>> diff --git a/drivers/mmc/core/core.c b/drivers/mmc/core/core.c
>>>>>>> index 0b6141d..3b4a9fc 100644
>>>>>>> --- a/drivers/mmc/core/core.c
>>>>>>> +++ b/drivers/mmc/core/core.c
>>>>>>> @@ -512,7 +512,16 @@ void mmc_set_data_timeout(struct mmc_data *data, const struct mmc_card *card)
>>>>>>> if (data->flags & MMC_DATA_WRITE)
>>>>>>> mult <<= card->csd.r2w_factor;
>>>>>>>
>>>>>>> - data->timeout_ns = card->csd.tacc_ns * mult;
>>>>>>> + /*
>>>>>>> + * The timeout in nanoseconds may overflow with some cards. Cap it at
>>>>>>> + * two seconds both to avoid the overflow and also because host
>>>>>>> + * controllers cannot generally generate timeouts that long anyway.
>>>>>>> + */
>>>>>>> + if (card->csd.tacc_ns <= (2 * NSEC_PER_SEC) / mult)
>>>>>>> + data->timeout_ns = card->csd.tacc_ns * mult;
>>>>>>> + else
>>>>>>> + data->timeout_ns = 2 * NSEC_PER_SEC;
>>>>>>
>>>>>> We clearly need to guard against overflow here, and this is the correct
>>>>>> way to clamp the multiplication. I can't speak as to whether 2 seconds
>>>>>> is the right limit.
>>>>>
>>>>> The host controllers I have looked at have a limit of around 2.5 seconds.
>>>>>
>>>>> But why not just use the size of the type as the limit? e.g.
>>>>>
>>>>> if (card->csd.tacc_ns <= UINT_MAX / mult)
>>>>> data->timeout_ns = card->csd.tacc_ns * mult;
>>>>> else
>>>>> data->timeout_ns = UINT_MAX;
>>>>
>>>> The host controller drivers don't seem to all do a very good job of
>>>> preventing further overflows or handling large values correctly
>>>> (though some do). sdhci takes the especially annoying additional step
>>>> of printk'ing a warning for *every single MMC command* where
>>>> data->timeout_ns is larger than the controller can accommodate.
>>>> Capping it to a value with a sensible order of magnitude seems to make
>>>> it more likely that cards with obviously bogus CSD parameters will
>>>> actually work. I don't object to using a larger number for the limit,
>>>> but UINT_MAX on a 64-bit system obviously doesn't limit this at all
>>>> and will leave you with timeouts up to 17 minutes, which seems
>>>> ridiculous :)
>>>
>>> Er, not 17 minutes; 102.4 seconds as I used later in my mail. SD cards
>>> have their timeouts capped already, so their larger 100x multiplier is
>>> not a problem; 102.4 seconds is the longest for an MMC card.
>>>
>>
>> Linux is LP64. i.e. "int" is always 32-bit in the kernel
>
> Oh, sorry; didn't think that through. So, yeah, that'd be 4.29
> seconds, which is still too long for many hosts :)
>
>>>> My original motivation for this patch is that I have a device with an
>>>> eMMC that specifies a 25.5 second timeout, attached to a sdhci host
>>>> whose maximum timeout is 2.8 seconds. Originally I proposed a patch to
>>>> just remove the warning in sdhci, but nobody replied, and when I
>>>> realised there was actually an overflow happening I opted to fix that
>>>> instead.
>>>>
>>>> So, yeah, we could use UINT_MAX, but then at minimum I also need to
>>>> kill the warning in sdhci to make my device work, and probably all the
>>>> host controller drivers need to be checked to make sure they don't use
>>>> timeout_ns in a way that can overflow.
>>>>
>>>> I've also just noticed that struct mmc_data's comment for timeout_ns
>>>> says /* data timeout (in ns, max 80ms) */ which is not true (the max
>>>> is 102.4 seconds if my math is correct), which may have contributed to
>>>> the host drivers not being too careful :)
>>>>
>>>> What do you think?
>>
>> If you can identify the card, the you could make a new quirk in a fashion
>> similar to mmc_card_long_read_time().
>>
>> Alternatively you could make use of SDHCI_QUIRK_BROKEN_TIMEOUT_VAL or
>> introduce your own sdhci quirk to suppress the warning.
>
> Those would work, but it seems silly to me to suppress the warning
> only for some cards, or to cap the timeout only for some cards. The
> best way to identify a card that has a "broken" timeout value is.. if
> the timeout value is a really big number, no? I am very skeptical that
> there is a card out there anywhere that will actually take more than
> two seconds (or 4.29 seconds, if you prefer) to successfully complete
> a command.
>
> The warning itself seems to have extremely limited use; there's
> nothing you can do about it other than suppress it (the driver is
> already capping the timeout for you), and because timeouts are
> calculated per-command the warning is absurdly noisy (in fact, the fun
> part on my system was the warning being logged to klogd, being written
> to logfiles on the eMMC, causing more warnings, causing more log
> messages, etc) :) sdhci is the only host driver that complains about
> this; it seems logical for the warning to apply to all hosts, or to
> none of them...

I just noticed that from linux 3.4, the SD write timeout is now 3 seconds
triggering the sdhci driver warning on every write on every SD card.
So change pr_warning to DGB in sdhci_calc_timeout(). Chris?

2012-06-01 13:12:43

by Torne (Richard Coles)

[permalink] [raw]
Subject: Re: [PATCH V2] MMC: core: cap MMC card timeouts at 2 seconds.

On 1 June 2012 13:59, Adrian Hunter <[email protected]> wrote:
> On 01/06/12 13:20, Torne (Richard Coles) wrote:
>> On 1 June 2012 11:09, Adrian Hunter <[email protected]> wrote:
>>> On 01/06/12 12:32, Torne (Richard Coles) wrote:
>>>> On 1 June 2012 10:31, Torne (Richard Coles) <[email protected]> wrote:
>>>>> On 1 June 2012 09:35, Adrian Hunter <[email protected]> wrote:
>>>>>> On 29/05/12 05:32, Ben Hutchings wrote:
>>>>>>> On Mon, 2012-05-28 at 18:31 +0100, Torne (Richard Coles) wrote:
>>>>>>>> From: "Torne (Richard Coles)" <[email protected]>
>>>>>>>>
>>>>>>>> MMC CSD info can specify very large, ridiculous timeouts, big enough to
>>>>>>>> overflow timeout_ns on 32-bit machines. This can result in the card
>>>>>>>> timing out on every operation because the wrapped timeout value is far
>>>>>>>> too small.
>>>>>>>>
>>>>>>>> Fix the overflow by capping the result at 2 seconds. ?Cards specifying
>>>>>>>> longer timeouts are almost certainly insane, and host controllers
>>>>>>>> generally cannot support timeouts that long in any case.
>>>>>>>>
>>>>>>>> 2 seconds should be plenty of time for any card to actually function;
>>>>>>>> the timeout calculation code is already using 1 second as a "worst case"
>>>>>>>> timeout for cards running in SPI mode.
>>>>>>>
>>>>>>> Needs a 'Signed-off-by'.
>>>>>>>
>>>>>>>> ---
>>>>>>>> ?drivers/mmc/core/core.c | ? 11 ++++++++++-
>>>>>>>> ?1 files changed, 10 insertions(+), 1 deletions(-)
>>>>>>>>
>>>>>>>> diff --git a/drivers/mmc/core/core.c b/drivers/mmc/core/core.c
>>>>>>>> index 0b6141d..3b4a9fc 100644
>>>>>>>> --- a/drivers/mmc/core/core.c
>>>>>>>> +++ b/drivers/mmc/core/core.c
>>>>>>>> @@ -512,7 +512,16 @@ void mmc_set_data_timeout(struct mmc_data *data, const struct mmc_card *card)
>>>>>>>> ? ? ?if (data->flags & MMC_DATA_WRITE)
>>>>>>>> ? ? ? ? ? ? ?mult <<= card->csd.r2w_factor;
>>>>>>>>
>>>>>>>> - ? ?data->timeout_ns = card->csd.tacc_ns * mult;
>>>>>>>> + ? ?/*
>>>>>>>> + ? ? * The timeout in nanoseconds may overflow with some cards. Cap it at
>>>>>>>> + ? ? * two seconds both to avoid the overflow and also because host
>>>>>>>> + ? ? * controllers cannot generally generate timeouts that long anyway.
>>>>>>>> + ? ? */
>>>>>>>> + ? ?if (card->csd.tacc_ns <= (2 * NSEC_PER_SEC) / mult)
>>>>>>>> + ? ? ? ? ? ?data->timeout_ns = card->csd.tacc_ns * mult;
>>>>>>>> + ? ?else
>>>>>>>> + ? ? ? ? ? ?data->timeout_ns = 2 * NSEC_PER_SEC;
>>>>>>>
>>>>>>> We clearly need to guard against overflow here, and this is the correct
>>>>>>> way to clamp the multiplication. ?I can't speak as to whether 2 seconds
>>>>>>> is the right limit.
>>>>>>
>>>>>> The host controllers I have looked at have a limit of around 2.5 seconds.
>>>>>>
>>>>>> But why not just use the size of the type as the limit? e.g.
>>>>>>
>>>>>> ? ? ? ?if (card->csd.tacc_ns <= UINT_MAX / mult)
>>>>>> ? ? ? ? ? ? ? ?data->timeout_ns = card->csd.tacc_ns * mult;
>>>>>> ? ? ? ?else
>>>>>> ? ? ? ? ? ? ? ?data->timeout_ns = UINT_MAX;
>>>>>
>>>>> The host controller drivers don't seem to all do a very good job of
>>>>> preventing further overflows or handling large values correctly
>>>>> (though some do). sdhci takes the especially annoying additional step
>>>>> of printk'ing a warning for *every single MMC command* where
>>>>> data->timeout_ns is larger than the controller can accommodate.
>>>>> Capping it to a value with a sensible order of magnitude seems to make
>>>>> it more likely that cards with obviously bogus CSD parameters will
>>>>> actually work. I don't object to using a larger number for the limit,
>>>>> but UINT_MAX on a 64-bit system obviously doesn't limit this at all
>>>>> and will leave you with timeouts up to 17 minutes, which seems
>>>>> ridiculous :)
>>>>
>>>> Er, not 17 minutes; 102.4 seconds as I used later in my mail. SD cards
>>>> have their timeouts capped already, so their larger 100x multiplier is
>>>> not a problem; 102.4 seconds is the longest for an MMC card.
>>>>
>>>
>>> Linux is LP64. i.e. "int" is always 32-bit in the kernel
>>
>> Oh, sorry; didn't think that through. So, yeah, that'd be 4.29
>> seconds, which is still too long for many hosts :)
>>
>>>>> My original motivation for this patch is that I have a device with an
>>>>> eMMC that specifies a 25.5 second timeout, attached to a sdhci host
>>>>> whose maximum timeout is 2.8 seconds. Originally I proposed a patch to
>>>>> just remove the warning in sdhci, but nobody replied, and when I
>>>>> realised there was actually an overflow happening I opted to fix that
>>>>> instead.
>>>>>
>>>>> So, yeah, we could use UINT_MAX, but then at minimum I also need to
>>>>> kill the warning in sdhci to make my device work, and probably all the
>>>>> host controller drivers need to be checked to make sure they don't use
>>>>> timeout_ns in a way that can overflow.
>>>>>
>>>>> I've also just noticed that struct mmc_data's comment for timeout_ns
>>>>> says /* data timeout (in ns, max 80ms) */ which is not true (the max
>>>>> is 102.4 seconds if my math is correct), which may have contributed to
>>>>> the host drivers not being too careful :)
>>>>>
>>>>> What do you think?
>>>
>>> If you can identify the card, the you could make a new quirk in a fashion
>>> similar to mmc_card_long_read_time().
>>>
>>> Alternatively you could make use of SDHCI_QUIRK_BROKEN_TIMEOUT_VAL or
>>> introduce your own sdhci quirk to suppress the warning.
>>
>> Those would work, but it seems silly to me to suppress the warning
>> only for some cards, or to cap the timeout only for some cards. The
>> best way to identify a card that has a "broken" timeout value is.. if
>> the timeout value is a really big number, no? I am very skeptical that
>> there is a card out there anywhere that will actually take more than
>> two seconds (or 4.29 seconds, if you prefer) to successfully complete
>> a command.
>>
>> The warning itself seems to have extremely limited use; there's
>> nothing you can do
about it other than suppress it (the driver is
>> already capping the timeout for you), and because timeouts are
>> calculated per-command the warning is absurdly noisy (in fact, the fun
>> part on my system was the warning being logged to klogd, being written
>> to logfiles on the eMMC, causing more warnings, causing more log
>> messages, etc) :) sdhci is the only host driver that complains about
>> this; it seems logical for the warning to apply to all hosts, or to
>> none of them...
>
> I just noticed that from linux 3.4, the SD write timeout is now 3 seconds
> triggering the sdhci driver warning on every write on every SD card.
> So change pr_warning to DGB in sdhci_calc_timeout(). ?Chris?
>

Woah, I didn't notice that change. Yeah, sdhci definitely needs changing then :)

If we do that, then it makes sense to just cap the MMC timeout at
UINT_MAX, I think. I'll send a new patch that does that.

--
Torne (Richard Coles)
[email protected]

2012-06-01 13:20:24

by Torne (Richard Coles)

[permalink] [raw]
Subject: [PATCH V3] MMC: core: cap MMC card timeouts at UINT_MAX

From: "Torne (Richard Coles)" <[email protected]>

MMC CSD info can specify very large, ridiculous timeouts, big enough to
overflow timeout_ns. This can result in the card timing out on every
operation because the wrapped timeout value is far too small.

Fix the overflow by capping the result at UINT_MAX. Cards specifying
longer timeouts are almost certainly insane, and host controllers
generally cannot support timeouts that long in any case.

Signed-off-by: Torne (Richard Coles) <[email protected]>
---
drivers/mmc/core/core.c | 11 ++++++++++-
1 files changed, 10 insertions(+), 1 deletions(-)

diff --git a/drivers/mmc/core/core.c b/drivers/mmc/core/core.c
index 0b6141d..74ec3d4 100644
--- a/drivers/mmc/core/core.c
+++ b/drivers/mmc/core/core.c
@@ -512,7 +512,16 @@ void mmc_set_data_timeout(struct mmc_data *data, const struct mmc_card *card)
if (data->flags & MMC_DATA_WRITE)
mult <<= card->csd.r2w_factor;

- data->timeout_ns = card->csd.tacc_ns * mult;
+ /*
+ * The timeout in nanoseconds may overflow with some cards. Cap it at
+ * UINT_MAX to avoid the overflow; host controllers cannot generally
+ * generate timeouts that long anyway.
+ */
+ if (card->csd.tacc_ns <= UINT_MAX / mult)
+ data->timeout_ns = card->csd.tacc_ns * mult;
+ else
+ data->timeout_ns = UINT_MAX;
+
data->timeout_clks = card->csd.tacc_clks * mult;

/*
--
1.7.7.3

2012-06-01 14:48:18

by Chris Ball

[permalink] [raw]
Subject: Re: [PATCH V2] MMC: core: cap MMC card timeouts at 2 seconds.

Hi,

On Fri, Jun 01 2012, Adrian Hunter wrote:
> I just noticed that from linux 3.4, the SD write timeout is now 3 seconds
> triggering the sdhci driver warning on every write on every SD card.
> So change pr_warning to DGB in sdhci_calc_timeout(). Chris?

Oops, thanks for noticing:


Subject: mmc: sdhci: Use DBG() instead of pr_warning() on large timeout

3bdc9ba892d6 ("mmc: use really long write timeout to deal with crappy
cards") in 3.4 increased the write timeout that the core sends to host
drivers to 3 seconds. This makes sdhci's "requested timeout too large"
warning trigger on every write; so, change this pr_warning() to a DBG().

Reported-by: Adrian Hunter <[email protected]>
Signed-off-by: Chris Ball <[email protected]>
---
drivers/mmc/host/sdhci.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/mmc/host/sdhci.c b/drivers/mmc/host/sdhci.c
index e626732..f4b8b4d 100644
--- a/drivers/mmc/host/sdhci.c
+++ b/drivers/mmc/host/sdhci.c
@@ -680,8 +680,8 @@ static u8 sdhci_calc_timeout(struct sdhci_host *host, struct mmc_command *cmd)
}

if (count >= 0xF) {
- pr_warning("%s: Too large timeout 0x%x requested for CMD%d!\n",
- mmc_hostname(host->mmc), count, cmd->opcode);
+ DBG("%s: Too large timeout 0x%x requested for CMD%d!\n",
+ mmc_hostname(host->mmc), count, cmd->opcode);
count = 0xE;
}

--
Chris Ball <[email protected]> <http://printf.net/>
One Laptop Per Child

2012-06-04 08:13:00

by Adrian Hunter

[permalink] [raw]
Subject: Re: [PATCH V3] MMC: core: cap MMC card timeouts at UINT_MAX

On 01/06/12 16:20, Torne (Richard Coles) wrote:
> From: "Torne (Richard Coles)" <[email protected]>
>
> MMC CSD info can specify very large, ridiculous timeouts, big enough to
> overflow timeout_ns. This can result in the card timing out on every
> operation because the wrapped timeout value is far too small.
>
> Fix the overflow by capping the result at UINT_MAX. Cards specifying
> longer timeouts are almost certainly insane, and host controllers
> generally cannot support timeouts that long in any case.
>
> Signed-off-by: Torne (Richard Coles) <[email protected]>
> ---
> drivers/mmc/core/core.c | 11 ++++++++++-
> 1 files changed, 10 insertions(+), 1 deletions(-)
>
> diff --git a/drivers/mmc/core/core.c b/drivers/mmc/core/core.c
> index 0b6141d..74ec3d4 100644
> --- a/drivers/mmc/core/core.c
> +++ b/drivers/mmc/core/core.c
> @@ -512,7 +512,16 @@ void mmc_set_data_timeout(struct mmc_data *data, const struct mmc_card *card)
> if (data->flags & MMC_DATA_WRITE)
> mult <<= card->csd.r2w_factor;
>
> - data->timeout_ns = card->csd.tacc_ns * mult;
> + /*
> + * The timeout in nanoseconds may overflow with some cards. Cap it at
> + * UINT_MAX to avoid the overflow; host controllers cannot generally
> + * generate timeouts that long anyway.
> + */
> + if (card->csd.tacc_ns <= UINT_MAX / mult)
> + data->timeout_ns = card->csd.tacc_ns * mult;
> + else
> + data->timeout_ns = UINT_MAX;
> +
> data->timeout_clks = card->csd.tacc_clks * mult;
>
> /*

Acked-by: Adrian Hunter <[email protected]>