2013-10-30 12:46:34

by Michal Simek

[permalink] [raw]
Subject: [PATCH] ARM: mm: Fix ECC mem policy printk


Attachments:
(No filename) (1.42 kB)
(No filename) (198.00 B)
Download all attachments

2013-10-30 13:08:31

by Russell King - ARM Linux

[permalink] [raw]
Subject: Re: [PATCH] ARM: mm: Fix ECC mem policy printk

On Wed, Oct 30, 2013 at 01:46:18PM +0100, Michal Simek wrote:
> Russell, Will: We discussed this at KS that will be good
> to rephrase it or have different logic around this.
> I am not sure if we can also test that this bit is
> implemented by particular SoC or not.
>
> Maybe logic should be that if SoC uses this bit
> that message is shown in origin format to declare
> that ECC is enabled or disabled.
> When SoC doesn't implement it then do not show this message.

This is not quite what I meant - by making the change you have, you also
omit to print the data cache policy.

> @@ -556,8 +556,9 @@ static void __init build_mem_type_table(void)
> mem_types[MT_CACHECLEAN].prot_sect |= PMD_SECT_WB;
> break;
> }
> - printk("Memory policy: ECC %sabled, Data cache %s\n",
> - ecc_mask ? "en" : "dis", cp->policy);
> + if (ecc_mask)
> + pr_info("Memory policy: ECC enabled, Data cache %s\n",
> + cp->policy);

pr_info("Memory policy: %sData cache %s\n",
ecc_mask ? "ECC enabled, " : "", cp->policy);

is more what I was suggesting.

2013-10-30 14:23:30

by Michal Simek

[permalink] [raw]
Subject: Re: [PATCH] ARM: mm: Fix ECC mem policy printk

On 10/30/2013 02:07 PM, Russell King - ARM Linux wrote:
> On Wed, Oct 30, 2013 at 01:46:18PM +0100, Michal Simek wrote:
>> Russell, Will: We discussed this at KS that will be good
>> to rephrase it or have different logic around this.
>> I am not sure if we can also test that this bit is
>> implemented by particular SoC or not.
>>
>> Maybe logic should be that if SoC uses this bit
>> that message is shown in origin format to declare
>> that ECC is enabled or disabled.
>> When SoC doesn't implement it then do not show this message.
>
> This is not quite what I meant - by making the change you have, you also
> omit to print the data cache policy.
>
>> @@ -556,8 +556,9 @@ static void __init build_mem_type_table(void)
>> mem_types[MT_CACHECLEAN].prot_sect |= PMD_SECT_WB;
>> break;
>> }
>> - printk("Memory policy: ECC %sabled, Data cache %s\n",
>> - ecc_mask ? "en" : "dis", cp->policy);
>> + if (ecc_mask)
>> + pr_info("Memory policy: ECC enabled, Data cache %s\n",
>> + cp->policy);
>
> pr_info("Memory policy: %sData cache %s\n",
> ecc_mask ? "ECC enabled, " : "", cp->policy);
>
> is more what I was suggesting.

If this is what you would like to see it there, I am fine with that too.

Thanks,
Michal


--
Michal Simek, Ing. (M.Eng), OpenPGP -> KeyID: FE3D1F91
w: http://www.monstr.eu p: +42-0-721842854
Maintainer of Linux kernel - Microblaze cpu - http://www.monstr.eu/fdt/
Maintainer of Linux kernel - Xilinx Zynq ARM architecture
Microblaze U-BOOT custodian and responsible for u-boot arm zynq platform



Attachments:
signature.asc (263.00 B)
OpenPGP digital signature

2013-10-30 14:32:14

by Michal Simek

[permalink] [raw]
Subject: Re: [PATCH] ARM: mm: Fix ECC mem policy printk

On 10/30/2013 03:23 PM, Michal Simek wrote:
> On 10/30/2013 02:07 PM, Russell King - ARM Linux wrote:
>> On Wed, Oct 30, 2013 at 01:46:18PM +0100, Michal Simek wrote:
>>> Russell, Will: We discussed this at KS that will be good
>>> to rephrase it or have different logic around this.
>>> I am not sure if we can also test that this bit is
>>> implemented by particular SoC or not.
>>>
>>> Maybe logic should be that if SoC uses this bit
>>> that message is shown in origin format to declare
>>> that ECC is enabled or disabled.
>>> When SoC doesn't implement it then do not show this message.
>>
>> This is not quite what I meant - by making the change you have, you also
>> omit to print the data cache policy.
>>
>>> @@ -556,8 +556,9 @@ static void __init build_mem_type_table(void)
>>> mem_types[MT_CACHECLEAN].prot_sect |= PMD_SECT_WB;
>>> break;
>>> }
>>> - printk("Memory policy: ECC %sabled, Data cache %s\n",
>>> - ecc_mask ? "en" : "dis", cp->policy);
>>> + if (ecc_mask)
>>> + pr_info("Memory policy: ECC enabled, Data cache %s\n",
>>> + cp->policy);
>>
>> pr_info("Memory policy: %sData cache %s\n",
>> ecc_mask ? "ECC enabled, " : "", cp->policy);
>>
>> is more what I was suggesting.
>
> If this is what you would like to see it there, I am fine with that too.

btw: passing ecc=on through command line will caused that "ECC enabled" message
will be there even on systems which don't implement this bit.
It is just side effect for both these solutions.
Isn't there any easy way to test if this bit is implemented or not just by setting
it up and clear it?

Thanks,
Michal
--
Michal Simek, Ing. (M.Eng), OpenPGP -> KeyID: FE3D1F91
w: http://www.monstr.eu p: +42-0-721842854
Maintainer of Linux kernel - Microblaze cpu - http://www.monstr.eu/fdt/
Maintainer of Linux kernel - Xilinx Zynq ARM architecture
Microblaze U-BOOT custodian and responsible for u-boot arm zynq platform



Attachments:
signature.asc (263.00 B)
OpenPGP digital signature

2013-10-30 15:01:59

by Russell King - ARM Linux

[permalink] [raw]
Subject: Re: [PATCH] ARM: mm: Fix ECC mem policy printk

On Wed, Oct 30, 2013 at 03:32:09PM +0100, Michal Simek wrote:
> btw: passing ecc=on through command line will caused that "ECC enabled"
> message will be there even on systems which don't implement this bit.
> It is just side effect for both these solutions.

It is a hint, nothing more. There is no way to detect whether it's
implemented or even how it has been implemented.

> Isn't there any easy way to test if this bit is implemented or not just
> by setting it up and clear it?

So... let's summerise the message that you're giving.

"My SoC doesn't implement this bit other than to provide ECC at the L1
cache, instead implementing a separate ECC scheme for system memory.
Therefore, I want to change it to describe my implementation, because
my customers are complaining that it says ECC is disabled when that
is not the case. If it can't describe my setup, I want to remove the
whole facility."

That's a very selfish attitude. Sorry, but it would be wrong of me
to allow your situation to change what we have beyond the proposed
patch.

I've shown you the ARM architecture reference manual where this bit in
the page tables is described, both older and newer versions. What we're
doing is in the spirit of the descriptions of bit 9 in the L1 page tables.

I don't think there's any sensible short description which would
adequately describe this setting which would satisfy both your situation
and situations on other SoCs. We could make the kernel print an entire
paragraph on it, something like:

"ECC might be %sabled. The exact ECC setting depends on how your SoC
is implemented. Please refer to your SoCs technical reference manual
for a description of bit 9 in the level one page tables for further
information on how to interpret this statement."

but that would be idiotic.

Of course, we could just print nothing, but the purpose of printing this
is so that _we_ as developers looking at the kernel messages know the
status of this bit, particularly when interpreting oops dumps. Hiding
this information would make some oops dumps harder to diagnose. So...
this is a matter for user education if your users are complaining about
it.

2013-10-30 15:14:09

by Michal Simek

[permalink] [raw]
Subject: Re: [PATCH] ARM: mm: Fix ECC mem policy printk

Hi Russell,

On 10/30/2013 04:01 PM, Russell King - ARM Linux wrote:
> On Wed, Oct 30, 2013 at 03:32:09PM +0100, Michal Simek wrote:
>> btw: passing ecc=on through command line will caused that "ECC enabled"
>> message will be there even on systems which don't implement this bit.
>> It is just side effect for both these solutions.
>
> It is a hint, nothing more. There is no way to detect whether it's
> implemented or even how it has been implemented.

ok. That's what I wanted to know.


>> Isn't there any easy way to test if this bit is implemented or not just
>> by setting it up and clear it?
>
> So... let's summerise the message that you're giving.
>
> "My SoC doesn't implement this bit other than to provide ECC at the L1
> cache, instead implementing a separate ECC scheme for system memory.
> Therefore, I want to change it to describe my implementation, because
> my customers are complaining that it says ECC is disabled when that
> is not the case. If it can't describe my setup, I want to remove the
> whole facility."
>
> That's a very selfish attitude. Sorry, but it would be wrong of me
> to allow your situation to change what we have beyond the proposed
> patch.

I thought the situation is quite clear here. I am just saying
that there is a way to get it back and it is task for us to educate
our users/customers how to get ecc to work on zynq.

>
> I've shown you the ARM architecture reference manual where this bit in
> the page tables is described, both older and newer versions. What we're
> doing is in the spirit of the descriptions of bit 9 in the L1 page tables.
>
> I don't think there's any sensible short description which would
> adequately describe this setting which would satisfy both your situation
> and situations on other SoCs. We could make the kernel print an entire
> paragraph on it, something like:

It is not my situation and even not my two use cases.
I just want to make sure that if any "user" just use this without knowing
what it means that we will get that message back.
I am not saying it is good or bad. Just saying that there is a way how
to get it back. And the purpose of this second email was just check
that we can't detect that. That's it - nothing more nothing less.

>
> "ECC might be %sabled. The exact ECC setting depends on how your SoC
> is implemented. Please refer to your SoCs technical reference manual
> for a description of bit 9 in the level one page tables for further
> information on how to interpret this statement."
>
> but that would be idiotic.

I agree with you and none is asking for this.


> Of course, we could just print nothing, but the purpose of printing this
> is so that _we_ as developers looking at the kernel messages know the
> status of this bit, particularly when interpreting oops dumps. Hiding
> this information would make some oops dumps harder to diagnose. So...
> this is a matter for user education if your users are complaining about
> it.

I have no problem with that. I just wanted to check that there is no way
how we can detect that. Then your proposed fix is completely fine to me.

Thanks,
Michal

--
Michal Simek, Ing. (M.Eng), OpenPGP -> KeyID: FE3D1F91
w: http://www.monstr.eu p: +42-0-721842854
Maintainer of Linux kernel - Microblaze cpu - http://www.monstr.eu/fdt/
Maintainer of Linux kernel - Xilinx Zynq ARM architecture
Microblaze U-BOOT custodian and responsible for u-boot arm zynq platform



Attachments:
signature.asc (263.00 B)
OpenPGP digital signature