2012-06-13 13:43:32

by Seiji Aguchi

[permalink] [raw]
Subject: [RFC][Patch]efi_pstore:Introduce an efi_pstore_overwrite parameter to avoid missing messages in NVRAM

Currently, efi_pstore driver simply overwrites existing panic messages in NVRAM.
So, in the following scenario, we will lose 1st panic messages.

1. kernel panics.
2. efi_pstore is kicked and writes panic messages to NVRAM.
3. system shutdowns and boots up.
4. kernel panics again before a user checks the 1st panic messages in NVRAM.


To avoid missing 1st panic messages, this patch adds a new parameter ,efi_pstore_overwrite,
to efi_pstore so that we can specify whether efi_pstore overwrites existing entries in NVRAM or not.


Signed-off-by: Seiji Aguchi <[email protected]>

---
Documentation/kernel-parameters.txt | 6 ++++
drivers/firmware/efivars.c | 55 +++++++++++++++++++---------------
2 files changed, 37 insertions(+), 24 deletions(-)

diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt
index a92c5eb..990befa 100644
--- a/Documentation/kernel-parameters.txt
+++ b/Documentation/kernel-parameters.txt
@@ -786,6 +786,12 @@ bytes respectively. Such letter suffixes can also be entirely omitted.
edd= [EDD]
Format: {"off" | "on" | "skip[mbr]"}

+ efivars.efi_pstore_overwrite=
+ Specify whether efi_pstore overwrites existing entries
+ in NVRAM
+ Format: <bool> (1/Y/y=yes, 0/N/n=no)
+ default: yes
+
eisa_irq_edge= [PARISC,HW]
See header of drivers/parisc/eisa.c.

diff --git a/drivers/firmware/efivars.c b/drivers/firmware/efivars.c index 47408e8..94fbd1d 100644
--- a/drivers/firmware/efivars.c
+++ b/drivers/firmware/efivars.c
@@ -685,6 +685,10 @@ static ssize_t efi_pstore_read(u64 *id, enum pstore_type_id *type,
return 0;
}

+static bool efi_pstore_overwrite = 1;
+module_param_named(efi_pstore_overwrite, efi_pstore_overwrite,
+ bool, S_IRUGO | S_IWUSR);
+
static int efi_pstore_write(enum pstore_type_id type,
enum kmsg_dump_reason reason, u64 *id,
unsigned int part, size_t size, struct pstore_info *psi) @@ -705,33 +709,36 @@ static int efi_pstore_write(enum pstore_type_id type,
for (i = 0; i < DUMP_NAME_LEN; i++)
efi_name[i] = stub_name[i];

- /*
- * Clean up any entries with the same name
- */
+ if (efi_pstore_overwrite) {
+ /*
+ * Clean up any entries with the same name
+ */
+
+ list_for_each_entry(entry, &efivars->list, list) {
+ get_var_data_locked(efivars, &entry->var);
+
+ if (efi_guidcmp(entry->var.VendorGuid, vendor))
+ continue;
+ if (utf16_strncmp(entry->var.VariableName, efi_name,
+ utf16_strlen(efi_name)))
+ continue;
+ /* Needs to be a prefix */
+ if (entry->var.VariableName[utf16_strlen(efi_name)]
+ == 0)
+ continue;
+
+ /* found */
+ found = entry;
+ efivars->ops->set_variable(entry->var.VariableName,
+ &entry->var.VendorGuid,
+ PSTORE_EFI_ATTRIBUTES,
+ 0, NULL);
+ }

- list_for_each_entry(entry, &efivars->list, list) {
- get_var_data_locked(efivars, &entry->var);
-
- if (efi_guidcmp(entry->var.VendorGuid, vendor))
- continue;
- if (utf16_strncmp(entry->var.VariableName, efi_name,
- utf16_strlen(efi_name)))
- continue;
- /* Needs to be a prefix */
- if (entry->var.VariableName[utf16_strlen(efi_name)] == 0)
- continue;
-
- /* found */
- found = entry;
- efivars->ops->set_variable(entry->var.VariableName,
- &entry->var.VendorGuid,
- PSTORE_EFI_ATTRIBUTES,
- 0, NULL);
+ if (found)
+ list_del(&found->list);
}

- if (found)
- list_del(&found->list);
-
for (i = 0; i < DUMP_NAME_LEN; i++)
efi_name[i] = name[i];

--
1.7.1


2012-06-13 20:25:17

by Mike Waychison

[permalink] [raw]
Subject: Re: [RFC][Patch]efi_pstore:Introduce an efi_pstore_overwrite parameter to avoid missing messages in NVRAM

On Wed, Jun 13, 2012 at 6:42 AM, Seiji Aguchi <[email protected]> wrote:
>
> ? ?Currently, efi_pstore driver simply overwrites existing panic messages in NVRAM.
> ? ?So, in the following scenario, we will lose 1st panic messages.
>
> ? ?1. kernel panics.
> ? ?2. efi_pstore is kicked and writes panic messages to NVRAM.
> ? ?3. system shutdowns and boots up.
> ? ?4. kernel panics again before a user checks the 1st panic messages in NVRAM.
>
>
> ? ?To avoid missing 1st panic messages, this patch adds a new parameter ,efi_pstore_overwrite,
> ? ?to efi_pstore so that we can specify whether efi_pstore overwrites existing entries in NVRAM or not.
>
>
> ?Signed-off-by: Seiji Aguchi <[email protected]>


Acked-by: Mike Waychison <[email protected]>

>
> ---
> ?Documentation/kernel-parameters.txt | ? ?6 ++++
> ?drivers/firmware/efivars.c ? ? ? ? ?| ? 55 +++++++++++++++++++---------------
> ?2 files changed, 37 insertions(+), 24 deletions(-)
>
> diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt
> index a92c5eb..990befa 100644
> --- a/Documentation/kernel-parameters.txt
> +++ b/Documentation/kernel-parameters.txt
> @@ -786,6 +786,12 @@ bytes respectively. Such letter suffixes can also be entirely omitted.
> ? ? ? ?edd= ? ? ? ? ? ?[EDD]
> ? ? ? ? ? ? ? ? ? ? ? ?Format: {"off" | "on" | "skip[mbr]"}
>
> + ? ? ? efivars.efi_pstore_overwrite=
> + ? ? ? ? ? ? ? ? ? ? ? Specify whether efi_pstore overwrites existing entries
> + ? ? ? ? ? ? ? ? ? ? ? in NVRAM
> + ? ? ? ? ? ? ? ? ? ? ? Format: <bool> ?(1/Y/y=yes, 0/N/n=no)
> + ? ? ? ? ? ? ? ? ? ? ? default: yes
> +
> ? ? ? ?eisa_irq_edge= ?[PARISC,HW]
> ? ? ? ? ? ? ? ? ? ? ? ?See header of drivers/parisc/eisa.c.
>
> diff --git a/drivers/firmware/efivars.c b/drivers/firmware/efivars.c index 47408e8..94fbd1d 100644
> --- a/drivers/firmware/efivars.c
> +++ b/drivers/firmware/efivars.c
> @@ -685,6 +685,10 @@ static ssize_t efi_pstore_read(u64 *id, enum pstore_type_id *type,
> ? ? ? ?return 0;
> ?}
>
> +static bool efi_pstore_overwrite = 1;
> +module_param_named(efi_pstore_overwrite, efi_pstore_overwrite,
> + ? ? ? ? ? ? ? ? ?bool, S_IRUGO | S_IWUSR);
> +
> ?static int efi_pstore_write(enum pstore_type_id type,
> ? ? ? ? ? ? ? ?enum kmsg_dump_reason reason, u64 *id,
> ? ? ? ? ? ? ? ?unsigned int part, size_t size, struct pstore_info *psi) @@ -705,33 +709,36 @@ static int efi_pstore_write(enum pstore_type_id type,
> ? ? ? ?for (i = 0; i < DUMP_NAME_LEN; i++)
> ? ? ? ? ? ? ? ?efi_name[i] = stub_name[i];
>
> - ? ? ? /*
> - ? ? ? ?* Clean up any entries with the same name
> - ? ? ? ?*/
> + ? ? ? if (efi_pstore_overwrite) {
> + ? ? ? ? ? ? ? /*
> + ? ? ? ? ? ? ? ?* Clean up any entries with the same name
> + ? ? ? ? ? ? ? ?*/
> +
> + ? ? ? ? ? ? ? list_for_each_entry(entry, &efivars->list, list) {
> + ? ? ? ? ? ? ? ? ? ? ? get_var_data_locked(efivars, &entry->var);
> +
> + ? ? ? ? ? ? ? ? ? ? ? if (efi_guidcmp(entry->var.VendorGuid, vendor))
> + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? continue;
> + ? ? ? ? ? ? ? ? ? ? ? if (utf16_strncmp(entry->var.VariableName, efi_name,
> + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? utf16_strlen(efi_name)))
> + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? continue;
> + ? ? ? ? ? ? ? ? ? ? ? /* Needs to be a prefix */
> + ? ? ? ? ? ? ? ? ? ? ? if (entry->var.VariableName[utf16_strlen(efi_name)]
> + ? ? ? ? ? ? ? ? ? ? ? ? ? == 0)
> + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? continue;
> +
> + ? ? ? ? ? ? ? ? ? ? ? /* found */
> + ? ? ? ? ? ? ? ? ? ? ? found = entry;
> + ? ? ? ? ? ? ? ? ? ? ? efivars->ops->set_variable(entry->var.VariableName,
> + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?&entry->var.VendorGuid,
> + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?PSTORE_EFI_ATTRIBUTES,
> + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?0, NULL);
> + ? ? ? ? ? ? ? }
>
> - ? ? ? list_for_each_entry(entry, &efivars->list, list) {
> - ? ? ? ? ? ? ? get_var_data_locked(efivars, &entry->var);
> -
> - ? ? ? ? ? ? ? if (efi_guidcmp(entry->var.VendorGuid, vendor))
> - ? ? ? ? ? ? ? ? ? ? ? continue;
> - ? ? ? ? ? ? ? if (utf16_strncmp(entry->var.VariableName, efi_name,
> - ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? utf16_strlen(efi_name)))
> - ? ? ? ? ? ? ? ? ? ? ? continue;
> - ? ? ? ? ? ? ? /* Needs to be a prefix */
> - ? ? ? ? ? ? ? if (entry->var.VariableName[utf16_strlen(efi_name)] == 0)
> - ? ? ? ? ? ? ? ? ? ? ? continue;
> -
> - ? ? ? ? ? ? ? /* found */
> - ? ? ? ? ? ? ? found = entry;
> - ? ? ? ? ? ? ? efivars->ops->set_variable(entry->var.VariableName,
> - ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?&entry->var.VendorGuid,
> - ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?PSTORE_EFI_ATTRIBUTES,
> - ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?0, NULL);
> + ? ? ? ? ? ? ? if (found)
> + ? ? ? ? ? ? ? ? ? ? ? list_del(&found->list);
> ? ? ? ?}
>
> - ? ? ? if (found)
> - ? ? ? ? ? ? ? list_del(&found->list);
> -
> ? ? ? ?for (i = 0; i < DUMP_NAME_LEN; i++)
> ? ? ? ? ? ? ? ?efi_name[i] = name[i];
>
> --
> 1.7.1

2012-06-13 20:43:22

by Luck, Tony

[permalink] [raw]
Subject: RE: [RFC][Patch]efi_pstore:Introduce an efi_pstore_overwrite parameter to avoid missing messages in NVRAM

> ? ?4. kernel panics again before a user checks the 1st panic messages in NVRAM.
>
>
> ? ?To avoid missing 1st panic messages, this patch adds a new parameter ,efi_pstore_overwrite,
> ? ?to efi_pstore so that we can specify whether efi_pstore overwrites existing entries in NVRAM or not.

Since the EFI backend has so little storage space ... perhaps it should
have some overwrite rules built into it?

E.g. in the case where there is an OOPS already logged, and a panic happens,
then I think the right thing to do is overwrite the oops with the panic.
[The oops *might* have already made it to /var/log/messages, but even if
it didn't, a user with a PICK ONLY ONE choice would have to go for the
log of the panic]

In the situation you describe where we already have a panic, then
I don't think than anyone would want the earlier panic to be overwritten
by the new one.

If this seems a plausible route ... we'd need to tabulate the overwrite
rules. I think they are:
shutdown/reboot/kexec - can be overwritten by OOPS
OOPS can be overwritten by panic
panic can never be overwritten

I'd like to avoid a kernel parameter ... chances are too high that the
machine would be automatically rebooted without the right boot argument.

-Tony

2012-06-13 23:14:40

by Seiji Aguchi

[permalink] [raw]
Subject: RE: [RFC][Patch]efi_pstore:Introduce an efi_pstore_overwrite parameter to avoid missing messages in NVRAM

Thank you for your comment.

> Since the EFI backend has so little storage space ... perhaps it should have some overwrite rules built into it?

I think so.
If multiple triggers don't stride across the reboot, I think we can make an automated rule.

> If this seems a plausible route ... we'd need to tabulate the overwrite rules. I think they are:
> shutdown/reboot/kexec - can be overwritten by OOPS

In my usecase, emergency_restart can be overwritten by OOPS

> OOPS can be overwritten by panic
> panic can never be overwritten

I can make a patch in accordance with the rule above.

Should it be in pstore layer?
- introducing a new global variable storing a previous trigger.
- deciding if we can continue by checking a value of previous trigger at the beginning of pstore_dump().


> I'd like to avoid a kernel parameter ... chances are too high that the machine would be automatically rebooted without the right boot
> argument.

But, I'm not sure if we can simply apply the rule above in case where multiple triggers stride across the reboot...
In this case, emergency_restart cannot be overwritten by panic..

Anyway, I will consider the possibility of avoiding a kernel parameter.

Seiji

2012-06-13 23:20:20

by Luck, Tony

[permalink] [raw]
Subject: RE: [RFC][Patch]efi_pstore:Introduce an efi_pstore_overwrite parameter to avoid missing messages in NVRAM

> Should it be in pstore layer?
> - introducing a new global variable storing a previous trigger.
> - deciding if we can continue by checking a value of previous trigger at the beginning of pstore_dump().

I think not - the limitation is EFI (ERST can handle several errors, I think that the ramoops
patches are working with the assumption that enough storage is available too).

The question is how easy is it for EFI to determine the "type" of the saved
error (which may be from the previous booted kernel - so no use using a
global variable ... the information needs to be in persistent store).

-Tony

2012-06-13 23:28:23

by Seiji Aguchi

[permalink] [raw]
Subject: RE: [RFC][Patch]efi_pstore:Introduce an efi_pstore_overwrite parameter to avoid missing messages in NVRAM


> I think not - the limitation is EFI (ERST can handle several errors, I think that the ramoops patches are working with the assumption that
> enough storage is available too).
>

I see.

> The question is how easy is it for EFI to determine the "type" of the saved error (which may be from the previous booted kernel - so
> no use using a global variable ... the information needs to be in persistent store).

OK. I will not use the global variable.
I can check the previous event from a header supplied by pstore ,such as "Panic#1 Part1".

Seiji

2012-06-13 23:38:42

by Seiji Aguchi

[permalink] [raw]
Subject: RE: [RFC][Patch]efi_pstore:Introduce an efi_pstore_overwrite parameter to avoid missing messages in NVRAM

Tony,

> - the limitation is EFI

I have one question. (It is not directly related to a discussion here.)
Do you mean the limitation is due to EFI specification?
Or do you mean there is no EFI server capable of a much amount of NVRAM?

Seiji

2012-06-13 23:55:31

by Luck, Tony

[permalink] [raw]
Subject: RE: [RFC][Patch]efi_pstore:Introduce an efi_pstore_overwrite parameter to avoid missing messages in NVRAM

> I have one question. (It is not directly related to a discussion here.)
> Do you mean the limitation is due to EFI specification?
> Or do you mean there is no EFI server capable of a much amount of NVRAM?

Maybe Matthew can answer that - I wouldn't think that the specification would
limit the amount of NVRAM - but I would expect implementations to do so. Generally
the EFI persistent space has to share the same flash chip as the BIOS ... and the
BIOS folks are good at filling up all the space with their own code & data. These
chips are still only a few (4, 8) Megabytes or so.

-Tony